| Literature DB >> 32368601 |
Elima Hussain1, Lipi B Mahanta1, Himakshi Borah2, Chandana Ray Das2.
Abstract
While a publicly available benchmark dataset provides a base for the development of new algorithms and comparison of results, hospital-based data collected from the real-world clinical setup is also very important in AI-based medical research for automated disease diagnosis, prediction or classifications as per standard protocol. Primary data must be constantly updated so that the developed algorithms achieve as much accuracy as possible in the regional context. This dataset would support research work related to image segmentation and final classification for a complete decision support system (https://doi.org/10.1016/j.tice.2020.101347) [1]. Liquid-based cytology (LBC) is one of the cervical screening tests. The repository consists of a total of 963 LBC images sub-divided into four sets representing the four classes: NILM, LSIL, HSIL, and SCC. It comprises pre-cancerous and cancerous lesions related to cervical cancer as per standards under The Bethesda System (TBS). The images were captured in 40x magnification using Leica ICC50 HD microscope collected with due consent from 460 patients visiting the O&G department of the public hospital with various gynaecological problems. The images were then viewed and categorized by experts of the pathology department.Entities:
Keywords: 40x; Cervical cancer; Cervical cancerous lesions; Cervical pre-cancerous lesions; Liquid-based cytology; Pap smear
Year: 2020 PMID: 32368601 PMCID: PMC7186519 DOI: 10.1016/j.dib.2020.105589
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Dataset description.
Fig. 1(A) Images belonging to class (I) NILM and (II) LSIL, and (B) Images belonging to class (III) HSIL and (IV) SCC.
| Subject | Computer Science, Computer Vision, and Pattern Recognition, |
| Specific subject area | Medical Image Processing, Cervical Cancer, Cell segmentation, Cell classification |
| Type of data | Images |
| How data were acquired | Images were captured using a Leica DM 750 microscope with camera model ICC50 HD, in 400x (40x objective lens |
| Data format | Raw JPG |
| Parameters for data collection | Images were captured in 400x (40x objective lens |
| Description of data collection | Liquid-based cytology provides more uniform fixation with a cleaner background and well-preserved samples for further HPV tests other than conventional Pap tests and hence it is preferred here. The LBC pap smear slides were collected from three distinguished medical diagnostic centers of the NER regions, India namely Babina Diagnostic Pvt. Ltd, Imphal, Gauhati Medical College and Hospital, Guwahati and Dr. B. Barooah Cancer Institute, Guwahati. All samples involve ethical clearance protocol from the three diagnostic centers along with patient consent from a total of 460 patients undergoing cervical screening tests. The images were captured in 400x magnifications using Leica DM 750 microscope, model ICC50 HD connected with the camera and a high-configured computer and software. The images represent the sub-categories of cervical lesions (malignant and pre-malignant) as NILM (Negative for Intraepithelial lesion or malignancy), LSIL (Low-grade intraepithelial lesions), HSIL (High-grade intraepithelial lesions), and SCC (Squamous Cell Carcinoma). |
| Data source location | 1. Babina Diagnostic Pvt. Ltd, Imphal, India |
| 2. Dr. B. Borooah Cancer Research Institute, Guwahati, Assam, India | |
| 3. Gauhati Medical College and Hospital, Guwahati, Assam, India | |
| Data accessibility | Hussain, Elima (2019), “Liquid-based cytology pap smear images for multi-class diagnosis of cervical cancer”, Mendeley Data, V4. |
| Related research article | E. Hussain, L.B. Mahanta, C. Ray, R. Kanta, Tissue and Cell A comprehensive study on the multi-class cervical cancer diagnostic prediction on pap smear images using a fusion-based decision from ensemble deep convolutional neural network, Tissue Cell. 65 (2020) 101347. |