| Literature DB >> 32346559 |
Andrea Acevedo1,2, Anna Merino1, Santiago Alférez3, Ángel Molina1, Laura Boldú1, José Rodellar2.
Abstract
This article makes available a dataset that was used for the development of an automatic recognition system of peripheral blood cell images using convolutional neural networks [1]. The dataset contains a total of 17,092 images of individual normal cells, which were acquired using the analyzer CellaVision DM96 in the Core Laboratory at the Hospital Clinic of Barcelona. The dataset is organized in the following eight groups: neutrophils, eosinophils, basophils, lymphocytes, monocytes, immature granulocytes (promyelocytes, myelocytes, and metamyelocytes), erythroblasts and platelets or thrombocytes. The size of the images is 360 × 363 pixels, in format jpg, and they were annotated by expert clinical pathologists. The images were captured from individuals without infection, hematologic or oncologic disease and free of any pharmacologic treatment at the moment of blood collection. This high-quality labelled dataset may be used to train and test machine learning and deep learning models to recognize different types of normal peripheral blood cells. To our knowledge, this is the first publicly available set with large numbers of normal peripheral blood cells, so that it is expected to be a canonical dataset for model benchmarking.Entities:
Keywords: Blood cell automatic recognition; Blood cell images; Blood cell morphology; Deep learning; Hematological diagnosis; Machine learning
Year: 2020 PMID: 32346559 PMCID: PMC7182702 DOI: 10.1016/j.dib.2020.105474
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Types and number of cells in each group.
| CELL TYPE | TOTAL OF IMAGES BY TYPE | % |
|---|---|---|
| neutrophils | 3329 | 19.48 |
| eosinophils | 3117 | 18.24 |
| basophils | 1218 | 7.13 |
| lymphocytes | 1214 | 7.10 |
| monocytes | 1420 | 8.31 |
| immature granulocytes | ||
| (metamyelocytes, myelocytes and promyelocytes) | 2895 | 16.94 |
| erythroblasts | 1551 | 9.07 |
| platelets (thrombocytes) | 2348 | 13.74 |
| Total | 17,092 | 100 |
Fig. 1Example images of different types of normal peripheral blood cells that can be found in the dataset and organized in eight groups, including those more frequently observed in infections and regenerative anaemias.
Fig. 2Daily work flow at the Core Laboratory performed to obtain the peripheral blood cell images.
| Subject | Hematology |
| Specific subject area | Computational tools for hematological diagnosis using microscopic cell images and automatic learning methods. |
| Type of data | Images |
| How data were acquired | Digital images of normal peripheral blood cells were obtained from samples collected in the Core Laboratory at the Hospital Clinic of Barcelona. In order to obtain the all blood counts, blood samples were analysed in the Advia 2120 instrument. Next, the smear was automatically prepared using the slide maker–stainer Sysmex SP1000i with May Grünwald-Giemsa staining. Then, the automatic analyser CellaVision DM96 was used to obtain individual cell images with format jpg and size 360 × 363 pixels. Images obtained were labelled and stored by the clinical pathologists. |
| Data format | Raw |
| Parameters for data collection | The dataset images were obtained from normal individuals and blood cells have been selected based on normal laboratory data. |
| Description of data collection | The images were collected in a 4-year period (2015 to 2019) within a daily routine. Blood cell images were annotated and saved using a random number to remove any link to the individual data, resulting in an anonymized dataset. |
| Data source location | Institution: Hospital Clinic of Barcelona |
| Data accessibility | The dataset is stored in a Mendeley repository: |
| Related research article | Author's name: Andrea Acevedo, Anna Merino, Santiago Alférez, Laura Puigví, José Rodellar |