| Literature DB >> 32782410 |
Florian Kromp1,2, Eva Bozsaky3, Fikret Rifatbegovic3, Lukas Fischer4, Magdalena Ambros3, Maria Berneder3,5, Tamara Weiss3, Daria Lazic3, Wolfgang Dörr6, Allan Hanbury7,8, Klaus Beiske9, Peter F Ambros3,5,10, Inge M Ambros3,5, Sabine Taschner-Mandl11,12.
Abstract
Fully-automated nuclear image segmentation is the prerequisite to ensure statistically significant, quantitative analyses of tissue preparations,applied in digital pathology or quantitative microscopy. The design of segmentation methods that work independently of the tissue type or preparation is complex, due to variations in nuclear morphology, staining intensity, cell density and nuclei aggregations. Machine learning-based segmentation methods can overcome these challenges, however high quality expert-annotated images are required for training. Currently, the limited number of annotated fluorescence image datasets publicly available do not cover a broad range of tissues and preparations. We present a comprehensive, annotated dataset including tightly aggregated nuclei of multiple tissues for the training of machine learning-based nuclear segmentation algorithms. The proposed dataset covers sample preparation methods frequently used in quantitative immunofluorescence microscopy. We demonstrate the heterogeneity of the dataset with respect to multiple parameters such as magnification, modality, signal-to-noise ratio and diagnosis. Based on a suggested split into training and test sets and additional single-nuclei expert annotations, machine learning-based image segmentation methods can be trained and evaluated.Entities:
Mesh:
Year: 2020 PMID: 32782410 PMCID: PMC7419523 DOI: 10.1038/s41597-020-00608-w
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Workflow for ground truth image annotation. (a) Raw image visualizing HaCaT cytospinned nuclei. (b) A machine learning framework was used to annotate the raw image, learning from user interaction within three consecutive steps: S1. foreground extraction, S2. connected component classification (red = non-usable objects, blue = nuclei aggregations, green = single nuclei) and S3. splitting of aggregated objects into single nuclei, resulting in an annotation mask. (c) Zoom-in of the SVG-file showing the nuclear image overlaid with polygons representing each annotated nucleus. Polygons were modified by expert biologists to fit effective nuclear borders. Challenging decisions on how to annotate nuclei, mainly occurring due to aggregated or overlapped nuclei, were presented to an expert pathologist and corrected to obtain the final ground truth. (d) The curated SVG-file was transformed into a labeled nuclear mask.
Test set split into 10 classes to evaluate the generalizability of machine learning-based image segmentation methods with respect to varying imaging conditions.
| Acronym | Description |
|---|---|
| GNB-I | ganglioneuroblastoma tissue sections |
| GNB-II | ganglioneuroblastoma tissue sections with a low signal-to-noise ratio |
| NB-I | neuroblastoma bone marrow cytospin preparations |
| NB-II | neuroblastoma cell line preparations imaged with different magnifications |
| NB-III | neuroblastoma cell line preparations imaged with LSM modalities |
| NB-IV | neuroblastoma tumor touch imprints |
| NC-I | normal cells cytospin preparations |
| NC-II | normal cells cytospin preparations with low signal-to-noise ratio |
| NC-III | normal cells grown on slide |
| TS | other tissue sections (neuroblastoma, Wilms) |
Fig. 2Heterogeneity of the proposed dataset with respect to the type of preparation. GNB: ganglioneuroblastoma, NB: neuroblastoma, TU touch: tumor touch imprint, Tissue: tissue section.
Mean Dice coefficient between randomly selected nuclei of the manual annotations and the ground truth annotations with respect to the human annotator.
| Annotator | GNB-I | GNB-II | NB-I | NB-II | NB-III | NB-IV | NC-I | NC-II | NC-III | TS | Overall |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Biol. exp. | 0.932 | ||||||||||
| Annot. exp. | 0.877 | 0.849 | 0.896 | 0.888 | 0.957 | 0.928 | 0.973 | 0.912 |
Annot. exp.: annotation expert, Biol. exp.: expert biologist, GNB-I: ganglioneuroblastoma tissue sections, GNB-II: ganglioneuroblastoma tissue sections with a low signal-to-noise ratio, NB-I: neuroblastoma bone marrow cytospin preparations, NB-II: neuroblastoma cell line preparations with different magnifications, NB-III: neuroblastoma cell line preparations with different magnifications and imaged with LSM modalities, NB-IV: neuroblastoma tumor touch imprints, NC-I: normal cells (HaCaT) cytospin preparations, NC-II: normal cells (HaCaT) with low signal-to-noise ratio, NC-III: normal cells (HaCaT) grown on slide, TS: other tissue sections (neuroblastoma, Wilms tumor). Bold values set the baseline for machine learning-based image segmentation methods.
| Measurement(s) | nucleus • Annotation • Frozen Section • Neuroblastoma • Touch Prep Slide • Centrifuged Smear Slide • cells grown on slide • Ganglioneuroblastoma • Wilms Tumor • HaCaT cell |
| Technology Type(s) | Fluorescence Imaging • machine learning |
| Factor Type(s) | nucleus segmentation |
| Sample Characteristic - Organism | Homo sapiens |