| Literature DB >> 32561748 |
Le Hou1, Rajarsi Gupta2, John S Van Arnam2, Yuwei Zhang2, Kaustubh Sivalenka1, Dimitris Samaras1, Tahsin M Kurc2, Joel H Saltz3.
Abstract
The distribution and appearance of nuclei are essential markers for the diagnosis and study of cancer. Despite the importance of nuclear morphology, there is a lack of large scale, accurate, publicly accessible nucleus segmentation data. To address this, we developed an analysis pipeline that segments nuclei in whole slide tissue images from multiple cancer types with a quality control process. We have generated nucleus segmentation results in 5,060 Whole Slide Tissue images from 10 cancer types in The Cancer Genome Atlas. One key component of our work is that we carried out a multi-level quality control process (WSI-level and image patch-level), to evaluate the quality of our segmentation results. The image patch-level quality control used manual segmentation ground truth data from 1,356 sampled image patches. The datasets we publish in this work consist of roughly 5 billion quality controlled nuclei from more than 5,060 TCGA WSIs from 10 different TCGA cancer types and 1,356 manually segmented TCGA image patches from the same 10 cancer types plus additional 4 cancer types.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32561748 PMCID: PMC7305328 DOI: 10.1038/s41597-020-0528-1
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
The main contribution of our work: nucleus segmentation data in 10 cancer types.
| Abbre. | Cancer type | #. slides in total | #. slides failed QC |
|---|---|---|---|
| BLCA | Urothelial carcinoma of the bladder | 380 | 14 |
| BRCA | Invasive carcinoma of the breast | 1,096 | 88 |
| CESC | Cervical squamous cell carcinoma and endocervical adenocarcinoma | 249 | 54 |
| GBM | Glioblastoma Multiforme | 772 | 40 |
| LUAD | Lung adenocarcinoma | 540 | 59 |
| LUSC | Lung squamous cell carcinoma | 431 | 35 |
| PAAD | Pancreatic adenocarcinoma | 190 | 11 |
| PRAD | Prostate adenocarcinoma | 387 | 19 |
| SKCM | Skin Cutaneous Melanoma | 470 | 64 |
| UCEC | Endometrial Carcinoma of the Uterine Corpua | 545 | 192 |
| Total | 5,060 | 576 |
We also generated results in 4 additional cancer types (COAD: colon adenocarcinoma, READ: rectal adenocarcinoma, STAD: stomach adenocarcinoma, UVM: Uveal Melanoma) that are not as good as the 10 cancer types. To validate the segmentation data, we collected segmentation ground truth in 1,356 patches. This set of manually segmented data is another contribution of our work.
Fig. 1Samples of our data. (1) Automatic segmentation results on 5,060 WSIs (samples in top row), summarized in Table 1. (2) Manual segmentation data on over 1,356 patches (samples in bottom rows). Coloring of nuclear masks is for visualization only: it differentiates individual nuclei. We collect a large number of patches with labels for validating the segmentation results.
Fig. 2Overview of our nucleus segmentation model training: we use a texture inpainting module to synthesize an initial synthetic pathology image patch with its nuclear mask. We then refine the initial synthetic patch using a GAN and compute its sample weight. We finally train a segmentation CNN on this sampled instance. Details are in our technical paper[33] and source code repository.
Fig. 3Our quality control and data validation pipeline. This QC process is implemented to evaluate segmentation results at the WSI level. It would be infeasible to check the segmentation quality of all the nuclei individually.
We categorize WSIs into groups with different segmentation quality levels.
| WSI groups | Percentage of patches with bad segmentations | #. slides |
|---|---|---|
| Best | 0% | 2,346 |
| Good | 0.01–6.67% | 1,246 |
| Adequate | 6.68–13.3% | 593 |
| Problematic | 13.4–20.0% | 302 |
| Unacceptable | >20.0% | 573 |
| or failed WSI QC |
Slides identified as having unacceptable segmentation results are excluded from analysis in the rest of this work.
Fig. 4Examples of automatic segmentation vs. manual segmentation. First two rows: failure cases. Last two rows: randomly selected samples.
Quantitative assessment of the quality of nucleus segmentation, across 10 cancer types.
| WSI groups | #. patch labels | Dice | Instance-Dice | Nuclei count | |
|---|---|---|---|---|---|
| Correlat. | MAE% | ||||
| Best | 446 | 0.797 | 0.687 | 0.947 | 15.2% |
| Good | 242 | 0.789 | 0.660 | 0.930 | 16.1% |
| Adequate | 128 | 0.774 | 0.636 | 0.915 | 17.6% |
| Problematic | 52 | 0.788 | 0.625 | 0.879 | 20.5% |
| Unacceptable | 103 | 0.690 | 0.545 | 0.718 | 33.8% |
| Excluding unacceptable | 868 | 0.790 | 0.667 | 0.932 | 16.2% |
The definition of WSI groups are given in Table 2. We exclude unacceptable segmentation results from analysis work in the rest of this paper.
Fig. 5Top: Dice and MAE% results of all patches. Bottom: Predicted nuclei count (derived from automatic segmentation) vs. Ground truth nuclei count (derived from manual segmentation). Pearson correlation = 0.932, p-value < 1.0 × 10−308.
Quantitative assessment of the quality of nucleus segmentation, in each of the 10 cancer types.
| Cancer Type | #. patch labels | Dice | Instance-Dice | Nuclei count | |
|---|---|---|---|---|---|
| Correlat. | MAE% | ||||
| BLCA | 95 | 0.779 | 0.668 | 0.941 | 20.5% |
| BRCA | 89 | 0.798 | 0.649 | 0.922 | 19.6% |
| CESC | 79 | 0.818 | 0.677 | 0.947 | 13.4% |
| GBM | 86 | 0.809 | 0.723 | 0.938 | 14.4% |
| LUAD | 88 | 0.772 | 0.641 | 0.896 | 17.4% |
| LUSC | 97 | 0.789 | 0.665 | 0.924 | 16.1% |
| PAAD | 91 | 0.785 | 0.679 | 0.933 | 15.8% |
| PRAD | 96 | 0.799 | 0.670 | 0.940 | 14.7% |
| SKCM | 86 | 0.774 | 0.675 | 0.933 | 17.1% |
| UCEC | 61 | 0.778 | 0.629 | 0.900 | 14.6% |
The p-value of Pearson correlation for every cancer type is smaller than 7.0 × 10−23.
Agreements between annotations from different human annotators. This is the performance upper bond of any automatic segmentation method.
| Inter-annotator | Dice | Instance-Dice | Nuclei count | |
|---|---|---|---|---|
| Correlat. | MAE% | |||
| Annotator A vs. B | 0.760 | 0.600 | 0.959 | 10.8% |
| Annotator B vs. C | 0.752 | 0.622 | 0.959 | 15.5% |
| Annotator C vs. A | 0.774 | 0.697 | 0.954 | 12.2% |
Comparing labeling from scratch vs. correcting Mask R-CNN’s results.
| Annotator | Dice | Instance-Dice | Nuclei count | |
|---|---|---|---|---|
| Correlat. | MAE% | |||
| Annotator A | 0.803 | 0.664 | 0.962 | 12.4% |
| Annotator B | 0.793 | 0.631 | 0.984 | 11.2% |
| Annotator C | 0.780 | 0.683 | 0.973 | 9.5% |
| Measurement(s) | nucleus • segmented nucleus |
| Technology Type(s) | unsupervised machine learning • hematoxylin and eosin stain |
| Factor Type(s) | cancer type |
| Sample Characteristic - Organism | Homo sapiens |