| Literature DB >> 31831792 |
Thomas E Tavolara1, M Khalid Khan Niazi2, Vidya Arole3, Wei Chen3, Wendy Frankel3, Metin N Gurcan1.
Abstract
Automatic identification of tissue structures in the analysis of digital tissue biopsies remains an ongoing problem in digital pathology. Common barriers include lack of reliable ground truth due to inter- and intra- reader variability, class imbalances, and inflexibility of discriminative models. To overcome these barriers, we are developing a framework that benefits from a reliable immunohistochemistry ground truth during labeling, overcomes class imbalances through single task learning, and accommodates any number of classes through a minimally supervised, modular model-per-class paradigm. This study explores an initial application of this framework, based on conditional generative adversarial networks, to automatically identify tumor from non-tumor regions in colorectal H&E slides. The average precision, sensitivity, and F1 score during validation was 95.13 ± 4.44%, 93.05 ± 3.46%, and 94.02 ± 3.23% and for an external test dataset was 98.75 ± 2.43%, 88.53 ± 5.39%, and 93.31 ± 3.07%, respectively. With accurate identification of tumor regions, we plan to further develop our framework to establish a tumor front, from which tumor buds can be detected in a restricted region. This model will be integrated into a larger system which will quantitatively determine the prognostic significance of tumor budding.Entities:
Mesh:
Year: 2019 PMID: 31831792 PMCID: PMC6908583 DOI: 10.1038/s41598-019-55257-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Training and validation precision, sensitivity, and average F1 and standard deviations across each fold trained on Inception v3.
| Training | Validation | ||||
|---|---|---|---|---|---|
| Precision | Sensitivity | F1 | Precision | Sensitivity | F1 |
| 95.05 ± 1.11 | 98.91 ± 0.12 | 96.94 ± 0.52 | 90.69 ± 5.77 | 96.77 ± 3.17 | 93.45 ± 2.59 |
Training and validation precision, sensitivity, and average F1 and standard deviations across each fold trained using cGANs with two Canny edge conditions.
| Condition | Training | Validation | ||||
|---|---|---|---|---|---|---|
| Precision | Sensitivity | F1 | Precision | Sensitivity | F1 | |
| sigma_2 | 99.24 ± 0.32 | 97.20 ± 1.73 | 98.21 ± 0.92 | 94.55 ± 5.04 | 93.20 ± 1.85 | 93.80 ± 2.97 |
| sigma_5 | 99.30 ± 0.51 | 97.54 ± 1.13 | 98.40 ± 0.39 | 95.13 ± 4.44 | 93.05 ± 3.46 | 94.02 ± 3.23 |
Testing precision, sensitivity, and F1 score across 270 high power fields, sampled from an independent set of 91 slides from in the tumor bulk, outside the tumor bulk, and along the tumor front.
| Condition | Our Framework | Pathologist | ||||
|---|---|---|---|---|---|---|
| Precision | Sensitivity | F1 | Precision | Sensitivity | F1 | |
| sigma_2 | 98.90 ± 2.43 (100.00) | 88.53 ± 5.39 (90.00) | 93.31 ± 3.07 (93.33) | 99.67 | 99.18 | 99.43 |
| sigma_5 | 82.19 ± 5.30 (82.61) | 97.57 ± 3.28 (100.00) | 89.10 ± 3.40 (88.89) | |||
A model trained on the entire 24 slide dataset was used for this evaluation. Reported as mean ± standard deviation (median). A pathologist evaluated each HPF.
Testing precision, sensitivity, and F1 scores across 182 high power fields, two sampled from each 91 slides in the tumor bulk and outside the tumor bulk.
| Condition | Our Framework | Inception v3 | ||||
|---|---|---|---|---|---|---|
| Precision | Sensitivity | F1 | Precision | Sensitivity | F1 | |
| sigma_2 | 99.72 ± 0.78 (100.0) | 94.40 ± 5.45 (96.30) | 96.90 ± 3.00 (97.74) | 87.16 ± 14.03 (93.33) | 99.97 ± 0.17 (100.00) | 93.77 ± 6.25 (96.55) |
| sigma_5 | 95.61 ± 7.10 (98.44) | 98.81 ± 2.41 (100.00) | 97.02 ± 4.14 (98.53) | |||
Models were trained in previous manner but with extreme class imbalances – 10 non-tumor to 1 tumor tile. Reported as mean ± standard deviation and (median).
Figure 1Sample tiles (middle row) and respective reconstructions using the tumor cGAN (top row) and non-tumor cGAN (bottom row), both conditioned with Canny (sigma = 2). First 6 columns from the left resulted in classification failures using MSE, and right 2 columns resulted in correct classifications. Sampled from set1 validation set.
Figure 2Issues with inter-tumoral tumor budding due to crude ‘windowing’ of the proposed method. Since the window is majority non-tumor, the patch is classified as non-tumor. Yet, there is a tumor bud.
Figure 3Manner in which the cGAN was conditioned. On the left is a tumor tile. The tile on the right shows the channel-wise Canny-detected edges. The red, green, and blue correspond to edges extracted from red, green, and blue channels, respectively.