| Literature DB >> 30833650 |
Jason W Wei1,2, Laura J Tafe3, Yevgeniy A Linnik3, Louis J Vaickus3, Naofumi Tomita1, Saeed Hassanpour4,5,6.
Abstract
Classification of histologic patterns in lung adenocarcinoma is critical for determining tumor grade and treatment for patients. However, this task is often challenging due to the heterogeneous nature of lung adenocarcinoma and the subjective criteria for evaluation. In this study, we propose a deep learning model that automatically classifies the histologic patterns of lung adenocarcinoma on surgical resection slides. Our model uses a convolutional neural network to identify regions of neoplastic cells, then aggregates those classifications to infer predominant and minor histologic patterns for any given whole-slide image. We evaluated our model on an independent set of 143 whole-slide images. It achieved a kappa score of 0.525 and an agreement of 66.6% with three pathologists for classifying the predominant patterns, slightly higher than the inter-pathologist kappa score of 0.485 and agreement of 62.7% on this test set. All evaluation metrics for our model and the three pathologists were within 95% confidence intervals of agreement. If confirmed in clinical practice, our model can assist pathologists in improving classification of lung adenocarcinoma patterns by automatically pre-screening and highlighting cancerous regions prior to review. Our approach can be generalized to any whole-slide image classification task, and code is made publicly available at https://github.com/BMIRDS/deepslide .Entities:
Year: 2019 PMID: 30833650 PMCID: PMC6399447 DOI: 10.1038/s41598-019-40041-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Overview of whole-slide classification of histologic patterns. We used a sliding window approach on the whole slide to generate small patches, classified each patch with a residual neural network, aggregated the patch predictions, and used a heuristic to determine predominant and minor histologic patterns for the whole slide. Patch predictions were made independently of adjacent patches and relative location in the whole-slide image.
Distribution of training, development, and test set data among five histologic patterns and benign cases.
| Pattern | Training Set | Development Set | Test Set | Total | ||
|---|---|---|---|---|---|---|
| WSI | Crops | WSI | Patches | WSI | WSI | |
| Lepidic | 99 | 515 | 17 | 58 | 64 | 180 |
| Acinar | 115 | 692 | 23 | 269 | 82 | 220 |
| Papillary | 9 | 44 | 3 | 65 | 5 | 17 |
| Micropapillary | 41 | 412 | 9 | 50 | 22 | 72 |
| Solid | 68 | 425 | 9 | 400 | 54 | 131 |
| Benign | — | 2073 | — | 226 | — | — |
|
|
|
|
|
|
|
|
WSI denotes whole slide image. Crops are variable length and width and annotated by pathologists, while patches are square and of fixed size, obtained from sliding a window over crops. The class distribution for WSI’s in our test set are the average of the labels from three pathologists.
Figure 2Model’s performance on the 1,068 classic samples for histologic patterns. (A) patch classification results with 95% confidence intervals. (B) ROC curves and their area under the curves (AUC’s) on this development set.
Figure 3Model’s classification of 143 whole-slide images in the test set compared to those of three pathologists. (A) The kappa score of the predominant classification among all pairs of annotations. (B) Agreement percentages of predominant classification among all pairs of annotations. (C) Kappa scores for each histologic pattern among all pairs of annotations regardless of predominant or minor subtypes. P1, P2, and P3 are Pathologist 1, Pathologist 2, and Pathologist 3 respectively.
Comparison of pathologists and our model for classification of predominant subtypes in 143 whole-slide images in our test set.
| Average Κappa Score | Average Agreement (%) | Robust Agreement (%) | |
|---|---|---|---|
| Pathologist 1 | 0.454 (0.372–0.536) | 61.3 (53.3–69.3) | 66.9 (59.2–74.6) |
| Pathologist 2 | 0.515 (0.433–0.597) | 64.8 (57.0–72.6) | 72.3 (65.0–79.6) |
| Pathologist 3 | 0.514 (0.432–0.596) | 63.1 (55.2–71.0) | 75.4 (68.3–82.5) |
| Inter-pathologist | 0.479 (0.397–0.561) | 62.7 (54.8–70.6) | 71.5 (64.1–78.9) |
| Baseline model[ | 0.445 (0.364–0.526) | 60.1 (52.1–68.1) | 69.0 (61.4–76.6) |
| Our model |
Average kappa score is calculated by averaging pairs of an annotator’s kappa scores. For instance, Pathologist 1 average is calculated by averaging the kappa scores of Pathologist 1 & Pathologist 2, Pathologist 1 & Pathologist 3, and Pathologist 1 & our model. Average agreement was calculated in the same fashion. Robust agreement indicates agreement for an annotator with at least two of the three other annotators. 95% confidence intervals are shown in parentheses.
Figure 4Visualization of the histologic patterns annotated by pathologists (A.i–iv), compared to those detected by our model (B.i–iv).