| Literature DB >> 36110150 |
Kevin De Angeli1,2, Shang Gao1, Andrew Blanchard1, Eric B Durbin3, Xiao-Cheng Wu4, Antoinette Stroup5, Jennifer Doherty6, Stephen M Schwartz7, Charles Wiggins8, Linda Coyle9, Lynne Penberthy10, Georgia Tourassi1, Hong-Jun Yoon1.
Abstract
Objective: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. Materials andEntities:
Keywords: CNN; NLP; deep learning; ensemble distillation; selective classification
Year: 2022 PMID: 36110150 PMCID: PMC9469924 DOI: 10.1093/jamiaopen/ooac075
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Number of classes in each task
| Task | Site | Subsite | Laterality | Histology | Behavior |
|---|---|---|---|---|---|
| Classes | 70 | 326 | 7 | 639 | 4 |
Size of individual registries
| Registry | R1 | R2 | R3 | R4 | R5 | R6 |
|---|---|---|---|---|---|---|
| e-Path Reports | 85 789 | 577 094 | 137 135 | 441 732 | 360 375 | 365 152 |
Figure 1.Overview of our training pipeline with a hypothetical example in which 3 different models classify a pathology report as stomach, esophagus, and colon. Our actual implementation consists of 1000 teacher models.
Retention proportions results
| Model | Site | Subsite | Laterality | Histology | Behavior |
|---|---|---|---|---|---|
| MtCNN | 90.62 | 34.52 | 87.83 | 23.87 | 99.46 |
| (90.61, 90.63) | (34.48, 34.56) | (87.82, 87.85) | (23.77, 23.97) | (99.45, 99.46) | |
| Student | 91.10 | 36.33 | 88.49 | 27.20 | 99.98 |
| Ensemble | 92.17 | 39.00 | 89.60 | 34.16 | 99.42 |
Notes: The numbers shown represent the percentage of document remaining after abstention (higher percentage means more coverage). Intervals represent 95% confidence intervals.
MtCNN: multitask convolutional neural network.
Accuracy results
| Model | Site | Subsite | Laterality | Histology | Behavior |
|---|---|---|---|---|---|
| MtCNN | 96.06 | 94.43 | 96.05 | 95.12 | 97.69 |
| (96.06, 96.07) | (94.42, 94.45) | (96.04, 96.05) | (95.10, 95.14) | (97.69, 97.70) | |
| Student | 96.27 | 94.82 | 96.19 | 95.84 | 97.60 |
| Ensemble | 96.19 | 94.55 | 96.10 | 95.78 | 98.00 |
Note: Intervals represent 95% bootstrap confidence intervals.
MtCNN: multitask convolutional neural network.
Figure 2.Histology Task. Distribution of softmaxes for the wrong predictions.
Figure 3.Subsite Task. Distribution of softmaxes for the wrong predictions.
Figure 4.Wrong histology predictions made with confidence >0.97.
Figure 5.Incorrectly annotated pathology that was fixed during the distillation process. Some sentences were removed to conserve privacy.
Figure 6.Pathology report in which the ensemble prediction votes were split into 3 equivalent groups. This report includes results of 3 analyzed specimens related to the stomach, esophagus, and colon. Some sentences were removed to ensure privacy.