| Literature DB >> 34889764 |
Jae Young Choi1, Hae-Jeong Park2,3, Dongchul Cha1, Chongwon Pae2,3, Se A Lee1, Gina Na1, Young Kyun Hur1, Ho Young Lee1, A Ra Cho1, Young Joon Cho4, Sang Gil Han4, Sung Huhn Kim1.
Abstract
BACKGROUND: Deep learning (DL)-based artificial intelligence may have different diagnostic characteristics than human experts in medical diagnosis. As a data-driven knowledge system, heterogeneous population incidence in the clinical world is considered to cause more bias to DL than clinicians. Conversely, by experiencing limited numbers of cases, human experts may exhibit large interindividual variability. Thus, understanding how the 2 groups classify given data differently is an essential step for the cooperative usage of DL in clinical application.Entities:
Keywords: artificial intelligence; computer-aided diagnosis; convolutional neural network; deep learning, class imbalance problem; eardrum; human-machine cooperation; otology; otoscopy
Year: 2021 PMID: 34889764 PMCID: PMC8701703 DOI: 10.2196/33049
Source DB: PubMed Journal: JMIR Med Inform
Composition of the training and test sets as well as labels, sorted by labeling priority.
| Classification | Number of images | ||
| Training (n=6900), n (%) | Test-balanceda (n=300), n (%) | Test-imbalancedb (n=300), n (%) | |
| (1) Tympanic perforation | 1793 (26.99) | 50 (16.77) | 51 (17.00) |
| (2) Attic retraction/atelectasis | 521 (7.56) | 50 (16.77) | 20 (6.67) |
| (3) Myringitis/otitis externa | 256 (3.71) | 50 (16.77) | 15 (5.00) |
| (4) Otitis media with effusion | 506 (7.33) | 50 (16.77) | 29 (9.67) |
| (5) Tumors | 285 (4.13) | 50 (16.77) | 18 (6.00) |
| (6) Normal | 3539 (51.29) | 50 (16.77) | 167 (55.67) |
aAll classes are distributed equally.
bClasses are distributed proportionally to the training set.
Figure 1Representative class and their activation heatmap (Grad-CAM): (A) attic retraction), (B) myringitis or otitis externa, (C) normal findings, (D) otitis media with effusion, (E) tympanic perforation, (F) middle ear or external ear canal tumors.
Figure 2Per-class recall and overall classification accuracy (bars = 95% CI) for classes according to the number of training samples and augmentation, trained with 12 different convolutional neural network models and tested on the balanced test set. Acc: overall accuracy; Ar: attic retraction, destruction; No: normal; Oe: myringitis or acute otitis externa; Om: otitis media with effusion; Tp: tympanic perforation; Tu: middle or external ear canal tumors or cerumen impaction.
Figure 3Mean (A) overall diagnostic accuracy and (B) Fleiss generalized kappa for interrater reliability (error bars = 95% CI); the predictions by the ResNet152-based deep learning model were assumed to be a human rater. ENT: otolaryngologists; ENT+ML': machine learning model plus otolaryngologists; ML: baseline machine learning models; ML': augmented machine learning models; Non-ENT: nonotolaryngologists; Non-ENT+ML': machine learning model plus nonotolaryngologists; NS: not statistically significant. *P<.001 (Mann-Whitney test: ENT vs Non-ENT; Wilcoxon matched-pairs signed-rank test: ML vs ML').
Figure 4In the balanced test set, (A) per-class recall and overall accuracy (bars indicate 95% CI) and (B) prediction counts in individual classes (the dotted line at 50 indicates the sample size of the balanced test set for each class; x axis is on a logarithmic scale). Classes are listed left to right by descending number of training samples. Each class had 50 samples in the balanced test set (a total of 300 samples for all 6 classes). Nonotolaryngologists had too high variations and low accuracies and were not plotted. ENT: Y intercept=42.14 (95% CI 39.14-45.24), slope=0.006836 (95% CI 0.004805-0.008939), pseudo R-squared=0.3262; ML’: Y intercept=37.89 (95% CI 35.77-40.07), slope=0.01053 (95% CI 0.008981-0.01211), pseudo R-squared=0.8665; ML: Y intercept=26.68 (95% CI 24.73-28.69), slope=0.02028 (95% CI 0. 01861-0.02198), pseudo R-squared=0.9167. Acc: overall accuracy; Ar: attic retraction; ENT: otolaryngologist; FN: false negative; ML: baseline machine learning models; ML': augmented machine learning models; No: normal; Oe: myringitis or acute otitis externa; Om: otitis media with effusion; Tp: tympanic perforation; TP: true positive; Tu: middle or external ear canal tumors or cerumen impaction. *P<.01 (Wilcoxon matched-pairs signed rank test).