| Literature DB >> 35856040 |
Yen-Chi Chen1,2,3, Yuan-Chia Chu4,5,6, Chii-Yuan Huang1,7, Yen-Ting Lee1, Wen-Ya Lee1, Chien-Yeh Hsu6,8, Albert C Yang2,9, Wen-Huei Liao1,7, Yen-Fu Cheng1,2,7,9.
Abstract
Background: Middle ear diseases such as otitis media and middle ear effusion, for which diagnoses are often delayed or misdiagnosed, are among the most common issues faced by clinicians providing primary care for children and adolescents. Artificial intelligence (AI) has the potential to assist clinicians in the detection and diagnosis of middle ear diseases through imaging.Entities:
Keywords: Convolutional neural network; Deep learning; Edge computing; Middle ear diseases; Mobile applications; Transfer learning
Year: 2022 PMID: 35856040 PMCID: PMC9287624 DOI: 10.1016/j.eclinm.2022.101543
Source DB: PubMed Journal: EClinicalMedicine ISSN: 2589-5370
Figure 1Developing a smartphone-based computing program for diagnosing middle ear disease and providing medical suggestions.
The main architecture of our AI model is a CNN with transfer learning. Each layer extracts different tympanic membrane/intermediate image features, and all the extracted features are integrated to determine the type of middle ear disease and the corresponding treatment. Subsequently, a Core ML model was developed for this new smartphone-based eardrum application, where users can upload eardrum images to the cloud. The eardrum app analyses the input image and indicates the type of middle ear disease and the treatment to be performed based on the results.
Figure 2Representative images of ten classes of common middle ear conditions/diseases.
(A) normal tympanic membrane, (B) acute otitis media, (C) acute myringitis, (D) chronic suppurative otitis media, (E) otitis media with effusion, (F) tympanic membrane perforation, (G) cerumen impaction, (H) ventilation tube, (I) tympanic membrane retraction, and (J) otomycosis.
Figure 3Representative demonstration of middle ear disease detection in the mobile application.
Image characteristics and diagnostic performance.
| Evaluation | Acc (%) | Recall (%) | Precision (%) | Sen (%) | Sp (%) | F1-Score (%) | AUC Score | Training/Test Time (sec) | Best Number of Epochs |
|---|---|---|---|---|---|---|---|---|---|
| VGG16 | 97.3 | 97.1 | 97.2 | 97.1 | 99.7 | 97.1 | 0.984(0.979-0.989) | 1200 | 34 |
| VGG19 | 96.7 | 96.4 | 96.5 | 96.4 | 99.6 | 96.5 | 0.998(0.996-0.999) | 1650 | 41 |
| Xception | 76.6 | 75.8 | 81.7 | 75.8 | 97.4 | 77.0 | 0.866(0.853–0.879) | 2550 | 49 |
| InceptionV3 | 98.0 | 97.9 | 98.1 | 97.9 | 99.8 | 98.0 | 0.989(0.985–0.992) | 9000 | 50 |
| NASNetLarge | 80.8 | 80.7 | 81.4 | 80.7 | 97.9 | 80.0 | 0.893(0.888–0.898) | 23,200 | 47 |
| ResNet50 | 96.8 | 96.5 | 96.7 | 96.5 | 96.8 | 96.5 | 0.981(0.977–0.984) | 1850 | 42 |
| NASNetMobile(TL) | 97.2 | 97.1 | 97.2 | 97.1 | 99.7 | 97.1 | 0.984(0.980–0.988) | 1550 | 2 |
| MobileNetV2(TL) | 97.6 | 97.4 | 97.4 | 97.4 | 99.7 | 97.4 | 0.985(0.980–0.990) | 4500 | 9 |
| DenseNet201(TL) | 97.3 | 97.1 | 97.2 | 97.1 | 99.7 | 97.1 | 0.984(0.979–0.989) | 1600 | 2 |
Acc = accuracy; Sen = sensitivity; Sp = specificity; AUC = area under the receiver operating characteristic curve; TL= Transfer learning with InceptionV3.
Champion model for transfer learning.
Champion model integrated into smartphone.
Figure 4Representative class activation maps (CAMs) of 10 common ear drum/middle ear diseases. A CAM is a heatmap-like representation of the data output by global average pooling. The hot spots (red) generated by the CAM represent more important parts of the object, rather than all of it, and it does not produce a segmentation result with fine boundaries.
Figure 5Heatmap comparing the results produced by the AI and the human practitioners. GP=general practitioner, R=resident doctor, SP=otolaryngology specialist, AI= artificial intelligence.
Comparison of the proposed model with previous studies.
| N | Classification | Accuracy | Year | Algorithm | Device | |
|---|---|---|---|---|---|---|
| Senaras et al. | 247 | 2 | 84.6% | 2017 | CNN | PC |
| Myburgh et al. | 389 | 5 | 81.6%∼ 86.8% | 2018 | DT+ CNN | PC |
| Cha et al. | 10,544 | 6 | 93.7% | 2019 | CNN+ETL | PC |
| Livingstone et al. | 1366 | 14 | 88.7% | 2020 | AutoML | PC |
| Khan et al. | 2484 | 3 | 95.0% | 2020 | CNN | PC |
| Wu et al. | 12,230 | 2 | 90.7% | 2021 | CNN+TL | Smartphone-otoscope |
| Zafer et al. | 857 | 4 | 99.5% | 2020 | DCNN, SVM | PC (public dataset) |
| Viscaino et al. | 1060 | 4 | 93.9% | 2020 | SVM, K-NN, DT | PC |
| Zeng et al. | 20,542 | 6 | 95.6% | 2021 | CNN+ETL | Real-time detection device |
| Current study | 2171 | 10 | 97.6% | 2022 | CNN+TL | Smartphone application |
N= number of images in dataset; DT= decision tree; AutoML = automated machine learning; CNN=convolutional neural network; DCNN= deep convolutional neural network; K-NN= K-nearest neighbour; PC = personal computer; ETL=ensemble transfer learning.