| Literature DB >> 36190937 |
Masahiro Takahashi1, Katsuhiko Noda2, Kaname Yoshida2, Keisuke Tsuchida1, Ryosuke Yui1, Takara Nakazawa1, Sho Kurihara1, Akira Baba3, Masaomi Motegi1, Kazuhisa Yamamoto1, Yutaka Yamamoto1, Hiroya Ojiri3, Hiromi Kojima1.
Abstract
Cholesteatoma is a progressive middle ear disease that can only be treated surgically but with a high recurrence rate. Depending on the extent of the disease, a surgical approach, such as microsurgery with a retroarticular incision or transcanal endoscopic surgery, is performed. However, the current examination cannot sufficiently predict the progression before surgery, and changes in approach may be made during the surgery. Large amounts of data are typically required to train deep neural network models; however, the prevalence of cholesteatomas is low (1-in-25, 000). Developing analysis methods that improve the accuracy with such a small number of samples is an important issue for medical artificial intelligence (AI) research. This paper presents an AI-based system to automatically detect mastoid extensions using CT. This retrospective study included 164 patients (80 with mastoid extension and 84 without mastoid extension) who underwent surgery. This study adopted a relatively lightweight neural network model called MobileNetV2 to learn and predict the CT images of 164 patients. The training was performed with eight divided groups for cross-validation and was performed 24 times with each of the eight groups to verify accuracy fluctuations caused by randomly augmented learning. An evaluation was performed by each of the 24 single-trained models, and 24 sets of ensemble predictions with 23 models for 100% original size images and 400% zoomed images. Fifteen otolaryngologists diagnosed the images and compared the results. The average accuracy of predicting 400% zoomed images using ensemble prediction model was 81.14% (sensitivity = 84.95%, specificity = 77.33%). The average accuracy of the otolaryngologists was 73.41% (sensitivity, 83.17%; specificity, 64.13%), which was not affected by their clinical experiences. Noteworthily, despite the small number of cases, we were able to create a highly accurate AI. These findings represent an important first step in the automatic diagnosis of the cholesteatoma extension.Entities:
Mesh:
Year: 2022 PMID: 36190937 PMCID: PMC9529134 DOI: 10.1371/journal.pone.0273915
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1CT and MRI (non-EP diffusion-weighted imaging) findings in patients with cholesteatoma.
Yellow dotted lines indicate lesions. (a) CT showing a soft density lesion from the attic to the mastoid. It is difficult to distinguish between cholesteatoma and non-cholesteatoma inflammatory lesions. (b) MRI can diagnose the condition but the exact extent is difficult to determine.
The number of extracted slices.
| Intraoperative findings | Patients | Total Images | Slices including lesion |
|---|---|---|---|
| M (-) | 84 | 2520 | 912 |
| M (+) | 80 | 2430 | 1513 |
All cases were sub-classified into two groups: cases showing extension to the mastoid (M+) and those that did not show an extension to the mastoid (M-). A total of 80 and 84 cases were classified as M+ and M −, respectively. CT slices, including the lesion, were extracted for the training and evaluation of the DNN models.
The number of patients and images in each group.
| Group | M (-) | M (+) | ||
|---|---|---|---|---|
| Patients | Images | Patients | Images | |
| A | 11 | 116 | 10 | 190 |
| B | 10 | 113 | 10 | 190 |
| C | 10 | 113 | 10 | 190 |
| D | 10 | 113 | 10 | 189 |
| E | 11 | 115 | 10 | 190 |
| F | 11 | 115 | 10 | 188 |
| G | 11 | 115 | 10 | 190 |
| H | 10 | 112 | 10 | 186 |
For cross-validation, we randomly divided the patients into eight groups and prepared eight datasets, using seven groups for training and the remaining group for evaluation.
The number of patients and original images in each training set.
| Training Set | Training | Evaluation | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Group | M (-) | M (+) | Group | M (-) | M (+) | |||||
| Patients | Images | Patients | Images | Patients | Images | Patients | Images | |||
| Set-1 | A,B,C,D,E,F,G | 74 | 800 | 70 | 1327 | H | 10 | 112 | 10 | 186 |
| Set-2 | B,C,D,E,F,G,H | 73 | 796 | 70 | 1323 | A | 11 | 116 | 10 | 190 |
| Set-3 | C,D,E,F,G,H,A | 74 | 799 | 70 | 1323 | B | 10 | 113 | 10 | 190 |
| Set-4 | D,E,F,G,H,A,B | 74 | 799 | 70 | 1323 | C | 10 | 113 | 10 | 190 |
| Set-5 | E,F,G,H,A,B,C | 74 | 799 | 70 | 1324 | D | 10 | 113 | 10 | 189 |
| Set-6 | F,G,H,A,B,C,D | 73 | 797 | 70 | 1323 | E | 11 | 115 | 10 | 190 |
| Set-7 | G,H,A,B,C,D,E | 73 | 797 | 70 | 1325 | F | 11 | 115 | 10 | 188 |
| Set-8 | H,A,B,C,D,E,F | 73 | 797 | 70 | 1323 | G | 11 | 115 | 10 | 190 |
We created 24 training sets to verify the accuracy fluctuations of each model. Consequently, 192 models were generated (8 datasets × 24 = 192 models).
Fig 2CT images used.
We evaluated the accuracy of prediction using 25% partial CT images that contain lesion areas cropped from the vertical center 50% and horizontally left or right 50%.
Average accuracy of 24 single models and 24 ensemble predictions in single-image unit-based prediction and patient unit-based prediction.
| Images | Model | Sensitivity | Specificity | Accuracy | |
|---|---|---|---|---|---|
| Single-Image Unit-Based Prediction | 100% | Single | 71.87% | 62.45% | 67.16% |
| Ensemble | 74.90% | 61.76% | 68.33% | ||
| 25% | Single | 80.17% | 68.70% | 74.43% | |
| Ensemble | 77.12% | 73.75% | 75.43% | ||
| Patient Unit-Based Prediction | 100% | Single | 80.68% | 62.80% | 71.74% |
| Ensemble | 75.31% | 74.65% | 74.98% | ||
| 25% | Single | 86.82% | 71.97% | 79.40% | |
| Ensemble | 84.95% | 77.33% | 81.14% |
The best performance in patient unit-based predictions was 81.14% on average (sensitivity = 84.95%, specificity = 77.33%) performed by ensemble prediction on 25% of cropped images.
Fig 3Fluctuation of average accuracy in single-image-unit-based-prediction (a) and patient-unit-based-prediction (b). This chart indicates that the accuracy of prediction on 25% cropped images has a big advantage compared to 100% original size images, and the ensemble predictions perform better than the single model predictions.
Fig 4ROC curve of median case in single-image-unit-based-prediction (a) and patient-unit-based-prediction (b). This chart indicates that the best AUC is 0.8372 performed by ensemble predictions on 25% cropped images in the patient unit based prediction. The result reveals predictions on 25% cropped images bring much better performance than 100% original size images regardless of where the thresholds are placed.
Fig 5The differences in years of experience are shown in the scatterplot.
The red line indicates the average accuracy of the otolaryngologist (73.4%), and the blue line indicates AI accuracy (81.1%).
Differences between AI and humans in intraoperative findings and CT densities.
| Intraoperative mastoid extension | Mastoid density on CT | Agreement between intraoperative findings and mastoid density on CT | ||||
|---|---|---|---|---|---|---|
| No (n = 84) | Yes (n = 80) | No (n = 58) | Yes (n = 106) | No (n = 36) | Yes (n = 128) | |
| Accuracy of otolaryngologists | 64.2% | 83.1% | 83.3% | 68.0% | 22.6% | 87.7% |
| Accuracy of AI | 71.7% | 82.0% | 80.8% | 74.5% | 50.8% | 84.0% |
Otolaryngologists had a 15–20% difference in the presence or absence of intraoperative mastoid cavity extension and the presence or absence of CT density, whereas AI had a less than 10% difference. The difference based on whether the intraoperative findings matched the density on CT was 65.1% (87.7–22.6%) for otolaryngologists vs. 33.2% (84.0–50.8%) for AI.
Fig 6Cases easily diagnosed by otolaryngologists but with low AI accuracy.
Upper: Case without mastoid extension. Lower: Case with mastoid extension.