| Literature DB >> 36010220 |
Michail E Klontzas1,2,3,4, Ioannis Stathis1, Konstantinos Spanakis1, Aristeidis H Zibis5, Kostas Marias2,3,6, Apostolos H Karantanas1,2,3,4.
Abstract
Differential diagnosis between avascular necrosis (AVN) and transient osteoporosis of the hip (TOH) can be complicated even for experienced MSK radiologists. Our study attempted to use MR images in order to develop a deep learning methodology with the use of transfer learning and a convolutional neural network (CNN) ensemble, for the accurate differentiation between the two diseases. An augmented dataset of 210 hips with TOH and 210 hips with AVN was used to finetune three ImageNet-trained CNNs (VGG-16, InceptionResNetV2, and InceptionV3). An ensemble decision was reached in a hard-voting manner by selecting the outcome voted by at least two of the CNNs. Inception-ResNet-V2 achieved the highest AUC (97.62%) similar to the model ensemble, followed by InceptionV3 (AUC of 96.82%) and VGG-16 (AUC 96.03%). Precision for the diagnosis of AVN and recall for the detection of TOH were higher in the model ensemble compared to Inception-ResNet-V2. Ensemble performance was significantly higher than that of an MSK radiologist and a fellow (P < 0.001). Deep learning was highly successful in distinguishing TOH from AVN, with a potential to aid treatment decisions and lead to the avoidance of unnecessary surgery.Entities:
Keywords: Artificial Intelligence; Inception-ResNetV2; InceptionV3; MR imaging; VGG-16; avascular necrosis; deep learning; hip; osteoporosis/transient; transfer learning
Year: 2022 PMID: 36010220 PMCID: PMC9406993 DOI: 10.3390/diagnostics12081870
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Flow diagram describing methodology followed for data augmentation, deep learning model training with transfer learning, and the development of a model ensemble for the diagnosis of TOH vs. AVN (created with BioRender.com).
Figure 2Training/validation accuracy (A,C,E) and training/validation loss (B,D,F) plots for the finetuning for the ImageNet pretrained Inception-ResNetV2 (A,B), VGG-16 (C,D), and Inception V3 (E,F).
Performance metrics of individual convolutional neural networks and the respective network ensemble.
| AUC | Group | Precision | Recall | f1-Score | |
|---|---|---|---|---|---|
|
|
| ||||
| AVN | 1 | 0.95 | 0.98 | ||
| TOH | 0.95 | 1 | 0.98 | ||
|
|
| ||||
| AVN | 1 | 0.92 | 0.96 | ||
| TOH | 0.93 | 1 | 0.96 | ||
|
|
| ||||
| AVN | 1 | 0.94 | 0.97 | ||
| TOH | 0.94 | 1 | 0.97 | ||
|
|
| ||||
| AVN | 0.98 | 0.97 | 0.98 | ||
| TOH | 0.97 | 0.98 | 0.98 |
AUC: Area Under the Curve; AVN: Avascular Necrosis; TOH: Transient Osteoporosis of the Hip.
Figure 3Confusion matrices for the ensemble CNN decision (A), for VGG-16 (B), InceptionV3 (C) and Inception-ResNetV2 (D). TOH: transient osteoporosis of the hip; AVN: avascular necrosis of the femoral head.
Figure 4Receiver Operating Characteristics (ROC) curves of the model ensemble and MSK imaging experts. Model ensemble curve is plotted as a pink line, the MSK radiologist curve is plotted with brown color and the MSK fellow curve is plotted with a turquoise color. MSK Rad: Musculoskeletal Radiologist; AUC: Area Under the Curve.