Jose E Cejudo1, Akhilanand Chaurasia2,3, Ben Feldberg1, Joachim Krois1,2, Falk Schwendicke1,2. 1. Department of Oral Diagnostics, Digital Health and Health Services Research, Charité-Universitätsmedizin Berlin, 14197 Berlin, Germany. 2. ITU/WHO Focus Group AI on Health, Topic Group Dentistry, 1211 Geneva, Switzerland. 3. Department of Oral Medicine and Radiology, King George's Medical University, Lucknow 226003, Uttar Pradesh, India.
Abstract
OBJECTIVES: To retrospectively assess radiographic data and to prospectively classify radiographs (namely, panoramic, bitewing, periapical, and cephalometric images), we compared three deep learning architectures for their classification performance. METHODS: Our dataset consisted of 31,288 panoramic, 43,598 periapical, 14,326 bitewing, and 1176 cephalometric radiographs from two centers (Berlin/Germany; Lucknow/India). For a subset of images L (32,381 images), image classifications were available and manually validated by an expert. The remaining subset of images U was iteratively annotated using active learning, with ResNet-34 being trained on L, least confidence informative sampling being performed on U, and the most uncertain image classifications from U being reviewed by a human expert and iteratively used for re-training. We then employed a baseline convolutional neural networks (CNN), a residual network (another ResNet-34, pretrained on ImageNet), and a capsule network (CapsNet) for classification. Early stopping was used to prevent overfitting. Evaluation of the model performances followed stratified k-fold cross-validation. Gradient-weighted Class Activation Mapping (Grad-CAM) was used to provide visualizations of the weighted activations maps. RESULTS: All three models showed high accuracy (>98%) with significantly higher accuracy, F1-score, precision, and sensitivity of ResNet than baseline CNN and CapsNet (p < 0.05). Specificity was not significantly different. ResNet achieved the best performance at small variance and fastest convergence. Misclassification was most common between bitewings and periapicals. For bitewings, model activation was most notable in the inter-arch space for periapicals interdentally, for panoramics on bony structures of maxilla and mandible, and for cephalometrics on the viscerocranium. CONCLUSIONS: Regardless of the models, high classification accuracies were achieved. Image features considered for classification were consistent with expert reasoning.
OBJECTIVES: To retrospectively assess radiographic data and to prospectively classify radiographs (namely, panoramic, bitewing, periapical, and cephalometric images), we compared three deep learning architectures for their classification performance. METHODS: Our dataset consisted of 31,288 panoramic, 43,598 periapical, 14,326 bitewing, and 1176 cephalometric radiographs from two centers (Berlin/Germany; Lucknow/India). For a subset of images L (32,381 images), image classifications were available and manually validated by an expert. The remaining subset of images U was iteratively annotated using active learning, with ResNet-34 being trained on L, least confidence informative sampling being performed on U, and the most uncertain image classifications from U being reviewed by a human expert and iteratively used for re-training. We then employed a baseline convolutional neural networks (CNN), a residual network (another ResNet-34, pretrained on ImageNet), and a capsule network (CapsNet) for classification. Early stopping was used to prevent overfitting. Evaluation of the model performances followed stratified k-fold cross-validation. Gradient-weighted Class Activation Mapping (Grad-CAM) was used to provide visualizations of the weighted activations maps. RESULTS: All three models showed high accuracy (>98%) with significantly higher accuracy, F1-score, precision, and sensitivity of ResNet than baseline CNN and CapsNet (p < 0.05). Specificity was not significantly different. ResNet achieved the best performance at small variance and fastest convergence. Misclassification was most common between bitewings and periapicals. For bitewings, model activation was most notable in the inter-arch space for periapicals interdentally, for panoramics on bony structures of maxilla and mandible, and for cephalometrics on the viscerocranium. CONCLUSIONS: Regardless of the models, high classification accuracies were achieved. Image features considered for classification were consistent with expert reasoning.
Authors: L Schneider; L Arsiwala-Scheppach; J Krois; H Meyer-Lueckel; K K Bressem; S M Niehues; F Schwendicke Journal: J Dent Res Date: 2022-06-09 Impact factor: 8.924
Authors: Katarzyna Zaborowicz; Tomasz Garbowski; Barbara Biedziak; Maciej Zaborowicz Journal: Int J Environ Res Public Health Date: 2022-03-03 Impact factor: 3.390