| Literature DB >> 35027629 |
Shintaro Sukegawa1,2, Tamamo Matsuyama3, Futa Tanaka4, Takeshi Hara4,5, Kazumasa Yoshii6, Katsusuke Yamashita7, Keisuke Nakano8, Kiyofumi Takabatake8, Hotaka Kawai8, Hitoshi Nagatsuka8, Yoshihiko Furuki9.
Abstract
Pell and Gregory, and Winter's classifications are frequently implemented to classify the mandibular third molars and are crucial for safe tooth extraction. This study aimed to evaluate the classification accuracy of convolutional neural network (CNN) deep learning models using cropped panoramic radiographs based on these classifications. We compared the diagnostic accuracy of single-task and multi-task learning after labeling 1330 images of mandibular third molars from digital radiographs taken at the Department of Oral and Maxillofacial Surgery at a general hospital (2014-2021). The mandibular third molar classifications were analyzed using a VGG 16 model of a CNN. We statistically evaluated performance metrics [accuracy, precision, recall, F1 score, and area under the curve (AUC)] for each prediction. We found that single-task learning was superior to multi-task learning (all p < 0.05) for all metrics, with large effect sizes and low p-values. Recall and F1 scores for position classification showed medium effect sizes in single and multi-task learning. To our knowledge, this is the first deep learning study to examine single-task and multi-task learning for the classification of mandibular third molars. Our results demonstrated the efficacy of implementing Pell and Gregory, and Winter's classifications for specific respective tasks.Entities:
Mesh:
Year: 2022 PMID: 35027629 PMCID: PMC8758752 DOI: 10.1038/s41598-021-04603-y
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Prediction performance on the single-task model.
| Accuracy | Precision | Recall | F1 score | AUC | |
|---|---|---|---|---|---|
| SD | SD | SD | SD | SD | |
| 95%CI | 95%CI | 95%CI | 95%CI | 95%CI | |
| Class | 0.8541 | 0.8588 | 0.8544 | 0.8538 | 0.9638 |
| 0.0074 | 0.0075 | 0.0071 | 0.0073 | 0.0018 | |
| 0.851–0.858 | 0.856–0.862 | 0.852–0.857 | 0.851–0.857 | 0.963–0.965 | |
| Position | 0.8895 | 0.8824 | 0.8877 | 0.8831 | 0.9739 |
| 0.0055 | 0.0075 | 0.0064 | 0.0064 | 0.0017 | |
| 0.887–0.892 | 0.880–0.885 | 0.885–0.890 | 0.881–0.886 | 0.973–0.975 | |
| Winter's classification | 0.8663 | 0.8559 | 0.8003 | 0.8138 | 0.9801 |
| 0.0052 | 0.0143 | 0.0119 | 0.0123 | 0.0025 | |
| 0.864–0.868 | 0.851–0.861 | 0.796–0.805 | 0.809–0.818 | 0.979–0.981 |
SD standard deviation, 95% CI 95% confidence interval, AUC area under the receiver operating characteristics curve.
Prediction performance of the multi-task model, including class, position and Winter’s classification.
| Accuracy | Precision | Recall | F1 score | AUC | |
|---|---|---|---|---|---|
| SD | SD | SD | SD | SD | |
| 95%CI | 95%CI | 95%CI | 95%CI | 95%CI | |
| Class | 0.8487 | 0.8541 | 0.8478 | 0.8474 | 0.9606 |
| 0.0087 | 0.0065 | 0.0083 | 0.0084 | 0.0018 | |
| 0.845–0.852 | 0.851–0.857 | 0.845–0.851 | 0.844–0.851 | 0.960–0.961 | |
| Position | 0.8861 | 0.8779 | 0.8829 | 0.8781 | 0.9733 |
| 0.0056 | 0.0065 | 0.0084 | 0.0070 | 0.0025 | |
| 0.884–0.888 | 0.875–0.880 | 0.880–0.886 | 0.875–0.881 | 0.972–0.974 | |
| Winter's classification | 0.8537 | 0.8332 | 0.7747 | 0.7896 | 0.9770 |
| 0.0068 | 0.0124 | 0.0105 | 0.0110 | 0.0024 | |
| 0.851–0.856 | 0.829–0.861 | 0.771–0.779 | 0.786–0.793 | 0.976–0.978 |
SD standard deviation, 95% CI 95% confidence interval, AUC area under the receiver operating characteristics curve.
Prediction performance of the two-task multi-task model including class and position.
| Accuracy | Precision | Recall | F1 score | AUC | |
|---|---|---|---|---|---|
| SD | SD | SD | SD | SD | |
| 95%CI | 95%CI | 95%CI | 95%CI | 95%CI | |
| Class | 0.8543 | 0.8590 | 0.8539 | 0.8534 | 0.9633 |
| 0.0094 | 0.0102 | 0.0094 | 0.0088 | 0.0028 | |
| 0.887–0.892 | 0.856–0.862 | 0.850–0.857 | 0.850–0.857 | 0.962–0.964 | |
| Position | 0.8899 | 0.8814 | 0.8857 | 0.8813 | 0.9737 |
| 0.0069 | 0.0102 | 0.0077 | 0.8813 | 0.0018 | |
| 0.772–0.814 | 0.878–0.885 | 0.882–0.891 | 0.878–0.884 | 0.973–0.974 |
SD standard deviation, 95% CI 95% confidence interval, AUC area under the receiver operating characteristics curve.
Statistical comparisons by p-value and effect size for the single-task and multi-task models.
| Class | Accuracy | Precision | Recall | F1 score | AUC |
|---|---|---|---|---|---|
| Multi3 | 0.029 | 0.064 | 0.006 | 0.006 | < 0.0001 |
| Multi2 | 0.996 | 0.994 | 0.955 | 0.966 | 0.523 |
| Multi3 | 0.674 | 0.554 | 0.857 | 0.817 | 1.823 |
| Multi2 | 0.019 | 0.0235 | 0.064 | 0.057 | 0.233 |
SD standard deviation, 95% CI 95% confidence interval, AUC area under the receiver operating characteristics curve.
Figure 1Visualization of the judgment basis for classification prediction by a convolutional neural network (CNN) using Grad-CAM.
Figure 2A depiction of the crop method for data preprocessing.
Distribution of Pell and Gregory, and Winter’s classifications.
| Pell & Gregory classification | Winter's classification | ||||
|---|---|---|---|---|---|
| Class | Position | ||||
| I | 405 | A | 438 | Horizontal | 514 |
| Mesioangular | 346 | ||||
| II | 607 | B | 693 | Vertical | 282 |
| Distoangular | 79 | ||||
| III | 318 | C | 199 | Inverted | 79 |
| Bucco/lingualangular | 30 | ||||
Figure 3Schematic diagram for classification of the mandibular third molars using single-task and multi-task convolutional neural network (CNN) models.
The number of parameters for each of the two types of multi-tasks and single tasks in the VGG16 model.
| VGG16 | Total parameter | Trainable parameter | Non-trainable parameter |
|---|---|---|---|
| Multi-3task (class, position, and Winter's) | |||
| Multi-2task (class and position) + Single-task (Winter's) | |||
| Multi-2task (class and position) | 15,246,157 | 531,462 | 14,714,695 |
| Single-task (Winter's) | 15,243,082 | 528,387 | 14,714,695 |
| Single-task (class + position + Winter's) | |||
| Each single-task (class/position/Winter's) | 15,243,082 | 528,387 | 14,714,695 |
Bold is the sum of the parameters for each task.