| Literature DB >> 34416824 |
J Kühnisch1, O Meyer2, M Hesenius2, R Hickel1, V Gruhn2.
Abstract
Although visual examination (VE) is the preferred method for caries detection, the analysis of intraoral digital photographs in machine-readable form can be considered equivalent to VE. While photographic images are rarely used in clinical practice for diagnostic purposes, they are the fundamental requirement for automated image analysis when using artificial intelligence (AI) methods. Considering that AI has not been used for automatic caries detection on intraoral images so far, this diagnostic study aimed to develop a deep learning approach with convolutional neural networks (CNNs) for caries detection and categorization (test method) and to compare the diagnostic performance with respect to expert standards. The study material consisted of 2,417 anonymized photographs from permanent teeth with 1,317 occlusal and 1,100 smooth surfaces. All the images were evaluated into the following categories: caries free, noncavitated caries lesion, or caries-related cavitation. Each expert diagnosis served as a reference standard for cyclic training and repeated evaluation of the AI methods. The CNN was trained using image augmentation and transfer learning. Before training, the entire image set was divided into a training and test set. Validation was conducted by selecting 25%, 50%, 75%, and 100% of the available images from the training set. The statistical analysis included calculations of the sensitivity (SE), specificity (SP), and area under the receiver operating characteristic (ROC) curve (AUC). The CNN was able to correctly detect caries in 92.5% of cases when all test images were considered (SE, 89.6; SP, 94.3; AUC, 0.964). If the threshold of caries-related cavitation was chosen, 93.3% of all tooth surfaces were correctly classified (SE, 95.7; SP, 81.5; AUC, 0.955). It can be concluded that it was possible to achieve more than 90% agreement in caries detection using the AI method with standardized, single-tooth photographs. Nevertheless, the current approach needs further improvement.Entities:
Keywords: caries assessment; caries diagnostics; clinical evaluation; convolutional neural networks; deep learning; visual examination
Mesh:
Year: 2021 PMID: 34416824 PMCID: PMC8808002 DOI: 10.1177/00220345211032524
Source DB: PubMed Journal: J Dent Res ISSN: 0022-0345 Impact factor: 6.116
Overview of the Model Performance of the Convolutional Neural Network When the Independent Test Set (n = 479 with 180 Healthy Tooth Surfaces, 216 Noncavitated Carious Lesions, and 83 Cavitations) Was Used for Overall Caries Detection.
| True Positives | True Negatives | False Positives | False Negatives | Diagnostic Performance | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Overall Caries Detection |
| % |
| % |
| % |
| % | ACC | SE | SP | NPV | PPV | AUC |
| Results from all the included teeth and surfaces ( | ||||||||||||||
| 25% of the images | 156 | 32.6 | 258 | 53.8 | 24 | 5.0 | 41 | 8.6 | 86.4 | 79.2 | 91.5 | 86.7 | 86.3 | 0.924 |
| 50% of the images | 148 | 30.9 | 280 | 58.4 | 32 | 6.7 | 19 | 4.0 | 89.4 | 88.6 | 89.7 | 82.2 | 93.6 | 0.950 |
| 75% of the images | 159 | 33.2 | 276 | 57.6 | 21 | 4.4 | 23 | 4.8 | 90.8 | 87.4 | 92.9 | 88.3 | 92.3 | 0.955 |
| 100% of the images | 163 | 34.0 | 280 | 58.5 | 17 | 3.5 | 19 | 4.0 | 92.5 | 89.6 | 94.3 | 90.6 | 93.6 | 0.964 |
| Results from anterior surfaces—incisors/canines ( | ||||||||||||||
| 25% of the images | 63 | 41.2 | 65 | 42.5 | 4 | 2.6 | 21 | 13.7 | 83.7 | 75.0 | 94.2 | 94.0 | 75.6 | 0.911 |
| 50% of the images | 59 | 38.6 | 79 | 51.6 | 8 | 5.2 | 7 | 4.6 | 90.2 | 89.4 | 90.8 | 88.1 | 91.9 | 0.953 |
| 75% of the images | 64 | 41.8 | 73 | 47.7 | 3 | 2.0 | 13 | 8.5 | 89.5 | 83.1 | 96.1 | 95.5 | 84.9 | 0.947 |
| 100% of the images | 64 | 41.8 | 80 | 52.3 | 3 | 2.0 | 6 | 3.9 | 94.1 | 91.4 | 96.4 | 95.5 | 93.0 | 0.965 |
| Results from posterior surfaces—molars/premolars ( | ||||||||||||||
| 25% of the images | 93 | 28.6 | 193 | 59.2 | 20 | 6.1 | 20 | 6.1 | 87.7 | 82.3 | 90.6 | 82.3 | 90.6 | 0.932 |
| 50% of the images | 89 | 27.3 | 201 | 61.6 | 24 | 7.4 | 12 | 3.7 | 89.0 | 88.1 | 89.3 | 78.8 | 94.4 | 0.945 |
| 75% of the images | 95 | 29.1 | 203 | 62.3 | 18 | 5.5 | 10 | 3.1 | 91.4 | 90.5 | 91.9 | 84.1 | 95.3 | 0.961 |
| 100% of the images | 99 | 30.4 | 200 | 61.3 | 14 | 4.3 | 13 | 4.0 | 91.7 | 88.4 | 93.5 | 87.6 | 93.9 | 0.964 |
| Results from vestibular and oral surfaces—anterior/posterior teeth ( | ||||||||||||||
| 25% of the images | 65 | 28.9 | 126 | 56.0 | 8 | 3.6 | 26 | 11.5 | 84.9 | 71.4 | 94.0 | 89.0 | 82.9 | 0.910 |
| 50% of the images | 62 | 27.6 | 143 | 63.5 | 11 | 4.9 | 9 | 4.0 | 91.1 | 87.3 | 92.9 | 84.9 | 94.1 | 0.954 |
| 75% of the images | 67 | 29.8 | 137 | 60.9 | 6 | 2.7 | 15 | 6.6 | 90.7 | 81.7 | 95.8 | 91.8 | 90.1 | 0.952 |
| 100% of the images | 67 | 29.8 | 142 | 63.1 | 6 | 2.7 | 10 | 4.4 | 92.9 | 87.0 | 95.9 | 91.8 | 93.4 | 0.964 |
| Results from occlusal surfaces—molars/premolars ( | ||||||||||||||
| 25% of the images | 91 | 36.0 | 131 | 51.8 | 16 | 6.3 | 15 | 5.9 | 87.7 | 85.8 | 89.1 | 85.0 | 89.7 | 0.943 |
| 50% of the images | 86 | 34.0 | 136 | 53.7 | 21 | 8.3 | 10 | 4.0 | 87.7 | 89.6 | 86.6 | 80.4 | 93.2 | 0.949 |
| 75% of the images | 92 | 36.4 | 138 | 54.5 | 15 | 5.9 | 8 | 3.2 | 90.9 | 92.0 | 90.2 | 86.0 | 94.5 | 0.961 |
| 100% of the images | 96 | 37.9 | 137 | 54.2 | 11 | 4.3 | 9 | 3.6 | 92.1 | 91.4 | 92.6 | 89.7 | 93.8 | 0.968 |
The calculations were performed for different types of teeth, surfaces, and training steps, which resulted in different subsamples.
ACC, accuracy; AUC, area under the receiver operating characteristic curve; SE, sensitivity; SP, specificity; NPV, negative predictive value; PPV, positive predictive value.
Overview of the Model Performance of the Convolutional Neural Network When the Independent Test Set (n = 479 with 180 Healthy Tooth Surfaces, 216 Noncavitated Carious Lesions, and 83 Cavitations) Was Used for Detection of Cavitations.
| True Positives | True Negatives | False Positives | False Negatives | Diagnostic Performance | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Detection of Cavitation |
| % |
| % |
| % |
| % | ACC | SE | SP | NPV | PPV | AUC |
| Results from all the included teeth and surfaces ( | ||||||||||||||
| 25% of the images | 382 | 79.7 | 53 | 11.1 | 14 | 2.9 | 30 | 6.3 | 90.8 | 92.7 | 79.1 | 96.5 | 63.9 | 0.916 |
| 50% of the images | 381 | 79.6 | 61 | 12.7 | 15 | 3.1 | 22 | 4.6 | 92.3 | 94.5 | 80.3 | 96.2 | 73.5 | 0.931 |
| 75% of the images | 382 | 79.8 | 61 | 12.7 | 14 | 2.9 | 22 | 4.6 | 92.5 | 94.6 | 81.3 | 96.5 | 73.5 | 0.948 |
| 100% of the images | 381 | 79.5 | 66 | 13.8 | 15 | 3.1 | 17 | 3.6 | 93.3 | 95.7 | 81.5 | 96.2 | 79.5 | 0.955 |
| Results from anterior surfaces—incisors/canines ( | ||||||||||||||
| 25% of the images | 106 | 69.3 | 23 | 15.0 | 7 | 4.6 | 17 | 11.1 | 84.3 | 86.2 | 76.7 | 93.8 | 57.5 | 0.887 |
| 50% of the images | 108 | 70.6 | 29 | 18.9 | 5 | 3.3 | 11 | 7.2 | 89.5 | 90.8 | 85.3 | 95.6 | 72.5 | 0.916 |
| 75% of the images | 109 | 71.2 | 27 | 17.7 | 4 | 2.6 | 13 | 8.5 | 88.9 | 89.3 | 87.1 | 96.5 | 67.5 | 0.932 |
| 100% of the images | 109 | 71.2 | 33 | 21.6 | 4 | 2.6 | 7 | 4.6 | 92.8 | 94.0 | 89.2 | 96.5 | 82.5 | 0.951 |
| Results from posterior surfaces—molars/premolars ( | ||||||||||||||
| 25% of the images | 276 | 84.7 | 30 | 9.2 | 7 | 2.1 | 13 | 4.0 | 93.9 | 95.5 | 81.1 | 97.5 | 69.8 | 0.941 |
| 50% of the images | 273 | 83.7 | 32 | 9.8 | 10 | 3.1 | 11 | 3.4 | 93.6 | 96.1 | 76.2 | 96.5 | 74.4 | 0.932 |
| 75% of the images | 273 | 83.7 | 34 | 10.4 | 10 | 3.1 | 9 | 2.8 | 94.2 | 96.8 | 77.3 | 96.5 | 79.1 | 0.967 |
| 100% of the images | 272 | 83.4 | 33 | 10.1 | 11 | 3.4 | 10 | 3.1 | 93.6 | 96.5 | 75.0 | 96.1 | 76.7 | 0.957 |
| Results from vestibular and oral surfaces—anterior/posterior teeth ( | ||||||||||||||
| 25% of the images | 155 | 68.9 | 39 | 17.3 | 12 | 5.3 | 19 | 8.5 | 86.2 | 89.1 | 76.5 | 92.8 | 67.2 | 0.884 |
| 50% of the images | 156 | 69.3 | 43 | 19.1 | 11 | 4.9 | 15 | 6.7 | 88.4 | 91.2 | 79.6 | 93.4 | 74.1 | 0.923 |
| 75% of the images | 160 | 71.1 | 41 | 18.2 | 7 | 3.1 | 17 | 7.6 | 89.3 | 90.4 | 85.4 | 95.8 | 70.7 | 0.937 |
| 100% of the images | 159 | 70.7 | 47 | 20.9 | 8 | 3.5 | 11 | 4.9 | 91.6 | 93.5 | 85.5 | 95.2 | 81.0 | 0.943 |
| Results from occlusal surfaces—molars/premolars ( | ||||||||||||||
| 25% of the images | 227 | 89.7 | 13 | 5.1 | 2 | 0.8 | 11 | 4.4 | 94.9 | 95.4 | 86.7 | 99.1 | 54.2 | 0.939 |
| 50% of the images | 225 | 88.9 | 17 | 6.7 | 4 | 1.6 | 7 | 2.8 | 95.7 | 97.0 | 81.0 | 98.3 | 70.8 | 0.914 |
| 75% of the images | 222 | 87.7 | 19 | 7.5 | 7 | 2.8 | 5 | 2.0 | 95.3 | 97.8 | 73.1 | 96.9 | 79.2 | 0.962 |
| 100% of the images | 222 | 87.7 | 18 | 7.1 | 7 | 2.8 | 6 | 2.4 | 94.9 | 97.4 | 72.0 | 96.9 | 75.0 | 0.966 |
The calculations were performed for different types of teeth, surfaces, and training steps, which resulted in different subsamples.
ACC, accuracy; AUC, area under the receiver operating characteristic curve; SE, sensitivity; SP, specificity; NPV, negative predictive value; PPV, positive predictive value.
Overview of the Model Performance of the Convolutional Neural Network in Relation to the Main Diagnostic Classes from the Independent Test Set (n = 479).
| True Positives | True Negatives | False Positives | False Negatives | Diagnostic Performance | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Detection of Cavitation |
| % |
| % |
| % |
| % | ACC | SE | SP | NPV | PPV | AUC |
| Results from caries-free teeth or surfaces ( | ||||||||||||||
| 25% of the images | 156 | 86.7 | 0 | 0.0 | 0 | 0.0 | 24 | 13.3 | 86.7 | 86.7 | UC | 0.0 | 100.0 | UC |
| 50% of the images | 148 | 82.2 | 0 | 0.0 | 0 | 0.0 | 32 | 17.8 | 82.2 | 82.2 | UC | 0.0 | 100.0 | UC |
| 75% of the images | 159 | 88.3 | 0 | 0.0 | 0 | 0.0 | 21 | 11.7 | 88.3 | 88.3 | UC | 0.0 | 100.0 | UC |
| 100% of the images | 163 | 90.6 | 0 | 0.0 | 0 | 0.0 | 17 | 9.4 | 90.6 | 90.6 | UC | 0.0 | 100.0 | UC |
| Results from noncavitated caries lesions ( | ||||||||||||||
| 25% of the images | 170 | 78.7 | 0 | 0.0 | 0 | 0.0 | 46 | 21.3 | 78.7 | 78.7 | UC | 0.0 | 100.0 | UC |
| 50% of the images | 187 | 86.6 | 0 | 0.0 | 0 | 0.0 | 29 | 13.4 | 86.6 | 86.6 | UC | 0.0 | 100.0 | UC |
| 75% of the images | 183 | 84.7 | 0 | 0.0 | 0 | 0.0 | 33 | 15.3 | 84.7 | 84.7 | UC | 0.0 | 100.0 | UC |
| 100% of the images | 184 | 85.2 | 0 | 0.0 | 0 | 0.0 | 32 | 14.8 | 85.2 | 85.2 | UC | 0.0 | 100.0 | UC |
| Results from cavitated caries lesions ( | ||||||||||||||
| 25% of the images | 53 | 63.9 | 0 | 0.0 | 0 | 0.0 | 30 | 36.1 | 63.9 | 63.9 | UC | 0.0 | 100.0 | UC |
| 50% of the images | 61 | 73.5 | 0 | 0.0 | 0 | 0.0 | 22 | 26.5 | 73.5 | 73.5 | UC | 0.0 | 100.0 | UC |
| 75% of the images | 61 | 73.5 | 0 | 0.0 | 0 | 0.0 | 22 | 26.5 | 73.5 | 73.5 | UC | 0.0 | 100.0 | UC |
| 100% of the images | 66 | 79.5 | 0 | 0.0 | 0 | 0.0 | 17 | 20.5 | 79.5 | 79.5 | UC | 0.0 | 100.0 | UC |
The calculations included all types of teeth or surfaces, which were classified into each diagnostic category by the independent expert evaluation. As the reference standard served as selection criteria, true-negative and false-positive rates appear as zero values and, in consequence, SP and AUC became uncalculable.
ACC, accuracy; AUC, area under the receiver operating characteristic curve; SE, sensitivity; SP, specificity; NPV, negative predictive value; PPV, positive predictive value; UC, uncalculable.
Figure 1.The receiver operating characteristic curves (ROC) illustrate the model performance of the convolutional neural network. Performance is shown for overall caries detection (A) and cavity detection (B) when 25%, 50%, 75%, and 100% of all training images were used. This figure is available in color online.
Figure 2.Example clinical images and the corresponding test results by the artificial intelligence (AI) algorithms. Furthermore, the illustration includes saliency maps that visualize those image areas (in blue) that were used for decision-making by the convolutional neural network. This figure is available in color online.