| Literature DB >> 33918998 |
Kyoung Min Kim1, Tae-Young Heo2, Aesul Kim3, Joohee Kim3, Kyu Jin Han4, Jaesuk Yun3, Jung Kee Min4.
Abstract
Artificial intelligence (AI)-based diagnostic tools have been accepted in ophthalmology. The use of retinal images, such as fundus photographs, is a promising approach for the development of AI-based diagnostic platforms. Retinal pathologies usually occur in a broad spectrum of eye diseases, including neovascular or dry age-related macular degeneration, epiretinal membrane, rhegmatogenous retinal detachment, retinitis pigmentosa, macular hole, retinal vein occlusions, and diabetic retinopathy. Here, we report a fundus image-based AI model for differential diagnosis of retinal diseases. We classified retinal images with three convolutional neural network models: ResNet50, VGG19, and Inception v3. Furthermore, the performance of several dense (fully connected) layers was compared. The prediction accuracy for diagnosis of nine classes of eight retinal diseases and normal control was 87.42% in the ResNet50 model, which added a dense layer with 128 nodes. Furthermore, our AI tool augments ophthalmologist's performance in the diagnosis of retinal disease. These results suggested that the fundus image-based AI tool is applicable for the medical diagnosis process of retinal diseases.Entities:
Keywords: artificial intelligence; class activation map; convolutional neural network; fundus photograph; retinal diseases
Year: 2021 PMID: 33918998 PMCID: PMC8142986 DOI: 10.3390/jpm11050321
Source DB: PubMed Journal: J Pers Med ISSN: 2075-4426
Eyes (n = 549) were diagnosed with dAMD, nAMD, DR, ERM, RRD, RP, MH, or RVO, and fundus images were collected. Furthermore, 79 eyes from healthy subjects were collected as controls.
| Disease | dAMD | nAMD | DR | ERM | RRD | RP | MH | RVO | Control | |
|---|---|---|---|---|---|---|---|---|---|---|
| Fundus images ( | 58 | 79 | 95 | 99 | 80 | 50 | 49 | 39 | 79 | |
| Gender | Male | 27 | 41 | 53 | 53 | 47 | 26 | 31 | 19 | 40 |
| Female | 31 | 38 | 42 | 46 | 33 | 24 | 18 | 20 | 39 | |
| Age (years) | 69.6 ± 8.0 | 69.1 ± 8.3 | 53.2 ± 10.4 | 63.6 ± 7.6 | 54.4 ± 14.6 | 53.4 ± 11.0 | 64.2 ± 8.9 | 67.5 ± 8.0 | 56.7 ± 7.3 | |
Age (years) are presented as mean ± standard deviation. dAMD, dry age-related macular degeneration; nAMD, neovascular age-related macular degeneration; DR, diabetic retinopathy; ERM, epiretinal membrane; RRD, rhegmatogenous retinal detachment; RP, retinitis pigmentosa; MH, macular hole; RVO, retinal vein occlusion.
The values of each argument used in ImageDataGenerator.
| Argument | Value | |
|---|---|---|
| (1) | Width_shift_range | 0.4 |
| (2) | Height_shift_range | 0.2 |
| (3) | Rotation_range | 90 |
| (4) | Zoom_range | 0.1 |
| (5) | Horizontal_flip | True |
| Vertical_flip | ||
| (6) | Shear_range | 30 |
Figure 1Comparison of three different convolutional neural network (CNN) architectures. (A) VGG19 model, Inception V3 model, and ResNet50 model. (B) Inception node uses various convolutional filters. (C) The residual node maps the necessary information using residual connection.
Comparison of outcomes of convolutional neural network models in two-class (normal vs. disease) diagnosis (accuracy).
| Model | VGG19 | Inception v3 | ResNet50 |
|---|---|---|---|
| Accuracy | 99.12% | 98.08% | 97.85% |
VGG19, Visual Geometry Group with 19 layers; ResNet50, Deep Residual Learning for Image Recognition with 50 layers.
Accuracy results obtained using 5-fold cross-validation.
| Dense Layer | VGG19 | Inception v3 | ResNet50 |
|---|---|---|---|
| 128 nodes | 0.8200 ± 0.0282 | 0.8340 ± 0.0364 | 0.8742 ± 0.0349 |
| 256 nodes | 0.8135 ± 0.0315 | 0.8212 ± 0.0444 | 0.8646 ± 0.0205 |
| 128 nodes + 128 nodes | 0.8168 ± 0.0243 | 0.8360 ± 0.0115 | 0.8694 ± 0.0338 |
| 256 nodes + 256 nodes | 0.8026 ± 0.0365 | 0.8483 ± 0.0381 | 0.8452 ± 0.0351 |
The data are shown as mean ± standard deviation. VGG19, Visual Geometry Group with 19 layers; ResNet50, Deep Residual Learning for Image Recognition with 50 layers.
Cross-validation results of classification performance evaluation index for nine-class diagnosis (accuracy, sensitivity, specificity, PPV, NPV).
| Model | Accuracy | Class | Sensitivity | Specificity | PPV | NPV |
|---|---|---|---|---|---|---|
| ResNet50 with 128 nodes | 87.42% | dAMD | 0.8190 | 0.9844 | 0.8439 | 0.9770 |
| DR | 0.9262 | 0.9833 | 0.9052 | 0.9868 | ||
| ERM | 0.9252 | 0.9830 | 0.9089 | 0.9850 | ||
| MH | 0.8192 | 0.7960 | 0.7556 | 0.9861 | ||
| Normal | 0.8830 | 0.9873 | 0.9092 | 0.9800 | ||
| RP | 0.9085 | 0.9966 | 0.9600 | 0.9914 | ||
| RRD | 0.8143 | 0.9870 | 0.9125 | 0.9671 | ||
| RVO | 0.8514 | 0.9916 | 0.8750 | 0.9882 | ||
| wAMD | 0.9708 | 0.9667 | 0.7600 | 0.9964 |
NPV, negative predictive value; PPV, positive predictive value; ResNet50, Deep Residual Learning for Image Recognition with 50 layers.
Figure 2Examples of Gradient-weighted Class Activation Mapping (Grad-CAM) visualization of retinal diseases. Grad-CAM visualization of (A) dAMD, (B) wAMD, (C) DR, (D) ERM, (E) MH, (F) RP, (G) RRD, (H) RVO, and (I) normal retina. Grad-CAM extracts the feature map of the last convolution layer and shows a heatmap within the image describing the calculated weight of the feature map. Heatmap images of nAMD show that the AI tool identified pathological changes, such as drusen, bleeding, elevation of the center, pigmentation, surface wrinkling, and retinal detachment. However, in normal controls, the center of macula is identified, with no degenerated area.
Figure 3Heatmap and probability with correct classification and misclassification. (A) AI correctly diagnosed the RP fundus photograph as RP with 100% probability. (B) AI diagnosed the nAMD fundus photograph with 57.67% of DR and 33.2% of nAMD.
Comparison of diagnostic results before and after AI reference. A total of 180 fundus photos were evaluated by extracting 20 images for each retinal disease.
| AI Results | Ophthalmology Residents’ Results | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Before Referring to AI Results | After Referring to AI Results | ||||||||
| 1 | 2 | 3 | 4 | 1 | 2 | 3 | 4 | ||
| Wrong count | 29 | 15 | 21 | 28 | 27 | 14 | 17 | 29 | 23 |
| Accuracy (%) | 83.9 | 91.7 | 88.3 | 84.4 | 85 | 92.2 | 90.6 | 83.9 | 87.2 |
| Time (min) | 50 | 70 | 75 | 32 | 15 | 25 | 24 | 25 | |