| Literature DB >> 30279554 |
Naoto Shibata1, Masaki Tanito2,3, Keita Mitsuhashi1, Yuri Fujino4,5, Masato Matsuura4,5, Hiroshi Murata4, Ryo Asaoka6.
Abstract
The Purpose of the study was to develop a deep residual learning algorithm to screen for glaucoma from fundus photography and measure its diagnostic performance compared to Residents in Ophthalmology. A training dataset consisted of 1,364 color fundus photographs with glaucomatous indications and 1,768 color fundus photographs without glaucomatous features. A testing dataset consisted of 60 eyes of 60 glaucoma patients and 50 eyes of 50 normal subjects. Using the training dataset, a deep learning algorithm known as Deep Residual Learning for Image Recognition (ResNet) was developed to discriminate glaucoma, and its diagnostic accuracy was validated in the testing dataset, using the area under the receiver operating characteristic curve (AROC). The Deep Residual Learning for Image Recognition was constructed using the training dataset and validated using the testing dataset. The presence of glaucoma in the testing dataset was also confirmed by three Residents in Ophthalmology. The deep learning algorithm achieved significantly higher diagnostic performance compared to Residents in Ophthalmology; with ResNet, the AROC from all testing data was 96.5 (95% confidence interval [CI]: 93.5 to 99.6)% while the AROCs obtained by the three Residents were between 72.6% and 91.2%.Entities:
Mesh:
Year: 2018 PMID: 30279554 PMCID: PMC6168579 DOI: 10.1038/s41598-018-33013-w
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Demographics of subjects in testing dataset.
| n | G | mG | N | mN | p value |
|---|---|---|---|---|---|
| 33 | 28 | 27 | 22 | ||
| Age (years) | |||||
| Mean ± SD | 68.7 ± 7.9 | 60.2 ± 12.1 | 66.9 ± 10.3 | 42.1 ± 16.6 | <0.0001a |
| 95% CI | 65.9–71.5 | 55.5–64.9 | 62.8–71.0 | 34.8–49.5 | |
| Sex | |||||
| Men, n (%) | 12 (36) | 12 (43) | 15 (56) | 7 (32) | 0.3324b |
| Women, n (%) | 21 (64) | 16 (57) | 12 (44) | 15 (68) | |
| Eye | |||||
| Right, n (%) | 19 (58) | 16 (57) | 8 (30) | 11 (50) | 0.1160b |
| Left, n (%) | 14 (42) | 12 (43) | 19 (70) | 11 (50) | |
| BCVA (LogMAR) | |||||
| Mean ± SD | 0.02 ± 0.14 | 0.08 ± 0.22 | −0.03 ± 0.07 | −0.05 ± 0.05 | 0.0049a |
| 95% CI | −0.03–0.06 | 0.00–0.16 | −0.06–0.00 | −0.08–−0.03 | |
| IOP (mmHg) | |||||
| Mean ± SD | 13.5 ± 3.0 | 15.6 ± 5.5 | 13.3 ± 2.9 | 14.7 ± 2.9 | 0.0751a |
| range | 12.4–14.5 | 13.5–17.8 | 12.2–14.5 | 13.4–16.0 | |
| Spherical equivalent refractive error (D) | |||||
| Mean ± SD | −1.7 ± 2.2 | −9.1 ± 3.1 | −0.45 ± 2.3 | −8.3 ± 1.7 | <0.0001a |
| range | −2.5–−1.0 | −10.3–−7.9 | −1.4–+0.5 | −9.1–−7.5 | |
| Axial length (mm) | |||||
| Mean ± SD | 24.0 ± 1.4 | 26.6 ± 1.2 | 23.6 ± 1.4 | 26.3 ± 1.1 | <0.0001a |
| range | 23.5–24.5 | 26.2–27.1 | 23.1–24.2 | 25.8–26.8 | |
| vC/D ratio | |||||
| Mean ± SD | 0.73 ± 0.15 | 0.79 ± 0.11 | 0.53 ± 0.15 | 0.37 ± 0.12 | <0.0001a |
| range | 0.69–0.79 | 0.74–0.83 | 0.48–0.59 | 0.31–0.42 | |
| cpRNFLT (μm) | |||||
| Mean ± SD | 71.3 ± 17.0 | 63.6 ± 14.8 | 98.0 ± 9.1 | 97.2 ± 10.4 | <0.0001a |
| range | 65.2–77.3 | 57.8–69.3 | 94.4–101.6 | 92.6–101.8 | |
| mIRT (μm) | |||||
| Mean ± SD | 75.0 ± 13.7 | 67.1 ± 10.4 | 97.9 ± 7.6 | 94.4 ± 6.2 | <0.0001a |
| range | 70.2–79.9 | 63.0–71.1 | 94.9–100.9 | 91.7–97.2 | |
P values are calculated among the four groups by one-way analysis of variance (ANOVA) (a) for continuous variables, and by the chi-square test (b) for categorical variables.
G: non-highly myopic glaucoma subjects, mG: highly myopic glaucoma subjects, N: non-highly myopic normative subjects, mN: highly myopic normative subjects, SD: standard deviation, 95% CI: 95% confidence interval, BCVA, best-corrected visu al acuity, logMAR: logarithm of the minimum angle of resolution, IOP: intraocular pressure, D: diopter, vC/D ratio: vertical cup-to-disc ratio, cpRNFLT: circumpapillary retinal nerve fiber layer thickness, mIRT: macular inner retinal thickness.
AROC values obtained with internal three-fold cross validation.
| Fold 1 for validation | Fold 2 for validation | Fold 3 for validation | |
|---|---|---|---|
| Iteration 1 (%) | 95.0 | 94.1 | 96.0 |
| Iteration 2 (%) | 94.2 | 94.9 | 95.9 |
| Iteration 3 (%) | 95.2 | 94.6 | 95.3 |
AROC values obtained were obtained using three-fold cross validation.
Figure 1The deep residual learning algorithm to diagnose glaucoma using fundus photography. In ResNet, after two instances of convolution and batch normalization, the input is added to the raw output. (a) Shows the scheme of the classifier. The network is highly influenced by ResNet, which has skipping connections in each residual block to promote efficient training of deeper layers. This network has 18 convolutional layers in total. (b) shows the detailed explanation of residual blocks. In the case of (a), the shape of input and output will be the same. On the other hand, (b) doubles the number of channels with the second convolution, while width and height are halved with max pooling. When adding the input to output, half of the filters added are zero-padded so that the shapes match. ResNet: residual network.
Parameters used in ResNet.
| Learning Rate | Dropout | Batch Size | Momentum SGD | |
|---|---|---|---|---|
| Damping coefficient | Weight Decay | |||
| 05 to 0.1 | 0.5 | 64 | 0.9 | 0.0001 |
SGD: stochastic gradient descent, ResNet: residual network. Learning Rate exponentially decayed as training progressed.
Figure 2External validation: Receiver operating characteristic curve. The receiver operating characteristic curve obtained in the testing dataset (N = 110). The AROC with ResNet was 96.52 (95% confidence interval [CI]: 95.6 to 98.7)%. The AROC values of the three Residents in Ophthalmology were: 72.6 (95% CI: 64.1 to 81.1)%, 87.7 (95% CI: 82.3 to 93.2)%, and 91.2 (95% CI: 85.9 to 96.47)%, which were significantly smaller than that of ResNet. P values were obtained by comparing the AUC with ResNet and those of Residents in Ophthalmology A, B, C (DeLong’s method with adjustment for multiple comparisons). AROC: area under the receiver operating characteristic curve. ResNet: residual network.
Figure 3Receiver operating characteristic curve obtained between G and N groups in testing dataset. The receiver operating characteristic curve obtained between G and N groups in the testing dataset. The AROC with ResNet was 97.1 (95% confidence interval [CI]: 93.3 to 100.0)%. The AROC values of the three Residents in Ophthalmology were: 77.4 (95% CI: 67.0 to 87.9)%, 84.9 (95% CI: 76.9 to 92.8)%, and 93.7 (95% CI: 86.8 to 99.8)%. P values were obtained by comparing the AUC with ResNet and those of Residents in Ophthalmology A, B, C (DeLong’s method with adjustment for multiple comparisons). AROC: area under the receiver operating characteristic curve. ResNet: residual network, G: non-highly myopic glaucoma patients and N: non-highly myopic normative subjects.
Figure 4Receiver operating characteristic curve obtained between mG and mN groups in testing dataset. The receiver operating characteristic curve was obtained between mG and mN groups in the testing dataset. The AROC with ResNet was 96.4 (95% CI: 92.0 to 100.0)%. The AROC values of the three Residents in Ophthalmology were 66.6 (95% CI: 53.4 to 79.9)%, 91.2 (95% CI: 83.9 to 98.3)%, and 88.8 (95% CI: 80.3 to 97.3)%. P values were obtained by comparing the AUC with ResNet and those of Residents in Ophthalmology A, B, C (DeLong’s method with adjustment for multiple comparisons). AROC: area under the receiver operating characteristic curve. ResNet: residual network. mG: highly myopic glaucoma patients and N: highly myopic normative subjects.
AROC values obtained with other models used to diagnose glaucoma.
| all eyes (N = 110) | N and G groups | mN and mG groups | |
|---|---|---|---|
| a CNN with 16 layers, similar to VGG16 | 86.3 [79.9–93.0] | 81.8 [71.2–91.4] | 91.2 [83.5–99.0] |
| Random Forests | 77.5 [69.6–85.4] | 76.8 [65.9–87.7] | 78.3 [66.9–89.6] |
| Support Vector Machine | 71.1 [62.7–79.5] | 75.1 [64.1–86.1] | 66.2 [53.0–79.5] |
AROC [95% confidence interval] values were calculated by training using (i) CNN with 16 layers, similarly to VGG16, (ii) support vector machine, and (iii) Random Forest, using all of the training dataset, and validating using the testing dataset.
AROC: area under the receiver operating characteristic curve, CNN: convolutional neural network, G: non-highly myopic glaucoma patients, N: non-highly myopic normative subjects, mG: highly myopic glaucoma patients and N: highly myopic normative subjects.