| Literature DB >> 35551345 |
Erfan Noury1,2, Suria S Mannil3, Robert T Chang3, An Ran Ran4, Carol Y Cheung4, Suman S Thapa5, Harsha L Rao6, Srilakshmi Dasari6, Mohammed Riyazuddin6, Dolly Chang3, Sriharsha Nagaraj6, Clement C Tham4, Reza Zadeh1,7.
Abstract
Purpose: To develop a three-dimensional (3D) deep learning algorithm to detect glaucoma using spectral-domain optical coherence tomography (SD-OCT) optic nerve head (ONH) cube scans and validate its performance on ethnically diverse real-world datasets and on cropped ONH scans.Entities:
Mesh:
Year: 2022 PMID: 35551345 PMCID: PMC9145034 DOI: 10.1167/tvst.11.5.11
Source DB: PubMed Journal: Transl Vis Sci Technol ISSN: 2164-2591 Impact factor: 3.048
Figure 1.Building blocks of the dense convolutional blocks used in the convolutional neural network.
Results of the Proposed Model on the Stanford Test and External Data Sets
| Dataset | AUC, 95% CI | Sensitivity, 95% CI | Specificity, 95% CI | F1 Score, 95% CI |
|---|---|---|---|---|
| Stanford | 0.91 (0.90–0.92) | 0.86 (0.80–0.92) | 0.78 (0.68–0.88) | 0.87 (0.86–0.89) |
| Hong Kong | 0.80 (0.78–0.82) | 0.73 (0.67–0.79) | 0.73 (0.61–0.85) | 0.76 (0.75–0.77) |
| India | 0.94 (0.93–0.96) | 0.93 (0.88–0.99) | 0.71 (0.51–0.91) | 0.91 (0.90–0.92) |
| Nepal | 0.87 (0.85–0.90) | 0.79 (0.68–0.90) | 0.79 (0.66–0.92) | 0.80 (0.78–0.83) |
The 95% confidence intervals (CIs) are computed over five independent runs of the model.
Figure 2.AUC for all the data sets, with standard deviations computed over five runs of the model to plot the shaded areas.
Figure 3.AUC for the proposed model on the subset of the Stanford test set that was graded by a glaucoma fellowship-trained ophthalmologist, with standard deviations computed over five runs of the model to plot the shaded areas. To assign a ground-truth label, human grader had access to other screening data, including fundus images, OCT RNFL and GCIPL printouts, IOP values, and visual field parameters, and also had access to patient history and physical examination data, while the model only had access to the OCT scan cube.
Results of the Proposed Model on the Stanford Test Set for Each Myopia Severity Level
| Myopia Severity | Number of Scans (Eyes) | AUC, 95% CI | Sensitivity, 95% CI, % | Specificity, 95% CI, % | F1 Score, 95% CI |
|---|---|---|---|---|---|
| Mild | 166 (67) | 0.92 (0.89–0.95) | 89.37 (84.53–94.21) | 69.09 (59.78–78.40) | 0.87 (0.84–0.91) |
| Moderate | 52 (18) | 0.96 (0.93–1.00) | 91.43 (83.37–99.48) | 89.17 (80.51–97.82) | 0.91 (0.86–0.96) |
| Severe | 51 (13) | 0.99 (0.97–1.00) | 94.47 (91.46–97.48) | 90.00 (73.00–100.0) | 0.97 (0.96–0.98) |
The 95% confidence intervals (CIs) are computed over five independent runs of the model.
Results of the Proposed Model on the Stanford Test Set for Each Glaucoma Severity Level, for Scans Where We Have Glaucoma Severity Information
| Glaucoma Severity | Number of Scans (Eyes) | Recall, 95% CI |
|---|---|---|
| Mild | 225 (50) | 0.84 (0.74–0.94) |
| Moderate | 70 (20) | 0.92 (0.89–0.95) |
| Severe | 66 (29) | 0.98 (0.97–1.00) |
The 95% confidence intervals (CIs) are computed over five independent runs of the model.
Observed Causes of False Predictions of Glaucoma Versus Normal on the Stanford Test Set
| False Predictions | Number (%) of Eyes | |
|---|---|---|
| False positives | 15 | |
| Age >70 | 11 (73.3) | |
| Severe myopia with tilted discs | 2 (13.3) | |
| Large CD (>0.7) | 1 (6.6) | |
| Causes unidentifiable | 1 (6.6) | |
| False negatives | 34 | |
| Mild glaucoma MD >6 | 26 (76.47) | |
| Small CD (<0.3) | 3 (8.82) | |
| Age <50 | 3 (8.82) | |
| Causes unidentifiable | 2 (5.88) | |
Figure 4.Saliency visualizations for two cases from the Stanford Test set. (a) Top and (b) side view of saliency visualizations of a correctly classified normal eye. (c) Top and (d) side view of saliency visualizations of a correctly classified glaucomatous eye. As can be seen, in most of the cases, a highlight in the lamina cribrosa region is mostly correlated with Glaucoma prediction, while for cases with normal prediction, the retinal layer is mostly highlighted. Saliency visualization has been obtained with respect to the predicted class. Regions with a higher value are more salient for the model in making the final prediction.
Results of the Proposed Model Trained With the DiagFind Algorithm, on the Cropped Scans From the Stanford Test Set for Each Myopia Severity Level
| Myopia Severity | Number of Scans (Eyes) | AUC | Sensitivity, % | Specificity, % | F1 Score |
|---|---|---|---|---|---|
| Mild | 24 (24) | 0.77 | 71.43 | 50.00 | 0.69 |
| Moderate | 7 (7) | 0.75 | 75.00 | 66.67 | 0.75 |
| Severe | 4 (4) | 1.00 | 100 | 100 | 1.00 |
The number of cropped scans with myopia severity information that have severe and moderate levels of myopia is very small.
| Algorithm 1: DiagFind | |
|---|---|
| 1: | Train a neural network on a medical imagery classification task. |
| 2: | Utilize saliency methods to find areas of potential sensitivity, and confirm these areas are useful by consulting a domain expert (e.g., a glaucomaspecialized ophthalmologist for this paper) |
| 3: | Further refine these areas of sensitivity to those that correlate with a diagnostic label for which the model is being trained. |
| 4: | Redo training, while utilizing a cropping data augmentation that crops the focus onto the areas of sensitivity. |
| 5: | Manually crop a number of evaluation data points to the area of interest and evaluate and measure the performance of the model on the cropped data. |
| 6: | If the resulting performance of the model is nontrivial, it shows that the identified area contains useful diagnostic information for the given medical imagery problem, since the model has no input other than the area of interest. |