| Literature DB >> 36250081 |
Xiaoqin Huang1, Jian Sun1,2, Krati Gupta1, Giovanni Montesano3,4,5, David P Crabb4, David F Garway-Heath5, Paolo Brusini6, Paolo Lanzetta7, Francesco Oddone8, Andrew Turpin9, Allison M McKendrick10, Chris A Johnson11, Siamak Yousefi1,12.
Abstract
Objective: To assess the accuracy of probabilistic deep learning models to discriminate normal eyes and eyes with glaucoma from fundus photographs and visual fields. Design: Algorithm development for discriminating normal and glaucoma eyes using data from multicenter, cross-sectional, case-control study. Subjects and participants: Fundus photograph and visual field data from 1,655 eyes of 929 normal and glaucoma subjects to develop and test deep learning models and an independent group of 196 eyes of 98 normal and glaucoma patients to validate deep learning models. Main outcome measures: Accuracy and area under the receiver-operating characteristic curve (AUC).Entities:
Keywords: artificial intelligence; automated diagnosis; deep learning; fundus photograph; glaucoma; visual field
Year: 2022 PMID: 36250081 PMCID: PMC9556968 DOI: 10.3389/fmed.2022.923096
Source DB: PubMed Journal: Front Med (Lausanne) ISSN: 2296-858X
Demographic characteristics of the study.
| Characteristic | Development dataset | Independent dataset | Early glaucoma subset |
| Number of subjects | 929 | 98 | 50 |
| Number of eyes | 1,655 | 196 | 86 |
| Average age (years) | 57.1 | 52.9 | 47.7 |
| Average MD (dB) | –4.03 | –4.06 | –2.30 |
| Average PSD (dB) | 4.7 | 4.41 | 2.47 |
| Race | |||
| White | 1,429 (86.3%) | 144 (73.4%) | 64 (74.4%) |
| Asian | 51 (3.1%) | 8 (4.1%) | 5 (5.8%) |
| Black | 24 (1.5%) | 0 | 0 |
| Unknown | 116 (7.0%) | 38 (19.4%) | 11 (12.8%) |
| Other | 35 (2.1%) | 6 (3.1%) | 6 (7.0%) |
FIGURE 1Diagram of the deep learning models. Upper: Deep learning model based on fundus photograph. Middle: Hybrid model based on combined visual fields and fundus photographs. Lower: 1-D CNN model based on visual fields.
FIGURE 2Mean deviation (MD) of eyes in the training subset of the discovery dataset, testing subset of the discovery dataset, independent validation dataset, and the early glaucoma subset.
Average value of evaluation metrics of different probabilistic deep learning models in discriminating normal eyes from eyes with glaucoma.
| Dataset | Full discovery dataset | Full independent validation dataset | Early glaucoma subset | ||||||
| Model | Fundus | Visual field | Combined | Fundus | Visual field | Combined | Fundus | Visual field | Combined |
| AUC | 0.90 (0.89, 0.92) | 0.89 (0.86, 0.91) | 0.94 (0.91, 0.96) | 0.94 (0.92, 0.95) | 0.98 (0.98, 0.99) | 0.98 (0.98, 0.99) | 0.90 (0.88, 0.91) | 0.74 (0.73, 0.75) | 0.91 (0.89, 0.93) |
| Accuracy (%) | 82 (79, 84) | 81 (80, 82) | 85 (82, 88) | 88 (86, 89) | 92 (91, 93) | 93 (92, 95) | 83 (80, 85) | 68 (67, 69) | 84 (82.86) |
| Sensitivity (%) | 71 (67, 76) | 0.73 (71, 76) | 79 (72, 85) | 77 (72, 82) | 91 (89, 92) | 91 (87, 95) | 59 (48, 70) | 100 (100, 100) | 100 (100, 100) |
| Specificity (%) | 94 (92, 96) | 90 (88, 91) | 92 (91, 93) | 96 (95, 98) | 94 (93, 95) | 95 (94, 97) | 95 (94, 96) | 51 (50, 52) | 75 (72, 78) |
Numbers in parentheses reflect the 95% confidence intervals. AUC, Area under the receiver operating characteristic curve.
Pair-wise comparison of the AUCs based on the method of Delong et al. (35).
| Dataset | Model of AUC | ||
| Discovery dataset | Fundus | Visual field | 0.081 |
| Fundus | Combined | 0.063 | |
| Visual field | Combined | 0.000 | |
| Independent validation dataset | Fundus | Visual field | 0.006 |
| Fundus | Combined | 0.001 | |
| Visual field | Combined | 0.560 | |
| Early glaucoma subset | Fundus | Visual field | 0.081 |
| Fundus | Combined | 0.063 | |
| Visual field | Combined | 0.000 | |
FIGURE 3Receiver operating characteristic (ROC) curves of the deep learning model for diagnosing glaucoma. Left: ROC of the model for diagnosing glaucoma based on the discovery dataset. Middle: ROC of the model for diagnosing glaucoma based on the independent validation dataset. Right: ROC of the model for diagnosing glaucoma based on an early glaucoma subset.
FIGURE 4Fundus photographs and corresponding CAMs at two different convolutional layers. Top row: Correct prediction. Bottom row: Incorrect prediction. Highlighted regions were more important for the model to make diagnosis.
FIGURE 5Uncertainty of the model in making diagnosis based on data from the independent validation dataset. Left: Uncertainty level of AI for making correct and incorrect diagnosis based on fundus photographs, visual fields, and combined modality. Right: Uncertainty level of AI in making diagnosis on fundus photographs, visual fields, and combined modality based on glaucoma severity levels.