| Literature DB >> 30361278 |
Daniel Shu Wei Ting1, Louis R Pasquale2, Lily Peng3, John Peter Campbell4, Aaron Y Lee5, Rajiv Raman6, Gavin Siew Wei Tan7, Leopold Schmetterer7,8,9,10, Pearse A Keane11, Tien Yin Wong7.
Abstract
Artificial intelligence (AI) based on deep learning (DL) has sparked tremendous global interest in recent years. DL has been widely adopted in image recognition, speech recognition and natural language processing, but is only beginning to impact on healthcare. In ophthalmology, DL has been applied to fundus photographs, optical coherence tomography and visual fields, achieving robust classification performance in the detection of diabetic retinopathy and retinopathy of prematurity, the glaucoma-like disc, macular oedema and age-related macular degeneration. DL in ocular imaging may be used in conjunction with telemedicine as a possible solution to screen, diagnose and monitor major eye diseases for patients in primary care and community settings. Nonetheless, there are also potential challenges with DL application in ophthalmology, including clinical and technical challenges, explainability of the algorithm results, medicolegal issues, and physician and patient acceptance of the AI 'black-box' algorithms. DL could potentially revolutionise how ophthalmology is practised in the future. This review provides a summary of the state-of-the-art DL systems described for ophthalmic applications, potential challenges in clinical deployment and the path forward. © Author(s) (or their employer(s)) 2019. Re-use permitted under CC BY-NC. No commercial re-use. See rights and permissions. Published by BMJ.Entities:
Keywords: glaucoma; imaging; public health; retina; telemedicine
Mesh:
Year: 2018 PMID: 30361278 PMCID: PMC6362807 DOI: 10.1136/bjophthalmol-2018-313173
Source DB: PubMed Journal: Br J Ophthalmol ISSN: 0007-1161 Impact factor: 4.638
Summary table for the different DL systems in the detection of referable diabetic retinopathy, glaucoma suspect, age-related macular degeneration and retinopathy of prematurity using fundus photographs
| DL systems | Year | Test data sets | Test images (n) | CNN | AUC | Sensitivity (%) | Specificity (%) |
| Referable diabetic retinopathy | |||||||
| Abràmoff | 2016 | Messidor-2 | 1748 | AlexNet/VGG | 0.98 | 96.80 | 87.00 |
| Gulshan | 2016 | Messidor-2 | 1748 | Inception-V3 | 0.99 | 87 | 98.50 |
| 96.10 | 93.90 | ||||||
| EyePACS-1 | 9963 | 0.991 | 90.30 | 98.10 | |||
| 97.50 | 93.40 | ||||||
| Gargeya and Leng | 2017 | Kaggle images | 75 137 | Customised CNN | 0.97 | NA | NA |
| E-Ophtha | 463 | 0.96 | NA | NA | |||
| Messidor-2 | 1748 | 0.94 | NA | NA | |||
| Ting | 2017 | SiDRP 14–15 | 71 896 | VGG-19 | 0.936 | 90.50 | 91.60 |
| Guangdong | 15 798 | 0.949 | 98.70 | 81.60 | |||
| SIMES | 3052 | 0.889 | 97.10 | 82.00 | |||
| SINDI | 4512 | 0.917 | 99.3 | 73.3 | |||
| SCES | 1936 | 0.919 | 100 | 76.30 | |||
| BES | 1052 | 0.929 | 94.40 | 88.50 | |||
| AFEDS | 1968 | 0.98 | 98.80 | 86.50 | |||
| RVEEH | 2302 | 0.983 | 98.90 | 92.20 | |||
| Mexican | 1172 | 0.95 | 91.80 | 84.80 | |||
| CUHK | 1254 | 0.948 | 99.3 | 83.10 | |||
| HKU | 7706 | 0.964 | 100 | 81.30 | |||
| Abràmoff | 2018 | 10 primary care practice sites from the USA | 892 patients | Alex/VGG | NA | 87.2 | 90.7 |
| Glaucoma suspect* | |||||||
| Ting | 2017 | SiDRP 14–15 | 71 896 | VGG-19 | 0.942 | 96.40 | 93.20 |
| Li | 2018 | Guangdong | 48 116 | 0.986 | 95.60 | 92.00 | |
| Age-related macular degeneration | |||||||
| Ting | 2017 | SiDRP 14–15 | 35 948 | VGG-19 | 0.932 | 93.20 | 88.70 |
| Burlina | 2017 | AREDS | 120 656 | AlexNet, OverFeat | 0.940–0.96 | NA | NA |
| Grassmann | 2018 | AREDS | 120 656 | AlexNet, GoogleNet, VGG, Inception-V3, ResNet, Inception-ResNet-V2 | NA | 84.20 | 94.30 |
| Retinopathy of prematurity | |||||||
| Brown | 2018 | i-ROP | 100 | Inception-V1 and U-Net | NA | 100 | 94 |
The diagnostic performance is not comparable between the different DL systems given the different data sets used in the individual study.
*Definition of glaucoma suspect: (1) Ting et al 11—vertical cup to disc ratio of 0.8 or greater, and any glaucomatous disc changes; (2) Li et al 16—vertical cup to disc ratio of 0.7 or greater, and any glaucomatous disc changes.
AFEDS, African American Eye Disease Study; AREDS, Age-Related Eye Disease Study; AUC, area under the receiver operating characteristic curve; BES, Beijing Eye Study; CNN, convolutional neural network; CUHK, Chinese University Hong Kong; DL, deep learning; SiDRP 14–15, Singapore Integrated Diabetic Retinopathy Screening Programme; HKU, Hong Kong University; NA, not available; RVEEH, Royal Victorian Eye and Ear Hospital; SCES, Singapore Chinese Eye Study; SIMES, Singapore Malay Eye Study; SINDI, Singapore Indian Eye Study.
Summary table for the different DL systems in the detection of retinal diseases using OCT
| DL systems | Year | Disease | OCT machines | Test images | CNN | AUC | Accuracy (%) | Sensitivity (%) | Specificity (%) |
| Lee | 2017 | Exudative AMD | Spectralis | 20 613 | VGG-16 | 0.928 | 87.60 | 84.60 | 91.50 |
| Trader | 2018 | Exudative AMD | Spectralis | 100 | Inception-V3 | 0.980 | 100 | NA | NA |
| Kermany | 2018 | CNV | Spectralis | 1000 | Inception-V3 | ||||
| DMO | |||||||||
| Drusen | |||||||||
| 1. Multiclass comparison | 0.999 | 96.50 | 97.80 | 97.40 | |||||
| 2. Limited model | 0.988 | 93.40 | 96.60 | 94.00 | |||||
| 3. Binary model | |||||||||
| CNV vs normal | 1 | 100 | 100 | 100 | |||||
| DMO vs normal | 0.999 | 98.20 | 96.80 | 99.60 | |||||
| Drusen vs normal | 0.999 | 99 | 98 | 99.20 | |||||
| De Fauw | 2018 | Urgent, semiurgent, routine and observation only | Topcon | 997 patients | 1. Deep segmentation network using U-Net | Urgent | 94.5 | ||
| Normal, CNV, macular oedema, FTMH, PTMH, CSR, VMT, GA, drusen, ERM | Spectralis | 116 patients | 2. Deep classification network using a custom 29 CNN layers with 5 pooling layers | Urgent referral | 96.6 |
The diagnostic performance is not comparable between the different DL systems given the different data sets used in the individual study. AUC for specific conditions: CNV 0.993; macular oedema 0.990; normal 0.995; FTMH 1.00; PTMH 0.999; CSR 0.995; VMT 0.980; GA 0.990; drusen 0.967; and ERM 0.966.
AMD, age-related macular degeneration; AUC, area under the receiver operating characteristic curve; CNN, convolutional neural network; CNV, choroidal neovascularisation; CSR, central serous chorioretinopathy; DL, deep learning; DMO, diabetic macular oedema; ERM, epiretinal membrane; FTMH, full-thickness macula hole; GA, geographic atrophy; NA, not available; OCT, optical coherence tomography; PTMH, partial thickness macula hole; VMT, vitreomacular traction.
Figure 1Archetype analysis with 16 visual field (VF) archetypes (ATs) that were derived from an unsupervised computer algorithm described by Elze et al. 49
The clinical and technical challenges in building and deploying deep learning (DL) techniques from ’bench to bedside’
| Steps | Potential challenges |
| 1. Identification of training data sets |
Patients’ consent and confidentiality issues. Varying standards and regulations between the different institutional review boards. Small training data sets for rare disease (eg, ocular tumours) or common diseases that are not captured in routine (eg, cataracts). |
| 2. Validation and testing data sets |
Lack of sample size—not sufficiently powered. Lack of generalisability—not tested widely in different populations or on data collected from different devices. |
| 3. Explainability of the results |
Demonstration of the regions ‘deemed’ abnormal by DL. Methods to generate heat maps—occlusion tests, class activation, integrated gradient method, soft attention map and so on. |
| 4. Clinical deployment of DL Systems |
Recommendation of the potential clinical deployment sites. Application of regulatory approval from health authorities (eg, US Food and Drug Administration, Europe CE marking and so on). Conducting prospective clinical trials. Medical rebate scheme and medicolegal requirement. Ethical challenges. |
Figure 2Some examples of heat maps showing the abnormal areas in the retina. (A) Severe non-proliferative diabetic retinopathy (NPDR); (B) geographic atrophy in advanced age-related macular degeneration (AMD) on fundus photographs11; and (C) diabetic macular oedema on optical coherence tomography.
Figure 3A representative screenshot from the output of the Moorfields-DeepMind deep learning system for optical coherence tomography segmentation and classification. In this case, the system correctly diagnoses a case of central serous retinopathy with secondary choroidal neovascularisation and recommends urgent referral to an ophthalmologist. Through the creation of an intermediate tissue representation (seen here as two-dimensional thickness maps for each morphological parameter), the system provides ’explainability’ for the ophthalmologist.