| Literature DB >> 35998059 |
Ruben Hemelings1,2, Bart Elen2, João Barbosa-Breda1,3,4, Erwin Bellon5, Matthew B Blaschko6, Patrick De Boever2,7,8, Ingeborg Stalmans1,9.
Abstract
Purpose: Standard automated perimetry is the gold standard to monitor visual field (VF) loss in glaucoma management, but it is prone to intrasubject variability. We trained and validated a customized deep learning (DL) regression model with Xception backbone that estimates pointwise and overall VF sensitivity from unsegmented optical coherence tomography (OCT) scans.Entities:
Mesh:
Year: 2022 PMID: 35998059 PMCID: PMC9424967 DOI: 10.1167/tvst.11.8.22
Source DB: PubMed Journal: Transl Vis Sci Technol ISSN: 2164-2591 Impact factor: 3.048
Figure 1.T op panel: Overview of imaging modalities, the spatial relationship between structure and analyzed VF points, allocated to Garway-Heath sectors. The 30° SLO covers less than half of VF test locations, still considerably more than the three circumpapillary OCT scans (white circles) displayed on the right. Bottom panel: Four cases of the independent test set. Each case features (1) an ONH zoom of the original 30° SLO image, (2) measured VF map and MD, and (3) the corresponding predicted VF map and MD. The displayed cases include an example of early glaucoma (top left), moderate glaucoma with loss in the superior hemifield (bottom left), a myopic eye with severe glaucomatous loss in the inferior hemifield (top right), and severe glaucoma with only a small central island remaining (bottom right).
Study Sample Characteristics
| Characteristic | Train | Val | Test | Total |
|---|---|---|---|---|
| OCT–VF pairs, | 1006 | 198 | 186 | 1390 |
| Eyes, | 598 | 137 | 131 | 866 |
| Patients, | 325 | 84 | 88 | 497 |
| Age, y | 55.6 ± 20 | 57.1 ± 17 | 58.5 ± 17 | 55.8 ± 19 |
| Sex, F/M | 0.51/0.49 | 0.43/0.57 | 0.53/0.47 | 0.50/0.50 |
| MD data available | 1006/1006 (100) | 198/198 (100) | 186/186 (100) | 1390/1390 (100) |
| MD, dB | –7.55 ± 7.55 | –7.95 ± 8.92 | –7.37 ± 7.93 | –7.58 ± 7.80 |
| MD ≥ –6 dB | 600 (60) | 121 (61) | 108 (58) | 829 (60) |
| –6 dB > MD > –12 dB | 181 (18) | 29 (15) | 31 (17) | 241 (17) |
| MD ≤ –12 dB | 225 (22) | 48 (24) | 47 (25) | 320 (23) |
| SphEq data available | 799/1006 (79) | 152/198 (77) | 152/186 (84) | 1103/1390 (79) |
| SphEq, D | –2.17 ± 2.73 | –2.52 ± 2.72 | –2.42 ± 3.17 | –2.25 ± 2.79 |
| +1 D ≤ SphEq | 85 (11) | 19 (13) | 19 (13) | 123 (11) |
| +1 D > SphEq > –1 D | 190 (24) | 13 (9) | 36 (24) | 239 (22) |
| –1 D ≥ SphEq > –6 D | 441 (55) | 105 (69) | 74 (49) | 620 (56) |
| –6 D ≥ SphEq | 83 (10) | 15 (10) | 23 (15) | 121 (11) |
| IOP data available | 638/1006 (63) | 119/198 (60) | 125/186 (67) | 882/1390 (63) |
| Max IOP, mm Hg | 24.08 ± 8.25 | 24.16 ± 7.71 | 22.98 ± 7.57 | 23.94 ± 8.09 |
| vCDR data available | 882/1006 (88) | 173/198 (87) | 166/186 (91) | 1221/1390 (88) |
| vCDR estimate | 0.69 ± 0.22 | 0.69 ± 0.20 | 0.69 ± 0.20 | 0.69 ± 0.21 |
| OCT scan quality | 23.55 ± 4.36 | 23.82 ± 4.37 | 23.63 ± 4.22 | 23.60 ± 4.34 |
Values are presented as number (%) or mean ± SD unless otherwise indicated. D, diopters; IOP, intraocular pressure; SphEq, spherical equivalent; vCDR, vertical cup-disc ratio.
Quantitative Results for All Models Trained for the Estimation of 52 Threshold Values (First Row of Each Cell) and MD (Second Row of Each Cell)
| Modality | Target |
| Pearson | MSE (95% CI) | MAE (dB) (95% CI) (MAEdecr%) |
|---|---|---|---|---|---|
| Baseline (validation) | 52 pointsMD | 0.000.00 | 0.000.00 | 109.57 (93.29–127.23)79.09 (62.25–97.50) | 8.31 (7.67–8.99)7.20 (6.49–7.96) |
| Inner, 3.5 mm | 52 pointsMD | 0.55 (0.46–0.62)0.72 (0.63–0.78) | 0.75 (0.70–0.80)0.85 (0.80–0.89) | 49.26 (41.49–57.52)22.24 (17.09–27.85) | 4.98 (4.53–5.45) (40)3.40 (2.95–3.86) (53) |
| Middle, 4.1 mm | 52 pointsMD | 0.54 (0.45–0.61)0.66 (0.54–0.75) | 0.75 (0.70–0.80)0.83 (0.77–0.89) | 49.38 (41.96–57.40)26.61 (20.24–33.50) | 5.12 (4.68–5.57) (38)3.76 (3.28–4.27) (48) |
| Outer, 4.7 mm | 52 pointsMD | 0.57 (0.48–0.63)0.70 (0.60–0.78) | 0.77 (0.72–0.82)0.84 (0.78–0.89) | 49.02 (39.83–54.41)23.54 (17.18–30.69) | 5.10 (4.72–5.51) (39)3.42 (2.96–3.90) (53) |
| SLO | 52 pointsMD | 0.39 (0.31–0.46)0.47 (0.33–0.58) | 0.66 (0.59–0.71)0.70 (0.59–0.79) | 66.36 (55.52–77.50)41.52 (31.58–52.49) | 5.82 (5.28–6.39) (30)4.77 (4.19–5.37) (34) |
| Circle scans (weighted average) | 52 pointsMD | 0.59 (0.51–0.65)0.74 (0.65–0.80) | 0.79 (0.74–0.83)0.86 (0.81–0.90) | 44.50 (37.62–51.87)20.72 (15.57–26.53) | 4.89 (4.48–5.30) (41)3.27 (2.84–3.72) (55) |
| Circle scans, SLO (weighted average) | 52 pointsMD |
|
|
|
|
| Baseline (test) | 52 pointsMD | 0.000.00 | 0.000.00 | 101.59 (84.97–119.91)62.48 (48.98–77.32) | 7.76 (7.12–8.43)6.32 (5.66–7.01) |
| Test set (186 images) | 52 pointsMD | 0.58 (0.51–0.63)0.75 (0.67–0.81) | 0.79 (0.75–0.82)0.87 (0.83–0.91) | 42.35 (36.21–49.93)15.73 (11.35–21.06) | 4.82 (4.45–5.22) (38)2.89 (2.50–3.30) (54) |
The first section features results on the validation set (198 images), for which the best results are set in bold. Best model setup on validation data was subsequently used to obtain results on the independent test set (186 images), selected on best R2. MAE and MSE baseline for validation and test data were computed through the constant prediction of the mean value (threshold value, MD).
Figure 2.(A) MAEdecr% values for 52 VF threshold values obtained using the model trained on 4.7 mm (outer) OCT scans. MAEdecr% is the decrease in percentage from the baseline MAE, with the latter obtained when always predicting the pointwise mean. (B) Similar to panel A, but model trained using en face SLO images, to compare with as a baseline. (C) Final MAEdecr% values obtained on the test set, using the weighted averaged predictions of the four CNNs trained using OCT scans and SLO images. (D) The difference between panels A and B, indicating the superior VF modeling performance of OCT scans across the majority of VF test locations.
Metrics on the Six Visual Field Sectors as Described by Garway-Heath et al., Computed on the Test Set Using the Weighted Ensemble Model
| Sector |
| Pearson | MAE (dB) | MAE Baseline (MAEdecr %) |
|---|---|---|---|---|
| Central | 0.52 (0.46–0.57) | 0.77 (0.72–0.81) | 4.84 (4.33–5.39) | 7.50 (35) |
| Temporal | 0.55 (0.44–0.62) | 0.77 (0.69–0.83) | 4.41 (4.01–4.85) | 6.30 (30) |
| Inferior | 0.56 (0.46–0.63) | 0.77 (0.70–0.82) | 5.09 (4.66–5.55) | 7.99 (36) |
| Inferior nasal | 0.64 (0.57–0.70) | 0.83 (0.78–0.87) | 4.67 (4.20–5.17) | 8.11 (42) |
| Superior | 0.54 (0.45–0.62) | 0.76 (0.70–0.81) | 4.89 (4.48–5.32) | 7.37 (51) |
| Superior nasal | 0.60 (0.52–0.67) | 0.80 (0.75–0.84) | 4.79 (4.32–5.28) | 8.09 (41) |
MAE baseline was obtained by always predicting the sector threshold mean.
Figure 3.Comparative overview of three original studies (current, Guo et al., and Zhu et al.) that report on the relationship between measured and predicted VF threshold values, stratified by sensitivity (step size of 2 dB). The error ranges obtained by our approach leveraging DL are smaller than previous non-DL studies. Thirty-three of 38 whiskers are located within the 90% CI test–retest limits reported by Artes et al.
Pointwise MAE Aggregated on the Global Level and Six Visual Field Sectors as Described by Garway-Heath et al., Stratified by Three VF Severity Groups in the Test Set
| MAE | |||
|---|---|---|---|
| Sector | Early VF Loss (MD ≥ −6 dB) | Moderate VF Loss (−6 dB > MD > −12 dB) | Advanced VF Loss (MD ≤ −12 dB) |
| All | 3.582 | 4.561 | 7.849 |
| Central | 3.126 | 4.640 |
|
| Temporal | 3.597 | 3.246 | 7.059 |
| Inferior | 3.762 |
| 8.047 |
| Inferior nasal | 3.235 | 3.850 | 8.512 |
| Superior |
| 4.576 | 6.672 |
| Superior nasal | 3.517 | 4.905 | 7.640 |
The largest MAE per severity group is highlighted in bold, indicating the best modeling performance by the ensembled CNN.
Post Hoc One-at-a-Time Sensitivity Analysis to Assess Influence of Certain Input Factors on the Error Term for the Test Set
| Subset | No. of OCT–VF Pairs | MD MAE, dB / MAEdecr% | 52 Points MAE, dB / MAEdecr% |
|---|---|---|---|
| All (FL ≤20, FP ≤15) | 186 | 2.89/54 | 4.82/38 |
| Effect of other ocular disease | |||
| Excluding high myopia (≤–6 D) | 125 | 2.69/ | 4.65/38 |
| Excluding history of cataract | 130 | 2.63/46 | 4.49/23 |
| Excluding other types of glaucoma (other than POAG, NTG) | 173 | 2.75/56 | 4.70/38 |
| Excluding history of retinal diseases | 161 | 2.88/53 | 4.95/34 |
| Effect of OCT scan quality | |||
| Excluding scans with quality <20 | 149 | 2.65/ | 4.64/38 |
| Excluding scans with quality <15 | 181 | 2.82/55 | 4.74/38 |
| Effect of visual field reliability indices | |||
| Excluding HFA FL <10 | 135 | 2.70/55 | 4.51/38 |
| Excluding HFA FP <10 | 179 | 2.79/56 | 4.74/ |
| Excluding HFA FN <10 | 148 | 2.74/55 | 4.67/37 |
The best performance (largest MAEdecr%) per column is highlighted in bold. Baseline error is the MAE obtained when predicting the mean MD or pointwise value. FL, fixation loss; FN, false negative; FP, false positive; NTG, normal tension glaucoma; POAG, primary open angle glaucoma.
Not all OCT–VF pairs have a SphEq label assigned; hence, omission of pairs might be due to missing values.