Vahid Mohammadzadeh1, Melodyanne Cheng1, Sepideh Heydar Zadeh1, Kiumars Edalati1, Dariush Yalzadeh1, Joseph Caprioli1, Sunil Yadav2,3, Ella M Kadas2,3, Alexander U Brandt4, Kouros Nouri-Mahdavi1. 1. Glaucoma Division, Stein Eye Institute, David Geffen School of Medicine, University of California Los Angeles, Los Angeles, CA, USA. 2. Experimental and Clinical Research Center, Max Delbruck Center for Molecular Medicine, Berlin, Germany. 3. Charité - Universitätsmedizin Berlin, corporate member of Freie Universität Berlin, Humboldt-Universität zu Berlin, and Berlin Institute of Health, Berlin, Germany. 4. Department of Neurology, University of California Irvine, CA, USA.
Abstract
Purpose: To test the hypothesis that newly developed shape measures using optical coherence tomography (OCT) macular volume scans can discriminate patients with perimetric glaucoma from healthy subjects. Methods: OCT structural measures defining macular topography and volume were recently developed based on cubic Bézier curves. We exported macular volume scans from 135 eyes with glaucoma (133 patients) and 155 healthy eyes (85 subjects) and estimated global and quadrant-based measures. The best subset of measures to predict glaucoma was explored with a gradient boost model (GBM) with subsequent logistic regression. Accuracy and area under receiver operating curves (AUC) were the primary metrics. In addition, we separately investigated model performance in 66 eyes with mild glaucoma (mean deviation ≥ -6 dB). Results: Average (±SD) 24-2 mean deviation was -8.2 (±6.1) dB in eyes with glaucoma. The main predictive measures for glaucoma were temporal inferior rim height, nasal inferior pit volume, and temporal inferior pit depth. Lower values for these measures predicted higher risk of glaucoma. Sensitivity, specificity, and AUC for discriminating between healthy and glaucoma eyes were 81.5% (95% CI = 76.6-91.9%), 89.7% (95% CI = 78.7-94.2%), and 0.915 (95% CI = 0.882-0.948), respectively. Corresponding metrics for mild glaucoma were 84.8% (95% CI = 72.1%-95.5%), 85.8% (95% CI = 87.1%-97.4%), and 0.913 (95% CI = 0.867-0.958), respectively. Conclusions: Novel macular shape biomarkers detect early glaucoma with clinically relevant performance. Such biomarkers do not depend on intraretinal segmentation accuracy and may be helpful in eyes with suboptimal macular segmentation. Translational Relevance: Macular shape biomarkers provide valuable information for detection of early glaucoma and may provide additional information beyond thickness measurements.
Purpose: To test the hypothesis that newly developed shape measures using optical coherence tomography (OCT) macular volume scans can discriminate patients with perimetric glaucoma from healthy subjects. Methods: OCT structural measures defining macular topography and volume were recently developed based on cubic Bézier curves. We exported macular volume scans from 135 eyes with glaucoma (133 patients) and 155 healthy eyes (85 subjects) and estimated global and quadrant-based measures. The best subset of measures to predict glaucoma was explored with a gradient boost model (GBM) with subsequent logistic regression. Accuracy and area under receiver operating curves (AUC) were the primary metrics. In addition, we separately investigated model performance in 66 eyes with mild glaucoma (mean deviation ≥ -6 dB). Results: Average (±SD) 24-2 mean deviation was -8.2 (±6.1) dB in eyes with glaucoma. The main predictive measures for glaucoma were temporal inferior rim height, nasal inferior pit volume, and temporal inferior pit depth. Lower values for these measures predicted higher risk of glaucoma. Sensitivity, specificity, and AUC for discriminating between healthy and glaucoma eyes were 81.5% (95% CI = 76.6-91.9%), 89.7% (95% CI = 78.7-94.2%), and 0.915 (95% CI = 0.882-0.948), respectively. Corresponding metrics for mild glaucoma were 84.8% (95% CI = 72.1%-95.5%), 85.8% (95% CI = 87.1%-97.4%), and 0.913 (95% CI = 0.867-0.958), respectively. Conclusions: Novel macular shape biomarkers detect early glaucoma with clinically relevant performance. Such biomarkers do not depend on intraretinal segmentation accuracy and may be helpful in eyes with suboptimal macular segmentation. Translational Relevance: Macular shape biomarkers provide valuable information for detection of early glaucoma and may provide additional information beyond thickness measurements.
Glaucoma is characterized by the loss of retinal ganglion cells (RGCs) and may result in functional loss if left untreated., Timely treatment of glaucoma is of utmost importance as glaucoma deterioration is associated with worsening of vision-specific quality of life., The RGCs most crucial to central vision are located in the macular region. Macular optical coherence tomography (OCT) is commonly used to assess the health of the RGC-axonal complex., Although current OCT imaging approaches have significantly enhanced detection of early signs of glaucoma, identifying eyes with glaucoma at highest risk of visual disability with high accuracy is a pressing unmet need in the field of glaucoma diagnostics. Automated detection of eyes with a high-risk of early glaucoma with high specificity could make screening or risk stratification possible. This is especially important in earlier stages of the disease so that further functional damage may be averted.Macular thickness measurements have been the main outcome of interest in both research and clinical setting with thinning of various layers of the inner retina considered the hallmark of glaucoma.– The central macula has a complex configuration and variations in its thickness and topography have been reported in eyes with glaucoma.– Recent studies reported on the foveal shape in healthy subjects and subjects with glaucoma and explored its influence on structure–function (SF) relationships.,– Yadav et al. and Motamedi and collaborators recently proposed a novel approach to parametrizing the configuration and topography of the central macular and perifoveal region based on cubic Bézier polynomials along with a least square optimization for 3-dimensional modeling., Motamedi et al. evaluated the performance of such measures in differentiating between neuromyelitis optica (NMO) and multiple sclerosis (MS); the resulting area under receiver operating characteristic curve (AUROC) for discriminating between the 2 diseases was 0.800. Yadav and colleagues also reported the differences in the aforementioned shape and topography measures between healthy subjects and patients with various inflammatory optic neuropathies.The goal of the current study is to test the hypothesis that newly developed macular topographic and volumetric measures (called macular shape measures in this paper) can discriminate eyes with glaucoma from healthy eyes with clinically relevant accuracy. We speculate that these novel macular biomarkers could enhance automated detection of early central damage in patients with glaucoma above and beyond macular thickness measures.
Methods
Study Sample
One hundred thirty-five eyes from 133 patients with glaucoma and 155 eyes of 85 healthy subjects were enrolled in this study. In this dataset, 66 eyes of 66 patients were categorized as having mild perimetric damage defined as visual field mean deviation (MD) ≥–6 dB. The current study was carried out in accordance with the tenets of the Declaration of Helsinki and the Health Insurance Portability and Accountability Act (HIPAA) and was approved by the University of California Los Angeles (UCLA) Human Research Protection Program.Inclusion criteria for eyes with glaucoma were: (1) clinical diagnosis of primary open-angle glaucoma, pseudoexfoliative glaucoma, pigmentary glaucoma, or primary angle-closure glaucoma; (2) age between 40 and 80 years; (3) best corrected visual acuity ≥20/50; and (4) no significant confounding retinal or neurological disease. All study eyes underwent a complete eye examination and macular imaging with Spectralis SD-OCT (Heidelberg Engineering, Heidelberg, Germany). A visual field (VF) defect was considered to be present if both of the following criteria were met: (1) Glaucoma Hemifield Test outside the normal limits or pattern standard deviation with P < 0.05; and (2) 4 abnormal points with P < 0.05 on the pattern deviation plot, both confirmed at least once. VF examinations with false positives rates >15% were excluded from this study. Eligible healthy eyes were required to have normal VFs and normal optic disc photographs based on the review of two individual glaucoma specialists.
Macular OCT Imaging
We used the Posterior Pole Algorithm of the Spectralis SD-OCT for this study. This algorithm acquires 61 horizontal B-scans spanning a 30 degrees × 25 degrees wide area with activated eye tracker, parallel to the fovea-Bruch's membrane opening axis; each B-scan consists of 768 A-scans. The B-scans are repeated 9 to 11 times (automatic real-time [ART] averaging) to decrease speckle noise and improve image quality. Scans with a quality factor greater than 15 were considered for the analysis. The quality factor is a quality index provided by the proprietary software of Spectralis OCT and ranges from 0 (worst quality) to mid-40s (best quality) under clinical scenarios. We exported .vol files to a personal computer for further analysis by the morphometry algorithm as described below.All analyses were carried out with a software suite developed by the co-authors (Nocturne GmbH, Berlin, Germany). Nocturne software is a CE certified software (the product meets high safety, health and environmental protection requirements to be sold in the European Economic Area) as a medical device (DIN EN ISO 13845:2016, class IM measurement software). The local installable web application analyzes OCT images of the retina to measure thinning, swelling, and deformation of tissues. After an OCT volume scan is acquired, the user can easily obtain a report in a few minutes by opening the application on a local computer, uploading the scan, and triggering the required analysis. This deep learning-based software suite consists of a quality check algorithm (Automatic Quality Analysis [AQUA]) and a foveal morphometry algorithm (FOVEAM).,
Quality Check With the AQUA Algorithm
The OCT quality check and morphometry analyses were carried out with the Nocturne software developed by the co-authors. Only images with a quality factor of >15 were included. In addition, OCT image quality was checked with a customized automated algorithm, AQUA, which is part of the Nocturne software suite and is based on the deep learning modeling. This model ensures that OCT volume scan data are of sufficient quality for subsequent model-driven 3D reconstruction of the fovea. The AQUA algorithm provides a fully standardized, grader-independent, and easily applicable real-time method for evaluation of macular OCT image quality (Fig. 1). AQUA has been validated against reference data labeled by expert reviewers at Berlin's Charité University Hospital and against the OSCAR-IB quality assessment algorithm, a reference standard for quality check of macular OCT images; AQUA was able to discriminate between good and poor-quality scans (volume, radial, and ring patterns) with 97% accuracy. The AQUA method combines three main OSCAR-IB criteria: (1) foveal center detection, (2) signal quality (signal to noise ratio), and (3) image completeness artifacts (scan cuts). These features are evaluated before the fovea morphometry, and only if the input volume is free from missing regions, is well-centered on the fovea, and the signal quality is acceptable, then the morphometric analysis is performed. AQUA predicts the quality of OCT volume scans based on a review of A-scans and if 98% or more of the A-scans have an image quality above the threshold (learned by DCNN), then the input volume will be included for further processing. AQUA uses parabola fitting in the foveal region for accurate center detection. AQUA not only checks image quality but also evaluates the quality of segmented data. In our study, we used the segmentation of the upper (inner limiting membrane [ILM]) and lower boundaries (Bruch's membrane) of the retina as provided in the Spectralis’ .vol files; the segmentation of these two boundaries can be more easily done and is of higher quality compared to the inner layers when the quality of OCT images is subpar. Before performing the foveal morphometry, we checked the accuracy of these two segmentation lines to avoid including any data with inappropriate segmentation. Similar to the image quality assessment, AQUA ensures that the segmentation lines are free from missing regions, high jumps, and that the internal limiting membrane and Bruch's membrane are not intersecting each other.
Figure 1.
(A) The output from the AQUA algorithm provides information on image quality and segmentation adequacy. AQUA uses a deep learning approach to assess volume scan quality. AQUA accesses each A-scan in the input volume and projects the corresponding quality measure on the SLO image in different colors; for example, sufficient and insufficient signal quality are shown in green and yellow, respectively. Furthermore, upper and lower image cutoffs are represented in red and orange colors, respectively. The interpolation feature will generate artificial B-scans if one to three nonconsecutive B-scans have low signal quality. (B) The FOVEAM algorithm utilizes macular OCT volume scans to create a model-driven 3D reconstruction of the central 4 mm of the macula and estimates various shape and volume measures (see text). T, S, N, and I represent the temporal, superior, nasal, and inferior macula, respectively.
(A) The output from the AQUA algorithm provides information on image quality and segmentation adequacy. AQUA uses a deep learning approach to assess volume scan quality. AQUA accesses each A-scan in the input volume and projects the corresponding quality measure on the SLO image in different colors; for example, sufficient and insufficient signal quality are shown in green and yellow, respectively. Furthermore, upper and lower image cutoffs are represented in red and orange colors, respectively. The interpolation feature will generate artificial B-scans if one to three nonconsecutive B-scans have low signal quality. (B) The FOVEAM algorithm utilizes macular OCT volume scans to create a model-driven 3D reconstruction of the central 4 mm of the macula and estimates various shape and volume measures (see text). T, S, N, and I represent the temporal, superior, nasal, and inferior macula, respectively.
Macular Shape Analysis
Two-dimensional (2D) and three-dimensional (3D) data from the macular OCT volume scans can assess structural changes of the central macula including the perifoveal area that may correlate with presence of glaucoma or disease progression. Yadav et al. first proposed a method for parametric modeling of the central macular topography and volume from OCT data, which uses a cubic Bézier polynomial and least square optimization alongside landmark features of the macula to create a best-fit parametric model of the foveal topography within the central 4 mm of the macula. Specifically, the proposed model uses a least square optimization to fit the cubic Bézier polynomial to interior and exterior regions of the fovea based off stable and reliable features of the foveal pit shape, including the lowest point of the pit and the highest point around the pit region. To generate a high-fidelity parameterization of the fovea, the interior and exterior regions are parametrized using different constraints based on the geometric properties of the macula. The foveal shape has two different intrinsic properties: maximum peaks at rim points and minimum depth at the center of the fovea. We utilized these two properties to force different constraints in different regions. The whole region was split between interior (between maximum and minimum points) and exterior (beyond the maximum points) areas. For parameterization of the interior region, the main constraint is that the end points must have horizontal tangent lines (because of the extreme points). For the exterior region, we used the same constraints for one side (maxima points) but not for the other side, which is defined by the rim points. Because of the intrinsic shape differences, the constraints are different.Using the above model-driven foveal shape analysis method developed by Yadav et al., we acquired thickness, volume, and fitting measures to accurately model and reconstruct the 3D profile of the perifoveal region. The 3D reconstruction exemplified in Figure 1B represents the macula of a patient's right eye.
Foveal Morphometry With the FOVEAM Algorithm
The FOVEAM (fovea morphometry) algorithm calculates topographic and volumetric (shape) biomarkers based on a central 4 mm macular area (see Fig. 1B). The FOVEAM algorithm calculates several shape measures from raw macular volume scans (.vol files) as follows (Figs. 2, 3). The macular volume scans sample the central macula using a stack of linear B-scans aligned with the fovea-Bruch's membrane opening axis. Therefore, radial resampling of the volume scan is needed. The number of radial resampling curves around the fovea is an important variable as too sparse sampling could lead to loss of data and a too dense sampling scheme can result in longer processing time for data extrapolations from the macular volume scans. Yadav et al. estimated the fidelity of the resampling approach with the L2-metric, which assesses the closeness between two geometrical shapes., They found that the resampling process using 12 or more radial lines on the Spectralis volume scans provided adequate resolution when compared to a ground truth of 48 radial scans (average error = 0.2 µm).
Figure 2.
Visualization of 2D measures on the central B-scan passing through the foveal pit.
Figure 3.
Volumetric measures of interest consist of the rim volume, pit volume, and inner rim volume. The rim volume (represented by the large blue cylinder) describes the total volume of the perifoveal retina delimited internally by the rim disk and externally by the Bruch's membrane–RPE complex, whereas the inner rim volume (the green cylinder) is defined as the volume of the retina between the Bruch's membrane–RPE complex and the internal limiting membrane within a 0.5 mm radius of the foveal center. The pit volume is the total volume between the rim disk, the highest points on the inner surface of the macula, and the inner surface of the foveal pit.
Visualization of 2D measures on the central B-scan passing through the foveal pit.Volumetric measures of interest consist of the rim volume, pit volume, and inner rim volume. The rim volume (represented by the large blue cylinder) describes the total volume of the perifoveal retina delimited internally by the rim disk and externally by the Bruch's membrane–RPE complex, whereas the inner rim volume (the green cylinder) is defined as the volume of the retina between the Bruch's membrane–RPE complex and the internal limiting membrane within a 0.5 mm radius of the foveal center. The pit volume is the total volume between the rim disk, the highest points on the inner surface of the macula, and the inner surface of the foveal pit.
Measures of Interest
Table 1 summarizes the topographic and volumetric parameters derived from the FOVEAM algorithm. The current version of the algorithm is able to calculate both global measures and sectoral measures within four quadrants around the fovea (temporal superior [TS], temporal inferior [TI], nasal superior [NS], and nasal inferior [NI]). Supplementary Figure S1 provides the algorithm pipeline carried out to estimate the shape biomarkers from raw 3D OCT images.
Table 1.
Description of Central Macular Topographic and Volumetric Measures (Also See Figs. 2 and 3).
Rim disk area: Area represented by a spline fit on the maximum perifoveal points on each radial cut (reconstructed radial foveal cross-section) centered on the foveal center.
Rim disk diameter: Longest distance between opposite maximum perifoveal rim points among radial B-scans.
Rim height: Average of maximum perifoveal rim height measurements on 12 radial B-scans.
Average pit depth: Distance between the center of the foveal pit and the average rim height.
Central foveal thickness: Minimum distance between foveal pit's center and the underlying retinal pigment epithelium (RPE)-Bruch's membrane complex.
Slope disk area: The area derived from a spline fit of all maximum slope points on each reconstructed radial foveal cross-section.
Slope disk diameter: Longest distance among all the corresponding pairs of maximum slope points on radial cross-sections.
Pit flat disk area: Quantifies the flatness of the foveal pit surrounding the foveal center.
Pit flat disk diameter: Distance between two opposite points representing the widest area of the flat portion of foveal pit.
Volume Parameters
Rim volume: Total volume of the perifoveal rim delimited internally by the rim disk and externally the BM-RPE complex (large blue cylinder, Fig. 3).
Inner rim volume: Volume of the fovea between the region of inner limiting membrane within 0.5mm of the foveal center and the BM-RPE complex (green cylinder, Fig. 3).
Pit volume: Total volume between the rim disk and the inner surface of the foveal pit (yellow valley, Fig. 3).
Description of Central Macular Topographic and Volumetric Measures (Also See Figs. 2 and 3).
Statistical Analyses
We first explored the performance of each global and sectoral measure one at a time (univariable analyses) for glaucoma risk. Both linear logistic models and restricted cubic spline (RCS) logistic models were fit for each individual measure. Because the shape measures are correlated, we used a Gradient Boost logistic model (GBM) using all potential predictors simultaneously to evaluate the relative influence of each potential predictor and determine the best subset of the predictors to retain. The subset of predictors that accounted for most of the relative influence was retained. A conventional logistic model allowing two-way interactions was subsequently fit using the variables selected by GBM. The model combines the variables into a logit score through the following formula (showing only 3 potential predictors):
where x1, x2, and x3 are the selected variables and β1, β2, and β3 are their corresponding weights. In the equation above, the β4 and β4 are the weights of interactions between variables 1 and 2 and variables 1 and 3, respectively. The odds and risk (P = proportion) of glaucoma can be computed from the logit score as: where odds = O = exp (logit score) and risk = P = O/(1+O)).We also included a random subject effect to allow for non-independence between the two eyes of subjects. The Akaike information criterion (AIC) was used to determine which interaction terms were retained in the final logistic model. We report primary outcome measures from the logistic model, including sensitivity, specificity, and AUROC with their corresponding 95% confidence interval (CI) after 5-fold cross-validations. We repeated the same approach to determine the performance of the shape measures for discriminating between mild glaucoma and healthy eyes.We also investigated the performance of various machine learning (ML) models for discriminating glaucoma eyes from healthy eyes for both the entire dataset and for eyes with early glaucoma. We used a GBM, Naïve Bayes (NB), and a support vector machine (SVM) algorithm for this purpose. We fed the 40 topographic and volumetric parameters of interest to the ML models and carried out parameter tuning with Grid Search Method; a 5-fold cross-validation was done for each ML model. We report the AUROC, sensitivity, specificity, and accuracy along with the corresponding 95% CI for each model.All the analyses were performed with the software R version 4.0.4 (R Foundation for Statistical Computing, Vienna, Austria).
Results
Table 2 displays the clinical and demographic data for the study eyes. The mean (SD) age was 52.6 (13.6) and 66.7 (9.6) years (P < 0.001) and the average (SD) 24-2 visual field MD was –0.73 (1.77) and –8.24 (6.11) dB in the glaucoma and healthy groups, respectively (P < 0.001). There were no significant differences between the two groups with regard to gender and ethnicity.
Table 2.
Demographic Characteristics of the Healthy Subjects and the Subjects With Glaucoma
Healthy (N = 155)
Glaucoma (N = 135)
P Value
Age
Mean (SD)
52.6 (13.6)
66.7 (9.6)
< 0.001
Gender (%)
Female
55%
60%
Male
45%
40%
0.466
Race (%)
Caucasian
32%
54%
0.988
African American
15%
15%
0.988
Hispanic
15%
13%
0.987
Asian
38%
18%
0.988
24–2 MD (dB)
Mean (SD)
–0.73 (1.77)
–8.24 (6.11)
< 0.001
Axial length (mm)
Mean (SD)
23.8 (1.2)
24.6 (1.4)
< 0.001
Global rim height (mm)
Mean (SD)
0.34 (0.01)
0.31 (0.02)
< 0.001
Global pit volume (mm3)
Mean (SD)
0.25 (0.04)
0.21(0.04)
< 0.001
Global pit depth (mm)
Mean (SD)
0.11 (0.02)
0.09 (0.02)
< 0.001
Demographic Characteristics of the Healthy Subjects and the Subjects With GlaucomaTable 3 describes the performance of each of the global shape measures for detection of glaucoma in the entire patient sample based on linear and RCS. The best performing global measure was the global rim height (AUC = 0.814). The global rim height (AUC = 0.818) was also found to be the best-performing measure based on the RCS logistic model for the entire patient group.
Table 3.
Performance of Global Topographic and Volumetric Parameters for Detection of Glaucoma Based on Univariable Linear and Restricted Cubic Spline Logistic Models for the Entire Dataset
Linear Logistic Model
Restricted Cubic Spline Logistic Model
Global Predictor
Specificity
Sensitivity
Accuracy
AUC
Specificity
Sensitivity
Accuracy
AUC
Rim height
82.1%
68.9%
75.5%
0.814
82.1%
68.9%
75.5%
0.818
Pit depth
62.8%
82.2%
72.5%
0.769
62.8%
82.2%
72.5%
0.769
Rim volume
66.7%
77.0%
71.9%
0.764
66.7%
77.0%
71.9%
0.758
Pit volume
67.9%
74.1%
71.0%
0.739
67.9%
74.1%
71.0%
0.739
AUC, area under the curve.
Performance of Global Topographic and Volumetric Parameters for Detection of Glaucoma Based on Univariable Linear and Restricted Cubic Spline Logistic Models for the Entire DatasetAUC, area under the curve.Overall, 40 global and sectoral variables were used in the GBM analysis. Based on the GBM analysis, the 3 top variables according to their relative influence were the TI rim height, NI pit volume, and TI pit depth with the 3 of them accounting for 53% of the total influence of all the shape measures (i.e. all other predictors added less than 47%). The final optimal logistic model included the above 3 measures and 2-way interactions between TI rim height and NI pit volume, and TI rim height and TI pit depth. Table 4 provides the estimates, standard errors, and P values for the predictors in the final logistic model. The sensitivity, specificity, accuracy, and AUC (95% CI) of the final logistic model for discriminating between healthy eyes and eyes with glaucoma were 81.5% (95% CI = 76.6%, 91.9%), 89.7% (95% CI = 78.7%, 94.2%), 85.6% (95% CI = 84.1%, 87.1%), and 0.915 (95% CI = 0.882, 0.948), respectively. Figure 4 shows the glaucoma risk for a range of values of each predictor with the other 2 variables fixed at their first quartile, mean, and third quartile values. Lower values for TI rim height, NI pit volume, and TI pit depth were associated with a higher risk of glaucoma.
Table 4.
The Coefficients, Standard Errors, and P Values for the Predictors in the Final Optimal Logistic Model for Prediction of Glaucoma Based on Topographic and Volumetric Markers for the Entire Glaucoma Group and Separately for Eyes With Mild Glaucoma
Entire Glaucoma Eyes
Mild Glaucoma Eyes
Predictor
Estimate
SE
P Value
Estimate
SE
P Value
TI rim height (mm)
84.6
58.8
0.1500
60.4
57.4
0.2927
NI pit volume (mm3)
482.5
247.3
0.0511
334.4
257.0
0.1933
TI pit depth (mm)
200.8
124.2
0.1059
186.3
114.3
0.1030
TI rim height × NI pit volume
–1710.3
776.0
0.0275
–1219.6
807.5
0.1310
TI rim height × TI pit depth
–703.4
389.4
0.0709
–663.0
363.0
0.0678
TI, temporal inferior; NI, nasal inferior; SE, standard error.
• Logit score for entire dataset = –20.9 + 84.6 (TI rim height) + 482.5 (NI pit volume) + 200.8 (TI pit depth) –1710.3 (TI rim height × NI pit volume) – 703.4 (TI rim height × TI pit depth).
• Logit score for eyes with mild glaucoma = –14.1+ 60.4 (TI rim height) + 334.4 (NI pit volume) + 186.3 (TI pit depth) –1219.6 (TI rim height × NI pit volume) –663.0 (TI rim height × TI pit depth).
Figure 4.
Effect of each of the three shape predictors identified by the gradient boost model (temporal inferior [TI] rim height, nasal inferior [NI] pit volume, and temporal inferior pit depth) on the glaucoma risk, whereas the other two predictors are fixed at their first quartile (Q1), mean, and third quartile (Q3).
Figure 5.
Receiver operating characteristic (ROC) curves for glaucoma prediction models for the entire dataset (red curve, AUC = 0.915), prediction of mild glaucoma using four selected parameters based on repeat analyses (green curve, AUC = 0.913) and based on parameters derived from the entire dataset (blue curve, AUC = 0.896).
The Coefficients, Standard Errors, and P Values for the Predictors in the Final Optimal Logistic Model for Prediction of Glaucoma Based on Topographic and Volumetric Markers for the Entire Glaucoma Group and Separately for Eyes With Mild GlaucomaTI, temporal inferior; NI, nasal inferior; SE, standard error.• Logit score for entire dataset = –20.9 + 84.6 (TI rim height) + 482.5 (NI pit volume) + 200.8 (TI pit depth) –1710.3 (TI rim height × NI pit volume) – 703.4 (TI rim height × TI pit depth).• Logit score for eyes with mild glaucoma = –14.1+ 60.4 (TI rim height) + 334.4 (NI pit volume) + 186.3 (TI pit depth) –1219.6 (TI rim height × NI pit volume) –663.0 (TI rim height × TI pit depth).Effect of each of the three shape predictors identified by the gradient boost model (temporal inferior [TI] rim height, nasal inferior [NI] pit volume, and temporal inferior pit depth) on the glaucoma risk, whereas the other two predictors are fixed at their first quartile (Q1), mean, and third quartile (Q3).Receiver operating characteristic (ROC) curves for glaucoma prediction models for the entire dataset (red curve, AUC = 0.915), prediction of mild glaucoma using four selected parameters based on repeat analyses (green curve, AUC = 0.913) and based on parameters derived from the entire dataset (blue curve, AUC = 0.896).We repeated all the above steps to build a new logistic model based on the healthy eyes and the eyes with early perimetric glaucoma including only eyes with visual field MD ≥–6 dB. The GBM selected TS rim height, NI pit volume, TI pit depth, and TS pit depth as the top variables with overall resistive index (RI) of 50%. Table 5 displays the estimates, standard errors, and P values for this logistic model using only eyes with early perimetric glaucoma. The logistic model included the interactions between TS pit depth and TS rim height and between TI pit depth and TS rim height. The sensitivity, specificity, accuracy, and AUC (95% CI) were 84.8% (95% CI = 72.1%, 95.5%), 85.8% (95% CI = 87.1%, 97.4%), 85.3% and 0.913 (95% CI = 0.867, 0.958), respectively. When we applied the logistic model derived from the entire glaucoma group, which included TI rim height, NI pit volume, and TI pit depth, and the relevant interactions (see Table 4), to eyes with mild glaucoma and healthy eyes only, the sensitivity, specificity, accuracy, and AUC (95% CI) were 87.9% (95% CI = 68.2%, 95.5%), 76.8% (95% CI = 71.6%, 96.8%), 82.3%, and 0.896 (95% CI = 0.850, 0.942) respectively. Figure 5 represents the receiver operating characteristic (ROC) curves for models used for discriminating normal eyes from glaucoma eyes and from early stage glaucoma eyes.
Table 5.
The Coefficients, Standard Errors, and P Values for the Predictors in the Final Optimal Logistic Model for Prediction of Mild Perimetric Glaucoma (Visual Field Mean Deviation ≥ –6 dB) Based on the Top Four Topographic and Volumetric Markers Selected by the Gradient Boost Model
Predictor
Estimate
SE
P Value
(Intercept)
9.77
10.73
0.3622
TS rim height (mm)
–13.50
34.48
0.6953
NI pit volume (mm3)
–70.21
17.85
0.0001
TI pit depth (mm)
1356.51
427.07
0.0015
TS pit depth (mm)
–1139.56
402.76
0.0047
TS pit depth × TS rim height
3758.79
1258.32
0.0028
TI pit depth × TS rim height
–4502.45
1333.19
0.0007
TI, temporal inferior; NI, nasal inferior; TS, temporal superior; SE, standard error.
The Coefficients, Standard Errors, and P Values for the Predictors in the Final Optimal Logistic Model for Prediction of Mild Perimetric Glaucoma (Visual Field Mean Deviation ≥ –6 dB) Based on the Top Four Topographic and Volumetric Markers Selected by the Gradient Boost ModelTI, temporal inferior; NI, nasal inferior; TS, temporal superior; SE, standard error.Logit score = 9.77 –13.50 (TS rim height) –70.21 (NI pit volume) + 1356.51 (TI pit depth) – 1139.56 (TS pit depth) + 3758.79 (TS pit depth × TS rim height) –4502.45 (TI pit depth × TS rim height).The performance of the 3 ML models (GBM, NB, and SVM) are provided in Table 6. For the entire dataset, GBM demonstrated the highest performance for discriminating eyes with glaucoma from healthy eyes. The AUCs (95% CI) for the classification task were 0.947 (95% CI = 0.921, 0.973), 0.838 (95% CI = 0.782, 0.895), and 0.845 (95% CI = 0.800, 0.890) for GBM, NB, and SVM, respectively. Similarly, for discriminating eyes with mild glaucoma from the healthy eyes, GBM showed the highest performance among all the ML models, with AUCs of 0.950 (95% CI = 0.918, 0.983), 0.838 (95% CI = 0.782, 0.895), and 0.845 (95% CI = 0.800, 0.890) for GBM, NB, and SVM, respectively.
Table 6.
Results of Three Machine Learning Approaches for Discriminating all Eyes With Glaucoma and Eyes With Early Glaucoma From Healthy Subjects
Results of Three Machine Learning Approaches for Discriminating all Eyes With Glaucoma and Eyes With Early Glaucoma From Healthy SubjectsGBM, gradient boost model; SVM, support vector machine; CI, confidence interval.
Discussion
The goal of this study was to assess whether the newly developed macular shape measures, capturing both topographical and volumetric features of the central perifoveal macular region, can discriminate eyes with perimetric glaucoma from healthy eyes. Our results demonstrate that these novel macular biomarkers can detect perimetric glaucoma with clinically relevant performance. Quadrant-based measures performed overall better than global measures. The best-performing model incorporated a combination of a few quadrant-based measures and their interactions. We found an AUC of 0.915 for the detection of glaucoma with the selected measures, namely TI rim height, NI pit volume, and TI pit depth. In addition, we explored 3 ML models (GBM, NB, and SVM) for identifying eyes with glaucoma. All showed good to very good performance for discriminating glaucoma and more specifically, eyes with early glaucoma from healthy eyes with GBM demonstrating the best performance for this purpose (AUC of 0.948 and 0.950 for the entire dataset and mild glaucoma eyes, respectively).Most prior studies found that macular thickness measurements derived from OCT imaging performed well for identifying glaucoma at different stages of the disease.,–,– These studies investigated both full macular thickness and inner retinal layers. However, there are several shortcomings with regard to macular thickness measurements. Various artifacts may affect the segmentation of inner retinal layers, especially in more severe glaucoma or when the quality of the images is subpar., Another caveat is that measurement of perifoveal RGCs is particularly difficult even in eyes with good-quality images as the inner retinal thickness rises from near zero in the center of the fovea to its peak around 700 to 1100 microns from the fovea. Thickness measurements may also be misleading in myopic eyes.The novel proposed biomarkers are not dependent on the segmentation of individual retinal layers and may be especially helpful when accurate segmentation of the inner retinal layers cannot be achieved. We also hypothesize that such biomarkers may be helpful very early in the disease process when subtle changes in thickness measurements in individual eyes may be difficult to ascertain due to the wide range of thickness measurements observed in OCT normative databases. The software suite developed by the co-authors is equipped with a built-in algorithm (AQUA), which is a deep learning model able to automatically evaluate the quality of the macular volume scans. Hood et al. proposed that two main macular damage phenotypes demonstrating shallow and diffuse versus localized and deep damage may exist. The proposed novel shape measures would be particularly useful for defining macular phenotypes and our team is exploring deep learning approaches to identify such phenotypes. We also speculate that the macular shape measures could be used to monitor disease progression in various stages of glaucoma.The deep learning model providing novel 3D volumetric and topographic perifoveal biomarkers with automated image quality check has no precedence in the OCT literature. Vercellin et al. developed an algorithm in MATLAB to provide ganglion cell complex (GCC) volume and reported very good performance in detecting glaucoma. However, the scan quality and segmentation accuracy needed to be checked manually. A recent study reported that 3D optic disc neuroretinal rim measures were able to detect glaucoma progression earlier than 2D thickness measurements. The Nocturne software suite does not rely on segmentation of inner macular layers or thickness measurements. Any artifacts within the macular volume scans can be detected by the AQUA software and depending on the severity may be subsequently removed from the dataset. Asaoka et al. recently demonstrated that a deep learning algorithm using pretraining would perform better than an approach not utilizing pretraining. They reported an AUC of 94% for the new algorithm compared to 77% to 79% without the pretraining. The main difference between the study by Asaoka et al. and ours is that the deep learning on raw images represents a black box with no clue on how the deep learning algorithm is performing this task while our proposed biomarkers are specifically aiming to utilize predefined features of the perifoveal macula. The performance of both approaches seems to be clinically relevant.The Nocturne algorithm provides an array of global and sectoral measures. Due to the collinearity observed between many measures (results not shown), a machine learning approach (GBM) was performed to select the best measures to discriminate between eyes with glaucoma and healthy eyes. After introducing a loss function, GBM selected TI rim height, NI pit volume, and TI pit depth as the top 3 shape measures with an AUC of 0.915 for discriminating eyes with glaucoma of various stages from healthy eyes. The same model still performed well for detection of mild perimetric glaucoma (AUC = 0.895). In addition, when subset selection was carried out using only eyes with mild glaucoma, the identified subset of predictive measures was quite similar with a higher AUC (0.913). Therefore, the proposed 3D topographic and volumetric measures could be clinically relevant biomarkers for the detection of glaucoma at all stages once their performance is further validated in other external and larger datasets. Two out of three best sectoral measures belonged to the inferotemporal sector, which is well known to be the most common macular region where early signs of the disease tend to manifest.,,,Yadav et al. proposed cubic Bézier polynomial modeling and least square optimization of landmark features of the macula for parametric modeling of the perifoveal configuration and volume in a cohort of healthy eyes and eyes from patients with neuroinflammatory central nervous system disorders and optic neuritis. Their novel model is based on stable and reliable features of the foveal pit shape, including the lowest point of the pit and the highest retinal point around the foveal pit region. A recent study on foveal morphology showed the importance of accounting for artifacts seen on RGC probability maps in healthy subjects; the proposed modeling of the perifoveal region by Yadav et al. is an important step toward individualized structure-function maps of the macula as further development of the current deep learning algorithms may help prediction of functional damage in the macular region.,The Nocturne software has an OCT reader module which is capable of reading the image data and other available information in OCT volume scans from different devices (Heidelberg Spectralis, Cirrus, Nidek, and Topcon) and different file types (.vol, .img, .dcm, .xml, and .fda). All these files contain, at the minimum, the raw imaging data (e.g. .img format) and potentially the segmentation information (for example, .vol format). If the data file has only the raw image information, then the Nocturne segmenter module performs segmentation of the ILM and Bruch's membrane in the foveal region and this segmentation information is utilized for the perifoveal shape analysis module. In this study, we utilized .vol files from the Heidelberg Spectralis OCT; these files do contain the segmentation information for the upper (ILM) and lower (Bruch's membrane) retinal boundaries. Therefore, we did not utilize the segmenter module from Nocturne. However, the ILM and Bruch's membrane segmentation quality for each volume scan was checked using the automated quality analysis module (AQUA) from Nocturne. The foveal shape analysis was performed using the fovea morphometry module utilizing quality checked data. As Cubic Bézier parametrization is a smooth representation of the foveal shape, the effect of noise in segmentation is removed robustly. Furthermore, we have also compared the accuracy of our method with respect to the current state-of-the-art methods and shown that our method is capable of parameterizing the different foveal shape with high fidelity. A similar result was shown in a recent paper by Romero-Boscenes et al., who performed a comparison between current state-of-the-art methods for foveal shape modeling in terms of accuracy and robustness.Given the performance of the macular shape parameters, we propose that combining these parameters with macular thickness measures could potentially provide a stronger structural signal for detection of glaucoma, especially in eyes with early-stage glaucoma. Another potential benefit is to explore these novel macular phenotypes for categorizing different stages of glaucoma, which has been an ongoing interest in the glaucoma community.Our results should be interpreted in light of potential shortcomings. A larger normal database is required to define the factors potentially affecting shape parameters. Specifically, race, gender, and axial length might affect the shape measures in healthy and glaucomatous eyes. The group of eyes with early glaucoma was fairly small and the results need to be confirmed in a larger study and potentially in eyes with preperimetric glaucoma.In conclusion, the proposed novel topographic and volumetric macular biomarkers were able to detect perimetric glaucoma across the severity spectrum with clinically relevant performance. Incorporating this model in clinical practice, once validated, could help in optimizing detection of glaucoma in early stages especially in eyes with significant early central involvement. The shape biomarkers do not depend on segmentation of intraretinal layers and may prove to be helpful where segmentation of macular layers is suboptimal.
Authors: Kitiya Ratanawongphaibul; Edem Tsikata; Michele Zemplenyi; Hang Lee; Milica A Margeta; Courtney L Ondeck; Janice Kim; Billy X Pan; Paul Petrakos; Anne L Coleman; Fei Yu; Johannes F de Boer; Teresa C Chen Journal: Ophthalmol Glaucoma Date: 2021-03-25