Literature DB >> 35179554

Identifying the Retinal Layers Linked to Human Contrast Sensitivity Via Deep Learning.

Foroogh Shamsi¹, Rong Liu^1,2,3, Cynthia Owsley², MiYoung Kwon^1,2.

Abstract

Purpose: Luminance contrast is the fundamental building block of human spatial vision. Therefore contrast sensitivity, the reciprocal of contrast threshold required for target detection, has been a barometer of human visual function. Although retinal ganglion cells (RGCs) are known to be involved in contrast coding, it still remains unknown whether the retinal layers containing RGCs are linked to a person's contrast sensitivity (e.g., Pelli-Robson contrast sensitivity) and, if so, to what extent the retinal layers are related to behavioral contrast sensitivity. Thus the current study aims to identify the retinal layers and features critical for predicting a person's contrast sensitivity via deep learning.
Methods: Data were collected from 225 subjects including individuals with either glaucoma, age-related macular degeneration, or normal vision. A deep convolutional neural network trained to predict a person's Pelli-Robson contrast sensitivity from structural retinal images measured with optical coherence tomography was used. Then, activation maps that represent the critical features learned by the network for the output prediction were computed.
Results: The thickness of both ganglion cell and inner plexiform layers, reflecting RGC counts, were found to be significantly correlated with contrast sensitivity (r = 0.26 ∼ 0.58, Ps < 0.001 for different eccentricities). Importantly, the results showed that retinal layers containing RGCs were the critical features the network uses to predict a person's contrast sensitivity (an average R2 = 0.36 ± 0.10). Conclusions: The findings confirmed the structure and function relationship for contrast sensitivity while highlighting the role of RGC density for human contrast sensitivity.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35179554 PMCID： PMC8859491 DOI： 10.1167/iovs.63.2.27

Source DB: PubMed Journal: Invest Ophthalmol Vis Sci ISSN： 0146-0404 Impact factor: 4.799

Luminance contrast, the difference in intensity between light and dark regions of an image, is the fundamental building block of human pattern vision. It is contrast, not light intensity, that is the primary signal sent from the eye into the primary visual cortex. For this reason, when the visual system is not able to receive or process the full range of contrast signals, it often brings about devastating effects on various everyday activities such as reading,– object/face recognition,– visual search, walking,, and driving., Thus contrast sensitivity, the reciprocal of contrast threshold required for target detection, has been a major barometer of visual function., Besides the aforementioned functional significance, elucidating the exact mechanism underlying behavioral contrast sensitivity has been of great interest to many scientists because it is assumed to reflect the fundamental properties of human visual processing. Studies have shown that behavioral contrast sensitivity is accounted for by three major factors: the eye's optics such as optical aberrations and pupil size; the response and sampling properties of retinal neurons, such as the density of cones or ganglion cells,; and the properties of cortical neurons such as divisive normalization or bandpass-spatial-frequency tuning., Importantly, it has been well established that contrast information is first encoded by the center and surround structure of retinal ganglion cell (RGC) receptive fields, where the photoreceptor signals are first converted into neural activity., For this reason, the human contrast sensitivity function (i.e., a plot of contrast sensitivity as a function of spatial frequency) has been modeled by the response properties of RGCs.,,, Furthermore, RGC counts have been related to visual perimetry representing light sensitivity across the visual field in patients with glaucoma.– For example, Harwerth et al. showed that light sensitivity threshold measured with visual perimetry is linearly related with either the thickness of ganglion cell layer or RGC counts.–, In light of the known involvement of the retina in contrast coding, here we examined the structure and function relationship for human contrast sensitivity by identifying retinal layers/features closely linked to behavioral contrast sensitivity via deep learning. More specifically, we addressed the following questions: whether knowing the structural properties of the retina allows us to predict a person's contrast sensitivity; whether any particular regions in the retina (e.g., ganglion cell layer or photoreceptor layer) would serve as critical features in prediction of contrast sensitivity; and if so, how much of the variance in contrast sensitivity can be explained by the retinal structural information. To this end, the structural properties of the retina such as each retinal layer and its thickness were obtained with spectral-domain optical coherence tomography. We used a deep convolutional neural network (CNN) trained to decode Pelli-Robson contrast sensitivity measured in a person's central vision from optical coherence tomography (OCT) structural images. We then computed activation maps representing the critical features learned by the network for the output prediction. To further validate our results, we also included a control experiment: Visual acuity (i.e., the ability to resolve fine spatial details) is another major barometer of visual function and is known to be largely limited by the photoreceptor mosaic and sampling. Interestingly, a recent study showed that the relationship between visual acuity and contrast sensitivity varies across different diagnosis groups including glaucoma, cataract, age-related macular degeneration (AMD), retinitis pigmentosa, and normal vision, indicating the dissociative nature of visual acuity and contrast sensitivity. For this reason, we undertook a control experiment in which we compared the resulting critical layers/features learned by the CNN to decode visual acuity from OCT structural images to those features used to decode contrast sensitivity. Given the separable nature of visual acuity and contrast sensitivity, we would expect to see that different retinal layers/features are used by the network to predict visual acuity and contrast sensitivity.

Methods

Participants

The study design included a total of 225 subjects including 91 patients with primary open-angle glaucoma (mean age = 64.6 ± 8.4 years), 104 normally-sighted adults (mean age = 43.8 ± 19.6 years), and 30 patients with AMD (mean age = 74.1 ± 5.6). Glaucoma was clinically diagnosed and confirmed through medical records. The patients with primary open-angle glaucoma in the current study met the following three inclusion criteria: (1) glaucoma-specific changes of optic nerve or nerve fiber layer defect in which the presence of the glaucomatous optic nerve was defined by masked review of optic nerve head photos done by glaucoma specialists using previously published criteria; (2) glaucoma-specific visual field defect, defined as having a value on glaucoma hemifield test from the Humphrey field analyzer outside normal limits; and (3) no history of other ocular or neurologic disease or surgery that caused visual field loss. The preperimetric glaucoma patient met the inclusion criteria of (1) and (3). The visual field test was performed with standard automatic perimetry using SITA Standard 24–2 and 10–2 tests with a Humphrey Field Analyzer (Carl Zeiss Meditec, Inc., Dublin, CA, USA). Goldmann size III targets with a diameter of 0.43° were presented for 200 ms at a given test location in a grid on a white background (10 cd/m2). The pupil diameter for each eye was obtained from the Humphrey Field Analyzer. The average mean deviation obtained from the Humphrey Field Analyzer in glaucoma patients was −7.11 ± 9.31 dB for the right eye and −7.63 ± 7.59 dB for the left eye. Visual acuity was measured using Early Treatment Diabetic Retinopathy Study (ETDRS) charts and reported in logarithm of the minimum angle of resolution (logMAR). The mean visual acuity for glaucoma patients was 0.09 ± 0.16 logMAR (or approximately 20/25 Snellen equivalent) for the right eye and 0.08 ± 0.16 logMAR for the left eye. Contrast sensitivity was measured using Pelli-Robson contrast sensitivity charts (the letter spanned about 3° of visual angle at the viewing distance of 1 m). The mean log contrast sensitivity (Pelli-Robson charts) was 1.52 ± 0.28 for the right eye and 1.56 ± 0.21 for the left eye. AMD was clinically diagnosed and confirmed through medical records. According to the Age-Related Eye Disease Study grading performed by our imaging specialist, our patients had either early/intermediate noncentral geographic atrophy, central geographic atrophy, or neovascularization AMD. The mean visual acuity was 0.03 ± 0.17 logMAR (or approximately 20/20 Snellen equivalent) for the right eye and 0.07 ± 0.23 logMAR for the left eye. The mean log contrast sensitivity was 1.55 ± 0.12 for the right eye and 1.54 ± 0.15 for the left eye. In this study, normal vision was defined as better than or equal to 0.2 logMAR (or 20/32 Snellen equivalent) best-corrected visual acuity in each eye with normal binocular vision (i.e., stereopsis) and with no history of ocular or neurologic disease other than cataract surgery. For normal adults, the mean visual acuity was −0.04 ± 0.10 logMAR (or 20/20 Snellen equivalent) for the right eye and −0.04 ± 0.10 logMAR (or 20/20) for the left eye. The mean log contrast sensitivity was 1.78 ± 0.13 for the right eye and 1.75 ± 0.15 for the left eye. The average mean deviation was −0.2 ± 1.5 dB for the right eye and −0.7 ± 1.6 dB for the left eye. All participants were native or fluent English speakers without known cognitive or neurologic impairments, confirmed by the Mini Mental Status Exam (≥25 score for those aged 65 and over). Proper refractive correction for the viewing distance was used. The experimental protocols followed the tenets of the Declaration of Helsinki and were approved by the Internal Review Board of the University of Alabama at Birmingham. Written informed consents were obtained from all subjects before the experiment and after explanation of the nature and possible consequences of the study.

Data Collection

Data were collected from subjects (see Participants section for details) as follows: Visual acuity and contrast sensitivity. For each subject, Pelli-Robson contrast sensitivity and visual acuity were obtained for both eyes (see Participants section for details). OCT image acquisition and preprocessing. The cross-sectional retinal images were acquired through Spectralis Spectral-Domain Optical Coherence Tomography (Heidelberg Engineering GmbH, Heidelberg, Germany) from both eyes. The measurement was made in the macula, approximately 6 × 6 mm centered on the fovea (corresponding to the central 20° visual field). Using high-resolution volume scan mode with automatic real-time mean value of 15, 49 B-scans with each consisting of 1024 A-scans were acquired for normal vision and glaucoma, and 73 B-scans were obtained for AMD. Any scan with a quality score less than 20 dB was excluded from the analysis. The Heidelberg Eye Explorer software (version 6.3.1.0) automatically segmented the B-scan image into 10 layers and the data was acquired. The outer four layers including the inner and outer photoreceptor (PR) segments and the layers above and below retinal pigment epithelium (RPE) were combined into two layers of PR and retinal pigment epithelium, respectively, based on functional similarity and distance, resulting in a total of eight segmented layers. We then performed rigid registration on OCT images: The whole image was rotated at its center to make the left and right endpoints of Bruch membrane equally high. The centering was done by translating the image to the fovea center. The fovea center was determined as the point where the distance between central internal limiting membrane and Bruch membrane is the shortest. Note that the original OCT image spanned about 6 mm in width, but after the image registration only the central 5 mm of the retina was used for further analysis. Also note that no additional adjustment was made for any possible change in image scale that may occur during image rotation. However, the images with large offsets and rotation angle were excluded from the analysis to ensure the data quality. Eccentricity segmentation separating the image into nine columns was also performed with each column spanning 0.5 mm in width except for the central column (centered at the fovea) spanning 1 mm. Thus both layer segmentation and eccentricity segmentation ended up dividing the OCT image into a total of 72 subregions (see Fig. 1B).

Figure 1.

(A) The architecture of the CNN. The CNN takes OCT B-scan images as the input and predicts Pelli-Robson contrast sensitivity or ETDRS visual acuity data as the output. (B) Image processing steps. (i) An example of an original OCT B-scan cross-sectional image with retinal layer segmentation. OCT images were first segmented into the following eight retinal layers: RNFL containing the axons of ganglion cells; GCL containing ganglion cell bodies; IPL containing the dendritic structures of ganglion cells; inner nuclear layer (INL) containing bipolar cells, horizontal cells, and amacrine cells and muller glial cell bodies; outer plexiform layer (OPL) containing neuronal synapses; outer nuclear layer (ONL) containing rod and cone granules; PR layer containing inner and outer segments of photoreceptors; and retinal pigment epithelium layer (RPE) containing pigmented cells. (ii) The OCT image after flattening/centering and eccentricity segmentation. Colored lines demarcate the segmented retinal layers and the white vertical lines indicate segmentation by eccentricity. (C) Activation maps. The regression activation maps are computed as a weighted sum of the feature maps (i.e., outputs) of the last convolutional layer. Image correction and adjustment were also performed on activation maps to better localize critical features. Solid lines demarcate the segmented layers and eccentricities. For each subject, 10 B-scan OCT images were selected (five from each eye, five out of 49 B-scans for normal and glaucoma subjects and five out of 73 B-scans for AMD subjects). The selected B-scans included the B-scan centered at the fovea and four other B-scans to its left and right (two from each direction). After removing the B-scans with low quality scores (less than 20 dB), a total of 2030 pairs of contrast sensitivity/visual acuity and OCT image were obtained for data analysis. The left and right edges of the B-scan image was cropped to exclude possible invalid data points, which resulted in an image size of 496 pixels × 1016 pixels. The images from the left eyes were flip horizontally to match the images from the right eyes.

CNN Architecture

A CNN was adopted to detail the structure and function relationship of the human retina. The CNN took the down-sampled OCT B-scan images as input and was trained to predict either Pelli-Robson contrast sensitivity or ETDRS visual acuity as output. In the current study, a pretrained vgg16 model with transfer learning was selected because of its generalizability to other datasets. The model architecture was changed based on the requirements of the question of interest and the nature of our dataset. The original vgg16 architecture includes 13 convolutional layers with 3 × 3 filter sizes and “ReLU” activation functions, five max-pooling layers, and three fully connected layers. We changed the model architecture to meet the requirements of our regression task. The last three fully connected layers were replaced with a global average pooling layer, followed by a fully connected layer with a linear activation function to generate the output. Moreover, to avoid information loss due to the image size reduction throughout the network and to achieve activation maps with high resolution, we only kept the first max-pooling layer and removed the rest of max-pooling layers from the network architecture (see Fig. 1A). Because the first layers in the CNN models extract more generic features and the last layers contain more specialized features of the training dataset, to fine-tune the model, the weights of the first four convolutional layers were retained from the vgg16 model pretrained on ImageNet dataset and frozen. The weights of the rest of the layers in the network were fine-tuned using our dataset.

Data Augmentation

The contrast sensitivity data were unbalanced in a way that the number of eyes with poor contrast sensitivity (CS)/visual acuity (VA) (e.g., CS < 1.5 log unit/VA > 0.2 logMAR) was much smaller than the eyes in a normal contrast sensitivity range. Thus we conducted data augmentation to increase the heterogeneity of the training dataset known to help improving the generalization performance of the training models. For example, when contrast sensitivity was less than 1.5 log unit/VA > 0.2 logMAR, the corresponding OCT image was tripled in number by randomly rotating within 3° and vertically shifting within six pixels. Moreover, because the number of AMD subjects was about one third of the other two groups, we used data augmentation to handle imbalanced training samples from different groups. Specifically, the number of training samples from the AMD group was tripled by randomly rotating the training images of this group within 3° or vertically shifting them within six pixels. Furthermore, to avoid the overfitting problem, real-time data augmentation of all the training data was also performed during the model training by random rotation of the images within 6° and vertical shifting of them within 20 pixels. All OCT images were down-sampled to 320 pixels × 320 pixels for the input of the CNN. Additionally, since for the vgg16 model, the input images need to be in RGB format (three channels) and our OCT images were in grayscale format (one channel), the images were converted to RGB by repeating the same image array on three channels, which resulted in images of 320 × 320 × 3 size.

Model Training and Hyperparameters

We trained five replicates of the model using five different splits of the data. Each split was obtained as follows: For each group of subjects, the data were randomly split by subjects into train (70%), validation (15%) and test (15%) subsets. The whole train, validation, and test subsets were formed by putting the corresponding subsets from all groups together which resulted in the total numbers of 145, 29, and 29 subjects for training, validation, and testing, respectively. Subjects with missing contrast sensitivity/visual acuity information and eyes with poor B-scans quality (e.g., quality score less than 20 dB) were removed from the dataset before the splitting. For each replicate of the model, cross-validation was used for tunning its hyperparameters, independently. The CNNs were trained to minimize the mean squared error (MSE) between the predicted and true contrast sensitivity/visual acuity values. Note that although the hyperparameters were fine tunned for each model independently, they were obtained to be the same: The RMSprop algorithm with an initial learning rate of 1e–4 and a learning rate decay of 1e–6 was considered as the optimization algorithm and the number of epochs for training the model was set to 150. During the training, 10 selected B-scans for each subject (five for each eye) in the training and validation subsets were used for training and validation, respectively, whereas in the testing step, for each subject in the test subset, only two of the 10 selected B-scans were used (one for each eye, centered at fovea).

Activation Maps of the CNN

Gradient-weighted regression activation maps were used to highlight the important regions in the OCT images for the output prediction. A gradient-weighted regression activation map M was generated as the absolute value of the weighted sum of the feature maps from the last convolutional layer where the weight of each feature map was considered as the gradient of the predicted output with respect to the pixel values of that feature map, global-average-pooled over its height and width, as follows: where A is the kth feature map of the last convolutional layer and where y is the predicted output, is the (i, j) pixel value of A, and N is the total number of pixels in the feature map. The computed activation map was then normalized between 0 and 1 by dividing all the pixels by the maximum pixel value. Note that the way we computed the activation maps follows the same idea of computing gradient-weighted class activation maps proposed by Selvaraju et al. However, here we used abs(.) function instead of the ReLU because of the difference in the nature of our task (regression vs. classification/linear vs. softmax activation function of the output layer) in which pixels with both positive and negative values in the feature maps could be important for predicting the output, whereas in the classification task only the pixels with positive values in the class activation map are considered to be important for calculating the score of the corresponding class. Note that in our modified vgg16 architecture, the max-pooling layers of the original vgg16 architecture were removed (except for the first one) to avoid image size reduction and information loss and achieve high-resolution activation maps. Each activation map (160 × 160 pixels) was resized back to the input size (496 × 1016 pixels). Then, the segmentation and registration of the OCT images were applied to their corresponding activation maps resulting in 72 subregions (eight layers and nine eccentricities) in the activation maps, where the activation value of each subregion was set to the mean of the activation values of all pixels in that subregion. To calculate the mean activation map of each model across its test subjects, we threshold the normalized maps as follows: first, the segmented activation maps were thresholded based on the activation values of different subregions. Subregions with activation values higher than one standard deviation away from the mean value (across all sub-regions) were set to one, and the rest of the subregions were set to zero: where activation(l, ecc) is the value of the segmented map at layer l(l = 1, 2, …, 8) and eccentricity of ecc(ecc = 1, 2, …, 9) and μ and σ are the mean and standard deviation of the activation values across all 72 subregions, respectively. These steps are shown in Figure 1C. Then these thresholded maps were averaged over the test samples of each model (with prediction error less than 0.15) to obtain its corresponding mean activation map. Finally, the average activation map was obtained by calculating the average of the mean activation maps across five models. For visualization purpose, the average activation map was superimposed onto an OCT B-scan cross-sectional image as shown in Figure 1C. For all the experiments, data processing and analysis were conducted using both MATLAB (R2020b; The MathWorks Inc., Natick, MA, USA) and Python 3.7.

Results

Correlation Between the Retinal Layer Thickness and Foveal Contrast Sensitivity

We first examined whether the overall thickness of each retinal layer differs across different diagnosis groups, reflecting a loss of a particular type of retinal neurons. Figure 2A compared the thickness of each retinal layer for different subject groups. One-way analysis of variance and post-hoc pairwise comparisons with the Bonferroni correction were conducted on the mean thickness of different layers to determine the significance of differences between subject groups. For each subject, the average of each layer from both eyes was considered as the thickness of the corresponding layer. As illustrated in Figure 2A, four noticeable observations were made from this analysis: First, consistent with our prediction, the mean thickness of both ganglion cell and inner plexiform layers for glaucoma group was significantly lower (Ps < 0.01) than the AMD and normal groups, but there was no significant difference among the AMD and normal groups. Second, the mean thickness of photoreceptor layer was significantly lower in both AMD and glaucoma groups compared to the normal group (Ps < 0.01). Third, the mean thickness of the retinal pigment epithelium layer in AMD and glaucoma groups was significantly higher (Ps < 0.01) than the normal groups. Fourth, the other layers including retinal nerve fiber, inner nuclear layer, outer plexiform layer, and outer nuclear layers did not show any significant difference among different groups (the last three are not shown in Fig. 2A). Furthermore, when we qualitatively compared the average thickness of the ganglion cell layer of the healthy eyes obtained from our current study (mean age = 44.0 years) as a function of retinal eccentricity with the RGC density acquired from the histological study of the adult human retina (mean age = 34.0 years) in a dual axis plot, we observed an excellent correspondence between the two across retinal eccentricities (Fig. 2B). This result was well aligned with previous findings showing that the thickness of the ganglion cell and inner plexiform layers (i.e., the RGC+ layer) were closely related to RGC counts/density.–

Figure 2.

(A) Comparing retinal layer thickness among subject groups. The thickness of each retinal layer within the central 5 mm retina was compared for glaucoma (orange patch), AMD (gray patch), and normal vision groups (green patch). Each patch represents the 95% confidence interval. The box graphs for each layer show the results of one-way analysis of variance and multiple comparisons between the mean thickness of different groups. The significant differences between groups (P < 0.01) were indicated by **. (B) Correspondence between the RGC density and the ganglion cell layer thickness. The average thickness (green line) of the ganglion cell layer of healthy eyes obtained from the current study is plotted against the RGC density (black line) acquired from the histologic study of the human adult retina as a function of retinal eccentricity. (C) Correlation between a person's contrast sensitivity and individual retinal layer thicknesses. The heatmap represents the correlation coefficient between contrast sensitivity and individual retinal layer thickness for each subregion based on the eyes of all subjects. The question, then, arises whether the overall thickness of each retinal layer is indeed correlated with a person's behavioral contrast sensitivity. Our correlation analysis showed that a person's contrast sensitivity was significantly correlated with the thickness of the ganglion cell layer and inner plexiform layer (Note that the r value ranged from 0.26, 95% confidence interval [CI]: [0.17, 0.35] to 0.58, 95% CI: [0.51, 0.64] for all the eccentricities except for −0.5 to 0.5 mm, all Ps < 0.001). The thickness of RGC plus inner plexiform layer (RGC+IPL) is known to be highly correlated with RGC counts. As shown in Figure 2C, it is noteworthy that the correlation coefficient was noticeably higher at the ganglion cell layer in the retinal region between 1∼2 mm eccentricities, which is consistent with the fact that the receptive fields of RGCs in the foveal region are laterally displaced. Also see Supplementary Table S1 for the correlation values at different layers and eccentricities. These results highlight the linkage between the thickness of RGC related layers and human contrast sensitivity. Importantly, the fact that our results are well aligned with various previous findings,,– further helped us assure the quality of our OCT image acquisition and preprocessing.

Using Deep Learning to Identify Structural Retinal Features Linked to Foveal Contrast Sensitivity

So far, we have shown that how the thickness of the RGC layer is correlated with the contrast sensitivity measured by Pelli-Robson charts. In this section, we explored whether this association between the RGC layer and Pelli-Robson contrast sensitivity can be captured by a deep neural network trained to predict behavioral contrast sensitivity from retinal structural image data (i.e., OCT retinal images). If so, how much of the variance in behavioral contrast sensitivity can be explained by knowing retinal structural image data? To this end, we probed the critical features (layers) in the OCT images that the model used to predict the contrast sensitivity. As shown in Figure 1A, the input of the CNN model is the macular OCT image and the output is Pelli-Robson contrast sensitivity measured in central vision (see Methods for more details on the network architecture and training). Note that the focus of this work is not to propose a regression model for predicting the behavioral contrast sensitivity value from retinal structure, but to take advantage of such a model to find the possible linkage between these two by probing the retinal features the model utilized for predicting behavioral contrast sensitivity. Therefore, good prediction performance for the test (unseen) samples is pre-requisite and essential to ensure the validity of the potential features. The model was evaluated using the test dataset and achieved high prediction performance: the average MSE was 0.03 ± 0.004 and the average mean absolute error (MAE) was 0.13 ± 0.011 for all subjects (both eyes) in the test subsets. It should be noted that the contrast range of the Pelli-Robson contrast sensitivity chart is from 0 to 2.25 log units. The chart uses the 10 Sloan letters with constant size, and the letters are arranged in 16 triplets over eight lines, with each triplet of the same contrast level representing an increment of 0.15 log units (0.05 per each letter). Convinced by the prediction performance of the model, we then went on to probe whether there were any particular regions in the retina that served as critical features for the output prediction. We identified which regions in OCT retinal image were being utilized for predicting Pelli-Robson contrast sensitivity. To this end, we extracted gradient-weighted regression activation maps from the CNN models (see Methods for more details on activation maps). For simplicity, we use the term “activation map” throughout the article. To better localize critical features in activation maps, we performed image correction and segmentation on both OCT images and activation maps. Samples from each test dataset with prediction errors (|true CS − predicted CS|) less than 0.15 included for computing the mean activation map of the corresponding model. We threshold the individual activation maps based on the activation values of different subregions and calculated the average over the test samples (with prediction error less than 0.15) (see Methods for more details on how we threshold the individual maps and calculate the mean activation map of each model). Finally, we calculated the average of the mean activation maps from five replicates the models. For visualization purpose, the final average activation map was superimposed onto an OCT B-scan cross-sectional image (Fig. 3A(i)). This map shows the average recurrence rate of each subregion being highlighted as a critical region in individual thresholded activation map (it can be viewed as the probability of each subregion being used as a critical feature for predicting contrast sensitivity). The original activation maps (without thresholding) for four individual subjects (for one selected eye) from each diagnosis group were shown in Figure 3A(ii) (the subjects were selected from the test samples of a randomly selected model among the five replicates). It is evident that despite some variabilities across individual subjects, the pattern of activations is consistent between the average activation map and the individual subjects’ activation maps.

Figure 3.

(A) (i) Average activation map for contrast sensitivity. The contrast sensitivity activation map averaged across all mean activation maps of the five model replicates is illustrated. (ii) Activation maps of individual subjects. Individual activation maps of four subjects (for one selected eye) from each diagnosis group are arranged in each column: glaucoma, AMD, and normal vision. (B) Average activation map for visual acuity. The visual acuity activation map averaged across all mean activation maps of the five model replicates is shown. Going back to our hypothesis, we expected that, considering the role of RGCs in encoding visual signals, the retinal layers containing RGCs would be the predominant features in predicting a person's contrast sensitivity. Consistent with our prediction, the layers that contain ganglion cell bodies (GCL) and its axons (retinal nerve fiber layer [RNFL]) received the highest activation as shown in Figures 3A(i) and 3A(ii). These results suggested that the retinal layers containing ganglion cells are linked to Pelli-Robson contrast sensitivity. However, are these aforementioned retinal layers uniquely linked to contrast sensitivity per se? We cannot rule out the possibility that these layers may remain essential regardless of the type of visual functions considering the fact that ganglion cells are the output neurons of the retina. We thus explored the retinal layers linked to the visual acuity as a control experiment by capitalizing on the dissociative nature of visual acuity and contrast sensitivity at both behavioral and neurophysiological levels.,,– We expected that unlike those used for contrast sensitivity, the network is likely to rely on a different retinal layer (e.g., the photoreceptor layer) for predicting visual acuity. To this end, CNNs with the same architecture as the contrast sensitivity experiment (Fig. 1A) were trained using OCT images as the inputs and ETDRS visual acuity values as the outputs. The same five splits of training, validation, and test subsets used in the contrast sensitivity experiment were employed for training, validation, and testing the five replicates of a model for visual acuity prediction, respectively. To calculate the mean activation map of each model, test samples with prediction errors (|true VA − predicted VA|) less than 0.1 logMAR were considered. The average activation map (the average of mean activation maps across five replicates of the model) is illustrated in Figure 3B. Similar to the contrast sensitivity experiment, the trained model achieved good prediction performance on the test dataset. Specifically, the average MSE was 0.02 ± 0.005, and the average MAE was 0.09 ± 0.011 for all samples of test subsets (i.e., unseen data samples). More importantly, consistent with our prediction, the photoreceptor layer turned out to be the most critical retinal layer linked to the foveal visual acuity (Fig. 3B) whereas the ganglion cell layer received much less activation. This result is in accordance with the view that foveal visual acuity is largely limited by the properties of the photoreceptor mosaic or sampling, whereas contrast sensitivity is largely explained by the response properties of RGCs.,,, To our knowledge, this is the first evidence supporting the dissociative nature of the two major visual functions: foveal visual acuity and foveal contrast sensitivity at the retinal structural level. Note that although we elucidated the critical retinal layers linked to a person's contrast sensitivity, it is well established that there are other factors that can affect the measured behavioral contrast sensitivity of a person including the luminance condition (e.g., photopic or scotopic vision), eye's optics, and cortical neurons as mentioned in the Introduction. Therefore we calculated the R value between the predicted and true contrast sensitivity to further look into the amount of variance in the behavioral contrast sensitivity explained by the properties of the retinal structure. Given our sample, the Rvalue ranged from 0.23 (95% CI: [0, 0.46]) to 0.49 (95% CI: [0.26,0.66]) for different models (see Supplementary Table S2 for the Rvalue of different models) and the average Rvalue of 0.36 suggested that on average 36% of the variation in the contrast sensitivity can be explained by simply knowing the retinal structure of the eye. This accountability is well in line with the estimates (i.e., the R2 value of 0.30 to 0.46) reported in previous studies,– relating the perimetric sensitivity and retinal structural measurements in glaucomatous vision. This level of accountability by the retinal structure is rather remarkable considering various sources of the variation including optical, retinal, cortical, and cognitive factors in behavioral data.

Discussion

Contrast sensitivity, the ability to distinguish between an object and the background, is a foundation for human pattern vision. Behavioral contrast sensitivity is assumed to reflect the essential properties of human visual processing such as the eye's optics, the response characteristics of the retinal neurons,,,,, or the cortical neurons., Recently, a number of studies have been using deep learning techniques to investigate the relation between the retinal structure and visual sensitivity.– Here, we aimed to identify the retinal layers/features underlying human contrast sensitivity via deep learning. First, we compared the thickness of the different retinal layers across different diagnosis groups. Consistent with previous findings, we found that the thickness of the GCL and IPL were significantly lower for glaucoma patients compared to the normal and AMD groups. Also, consistent with previous work,, our results showed that the thickness of the PR layer (PRL) in glaucoma and AMD groups was significantly lower than the normal group. However, Matlach et al., using the Heidelberg retina angiograph imaging technique, showed that local measurements of cone density in patients with glaucoma do not differ significantly from healthy controls, despite large differences in RGC density, suggesting no thinning of the PRL in the glaucomatous eyes. One reason for the difference in the results between our study and Matlach et al.'s study might have to do with the age range of the subjects with normal vision because the thickness of PRL is known to decrease with aging. In our study, the average age of the normal subjects was 43.8 ± 19.6 years (the median age was 49 years), whereas in the Matlach et al.'s study, the median age of the normal group was 57 years. A future study is called for to address the apparent discrepant findings with respect to thinning of the PRL in glaucomatous vision. Then, we examined the relation between the Pelli-Robson contrast sensitivity and thickness of the retinal layers. Our results showed that the thickness of the ganglion cell layer (where ganglion cell bodies are located) and inner plexiform layer (where the dendritic structures of ganglion cells are located) within the retinal region between 1 mm and 2 mm eccentricities exhibited the strongest correlation (r = 0.6, P < 0.01) with behavioral contrast sensitivity compared to other regions (Fig. 1C). This finding is consistent with the fact that RGC receptive fields responsible for processing foveal visual input are laterally displaced. The role of RGC counts/sampling density in behavioral contrast sensitivity has been implicated in previous studies. For example, Hess and Field have also hinted that sparse RGC sampling is likely to lead to decreased contrast sensitivity. Our recent work using a computational model has also demonstrated that RGC undersampling may in part explain the loss of foveal contrast sensitivity in glaucoma patients. RGC counts have also been related to visual perimetry representing light sensitivity across the visual field.– Particularly, a study done by Harwerth et al. showed that light sensitivity measured with visual perimetry is linearly related with either the thickness of ganglion cell layer or RGC counts.–, Taken together, these results underscore the linkage between RGCs and behavioral contrast sensitivity. In the current study, we used a deep learning technique to elucidate the structure-function relationship for Pelli-Robson contrast sensitivity measured in central vision. We probed the retinal features used by CNNs trained to predict a person's contrast sensitivity from OCT images. Activation maps provide a means by which the important regions of the OCT images that the CNN uses to predict the output can be visualized. Our activation maps indicated that the deep neural network learned and used the information from the features of the retinal layer containing ganglion cells to predict the Pelli-Robson contrast sensitivity. The pattern of the results remained consistent across subjects regardless of their ocular pathology or age. Note that although the goal of this study was not to propose a model for predicting behavioral contrast sensitivity from retinal structure, we checked the prediction performance of the model to confirm the validity of our activation map results. In other words, the detected linkage between the behavioral contrast sensitivity and RGC layer is valid only if the model can reliably predict the behavioral contrast sensitivity from OCT images. The results obtained from the test datasets (for all replicates of the model) showed that the CNN models were able to predict a person's contrast sensitivity with good precision (i.e., the average MAE of 0.13). Considering the fact that the smallest contrast step you could measure with the Pelli-Robson chart is either 0.15 (a triplet of the same contrast) or 0.05 (letter-by-letter scoring), the observed precision of the average MAE of 0.13 is a good accuracy. Although we cannot rule out the presence of other factors that can affect a person's behavioral contrast sensitivity including the eyes’ optics (e.g., age-related lens opacity and pupil size) and properties of cortical neurons,, our results indicated a linkage between the Pelli-Robson contrast sensitivity and RGC layer. In fact, the average R2 value between the true and predicted contrast sensitivity from OCT images (from test samples) indicates that, given our sample, on average 36% of the variance in the behavioral contrast sensitivity can be explained by the retinal structure (i.e., the retinal layer containing ganglion cells). As mentioned earlier, the retinal structure such as the thickness of retinal layers or RGC counts, was shown to be correlated with the perimetric sensitivity in glaucoma.– For example, Shafi et al. studied the relation between the perimetric sensitivity and the neuroretinal rim area in patients with glaucoma and their results suggested a linear relation between these two (R > 0.3, P < 0.005). Thus it is worth noting that the average R value (0.36) observed in our study is quite comparable to the values reported in the previous studies,– (0.30 to 0.46) despite obvious methodological differences between our study and the aforementioned studies: subject group (glaucoma vs. normal vision, AMD, and glaucoma), measurement site (optic nerve head scan vs. macular scan), measurement method (visual field perimetry vs. foveal contrast sensitivity), analysis method (correlation/regression analysis vs. deep neural network approach). Thus one major contribution of our current study is to further confirm the relationship between the retinal structure and contrast sensitivity using different methods. On the other hand, our control experiment of decoding visual acuity from OCT images suggested that the neural network utilizes the features from the PRL (Fig. 3B), which is in line with the view that the foveal visual acuity is largely limited by the photoreceptor sampling. It is worth mentioning that unlike foveal visual acuity, the peripheral visual acuity has been shown to be mostly limited by ganglion cell density.– Moreover, the RNFL exhibited high activation for both contrast sensitivity and visual acuity prediction as shown in Figures 3A(i) and 3B. The highest recurrence rate of RNFL layer (over all eccentricities) in the contrast sensitivity average activation map is 0.80 (vs. 0.73 recurrence rate for GCL) and in the visual acuity average activation map is 0.61 (vs. 0.47 recurrence rate for PRL). One possible explanation for the relatively high activation of RNFL might be in part due to age-related changes in RNFL and its correlations with visual acuity and contrast sensitivity. Taken together, the results of the current study on predicting Pelli-Robson contrast sensitivity and ETDRS visual acuity suggest that the retinal structure containing the retinal GCL and its axons (RNFL) and the PRL are linked to two major barometers of human visual function (i.e., contrast sensitivity and visual acuity). We, however, acknowledge the limitations to our study. It is known that the deep CNN models usually require a large dataset to train a network with good generalizability. However, in the current study, small datasets were available to train and evaluate the CNN model. We addressed this issue by using transfer learning, as well as data augmentation techniques that were extensively used in previous studies including medical applications.– Furthermore, lack of a public OCT dataset that includes the Pelli-Robson contrast sensitivity information made this problem of testing the model on other datasets challenging. Also, as shown in previous studies,– we cannot rule out the fact that various factors such as individual differences in axial length, gender, retinal disease, or age might have affected the accuracy of OCT retinal thickness measurements. Furthermore, understanding the direct correspondence between a foveal stimulus to its underlying retinal structures poses intrinsic difficulties because of the lateral displacement of ganglion cell receptive fields in the fovea and other colocalization issues present in retinal nerve fibers. Although the current study focused on the structure and function relationship in the central vision, the question of whether this relationship still holds true in the peripheral vision (i.e., visual acuity and contrast sensitivity measured in the peripheral vision) should be addressed in a future study. Although speculative, it is possible that the potential imprecision of our measurements might have resulted in an increase in the variance in foveal contrast sensitivity or visual acuity that cannot be explained by OCT structural measures in our model. While artificial intelligence has been applied successfully in screening, diagnosing, and monitoring ophthalmic diseases,– using retinal images, it is still challenging to understand and interpret how artificial intelligence makes a decision., Despite the stated challenges, our painstakingly careful image processing procedure and segmentation analysis enabled us to successfully analyze the activation of the CNN with respect to its prediction outcome, thereby localizing the retinal layers linked to foveal contrast sensitivity or visual acuity. We further showed that the information contained in the thickness of the RGC layers is critical to Pelli-Robson contrast sensitivity. This result is consistent with our earlier work, as well as previous reports,, showing that the thickness of the ganglion cell layer and inner plexiform layer (i.e., RGC+IPL) is correlated with visual sensitivity. On the other hand, studies also showed that the reflectivity (the pixel value in OCT image) of the retina is associated with visual functions. We, however, did not find any evidence that retinal reflectivity contributes to the prediction of behavioral contrast sensitivity. In summary, our findings confirmed the structure and function relationship for human contrast sensitivity via deep learning while highlighting the role of RGC sampling density for behavioral contrast sensitivity.

89 in total

1. Different circuits for ON and OFF retinal ganglion cells cause different contrast sensitivities.

Authors: Kareem A Zaghloul; Kwabena Boahen; Jonathan B Demb
Journal: J Neurosci Date: 2003-04-01 Impact factor: 6.167

2. An enhanced OCT image captioning system to assist ophthalmologists in detecting and classifying eye diseases.

Authors: Sivamurugan Vellakani; Indumathi Pushbam
Journal: J Xray Sci Technol Date: 2020 Impact factor: 1.535

3. Summation and inhibition in the frog's retina.

Authors: H B BARLOW
Journal: J Physiol Date: 1953-01 Impact factor: 5.182

4. Face recognition in age-related maculopathy.

Authors: M A Bullimore; I L Bailey; R T Wacker
Journal: Invest Ophthalmol Vis Sci Date: 1991-06 Impact factor: 4.799

5. The relationship between central visual field sensitivity and macular ganglion cell/inner plexiform layer thickness in glaucoma.

Authors: Ji-Woong Lee; Esteban Morales; Farideh Sharifipour; Navid Amini; Fei Yu; Abdelmonem A Afifi; Anne L Coleman; Joseph Caprioli; Kouros Nouri-Mahdavi
Journal: Br J Ophthalmol Date: 2017-01-11 Impact factor: 4.638

Identifying the Retinal Layers Linked to Human Contrast Sensitivity Via Deep Learning.

Methods

Participants

Data Collection

CNN Architecture

Data Augmentation

Model Training and Hyperparameters

Activation Maps of the CNN

Results

Correlation Between the Retinal Layer Thickness and Foveal Contrast Sensitivity

Using Deep Learning to Identify Structural Retinal Features Linked to Foveal Contrast Sensitivity

Discussion

1. Different circuits for ON and OFF retinal ganglion cells cause different contrast sensitivities.

2. An enhanced OCT image captioning system to assist ophthalmologists in detecting and classifying eye diseases.

3. Summation and inhibition in the frog's retina.

4. Face recognition in age-related maculopathy.

5. The relationship between central visual field sensitivity and macular ganglion cell/inner plexiform layer thickness in glaucoma.

6. Visual risk factors for crash involvement in older drivers with cataract.

7. Optical and retinal factors affecting visual resolution.

8. Glaucoma Detection from Raw SD-OCT Volumes: A Novel Approach Focused on Spatial Dependencies.

9. Measuring contrast sensitivity.

10. Neural bandwidth of veridical perception across the visual field.

1. Schizophrenia in Translation: Why the Eye?

2. Contrast Sensitivity Deficits and Its Structural Correlates in Fuchs Uveitis Syndrome.

3. Foveal crowding appears to be robust to normal aging and glaucoma unlike parafoveal and peripheral crowding.