| Literature DB >> 35388080 |
Denis Corbin1, Frédéric Lesage2,3.
Abstract
Accumulation of beta-amyloid in the brain and cognitive decline are considered hallmarks of Alzheimer's disease. Knowing from previous studies that these two factors can manifest in the retina, the aim was to investigate whether a deep learning method was able to predict the cognition of an individual from a RGB image of his retina and metadata. A deep learning model, EfficientNet, was used to predict cognitive scores from the Canadian Longitudinal Study on Aging (CLSA) database. The proposed model explained 22.4% of the variance in cognitive scores on the test dataset using fundus images and metadata. Metadata alone proved to be more effective in explaining the variance in the sample (20.4%) versus fundus images (9.3%) alone. Attention maps highlighted the optic nerve head as the most influential feature in predicting cognitive scores. The results demonstrate that RGB fundus images are limited in predicting cognition.Entities:
Mesh:
Year: 2022 PMID: 35388080 PMCID: PMC8986784 DOI: 10.1038/s41598-022-09719-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Descriptive statistics of predicted variables for each subset of data.
| Characteristics | Training set | Validation set | Testing set | |||
|---|---|---|---|---|---|---|
| Mean | Std. | Mean | Std. | Mean | Std. | |
| Age (years) | 60.93 | 9.57 | 60.93 | 9.46 | 61.02 | 9.54 |
| Body mass index (A.U.) | 28.06 | 5.44 | 27.91 | 5.19 | 28.14 | 5.50 |
| Systolic BP (mmHg) | 120.35 | 16.43 | 119.93 | 16.45 | 120.46 | 16.47 |
| Diastolic BP (mmHg) | 74.49 | 9.89 | 74.26 | 9.76 | 74.48 | 9.82 |
| Executive function (A.U.) | 42.72 | 12.45 | 42.86 | 12.58 | 42.84 | 12.61 |
| Speed (A.U.) | 59.42 | 14.42 | 59.40 | 14.14 | 59.40 | 14.84 |
| Memory (A.U.) | 53.93 | 15.73 | 53.80 | 15.69 | 53.95 | 15.97 |
| Inhibition (A.U.) | 60.78 | 11.83 | 60.98 | 11.63 | 60.87 | 11.72 |
| Global cognition (A.U.) | 38.97 | 11.86 | 38.98 | 11.74 | 39.01 | 12.20 |
| Right eyes (%) | 54.27 | – | 54.02 | – | 54.11 | – |
| Male (%) | 49.29 | – | 49.44 | – | 49.35 | – |
Significant values are in bold.
Difference between CLSA subset and similar works aiming to predict cognition related measurements.
| Characteristics | Dataset | |||
|---|---|---|---|---|
| CLSA (ours) | Venugopalan[ | Thanh Duc[ | Oyama[ | |
| Modality | Multimodal (Fundus and metadata) | Multimodal (MRI, EHR, SNP) | rs-fMRI | NIRS |
| Number of images | 202 | 331 | 202 | |
| Age (years) | N/A | 74.3 (4.70) | 73.4 (13.0) | |
| Sex (% male) | 50.66% | N/A | 49.54% | 43.07% |
| Predicted variable(s) | Global cognition, inhibition, memory, speed, executive function, sex, APOE4, BMI, SBP, DBP and age | AD status | MMSE scores and AD status | MMSE scores |
Values from cohorts other than CLSA are reported from corresponding papers. Mean and standard deviation are reported for numerical values.
Significant values are in bold.
Effect of architectures on the mean absolute error (MAE) and the coefficient of determination R2.
| Metrics on CLSA | Baseline | Network | ||
|---|---|---|---|---|
| EfficientNet-b3[ | InceptionV3[ | MobileNetV2[ | ||
| Used by for the same task | Corbin et al. (ours) | Poplin et al.[ | Gerrits et al.[ | |
| Age: R2 (95% CI) | 0.00 | 0.693 (0.678–0.706) | 0.707 (0.687–0.726) | |
| Age: MAE, years (95% CI) | 7.19 | 3.72 (3.61–3.84) | 3.68 (3.57–3.79) | |
95% on predicted age. CIs on metrics were calculated with 2000 bootstrap samples the size of the test set. Baseline metrics are from predicting the mean value for all individuals.
Significant values are in bold.
Figure 1Predicted variables from regression tasks with corresponding coefficient of determination R2. Predictions are based on fundus image alone, on metadata alone or on fundus and metadata for cognitive scores. Error bars correspond to 95% CIs on R2 which were calculated with 2000 bootstrap samples.
Hyperparameters used in the training of the proposed solution.
| Hyperparameters | Value |
|---|---|
| Loss function | Mean absolute error (MAE) for regression task |
| Binary cross entropy (BCE) for classification task | |
| Optimizer | Standard gradient descend (SGD) |
| Momentum | 0.9 |
| Learning rate (LR) | 0.001 |
| Scheduler | Cosine annealing with 0.0001 max LR |
| Batch size | 32 |
| Epoch | 100 |
Figure 2Predicted variables from classification tasks with corresponding AUC. Predictions are based on fundus image alone or on fundus and metadata. The effect of predicting at an image level versus an individual level is also illustrated.
Figure 3Saliency maps highlighting where the network focus when predicting age (1) and global cognition (2). Column (a) is the input image, (b) is the saliency map and (c) is the overlap of (a) and (b). The pink regions on the saliency maps are the one having influence in making the prediction while blue regions had lower importance. Highlighted regions were noted from 100 randomly selected images. For global cognition, the ONH was highlighted in 94% of sampled images, the background was highlighted in 44% of images, the vessels were highlighted in none of the images (0%) and non-specific regions (non-specific feature, edge of retinal scans, etc.) were highlighted in 24% of images. Regarding age, the ONH was highlighted in 1% of images, the background was highlighted in 2% of images, the vessels were highlighted in 89% of the images and non-specific regions were highlighted in 9% of images.
Modified architecture of the main network.
| Layer type | Output shape |
|---|---|
| Input—retinal fundus | (600, 600, 3) |
| Pretrained network (InceptionV3, MobileNetV2 or EfficientNetB3) | (1000) |
| Fully connected | (500) without metadata subnet (750) with metadata subnet |
| Batch normalization | |
| ReLU | |
| Dropout (p = 0.2) | |
| Output—fully connected | (1) |
Architecture of the metadata subnet.
| Layer type | Output shape |
|---|---|
| Input—metadata | (303) |
| Fully connected | (500) |
| Batch normalization | |
| ReLU | |
| Dropout (p = 0.2) | |
| Fully connected | (250) |
| Batch normalization | |
| ReLU | |
| Dropout (p = 0.2) |