| Literature DB >> 35885619 |
Nergis C Khan1, Chandrashan Perera1,2, Eliot R Dow1, Karen M Chen1, Vinit B Mahajan1, Prithvi Mruthyunjaya1, Diana V Do1, Theodore Leng1, David Myung1,3.
Abstract
While color fundus photos are used in routine clinical practice to diagnose ophthalmic conditions, evidence suggests that ocular imaging contains valuable information regarding the systemic health features of patients. These features can be identified through computer vision techniques including deep learning (DL) artificial intelligence (AI) models. We aim to construct a DL model that can predict systemic features from fundus images and to determine the optimal method of model construction for this task. Data were collected from a cohort of patients undergoing diabetic retinopathy screening between March 2020 and March 2021. Two models were created for each of 12 systemic health features based on the DenseNet201 architecture: one utilizing transfer learning with images from ImageNet and another from 35,126 fundus images. Here, 1277 fundus images were used to train the AI models. Area under the receiver operating characteristics curve (AUROC) scores were used to compare the model performance. Models utilizing the ImageNet transfer learning data were superior to those using retinal images for transfer learning (mean AUROC 0.78 vs. 0.65, p-value < 0.001). Models using ImageNet pretraining were able to predict systemic features including ethnicity (AUROC 0.93), age > 70 (AUROC 0.90), gender (AUROC 0.85), ACE inhibitor (AUROC 0.82), and ARB medication use (AUROC 0.78). We conclude that fundus images contain valuable information about the systemic characteristics of a patient. To optimize DL model performance, we recommend that even domain specific models consider using transfer learning from more generalized image sets to improve accuracy.Entities:
Keywords: artificial intelligence; diabetic retinopathy; retinal imaging; transfer learning
Year: 2022 PMID: 35885619 PMCID: PMC9322827 DOI: 10.3390/diagnostics12071714
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Participant demographics. Information pertaining to patient age, sex, race, ethnicity, and comorbidity status are outlined below.
| Demographic Feature | N | Proportion of Dataset (%) |
|---|---|---|
| Unique participants | 760 | – |
| Total fundus images | 1277 | – |
| Right eyes | 650 | 50.9 |
| Left eyes | 627 | 49.1 |
|
| ||
| Male | 432 | 54.7 |
| Female | 358 | 45.3 |
|
| ||
| 20–29 | 23 | 2.9 |
| 30–39 | 59 | 7.5 |
| 40–49 | 130 | 16.5 |
| 50–59 | 196 | 24.8 |
| 60–69 | 203 | 25.7 |
| 70–79 | 126 | 15.9 |
| 80–89 | 46 | 5.8 |
| 90–99 | 7 | 0.9 |
|
| ||
| Asian | 202 | 25.6 |
| African American/Black | 253 | 32 |
| White | 68 | 8.6 |
| Native American/Pacific Islander | 18 | 2.3 |
| Other/Unknown | 249 | 31.5 |
|
| ||
| Hispanic/Latino | 173 | 21.9 |
| Non-Hispanic/Latino | 547 | 69.2 |
| Other/Unknown | 70 | 8.9 |
|
| ||
| Cardiac Disease | 669 | 88 |
| Stroke | 584 | 76.8 |
| Hypertension | 696 | 91.6 |
| Diabetic Retinopathy | 90 | 11.8 |
Figure 1Representative fundus imaging with age > 70 ground-truth labeling and ImageNet-pretrained model classification. Above each fundus image, the first row of data contain the ground truth extracted from patient EMR and the second row contains the AI model pre-trained on ImageNet’s classification. Green and red indicate agreement and disagreement between the AI model and ground truth, respectively. For example, the fundus image in the top row on the far right was correctly predicted to be from a patient under 70 years of age (see the concordance between the ground truth and AI classification), whereas the fundus image in the second row far right was incorrectly predicted by the AI model.
Figure 2Visual representation of the methodology used to construct both the ImageNet and retinal image pretrained DL models.
Figure 3Representative receiver operating characteristic (ROC) curve for ImageNet-pretrained AI model classification of patient Age > 70 (yellow line). Note the ROC curve area, which indicates the achieved area under the ROC (AUROC). The performance of a hypothetical random classifier (AUROC = 0.5) is represented by the blue dashed line.
Figure 4Five systemic features for which the ImageNet pretrained AI model achieved the highest classification accuracy based on AUROC. Dark blue represents the ImageNet pretrained model; light blue represents the retinal image pretrained model.
ImageNet-pretrained AI model performance. Achieved area under the receiver operating characteristic curve (AUROC), optimized F1 score, sensitivity, and specificity are listed. Systemic features are ordered by descending AUROC.
| Systemic Feature | AUROC | Optimized F1 Score | Sensitivity | Specificity |
|---|---|---|---|---|
| Ethnicity | 0.926 | 0.871 | 0.86 | 0.886 |
| Age > 70 | 0.902 | 0.873 | 0.862 | 0.869 |
| Gender | 0.852 | 0.758 | 0.742 | 0.774 |
| Medication—ACEi | 0.815 | 0.804 | 0.811 | 0.791 |
| Medication—ARB | 0.783 | 0.707 | 0.7 | 0.708 |
| LDL | 0.766 | 0.718 | 0.694 | 0.714 |
| HDL | 0.756 | 0.711 | 0.692 | 0.722 |
| Smoking status | 0.732 | 0.697 | 0.632 | 0.713 |
| HbA1c | 0.708 | 0.669 | 0.638 | 0.634 |
| Cardiac disease | 0.7 | 0.669 | 0.625 | 0.598 |
| Medication—Aspirin | 0.696 | 0.681 | 0.673 | 0.685 |
| Hypertension | 0.687 | 0.695 | 0.643 | 0.623 |
Comparing ImageNet pretrained and retinal image pretrained model performances. The achieved area under the receiver operating characteristic curve (AUROC) for each of the 12 systemic features are listed. The mean AUROC achieved across all features was found to be statistically significant between the two models (p < 0.001).
| Systemic Feature | AUROC of ImageNet Pre-Trained Model | AUROC of Retinal Image Pre-Trained Model |
|---|---|---|
| Gender | 0.852 | 0.576 |
| Medication—ARB | 0.783 | 0.542 |
| Smoking Status | 0.732 | 0.528 |
| Medication—ACEi | 0.815 | 0.612 |
| LDL | 0.766 | 0.624 |
| Hypertension | 0.687 | 0.585 |
| HDL | 0.756 | 0.667 |
| Cardiac Disease | 0.7 | 0.623 |
| HbA1c | 0.708 | 0.64 |
| Age > 70 | 0.902 | 0.84 |
| Medication—Aspirin | 0.696 | 0.638 |
| Ethnicity | 0.926 | 0.907 |
|
| 0.777 | 0.648 |
Figure 5Five systemic health features for which the classification accuracy differed the most between the two AI models based on AUROC.
Figure 6Gradient activation map. This map demonstrates which region of the image the AI model is attending to at various CNN layer depths. The original fundus image for analysis is on the far left. The second image demonstrates which areas of the image the model is paying the greatest attention to in the middle layers of the model. The third image demonstrates the regions of the fundus image that the final layer of the model is paying most attention towards. The scale on the far right indicates the per-pixel degree of model attention from most attention to least attention.
Fundus images with corresponding electronic medical record (EMR) feature data. For each of the 12 systemic features of interest, the number of fundus images from among the complete set of 1277 with available corresponding information about the EMR is listed.
| Systemic Feature | Images with Corresponding Patient Data | Images without Corresponding Patient Data |
|---|---|---|
| Ethnicity | 1182 | 95 |
| Gender | 1277 | 0 |
| LDL | 1129 | 148 |
| HDL | 291 | 986 |
| Smoking status | 1247 | 30 |
| Age > 70 | 1277 | 0 |
| Cardiac disease | 1277 | 0 |
| HbA1c | 1183 | 60 |
| Hypertension | 1277 | 0 |
| Medication—ARB | 1277 | 0 |
| Medication—ACEi | 1277 | 0 |
| Medication—Aspirin | 1277 | 0 |