| Literature DB >> 34423259 |
Omolola I Ogunyemi1, Meghal Gandhi1, Martin Lee2, Senait Teklehaimanot2, Lauren Patty Daskivich3, David Hindman4, Kevin Lopez1, Ricky K Taira5.
Abstract
OBJECTIVE: Clinical guidelines recommend annual eye examinations to detect diabetic retinopathy (DR) in patients with diabetes. However, timely DR detection remains a problem in medically underserved and under-resourced settings in the United States. Machine learning that identifies patients with latent/undiagnosed DR could help to address this problem.Entities:
Keywords: artificial intelligence; diabetic retinopathy; diabetic retinopathy diagnosis; machine learning; safety net providers
Year: 2021 PMID: 34423259 PMCID: PMC8374369 DOI: 10.1093/jamiaopen/ooab066
Source DB: PubMed Journal: JAMIA Open ISSN: 2574-2531
Clinical variables for patients with diabetes obtained from LACDHS EHR system
| Socio-demographic variables | ||
| Age | Race | Ethnicity |
| Sex | Marital status | Insurance status |
| General health overview | ||
| Diabetes diagnosis date | Date of last eye examination | Pregnancy status |
| Previous diabetic retinopathy treatment | Smoking status | Insulin dependence |
| Clinical measurements | ||
| Body mass index | Diastolic blood pressure | Fasting glucose level |
| Blood urea nitrogen | Systolic blood pressure | HDL |
| Hemoglobin | Hemoglobin A1C | Triglycerides |
| Comorbid conditions | ||
| Peripheral vascular disease | Hypertension | Stroke |
| Depression | Obesity | Nephropathy |
| Dyslipidemia | Neuropathy | Erectile dysfunction |
| Condition of interest | ||
| Diabetic retinopathy diagnosis | ||
Included (or derived variable included) in 14-variable feature subset used for ML.
Converted to “duration of diabetes in years” + and “time since last eye exam in months”.
Dropped due to >35% missing data.
Study population characteristics (N = 40 631)
| Participant characteristics | |
|---|---|
| Age in years | |
| Mean (SD) | 57.5 (10.5) |
| Missing values | 0 (0.0) |
| Gender | |
| Male | 23 242 (57.20) |
| Female | 17 389 (42.80) |
| Missing values | 0 (0.0) |
| Ethnicity | |
| Hispanic or Latino | 28 040 (69.01) |
| Not Hispanic or Latino | 9264 (22.80) |
| Missing values | 3327 (8.19) |
| Race | |
| American Indian or Alaska Native | 135 (0.33) |
| Asian or Asian American | 2626 (6.46) |
| Black or African American | 3324 (8.18) |
| More than one race | 17 612 (43.35) |
| Native Hawaiian or other Pacific Islander | 81 (0.20) |
| White | 4269 (10.51) |
| Missing values | 12 584 (30.97) |
| Marital status | |
| Single | 13 940 (34.31) |
| Married | 13 807 (33.98) |
| Divorced | 1397 (3.44) |
| Widowed | 1407 (3.46) |
| Separated | 861 (2.12) |
| Missing values | 9219 (22.69) |
| Insurance provider | |
| Medicaid | 24 117 (59.36) |
| Medicare | 4274 10.52) |
| Private | 5569 (13.71) |
| Self-pay | 4887 (12.03) |
| CHAMPUS | 1 (0.00) |
| Other | 98 (0.24) |
| Missing values | 1685 (4.15) |
| Diagnosis of retinopathy | |
| Yes | 12 633 (31.09) |
| No | 27 998 (68.91) |
| Missing values | 0 (0.0) |
Univariate analyses of potential predictor variables
| Variable | P-value for testing significanceas predictor | Odds ratio | 95% confidenceinterval | Interpretation |
|---|---|---|---|---|
| Age | .0003 | 1.00 | 1.00–1.01 | Older, higher risk |
| Marital status (single vs. married) | .08 | |||
| Sex | <.00001 | 1.54 | 1.48–1.61 | Males are higher risk |
| Ethnicity | <.00001 | 0.63 | 0.60–0.67 | Not Hispanic or Latino is lower risk |
| Duration of diabetes (years) | <.00001 | 1.08 | 1.07–1.08 | Longer duration, higher risk |
| Pregnancy status | .20 | |||
| Insulin dependence | <.00001 | 3.42 | 3.28–3.58 | Insulin dependence, higher risk |
| Time since last eye exam (months) | <.00001 | 0.98 | 0.98–0.98 | Longer time since last eye exam, lower risk |
| Peripheral vascular disease | <.00001 | 2.85 | 2.42–3.34 | Peripheral vascular disease, higher risk |
| Hypertension | .024 | |||
| Systolic blood pressure | <.00001 | 1.03 | 1.02–1.03 | Higher systolic blood pressure, higher risk |
| Diastolic blood pressure | .00022 | 1.01 | 1.00–1.01 | Higher diastolic blood pressure, higher risk |
| Depression | .002 | 0.90 | 0.83–0.96 | Depression, lower risk |
| Obesity | <.00001 | 0.63 | 0.60–0.66 | Obesity, lower risk |
| BMI | <.00001 | 0.97 | 0.96–0.97 | Higher BMI, lower risk |
| Stroke | <.00001 | 1.82 | 1.64–2.01 | Stroke, higher risk |
| Nephropathy | <.00001 | 3.73 | 3.53–3.94 | Nephropathy, higher risk |
| Erectile dysfunction | <.00001 | 1.34 | 1.19–1.51 | Erectile dysfunction, higher risk |
| Neuropathy | <.00001 | 2.27 | 2.14–2.40 | Neuropathy, higher risk |
| Dyslipidemia | <.00001 | 0.65 | 0.62–0.68 | Dyslipidemia, lower risk |
| Insurance | .05 | |||
| BUN | <.00001 | 1.09 | 1.09–1.09 | Higher blood urea nitrogen, higher risk |
| HDL | .16 | |||
| Hemoglobin | <.00001 | 0.73 | 0.72–0.74 | Higher hemoglobin, lower risk |
| Hemoglobin A1c | <.00001 | 1.24 | 1.22–1.25 | Higher hemoglobin A1C, higher risk |
| Triglycerides | .00002 | 1.00 | 1.00–1.00 | Higher triglycerides, lower risk |
Figure 1.Optimal number of variables/features using various metrics.
14 variable model performance on test and validation sets
| RF under | XGBOOST under | SVM under | Ensemble model under | DNN under | |
|---|---|---|---|---|---|
| Model performance on 14 Variables with majority class undersampling on test set | |||||
| Sensitivity (%) | 71.52 | 70.85 | 72.81 | 70.68 |
|
| Specificity (%) | 73.51 | 74.61 | 72.58 | 74.96 | 72.77 |
| PPV (%) | 54.97 | 55.79 | 54.56 | 56.07 | 54.98 |
| NPV (%) | 85.09 | 84.99 | 85.52 | 84.97 | 85.88 |
| Accuracy (%) | 72.89 | 73.44 | 72.65 | 73.63 | 73.01 |
| Kappa statistic | 0.416 | 0.423 | 0.4158 | 0.426 | 0.4240 |
| AUC | 0.799 | 0.800 | 0.798 | 0.803 |
|
| Model performance on 14 variables with majority class undersampling on external validation set | |||||
| Sensitivity (%) | 69.06 | 66.78 | 70.00 | 67.38 |
|
| Specificity (%) | 76.01 | 77.35 | 75.24 | 77.09 | 74.20 |
| PPV (%) | 44.45 | 45.05 | 44 | 44.98 | 43.75 |
| NPV (%) | 89.83 | 89.34 | 90.02 | 89.47 | 90.55 |
| Accuracy (%) | 74.49 | 75.05 | 74.1 | 74.98 | 73.76 |
| Kappa statistic | 0.3756 | 0.3759 | 0.3728 | 0.3769 | 0.3756 |
| AUC | 0.791 | 0.792 | 0.794 | 0.794 |
|