| Literature DB >> 32523046 |
Nele Gerrits1, Bart Elen2, Toon Van Craenendonck2, Danai Triantafyllidou2, Ioannis N Petropoulos3, Rayaz A Malik3, Patrick De Boever2,4,5.
Abstract
Deep neural networks can extract clinical information, such as diabetic retinopathy status and individual characteristics (e.g. age and sex), from retinal images. Here, we report the first study to train deep learning models with retinal images from 3,000 Qatari citizens participating in the Qatar Biobank study. We investigated whether fundus images can predict cardiometabolic risk factors, such as age, sex, blood pressure, smoking status, glycaemic status, total lipid panel, sex steroid hormones and bioimpedance measurements. Additionally, the role of age and sex as mediating factors when predicting cardiometabolic risk factors from fundus images was studied. Predictions at person-level were made by combining information of an optic disc centred and a macula centred image of both eyes with deep learning models using the MobileNet-V2 architecture. An accurate prediction was obtained for age (mean absolute error (MAE): 2.78 years) and sex (area under the curve: 0.97), while an acceptable performance was achieved for systolic blood pressure (MAE: 8.96 mmHg), diastolic blood pressure (MAE: 6.84 mmHg), Haemoglobin A1c (MAE: 0.61%), relative fat mass (MAE: 5.68 units) and testosterone (MAE: 3.76 nmol/L). We discovered that age and sex were mediating factors when predicting cardiometabolic risk factors from fundus images. We have found that deep learning models indirectly predict sex when trained for testosterone. For blood pressure, Haemoglobin A1c and relative fat mass an influence of age and sex was observed. However, achieved performance cannot be fully explained by the influence of age and sex. In conclusion we confirm that age and sex can be predicted reliably from a fundus image and that unique information is stored in the retina that relates to blood pressure, Haemoglobin A1c and relative fat mass. Future research should focus on stratification when predicting person characteristics from a fundus image.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32523046 PMCID: PMC7287116 DOI: 10.1038/s41598-020-65794-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Descriptive statistics of individuals in this subset of the Qatar Biobank data.
| Characteristics | Qatar Biobank subset |
|---|---|
| Number of participants | 3,000 |
| Number of images | 12,000 |
| Age (years) | 40.6 (13.0) |
| Sex (% male) | 41% |
| Ethnicity | 86% Qatari, 14% mix of 25 other countries |
| Current smoker (%) | 18% |
| Body Mass Index (kg/m2) | 29.7 (6.4) |
| Relative fat mass (%) | 38.4 (9.9) |
| Systolic blood pressure (mmHg) | 114.4 (15.6) |
| Diastolic blood pressure (mmHg) | 66.4 (10.0) |
| Haemoglobin A1c (%) | 5.8 (1.4) |
| Insulin (mcunit/ml) | 13.3 (16.6) |
| Glucose (mmol/L) | 6.0 (2.5) |
| Sex hormone binding globulin (nmol/L) | 52.4 (43.4) |
| Estradiol (pmol/L) | 254.6 (314.4) |
| Testosterone (nmol/L) | 7.9 (9.3) |
| Total cholesterol (mmol/L) | 5.0 (1.0) |
| HDL cholesterol (mmol/L) | 1.4 (0.4) |
| LDL cholesterol (mmol/L) | 3.0 (0.9) |
| Triglyceride (mmol/L) | 1.3 (0.9) |
For numerical variables the mean and standard deviation are shown.
Figure 1Model performance on predicting continuous cardiometabolic risk factors in the test set on person level for the regression task. The coefficient of determination is plotted for every risk factor, along with the 95% confidence interval obtained via a bootstrapping methodology. Results for a linear regression using age and sex on person level on the test set is added for every cardiometabolic risk factor, except age, with a red dot. Risk factors included in the exploration of the impact of age and sex on prediction performance have a coefficient of determination higher than 0.20 and are indicated in dark green.
Figure 2Predicted and actual value for the test set for an evaluation on person level for several cardiometabolic risk factors, stratified as per sex (female is coloured blue and male is coloured red). Units for age are years (left upper), for SBP are mmHg (right upper), for DBP are mmHg (middle left), for HbA1c are % (middle right), for relative fat mass are units (lower left) and for testosterone are nmol/L (lower right). The lines represent y = x values.
Performance of the algorithm stratified per age category on person level.
| <=30 years | >30 and <=39 years | >39 and <=50 years | >50 years | Total | |
|---|---|---|---|---|---|
| Sex (% male) | 0.99 | 0.96 | 0.98 | 0.97 | 0.97 |
| Systolic blood pressure (mmHg) | 0.24 | 0.23 | 0.30 | 0.04 | 0.40 |
| Diastolic blood pressure (mmHg) | 0.16 | 0.16 | 0.21 | 0.21 | 0.24 |
| Haemoglobin A1c (%) | −0.14 | 0.28 | 0.24 | 0.06 | 0.34 |
| Relative fat mass value | 0.29 | 0.33 | 0.33 | 0.47 | 0.43 |
| Testosterone (nmol/L) | 0.61 | 0.57 | 0.45 | 0.50 | 0.54 |
Performance is given by showing R2 (regression task) or the AUC (classification task) on the test set.
Performance results for training on subsets of the data, on females, on males and on a random half subset of the Qatar Biobank data.
| Females | Males | On 1/2 training set | On total training set | |
|---|---|---|---|---|
| Age (years) | 0.89 | 0.86 | 0.86 | 0.89 |
| Systolic blood pressure (mmHg) | 0.44 | 0.19 | 0.29 | 0.40 |
| Diastolic blood pressure (mmHg) | 0.14 | 0.21 | 0.14 | 0.24 |
| Haemoglobin A1c (%) | 0.28 | 0.16 | 0.25 | 0.34 |
| Relative fat mass | 0.23 | 0.07 | 0.40 | 0.43 |
| Testosterone (nmol/L) | 0.03 | 0.04 | 0.48 | 0.54 |
For completeness the performance results on the test set when trained on the total training set are included in the table as well. Obtained R2 values on the respective test sets on person level are shown.