| Literature DB >> 36078704 |
Govinda R Poudel1, Anthony Barnett1, Muhammad Akram1, Erika Martino2, Luke D Knibbs3,4, Kaarin J Anstey5,6,7, Jonathan E Shaw8, Ester Cerin1.
Abstract
The environment we live in, and our lifestyle within this environment, can shape our cognitive health. We investigated whether sociodemographic, neighbourhood environment, and lifestyle variables can be used to predict cognitive health status in adults. Cross-sectional data from the AusDiab3 study, an Australian cohort study of adults (34-97 years) (n = 4141) was used. Cognitive function was measured using processing speed and memory tests, which were categorized into distinct classes using latent profile analysis. Sociodemographic variables, measures of the built and natural environment estimated using geographic information system data, and physical activity and sedentary behaviours were used as predictors. Machine learning was performed using gradient boosting machine, support vector machine, artificial neural network, and linear models. Sociodemographic variables predicted processing speed (r2 = 0.43) and memory (r2 = 0.20) with good accuracy. Lifestyle factors also accurately predicted processing speed (r2 = 0.29) but weakly predicted memory (r2 = 0.10). Neighbourhood and built environment factors were weak predictors of cognitive function. Sociodemographic (AUC = 0.84) and lifestyle (AUC = 0.78) factors also accurately classified cognitive classes. Sociodemographic and lifestyle variables can predict cognitive function in adults. Machine learning tools are useful for population-level assessment of cognitive health status via readily available and easy-to-collect data.Entities:
Keywords: built environment; cognition; machine learning; memory; neighbourhood environment; physical activity; prediction; processing speed; sedentary behaviour; sociodemographic
Mesh:
Year: 2022 PMID: 36078704 PMCID: PMC9517821 DOI: 10.3390/ijerph191710977
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 4.614
Participant characteristics (n = 4141).
| Characteristics | Statistics | Characteristics | Statistics |
|---|---|---|---|
|
| |||
| Age, years, M ± SD | 61.1 ± 11.4 | Sex, female, % | 55.2 |
| Educational attainment, % | English-speaking background, % | 89.9 | |
| Up to secondary | 32.7 | Household income, annual, % | |
| Trade, technician certificate | 29.1 | Up to $49,999 | 32.9 |
| Associate diploma and equiv. | 14.5 | $50,000–$99,999 | 26.8 |
| Bachelor’s degree, post-graduate diploma | 23.1 | $100,000 and over | 28.9 |
| Does not know or refusal | 8.8 | ||
| Living arrangements, % | |||
| Couple without children | 48.2 | ||
| Couple with children | 26.8 | ||
| Other | 22.4 | ||
| Population density (persons/ha) | 17.4 ± 10.0 | Street intersection density | 62.2 ± 32.2 |
| Dwelling density | 2.9 ± 4.2 | Percentage of parkland | 11.6 ± 12.5 |
| Percentage of commercial land use | 2.5 ± 6.1 | Percentage of blue space | 0.2 ± 1.98 |
| Percentage of residential land use | 73.6 ± 19.9 | Nearest parkland (km) | 0.3 ± 0.3 |
| Nearest blue space (km) | 7.9 ± 9.3 | Aerial distance to trainline | 3.9 ± 5.3 |
| PM2.5 | 6.3 ± 1.7 | Road density major roads (km) | 0.9 ± 1.7 |
| NO2 (ppb) | 5.5 ± 2.1 | Road density minor roads (km) | 8.9 ± 3.7 |
| Area-level IRSAD | 6.4 ± 2.7 | ||
| Vigorous gardening (times/week) | 0.8 ±1.5 | Muscle strength exercise | 0.9 ± 2.3 |
| Walking for transport | 1.4 ± 3.5 | Walking for leisure | 2.4 ± 2.5 |
| Total walking ( | 3.1 ± 2.6 | ||
| Sitting for work | 1.6 ± 2.2 | Sitting for screen | 1.9 ± 1.3 |
| Sitting for transport | 0.8 ± 0.8 | Non-work computer sitting | 0.6 ± 0.9 |
| Sitting for other (h/day) | 3.4 ± 2.4 | ||
| Memory, CVLT score | 6.5 ± 2.4 | Processing speed, SDMT score | 49.7 ± 11.6 |
| Missing data, % | 2.3 | Missing data, % | 2.0 |
Notes. M, mean; SD, standard deviation; IRSAD, Index of Relative Socioeconomic Advantage and Disadvantage; CVLT, California Verbal Learning Test; SDMT, Symbol–Digit Modalities test; NO2, nitrogen dioxide; ppb, parts per billion.
Coefficient of determination [r2 and 95% CI] values of association between predicted and measured processing speed (SDMT) and memory scores (CVLT). Performance of regression-based linear model (LM), support vector machine (SVM), neural network (ANN), and gradient boosting machines (GBM) are compared. Different combinations of sociodemographic (SDF), neighbourhood environment (NEF), and lifestyle (LSF) factors were used in the models.
| GBM | SVM | ANN | LM | |
|---|---|---|---|---|
|
| ||||
| SDMT | 0.43 (0.37, 0.49) | 0.43 (0.37, 0.48) | 0.4 (0.33, 0.47) | 0.43 (0.37, 0.48) |
| CVLT | 0.20 (0.14, 0.27) | 0.2 (0.14, 0.27) | 0.18 (0.12, 0.24) | 0.20 (0.14, 0.27) |
|
| ||||
| SDMT | 0.04 (0.02, 0.07) | 0.01 (0, 0.03) | 0.01 (0, 0.04) | 0.01 (0, 0.03) |
| CVLT | 0.03 (0.01, 0.06) | 0 (0, 0.02) | 0.01 (0, 0.04) | 0 (0, 0.02) |
|
| ||||
| SDMT | 0.29 (0.23, 0.35) | 0.17 (0.12, 0.23) | 0.26 (0.2, 0.32) | 0.17 (0.12, 0.23) |
| CVLT | 0.10 (0.06, 0.15) | 0.05 (0.01, 0.1) | 0.07 (0.03, 0.11) | 0.05 (0.02, 0.1) |
|
| ||||
| SDMT | 0.43 (0.37, 0.49) | 0.42 (0.36, 0.47) | 0.38 (0.31, 0.45) | 0.42 (0.37, 0.48) |
| CVLT | 0.22 (0.15, 0.29) | 0.2 (0.14, 0.27) | 0.16 (0.1, 0.23) | 0.21 (0.15, 0.28) |
|
| ||||
| SDMT | 0.46 (0.41, 0.52) | 0.43 (0.37, 0.49) | 0.41 (0.35, 0.47) | 0.44 (0.38, 0.49) |
| CVLT | 0.21 (0.15, 0.28) | 0.2 (0.14, 0.27) | 0.16 (0.09, 0.23) | 0.2 (0.14, 0.27) |
|
| ||||
| SDMT | 0.30 (0.24, 0.36) | 0.17 (0.12, 0.22) | 0.24 (0.18, 0.3) | 0.17 (0.12, 0.23) |
| CVLT | 0.12 (0.07, 0.17) | 0.04 (0.01, 0.09) | 0.04 (0.01, 0.08) | 0.05 (0.02, 0.1) |
|
| ||||
| SDMT | 0.46 (0.41, 0.52) | 0.42 (0.37, 0.48) | 0.4 (0.31, 0.48) | 0.43 (0.38, 0.49) |
| CVLT | 0.23 (0.17, 0.3) | 0.2 (0.14, 0.27) | 0.15 (0.1, 0.22) | 0.21 (0.15, 0.28) |
Pooled area under the curve (AUC) estimates and 95% CI for classifying cognitive function classes. Support vector machine (SVM), artificial neural network (ANN), gradient boosting machine (GBM), and linear models (LM) were used for training. Different combinations of sociodemographic (SDF), neighbourhood environment (NEF), and lifestyle (LSF) factors were used in the models.
| GBM | SVM | ANN | LM | |
|---|---|---|---|---|
| AUC (95% CI) | AUC (95% CI) | AUC (95% CI) | AUC (95% CI) | |
| SDF | 0.84 (0.68, 0.93) | 0.84 (0.68, 0.93) | 0.84 (0.67, 0.93) | 0.85 (0.67, 0.94) |
| NEF | 0.58 (0.51, 0.65) | 0.53 (0.49, 0.57) | 0.53 (0.45, 0.61) | 0.56 (0.5, 0.61) |
| LSF | 0.78 (0.61, 0.89) | 0.62 (0.44, 0.76) | 0.74 (0.61, 0.83) | 0.74 (0.61, 0.85) |
| SDF + NEF | 0.84 (0.68, 0.93) | 0.84 (0.68, 0.93) | 0.84 (0.68, 0.93) | 0.84 (0.68, 0.93) |
| SDF + LSF | 0.85 (0.67, 0.95) | 0.84 (0.68, 0.93) | 0.85 (0.68, 0.93) | 0.85 (0.67, 0.94) |
| NEF + LSF | 0.78 (0.6, 0.89) | 0.64 (0.55, 0.73) | 0.74 (0.62, 0.83) | 0.74 (0.62, 0.83) |
| SDF + NEF + LSF | 0.85 (0.67, 0.94) | 0.84 (0.69, 0.92) | 0.84 (0.68, 0.93) | 0.84 (0.68, 0.93) |
Figure 1Performance of gradient boosting machine (GBM) model. Receiver operator characteristics (ROC) curves for prediction of cognitive classes using sociodemographic, neighbourhood environment, and lifestyle variables and their combination via GBM model. Ten separate ROC curves are shown for the 10 multiple imputations of the data. tpr = true positive rate, fpr = false positive rate.
Figure 2Relative contribution of the sociodemographic and lifestyle factors for predicting cognitive classes using gradient boosting machine.