| Literature DB >> 29986849 |
Laurent Cleret de Langavant1,2,3,4,5, Eleonore Bayen5,6,7,8, Kristine Yaffe5,9.
Abstract
BACKGROUND: Dementia is increasing in prevalence worldwide, yet frequently remains undiagnosed, especially in low- and middle-income countries. Population-based surveys represent an underinvestigated source to identify individuals at risk of dementia.Entities:
Keywords: cluster analysis; cognition disorders; data mining; dementia; diagnosis; electronic health records; health surveys; unsupervised machine learning
Mesh:
Year: 2018 PMID: 29986849 PMCID: PMC6056741 DOI: 10.2196/10493
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Figure 1Unsupervised hierarchical clustering in the Health and Retirement Study cohort. Scatterplot of the two first dimensions of the principal component analysis (dimension 1 and dimension 2 with explained variance) for individuals in the three clusters (red=cluster 1, blue=cluster 2, green=cluster 3).
Demographic and clinical characteristics in the three clusters created by unsupervised machine learning in the Health and Retirement Study (HRS) cohort.
| Demographic and clinical characteristics | All (N=18,165) | Cluster 1 (n=12,231) | Cluster 2 (n=4841) | Cluster 3 (n=1093) | ||
| Age (years), mean (SD) | 68.4 (10.5) | 66.1 (9.5) | 71.4 (10.4) | 79.7 (11.3) | <.001 | |
| Gender (male), n (%) | 7456 (41.05) | 5580 (45.62) | 1510 (31.19) | 366 (33.49) | <.001 | |
| Education (years), mean (SD) | 12.1 (3.4) | 12.8 (3.0) | 10.9 (3.5) | 10 (4.0) | <.001 | |
| White | 14,967 (82.39) | 10,373 (84.81) | 3770 (77.87) | 824 (75.39) | <.001 | |
| Black | 2508 (13.81) | 1423 (11.63) | 866 (17.89) | 219 (20.04) | <.001 | |
| Hispanic | 1472 (8.10) | 886 (7.24) | 465 (9.61) | 121 (11.07) | <.001 | |
| Other race/ethnicity | 685 (3.77) | 434 (3.55) | 201 (4.15) | 50 (4.57) | .07 | |
| Working full time, n (%) | 3773 (20.77) | 3470 (28.37) | 301 (6.22) | 2 (0.18) | <.001 | |
| IADLb (0-5) | 0.4 (1.0) | 0 (0.2) | 0.5 (0.8) | 3.6 (1.4) | <.001 | |
| ADLc (0-5) | 0.4 (1.0) | 0 (0.1) | 0.6 (0.9) | 3.5 (1.4) | <.001 | |
| Mobilityd (0-5) | 1.2 (1.5) | 0.4 (0.7) | 2.5 (1.4) | 3.9 (1.4) | <.001 | |
| Total word recall (0-20) | 9.4 (4.1) | 10.5 (3.4) | 8.4 (3.5) | 2.2 (4.2) | <.001 | |
| CES-De (0-8) | 1.6 (2.1) | 0.9 (1.3) | 3.2 (2.2) | 3.4 (3.2) | <.001 | |
| Ever had high blood pressure | 9167 (50.47) | 5265 (43.05) | 3183 (65.75) | 719 (65.78) | <.001 | |
| Ever had diabetes | 3029 (16.67) | 1456 (11.90) | 1286 (26.56) | 287 (26.26) | <.001 | |
| Ever had cancer | 2337 (12.87) | 1364 (11.15) | 788 (16.28) | 185 (16.93) | <.001 | |
| Ever had lung disease | 1473 (8.11) | 499 (4.08) | 801 (16.57) | 173 (15.83) | <.001 | |
| Ever had heart disease | 4219 (23.23) | 1854 (15.16) | 1843 (38.07) | 521 (47.67) | <.001 | |
| Ever had stroke | 1567 (8.63) | 469 (3.83) | 654 (13.51) | 444 (40.62) | <.001 | |
| Ever had arthritis | 10,231 (56.32) | 5501 (44,98) | 3903 (80.62) | 826 (75.57) | <.001 | |
| Ever smoked | 10,623 (58.48) | 7105 (58.09) | 2954 (61,02) | 564 (51.60) | <.001 | |
| Ever drank alcohol | 8103 (44.61) | 6573 (53.74) | 1410 (29.13) | 120 (10,98) | <.001 | |
| Body mass index, mean (SD) | 27.2 (5.4) | 26.9 (4.7) | 28.3 (6.5) | 25.2 (6.4) | <.001 | |
aP values are from one-way analysis of variance (ANOVA) or chi-square tests as appropriate.
bIADL: instrumental activities of daily living, including any difficulty using a telephone, taking medication, handling money, shopping, and preparing meals.
cADL: activities of daily living, including any difficulty bathing, eating, dressing, walking across a room, and getting in or out of bed.
dMobility: any difficulty for walking several blocks, walking one block, walking across the room, climbing several flights of stairs and climbing one flight of stairs.
eCES-D: Center for Epidemiological Study of Depression Scale [13].
Classification performance of cluster 3 from unsupervised machine learning in the Health and Retirement Study (HRS) cohort compared to various thresholds of predicted probabilities of dementia from Hurd et al’s model and to the Aging, Demographics, and Memory Study (ADAMS) clinical diagnosis of dementia.
| Classification performance of cluster 3 | Predicted probability of dementiaa (N=7574) | ADAMS clinical diagnosis of dementia (N=834) | |||
| >.50 | >.75 | >.90 | >.95 | ||
| Sensitivity (%) | 62.9 | 77.3 | 86.7 | 89.5 | 59.3 |
| Specificity (%) | 96.4 | 95.1 | 94.2 | 93.3 | 93.0 |
| Accuracy (%) | 92.1 | 93.6 | 93.7 | 93.1 | 81.3 |
aHurd et al’s model [10].
Figure 2Longitudinal change of instrumental activities of daily living (IADL), activities of daily living (ADL), and mobility scores in both Health and Retirement Study (HRS) and Survey of Health, Ageing and Retirement in Europe (SHARE) cohorts. Linear models with date of assessment at each wave as an independent variable were used to depict the longitudinal change of IADL, ADL, and mobility scores in the three clusters (red=cluster 1, blue=cluster 2, green=cluster 3) in both HRS (left) and SHARE (right) cohorts. A 99% confidence interval (gray color) is drawn for each cluster. The year corresponding to the time of classification is indicated by an arrow.
Figure 3Unsupervised hierarchical clustering in the Survey of Health, Ageing and Retirement in Europe cohort. Scatterplot of the two first dimensions of the principal component analysis (dimension 1 and dimension 2 with explained variance) for individuals in the three clusters (red=cluster 1, blue=cluster 2, green=cluster 3).
Demographic and clinical characteristics in the three clusters created by unsupervised machine learning in the Survey of Health, Ageing and Retirement in Europe (SHARE) cohort.
| Demographic and clinical characteristics | All (N=58,202) | Cluster 1 (n=40,223) | Cluster 2 (n=15,644) | Cluster 3 (n=2335) | ||
| Age (years), mean (SD) | 65.4 (10.4) | 62.7 (9.0) | 70.7 (10.2) | 77.4 (10.6) | <.001 | |
| Gender (male), n (%) | 25,182 (43.26) | 19,469 (48.40) | 4825 (30.84) | 888 (38.03) | <.001 | |
| Education (years), mean (SD) | 10.3 (4.3) | 11.1 (4.1) | 8.8 (4.0) | 7.8 (4.3) | <.001 | |
| Working, n (%) | 3889 (6.68) | 3591 (8.93) | 281 (1.80) | 17 (0.73) | <.001 | |
| IADLb (0-5) | 0.2 (0.8) | 0 (0.2) | 0.3 (0.6) | 3.1 (1.5) | <.001 | |
| ADLc (0-5) | 0.2 (0.8) | 0 (0.1) | 0.4 (0.7) | 3.2 (1.4) | <.001 | |
| Mobilityd (0-4) | 0.6 (1.0) | 0.1 (0.4) | 1.3 (1.0) | 3.1 (1.1) | <.001 | |
| Total word recall (0-20) | 8.9 (3.8) | 9.9 (3.4) | 7 (3.5) | 3.7 (3.8) | <.001 | |
| EURO-De (0-12) | 2.6 (2.3) | 1.8 (1.7) | 4.3 (2.4) | 5.1 (2.8) | <.001 | |
| Ever had high blood pressure | 22,848 (39.26) | 12,840 (31.92) | 8846 (56.55) | 1162 (49.76) | <.001 | |
| Ever had diabetes | 7208 (12.38) | 3136 (7.78) | 3481 (22.25) | 591 (25.31) | <.001 | |
| Ever had cancer | 3076 (5.29) | 1510 (3.75) | 1357 (8.67) | 209 (8.95) | <.001 | |
| Ever had lung disease | 3835 (6.59) | 1444 (3.59) | 2051 (13.11) | 340 (14.56) | <.001 | |
| Ever had heart disease | 7999 (13.74) | 2975 (7.40) | 4249 (27.16) | 775 (33.19) | <.001 | |
| Ever had stroke | 2547 (4.37) | 638 (1.59) | 1286 (8.22) | 623 (26.68) | <.001 | |
| Ever had arthritis | 14,192 (24.38) | 5797 (14.41) | 7347 (46.96) | 1035 (44.33) | <.001 | |
| Ever smoked | 27,097 (46.56) | 20,120 (50.02) | 6163 (39.40) | 814 (34.86) | <.001 | |
| Ever drank alcohol | 45,893 (78.85) | 34,061 (84.68) | 10,620 (67.89) | 1212 (51.91) | <.001 | |
| Body mass index, mean (SD) | 26.9 (4.8) | 26.4 (4.2) | 28.2 (5.9) | 26.9 (5.8) | <.001 | |
aP values are from one-way ANOVAs or chi-square tests as appropriate.
bIADL: instrumental activities of daily living, including any difficulty using a telephone, taking medication, handling money, shopping, and preparing meals.
cADL: activities of daily living, including any difficulty bathing, eating, dressing, walking across a room, and getting in or out of bed.
dMobility: any difficulty for walking 100 meters, walking across a room, climbing one flight of stairs, and climbing several flights of stairs.
eEURO-D: European Union initiative to compare symptoms of depression scale [14].