| Literature DB >> 35902789 |
Yafei Wu1,2,3,4, Chaoyi Xiang1,3,4, Maoni Jia1,3,4, Ya Fang5,6,7,8.
Abstract
OBJECTIVES: To explore the heterogeneous disability trajectories and construct explainable machine learning models for effective prediction of long-term disability trajectories and understanding the mechanisms of predictions among the elderly Chinese at community level.Entities:
Keywords: ADL limitations; Explanations; Functional disability; Machine learning; Trajectories
Mesh:
Year: 2022 PMID: 35902789 PMCID: PMC9336105 DOI: 10.1186/s12877-022-03295-x
Source DB: PubMed Journal: BMC Geriatr ISSN: 1471-2318 Impact factor: 4.070
Fig. 1Flow chart of model derivation and validation
Performance of latent class growth model and growth mixture model
| BIC | ||||
|---|---|---|---|---|
| LGCM | ||||
| linear | 105,530.680 | |||
| quadratic | 105,345.372 | |||
| cubic | 105,359.053 | |||
| LCGM | BIC | Entropy | VLMR-LRT | Smallest class |
| 1-class | 111,004.715 | 100% | ||
| 2-class | 105,054.609 | 0.942 | 0.0000 | 16.173% |
| 3-class | 102,817.976 | 0.892 | 0.0000 | 8.093% |
| 4-class | 101,585.221 | 0.904 | 0.0018 | 3.720% |
| 5-class | 100,586.689 | 0.881 | 0.0009 | 4.278% |
| 6-class | 99,933.216 | 0.862 | 0.0153 | 3.068% |
| GMM | ||||
| 3-class | 102,109.346 | 0.929 | 0.0000 | 7.158% |
Abbreviations: BIC Bayesian information criteria, VLMR-LRT VUONG-LO-MENDELL-RUBIN likelihood ratio test, LGCM Latent growth curve model, LCGM Latent class growth model, GMM, growth mixture model
Fig. 2Heterogenous disability trajectory classes of older adults with complete information at least three waves (n = 4,149). Disability score (0–28) was measured by the sum score of BADL (0–12) and IADL (0–16). Three trajectory classes were identified: progressive class, high-onset class, and normal class. Each trajectory represents the mean change pattern of the 3 classes
Sample characteristics by disability trajectories classes in 2002
| Age | 74.99 ± 7.58 | 75.23 ± 7.97 | 85.42 ± 8.72 | 89.12 ± 9.30 | < 0.001 | — |
| Sex | ||||||
| Male | 1972 (46.4%) | 1619 (50.4%) | 248 (38.6%) | 60 (20.2%) | < 0.001 | — |
| Female | 2222 (53.6%) | 1591 (49.6%) | 394 (61.4%) | 237 (79.8%) | ||
| Ethnicity | ||||||
| Han Ancestry | 3852 (92.8%) | 2970 (92.5%) | 602 (93.8%) | 280(94.3%) | 0.326 | — |
| Minority | 297 (7.2%) | 240 (7.5%) | 40 (6.2%) | 17 (5.7%) | ||
| Education | ||||||
| Illiterate | 2280 (55.1%) | 1635 (51.0%) | 412 (64.5%) | 233 (79.3%) | < 0.001 | 10 (0.24%) |
| Literate | 1859 (44.9%) | 1571 (49.0%) | 227 (35.3%) | 61 (20.7%) | ||
| Occupation | ||||||
| Low level | 497 (12.0%) | 416 (13%) | 63 (9.9%) | 18 (6.1%) | 0.003 | 12 (0.29%) |
| High level | 3640 (88.0%) | 2788 (87.0%) | 574 (90.1%) | 278 (93.9%) | ||
| Marital Status | ||||||
| Unmarried/Separated/Divorced/Widowed | 2160 (52.1%) | 1475 (46.0%) | 451 (70.2%) | 234 (78.8%) | < 0.001 | — |
| Married | 1989 (47.9%) | 1735 (54.0%) | 191 (29.8%) | 63 (21.2%) | ||
| Residence | ||||||
| Urban | 1692 (40.8%) | 1242 (38.7%) | 309 (48.1%) | 141 (47.5%) | < 0.001 | — |
| Rural | 2457 (59.2%) | 1968 (61.3%) | 333 (51.9%) | 156 (52.5%) | ||
| Co-residence | ||||||
| Living alone | 664 (16.0%) | 496 (15.5%) | 135 (21.0%) | 33 (11.1%) | < 0.001 | — |
| With family | 3485 (84.0%) | 2714 (84.5%) | 507 (79.0%) | 264 (88.9%) | ||
| Fruit Intake | ||||||
| Low frequency | 2758 (66.5%) | 2124 (66.2%) | 433 (67.4%) | 201 (67.7%) | 0.741 | — |
| High frequency | 1391 (33.5%) | 1086 (33.8%) | 209 (32.6%) | 96 (32.3%) | ||
| Vegetables Intake | ||||||
| Low frequency | 416 (10.0%) | 285 (8.9%) | 77 (12.0%) | 54 (18.2%) | < 0.001 | 1 (0.02%) |
| High frequency | 3732 (90.0%) | 2924 (91.1%) | 565 (88.0%) | 243 (81.8%) | ||
| Tea Consumption | ||||||
| Low frequency | 2797 (67.4%) | 2097 (65.3%) | 459 (71.5%) | 241 (81.1%) | < 0.001 | 1 (0.02%) |
| High frequency | 1351 (32.6%) | 1112 (34.7%) | 183 (28.5%) | 56 (18.9%) | ||
| Smoker | ||||||
| Yes | 979 (23.6%) | 838 (26.1%) | 117 (18.2%) | 24 (8.1%) | < 0.001 | 1 (0.02%) |
| No | 3169 (76.4%) | 2371 (73.9%) | 525 (81.8%) | 273 (91.9%) | ||
| Alcohol Drinker | ||||||
| Yes | 988 (23.8%) | 822 (25.6%) | 136 (21.2%) | 30 (10.1%) | < 0.001 | 4 (0.10%) |
| No | 3157 (76.2%) | 2386 (74.4%) | 505 (78.8%) | 266 (89.9%) | ||
| Regular Exercise | ||||||
| Yes | 1573 (37.9%) | 1261(39.3%) | 269 (42.0%) | 43 (14.5%) | < 0.001 | 3 (0.07%) |
| No | 2573 (62.1%) | 1948 (60.7%) | 372 (58.0%) | 253 (85.5%) | ||
| Leisure Activity Index | 0.49 ± 0.13 | 0.51 ± 0.12 | 0.45 ± 0.13 | 0.31 ± 0.10 | < 0.001 | 2 (0.05%) |
| Weight | 51.25 ± 10.25 | 52.00 ± 10.02 | 49.65 ± 10.53 | 46.58 ± 10.43 | < 0.001 | — |
| Systolic Pressure | 133.03 ± 17.63 | 132.35 ± 17.23 | 134.65 ± 18.15 | 136.80 ± 20.00 | 0.28 | 4 (0.10%) |
| Diastolic Pressure | 84.65 ± 12.22 | 84.34 ± 12.06 | 85.79 ± 12.76 | 85.58 ± 12.57 | 0.04 | 6 (0.14%) |
| Rhythm of Heart | ||||||
| Regular | 3872 (93.4%) | 3025 (94.3%) | 588 (91.6%) | 259 (87.5%) | < 0.001 | 4 (0.10%) |
| Irregular | 273 (6.6%) | 182 (5.7%) | 54 (8.4%) | 37 (12.5%) | ||
| Heart Rate | 72.44 ± 7.53 | 72.37 ± 7.46 | 72.73 ± 7.61 | 72.51 ± 8.10 | 0.01 | 6 (0.14%) |
| Length from Wrist to Shoulder | 49.84 ± 5.53 | 50.20 ± 5.43 | 49.62 ± 5.90 | 48.33 ± 5.55 | 0.10 | 2 (0.05%) |
| Length from Kneel to Floor | 46.66 ± 5.37 | 46.79 ± 5.27 | 46.49 ± 5.81 | 45.59 ± 5.28 | 0.14 | 6 (0.14%) |
| PWB Score | 22.51 ± 3.74 | 22.64 ± 3.71 | 22.42 ± 3.57 | 20.97 ± 4.21 | 0.189 | 226 (5.45%) |
| BADL Score | 11.77 ± 0.99 | 11.96 ± 0.26 | 11.81 ± 0.52 | 9.60 ± 2.70 | < 0.001 | — |
| IADL Score | 13.80 ± 3.82 | 15.05 ± 1.85 | 12.60 ± 3.01 | 2.79 ± 2.44 | < 0.001 | — |
| MMSE score | 27.95 ± 2.74 | 28.20 ± 2.40 | 27.08 ± 3.35 | 24.17 ± 5.12 | < 0.001 | 1680 (40.49%) |
| Chronic Condition | 0.68 ± 0.91 | 0.65 ± 0.85 | 0.77 ± 1.03 | 0.90 ± 1.09 | < 0.001 | 364 (8.77%) |
| Hypertension | ||||||
| Yes | 677 (16.9%) | 486 (15.7%) | 126 (20.3%) | 65 (22.3%) | 0.001 | 145 (3.49%) |
| No | 3327 (83.1%) | 2606 (84.3%) | 495 (79.7%) | 226(77.7%) | ||
| Diabetes | ||||||
| Yes | 77 (1.9%) | 54 (1.7%) | 19 (3.1%) | 4 (1.4%) | 0.07 | 146 (3.52%) |
| No | 3926 (98.1%) | 3044 (98.3%) | 601 (96.9%) | 281 (98.6%) | ||
| Stroke | ||||||
| Yes | 178 (4.4%) | 110 (3.5%) | 36 (5.8%) | 32 (11.1%) | < 0.001 | 125 (3.01%) |
| No | 3946 (95.6%) | 3005 (96.5%) | 585 (94.2%) | 256 (88.9%) | ||
| Hear Disease | ||||||
| Yes | 363 (9.1%) | 260 (8.4%) | 71 (11.5%) | 32 (11.1%) | 0.02 | 141 (3.40%) |
| No | 3645 (90.9%) | 2841 (91.6%) | 548 (88.5%) | 256 (88.9%) | ||
| Household Income per Capita | ||||||
| Low level | 3029 (75.8%) | 2337 (75.2%) | 476 (77.7%) | 216 (78.3%) | 0.27 | 153 (3.69%) |
| High level | 967 (24.2%) | 770 (24.8%) | 137 (22.3%) | 60 (3.4%) | ||
| Adequate Health Services | ||||||
| Yes | 3792 (91.4%) | 2974 (92.7%) | 586 (91.3%) | 232 (78.1%) | < 0.001 | 1 (0.02%) |
| No | 356 (8.6%) | 235 (7.3%) | 56 (8.7%) | 65 (21.9%) | ||
| Sufficient Financial Support | ||||||
| Yes | 3312 (79.9%) | 2581 (80.4%) | 508 (79.3%) | 223 (75.1%) | 0.08 | 2 (0.05%) |
| No | 835 (20.1%) | 628 (19.6%) | 133 (20.7%) | 74 (24.9%) | ||
Values are presented as mean ± standard deviation, number (%)
Abbreviations: PWB, psychological well-being; BADL, basic activity of daily living; IADL, instrumental activity of daily living; MMSE, Mini-Mental State Examination
aANOVA test and chi-square test were performed, and the null hypothesis is no difference across the three classes
Performance of machine learning for three-class task prediction
| Accuracy | 0.706 | 0.734 | 0.773 | 0.735 | 0.771 | 0.714 | 0.744 | 0.774 | 0.744 | 0.771 |
| Recall | 0.779 | 0.810 | 0.844 | 0.785 | 0.834 | 0.780 | 0.810 | 0.843 | 0.780 | 0.840 |
| Precision | 0.769 | 0.807 | 0.854 | 0.775 | 0.833 | 0.771 | 0.800 | 0.848 | 0.773 | 0.848 |
| F1 Score | 0.760 | 0.807 | 0.848 | 0.766 | 0.832 | 0.761 | 0.802 | 0.845 | 0.759 | 0.843 |
| Hamming | 0.221 | 0.190 | 0.156 | 0.215 | 0.166 | 0.220 | 0.190 | 0.157 | 0.220 | 0.160 |
| Jaccard | 0.639 | 0.705 | 0.759 | 0.645 | 0.735 | 0.639 | 0.696 | 0.754 | 0.636 | 0.751 |
| Kappa | 0.507 | 0.495 | 0.568 | 0.516 | 0.567 | 0.512 | 0.518 | 0.575 | 0.514 | 0.564 |
Accuracy, recall, precision, F1 score were all calculated with weighted metrics. Hamming, Jaccard, and Kappa refer to Hamming distance, Jaccard similarity coefficient, and Cohen’s kappa score
Performance of machine learning for two-class task prediction
| Accuracy | 0.760 | 0.776 | 0.805 | 0.748 | 0.810 | 0.764 | 0.770 | 0.784 | 0.753 | 0.796 |
| Recall | 0.828 | 0.843 | 0.859 | 0.816 | 0.861 | 0.831 | 0.837 | 0.847 | 0.821 | 0.854 |
| Precision | 0.824 | 0.841 | 0.867 | 0.813 | 0.870 | 0.828 | 0.833 | 0.850 | 0.818 | 0.860 |
| F1 Score | 0.821 | 0.841 | 0.862 | 0.808 | 0.864 | 0.823 | 0.832 | 0.848 | 0.813 | 0.855 |
| Hamming | 0.172 | 0.157 | 0.141 | 0.184 | 0.139 | 0.169 | 0.163 | 0.153 | 0.179 | 0.146 |
| Jaccard | 0.707 | 0.738 | 0.768 | 0.690 | 0.772 | 0.710 | 0.724 | 0.748 | 0.697 | 0.759 |
| Kappa | 0.554 | 0.564 | 0.583 | 0.532 | 0.586 | 0.564 | 0.565 | 0.560 | 0.542 | 0.573 |
Accuracy, recall, precision, F1 score were all calculated with weighted metrics. Hamming, Jaccard, and Kappa refer to Hamming distance, Jaccard similarity coefficient, and Cohen’s kappa score
Fig. 3Decision curve analysis for the prediction models in full-variable data set
Fig. 4The relative feature importance (top 20) of RF (A-D) and XGBoost (E–H) in three-class prediction. A: overall importance of RF; B SHAP summary plot of RF model when the expected trajectory is progressive; C SHAP summary plot of RF model when the expected trajectory is high-onset; D SHAP summary plot of RF model when the expected trajectory is normal; E overall importance of XGBoost; F SHAP summary plot of XGBoost model when the expected trajectory is progressive; G SHAP summary plot of XGBoost model when the expected trajectory is high-onset; H SHAP summary plot of XGBoost model when the expected trajectory is normal
Fig. 5The relative feature importance (top 20) of RF (A-B) and XGBoost (C-D) in two-class prediction. A overall importance of RF; B SHAP summary plot of RF model when the expected outcome is abnormal; C overall importance of XGBoost; D SHAP summary plot of XGBoost model when the expected outcome is abnormal
Fig. 6Local interpretation of samples based on RF and XGBoost. A RF explanation when the expected trajectory is progressive; B XGBoost explanation when the expected trajectory is progressive; C RF explanation when the expected trajectory is abnormal; D XGBoost explanation when the expected trajectory is abnormal