| Literature DB >> 35111085 |
Shaowu Lin1,2,3, Yafei Wu1,2,3, Ya Fang1,2,3.
Abstract
BACKGROUND: Depression is highly prevalent and considered as the most common psychiatric disorder in home-based elderly, while study on forecasting depression risk in the elderly is still limited. In an endeavor to improve accuracy of depression forecasting, machine learning (ML) approaches have been recommended, in addition to the application of more traditional regression approaches.Entities:
Keywords: depression; home-based elderly; machine learning; prediction; regression
Year: 2022 PMID: 35111085 PMCID: PMC8801448 DOI: 10.3389/fpsyt.2021.764806
Source DB: PubMed Journal: Front Psychiatry ISSN: 1664-0640 Impact factor: 4.157
Figure 1Model procedure for training and testing data.
Figure 2A flow chart for study population selection.
Characteristics of the study population in 2011.
|
|
|
| |
|---|---|---|---|
| Age, years | |||
| 60- | 659 (67.52%) | 1,842 (71.01%) | 0.059 |
| 70- | 276 (28.28%) | 674 (25.98%) | |
| 80- | 41 (4.20%) | 78 (3.01%) | |
| Sex | |||
| Male | 424 (43.44%) | 1,549(59.71%) | 0.000 |
| Female | 552(56.56%) | 1,045(40.29%) | |
| Rural/urban community | |||
| Rural | 708 (72.54%) | 1,463 (56.40%) | 0.000 |
| Urban | 268 (27.46%) | 1,131 (43.60%) | |
| Hukou status | |||
| Agriculture | 835 (85.55%) | 1,839 (70.89%) | 0.000 |
| Non-agriculture | 141 (14.44%) | 754 (29.11%) | |
| Geographic location | |||
| East | 278 (28.48%) | 1,008 (38.86%) | 0.000 |
| Central | 342 (35.04%) | 877 (33.81%) | |
| West | 356 (36.48%) | 709 (27.33%) | |
| Marital status | |||
| Single | 223 (22.85%) | 385 (14.84%) | 0.000 |
| Married | 753 (77.15%) | 2,209 (85.16%) | |
| Educational attainment | |||
| Low | 867 (88.83%) | 1,922 (74.09%) | 0.000 |
| High | 109 (11.17%) | 672 (25.91%) | |
| Occupational status | |||
| Agricultural work | 514 (52.66%) | 1,113 (42.91%) | 0.000 |
| Non-agricultural work | 66 (6.76%) | 348 (13.42%) | |
| Retired | 379 (38.83%) | 1,078 (41.56%) | |
| Unemployed/never work | 17 (1.74%) | 55 (2.12%) | |
| Household per capita income, yuan | |||
| 0- | 714 (73.16%) | 1,735 (66.89%) | 0.001 |
| 5000- | 131 (13.42%) | 448 (17.27%) | |
| 10000- | 131 (13.42%) | 411 (15.84%) | |
| Life satisfaction | |||
| Satisfied | 136 (13.93%) | 724 (27.91%) | 0.000 |
| Medium | 648 (66.39%) | 1,718 (66.23%) | |
| Not satisfied | 192 (19.67%) | 152 (5.86%) | |
| Major misfortune injury experience | |||
| Ever | 124 (9.78%) | 336 (9.41%) | 0.007 |
| Never | 1,144 (90.22%) | 3,234 (90.59%) | |
| Self-reported health status before 15 years old | |||
| Good | 689 (70.59%) | 2,041 (78.68%) | 0.000 |
| Fair | 188 (19.26%) | 401 (15.46%) | |
| Poor | 99 (10.14%) | 152 (5.86%) | |
| Social activities | |||
| Never | 552 (56.56%) | 1,219 (46.99%) | 0.000 |
| Ever | 424 (43.44%) | 1,375 (53.01%) | |
| Smoking | |||
| Never | 598 (61.27%) | 1,375 (53.01%) | 0.000 |
| Ever | 378 (38.73%) | 1,219 (46.99%) | |
| Drinking | |||
| Never | 693 (71.00%) | 1,672 (64.46%) | 0.000 |
| Ever | 283 (29.00%) | 922 (35.54%) | |
| Self-reported memory | |||
| Good | 77 (7.89%) | 626 (24.13%) | 0.000 |
| Fair | 364 (37.30%) | 1,270 (48.96%) | |
| Poor | 535 (54.82%) | 698 (26.91%) | |
| Medical insurance | |||
| No | 66 (6.76%) | 127 (4.90%) | 0.028 |
| Yes | 910 (93.24%) | 2,467 (95.10%) | |
| Medical service | |||
| No | 716 (73.36%) | 2,163 (83.38%) | 0.000 |
| Yes | 260 (26.64%) | 431 (16.62%) | |
| Sleeping time, hour | |||
| 0- | 301 (30.84%) | 285 (10.99%) | 0.000 |
| 4- | 327 (33.50%) | 898 (34.62%) | |
| 6- | 281 (28.79%) | 1,168 (45.03%) | |
| 8- | 67 (6.86%) | 243 (9.37%) | |
| Chronic disease | |||
| No | 181 (18.55%) | 847 (32.65%) | 0.000 |
| Yes | 795 (81.45%) | 1,747 (67.35%) | |
| ADL impairment | |||
| No | 520 (53.28%) | 2,023 (77.99%) | 0.000 |
| Yes | 456 (46.72%) | 571 (22.01%) | |
| Disability | |||
| No | 710 (72.75%) | 2,174 (83.81%) | 0.000 |
| Yes | 266 (27.25%) | 420 (16.19%) | |
| Cognitive ability | 7.51 ± 3.91 | 9.76 ± 3.99 | 0.000 |
| CESD-10 score | 5.73 ± 2.92 | 5.22 ± 2.80 | 0.000 |
Overview of the performance estimates for each prediction model.
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|
| Logistic | 0.795 | 0.788 | 0.156 | 0.159 | 0.404 | 0.656 |
| Lasso | 0.794 | 0.788 | 0.156 | 0.158 | 0.402 | 0.665 |
| Ridge | 0.794 | 0.788 | 0.156 | 0.159 | 0.402 | 0.654 |
| Random forest | 0.769 | 0.771 | 0.164 | 0.165 | 0.293 | 0.643 |
AUC, area under the receiver operating curve; BS, Brier-scaled; Sens, sensitivity; PPV, positive predictive value; M, mean; Med, medium.
Figure 3Boxplot of 100 resampling results for each prediction model (see median results in Table 2). Logistic, Logistic regression model; Rf, Random forest model. Left: Area under the curve (AUC). Right: Scaled Brier score, with values below zero indicating a model performance/calibration inferior to that of a chance prediction model applied to the validation dataset.
Overview of the decreasing importance of the 23 baseline predictors for each prediction model.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Life satisfaction | 0.83 | 2.29 | 1 | 128.6 | 0.13 | 1.14 | 1 | 13.8 | 0.12 | 1.13 | 1 | 13.2 | 6.392 | 5 |
| Self-reported memory | 0.59 | 1.80 | 2 | 79.9 | 0.09 | 1.09 | 3 | 9.0 | 0.08 | 1.08 | 4 | 8.5 | 6.702 | 4 |
| Chronic disease | 0.57 | 1.77 | 3 | 77.0 | 0.07 | 1.07 | 5 | 6.9 | 0.07 | 1.07 | 5 | 7.4 | 2.911 | 12 |
| Sex | 0.54 | 1.72 | 4 | 72.2 | 0.08 | 1.09 | 4 | 8.7 | 0.09 | 1.09 | 3 | 9.3 | 2.704 | 15 |
| ADL barrier | 0.54 | 1.71 | 5 | 71.1 | 0.10 | 1.10 | 2 | 10.1 | 0.09 | 1.10 | 2 | 9.9 | 4.390 | 8 |
| Sleeping time | −0.44 | 0.65 | 6 | 35.3 | −0.07 | 0.94 | 6 | 6.4 | −0.06 | 0.94 | 6 | 6.2 | 8.166 | 3 |
| Medical service | 0.39 | 1.47 | 7 | 46.9 | 0.04 | 1.04 | 10 | 4.3 | 0.05 | 1.05 | 9 | 5.0 | 2.834 | 14 |
| Rural/urban community | −0.37 | 0.69 | 8 | 30.7 | −0.05 | 0.95 | 8 | 5.0 | −0.05 | 0.95 | 8 | 4.8 | 2.519 | 19 |
| Disability | 0.32 | 1.37 | 9 | 37.0 | 0.04 | 1.04 | 11 | 4.0 | 0.04 | 1.05 | 10 | 4.6 | 2.692 | 17 |
| Marital status | −0.31 | 0.74 | 10 | 26.5 | −0.05 | 0.95 | 7 | 5.0 | −0.06 | 0.94 | 7 | 5.6 | 2.549 | 18 |
| Medical insurance | −0.26 | 0.77 | 11 | 23.1 | 0.00 | 1.00 | 20 | 0.3 | −0.02 | 0.98 | 15 | 2.3 | 1.254 | 23 |
| Major misfortune injury experience | −0.22 | 0.80 | 12 | 20.0 | −0.02 | 0.98 | 13 | 2.3 | −0.04 | 0.96 | 13 | 3.5 | 1.749 | 22 |
| Educational attainment | −0.22 | 0.80 | 13 | 19.8 | −0.04 | 0.96 | 12 | 3.6 | −0.04 | 0.96 | 12 | 3.9 | 1.891 | 21 |
| Geographic location | 0.20 | 1.22 | 14 | 21.8 | 0.04 | 1.04 | 9 | 4.3 | 0.04 | 1.04 | 11 | 4.4 | 5.426 | 6 |
| Hukou status | −0.18 | 0.83 | 15 | 16.8 | 0.00 | 1.00 | 19 | 0.3 | −0.01 | 0.99 | 18 | 1.1 | 1.895 | 20 |
| Self-reported health status before 15 years old | 0.16 | 1.17 | 16 | 16.8 | 0.02 | 1.02 | 14 | 2.3 | 0.03 | 1.03 | 14 | 2.8 | 3.771 | 10 |
| Social activities | −0.10 | 0.91 | 17 | 9.5 | −0.01 | 0.99 | 15 | 1.1 | −0.02 | 0.98 | 16 | 1.8 | 3.407 | 11 |
| Smoking | 0.07 | 1.07 | 18 | 7.2 | 0.00 | 1.00 | 22 | 0.0 | 0.01 | 1.01 | 22 | 0.6 | 2.700 | 16 |
| Household per capita income | −0.05 | 0.95 | 19 | 5.3 | 0.00 | 1.00 | 21 | 0.0 | −0.01 | 0.99 | 21 | 0.6 | 3.948 | 9 |
| CESD-10 score | 0.04 | 1.04 | 20 | 4.3 | 0.01 | 1.01 | 18 | 0.5 | 0.01 | 1.01 | 20 | 0.6 | 11.508 | 2 |
| Occupational status | −0.04 | 0.96 | 21 | 4.1 | −0.01 | 0.99 | 16 | 1.1 | −0.01 | 0.99 | 17 | 1.3 | 4.522 | 7 |
| Cognitive ability | −0.04 | 0.96 | 22 | 4.0 | −0.01 | 0.99 | 17 | 0.8 | −0.01 | 0.99 | 19 | 0.8 | 13.206 | 1 |
| Drinking | −0.01 | 1.00 | 23 | 0.5 | 0.00 | 1.00 | 23 | 0.0 | 0.01 | 1.01 | 23 | 0.5 | 2.865 | 13 |
, Order according to the predictor ranking of the logistic regression model; β, beta-coefficient of the (penalized) logistic regression model; OR, odds ratio; %, OR translated to percentage. The original importance values of the random forest model have been multiplied by 100, to avoid having to display too many digits.
Figure 4Display of the top 15 predictor variables for predicting depression by using Logistic (A), Lasso (B), Ridge (C), and Random forest (D). Feature: predictor; Overall: 100 = most important; and 0 = least important. Self-reported health status*: Self-reported health status before 15 years old.