| Literature DB >> 30420808 |
Maria Teresa Climent1, Juan Pardo2, Francisco Javier Muñoz-Almaraz2, Maria Dolores Guerrero3, Lucrecia Moreno3.
Abstract
Purpose: The early detection of Mild Cognitive Impairment (MCI) is essential in aging societies where dementia is becoming a common manifestation among the elderly. Thus our aim is to develop a decision tree to discriminate individuals at risk of MCI among non-institutionalized elderly users of community pharmacy. A more clinically and patient-oriented role of the community pharmacist in primary care makes the dispensation of medication an adequate situation for an effective, rapid, easy, and reproducible screening of MCI.Entities:
Keywords: community pharmacists; decision trees; early detection; memory complaint; mild cognitive impairment; risk factors; sleep duration; statistical learning
Year: 2018 PMID: 30420808 PMCID: PMC6215965 DOI: 10.3389/fphar.2018.01232
Source DB: PubMed Journal: Front Pharmacol ISSN: 1663-9812 Impact factor: 5.810
Figure 1Flowchart of the research study. The flow on the left represents the research study whose data were applied machine learning algorithms. As a result of these techniques, a mass enrollment is proposed for early detection of MCI and the flowchart of the procedure is displayed on the right.
Description of odds ratios for qualitative variables with at least one category statistically significant with a univariate logistic regression for a significance level α = 0.01.
| Never | 163 (27.12) | 70 (55.12) | ||
| Sometimes | 174 (28.95) | 31 (24.41) | 2.7e-04 | 0.41 (0.26,0.67) |
| Daily | 264 (43.93) | 26 (20.47) | 4.0e-09 | 0.23 (0.14,0.37) |
| No | 372 (61.9) | 43 (33.86) | ||
| Yes | 229 (38.1) | 84 (66.14) | 1.9e-08 | 3.17 (2.12,4.75) |
| No | 514 (85.52) | 83 (65.35) | ||
| Yes | 87 (14.48) | 44 (34.65) | 2e-07 | 3.13 (2.04,4.82) |
| Incomplete Primary | 39 (6.49) | 30 (23.62) | ||
| Primary | 378 (62.9) | 74 (58.27) | 6.0e-07 | 0.25 (0.15,0.44) |
| Secondary | 134 (22.3) | 20 (15.75) | 1.5e-06 | 0.19 (0.1,0.38) |
| Tertiary | 50 (8.32) | 3 (2.36) | 7.1e-05 | 0.08 (0.02,0.27) |
| Male | 260 (43.26) | 32 (25.2) | ||
| Female | 341 (56.74) | 95 (74.8) | 2.1e-4 | 2.26 (1.47,3.49) |
| No | 506 (84.19) | 89 (70.08) | ||
| Yes | 95 (15.81) | 38 (29.92) | 2.4e-4 | 2.27 (1.47,3.53) |
| No | 467 (77.7) | 114 (89.76) | ||
| Yes | 134 (22.3) | 13 (10.24) | 2.8e-3 | 0.4 (0.22,0.73) |
| 6 | 261 (43.43) | 77 (60.63) | ||
| 5 | 94 (15.64) | 20 (15.75) | 0.2402 | 0.72 (0.42,1.24) |
| 4 | 114 (18.97) | 18 (14.17) | 0.0282 | 0.54 (0.31,0.94) |
| 3 | 79 (13.14) | 7 (5.51) | 0.0038 | 0.3 (0.13,0.68) |
| 2 | 31 (5.16) | 5 (3.94) | 0.2263 | 0.55 (0.21,1.45) |
| 1 | 22 (3.66) | 0 (0) | 0.9761 | 0 (0, Inf) |
Description of significant quantitative variables for a univariate logistic regression with a significance level α = 0.01.
| Height | 1 | 1.62 (0.08) | 1.59 (0.0 9) | 3.2e-05 | |
| P20-P80 N(%) | 434 (72.21) | 77 (60.63) | |||
| P20 N(%) | 89 (14.81) | 41 (32.28) | 2.4e-05 | 2.6 (1.67,4.04) | |
| P80 N(%) | 78 (12.98) | 8 (6.3) | 1.6e-01 | 0.58 (0.27,1.24) | |
| Overnight sleeping time | 0 | 7.07 (1.57) | 7.72 (1.83) | 6.1e-05 | |
| 6-9 N(%) | 466 (77.54) | 90 (70.87) | |||
| < 6 N(%) | 99 (16.47) | 16 (12.6) | 0.54318 | 0.84 (0.47,1.49) | |
| >9 N(%) | 36 (5.99) | 21 (16.54) | 0.00021 | 3.02 (1.69,5.41) | |
| Sleeping time | 0 | 7.62 (1.75) | 8.21 (1.93) | 0.001 | |
| 6-8h N(%) | 411 (68.39) | 66 (51.97) | |||
| < 6h N(%) | 51 (8.49) | 10 (7.87) | 5.9e-01 | 1.22 (0.59,2.52) | |
| >9h N(%) | 139 (23.13) | 51 (40.16) | 8.8e-05 | 2.28 (1.51,3.45) | |
| Age | 0 | 74.17 (6.28) | 75.84 (6.77) | 0.0079 | |
| 65-69 N(%) | 162 (26.96) | 29 (22.83) | |||
| 70-74 N(%) | 182 (30.28) | 26 (20.47) | 0.4380 | 0.8 (0.45,1.41) | |
| 75-79 N(%) | 130 (21.63) | 37 (29.13) | 0.0912 | 1.59 (0.93,2.72) | |
| 80-84 N(%) | 88 (14.64) | 16 (12.6) | 0.9633 | 1.02 (0.52,1.97) | |
| 85+ N(%) | 39 (6.49) | 19 (14.96) | 0.0037 | 2.72 (1.38,5.35) |
These variables have been categorized into several groups and the sample odds ratio and 95%confidence interval for these groups.
Figure 2Tree developed with a recursive partitioning algorithm using a training set. For every box, the first line is whether or not the user is classified to be in risk of MCI. The second line consists in two numbers, which indicates the estimated probability of being positive in MCI. The last line in the box is the percentage of the data fulfilling these conditions. The text below the box is the question corresponding to the next split. The warmer is the color of the box, the more likely is a user in that node to be positive in MCI.
Figure 3ROC curves of the predictive models for the test set. Models assign a probability to every user and the sensitivity and specificity are calculated for all possible cut–off points. Blue corresponds to Random Forest (RF), red is Extreme Gradient Boosting (XGBoost), purple to Stochastic Gradient Boosting (GBM), and black to the ensemble of models.
Comparison of the area under ROC curve (AUC) for several models.
| Random forest | 0.6987 | 0.7667 | (0.6709, 0.8624) |
| Extreme gradient boosting | 0.6655 | 0.7873 | (0.691, 0.8837) |
| GBM | 0.6634 | 0.7557 | (0.6532, 0.8582) |
| Ensemble model | 0.7471 | 0.8007 | (0.7044, 0.8969) |
The parameters deployed by each model are: Random Forest model (with parameter mtry = 23), Extreme Gradient Boosting (with the parameters: nrounds = 30, max depth = 1, eta = 0.45, gamma = 0, colsample bytree = 0.7 and min child weight = 1), and GBM (Stochastic Gradient Boosting ) (with parameters n.trees = 50, interaction.depth = 1, shrinkage = 0.1, and n.minobsinnode = 10). The final one the Ensemble Model: Random Forest (mtry = 169), Extreme Gradient Boosting (nrounds = 50, max depth = 1, eta = 0.3, gamma = 0, colsample bytree = 0.8, and min child weight = 1) and GBM ( n.trees = 50, interaction.depth = 3, shrinkage = 0.1, and n.minobsinnode = 10).
Correlation of the models in the Ensemble Model: Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Gradient Boosting Machine (GBM).
| GBM | 1.0000000 | 0.31819462 | 0.45761406 |
| RF | 0.3181946 | 1.00000000 | 0.05869544 |
| XGBoost | 0.4576141 | 0.05869544 | 1.00000000 |
Importance of the variable scaled according the method varImp in the caret library for the complete data set.
| Memory complaint | 14.31 | 9.69 | 3.93 | 19.95 |
| Overnight sleeping time | 11.93 | 9.04 | 6.82 | 14.87 |
| Educational attainment (linear contrast) | 11.29 | 8.85 | 5.51 | 14.41 |
| Reading (linear contrast) | 10.73 | 9.75 | 5.35 | 13.33 |
| Age | 6.86 | 9.00 | 6.76 | 6.41 |
| Sex | 4.41 | 4.84 | 1.27 | 5.69 |
| N06 | 4.12 | 3.14 | 1.38 | 5.55 |
| M01 | 3.87 | 1.27 | 0.63 | 5.90 |
| N06B | 2.71 | 3.01 | 0.82 | 3.47 |
| Job | 2.40 | 1.74 | 2.00 | 2.72 |
| Sleeping time | 2.24 | 5.90 | 5.38 | 0.00 |
| Games (linear contrast) | 1.52 | 1.76 | 1.46 | 1.49 |
| M01A | 1.40 | 1.74 | 0.49 | 1.73 |
| Smoking (linear contrast) | 1.39 | 0.33 | 0.44 | 2.06 |
| M02 | 1.02 | 0.31 | 0.39 | 1.47 |
| C08 | 0.83 | 1.79 | 1.28 | 0.40 |
| TV consumption | 0.79 | 1.30 | 2.30 | 0.00 |
| Day time nap | 0.78 | 1.68 | 2.07 | 0.00 |
| Educational attainment (cubic contrast) | 0.64 | 2.94 | 0.85 | 0.00 |
| Exercises (cubic contrast) | 0.57 | 0.84 | 1.63 | 0.04 |
Variables have been ordered depending on the importance in the ensemble model.
Figure 4Comparison of predictor and response in the test set for predictive models seen in the article. Along the x-axis are the name of the model split into two groups: positive in MCI and negative in MCI according to the score in the test, on the y-axis is the probability (predictor) assigned by the model.