| Literature DB >> 36114257 |
Ingrid Rye1, Alexandra Vik2, Marek Kocinski2,3,4, Alexander S Lundervold2,5, Astri J Lundervold6.
Abstract
Patients with Mild Cognitive Impairment (MCI) have an increased risk of Alzheimer's disease (AD). Early identification of underlying neurodegenerative processes is essential to provide treatment before the disease is well established in the brain. Here we used longitudinal data from the ADNI database to investigate prediction of a trajectory towards AD in a group of patients defined as MCI at a baseline examination. One group remained stable over time (sMCI, n = 357) and one converted to AD (cAD, n = 321). By running two independent classification methods within a machine learning framework, with cognitive function, hippocampal volume and genetic APOE status as features, we obtained a cross-validation classification accuracy of about 70%. This level of accuracy was confirmed across different classification methods and validation procedures. Moreover, the sets of misclassified subjects had a large overlap between the two models. Impaired memory function was consistently found to be one of the core symptoms of MCI patients on a trajectory towards AD. The prediction above chance level shown in the present study should inspire further work to develop tools that can aid clinicians in making prognostic decisions.Entities:
Mesh:
Substances:
Year: 2022 PMID: 36114257 PMCID: PMC9481567 DOI: 10.1038/s41598-022-18805-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Demographics, cognitive function and biological measures in patients defined as stable MCI and converters to AD.
| sMCI (N = 357) | cAD (N = 321) | t/ | p value | Effect size | |
|---|---|---|---|---|---|
| Mean (SD) | Mean (SD) | ||||
| Age | 73.1 (7.45) | 73.9 (7.11) | 1.35 | 0.176 | 0.10 |
| Gender (%F) | 41.2 | 38.9 | 0.352 | 0.553 | 0.02 |
| RAVLT-Im | 36.9 (10.5) | 29.3 (7.7) | 10.56 | < 0.001 | 0.81 |
| RAVLT-delay | 4.88 (3.93) | 2.05 (2.67) | 10.87 | < 0.001 | 0.84 |
| RAVLT-recog | 11.26 (3.16) | 9.42 (3.56) | 7.13 | < 0.001 | 0.55 |
| TMTA | 39.2 (15.6) | 44.7 (21.5) | 3.90 | < 0.001 | 0.30 |
| TMTB | 108.1 (56.9) | 133.8 (73.9) | 5.10 | < 0.001 | 0.39 |
| CFT animals | 17.8 (5.17) | 15.8 (4.75) | 5.13 | < 0.001 | 0.39 |
| GDS: mean (SD) | 1.71 (1.44) | 1.65 (1.38) | 0.53 | 0.596 | 0.04 |
| ANART Total errors | 12.9 (9.3) | 13.3 (9.6) | 0.61 | 0.539 | 0.05 |
| Hippocampus volume | 0.00451 (7.6 | 0.00398 (6.8 | 9.64 | < 0.001 | 0.74 |
| APOE (%positive) | 42.3 | 64.2 | 32.45 | < 0.001 | 0.22 |
sMCI stable Mild Cognitive Impairment, cAD converted to Alzheimer’s disease, RAVLT Rey Auditory Verbal Learning Test, TMT Trail Making Test, CTF Category Fluency Test, GDS Geriatric Depression Scale, ANART American National Reading Test.
Figure 12 × 2 confusion matrices computed for the sMCI and cAD labels returned from prediction on test set compared with the co-occurrences of the observed outcome. The black and purple cells represent misclassified subjects, while the beige and red cells represent correctly classified subjects. The number of occurrences in each cell is given as number of subjects and percentage of the total test set for RF model and ensemble.
Figure 2The figure illustrates the two models’ overlap in misclassified sMCI (a) og cAD (b). Gray symbols represent subjects for which the two models overlapped in misclassification. Purple and blue symbols represents additional subjects misclassified by the Random Forest model and the ensemble model, respectively.
Demographics, cognitive function and biological measures in correctly and misclassified patients.
| TN (n = 54) | FP (n = 20) | FN (n = 24) | TP (n = 41) | ||
|---|---|---|---|---|---|
| Mean (SD) | Mean (SD) | Mean (SD) | Mean (SD) | ||
| Age | 72.1 (7.32) | 74.9 (7.19) | 74.8 (8.01) | 73.3 (7.61) | – |
| Gender (% F) | 38.9 | 55.0 | 29.2 | 43.9 | – |
| RAVLT-Im | 39.15 (8.85) | 30.60 (7.23) | 35.38 (7.60) | 28.12 (4.83) | a, b |
| RAVLT-delay | 5.89 (3.41) | 1.20 (1.54) | 4.38 (3.00) | 1.24 (1.55) | a, b |
| RAVLT-recog | 12.15 (2.66) | 9.40 (3.42) | 11.71 (2.60) | 8.66 (3.63) | a, b |
| TMTA | 37.2 (13.1) | 40.8 (8.3) | 42.6 (28.9) | 45.1 (25.8) | – |
| TMTB | 91.6 (31.9) | 129.6 (61.3) | 130.0 (88.6) | 134.4 (77.4) | a |
| CFT animals | 18.69 (4.82) | 16.80 (5.35) | 16.00 (4.23) | 15.81 (4.24) | – |
| GDS | 1.82 (1.35) | 1.85 (1.14) | 1.29 (1.12) | 1.51 (1.25) | – |
| ANART total errors | 13.0 (9.7) | 9.2 (7.3) | 13.0 (9.9) | 13.3 (10.3) | – |
| Hippocampus volume | 0.00457 (7.4*10 | 0.00384(6.1*10 | 0.00439 (6.2*10 | 0.00372 (6.8*10 | a, b |
| APOE (% positive) | 37.0 | 55.0 | 45.8 | 78.0 | – |
TN correctly classified sMCI, FP sMCI subjects misclassified converters, FN cAD subjects misclassified as stable, TP cAD subjects correctly classified, RAVLT Rey Auditory Verbal Learning Test, TMT Trail Making Test, CTF Category Fluency Test, GDS Geriatric Depression Scale, ANART American National Reading Test. Multiple comparisons abbreviated as: a = TN differ from FP; b = FN differ from TP. Group mean differences at Bonferroni corrected alpha level of 0.004 , rounded) considered statistically significant.
Figure 3Feature importances calculated by decrease in impurity from evaluation on test set. All the predictors included in the model are displayed on the y-axis while the x-axis depicts their relative importance.
The table depicts each feature’s importance in descending order calculated by permutation.
| 0.0403 ± 0.0503 | HC |
| 0.0245 ± 0.0413 | RAVLT-Im |
| 0.0086 ± 0.0108 | AGE |
| 0.0058 ± 0.0141 | CFT |
| 0.0000 ± 0.0129 | GENDER |
| − 0.0014 ± 0.0211 | APOE |
| − 0.0014 ± 0.0058 | GDS |
| − 0.0029 ± 0.0503 | RAVLT-delay |
| − 0.0058 ± 0.0168 | TMTA |
| − 0.0101 ± 0.0147 | TMTB |
| − 0.0129 ± 0.0058 | ANART |
| − 0.0158 ± 0.0279 | RAVLT-recog |
The leftmost column in each row depict average effect on model accuracy by random shuffling ± how the accuracy varied from one reshuffling to the next. The two most important features are hippocampal volume and RAVLT immediate, followed by age, category fluency and gender.