| Literature DB >> 25379423 |
A V Lebedev1, E Westman2, G J P Van Westen3, M G Kramberger4, A Lundervold5, D Aarsland6, H Soininen7, I Kłoszewska8, P Mecocci9, M Tsolaki10, B Vellas11, S Lovestone12, A Simmons12.
Abstract
Computer-aided diagnosis of Alzheimer's disease (AD) is a rapidly developing field of neuroimaging with strong potential to be used in practice. In this context, assessment of models' robustness to noise and imaging protocol differences together with post-processing and tuning strategies are key tasks to be addressed in order to move towards successful clinical applications. In this study, we investigated the efficacy of Random Forest classifiers trained using different structural MRI measures, with and without neuroanatomical constraints in the detection and prediction of AD in terms of accuracy and between-cohort robustness. From The ADNI database, 185 AD, and 225 healthy controls (HC) were randomly split into training and testing datasets. 165 subjects with mild cognitive impairment (MCI) were distributed according to the month of conversion to dementia (4-year follow-up). Structural 1.5-T MRI-scans were processed using Freesurfer segmentation and cortical reconstruction. Using the resulting output, AD/HC classifiers were trained. Training included model tuning and performance assessment using out-of-bag estimation. Subsequently the classifiers were validated on the AD/HC test set and for the ability to predict MCI-to-AD conversion. Models' between-cohort robustness was additionally assessed using the AddNeuroMed dataset acquired with harmonized clinical and imaging protocols. In the ADNI set, the best AD/HC sensitivity/specificity (88.6%/92.0% - test set) was achieved by combining cortical thickness and volumetric measures. The Random Forest model resulted in significantly higher accuracy compared to the reference classifier (linear Support Vector Machine). The models trained using parcelled and high-dimensional (HD) input demonstrated equivalent performance, but the former was more effective in terms of computation/memory and time costs. The sensitivity/specificity for detecting MCI-to-AD conversion (but not AD/HC classification performance) was further improved from 79.5%/75%-83.3%/81.3% by a combination of morphometric measurements with ApoE-genotype and demographics (age, sex, education). When applied to the independent AddNeuroMed cohort, the best ADNI models produced equivalent performance without substantial accuracy drop, suggesting good robustness sufficient for future clinical implementation.Entities:
Keywords: ADNI; AddNeuroMed; Alzheimer's disease; Computer-aided diagnosis; Mild cognitive impairment; Multi-center study; Random Forest; Structural MRI
Mesh:
Year: 2014 PMID: 25379423 PMCID: PMC4215532 DOI: 10.1016/j.nicl.2014.08.023
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Fig. A1Workflow diagram. The diagram illustrates main steps of image post-processing and analysis. It starts and proceeds in two directions aimed at extraction of parcelled and high-dimensional measurements using Freesurfer software (blue box). After this part had completed, the extracted measures underwent steps for outlier detection, and the resulted output was used in further Random Forest classification runs (in R programming language — gray box). We additionally tuned our models using recursive feature elimination and mtry-parameter adjustment (which defines the number of predictors randomly sampled at each node of the classifier). Finally, feature importance vectors from the best models were either mapped into the brain space (for the high-dimensional data) or plotted (for the parcelled input).
Fig. A2Classifier tuning diagram. The diagram describes main steps performed during the tuning of Random Forest models. This framework was employed for all modalities: cortical thickness, sulcal depth, Jacobian maps, non-cortical volumes, combined parcelled measurements of cortical thickness and non-cortical volumes. (1) First, the measurements with near-zero variance were removed from the feature sets and the resulting output underwent stepwise recursive feature elimination (RFE); (2) 10,000 trees were then used to “grow” the first forest (using full feature set), and afterwards (3) RFE was performed based on feature importance vector derived from the first forest, by removing lowest-ranked 5% of the features at each step (gradually reducing the dimensionality as 100%, 95%, … etc., up to 50%), and by the subsequent accuracy comparison with 5-fold CV; (4) after selection of the optimal feature subset, mtry-parameter adjustment was also performed using 1000 trees (search range ∈ , step = ); (5) the forests were retrained with optimal parameters using 10,000 trees.
Subject demographics: ADNI cohort.
| AD | HC | MCI | AD/HC-comparison: test (p-value) | |
|---|---|---|---|---|
| N | 185 | 225 | 165 | − |
| Age | 75.2 [±7.48] | 75.95 [±5.02] | 75.46 [±7.37] | |
| M/F ratio | 1.01 (93/92) | 1.05 (115/110) | 1.66 (103/62) | χ2 = 0.01 (0.93) |
| Education | 14.6 [±3.24] | 16.0 [±2.85] | 15.65 [±2.97] | |
| MMSE | 23.3 [±1.99] | 29.1 [±0.98] | 27.04 [±1.78] |
149 of total 165 MCI subjects developed Alzheimer's disease at some point during the 4-year follow-up period (MCI-converters).
AddNeuroMed cohort demographics.
| AD | HC | MCI | |
|---|---|---|---|
| N | 107 | 100 | 114 |
| Age | 75.7 [±5.63] | 73.2 [±6.87] | 74.4 [±5.79] |
| M/F ratio | 0.65 (42/65) | 1.08 (52/48) | 1.11 (60/54) |
| Education | 7.6 [±3.78] | 8.59 [±4.17] | 10.5 [±4.82] |
| MMSE | 20.8 [±4.74] | 29.1 [±1.26] | 27.1 [±1.68] |
21 of total 114 MCI subjects developed Alzheimer's disease at some point over 1-year follow-up. (MCI-converters)
Model tuning.
| Measurement | NZV | Feature subset after RFE | Opt mtry | CPU cores used | RAM usage (RFE/mtry) | Total tuning time |
|---|---|---|---|---|---|---|
| Thickness (HD) | 14,880 | 156,402 | 356 | 6 | 51.3/44 GB | 80 h 50 min |
| Sulcal depth (HD) | 14,851 | 218,983 | 1170 | 6 | 51/46 GB | 84 h 22 min |
| Jacobian (HD) | 0 | 262,147 | 1024 | 6 | 58/45 GB | 89 h 27 min |
| Volumes (ROI) | 0 | 6 | 2 | 10 | 1 GB | 10 min |
| Thickness + volumes (parc) | 0 | 24 | 4 | 10 | 3.8 GB | 3 h 35 min |
NZV — number of features with Near-Zero Variance removed at the first step; RFE — Recursive Feature Elimination; opt mtry — optimal mtry-parameter (see Methods for details).
For these models, an exhaustive search for optimal mtry parameter and feature subset has been performed.
AD/HC performance of the final models: ADNI cohort.
| Models | OOB error [Train] | Sensitivity/specificity (OA) [Test] | AUC (95% C.I.) | |
|---|---|---|---|---|
| Cortical thickness | RFE | 14.0% | 88.6%/90.7% (89.62%) | 0.93 (0.9–0.96) |
| Sulcal depth | 21.3% | 80.0%/74.7% (77.3%) | 0.84 (0.8–0.89) | |
| Jacobian | 21.7% | 77.1%/81.3% (79.2%) | 0.84 (0.79–0.88) | |
| Volumes | 15.0% | 82.9%/86.7% (84.7%) | 0.91 (0.88–0.95) | |
| Thickness + volumes (ROI) | 11.7% | 88.6%/92.0% (90.3%) | 0.94 (0.91–0.96) | |
| Thickness | no RFE | 14.7% | 88.6%/89.3% (89.0%) | 0.92 (0.89–0.95) |
| Sulcal depth | 14.0% | 80.0%/73.3% (76.67%) | 0.83 (0.79–0.88) | |
| Jacobian | 21.0% | 80.0%/80.0% (80.0%) | 0.84 (0.79–0.88) | |
| Volumes | 16.7% | 80.0%/86.7% (83.3%) | 0.91 (0.88–0.94) | |
| Thickness + volumes (ROI) | 12% | 85.7%/89.3% (87.5%) | 0.93 (0.91–0.96) | |
AD/HC — Alzheimer's disease/healthy controls; OA — overall accuracy; OOB — out-of-bag estimate; AUC (95% C.I.) — area under the ROC curve with 95% confidence interval.
Fig. 1ROC curves: morphometric modalities (AD/HC). The figure illustrates ROC-curves of the models trained with different morphometric inputs. Three inputs demonstrate competing performances: high-dimensional (HD) cortical thickness, volumetric data and combined parcellated measurements. AD/HC — Alzheimer's disease/healthy controls.
Fig. A5ROC curves: best models (AD/HC). The figure illustrates ROC-curves of the most accurate models trained to differentiate between AD and HC. The model trained using combined parcelled (DK atlas) cortical thickness measures and subcortical volumetric data produced equivalent accuracy compared to the higher-order ensemble model (where all classifiers were combined together by a majority vote). AD/HC — Alzheimer's Disease/Healthy Controls.
Fig. 2Effect of cortical parcellation using DK and D atlases on AD/HC performance. ROC-curves differed significantly when compared to one from the model trained using non-parcellated high-dimensional measurements (p-values: 0.002 and 0.009 — for Desikan-Killiany (DK) and Destrieux (D), respectively) differences were non significant between DK and D. AD/HC — Alzheimer's disease/healthy controls.
Ability of the AD/HC models to predict MCI-to-AD conversion: morphometric data, ADNI cohort.
| Measurements | MCI-to-AD converters detection accuracy | |||||
|---|---|---|---|---|---|---|
| 6 m (n = 14) | 12 m (n = 44) | 18 m (n = 30) | 24 m (n = 35) | 36 m+ (n = 26) | NC (n = 16) | |
| Cortical thickness | 78.6% | 79.5% | 70.0% | 62.9% | 65.3% | 75.0% |
| Sulcal depth | 71.4% | 77.3% | 53.3% | 51.4% | 53.8% | 62.5% |
| Jacobian | 60.2% | 70.5% | 76.7% | 71.4% | 53.8% | 56.3% |
| Volumes | 78.6% | 72.7% | 70.0% | 68.6% | 53.8% | 75.0% |
| Combined | 78.6% | 79.5% | 76.7% | 71.4% | 61.5% | 75.0% |
6, 12, 18, 24, 36 m+ — month of MCI-to-AD conversion; NC — non-converters (detected as HC by the classifiers); AD/HC — Alzheimer's disease/healthy controls.
Here the same models trained to classify images from AD and HC were used.
MCI-c/MCI-nc performance of the combined model (morphometry + ApoE+ demographics).
| OOB error | AD: Sens/spec (test) | MCI | ||||||
|---|---|---|---|---|---|---|---|---|
| 6 m (n = 14) | 12 m (n = 44) | 18 m (n = 30) | 24 m (n = 35) | 36 m+ (n = 26) | NC (n = 16) | |||
| Th + vol + ApoE+Age+Educ | 11.3% | 90.7%/82.9% (86.7%) | 78.6% | 75.0% | 83.3% | 80.0% | 50.0% | 81.3% |
Th — cortical thickness; Vol — non-cortical volumes; Educ — education; MCI — mild cognitive impairment; ApoE — ApoE genotype; MCI-c/MCI-nc — MCI converters/MCI non-converters.
Classifiers’ performance in the same (ADNI) and separate (AddNeuroMed) cohorts.
| Models | AD: Sens/Spec (OA) | MCI-converter 1yr sensitivity* | ||
|---|---|---|---|---|
| Same cohort (ADNI) | Separate cohort (AddNeuroMed) | Same cohort (ADNI) | Separate cohort (AddNeuroMed) | |
| Thickness | 88.6%/90.7% (89.62%) | 87%/78% (82.5%) | 79.0% | 76.2% |
| Sulcal Depth | 80.0%/74.7% (77.3%) | Failed | 74.4% | Failed |
| Jacobian | 77.1%/81.3% (79.2%) | 78.5%/72% (75.25%) | 65.4% | 57.1% |
| Volumes | 82.9%/86.7% (84.7%) | 70.1%/89% (79.5%) | 75.7% | 57.1% |
| Thickness + volumes (parc) | 88.6%/92.0% (90.3%) | 83.2%/89% (86.1%) | 79.0% | 71.4% |
| Morphometry + ApoE + demographics | 90.7%/82.9% (86.7%) | 84.2%/88.3% (86.25%) | 78.0% | 79% |
The classifiers were trained on the subset from the ADNI dataset and then validated on testing sets from both ADNI (same) and AddNeuroMed (separate) cohorts.
Sens/Spec (OA) – Sensitivity/Specificity (Overall Accuracy);
* – for the AddNeuroMed cohort, definition of the MCI-to-AD converters subgroup (n=21) was defined based on 1-year follow-up.
NB: We did not compare accuracy to detect MCI non-converters due to only 1-year follow-up available for the AddNeuroMed cohort