| Literature DB >> 30763336 |
P J Moore1, T J Lyons1, J Gallacher2.
Abstract
Time-dependent data collected in studies of Alzheimer's disease usually has missing and irregularly sampled data points. For this reason time series methods which assume regular sampling cannot be applied directly to the data without a pre-processing step. In this paper we use a random forest to learn the relationship between pairs of data points at different time separations. The input vector is a summary of the time series history and it includes both demographic and non-time varying variables such as genetic data. To test the method we use data from the TADPOLE grand challenge, an initiative which aims to predict the evolution of subjects at risk of Alzheimer's disease using demographic, physical and cognitive input data. The task is to predict diagnosis, ADAS-13 score and normalised ventricles volume. While the competition proceeds, forecasting methods may be compared using a leaderboard dataset selected from the Alzheimer's Disease Neuroimaging Initiative (ADNI) and with standard metrics for measuring accuracy. For diagnosis, we find an mAUC of 0.82, and a classification accuracy of 0.73 compared with a benchmark SVM predictor which gives mAUC = 0.62 and BCA = 0.52. The results show that the method is effective and comparable with other methods.Entities:
Mesh:
Year: 2019 PMID: 30763336 PMCID: PMC6375557 DOI: 10.1371/journal.pone.0211558
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Histograms of time series lengths.
Upper: training set LB1 whose time series may cover ADNI, ADNI-GO and ADNI-2. Lower: Set LB2 which is formed only from time series from ADNI-1.
Demographic characteristics and ApoE4 status for participants from the set LB2 compared with a matched evaluation set selected from LB1.
The rows are the sample size n, age as minimum, mean and maximum, gender and ApoE4 status.
| 110 | 92 | |
| 59.9 (75.1) 87.9 | 57.8 (75.3) 84.8 | |
| 60.9% | 60.9% | |
| 39.1% | 39.1% | |
| 70.0% | 68.5% | |
| 27.3% | 29.4% | |
| 2.7% | 2.2% |
Fig 2Partition of a 2D input space using two features, X1 and X2.
Feature X2 is first used to bisect the square, then each rectangle is bisected using feature X1. The different marker styles denote unique classes.
The set of variables from which features are selected for prediction.
The variable MMSE is binary and found by thresholding the raw value at 26. The output VENTS-ICV is the ratio of VENTS and ICV which are predicted using separate models. VENTS is predicted using VENTRICLES, ΔVENTRICLES and TIME_DELAY. ICV is predicted using ICV, ΔICV and TIME_DELAY.
| Variable | Meaning | Diag | ADAS-13 | VENTS-ICV |
|---|---|---|---|---|
|
| Diagnosis (NL or MCI or AD) | ✓ | ✓ | |
|
| Age | ✓ | ✓ | |
|
| Gender | ✓ | ✓ | |
|
| ApoE4 status | ✓ | ✓ | |
|
| MMSE Mini-mental state examination | ✓ | ✓ | |
|
| CDRSB | ✓ | ✓ | |
|
| Functional activities questionnaire | ✓ | ✓ | |
|
| Participant identifier | ✓ | ✓ | |
|
| Middle temporal gyrus | ✓ | ✓ | |
|
| Number of months delay | ✓ | ✓ | ✓ |
|
| Alzheimer’s Disease Assessment Scale | ✓ | ||
|
| Intracranial volume | ✓ | ||
|
| Ventricles volume | ✓ | ||
| Δ | MMSE slope | ✓ | ✓ | |
| Δ | Alzheimer’s Disease Assessment Scale slope | ✓ | ||
| Δ | Hippocampus volume slope | ✓ | ✓ | |
| Δ | Ventricles volume slope | ✓ | ✓ | ✓ |
| Δ | Intracranial volume slope | ✓ |
Fig 3Variable importance for diagnosis, ADAS-13, ventricles and intracranial volume prediction.
Confusion matrix for predicting 417 diagnosis points of the 110 participants in test set LB4, where the forecast horizon is up to 7 years.
The labels are as follows, NL: healthy, MCI: mild cognitive impairment, AD: Alzheimer’s disease. The overall accuracy is 0.72.
| NL | MCI | AD | Total | Accuracy | ||
|---|---|---|---|---|---|---|
| NL | 193 | 2 | 0 | 195 | 0.99 | |
| MCI | 54 | 88 | 8 | 150 | 0.59 | |
| AD | 6 | 45 | 21 | 72 | 0.29 | |
Test set results for the random forest, mixed effect and SVM estimators.
The values are the mean over ten splits of the test data with the standard deviation in brackets.
| Method | Diagnosis | ADAS-13 | VENTS-ICV | |
|---|---|---|---|---|
| mAUC | BCA | MAE | MAE | |
| Random forest | 0.80 (0.06) | 0.74 (0.05) | 5.24 (1.39) | 0.0026 (0.00130) |
| Mixed effects | 0.77 (0.07) | 0.68 (0.07) | 5.87 (1.26) | 0.0034 (0.00098) |
| SVM | 0.62 (0.07) | 0.51 (0.04) | 8.13 (1.02) | 0.0098 (0.00078) |
Competition leaderboard table at 4 May 2018 where each row represents an entry from a competition team listed in rank order.
The first highlighted row shows the results for our random forest estimator, the middle highlighted entry shows the results for a linear mixed effects model and the bottom highlighted entry for a support vector machine model. There are three target outcomes for prediction: 1) Diagnosis, 2) the ADAS-13 score, and 3) VENTS-ICV which is the ventricles volume divided by intracranial volume. The overall rank is determined by the lowest sum of ranks from mAUC, ADAS-13 MAE and VENTS-ICV MAE.
| Diagnosis | ADAS-13 | VENTS-ICV | ||||||
|---|---|---|---|---|---|---|---|---|
| mAUC | BCA | MAE | WES | CPA | MAE | WES | CPA | |
| 0.91 | 0.83 | 3.62 | 3.62 | 0.11 | 0.0020 | 0.0018 | 0.13 | |
| 0.93 | 0.85 | 3.72 | 3.10 | 0.02 | 0.0020 | 0.0016 | 0.15 | |
| 0.93 | 0.85 | 3.72 | 3.10 | 0.02 | 0.0020 | 0.0016 | 0.15 | |
| 0.91 | 0.83 | 3.67 | 3.67 | 0.12 | 0.0024 | 0.0022 | 0.08 | |
| 0.91 | 0.74 | 3.73 | 3.70 | 0.01 | 0.0028 | 0.0023 | 0.32 | |
| 0.89 | 0.78 | 4.16 | 4.16 | 0.39 | 0.0023 | 0.0023 | 0.47 | |
| 0.89 | 0.82 | 3.76 | 3.76 | 0.12 | 0.0034 | 0.0029 | 0.15 | |
| 0.89 | 0.82 | 3.80 | 3.80 | 0.11 | 0.0034 | 0.0029 | 0.14 | |
| 0.87 | 0.78 | 4.12 | 4.08 | 0.03 | 0.0027 | 0.0027 | 0.01 | |
| 0.87 | 0.69 | 4.41 | 4.41 | 0.30 | 0.0026 | 0.0026 | 0.46 | |
| 0.84 | 0.74 | 4.54 | 4.17 | 0.49 | 0.0025 | 0.0021 | 0.49 | |
| 0.89 | 0.81 | 3.81 | 3.81 | 0.11 | 0.0057 | 0.0041 | 0.01 | |
| 0.88 | 0.80 | 3.87 | 3.87 | 0.11 | 0.0049 | 0.0038 | 0.05 | |
| 0.91 | 0.74 | 3.73 | 3.70 | 0.01 | 0.0092 | 0.0092 | 0.01 | |
| 0.80 | 0.74 | 4.51 | 4.49 | 0.40 | 0.0027 | 0.0027 | 0.25 | |
| Random forest | 0.82 | 0.73 | 5.19 | 4.57 | 0.07 | 0.0023 | 0.0019 | 0.11 |
| 0.76 | 0.67 | 4.34 | 4.30 | 0.08 | 0.0022 | 0.0021 | 0.08 | |
| 0.88 | 0.80 | 5.00 | 4.78 | 0.03 | 0.0030 | 0.0030 | 0.05 | |
| 0.88 | 0.80 | 3.92 | 3.92 | 0.10 | 0.0060 | 0.0043 | 0.01 | |
| 0.86 | 0.70 | 4.56 | 3.69 | 0.14 | 0.0034 | 0.0032 | 0.43 | |
| 0.81 | 0.73 | 5.13 | 5.14 | 0.01 | 0.0027 | 0.0028 | 0.20 | |
| 0.81 | 0.73 | 4.09 | 4.09 | 0.09 | 0.0045 | 0.0038 | 0.01 | |
| 0.80 | 0.74 | 4.51 | 4.49 | 0.40 | 0.0038 | 0.0038 | 0.42 | |
| 0.80 | 0.68 | 4.14 | 4.14 | 0.29 | 0.0040 | 0.0040 | 0.38 | |
| 0.80 | 0.66 | 4.81 | 4.81 | 0.21 | 0.0038 | 0.0038 | 0.10 | |
| 0.80 | 0.74 | 4.60 | 4.60 | 0.35 | 0.0041 | 0.0041 | 0.12 | |
| 0.88 | 0.69 | 4.98 | 4.98 | 0.34 | 0.0066 | 0.0066 | 0.27 | |
| 0.78 | 0.71 | 4.60 | 4.60 | 0.35 | 0.0041 | 0.0041 | 0.12 | |
| 0.79 | 0.69 | 6.68 | 5.54 | 0.05 | 0.0028 | 0.0023 | 0.32 | |
| 0.81 | 0.72 | 4.70 | 4.70 | 0.09 | 0.0070 | 0.0070 | 0.03 | |
| 0.77 | 0.65 | 4.83 | 4.83 | 0.20 | 0.0038 | 0.0038 | 0.07 | |
| 0.87 | 0.70 | 4.91 | 4.79 | 0.36 | 0.0073 | 0.0073 | 0.46 | |
| Mixed effects | 0.77 | 0.68 | 5.85 | 5.85 | 0.38 | 0.0032 | 0.0032 | 0.34 |
| 0.71 | 0.63 | 6.37 | 6.71 | 0.39 | 0.0026 | 0.0026 | 0.32 | |
| 0.71 | 0.63 | 6.37 | 6.74 | 0.25 | 0.0026 | 0.0026 | 0.27 | |
| 0.79 | 0.66 | 4.69 | 4.69 | 0.09 | 0.0093 | 0.0093 | 0.01 | |
| 0.76 | 0.69 | 5.00 | 4.98 | 0.35 | 0.0042 | 0.0042 | 0.38 | |
| 0.72 | 0.62 | 5.70 | 5.70 | 0.41 | 0.0036 | 0.0036 | 0.43 | |
| 0.73 | 0.59 | 9.63 | 9.63 | 0.45 | 0.0029 | 0.0029 | 0.48 | |
| 0.80 | 0.68 | 6.00 | 6.00 | 0.11 | 0.0075 | 0.0075 | 0.17 | |
| 0.71 | 0.58 | 9.70 | 9.70 | 0.40 | 0.0029 | 0.0029 | 0.26 | |
| 0.74 | 0.68 | 5.70 | 4.60 | 0.21 | 0.0070 | 0.0042 | 0.35 | |
| 0.74 | 0.68 | 5.70 | 4.60 | 0.21 | 0.0070 | 0.0042 | 0.35 | |
| 0.77 | 0.65 | 6.73 | 6.73 | 0.13 | 0.0094 | 0.0094 | 0.02 | |
| 0.78 | 0.68 | 7.39 | 7.39 | 0.12 | 0.0095 | 0.0095 | 0.04 | |
| 0.78 | 0.66 | 8.43 | 5.09 | 0.48 | 0.0096 | 0.0095 | 0.50 | |
| … | ||||||||
| SVM | 0.62 | 0.52 | 8.11 | 8.11 | 0.50 | 0.0098 | 0.0098 | 0.50 |