| Literature DB >> 31362149 |
Damiano Archetti1, Silvia Ingala2, Vikram Venkatraghavan3, Viktor Wottschel4, Alexandra L Young5, Maura Bellio6, Esther E Bron7, Stefan Klein8, Frederik Barkhof9, Daniel C Alexander10, Neil P Oxtoby11, Giovanni B Frisoni12, Alberto Redolfi13.
Abstract
Understanding the sequence of biological and clinical events along the course of Alzheimer's disease provides insights into dementia pathophysiology and can help participant selection in clinical trials. Our objective is to train two data-driven computational models for sequencing these events, the Event Based Model (EBM) and discriminative-EBM (DEBM), on the basis of well-characterized research data, then validate the trained models on subjects from clinical cohorts characterized by less-structured data-acquisition protocols. Seven independent data cohorts were considered totalling 2389 cognitively normal (CN), 1424 mild cognitive impairment (MCI) and 743 Alzheimer's disease (AD) patients. The Alzheimer's Disease Neuroimaging Initiative (ADNI) data set was used as training set for the constriction of disease models while a collection of multi-centric data cohorts was used as test set for validation. Cross-sectional information related to clinical, cognitive, imaging and cerebrospinal fluid (CSF) biomarkers was used. Event sequences obtained with EBM and DEBM showed differences in the ordering of single biomarkers but according to both the first biomarkers to become abnormal were those related to CSF, followed by cognitive scores, while structural imaging showed significant volumetric decreases at later stages of the disease progression. Staging of test set subjects based on sequences obtained with both models showed good linear correlation with the Mini Mental State Examination score (R2EBM = 0.866; R2DEBM = 0.906). In discriminant analyses, significant differences (p-value ≤ 0.05) between the staging of subjects from training and test sets were observed in both models. No significant difference between the staging of subjects from the training and test was observed (p-value > 0.05) when considering a subset composed by 562 subjects for which all biomarker families (cognitive, imaging and CSF) are available. Event sequence obtained with DEBM recapitulates the heuristic models in a data-driven fashion and is clinically plausible. We demonstrated inter-cohort transferability of two disease progression models and their robustness in detecting AD phases. This is an important step towards the adoption of data-driven statistical models into clinical domain.Entities:
Keywords: Alzheimer's disease; Biomarkers progression; Event-based models; Inter-cohort validation; Patient staging
Mesh:
Substances:
Year: 2019 PMID: 31362149 PMCID: PMC6675943 DOI: 10.1016/j.nicl.2019.101954
Source DB: PubMed Journal: Neuroimage Clin ISSN: 2213-1582 Impact factor: 4.881
Characteristics of the data sets selected.
| Data set | Full name | Description | Categories | |
|---|---|---|---|---|
| Training set | ADNI-1 | Alzheimer's disease neuroimaging initiative - 1 | The Alzheimer's Disease Neuroimaging Initiative ( | CN MCI AD SMC |
| ADNI-GO | Alzheimer's disease neuroimaging initiative – grand opportunities | MCI SMC | ||
| ADNI-2 | Alzheimer's disease neuroimaging initiative - 2 | CN MCI AD SMC | ||
| Test set | ADC | Amsterdam dementia cohort | The ADC includes all patients who come to the Alzheimer Center in Amsterdam (since 2004) for diagnostic work-up and consent to give all their data collected for research ( | SMC MCI AD |
| ARWiBo | Alzheimer's disease repository without borders | ARWiBo is a cross-sectional data set including data from >2500 patients enrolled in Brescia (Italy) and nearby areas. The data set contains socio-demographic, clinical, genotype, bio-specimen information, MRI T1-weighted images ( | CN MCI AD | |
| EDSD | European DTI study on dementia | EDSD ( | CN MCI AD | |
| OASIS | Open access series of imaging studies | OASIS ( | CN MCI AD | |
| PharmaCog ( | Prediction of cognitive properties of new drug candidates for neurodegenerative diseases in early clinical development | PharmaCog is an industry-academic European project (IMI) aimed at identifying biomarkers sensitive to symptomatic and disease modifying effects of drugs for Alzheimer's disease ( | MCI | |
| ViTA | Vienna transdanube aging | ViTA is a population-based cohort-study of all 75-years old inhabitants of a geographically defined area of Vienna ( | CN MCI AD |
Abbreviations: AD, Alzheimer's disease; MCI, mild cognitive impairment; CN, cognitively normal; SMC: subjective memory complaints.
Diagnoses and biomarker availability.
| Data set | CN | MCI | AD | Sub-Total | MRI | CSF | Cognitive scores | |
|---|---|---|---|---|---|---|---|---|
| Training set | ADNI 1/GO/2 | 468 | 753 | 267 | 1488 | 100% | 72% | 100% |
| Test set | ADC | 125 | 80 | 129 | 334 | 100% | 83% | 99% |
| ARWiBo | 1399 | 169 | 152 | 1720 | 100% | 3% | 59% | |
| EDSD | 179 | 138 | 151 | 468 | 100% | 19% | 97% | |
| OASIS | 177 | 122 | 42 | 341 | 100% | NA | 100% | |
| PharmaCog | 0 | 147 | 0 | 147 | 100% | 99% | 100% | |
| ViTA | 41 | 15 | 2 | 58 | 100% | NA | 100% | |
| Total | 2389 | 1424 | 743 | 4556 | 100% | 36% | 77% |
The number of cognitively normal (CN), mild cognitive impairment (MCI), Alzheimer's disease (AD) and total subjects is reported for each data set. Biomarker availability is expressed as percentage related to the total subjects in each data set. No CSF biomarker is available for OASIS and ViTA data sets.
Demographics and clinical characteristics.
| MCI | AD | P-value | Total | |||
|---|---|---|---|---|---|---|
| Training set | Age | 73.9 ± 6.7 | 72.5 ± 7.3 | 73.9 ± 7.9 | 3.22·10[‐]−4 | 73.2 ± 7.0 |
| Years of education | 16.4 ± 2.7 | 15.9 ± 2.8 | 15.2 ± 2.9 | 1.09·10[‐]−6 | 15.9 ± 2.8 | |
| eTIV (cm3) | 1510± 180 | 1540 ± 160 | 1530 ± 160 | 4.20·10[‐]−3 | 1530 ± 160 | |
| MMSE | 29.1 ± 1.2 | 27.6 ± 1.8 | 23.2 ± 2.0 | 2.2·10[‐]−16 | 27.3 ± 2.6 | |
| Sex (% of females) | 52% | 42% | 48% | 1.43·10[‐]−3 | 46% | |
| APOE4-carrier | 34% | 49%* | 66% | 2.2·10[‐]−16 | 49% | |
| Test set | Age | 56 ± 17 | 70.6 ± 7.7 | 73.7 ± 8.1 | 2.2·10[‐]−16 | 62 ± 16 |
| Years of education | 10.8 ± 4.8 | 9.0 ± 4.5 | 8.7 ± 4.5 | 2.2·10[‐]−16 | 10.2 ± 4.8 | |
| eTIV (cm3) | 1450 ± 160 | 1460 ± 170 | 1470 ± 170 | 0.157 | 1460 ± 160 | |
| MMSE | 28.7 ± 1.4 | 26.5 ± 2.4 | 21.0 ± 4.7 | 2.2·10[‐]−16 | 26.6 ± 3.9 | |
| Sex (% of females) | 61% | 49% | 63% | 1.50·10[‐]−5 | 58% | |
| APOE4-carrier | 21% | 43% | 49% | 2.2·10[‐]−16 | 43% | |
Data are expressed as mean values ± standard deviations. Acronyms: eTIV: estimated total intracranial volume; MMSE: Mini Mental State Examination; APOE4: apolipoprotein E ε4; CN: cognitively normal; MCI: mild cognitive impairment; AD: Alzheimer's disease. P-values were calculated via chi square test for dichotomic variables and via ANOVA for non-dichotomic variables. Values of training set denoted with * are not significantly different from their corresponding values derived from the test subjects (p-value >0.05).
Fig. 1Positional variance diagrams of event orderings obtained with EBM and DEBM. Both diagrams show the number of times each biomarker occurred in a specific position from a batch of 50 independent bootstrapped sequences generated using biomarkers of training subjects with EBM (left) and DEBM (right) methods.
Fig. 2Subject staging based on the sequences obtained with EBM and DEBM methods. Staging of subjects from all diagnostic categories (Cognitively normal (CN) in blue, mild cognitive impairment (MCI) in orange, Alzheimer's disease (AD) in red) are shown for (a) training subjects on EBM sequence, (b) training subjects on DEBM sequence, (c) test subjects on EBM sequence and (d) test subjects on DEBM sequence. Histograms are normalized for each diagnostic category. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 3Correlation between MMSE score and subjects staging for (a) training set subjects on EBM sequence, (b) training set subjects on DEBM sequence, (c) test set subjects on EBM sequence, (d) test set subjects on DEBM sequence. Average and standard deviation of MMSE score of training and test subjects staged on the basis of EBM and DEBM sequences are shown. Coefficients of determination (R2) of the linear regression of MMSE score vs disease stage are reported.
Measurements of area under curve (AUC), sensitivity (Sens), specificity (Spec), and balanced accuracy (BalAcc) at a specific threshold (kT) for the subject staged with EBM and DEBM methods on training and test data sets.
| EBM | DEBM | p-value | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| kT | Sens | Spec | BalAcc | AUC | kT | Sens | Spec | BalAcc | AUC | ||
| Training set | |||||||||||
| AD vs CN | 7 | 0.97 | 0.96 | 0.96 | 0.97* | 5 | 0.92 | 0.94 | 0.93 | 0.95* | 1.88·10−3 |
| AD vs MCI | 9 | 0.59 | 0.96 | 0.77 | 0.81 | 5 | 0.48 | 0.94 | 0.71 | 0.76 | 5.30·10−5 |
| MCI vs CN | 6 | 0.88 | 0.52 | 0.70 | 0.73* | 5 | 0.92 | 0.52 | 0.72 | 0.73* | 0.537 |
| Test set | |||||||||||
| AD vs CN | 5 | 0.71 | 0.91 | 0.81 | 0.87 | 7 | 0.78 | 0.85 | 0.81 | 0.86 | 3.99·10−2 |
| AD vs MCI | 12 | 0.77 | 0.71 | 0.74 | 0.78 | 11 | 0.70 | 0.75 | 0.73 | 0.77 | 0.393 |
| MCI vs CN | 1 | 0.63 | 0.62 | 0.62 | 0.63 | 1 | 0.68 | 0.60 | 0.64 | 0.64 | 0.676 |
Thresholds are chosen to maximize the balanced accuracy in each classification task. P-values of Delong test performed to compare AUCs of EBM and DEBM methods are reported in the last column. AUCs of training set denoted with * are significantly different from their corresponding values derived from the test subjects (p-value of DeLong test ≤0.05).
Measurements of area under curve (AUC), sensitivity (Sens), specificity (Spec) and balanced accuracy (BalAcc) at a specific threshold (kT) for the staging obtained with EBM and DEBM methods on training and test data sets not containing missing values.
| EBM | DEBM | p-value | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| kT | Sens | Spec | BalAcc | AUC | kT | Sens | Spec | BalAcc | AUC | ||
| Training set | |||||||||||
| AD vs CN | 8 | 0.98 | 0.95 | 0.97 | 0.97 | 3 | 0.86 | 0.99 | 0.92 | 0.95 | 3.10 10−2 |
| AD vs MCI | 8 | 0.70 | 0.95 | 0.83 | 0.83 | 7 | 0.66 | 0.76 | 0.71 | 0.76 | 0.104 |
| MCI vs CN | 5 | 0.89 | 0.51 | 0.70 | 0.72 | 3 | 0.86 | 0.58 | 0.72 | 0.73 | 1.99 10−8 |
| Test set | |||||||||||
| AD vs CN | 4 | 0.88 | 0.94 | 0.91 | 0.95 | 3 | 0.91 | 0.91 | 0.91 | 0.94 | 0.332 |
| AD vs MCI | 4 | 0.57 | 0.94 | 0.76 | 0.80 | 5 | 0.63 | 0.87 | 0.75 | 0.79 | 1.65 10−2 |
| MCI vs CN | 4 | 0.88 | 0.43 | 0.66 | 0.66 | 3 | 0.91 | 0.52 | 0.71 | 0.70 | 0.296 |
P-values of Delong test performed to compare AUCs of EBM and DEBM methods are reported in the last column. In DEBM and EBM AUCs of the training set were not significantly different to their corresponding AUCs in the test set (p-values of DeLong test always >0.05).
Fig. 4Positional variance diagrams of event sequences computed from training set (left) and test set (right) using EBM (a) and DEBM (b) algorithms. In the case of DEBM green lines divide the sequences into homogeneous blocks between the training and test sets. Orange boxes represent biomarker exceptions not conserved in the same block comparing the training vs. test positional variance diagrams. Clear event blocks cannot be identified for EBM sequences. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)