| Literature DB >> 32545374 |
Angela Lombardi1, Nicola Amoroso1,2, Domenico Diacono1, Alfonso Monaco1, Sabina Tangaro1,3, Roberto Bellotti1,4.
Abstract
Characterizing both neurodevelopmental and aging brain structural trajectories is important for understanding normal biological processes and atypical patterns that are related to pathological phenomena. Initiatives to share open access morphological data contributed significantly to the advance in brain structure characterization. Indeed, such initiatives allow large brain morphology multi-site datasets to be shared, which increases the statistical sensitivity of the outcomes. However, using neuroimaging data from multi-site studies requires harmonizing data across the site to avoid bias. In this work we evaluated three different harmonization techniques on the Autism Brain Imaging Data Exchange (ABIDE) dataset for age prediction analysis in two groups of subjects (i.e., controls and autism spectrum disorder). We extracted the morphological features from T1-weighted images of a mixed cohort of 654 subjects acquired from 17 sites to predict the biological age of the subjects using three machine learning regression models. A machine learning framework was developed to quantify the effects of the different harmonization strategies on the final performance of the models and on the set of morphological features that are relevant to the age prediction problem in both the presence and absence of pathology. The results show that, even if two harmonization strategies exhibit similar accuracy of predictive models, a greater mismatch occurs between the sets of most age-related predictive regions for the Autism Spectrum Disorder (ASD) subjects. Thus, we propose to use a stability index to extract meaningful features for a robust clinical validation of the outcomes of multiple harmonization strategies.Entities:
Keywords: FreeSurfer; age prediction; aging; morphological analysis; multi-site harmonization; neurodevelopment
Year: 2020 PMID: 32545374 PMCID: PMC7349402 DOI: 10.3390/brainsci10060364
Source DB: PubMed Journal: Brain Sci ISSN: 2076-3425
Figure 1Statistical framework to compare the three harmonization strategies and select the most age-related predictive features. Three harmonization strategies were applied to each of the two dataset (i.e., NC and ASD). For each resulting dataset, a nested feature selection was performed on the training set in each round of the k-fold validation. Subsequently, 100 stepwise regression models were trained by progressive increasing the training set size. A consensus ranking procedure was used to select the most stable features with the lowest mean absolute error. This procedure was repeated for each of the three regression algorithms—SVR, RF, and Lasso—resulting in nine combinations of “harmonization strategy–regression model”. The performance of the models were compared in order to select the most effective machine learning algorithm for age prediction. For the best regression model, the sets of features that resulted from different harmonization strategies are compared using a stability index to quantify the effects of the harmonization and support clinical evaluation.
Number of ranked features at which the regression models exhibit the minimum average Mean absolute error (MAE).
| Harmonization Technique | NC | ASD | ||||
|---|---|---|---|---|---|---|
| SVR | RF | Lasso | SVR | RF | Lasso | |
| No harmonization | 400 | 90 | 55 | 150 | 130 | 35 |
| Age covariate | 1000 | 100 | 100 | 400 | 50 | 65 |
| No age covariate | 40 | 30 | 15 | 50 | 20 | 15 |
Figure 2Violin plots of MAE resulting from (a) support vector regression (SVR); (b) Random forest (RF); and, (c) Lasso, for the three harmonization strategies.
Mean MAE ± SD resulting from age prediction for each class and each harmonization technique for the three regression algorithms SVR, Random Forest, and Lasso. The best performances are highlighted in gray.
| Harmonization Technique | NC | ASD | ||||
|---|---|---|---|---|---|---|
| SVR | RF | Lasso | SVR | RF | Lasso | |
| No harmonization |
|
|
|
|
|
|
| Age covariate |
|
|
|
|
|
|
| No age covariate |
|
|
|
|
|
|
Mean ± SD resulting from age prediction for each class and each harmonization technique for the three regression algorithms SVR, Random Forest, and Lasso. The best performances are highlighted in gray.
| Harmonization Technique | NC | ASD | ||||
|---|---|---|---|---|---|---|
| SVR | RF | Lasso | SVR | RF | Lasso | |
| No harmonization |
|
|
|
|
|
|
| Age covariate |
|
|
|
|
|
|
| No age covariate |
|
|
|
|
|
|
Figure 3Best-fitting models of chronological age-predicted age resulting from (a) SVR; (b) RF; (c) Lasso, for the three harmonization strategies.
Best-fitting models of age prediction and adjusted for each class and each harmonization technique for the three regression algorithms SVR, Random Forest and Lasso.
| Harmonization Technique | Class | SVR | RF | Lasso |
|---|---|---|---|---|
| No harmonization | NC |
|
|
|
|
|
|
| ||
| ASD |
|
|
| |
|
|
|
| ||
| Age covariate | NC |
|
|
|
|
|
|
| ||
| ASD |
|
|
| |
|
|
|
| ||
| No age covariate | NC |
|
|
|
|
|
|
| ||
| ASD |
|
|
| |
|
|
|
|
Signed difference between the of the ideal model fit and each of the nine best-fitting models for NC and ASD classes.
| Harmonization Technique | Class | SVR | RF | Lasso |
|---|---|---|---|---|
| No harmonization | NC |
|
|
|
| ASD |
|
|
| |
| Age covariate | NC |
|
|
|
| ASD |
|
|
| |
| No age coviariate | NC |
|
|
|
| ASD |
|
|
|
Figure 4Frequency of occurrence of feature categories resulting from (a) SVR; (b) RF; (c) Lasso, for the three harmonization strategies and both populations.
Figure 5The most significant ROIs for the RF age prediction with no harmonization in NC population.
Figure 6The most significant ROIs for the RF age prediction with age covariate harmonization in NC population.
Figure 7The most significant ROIs for the RF age prediction with no harmonization in ASD population.
Figure 8The most significant ROIs for the RF age prediction with age covariate harmonization in ASD population.