| Literature DB >> 35546881 |
Wouter van Loon1, Frank de Vos1,2,3, Marjolein Fokkema1, Botond Szabo4,5, Marisa Koini6, Reinhold Schmidt6, Mark de Rooij1,3.
Abstract
Multi-view data refers to a setting where features are divided into feature sets, for example because they correspond to different sources. Stacked penalized logistic regression (StaPLR) is a recently introduced method that can be used for classification and automatically selecting the views that are most important for prediction. We introduce an extension of this method to a setting where the data has a hierarchical multi-view structure. We also introduce a new view importance measure for StaPLR, which allows us to compare the importance of views at any level of the hierarchy. We apply our extended StaPLR algorithm to Alzheimer's disease classification where different MRI measures have been calculated from three scan types: structural MRI, diffusion-weighted MRI, and resting-state fMRI. StaPLR can identify which scan types and which derived MRI measures are most important for classification, and it outperforms elastic net regression in classification performance.Entities:
Keywords: feature selection; machine learning; multimodal MRI; penalized regression; stacked generalization
Year: 2022 PMID: 35546881 PMCID: PMC9082949 DOI: 10.3389/fnins.2022.830630
Source DB: PubMed Journal: Front Neurosci ISSN: 1662-453X Impact factor: 5.152
Overview of the scan types, MRI measures, and corresponding number of features used in this study.
|
|
|
|
|---|---|---|
| (1) structural MRI | (1) gray matter density | 48 |
| (2) subcortical volumes | 14 | |
| (3) cortical thickness | 68 | |
| (4) cortical area | 68 | |
| (5) cortical curvature | 68 | |
| (2) diffusion MRI | (6) fractional anisotropy | 20 |
| (7) mean diffusivity | 20 | |
| (8) axial diffusivity | 20 | |
| (9) radial diffusivity | 20 | |
| (3) resting state fMRI | (10) full FC correlation matrix (20 × 20) | 190 |
| (11) full FC correlation matrix (70 × 70) | 2,415 | |
| (12) sparse partial FC correlation matrix (20 × 20) | 190[*] | |
| (13) sparse partial FC correlation matrix (70 × 70) | 2,415[*] | |
| (14) SD of full FC matrix (20 × 20) | 190 | |
| (15) SD of full FC matrix (70 × 70) | 2,415 | |
| (16) SD of sparse partial FC matrix (20 × 20) | 190 | |
| (17) SD of sparse partial FC matrix (70 × 70) | 2,415[*] | |
| (18) FC states of full FC matrix (20 × 20) | 5 | |
| (19) FC states of full FC matrix (70 × 70) | 5 | |
| (20) FC states of sparse partial FC matrix (20 × 20) | 5 | |
| (21) FC states of sparse partial FC matrix (70 × 70) | 5 | |
| (22) Graph metrics of full FC matrix (20 × 20) | 124 | |
| (23) Graph metrics of full FC matrix (70 × 70) | 424 | |
| (24) Graph metrics of sp. par. FC matrix (20 × 20) | 124[*] | |
| (25) Graph metrics of sp. par. FC matrix (70 × 70) | 424[*] | |
| (26) FC with visual network 1 | 190,981 | |
| (27) FC with visual network 2 | 190,981 | |
| (28) FC with visual network 3 | 190,981 | |
| (29) FC with default mode network | 190,981 | |
| (30) FC with the cerebellum | 190,981 | |
| (31) FC with sensorimotor network | 190,981 | |
| (32) FC with auditory network | 190,981 | |
| (33) FC with executive control network | 190,981 | |
| (34) FC with frontoparietal network 1 | 190,981 | |
| (35) FC with frontoparietal network 2 | 190,981 | |
| (36) FC with left hippocampus | 190,981 | |
| (37) FC with right hippocampus | 190,981 | |
| (38) Fast eigenvector centrality mapping | 190,981 | |
| (39) ALFF | 190,981 | |
| (40) fALFF | 190,981 | |
| Total | 2,876,515[**] |
The indices s and v are used to refer to the different scan types and MRI measures in .
StaPLR with 3 levels, as applied to the current data set.
| 1. Train a logistic ridge regression classifier (including cross-validation for λ) on the pairs ( |
| 2. Apply 10-fold cross-validation to obtain a vector of predictions |
| 3. For each of the three scan types |
| 4. Train a logistic nonnegative lasso classifier (including cross-validation for λ) on the pairs ( |
| 5. Apply 10-fold cross-validation to obtain a vector of predictions |
| 6. Collect the predictions |
| 7. Train a logistic nonnegative lasso classifier (including cross-validation for λ) on the pair ( |
| 8. Define the final stacked classifier as: |
|
|
Figure 1Boxplots of the number of selected features for each MRI measure resulting from the elastic net regression, colored by scan type (red = structural MRI, green = diffusion MRI, blue = resting state fMRI). The numbers at the top of the graph denote the median proportion of nonzero coefficients.
Figure 2Boxplots of the L2-norm of the regression coefficient vector for each MRI measure resulting from the elastic net regression, colored by scan type (red = structural MRI, green = diffusion MRI, blue = resting state fMRI).
Figure 3Boxplots of the regression coefficients of StaPLR applied with only 2 levels, using the MRI measures as views, colored by scan type (red = structural MRI, green = diffusion MRI, blue = resting state fMRI).
Figure 4Box-and-violin plots of the regression coefficients of StaPLR applied with only 2 levels, using the scan types as views.
Figure 5Box-and-violin plots of the hierarchical StaPLR meta-level regression coefficients for each scan type. The numbers at the top of the graph denote the proportion of times the coefficient was nonzero.
Figure 6Box-and-violin plots of the hierarchical StaPLR intermediate-level regression coefficients for the structural scan type. The numbers at the top of the graph denote the proportion of times the coefficient was nonzero.
Figure 7Box-and-violin plots of the hierarchical StaPLR intermediate-level regression coefficients for the diffusion scan type. The numbers at the top of the graph denote the proportion of times the coefficient was nonzero.
Figure 8Box plots of the hierarchical StaPLR intermediate-level regression coefficients for the functional scan type. The numbers at the top of the graph denote the proportion of times the coefficient was nonzero.
Figure 9Influence of each of the MRI measures on the predictions of the hierarchical stacked classifier, as quantified by the minority report measure. MRM values range from 0 to 1, with 0 indicating no influence and 1 indicating maximum possible influence. The colors of the bars refer to the different scan types (red = structural MRI, green = diffusion MRI, blue = resting state fMRI).