| Literature DB >> 32592530 |
Xiaowei Zhuang1, Zhengshi Yang1, Dietmar Cordes1,2,3.
Abstract
Collecting comprehensive data sets of the same subject has become a standard in neuroscience research and uncovering multivariate relationships among collected data sets have gained significant attentions in recent years. Canonical correlation analysis (CCA) is one of the powerful multivariate tools to jointly investigate relationships among multiple data sets, which can uncover disease or environmental effects in various modalities simultaneously and characterize changes during development, aging, and disease progressions comprehensively. In the past 10 years, despite an increasing number of studies have utilized CCA in multivariate analysis, simple conventional CCA dominates these applications. Multiple CCA-variant techniques have been proposed to improve the model performance; however, the complicated multivariate formulations and not well-known capabilities have delayed their wide applications. Therefore, in this study, a comprehensive review of CCA and its variant techniques is provided. Detailed technical formulation with analytical and numerical solutions, current applications in neuroscience research, and advantages and limitations of each CCA-related technique are discussed. Finally, a general guideline in how to select the most appropriate CCA-related technique based on the properties of available data sets and particularly targeted neuroscience questions is provided.Entities:
Keywords: canonical correlation analysis; multivariate analysis; neuroscience
Mesh:
Year: 2020 PMID: 32592530 PMCID: PMC7416047 DOI: 10.1002/hbm.25090
Source DB: PubMed Journal: Hum Brain Mapp ISSN: 1065-9471 Impact factor: 5.038
FIGURE 1Inclusion and exclusion criteria for this review
FIGURE 2Number of articles summarized by category (a) and year (b)
FIGURE 3Technical details of CCA and relationship between CCA and its variants. Background color indicates different techniques: red: conventional CCA; gray: nonlinear CCA; yellow: constrained CCA; orange: multiset CCA; green: other techniques related to CCA. CCA, canonical correlation analysis; PCA, principle component analysis; PLS, partial least square
CCA application
| CCA variant | Modality 1 | Modality 2 | References |
|---|---|---|---|
| CCA | Brain imaging data | Clinical/behavioral/neuropsychological measurements | Adhikari et al. ( |
| Brain imaging data | Brain imaging data | Ashrafulla et al. ( | |
| Brain imaging data | Task design | El‐Shabrawy et al. ( | |
| Electrophysiological data | Clinical/behavioral measurements | Abraham et al. ( | |
| Electrophysiological data | Electrophysiological data | Brookes et al. ( | |
| Electrophysiological data | Stimulus | de Cheveigne et al. ( | |
| Genetic information | Clinical/behavioral measurements | Laskaris et al. ( | |
| Clinical/behavioral/demographics/neuropsychological measurements | Clinical/behavioral/demographics/neuropsychological measurements | Bedi et al. ( | |
| Blind‐source separation to denoise electrophysiological data | Hallez et al. ( | ||
| PCA/LASSO/regression + CCA | Brain imaging data | Clinical/behavioral/neuropsychological measurements | Churchill et al. ( |
| Brain imaging data | Brain imaging data | Abrol, Rashid, Rachakonda, Damaraju, and Calhoun ( | |
| Brain imaging data | Genetic data | Bai, Zille, Hu, Calhoun, and Wang ( | |
| Electrophysiological data | Clinical/behavioral measurements | Bologna et al. ( | |
Abbreviations: CAA, canonical correlation analysis; LASSO, least absolute shrinkage and selection operator; PCA, principal component analysis.
Constrained CCA application
| CCA variant | Modality 1 | Modality 2 | Reference |
|---|---|---|---|
| Sparse CCA (L1‐norm penalty) | Brain imaging data | Clinical/behavioral/neuropsychological measurements | Badea et al. ( |
| Brain imaging data | Brain imaging data | Avants, Cook, Ungar, Gee, and Grossman ( | |
| Brain imaging data | Genetic information | Du et al. ( | |
| Genetic information | Clinical/behavioral/measurements | Leonenko et al. ( | |
| Structure‐sparse CCA | Brain imaging data | Brain imaging data | Lisowska and Rekik ( |
| Brain imaging data | Genetic information | Du et al. ( | |
| Discriminant sparse CCA | Brain imaging data | Genetic information/blood data | Fang et al. ( |
| Constrained CCA | Brain imaging data | Clinical/behavioral/neuropsychological measurements | Grosenick et al. ( |
| Brain imaging data | Task design | Cordes, Jin, Curran, and Nandy, ( | |
| Other constraints in CCA | Longitudinal brain imaging data | Genetic information | Du, Liu, Zhu, et al. ( |
Abbreviation: CCA, canonical correlation analysis.
Nonlinear Kernel CCA applications
| CCA variant | Modality 1 | Modality 2 | Reference |
|---|---|---|---|
| Kernel CCA | Brain imaging data | Brain imaging data | Yang, Cao, et al. ( |
| Brain imaging data | Task design | Hardoon, Mourão‐Miranda, Brammer, and Shawe‐Taylor ( | |
| Genetic information | Genetic information | Ashad Alam, Komori, Deng, Calhoun, and Wang ( | |
| Temporal kernel CCA | Simultaneously recorded multiple modalities | John et al. ( | |
Abbreviation: CCA, canonical correlation analysis.
Multiset CCA applications
| CCA variant | Detailed modalities | Reference | |
|---|---|---|---|
| Multiset CCA | Combine multiple brain imaging data | rsfMRI + task fMRI + sMRI | Lerman‐Sinkoff et al. ( |
| sMRI (WM + GM + CSF) + rsfMRI | Lottman et al. ( | ||
| sMRI + fMRI + dMRI | Sui et al. ( | ||
| Multiple task fMRI | Langers, Krumbholz, Bowtell, and Hall ( | ||
| sMRI + fMRI + EEG | Correa, Adali, Li, and Calhoun ( | ||
| Combine brain imaging data and other information | Brain imaging data (sMRI/fMRI) + neuropsychological measurements + clinical/behavioral measurements | Baumeister et al. ( | |
| Brain imaging data (PET + sMRI + fMRI) + neuropsychological measurements | Stout et al. ( | ||
| Combine multiple subjects within a single modality | Sub1 + Sub2 + … + SubN within a single modality | Afshin‐Pour, Hossein‐Zadeh, Strother, and Soltanian‐Zadeh ( | |
| Combine multiple subjects from two modalities | Sub1 + Sub2+ … + SubN from fMRI and EEG | Correa, Eichele, Adali, Li, and Calhoun ( | |
| Combine multiple ROIs within a single modality | ROI1 + ROI2 + … + ROIN within a single modality | Deleus et al. ( | |
| Constraints in multiset CCA | Sparse multiset CCA | Brain imaging data + genetic information + clinical measurements | Hu, Lin, Calhoun, and Wang ( |
| Multiset CCA with reference | Brain imaging data (fMRI + sMRI + dMRI) with neuropsychological measurements as reference | Qi et al. ( | |
| Brain imaging data (fMRI + sMRI + dMRI) with genetic information as reference | Qi, Yang, et al. ( |
Abbreviations: CCA, canonical correlation analysis; CSF, cerebrospinal fluid; dMRI, diffusion‐weighted MRI; EEG, electroencephalogram; GM, gray matter; MRI, magnetic resonance imaging; PET, position emission tomography; ROI, regions of interest; rsfMRI, resting‐state functional MRI; sMRI, structural MRI; Sub, subject; WM, white matter.
Other CCA applications
| CCA variant | CCA application | Reference |
|---|---|---|
| Supervised local CCA | Combine two modalities | Zhao, Qiao, Shi, Yap, and Shen ( |
| Tensor CCA | Morphological networks | Graa and Rekik ( |
| Bayesian CCA | Realign fMRI data from multiple subjects | Smirnov et al. ( |
| Task fMRI activation detection | Fujiwara, Miyawaki, and Kamitani ( | |
| Others | Toolbox | Bilenko and Gallant ( |
| Reviews | Liu and Calhoun ( |
Abbreviations: CCA, canonical correlation analysis; fMRI, functional magnetic resonance imaging.
Advantages and limitations of each CCA‐related technique
| Category | CCA variant | Advantages | Limitations | |
|---|---|---|---|---|
| CCA | CCA |
1) Has closed‐form analytical solution 2) Easy to apply 3) Invariant to scaling |
1) Requires 2) Signs of canonical correlations are indeterminate | |
| Constrained CCA | Sparse CCA |
1) Removes noninformative features and solves 2) Performs reasonably with high‐dimensional‐co‐linear data | Requires optimization expertise | |
| Structure sparse CCA | Removes noninformative features, solving |
1) Improves effectiveness of sparse CCA. 2) Produces biological meaningful results |
1) Requires optimization expertise 2) Requires prior knowledge about the data | |
| Discriminant sparse CCA | Discovers group discriminant features | |||
| Generalized constrained CCA |
1) Reduces false positives 2) Maintains most of the variance in a stable model |
1) Requires optimization expertise 2) Requires predefined constraints | ||
| Nonlinear CCA | Kernel CCA |
1) Finds nonlinear relationship among modalities 2) Has analytical solution |
1) Requires predefined kernel functions 2) Difficult to project from kernel space back to original feature space, leading to difficulties in interpretation 3) Only linear kernel space can be projected back to the original feature space. | |
| Temporal kernel CCA | Most appropriate to simultaneously collect data from two modalities with time delay | |||
| Deep CCA |
1) Finds unknown nonlinear relationship 2) Purely data‐driven |
1) Requires deep learning expertise 2) Requires large number of training samples (in tens of thousands) | ||
| Multiset CCA | Multiset CCA |
1) Good for more than two modalities 2) Good for group analysis |
1) Requires predefined objective functions 2) The number of final canonical components does not represent the intersected common patterns across all modalities | |
| Sparse multiset CCA |
1) Good for more than two modalities 2) Removes noninformative features and solves | |||
| Multiset CCA with reference | Supervised fusion technique to link common patterns with a prior known variable | |||
Abbreviation: CCA, Canonical correlation analysis.
Current applied and potential CCA techniques for each application
| Applications | Currently applied | Potential techniques |
|---|---|---|
| Link two modalities |
CCA Sparse CCA Structure/discriminant sparse CCA Kernel CCA Temporal kernel CCA |
Deep CCA |
| Detect task fMRI activation |
CCA Constrained CCA Kernel CCA |
Deep CCA Sparse CCA |
| Uncover common patterns across multiple modalities |
Multiset CCA Sparse multiset CCA |
Multiset constrained CCA Deep CCA |
| Denoise raw data |
CCA |
Constrained CCA Kernel CCA Deep CCA |
Abbreviations: CCA, canonical correlation analysis; fMRI, functional magnetic resonance imaging.
FIGURE 4Selecting a canonical correlation analysis (CCA)‐technique that suits your application. Three scenarios are most commonly encountered in neuroscience applications: CCA with and without constraints (dashed yellow box); nonlinear CCA (dashed gray box) and multiset CCA (dashed orange box)
FIGURE 5Example of choosing canonical correlation analysis (CCA) variants by following the guideline. Voxel‐wise functional and structural MRI information from cognitive normal subjects and subjects with mild cognitive impairment were used for data fusion analysis. (a) Schematic diagram of (sparse) principal component analysis (PCA) + CCA. The abbreviation sPCA stands for sparse PCA. (b) Schematic diagram of sparse CCA (sCCA). (c) Top panel shows the most disease‐discriminant functional and structural component and the bottom panel shows the correlation between datasets (), the significance of the correlation derived from nonparametric permutation test () and the classification accuracy for each method