Literature DB >> 28881979

Identification of associations between genotypes and longitudinal phenotypes via temporally-constrained group sparse canonical correlation analysis.

Xiaoke Hao¹, Chanxiu Li¹, Jingwen Yan^2,3, Xiaohui Yao^2,3, Shannon L Risacher², Andrew J Saykin², Li Shen^2,3, Daoqiang Zhang¹.

Abstract

MOTIVATION: Neuroimaging genetics identifies the relationships between genetic variants (i.e., the single nucleotide polymorphisms) and brain imaging data to reveal the associations from genotypes to phenotypes. So far, most existing machine-learning approaches are widely used to detect the effective associations between genetic variants and brain imaging data at one time-point. However, those associations are based on static phenotypes and ignore the temporal dynamics of the phenotypical changes. The phenotypes across multiple time-points may exhibit temporal patterns that can be used to facilitate the understanding of the degenerative process. In this article, we propose a novel temporally constrained group sparse canonical correlation analysis (TGSCCA) framework to identify genetic associations with longitudinal phenotypic markers.
RESULTS: The proposed TGSCCA method is able to capture the temporal changes in brain from longitudinal phenotypes by incorporating the fused penalty, which requires that the differences between two consecutive canonical weight vectors from adjacent time-points should be small. A new efficient optimization algorithm is designed to solve the objective function. Furthermore, we demonstrate the effectiveness of our algorithm on both synthetic and real data (i.e., the Alzheimer's Disease Neuroimaging Initiative cohort, including progressive mild cognitive impairment, stable MCI and Normal Control participants). In comparison with conventional SCCA, our proposed method can achieve strong associations and discover phenotypic biomarkers across multiple time-points to guide disease-progressive interpretation.
AVAILABILITY AND IMPLEMENTATION: The Matlab code is available at https://sourceforge.net/projects/ibrain-cn/files/ . CONTACT: dqzhang@nuaa.edu.cn or shenli@iu.edu.

Entities: Chemical

Mesh：

Substances：
Biomarkers

Year: 2017 PMID： 28881979 PMCID： PMC5870577 DOI： 10.1093/bioinformatics/btx245

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Integrating neuroimaging and molecular genetics technology hold great promising to use brain imaging as quantitative phenotypes to investigate the role of genetic variations. These imaging quantitative traits (QTs) serve as intermediate phenotypes with rich information, which bridge the gap between genetic factors and phenotypic outcomes (Glahn ; Gottesman and Gould, 2003; Hariri ) and may lead to a better understanding of the complex biological mechanism underlying neurodegenerative diseases [e.g., mild cognitive impairment (MCI), the prodromal stage of Alzheimer’s disease (AD)]. In prior imaging genetic studies, pairwise univariate analysis strategies have been performed to identify the associations between single nucleotide polymorphisms (SNPs) and neuroimaging QTs. The most comprehensive studies focused on scanning the entire brain and the entire genome (Shen, ; Stein ). In recent studies, taking into account the inherent structure among genotype or phenotype data, some researchers have developed generalized multivariate linear regression models (Hibar ; Kohannim , 2012; Vounou ; Wang ) and structured bi-multivariate models (Chi ; Lin ; Yan ) to identify multi-SNP-multi-QT associations. Those methods have sufficient power to discover structured phenotypic imaging markers associated with disease-relevant SNPs. However, examining genetic influence on the longitudinal profiles of imaging phenotype is still an under-explored topic in imaging genetics. Specifically, a straight forward approach such as conventional sparse canonical correlation analysis (SCCA) (Chi ; Witten ; Witten and Tibshirani, 2009), which does not take into account the valuable information conveyed by the longitudinal pattern of phenotypic input, is to perform multi-SNP-multi-QT associations at one time-point. In fact, the phenotypes across multiple time-points may exhibit temporal patterns that can be used to describe the degenerative process. Some studies have investigated on prediction of memory impairment and cognitive assessments with longitudinal magnetic resonance imaging (MRI) data (Jie ; Wang ). So far, only a few machine-learning strategies have been proposed to examine how the phenotypic changes are that affected by SNPs. Recently, Wang ) proposed a novel task-correlated longitudinal sparse regression model to study the association between phenotypic imaging markers and the genotypes by taking into account the temporal structure of the longitudinal imaging data. More specifically, they used L21-norm for regression coefficient matrix to jointly select imaging markers that have common effects across all time-points. However, the task-correlated longitudinal sparse regression model assumed that longitudinal imaging markers were related to all candidate SNPs as a task-correlated constraint, which might not hold in real applications. More recently, Vounou ) have proposed a two-step framework based on sparse reduced-rank regression to solve the imaging genetics problem for the genome-wide detection of markers associated with voxel-wise longitudinal changes in brain (Vounou ). They first pre-selected the disease relevant voxel level imaging phenotypes with high-classification performance between AD and NC group by penalized linear discriminant analysis, and then identified the SNPs associated with the multivariate imaging phenotypes from the first step. This approach might be inadequate to capture the dynamics of phenotypic trajectories and thus unable to detect the underlying temporal patterns. Therefore, how to identify the longitudinal phenotypes across consecutive time-points associated to the disease sensitive SNPs is still an important topic in imaging genetic studies. With these observations, the motivation of this study is to identify associations between risk genotypes and longitudinal phenotypes, where we aim to design a powerful model to simultaneously maximize progression-relevant imaging genetic associations and capture the consecutive changes in brain between adjacent time-points. Accordingly, as shown in Figure 1, we propose a novel temporally constrained group sparse canonical correlation analysis (TGSCCA) framework that incorporates the group sparsity constraint and fused penalty to identify the associations between genetic factors and longitudinal phenotypes. In particular, it is promising to find the consecutive patterns that are robust to noises or outliers via considering both joint selection and the fused information in imaging phenotypes from adjacent time-points.

Fig. 1

Schematic illustration of TGSCCA for imaging genetics

Schematic illustration of TGSCCA for imaging genetics In this study, to evaluate the effectiveness and efficiency of our proposed method, we perform experiments on both synthetic and real data. For real data, using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) cohort (Mueller ), we examine imaging genetic associations between SNPs nearby the apolipoprotein E (APOE) gene and region of interest (ROI) measures extracted from longitudinal structural MRI. The empirical results show that our method not only yields clearly improved association performance under the metrics of correlation coefficient but also detects relevant risk SNP loci and imaging ROI markers. The rest of this article is organized as follows. Section 2 introduces the TGSCCA method. The performances of the proposed method are evaluated through both simulations and real data analysis in Section 3. The last section concludes the study.

2 Materials and methods

2.1 Sparse Canonical Correlation Analysis (SCCA)

We first describe relevant notations to present imaging genetic association analysis. We use lowercase letters to denote vectors, and uppercase letters to denote the matrices. Let be the SNP genotype data, be the phenotype data, where N is the number of participants, and p and q are the feature number of SNPs and imaging data, respectively. Canonical correlation analysis (CCA) is a powerful association method that seeks linear transformations of two data sets and to achieve the maximal correlation between and (Hotelling, 1935), which can be formulated as: where we assume that the columns of and are standardized to have zero mean and unit variance, and and are canonical weights, reflecting the contribution of each feature in the identified canonical correlation. However, in imaging genetics applications, the traditional CCA model tends to overfit and does not yield desirable results as the dimension of the data is much higher than the sample size. In addition, the CCA outcome could spread nontrivial effects across all features, which are not desirable for applications needing to identify relevant features. To address these issues, sparse version of CCA (SCCA) (Chi ; Witten ; Witten and Tibshirani, 2009) has been proposed by introducing penalties with L1 regularization for variable selection (Tibshirani, 2011) as follows: where the constraints and are regularization term for the objective function, and and is the corresponding regularization parameters. In imaging genetics applications, the weight vectors and measure the relative contributions of the SNP loci and imaging phenotype ROIs. For easy computation, the variance matrix of and is treated as diagonal matrix, which has shown to be effective and efficient for high-dimensional data (Grellmann ; Witten ).

2.2 Temporally constrained group sparse canonical correlation analysis (TGSCCA)

In clinical practice, imaging phenotypes affected by genetic factors changes over time. To investigate the association between genotypes and longitudinal imaging phenotypes, in this article, we consider how to perform bi-multivariate association analysis across the consecutive time-points. Assume that we have training subjects, and each subject has imaging data derived from different time-points. Given the genotype SNPs data and longitudinal imaging phenotypes at time-point () as input in the association model, where is the number of participants, and are the numbers of feature dimensionalities (i.e., number of SNP loci and brain imaging ROIs). As described in Section 1 and Figure 1, our aim is to discover those longitudinal brain imaging markers associated with genetic factors across different time-points. Task-correlated longitudinal analysis model has recently been successfully investigated and applied to regression problems (Wang et al., 2012a, b), which are inspired by using multi-task learning framework (Liu ; Obozinski ) in machine-learning community. Following their previous work, we induce the joint penalty term L21-norm into the Equation (2) and then develop group sparse canonical correlation analysis (GSCCA) model as follows: where the weight vector and measure the relative importance of the SNP loci and imaging phenotype ROIs at time-point (). and denote control parameters of the regularization terms, respectively. is the weight matrix whose row is the vector of coefficients assigned to the -th feature across different time-points, and is to penalize all coefficients in the same row of matrix for joint feature selection. It is worth noting that the L21 regularization term can be coupled over time dimension and this “group-sparsity” regularizer forces only a small number of features being selected (Yuan and Lin, 2006). In other words, the longitudinal imaging features across all time-points will be identified. To further take into account detecting temporally-constrained imaging genetic associations, we expect to develop our model to explore the association between baseline SNPs and longitudinal imaging phenotypes for a better understanding of underlying progressive mechanism specific to the disease. More specifically, motivated by the existing work (Jie ), we induce a new regularization term called fused least absolute shrinkage and selection operator (Lasso) (Liu ) in machine-learning community and then formulate the TGSCCA model as follows: where the weight vectors and measure the relative contributions of the SNP loci and imaging phenotype ROIs at time-point . The weight vector and are from adjacent time-points. , and denote control parameters of the regularization terms, respectively. The fused Lasso regularization term tends to constrain the differences between two successive canonical weight vectors from adjacent time-points to be small, that is the smoothness of weight vectors encourages neighboring features to be selected together. Due to the two regularization terms in Equation (3), it is promising to find the better solution that is robust to noises or outliers via considering both joint selections and the fused information inherent in imaging genetic associations.

2.3 Optimization algorithm

In this section, we introduce the algorithm to obtain and from Equation (4). The objective function is convex with respect to when is fixed and vice versa. So the iteration procedures mainly contain two steps as follows: We can use the Lagrange multiplier and write the penalties into the matrix form for Equation (5), thus the new objective function is as follows: where and are the model parameters. In this solution, a smooth approximation has been estimated for L1 term by including an extremely small value. Take the derivative regarding and let it be 0. The solution for in each iteration step is as follows: where is a diagonal matrix with the th element as . To obtain when is fixed for Equation (6), we follow the previous work (Fang ; Witten ) and assume that the variance matrix of is treated as diagonal matrix, which has shown to be effective and efficient for optimization. The solution for in each iteration step is as follows: where , is given by , where is the optimum of It is straightforward to verify that Equation (10) is convex but non-smooth because of L21-norm and Fused Lasso regularization term. The basic idea to solve this problem is to use a smooth function to approximate the original non-smooth objective function. In this study, we use the Nesterov’s accelerated proximal gradient (APG) algorithm (Beck and Teboulle, 2009; Chen ) to solve our optimization problem, which is shown in the Algorithm 1. First, we separate Equation (10) into a smooth part Equation (11) and a non-smooth part Equation (12) as follows: We define the approximation function Equation (10) as follows, which is composited by the above smooth part and non-smooth one: where denotes the Frobenius norm, denotes the gradient of on point at the th iteration, and is the step size. Finally, the update step of Nesterov’s APG is defined as: where . The key of APG algorithm is how to solve the update step efficiently. In addition, according to the technique used in Chen , instead of performing gradient descent based on , we compute the search point as: where and . For more details about the solution of L21-norm and Fused Lasso problem, please refer to the previous work (Jie ; Liu ). Input: SNPs ; longitudinal imaging phenotypes at time-point (); parameters , , , . Initialization:, , . While not converge do 1: Calculate the diagonal matrix , where the -th element is ; 2: Update by Equation (8); 3: Scale so that ; 4: Computed the search point according to Equation (15); 5:Find the smallest so that , where is computed by Equation (14); 6: Set ; 7: Scale so that . End while Output: canonical vector .

3 Results and discussions

3.1 Results on simulation data

In this section, we present a simulation study to evaluate the potential power of our proposed TGSCCA method. The procedure of simulation generation is similar to that in Chen )and Fang . We first generated one canonical vector with p’ non-zero entries and successive canonical vector with q’ non-entries, where . Each non-zero variable in and was sampled independently from a uniform distribution in the range of . And then, we randomly generated a latent variable with normal distribution for each sample, where is the signal to noise level. For the data matrix X and Y, the features were simulated from Gaussian distribution and , respectively. We set N = 100, p = 100, q = 50, p’=30, q’=20, . To validate the effects on the performance, we varied the noise level from 0.1 to 0.5 to generate simulation data 1 and simulation data 2, respectively. In our experiments, 5-fold cross-validation strategy is adopted to evaluate the effectiveness of our proposed method. All the regularization parameters are optimally tuned using a grid search from the range of {0.01, 0.02, 0.05, 0.08, 0.1, 0.2, 0.5, 0.8, 1} by another nested 5-fold cross-validation on the training set. We compare SCCA (denoted as sparse canonical correlation analysis to detect associations between SNP loci and imaging features at each time-point), GSCCA (denoted as group sparse canonical correlation analysis to detect associations between SNP loci and longitudinal imaging phenotypic features jointly across all time-points only via L21-norm), and TGSCCA (denoted as temporally-constrained group sparse canonical correlation analysis to detect associations between SNP loci and longitudinal imaging features jointly across all adjacent time-points via L21-norm and fused penalty). The performance on each dataset is assessed with correlation coefficient between X and Y, which are widely used in measuring performances of association analysis. The average results of correlation coefficients on 5-fold testing data are calculated respectively on Simulation 1 and Simulation 2. As shown in Figure 2, joint longitudinal association methods (including GSCCA and TGSCCA) outperform SCCA consistently and significantly in the metrics of correlation coefficients on both simulations. It is worth noting that TGSCCA is comparable with GSCCA due to the low noises in Simulation 1, while TGSCCA is more robust to the data with high noises in Simulation 2. Furthermore, we show the estimated canonical weights from different methods. As shown in Figure 3, the overall profiles of the estimated u and v values from TGSCCA are consistent with the ground truth on both Simulation 1 and Simulation 2, whereas SCCA is only capable of identifying inconsistent signals at different time-points. Although GSCCA can almost capture the same signals on u as TGSCCA, from the unsmooth patterns across the longitudinal case, it may be affected by noises and then draw false discoveries on v without the induced temporal-constraint. From the above results, it is observed that TGSCCA can identify not only the signal locations but also strong correlations, which has certain superiority compared with other methods.

Fig. 2

The averaged correlation coefficients on 5-fold test data using different methods on simulations. (a) Results on simulation data 1. (b) Results on simulation data 2

Fig. 3

The estimated weights of u and v from average 5-fold cross-validation test on simulation data are shown in the left five panels and right five panels. Ground truth of w and v are shown in the most left in the two parts, respectively. The estimated u values and v values are shown in the remaining panels, corresponding to different methods. (a) Results on simulation data 1. (b) Results on simulation data 2

The averaged correlation coefficients on 5-fold test data using different methods on simulations. (a) Results on simulation data 1. (b) Results on simulation data 2 The estimated weights of u and v from average 5-fold cross-validation test on simulation data are shown in the left five panels and right five panels. Ground truth of w and v are shown in the most left in the two parts, respectively. The estimated u values and v values are shown in the remaining panels, corresponding to different methods. (a) Results on simulation data 1. (b) Results on simulation data 2

3.2 Results on real imaging genetic data

3.2.1 ADNI dataset

Real imaging genetics data used in the preparation of this article were obtained from the ADNI database (adni.loni.usc.edu). The ADNI was launched in 2003 as a public–private partnership, led by Principal Investigator Michael W. Weiner, MD. The primary goal of ADNI has been to test whether serial MRI, positron emission tomography (PET), other biological markers, and clinical and neuropsychological assessment can be combined to measure the progression of MCI and early AD. For up-to-date information, see www.adni-info.org. The genotyping and longitudinal imaging data of 114 non-Hispanic Caucasian participants were downloaded from the ADNI website. The demographic information is summarized in Table 1. Specifically, the time points examined in this study for MRI T1-weighted imaging and cognitive assessments (i.e., MMSE and ADAS-Cog) included baseline (BL), Month 06 (M06), Month 12 (M12) and Month 24 (M24). We aligned the preprocessed imaging data [i.e., voxel based morphometry (VBM)] to each participant’s same visit scan, and then created normalized gray matter density maps from MRI data in the standard Montreal Neurological Institute (MNI) space as 2 × 2 × 2 mm3 voxels SPM software package (Ashburner and Friston, 2007). One hundred and sixteen ROI level measurements of mean gray matter densities were further extracted based on the MarsBaR AAL atlas (Tzourio-Mazoyer ). After removal of cerebellum, the imaging measures of 90 ROIs were used as phenotypes in our experiments. For the genotyping data, we included 85 SNPs within ±20k base pairs of the APOE gene boundary based on the ANNOVAR (http://annovar.openbioinformatics.org) annotation. For input in this association study, each SNP value was coded in an additive fashion as 0, 1, 2, indicating the number of minor alleles.

Table 1.

Demographic characteristics of the studied population (the values are denoted as mean ± standard deviation)

Subjects	pMCI (n = 15)	sMCI (n = 41)	NC (n = 58)
Gender (M/F)	8/7	26/15	31/27
Age	71.75±5.92	73.40±7.59	75.71±4.74
Education	16.33±3.54	16.22±2.86	16.38±2.85
MMSE(BL)	26.93±1.91	27.59±1.50	29.21±0.99
MMSE(M06)	26.07±2.69	27.59±1.76	29.03±1.03
MMSE(M12)	25.47±2.72	27.56±1.91	29.38±0.83
MMSE(M24)	22.80±4.00	27.61±2.24	29.12±1.09
ADAS-Cog(BL)	20.64±5.51	15.45±5.80	8.92±3.69
ADAS-Cog(M06)	22.91±8.48	15.52±5.77	8.95±3.75
ADAS-Cog(M12)	24.33±6.57	15.20±5.93	7.58±4.05
ADAS-Cog(M24)	26.95±8.07	16.11±6.28	8.43±4.43

Note: NC = Normal Control, pMCI = progressive Mild Cognitive Impairment, sMCI = stable Mild Cognitive Impairment, MMSE = Mini-Mental State Examination, ADAS-Cog = Alzheimer’s Disease Assessment Scale-Cognitive Subscale.

Demographic characteristics of the studied population (the values are denoted as mean ± standard deviation) Note: NC = Normal Control, pMCI = progressive Mild Cognitive Impairment, sMCI = stable Mild Cognitive Impairment, MMSE = Mini-Mental State Examination, ADAS-Cog = Alzheimer’s Disease Assessment Scale-Cognitive Subscale.

3.2.2 Improved association between risk SNP loci and longitudinal imaging ROI markers

In the real data experiments, we also use 5-fold cross-validation strategy to evaluate the effectiveness of our proposed method. Similar to the previous simulation study, we determine the values of regularization parameters by nested 5-fold cross-validation on the training set. The parameters are tuned in the range of {0.01, 0.02, 0.05, 0.08, 0.1, 0.2, 0.5, 0.8, 1}. In current studies, we compare our proposed joint longitudinal imaging genetic strategies (including GSCCA and TGSCCA) with conventional SCCA method. For measuring the association performance of the compared methods, the average values of Pearson correlation coefficients on 5-fold test sets are calculated to eliminate the bias. As shown in Figure 4, the performances of longitudinal strategies with joint detections (including GSCCA and TGSCCA) are more stable than the conventional SCCA, which treats imaging genetic associations at each time-point independently. As expected, TGSCCA can achieve the best correlation coefficients so that it consistently outperform SCCA and GSCCA. These results demonstrate that the usage of temporal information across adjacent time-points can help improve the performances of association between genotypes and longitudinal imaging phenotypes.

Fig. 4

The averaged correlation coefficients on 5-fold test data using different methods on ADNI

3.2.3 Identification of risk SNP loci

Besides improving association performance, one major goal of this study is to identify some vital SNP loci and imaging phenotypic markers for disease progression in MCI research. Therefore, finding genetic risk factors and imaging ROIs helps scientists better understand how the disease develops and identify possible treatments to study. We aim to present the selected features on the SNP loci and imaging ROIs, whose annotations are shown on the X-axis from top and bottom panels in Figure 5. It shows all comparisons of absolute weight maps for top 10 loci from APOE SNPs associated to top 10 brain ROIs with longitudinal analysis respect to different methods.

Fig. 5

The estimated weights of u (top panels) and v (bottom panels) from average 5-fold cross-validation test on ADNI data using the different methods

The estimated weights of u (top panels) and v (bottom panels) from average 5-fold cross-validation test on ADNI data using the different methods For detecting genetic factors, as shown on top panels in Figure 5, the locus rs76692773 and rs2075649 (Lin ) are the top hits by all methods. However, compared with SCCA, the joint longitudinal detections (including GSCCA and TGSCCA) can discover consistent and clear patterns across all time-points, which indicate our proposed method performs stable in longitudinal imaging genetic associations. It is worth noting that the best-known risk genetic loci rs429358 has not been identified by all methods (including our proposed TGSCCA). It warrants further investigation to confirm whether the eminent risk factor rs429358 is truly not associated with longitudinal VBM phenotypes in the MCI progression. In addition, as the association solution relies on the linear combination of all loci, an individual one might not have a direct influence to the correlation, i.e., it might modulate the influence of another locus. Consequently, these genetic factors selected in this association study should also warrant further investigation for replication in independent and larger cohorts.

3.2.4 Identification of longitudinal imaging ROI markers

For detecting brain imaging ROIs, as shown on bottom panels in Figure 5, the conventional SCCA method identifies some irregular imaging ROIs at different time-points, which are not able to serve screening target over the course of MCI progression. While using joint longitudinal detection strategies, we can obtain clear patterns on the feature panels. For example, right parahippocampal gyrus and right superior parietal gyrus are the two top hits in joint detections (including GSCCA and TGSCCA), which are in accordance with previous findings (Jacobs ; Shen ). In addition, the top ROIs selected for progression across all time-points by TGSCCA can be treated as increasingly stable markers, which are also accordance with the fact that the grey matter atrophy of these ROIs is severe in MCI (Driscoll ). Therefore, compared with SCCA and GSCCA, our proposed TGSCCA can tolerate noises to some extent so that the weight maps of each selected ROI across different time-points are very smooth. This further indicates the advantage of using the temporal-constrained smoothness regularization.

4 Conclusion

In this article, we propose a novel TGSCCA framework to detect risk genetic factors and their correlated longitudinal phenotype markers for a progressive disease (i.e., MCI). This approach explicitly captures the consecutive changes between phenotypes from adjacent time-points by incorporating the group sparsity constraint and fused penalty into the objective function. We also present an effective iterative algorithm to solve the optimization problem. We apply the proposed method on the simulation data and ADNI cohort [including progressive mild cognitive impairment (pMCI), stable MCI (sMCI) and NC participants]. The experimental results show that our proposed TGSCCA model can identify stronger associations than conventional SCCA and GSCCA. Besides the improved association performance, in real imaging genetic data, our model can also detect the risk SNP loci and clearly consistent brain ROIs across all time-points, which provides valuable information to help understand the genetic basis of brain structural change over the progression of MCI and AD. As an interesting future direction, this TGSCCA method can be applied to investigate the potential mechanism of other imaging phenotypes (e.g., fluorodeoxyglucose positron emission tomography (FDG-PET) and Florbetapir F 18 amyloid PET data) (Hao ) and biomarkers such as cerebrospinal fluid and plasma from longitudinal perspective (Fagan ). Therefore, all kinds of biomarker outcomes learned from trajectories across the course of disease can be evaluated, and the findings may have the potential to help with neurodegenerative assessments in clinical practices. In this initial study, the proposed TGSCCA can be successfully applied for longitudinal imaging genetics study on a candidate gene set. However, when the datasets contain more features, it becomes more challenging to identify truly relevant ones. Thus, an interesting future direction could be to develop an improved TGSCCA model by exploring non-convex penalty terms that have been shown to be more effective than L1 based terms in terms of feature selection via sparse learning. In addition, this general framework can be extended and applied to some other interesting fields such as brain-perceived analysis (Connolly ) and gene expression analysis on multiple data sources (Allahyar and de Ridder, 2015; Bunte ; Omranian ), allowing for generating new insights.

31 in total

1. Prediction of Memory Impairment with MRI Data: A Longitudinal Study of Alzheimer's Disease.

Authors: Xiaoqian Wang; Dinggang Shen; Heng Huang
Journal: Med Image Comput Comput Assist Interv Date: 2016-10-02

2. IMAGING GENETICS VIA SPARSE CANONICAL CORRELATION ANALYSIS.

Authors: Eric C Chi; Genevera I Allen; Hua Zhou; Omid Kohannim; Kenneth Lange; Paul M Thompson
Journal: Proc IEEE Int Symp Biomed Imaging Date: 2013-12-31

Review 3. The endophenotype concept in psychiatry: etymology and strategic intentions.

Authors: Irving I Gottesman; Todd D Gould
Journal: Am J Psychiatry Date: 2003-04 Impact factor: 18.112

4. Temporally Constrained Group Sparse Learning for Longitudinal Data Analysis in Alzheimer's Disease.

Authors: Biao Jie; Mingxia Liu; Jun Liu; Daoqiang Zhang; Dinggang Shen
Journal: IEEE Trans Biomed Eng Date: 2016-04-13 Impact factor: 4.538

5. Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: A study of the ADNI cohort.

Authors: Li Shen; Sungeun Kim; Shannon L Risacher; Kwangsik Nho; Shanker Swaminathan; John D West; Tatiana Foroud; Nathan Pankratz; Jason H Moore; Chantel D Sloan; Matthew J Huentelman; David W Craig; Bryan M Dechairo; Steven G Potkin; Clifford R Jack; Michael W Weiner; Andrew J Saykin
Journal: Neuroimage Date: 2010-01-25 Impact factor: 6.556

6. Joint sparse canonical correlation analysis for detecting differential imaging genetics modules.

Authors: Jian Fang; Dongdong Lin; S Charles Schulz; Zongben Xu; Vince D Calhoun; Yu-Ping Wang
Journal: Bioinformatics Date: 2016-07-27 Impact factor: 6.937

7. Gene regulatory network inference using fused LASSO on multiple data sets.

Authors: Nooshin Omranian; Jeanne M O Eloundou-Mbebi; Bernd Mueller-Roeber; Zoran Nikoloski
Journal: Sci Rep Date: 2016-02-11 Impact factor: 4.379

8. Correspondence between fMRI and SNP data by group sparse canonical correlation analysis.

Authors: Dongdong Lin; Vince D Calhoun; Yu-Ping Wang
Journal: Med Image Anal Date: 2013-10-31 Impact factor: 8.545

9. From phenotype to genotype: an association study of longitudinal phenotypic markers to Alzheimer's disease relevant SNPs.

Authors: Hua Wang; Feiping Nie; Heng Huang; Jingwen Yan; Sungeun Kim; Kwangsik Nho; Shannon L Risacher; Andrew J Saykin; Li Shen
Journal: Bioinformatics Date: 2012-09-15 Impact factor: 6.937

10. Discovery and Replication of Gene Influences on Brain Structure Using LASSO Regression.

Authors: Omid Kohannim; Derrek P Hibar; Jason L Stein; Neda Jahanshad; Xue Hua; Priya Rajagopalan; Arthur W Toga; Clifford R Jack; Michael W Weiner; Greig I de Zubicaray; Katie L McMahon; Narelle K Hansell; Nicholas G Martin; Margaret J Wright; Paul M Thompson
Journal: Front Neurosci Date: 2012-08-06 Impact factor: 4.677

11 in total

1. Detecting time-evolving phenotypic topics via tensor factorization on electronic health records: Cardiovascular disease case study.

Authors: Juan Zhao; Yun Zhang; David J Schlueter; Patrick Wu; Vern Eric Kerchberger; S Trent Rosenbloom; Quinn S Wells; QiPing Feng; Joshua C Denny; Wei-Qi Wei
Journal: J Biomed Inform Date: 2019-08-22 Impact factor: 6.317

2. Morbigenous brain region and gene detection with a genetically evolved random neural network cluster approach in late mild cognitive impairment.

Authors: Xia-An Bi; Yingchao Liu; Yiming Xie; Xi Hu; Qinghua Jiang
Journal: Bioinformatics Date: 2020-04-15 Impact factor: 6.937

3. Brain Imaging Genomics: Integrated Analysis and Machine Learning.

Authors: Li Shen; Paul M Thompson
Journal: Proc IEEE Inst Electr Electron Eng Date: 2019-10-29 Impact factor: 10.961

4. Improved Prediction of Cognitive Outcomes via Globally Aligned Imaging Biomarker Enrichments Over Progressions.

Authors: Lyujian Lu; Saad Elbeleidy; Lauren Baker; Hua Wang; Li Shen; Huang Heng
Journal: IEEE Trans Biomed Eng Date: 2021-10-19 Impact factor: 4.538

5. A technical review of canonical correlation analysis for neuroscience applications.

Authors: Xiaowei Zhuang; Zhengshi Yang; Dietmar Cordes
Journal: Hum Brain Mapp Date: 2020-06-27 Impact factor: 5.038

6. Identifying progressive imaging genetic patterns via multi-task sparse canonical correlation analysis: a longitudinal study of the ADNI cohort.

Authors: Lei Du; Kefei Liu; Lei Zhu; Xiaohui Yao; Shannon L Risacher; Lei Guo; Andrew J Saykin; Li Shen
Journal: Bioinformatics Date: 2019-07-15 Impact factor: 6.931

Review 7. The Application of Artificial Intelligence in the Genetic Study of Alzheimer's Disease.

Authors: Rohan Mishra; Bin Li
Journal: Aging Dis Date: 2020-12-01 Impact factor: 6.745

8. A single mode of population covariation associates brain networks structure and behavior and predicts individual subjects' age.

Authors: Brent C McPherson; Franco Pestilli
Journal: Commun Biol Date: 2021-08-05

9. A recursive framework for predicting the time-course of drug sensitivity.

Authors: Cheng Qian; Amin Emad; Nicholas D Sidiropoulos
Journal: Sci Rep Date: 2020-10-19 Impact factor: 4.379

10. Machine learning suggests polygenic risk for cognitive dysfunction in amyotrophic lateral sclerosis.

Authors: Katerina Placek; Michael Benatar; Joanne Wuu; Evadnie Rampersaud; Laura Hennessy; Vivianna M Van Deerlin; Murray Grossman; David J Irwin; Lauren Elman; Leo McCluskey; Colin Quinn; Volkan Granit; Jeffrey M Statland; Ted M Burns; John Ravits; Andrea Swenson; Jon Katz; Erik P Pioro; Carlayne Jackson; James Caress; Yuen So; Samuel Maiser; David Walk; Edward B Lee; John Q Trojanowski; Philip Cook; James Gee; Jin Sha; Adam C Naj; Rosa Rademakers; Wenan Chen; Gang Wu; J Paul Taylor; Corey T McMillan
Journal: EMBO Mol Med Date: 2020-12-03 Impact factor: 14.260