Literature DB >> 21984855

Comparison of Penalty Functions for Sparse Canonical Correlation Analysis.

Prabhakar Chalise1, Brooke L Fridley.   

Abstract

Canonical correlation analysis (CCA) is a widely used multivariate method for assessing the association between two sets of variables. However, when the number of variables far exceeds the number of subjects, such in the case of large-scale genomic studies, the traditional CCA method is not appropriate. In addition, when the variables are highly correlated the sample covariance matrices become unstable or undefined. To overcome these two issues, sparse canonical correlation analysis (SCCA) for multiple data sets has been proposed using a Lasso type of penalty. However, these methods do not have direct control over sparsity of solution. An additional step that uses Bayesian Information Criterion (BIC) has also been suggested to further filter out unimportant features. In this paper, a comparison of four penalty functions (Lasso, Elastic-net, SCAD and Hard-threshold) for SCCA with and without the BIC filtering step have been carried out using both real and simulated genotypic and mRNA expression data. This study indicates that the SCAD penalty with BIC filter would be a preferable penalty function for application of SCCA to genomic data.

Entities:  

Year:  2012        PMID: 21984855      PMCID: PMC3185379          DOI: 10.1016/j.csda.2011.07.012

Source DB:  PubMed          Journal:  Comput Stat Data Anal        ISSN: 0167-9473            Impact factor:   1.681


  9 in total

1.  Characterization of multilocus linkage disequilibrium.

Authors:  Alessandro Rinaldo; Silviu-Alin Bacanu; B Devlin; Vibhor Sonpar; Larry Wasserman; Kathryn Roeder
Journal:  Genet Epidemiol       Date:  2005-04       Impact factor: 2.135

2.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.

Authors:  Paul Scheet; Matthew Stephens
Journal:  Am J Hum Genet       Date:  2006-02-17       Impact factor: 11.025

3.  Quantifying the association between gene expressions and DNA-markers by penalized canonical correlation analysis.

Authors:  Sandra Waaijenborg; Philip C Verselewel de Witt Hamer; Aeilko H Zwinderman
Journal:  Stat Appl Genet Mol Biol       Date:  2008-01-23

4.  Sparse canonical correlation analysis with application to genomic data integration.

Authors:  Elena Parkhomenko; David Tritchler; Joseph Beyene
Journal:  Stat Appl Genet Mol Biol       Date:  2009-01-06

Review 5.  Extensions of sparse canonical correlation analysis with applications to genomic data.

Authors:  Daniela M Witten; Robert J Tibshirani
Journal:  Stat Appl Genet Mol Biol       Date:  2009-06-09

6.  Gemcitabine and cytosine arabinoside cytotoxicity: association with lymphoblastoid cell expression.

Authors:  Liang Li; Brooke Fridley; Krishna Kalari; Gregory Jenkins; Anthony Batzler; Stephanie Safgren; Michelle Hildebrandt; Matthew Ames; Daniel Schaid; Liewei Wang
Journal:  Cancer Res       Date:  2008-09-01       Impact factor: 12.701

7.  Testing association between disease and multiple SNPs in a candidate gene.

Authors:  W James Gauderman; Cassandra Murcray; Frank Gilliland; David V Conti
Journal:  Genet Epidemiol       Date:  2007-07       Impact factor: 2.135

8.  Gemcitabine and arabinosylcytosin pharmacogenomics: genome-wide association and drug response biomarkers.

Authors:  Liang Li; Brooke L Fridley; Krishna Kalari; Gregory Jenkins; Anthony Batzler; Richard M Weinshilboum; Liewei Wang
Journal:  PLoS One       Date:  2009-11-09       Impact factor: 3.240

9.  Sparse canonical methods for biological data integration: application to a cross-platform study.

Authors:  Kim-Anh Lê Cao; Pascal G P Martin; Christèle Robert-Granié; Philippe Besse
Journal:  BMC Bioinformatics       Date:  2009-01-26       Impact factor: 3.169

  9 in total
  7 in total

1.  ATHENA: the analysis tool for heritable and environmental network associations.

Authors:  Emily R Holzinger; Scott M Dudek; Alex T Frase; Sarah A Pendergrass; Marylyn D Ritchie
Journal:  Bioinformatics       Date:  2013-10-21       Impact factor: 6.937

2.  Integrative analysis of transcriptomic and metabolomic data via sparse canonical correlation analysis with incorporation of biological information.

Authors:  Sandra E Safo; Shuzhao Li; Qi Long
Journal:  Biometrics       Date:  2017-05-08       Impact factor: 2.571

3.  Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study.

Authors:  Theodoulos Rodosthenous; Vahid Shahrezaei; Marina Evangelou
Journal:  Bioinformatics       Date:  2020-11-01       Impact factor: 6.937

4.  Population level inference for multivariate MEG analysis.

Authors:  Anna Jafarpour; Gareth Barnes; Lluis Fuentemilla; Emrah Duzel; Will D Penny
Journal:  PLoS One       Date:  2013-08-05       Impact factor: 3.240

5.  Group sparse canonical correlation analysis for genomic data integration.

Authors:  Dongdong Lin; Jigang Zhang; Jingyao Li; Vince D Calhoun; Hong-Wen Deng; Yu-Ping Wang
Journal:  BMC Bioinformatics       Date:  2013-08-12       Impact factor: 3.169

6.  Robust sparse canonical correlation analysis.

Authors:  Ines Wilms; Christophe Croux
Journal:  BMC Syst Biol       Date:  2016-08-11

7.  Multivariate association between single-nucleotide polymorphisms in Alzgene linkage regions and structural changes in the brain: discovery, refinement and validation.

Authors:  Elena Szefer; Donghuan Lu; Farouk Nathoo; Mirza Faisal Beg; Jinko Graham
Journal:  Stat Appl Genet Mol Biol       Date:  2017-11-27
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.