Literature DB >> 19049491

A sparse PLS for variable selection when integrating omics data.

Kim-Anh Lê Cao1, Debra Rossouw, Christèle Robert-Granié, Philippe Besse.   

Abstract

Recent biotechnology advances allow for multiple types of omics data, such as transcriptomic, proteomic or metabolomic data sets to be integrated. The problem of feature selection has been addressed several times in the context of classification, but needs to be handled in a specific manner when integrating data. In this study, we focus on the integration of two-block data that are measured on the same samples. Our goal is to combine integration and simultaneous variable selection of the two data sets in a one-step procedure using a Partial Least Squares regression (PLS) variant to facilitate the biologists' interpretation. A novel computational methodology called ;;sparse PLS" is introduced for a predictive analysis to deal with these newly arisen problems. The sparsity of our approach is achieved with a Lasso penalization of the PLS loading vectors when computing the Singular Value Decomposition. Sparse PLS is shown to be effective and biologically meaningful. Comparisons with classical PLS are performed on a simulated data set and on real data sets. On one data set, a thorough biological interpretation of the obtained results is provided. We show that sparse PLS provides a valuable variable selection tool for highly dimensional data sets.

Mesh:

Year:  2008        PMID: 19049491     DOI: 10.2202/1544-6115.1390

Source DB:  PubMed          Journal:  Stat Appl Genet Mol Biol        ISSN: 1544-6115


  122 in total

1.  Human behavioral informatics in genetic studies of neuropsychiatric disease: multivariate profile-based analysis.

Authors:  Cinnamon S Bloss; Kelly M Schiabor; Nicholas J Schork
Journal:  Brain Res Bull       Date:  2010-04-28       Impact factor: 4.077

2.  Sparse partial least squares classification for high dimensional data.

Authors:  Dongjun Chung; Sunduz Keles
Journal:  Stat Appl Genet Mol Biol       Date:  2010-03-03

3.  Methylation potential associated with diet, genotype, protein, and metabolite levels in the Delta Obesity Vitamin Study.

Authors:  Jacqueline Pontes Monteiro; Carolyn Wise; Melissa J Morine; Candee Teitel; Lisa Pence; Anna Williams; Beverly McCabe-Sellers; Catherine Champagne; Jerome Turner; Beatrice Shelby; Baitang Ning; Joan Oguntimein; Lauren Taylor; Terri Toennessen; Corrado Priami; Richard D Beger; Margaret Bogle; Jim Kaput
Journal:  Genes Nutr       Date:  2014-04-24       Impact factor: 5.523

4.  integrOmics: an R package to unravel relationships between two omics datasets.

Authors:  Kim-Anh Lê Cao; Ignacio González; Sébastien Déjean
Journal:  Bioinformatics       Date:  2009-08-25       Impact factor: 6.937

5.  Association of repeatedly measured intermediate risk factors for complex diseases with high dimensional SNP data.

Authors:  Sandra Waaijenborg; Aeilko H Zwinderman
Journal:  Algorithms Mol Biol       Date:  2010-02-11       Impact factor: 1.405

6.  Integrative analysis of gene expression and copy number alterations using canonical correlation analysis.

Authors:  Charlotte Soneson; Henrik Lilljebjörn; Thoas Fioretos; Magnus Fontes
Journal:  BMC Bioinformatics       Date:  2010-04-15       Impact factor: 3.169

7.  Integrative mixture of experts to combine clinical factors and gene markers.

Authors:  Kim-Anh Lê Cao; Emmanuelle Meugnier; Geoffrey J McLachlan
Journal:  Bioinformatics       Date:  2010-03-11       Impact factor: 6.937

8.  SlimPLS: a method for feature selection in gene expression-based disease classification.

Authors:  Michael Gutkin; Ron Shamir; Gideon Dror
Journal:  PLoS One       Date:  2009-07-29       Impact factor: 3.240

9.  Sparse canonical correlation analysis for identifying, connecting and completing gene-expression networks.

Authors:  Sandra Waaijenborg; Aeilko H Zwinderman
Journal:  BMC Bioinformatics       Date:  2009-09-28       Impact factor: 3.169

10.  Predicting qualitative phenotypes from microarray data - the Eadgene pig data set.

Authors:  Christèle Robert-Granié; Kim-Anh Lê Cao; Magali Sancristobal
Journal:  BMC Proc       Date:  2009-07-16
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.