Literature DB >> 24587839

SPARSE INTEGRATIVE CLUSTERING OF MULTIPLE OMICS DATA SETS.

Ronglai Shen1, Sijian Wang2, Qianxing Mo3.   

Abstract

High resolution microarrays and second-generation sequencing platforms are powerful tools to investigate genome-wide alterations in DNA copy number, methylation, and gene expression associated with a disease. An integrated genomic profiling approach measuring multiple omics data types simultaneously in the same set of biological samples would render an integrated data resolution that would not be available with any single data type. In this study, we use penalized latent variable regression methods for joint modeling of multiple omics data types to identify common latent variables that can be used to cluster patient samples into biologically and clinically relevant disease subtypes. We consider lasso (Tibshirani, 1996), elastic net (Zou and Hastie, 2005), and fused lasso (Tibshirani et al., 2005) methods to induce sparsity in the coefficient vectors, revealing important genomic features that have significant contributions to the latent variables. An iterative ridge regression is used to compute the sparse coefficient vectors. In model selection, a uniform design (Fang and Wang, 1994) is used to seek "experimental" points that scattered uniformly across the search domain for efficient sampling of tuning parameter combinations. We compared our method to sparse singular value decomposition (SVD) and penalized Gaussian mixture model (GMM) using both real and simulated data sets. The proposed method is applied to integrate genomic, epigenomic, and transcriptomic data for subtype analysis in breast and lung cancer data sets.

Entities:  

Year:  2013        PMID: 24587839      PMCID: PMC3935438          DOI: 10.1214/12-AOAS578

Source DB:  PubMed          Journal:  Ann Appl Stat        ISSN: 1932-6157            Impact factor:   2.083


  40 in total

1.  Variable selection for model-based high-dimensional clustering and its application to microarray data.

Authors:  Sijian Wang; Ji Zhu
Journal:  Biometrics       Date:  2007-10-26       Impact factor: 2.571

2.  Spatial smoothing and hot spot detection for CGH data using the fused lasso.

Authors:  Robert Tibshirani; Pei Wang
Journal:  Biostatistics       Date:  2007-05-18       Impact factor: 5.899

3.  Comprehensive high-throughput arrays for relative methylation (CHARM).

Authors:  Rafael A Irizarry; Christine Ladd-Acosta; Benilton Carvalho; Hao Wu; Sheri A Brandenburg; Jeffrey A Jeddeloh; Bo Wen; Andrew P Feinberg
Journal:  Genome Res       Date:  2008-03-03       Impact factor: 9.043

4.  Nonparametric testing for DNA copy number induced differential mRNA gene expression.

Authors:  Wessel N van Wieringen; Mark A van de Wiel
Journal:  Biometrics       Date:  2008-05-13       Impact factor: 2.571

5.  Distinctive gene expression patterns in human mammary epithelial cells and breast cancers.

Authors:  C M Perou; S S Jeffrey; M van de Rijn; C A Rees; M B Eisen; D T Ross; A Pergamenschikov; C F Williams; S X Zhu; J C Lee; D Lashkari; D Shalon; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1999-08-03       Impact factor: 11.205

6.  Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer.

Authors:  Jie Peng; Ji Zhu; Anna Bergamaschi; Wonshik Han; Dong-Young Noh; Jonathan R Pollack; Pei Wang
Journal:  Ann Appl Stat       Date:  2010-03       Impact factor: 2.083

7.  Development and validation of predictive indices for a continuous outcome using gene expression profiles.

Authors:  Yingdong Zhao; Richard Simon
Journal:  Cancer Inform       Date:  2010-05-07

8.  Integrative analysis of gene expression and copy number alterations using canonical correlation analysis.

Authors:  Charlotte Soneson; Henrik Lilljebjörn; Thoas Fioretos; Magnus Fontes
Journal:  BMC Bioinformatics       Date:  2010-04-15       Impact factor: 3.169

9.  Gene expression profiling identifies clinically relevant subtypes of prostate cancer.

Authors:  Jacques Lapointe; Chunde Li; John P Higgins; Matt van de Rijn; Eric Bair; Kelli Montgomery; Michelle Ferrari; Lars Egevad; Walter Rayford; Ulf Bergerheim; Peter Ekman; Angelo M DeMarzo; Robert Tibshirani; David Botstein; Patrick O Brown; James D Brooks; Jonathan R Pollack
Journal:  Proc Natl Acad Sci U S A       Date:  2004-01-07       Impact factor: 11.205

10.  A prediction-based resampling method for estimating the number of clusters in a dataset.

Authors:  Sandrine Dudoit; Jane Fridlyand
Journal:  Genome Biol       Date:  2002-06-25       Impact factor: 13.583

View more
  36 in total

1.  Integrative linear discriminant analysis with guaranteed error rate improvement.

Authors:  Quefeng Li; Lexin Li
Journal:  Biometrika       Date:  2018-10-22       Impact factor: 2.445

2.  Integrative and regularized principal component analysis of multiple sources of data.

Authors:  Binghui Liu; Xiaotong Shen; Wei Pan
Journal:  Stat Med       Date:  2016-01-12       Impact factor: 2.373

3.  Assisted gene expression-based clustering with AWNCut.

Authors:  Yang Li; Ruofan Bie; Sebastian J Teran Hidalgo; Yichen Qin; Mengyun Wu; Shuangge Ma
Journal:  Stat Med       Date:  2018-08-09       Impact factor: 2.373

4.  Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration.

Authors:  Jeffrey S Morris; Veerabhadran Baladandayuthapani
Journal:  Stat Modelling       Date:  2017-06-15       Impact factor: 2.039

5.  Deep-learning approach to identifying cancer subtypes using high-dimensional genomic data.

Authors:  Runpu Chen; Le Yang; Steve Goodison; Yijun Sun
Journal:  Bioinformatics       Date:  2020-03-01       Impact factor: 6.937

6.  Integrative Analysis of "-Omics" Data Using Penalty Functions.

Authors:  Qing Zhao; Xingjie Shi; Jian Huang; Jin Liu; Yang Li; Shuangge Ma
Journal:  Wiley Interdiscip Rev Comput Stat       Date:  2015 Jan-Feb

7.  Simultaneous Covariance Inference for Multimodal Integrative Analysis.

Authors:  Yin Xia; Lexin Li; Samuel N Lockhart; William J Jagust
Journal:  J Am Stat Assoc       Date:  2019-06-28       Impact factor: 5.033

8.  Integrative clustering methods for high-dimensional molecular data.

Authors:  Prabhakar Chalise; Devin C Koestler; Milan Bimali; Qing Yu; Brooke L Fridley
Journal:  Transl Cancer Res       Date:  2014-06-01       Impact factor: 1.241

9.  Semi-supervised identification of cancer subgroups using survival outcomes and overlapping grouping information.

Authors:  Wei Wei; Zequn Sun; Willian A da Silveira; Zhenning Yu; Andrew Lawson; Gary Hardiman; Linda E Kelemen; Dongjun Chung
Journal:  Stat Methods Med Res       Date:  2018-01-16       Impact factor: 3.021

10.  Nonlinear Joint Latent Variable Models and Integrative Tumor Subtype Discovery.

Authors:  Binghui Liu; Xiaotong Shen; Wei Pan
Journal:  Stat Anal Data Min       Date:  2016-03-28       Impact factor: 1.051

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.