Literature DB >> 27549122

Integrative clustering of multi-level omics data for disease subtype discovery using sequential double regularization.

Sunghwan Kim1, Steffi Oesterreich2, Seyoung Kim3, Yongseok Park4, George C Tseng4.   

Abstract

With the rapid advances in technologies of microarray and massively parallel sequencing, data of multiple omics sources from a large patient cohort are now frequently seen in many consortium studies. Effective multi-level omics data integration has brought new statistical challenges. One important biological objective of such integrative analysis is to cluster patients in order to identify clinically relevant disease subtypes, which will form basis for tailored treatment and personalized medicine. Several methods have been proposed in the literature for this purpose, including the popular iCluster method used in many cancer applications. When clustering high-dimensional omics data, effective feature selection is critical for better clustering accuracy and biological interpretation. It is also common that a portion of "scattered samples" has patterns distinct from all major clusters and should not be assigned into any cluster as they may represent a rare disease subcategory or be in transition between disease subtypes. In this paper, we firstly propose to improve feature selection of the iCluster factor model by an overlapping sparse group lasso penalty on the omics features using prior knowledge of inter-omics regulatory flows. We then perform regularization over samples to allow clustering with scattered samples and generate tight clusters. The proposed group structured tight iCluster method will be evaluated by two real breast cancer examples and simulations to demonstrate its improved clustering accuracy, biological interpretation, and ability to generate coherent tight clusters.
© The Author 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

Entities:  

Keywords:  Group structured lasso; Integrative clustering (iCluster); Penalized EM-algorithm; The Cancer Genome Atlas (TCGA)

Mesh:

Year:  2016        PMID: 27549122      PMCID: PMC5255053          DOI: 10.1093/biostatistics/kxw039

Source DB:  PubMed          Journal:  Biostatistics        ISSN: 1465-4644            Impact factor:   5.899


  21 in total

1.  Index for rating diagnostic tests.

Authors:  W J YOUDEN
Journal:  Cancer       Date:  1950-01       Impact factor: 6.860

2.  Integrative gene set analysis of multi-platform data with sample heterogeneity.

Authors:  Jun Hu; Jung-Ying Tzeng
Journal:  Bioinformatics       Date:  2014-01-30       Impact factor: 6.937

3.  Molecular portraits of human breast tumours.

Authors:  C M Perou; T Sørlie; M B Eisen; M van de Rijn; S S Jeffrey; C A Rees; J R Pollack; D T Ross; H Johnsen; L A Akslen; O Fluge; A Pergamenschikov; C Williams; S X Zhu; P E Lønning; A L Børresen-Dale; P O Brown; D Botstein
Journal:  Nature       Date:  2000-08-17       Impact factor: 49.962

4.  miRCancer: a microRNA-cancer association database constructed by text mining on literature.

Authors:  Boya Xie; Qin Ding; Hongjin Han; Di Wu
Journal:  Bioinformatics       Date:  2013-01-16       Impact factor: 6.937

Review 5.  Mechanisms of cardiac dysfunction associated with tyrosine kinase inhibitor cancer therapeutics.

Authors:  Ming Hui Chen; Risto Kerkelä; Thomas Force
Journal:  Circulation       Date:  2008-07-01       Impact factor: 29.690

6.  Integrative clustering of multiple genomic data types using a joint latent variable model with application to breast and lung cancer subtype analysis.

Authors:  Ronglai Shen; Adam B Olshen; Marc Ladanyi
Journal:  Bioinformatics       Date:  2009-09-16       Impact factor: 6.937

7.  iBAG: integrative Bayesian analysis of high-dimensional multiplatform genomics data.

Authors:  Wenting Wang; Veerabhadran Baladandayuthapani; Jeffrey S Morris; Bradley M Broom; Ganiraju Manyam; Kim-Anh Do
Journal:  Bioinformatics       Date:  2012-11-09       Impact factor: 6.937

Review 8.  Comprehensive literature review and statistical considerations for microarray meta-analysis.

Authors:  George C Tseng; Debashis Ghosh; Eleanor Feingold
Journal:  Nucleic Acids Res       Date:  2012-01-19       Impact factor: 16.971

Review 9.  Key issues in conducting a meta-analysis of gene expression microarray datasets.

Authors:  Adaikalavan Ramasamy; Adrian Mondry; Chris C Holmes; Douglas G Altman
Journal:  PLoS Med       Date:  2008-09-02       Impact factor: 11.069

10.  A prediction-based resampling method for estimating the number of clusters in a dataset.

Authors:  Sandrine Dudoit; Jane Fridlyand
Journal:  Genome Biol       Date:  2002-06-25       Impact factor: 13.583

View more
  10 in total

1.  A Sparse Mixture-of-Experts Model With Screening of Genetic Associations to Guide Disease Subtyping.

Authors:  Marie Courbariaux; Kylliann De Santiago; Cyril Dalmasso; Fabrice Danjou; Samir Bekadar; Jean-Christophe Corvol; Maria Martinez; Marie Szafranski; Christophe Ambroise
Journal:  Front Genet       Date:  2022-06-06       Impact factor: 4.772

2.  Integration of Proteomics and Other Omics Data.

Authors:  Mengyun Wu; Yu Jiang; Shuangge Ma
Journal:  Methods Mol Biol       Date:  2021

3.  Bayesian integrative model for multi-omics data with missingness.

Authors:  Zhou Fang; Tianzhou Ma; Gong Tang; Li Zhu; Qi Yan; Ting Wang; Juan C Celedón; Wei Chen; George C Tseng
Journal:  Bioinformatics       Date:  2018-11-15       Impact factor: 6.931

4.  Node-Structured Integrative Gaussian Graphical Model Guided by Pathway Information.

Authors:  SungHwan Kim; Jae-Hwan Jhong; JungJun Lee; Ja-Yong Koo; ByungYong Lee; SungWon Han
Journal:  Comput Math Methods Med       Date:  2017-04-12       Impact factor: 2.238

5.  Meta-analytic support vector machine for integrating multiple omics data.

Authors:  SungHwan Kim; Jae-Hwan Jhong; JungJun Lee; Ja-Yong Koo
Journal:  BioData Min       Date:  2017-01-26       Impact factor: 2.522

Review 6.  Precision medicine in COPD: where are we and where do we need to go?

Authors:  Venkataramana K Sidhaye; Kristine Nishida; Fernando J Martinez
Journal:  Eur Respir Rev       Date:  2018-08-01

Review 7.  Machine Learning and Integrative Analysis of Biomedical Big Data.

Authors:  Bilal Mirza; Wei Wang; Jie Wang; Howard Choi; Neo Christopher Chung; Peipei Ping
Journal:  Genes (Basel)       Date:  2019-01-28       Impact factor: 4.096

8.  Approaches to the assessment of severe asthma: barriers and strategies.

Authors:  Eleanor C Majellano; Vanessa L Clark; Natasha A Winter; Peter G Gibson; Vanessa M McDonald
Journal:  J Asthma Allergy       Date:  2019-08-23

9.  CEPICS: A Comparison and Evaluation Platform for Integration Methods in Cancer Subtyping.

Authors:  Ran Duan; Lin Gao; Han Xu; Kuo Song; Yuxuan Hu; Hongda Wang; Yongqiang Dong; Chenxing Zhang; Songwei Jia
Journal:  Front Genet       Date:  2019-10-08       Impact factor: 4.599

10.  Machine learning analysis of TCGA cancer data.

Authors:  Jose Liñares-Blanco; Alejandro Pazos; Carlos Fernandez-Lozano
Journal:  PeerJ Comput Sci       Date:  2021-07-12
  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.