Literature DB >> 30094873

Assisted gene expression-based clustering with AWNCut.

Yang Li1,2, Ruofan Bie2, Sebastian J Teran Hidalgo3, Yichen Qin4, Mengyun Wu5,3, Shuangge Ma2,3.   

Abstract

In the research on complex diseases, gene expression (GE) data have been extensively used for clustering samples. The clusters so generated can serve as the basis for disease subtype identification, risk stratification, and many other purposes. With the small sample sizes of genetic profiling studies and noisy nature of GE data, clustering analysis results are often unsatisfactory. In the most recent studies, a prominent trend is to conduct multidimensional profiling, which collects data on GEs and their regulators (copy number alterations, microRNAs, methylation, etc.) on the same subjects. With the regulation relationships, regulators contain important information on the properties of GEs. We develop a novel assisted clustering method, which effectively uses regulator information to improve clustering analysis using GE data. To account for the fact that not all GEs are informative, we propose a weighted strategy, where the weights are determined data-dependently and can discriminate informative GEs from noises. The proposed method is built on the NCut technique and effectively realized using a simulated annealing algorithm. Simulations demonstrate that it can well outperform multiple direct competitors. In the analysis of TCGA cutaneous melanoma and lung adenocarcinoma data, biologically sensible findings different from the alternatives are made.
© 2018 John Wiley & Sons, Ltd.

Entities:  

Keywords:  NCut; assisted analysis; clustering; gene expression data

Mesh:

Year:  2018        PMID: 30094873      PMCID: PMC6447298          DOI: 10.1002/sim.7928

Source DB:  PubMed          Journal:  Stat Med        ISSN: 0277-6715            Impact factor:   2.373


  36 in total

Review 1.  Techniques for clustering gene expression data.

Authors:  G Kerr; H J Ruskin; M Crane; P Doolan
Journal:  Comput Biol Med       Date:  2007-12-03       Impact factor: 4.589

2.  Integrating multidimensional omics data for cancer outcome.

Authors:  Ruoqing Zhu; Qing Zhao; Hongyu Zhao; Shuangge Ma
Journal:  Biostatistics       Date:  2016-03-14       Impact factor: 5.899

3.  SIFORM: shared informative factor models for integration of multi-platform bioinformatic data.

Authors:  Xuebei An; Jianhua Hu; Kim-Anh Do
Journal:  Bioinformatics       Date:  2016-07-05       Impact factor: 6.937

4.  A framework for feature selection in clustering.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

5.  COORDINATE DESCENT ALGORITHMS FOR NONCONVEX PENALIZED REGRESSION, WITH APPLICATIONS TO BIOLOGICAL FEATURE SELECTION.

Authors:  Patrick Breheny; Jian Huang
Journal:  Ann Appl Stat       Date:  2011-01-01       Impact factor: 2.083

6.  Genomic Classification of Cutaneous Melanoma.

Authors: 
Journal:  Cell       Date:  2015-06-18       Impact factor: 41.582

7.  ctsGE-clustering subgroups of expression data.

Authors:  Michal Sharabi-Schwager; Etti Or; Ron Ophir
Journal:  Bioinformatics       Date:  2017-07-01       Impact factor: 6.937

8.  Combining multidimensional genomic measurements for predicting cancer prognosis: observations from TCGA.

Authors:  Qing Zhao; Xingjie Shi; Yang Xie; Jian Huang; BenChang Shia; Shuangge Ma
Journal:  Brief Bioinform       Date:  2014-03-13       Impact factor: 13.994

9.  Assisted clustering of gene expression data using ANCut.

Authors:  Sebastian J Teran Hidalgo; Mengyun Wu; Shuangge Ma
Journal:  BMC Genomics       Date:  2017-08-16       Impact factor: 3.969

10.  Comprehensive molecular profiling of lung adenocarcinoma.

Authors: 
Journal:  Nature       Date:  2014-07-09       Impact factor: 49.962

View more
  3 in total

1.  Bottom-Up Approach to the Discovery of Clinically Relevant Biomarker Genes: The Case of Colorectal Cancer.

Authors:  Faddy Kamel; Nathalie Schneider; Pasha Nisar; Mikhail Soloviev
Journal:  Cancers (Basel)       Date:  2022-05-27       Impact factor: 6.575

2.  Integration of Proteomics and Other Omics Data.

Authors:  Mengyun Wu; Yu Jiang; Shuangge Ma
Journal:  Methods Mol Biol       Date:  2021

3.  Assisted differential network analysis for gene expression data.

Authors:  Huangdi Yi; Shuangge Ma
Journal:  Genet Epidemiol       Date:  2021-06-26       Impact factor: 2.344

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.