Literature DB >> 17182700

Clustering threshold gradient descent regularization: with applications to microarray studies.

Shuangge Ma1, Jian Huang.   

Abstract

MOTIVATION: An important goal of microarray studies is to discover genes that are associated with clinical outcomes, such as disease status and patient survival. While a typical experiment surveys gene expressions on a global scale, there may be only a small number of genes that have significant influence on a clinical outcome. Moreover, expression data have cluster structures and the genes within a cluster have correlated expressions and coordinated functions, but the effects of individual genes in the same cluster may be different. Accordingly, we seek to build statistical models with the following properties. First, the model is sparse in the sense that only a subset of the parameter vector is non-zero. Second, the cluster structures of gene expressions are properly accounted for.
RESULTS: For gene expression data without pathway information, we divide genes into clusters using commonly used methods, such as K-means or hierarchical approaches. The optimal number of clusters is determined using the Gap statistic. We propose a clustering threshold gradient descent regularization (CTGDR) method, for simultaneous cluster selection and within cluster gene selection. We apply this method to binary classification and censored survival analysis. Compared to the standard TGDR and other regularization methods, the CTGDR takes into account the cluster structure and carries out feature selection at both the cluster level and within-cluster gene level. We demonstrate the CTGDR on two studies of cancer classification and two studies correlating survival of lymphoma patients with microarray expressions. AVAILABILITY: R code is available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17182700     DOI: 10.1093/bioinformatics/btl632

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  16 in total

1.  Nonparametric survival analysis using Bayesian Additive Regression Trees (BART).

Authors:  Rodney A Sparapani; Brent R Logan; Robert E McCulloch; Purushottam W Laud
Journal:  Stat Med       Date:  2016-02-07       Impact factor: 2.373

2.  Nonparametric competing risks analysis using Bayesian Additive Regression Trees.

Authors:  Rodney Sparapani; Brent R Logan; Robert E McCulloch; Purushottam W Laud
Journal:  Stat Methods Med Res       Date:  2019-01-07       Impact factor: 3.021

3.  Identification of cancer-associated gene clusters and genes via clustering penalization.

Authors:  Shuangge Ma; Jian Huang; Shihao Shen
Journal:  Stat Interface       Date:  2009-01-01       Impact factor: 0.582

4.  Identifying subset of genes that have influential impacts on cancer progression: a new approach to analyze cancer microarray data.

Authors:  Mingyu Shi; Shuangge Ma
Journal:  Funct Integr Genomics       Date:  2008-05-20       Impact factor: 3.410

5.  Identification of non-Hodgkin's lymphoma prognosis signatures using the CTGDR method.

Authors:  Shuangge Ma; Yawei Zhang; Jian Huang; Xuesong Han; Theodore Holford; Qing Lan; Nathaniel Rothman; Peter Boyle; Tongzhang Zheng
Journal:  Bioinformatics       Date:  2009-10-22       Impact factor: 6.937

6.  IPI59: An Actionable Biomarker to Improve Treatment Response in Serous Ovarian Carcinoma Patients.

Authors:  J Choi; S Ye; K H Eng; K Korthauer; W H Bradley; J S Rader; C Kendziorski
Journal:  Stat Biosci       Date:  2016-03-29

7.  A group bridge approach for variable selection.

Authors:  Jian Huang; Shuange Ma; Huiliang Xie; Cun-Hui Zhang
Journal:  Biometrika       Date:  2009-06       Impact factor: 2.445

8.  Concave 1-norm group selection.

Authors:  Dingfeng Jiang; Jian Huang
Journal:  Biostatistics       Date:  2014-11-21       Impact factor: 5.279

9.  Regularized gene selection in cancer microarray meta-analysis.

Authors:  Shuangge Ma; Jian Huang
Journal:  BMC Bioinformatics       Date:  2009-01-01       Impact factor: 3.169

10.  Recursive cluster elimination (RCE) for classification and feature selection from gene expression data.

Authors:  Malik Yousef; Segun Jung; Louise C Showe; Michael K Showe
Journal:  BMC Bioinformatics       Date:  2007-05-02       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.