Literature DB >> 23843673

Simultaneous supervised clustering and feature selection over a graph.

Xiaotong Shen1, Hsin-Cheng Huang, Wei Pan.   

Abstract

In this article, we propose a regression method for simultaneous supervised clustering and feature selection over a given undirected graph, where homogeneous groups or clusters are estimated as well as informative predictors, with each predictor corresponding to one node in the graph and a connecting path indicating a priori possible grouping among the corresponding predictors. The method seeks a parsimonious model with high predictive power through identifying and collapsing homogeneous groups of regression coefficients. To address computational challenges, we present an efficient algorithm integrating the augmented Lagrange multipliers, coordinate descent and difference convex methods. We prove that the proposed method not only identifies the true homogeneous groups and informative features consistently but also leads to accurate parameter estimation. A gene network dataset is analysed to demonstrate that the method can make a difference by exploring dependency structures among the genes.

Keywords:  Expression quantitative trait loci data; High-dimensional data; Nonconvex minimization; Prediction

Year:  2012        PMID: 23843673      PMCID: PMC3629856          DOI: 10.1093/biomet/ass038

Source DB:  PubMed          Journal:  Biometrika        ISSN: 0006-3444            Impact factor:   2.445


  10 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  Likelihood-based selection and sharp parameter estimation.

Authors:  Xiaotong Shen; Wei Pan; Yunzhang Zhu
Journal:  J Am Stat Assoc       Date:  2012-06-11       Impact factor: 5.033

3.  Averaged gene expressions for regression.

Authors:  Mee Young Park; Trevor Hastie; Robert Tibshirani
Journal:  Biostatistics       Date:  2006-05-11       Impact factor: 5.899

4.  Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR.

Authors:  Howard D Bondell; Brian J Reich
Journal:  Biometrics       Date:  2007-06-30       Impact factor: 2.571

5.  Network-constrained regularization and variable selection for analysis of genomic data.

Authors:  Caiyan Li; Hongzhe Li
Journal:  Bioinformatics       Date:  2008-03-01       Impact factor: 6.937

6.  Network-based multiple locus linkage analysis of expression traits.

Authors:  Wei Pan
Journal:  Bioinformatics       Date:  2009-03-31       Impact factor: 6.937

7.  Grouping pursuit through a regularization solution surface.

Authors:  Xiaotong Shen; Hsin-Cheng Huang
Journal:  J Am Stat Assoc       Date:  2010-06-01       Impact factor: 5.033

8.  Variable selection and estimation in generalized linear models with the seamless L0 penalty.

Authors:  Zilin Li; Sijian Wang; Xihong Lin
Journal:  Can J Stat       Date:  2012-12       Impact factor: 0.875

9.  Combined expression trait correlations and expression quantitative trait locus mapping.

Authors:  Hong Lan; Meng Chen; Jessica B Flowers; Brian S Yandell; Donnie S Stapleton; Christine M Mata; Eric Ton-Keen Mui; Matthew T Flowers; Kathryn L Schueler; Kenneth F Manly; Robert W Williams; Christina Kendziorski; Alan D Attie
Journal:  PLoS Genet       Date:  2006-01-20       Impact factor: 5.917

10.  Integrating genetic and network analysis to characterize genes related to mouse weight.

Authors:  Anatole Ghazalpour; Sudheer Doss; Bin Zhang; Susanna Wang; Christopher Plaisier; Ruth Castellanos; Alec Brozell; Eric E Schadt; Thomas A Drake; Aldons J Lusis; Steve Horvath
Journal:  PLoS Genet       Date:  2006-07-05       Impact factor: 5.917

  10 in total
  5 in total

1.  Feature Grouping and Selection Over an Undirected Graph.

Authors:  Sen Yang; Lei Yuan; Ying-Cheng Lai; Xiaotong Shen; Peter Wonka; Jieping Ye
Journal:  KDD       Date:  2012

2.  Statistical Contributions to Bioinformatics: Design, Modeling, Structure Learning, and Integration.

Authors:  Jeffrey S Morris; Veerabhadran Baladandayuthapani
Journal:  Stat Modelling       Date:  2017-06-15       Impact factor: 2.039

3.  A significance test for graph-constrained estimation.

Authors:  Sen Zhao; Ali Shojaie
Journal:  Biometrics       Date:  2015-09-22       Impact factor: 2.571

4.  The Cluster Elastic Net for High-Dimensional Regression With Unknown Variable Grouping.

Authors:  Daniela M Witten; Ali Shojaie; Fan Zhang
Journal:  Technometrics       Date:  2014-02-20

5.  Provable Convex Co-clustering of Tensors.

Authors:  Eric C Chi; Brian R Gaines; Will Wei Sun; Hua Zhou; Jian Yang
Journal:  J Mach Learn Res       Date:  2020       Impact factor: 5.177

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.