Literature DB >> 25642008

The cluster graphical lasso for improved estimation of Gaussian graphical models.

Kean Ming Tan1, Daniela Witten1, Ali Shojaie1.   

Abstract

The task of estimating a Gaussian graphical model in the high-dimensional setting is considered. The graphical lasso, which involves maximizing the Gaussian log likelihood subject to a lasso penalty, is a well-studied approach for this task. A surprising connection between the graphical lasso and hierarchical clustering is introduced: the graphical lasso in effect performs a two-step procedure, in which (1) single linkage hierarchical clustering is performed on the variables in order to identify connected components, and then (2) a penalized log likelihood is maximized on the subset of variables within each connected component. Thus, the graphical lasso determines the connected components of the estimated network via single linkage clustering. The single linkage clustering is known to perform poorly in certain finite-sample settings. Therefore, the cluster graphical lasso, which involves clustering the features using an alternative to single linkage clustering, and then performing the graphical lasso on the subset of variables within each cluster, is proposed. Model selection consistency for this technique is established, and its improved performance relative to the graphical lasso is demonstrated in a simulation study, as well as in applications to a university webpage and a gene expression data sets.

Entities:  

Keywords:  hierarchical clustering; high-dimensional setting; network; single linkage clustering; sparsity

Year:  2015        PMID: 25642008      PMCID: PMC4307846          DOI: 10.1016/j.csda.2014.11.015

Source DB:  PubMed          Journal:  Comput Stat Data Anal        ISSN: 0167-9473            Impact factor:   1.681


  10 in total

Review 1.  Elucidation of the methylerythritol phosphate pathway for isoprenoid biosynthesis in bacteria and plastids. A metabolic milestone achieved through genomics.

Authors:  Manuel Rodríguez-Concepción; Albert Boronat
Journal:  Plant Physiol       Date:  2002-11       Impact factor: 8.340

2.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso.

Authors:  Rahul Mazumder; Trevor Hastie
Journal:  J Mach Learn Res       Date:  2012-03-01       Impact factor: 3.654

3.  Sparse inverse covariance estimation with the graphical lasso.

Authors:  Jerome Friedman; Trevor Hastie; Robert Tibshirani
Journal:  Biostatistics       Date:  2007-12-12       Impact factor: 5.899

4.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.

Authors:  Daniela M Witten; Robert Tibshirani; Trevor Hastie
Journal:  Biostatistics       Date:  2009-04-17       Impact factor: 5.899

5.  Joint estimation of multiple graphical models.

Authors:  Jian Guo; Elizaveta Levina; George Michailidis; Ji Zhu
Journal:  Biometrika       Date:  2011-02-09       Impact factor: 2.445

6.  NETWORK EXPLORATION VIA THE ADAPTIVE LASSO AND SCAD PENALTIES.

Authors:  Jianqing Fan; Yang Feng; Yichao Wu
Journal:  Ann Appl Stat       Date:  2009-06-01       Impact factor: 2.083

7.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation.

Authors:  Clifford Lam; Jianqing Fan
Journal:  Ann Stat       Date:  2009       Impact factor: 4.028

8.  Partial Correlation Estimation by Joint Sparse Regression Models.

Authors:  Jie Peng; Pei Wang; Nengfeng Zhou; Ji Zhu
Journal:  J Am Stat Assoc       Date:  2009-06-01       Impact factor: 5.033

9.  An Arabidopsis gene network based on the graphical Gaussian model.

Authors:  Shisong Ma; Qingqiu Gong; Hans J Bohnert
Journal:  Genome Res       Date:  2007-10-05       Impact factor: 9.043

10.  Sparse graphical Gaussian modeling of the isoprenoid gene network in Arabidopsis thaliana.

Authors:  Anja Wille; Philip Zimmermann; Eva Vranová; Andreas Fürholz; Oliver Laule; Stefan Bleuler; Lars Hennig; Amela Prelic; Peter von Rohr; Lothar Thiele; Eckart Zitzler; Wilhelm Gruissem; Peter Bühlmann
Journal:  Genome Biol       Date:  2004-10-25       Impact factor: 13.583

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.