Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Penalized model-based clustering with unconstrained covariance matrices.

Literature DB >> 20463857

Penalized model-based clustering with unconstrained covariance matrices.

Abstract

Clustering is one of the most useful tools for high-dimensional analysis, e.g., for microarray data. It becomes challenging in presence of a large number of noise variables, which may mask underlying clustering structures. Therefore, noise removal through variable selection is necessary. One effective way is regularization for simultaneous parameter estimation and variable selection in model-based clustering. However, existing methods focus on regularizing the mean parameters representing centers of clusters, ignoring dependencies among variables within clusters, leading to incorrect orientations or shapes of the resulting clusters. In this article, we propose a regularized Gaussian mixture model permitting a treatment of general covariance matrices, taking various dependencies into account. At the same time, this approach shrinks the means and covariance matrices, achieving better clustering and variable selection. To overcome one technical challenge in estimating possibly large covariance matrices, we derive an E-M algorithm utilizing the graphical lasso (Friedman et al 2007) for parameter estimation. Numerical examples, including applications to microarray gene expression data, demonstrate the utility of the proposed method.

Entities: Chemical Disease Gene Species

Year: 2009 PMID： 20463857 PMCID： PMC2867492 DOI： 10.1214/09-EJS487

Source DB: PubMed Journal: Electron J Stat ISSN： 1935-7524 Impact factor: 1.125

22 in total

1. A mixture model-based approach to the clustering of microarray expression data.

Authors: G J McLachlan; R W Bean; D Peel
Journal: Bioinformatics Date: 2002-03 Impact factor: 6.937

2. Endothelial cell diversity revealed by global expression profiling.

Authors: Jen-Tsan Chi; Howard Y Chang; Guttorm Haraldsen; Frode L Jahnsen; Olga G Troyanskaya; Dustin S Chang; Zhen Wang; Stanley G Rockson; Matt van de Rijn; David Botstein; Patrick O Brown
Journal: Proc Natl Acad Sci U S A Date: 2003-09-08 Impact factor: 11.205

3. Incorporating gene functions as priors in model-based clustering of microarray gene expression data.

Authors: Wei Pan
Journal: Bioinformatics Date: 2006-01-24 Impact factor: 6.937

4. Variable selection for model-based high-dimensional clustering and its application to microarray data.

Authors: Sijian Wang; Ji Zhu
Journal: Biometrics Date: 2007-10-26 Impact factor: 2.571

5. Logistic regression for disease classification using microarray data: model selection in a large p and small n case.

Authors: J G Liao; Khew-Voon Chin
Journal: Bioinformatics Date: 2007-05-31 Impact factor: 6.937

6. Penalized and weighted K-means for clustering with scattered objects and prior information in high-throughput biological data.

Authors: George C Tseng
Journal: Bioinformatics Date: 2007-06-27 Impact factor: 6.937

7. Variable selection in penalized model-based clustering via regularization on grouped parameters.

Authors: Benhuai Xie; Wei Pan; Xiaotong Shen
Journal: Biometrics Date: 2007-12-20 Impact factor: 2.571

8. Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables.

Authors: Benhuai Xie; Wei Pan; Xiaotong Shen
Journal: Electron J Stat Date: 2008 Impact factor: 1.125

9. Cluster analysis and display of genome-wide expression patterns.

Authors: M B Eisen; P T Spellman; P O Brown; D Botstein
Journal: Proc Natl Acad Sci U S A Date: 1998-12-08 Impact factor: 11.205

10. Identifying genes that contribute most to good classification in microarrays.

Authors: Stuart G Baker; Barnett S Kramer
Journal: BMC Bioinformatics Date: 2006-09-07 Impact factor: 3.169

10 in total

1. Graph-based sparse linear discriminant analysis for high-dimensional classification.

Authors: Jianyu Liu; Guan Yu; Yufeng Liu
Journal: J Multivar Anal Date: 2018-12-17 Impact factor: 1.473

2. Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data.

Authors: Benhuai Xie; Wei Pan; Xiaotong Shen
Journal: Bioinformatics Date: 2009-12-23 Impact factor: 6.937

3. Comparing Model Selection and Regularization Approaches to Variable Selection in Model-Based Clustering.

Authors: Gilles Celeux; Marie-Laure Martin-Magniette; Cathy Maugis-Rabusseau; Adrian E Raftery
Journal: J Soc Fr Statistique (2009) Date: 2014

4. Integrative clustering methods for multi-omics data.

Authors: Xiaoyu Zhang; Zhenwei Zhou; Hanfei Xu; Ching-Ti Liu
Journal: Wiley Interdiscip Rev Comput Stat Date: 2021-02-07

5. Clustering High-Dimensional Landmark-based Two-dimensional Shape Data^‡.

Authors: Chao Huang; Martin Styner; Hongtu Zhu
Journal: J Am Stat Assoc Date: 2015-04-16 Impact factor: 5.033

6. Estimation of multiple networks in Gaussian mixture models.

Authors: Chen Gao; Yunzhang Zhu; Xiaotong Shen; Wei Pan
Journal: Electron J Stat Date: 2016-05-02 Impact factor: 1.125

7. HeteroGGM: an R package for Gaussian graphical model-based heterogeneity analysis.

Authors: Mingyang Ren; Sanguo Zhang; Qingzhao Zhang; Shuangge Ma
Journal: Bioinformatics Date: 2021-02-26 Impact factor: 6.937

8. Cancer subtype discovery and biomarker identification via a new robust network clustering algorithm.

Authors: Meng-Yun Wu; Dao-Qing Dai; Xiao-Fei Zhang; Yuan Zhu
Journal: PLoS One Date: 2013-06-17 Impact factor: 3.240

9. Penalized model-based clustering of fMRI data.

Authors: Andrew Dilernia; Karina Quevedo; Jazmin Camchong; Kelvin Lim; Wei Pan; Lin Zhang
Journal: Biostatistics Date: 2022-07-18 Impact factor: 5.279

10. Molecular heterogeneity at the network level: high-dimensional testing, clustering and a TCGA case study.

Authors: Nicolas Städler; Frank Dondelinger; Steven M Hill; Rehan Akbani; Yiling Lu; Gordon B Mills; Sach Mukherjee
Journal: Bioinformatics Date: 2017-09-15 Impact factor: 6.937

10 in total