Literature DB >> 25364221

Sparse Biclustering of Transposable Data.

Kean Ming Tan1, Daniela M Witten2.   

Abstract

We consider the task of simultaneously clustering the rows and columns of a large transposable data matrix. We assume that the matrix elements are normally distributed with a bicluster-specific mean term and a common variance, and perform biclustering by maximizing the corresponding log likelihood. We apply an ℓ1 penalty to the means of the biclusters in order to obtain sparse and interpretable biclusters. Our proposal amounts to a sparse, symmetrized version of k-means clustering. We show that k-means clustering of the rows and of the columns of a data matrix can be seen as special cases of our proposal, and that a relaxation of our proposal yields the singular value decomposition. In addition, we propose a framework for bi-clustering based on the matrix-variate normal distribution. The performances of our proposals are demonstrated in a simulation study and on a gene expression data set. This article has supplementary material online.

Entities:  

Keywords:  Clustering; Gene expression; Matrix-variate normal distribution; Unsupervised learning; ℓ1 penalty

Year:  2014        PMID: 25364221      PMCID: PMC4212513          DOI: 10.1080/10618600.2013.852554

Source DB:  PubMed          Journal:  J Comput Graph Stat        ISSN: 1061-8600            Impact factor:   2.302


  16 in total

1.  Coupled two-way clustering analysis of gene microarray data.

Authors:  G Getz; E Levine; E Domany
Journal:  Proc Natl Acad Sci U S A       Date:  2000-10-24       Impact factor: 11.205

2.  Biclustering of expression data.

Authors:  Y Cheng; G M Church
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  2000

3.  A systematic comparison and evaluation of biclustering methods for gene expression data.

Authors:  Amela Prelić; Stefan Bleuler; Philip Zimmermann; Anja Wille; Peter Bühlmann; Wilhelm Gruissem; Lars Hennig; Lothar Thiele; Eckart Zitzler
Journal:  Bioinformatics       Date:  2006-02-24       Impact factor: 6.937

4.  Biclustering algorithms for biological data analysis: a survey.

Authors:  Sara C Madeira; Arlindo L Oliveira
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2004 Jan-Mar       Impact factor: 3.710

5.  Variable selection for model-based high-dimensional clustering and its application to microarray data.

Authors:  Sijian Wang; Ji Zhu
Journal:  Biometrics       Date:  2007-10-26       Impact factor: 2.571

6.  Coclustering of human cancer microarrays using Minimum Sum-Squared Residue coclustering.

Authors:  Hyuk Cho; Inderjit S Dhillon
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2008 Jul-Sep       Impact factor: 3.710

7.  Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables.

Authors:  Benhuai Xie; Wei Pan; Xiaotong Shen
Journal:  Electron J Stat       Date:  2008       Impact factor: 1.125

8.  Covariance-regularized regression and classification for high-dimensional problems.

Authors:  Daniela M Witten; Robert Tibshirani
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2009-02-20       Impact factor: 4.488

9.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

10.  Bayesian biclustering of gene expression data.

Authors:  Jiajun Gu; Jun S Liu
Journal:  BMC Genomics       Date:  2008       Impact factor: 3.969

View more
  10 in total

1.  Biclustering via sparse clustering.

Authors:  Erika S Helgeson; Qian Liu; Guanhua Chen; Michael R Kosorok; Eric Bair
Journal:  Biometrics       Date:  2019-10-14       Impact factor: 2.571

2.  Data-Driven Tree Transforms and Metrics.

Authors:  Gal Mishne; Ronen Talmon; Israel Cohen; Ronald R Coifman; Yuval Kluger
Journal:  IEEE Trans Signal Inf Process Netw       Date:  2017-08-23

3.  Permutation based testing on covariance separability.

Authors:  Seongoh Park; Johan Lim; Xinlei Wang; Sanghan Lee
Journal:  Comput Stat       Date:  2018-09-27       Impact factor: 1.000

Review 4.  It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data.

Authors:  Juan Xie; Anjun Ma; Anne Fennell; Qin Ma; Jing Zhao
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

5.  Sparse and Simple Structure Estimation via Prenet Penalization.

Authors:  Kei Hirose; Yoshikazu Terada
Journal:  Psychometrika       Date:  2022-05-23       Impact factor: 2.500

6.  Statistical properties of convex clustering.

Authors:  Kean Ming Tan; Daniela Witten
Journal:  Electron J Stat       Date:  2015-10-14       Impact factor: 1.125

7.  Generalized Co-Clustering Analysis via Regularized Alternating Least Squares.

Authors:  Gen Li
Journal:  Comput Stat Data Anal       Date:  2020-05-04       Impact factor: 1.681

8.  Analysis of breast cancer subtypes by AP-ISA biclustering.

Authors:  Liying Yang; Yunyan Shen; Xiguo Yuan; Junying Zhang; Jianhua Wei
Journal:  BMC Bioinformatics       Date:  2017-11-14       Impact factor: 3.169

9.  Clustering multilayer omics data using MuNCut.

Authors:  Sebastian J Teran Hidalgo; Shuangge Ma
Journal:  BMC Genomics       Date:  2018-03-14       Impact factor: 3.969

10.  Provable Convex Co-clustering of Tensors.

Authors:  Eric C Chi; Brian R Gaines; Will Wei Sun; Hua Zhou; Jian Yang
Journal:  J Mach Learn Res       Date:  2020       Impact factor: 5.177

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.