Literature DB >> 22772837

A comparative analysis of biclustering algorithms for gene expression data.

Kemal Eren1, Mehmet Deveci, Onur Küçüktunç, Ümit V Çatalyürek.   

Abstract

The need to analyze high-dimension biological data is driving the development of new data mining methods. Biclustering algorithms have been successfully applied to gene expression data to discover local patterns, in which a subset of genes exhibit similar expression levels over a subset of conditions. However, it is not clear which algorithms are best suited for this task. Many algorithms have been published in the past decade, most of which have been compared only to a small number of algorithms. Surveys and comparisons exist in the literature, but because of the large number and variety of biclustering algorithms, they are quickly outdated. In this article we partially address this problem of evaluating the strengths and weaknesses of existing biclustering methods. We used the BiBench package to compare 12 algorithms, many of which were recently published or have not been extensively studied. The algorithms were tested on a suite of synthetic data sets to measure their performance on data with varying conditions, such as different bicluster models, varying noise, varying numbers of biclusters and overlapping biclusters. The algorithms were also tested on eight large gene expression data sets obtained from the Gene Expression Omnibus. Gene Ontology enrichment analysis was performed on the resulting biclusters, and the best enrichment terms are reported. Our analyses show that the biclustering method and its parameters should be selected based on the desired model, whether that model allows overlapping biclusters, and its robustness to noise. In addition, we observe that the biclustering algorithms capable of finding more than one model are more successful at capturing biologically relevant clusters.

Keywords:  biclustering; clustering; gene expression; microarray

Mesh:

Year:  2012        PMID: 22772837      PMCID: PMC3659300          DOI: 10.1093/bib/bbs032

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  23 in total

1.  Extracting conserved gene expression motifs from gene expression data.

Authors:  T M Murali; Simon Kasif
Journal:  Pac Symp Biocomput       Date:  2003

2.  More powerful procedures for multiple significance testing.

Authors:  Y Hochberg; Y Benjamini
Journal:  Stat Med       Date:  1990-07       Impact factor: 2.373

Review 3.  Two-mode clustering methods: a structured overview.

Authors:  Iven Van Mechelen; Hans-Hermann Bock; Paul De Boeck
Journal:  Stat Methods Med Res       Date:  2004-10       Impact factor: 3.021

4.  Co-clustering: a versatile tool for data analysis in biomedical informatics.

Authors:  Sungroh Yoon; Luca Benini; Giovanni De Micheli
Journal:  IEEE Trans Inf Technol Biomed       Date:  2007-07

5.  Detailing regulatory networks through large scale data integration.

Authors:  Curtis Huttenhower; K Tsheko Mutungu; Natasha Indik; Woongcheol Yang; Mark Schroeder; Joshua J Forman; Olga G Troyanskaya; Hilary A Coller
Journal:  Bioinformatics       Date:  2009-10-13       Impact factor: 6.937

6.  Iterative signature algorithm for the analysis of large-scale gene expression data.

Authors:  Sven Bergmann; Jan Ihmels; Naama Barkai
Journal:  Phys Rev E Stat Nonlin Soft Matter Phys       Date:  2003-03-11

7.  Spectral biclustering of microarray data: coclustering genes and conditions.

Authors:  Yuval Kluger; Ronen Basri; Joseph T Chang; Mark Gerstein
Journal:  Genome Res       Date:  2003-04       Impact factor: 9.043

8.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

9.  Global and regional brain metabolic scaling and its functional consequences.

Authors:  Jan Karbowski
Journal:  BMC Biol       Date:  2007-05-09       Impact factor: 7.431

10.  Bayesian biclustering of gene expression data.

Authors:  Jiajun Gu; Jun S Liu
Journal:  BMC Genomics       Date:  2008       Impact factor: 3.969

View more
  54 in total

1.  Unravelling the geometry of data matrices: effects of water stress regimes on winemaking.

Authors:  Hsieh Fushing; Chih-Hsin Hsueh; Constantin Heitkamp; Mark A Matthews; Patrice Koehl
Journal:  J R Soc Interface       Date:  2015-10-06       Impact factor: 4.118

2.  Bi-Force: large-scale bicluster editing and its application to gene expression data biclustering.

Authors:  Peng Sun; Nora K Speicher; Richard Röttger; Jiong Guo; Jan Baumbach
Journal:  Nucleic Acids Res       Date:  2014-03-20       Impact factor: 16.971

Review 3.  Big data in medical science--a biostatistical view.

Authors:  Harald Binder; Maria Blettner
Journal:  Dtsch Arztebl Int       Date:  2015-02-27       Impact factor: 5.594

4.  Rank-preserving biclustering algorithm: a case study on miRNA breast cancer.

Authors:  Koyel Mandal; Rosy Sarmah; Dhruba Kumar Bhattacharyya; Jugal Kumar Kalita; Bhogeswar Borah
Journal:  Med Biol Eng Comput       Date:  2021-04-11       Impact factor: 2.602

5.  EBIC: an evolutionary-based parallel biclustering algorithm for pattern discovery.

Authors:  Patryk Orzechowski; Moshe Sipper; Xiuzhen Huang; Jason H Moore
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

6.  Bayesian generalized biclustering analysis via adaptive structured shrinkage.

Authors:  Ziyi Li; Changgee Chang; Suprateek Kundu; Qi Long
Journal:  Biostatistics       Date:  2020-07-01       Impact factor: 5.899

7.  NUSAP1 influences the DNA damage response by controlling BRCA1 protein levels.

Authors:  Shweta Kotian; Tapahsama Banerjee; Ainsley Lockhart; Kun Huang; Umit V Catalyurek; Jeffrey D Parvin
Journal:  Cancer Biol Ther       Date:  2014-02-12       Impact factor: 4.742

8.  IRIS3: integrated cell-type-specific regulon inference server from single-cell RNA-Seq.

Authors:  Anjun Ma; Cankun Wang; Yuzhou Chang; Faith H Brennan; Adam McDermaid; Bingqiang Liu; Chi Zhang; Phillip G Popovich; Qin Ma
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

9.  Knowledge-Guided Biclustering via Sparse Variational EM Algorithm.

Authors:  Changgee Chang; Jihwan Oh; Eun Jeong Min; Qi Long
Journal:  10th IEEE Int Conf Big Knowl (2019)       Date:  2019-12-30

Review 10.  It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data.

Authors:  Juan Xie; Anjun Ma; Anne Fennell; Qin Ma; Jing Zhao
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.