Literature DB >> 19040367

Biclustering sparse binary genomic data.

Miranda van Uitert1, Wouter Meuleman, Lodewyk Wessels.   

Abstract

Genomic datasets often consist of large, binary, sparse data matrices. In such a dataset, one is often interested in finding contiguous blocks that (mostly) contain ones. This is a biclustering problem, and while many algorithms have been proposed to deal with gene expression data, only two algorithms have been proposed that specifically deal with binary matrices. None of the gene expression biclustering algorithms can handle the large number of zeros in sparse binary matrices. The two proposed binary algorithms failed to produce meaningful results. In this article, we present a new algorithm that is able to extract biclusters from sparse, binary datasets. A powerful feature is that biclusters with different numbers of rows and columns can be detected, varying from many rows to few columns and few rows to many columns. It allows the user to guide the search towards biclusters of specific dimensions. When applying our algorithm to an input matrix derived from TRANSFAC, we find transcription factors with distinctly dissimilar binding motifs, but a clear set of common targets that are significantly enriched for GO categories.

Mesh:

Substances:

Year:  2008        PMID: 19040367     DOI: 10.1089/cmb.2008.0066

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  3 in total

1.  A Tabu-Search Heuristic for Deterministic Two-Mode Blockmodeling of Binary Network Matrices.

Authors:  Michael Brusco; Douglas Steinley
Journal:  Psychometrika       Date:  2011-07-14       Impact factor: 2.500

2.  Large-scale bioactivity analysis of the small-molecule assayed proteome.

Authors:  Tyler William H Backman; Daniel S Evans; Thomas Girke
Journal:  PLoS One       Date:  2017-02-08       Impact factor: 3.240

3.  Fastbreak: a tool for analysis and visualization of structural variations in genomic data.

Authors:  Ryan Bressler; Jake Lin; Andrea Eakin; Thomas Robinson; Richard Kreisberg; Hector Rovira; Theo Knijnenburg; John Boyle; Ilya Shmulevich
Journal:  EURASIP J Bioinform Syst Biol       Date:  2012-10-09
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.