Literature DB >> 15290780

CUBIC: identification of regulatory binding sites through data clustering.

Victor Olman1, Dong Xu, Ying Xu.   

Abstract

Transcription factor binding sites are short fragments in the upstream regions of genes, to which transcription factors bind to regulate the transcription of genes into mRNA. Computational identification of transcription factor binding sites remains an unsolved challenging problem though a great amount of effort has been put into the study of this problem. We have recently developed a novel technique for identification of binding sites from a set of upstream regions of genes, that could possibly be transcriptionally co-regulated and hence might share similar transcription factor binding sites. By utilizing two key features of such binding sites (i.e. their high sequence similarities and their relatively high frequencies compared to other sequence fragments), we have formulated this problem as a cluster identification problem. That is to identify and extract data clusters from a noisy background. While the classical data clustering problem (partitioning a data set into clusters sharing common or similar features) has been extensively studied, there is no general algorithm for solving the problem of identifying data clusters from a noisy background. In this paper, we present a novel algorithm for solving such a problem. We have proved that a cluster identification problem, under our definition, can be rigorously and efficiently solved through searching for substrings with special properties in a linear sequence. We have also developed a method for assessing the statistical significance of each identified cluster, which can be used to rule out accidental data clusters. We have implemented the cluster identification algorithm and the statistical significance analysis method as a computer software CUBIC. Extensive testing on CUBIC has been carried out. We present here a few applications of CUBIC on challenging cases of binding site identification.

Mesh:

Substances:

Year:  2003        PMID: 15290780     DOI: 10.1142/s0219720003000162

Source DB:  PubMed          Journal:  J Bioinform Comput Biol        ISSN: 0219-7200            Impact factor:   1.122


  18 in total

1.  EXCAVATOR: a computer program for efficiently mining gene expression data.

Authors:  Dong Xu; Victor Olman; Li Wang; Ying Xu
Journal:  Nucleic Acids Res       Date:  2003-10-01       Impact factor: 16.971

2.  Mapping of orthologous genes in the context of biological pathways: An application of integer programming.

Authors:  Fenglou Mao; Zhengchang Su; Victor Olman; Phuongan Dam; Zhijie Liu; Ying Xu
Journal:  Proc Natl Acad Sci U S A       Date:  2005-12-22       Impact factor: 11.205

3.  Simultaneous prediction of transcription factor binding sites in a group of prokaryotic genomes.

Authors:  Shaoqiang Zhang; Shan Li; Phuc T Pham; Zhengchang Su
Journal:  BMC Bioinformatics       Date:  2010-07-23       Impact factor: 3.169

4.  HAMSTER: visualizing microarray experiments as a set of minimum spanning trees.

Authors:  Raymond Wan; Larisa Kiseleva; Hajime Harada; Hiroshi Mamitsuka; Paul Horton
Journal:  Source Code Biol Med       Date:  2009-11-20

5.  A new framework for identifying cis-regulatory motifs in prokaryotes.

Authors:  Guojun Li; Bingqiang Liu; Qin Ma; Ying Xu
Journal:  Nucleic Acids Res       Date:  2010-12-11       Impact factor: 16.971

6.  MotifClick: prediction of cis-regulatory binding sites via merging cliques.

Authors:  Shaoqiang Zhang; Shan Li; Meng Niu; Phuc T Pham; Zhengchang Su
Journal:  BMC Bioinformatics       Date:  2011-06-16       Impact factor: 3.169

7.  DOOR: a database for prokaryotic operons.

Authors:  Fenglou Mao; Phuongan Dam; Jacky Chou; Victor Olman; Ying Xu
Journal:  Nucleic Acids Res       Date:  2008-11-06       Impact factor: 16.971

8.  Computational prediction of cAMP receptor protein (CRP) binding sites in cyanobacterial genomes.

Authors:  Minli Xu; Zhengchang Su
Journal:  BMC Genomics       Date:  2009-01-15       Impact factor: 3.969

9.  Computational prediction of Pho regulons in cyanobacteria.

Authors:  Zhengchang Su; Victor Olman; Ying Xu
Journal:  BMC Genomics       Date:  2007-06-08       Impact factor: 3.969

10.  Genome-wide de novo prediction of cis-regulatory binding sites in prokaryotes.

Authors:  Shaoqiang Zhang; Minli Xu; Shan Li; Zhengchang Su
Journal:  Nucleic Acids Res       Date:  2009-04-21       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.