Literature DB >> 16916460

A combinatorial optimization approach for diverse motif finding applications.

Elena Zaslavsky1, Mona Singh.   

Abstract

BACKGROUND: Discovering approximately repeated patterns, or motifs, in biological sequences is an important and widely-studied problem in computational molecular biology. Most frequently, motif finding applications arise when identifying shared regulatory signals within DNA sequences or shared functional and structural elements within protein sequences. Due to the diversity of contexts in which motif finding is applied, several variations of the problem are commonly studied.
RESULTS: We introduce a versatile combinatorial optimization framework for motif finding that couples graph pruning techniques with a novel integer linear programming formulation. Our approach is flexible and robust enough to model several variants of the motif finding problem, including those incorporating substitution matrices and phylogenetic distances. Additionally, we give an approach for determining statistical significance of uncovered motifs. In testing on numerous DNA and protein datasets, we demonstrate that our approach typically identifies statistically significant motifs corresponding to either known motifs or other motifs of high conservation. Moreover, in most cases, our approach finds provably optimal solutions to the underlying optimization problem.
CONCLUSION: Our results demonstrate that a combined graph theoretic and mathematical programming approach can be the basis for effective and powerful techniques for diverse motif finding applications.

Entities:  

Year:  2006        PMID: 16916460      PMCID: PMC1570465          DOI: 10.1186/1748-7188-1-13

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  41 in total

1.  Local multiple sequence alignment using dead-end elimination.

Authors:  A V Lukashin; J J Rosa
Journal:  Bioinformatics       Date:  1999-11       Impact factor: 6.937

Review 2.  DNA binding sites: representation and discovery.

Authors:  G D Stormo
Journal:  Bioinformatics       Date:  2000-01       Impact factor: 6.937

3.  Gibbs Recursive Sampler: finding transcription factor binding sites.

Authors:  William Thompson; Eric C Rouchka; Charles E Lawrence
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

4.  A workbench for multiple alignment construction and analysis.

Authors:  G D Schuler; S F Altschul; D J Lipman
Journal:  Proteins       Date:  1991

5.  Efficient methods for multiple sequence alignment with guaranteed error bounds.

Authors:  D Gusfield
Journal:  Bull Math Biol       Date:  1993-01       Impact factor: 1.758

6.  Conservation of DNA regulatory motifs and discovery of new motifs in microbial genomes.

Authors:  A M McGuire; J D Hughes; G M Church
Journal:  Genome Res       Date:  2000-06       Impact factor: 9.043

7.  Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes.

Authors:  L McCue; W Thompson; C Carmack; M P Ryan; J S Liu; V Derbyshire; C E Lawrence
Journal:  Nucleic Acids Res       Date:  2001-02-01       Impact factor: 16.971

8.  Information content of binding sites on nucleotide sequences.

Authors:  T D Schneider; G D Stormo; L Gold; A Ehrenfeucht
Journal:  J Mol Biol       Date:  1986-04-05       Impact factor: 5.469

9.  Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

Authors:  R L Tatusov; S F Altschul; E V Koonin
Journal:  Proc Natl Acad Sci U S A       Date:  1994-12-06       Impact factor: 11.205

10.  Analysis of computational approaches for motif discovery.

Authors:  Nan Li; Martin Tompa
Journal:  Algorithms Mol Biol       Date:  2006-05-19       Impact factor: 1.405

View more
  8 in total

1.  M are better than one: an ensemble-based motif finder and its application to regulatory element prediction.

Authors:  Chen Yanover; Mona Singh; Elena Zaslavsky
Journal:  Bioinformatics       Date:  2009-02-17       Impact factor: 6.937

2.  A cost-aggregating integer linear program for motif finding.

Authors:  Carl Kingsford; Elena Zaslavsky; Mona Singh
Journal:  J Discrete Algorithms (Amst)       Date:  2011-12-01

3.  Identifying functional relationships within sets of co-expressed genes by combining upstream regulatory motif analysis and gene expression information.

Authors:  Viktor Martyanov; Robert H Gross
Journal:  BMC Genomics       Date:  2010-11-02       Impact factor: 3.969

4.  A Haystack Heuristic for Autoimmune Disease Biomarker Discovery Using Next-Gen Immune Repertoire Sequencing Data.

Authors:  Leonard Apeltsin; Shengzhi Wang; H-Christian von Büdingen; Marina Sirota
Journal:  Sci Rep       Date:  2017-07-13       Impact factor: 4.379

5.  Searching for transcription factor binding sites in vector spaces.

Authors:  Chih Lee; Chun-Hsi Huang
Journal:  BMC Bioinformatics       Date:  2012-08-27       Impact factor: 3.169

6.  Binding site graphs: a new graph theoretical framework for prediction of transcription factor binding sites.

Authors:  Timothy E Reddy; Charles DeLisi; Boris E Shakhnovich
Journal:  PLoS Comput Biol       Date:  2007-04-10       Impact factor: 4.475

7.  NestedMICA as an ab initio protein motif discovery tool.

Authors:  Mutlu Doğruel; Thomas A Down; Tim Jp Hubbard
Journal:  BMC Bioinformatics       Date:  2008-01-14       Impact factor: 3.169

8.  LASAGNA: a novel algorithm for transcription factor binding site alignment.

Authors:  Chih Lee; Chun-Hsi Huang
Journal:  BMC Bioinformatics       Date:  2013-03-24       Impact factor: 3.169

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.