Literature DB >> 16257985

A generic motif discovery algorithm for sequential data.

Kyle L Jensen1, Mark P Styczynski, Isidore Rigoutsos, Gregory N Stephanopoulos.   

Abstract

MOTIVATION: Motif discovery in sequential data is a problem of great interest and with many applications. However, previous methods have been unable to combine exhaustive search with complex motif representations and are each typically only applicable to a certain class of problems.
RESULTS: Here we present a generic motif discovery algorithm (Gemoda) for sequential data. Gemoda can be applied to any dataset with a sequential character, including both categorical and real-valued data. As we show, Gemoda deterministically discovers motifs that are maximal in composition and length. As well, the algorithm allows any choice of similarity metric for finding motifs. Finally, Gemoda's output motifs are representation-agnostic: they can be represented using regular expressions, position weight matrices or any number of other models for any type of sequential data. We demonstrate a number of applications of the algorithm, including the discovery of motifs in amino acids sequences, a new solution to the (l,d)-motif problem in DNA sequences and the discovery of conserved protein substructures. AVAILABILITY: Gemoda is freely available at http://web.mit.edu/bamel/gemoda

Mesh:

Year:  2005        PMID: 16257985     DOI: 10.1093/bioinformatics/bti745

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  7 in total

1.  Real-Time PCR: Revolutionizing Detection and Expression Analysis of Genes.

Authors:  Sa Deepak; Kr Kottapalli; R Rakwal; G Oros; Ks Rangappa; H Iwahashi; Y Masuo; Gk Agrawal
Journal:  Curr Genomics       Date:  2007-06       Impact factor: 2.236

2.  Breaking the computational barrier: a divide-conquer and aggregate based approach for Alu insertion site characterisation.

Authors:  Kun Zhang; Wei Fan; Prescott Deininger; Andrea Edwards; Zujia Xu; Dongxiao Zhu
Journal:  Int J Comput Biol Drug Des       Date:  2009-01-04

3.  Efficient motif search in ranked lists and applications to variable gap motifs.

Authors:  Limor Leibovich; Zohar Yakhini
Journal:  Nucleic Acids Res       Date:  2012-03-13       Impact factor: 16.971

4.  iTriplet, a rule-based nucleic acid sequence motif finder.

Authors:  Eric S Ho; Christopher D Jakubowski; Samuel I Gunderson
Journal:  Algorithms Mol Biol       Date:  2009-10-29       Impact factor: 1.405

5.  A Caenorhabditis motif compendium for studying transcriptional gene regulation.

Authors:  Christoph Dieterich; Ralf J Sommer
Journal:  BMC Genomics       Date:  2008-01-23       Impact factor: 3.969

6.  Comparative analysis of regulatory motif discovery tools for transcription factor binding sites.

Authors:  Wei Wei; Xiao-Dan Yu
Journal:  Genomics Proteomics Bioinformatics       Date:  2007-05       Impact factor: 7.691

Review 7.  Navigating freely-available software tools for metabolomics analysis.

Authors:  Rachel Spicer; Reza M Salek; Pablo Moreno; Daniel Cañueto; Christoph Steinbeck
Journal:  Metabolomics       Date:  2017-08-09       Impact factor: 4.290

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.