Literature DB >> 17118189

SMOTIF: efficient structured pattern and profile motif search.

Yongqiang Zhang1, Mohammed J Zaki.   

Abstract

BACKGROUND: A structured motif allows variable length gaps between several components, where each component is a simple motif, which allows either no gaps or only fixed length gaps. The motif can either be represented as a pattern or a profile (also called positional weight matrix). We propose an efficient algorithm, called SMOTIF, to solve the structured motif search problem, i.e., given one or more sequences and a structured motif, SMOTIF searches the sequences for all occurrences of the motif. Potential applications include searching for long terminal repeat (LTR) retrotransposons and composite regulatory binding sites in DNA sequences.
RESULTS: SMOTIF can search for both pattern and profile motifs, and it is efficient in terms of both time and space; it outperforms SMARTFINDER, a state-of-the-art algorithm for structured motif search. Experimental results show that SMOTIF is about 7 times faster and consumes 100 times less memory than SMARTFINDER. It can effectively search for LTR retrotransposons and is well suited to searching for motifs with long range gaps. It is also successful in finding potential composite transcription factor binding sites.
CONCLUSION: SMOTIF is a useful and efficient tool in searching for structured pattern and profile motifs. The algorithm is available as open-source at: http://www.cs.rpi.edu/~zaki/software/sMotif/.

Entities:  

Year:  2006        PMID: 17118189      PMCID: PMC1679804          DOI: 10.1186/1748-7188-1-22

Source DB:  PubMed          Journal:  Algorithms Mol Biol        ISSN: 1748-7188            Impact factor:   1.405


  16 in total

1.  Fast probabilistic analysis of sequence function using scoring matrices.

Authors:  T D Wu; C G Nevill-Manning; D L Brutlag
Journal:  Bioinformatics       Date:  2000-03       Impact factor: 6.937

2.  Algorithms for extracting structured motifs using a suffix tree with an application to promoter and regulatory site consensus identification.

Authors:  L Marsan; M F Sagot
Journal:  J Comput Biol       Date:  2000       Impact factor: 1.479

3.  Identifying target sites for cooperatively binding factors.

Authors:  D GuhaThakurta; G D Stormo
Journal:  Bioinformatics       Date:  2001-07       Impact factor: 6.937

Review 4.  Plant transposable elements: where genetics meets genomics.

Authors:  Cédric Feschotte; Ning Jiang; Susan R Wessler
Journal:  Nat Rev Genet       Date:  2002-05       Impact factor: 53.242

5.  LTR_STRUC: a novel search and identification program for LTR retrotransposons.

Authors:  Eugene M McCarthy; John F McDonald
Journal:  Bioinformatics       Date:  2003-02-12       Impact factor: 6.937

6.  MATCH: A tool for searching transcription factor binding sites in DNA sequences.

Authors:  A E Kel; E Gössling; I Reuter; E Cheremushkin; O V Kel-Margoulis; E Wingender
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

7.  Fast and simple character classes and bounded gaps pattern matching, with applications to protein searching.

Authors:  Gonzalo Navarro; Mathieu Raffinot
Journal:  J Comput Biol       Date:  2003       Impact factor: 1.479

8.  MatInspector and beyond: promoter analysis based on transcription factor binding sites.

Authors:  K Cartharius; K Frech; K Grote; B Klocke; M Haltmeier; A Klingenhoff; M Frisch; M Bayerlein; T Werner
Journal:  Bioinformatics       Date:  2005-04-28       Impact factor: 6.937

9.  SCPD: a promoter database of the yeast Saccharomyces cerevisiae.

Authors:  J Zhu; M Q Zhang
Journal:  Bioinformatics       Date:  1999 Jul-Aug       Impact factor: 6.937

10.  TRANSFAC: transcriptional regulation, from patterns to profiles.

Authors:  V Matys; E Fricke; R Geffers; E Gössling; M Haubrock; R Hehl; K Hornischer; D Karas; A E Kel; O V Kel-Margoulis; D-U Kloos; S Land; B Lewicki-Potapov; H Michael; R Münch; I Reuter; S Rotert; H Saxel; M Scheer; S Thiele; E Wingender
Journal:  Nucleic Acids Res       Date:  2003-01-01       Impact factor: 16.971

View more
  5 in total

1.  Finding and Characterizing Repeats in Plant Genomes.

Authors:  Jacques Nicolas; Sébastien Tempel; Anna-Sophie Fiston-Lavier; Emira Cherif
Journal:  Methods Mol Biol       Date:  2022

2.  Bioinformatics and genomic analysis of transposable elements in eukaryotic genomes.

Authors:  Mateusz Janicki; Rebecca Rooke; Guojun Yang
Journal:  Chromosome Res       Date:  2011-08       Impact factor: 4.620

3.  Protein sequences classification by means of feature extraction with substitution matrices.

Authors:  Rabie Saidi; Mondher Maddouri; Engelbert Mephu Nguifo
Journal:  BMC Bioinformatics       Date:  2010-04-08       Impact factor: 3.169

4.  ModuleOrganizer: detecting modules in families of transposable elements.

Authors:  Sebastien Tempel; Christine Rousseau; Fariza Tahi; Jacques Nicolas
Journal:  BMC Bioinformatics       Date:  2010-09-22       Impact factor: 3.169

5.  Active motif finder - a bio-tool based on mutational structures in DNA sequences.

Authors:  Mani Udayakumar; Palaniyandi Shanmuga-Priya; Kamalakannan Hemavathi; Rengasamy Seenivasagam
Journal:  J Biomed Res       Date:  2011-11
  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.