Literature DB >> 16328946

High-performance exact algorithms for motif search.

Sanguthevar Rajasekaran1, Sudha Balla, Chun-Hsi Huang, Vishal Thapar, Michael Gryk, Mark Maciejewski, Martin Schiller.   

Abstract

OBJECTIVE: The human genome project has resulted in the generation of voluminous biological data. Novel computational techniques are called for to extract useful information from this data. One such technique is that of finding patterns that are repeated over many sequences (and possibly over many species). In this paper we study the problem of identifying meaningful patterns (i.e., motifs) from biological data, the motif search problem.
METHODS: The general version of the motif search problem is NP-hard. Numerous algorithms have been proposed in the literature to solve this problem. Many of these algorithms fall under the category of heuristics. We concentrate on exact algorithms in this paper. In particular, we concentrate on two different versions of the motif search problem and offer exact algorithms for them.
RESULTS: In this paper we present algorithms for two versions of the motif search problem. All of our algorithms are elegant and use only such simple data structures as arrays. For the first version of the problem described as Problem 1 in the paper, we present a simple sorting based algorithm, SMS (Simple Motif Search). This algorithm has been coded and experimental results have been obtained. For the second version of the problem (described in the paper as Problem 2), we present two different algorithms--a deterministic algorithm (called DMS) and a randomized algorithm (Monte Carlo algorithm). We also show how these algorithms can be parallelized.
CONCLUSIONS: All the algorithms proposed in this paper are improvements over existing algorithms for these versions of motif search in biological sequence data. The algorithms presented have the potential of performing well in practice.

Entities:  

Mesh:

Year:  2005        PMID: 16328946     DOI: 10.1007/s10877-005-0677-y

Source DB:  PubMed          Journal:  J Clin Monit Comput        ISSN: 1387-1307            Impact factor:   1.977


  3 in total

1.  An efficient algorithm for finding short approximate non-tandem repeats.

Authors:  E F Adebiyi; T Jiang; M Kaufmann
Journal:  Bioinformatics       Date:  2001       Impact factor: 6.937

2.  Finding motifs using random projections.

Authors:  Jeremy Buhler; Martin Tompa
Journal:  J Comput Biol       Date:  2002       Impact factor: 1.479

3.  An efficient method for finding repeats in molecular sequences.

Authors:  H M Martinez
Journal:  Nucleic Acids Res       Date:  1983-07-11       Impact factor: 16.971

  3 in total
  7 in total

Review 1.  Small peptide recognition sequence for intracellular sorting.

Authors:  Kailash N Pandey
Journal:  Curr Opin Biotechnol       Date:  2010-10       Impact factor: 9.740

2.  SlideSort: all pairs similarity search for short reads.

Authors:  Kana Shimizu; Koji Tsuda
Journal:  Bioinformatics       Date:  2010-12-09       Impact factor: 6.937

3.  ExactSearch: a web-based plant motif search tool.

Authors:  Chathura Gunasekara; Avinash Subramanian; Janaki Venkata Ram Kumar Avvari; Bin Li; Su Chen; Hairong Wei
Journal:  Plant Methods       Date:  2016-04-28       Impact factor: 4.993

4.  Efficient sequential and parallel algorithms for finding edit distance based motifs.

Authors:  Soumitra Pal; Peng Xiao; Sanguthevar Rajasekaran
Journal:  BMC Genomics       Date:  2016-08-18       Impact factor: 3.969

Review 5.  Functional roles of short sequence motifs in the endocytosis of membrane receptors.

Authors:  Kailash N Pandey
Journal:  Front Biosci (Landmark Ed)       Date:  2009-06-01

6.  PairMotif+: a fast and effective algorithm for de novo motif discovery in DNA sequences.

Authors:  Qiang Yu; Hongwei Huo; Yipu Zhang; Hongzhi Guo; Haitao Guo
Journal:  Int J Biol Sci       Date:  2013-04-29       Impact factor: 6.580

Review 7.  Endocytosis and Trafficking of Natriuretic Peptide Receptor-A: Potential Role of Short Sequence Motifs.

Authors:  Kailash N Pandey
Journal:  Membranes (Basel)       Date:  2015-07-03
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.