Literature DB >> 28055896

A graph-theoretical approach for motif discovery in protein sequences.

Elena Czeizler, Tommi Hirvola, Kalle Karhu.   

Abstract

Motif recognition is a challenging problem in bioinformatics due to the diversity of protein motifs. Many existing algorithms identify motifs of a given length, thus being either not applicable or not efficient when searching simultaneously for motifs of various lengths. Searching for gapped motifs, although very important, is a highly time-consuming task due to the combinatorial explosion of possible combinations implied by the consideration of long gaps. We introduce a new graph theoretical approach to identify motifs of various lengths, both with and without gaps. We compare our approach with two widely used methods: MEME and GLAM2 analyzing both the quality of the results and the required computational time. Our method provides results of a slightly higher level of quality than MEME but at a much faster rate, i.e., one eighth of MEME's query time. By using similarity indexing, we drop the query times down to an average of approximately one sixth of the ones required by GLAM2, while achieving a slightly higher level of quality of the results. More precisely, for sequence collections smaller than 50000 bytes GLAM2 is 13 times slower, while being at least as fast as our method on larger ones. The source code of our C++ implementation is freely available in GitHub: https://github.com/hirvolt1/debruijn-motif.

Entities:  

Year:  2015        PMID: 28055896     DOI: 10.1109/TCBB.2015.2511750

Source DB:  PubMed          Journal:  IEEE/ACM Trans Comput Biol Bioinform        ISSN: 1545-5963            Impact factor:   3.710


  1 in total

1.  HH-MOTiF: de novo detection of short linear motifs in proteins by Hidden Markov Model comparisons.

Authors:  Roman Prytuliak; Michael Volkmer; Markus Meier; Bianca H Habermann
Journal:  Nucleic Acids Res       Date:  2017-07-03       Impact factor: 16.971

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.