| Literature DB >> 19773334 |
Janne Korhonen1, Petri Martinmäki, Cinzia Pizzi, Pasi Rastas, Esko Ukkonen.
Abstract
UNLABELLED: MOODS (MOtif Occurrence Detection Suite) is a software package for matching position weight matrices against DNA sequences. MOODS implements state-of-the-art online matching algorithms, achieving considerably faster scanning speed than with a simple brute-force search. MOODS is written in C++, with bindings for the popular BioPerl and Biopython toolkits. It can easily be adapted for different purposes and integrated into existing workflows. It can also be used as a C++ library. AVAILABILITY: The package with documentation and examples of usage is available at http://www.cs.helsinki.fi/group/pssmfind. The source code is also available under the terms of a GNU General Public License (GPL).Entities:
Mesh:
Year: 2009 PMID: 19773334 PMCID: PMC2778336 DOI: 10.1093/bioinformatics/btp554
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Algorithm benchmarks
| 600k | Chr20 | |||
|---|---|---|---|---|
| 10−6 | 10−4 | 10−6 | 10−4 | |
| MOODS | ||||
| Naive algorithm | 6.5 s | 7.3 s | 689 s | 782 s |
| Permutated lookahead | 3.8 s | 6.3 s | 405 s | 677 s |
| MLF | 0.4 s | 1.1 s | 16.0 s | 117 s |
| TFBS | 20.4 s | 53.1 s | – | – |
| Motility | 103 s | 103 s | 180 min | 181 min |
| Biopython | 42 min | 41 min | – | – |
| Matches | 952 | 7.3 × 104 | 1.1 × 105 | 6.7 × 106 |
We used two target sequences: ‘600k’ is a 600 kb long human DNA fragment, and ‘Chr20’ is the 62 Mb long human chromosome 20. The total scanning times for each algorithm or package are given, with ‘–’ indicating that the dataset was too large to be processed. The reported times include the construction of the data structures required in scanning as well as the scanning itself. The ‘matches’ row gives the total number of matches found for each P-value.