Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 AlignBucket: a tool to speed up 'all-against-all' protein sequence alignments optimizing length constraints.

Literature DB >> 26231432

AlignBucket: a tool to speed up 'all-against-all' protein sequence alignments optimizing length constraints.

Giuseppe Profiti¹, Piero Fariselli², Rita Casadio³.

Abstract

MOTIVATION: The next-generation sequencing era requires reliable, fast and efficient approaches for the accurate annotation of the ever-increasing number of biological sequences and their variations. Transfer of annotation upon similarity search is a standard approach. The procedure of all-against-all protein comparison is a preliminary step of different available methods that annotate sequences based on information already present in databases. Given the actual volume of sequences, methods are necessary to pre-process data to reduce the time of sequence comparison.
RESULTS: We present an algorithm that optimizes the partition of a large volume of sequences (the whole database) into sets where sequence length values (in residues) are constrained depending on a bounded minimal and expected alignment coverage. The idea is to optimally group protein sequences according to their length, and then computing the all-against-all sequence alignments among sequences that fall in a selected length range. We describe a mathematically optimal solution and we show that our method leads to a 5-fold speed-up in real world cases.
AVAILABILITY AND IMPLEMENTATION: The software is available for downloading at http://www.biocomp.unibo.it/∼giuseppe/partitioning.html. CONTACT: giuseppe.profiti2@unibo.it. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Mesh：

Substances：
Proteins

Year: 2015 PMID： 26231432 DOI： 10.1093/bioinformatics/btv451

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

2 in total

1. BENZ WS: the Bologna ENZyme Web Server for four-level EC number annotation.

Authors: Davide Baldazzi; Castrense Savojardo; Pier Luigi Martelli; Rita Casadio
Journal: Nucleic Acids Res Date: 2021-07-02 Impact factor: 16.971

2. The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

Authors: Giuseppe Profiti; Pier Luigi Martelli; Rita Casadio
Journal: Nucleic Acids Res Date: 2017-07-03 Impact factor: 16.971

2 in total