Literature DB >> 20583927

Code optimization of the subroutine to remove near identical matches in the sequence database homology search tool PSI-BLAST.

Mats Aspnäs1, Kimmo Mattila, Kristoffer Osowski, Jan Westerholm.   

Abstract

A central task in protein sequence characterization is the use of a sequence database homology search tool to find similar protein sequences in other individuals or species. PSI-BLAST is a widely used module of the BLAST package that calculates a position-specific score matrix from the best matching sequences and performs iterated searches using a method to avoid many similar sequences for the score. For some queries and parameter settings, PSI-BLAST may find many similar high-scoring matches, and therefore up to 80% of the total run time may be spent in this procedure. In this article, we present code optimizations that improve the cache utilization and the overall performance of this procedure. Measurements show that, for queries where the number of similar matches is high, the optimized PSI-BLAST program may be as much as 2.9 times faster than the original program.

Mesh:

Year:  2010        PMID: 20583927     DOI: 10.1089/cmb.2008.0053

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  2 in total

1.  Simple adjustment of the sequence weight algorithm remarkably enhances PSI-BLAST performance.

Authors:  Toshiyuki Oda; Kyungtaek Lim; Kentaro Tomii
Journal:  BMC Bioinformatics       Date:  2017-06-02       Impact factor: 3.169

2.  Div-BLAST: diversification of sequence search results.

Authors:  Elif Eser; Tolga Can; Hakan Ferhatosmanoğlu
Journal:  PLoS One       Date:  2014-12-22       Impact factor: 3.240

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.