Literature DB >> 11911793

The efficient computation of position-specific match scores with the fast fourier transform.

S Rajasekaran1, X Jin, J L Spouge.   

Abstract

Historically, in computational biology the fast Fourier transform (FFT) has been used almost exclusively to count the number of exact letter matches between two biosequences. This paper presents an FFT algorithm that can compute the match score of a sequence against a position-specific scoring matrix (PSSM). Our algorithm finds the PSSM score simultaneously over all offsets of the PSSM with the sequence, although like all previous FFT algorithms, it still disallows gaps. Although our algorithm is presented in the context of global matching, it can be adapted to local matching without gaps. As a benchmark, our PSSM-modified FFT algorithm computed pairwise match scores. In timing experiments, our most efficient FFT implementation for pairwise scoring appeared to be 10 to 26 times faster than a traditional FFT implementation, with only a factor of 2 in the acceleration attributable to a previously known compression scheme. Many important algorithms for detecting biosequence similarities, e.g., gapped BLAST or PSIBLAST, have a heuristic screening phase that disallows gaps. This paper demonstrates that FFT algorithms merit reconsideration in these screening applications.

Mesh:

Substances:

Year:  2002        PMID: 11911793     DOI: 10.1089/10665270252833172

Source DB:  PubMed          Journal:  J Comput Biol        ISSN: 1066-5277            Impact factor:   1.479


  6 in total

1.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

2.  Sequence alignment by cross-correlation.

Authors:  Alan L Rockwood; David K Crockett; James R Oliphant; Kojo S J Elenitoba-Johnson
Journal:  J Biomol Tech       Date:  2005-12

3.  COSINE: non-seeding method for mapping long noisy sequences.

Authors:  Pegah Tootoonchi Afshar; Wing Hung Wong
Journal:  Nucleic Acids Res       Date:  2017-08-21       Impact factor: 16.971

4.  PSimScan: algorithm and utility for fast protein similarity search.

Authors:  Anna Kaznadzey; Natalia Alexandrova; Vladimir Novichkov; Denis Kaznadzey
Journal:  PLoS One       Date:  2013-03-07       Impact factor: 3.240

5.  Fast index based algorithms and software for matching position specific scoring matrices.

Authors:  Michael Beckstette; Robert Homann; Robert Giegerich; Stefan Kurtz
Journal:  BMC Bioinformatics       Date:  2006-08-24       Impact factor: 3.169

6.  Fast sequence analysis based on diamond sampling.

Authors:  Liangxin Gao; Wenzhen Bao; Hongbo Zhang; Chang-An Yuan; De-Shuang Huang
Journal:  PLoS One       Date:  2018-06-28       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.