Literature DB >> 10977073

Accelerating protein classification using suffix trees.

B Dorohonceanu1, C G Nevill-Manning.   

Abstract

Position-specific scoring matrices have been used extensively to recognize highly conserved protein regions. We present a method for accelerating these searches using a suffix tree data structure computed from the sequences to be searched. Building on earlier work that allows evaluation of a scoring matrix to be stopped early, the suffix tree-based method excludes many protein segments from consideration at once by pruning entire subtrees. Although suffix trees are usually expensive in space, the fact that scoring matrix evaluation requires an in-order traversal allows nodes to be stored more compactly without loss of speed, and our implementation requires only 17 bytes of primary memory per input symbol. Searches are accelerated by up to a factor of ten.

Mesh:

Substances:

Year:  2000        PMID: 10977073

Source DB:  PubMed          Journal:  Proc Int Conf Intell Syst Mol Biol        ISSN: 1553-0833


  1 in total

1.  Fast sequence analysis based on diamond sampling.

Authors:  Liangxin Gao; Wenzhen Bao; Hongbo Zhang; Chang-An Yuan; De-Shuang Huang
Journal:  PLoS One       Date:  2018-06-28       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.