Literature DB >> 24034736

Frequent patterns mining in multiple biological sequences.

Ling Chen1, Wei Liu.   

Abstract

Existing algorithms for mining frequent patterns in multiple biosequences may generate multiple projected databases and short candidate patterns, which can increase computation time and memory requirement. In order to overcome such shortcomings, we propose a fast and efficient algorithm for mining frequent patterns in multiple biological sequences (MSPM). We first present the concept of a primary pattern, which can be extended to form larger patterns in the sequence. To detect frequent primary patterns, a prefix tree is constructed. Based on this prefix tree, a pattern-extending approach is also presented to mine frequent patterns without producing a large number of irrelevant candidate patterns. The experimental results show that the MSPM algorithm can achieve not only faster speed, but also higher quality results as compared with other methods.
Copyright © 2013 Elsevier Ltd. All rights reserved.

Keywords:  Biological sequence; Frequent pattern mining; Prefix tree; Primary pattern

Mesh:

Year:  2013        PMID: 24034736     DOI: 10.1016/j.compbiomed.2013.07.009

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  1 in total

1.  MpBsmi: A new algorithm for the recognition of continuous biological sequence pattern based on index structure.

Authors:  Weina Li; Jiadong Ren
Journal:  PLoS One       Date:  2018-04-23       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.