Literature DB >> 26773937

Mining frequent biological sequences based on bitmap without candidate sequence generation.

Qian Wang1, Darryl N Davis2, Jiadong Ren3.   

Abstract

Biological sequences carry a lot of important genetic information of organisms. Furthermore, there is an inheritance law related to protein function and structure which is useful for applications such as disease prediction. Frequent sequence mining is a core technique for association rule discovery, but existing algorithms suffer from low efficiency or poor error rate because biological sequences differ from general sequences with more characteristics. In this paper, an algorithm for mining Frequent Biological Sequence based on Bitmap, FBSB, is proposed. FBSB uses bitmaps as the simple data structure and transforms each row into a quicksort list QS-list for sequence growth. For the continuity and accuracy requirement of biological sequence mining, tested sequences used during the mining process of FBSB are real ones instead of generated candidates, and all the frequent sequences can be mined without any errors. Comparing with other algorithms, the experimental results show that FBSB can achieve a better performance on both run time and scalability.
Copyright © 2015 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Biological sequence; Bitmap; Frequent pattern; Quicksort list

Mesh:

Year:  2015        PMID: 26773937     DOI: 10.1016/j.compbiomed.2015.12.016

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  1 in total

1.  MpBsmi: A new algorithm for the recognition of continuous biological sequence pattern based on index structure.

Authors:  Weina Li; Jiadong Ren
Journal:  PLoS One       Date:  2018-04-23       Impact factor: 3.240

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.