Literature DB >> 28976327

NOSEP: Nonoverlapping Sequence Pattern Mining With Gap Constraints.

.   

Abstract

Sequence pattern mining aims to discover frequent subsequences as patterns in a single sequence or a sequence database. By combining gap constraints (or flexible wildcards), users can specify special characteristics of the patterns and discover meaningful subsequences suitable for their own application domains, such as finding gene transcription sites from DNA sequences or discovering patterns for time series data classification. Due to the inherent complexity of sequence patterns, including the exponential candidate space with respect to pattern letters and gap constraints, to date, existing sequence pattern mining methods are either incomplete or do not support the Apriori property because the support ratio of a pattern may be greater than that of its subpatterns. Most importantly, patterns discovered by these methods are either too restrictive or too general and cannot represent underlying meaningful knowledge in the sequences. In this paper, we focus on a nonoverlapping sequence pattern mining task with gap constraints, where a nonoverlapping sequence pattern allows sequence letters to be flexibly and maximally utilized for pattern discovery. A new Apriori-based nonoverlapping sequence pattern mining algorithm, NOSEP, is proposed. NOSEP is a complete pattern mining algorithm, which uses a specially designed data structure, Nettree, to calculate the exact occurrence of a pattern in the sequence. Experimental results and comparisons on biology DNA sequences, time series data, and Gazelle datasets demonstrate the efficiency of the proposed algorithm and the uniqueness of nonoverlapping sequence patterns compared to other methods.

Year:  2017        PMID: 28976327     DOI: 10.1109/TCYB.2017.2750691

Source DB:  PubMed          Journal:  IEEE Trans Cybern        ISSN: 2168-2267            Impact factor:   11.448


  2 in total

1.  NetNMSP: Nonoverlapping maximal sequential pattern mining.

Authors:  Yan Li; Shuai Zhang; Lei Guo; Jing Liu; Youxi Wu; Xindong Wu
Journal:  Appl Intell (Dordr)       Date:  2022-01-10       Impact factor: 5.019

2.  NetNCSP: Nonoverlapping closed sequential pattern mining.

Authors:  Youxi Wu; Changrui Zhu; Yan Li; Lei Guo; Xindong Wu
Journal:  Knowl Based Syst       Date:  2020-03-31       Impact factor: 8.038

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.