| Literature DB >> 33267280 |
Junning Deng1, Jefrey Lijffijt1, Bo Kang1, Tijl De Bie1.
Abstract
Numerical time series data are pervasive, originating from sources as diverse as wearable devices, medical equipment, to sensors in industrial plants. In many cases, time series contain interesting information in terms of subsequences that recur in approximate form, so-called motifs. Major open challenges in this area include how one can formalize the interestingness of such motifs and how the most interesting ones can be found. We introduce a novel approach that tackles these issues. We formalize the notion of such subsequence patterns in an intuitive manner and present an information-theoretic approach for quantifying their interestingness with respect to any prior expectation a user may have about the time series. The resulting interestingness measure is thus a subjective measure, enabling a user to find motifs that are truly interesting to them. Although finding the best motif appears computationally intractable, we develop relaxations and a branch-and-bound approach implemented in a constraint programming solver. As shown in experiments on synthetic data and two real-world datasets, this enables us to mine interesting patterns in small or mid-sized time series.Entities:
Keywords: exploratory data mining; information theory; motif detection; pattern mining; subjective interestingness; time series
Year: 2019 PMID: 33267280 PMCID: PMC7515055 DOI: 10.3390/e21060566
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.524
Figure 1The algorithm correctly retrieves the two patterns in the synthetic data.
Figure 2Three motif templates identified in the 20-second ECG recording.
Figure 3Four motif templates identified in the Belgium power load data.
Run-time to search the initial motif set, with a pruning factor of 99%.
| n | l | Time (s) | n | l | Time (s) | n | l | Time (s) | ||
|---|---|---|---|---|---|---|---|---|---|---|
| 1800 | 100 | 9.96 | 3600 | 100 | 50.12 | 7200 | 100 | 369.92 | ||
| 7200 | 25 | 328.09 | 7200 | 50 | 350.65 | 7200 | 100 | 369.92 |