Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 A disk-aware algorithm for time series motif discovery.

Literature DB >> 32153346

A disk-aware algorithm for time series motif discovery.

Abdullah Mueen¹, Eamonn Keogh¹, Qiang Zhu¹, Sydney S Cash², M Brandon Westover³, Nima Bigdely-Shamlo⁴.

Abstract

Time series motifs are sets of very similar subsequences of a long time series. They are of interest in their own right, and are also used as inputs in several higher-level data mining algorithms including classification, clustering, rule-discovery and summarization. In spite of extensive research in recent years, finding time series motifs exactly in massive databases is an open problem. Previous efforts either found approximate motifs or considered relatively small datasets residing in main memory. In this work, we leverage off previous work on pivot-based indexing to introduce a disk-aware algorithm to find time series motifs exactly in multi-gigabyte databases which contain on the order of tens of millions of time series. We have evaluated our algorithm on datasets from diverse areas including medicine, anthropology, computer networking and image processing and show that we can find interesting and meaningful motifs in datasets that are many orders of magnitude larger than anything considered before.

Entities: Chemical Disease Gene Species

Keywords: Bottom-up search; Pruning; Random references; Time series motifs

Year: 2010 PMID： 32153346 PMCID： PMC7062370 DOI： 10.1007/s10618-010-0176-8

Source DB: PubMed Journal: Data Min Knowl Discov ISSN： 1384-5810 Impact factor: 3.670

Keyword Cloud
References

9 in total

A disk-aware algorithm for time series motif discovery.

1. PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals.

2. EEG changes accompanying learned regulation of 12-Hz EEG activity.

3. Learning recurrent behaviors from heterogeneous multivariate time-series.

4. Knowledge construction from time series data using a collaborative exploration system.

5. Effective proximity retrieval by ordering permutations.

6. Independent component analysis using an extended infomax algorithm for mixed subgaussian and supergaussian sources.

7. Functional uncoupling of hemodynamic from neuronal response by inhibition of neuronal nitric oxide synthase.

8. Brain activity-based image classification from rapid serial visual presentation.

9. 80 million tiny images: a large data set for nonparametric object and scene recognition.