Literature DB >> 31607834

Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping.

Thanawin Rakthanmanon1, Bilson Campana2, Abdullah Mueen2, Gustavo Batista3, Brandon Westover4, Qiang Zhu2, Jesin Zakaria2, Eamonn Keogh2.   

Abstract

Most time series data mining algorithms use similarity search as a core subroutine, and thus the time taken for similarity search is the bottleneck for virtually all time series data mining algorithms, including classification, clustering, motif discovery, anomaly detection, and so on. The difficulty of scaling a search to large datasets explains to a great extent why most academic work on time series data mining has plateaued at considering a few millions of time series objects, while much of industry and science sits on billions of time series objects waiting to be explored. In this work we show that by using a combination of four novel ideas we can search and mine massive time series for the first time. We demonstrate the following unintuitive fact: in large datasets we can exactly search under Dynamic Time Warping (DTW) much more quickly than the current state-of-the-art Euclidean distance search algorithms. We demonstrate our work on the largest set of time series experiments ever attempted. In particular, the largest dataset we consider is larger than the combined size of all of the time series datasets considered in all data mining papers ever published. We explain how our ideas allow us to solve higher-level time series data mining problems such as motif discovery and clustering at scales that would otherwise be untenable. Moreover, we show how our ideas allow us to efficiently support the uniform scaling distance measure, a measure whose utility seems to be underappreciated, but which we demonstrate here. In addition to mining massive datasets with up to one trillion datapoints, we will show that our ideas also have implications for real-time monitoring of data streams, allowing us to handle much faster arrival rates and/or use cheaper and lower powered devices than are currently possible.

Entities:  

Keywords:  Algorithms; Experimentation; Time series; lower bounds; similarity search

Year:  2013        PMID: 31607834      PMCID: PMC6790126     

Source DB:  PubMed          Journal:  ACM Trans Knowl Discov Data        ISSN: 1556-4681            Impact factor:   2.713


  2 in total

1.  A unified framework for gesture recognition and spatiotemporal gesture segmentation.

Authors:  Jonathan Alon; Vassilis Athitsos; Quan Yuan; Stan Sclaroff
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2009-09       Impact factor: 6.226

2.  Comparative and demographic analysis of orang-utan genomes.

Authors:  Devin P Locke; LaDeana W Hillier; Wesley C Warren; Kim C Worley; Lynne V Nazareth; Donna M Muzny; Shiaw-Pyng Yang; Zhengyuan Wang; Asif T Chinwalla; Pat Minx; Makedonka Mitreva; Lisa Cook; Kim D Delehaunty; Catrina Fronick; Heather Schmidt; Lucinda A Fulton; Robert S Fulton; Joanne O Nelson; Vincent Magrini; Craig Pohl; Tina A Graves; Chris Markovic; Andy Cree; Huyen H Dinh; Jennifer Hume; Christie L Kovar; Gerald R Fowler; Gerton Lunter; Stephen Meader; Andreas Heger; Chris P Ponting; Tomas Marques-Bonet; Can Alkan; Lin Chen; Ze Cheng; Jeffrey M Kidd; Evan E Eichler; Simon White; Stephen Searle; Albert J Vilella; Yuan Chen; Paul Flicek; Jian Ma; Brian Raney; Bernard Suh; Richard Burhans; Javier Herrero; David Haussler; Rui Faria; Olga Fernando; Fleur Darré; Domènec Farré; Elodie Gazave; Meritxell Oliva; Arcadi Navarro; Roberta Roberto; Oronzo Capozzi; Nicoletta Archidiacono; Giuliano Della Valle; Stefania Purgato; Mariano Rocchi; Miriam K Konkel; Jerilyn A Walker; Brygg Ullmer; Mark A Batzer; Arian F A Smit; Robert Hubley; Claudio Casola; Daniel R Schrider; Matthew W Hahn; Victor Quesada; Xose S Puente; Gonzalo R Ordoñez; Carlos López-Otín; Tomas Vinar; Brona Brejova; Aakrosh Ratan; Robert S Harris; Webb Miller; Carolin Kosiol; Heather A Lawson; Vikas Taliwal; André L Martins; Adam Siepel; Arindam Roychoudhury; Xin Ma; Jeremiah Degenhardt; Carlos D Bustamante; Ryan N Gutenkunst; Thomas Mailund; Julien Y Dutheil; Asger Hobolth; Mikkel H Schierup; Oliver A Ryder; Yuko Yoshinaga; Pieter J de Jong; George M Weinstock; Jeffrey Rogers; Elaine R Mardis; Richard A Gibbs; Richard K Wilson
Journal:  Nature       Date:  2011-01-27       Impact factor: 69.504

  2 in total
  7 in total

1.  Distance metrics optimized for clustering temporal dietary patterning among U.S. adults.

Authors:  Heather A Eicher-Miller; Saul Gelfand; Youngha Hwang; Edward Delp; Anindya Bhadra; Jiaqi Guo
Journal:  Appetite       Date:  2019-09-12       Impact factor: 3.868

2.  A network-based deep learning methodology for stratification of tumor mutations.

Authors:  Chuang Liu; Zhen Han; Zi-Ke Zhang; Ruth Nussinov; Feixiong Cheng
Journal:  Bioinformatics       Date:  2021-01-08       Impact factor: 6.937

3.  Development of a Deep Learning Model for Dynamic Forecasting of Blood Glucose Level for Type 2 Diabetes Mellitus: Secondary Analysis of a Randomized Controlled Trial.

Authors:  Syed Hasib Akhter Faruqui; Yan Du; Rajitha Meka; Adel Alaeddini; Chengdong Li; Sara Shirinkam; Jing Wang
Journal:  JMIR Mhealth Uhealth       Date:  2019-11-01       Impact factor: 4.773

4.  Deep Neural Networks for the Classification of Pure and Impure Strawberry Purees.

Authors:  Zhong Zheng; Xin Zhang; Jinxing Yu; Rui Guo; Lili Zhangzhong
Journal:  Sensors (Basel)       Date:  2020-02-23       Impact factor: 3.576

5.  Realtime Tracking of Passengers on the London Underground Transport by Matching Smartphone Accelerometer Footprints.

Authors:  Khuong An Nguyen; You Wang; Guang Li; Zhiyuan Luo; Chris Watkins
Journal:  Sensors (Basel)       Date:  2019-09-26       Impact factor: 3.576

6.  NEXGB: A Network Embedding Framework for Anticancer Drug Combination Prediction.

Authors:  Fanjie Meng; Feng Li; Jin-Xing Liu; Junliang Shang; Xikui Liu; Yan Li
Journal:  Int J Mol Sci       Date:  2022-08-30       Impact factor: 6.208

7.  Impact of socioeconomic status,population mobility and control measures on COVID-10 development in major cities of China.

Authors:  Sisi Wang; Yuanqing Ye; Xiaolin Xu; Sicong Wang; Xin Xu; Changzheng Yuan; Shu Li; Shuyin Cao; Wenyuan Li; Chen Chen; Kejia Hu; Hao Lei; Hui Zhu; Yong Zhu; Xifeng Wu
Journal:  Zhejiang Da Xue Xue Bao Yi Xue Ban       Date:  2021-02-25
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.