| Literature DB >> 35634126 |
Yuerong Tong1, Jingyi Liu1, Lina Yu1, Liping Zhang1, Linjun Sun1, Weijun Li1,2, Xin Ning1, Jian Xu1, Hong Qin1, Qiang Cai3.
Abstract
Time series appear in many scientific fields and are an important type of data. The use of time series analysis techniques is an essential means of discovering the knowledge hidden in this type of data. In recent years, many scholars have achieved fruitful results in the study of time series. A statistical analysis of 120,000 literatures published between 2017 and 2021 reveals that the topical research about time series is mostly focused on their classification and prediction. Therefore, in this study, we focus on analyzing the technical development routes of time series classification and prediction algorithms. 87 literatures with high relevance and high citation are selected for analysis, aiming to provide a more comprehensive reference base for interested researchers. For time series classification, it is divided into supervised methods, semi-supervised methods, and early classification of time series, which are key extensions of time series classification tasks. For time series prediction, from classical statistical methods, to neural network methods, and then to fuzzy modeling and transfer learning methods, the performance and applications of these different methods are discussed. We hope this article can help aid the understanding of the current development status and discover possible future research directions, such as exploring interpretability of time series analysis and online learning modeling. ©2022 Tong et al.Entities:
Keywords: Classification; Evaluation models; Prediction; Time series analysis
Year: 2022 PMID: 35634126 PMCID: PMC9138170 DOI: 10.7717/peerj-cs.982
Source DB: PubMed Journal: PeerJ Comput Sci ISSN: 2376-5992
Figure 1Subject category co-occurrence map of time series literatures (2017–2021).
Figure 2Keyword co-occurrence map of time series literatures (2017–2021).
Related time series surveys.
| Theme | Related surveys | Topic | Key contributions |
|---|---|---|---|
| Prediction | ( | prediction | Review the time series prediction research of the past 25 years. |
| ( | time series mining; event prediction | Classify and evaluate event prediction methods in time series. | |
| ( | time series prediction; machine learning | Review the time series prediction of machine learning technology in different states spanning ten years. | |
| ( | prediction; time series mining | Provide a detailed survey of various techniques used to predict different types of time series datasets, and discuss various performance evaluation parameters used to evaluate predictive models. | |
| ( | prediction; machine learning; energy prediction | A comprehensive review of existing machine learning techniques used to predict time series energy consumption. | |
| ( | prediction; nonlinear time series; neural network | Summarize the research progress of artificial neural network methods in time series prediction models. | |
| ( | prediction; fuzzy time series | Summarize and review the contributions in the field of fuzzy time series prediction in the past 25 years. | |
| ( | prediction; mixed structure | Analyze various hybrid structures used in time series modeling and prediction. | |
| ( | prediction model; non-stationarity; conversion method | Review and analyze the conversion methods of non-stationary time series, and discuss their advantages and limitations on time series prediction problems. | |
| ( | prediction; deep learning; finance | Provide research on deep learning in the field of financial time series prediction. | |
| ( | counterfactual prediction; deep neural networks | Survey encoder–decoder designs for time series forecasting and recent developments in hybrid deep learning models. | |
| ( | Intelligent predictors; Hybrid modeling strategies | Analyze various components and combinations in mixed models for time series forecasting. | |
| Classification | ( | classification; data mining technology | Research multiple time series and classification techniques and investigate various data mining methods for disease prediction. |
| ( | deep learning; time series classification | Conduct empirical research on the latest deep neural network architecture for time series classification, and analyze the latest performance of deep learning algorithms for time series classification. | |
| ( | classification; distance | Summarize the development of distance-based time series classification methods. | |
| ( | clustering; classification; visualization; visual analysis | Clarify the main concepts of using clustering or classification algorithms in the visual analysis of time series data. | |
| Data mining | ( | data mining; representation; similarity; segmentation; visualization | Comprehensively review the existing research on time series data mining and divide it into research directions such as representation and indexing, similarity measurement, segmentation, visualization, and mining. |
| ( | data mining; machine learning | Summarize the existing data mining techniques for time series modeling and analysis and divide the main research directions of time series into three sub-fields: dimensionality reduction (time series representation), similarity measurement, and data mining tasks. | |
| Clustering | ( | clustering; data mining; dimensionality reduction; distance measurement | Investigate the clustering of time series in various application fields such as science, engineering, business, finance, economics, health care, and government. |
| ( | time series clustering; subsequence | Review the definition and background of subsequence time series clustering. | |
| ( | clustering; distance measurement; evaluation measures | Reveal the four main components of time series clustering, investigating the improvement trends in the efficiency, quality, and complexity of clustering time series methods over the past decade. | |
| ( | clustering; representative periods | Summarize time series analysis methods used in energy system optimization models. | |
| Similarity measure | ( | time series data mining; time series similarity; mining accuracy | Analyze the advantages and disadvantages of current time series similarity measures, and the application of similarity measures in the clustering, classification, and regression of time series data. |
| ( | multivariate time series; data mining; similarity; similarity search | Summarize the existing time series similarity measures, compares different methods of multivariate time series similarity searches, and analyze their advantages and disadvantages. | |
| Deep learning | ( | unsupervised feature learning; deep learning | Review the latest developments in deep learning and unsupervised feature learning for time series problems. |
| ( | deep learning; prediction; classification; anomaly detection | Summarizes the latest deep learning methods for time series prediction, classification, and anomaly detection from the aspects of application, network architecture, and ideas. | |
| ( | deep learning; forecasting | Evaluate the performance of several deep learning architectures on multiple datasets. | |
| Change detection | ( | time series change detection | A comprehensive review of the four important aspects of the Landsat time series-based change detection research, including frequency, preprocessing, algorithm, and application. |
| ( | online change detection; anomaly detection; time series segmentation | Summarize the main techniques of time series change-point detection, focusing on online methods. | |
| Others | ( | correlation; reasoning; multivariate model; semi parametric estimation | Investigates the estimation, inference methods, and goodness- of-fit test based on copula-based economic and financial time series models, as well as the empirical application of copula in economic and financial time series. |
| ( | hydrological time series analysis; wavelet transform | Summarizes and reviews the research and application of wavelet transform method in hydrological time series from six aspects. | |
| ( | experience likelihood | Summarize the progress of the experience likelihood of time series data. | |
| ( | complexity test | Discuss the complexity testing technology of time series data. | |
| ( | autocorrelation function (ACF); count; sparse operator | Investigate the development of the field of integer-valued time series modeling, and review the literature on the most relevant sparse operators proposed in the analysis of univariate and multivariate integer-valued time series with limited or unlimited support. | |
| ( | regression analysis; artificial intelligence; exogenous variables; prediction scheme | A systematic literature review of time series models with explanatory variables. | |
| ( | irreversibility; time-reversal symmetry | Review and compare important algorithms for testing the irreversibility of time series. |
Figure 3Technology development routes.
Comparison of the accuracy of supervised time series classification.
| Category | 1NN-DTW | Shapelets | Shapelets transform | Shapelets learning | ||||
|---|---|---|---|---|---|---|---|---|
| 1NN-DTW | Fast shapelets | Shapelet transform | COTE | LTS | FLAG | RSLA-LS | RSLA-LZ | |
| Adiac | 60.0 | 54.9 | 29.2 | 76.9 | 49.7 | 74.2 |
| 73.9 |
| Beef | 63.3 | 56.7 | 50.0 | 80.0 | 83.3 | 80.0 | 83.3 |
|
| Chlorine | 64.8 | 59.1 | 58.8 | 68.6 | 59.4 | 78.0 | 75.0 |
|
| Coffee |
| 96.4 | 96.4 |
|
|
|
|
|
| Diatom | 96.4 | 87.9 | 72.2 | 89.2 | 96.7 | 96.4 | 96.7 |
|
| DP_Little | 50.3 | 57.8 | 65.4 | – |
| 65.7 | 69.1 | 69.8 |
| DP_Middle | 54.1 | 59.2 | 70.5 | – | 73.5 | 72.9 | 72.6 |
|
| DP_Thumb | 53.0 | 59.1 | 58.1 | 65.4 |
| 72.4 | 70.7 | 75.0 |
| ECGFiveDays | 78.7 | 99.5 | 77.5 | 99.9 |
| 92.0 |
|
|
| FaceFour | 82.9 | 92.0 | 84.1 | 71.6 | 95.4 | 90.9 | 92.0 |
|
| Gun_Point | 94.0 | 94.0 | 89.3 | 93.3 |
| 96.7 | 96.7 | 99.3 |
| ItalyPower | 95.2 | 90.5 | 89.2 | 96.2 | 95.9 | 94.6 | 96.5 |
|
| Lighting7 | 73.9 | 63.0 | 49.3 | 61.6 | 78.1 | 76.7 | 75.3 |
|
| Medicallmages |
| 60.5 | 48.8 | 67.1 | 67.8 | 72.4 | 71.4 | 73.4 |
| MoteStrain | 86.8 | 79.8 | 82.5 | 84.0 | 85.1 | 88.8 |
|
|
| MP_Little | 55.2 | 62.1 | 66.4 | – |
| 71.8 | 73.6 | 73.6 |
| MP_Middle | 55.2 | 61.7 | 71.0 | – | 77.3 | 76.6 | 74.7 |
|
| Otoliths | 59.3 | 60.9 | – | 60.9 | 67.2 | 64.1 |
| 71.9 |
| PP_Little | 55.2 | 48.7 | 59.6 | – |
| 68.5 | 71.6 | 70.5 |
| PP_Middle | 50.0 | 56.8 | 61.4 | – | 74.9 | 74.0 | 72.7 |
|
| PP_Thumb | 51.2 | 58.9 | 60.8 | – | 70.1 | 68.4 | 69.8 |
|
| Sony | 73.2 | 68.5 | – | 87.7 | 85.3 | 92.8 | 93.2 |
|
| Symbols | 94.1 | 93.6 | 78.0 |
| 93.9 | 87.5 | 91.3 | 92.3 |
| SyntheticC | 99.3 | 93.6 | 94.3 | 81.0 |
|
|
| 99.0 |
| Trace |
|
| 98.0 |
|
| 99.0 | 98.0 |
|
| TwoLeadECG | 89.3 | 94.6 | 85.0 | 91.6 |
| 99.0 | 99.3 | 99.3 |
Notes.
A dash (-) indicates that there is no data available. The bold values represent the highest accuracy for each category.
Comparison of supervised time series classification.
| Category | Methods | Advantages | Disadvantages |
|---|---|---|---|
| 1NN-DTW | 1NN-DTW ( | Simple, no training needed | High time complexity of classification |
| Shapelets | Ye’s ( | High interpretability and robustness, low classification time complexity | High time complexity of shapelets searching procedure, and for large length sequences, the time cost becomes unacceptable |
| Shapelet transform | Line’s ( | High accuracy and flexible | Long shapelets search time |
| Shapelet learning | LTS ( | High robustness, interpretability, discriminativeness | Long training time |
Comparison of the accuracy of semi-supervised classification methods.
| Datasets | Class number | Wei ( | DTW-D ( | SUCCESS ( | Xu ( | SSSL ( |
|---|---|---|---|---|---|---|
| Coffee | 2 | 57.1 | 60.1 | 63.2 | 58.8 | 79.2 |
| CBF | 3 | 99.5 | 83.3 | 99.7 | 92.1 | 100.0 |
| ECG | 2 | 76.3 | 95.3 | 77.5 | 81.9 | 79.3 |
| Face four | 4 | 81.8 | 78.2 | 80.0 | 83.3 | 85.1 |
| Gun point | 2 | 92.5 | 71.1 | 95.5 | 72.9 | 82.4 |
| ItalyPow.Dem | 2 | 93.4 | 66.4 | 92.4 | 77.2 | 94.1 |
| Lighting2 | 2 | 65.8 | 64.1 | 68.3 | 69.8 | 81.3 |
| Linghting7 | 7 | 46.4 | 50.3 | 47.1 | 51.1 | 79.6 |
| OSU leaf | 6 | 46.0 | 70.1 | 53.4 | 64.2 | 83.5 |
| Trace | 4 | 95.0 | 80.1 | 100.0 | 78.8 | 100.0 |
| WordsSyn | 25 | 59.0 | 86.3 | 61.8 | 63.9 | 87.5 |
| OliveOil | 4 | 63.3 | 73.2 | 61.7 | 63.9 | 77.6 |
| StarLight Curves | 3 | 86.0 | 74.3 | 80.0 | 75.5 | 87.2 |
Comparison of univariable accuracy in early classification.
| Methods | Datasets | ||||||
|---|---|---|---|---|---|---|---|
| Wafer | Gun Point | Two patterns | ECG | Synthetic control | OliveOil | CBF | |
| ECTS ( | 99.08 | 86.67 | 86.48 | 89.00 | 89.00 | 90.00 | 85.20 |
| RelaxedECTS ( | 99.08 | 86.67 | 86.35 | 89.00 | 88.30 | 90.00 | 85.20 |
| ECDIRE ( | 97.00 | 87.00 | 87.00 | 91.00 | 96.00 | 40.00 | 89.00 |
| EDSC ( | 99.00 | 94.00 | 80.00 | 85.00 | 89.00 | 60.00 | 84.00 |
Comparison of multivariable accuracy in early classification.
| Methods | Datasets | |||
|---|---|---|---|---|
| Syn1 | Syn2 | Wafer | ECG | |
| Class number | 2 | 3 | 2 | 2 |
| Variable number | 3 | 4 | 6 | 2 |
| MSD ( | 0.74 | 0.34 | 0.74 | 0.74 |
| MCFEC-QBC ( | 0.99 | 0.77 | 0.9 | 0.77 |
| MCFEC-Rule ( | 0.98 | 0.74 | 0.97 | 0.78 |
| EPIMTS ( | 0.98 | 0.99 | 0.96 | 0.84 |
Figure 4Technology development routes.
Performance comparison of different methods.
| Methods | S&P 500 Index | Shanghai Composite Index | Hangzhou Temperature | ||||||
|---|---|---|---|---|---|---|---|---|---|
| RMSE | MAE |
| RMSE | MAE |
| RMSE | MAE |
| |
| ANN | 24.22 | 20.21 | 0.965 | 66.25 | 39.35 | 0.975 | 2.95 | 2.14 | 0.895 |
| UFCNN ( | 24.36 | 19.84 | 0.965 | 93.06 | 57.77 | 0.950 |
|
|
|
| LSTM | 19.04 | 14.42 | 0.978 |
|
|
| 2.86 | 2.09 | 0.901 |
| SeriesNet ( |
|
|
| 63.94 | 38.37 |
| 2.82 | 2.06 | 0.903 |
Notes.
The data are obtained from reference (Shen et al., 2020).
Performance comparison of different methods.
| Method category | Methods | Advantages | Disadvantages |
|---|---|---|---|
| Classical method | AR, MA, ARMA, ARIMA | Good at linear problems | Cannot handle nonlinear problems well |
| Traditional machine learning | SVM, LS-SVM ( | Able to solve complex time series data | Cannot handle nonlinear problems well |
| NN | ANN, BPNN, DE-BPNN ( | Able to handle nonlinear problems | Long-term dependence cannot be effectively preserved |
| LSTM | LSTM | Capable of capturing long-term dependence, structure is conducive to dealing with sequence problems | Facing the problem of gradient disappearance or gradient explosion, and it is difficult to train |
| CNN | CNN, UFCNN ( | Efficient | Difficult to capture long-term dependence |
| Hybrid model | ARIMA-ANN ( | Better performance | High complexity |