| Literature DB >> 34764604 |
Yun Yang1, ChongJun Fan1, HongLin Xiong1.
Abstract
Realizing the accurate prediction of data flow is an important and challenging problem in industrial automation. However, due to the diversity of data types, it is difficult for traditional time series prediction models to have good prediction effects on different types of data. To improve the versatility and accuracy of the model, this paper proposes a novel hybrid time-series prediction model based on recursive empirical mode decomposition (REMD) and long short-term memory (LSTM). In REMD-LSTM, we first propose a new REMD to overcome the marginal effects and mode confusion problems in traditional decomposition methods. Then use REMD to decompose the data stream into multiple in intrinsic modal functions (IMF). After that, LSTM is used to predict each IMF subsequence separately and obtain the corresponding prediction results. Finally, the true prediction value of the input data is obtained by accumulating the prediction results of all IMF subsequences. The final experimental results show that the prediction accuracy of our proposed model is improved by more than 20% compared with the LSTM algorithm. In addition, the model has the highest prediction accuracy on all different types of data sets. This fully shows the model proposed in this paper has a greater advantage in prediction accuracy and versatility than the state-of-the-art models. The data used in the experiment can be downloaded from this website: https://github.com/Yang-Yun726/REMD-LSTM.Entities:
Keywords: Data decomposition; REMD-LSTM; Time series prediction
Year: 2021 PMID: 34764604 PMCID: PMC8178659 DOI: 10.1007/s10489-021-02442-y
Source DB: PubMed Journal: Appl Intell (Dordr) ISSN: 0924-669X Impact factor: 5.086
Fig. 1LSTM memory cell structure
Fig. 2Improved EMD (left) and EMD (right)
Fig. 3REMD-LSTM model structure diagram
Fig. 4The time-series diagram of four types of data
Statistics of four types of time series data
| Data | Max | Min | Mean | Var | Std | Median |
|---|---|---|---|---|---|---|
| Stock | 1536.8 | 104.39 | 788.3333 | 134,557.5 | 366.82 | 772.915 |
| Restaurant Sales | 28.641 | 1.3916 | 2.798125 | 4.559266 | 2.1352 | 2.241667 |
| Commodity | 27.229 | 18.219 | 19.01126 | 1.505774 | 1.2271 | 18.83379 |
| Satisfaction Rate | 0.9784 | 0 | 0.546747 | 0.040573 | 0.2014 | 0.58065 |
Fig. 5Comparison of REMD-LSTM and LSTM time consumption
Fig. 6The prediction results of REMD-LSTM and LSTM
Fig. 7The prediction results of the four algorithms
The prediction error results of four algorithms
| Algorithm | Index | Stock | Restaurant | Commodity | Satisfaction |
|---|---|---|---|---|---|
| MSE | REMD-LSTM | 113.04 | 0.08 | 5.03553E-05 | 0.016 |
| ARIMA | 146.32 | 0.264 | 6.73191E-05 | 0.064 | |
| SVM | 3812.9 | 0.094 | 0.00124 | 0.033 | |
| LSTM | 256.71 | 0.1059 | 0.00161 | 0.037 | |
| Prophet | 8212.28 | 0.21742 | 0.03179 | 0.03378 | |
| XGboost | 3280.33 | 0.10211 | 0.000222 | 0.03381 | |
| RMSE | REMD-LSTM | 10.63 | 0.28 | 0.007096145 | 0.129 |
| ARIMA | 12.096 | 0.514 | 0.008204823 | 0.254 | |
| SVM | 61.749 | 0.3066 | 0.03532 | 0.182 | |
| LSTM | 16.022 | 0.32543 | 0.04023 | 0.192 | |
| Prophet | 16.022 | 0.32543 | 0.04023 | 0.192 | |
| XGboost | 45.5547 | 0.26565 | 0.01056 | 0.16152 | |
| MAE | REMD-LSTM | 8.164 | 0.23 | 0.005454375 | 0.106 |
| ARIMA | 10.074 | 0.422 | 0.006328749 | 0.198 | |
| SVM | 48.457 | 0.256 | 0.03354 | 0.154 | |
| LSTM | 13.072 | 0.23741 | 0.02196 | 0.145 | |
| Prophet | 83.7201 | 0.3853 | 0.14336 | 0.15852 | |
| XGboost | 57.2741 | 0.31954 | 0.01492 | 0.18388 |
Fig. 8The prediction effect of four algorithms on four sets of data
Friedman test result
| Alogrithm | REMD-LSTM | ARIMA | SVM | LSTM | Prophet | XGboost |
|---|---|---|---|---|---|---|
| Data | ||||||
| Stock | 1 | 2 | 5 | 3 | 6 | 4 |
| Restaurant | 1 | 6 | 2 | 4 | 5 | 3 |
| Commodity | 1 | 2 | 4 | 5 | 6 | 3 |
| Satisfaction | 1 | 6 | 2 | 5 | 3 | 4 |
| Average value | 1 | 4 | 3.25 | 4.25 | 5 | 3.5 |
Fig. 9The influence of the number of cycles (real time = time*100)