| Literature DB >> 29883381 |
Xike Zhang1,2, Qiuwen Zhang3, Gui Zhang4, Zhiping Nie5, Zifan Gui6, Huafei Que7.
Abstract
Daily land surface temperature (LST) forecasting is of great significance for application in climate-related, agricultural, eco-environmental, or industrial studies. Hybrid data-driven prediction models using Ensemble Empirical Mode Composition (EEMD) coupled with Machine Learning (ML) algorithms are useful for achieving these purposes because they can reduce the difficulty of modeling, require less history data, are easy to develop, and are less complex than physical models. In this article, a computationally simple, less data-intensive, fast and efficient novel hybrid data-driven model called the EEMD Long Short-Term Memory (LSTM) neural network, namely EEMD-LSTM, is proposed to reduce the difficulty of modeling and to improve prediction accuracy. The daily LST data series from the Mapoling and Zhijaing stations in the Dongting Lake basin, central south China, from 1 January 2014 to 31 December 2016 is used as a case study. The EEMD is firstly employed to decompose the original daily LST data series into many Intrinsic Mode Functions (IMFs) and a single residue item. Then, the Partial Autocorrelation Function (PACF) is used to obtain the number of input data sample points for LSTM models. Next, the LSTM models are constructed to predict the decompositions. All the predicted results of the decompositions are aggregated as the final daily LST. Finally, the prediction performance of the hybrid EEMD-LSTM model is assessed in terms of the Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Pearson Correlation Coefficient (CC) and Nash-Sutcliffe Coefficient of Efficiency (NSCE). To validate the hybrid data-driven model, the hybrid EEMD-LSTM model is compared with the Recurrent Neural Network (RNN), LSTM and Empirical Mode Decomposition (EMD) coupled with RNN, EMD-LSTM and EEMD-RNN models, and their comparison results demonstrate that the hybrid EEMD-LSTM model performs better than the other five models. The scatterplots of the predicted results of the six models versus the original daily LST data series show that the hybrid EEMD-LSTM model is superior to the other five models. It is concluded that the proposed hybrid EEMD-LSTM model in this study is a suitable tool for temperature forecasting.Entities:
Keywords: Dongting Lake basin; Ensemble Empirical Mode Decomposition (EEMD); Long Short-Term Memory (LSTM); Neural Network (NN); daily land surface temperature; data-driven; forecasting; hybrid model
Mesh:
Year: 2018 PMID: 29883381 PMCID: PMC5982071 DOI: 10.3390/ijerph15051032
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1The architecture of (a) a traditional Artificial Neural Network (ANN) and (b) a Recurrent Neural Network (RNN).
Figure 2The architecture of the Long Short-Term Memory (LSTM) neural network.
Figure 3The architecture of the proposed Ensemble Empirical Mode Decomposition (EEMD)-LSTM neural network hybrid data-driven model.
Figure 4(a) Location of the Dongting Lake basin in central south China; (b) Composition of the basin; (c) Distribution of the Mapoling and Zhijiang meteorological stations.
Figure 5Daily land surface temperature (LST) data series of Mapoling station (upper) and Zhijiang station (lower) from 1 January 2014 to 31 December 2016.
Figure 6Decomposition results of the original daily LST data series of Mapoling station by EEMD.
Figure A1Decomposition results of the original daily surface temperature data series of Zhijiang station by EEMD.
Statistics of the original daily LST data series and the decomposition results of Mapoling station.
| Series | Period | Min. | Max. | Mean | Variance | SD 1 | Skewness | Kurtosis |
|---|---|---|---|---|---|---|---|---|
| Original data set | 1 January 2014 to 31 December 2016 | −1.5 | 32.8 | 17.6599 | 67.8232 | 8.2355 | −0.2036 | −1.0706 |
| 1 January 2014 to 30 June 2016 (Training) | −1.5 | 32 | 17.0957 | 65.1861 | 8.0738 | −0.2059 | −1.0738 | |
| 1 July 2016 to 31 December 2016 (Testing) | 0.8 | 32.8 | 20.4565 | 71.4957 | 8.4555 | −0.3571 | −1.1502 | |
| IMF1 | 1 January 2014 to 31 December 2016 | −3.7604 | 3.9356 | −0.0045 | 1.076 | 1.0373 | 0.0456 | 1.2377 |
| 1 January 2014 to 30 June 2016 (Training) | −3.7604 | 3.9356 | -0.0047 | 1.1645 | 1.0791 | 0.0466 | 0.9976 | |
| 1 July 2016 to 31 December 2016 (Testing) | −2.9097 | 2.8863 | −0.0037 | 0.6374 | 0.7984 | 0.0279 | 2.92 | |
| IMF2 | 1 January 2014 to 31 December 2016 | −4.1524 | 4.2432 | −0.008 | 1.508 | 1.228 | 0.0174 | 0.5498 |
| 1 January 2014 to 30 June 2016 (Training) | −4.1524 | 4.2432 | −0.0063 | 1.4944 | 1.2224 | 0.0309 | 0.4341 | |
| 1 July 2016 to 31 December 2016 (Testing) | −4.1085 | 3.6196 | −0.0162 | 1.5756 | 1.2552 | −0.0441 | 1.1147 | |
| IMF3 | 1 January 2014 to 31 December 2016 | −4.1166 | 4.8691 | −0.0506 | 1.734 | 1.3168 | 0.0441 | 1.1287 |
| 1 January 2014 to 30 June 2016 (Training) | −4.1166 | 4.8691 | -0.0763 | 1.6987 | 1.3034 | 0.0537 | 1.3837 | |
| 1 July 2016 to 31 December 2016 (Testing) | −3.9231 | 3.6299 | 0.0768 | 1.8891 | 1.3745 | -0.025 | 0.1554 | |
| IMF4 | 1 January 2014 to 31 December 2016 | −2.9359 | 3.4556 | −0.0027 | 1.1967 | 1.0939 | −0.0078 | 0.0501 |
| 1 January 2014 to 30 June 2016 (Training) | −2.9359 | 3.4556 | -0.0072 | 1.2543 | 1.12 | 0.0216 | 0.0981 | |
| 1 July 2016 to 31 December 2016 (Testing) | −2.1632 | 2.0125 | 0.0197 | 0.9102 | 0.9541 | −0.2184 | −0.7181 | |
| IMF5 | 1 January 2014 to 31 December 2016 | −3.5915 | 4.826 | −0.044 | 1.2316 | 1.1098 | 0.0722 | 3.0066 |
| 1 January 2014 to 30 June 2016 (Training) | −3.5915 | 4.826 | 0.0478 | 1.0681 | 1.0335 | 0.294 | 5.0861 | |
| 1 July 2016 to 31 December 2016 (Testing) | −2.4797 | 1.6551 | −0.499 | 1.7933 | 1.3392 | 0.0258 | −1.2748 | |
| IMF6 | 1 January 2014 to 31 December 2016 | −10.941 | 11.8481 | 0.7883 | 49.9635 | 7.0685 | −0.124 | −1.4036 |
| 1 January 2014 to 30 June 2016 (Training) | −10.941 | 10.1515 | 0.1742 | 47.1743 | 6.8684 | −0.0974 | −1.4479 | |
| 1 July 2016 to 31 December 2016 (Testing) | −9.9938 | 11.8481 | 3.8317 | 52.6572 | 7.2565 | −0.4786 | −1.234 | |
| IMF7 | 1 January 2014 to 31 December 2016 | −0.9518 | 1.2903 | −0.0991 | 0.4445 | 0.6667 | 0.6826 | −0.5425 |
| 1 January 2014 to 30 June 2016 (Training) | −0.9518 | 1.2903 | −0.0261 | 0.4801 | 0.6929 | 0.5038 | −0.8375 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.8916 | 0.2486 | −0.4609 | 0.1108 | 0.3328 | 0.5094 | −0.959 | |
| IMF8 | 1 January 2014 to 31 December 2016 | −0.1752 | 0.2321 | 0.0247 | 0.0217 | 0.1472 | 0.0304 | −1.5499 |
| 1 January 2014 to 30 June 2016 (Training) | −0.1749 | 0.2321 | 0.0593 | 0.0188 | 0.1371 | −0.304 | −1.3216 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.1752 | −0.0809 | −0.1463 | 0.0008 | 0.0281 | 0.7649 | −0.6715 | |
| IMF9 | 1 January 2014 to 31 December 2016 | −0.067 | 0.0673 | 0.0225 | 0.0016 | 0.0401 | −0.6397 | −0.8557 |
| 1 January 2014 to 30 June 2016 (Training) | −0.067 | 0.0673 | −0.0274 | 0.0005 | 0.0216 | −0.138 | −1.1845 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.067 | 0.0073 | 17.0341 | 0.3258 | 0.5708 | −0.5572 | −0.9412 | |
| Residue | 1 January 2014 to 31 December 2016 | 15.7958 | 17.7251 | 17.0341 | 0.3258 | 0.5708 | −0.5572 | −0.9412 |
| 1 January 2014 to 30 June 2016 (Training) | 15.7958 | 17.6306 | 16.9026 | 0.2884 | 0.537 | −0.4171 | −1.0568 | |
| 1 July 2016 to 31 December 2016 (Testing) | 17.6314 | 17.7251 | 17.6859 | 0.0008 | 0.0274 | −0.337 | −1.1062 |
1 SD, represents the standard deviation. The unit of minimum value, maximum value and mean value is °C.
Statistics of the original daily surface temperature data series and the decomposition results of Zhijiang station.
| Series | Period | Min. | Max. | Mean | Variance | SD 1 | Skewness | Kurtosis |
|---|---|---|---|---|---|---|---|---|
| Original data set | 1 January 2014 to 31 December 2016 | −0.7 | 31.8 | 18.0538 | 63.8016 | 7.9876 | −0.2423 | −1.1268 |
| 1 January 2014 to 30 June 2016 (Training) | −0.7 | 30.85 | 17.5038 | 61.452 | 7.8391 | −0.2228 | −1.1137 | |
| 1 July 2016 to 31 December 2015 (Testing) | 3.15 | 31.8 | 20.7796 | 66.5181 | 8.1559 | −0.491 | −1.1334 | |
| IMF1 | 1 January 2014 to 31 December 2016 | −3.4876 | 3.3164 | −0.0006 | 1.0313 | 1.0155 | −0.006 | 0.5469 |
| 1 January 2014 to 30 June 2016 (Training) | −3.4876 | 3.3164 | −0.0038 | 1.0904 | 1.0442 | 0.0142 | 0.4111 | |
| 1 July 2016 to 31 December 2016 (Testing) | −2.6808 | 2.4776 | 0.0154 | 0.738 | 0.8591 | −0.1602 | 1.4322 | |
| IMF2 | 1 January 2014 to 31 December 2016 | −4.9666 | 4.6916 | −0.0137 | 1.4461 | 1.2025 | −0.0209 | 1.3251 |
| 1 January 2014 to 30 June 2016 (Training) | −4.9666 | 4.6916 | −0.0151 | 1.4498 | 1.2041 | −0.0145 | 1.3825 | |
| 1 July 2016 to 31 December 2016 (Testing) | −3.5345 | 3.7363 | −0.0067 | 1.4281 | 1.195 | −0.0533 | 1.0959 | |
| IMF3 | 1 January 2014 to 31 December 2016 | −3.9453 | 4.4972 | −0.0368 | 1.5168 | 1.2316 | 0.0842 | 0.8224 |
| 1 January 2014 to 30 June 2016 (Training) | −3.9453 | 4.4972 | −0.0552 | 1.5108 | 1.2292 | 0.0978 | 0.899 | |
| 1 July 2016 to 31 December 2016 (Testing) | −3.6699 | 3.4617 | 0.0546 | 1.5364 | 1.2395 | 0.0151 | 0.5368 | |
| IMF4 | 1 January 2014 to 31 December 2016 | −3.2607 | 3.8574 | −0.0131 | 1.0952 | 1.0465 | −0.0819 | 1.0059 |
| 1 January 2014 to 30 June 2016 (Training) | −3.2607 | 3.8574 | −0.0216 | 1.1767 | 1.0848 | −0.0533 | 0.9823 | |
| 1 July 2016 to 31 December 2016 (Testing) | −2.0462 | 1.8588 | 0.029 | 0.6889 | 0.83 | −0.2818 | −0.2253 | |
| IMF5 | 1 January 2014 to 31 December 2016 | −3.8762 | 5.271 | 0.0365 | 1.05 | 1.0247 | 0.3922 | 6.5503 |
| 1 January 2014 to 30 June 2016 (Training) | −3.8762 | 5.271 | 0.0542 | 1.1226 | 1.0595 | 0.4193 | 6.897 | |
| 1 July 2016 to 31 December 2016 (Testing) | −1.2743 | 1.1222 | −0.051 | 0.6811 | 0.8253 | −0.1313 | −1.4911 | |
| IMF6 | 1 January 2014 to 31 December 2016 | −11.189 | 11.9534 | 0.7483 | 46.884 | 6.8472 | −0.1075 | −1.3765 |
| 1 January 2014 to 30 June 2016 (Training) | −10.6116 | 10.1684 | 0.1262 | 42.3756 | 6.5097 | −0.0985 | −1.4462 | |
| 1 July 2016 to 31 December 2016 (Testing) | −11.189 | 11.9534 | 3.8314 | 57.8063 | 7.603 | −0.5543 | −1.1442 | |
| IMF7 | 1 January 2014 to 31 December 2016 | −1.6727 | 1.9026 | −0.0502 | 0.7532 | 0.8679 | 0.9044 | −0.1937 |
| 1 January 2014 to 30 June 2016 (Training) | −1.6727 | 1.9026 | 0.047 | 0.8221 | 0.9067 | 0.7038 | −0.6052 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.9499 | 0.2884 | −0.5321 | 0.1329 | 0.3645 | 0.6622 | −0.7966 | |
| IMF8 | 1 January 2014 to 31 December 2016 | −0.496 | 0.5448 | 0.0197 | 0.1395 | 0.3736 | 0.0206 | −1.5351 |
| 1 January 2014 to 30 June 2016 (Training) | −0.496 | 0.5448 | 0.0921 | 0.1335 | 0.3654 | −0.3236 | −1.362 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.4884 | −0.0747 | −0.339 | 0.0148 | 0.1216 | 0.569 | −0.9077 | |
| IMF9 | 1 January 2014 to 31 December 2016 | −0.0582 | 0.0586 | 0.0196 | 0.0012 | 0.0349 | −0.6397 | −0.8556 |
| 1 January 2014 to 30 June 2016 (Training) | −0.0582 | 0.0586 | 0.0283 | 0.0009 | 0.0306 | −1.1161 | 0.3131 | |
| 1 July 2016 to 31 December 2016 (Testing) | −0.0582 | 0.0068 | −0.0235 | 0.0004 | 0.0189 | −0.1403 | −1.1839 | |
| Residue | 1 January 2014 to 31 December 2016 | 16.2408 | 17.8054 | 17.345 | 0.2196 | 0.4686 | −0.7969 | −0.652 |
| 1 January 2014 to 30 June 2016 (Training) | 16.2408 | 17.8008 | 17.2537 | 0.2142 | 0.4628 | −0.5833 | −0.9155 | |
| 1 July 2016 to 31 December 2016 (Testing) | 17.7763 | 17.8054 | 17.798 | 0.0001 | 0.0083 | −1.1065 | −0.0229 |
1 SD represents the standard deviation. The unit of minimum value, maximum value and mean value is °C.
Figure 7PACF graphs for the original daily LST data series and the decomposition results of the Mapoling station.
Figure A2PACF graphs for the original daily surface temperature data series and the decomposition results of the Zhijiang station.
Figure 8Performance comparison of the forecasting results of (a) Mapoling station and (b) Zhijiang station among RNN, LSTM, Empirical Mode Decomposition (EMD)-RNN, EMD-LSTM, EEMD-RNN and EEMD-LSTM.
Figure 9Scatterplot of the daily LST comparison of Mapoling station (left) and Zhijiang station (right) between (a) original data and RNN; (b) original data and LSTM; (c) original data and EMD-RNN; (d) original data and EMD-LSTM; (e) original data and EEMD-RNN; (f) original data and EEMD-LSTM from 1 July to 31 December 2016.
Figure 10Bar plots of the residuals of the (a) Mapoling station and (b) Zhijiang station for original vs. EEMD-LSTM.
Figure A3Bar plots of the residuals of the (a) Mapoling station and (b) Zhijiang station for original vs. EEMD-LSTM. The dotted lies are the confidence level at 95%.
Figure 11Bar charts of the statistical summary of the six models’ prediction results versus the original daily LST data series of (a) Mapoling stationand (b) Zhijiang station (** indicating a significance level of 0.01).