| Literature DB >> 35095152 |
Rabin K Jana1, Indranil Ghosh2.
Abstract
The natural gas price is an essential financial variable that needs periodic modeling and predictive analysis for many practical implications. Macroeconomic euphoria and external uncertainty make its evolutionary patterns highly complex. We propose a two-stage granular framework to perform predictive analysis of the natural gas futures for the USA (NGF-USA) and the UK natural gas futures for the EU (NGF-UK) for pre-and during COVID-19 phases. The residuals of the previous stage are introduced as a new explanatory feature along with standard technical indicators to perform predictive tasks. The importance of the new feature is explained through the Boruta feature evaluation methodology. Maximal Overlap Discrete Wavelet Transformation (MODWT) is applied to decompose the original time-series observations of the natural gas prices to enable granular level forecasting. Random Forest is invoked on each component to fetch the respective predictions. The aggregated component-wise sums lead to final predictions. A rigorous performance assessment signifies the efficacy of the proposed framework. The results show the effectiveness of the residual as a feature in deriving accurate forecasts. The framework is highly efficient in analyzing patterns in the presence of a limited number of data points during the uncertain COVID-19 phase covering the first and second waves of the pandemic. Our findings reveal that the prediction accuracy is the best for the NGF-UK in the pre-COVID-19 period. Also, the prediction accuracy of the NGF-USA is better in the COVID-19 period than the pre-COVID-19 period.Entities:
Keywords: Boruta; COVID-19; Ensemble machine learning; Maximal Overlap Discrete Wavelet Transformation; Natural gas futures; Residual
Year: 2022 PMID: 35095152 PMCID: PMC8783804 DOI: 10.1007/s10479-021-04492-4
Source DB: PubMed Journal: Ann Oper Res ISSN: 0254-5330 Impact factor: 4.854
Fig. 1Evolutionary patterns of natural gas futures. Figure 1 represents the temporal patterns on Natural gas futures for the UK and USA during the mentioned periods. The x-axis shows the duration and the y-axis of the figures (a) and (b) shows the natural gas futures for the UK in US$ per MMBtu. The y-axis of figures (c) and (d) shows the NGF-USA in US$ per MMBtu
Descriptive statistics
| UK-Pre | UK-During | US-Pre | US-During | |
|---|---|---|---|---|
| Mean | 0.60 | 0.44 | 2.79 | 2.30 |
| SD | 0.21 | 0.22 | 0.56 | 0.48 |
| Median | 0.53 | 0.40 | 2.65 | 2.40 |
| Min | 0.31 | 0.10 | 2.07 | 1.48 |
| Max | 1.03 | 1.05 | 4.84 | 3.35 |
| MAD | 0.22 | 0.30 | 0.46 | 0.71 |
| Skewness | 0.46 | 0.34 | 1.53 | 0.05 |
| Kurtosis | − 1.28 | − 1.04 | 2.25 | − 1.42 |
| N.valid | 367 | 367 | 367 | 367 |
| Jarque–Bera Test | 37.55*** | 23.05*** | 228.99*** | 30.72*** |
| Shapiro–Wilk Test | 0.89*** | 0.94*** | 0.85*** | 0.92*** |
| Kolmogorov–Smirnov Test | 0.62*** | 0.54*** | 0.98*** | 0.94*** |
| Anderson–Darling Test | 15.22*** | 6.74*** | 14.45*** | 11.44*** |
| Augmented Dickey-Fuller Test | − 1.18# | − 2.39# | − 2.60# | − 3.24# |
| Ljung–Box Test | 361.79*** | 359.78*** | 352.36*** | 354.17*** |
| Terasvirta’s Neural Network Test | 8.07** | 23.41*** | 30.54*** | 6.96** |
| White’s Neural Network Test | 8.15** | 17.22*** | 8.76*** | 6.32** |
| Hurst Exponent | 0.86 | 0.86 | 0.83 | 0.86 |
***Significant at 1%, **Significant at 5%, *significant at 10% levels of significance, # Not Significant. US-Pre and UK-Pre denote NGF-USA and NGF-UK from August 09, 2018 to December 30, 2019, respectively. Similarly, US-During and UK-During denote NGF-USA and NGF-UK from December 31, 2019 to May 20, 2021, respectively
Fig. 2MODWT decomposition of natural gas futures
Features importance of the decomposed components of UK-Pre (Pre COVID-19 NGF-UK)
| Series | Mean of squared residuals | % Var explained | Feature Importance | Execution Time (sec) | ||
|---|---|---|---|---|---|---|
| Feature #1 | Feature #2 | Feature #3 | ||||
| D1 (F) | 0.000045 | 61.33 | EMA12 (8.15) | MTM5 (7.82) | 0.41 | |
| D1 (NF) | 0.000114 | 3.04 | LAG1 (8.91) | MTMS (7.84) | WMS (6.56) | 0.52 |
| D2 (F) | 0.000041 | 75.9 | LOSS (20.29) | WMS (16.71) | 0.43 | |
| D2 (NF) | 0.000085 | 50.58 | LOSS (21.62) | WMS (17.33) | LAG1 (13.89) | 0.36 |
| D3 (F) | 0.000031 | 87.03 | LAG1 (20.48) | WMS (15.48) | 0.49 | |
| D3 (NF) | 0.000044 | 81.69 | LAG1 (21.90) | LOSS (16.61) | WMS (15.88) | 0.37 |
| D4 (F) | 0.000017 | 95.74 | LAG1 (19.76) | B5 (16.46) | 0.36 | |
| D4 (NF) | 0.000020 | 94.73 | LAG1 (19.23) | LOSS (16.06) | B5 (14.61) | 0.36 |
| S4 (F) | 0.000054 | 99.88 | MTM10 (10.53) | B10 (9.63) | LAG1 (7.87) | 0.37 |
| S4 (NF) | 0.000056 | 99.87 | MTM20 (9.47) | MTM5 (9.33) | LAG1 (8.86) | 0.34 |
Features importance of the decomposed components of UK-During (During COVID-19 NGF-UK)
| Series | Mean of squared residuals | % Var explained | Feature Importance | Execution Time (sec) | ||
|---|---|---|---|---|---|---|
| Feature #1 | Feature #2 | Feature #3 | ||||
| D1 (F) | 0.000049 | 56.29 | UB (7.86) | EMA10 (6.11) | 0.47 | |
| D1 (NF) | 0.000133 | 18.06 | UB (10.29) | EMA10 (7.48) | EMA12 (7.35) | 0.50 |
| D2 (F) | 0.000091 | 51.93 | WMS (11.96) | LAG1 (11.07) | 0.36 | |
| D2 (NF) | 0.000144 | 23.71 | LAG1 (14.05) | WMS (12.04) | EMA5 (8.75) | 0.38 |
| D3 (F) | 0.000062 | 75.20 | LAG1 (17.64) | MTM5 (12.46) | 0.44 | |
| D3 (NF) | 0.000078 | 68.73 | LAG1 (19.87) | EMA5 (12.50) | B5 (12.18) | 0.42 |
| D4 (F) | 0.000017 | 97.20 | LAG1 (18.18) | MTM5 (13.49) | 0.36 | |
| D4 (NF) | 0.000019 | 96.72 | LAG1 (18.37) | MTM5 (14.74) | LAG2 (12.85) | 0.36 |
| S4 (F) | 0.000101 | 99.78 | LAG1 (12.68) | LAG2 (12.21) | ROC20 (11.28) | 0.33 |
| S4 (NF) | 0.000101 | 99.78 | LAG1 (12.20) | LAG2 (11.75) | EMA5 (11.11) | 0.34 |
Features importance of the decomposed components of US-Pre (Pre COVID-19 NGF-USA)
| Series | Mean of squared residuals | % Var explained | Feature Importance | Execution Time (sec) | ||
|---|---|---|---|---|---|---|
| Feature #1 | Feature #2 | Feature #3 | ||||
| D1 (F) | 0.003222 | 33.03 | MACD (5.35) | MA20 (4.81) | 0.52 | |
| D1 (NF) | 0.003923 | 25.46 | LAG5 (5.81) | EMA12 (4.76) | MACD (4.44) | 0.62 |
| D2 (F) | 0.001165 | 68.57 | LAG1 (13.89) | EMA5 (10.94) | 0.42 | |
| D2 (NF) | 0.002380 | 35.79 | LAG1 (15.37) | EMA5 (12.82) | WMS (12.27) | 0.48 |
| D3 (F) | 0.000822 | 85.56 | LAG1 (19.39) | EMA5 (12.47) | 0.39 | |
| D3 (NF) | 0.001133 | 80.10 | LAG1 (20.72) | EMA5 (13.52) | MTM10 (12.64) | 0.52 |
| D4 (F) | 0.000661 | 94.74 | LAG1 (19.39) | LAG2 (13.81) | EMA5 (12.50) | 0.37 |
| D4 (NF) | 0.000796 | 93.67 | LAG1 (19.98) | LAG2 (15.33) | EMA5 (12.69) | 0.37 |
| S4 (F) | 0.001700 | 99.43 | LAG1 (13.24) | LAG2 (11.82) | 0.34 | |
| S4 (NF) | 0.001964 | 99.34 | LAG1 (13.20) | LAG2 (13.01) | EMA5 (11.21) | 0.34 |
Features importance of the decomposed components of US-During (During COVID-19 NGF-USA)
| Series | Mean of squared residuals | % Var explained | Feature importance | Execution Time (sec) | ||
|---|---|---|---|---|---|---|
| Feature #1 | Feature #2 | Feature #3 | ||||
| D1 (F) | 0.000575 | 64.74 | LAG1 (6.60) | LAG2 (6.56) | 0.44 | |
| D1 (NF) | 0.001687 | 23.43 | LAG2 (7.74) | LAG5 (7.46) | LAG1 (7.04) | 0.45 |
| D2 (F) | 0.000596 | 77.62 | LOSS (22.42) | WMS (15.07) | 0.39 | |
| D2 (NF) | 0.001317 | 50.53 | LOSS (23.58) | LAG1 (15.17) | WMS (13.90) | 0.39 |
| D3 (F) | 0.000448 | 88.69 | LAG1 (21.49) | LOSS (17.89) | 0.36 | |
| D3 (NF) | 0.000695 | 82.44 | LAG1 (22.52) | LOSS (20.67) | WMS (15.29) | 0.36 |
| D4 (F) | 0.000289 | 95.52 | LAG1 (20.56) | LAG2 (14.27) | 0.36 | |
| D4 (NF) | 0.000341 | 94.71 | LAG1 (20.10) | LOSS (15.39) | MTM5 (14.00) | 0.36 |
| S4 (F) | 0.000649 | 99.69 | LAG1 (12.24) | LAG2 (11.53) | B20 (11.19) | 0.34 |
| S4 (NF) | 0.000730 | 99.65 | LAG1 (12.11) | LAG2 (11.26) | EMA5 (11.15) | 0.34 |
Forecast error and predictive performance based on the training dataset on MODWT-decomposed series
| Series | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | IA | NSE | TI | MAE | RMSE | IA | NSE | TI | |
| UK-Pre | 0.001732 | 0.002878 | 0.987615 | 0.961127 | 0.005087 | 0.002518 | 0.005111 | 0.968857 | 0.917194 | 0.005092 |
| UK-During | 0.001943 | 0.003367 | 0.986098 | 0.955955 | 0.005776 | 0.002778 | 0.004201 | 0.966141 | 0.910027 | 0.005665 |
| US-Pre | 0.009599 | 0.018212 | 0.971798 | 0.924649 | 0.01896 | 0.011984 | 0.021464 | 0.957751 | 0.892595 | 0.020434 |
| US-During | 0.006545 | 0.010399 | 0.989794 | 0.967014 | 0.013529 | 0.009634 | 0.013870 | 0.97179 | 0.923665 | 0.014333 |
Forecast error and predictive performance based on the whole dataset on MODWT-decomposed series
| Series | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| MAE | RMSE | IA | NSE | TI | MAE | RMSE | IA | NSE | TI | |
| UK-Pre | 0.00779 | 0.01129 | 0.99927 | 0.99710 | 0.01500 | 0.01244 | 0.01699 | 0.99835 | 0.99343 | 0.02167 |
| UK-During | 0.00842 | 0.01360 | 0.99908 | 0.99633 | 0.02140 | 0.01190 | 0.01814 | 0.99835 | 0.99347 | 0.02721 |
| US-Pre | 0.03723 | 0.06953 | 0.99610 | 0.98412 | 0.02348 | 0.04691 | 0.08409 | 0.99422 | 0.97766 | 0.02832 |
| US-During | 0.02972 | 0.04699 | 0.99760 | 0.99055 | 0.01888 | 0.04580 | 0.06650 | 0.99508 | 0.98075 | 0.02664 |
Forecast error and predictive performance based on the whole dataset without MODWT-decomposition
| Series | MAE | RMSE | IA | NSE | TI |
|---|---|---|---|---|---|
| UK-Pre | 0.01076 | 0.01500 | 0.99871 | 0.99488 | 0.01938 |
| UK-During | 0.01093 | 0.01747 | 0.99848 | 0.99394 | 0.02649 |
| US-Pre | 0.04339 | 0.07496 | 0.99551 | 0.98225 | 0.02529 |
| US-During | 0.04109 | 0.06108 | 0.99585 | 0.98376 | 0.02453 |
Fig. 3Prediction of natural gas futures. Note. Figure 3 represents the comparison of prediction of Natural gas futures for the UK and USA during the mentioned periods. The x-axis shows the duration, and the y-axis of the figures (a) and (b) shows the natural gas futures for the UK in US$ per MMBtu. The y-axis of figures (c) and (d) shows the NGF-USA in US$ per MMBtu
Correlation between the actual and predicted values
| Series | Series with decomposition and | Series with decomposition and | Series without decomposition |
|---|---|---|---|
| Correlation | Correlation | Correlation | |
| UK-Pre | 0.99855 | 0.99671 | 0.99745 |
| UK-During | 0.99817 | 0.99673 | 0.99696 |
| US-Pre | 0.99252 | 0.98897 | 0.99112 |
| US-During | 0.99527 | 0.99042 | 0.99192 |
Forecast error and predictive performance based on the training dataset on decomposed components with as a feature
| UK-Pre | UK-During | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| D1 | D2 | D3 | D4 | S4 | D1 | D2 | D3 | D4 | S4 | |
| MAE | 0.001688 | 0.001639 | 0.001589 | 0.001283 | 0.002458 | 0.001352 | 0.001872 | 0.001718 | 0.00137 | 0.003405 |
| RMES | 0.003479 | 0.003045 | 0.002545 | 0.001945 | 0.003377 | 0.002947 | 0.004168 | 0.003346 | 0.001898 | 0.004478 |
| IA | 0.964902 | 0.983439 | 0.992352 | 0.997445 | 0.999935 | 0.974904 | 0.970036 | 0.987213 | 0.998446 | 0.999891 |
| NSC | 0.896923 | 0.945863 | 0.972806 | 0.990303 | 0.99974 | 0.922702 | 0.908185 | 0.955372 | 0.993954 | 0.999564 |
| TI | 0.005995 | 0.005181 | 0.004815 | 0.004648 | 0.004796 | 0.005409 | 0.006917 | 0.005526 | 0.003323 | 0.007707 |
Forecast error and predictive performance based on the training dataset on decomposed components with not as a feature
| UK-Pre | UK-During | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| D1 | D2 | D3 | D4 | S4 | D1 | D2 | D3 | D4 | S4 | |
| MAE | 0.003464 | 0.003016 | 0.002118 | 0.00149 | 0.002502 | 0.003447 | 0.003197 | 0.002234 | 0.001561 | 0.003451 |
| RMES | 0.005337 | 0.004522 | 0.003089 | 0.002152 | 0.003463 | 0.005111 | 0.005293 | 0.003932 | 0.002076 | 0.004592 |
| IA | 0.898186 | 0.960623 | 0.988674 | 0.996868 | 0.999932 | 0.902358 | 0.9482 | 0.982125 | 0.998136 | 0.999885 |
| NSC | 0.757503 | 0.880691 | 0.959922 | 0.988126 | 0.999727 | 0.767536 | 0.851947 | 0.938352 | 0.992761 | 0.999541 |
| TI | 0.005644 | 0.005045 | 0.00476 | 0.005099 | 0.004914 | 0.005701 | 0.00633 | 0.005159 | 0.003237 | 0.0079 |