| Literature DB >> 34305322 |
Ivana Lolić1, Petar Sorić1, Marija Logarušić1.
Abstract
We utilize a battery of ensemble learning techniques [ensemble linear regression (LM), random forest], as well as two gradient boosting techniques [Gradient Boosting Decision Tree and Extreme Gradient Boosting (XGBoost)] to scrutinize the possibilities of enhancing the predictive accuracy of Economic Policy Uncertainty (EPU) index. Applied to a data-rich environment of the Newsbank media database, our LM and XGBoost assessments mostly outperform the other two ensemble learning procedures, as well as the original EPU index. Our LM and XGBoost estimates bring EPU closer to the stylized facts of uncertainty than other uncertainty estimates. LM and XGBoost indicators are more countercyclical and have more pronounced leading properties. We find that EPU is more strongly correlated to financial volatility measures than to consumers' assessments of uncertainty. This corroborates that the media place a much higher weight on the financial sector than on the economic issues of consumers. Further on, we considerably widen the scope of search terms included in the calculation of EPU index. Using ensemble learning techniques on such a rich set of keywords, we mostly manage to outperform the standard EPU in terms of correlation with standard uncertainty proxies. We also find that the predictive accuracy of EPU index can be considerably increased using a more diversified set of uncertainty-related terms than the original EPU framework. Our estimates perform much better in a monthly setting (targeting the industrial production growth) than targeting quarterly GDP growth. This speaks in favor of uncertainty as a purely short-term phenomenon.Entities:
Keywords: Economic policy uncertainty index; Ensemble learning; Gradient boosting; Random forest model; Textual analysis
Year: 2021 PMID: 34305322 PMCID: PMC8280278 DOI: 10.1007/s10614-021-10153-2
Source DB: PubMed Journal: Comput Econ ISSN: 0927-7099 Impact factor: 1.741
Data description
| Variable | Category | Description | Source | Frequency |
|---|---|---|---|---|
| Economy | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Financial regulation | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Entitlement programs | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Monetary policy | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Government spending and other | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Healthcare | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Trade policy | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| National security | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Sovereign debt and currency crises | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Equity market | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Taxes | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Regulation | Selected keywords: | Newsbank database, keywords defined in accordance with Baker et al. ( | Monthly | |
| Uncertainty | Selected keywords: | Synonyms for 'uncertainty' taken from: | Monthly | |
| Synonyms for ‘uncertainty’: | ||||
| SPF_GDP | Cross-sectional dispersion (75th percentile minus 25th percentile) of quarterly forecasts of real GDP growth (4 quarter ahead) | Survey of Professional Forecasters, Federal Reserve Bank of Philadelphia | Quarterly | |
| SPF_FED | Cross-sectional dispersion (75th percentile minus 25th percentile) of quarterly forecasts of chain-weighted real federal government consumption and gross investment (4 quarter ahead) | Survey of Professional Forecasters, Federal Reserve Bank of Philadelphia | Quarterly | |
| SPF_CPI | Cross-sectional dispersion (75th percentile minus 25th percentile) of quarterly forecasts of headline CPI inflation (4 quarter ahead) | Survey of Professional Forecasters, Federal Reserve Bank of Philadelphia | Quarterly | |
| VIX | The Chicago Board Options Exchange (CBOE) Volatility Index, measures expected price fluctuations in the S&P 500 Index options over the next 30 days | Yahoo finance | Monthly | |
| sd10 | Monthly standard deviation of the US Treasury Bond (10 years maturity) yield | Yahoo finance | Monthly | |
| sd500 | Monthly standard deviation of the daily S&P500 market capitalization index | Yahoo finance | Monthly | |
| govt | A disagreement indicator in the vein of Bachmann et al., ( | University of Michigan Surveys of Consumers | Monthly | |
| dis | An "aggregate" disagreement indicator in the vein of Girardi and Reuter ( | University of Michigan Surveys of Consumers | Monthly | |
| IND | Year-on-year growth rate of industrial production: manufacturing (NAICS), seasonally adjusted | Federal Reserve Bank of St. Louis | Monthly | |
| GDP | Year-on-year growth rate of real Gross Domestic Product, seasonally adjusted | Federal Reserve Bank of St. Louis | Quarterly | |
Fig. 1LM and RF estimates for monthly data. Solid and dashed lines represent the LM and RF model, respectively
Fig. 2LM and RF estimates for quarterly data. Solid and dashed lines represent the LM and RF model, respectively
Fig. 3XGBoost and GBDT estimates for monthly data. Solid and dashed lines represent the XGBoost and GBDT model, respectively
Fig. 4XGBoost and GBDT estimates for quarterly data. Solid and dashed lines represent the XGBoost and GBDT model, respectively
Correlation coefficients of (a) LM and RF and (b) XGBoost and GBDT uncertainty estimates with standard uncertainty proxies (monthly data)
| LM estimate of | EPU | RF estimate of | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| a | |||||||||||
| | 0.33*** | 0.42*** | 0.36*** | − 0.33*** | − 0.11 | 0.48*** | 0.11 | 0.11 | 0.03 | − 0.14 | 0.01 |
| | 0.16* | 0.23** | 0.18** | − 0.17* | − 0.03 | 0.35*** | 0.03 | 0.1 | − 0.01 | − 0.12 | − 0.02 |
| | 0.46*** | 0.59*** | 0.51*** | − 0.41*** | − 0.21** | 0.58*** | 0.19** | 0.1 | 0.13 | − 0.1 | − 0.03 |
| | 0.6*** | 0.41*** | 0.58*** | − 0.63*** | − 0.71*** | 0.18** | 0.6*** | 0.53*** | 0.68*** | − 0.13 | 0.12 |
| | 0.41*** | 0.31*** | 0.40*** | − 0.34*** | − 0.47*** | 0.09 | 0.39*** | 0.32*** | 0.46*** | 0.05 | 0.12 |
Pearson’s correlation coefficient is calculated for the period 2010Jan–2020Mar
*, **, and *** denote 0.1, 0.05, and 0.01 significance (respectively)
Correlation coefficient of uncertainty estimates with industrial production y-o-y growth rate
| LM | RF | XGBoost | GBDT | |
|---|---|---|---|---|
| − 0.33*** | − 0.19** | − 0.31*** | − 0.21** | |
| − 0.19** | − 0.12 | − 0.26*** | − 0.17* | |
| − 0.32*** | − 0.27*** | − 0.37*** | − 0.36*** | |
| 0.32*** | − 0.21** | − 0.40*** | − 0.03 | |
| 0.41*** | 0.11 | 0.41*** | 0.5*** |
Pearson’s correlation coefficient is calculated for the period 2010Jan–2020Mar. The correlation of EPU with industrial production is − 0.08
*, **, and *** denote 0.1, 0.05, and 0.01 significance
A comparison of predictive accuracy of ensemble learning models with EPU (monthly data)
| RMSE LM (h = 0) | 1.65 | 1.72 | 1.65 | 1.63 | 1.63 | – |
| Ratio EPU to LM | 105.8 | 101.5 | 105.8 | 107.6 | 107.7 | 105.7 |
| Ratio EPU to RF | 99.8 | 99.7 | 100.7 | 104.0 | 99.6 | 100.8 |
| Ratio EPU to XGBoost | 105.1 | 103.9 | 107.7 | 100.1 | 110.1 | 105.4 |
| Ratio EPU to GBDT | 99.8 | 99.5 | 103.7 | 99.0 | 112.3 | 102.9 |
| Average | 102.6 | 101.2 | 104.5 | 102.7 | 107.4 | 103.7 |
| RMSE LM (h = 3) | 1.69 | 1.77 | 1.70 | 1.66 | 1.62 | – |
| Ratio EPU to LM | 104.8 | 100.0 | 103.9 | 106.4 | 109.2 | 104.9 |
| Ratio EPU to RF | 101.5 | 103.6 | 103.6 | 102.6 | 99.9 | 102.2 |
| Ratio EPU to XGBoost | 103.9 | 101.6 | 104.8 | 100.0 | 109.9 | 104.0 |
| Ratio EPU to GBDT | 101.6 | 100.9 | 105.1 | 100.0 | 111.2 | 103.8 |
| Average | 103.0 | 101.5 | 104.4 | 102.3 | 107.6 | 103.7 |
| RMSE LM (h = 6) | 1.68 | 1.77 | 1.70 | 1.68 | 1.57 | – |
| Ratio EPU to LM | 104.5 | 99.2 | 103.3 | 104.7 | 111.9 | 104.7 |
| Ratio EPU to RF | 100.6 | 102.4 | 102.1 | 99.8 | 100.6 | 101.1 |
| Ratio EPU to XGBoost | 103.7 | 100.6 | 104.9 | 99.7 | 112.9 | 104.4 |
| Ratio EPU to GBDT | 99.9 | 101.1 | 102.1 | 99.5 | 119.5 | 104.4 |
| Average | 102.2 | 100.8 | 103.1 | 100.9 | 111.2 | 103.7 |
| RMSE LM (h = 12) | 1.68 | 1.77 | 1.71 | 1.73 | 1.55 | – |
| Ratio EPU to LM | 104.8 | 99.6 | 103.3 | 102.0 | 113.5 | 104.6 |
| Ratio EPU to RF | 102.5 | 101.5 | 103.5 | 99.8 | 102.6 | 102.0 |
| Ratio EPU to XGBoost | 104.6 | 100.1 | 105.2 | 102.8 | 114.3 | 105.4 |
| Ratio EPU to GBDT | 101.1 | 104.7 | 100.9 | 101.2 | 128.3 | 107.2 |
| Average | 103.3 | 101.5 | 103.2 | 101.5 | 114.7 | 104.8 |
RMSEs and corresponding ratios are calculated for the period 2011Jan–2020Mar, after taking the 12th lag of uncertainty indicators into account
Correlation coefficients of LM estimates and EPU with standard uncertainty proxies (quarterly data)
| LM estimate of | EPU | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| − 0.17 | − 0.23 | 0.47*** | − 0.2 | − 0.14 | − 0.2 | 0.22 | 0.3* | 0.06 | |
| − 0.3* | − 0.39** | 0.43*** | − 0.38** | − 0.32** | − 0.38** | 0.42*** | 0.47*** | 0.05 | |
| − 0.21 | − 0.23 | 0.26* | − 0.24 | − 0.21 | − 0.26 | 0.24 | 0.35** | 0.0 | |
| 0.3* | 0.2 | − 0.22 | 0.25 | 0.27* | 0.26* | − 0.31** | − 0.15 | 0.54*** | |
| 0.19 | 0.13 | − 0.17 | 0.15 | 0.19 | 0.17 | − 0.2 | − 0.07 | 0.43*** | |
| 0.42*** | 0.32** | − 0.35** | 0.36** | 0.38** | 0.38** | − 0.43*** | − 0.29* | 0.55*** | |
| 0.62*** | 0.71*** | − 0.36** | 0.7*** | 0.66*** | 0.7*** | − 0.69*** | − 0.75*** | 0.32** | |
| 0.43*** | 0.51*** | − 0.31** | 0.47*** | 0.43*** | 0.48*** | − 0.4*** | − 0.54*** | 0.05 | |
Pearson’s correlation coefficient is calculated for the period 2010Q1–2020Q1
*, **, and *** denote 0.1, 0.05, and 0.01 significance
Correlation coefficients of RF estimates and EPU with standard uncertainty proxies (quarterly data)
| RF estimate of | EPU | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| − 0.13 | − 0.15 | − 0.19 | − 0.21 | − 0.27* | − 0.04 | − 0.04 | 0.28* | 0.06 | |
| − 0.39** | − 0.36** | − 0.2 | − 0.37** | − 0.42*** | − 0.23 | 0.15 | 0.3* | 0.05 | |
| − 0.16 | − 0.23 | − 0.27* | − 0.16 | − 0.24 | − 0.15 | − 0.15 | 0.14 | 0.0 | |
| 0.15 | 0.11 | 0.04 | 0.16 | 0.07 | 0.18 | − 0.19 | − 0.01 | 0.54*** | |
| 0.05 | 0.06 | − 0.01 | 0.01 | − 0.02 | 0.11 | − 0.08 | 0.06 | 0.43*** | |
| 0.24 | 0.21 | 0.13 | 0.26 | 0.18 | 0.25 | − 0.21 | − 0.09 | 0.55*** | |
| 0.68*** | 0.67*** | 0.55*** | 0.68*** | 0.77*** | 0.61*** | − 0.48*** | − 0.26* | 0.32** | |
| 0.41*** | 0.47*** | 0.35** | 0.47*** | 0.54*** | 0.42*** | − 0.27* | − 0.14 | 0.05 | |
Pearson’s correlation coefficient is calculated for the period 2010Q1–2020Q1
*, **, and *** denote 0.1, 0.05, and 0.01 significance
Correlation coefficients of XGBoost estimates and EPU with standard uncertainty proxies (quarterly data)
| XGBoost estimate of | EPU | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| − 0.14 | − 0.13 | − 0.17 | − 0.15 | − 0.06 | − 0.02 | 0.2 | 0.04 | 0.06 | |
| − 0.31** | − 0.26* | − 0.26* | − 0.26* | − 0.16 | − 0.11 | 0.38** | − 0.03 | 0.05 | |
| − 0.15 | − 0.16 | − 0.14 | − 0.14 | − 0.06 | − 0.05 | 0.24 | 0 | 0.0 | |
| 0.28* | 0.31** | 0.29* | 0.3* | 0.47*** | 0.44*** | − 0.28* | 0.46*** | 0.54*** | |
| 0.17 | 0.23 | 0.2 | 0.21 | 0.38** | 0.34** | − 0.17 | 0.38** | 0.43*** | |
| 0.38** | 0.4*** | 0.4** | 0.4*** | 0.55*** | 0.5*** | − 0.41*** | 0.5*** | 0.55*** | |
| 0.66*** | 0.61*** | 0.59*** | 0.61*** | 0.45*** | 0.47*** | − 0.68*** | 0.4** | 0.32** | |
| 0.42*** | 0.41*** | 0.38** | 0.39** | 0.29* | 0.28* | − 0.46*** | 0.21 | 0.05 | |
Pearson’s correlation coefficient is calculated for the period 2010Q1–2020Q1
*, **, and *** denote 0.1, 0.05, and 0.01 significance
Correlation coefficients of GBDT estimates and EPU with standard uncertainty proxies (quarterly data)
| GBDT estimate of | EPU | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| − 0.08 | 0.02 | − 0.19 | − 0.33** | 0.05 | − 0.33** | 0.18 | 0.32** | 0.06 | |
| − 0.22 | − 0.24 | − 0.44*** | − 0.56*** | − 0.1 | − 0.59*** | 0.31** | 0.28* | 0.05 | |
| − 0.12 | − 0.07 | − 0.38** | − 0.38** | 0 | − 0.38** | 0.16 | 0.21 | 0.0 | |
| 0.26 | 0.14 | − 0.26 | − 0.26 | 0.3* | − 0.05 | − 0.05 | 0.04 | 0.54*** | |
| 0.17 | − 0.02 | − 0.21 | − 0.23 | 0.15 | − 0.15 | 0.01 | 0.07 | 0.43*** | |
| 0.35** | 0.21 | − 0.22 | − 0.12 | 0.34** | 0.1 | − 0.15 | − 0.04 | 0.55*** | |
| 0.5*** | 0.62*** | 0.51*** | 0.74*** | 0.47*** | 0.85*** | − 0.7*** | − 0.08 | 0.32** | |
| 0.31** | 0.42*** | 0.45*** | 0.7*** | 0.23 | 0.58*** | − 0.42*** | 0.03 | 0.05 | |
Pearson’s correlation coefficient is calculated for the period 2010Q1–2020Q1
*, **, and *** denote 0.1, 0.05, and 0.01 significance
Correlation coefficient of uncertainty estimates with GDP y-o-y growth rate
| − 0.23 | − 0.14 | 0.29* | − 0.19 | − 0.22 | − 0.21 | 0.21 | 0.13 |
| − 0.04 | − 0.12 | − 0.19 | − 0.07 | 0.09 | − 0.15 | − 0.09 | − 0.02 |
| − 0.21 | − 0.17 | − 0.23 | − 0.19 | − 0.37** | − 0.36** | 0.2 | − 0.35** |
| − 0.2 | − 0.01 | 0.24 | 0.07 | − 0.21 | − 0.03 | − 0.04 | − 0.13 |
Pearson’s correlation coefficient is calculated for the period 2010Q1–2020Q1
*, **, and *** denote 0.1, 0.05, and 0.01 significance
A comparison of predictive accuracy of ensemble learning models with EPU (quarterly data)
| RMSE LM (h = 0) | 0.71 | 0.72 | 0.71 | 0.71 | 0.71 | 0.73 | 0.72 | 0.72 | – |
| Ratio EPU to LM | 89.5 | 88.2 | 90.0 | 89.3 | 89.3 | 87.9 | 89.1 | 88.8 | 89.0 |
| Ratio EPU to RF | 87.5 | 88.1 | 87.4 | 87.8 | 88.4 | 89.4 | 87.7 | 87.5 | 88.0 |
| Ratio EPU to XGBoost | 88.4 | 89.8 | 89.4 | 94.9 | 94.3 | 94.5 | 88.9 | 89.6 | 91.2 |
| Ratio RF to GBDT | 89.0 | 87.4 | 90.6 | 89.0 | 87.4 | 88.7 | 87.7 | 88.0 | 88.5 |
| 88.6 | 88.4 | 89.4 | 90.3 | 89.9 | 90.1 | 88.4 | 88.5 | ||
| RMSE LM (h = 1) | 0.72 | 0.73 | 0.71 | 0.72 | 0.73 | 0.73 | 0.73 | 0.73 | – |
| Ratio EPU to LM | 94.0 | 93.7 | 95.4 | 94.1 | 93.5 | 93.2 | 93.4 | 93.6 | 93.9 |
| Ratio EPU to RF | 93.3 | 94.9 | 93.7 | 93.4 | 95.6 | 93.3 | 93.3 | 93.3 | 93.9 |
| Ratio EPU to XGBoost | 93.8 | 94.7 | 94.3 | 96.5 | 97.3 | 98.7 | 93.4 | 94.4 | 95.4 |
| Ratio RF to GBDT | 94.8 | 93.3 | 95.0 | 96.0 | 93.2 | 95.2 | 93.2 | 93.6 | 94.3 |
| 94.0 | 94.2 | 94.6 | 95.0 | 94.9 | 95.1 | 93.3 | 93.7 | ||
| RMSE LM (h = 2) | 0.72 | 0.72 | 0.71 | 0.72 | 0.72 | 0.73 | 0.72 | 0.72 | – |
| Ratio EPU to LM | 92.5 | 92.4 | 93.8 | 93.0 | 93.1 | 91.9 | 93.1 | 92.9 | 92.8 |
| Ratio EPU to RF | 91.5 | 93.8 | 91.6 | 91.5 | 94.3 | 92.0 | 93.5 | 91.6 | 92.5 |
| Ratio EPU to XGBoost | 93.0 | 94.8 | 93.5 | 98.0 | 98.6 | 101.0 | 92.1 | 93.9 | 95.6 |
| Ratio RF to GBDT | 94.1 | 91.8 | 95.9 | 95.0 | 91.4 | 92.0 | 91.8 | 93.3 | 93.2 |
| 92.8 | 93.2 | 93.7 | 94.4 | 94.4 | 94.2 | 92.6 | 92.9 | ||
| RMSE LM (h = 4) | 0.72 | 0.72 | 0.71 | 0.72 | 0.72 | 0.72 | 0.70 | 0.71 | – |
| Ratio EPU to LM | 98.9 | 99.2 | 100.2 | 98.5 | 99.1 | 98.3 | 102.0 | 99.8 | 99.5 |
| Ratio EPU to RF | 97.9 | 104.4 | 97.8 | 98.7 | 103.5 | 97.7 | 102.0 | 98.1 | 100.0 |
| Ratio EPU to XGBoost | 103.0 | 101.5 | 102.0 | 100.4 | 102.3 | 104.5 | 100.4 | 102.2 | 102.0 |
| Ratio RF to GBDT | 101.0 | 99.6 | 97.8 | 100.5 | 97.7 | 97.7 | 102.4 | 101.5 | 99.8 |
| 100.2 | 101.2 | 99.5 | 99.5 | 100.7 | 99.6 | 101.7 | 100.4 |
RMSEs and corresponding ratios are calculated for the period 2011Q1–2020Q1, after taking the 4th lag of uncertainty indicators into account
Top ranked combinations of keywords in LM uncertainty estimates
| Variable | Rank | Quarterly data | Monthly data | ||
|---|---|---|---|---|---|
| Policy category | Uncertainty | Policy category | Uncertainty | ||
| 1 | Regulation | Uncertaina | |||
| 2 | Monetary policy | Unreliablea | |||
| 3 | Sovereign debt and currency crises | Concerna | |||
| 1 | Monetary policy | Ambivalenta | |||
| 2 | Sovereign debt and currency crises | Unreliablea | |||
| 3 | Trade policy | Skeptica | |||
| 1 | Sovereign debt and currency crises | Concerna | |||
| 2 | Trade policy | Unsettlea | |||
| 3 | National security | Hesitanta | |||
| 1 | Regulation | Erratica | Sovereign debt and currency crises | Uncertaina | |
| 2 | Sovereign debt and currency crises | Uncertaina | Sovereign debt and currency crises | Concerna | |
| 3 | Government spending and other | Undeterminea | Government spending and other | Undeterminea | |
| 1 | Trade policy | Erratica | Trade policy | Erratica | |
| 2 | Taxes | Unsettlea | Regulation | Uncertaina | |
| 3 | Government spending and other | Erratica | National security | Uncleara | |
| 1 | Sovereign debt and currency crises | Uncertaina | Sovereign debt and currency crises | Uncertaina | |
| 2 | Government spending and other | Erratica | Trade policy | Erratica | |
| 3 | Sovereign debt and currency crises | Erratica | Equity market | Riska | |
| 1 | Equity market | Riska | Monetary policy | Unpredictabea | |
| 2 | Financial regulation | Riska | Taxes | Ambigua | |
| 3 | Trade policy | Anxiety | Financial regulation | Riska | |
| 1 | Healthcare | Undeterminea | Healthcare | Undeterminea | |
| 2 | Taxes | Undeterminea | Taxes | Anxiety | |
| 3 | Equity market | Unpredictabea | Entitlement programs | Unpredictabea | |
aDenotes a word root