Literature DB >> 35185232

Nowcasting unemployment insurance claims in the time of COVID-19.

Abstract

Near-term forecasts, also called nowcasts, are most challenging but also most important when the economy experiences an abrupt change. In this paper, we explore the performance of models with different information sets and data structures in order to best nowcast US initial unemployment claims in spring of 2020 in the midst of the COVID-19 pandemic. We show that the best model, particularly near the structural break in claims, is a state-level panel model that includes dummy variables to capture the variation in timing of state-of-emergency declarations. Autoregressive models perform poorly at first but catch up relatively quickly. The state-level panel model, exploiting the variation in timing of state-of-emergency declarations, also performs better than models including Google Trends. Our results suggest that in times of structural change there is a bias-variance tradeoff. Early on, simple approaches to exploit relevant information in the cross sectional dimension improve forecasts, but in later periods the efficiency of autoregressive models dominates.

Entities: Chemical

Keywords: Forecast evaluation; Google Trends; Panel forecasting; Structural breaks; Time series forecasting

Year: 2021 PMID： 35185232 PMCID： PMC8846950 DOI： 10.1016/j.ijforecast.2021.01.001

Source DB: PubMed Journal: Int J Forecast ISSN： 0169-2070

Introduction

Simple time series models often forecast well in normal times (see the long literature started by Nelson (1972)), but in the midst of dramatic upheaval, such as in the first few weeks of the response to COVID-19, these models do not adjust quickly to changing conditions (Castle, Clements, & Hendry, 2016). These models exploit a long time series of an aggregate to provide precise, consistent estimates, while typically ignoring information on disaggregate components. We argue that in times of structural change, forecasters should attempt to exploit information within disaggregates, especially when there exists timing variation in the breaks among the disaggregate entities. In the case of the spring of 2020, we find that variation in state-level COVID-19-related state-of-emergency declarations provides useful information for predicting the national quantity of initial unemployment insurance claims. The number of persons filing initial claims for unemployment insurance (UI) is an important indicator of current US economic conditions (Berge and Jordà, 2011, Lewis et al., 2020). UI data are both timely and frequent, and constitute one of a limited number of macroeconomic indicators available at a weekly cadence. As the COVID-19 pandemic came to the US, it became an important indicator used to evaluate the state of the economy and the economic toll of the pandemic and associated stay-at-home orders in near-real time. It also itself became a key variable to forecast, to predict the economic costs of the pandemic.3 In this paper we present current-week forecasts (i.e.nowcasts) of the advance estimates of US national initial unemployment claims produced with different information sets and data structures.4 We show that a small T panel model performs well shortly after a structural break if it has relevant information. The variation in emergency declaration date across the states provides this relevant information and outperforms models that include Google Trends data, as well as autoregressive models. We further show that autoregressive models do catch up within a few periods, but that they perform poorly in the crucial weeks directly after the dramatic increase in claims that came in mid-March of 2020. Other researchers have recognized the need for different tools to forecast UI claims in the time of COVID-19 beyond simple time series models. Aaronson, Brave, Butters, Sacks, and Seo (2020) use an event-study approach based on hurricanes in order to use Google Trends data to forecast claims. Goldsmith-Pinkham and Sojourner (2020) also use Google Trends at the state level to forecast claims. Our innovation is to use the timing of emergency declarations across the states in a panel framework. We compare our forecasts to those produced using Google Trends in both panel and time series frameworks as well as with autoregressive models. We show that all models miss the initial shock, but that in the early periods the information in the declaration-weeks panel leads to substantially better forecasts. Quickly thereafter, the time series models catch up. Google Trends models are typically second- or third-best in each time period, outperforming autoregressive models initially, but still underperforming declarations dummy variable models.5 Our research also connects to the debate about forecasting the aggregate or aggregating forecasts (see discussion in Castle and Hendry (2010), Hendry and Hubrich (2011), Larson (2015), and Heinisch and Scheufele (2018)). We find support for forecasting the individual states and then aggregating in the case when we have useful variation in the timing of the states, e.g. the emergency declaration dates. This is consistent with the argument from Castle, Fawcett, and Hendry (2011), that we need relevant information available to forecast during breaks. After the usefulness of that information is exhausted, however, simple autoregressive models perform best, with little difference between direct national forecasts and aggregates of state forecasts. Research using Big Data for forecasting has emphasized the importance of a large number of time observations for forecasting, but in the face of structural breaks we may not have a long time series with consistent parameters (Bajari, Chernozhukov, Hortaçsu, & Suzuki, 2019). We may, however, have useful information in the panel dimension. Since we have variation in the timing of the emergency declaration by state, we can exploit this information using dummy variables in a panel framework. We show that in this case the cross sectional information does improve forecasts in the periods near the structural break. The remainder of our paper is as follows. First, we describe the UI claims, Google Trends, and disaster declaration series. Next, we describe the models used to generate pseudo-real-time forecasts, paying particular attention to information set assumptions regarding variables and timing and the structure of the data. Then, we turn to a discussion of our results, after which we conclude. Detailed model estimates and alternative forecasts can be found in the appendix.

Data

Unemployment insurance claims

Our target variable is the national total of the advance number of initial UI weekly claims under state programs, not seasonally adjusted (NSA). Fig. 1, panel reports this variable for each week since January 1998. The previous peak, near the end of the Great Recession, was just under 1 million people filing initial UI claims (NSA). In early 2020, the US economy was relatively strong, with around 215,000 weekly claims each week in February. In response to the COVID-19 pandemic and stay-at-home orders, claims quickly dwarfed the Great Recession high, rising almost 30-fold from this benchmark entry, with claims exceeding 6 million in a single week (panel ). Our analysis is focused on this crisis period and our forecasts begin for the week ending March 14th, 2020. Our objective throughout the paper is to forecast the advance release of the series presented in Fig. 1.

Fig. 1

National UI Claims. Notes: Data present the advance estimate of U.S. total (50 states plus the District of Columbia and Puerto Rico) weekly initial unemployment insurance claims.

Initial UI claims are released each Thursday morning at 8:30 a.m. Eastern Time by the US Department of Labor (DOL) and the latest data are the preliminary estimates for the week ending the Saturday five days prior. Each week our data set contains initial claims by state for a total of 52 entities in the panel: 50 US states plus the District of Columbia (DC) and Puerto Rico.6 We refer to all the entities as states for simplicity. The objective of the forecasts will be the advance national totals, which are directly calculated as a sum of the 52 state entities.7 We focus on NSA numbers throughout our analysis, following the recommendation of Rinz (2020), since the multiplicative seasonal adjustment procedure that is used for UI claims is likely misleading in the case of the magnitudes of changes in the sample we are analyzing.8 Our pseudo-real-time data set is constructed using the historical weekly file available on the DOL website through the week ending March 7th, 2020, and then the press release PDF files.9 All models are estimated on data available at the time the forecast was made. In our exercise, forecasts are assumed to be made immediately following the Thursday release for the next week’s release. Unemployment claims data are generally revised once, meaning that the last observation used in each week’s estimates will be preliminary data (“advance” estimates in DOL terminology) that is updated the following week.10 National UI Claims. Notes: Data present the advance estimate of U.S. total (50 states plus the District of Columbia and Puerto Rico) weekly initial unemployment insurance claims. State UI Claims. Notes: Data present UI claims normalized with respect to the week ending February 15th, 2020 in the respective state. Time is normalized such that 0 is the week of the initial COVID-19 emergency declaration. COVID-19 disaster declaration dates. In our analysis, we develop several forecasting models that exploit different samples of data and variable transformations. For autoregressive models (both state and national), our dependent variable is the natural log of weekly UI claims, and our sample begins in January 1998 in order to exploit a large number of periods, satisfy “Large T” asymptotics, and ensure ergodicity. Augmented Dickey–Fuller tests confirm this series to be stationary despite the rising labor force over the period, with all tests rejecting the null of a unit root at the 1% level. In models exploiting the panel dimension—“Large N” models—our dependent variable is the normalized weekly US claims for each state relative to the last pre-crisis week, which we identified to be the week ending February 15th. We then reverse this normalization when we aggregate back to the national level to produce the forecast of US national NSA initial unemployment claims. Our panel sample period begins the week ending February 1st.

Emergency declaration dates

Our declaration date analysis exploits the variation of timing in emergency declarations by states so that we can learn from the early states for predicting outcomes for later states, similar to the approach of Liu, Moon, and Schorfheide (2020). The event time is based on the week in which a state of emergency was declared regarding COVID-19 for that state. All states, as well as DC and Puerto Rico, declared a state of emergency within a four-week period in February and March of 2020 (Table 1). Washington state was the first to declare an emergency on February 29th and West Virginia was the last on March 16th. Fig. 2, panels through , show normalized state claims grouped by their week of declaration, where 0 on the -axis for each group is the week that the group declared their emergency. From these figures it is clear that all states experienced a substantial increase in claims after declaring a state of emergency. The last states to declare an emergency already had increases before their declaration week, but continued to experience further increases in the weeks following. It is important to note that we are using declarations of states of emergency, not stay-at-home orders or other restrictions, which, as discussed in Cronin and Evans (2020), explain only a small amount of the change in behavior in response to the pandemic.

Table 1

COVID-19 disaster declaration dates.

Week 1		Week 3, Continued
Washington	2/29/2020	Kansas	3/12/2020
Week 2		Montana	3/12/2020
California	3/4/2020	Nevada	3/12/2020
Hawaii	3/4/2020	Puerto Rico	3/12/2020
Maryland	3/5/2020	Tennessee	3/12/2020
Indiana	3/6/2020	Virginia	3/12/2020
Kentucky	3/6/2020	Wisconsin	3/12/2020
Utah	3/6/2020	Alabama	3/13/2020
New York	3/7/2020	Arkansas	3/13/2020
Week 3		Idaho	3/13/2020
Oregon	3/8/2020	Minnesota	3/13/2020
Florida	3/9/2020	Missouri	3/13/2020
Illinois	3/9/2020	Nebraska	3/13/2020
Iowa	3/9/2020	New Hampshire	3/13/2020
New Jersey	3/9/2020	North Dakota	3/13/2020
Ohio	3/9/2020	South Carolina	3/13/2020
Rhode Island	3/9/2020	South Dakota	3/13/2020
Colorado	3/10/2020	Texas	3/13/2020
Connecticut	3/10/2020	Vermont	3/13/2020
Massachusetts	3/10/2020	Wyoming	3/13/2020
Michigan	3/10/2020	Georgia	3/14/2020
North Carolina	3/10/2020	Mississippi	3/14/2020
Alaska	3/11/2020	Week 4
Arizona	3/11/2020	Maine	3/15/2020
District of Columbia	3/11/2020	Oklahoma	3/15/2020
Louisiana	3/11/2020	Pennsylvania	3/16/2020
New Mexico	3/11/2020	West Virginia	3/16/2020
Delaware	3/12/2020

Fig. 2

State UI Claims. Notes: Data present UI claims normalized with respect to the week ending February 15th, 2020 in the respective state. Time is normalized such that 0 is the week of the initial COVID-19 emergency declaration.

Google Trends

Google Trends data are used as an index of the relative search volume on Google for the keyword “unemployment” for the US and for each of the 50 states plus DC and Puerto Rico.11 The Google Trends API only allows five comparison locations per search, so, following (Goldsmith-Pinkham & Sojourner, 2020), we include California in each of our rounds of data collection and re-normalize each state relative to California. This approach allows us to compare across time and states. In order to get a long daily time series for the national trend, we follow a similar approach, pulling each six-month time period along with the initial six months. Because Google Trends data are based on a sample, we averaged over 10 different samples to reduce the noise (Stephens-Davidowitz & Varian, 2014). It is important to note that the models that include Google Trends are true nowcasts in the sense that they are using information available into the target week. Google Trends data are available with a delay of approximately 36 h, so for forecasts made on Thursday morning for the data that will be released the following week, we already have Google Trends data through Monday of the reference week. In order to use all information available at the time of our forecast, we create two different Google Trends variables. We include the average of the two days that are available in the current week, which we call . We also include a separate variable that is the average of the previous week, where the dates line up with the latest available UI claims numbers, which we call .12 Similar to the UI claims, the Google Trends data are normalized by their average value from the week ending February 15th. Since there are many cases of zeros in Google Trends, we first add 1 to all values, and then the Trends are divided by the average of 2/9 through 2/15 in the case of and by the average value of 2/9 and 2/10 in the case of .13 Google Trends data are available from January of 2014. We use the full series for the national time series model. For the panel models we start in February of 2020, consistent with our other panel models.

Models

Our forecast target is weekly US national initial unemployment claims with a focus on the first release, which is announced the Thursday of the week following the reference week. The nomenclature for the horizon of our forecast is tricky. Models that do not include Google Trends or disaster emergency declarations data can plausibly be considered one-week-ahead forecasts, since they are based on data through the previous week. However, the UI claims data on the previous week are only released on Thursday morning of the current week, which means we cannot produce the forecasts until well into the current week. Thus if we name the horizon based on when we are able to make the forecast, it would be a current-week forecast, also known as a nowcast. We estimate each of our models with data available on Thursday morning of the reference week. Thus the Google Trends data are available through Monday of that week, and declarations for the current week made after Wednesday are assumed not to have been made at the time of the forecast. Following Banbura, Giannone, Modugno, and Reichlin (2013), we use the term “nowcasts” for all models. We consider several combinations of data structures and information sets. Data structures considered include US state-level panel models, state time series models that do not exploit the panel dimension, and national time series models that do not exploit any disaggregate information. For the state-level data, we sum up to the national level for evaluation of the forecast. For information sets, we use single-series autoregressive models, Google Trends at the state and national levels, depending on the model, and dummy variables to convert the event time of the emergency declaration by state into calendar time. In the main results section, we report a total of six different models, where in the case of state-level models we aggregate back to the US national values for the final forecast, which we describe below in greater detail. Our main model is a panel model with dummy variables representing the distance week is from state ’s emergency declaration date. The variation in this information is what gives our panel good forecasting properties.14 The panel regression is weighted based on covered employment for each state for the week ending March 7th. In this model, state differentials in declarations timing give us information about future UI claims behavior. The states are in four different groups based on the timing of their emergency declaration. Washington state was the first to declare an emergency on February 29th, which we classify as a “week 1” declaration. The remaining states all followed suit over the next three weeks (see Table 1) and are given labels of weeks 2 through 4. Our first nowcast was made on March 12th for the week ending March 14th. The timing of our nowcasting exercise affects the availability of information on declaration dates. To illustrate, for the nowcast for the week ending March 14th, we use all information available up to March 12th, which includes the UI claims release that morning covering the week ending March 7th, and all declarations up to March 11th. All week 1 and week 2 declaration states have dummies set equal to 1 in their respective week. However, for week 3 states, only about half of the states would have dummy variables set equal to 1 for the week ending March 14th in this particular forecast vintage, because declarations after March 11th are not included. All other week 3 and week 4 declaration states have dummy variables set equal to 0 in all periods in this vintage. For the March 21st forecast vintage, the remaining week 3 states would have their dummy variables set equal to 1 for the week ending March 14th, and so on. In terms of notation, is how many weeks state is away from emergency declaration at time . So, for instance, in the week ending March 14th, the dummy variable, for Washington, whose declaration date is February 29th is 1 for , and 0 otherwise. By the same logic, the dummy variable in the same time period for California (declaration date of March 4th) is 1 for , and 0 otherwise. We also include dummy variables for the two periods in advance of the declaration once the declaration date is known. These leading dummy variables serve two purposes: they absorb any pre-declaration explanatory power; and they serve a “placebo” role in inference related to declaration pre-trends, though that is not a focus of our model design. Thus we estimate dummy variables, where is the number of weeks since the week ending February 29th: two leads, one concurrent to the declaration date, and lags. Forecasting results. Notes: This table presents percentage forecast errors ((Forecast - Actual)/ Actual) made on the Thursday of the week ending in the date listed in the row. MAE is the mean absolute forecast error. RMSE is the square-root of the mean squared forecast error. MAE-DM is the Diebold-Mariano small-sample test statistic relative to model [1], with negative sign implying results in favor of model [1]. All statistics are at the national level, including Mean Error, MAE, RMSE, and MAE-DM, which are calculated over the 10 forecast periods. We also estimate panel models that include Google Trends data for each state, where denotes the average of the Sunday and Monday of the current week and is the average for the entire prior week.15 16 Normalized state claims are aggregated to national claims in several steps. First, we undo the normalization by multiplying estimated normalized claims by the February 15th value of state claims. Next, we aggregate to the national level by summing the 52 state estimates. Finally, because the estimator for the normalized data is not unbiased in terms of the national sum, we include a bias-adjustment term that is equal to the percent error from the last in-sample predicted value. This is akin to intercept correcting a forecast based on the last estimated error.17 where The variable consists of lagged values in the data, but is only available once the information set at time is available. The panel models are then compared to state and national time series models. For state-level autoregressive models, we model the natural log of the state claims separately for each state 18 : We aggregate the state claims to the national level, first by exponentiating and adjusting for the variance of the residuals to correct for bias from Jensen’s inequality and then summing across the 52 states. The national-level model with Google Trends is as follows, where the modeled variable is normalized US claims. When calculating the estimated US claims value, we use the same aggregation bias-correction method as the state model, but with national claims instead of state claims.19 The national-level autoregressive model of order is estimated in the same way as the states, with the level value calculated using the estimated residual variance to adjust for Jensen’s inequality. Model [1] (Declarations DV Model) Vintage Estimates. Notes: This table presents parameters estimated using the column-vintage declarations dummy variable models. These correspond to forecasts made on the date in the column header for the week ending that Saturday (two days following). These models exploit disaster declarations made up to the day prior to the date in the column header. Parameters stabilize after five weeks, due to the presence of four separate weeks of disaster declarations and because some declarations do not occur until after the forecast timing cutoff for that week. In the sixth vintage week, a final revision is made due to revisions to the prior week’s data. Standard errors are omitted from the table, with . Standard errors are omitted from the table, with . Standard errors are omitted from the table, with .

Results

Weekly summary

We report forecasts for 10 weeks from the week ending on March 14th, 2020, through the week ending May 16th, 2020. The forecasts are depicted in Fig. 3 along with the actual values of the advance data for the national UI claims, which are the target of the forecasts. Forecast percentage errors (defined as forecast minus actual as a share of actual) are reported in Table 2.20 The models reported line up with the equations in the previous section. Models 1 through 3 use state-level data, while models 4 and 5 use national-level data. Models 1 and 2 use a panel data structure, with our main focus being the panel declarations dummy variable (“declarations DV”) model. The remaining models use a time series data structure. For the autoregressive models, both state and national, we report results with . We include Google Trends in the panel framework for model 2 and in the national time series framework for model 4.

Fig. 3

Alternative Forecasts of Weekly UI Claims.

Table 2

Forecasting results.

Data structure: Information set: Week ending	National claims	State - Panel declarations DV	State - Panel Google Trends	State - Time series autoregression	National Google Trends	National autoregression	Average
		[1]	[2]	[3]	[4]	[5]
3/14/2020	250,869	−19%	−20%	−16%	−21%	−16%	−18%
3/21/2020	2,898,392	−88%	−91%	−92%	−76%	−91%	−88%
3/28/2020	5,823,757	−38%	−10%	−78%	−44%	−69%	−48%
4/4/2020	6,203,348	4%	50%	−55%	17%	−24%	−2%
4/11/2020	4,971,820	15%	13%	−16%	−94%	5%	−16%
4/18/2020	4,267,394	4%	29%	−10%	53%	−3%	15%
4/25/2020	3,489,173	−14%	20%	−3%	25%	3%	6%
5/2/2020	2,849,079	8%	19%	0%	27%	5%	12%
5/9/2020	2,345,376	15%	8%	1%	−15%	5%	3%
5/16/2020	2,174,298	10%	4%	−5%	29%	−6%	6%

Mean error		−323,808	321,712	−1,205,880	−428,690	−785,164	−484,366
MAE		728,504	972,504	1,212,717	1,538,948	904,441	762,507
RMSE		1,114,431	1,401,606	2,002,762	2,032,517	1,599,849	1,237,357
MAE-DM (adjusted)			−1.991	−1.670	−3.648	−1.010
MAE-DM(adj) p-value			0.0776	0.1293	0.0053	0.3387

Notes: This table presents percentage forecast errors ((Forecast - Actual)/ Actual) made on the Thursday of the week ending in the date listed in the row. MAE is the mean absolute forecast error. RMSE is the square-root of the mean squared forecast error. MAE-DM is the Diebold-Mariano small-sample test statistic relative to model [1], with negative sign implying results in favor of model [1]. All statistics are at the national level, including Mean Error, MAE, RMSE, and MAE-DM, which are calculated over the 10 forecast periods.

Focusing on the beginning of the sample, all models perform poorly in the first two weeks. The popular narrative surrounding UI claims highlights the spike for the week ending March 21st, when national claims jumped to a historic high of nearly three million from the previous week of just over 250,000. However, there was also a spike in the week ending March 14th, when claims rose from around 200,000 to 250,000. We can see from Table 2 that in the first week all models miss by double-digit percentage points, and in the second week the absolute forecast errors rise to 76%–92%. In these two periods, no model forecasts substantially better or worse than any other. Alternative Forecasts of Weekly UI Claims. In the third week of our forecast analysis, the week ending March 28th, the panel models perform better than the other models, with the state Google Trends model missing by only 10% and the declarations DV model performing second best. The information contained in the declaration week dummy variables was fully populated in this period, with every state having declared a state of emergency by the time of the forecast. The experiences of Washington, California, New York, and some of the earliest-hit states were predictive of patterns exhibited in other states. The timing differentials were key in the forecasting variation in UI claims in states with later declarations.21 Table 3 presents our parameter estimates for each vintage week for the state declaration models. We add an additional parameter each week as we move one week further away from the first emergency declaration. Parameter estimates improve from week to week as we get more variation from additional states. For example, if we look at the estimated parameter on the dummy variable when we can follow across the row to see how the parameter varies as we get more information from more states. For the model estimated with data through March 12th for the week ending March 14th, there is only one observation that has a one for this variable: Washington in the latest period. Then the following week, the states in the week 2 group have a one for this variable in the last week, and Washington in the week before, so we get a more precise estimate—and so on for two more weeks until we get as much information as we can get from the states after 4 weeks. Then for week 5, we still have data revisions that affect the parameter estimates, but it is stable after that. A similar pattern appears for all the coefficients on the dummy variables.

Table 3

Model [1] (Declarations DV Model) Vintage Estimates.

Parameter	Dependent variable: Normalized state UI claims (StateNormClaims)
	Forecast Model for Week Ending in Column
	3/14/2020	3/21/2020	3/28/2020	4/4/2020	4/11/2020	4/18/2020	4/25/2020	5/2/2020	5/9/2020	5/16/2020
dj=−2	−0.0583	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎	−0.0653⁎
dj=−1	0.0600	0.0202	0.0218	0.0218	0.0218	0.0218	0.0218	0.0218	0.0218	0.0218
dj=0	−0.0319	0.0758⁎⁎	1.378	1.403	1.403	1.403	1.403	1.403	1.403	1.403
dj=1	0.113⁎⁎⁎	0.170⁎⁎	11.39⁎⁎⁎	12.42⁎⁎⁎	12.44⁎⁎⁎	12.44⁎⁎⁎	12.44⁎⁎⁎	12.44⁎⁎⁎	12.44⁎⁎⁎	12.44⁎⁎⁎
dj=2		1.377⁎⁎⁎	6.122⁎⁎⁎	24.37⁎⁎⁎	24.38⁎⁎⁎	24.42⁎⁎⁎	24.42⁎⁎⁎	24.42⁎⁎⁎	24.42⁎⁎⁎	24.42⁎⁎⁎
dj=3			20.50⁎⁎⁎	25.36⁎⁎⁎	32.53⁎⁎⁎	31.77⁎⁎⁎	31.80⁎⁎⁎	31.80⁎⁎⁎	31.80⁎⁎⁎	31.80⁎⁎⁎
dj=4				29.21⁎⁎⁎	26.61⁎⁎⁎	27.62⁎⁎⁎	27.00⁎⁎⁎	27.03⁎⁎⁎	27.03⁎⁎⁎	27.03⁎⁎⁎
dj=5					27.49⁎⁎⁎	22.62⁎⁎⁎	26.89⁎⁎⁎	26.17⁎⁎⁎	26.23⁎⁎⁎	26.23⁎⁎⁎
dj=6						23.25⁎⁎⁎	15.75⁎⁎⁎	21.35⁎⁎⁎	21.01⁎⁎⁎	21.18⁎⁎⁎
dj=7							13.35⁎⁎⁎	11.79⁎⁎⁎	15.24⁎⁎⁎	14.67⁎⁎⁎
dj=8								22.48⁎⁎⁎	11.90⁎⁎⁎	13.48⁎⁎⁎
dj=9									16.58⁎⁎⁎	9.294⁎⁎⁎
dj=10										17.72⁎⁎⁎
Constant	1.005⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎	1.017⁎⁎⁎

Obs	260	312	364	416	468	520	572	624	676	728
RMSE	0.192	0.201	4.693	7.364	8.313	8.686	10.18	11.02	10.99	10.91
R2	0.026	0.201	0.432	0.610	0.681	0.681	0.614	0.565	0.547	0.533

Notes: This table presents parameters estimated using the column-vintage declarations dummy variable models. These correspond to forecasts made on the date in the column header for the week ending that Saturday (two days following). These models exploit disaster declarations made up to the day prior to the date in the column header. Parameters stabilize after five weeks, due to the presence of four separate weeks of disaster declarations and because some declarations do not occur until after the forecast timing cutoff for that week. In the sixth vintage week, a final revision is made due to revisions to the prior week’s data.

Standard errors are omitted from the table, with .

In week 4, the declarations DV model reaches its peak in terms of benefit relative to other models. The forecast misses by only 4%, where no other model has an error less than 17%. In sum, in weeks 3 and 4, the information from the emergency declarations results in forecast improvements relative to the models without emergency declarations. This period is crucial because the week ending April 4th was the week of peak claims (and up to this point in the history of the series) with over 6.2 million people filing initial unemployment claims across the country. The percentage forecast error does not fully capture the absolute forecast error in these periods; absolute forecast errors for this period are about 250,000 for the declarations DV model, compared to 1,000,000 to well over 3,000,000 for the alternative models. In the latter case, these errors far exceed the previous high in actual claims before COVID-19 of about 1,000,000 (see the appendix). In the weeks ending on April 11th and 18th, the relative strength of the forecasting approaches becomes less clear, and it is debatable which forecasting approach is preferred. The trend in the forecast errors for the panel models is clearly pointing to plateauing performance. The state and national AR models, on the other hand, have similar forecast errors but they are showing monotonic improvements in forecasting performance since March 21st. The national Google Trends model continues to forecast poorly in this period. In the week ending April 25th through the end of the sample on May 16th, the AR models are slightly preferred relative to the panel models, and are much better than the national Google Trends model, though the percent errors are generally smaller across all models and the absolute number of claims is also smaller. Consequently, the cost of choosing an inferior forecast in this period is much lower than in the earlier periods. Absolute forecast errors for the AR models hover between 1% and 6% while the panel models continue to average around 10% to 15%. This corresponds to absolute error differences between the two forecasting approaches of about 100,000 to 200,000. State Panel Declarations DV versus State AR Models. Notes: This figure presents the MAEs for two forecasts: [1] (“Declarations DV”) and [3] (“State AR”) from Table 2. The figure shows nearly equal MAEs in the first week (week 1). In weeks 2 and 3, the Panel Declarations DV forecast substantially outperforms the State AR forecast. In weeks 4 and 5, both perform similarly. After week 5, the State AR forecast is consistently more accurate than the Declarations forecast. State-Level Forecasts, Declarations DV versus State AR. Notes: Data present actual and forecast UI claims by state in two weeks: the week ending April 4th and the week ending May 16th. The Declarations DV forecasts are aggregation-bias corrected. Estimated parameters (and standard errors in parentheses) are from the following Holden and Peel (1990) models estimated across states within a particular week, where is the forecast in the figure panel: . State-level forecasts are shown in greater detail in Table A.6.

Discussion

Based on our analysis we can divide our sample into four distinct periods: (1) the first two periods when the crisis is ramping up and all models perform poorly; (2) the next periods when the panel model with declaration DVs stands out; (3) a period of ambiguity where models begin to converge; and (4) at the end when the AR models stand out as best. Fig. 4 illustrates this pattern by zooming in on the forecasts of the declarations DV model and the state autoregressive model. In the periods immediately following the onset of the COVID-19 outbreak, both the level of the claims and the percent errors are larger. Accordingly, over the full sample, the Harvey, Leybourne, and Newbold adjusted Diebold–Mariano (“DM”) test (Diebold and Mariano, 2002, Harvey et al., 1997) rejects equal predictive accuracy in favor of the declaration-dates panel model when comparing with any of the Google Trends models (models 2 and 4) at the 10% significance level. The AR models also perform worse over the full sample, but not statistically significantly so.

Fig. 4

State Panel Declarations DV versus State AR Models. Notes: This figure presents the MAEs for two forecasts: [1] (“Declarations DV”) and [3] (“State AR”) from Table 2. The figure shows nearly equal MAEs in the first week (week 1). In weeks 2 and 3, the Panel Declarations DV forecast substantially outperforms the State AR forecast. In weeks 4 and 5, both perform similarly. After week 5, the State AR forecast is consistently more accurate than the Declarations forecast.

We believe these results have external validity beyond the experience in our specific setting. In general, forecasting UI claims in the time of COVID-19 involves a classic mean–variance tradeoff. Exploiting disaggregate information in the periods immediately following a break acts as a form of costly insurance. Early on, the information gained from the panel model is preferred; in later periods, the inefficiency of estimating individual dummy variable coefficients dominates. Fig. 5 illustrates this phenomenon at a state level. In the April 4th vintage, the actual values are almost uniformly above the state AR model forecasts. By the May 16th vintage, the bias is mostly gone and is no longer significant. The declarations DV model, on the other hand, is not significantly biased in either period, but it has a larger variance around the actual values, particularly for the week ending May 16th. When we observe these vintage coefficients from the declarations DV models in Table 3, we can see the clear trend in the parameters when looking at descending rows within each column. In the final period model, the coefficients trace out a hump-shaped curve with a tail that is decreasing at a decreasing rate. This is an inefficient way of modeling what is essentially the same type of decay found in an AR model.22 Consequently, AR models outperform the panel DV model at the end of the sample.

Fig. 5

State-Level Forecasts, Declarations DV versus State AR. Notes: Data present actual and forecast UI claims by state in two weeks: the week ending April 4th and the week ending May 16th. The Declarations DV forecasts are aggregation-bias corrected. Estimated parameters (and standard errors in parentheses) are from the following Holden and Peel (1990) models estimated across states within a particular week, where is the forecast in the figure panel: . State-level forecasts are shown in greater detail in Table A.6.

Conclusion

In this paper we produced current-week forecasts (also known as nowcasts) for the national total of state initial unemployment insurance (UI) claims for ten weeks in the midst of the COVID-19 pandemic in the US. We considered different data structures and information sets and compared their performance. We found that in the weeks immediately following the jump in UI claims associated with the COVID-19 crisis, a panel model exploiting the time variation in states declaring a state of emergency performed remarkably well, with the lowest mean absolute error for the full sample across the competing models, and statistically significantly better than models that included Google Trends data available at the time the forecasts were made. Autoregressive models caught up in a few weeks and in the last weeks of the sample had the smallest absolute errors. Prior research has emphasized the usefulness of simple autoregressive models in normal times, but has recommended much more complicated models for recessions, in order to incorporate sufficient relevant information to identify the change in regime (Chauvet & Potter, 2013). Our findings support the view that simple autoregressive models miss dramatic changes, but we showed that it is possible in certain instances to exploit panel information if there is information on differences in timing in the cross section.23 This allows us to still use a simple model for forecasting in a time of structural change. Our analysis also emphasizes that autoregressive models are again remarkably useful just a few periods after a break, consistent with the findings of Schorfheide and Song (2020), Primiceri and Tambalotti (2020), and Lenza and Primiceri (2020).

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

1 in total

1. Panel forecasts of country-level Covid-19 infections.

Authors: Laura Liu; Hyungsik Roger Moon; Frank Schorfheide
Journal: J Econom Date: 2020-10-16 Impact factor: 2.388

1 in total

3 in total

1. Association of COVID-19 with lifestyle behaviours and socio-economic variables in Turkey: An analysis of Google Trends.

Authors: Gamze Bayın Donar; Seda Aydan
Journal: Int J Health Plann Manage Date: 2021-09-22

2. Unemployment claims during COVID-19 and economic support measures in the U.S.

Authors: Theologos Dergiades; Costas Milas; Theodore Panagiotidis
Journal: Econ Model Date: 2022-05-11

3. A shot for the US economy.

Authors: Martin Gächter; Florian Huber; Martin Meier
Journal: Financ Res Lett Date: 2022-01-06

3 in total