Hajime Tomura1. 1. Faculty of Political Science and Economics, Waseda University, Shinjyuku, Tokyo, Japan.
Abstract
BACKGROUND: Social distancing measures to prevent the spread of COVID-19 included restrictions on retail services in many countries. In some countries, the governments also subsidized consumer spending on part of retail services to help struggling businesses. To evaluate the costs and benefits of government interventions in retail services, it is necessary to measure the infectiousness of each type of consumer activity. METHODS: This study regresses the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan on lagged values of household expenditures per household on eating out, traveling, admissions to entertainment facilities, clothing and footwear, and the other items, as well as a measure of mobility in public transportation in the past 14 days. The sample period of the dependent variable is set from March 1, 2020, to February 1, 2021, in order to avoid a possible structural break due to the spread of mutant strains in 2021. The regression model is estimated by the Bayesian method with a non-informative (improper) prior. The estimated model is evaluated by out-of-sample forecast performance from February 2, 2021, onward. RESULTS: The out-of-sample forecasts of the regression by the posterior means of regression coefficients perform well before the spread of the Delta variant in Japan since June 2021. R2 for the out-of-sample forecasts from February 2, 2021, to June 30, 2021, is 0.60. The dependent variable of the regression overshot the out-of-sample forecasts from mid-June to August 2021. Then, the out-of-sample forecasts overpredicted the dependent variable for the rest of 2021. CONCLUSION: The estimated model can be potentially useful in simulating changes in the number of new confirmed cases due to household spending on retail services, if it can be adjusted to real-time developments of mutant strains and vaccinations. Such simulations would help in designing cost-efficient government interventions.
BACKGROUND: Social distancing measures to prevent the spread of COVID-19 included restrictions on retail services in many countries. In some countries, the governments also subsidized consumer spending on part of retail services to help struggling businesses. To evaluate the costs and benefits of government interventions in retail services, it is necessary to measure the infectiousness of each type of consumer activity. METHODS: This study regresses the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan on lagged values of household expenditures per household on eating out, traveling, admissions to entertainment facilities, clothing and footwear, and the other items, as well as a measure of mobility in public transportation in the past 14 days. The sample period of the dependent variable is set from March 1, 2020, to February 1, 2021, in order to avoid a possible structural break due to the spread of mutant strains in 2021. The regression model is estimated by the Bayesian method with a non-informative (improper) prior. The estimated model is evaluated by out-of-sample forecast performance from February 2, 2021, onward. RESULTS: The out-of-sample forecasts of the regression by the posterior means of regression coefficients perform well before the spread of the Delta variant in Japan since June 2021. R2 for the out-of-sample forecasts from February 2, 2021, to June 30, 2021, is 0.60. The dependent variable of the regression overshot the out-of-sample forecasts from mid-June to August 2021. Then, the out-of-sample forecasts overpredicted the dependent variable for the rest of 2021. CONCLUSION: The estimated model can be potentially useful in simulating changes in the number of new confirmed cases due to household spending on retail services, if it can be adjusted to real-time developments of mutant strains and vaccinations. Such simulations would help in designing cost-efficient government interventions.
After the onset of the COVID-19 pandemic, social distancing measures were introduced as part of non-pharmaceutical interventions (NPIs) to contain the spread of COVID-19 across countries [1]. The mitigating effects of these measures on the transmission of COVID-19 have been confirmed by the estimates of associations between the introduction of NPIs and subsequent reductions in the time-varying reproduction number, i.e., the average number of secondary cases of infection generated by a primary case at each point of time, across countries [2-10]. Similar associations have been also found between NPIs and the number of reported cases of COVID-19 or the COVID-related mortality rate [11, 12]. These findings are consistent with the prediction of simulation analysis using theoretical epidemiological models [13-15].Social distancing measures, however, caused revenue losses for businesses, as they restricted physical retail services by stay-at-home orders for residents or more targeted interventions in specific businesses, such as temporary closures of bars and restaurants [16]. Thus, there occurred a trade-off between containing the spread of COVID-19 and maintaining economic activities [17]. This trade-off has become a major challenge for policy makers across countries.To compare the costs and benefits of social distancing measures, it is necessary to associate the protective effects of these measures with the economic values of restricted activities. For this agenda, hybrid frameworks have been developed to combine epidemiological models with economic models in which the effects of NPIs on economic production are calibrated [18, 19]. While there is no feedback effect from economic activities to the spread of COVID-19 in this type of framework, there is also a different type of model developed by theoretical economists to introduce behavioral responses by households and firms to COVID-19 and NPIs into epidemiological models [20-22]. This type of model features inter-dependence between aggregate (i.e., total) consumption in the economy and the spread of COVID-19, focusing on the responses of the entire economy to COVID-19.One of the remaining questions is to estimate associations between the values of more detailed types of consumer activities and the spread of COVID-19. This question is motivated by the fact that governments tended to intervene in only limited ranges of retail services, in order to minimize associated revenue losses for businesses [16, 23]. Also, in some countries, governments subsidized consumer spending on part of retail services to help struggling businesses. For example, the U.K. government introduced the “Eat Out to Help Out Scheme” in August 2020, which offered 50% discounts on meals up to 10 pounds per diner after the nationwide lockdown from March to May 2020 [24]. Also, the Japanese government introduced the “Go-To-Eat” and “Go-To-Travel” campaigns in the second half of 2020, which subsidized the expenses of eating out, including alcoholic drinks, and the expenses of accommodation and transportation for domestic travels, respectively [25, 26]. The introduction of these campaigns followed the declaration of the first state of emergency from April to May 2020 by the Japanese government.On the Eat Out to Help Out Scheme in the U.K., an association between the number of bars and restaurants participating in the scheme and the spread of COVID-19 in each region has been confirmed by a panel study using rainfalls for exogenous instruments in the difference-in-differences method [27]. On the Go-To-Travel campaign in Japan, an association between participation in the campaign and the incidence of symptoms indicative of COVID-19 has been reported by a cross-sectional study using Internet survey data [28]. More generally, there have been studies estimating associations between people’s visits to retail services and the spread of COVID-19 by using mobility data across countries [29-36]. While these studies confirmed associations between detailed types of consumer activities and the spread of COVID-19, they utilized data on physical visits or participation for the measures of consumer activities. It remains an issue to associate the economic values of detailed types of consumer activities with their contributions to the spread of COVID-19.On this issue, this study estimates associations between classified components of household expenditures and the rate of change in the number of new confirmed cases of COVID-19 by applying a time-series regression model to daily nationwide data in Japan. The estimated regression coefficients measure the infectiousness of each type of consumer activity per value of spending per household. In addition, household expenditure data are also convenient to estimate associations between detailed types of consumer activities and the spread of COVID-19 simultaneously, because they are categorized according to the classification of goods and services purchased by households. Using the estimated regression coefficients, this study decomposes the rate of change in the number of new confirmed cases of COVID-19 on each date into its associations with the values of spending on classified categories of consumer goods and services, including eating out and traveling.
Methods
Regression model
This study regresses the log difference over seven days in the number of new confirmed cases of COVID-19 on lagged values of classified components of household expenditures and a measure of mobility in public transportation in the past 14 days. The regression form is derived from the law of motion for the number of new infections described by [37]:
where I is the number of new infections on date t, R is the instantaneous reproduction number on date t, and w is the infectivity function for infection incidence s days ago. The first term on the right-hand side is the expected value of I, given the past numbers of new infections, . The second term is expectation error due to the difference between the expected value and the realized value of I.To derive the regression form for this study, a formula for the number of new confirmed cases of infection on date t, N, is also specified as follows:
where the first term on the right-hand side is the number of new symptomatic patients on date t, for which p denotes the probability of the incubation period being m days, M is the maximum length of an incubation period, and μ is the probability that an infected individual is asymptomatic. These probabilities are multiplied to the number of new infections m days ago, I, assuming that the law of large numbers is applicable. The second term is the residual accounting for reporting delays, false test results, and the detection of presymptomatic and asymptomatic infected individuals by testing.Combining Eqs (1) and (2) yields the following equation:
where the second term on the right-hand side is the residual due to the error terms in Eqs (1) and (2). Because the number of new infections, I, is not directly observable, the total infectiousness of infected individuals, , for each incubation period, m, is approximated by the number of new confirmed cases in the middle of the range of incubation periods, N, with a linear coefficient, ϕ:Because the total infectiousness of infected individuals is summed over the range of incubation periods on the right-hand side of Eqs (3) and (4) aims to approximate the average infectiousness of infected individuals over the range of incubation periods by the number of new confirmed cases of infection in the middle of the range. If Eq (4) is substituted into Eq (3), it yields the following equation:
where the second term on the right-hand side is the residual due to the error term in Eq (3) and the approximation error in Eq (4).The instantaneous reproduction number, R, is assumed to be the following function of explanatory variables:
where D is a time dummy for the year-end and new-year holiday period in Japan, D is a dummy variable for absolute humidity, D for i = 1, 2, …, I is a time dummy for each state of emergency or the period before the first state of emergency, I is the number of these time dummies related to states of emergency, is a set of classified components of household expenditures and a measure of mobility in public transportation, and J is the number of explanatory variables in this set.The underlying assumption for Eq (6) is that each person has more chances of physical contacts with other people if there are more consumer activities in the country. Because consumer activities associated with the same category of household expenditures can have different degrees of infectiousness depending upon the circumstances of spending, such as whether they are online or offline purchases, the coefficient of each variable in Eq (6) captures the average effect of the variable. Mobility in public transportation is also included as a separate explanatory variable, because the infectiousness of physical contacts in public transportation is unlikely to depend on what passengers do at their destinations. Thus, this effect should be measured separately from the infectiousness of consumer activities. In contrast, a measure of mobility at the places of retail services is not included in the explanatory variables, because the infectiousness of each type of consumer activity is captured by each component of household expenditures among the explanatory variables.In addition, there are cross terms on the right-hand side of Eq (6), because the infectiousness of consumer activities and mobility in public transportation may depend on the weather, and might vary across each state of emergency and the transition period between the onset of the pandemic and the first state of emergency. In addition, the year-end and new-year holiday dummy is included in the explanatory variables because it is customary for people to have home parties with families and friends, which can be infectious, during the holiday period around the new year’s day in Japan. The gatherings at home are not retail services, and thus difficult to capture with household expenditure data.Because the gross rate of change in the number of new confirmed cases on the left-hand side of Eq (5) cannot be negative by definition, it is replaced with the log difference in the number of new confirmed cases, lnN − lnN, for the dependent variable in Eq (5), so that there is no need for a restriction on the probability distribution of residuals in the regression due to a non-negativity constraint on the dependent variable. Then, substituting Eq (6) into Eq (5) yields the final form of the regression:
where F denotes an operator that returns the weighted average of lagged values of a time-series variable with the probability distribution of incubation periods, , for the weights:
for an arbitrary time-series variable Y. The second term in Eq (7), ϵ, is the residual due to the error term in Eq (5). This term is assumed to be an AR(1) process with a normally-distributed innovation, in order to allow serial correlation of residuals:
Estimation method
The regression model is estimated by the Bayesian method, given the following parameter restrictions and the prior distribution of the initial value of ϵ denoted by ϵ0:
for i = 1, 2, …, I and j = 1, 2, …, J. The prior distribution of the other parameters than ϵ0 is an improper prior, i.e., a constant for any set of values of the parameters.The parameter restrictions in Eq (10) are due to an assumption that consumer activities and mobility in public transportation contribute to the spread of COVID-19 in any condition, whereas their infectiousness is decreasing in the degree of absolute humidity. As will be explained in the next section, the assumption on the effect of absolute humidity is based on [38]. The parameter restriction in Eq (11) is the stationarity condition for the AR(1) process of ϵ. In Eq (12), the prior distribution of ϵ0 is assumed to be the unconditional probability distribution of ϵ. R ver. 4.1.1 and CmdStan ver. 2.28.1 are used for the estimation of the regression model [39, 40]. The convergence of Markov chain Monte Carlo (MCMC) simulations is confirmed by the function named cmdstan_diagnose() in the CmdStanR package of R. The code and the dataset used in this study are available at https://github.com/hajimetomura/COVID19_household_exp.
Data
Dependent variable
The maximum length of an incubation period, M, is set to 14 days, which is consistent with the guidance published by the Ministry of Health, Labour and Welfare of the Japanese government [41]. Therefore, the dependent variable is the log difference over seven days in the number of new confirmed cases of COVID-19. This feature of the dependent variable is convenient, as it eliminates the day-of-the-week effect on the dependent variable by comparing the numbers of new confirmed cases on the same days of two consecutive weeks. The daily number of new confirmed cases in Japan is published by the Ministry of Health, Labour and Welfare of the Japanese government [42].
Sample period of the dependent variable for the estimation of the regression model
For the estimation of the regression model, the sample period of the dependent variable is set from March 1, 2020, to February 1, 2021. The samples before this period are dropped because the number of new confirmed cases was not continuously positive at the beginning of the pandemic.The end of the sample period is also limited because of the spread of mutant strains in 2021 in Japan. The first cases of mutant strains in the country were confirmed on December 25, 2020 [43]. As of February 19, 2021, 173 cases of mutant strains had been confirmed, including 43 cases found in airport quarantine [44]. Thus, a wide spread of mutant strains was not observed in Japan before the end of January 2021. By using explanatory variables only up to the end of January 2021, this study aims to avoid a possible structural break in the regression due to the spread of mutant strains in 2021. Because of a lag between explanatory variables and the dependent variable in the regression, the end of the sample period of the dependent variable is set to February 1, 2021.
Classification of household expenditures among explanatory variables
For explanatory variables, this study uses average household expenditures per household on each date among households with two or more persons. The data source is the Family Income and Expenditure Survey published by the Ministry of Internal Affairs and Communications of the Japanese government [45]. There is no data for single-person households at daily frequency in this survey. Daily data for each month in this survey are published in the month after next. Thus, household expenditure data are available up to December 31, 2021, as of February 2022.To construct explanatory variables, household expenditures are classified into eight items: meals at bars and restaurants; soft drinks, confectioneries, and fruits at bars and restaurants; alcoholic drinks at bars and restaurants; non-packaged lodging; domestic travel packages; admissions, viewing, and game fees; clothing and footwear; and the other household consumption expenditures. Admissions, viewing, and game fees is the sum of admission fees for theaters, museums, stadiums, sports facilities, and theme parks. (See S1 Table for the original Japanese name of each category of household expenditures in the dataset.) The first six items were the subjects of government interventions during the four states of emergency in Japan before December 2021: from April 7, 2020, to May 25, 2020; from January 7, 2021, to March 21, 2021; from April 25, 2021, to June 20, 2021; and from July 12, 2021, to September 30, 2021 [46-49]. More specifically, the government shortened the opening hours of bars and restaurants in populated prefectures, such as Tokyo, in each state of emergency, and prohibited the sales of alcoholic drinks at bars and restaurants entirely in those prefectures during the third and fourth state of emergency. Also, the government closed large-scale indoor facilities or reduced the number of audiences per event during each state of emergency. The admission fees for these facilities and events are included in admissions, viewing, and game fees. On tourism, the government called for self-restraint on traveling during each state of emergency. However, the government also subsidized the expenses of accommodation and transportation in domestic travel packages from July 22, 2020, to December 27, 2020, in order to make up for reduced revenue for the tourism industry. This nationwide subsidy program was called a “Go-To-Travel” campaign [28]. Given these observations, this study includes the first six items listed above in the explanatory variables, in order to estimate associations between consumer activities intervened in by the government and the spread of COVID-19.Household expenditures on clothing and footwear are separated from the other household consumption expenditures among the explanatory variables, because this category of household expenditures exhibited a high cross correlation with the dependent variable during the sample period (Table 1). The maximum correlation coefficient for clothing and footwear was almost as high as that for alcoholic drinks at bars and restaurants. Because the other large categories of household expenditures did not show as high a cross correlation with the dependent variable as clothing and footwear, household expenditures other than the first seven items listed above are summed into “the other household consumption expenditures” to form one explanatory variable.
Table 1
Cross-correlation coefficients between the log difference over seven days in the number of new confirmed cases and lagged values of each large category of household expenditures per household in Japan.
Large category of household expenditures
Maximum correlation coefficient
Corresponding lag of household expenditures
Food
0.17
9
Housing
0.09
18
Fuel, light, and water charges
0.07
1
Furniture and household utensils
0.16
10
Clothing and footwear
0.36
9
Medical care
0.16
1
Transportation and communication
0.15
6
Education
0.25
4
Culture and recreation
0.22
9
The other household consumption expenditures
0.24
9
Alcoholic drinks at bars and restaurants
0.38
9
Notes: The table shows the maximum cross correlation coefficients between the log difference over seven days in the number of new confirmed cases and lagged values of each large category of nominal household expenditures per household up to 28-day lag in Japan. In the last row, the figures for alcoholic drinks at bars and restaurants are included for comparison, even though it is not part of large categories. The sample period of the log difference over seven days in the number of new confirmed cases is from March 1, 2020, to February 1, 2021.
Notes: The table shows the maximum cross correlation coefficients between the log difference over seven days in the number of new confirmed cases and lagged values of each large category of nominal household expenditures per household up to 28-day lag in Japan. In the last row, the figures for alcoholic drinks at bars and restaurants are included for comparison, even though it is not part of large categories. The sample period of the log difference over seven days in the number of new confirmed cases is from March 1, 2020, to February 1, 2021.
Conversion of nominal household expenditures into “real” values in the construction of explanatory variables
As described above, there was a subsidy program for domestic tourism called a Go-To-Travel campaign from late July to late December 2020 in Japan. While this campaign subsidized accommodation costs for domestic travels, transportation costs were subsidized only if they were packaged with accommodation costs [50]. Therefore, households increased mainly the purchases of domestic travel packages during the campaign period to maximize the amounts of subsidies they received. See S1 Fig to confirm the difference between nominal household expenditures (i.e., the amounts of money paid by households) for non-packaged lodging and domestic travel packages during the campaign period.To exhibit the subsidy effect of the Go-To-Travel campaign, Fig 1 shows the consumer price indices related to the components of household expenditures included in the explanatory variables. The consumer price index is the average price level for each category of consumer goods and services published by the Ministry of Internal Affairs and Communications of the Japanese government [51]. In the figure, the subsidy effect of the Go-To-Travel campaign is clearly visible in a drop in the consumer price index for lodging during the campaign period, even though this index includes both the prices of non-packaged lodging and the costs of accommodation in domestic travel packages. Thus, households could consume more travel services by buying domestic travel packages with the same amount of money during the campaign period.
Fig 1
Price levels of classified components of household expenditures in Japan.
This figure shows the monthly values of the consumer price index related to each component of household expenditures included in the explanatory variables. The sample period shown in the figure is from January 2020 to December 2021. The index for lodging includes both the prices of non-packaged lodging and the costs of accommodation in domestic travel packages. The index for food consumption at bars and restaurants in general covers the prices of three types of food consumption at bars and restaurants included in the explanatory variables. There are no separate consumer price indices corresponding to non-packaged lodging, domestic travel packages, and the three types of food consumption at bars and restaurants in the data source [51].
Price levels of classified components of household expenditures in Japan.
This figure shows the monthly values of the consumer price index related to each component of household expenditures included in the explanatory variables. The sample period shown in the figure is from January 2020 to December 2021. The index for lodging includes both the prices of non-packaged lodging and the costs of accommodation in domestic travel packages. The index for food consumption at bars and restaurants in general covers the prices of three types of food consumption at bars and restaurants included in the explanatory variables. There are no separate consumer price indices corresponding to non-packaged lodging, domestic travel packages, and the three types of food consumption at bars and restaurants in the data source [51].For a measure of consumer activities associated with the spread of COVID-19, the amount of packaged travel services consumed by households is more suitable than the nominal value of subsidized consumer spending on domestic travel packages. One way to remove the subsidy effect of the Go-To-Travel campaign to measure the amount of packaged travel services consumed by households is to divide nominal household expenditures on domestic travel packages by the consumer price index of the corresponding category. The resulting value is an artificial measure of quantity, as each nominal expenditure is the product of the quantity and the price of goods or services purchased. In the terminology of economics, this value is called a “real” value. For consistency, real values are used for all the explanatory variables on the classified components of household expenditures per household. See S1 Appendix for more details on how to construct the consumer price index for each explanatory variable. Fig 2 depicts the real values of household expenditures per household used for explanatory variables. For each explanatory variable, the unit of the real value is normalized to 100 yen at the average price in 2020—that is, one unit of real value equals the quantity of consumer goods and services that could be purchased by 100 yen in 2020.
Fig 2
“Real” values of classified components of household expenditures per household in Japan.
These series are computed by dividing each component of nominal household expenditures per household by the corresponding consumer price index on each date. They are used for explanatory variables in the regression model. The unit of each series is normalized to 100 yen at the average price of each component of household expenditures in 2020. The sample period shown in the figure is from January 1, 2020, to December 31, 2021. Each shadowed period indicates a state of emergency.
“Real” values of classified components of household expenditures per household in Japan.
These series are computed by dividing each component of nominal household expenditures per household by the corresponding consumer price index on each date. They are used for explanatory variables in the regression model. The unit of each series is normalized to 100 yen at the average price of each component of household expenditures in 2020. The sample period shown in the figure is from January 1, 2020, to December 31, 2021. Each shadowed period indicates a state of emergency.Fig 1 implies that the consumer price indices for the other types of goods and services than lodging did not change much during the sample period. Also, it is described in S1 Appendix that the subsidy effect of the Go-To-Travel campaign on the consumer price index for lodging is removed when the index is used to compute the real values of household expenditures on non-packaged lodging. Thus, the conversion of nominal household expenditures into real values makes a substantial difference only for domestic travel packages.
Measure of mobility in public transportation
For the measure of mobility in public transportation among the explanatory variables, this study uses “transit_stations” for Japan in the COVID-19 Community Mobility Reports published by Google [52]. This variable measures a percentage change in mobility in public transportation on each date from the average in the benchmark period from January 3, 2020, to February 6, 2020. Fig 3 depicts the values of this variable.
Fig 3
Mobility in public transportation in Japan.
The figure shows transit_stations for Japan in the COVID-19 Community Mobility Reports [52]. The sample period shown in the figure is from February 15, 2020, to December 31, 2021. Each shadowed period indicates a state of emergency.
Mobility in public transportation in Japan.
The figure shows transit_stations for Japan in the COVID-19 Community Mobility Reports [52]. The sample period shown in the figure is from February 15, 2020, to December 31, 2021. Each shadowed period indicates a state of emergency.
Nationwide dummy for absolute humidity
To construct the dummy variable for absolute humidity in Eq (6), D, a dummy for absolute humidity being no less than 9g/m3 is computed for the capital of each prefecture in Japan on each date, using Celsius temperature and relative humidity published by the Japan Meteorological Agency [53]. (See S2 Appendix for the formula to compute absolute humidity, and S3 Appendix for the details about how to fulfill missing values for some prefectures in the dataset.) These dummies are weighted by the population of each prefecture in 2019, and then summed across prefectures to compute the population-weighted nationwide average of the prefectural dummies on each date [54]. Fig 4 depicts the resulting series of the nationwide dummy for absolute humidity.
Fig 4
Nationwide dummy for absolute humidity in Japan.
This series is the population-weighted nationwide average of prefectural dummies for absolute humidity being no less than 9g/m3 at the capital of each prefecture on each date. The sample period shown in the figure is from January 1, 2020, to December 31, 2021.
Nationwide dummy for absolute humidity in Japan.
This series is the population-weighted nationwide average of prefectural dummies for absolute humidity being no less than 9g/m3 at the capital of each prefecture on each date. The sample period shown in the figure is from January 1, 2020, to December 31, 2021.The threshold level of absolute humidity for prefectural dummies, 9g/m3, is adopted from a study that found that the risk ratio of new infections of COVID-19 had a non-linear relationship with absolute humidity in England, peaking around 6–8g/m3 [38]. Prefectural dummies for absolute humidity approximate this non-linear relationship at the capital of each prefecture by a step function.
Time dummies
In Eq 6, the time dummy for the year-end and new-year holiday period, D, is set to unity for the period from December 29, 2020, to January 3, 2021. Time dummies related to states of emergency, D for i = 1, 2, 3, are set to unity for each of the following periods separately: the period from the beginning of the sample period to April 6, 2020; the first state of emergency from April 7, 2020, to May 25, 2020; and the second state of emergency from January 7, 2021, to March 21, 2021. The time dummy for the second state of emergency is defined up to the end of the sample period of explanatory variables for the estimation of the regression model, i.e., January 31, 2021.
Sample distribution of incubation periods
For the probability distribution of incubation periods in Eq (8), , I use the sample distribution of incubation periods in Japan reported by [6]. Fig 5 depicts the sample distribution.
Fig 5
Sample distribution of incubation periods in Japan reported by [6].
Sensitivity analysis
Associations between weather conditions and COVID-19 incidence and transmission have been examined across countries [55]. For example, it has been reported in a global study that approximately 85% of the COVID-19 cases reported before May 1, 2020, occurred in regions with outside temperature between 3 and 17°C and absolute humidity between 1 and 9g/m3, the latter of which is consistent with the construction of the nationwide dummy for absolute humidity described above [56]. For sensitivity analysis, the regression model is estimated with not only the baseline set of explanatory variables described above, but also four alternative sets in each of which the nationwide dummy for absolute humidity is removed from the set of explanatory variables, or substituted by one of the following alternative weather variables: a nationwide dummy for outside temperature being no less than 18°C; the nationwide average of outside temperature; and the nationwide average of absolute humidity. All of the nationwide dummy and averages are population-weighted across prefectures in Japan in the same way as the construction of the nationwide dummy for absolute humidity. See S4 Appendix for more details on the construction of alternative weather variables.In addition, among the classified components of household expenditures included in the explanatory variables, household expenditures on clothing and footwear are purchases of merchandise goods. The other components of household expenditures are purchases of offline services, except “the other household consumption expenditures”, which represents the residual household expenditures. Because the infectiousness of consumer activities can be different between online and offline merchandise shopping, the regression model is estimated with an alternative set of explanatory variables in which household expenditures on clothing and footwear only include offline purchases on each date. This variable is computed by multiplying the real value of household expenditures on clothing and footwear per household on each date by the monthly offline share of the expenditures in the same month. The monthly offline share can be computed from data available from the Survey of Household Economy published by the Ministry of Internal Affairs and Communications of the Japanese government [57]. See S5 Appendix for more details on the construction of offline household expenditures on clothing and footwear.
Results
Estimated regression coefficients
Table 2 shows the estimated regression coefficients of classified components of household expenditures and mobility in public transportation, except the coefficients of cross terms with time dummies related to states of emergency. See S2 Table for the estimation results for all the parameters in the regression model.
Table 2
Estimated regression coefficients of classified components of household expenditures and mobility in public transportation in Japan.
Explanatory variable
Baseline effect
Cross effect of absolute humidity
Posterior mean
2.5th percentile
97.5th percentile
Posterior mean
2.5th percentile
97.5th percentile
Meals at bars and restaurants
0.060
0.006
0.167
-0.018
-0.068
-0.000
Soft drinks, confectioneries, and fruits at bars and restaurants
1.038
0.126
2.914
-0.394
-1.432
-0.009
Alcoholic drinks at bars and restaurants
1.339
0.172
3.120
-0.231
-0.844
-0.006
Non-packaged lodging
0.356
0.047
0.912
-0.135
-0.481
-0.003
Domestic travel packages
0.221
0.029
0.608
-0.100
-0.361
-0.002
Admissions, viewing, and game fees
0.102
0.012
0.261
-0.022
-0.078
-0.001
Clothing and footwear
0.266
0.029
0.768
-0.120
-0.468
-0.003
The other household consumption expenditures
0.006
0.001
0.017
-0.002
-0.005
-0.000
Mobility in public transportation
0.019
0.004
0.042
-0.012
-0.029
-0.001
Notes: The dependent variable is the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan. The sample period of the dependent variable is from March 1, 2020, to February 1, 2021. Each row corresponds to the explanatory variable in the first column. In the rest of columns, “Baseline effect” indicates the linear coefficient of the explanatory variable, γ, whereas “Cross effect of absolute humidity” indicates the coefficient of the cross term between the explanatory variable and the nationwide dummy for absolute humidity, θ, in Eq (6).
Notes: The dependent variable is the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan. The sample period of the dependent variable is from March 1, 2020, to February 1, 2021. Each row corresponds to the explanatory variable in the first column. In the rest of columns, “Baseline effect” indicates the linear coefficient of the explanatory variable, γ, whereas “Cross effect of absolute humidity” indicates the coefficient of the cross term between the explanatory variable and the nationwide dummy for absolute humidity, θ, in Eq (6).The linear coefficient of each explanatory variable (i.e., “Baseline effect” in Table 2) estimates the degree of an association between the explanatory variable and the rate of change in the number of new confirmed cases of COVID-19 when absolute humidity is less than 9g/m3 at the capitals of all prefectures (i.e., the nationwide dummy for absolute humidity is zero), whereas the cross effect of absolute humidity measures how much the degree of this association changes if absolute humidity becomes equal to, or greater than, 9g/m3 at the capitals of all prefectures (i.e., the nationwide dummy for absolute humidity becomes one). For each component of household expenditures, the regression coefficient is measured per the real value of spending per household. Thus, the posterior means of regression coefficients in the table indicate that alcoholic drinks at bars and restaurants had the highest association with the spread of COVID-19 per the real value of spending per household, regardless of the degree of absolute humidity. Soft drinks, confectioneries, and fruits at bars and restaurants had the second highest association.
Out-of-sample forecast performance of the regression
Fig 6 plots the observed values of the dependent variable, the fitted values of the regression up to February 1, 2021, and the out-of-sample forecasts of the regression from February 2, 2021, onward. The out-of-sample forecasts are computed by inserting into the regression the values of explanatory variables during the out-of-sample forecast period, given each MCMC sample of parameter values, except the time dummy for the second state of emergency. For the out-of-sample forecast period, the time dummy for the second state of emergency is set to zero, because otherwise the out-of-sample forecasts would overpredict the observed values of the dependent variable significantly (see S2 Fig). As will be described below, this is likely due to the overfitting of regression coefficients for the time dummies related to states of emergency. Given this set-up, the figure exhibits that the observed values of the dependent variable traced the out-of-sample forecasts by the posterior means of regression coefficients closely until mid-June 2021. R2 for the out-of-sample forecasts from February 2, 2021, to June 30, 2021, is 0.60.
Fig 6
Fitted values and out-of-sample forecasts of the regression.
The dependent variable is the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to January 1, 2022. For the fitted values and the out-of-sample forecasts, the solid line is the posterior mean and the shadowed area indicates the 95% credible interval on each date. Each shadowed period indicates a state of emergency.
Fitted values and out-of-sample forecasts of the regression.
The dependent variable is the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to January 1, 2022. For the fitted values and the out-of-sample forecasts, the solid line is the posterior mean and the shadowed area indicates the 95% credible interval on each date. Each shadowed period indicates a state of emergency.On the other hand, the fitted values continuously deviate from the observed values of the dependent variable for the summer of 2020, when there was a surge in the number of new confirmed cases, so-called the “second wave” in Japan, following the first surge in the spring of 2020.
Decomposition of out-of-sample forecasts of the regression
Given the good fit of out-of-sample forecasts by the posterior means of regression coefficients up to mid-June 2021, Fig 7 shows the contribution of each explanatory variable to the dependent variable in the regression, which is measured by the product of the explanatory variable and the posterior mean of the corresponding regression coefficient during the out-of-sample forecast period.
Fig 7
Decomposition of out-of-sample forecasts of the regression: Level.
Each panel shows the product of an explanatory variable and the posterior mean of the corresponding regression coefficient. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from February 2, 2021, to January 1, 2022. For household expenditures and mobility in public transportation, “W/o cross effect of absolute humidity” indicates the posterior mean of γF(X) in Eq (7), whereas “Total contribution” indicates the posterior mean of γF(X) + θF(DX) in Eq (7) on each date. Each shadowed period indicates a state of emergency.
Decomposition of out-of-sample forecasts of the regression: Level.
Each panel shows the product of an explanatory variable and the posterior mean of the corresponding regression coefficient. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from February 2, 2021, to January 1, 2022. For household expenditures and mobility in public transportation, “W/o cross effect of absolute humidity” indicates the posterior mean of γF(X) in Eq (7), whereas “Total contribution” indicates the posterior mean of γF(X) + θF(DX) in Eq (7) on each date. Each shadowed period indicates a state of emergency.Fig 7 plots each explanatory variable’s contribution to the dependent variable without the cross effect of absolute humidity, as well as the total contribution including the cross effect of absolute humidity. The figure demonstrates that a higher degree of absolute humidity reduced the absolute size of each explanatory variable’s contribution to the dependent variable, as it weakened the association between each explanatory variable and the spread of COVID-19. At the same time, the figure also indicates that the main cause of fluctuations in each explanatory variable’s contribution to the dependent variable was not changes in absolute humidity, but idiosyncratic fluctuations in the explanatory variable. This observation presents evidence against reverse causality such that the dependent variable caused concerted fluctuations in the explanatory variables as a confounder, resulting in a good fit of out-of-sample forecasts of the regression.Fig 8 plots each explanatory variable’s contribution to the dependent variable on each date in the form of the difference from the beginning of the out-of-sample forecast period. The figure demonstrates the contribution of each explanatory variable to changes in the dependent variable. For the period before mid-June 2021, the figure indicates that the increase in the number of new confirmed cases around the end of March 2021, which is often called the “fourth wave” in Japan following the three surges in the number of new confirmed cases since the onset of the pandemic, was mainly due to increases in alcoholic drinks at bars and restaurants, non-packaged lodging, clothing and footwear, and mobility in public transportation.
Fig 8
Decomposition of out-of-sample forecasts of the regression: Differences from a benchmark date.
Each panel plots the differences in the product of an explanatory variable and the posterior mean of the corresponding regression coefficient from February 2, 2021. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from February 2, 2021, to January 1, 2022. For household expenditures and mobility in public transportation, “Total contribution” indicates differences in the posterior mean of γF(X) + θF(DX) in Eq (7) on each date from February 2, 2021. Each shadowed period indicates a state of emergency.
Decomposition of out-of-sample forecasts of the regression: Differences from a benchmark date.
Each panel plots the differences in the product of an explanatory variable and the posterior mean of the corresponding regression coefficient from February 2, 2021. For out-of-sample forecasts, the time dummy for the second state of emergency is set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from February 2, 2021, to January 1, 2022. For household expenditures and mobility in public transportation, “Total contribution” indicates differences in the posterior mean of γF(X) + θF(DX) in Eq (7) on each date from February 2, 2021. Each shadowed period indicates a state of emergency.
Decomposition of fitted values of the regression
If the fitted values of the regression are decomposed in the same way as in Fig 7, it can be shown that the cross effects of time dummies related to states of emergency are so large that the other components of contributions of explanatory variables to the dependent variable are almost invisible. Yet, the fitted values of the regression do not change much even if all the time dummies related to states of emergency are set to zero, given the same values of posterior means of regression coefficients being used. Thus, the large cross effects of time dummies related to states of emergency offset each other, which is likely due to overfitting. See S3 and S4 Figs for this result.Given this observation, Fig 9 shows the decomposition of fitted values of the regression when the time dummies related to states of emergency are set to zero without changing the posterior means of regression coefficients. For this case, Fig 10 also plots each explanatory variable’s contribution to the dependent variable in the form of the differences from February 1, 2021, i.e., the end of the sample period for the estimation of the regression model. Fig 10 indicates that fluctuations in alcoholic drinks at bars and restaurants, clothing and footwear, and mobility in public transportation made outstanding
contributions to changes in the dependent variable before February 2021.
Fig 9
Decomposition of fitted values of the regression: Level.
Each panel shows the product of an explanatory variable and the posterior mean of the corresponding regression coefficient, when time dummies for the period before the first state of emergency, the first state of emergency, and the second state of emergency are set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to February 1, 2021. For household expenditures and mobility in public transportation, “W/o cross effect of absolute humidity” indicates the posterior mean of γF(X) in Eq (7), whereas “Total contribution” indicates the posterior mean of γF(X) + θF(DX) in Eq (7) on each date. Each shadowed period indicates a state of emergency.
Fig 10
Decomposition of fitted values of the regression: Differences from a benchmark date.
Each panel shows the differences in the product of an explanatory variable and the posterior mean of the corresponding regression coefficient from February 1, 2021, when time dummies for the period before the first state of emergency, the first state of emergency, and the second state of emergency are set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to February 1, 2021. For household expenditures and mobility in public transportation, “Total contribution” indicates differences in the posterior mean of γF(X) + θF(DX) in Eq (7) on each date from February 1, 2021. Each shadowed period indicates a state of emergency.
Decomposition of fitted values of the regression: Level.
Each panel shows the product of an explanatory variable and the posterior mean of the corresponding regression coefficient, when time dummies for the period before the first state of emergency, the first state of emergency, and the second state of emergency are set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to February 1, 2021. For household expenditures and mobility in public transportation, “W/o cross effect of absolute humidity” indicates the posterior mean of γF(X) in Eq (7), whereas “Total contribution” indicates the posterior mean of γF(X) + θF(DX) in Eq (7) on each date. Each shadowed period indicates a state of emergency.
Decomposition of fitted values of the regression: Differences from a benchmark date.
Each panel shows the differences in the product of an explanatory variable and the posterior mean of the corresponding regression coefficient from February 1, 2021, when time dummies for the period before the first state of emergency, the first state of emergency, and the second state of emergency are set to zero without changing the posterior means of regression coefficients. The sample period shown in the figure is from March 1, 2020, to February 1, 2021. For household expenditures and mobility in public transportation, “Total contribution” indicates differences in the posterior mean of γF(X) + θF(DX) in Eq (7) on each date from February 1, 2021. Each shadowed period indicates a state of emergency.For sensitivity analysis on the effect of weather conditions on COVID-19 incidence and transmission, the regression model is estimated with four alternative sets of explanatory variables in each of which the nationwide dummy for absolute humidity is removed from the set of explanatory variables, or substituted by one of the following weather variables: a nationwide dummy for outside temperature being no less than 18°C; the nationwide average of outside temperature; and the nationwide average of absolute humidity. See S4 Appendix for the details of the estimation results.Table 3 shows the value of R2 for the out-of-sample forecasts of each regression by the posterior means of regression coefficients from February 2, 2021, to June 30, 2021. The table implies that each regression shows a good fit of out-of-sample forecasts, including the one without any weather variable among the explanatory variables. Even though the benchmark regression with the nationwide dummy for absolute humidity shows the highest value of R2 among the five regressions listed in the table, there is only a slim difference between the benchmark regression and the regression without a weather variable. Thus, using out-of-sample forecast performance measured by R2 for model evaluation, this study cannot be conclusive on the importance of weather conditions for COVID-19 incidence and transmission.
Table 3
R2 for the out-of-sample forecasts of regressions with alternative weather variables for explanatory variables.
Type of weather variable included in the explanatory variables
Nationwide dummy for absolute humidity
Nationwide dummy for outside temperature
No weather variable
Nationwide average of absolute humidity
Nationwide average of outside temperature
R2 for the out-of-sample forecasts of the regression from February 2, 2021, to June 30, 2021
0.60
0.48
0.57
0.31
0.43
Notes: The first row lists the type of variable included as an explanatory variable in the regression model to represent the effect of weather conditions on the dependent variable. “No weather variable” in the fourth column means that no weather variable is included among the explanatory variables. The second row shows the value of R2 for the out-of-sample forecasts of each regression by the posterior means of regression coefficients from February 2, 2021, to June 30, 2021. See S4 Appendix for more details.
Notes: The first row lists the type of variable included as an explanatory variable in the regression model to represent the effect of weather conditions on the dependent variable. “No weather variable” in the fourth column means that no weather variable is included among the explanatory variables. The second row shows the value of R2 for the out-of-sample forecasts of each regression by the posterior means of regression coefficients from February 2, 2021, to June 30, 2021. See S4 Appendix for more details.It is also shown in S4 Appendix that in each regression listed in Table 3, alcoholic drinks at bars and restaurants had the highest association with the dependent variable of the regression per the real value of spending per household among the classified components of household expenditures included in the explanatory variables, whereas soft drinks, confectioneries, and fruits at bars and restaurants had the second highest association. Also, the results of the regressions with the three highest values of R2 shown in Table 3, i.e., those with nationwide dummies for absolute humidity and outside temperature, and no weather variable among the explanatory variables, indicate that the assumption on a weather variable in the explanatory variables mostly affects the estimates of the regression coefficients of mobility in public transportation, leaving unchanged the implication of the regression model for the contribution of each classified component of household expenditures to the spread of COVID-19.In addition, the regression model is also estimated with another alternative set of explanatory variables in which household expenditures on clothing and footwear only include offline purchases. See S5 Appendix for the details of the estimation result. The value of R2 for the out-of-sample forecasts of the regression by the posterior means of regression coefficients from February 2, 2021, to June 30, 2021 is 0.59, which is almost the same as the value for the benchmark regression described above, 0.60. Also, the decomposition of fitted values and out-of-sample forecasts of the regression is similar to that of the benchmark regression reported in Figs 7–10.A possible reason for this result is the co-movement of online and offline household expenditures on clothing and footwear during the sample period. As shown in S5 Appendix, the monthly online share of household expenditures on clothing and footwear had been stable after a permanent rise in early 2020. If this observation held at daily frequency as well, then the average effect of total (i.e, the sum of online and offline) household expenditures on clothing and footwear would be proportional to the effect of the offline household expenditures on each date. This interpretation is consistent with the good fit of out-of-sample forecasts of the benchmark regression reported above, because if the ratio between online and offline household expenditures on clothing and footwear changed significantly on each date, then it would lower the fit of out-of-sample forecasts of the benchmark regression by making the true average effect of total household expenditures on clothing and footwear significantly time-varying. Nonetheless, obtaining daily data on the online share of household expenditures on clothing and footwear is necessary to confirm the legitimacy of this interpretation. This issue remains for future research.
Preliminary regression of out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population
As shown in Fig 6, the observed values of the dependent variable overshot the out-of-sample forecasts of the regression from mid-June to August 2021. Then, the out-of-sample forecasts overpredicted the dependent variable from September 2021 onward. Even though the regression model does not have any information about the determinants of out-of-sample forecast errors, the beginning of the overshooting coincided with the spread of the Delta variant in Japan since June 2021 (see Fig 11). Because there was also a steady progress of vaccinations in Japan from June 2021, vaccinations might gradually mitigate the effect of the Delta variant, reducing the magnitude of overshooting in August 2021 (see Fig 12).
Fig 11
Delta-variant share of new confirmed cases of COVID-19 in Japan.
Source: The Ministry of Health, Labour and Welfare, the Government of Japan; and the Tokyo Metropolitan Government [58, 59]. Notes: The Ministry of Health, Labour and Welfare and the Tokyo Metropolitan Government stopped reporting the data after the weeks of Sept. 27, 2021, and Nov. 1, 2021, respectively.
Fig 12
Twice-vaccinated share of the population in Japan.
Source: The Ministry of Health, Labour and Welfare, the Government of Japan [60].
Delta-variant share of new confirmed cases of COVID-19 in Japan.
Source: The Ministry of Health, Labour and Welfare, the Government of Japan; and the Tokyo Metropolitan Government [58, 59]. Notes: The Ministry of Health, Labour and Welfare and the Tokyo Metropolitan Government stopped reporting the data after the weeks of Sept. 27, 2021, and Nov. 1, 2021, respectively.
Twice-vaccinated share of the population in Japan.
Source: The Ministry of Health, Labour and Welfare, the Government of Japan [60].For preliminary analysis, the out-of-sample forecast errors of the regression on each date is regressed on the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population seven days ago. The lag length of the explanatory variables in this regression is chosen because it corresponds to the mid-point of the range of incubation periods [41]. The Delta-variant share of new confirmed cases in Tokyo is included in the explanatory variables as a proxy for the nationwide share in this regression, because it is available for a longer sample period than the reported nationwide share, as shown in Fig 11. In this regression, the weekly value of this variable is used on each date within the same week, because no daily data are available. The sample period for the estimation of this regression is terminated at the end of October 2021, because the November data for the Delta-variant share of new confirmed cases in Tokyo is based on only five cases due to a decline in the number of new confirmed cases in that month. The estimation method is ordinary least squares.Table 4 shows the estimation result. There is no constant term among the explanatory variables in the regression, so that the fitted value of the regression equals zero (i.e., the regression does not explain any component of out-of-sample forecast errors) if the two explanatory variables equal zero. A relatively high value of R2 shown in Table 4 implies that the out-of-sample forecast errors can be fitted to a linear combination of the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population. Fig 13 plots the fitted values of the regression of out-of-sample forecast errors, along with the values of out-of-sample forecast errors. The figure implies that, if the out-of-sample forecast errors were associated with the spread of the Delta variant and the progress of vaccinations, then there was an exception from mid-June to July 2021. A caveat is that this implication of the regression is tentative, as the estimation result shown in Table 4 is just an in-sample fit, yet to be validated by out-of-sample forecast performance.
Table 4
Regression of out-of-sample forecast errors of the regression model reported in Table 2.
Explanatory variable
Coefficient estimate
Standard error
Delta-variant share of new confirmed cases (%, 7 days lag)
0.014
0.001
Twice-vaccinated share of the population (%, 7 days lag)
-0.026
0.001
Number of observations: 272. R2: 0.44
Notes: The dependent variable is the out-of-sample forecast errors generated by the regression model with the posterior means of regression coefficients reported in S2 Table. The lag of each explanatory variable is seven days. There is no constant term in the regression. The sample period for the dependent variable is from February 2, 2021, to October 31, 2021. The estimation method is ordinary least squares.
Fig 13
Fitted values of the regression of out-of-sample forecast errors.
Notes: “Out-of-sample forecast errors” are out-of-sample forecast errors generated by the regression model with the posterior means of regression coefficients reported in S2 Table “Fitted values” are the fitted values of the regression of the out-of-sample forecast errors reported in Table 4.
Fitted values of the regression of out-of-sample forecast errors.
Notes: “Out-of-sample forecast errors” are out-of-sample forecast errors generated by the regression model with the posterior means of regression coefficients reported in S2 Table “Fitted values” are the fitted values of the regression of the out-of-sample forecast errors reported in Table 4.Notes: The dependent variable is the out-of-sample forecast errors generated by the regression model with the posterior means of regression coefficients reported in S2 Table. The lag of each explanatory variable is seven days. There is no constant term in the regression. The sample period for the dependent variable is from February 2, 2021, to October 31, 2021. The estimation method is ordinary least squares.
Discussion
Principal findings
By estimating a time-series regression model, this study has found that there had been stable associations between classified components of household expenditures and the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan before June 2021, except for a low in-sample fit of the regression during the summer of 2020. These associations can be validated using out-of-sample regression forecasts for February and June 2021. The estimated regression coefficients indicate that alcoholic drinks at bars and restaurants had the highest association with the rate of change in the number of new confirmed cases per the real value of spending per household among the classified components of household expenditures in the explanatory variables. The second-highest association was soft drinks, confectioneries, and fruits at bars and restaurants. The decomposition of fitted values and out-of-sample forecasts of the regression implies that household expenditures on alcoholic drinks at bars and restaurants, clothing and footwear, and mobility in public transportation made substantial contributions to fluctuations in the rate of change in the number of new confirmed cases in Japan between March 2020, that is, the beginning of the sample period, and June 2021.The dependent variable of the regression, which is the log difference over seven days in the number of new confirmed cases, overshot the out-of-sample forecasts of the regression from mid-June to August 2021. The out-of-sample forecasts overpredicted the dependent variable for the rest of 2021. The regression of the out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population implies that even though the out-of-sample forecast errors can be fitted to a linear combination of the two explanatory variables up to mid-June 2021 and from August 2021 onward, significant positive forecast errors (i.e., the overshooting of the rate of change in the number of new confirmed cases) from mid-June to July 2021 are left unexplained.
Contribution of the Go-To-Travel campaign to the spread of COVID-19 in Japan
There was controversy over the Go-To-Travel campaign in Japan between late July and late December 2020 regarding its contribution to the spread of COVID-19 in the country [61]. Fig 10 confirms a non-negligible contribution from an increase in household expenditures on domestic travel packages to an increase in the rate of change in the number of new confirmed cases during the campaign period. This result complements the cross-sectional study that confirmed an association between participation in the Go-To-Travel campaign and the incidence of symptoms indicative of COVID-19, using Internet survey data [28]. At the same time, Fig 10 also shows that the spread of COVID-19 during this period was not entirely due to the Go-To-Travel campaign, as there were contributions from other components of household expenditures as well.A caveat is that some part of the increase in household expenditures on domestic travel packages during the campaign period might have happened even without the Go-To-Travel campaign. Therefore, to measure the campaign’s contribution to the spread of COVID-19 precisely, it is necessary to identify counterfactual demand for domestic travel packages when there is no implementation of a Go-To-Travel campaign. This issue needs to be addressed in future studies.
Limitations and strengths
The estimated regression coefficients of household expenditures, shown in Table 2, are measured per the real value of spending per household. Thus, they compare the economic value of each type of consumer activity to its contribution to the spread of COVID-19. These estimates complement studies that used mobility data to estimate the associations between people’s physical visits to retail services and the spread of COVID-19 [29-34]. In particular, [29] found that visits to full-service restaurants, fitness centers, cafes and snack bars, hotels and motels, and limited-service restaurants contributed more to the spread of COVID-19 than other types of retail services. The estimated regression coefficients shown in Table 2 have a similar implication for the rankings of infectiousness among the classified components of household expenditures. The present study also has a unique strength over studies using mobility data, as mobility data do not contain information about the amount of money spent on each retail service visited by mobile phone users. In contrast, the household expenditure data used in the present study include such information.Furthermore, the estimate of infectiousness of each type of consumer activity per the real value of spending may be useful when simulating how much value of consumer demand for each type must be suppressed to keep the number of new confirmed cases from rising. Table 2 indicates that among the classified components of household expenditures included in the explanatory variables, it is the most cost-effective to restrict alcoholic drinks at bars and restaurants, followed by soft drinks, confectioneries, and fruits, because reducing the real value of spending on these retail services by one unit would decrease the rate of change in the number of new confirmed cases more than reducing the real value of spending on the other components of household expenditures by one unit.A caveat is that the estimated regression coefficients may reflect not only the direct effect of each classified component of household expenditures, but also the indirect effects of associated consumer activities summed into “the other household consumption expenditures” among the explanatory variables. They may also include the effects of workplace activities for each consumer service. To disentangle the effects of each type of retail service and the effects of workplace activities precisely, mobility data are likely to have an advantage over household expenditure data. Therefore, the present study is not superior but complementary to studies using mobility data.It is also important to note that households adjust the composition of their consumption endogenously if the government restricts consumer activities. Therefore, this adjustment must be considered when setting a scenario on household expenditures for the simulations of the spread of COVID-19 with government interventions in the future. However, this issue is beyond the scope of the present study and is left to future research.The estimation results reported in this study can also be useful in simulating seasonal fluctuations in the spread of COVID-19. The decomposition of out-of-sample forecasts of the regression shown in Figs 7 and 8 demonstrates that idiosyncratic fluctuations in each component of household expenditures contributed to the rate of change in the number of new confirmed cases of COVID-19. These idiosyncratic fluctuations were not only due to government interventions in retail services during each state of emergency but also due to seasonal demand, such as farewell and welcome parties with colleagues and schoolmates around the end of March, which is the end of the fiscal year in Japan, and eating out and traveling in mid-May and mid-August, which are holiday periods called the Golden Week and the Obon, respectively, in Japan. Thus, the estimates of associations between the classified components of household expenditures and the spread of COVID-19 reported in this study may be useful for simulating seasonal fluctuations in the rate of change in the number of new confirmed cases on future dates. To confirm the seasonal patterns of household expenditures, see S5 Fig, which plots each classified component of household expenditures per household from 2019 to 2021.However, a challenge for such a simulation is to add the net effect of mutant strains and vaccinations to the regression model. As demonstrated by the regression results reported in Table 4, one possible way to identify the net effect is to regress the out-of-sample forecast errors of the regression on each mutant strain’s share of new confirmed cases and the vaccinated share of the population for each number of vaccinations. This work would require consideration of the decay in the protective effect of vaccination and the presence of multiple mutant strains. Further investigation into this issue remains a challenge for future research.Another limitation of this study is the use of nationwide data. The contribution of household expenditures to the spread of COVID-19 may differ across regions in Japan. This study uses nationwide data because government statistics on regional household expenditures report only monthly averages [45].Therefore, the time-series regression model in this study cannot be applied to the rate of change in the number of new confirmed cases in each region because it would not be able to capture cross-regional effects such that an effective contact in one region caused an infection in another region through people’s travel. To capture such an effect, it is necessary to estimate the local infectiousness of infected individuals in each region, such as the regional time-varying reproduction number, and then evaluate the associations between fluctuations in the measure of local infectiousness of infected individuals and the components of household expenditures in the region. To the best of our knowledge, this type of regional analysis has been conducted only with mobility data in Japan [35, 36]. This issue is important because the local infectiousness of household expenditures is likely to depend on population density, which varies substantially across regions in Japan. In addition, even though the regression model in this study shows a similar out-of-sample forecast performance regardless of whether it includes a weather variable among the explanatory variables or whether it includes absolute humidity or outside temperature as a weather variable, exploiting regional differences in weather conditions may help to identify the effects of weather conditions on COVID-19 incidence and transmission more precisely than in the present study. Therefore, integrating household expenditure data into a regional analysis warrants further research.In addition, a possible alternative to the present study is to estimate a regression model with structural breaks to capture the effects of the spread of mutant strains and the progress of vaccinations using all available sample periods. On the one hand, the present study has a strength over this alternative, because it can avoid the possibility of misspecification of the effects of mutant strains and vaccinations by estimating a regression model only for the sample period before the spread of mutant strains and the progress of vaccinations. Moreover, this approach allows using the remaining sample period as testing data for model evaluation. On the other hand, the present study cannot be precise regarding the dates of possible structural breaks. Thus, the two approaches are complementary to each other. It remains a challenge for future research to estimate a regression model with structural breaks on the associations between household expenditures and the spread of COVID-19.Lastly, despite a good fit of out-of-sample forecasts of the regression model in the present study up to mid-June 2021, the dependent variable of the regression significantly exceeded the fitted values of the regression during the summer of 2020 (see Fig 6), as well as the out-of-sample forecasts of the regression from mid-June to July 2021, even after considering the possible roles of the spread of Delta variant and the progress of vaccinations in causing out-of-sample forecast errors (see Fig 13). These results may be due to a temporary change in regression coefficients in summer for some seasonal reason or an additional independent factor, such as increased cross-border mobility during the Tokyo 2020 Olympic Games. Regarding the former possibility, regression coefficients will be time-varying if single-person households show significantly different seasonal consumer behavior than households with two or more persons. However, the present study does not include expenditures by single-person households among the explanatory variables due to the lack of daily data. Further investigation into these issues is warranted.
Conclusions
By applying a time-series regression model to daily nationwide data in Japan, this study finds that the classified components of household expenditures and mobility in public transportation had stable associations with the log difference over seven days in the number of new confirmed cases of COVID-19 in Japan before June 2021, except for a low in-sample fit of the regression for the summer of 2020. These associations are validated by a good fit of the out-of-sample regression forecasts for February and June 2021.The estimated regression coefficients measure the infectiousness of consumer activities associated with each classified component of household expenditures per the real value of spending per household. The decomposition of fitted values and out-of-sample forecasts of the regression indicates that the spread of COVID-19 is associated with idiosyncratic fluctuations in various types of consumer activities. If it is possible to adjust the estimated regression coefficients to real-time developments of mutant strains and vaccinations, then the adjusted coefficients can be potentially useful for simulating changes in the number of new confirmed cases due to household spending on retail services. Such simulations would help in designing cost-efficient government interventions into consumer activities. Given the seasonality of household spending, they would also help predict seasonal fluctuations in the number of new confirmed cases over the year.
Japanese names of household expenditure variables in the family income and expenditure survey and the consumer price index.
(PDF)Click here for additional data file.
Posterior mean and the credible interval of each parameter in the regression model.
(PDF)Click here for additional data file.
How to construct the consumer price index for each explanatory variable on household expenditures.
(PDF)Click here for additional data file.
Formula to compute absolute humidity.
(PDF)Click here for additional data file.
How to fulfill missing values in Celsius temperature and relative humidity data published by the Japan Meteorological Agency.
(PDF)Click here for additional data file.
Sensitivity analysis with alternative weather variables for explanatory variables.
(PDF)Click here for additional data file.
Sensitivity analysis with offline household expenditures on clothing and footwear for an alternative explanatory variable.
(PDF)Click here for additional data file.
Nominal values of classified components of household expenditures per household in Japan.
(PDF)Click here for additional data file.
Fitted values and out-of-sample forecasts of the regression with time dummies related to states of emergency.
(PDF)Click here for additional data file.
Decomposition of fitted values of the regression with time dummies related to states of emergency.
(PDF)Click here for additional data file.
Fitted values of the regression when time dummies related to states of emergency are set to zero.
(PDF)Click here for additional data file.
Comparison of the “real” values of classified components of household expenditures per household in Japan over 2019–2021.
(PDF)Click here for additional data file.18 Jan 2022
PONE-D-21-36475
Associations between components of household expenditures and the rate of change in the number of new confirmed cases of COVID-19 in Japan: time-series analysis
PLOS ONE
Dear Dr. Tomura,Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.Please submit your revised manuscript by 18th Feb 2022. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.We look forward to receiving your revised manuscript.Kind regards,Shinya Tsuzuki, MD, MScAcademic EditorPLOS ONEJournal Requirements:1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found athttps://journals.plos.org/plosone/s/file?id=wjVg/PLOSOne_formatting_sample_main_body.pdf andhttps://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdfAdditional Editor Comments:I believe both reviewers assessed the manuscript appropriately and agree with the points they raised, then please respond each of them.[Note: HTML markup is below. Please do not edit.]Reviewers' comments:Reviewer's Responses to Questions
Comments to the Author1. Is the manuscript technically sound, and do the data support the conclusions?The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: YesReviewer #2: Partly********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: YesReviewer #2: Yes********** 3. Have the authors made all data underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: YesReviewer #2: Yes********** 4. Is the manuscript presented in an intelligible fashion and written in standard English?PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: YesReviewer #2: Yes********** 5. Review Comments to the AuthorPlease use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Thank you for the opportunity to review this interesting paper that discuss household expenditures and the rate of change in the number of new confirmed cases of COVID-19 in Japan and author discuss potential application for this study to simulation to set policy for COVID-19 pandemic. The main contribution of this paper from my view is that it potentially delivers reference data to set COVID-19 or future pandemic economic policy by monitoring household expenditures data and trends for optimizing pandemic economic policy which was not available. Currently this paper could contain some room for improvement from some methodological point of view. Please refer potential suggestion below for your reference.Major points;1. In line 137 author states that “whereas their infectiousness is decreased in the absolute humidity”. From some published papers show associations of Epidemic growth of COVID-19 was not associated with latitude and temperature from UK or other countries, while other publications to some extent negate those associations of absolute humidity and infectiousness which shows this is still open to debate. Also another published study in 2020, for example reported influence of absolute humidity, temperature and population density on COVID-19 spread and decay durations by using multivariate analysis covering multi-prefecture in Japan. Therefore, from methodological standpoints, potentially reader of this paper may also would like to see the data with 1) assumption that infectiousness is not decreased in the absolute humidity and 2) additional cross effect of i) ambient temperature, and ii) population density assessed in addition to Table 2 which is the Estimated regression coefficients of classified components of household expenditures and mobility in public transportation in Japan.2. In this study there seem to be no assessment of effect of vaccinated cohort population in the model, while Japan, it is estimated around 29.1% as of Sep 2021 is the elderly population and some of them are vaccinated based on priority of nationwide vaccination program in first half of 2021. The household expenditure of those eldery population seems NOT negligible portion of Japanese economy and once the eldery population received second vaccination their behavior on consumption may change. It maybe helpful to provide the plausible reason(s) for not considering the effect of vaccination program in the regression model to conduct this analysis from applied economic standpoint since there could be potential impact associated with vaccination.3. Author concluded as “there had been stable association between components of household expenditure and the spread of COVID-19 in Japan before the Delta variant in June 2021.” It is also helpful to see the result of degree of potential impact of Delta variant on household expenditure in this study since author mentioned in the conclusion section as, “If it is possible to adjust these estimates for real-time developments of mutant strains and vaccinations, then the adjusted estimates will be useful to compare the costs and benefits of a government intervention in each type of retail service that may be implemented in the future.”Minor points;1. For reference 41, please consider providing Japanese MHLW reference beside CDC recommendation since this study handles Japan specific situation. Also other parts please confirm if there are Japan specific reference to use please modify.2. From figure 1-4, please consider providing better resolution figures, currently some of the figures are not optimal resolution...Reviewer #2: The author aims to evaluate the associations between components of household expenditures and the spread of COVID-19 in Japan. With this in mind, he estimates a regression model using the Bayesian method with a non-informative prior.In light of the current situation, where the topic evaluated is highly relevant to reducing the omicron variant's spread, we again phase the decision to limit some economic activities. After two years of the pandemic, economies worldwide are phasing crises, and therefore, the governments are less willing to adopt measures that limit economic activities (e.g., closing restaurants, bars, and stores), which can worsen the severe effects of the crisis. However, many stakeholders believe that there is no option but to continue applying regulations limiting economic activities. Consequently, as the author states, it is indispensable to estimate the effect on the COVID transmission rate of implementing these regulations. First, it is critical to identify those economic activities whose limitations are the most cost-effective. Second, to estimate and be transparent about the impact of limiting the economic activity is necessary to be accountable to a population that has suffered the negative consequences of the disease and the economic crisis it has caused.1. Line 83: Does the author means "asymptomatic" instead of "asymptotic"?2. Line 106-107: Regarding the sentence "The underlying assumption for Eq (6) is that each person has more chances of physical contacts with other people if there are more consumer activities in the country"a. What about the increase in online shopping? An adjustment of people's behavior could lead to increases in household expenditures that are not necessarily related to physical contact.b. Is it possible to disentangle the increase in household expenditure through online services (e.g., ordering food)? If the answer is not, the author should mention the implications of a change in consumption behaviors during the pandemic (i.e., increase in online shopping) on the validity of his results and conclusions.3. The period for the independent variable considered is from December 25th, 2020, to February 19th, 2021.a. Was it not possible to include a structural break in the equation? At least for some of the coefficients? For example, including a break for those periods when the percentage of confirmed cases related to specific COVID variants surpasses a threshold (different thresholds can be tested).b. Similarly, for vaccination, to include a structural break when the percentage of the population vaccinated surpasses a certain threshold.c. Interestingly, the variation in the observed values appears to be higher during 2021 than in 2020 (Fig 6). Could this be explained by an adjustment of the families to the current situation, and therefore, a change in the consumption and expenditures patterns? If so, could it suggest that the regression coefficients vary from 2020 to 2021?4. There are indications of the out-of-sample results of a change in the behavior after June/July 2021 (Fig 6), which coincide with the last mentioned state of emergency.a. Are there any particular reasons why the last state of emergency could have different behavior? Are those reasons related to the economic activities mentioned in the study? For example, different economic activities were limited, or the degree of limitation varies compared to the previous state of emergency (e.g., the case in which only vaccinated people could eat in restaurants).5. Line 178-179. The author state that "There is no data for single-person households at the daily frequency in this survey."a. Single-person households have consumption habits that might differ from those of family households. It could be expected that a single person is more willing to go out to a restaurant or has the time to share a drink with friends. For this segment of the population to make trips on the weekends or have quick holidays could probably be more manageable, giving them the chance to take advantage of the Go-to-travel program. The author should discuss the effect of the lack of data from single-person households on the validity of its results and conclusions.6. Table 1 shows the corresponding lag of household expenditures in days. It is possible to observe important differences depending on the expenditure category. I would be interested in a brief analysis of the potential reasons for those differences.7. The word "anomalies" is used to define the period in which the equation estimated values do not fit well the observed values.a. The word "anomalies" is ambiguous and assumes that the equation perfectly captures what can be considered normal behavior. This assumption ignores the previously mentioned potential changes over time (e.g., changes in consumption patterns).b. The differences between the observed and the predicted values could be related to a problem of omitted variables, for instance, periods with particular characteristics that could increase personal contacts and/or household expenditures (e.g., elections). It would be recommended that the author mention whether he believes this could be a limitation of the work.8. Line 370: It is unclear why the author suggests that the equation is overfitted. He should further elaborate on this point.9. There is a lot of material in the discussion that belongs to the Result section.a. For instance, the paragraph that starts in line 400 describes what is observed in Fig 6, and the one started in line 420 what is observed in Fig 7, Fig 8, and Fig 13. The description should appear only in the result section, and the relevant consequences of the results should be mentioned in the discussion section.b. I would avoid including additional figures in the discussion section (i.e., Fig 11 and 12). However, if the author considered them necessary for the narrative, the figures might be better located in the appendix.10. The Conclusion mentions a stable association between the expenditure category and the rate of change in the number of confirmed cases until June 2021. Given that the regression is estimated until Feb 2021 and that there is a period of "anomalies" highlighted by the authors, it is probably not accurate to state a stable association until June 2021.********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: NoReviewer #2: No[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.27 Feb 2022I thank Reviewer 1 and Reviewer 2 for their valuable comments. Please find below point-by-point responses to their comments.1. Response to Reviewer 1> Major point 1-1) and 1-2)-i)Following the reviewer’s comments, I have added sensitivity analysis to the revised manuscript to report the estimates of a regression without any weather variable among the explanatory variables, and regressions with indicators for ambient (outside) temperature substituting the dummy variable for absolute humidity in the explanatory variables. Please see lines 412-441 on page 14 in the revised manuscript and S4 Appendix for the results of sensitivity analysis.In the sensitivity analysis, I report the result that the out-of-sample forecast performance of the regression model is similarly good, regardless of whether no weather variable is included in the explanatory variables, or whether the dummy variable for absolute humidity is replaced with an indicator for ambient temperature. Even though the benchmark regression with the dummy variable for absolute humidity shows a slightly better out-of-sample forecast performance than the other regressions considered in the sensitivity analysis, I conclude in the revised manuscript that, using out-of-sample forecast performance for model evaluation, the present study cannot be conclusive on the importance of weather conditions for COVID-19 incidence and transmission.In this analysis, I do not include indicators for both ambient temperature and absolute humidity in the explanatory variables at the same time, because I do not have good information to set prior distributions of the coefficients of the two variables to avoid a multi-collinearity problem, given similar fluctuations of ambient temperature and absolute humidity. I have noted in the revised manuscript that more precise identification of the effect of weather conditions on Covid-19 incidence and transmission remains a challenge for future research. Please see the Discussion section of the revised manuscript between lines 624-629 on page 19.> Major point 1-2)-ii)Unfortunately, household expenditure data at daily frequency are available only at national level in the government statistics (the Family Income and Expenditure Survey [45]) that this study uses, as described in lines 609-612 on page 18. To analyze the effect of population density on COVID-19 incidence and transmission, I would need regional data to have variations of population density among samples. Following the reviewer’s comment, I have noted in the Discussion section of the revised manuscript that it is important to estimate population density on the local infectiousness of household expenditures and that this issue warrants future research. Please see lines 622-624 on page 19 for this clarification.> Major point 2The regression model in the present study does not include the effect of vaccinations, because it is estimated using the data for the dependent variable only up to February 1, 2021. I limit the sample period for estimation in this way in order to avoid a possible structural break due to the spread of mutant strains in 2021, as described in lines 161-176 on page 6. I use data in the subsequent sample period to compute out-of-sample forecasts of the regression model for model evaluation. Because there was no large-scale vaccination effort before February 2021 in Japan, the regression model does not include the effect of vaccinations among the explanatory variables.Nonetheless, following the reviewer’s comment, I have added a new section (titled “Preliminary regression of out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population”) after the Results section in the revised manuscript to analyze the roles of the Delta-variant share of new confirmed cases and the vaccinated share of the population in explaining the out-of-sample forecast errors of the regression model. The regression analysis described in this section implies that while the Delta-variant share of new confirmed cases and the vaccinated share of the population can explain the out-of-sample forecast errors up to mid-June 2021 and from August 2021 onward, they cannot account for a rise in the rate of change in the number of new cases from mid-June to July 2021. A caveat is that this conclusion is tentative, as the regression analysis in the new section is yet to be validated by out-of-sample forecast performance. Please see lines 465-502 on pages 15-16 for the regression analysis.> Major point 3Following the reviewer’s comment, I have included the Delta-variant share of new confirmed cases in Japan among the explanatory variables in the regression analysis described in the response to Major point 2 in this letter. Please see lines 465-502 on pages 15-16 for the regression analysis.> Minor point 1Following the reviewer’s suggestion, I cite the clinical guide published by the MHLW of the Japanese government [41] instead of the CDC recommendation regarding the range of incubation periods of COVID-19. Please see lines 153-155 on pages 5-6.Except the citation of the CDC guideline in this part, there is no other citation of foreign public health agencies outside Japan in the revised manuscript.> Minor point 2To improve the resolution of figures to the satisfactory level for the PLOS ONE journal, I use the PACE website (https://pacev2.apexcovantage.com/) to convert figures into image files, following the instruction on figures by the journal.2. Response to Reviewer 2> Comment 1Yes. The typo is corrected in the revised manuscript. I thank the reviewer for pointing out the error.> Comment 2-aFollowing the reviewer’s comment, I have added a clarification in the revised manuscript that, because household expenditures can be made both online and offline, the coefficient of each explanatory variable in the regression captures the average effect of the variable. Please see lines 106-111 on page 4 for the clarification.> Comment 2-bFollowing the reviewer’s comment, I have added sensitivity analysis to the revised manuscript to report the estimation result of an alternative regression in which household expenditures for clothing and footwear in the explanatory variables include only offline purchases. I do not separate online and offline purchases for the other types of household expenditures in the explanatory variables, because they are offline services except “the other household expenditures”, even if the payments are made online, while “the other household expenditures” is the residual household expenditures other than classified household expenditures included in the explanatory variables. Please see lines 442-464 on pages 14-15 and S5 appendix for the sensitivity analysis.In the sensitivity analysis, I report that the estimation results of the alternative regression with offline household expenditures for clothing and footwear are similar to those of the benchmark regression with total (i.e., the sum of online and offline) household expenditures for clothing and footwear. For a possible reason for this result, I refer to the fact that the online share of household expenditure for clothing and footwear had been stable during the sample period after a permanent increase in early 2020. I also mention the limitation of this interpretation at the end of the sensitivity analysis.> Comment 3-aA regression model with a structural break has a strength in the precise identification of the date of the structural break. A challenge for this approach is that the estimation of coefficients both before and after the structural break will be biased unless the regression model is correctly specified for the entire sample period before and after the structural break. This issue is especially challenging for the present study, because it is yet difficult to specify correctly in a regression model the effects of multiple mutant strains, some of which appeared only recently, and vaccinations, whose protective effects are decaying over time and may be different for different strains. One way to mitigate this challenge is to focus on an early part of the sample period clearly before observed structural breaks, such as the spread of mutant strains and the progress of vaccinations in 2021 in Japan. The present study takes this approach. Another benefit of this approach is that the regression model can be validated by out-of-sample forecasts (i.e., testing data), because not all the available sample period is used for the estimation of the regression model. In contrast, model validation by out-of-sample forecasts is usually difficult for a regression model with a structural break, because a structural break is typically introduced into a regression model to use all the available sample period for in-sample fitting.Thus, the present study has some benefits that a regression model with a structural break does not have, and vice versa. Following the reviewer’s comment, I mention the benefit of estimating a regression model with a structural break as an issue for future research in the Discussion section in the revised manuscript. Please see lines 631-642 on page 19 for this clarification.> Comment 3-bFollowing the reviewer’s comment, I have added a new section (titled “Preliminary regression of out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the twice-vaccinated share of the population”) after the Results section in the revised manuscript to analyze the roles of the Delta-variant share of new confirmed cases and the vaccinated share of the population in explaining the out-of-sample forecast errors of the regression model. Please see lines 465-502 on pages 15-16 for this analysis.> Comment 3-cIf regression coefficients had changed significantly between 2020 and 2021, then the out-of-sample forecasts of the regression would not fit well the realized values of the dependent variable in the out-of-sample forecast period from February 2, 2021, onward. Nonetheless, it is possible that a temporary change in regression coefficients took place during the summer of 2020, for which the in-sample fit of the regression is low. Following the reviewer’s comment, this issue is added as a limitation of the present study in the Discussion section. Please see lines 643-656 on page 19.> Comment 4As stated in the reviewer’s comment, it is possible that people changed their behavior during each state of emergency. To capture the effect of such a behavioral change on the infectiousness of household expenditures, a time dummy for each state of emergency during the sample period is introduced in the regression model. Please see lines 119-122 on page 4 for this clarification. Because a time dummy only captures a time-varying average, the estimated coefficient of the time dummy for each state of emergency cannot tell the reason for the estimated time effect. Also, as will be described in the response to the reviewer’s comment 8, the estimated coefficients of time dummies for the states of emergencies are likely to suffer overfitting. Given this result, I do not try to interpret the estimated values of these coefficients in the revised manuscript.> Comment 5Because of lack of daily data for single-person households, it cannot be known if expenditures by single-person households were just proportional to expenditures by households with two or more persons, or if including them would improve the fit of the regression model to data significantly, such as the low in-sample fit during the summer of 2020. The present study reports that the regression model with expenditures by households with two or more persons can be validated by a good fit of out-of-sample forecasts of the regression. Following the reviewer’s comment, I mention the possibility of improving the fit of the regression model by including single-person household data for an explanatory variable in the Discussion section of the revised manuscript. Please see lines 651-656 on page 19 for this clarification.> Comment 6The presence of a lag in the sample correlation between the log difference over 7 days in the number of new confirmed cases and each large category of household expenditures shown in Table 1 is partly due to the presence of incubation periods after infections. Because it is a sample correlation, differences in lag lengths across different categories of household expenditures can be also due to differences in unobserved activities associated with each category of household expenditures. The possible effects of indirectly associated consumer activities for each classified component of household expenditures in the explanatory variables are mentioned in the Discussion section of the revised manuscript. Please see lines 571-575 on page 18 for this description.In addition, trying to remove the effect of such indirect channels and extract the direct effect of each household expenditure on incidence of an infection, the regression model in the present study constrains that household expenditures in the explanatory variables can affect the dependent variable on a future date only in proportion to the exogenous distribution of incubation periods. This use of the distribution of incubation periods in the regression model is described in lines 78-86 on page 3.> Comment 7-aRegarding the use of the word “anomaly” in the previous manuscript, I intended to simply mean that the regression in the manuscript fails to account for the second wave of COVID-19 infection in Japan, rather than that the regression model is perfect. Following the reviewer’s comment, the sentence referred to by this comment is removed from the revised manuscript.> Comment 7-bVarious possibilities of improving the fit of the regression model to data by incorporating a larger set of explanatory variables are mentioned in the Discussion section in the revised manuscript. Please see responses to the reviewer’s earlier comments, and also lines 643-656 on page 19. Also, I have added a new section on the regression of the out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the vaccinated share of the population, as described in the response to the reviewer’s comment 3-b.> Comment 8Overfitting of a regression occurs when regression coefficients estimated by the in-sample fit of the regression produce imprecise out-of-sample forecasts. Following the reviewer’s comment, I have added S2 Fig (a figure in an online appendix) to the revised manuscript to report the out-of-sample forecasts of the regression in which a time dummy for the second state of emergency is not set to zero in the out-of-sample forecast period. This figure demonstrates a poor fit of the out-of-sample forecasts of the regression, despite a good in-sample fit of the regression. Please see lines 686-687 on page 20 for confirmation.> Comment 9-aFollowing the reviewer’s comment, the subsection of the Discussion section referred to by the reviewer is moved to an earlier section, which describes the regression of out-of-sample forecast errors on the Delta-variant share of new confirmed cases and the vaccinated share of the population. Please see lines 465-502 on pages 15-16 for confirmation.> Comment 9-bFollowing the reviewer’s comment, all the figures in the Discussion section in the previous manuscript is either moved to earlier sections or online appendices. In the revised manuscript, there is no figure included in the main text of the Discussion section. Please see lines 503-656 on pages 16-19 for confirmation.> Comment 10Following the reviewer’s comment, I have added “except a low in-sample fit of the regression for the summer of 2020” after the expression in the conclusion criticized by the reviewer. I have also fully revised the conclusion part of the abstract to avoid using the phrase “stable association”. Please see lines 661-662 on page 19 and the abstract on page 1 for confirmation.Submitted filename: Response to reviewers.docxClick here for additional data file.16 Mar 2022
PONE-D-21-36475R1
Associations between components of household expenditures and the rate of change in the number of new confirmed cases of COVID-19 in Japan: time-series analysis
PLOS ONE
Dear Dr. Tomura,Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process.Please submit your revised manuscript by Apr 30 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file.Please include the following items when submitting your revised manuscript:
A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'.A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'.An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'.If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter.If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols.We look forward to receiving your revised manuscript.Kind regards,Shinya Tsuzuki, MD, MScAcademic EditorPLOS ONEJournal Requirements:Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice.Additional Editor Comments:Both reviewers basically satisfied with the given responses, however, raised a few minor concerns.I agree with their point then please make a few minor changes before publication.[Note: HTML markup is below. Please do not edit.]Reviewers' comments:Reviewer's Responses to Questions
Comments to the Author1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #1: (No Response)Reviewer #2: All comments have been addressed********** 2. Is the manuscript technically sound, and do the data support the conclusions?The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: YesReviewer #2: Yes********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: YesReviewer #2: Yes********** 4. Have the authors made all data underlying the findings in their manuscript fully available?The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: YesReviewer #2: Yes********** 5. Is the manuscript presented in an intelligible fashion and written in standard English?PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: YesReviewer #2: Yes********** 6. Review Comments to the AuthorPlease use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: Thank you for my opportunity to review.When reading the revised conclusion section, I think it would be better that the author could add more leading conclusion within the conclusion section in the abstract.Current revised version is too short and it would be better to guide the potential readers of this manuscript to conclude with what this study result demonstrated using all the data in the manuscript with regard to background and objective of this study rather than ...(validated by)....please consider making revision to the conclusion with what key points of the conclusion does your result really support and significance to medical and economic fields.Reviewer #2: I am satisfied with the answers of the author to my previous comments and only have two additional minor comments to add:1) The Discussion section is mainly composed of the imitations subsection of the word. Although it is very comprehensive, I would suggest changing the title of this sub-section by "limitations and strengths" since the author also highlights some of the analysis's main points. Additionally, I would recommend mentioning additional strengths of the analysis, such that not only the limitations are accentuating. The Discussion and Conclusion section could also benefit from an English native speaker correction. They do not have mistakes in words or grammar, but the text could be written more naturally.2) The conclusion does not highlight the "so what" of the paper. For example, the relevancy of the results in terms of policy recommendations should be emphasized.********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files.If you choose “no”, your identity will remain anonymous but your review may still be made public.Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: NoReviewer #2: No[NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.]While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.
21 Mar 2022I thank Reviewer 1 and Reviewer 2 for their suggestions to improve the manuscript. Please find below point-by-point responses to their comments.1. Response to Reviewer 1Following the reviewer’s suggestion, the Conclusion section of the abstract is changed to a more substantial explanation of this study’s contribution: “The estimated model can be potentially useful in simulating changes in the number of new confirmed cases due to household spending on retail services, if it can be adjusted to real-time developments of mutant strains and vaccinations. Such simulations would help in designing cost-efficient government interventions.”To accommodate the longer conclusion section within the word limit for the abstract, the Objective section of the previous abstract is dropped from the revised manuscript. Because the Objective section overlapped the Methods section in the previous abstract, there is no significant loss of information caused by this revision.2. Response to Reviewer 21) Following the reviewer’s suggestion, the title of the subsection is changed from “Limitations” to “Limitations and strengths” in the revised manuscript. Please see Line 552 on page 17 for confirmation.Furthermore, the Discussion and Conclusion section has been edited by a professional English editor at Editage, an academic editing company. The editor is a native English speaker according to the company. Most of the English editing is incorporated in the revised manuscript, except a few jargons such as “Delta variant” and “new confirmed cases”, and a few expressions that are kept to be consistent with the earlier sections of the revised manuscript. I believe the Discussion and Conclusion section has become easier to read, as the reviewer suggested.2) Following the reviewer’s suggestion to make the Conclusion section more substantial, the end of the Conclusion section is revised to clarify that the findings reported in this study can be potentially used for designing cost-efficient government interventions in consumer activities to contain the spread of COVID-19, and also for predicting seasonal fluctuations in the number of new confirmed cases of COVID-19. Please see lines 674-681 on page 20 for confirmation. For the same purpose, the Conclusion section of the abstract is also revised, as described in the response to Reviewer 1 above.Submitted filename: Response to reviewers in the second revision.docxClick here for additional data file.31 Mar 2022Associations between components of household expenditures and the rate of change in the number of new confirmed cases of COVID-19 in Japan: time-series analysisPONE-D-21-36475R2Dear Dr. Tomura,We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements.Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication.An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org.If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org.Kind regards,Shinya Tsuzuki, MD, MScAcademic EditorPLOS ONEAdditional Editor Comments (optional):Reviewers' comments:4 Apr 2022PONE-D-21-36475R2Associations between components of household expenditures and the rate of change in the number of new confirmed cases of COVID-19 in Japan: time-series analysisDear Dr. Tomura:I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department.If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org.If we can help with anything else, please email us at plosone@plos.org.Thank you for submitting your work to PLOS ONE and supporting open access.Kind regards,PLOS ONE Editorial Office Staffon behalf ofDr. Shinya TsuzukiAcademic EditorPLOS ONE
Authors: Adam J Kucharski; Petra Klepac; Andrew J K Conlan; Stephen M Kissler; Maria L Tang; Hannah Fry; Julia R Gog; W John Edmunds Journal: Lancet Infect Dis Date: 2020-06-16 Impact factor: 25.071
Authors: Joseph C Lemaitre; Javier Perez-Saez; Andrew S Azman; Andrea Rinaldo; Jacques Fellay Journal: Swiss Med Wkly Date: 2020-05-30 Impact factor: 2.193
Authors: Yasuhiro Kawano; Ryusuke Matsumoto; Eishi Motomura; Takashi Shiroyama; Motohiro Okada Journal: Int J Environ Res Public Health Date: 2022-07-25 Impact factor: 4.614