Literature DB >> 33642953

Using Soccer Games as an Instrument to Forecast the Spread of COVID-19 in Europe.

Juan-Pedro Gómez1, Maxim Mironov1.   

Abstract

We provide strong empirical support for the contribution of soccer games held in Europe to the spread of the COVID-19 virus in March 2020. We analyze more than 1,000 games across 194 regions from 10 European countries. Daily cases of COVID-19 grow significantly faster in regions where at least one soccer game took place two weeks earlier, consistent with the existence of an incubation period. These results weaken as we include stadiums with smaller capacity. We discuss the relevance of these variables as instruments for the identification of the causal effect of COVID-19 on firms, the economy, and financial markets.
© 2021 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  COVID-19; Identification strategy; Instrumental variables; Soccer; Super-spreaders

Year:  2021        PMID: 33642953      PMCID: PMC7900761          DOI: 10.1016/j.frl.2021.101992

Source DB:  PubMed          Journal:  Financ Res Lett        ISSN: 1544-6131


Introduction

There is anecdotal evidence that soccer games have contributed to the spread of the COVID-19 pandemic in Europe.1 In this paper, we provide strong empirical support for this conjecture and discuss the implications of our findings for the identification of the causal impact of COVID-19 on firms, the economy, and financial markets. Although it makes sense to assume that the original outbreak of the pandemic in China at the end of 2019 is exogenous, this becomes a more questionable assumption for the propagation of cases across countries and regions in Europe during the first quarter of 2020. For instance, the uninstrumented number of cases, especially at the beginning of pandemic, is likely to overestimate the incidence of COVID-19 in well-connected versus remote cities.2 Similarly, cities and regions with more inhabitants and higher population density are likely to experience faster virus spread (Rocklöv and Sjödin (2020)). On the one side, these regions tend to accumulate a higher percentage of firms and human capital, hence making any correlation between the number of cases and firm variables (like productivity, growth, solvency, or liquidity) potentially spurious. On the other side, these regions are likely to concentrate more economic and medical resources to detect and counterattack the pandemic. Thus, the raw number of COVID-19 cases might capture the inverse quality of the regional health system, which is likely correlated with firm performance and regional growth. To overcome these endogeneity issues, we propose four variables related to soccer games played across European regions from 10 countries during the first quarter of 2020. These variables constitute a novel and valuable instrument to explore the causal effect of COVID-19 infections on firm performance, management decisions, and the economy. Methodologically, the exclusion restriction is well founded. National leagues and pan-European tournaments, like the UEFA Champions and Europa leagues, were scheduled well before the original outbreaks of COVID-19 in China. Although there is evidence of the behavioral impact of victories and losses of soccer matches on stock returns (e.g., Edmans, García, and Norli (2007)), our soccer-related instruments are independent from the game's output. As far as we know, there is neither theory nor evidence that links directly the number of attendants to a soccer match or the capacity of the venue where it is played with, for instance, stock returns, cash holdings, or dividends of firms headquartered in the region, or, alternatively, growth in regional product or unemployment. Theoretically, the physical interaction among spectators in large venues as well as their arrival and departure from stadiums increase the likelihood of being infected with the virus, ultimately working as “super-spreader” events. The evidence in this paper is consistent with this conjecture and offers solid support for the relevance of these instruments to predict the spread of COVID-19 cases across European regions. We collect data from soccer games from all competitions (domestic and international) played in 194 regions across Belgium, France, Italy, Germany, the Netherlands, Poland, Spain, Sweden, Switzerland, and the UK, between January 1 and until the end of March 2020 (most games in Europe were canceled after March 10). In our main analysis, we include games played in venues with a minimum capacity of 25,000 people. In total, there are 1,051 qualifying games during this period.3 We also collect the confirmed cases of COVID-19 in these regions until the end of March, plus three economic and demographic variables: gross regional product, population, and density. We construct four variables related to the soccer matches. Namely: a dummy variable that takes a value of one if there was a soccer game in the region, zero otherwise; a variable that accumulates the number of games played in the region; a variable that accumulates all the spectators who attended those games; and a variable that accumulates the capacity (maximum number of spectators) of the venues where the matches took place. We document the following findings. First, for any single country and day from March 1 through 14, the rate of change in the number of COVID-19 cases relative to the previous day is, on average, 5.5 percentage points higher in regions where there was at least one soccer game two weeks earlier relative to regions with no games in the same period (as reference, the average rate of change is 23% per day during this period). Additionally, the daily increment of cases is, on average, about 6 basis points higher for every 1% increase in the attendance and venue capacity of games played two weeks earlier. These results are significant at the 1% level, and robust to the inclusion of regional demographic and economic control variables known to affect the virus spread (e.g., Rocklöv and Sjödin (2020)). Second, games celebrated, either the previous week or earlier than 2 weeks before, have no significant effect in the increment of daily cases. This is consistent with the incubation period and the lack of massive testing in the early stages of the pandemic.4 Third, as we expand the sample to include games celebrated in venues with smaller capacity, the statistical significance of the coefficient on the three soccer-related variables decreases, turning non-significantly different from zero when we include stadiums with a minimum capacity above 10,000 spectators. This evidence is consistent with the effect of “super-spreaders” of the virus documented in other large events (e.g., Dave et al (2020) and Felbermayr, Hinz, and Chowdhry (2020)). Fourth, the games played by a (local) team of a given region in another region have no significant effect on the number of cases in the local region, regardless of the game attendance or the venue capacity. Thus, there is no evidence that soccer fans moving to other regions or people gathering in bars in the local region to watch the game have contributed significantly to the spread of the virus. The rest of the paper is organized as follows. Section 2 describes the data. Results are presented in Section 3. We discuss the limitations of the analysis in Section 4, before concluding with Section 5. Our variables and their sources are described in the Appendix.

Data

Our sample consists of 2,162 region-day observations.5 We collect the accumulated number of diagnosed cases of COVID-19 per day and region from day 1 through 14 of March 2020, in 194 regions from Belgium, France, Italy, Germany, the Netherlands, Poland, Spain, Sweden, Switzerland, and the UK.6 We call this variable Cases.7 Panel A of Table 1 shows that, on average, there are 96 accumulated cases per day and region with an average of 35 accumulated cases per million regional inhabitants and day (variable Cases/Population).
Table 1

Summary Statistics for the Sample of Region-Days

In Panel A, each observation is a duple region-day. Every day from March 1 through March 14, 2020, Cases is the accumulated number of diagnosed cases of COVID-19 in the region during that period. Cases/Population is the number of cases per million inhabitants. We consider all regions in Belgium, France, Italy, Germany, the Netherlands, Poland, Spain, Sweden, Switzerland, and the UK. The distribution of observations across regions is in Table B.1 of Appendix B. Every day from March 1 through March 14, # Games, Attendance, and Capacity is the accumulated number of soccer matches played in the region, their attendance, and the venue capacity, respectively, over the previous 6 weeks. I_Games is a dummy variable that takes a value of 1 if there was at least one soccer match in the region where the firm is located during the previous 6 weeks, zero otherwise. Population is thousands of inhabitants per region; Density is number of inhabitants per square-Km; GRP is the Gross Regional Product per capita in USD. Log (x) denotes the natural logarithm of x. Δ Log(1+x)=Log((1+x)/(1+x)). In Panel B, we report the average across regions of the weekly accumulated number of games, attendance and venue capacity for up to 6 weekly lags. Table A in the Appendix includes the definition and source of each variable.

Panel A. Accumulated variables per day and region
MeanMedianSt. dev.# Regions# Obs.
(1)(2)(3)(4)(5)
Cases9685071942,162
Cases/Population357871942,162
Log(1+Cases)2.4342.1971.9021942,162
Δ Log(1+Cases)0.2280.1520.2861942,073
Log(1+Cases/Population)-11.486-11.5541.6431942,162
# Games3.29605.1621942,162
I_Games0.44400.4971942,162
Attendance78,9530162,9211942,162
Capacity136,0920244,2611942,162
Log(1+Attendance)5.11505.8261942,162
Log(1+Capacity)5.48106.1491942,162
Population, 0002,2871,1992,7821942,162
Density4511601,0461942,162
GRP37,42835,24014,7281942,162
Log(Population)13.92013.9971.3441942,162
Log(Density)5.0915.0811.3271942,162
Log(GRP)10.46410.4700.3591942,162
Summary Statistics for the Sample of Region-Days In Panel A, each observation is a duple region-day. Every day from March 1 through March 14, 2020, Cases is the accumulated number of diagnosed cases of COVID-19 in the region during that period. Cases/Population is the number of cases per million inhabitants. We consider all regions in Belgium, France, Italy, Germany, the Netherlands, Poland, Spain, Sweden, Switzerland, and the UK. The distribution of observations across regions is in Table B.1 of Appendix B. Every day from March 1 through March 14, # Games, Attendance, and Capacity is the accumulated number of soccer matches played in the region, their attendance, and the venue capacity, respectively, over the previous 6 weeks. I_Games is a dummy variable that takes a value of 1 if there was at least one soccer match in the region where the firm is located during the previous 6 weeks, zero otherwise. Population is thousands of inhabitants per region; Density is number of inhabitants per square-Km; GRP is the Gross Regional Product per capita in USD. Log (x) denotes the natural logarithm of x. Δ Log(1+x)=Log((1+x)/(1+x)). In Panel B, we report the average across regions of the weekly accumulated number of games, attendance and venue capacity for up to 6 weekly lags. Table A in the Appendix includes the definition and source of each variable. Then, we collect data from soccer games from all competitions (domestic and international) played in the 194 regions between January 1 and until the end of March 2020 (most games in Europe were canceled after March 10). Originally, we only include games played in venues with a minimum capacity of 25,000 people. In total, there are 1,051 qualifying games during the sample period. From each game, we collect date, playing teams, attendance (when available), venue capacity, and the region and country where it is located. Finally, we also collect the following demographic variables from each region: Population, Density, and Gross Regional Product (GRP) per capita. First, we want to explore if there is a pattern in the relation between the attendance to these events and the propagation of the virus. Every day, from March 1 through 14, we calculate the number of matches (# Games), Attendance and venue Capacity that took place in each region 1, 2,…, and up 30 days before. Figure 1 plots the average value of each variable across the 14 days and 194 regions for each day lag. Notice that game attendance and venue capacity are highly correlated across lags (correlation coefficient 0.98). The average match attendance is about 60% of venue capacity and this percentage is very stable across lags. The figure shows periodic spikes around 7, 21, and 28-day lags for the 3 variables. Considering that the first day of our sample is Sunday, March 1, these spikes reflect the higher concentration of soccer matches on weekends (70% of soccer matches take place on weekends). Figure 2 confirms this by plotting the number of soccer games across all regions in our sample, from January 14 through March 14. In the horizontal axis, we include Saturdays. We can see that a disproportionate number of games fall on Saturday or Sunday.
Figure 1

Instrument variables estimated with lags from 1 through 30 days

For every region in our sample and for every day from day 1 through 15 of March 2020, we estimate # Games, Attendance, and venue Capacity x days earlier, where x takes the value of 1 through 30. Panel A (B) presents the average Attendance and Capacity (# Games) over the 2,162 observations for every lag from 1 through 30 days. Variables are defined in Table 1.

Figure 2

Total number of soccer games per day in our sample

The figure represents the total numbe of games each day from january 14 through March 14 across all regions in our sample. In the horizontal axis, we include all Satudays.

Instrument variables estimated with lags from 1 through 30 days For every region in our sample and for every day from day 1 through 15 of March 2020, we estimate # Games, Attendance, and venue Capacity x days earlier, where x takes the value of 1 through 30. Panel A (B) presents the average Attendance and Capacity (# Games) over the 2,162 observations for every lag from 1 through 30 days. Variables are defined in Table 1. Total number of soccer games per day in our sample The figure represents the total numbe of games each day from january 14 through March 14 across all regions in our sample. In the horizontal axis, we include all Satudays. Thus, in order to smooth out the effect of weekends, we accumulate games, attendance and venue capacity over weekly windows. For every region in our sample and for every day from March 1 through 14, we estimate the number of soccer matches, the accumulated attendance, and the accumulated venue capacity 1, 2…, and up to 6 weeks earlier. We also calculate the variable I_Games that takes a value of 1 if there was at least one soccer match in the region during a given week, zero otherwise. Table 1, Panel A reports the statistics accumulated over the 6 weeks window. From March 1 through 14, on average, there were games in 44% percent of the regions over the previous 6 weeks. Additionally, for every day and region, there were on average 3.29 games accumulated over the previous 6 weeks, attended by an average of 78,953 (accumulated) people and played in venues with an average (accumulated) capacity of 136,092 spectators. Table B in the Appendix includes a list of all regions, with the accumulated number of cases, games, attendance and venue capacity in our sample.8
Table B

Statistics per Region and Day

Each day is one observation. Every day from March 1 through March 14, 2020, Cases is the accumulated number of diagnosed cases of COVID-19 in the region until that day. # Games, Attendance, and Capacity are the accumulated number of soccer matches played in venues with capacity of at least 25,000 spectators in the region, their attendance, and the venue capacity, respectively, over the previous 6 weeks. Population is thousands of inhabitants per region; Density is number of inhabitants per square-Km, both as of 2018. The table reports the average value of each variable and region from March 1 through 14. Appendix A describes all variables and their source.

Country RegionCases# GamesAttendanceCapacityPopulationDensity# Obs.
Belgium Brussels70---1,1997,38114
Belgium Flanders3229.21140,116276,1986,55348114
Belgium Wallonia1659.7978,607293,5713,62421414
France Auvergne-Rhône-Alpes17112.29362,220665,7647,91711314
France Bourgogne-Franche-Comté117---2,8185914
France Brittany663.3693,00499,9693,30712114
France Centre-Val de Loire16---2,5786614
France Corsica31---3303814
France Grand Est3466.50132,825180,9145,5559714
France Hauts-de-France1877.86251,519348,1766,00718914
France Normandy334.1435,226104,3213,33611114
France Nouvelle-Aquitaine412.9364,568123,3375,9367014
France Occitanie623.0042,80099,4505,8088014
France Pays de la Loire256.43107,287204,3703,73811614
France Provence-Alpes-Côte d'Azur788.64274,276431,5665,02216014
France Île-de-France2934.21190,014201,98712,1171,00914
Germany Badendeath Württemberg20812.14319,334443,88210,88030414
Germany Bavaria2289.50442,626515,19412,84418214
Germany Berlin573.43154,714255,9393,5203,94614
Germany Brandenburg13---2,4858414
Germany Bremen133.57148,673150,3576711,59814
Germany Hamburg356.29247,474277,8851,7872,36714
Germany Hesse475.36252,136275,8936,17629214
Germany Lower Saxony607.00177,349264,2867,92716714
Germany Mecklenburgdeath Vorpommern122.7934,07680,7861,6126914
Germany North Rhinedeath Westphalia44836.291,307,0191,701,54917,86552414
Germany Rhinelanddeath Palatinate287.57173,961322,8034,05320414
Germany Saarland9---99638814
Germany Saxony216.43221,229239,8634,08522114
Germany Saxonydeath Anhalt103.5058,34995,3752,24511014
Germany Schleswigdeath Holstein16---2,85918114
Germany Thuringia8---2,17113414
Italy Abruzzo33---1,31212114
Italy Aosta Valley13---1263914
Italy Apulia5010.86117,088411,1014,02920614
Italy Basilicata4---5635614
Italy Bolzano39---5217914
Italy Calabria143.7943,359104,2701,94712814
Italy Campania10214.64162,686618,5835,80242414
Italy Emilia-Romagna1,20410.0792,181320,4864,45919914
Italy Friuli-Venezia Giulia908.1457,341204,6461,21515314
Italy Lazio10913.79313,579973,7405,87934114
Italy Liguria12910.3698,136379,0611,55128614
Italy Lombardy4,77320.79512,6091,195,92810,06142214
Italy Marche313---1,52516214
Italy Molise11---3066914
Italy Piedemont33211.00124,778415,0734,35617214
Italy Sardinia17---1,6406814
Italy Sicily5511.2976,326357,3565,00019414
Italy Trentino-South Tyrol50---1,0727914
Italy Tuscany1976.8691,255324,3433,73016214
Italy Umbria32---88210414
Italy Veneto7754.9346,283192,4364,90626714
Netherlands Drenthe7---49318814
Netherlands Flevoland3---42229914
Netherlands Friesland27.1474,363186,42965019614
Netherlands Gelderland244.0762,696101,7862,08442014
Netherlands Groningen1---58625214
Netherlands Limburg26---1,11852114
Netherlands North Brabant1292.5086,40087,5002,56352314
Netherlands North Holland264.29225,453235,6712,8781,08214
Netherlands Overijssel76.0080,600181,2301,16235014
Netherlands South Holland363.07143,993157,1873,7061,31714
Netherlands Utrecht42---1,35498114
Netherlands Zeeland3---38421614
Poland Greater Poland23.0031,614137,4903,39811411
Poland Holy Cross0---1,27310911
Poland Kuyavia-Pomerania----2,06811511
Poland Lesser Poland13.5558,265118,7733,28721711
Poland Lower Silesia42.9123,987124,4252,88714511
Poland Lublin3---2,1628611
Poland Lubusz1---1,0097211
Poland Masovia43.5582,805110,2745,20414611
Poland Opole1---1,03311011
Poland Podlaskie----1,1915911
Poland Pomerania01.9119,00780,1512,22012111
Poland Silesia5---4,64637711
Poland Subcarpathian2---2,09911811
Poland Warmia–Masuria2---1,4275911
Poland West Pomerania2---1,6937411
Poland Łódź2---2,54914011
Spain Andalucia9910.14339,071453,1858,4509614
Spain Aragon383.5090,463117,6281,3492814
Spain Asturias315.93104,055179,3571,07710214
Spain Canarias342.5030,21981,0002,11828414
Spain Cantabria17---59411214
Spain Castilla y Leon553.0061,84783,5382,5462714
Spain Castilla-La Mancha87---2,1222714
Spain Cataluña1667.86373,219576,3427,57123614
Spain Ceuta0---844,42214
Spain Extremadura20---1,1082714
Spain Galicia365.57126,220185,5712,7819414
Spain Islas Baleares14---1,11922414
Spain La Rioja113---3246414
Spain Madrid9169.00588,469676,0526,49980914
Spain Melilla1---816,21614
Spain Murcia153.00-93,5371,47413014
Spain Navarra44---6456214
Spain Pais Vasco19310.64400,259447,4452,19330314
Spain Valencia7311.71227,139448,8045,12922114
Sweden Blekinge3---1605414
Sweden Dalarna1---2871014
Sweden Gotland1---591914
Sweden Gävleborg2---2871614
Sweden Halland10---3296014
Sweden Jämtland3---130314
Sweden Jönköping12---3613414
Sweden Kalmar2---2452214
Sweden Kronoberg3---2002414
Sweden Norrbotten2---250314
Sweden Skåne54---1,36212314
Sweden Stockholm1564.2937,971175,7862,34436014
Sweden Södermanland4---2954814
Sweden Uppsala12---3764614
Sweden Värmland13---2811614
Sweden Västerbotten3---270514
Sweden Västernorrland3---2451114
Sweden Västmanland1---2745314
Sweden Västra Götaland56---1,7107114
Sweden Örebro3---3023514
Sweden Östergötland2---4624414
Switzerland Aargau18---6783889
Switzerland Appenzell Ausserrhoden2---552209
Switzerland Appenzell Innerrhoden0---16879
Switzerland Basel-Landschaft25---2905029
Switzerland Basel-Stadt556.0075,895227,9642005,0729
Switzerland Bern421.3334,49842,3851,0351589
Switzerland Fribourg17---3191419
Switzerland Geneva925.0011,914150,4204991,4429
Switzerland Glarus1---40519
Switzerland Graubünden; Grisons24---198269
Switzerland Jura5---73829
Switzerland Luzern8---4102339
Switzerland Neuchâtel24---1772069
Switzerland Nidwalden2---431389
Switzerland Obwalden2---38669
Switzerland Schaffhausen0---822469
Switzerland Schwyz8---1591439
Switzerland Solothurn4---2733089
Switzerland St. Gallen9---5082229
Switzerland Thurgau3---2762299
Switzerland Ticino120---3531109
Switzerland Uri0---36339
Switzerland Valais17---344539
Switzerland Vaud109---7991889
Switzerland Zug7---1274169
Switzerland Zürich675.1125,964133,4201,5217019
UK Bedfordshire3---6695426
UK Berkshire12---9117226
UK Bristol3---4634,2246
UK Buckinghamshire73.3328,249101,6678094326
UK Cambridgeshire2---8532526
UK Cheshire2---1,0594526
UK Cornwall5---5681606
UK Cumbria7---499746
UK Derbyshire65.83150,093195,9831,0534016
UK Devon21---1,1941786
UK Dorset3---7722746
UK Durham3---8673246
UK East Riding of Yorkshire24.3349,732110,0676002426
UK East Sussex95.0063,266153,7508454726
UK Essex8---1,8334996
UK Gloucestershire5---9162916
UK Greater London14531.501,211,5481,447,2498,8995,6716
UK Greater Manchester2713.17415,219563,6422,8132,2046
UK Hampshire183.0087,87697,5151,8444896
UK Herefordshire1---192886
UK Hertfordshire18---1,1847216
UK Isle of Wight1---1423726
UK Kent10---1,8464946
UK Lancashire64.3354,252135,9241,4984876
UK Leicestershire43.83118,206123,8631,0534896
UK Lincolnshire2---1,0881566
UK Merseyside106.50318,321322,4751,4232,2006
UK Norfolk-2.0054,12054,4889041686
UK North Yorkshire54.0083,202139,9521,1591346
UK Northamptonshire6---7483166
UK Northumberland----320646
UK Nottinghamshire94.00113,541122,4121,1545356
UK Oxfordshire14---6882646
UK Rutland----401046
UK Shropshire2---4981436
UK Somerset2---9652326
UK South Yorkshire78.00206,392297,1661,4039046
UK Staffordshire44.0092,488120,3561,1314176
UK Suffolk15.0095,139151,5557592006
UK Surrey11---1,1907166
UK Tyne and Wear87.00254,218348,2111,1362,1056
UK Warwickshire4---5712896
UK West Midlands1219.33425,726592,5872,9163,2356
UK West Sussex4---8594316
UK West Yorkshire117.67203,289254,7802,3201,1436
UK Wiltshire6---7202076
Panel B of Table 1 presents the average of each variable across the 14 sample days and 194 regions for each week lag. Except for the first week,9 the estimates are very similar across weeks. On average, across weeks 2 through 6, 33% of the regions celebrated at least one soccer match per week. There were 0.55 games per week and region, attended by 13,192 people and played in venues with average capacity for about 22,590 spectators.10

Results

We proceed now to analyze the relation between, on the one side, the number, attendance, and venue capacity of the soccer games celebrated until all competitions were interrupted, and, on the other, the propagation of COVID-19 cases across days and regions during the first two weeks in March 2020. There is evidence that the incubation period of COVID-19 (that is, the “pre-symptomatic” period between becoming infected and developing symptoms of the disease) can be as long as two weeks. Thus, there is likely a lag between the time when the match spectators become infected and the time they are tested after developing symptoms compatible with the disease. This is especially relevant in the first two weeks of March 2020 when mass testing (in particular, across asymptomatic people) had not been yet implemented in any country. Figure 3 shows that by March 15, all countries in our sample, except Switzerland and (marginally) Germany, had a ratio of COVID-19 tests per thousand people below 0.2. Most likely, at the onset of the pandemic, only people with symptoms were tested and, eventually, diagnosed as new cases of COVID-19 infections. Therefore, considering the incubation window and that only symptomatic people were tested at that point, we expect the predictive power of our instruments to become significant with a lag after the game.
Figure 3

Daily COVID-19 test per thousand people

The figure shows the number of daily test of COVID-19 per thousand people from February 1 through March 31, 2020, for the countries in our sample for which there is data available. The graph is retrieved from https://ourworldindata.org/coronavirus-testing. Data is collected by Our World in Data by Oxford Martin School at the University of Oxford. Data description and sources per coutry can be found at https://ourworldindata.org/coronavirus-testing#source-information-country-by-country

Daily COVID-19 test per thousand people The figure shows the number of daily test of COVID-19 per thousand people from February 1 through March 31, 2020, for the countries in our sample for which there is data available. The graph is retrieved from https://ourworldindata.org/coronavirus-testing. Data is collected by Our World in Data by Oxford Martin School at the University of Oxford. Data description and sources per coutry can be found at https://ourworldindata.org/coronavirus-testing#source-information-country-by-country To test this prediction, we run the following panel regression in region r and day t from March 1 through 14, 2020:11 ΔLog( 1 + Cases  ) represents the (log) difference between 1 plus the number of cases in region r and day t and day t-1. Likewise, ΔLog( 1 + Cases  ) is the same variable lagged 1 day. For every lagged week w = {1, 2, …, 6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in the region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants over the week, Log(1 + Attendance  − Attendance ); and the natural logarithm of 1 plus the accumulated venue capacity of the games played over the week, Log(1 + Capacity  − Capacity ). We control for each region's population, density and gross regional product per capita (GRP). Our object of interest is the series of coefficients on the weekly lagged predictors, c . FE represents country times day fixed effects. All variables are defined in Appendix A. Standard errors are clustered at the region level. Table 2 presents the results from regression (1) for the three soccer variables. The rate at which the daily number of cases of COVID-19 increases is positive and significantly related to the increase of cases the previous day. It is also higher in more populated and wealthier (higher Log(GRP)) areas. With respect to the lagged soccer variables, only the coefficient c corresponding to I_Games, Log(Attendance), or Log(Capacity) two weeks earlier is significant. The other lags are non-significant for any of the three variables. In specification (1), for any single country and day from March 1 through 14, the rate of change in the number of COVID-19 cases relative to the previous day is, on average, 5.5 percentage points higher in regions where there was a soccer game two weeks earlier relative to regions with no games in the same period. This result is statistically significant at the 1% level as well as economically significant (the average growth rate of cases was 23% per day during this period). Specifications (2) and (3) show that the rate of change is, on average, about 6 basis points higher for every 1% increase in attendance and venue capacity, respectively. Both results are significant at the 1% level. These results are consistent with the documented incubation period of the virus and the lack of massive testing during the sample period.
Table 2

Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity

This table reports the coefficients from the following regression:

ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in the region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants over the week, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity over the week, Log(1 + Capacity − Capacity). We control for each region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively.

I_GamesLog(1+Attendance)Log(1+Capacity)
(1)(2)(3)
ΔLog( 1 + Casest − 1 )0.0560.0560.055
(0.028)**(0.028)**(0.028)**
Log(Population)0.0270.0270.028
(0.007)***(0.007)***(0.007)***
Log(Density)-0.002-0.002-0.002
(0.006)(0.006)(0.006)
Log(GRP)0.0490.0490.048
(0.025)**(0.025)*(0.025)**
Lagged week 1 (c1)-0.0280.000-0.003
(0.021)(0.002)(0.002)
Lagged week 2 (c2)0.0550.0060.005
(0.02)***(0.002)***(0.002)***
Lagged week 3 (c3)-0.016-0.003-0.001
(0.025)(0.002)(0.002)
Lagged week 4 (c4)-0.015-0.001-0.001
(0.02)(0.002)(0.002)
Lagged week 5 (c5)-0.004-0.0010.000
(0.022)(0.002)(0.002)
Lagged week 6 (c6)-0.012-0.002-0.002
(0.022)(0.002)(0.002)
Country × Day FEYYY
R-sq0.1800.1800.181
Number of Obs.2,0732,0732,073
Number of Regions194194194
Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity This table reports the coefficients from the following regression: ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in the region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants over the week, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity over the week, Log(1 + Capacity − Capacity). We control for each region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively. Finally, we test if our results change when we include venues with smaller minimum capacity. There is evidence of the role played by large gatherings of people in the dissemination of the virus. These are known as “super-spreader” events (e.g., Dave et al (2020) and Felbermayr, Hinz, and Chowdhry (2020)). To test the importance of the minimum venue capacity, we expand the sample to include games that took place in venues with a minimum capacity of 10,000 spectators. The extended sample includes 2,314 games. Table 3 presents the results of regression (1) when we consider games held in venues with a minimum capacity of 20, 15 and 10 thousand spectators, respectively, for each of the three soccer variables. Like in Table 2, the daily increment in the number of cases of COVID-19 is positive and significantly related to the increase of cases the previous day. It is also higher in more populated and wealthier (higher Log(GRP)) areas. When we include stadiums with a minimum capacity of 20,000 spectators, the rate of change in the number of COVID-19 cases relative to the previous day is, on average, higher by 4.2 percentage points in regions where there was a soccer game two weeks earlier relative to regions with no games in the same period. This is lower than the 5.5% difference in Table 2. The coefficient is significant at the 5% level (down from 1% in Table 2). The Attendance and Capacity variables show a similar qualitative pattern. However, when we expand the minimum capacity to 15,000 spectators, the coefficient is not statistically different from zero for any of the three variables (only marginally at the 10% for Attendance). These results are confirmed when the minimum capacity is lowered to 10,000 spectators.
Table 3

Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity Sorted by minimum venue Capacity (below 25K spectators)

This table reports the coefficients from the following regression:

ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in the region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants over the week, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity over the week, Log(1 + Capacity − Capacity). We control for each region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. >20K, >15K, and >10K represent the minimum capacity of venues included in the sample. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively.

I_GamesLog(1+Attendance)Log(1+Capacity)
>20K>15K>10K>20K>15K>10K>20K>15K>10K
(1)(2)(3)(4)(5)(6)(7)(8)(9)
ΔLog( 1 + Casest − 1 )0.0590.0590.0580.0590.0590.0600.0590.0590.059
(0.028)**(0.029)**(0.029)**(0.028)**(0.029)**(0.029)**(0.028)**(0.029)**(0.029)**
Log(Population)0.0250.0220.0170.0250.0220.0170.0260.0230.019
(0.007)***(0.007)***(0.007)**(0.007)***(0.007)***(0.008)**(0.007)***(0.008)***(0.008)**
Log(Density)-0.002-0.003-0.005-0.003-0.003-0.004-0.001-0.002-0.004
(0.007)(0.006)(0.006)(0.007)(0.006)(0.006)(0.007)(0.006)(0.006)
Log(GRP)0.0480.0520.0570.0500.0500.0530.0490.0510.054
(0.025)*(0.025)**(0.025)**(0.025)**(0.025)**(0.025)**(0.025)**(0.025)**(0.025)**
Lagged week 1 (c1)0.0030.0140.0070.0020.0020.0030.0000.0010.000
(0.021)(0.021)(0.02)(0.002)(0.002)(0.002)(0.002)(0.002)(0.002)
Lagged week 2 (c2)0.0420.008-0.0120.0050.0040.0010.0040.001-0.001
(0.02)**(0.022)(0.021)(0.002)**(0.002)*(0.002)(0.002)**(0.002)(0.002)
Lagged week 3 (c3)-0.0210.0030.019-0.003-0.0010.000-0.0020.0000.001
(0.029)(0.027)(0.025)(0.002)(0.002)(0.002)(0.003)(0.003)(0.002)
Lagged week 4 (c4)-0.030-0.042-0.031-0.002-0.004-0.004-0.003-0.004-0.003
(0.019)(0.02)**(0.024)(0.002)(0.002)*(0.002)*(0.002)(0.002)*(0.002)
Lagged week 5 (c5)-0.020-0.022-0.011-0.002-0.002-0.001-0.002-0.002-0.001
(0.024)(0.024)(0.023)(0.002)(0.002)(0.002)(0.002)(0.002)(0.002)
Lagged week 6 (c6)0.0140.0450.0600.0000.0010.0050.0010.0030.005
(0.024)(0.027)*(0.028)**(0.002)(0.003)(0.003)*(0.002)(0.003)(0.003)*
Country × Day FEYYYYYYYYY
R-sq0.1780.1780.1790.1790.1780.1780.1780.1770.177
Number of Obs.2,0732,0732,0732,0732,0732,0732,0732,0732,073
Nr. of Regions194194194194194194194194194
Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity Sorted by minimum venue Capacity (below 25K spectators) This table reports the coefficients from the following regression: ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in the region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants over the week, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity over the week, Log(1 + Capacity − Capacity). We control for each region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. >20K, >15K, and >10K represent the minimum capacity of venues included in the sample. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively. We interpret these results as consistent with the evidence of other super-spread events. A minimum agglomeration is needed for the spread of the virus to be statistically detectable.

Limitations of the analysis

In this section, we discuss some limitations of our analysis. In the first place, our regressions only explain, on average, 18% of the change in daily cases. Thus, the coefficients on the soccer variables should be interpreted in a cross-sectional way: they help explain differences in the incidence of COVID-19 across regions in the early stages of the pandemic, rather than the absolute numbers of contagions within each region. Furthermore, relative to our sample period, people's awareness has increased and governments around the world have taken measures to promote public hygiene and social distancing. Currently, we would expect any public gathering or mass event to result in much lower COVID-19 spreading. For this reason, using soccer games as an instrument variable is only applicable during the outbreak of the pandemic across Europe in March. This limitation is shared by other studies based on large gatherings, like motorcycle rallies and ski resorts, mentioned in the Introduction. Unlike these events, however, soccer competitions have two advantages as an instrument. First, they take place across several countries, hence expanding the sample size considerably. Second, the games are staggered through the first quarter of 2020, in contrast with other mass events like Carnival celebrations, which take place rather simultaneously across Europe in the same period. Finally, another limitation is that people might have also caught the corona virus in bars where soccer matches were broadcasted, without being physically present in the match venue. To assess the impact of this indirect via of contagion, we perform the following exercise. For every game in our sample, we replicate Table 2 but considering the spread of cases in the region when a local team plays outside the region. In this case, we might expect an increase of bar attendance in the region of the local team but not mass gathering of people as we predict in the region where the game is actually played.12 That is, in regression (1), for every lagged week w={1,2,…,6} and region r, the variable WX now represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match in which a team from region r played outside that region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants to those games, Log(1 + Attendance  − Attendance ), or the natural logarithm of 1 plus the accumulated venue capacity of those games, Log(1 + Capacity  − Capacity ). We include the same set of controls as in equation (1). Standard errors are clustered at the region level. Results are reported in Table 4 . Even accounting for the impact of cross-border movements of fans, the celebration of any game where a local team plays outside the region has no significant effect on the virus spread in the region, regardless of the venue attendance or capacity.
Table 4

Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity when a Regional Local Team Plays in a Different Region

This table reports the coefficients from the following regression:

ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match where a local team from region r played outside that region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants to those games, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity of those games, Log(1 + Capacity − Capacity). We control for each local region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively.

I_GamesLog(1+Attendance)Log(1+Capacity)
(1)(2)(3)
ΔLog( 1 + Casest − 1 )0.0580.0580.057
(0.029)**(0.029)**(0.029)**
Log(Population)0.0310.0290.031
(0.007)***(0.007)***(0.007)***
Log(Density)0.000-0.0010.000
(0.006)(0.006)(0.006)
Log(GRP)0.0490.0500.049
(0.024)**(0.024)**(0.024)**
Lagged week 1 (c1)-0.022-0.002-0.002
(0.016)(0.002)(0.001)
Lagged week 2 (c2)-0.0130.000-0.001
(0.016)(0.002)(0.002)
Lagged week 3 (c3)-0.002-0.0020.000
(0.016)(0.002)(0.001)
Lagged week 4 (c4)0.0210.0020.002
(0.015)(0.001)(0.001)
Lagged week 5 (c5)-0.014-0.001-0.001
(0.016)(0.002)(0.001)
Lagged week 6 (c6)-0.016-0.001-0.002
(0.015)(0.001)(0.001)
Country × Day FEYYY
R-sq0.1780.1780.178
Number of Obs.2,0732,0732,073
Number of Regions194194194
Regression of Change in Cases on Weekly Lagged Games, Attendance and Capacity when a Regional Local Team Plays in a Different Region This table reports the coefficients from the following regression: ΔLog( 1 + Cases ) represents (log) difference between 1 plus the number of cases in region r and day t with respect to day t-1. Likewise, ΔLog( 1 + Cases ) is the same variable lagged 1 day. For every lagged week w={1,2,…,6} and region r, the variable WX represents, alternatively, the dummy variable, that takes a value of one if there was a soccer match where a local team from region r played outside that region any day t ∈ (t − (1 + 7  × (w − 1), t − 7  × w); the natural logarithm of 1 plus the accumulated number of match attendants to those games, Log(1 + Attendance − Attendance), or the natural logarithm of 1 plus the accumulated venue capacity of those games, Log(1 + Capacity − Capacity). We control for each local region's Population, Density and Gross Regional Product per capita (GRP). FE Represents country times day fixed effects. Appendix A includes the definition and source of each variable. Standard errors (in parenthesis) are clustered at the region level. ***, **, * represent statistical significance at the 1, 5, and 10% level, respectively.

Conclusions and implications

The evidence about the soccer variables introduced in this paper may help overcome potential endogeneity issues in the analysis of how the spread of COVID-19 has affected the economy and firm decisions. Despite the limited time span (March 2020) of these variables, the impact of the COVID-19 pandemic is so deep and unprecedented, that we believe this analysis is relevant. Gómez and Mironov (2020), for instance, show that, only after instrumenting the number of COVID-19 cases with the soccer variables, there is evidence of a causal relation between the propagation of the virus and the cross-section of stock returns from firms headquartered in these regions. The accumulated drop in stock performance during March and April 2020 is significantly higher for firms in regions with higher incidence of (instrumented) COVID-19 but only when the company's CEO is older than 60 years. The existing evidence shows that older people are more likely to suffer from severe illness or even death in case of contagion. Thus, the market is discounting the likelihood of the company's CEO possibly dying of COVID-19. These instruments could also be used to analyze the causal effect of the virus on the drop in regional gross product or employment, or corporate variables like revenue, cash holdings, dividends, investments, inventories, and accounts payable, as more data becomes available.

Declarations of Competing Interest

None.
Table A

Variables definition and source

Main variables
CasesAccumulated number of COVID-19 diagnosed cases per region from the following sources:
CountryAgency/WebsiteCountryAgency/Website
BelgiumEpistatPolandSerwis Rzeczypospolitej Polskiej
FranceSanté Publique FranceSpainInstituto de Salud Carlos III
ItalyDipartimento della Protezione CivileSwedenFolkhalsomyndigheten
GermanyRobert Koch InstituteSwitzerlandFOPH
The NetherlandsRIVMUKGOV.UK
Cases/PopulationAccumulated number of COVID-19 diagnosed cases per million inhabitant per region.
# GamesAccumulated number of soccer matches per region. Collected from the website https://www.thesportsman.com/football
I_GamesA dummy variable that takes a value of 1 if there was a soccer match in the region where the firm is located, zero otherwise.
AttendanceAccumulated number of attendants to all soccer matches in each region. Various websites, including www.footlive.com, www.azscore.com, www.soccerway.com, www.fbref.com, and www.sofascore.com.
CapacityAccumulated maximum capacity in all venues with a minimum capacity of 25,000 spectators that hosted soccer matches per region. Retrieved from the website: https://en.wikipedia.org/wiki/List_of_European_stadiums_by_capacity.
Demographic variables
PopulationThousands of inhabitants in the region in 2018
DensityThousands of inhabitants per square-Km in the region in 2018
GRPGross Regional Product: USD per capita in 2018
CountryAgency/WebsiteCountryAgency/Website
BelgiumNBB.StatPolandStatistics Poland
FranceINEDSpainINE
ItalyISTATSwedenSCB
GermanyDESTATISSwitzerlandFSO
The NetherlandsCBSUKONS
  3 in total

1.  High population densities catalyse the spread of COVID-19.

Authors:  Joacim Rocklöv; Henrik Sjödin
Journal:  J Travel Med       Date:  2020-05-18       Impact factor: 8.490

2.  Identification of critical airports for controlling global infectious disease outbreaks: Stress-tests focusing in Europe.

Authors:  Paraskevas Nikolaou; Loukas Dimitriou
Journal:  J Air Transp Manag       Date:  2020-04-10

3.  The contagion externality of a superspreading event: The Sturgis Motorcycle Rally and COVID-19.

Authors:  Dhaval Dave; Drew McNichols; Joseph J Sabia
Journal:  South Econ J       Date:  2020-12-02
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.