| Literature DB >> 35317020 |
Francesco Bloise1, Massimiliano Tancioni2.
Abstract
We exploit the provincial variability of COVID-19 cases registered in Italy to select the territorial predictors of the pandemic. Absent an established theoretical diffusion model, we apply machine learning to isolate, among 77 potential predictors, those that minimize the out-of-sample prediction error. We first estimate the model considering cumulative cases registered before the containment measures displayed their effects (i.e. at the peak of the epidemic in March 2020), then cases registered between the peak date and when containment measures were relaxed in early June. In the first estimate, the results highlight the dominance of factors related to the intensity and interactions of economic activities. In the second, the relevance of these variables is highly reduced, suggesting mitigation of the pandemic following the lockdown of the economy. Finally, by considering cases at onset of the "second wave", we confirm that the territorial distribution of the epidemic is associated with economic factors.Entities:
Keywords: COVID-19; Coronavirus; Economic networks, Epidemic, Machine learning; Economic structure
Year: 2021 PMID: 35317020 PMCID: PMC7994006 DOI: 10.1016/j.strueco.2021.01.001
Source DB: PubMed Journal: Struct Chang Econ Dyn ISSN: 0954-349X
Fig. 1Geographical distribution of COVID-19 cumulative cases per 100,000 inhabitants.
Description of economic activity predictors.
| Employment rate | Employed people over provincial population | ISTAT (2017) |
| Unemployment rate | Percentage of active provincial population aged 15-74 who are unemployed | ISTAT (2019) |
| Percentage of employment in agriculture | Percentage of total employees who work in agriculture activities | ISTAT (2017) |
| Percentage of employment in manufacturing | Percentage of total employees who work in manufacturing activities | ISTAT (2017) |
| Percentage of employment in services | Percentage of total employees employed in service activities | ISTAT (2017) |
| Percentage of self-employed workers | Percentage of provincial workers who are self-employed | ISTAT (2011) |
| Value added per employee | Value added in euro per employee (productivity) | ISTAT (2017) |
| Value added per capita | Value added in euro per capita | ISTAT (2017) |
| Value added per capita - Agriculture | Value added of agriculture in euro per resident | ISTAT (2017) |
| Value added per capita - Manufacturing | Value added of manufacturing in euro per resident | ISTAT (2017) |
| Value added per capita - Services | Value added of services in euro per resident | ISTAT (2017) |
| Poverty rate | Percentage of taxpayers declaring less than 10,000 euro in 2018 | Italian Ministry of Economy and Finance (2019) |
| Firm density | Number of firms per km2 | ISTAT (2017) |
| Firm size | Average number of employees per firm | ISTAT (2017) |
| Percentage of employment in industrial districts | Percentage of total employees who work in industrial districts | ISTAT (2017) |
| Intensity of export relationships | Average number of areas of the world (e.g. Europe, BRICs, rest of the world) where firms export their products | ISTAT (2018) |
| Unloaded goods in the local harbours | Tons of goods unloaded in the local harbours per inhabitant | ISTAT (2018) |
| Cattle density | Number of livestock units per km2 | ISTAT (2010) |
| Density of firms producing animal-derived products | Number of firms producing goods derived from animal products per km2 | Italian Ministry of Health (2018) |
Description of education system predictors.
| Predictor | Description | Source (year of reference) |
|---|---|---|
| Percentage of compulsory school students | Percentage of compulsory school students over total population. | ISTAT (2018) |
| Percentage of high-school graduates | Percentage of high school graduates over total population | ISTAT (2018) |
| Percentage of people below upper secondary education | Percentage of people below upper secondary education | ISTAT (2011) |
| Percentage of pre-school students | Percentage of provincial students enrolled in pre-school | ISTAT (2018) |
| Percentage of students | Percentage of students over provincial population | ISTAT (2018) |
| Percentage of tertiary graduates | Percentage of provincial population with a tertiary degree | ISTAT (2011) |
| Percentage of university students | Percentage of provincial students who are enrolled in universities | ISTAT (2018) |
Fig. 2Estimated correlations between log cumulative cases per 100,000 inhabitants and selected covariates as of March 21, 2020.
Elastic net regression of log cumulative cases per 100,000 inhabitants on March 21, 2020.
| Baseline | |
|---|---|
| Distance from the first outbreak: less than or equal to 50 km | 0.496 |
| Value-added per employee | 0.244 |
| Intensity of export relationships | 0.188 |
| Nr. of frost days in a year | 0.161 |
| Mortality from infectious diseases | 0.093 |
| PM10 | 0.072 |
| Employment rate | 0.071 |
| Percentage of employment in manufacturing | 0.048 |
| Average family members | -0.028 |
| Percentage of employment in agriculture | -0.150 |
| Observations | 107 |
| 0.333 | |
| 0.393 | |
| MSE (hold-out sample) | 0.527 |
| Nr. of | 10 |
| Nr. of | 50 |
Constant terms and unselected predictors are not shown. The combination has been selected in the 80% training sample using 5-fold cross-validation. The out-of-sample predictive performance is tested in the 20% remaining observations.
Elastic net regression of log cumulative cases per 100,000 inhabitants—pre and post containment measures.
| March 21 (Model 1) | March 22-June 3 (Model 2) | March 22-June 3 (Model 3) | |
|---|---|---|---|
| Distance from the first outbreak: less than or equal to 50 km | 0.000 | 0.149 | 0.000 |
| Distance from the first outbreak: between 51 km and 100 km | 0.496 | 0.141 | 0.059 |
| Value-added per employee | 0.244 | 0.125 | 0.000 |
| Intensity of export relationships | 0.188 | 0.107 | 0.091 |
| Mean altitude of the province | 0.000 | 0.085 | 0.056 |
| Frost days in a year | 0.161 | 0.111 | 0.000 |
| Mortality from infectious diseases | 0.093 | 0.081 | 0.000 |
| Percentage of hospital beds of the elderly | 0.000 | 0.048 | 0.000 |
| Municipality density | 0.000 | 0.046 | 0.066 |
| Mortality rate from pneumonia | 0.000 | 0.000 | 0.052 |
| Average hospital size | 0.000 | 0.040 | 0.000 |
| Foggy days in a year | 0.000 | 0.038 | 0.000 |
| PM10 | 0.072 | 0.036 | 0.000 |
| N02 | 0.000 | 0.019 | 0.000 |
| Mortality rate | 0.000 | 0.000 | 0.010 |
| Employment rate | 0.071 | 0.000 | 0.000 |
| Percentage of employment in manufacturing | 0.048 | 0.000 | 0.000 |
| Hot days in a year | 0.000 | -0.061 | 0.000 |
| Percentage of families with 5 or more members | 0.000 | -0.065 | -0.077 |
| Average family members | -0.028 | -0.093 | -0.064 |
| Percentage of employment in agriculture | -0.150 | -0.088 | -0.028 |
| Hours of continuity health care services per capita | 0.000 | -0.111 | 0.000 |
| Log cases over 100,0000 people on March 21 | Not included | Not included | 0.493 |
| Observations | 107 | 107 | 107 |
Constant terms and predictors that are not selected in any of the three models are not shown.
Regression of log cumulative cases per 100,000 inhabitants on March 21, 2020: Full results.
| Elastic net | OLS | |||
|---|---|---|---|---|
| Coefficient | Coefficient | S.E. | P-value | |
| Distance from the first outbreak: less than or equal to 50 km | 0.496 | 2.066 | 1.212 | 0.098 |
| Value-added per employee | 0.244 | 0.380 | 0.985 | 0.702 |
| Intensity of export relationships | 0.188 | -0.121 | 0.255 | 0.637 |
| Nr. of frost days in a year | 0.161 | 0.576 | 0.279 | 0.047 |
| Mortality from infectious diseases | 0.093 | 0.153 | 0.154 | 0.328 |
| Concentration of PM10 | 0.072 | 0.244 | 0.305 | 0.430 |
| Employment rate | 0.071 | -0.005 | 1.403 | 0.997 |
| Percentage of employment in manufacturing | 0.048 | Omitted category | ||
| Percentage of employment in industrial districts | 0.000 | 0.068 | 0.183 | 0.710 |
| Percentage of employment in services | 0.000 | 0.234 | 0.433 | 0.593 |
| Percentage of workers who are self-employed | 0.000 | -0.011 | 0.139 | 0.937 |
| Hospital beds per capita | 0.000 | -0.273 | 0.259 | 0.299 |
| Percentage of hospital beds in private clinics | 0.000 | -0.147 | 0.193 | 0.450 |
| Percentage of hospital beds for the elderly | 0.000 | -0.012 | 0.122 | 0.920 |
| Average firm size | 0.000 | 0.179 | 0.287 | 0.537 |
| Average hospital size | 0.000 | 0.207 | 0.193 | 0.293 |
| Population density | 0.000 | -0.122 | 1.835 | 0.947 |
| Municipality density | 0.000 | -0.101 | 0.270 | 0.711 |
| Hospital density | 0.000 | 0.110 | 1.059 | 0.918 |
| Firm density | 0.000 | -0.099 | 1.177 | 0.933 |
| Percentage of tertiary graduates | 0.000 | 0.394 | 0.306 | 0.207 |
| Percentage of high-school graduates | 0.000 | -0.300 | 0.235 | 0.211 |
| Percentage of people below upper secondary education | 0.000 | Omitted category | ||
| Nr. of flights passengers per capita | 0.000 | 0.020 | 0.146 | 0.894 |
| Percentage of passengers from international locations | 0.000 | 0.042 | 0.154 | 0.786 |
| Nr. of public transport passengers per capita | 0.000 | -0.222 | 0.320 | 0.493 |
| Nr. of public transport seats per km/resident | 0.000 | 0.244 | 0.341 | 0.478 |
| Car density | 0.000 | -0.142 | 0.208 | 0.498 |
| Mortality rate for respiratory diseases | 0.000 | 0.052 | 0.236 | 0.827 |
| Mortality rate for pneumonia | 0.000 | -0.341 | 0.274 | 0.223 |
| Mortality rate | 0.000 | -0.333 | 0.403 | 0.415 |
| Percentage of students | 0.000 | -0.300 | 0.256 | 0.251 |
| Percentage of university students | 0.000 | Omitted category | ||
| Percentage of compulsory school students | 0.000 | -0.237 | 0.273 | 0.391 |
| Percentage of pre-school students | 0.000 | 0.078 | 0.273 | 0.778 |
| Value-added per capita | 0.000 | -3.255 | 2.991 | 0.284 |
| Poverty rate | 0.000 | 0.546 | 0.849 | 0.525 |
| Percentage of families with 5 or more members | 0.000 | 0.035 | 0.468 | 0.941 |
| Percentage of males | 0.000 | -0.168 | 0.158 | 0.295 |
| Average age of the population | 0.000 | 0.293 | 1.199 | 0.808 |
| Percentage of immigrants | 0.000 | 0.572 | 0.361 | 0.123 |
| Percentage of people aged 65 or more | 0.000 | -0.007 | 1.131 | 0.995 |
| Concentration of PM2.5 | 0.000 | -0.395 | 0.307 | 0.206 |
| Nr. of foggy days in a year | 0.000 | 0.036 | 0.194 | 0.853 |
| Concentration of N02 | 0.000 | 0.112 | 0.230 | 0.628 |
| Nr. of windy days in a year | 0.000 | 0.007 | 0.195 | 0.972 |
| Nr. of sunny days in a year | 0.000 | 0.236 | 0.417 | 0.576 |
| Nr. of hot days in a year | 0.000 | -0.230 | 0.175 | 0.198 |
| Nr. of rainy days in a year | 0.000 | 0.299 | 0.270 | 0.276 |
| Percentage of people who live close to a train station | 0.000 | -0.167 | 0.171 | 0.337 |
| Mean altitude of the province | 0.000 | 0.700 | 0.273 | 0.015 |
| Altitude of the province capital | 0.000 | -0.545 | 0.230 | 0.024 |
| Unemployment rate | 0.000 | 0.396 | 0.231 | 0.096 |
| Commuters as a share of the population | 0.000 | 0.282 | 0.272 | 0.307 |
| Percentage of commuters outside their municipality of residence | 0.000 | -0.075 | 0.290 | 0.797 |
| Unloaded goods in the local harbours per capita | 0.000 | 0.180 | 0.170 | 0.299 |
| Percentage of commuters who use a private vehicle | 0.000 | 0.077 | 0.179 | 0.672 |
| Agriculture valued-added per capita | 0.000 | 0.264 | 0.188 | 0.170 |
| Services valued-added per capita | 0.000 | 1.676 | 1.150 | 0.154 |
| Manufacturing valued-added per capita | 0.000 | 1.889 | 0.979 | 0.062 |
| Yearly ship passengers arriving in the local harbours over total population | 0.000 | 0.055 | 0.125 | 0.661 |
| Yearly registered visitors in accommodation facilities as a percentage of the population | 0.000 | 0.163 | 0.212 | 0.448 |
| People that actually live in the province as a percentage of residents | 0.000 | -0.088 | 0.164 | 0.597 |
| Percentage of people who live close to the sea | 0.000 | -0.217 | 0.135 | 0.116 |
| Cattle density | 0.000 | 0.138 | 0.106 | 0.204 |
| Firm density (derived from animal products) | 0.000 | -0.252 | 0.273 | 0.362 |
| General practitioner per capita | 0.000 | 0.187 | 0.294 | 0.529 |
| Hours of continuity health care services per capita | 0.000 | -0.151 | 0.255 | 0.558 |
| Cases handled by the medical homecare as a share of the population | 0.000 | -0.056 | 0.145 | 0.700 |
| Clinic density | 0.000 | 0.362 | 0.306 | 0.246 |
| Degree of provincial interconnection | 0.000 | 0.185 | 0.288 | 0.525 |
| Distance from the first outbreak: between 51km and 100 km | 0.000 | 0.924 | 1.101 | 0.407 |
| Distance from the first outbreak: between 101km and 300 km | 0.000 | 0.694 | 0.903 | 0.448 |
| Distance from the first outbreak: between 301km and 500 km | 0.000 | 0.542 | 0.714 | 0.453 |
| Distance from the first outbreak: more than 500km | 0.000 | Omitted category | ||
| Average family members | -0.028 | -0.104 | 0.562 | 0.855 |
| Percentage of employment in agriculture | -0.150 | -0.168 | 0.313 | 0.596 |
| MSE (hold-out-sample) | 0.578 | 5.763 | ||
| MSE (training sample) | 0.463 | 0.045 | ||
Fig. A.2Post-selection inference.
Elastic net regression of log cumulative cases per 100,000 inhabitants on March 21, 2020: sensitivity to the inclusion of regional dummies.
| Baseline | Including regional dummies | |
|---|---|---|
| Distance from the first outbreak: less than or equal to 50 km | 0.496 | 0.495 |
| Value-added per employee | 0.244 | 0.244 |
| Intensity of export relationships | 0.188 | 0.188 |
| Frost days in a year | 0.161 | 0.160 |
| Mortality from infectious diseases | 0.093 | 0.092 |
| PM10 | 0.072 | 0.072 |
| Employment rate | 0.071 | 0.071 |
| Percentage of employment in manufacturing | 0.048 | 0.048 |
| Average family members | -0.028 | -0.028 |
| Percentage of employment in agriculture | -0.150 | -0.150 |
| Observations | 107 | 107 |
Constant terms and predictors that are not selected in any of the two models are not shown.
Elastic net regression of log cumulative cases per 100,000 inhabitants on March 21, 2020, excluding predictors related to the geographical location of the first outbreak.
| Baseline | Excluding distance | |
|---|---|---|
| Distance from the first outbreak: less than or equal to 50 km | 0.496 | Not included |
| Value-added per employee | 0.244 | 0.262 |
| Intensity of export relationships | 0.188 | 0.188 |
| Frost days in a year | 0.161 | 0.162 |
| Mortality from infectious diseases | 0.093 | 0.082 |
| PM10 | 0.072 | 0.083 |
| Employment rate | 0.071 | 0.057 |
| Percentage of employment in manufacturing | 0.048 | 0.034 |
| Foggy days in a year | 0.000 | 0.033 |
| Average family members | -0.028 | -0.022 |
| Percentage of employment in agriculture | -0.150 | -0.145 |
| Observations | 107 | 107 |
Constant terms and predictors that are not selected in any of the two models are not shown.
Predictive out-of-sample and in-sample performance of different estimators.
| Estimated MSE | ||
|---|---|---|
| Hold-out sample | Training Sample | |
| Elastic net | 0.527 | 0.463 |
| (0.147) | ||
| Lasso | 0.580 | 0.459 |
| (0.168) | ||
| Ridge regression | 0.617 | 0.352 |
| (0.187) | ||
| OLS | 5.763 | 0.045 |
| (2.159) | ||
| Observations | 21 | 86 |
Bootstrapped standard errors (200 replications) in parenthesis.
Fig. A.3Graphical illustration of the variability of the out-of-sample performance of OLS.
Fig. A.4Graphical illustration of the variability of the out-of-sample performance of elastic net.
Fig. 3Graphical illustration of the predictive out-of-sample performances of elastic net and OLS.
Elastic net regression of log cumulative cases per 100,000 inhabitants between September 1, 2020 and October 30, 2020.
| Percentage of the population that lives close to a train station | 0.049 |
| Mean altitude of the province | 0.046 |
| People that actually live in the province as a percentage of residents | 0.028 |
| Firm density | 0.028 |
| Population density | 0.024 |
| Value-added per employee | 0.019 |
| Mortality from pneumonia | 0.013 |
| Windy days in a year | -0.012 |
| Daily hours of sunshine | -0.041 |
| Hours of continuity health care services per capita | -0.043 |
| Unemployment rate | -0.051 |
| Poverty rate | -0.060 |
| Percentage of employment in agriculture | -0.103 |
| Observations | 107 |
| 0.111 | |
| 0.502 | |
| MSE (hold-out sample) | 0.201 |
| Nr. of | 10 |
| Nr. of | 50 |
Constant terms and unselected predictors are not shown. The combination has been selected in the 80% training sample using 5-fold cross-validation. The out-of-sample predictive performance is tested in the 20% remaining observations.
Description of climate and pollution predictors.
| N02 | Concentration of nitrogen dioxide (µg/m3) in the provincial capital | ISTAT (2018) |
| PM10 | Concentration of particulate matter of 10 micrograms per cubic metre or less in diameter (µg/m3) in the provincial capital | ISTAT (2018) |
| PM2.5 | Concentration of particulate matter that is 2.5 micrograms per cubic metre or less in diameter (µg/m3) in the provincial capital | ISTAT (2018) |
| Hot days in a year | Number of days in a year with a max temperature above 30°C: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
| Frost days in a year | Number of days in a year with a max temperature below 3°C: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
| Rainy days in a year | Number of rainy days in a year: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
| Foggy days in a year | Number of foggy days in a year: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
| Daily hours of sunshine | Number of daily hours of sunshine: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
| Number of windy days in a year | Number of days in a year with wind gusts greater than 25 knots: 2008 - 2018 average values. | Il Sole 24 ore. Quality of life index (2018) |
Description of socio-demographic predictors.
| Age of the population | Average age of provincial population | ISTAT (2018) |
| Family size | Average number of family members | ISTAT (2011) |
| Percentage of families with 5 or more members | Percentage families with at least 5 members | ISTAT (2011) |
| Percentage of immigrants | Percentage of foreign residents in the provincial population | ISTAT (2019) |
| Percentage of males | Percentage of male individuals | ISTAT (2019) |
| Percentage of population aged 65 or more | Percentage of provincial population aged 65 years old or more | ISTAT (2019) |
| Percentage of the population living close to a train station | Percentage of provincial population living in a municipality with at least one station with more than 2,500 daily visitors in a year | Ministry of Economic Development (2014) |
| Percentage of the population living close to the sea | Percentage of provincial population living in a municipality located close to the sea | ISTAT (2019) |
| People that actually live in the province as a percentage of residents | Number of people actually living in the province as a percentage of provincial residents | ISTAT (2011) |
| Population density | Number of people per km2 | ISTAT (2019) |
Description of geographical and territorial predictors.
| Altitude of the province capital | Altitude of the provincial capital measured at City Hall | ISTAT |
| Distance from the first outbreak: less than or equal to 50 km | Dummy for a provincial capital which is 50 km or less away from the province of the first outbreak (Lodi) | Own elaboration using latitude, longitude and the curvature constant |
| Distance from the first outbreak: between 51 and 100 km | Dummy for a provincial capital which is between 51 km and 100 km away from the province of the first outbreak (Lodi) | Own elaboration using latitude, longitude and the curvature constant |
| Distance from the first outbreak: between 101 and 300 km | Dummy for a provincial capital which is between 101 km and 300 km away from the province of the first outbreak (Lodi) | Own elaboration using latitude, longitude and the curvature constant |
| Distance from the first outbreak: between 301 and 500 km | Dummy for a provincial capital which is between 301 km and 500 km away from the province of the first outbreak (Lodi) | Own elaboration using latitude, longitude and the curvature constant |
| Distance from the first outbreak: more than 500 km | Dummy for a provincial capital which is between 500 km away from the province of the first outbreak (Lodi) | Own elaboration using latitude, longitude and the curvature constant |
| Municipality density | Number of municipalities per km2 | ISTAT (2020) |
| Mean altitude of the province | Mean altitude of the provincial territory | ISTAT |
Description of health care system predictors.
| Average hospital size | Average number of hospital beds per hospital | Italian Ministry of Health (2018) |
| Hospital beds per capita | Number of hospital beds per capita | Italian Ministry of Health (2018) |
| Hospital density | Number of hospitals per km2 | Italian Ministry of Health (2018) |
| Mortality from infectious diseases | Number of deaths from infectious diseases per 10,000 people | ISTAT (2017) |
| Mortality rate | Number of deaths per 10,000 people | ISTAT (2017) |
| Mortality rate from pneumonia | Number of deaths from pneumonia per 10,000 people | ISTAT (2017) |
| Mortality rate from respiratory diseases | Number of deaths from respiratory diseases per 10,000 people | ISTAT (2017) |
| Percentage of hospital beds in private clinics | Percentage of total hospital beds hosted by private clinics | Italian Ministry of Health (2018) |
| Percentage of total hospital beds for the elderly | Percentage of provincial hospital beds dedicated to the elderly | Italian Ministry of Health (2018) |
| General practitioner per capita | Number of general practitioners as a share of the provincial population | Italian Ministry of Health (2018) |
| Hours of continuity health care services per capita | Total yearly hours of continuity health care services per capita | Italian Ministry of Health (2018) |
| Cases handled by the medical homecare as a share of the population | Number of patients handled by the medical homecare as a share of the population | Italian Ministry of Health (2018) |
| Clinic density | Number of clinics per km2 | Italian Ministry of Health (2018) |
Description of mobility predictors.
| Car density | Number of cars per km2 | ISTAT (2018) |
| Nr. of flight passengers per capita | Number of flight passengers in provincial airports between January and February 2020 as a percentage of the provincial population | Association of Italian airport operators (2020) |
| Nr. of public transport passengers per capita | Number of yearly public transport passengers per inhabitant in the provincial capital | ISTAT (2015) |
| Nr. of public transport seats per km/resident | Number of public transport seats per km/resident in the provincial capital | ISTAT (2015) |
| Percentage of commuters | Percentage of daily commuters in provincial population | ISTAT (2011) |
| Percentage of commuters outside their municipality of residence | Percentage of total commuters who travel daily outside the municipality of residence. | ISTAT (2011) |
| Percentage of commuters who use a private vehicle | Percentage of total commuters who use a private vehicle. | ISTAT (2011) |
| Percentage of flight passengers from international locations | Percentage of total flight passengers in local airports going to or coming from international locations between January and February 2020 | Association of Italian airport operators (2020) |
| Yearly registered visitors in accommodation facilities (percentage) | Number of yearly visitors registered in accommodation facilities as a percentage of the provincial population | ISTAT (2018) |
| Yearly ship passenger arrivals in local harbours (percentage) | Number of ship passengers landing in provincial harbours as a percentage of provincial population | ISTAT (2018) |
| Percentage of the population living close to a train station | Percentage of provincial population living in a municipality with at least one station with more than 2,500 daily visitors in a year | Ministry of Economic Development (2014) |
| Degree of provincial interconnection | Number of commuters going to or coming from other provinces as a share of the provincial population | ISTAT (2011) |