Literature DB >> 32711328

The spread of 2019-nCoV in China was primarily driven by population density. Comment on "Association between short-term exposure to air pollution and COVID-19 infection: Evidence from China" by Zhu et al.

Sergio Copiello1, Carlo Grillenzoni2.   

Abstract

Recently, an article published in the journal Science of the Total Environment and authored by Zhu et al. has claimed the "Association between short-term exposure to air pollution and COVID-19 infection" (doi: https://doi.org/10.1016/j.scitotenv.2020.138704). This note shows that the stated dependence between the diffusion of the infection and air pollution may be the result of spurious correlation due to the omission of a common factor, namely, population density. To this end, the relationship between demographic, socio-economic, and environmental conditions and the spread of the novel coronavirus in China is analyzed with spatial regression models on variables deflated by population size. The infection rate - as measured by the number of cases per 100 thousand inhabitants - is found to be strongly related to the population density. At the same time, the association with air pollution is detected with a negative sign, which is difficult to interpret.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  2019-nCoV; Air pollution; COVID-19; Demography; Novel coronavirus; Population density

Mesh:

Substances:

Year:  2020        PMID: 32711328      PMCID: PMC7365069          DOI: 10.1016/j.scitotenv.2020.141028

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


Introduction and background

The outbreak of the coronavirus pandemic (Cheng and Shan, 2020) has stimulated a multitude of studies on the topic over just a few months. Yet a few of them go beyond the clinical scope and try to deal with other epidemiological aspects. However, both the earlier literature on other viruses and some recent studies on the novel coronavirus have examined the likely relationship between socio-economic and environmental conditions and the diffusion of pandemics. In particular, those studies point to the role potentially played by weather conditions (Iqbal et al., 2020; Sobral et al., 2020), transportation (Adda, 2016; Jia et al., 2020), economic activity (Sarmadi et al., 2020), and air pollution (Coccia, 2020; Conticini et al., 2020). As far as the latter aspect is concerned, it is worth noting that pollution emissions are already known to be associated with respiratory viral infections (Becker and Soukup, 1999; Ciencewicki and Jaspers, 2007; Cui et al., 2003; Horne et al., 2018; Mehta et al., 2013; Xu et al., 2016; Ye et al., 2016). Recently, an article published in the journal Science of the Total Environment has supported the “Association between short-term exposure to air pollution and COVID-19 infection” (Zhu et al., 2020) based on an analysis of 120 Chinese cities using a generalized additive model. The authors find “significantly positive associations of PM2.5, PM10, CO, NO2 and O3 with COVID-19 confirmed cases” (p. 3). In this note, we show that the results of the study mentioned above may be affected by the issue of spurious correlation due to the omission of a common factor, namely, population density. Far be it from us deny that air pollution - and other factors as well - may have amplified the spread of the pandemic. The issue lies in the fact that, except for weather conditions, the concurrent factors suggested so far - i.e., transportation volumes, economic activity, and air pollution - are anthropic in nature. Thus, they all depend on the extent of human activities: the larger the population is, the higher the transportation volumes, economic activity, air pollution, and virus infections are (Fig. 1 ). Accordingly, when it comes to measuring the actual effect of anthropogenic causes on the pandemic, normalizing by population size and controlling for population density (Coccia, 2020) is by no means an option. Under the above framework, we consider alternative modeling, whose data covers almost all Chinese provinces and their socio-economic variables, relativized by the population size. Further, we also consider environmental variables that may explain the infection rate. Our main aim is to raise questions about future directions of the research on ecological and socio-economic aspects of 2019-nCoV.
Fig. 1

Interactions between anthropogenic factors, natural factors, and virus infections.

Interactions between anthropogenic factors, natural factors, and virus infections.

Materials and method

Outline of the problem

To show how the issue of spurious correlation may affect the relationship between virus contamination (Y) and air pollution (X) though population (Pop), we consider data of people infected by COVID-19 and the level of sulfur dioxide (SO2) emissions in the provinces of China (details about variables, study area, and data sources are provided in the next sub-sections). The scatterplot of X and Y and the fitting line show a positive relationship (Fig. 2 , left panel). However, when turning to consider the variables expressed as per capita data - RY=Y/Pop and RX = X/Pop - a negative relationship can be found (Fig. 2, right panel). The above issues can also be detected by comparing the simple correlation coefficient ρ = +0.33 and the partial correlation coefficient ρ = −0.45, which changes in sign and size. The previous example shows that the net relationship between several phenomena may be quite different from the gross one, the latter being possibly inflated by latent variables. That implies the need to consider all the potential factors that determine the health status of a society.
Fig. 2

Scatterplots and fitting lines of COVID-19 cases and SO2 emissions in Chinese provinces, given the population size.

Scatterplots and fitting lines of COVID-19 cases and SO2 emissions in Chinese provinces, given the population size.

Nomenclature

As far as the dependent variables are concerned, let us denote by Cov the total number of confirmed cases of 2019-nCoV, and by RCov the incidence rate of the infection, namely, the number of cases per 100 thousand inhabitants. The first set of covariates is as follows: Pop is the population size; Den is the population density, namely, the ratio between population and area in km2; Grp is the gross regional product; Pr stands for the yearly average precipitation; Th indicates the annual average maximum temperature; SO2 is the levels of sulfur dioxide emissions; Iwg stands for the emissions of industrial waste gases. The second set of covariates includes the variables that depend on human activity and are hence normalized by population: RGrp = Grp/Pop is the per capita gross regional product; RSO2 = SO2/Pop stands for the per capita emissions of sulfur dioxide; RIwg = Iwg/Pop indicates the emissions of industrial waste gases in per capita values.

Study area and data sources

This study focuses on 28 mainland Chinese provinces, autonomous regions, and municipalities outside Hubei province. Tibet and Guizhou are excluded due to missing data. Data concerning the cumulative confirmed cases of the 2019-nCoV as of March 22, 2020 (Fig. 3 ) - irrespective of whether they resulted in deaths, and regardless of the number of people that have recovered from the virus - is collected from the Coronavirus Resource Center of the Johns Hopkins University (Dong et al., 2020).
Fig. 3

Chinese provinces by COVID-19 overall confirmed cases as of March 22, 2020.

Chinese provinces by COVID-19 overall confirmed cases as of March 22, 2020. Demographic and economic variables are derived from the annual publication of the National Bureau of Statistics of China (National Bureau of Statistics of China, 2019). Information about precipitation and temperature is gathered from Current Results,1 based on data conveyed by the China Meteorological Administration and the World Meteorological Organization. Data about pollution emissions are taken from the paper “The Pollution state in 31 Provinces and Regions in China” (Yang and Yang, 2011). Although outdated, those values are assumed to be a fair proxy of current emissions in Chinese provinces.

Analytical models

To study the relationships between dependent and independent variables, we use the spatial autoregressive (SAR) models (Copiello and Grillenzoni, 2017; Elhorst, 2010) with exogenous predictors: where i is the province index, α, βj, and ρ are the coefficients, and e and u are residuals, which are expected to be independent and normally (IN) distributed, with mean zero and constant variance. It is worth noting that the model of Eq. (2) differs from the model of Eq. (1) because the variables which directly depend on human activity are normalized by population in the latter. In the models of Eqs. (1), (2), ρ is the spatial autocorrelation coefficient; hence, and are spatially lagged dependent variables (i.e., the mean values of Cov and RCov in the j provinces contiguous to the ith area). These lagged terms aim to identify whether the analyzed phenomenon has a spatial pattern accordingly to Tobler's (1970) first law of geography, namely, that “everything is related to everything else, but near things are more related than distant things” (p. 236). In order to satisfy the assumption of homoscedasticity (i.e., σ2 independent of i), all variables are transformed with natural logarithms (ln). The core of the analysis is represented by statistical significance and sign (+ or -) of the estimated coefficients βj. In particular, the differences in the βj of in the models of Eqs. (1), (2) is a symptom of spurious correlation between epidemic and environmental variables.

Results and discussion

The estimates of the models of Eqs. (1), (2) are provided in Table 1, Table 2 (see also Fig. 4, Fig. 5 ). In general, they are satisfactory as they fulfill the standard assumptions of regression, namely, normal residuals, absence of outliers and multicollinearity, and good fitting. Specifically, the hypothesis H0: e ,u  ~ IN(0,σ2) is accepted with low Chi2(2) statistics: 2.8637 (p-value 0.2389) for Cov, and 0.7412 (p-value 0.6903) for RCov. The explanatory variables are not affected by multicollinearity according to the low Variance Inflation Factors (VIFs ≤2.5, suggested in Allison, 1999). The adjusted R2 are 0.7124 for Cov, and 0.4906 for RCov.
Table 1

Results of the estimation of the model of Eq. (1) for the overall confirmed cases of 2019-nCoV.

Dependent: Cov
PredictorCoefficientStd. err.t-Stat1p-ValueVIF
const−5.9971.256−4.773⁎⁎⁎0.0001
Grp0.8890.1108.078⁎⁎⁎0.00001.085
Th0.9660.2743.532⁎⁎⁎0.00161.085

Cov: total number of confirmed cases of 2019-nCoV. Grp: gross regional product. Th: annual average maximum temperature. 1 Significance levels: * 0.1; ** 0.05; *** 0.01.

Table 2

Results of the estimation of the model of Eq. (2) for the incidence rate of 2019-nCoV.

Dependent: RCov
PredictorCoefficientStd. err.t-Stat1p-ValueVIF
const−1.2410.367−3.378⁎⁎⁎0.0024
Den0.2860.0476.056⁎⁎⁎0.00001.022
RIwg−0.5280.180−2.941⁎⁎⁎0.00691.022

RCov: number of cases of 2019-nCoV per 100 thousand inhabitants. Den: population density. RIwg: per capita emissions of industrial waste gases.

Significance levels: * 0.1; ** 0.05; *** 0.01.

Fig. 4

Normal distribution (left panel) and .95 Confidence intervals (right panel) of the residuals for the model of Eq. (1).

Fig. 5

Normal distribution (left panel) and .95 Confidence intervals (right panel) of the residuals for the model of Eq. (2).

Results of the estimation of the model of Eq. (1) for the overall confirmed cases of 2019-nCoV. Cov: total number of confirmed cases of 2019-nCoV. Grp: gross regional product. Th: annual average maximum temperature. 1 Significance levels: * 0.1; ** 0.05; *** 0.01. Results of the estimation of the model of Eq. (2) for the incidence rate of 2019-nCoV. RCov: number of cases of 2019-nCoV per 100 thousand inhabitants. Den: population density. RIwg: per capita emissions of industrial waste gases. Significance levels: * 0.1; ** 0.05; *** 0.01. Normal distribution (left panel) and .95 Confidence intervals (right panel) of the residuals for the model of Eq. (1). Normal distribution (left panel) and .95 Confidence intervals (right panel) of the residuals for the model of Eq. (2). The coefficients of the spatially lagged terms are not significant, meaning the absence of spatial correlation in the data. However, that may depend on the high level of spatial aggregation of provincial data, which involves suitable local policies to control the epidemic. For example, movement restrictions should be better adopted at the national level, at least, provided the national borders are not porous. As regards the analysis of the explanatory variables, in the model with Cov (the absolute number of confirmed cases), the average maximum temperature Th plays a significant role. Apart from indirect effects - namely, the higher is the temperature, the higher is the level of social interactions, and so the spread of the infection - the positive coefficient of Th stimulates other interpretations. Assuming that the novel coronavirus was already circulating before December 2019, it could imply that the recent global outbreak is also related to the mild weather conditions experienced in February 2020 (Masters, 2020). That contrasts with the expectation that the epidemic will spread less easily and more slowly during spring and summer as temperatures get warmer, as also suggested in other articles published in the journal Science of the Total Environment (Ma et al., 2020; Xie and Zhu, 2020). However, it has to be considered that the incidence rate of the infection - adjusted for population density and other factors - has been found to be inversely associated with warmer and drier weather conditions (Byass, 2020). Another significant covariate of the overall confirmed cases is the gross regional product Grp, which takes on a positive sign. Incidentally, that predictor is significantly correlated with some of the variables representing air pollution (SO2: ρ 0.5516, p-value 0.0023; Iwg: ρ 0.5412, p-value 0.0029). Apparently, this finding confirms the association found in the study authored by Zhu et al. (2020). Nevertheless, it is a trivial result. It has to be expected that the overall confirmed cases are higher in the most populated areas, which are usually also the most industrialized and wealthy, and, as a consequence, the most polluted ones. The problem is much more evident when turning to the analysis of the predictors of the incidence rate RCov. Population density is a significant driver of the number of cases per 100,000 population. That is in keeping with earlier literature (Amuakwa-Mensah et al., 2017), as well as with recent studies focusing on how population size and population density affect both the current and future spread of COVID-19 disease (Jahangiri et al., 2020; Rocklöv and Sjödin, 2020; Zhang et al., 2020). That might explain why, to date, the epidemic has hit so hard several highly densely populated areas around the world: Lombardy region in Italy, North Rhine-Westphalia in Germany, Madrid metropolitan area as far as Spain is concerned, New York in the United States, San Paulo in Brazil, and so forth. That is actually the issue with the finding presented by Zhu et al. (2020), namely, that the authors missed normalizing by population the number of novel coronavirus cases before testing the relationship with air pollution and other covariates. The authors state that their generalized additive model also includes “city fixed effects … to control for time-invariant city characteristics such as population size and density” (p. 2). Unfortunately, the results are not reported in full detail, so it is unknown the role played by those fixed effects, as well as how the same fixed effects interact with the variables measuring the pollutants. Hence, there remains an open question: would the COVID-pollution relationship be confirmed using the incidence rate, instead of the number of confirmed cases, as the dependent variable? Furthermore, in the model of Eq. (2), the level of emissions of industrial waste gas Iwg is another significant predictor of the number of cases per 100,000 population. Nevertheless, it takes on a negative sign, which leaves room for doubt about the hypothesis that air pollution has actually played a role in the spread of 2019-nCoV.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  11 in total

1.  Disparities of COVID-19 and HIV Occurrence Based on Neighborhood Infection Incidence in Philadelphia, Pennsylvania.

Authors:  Neal D Goldstein; Jessica L Webster; Lucy F Robinson; Seth L Welles
Journal:  Am J Public Health       Date:  2022-03       Impact factor: 9.308

2.  Multiple relationships between aerosol and COVID-19: A framework for global studies.

Authors:  Yaxin Cao; Longyi Shao; Tim Jones; Marcos L S Oliveira; Shuoyi Ge; Xiaolei Feng; Luis F O Silva; Kelly BéruBé
Journal:  Gondwana Res       Date:  2021-02-09       Impact factor: 6.051

3.  Early Phase Management of the SARS-CoV-2 Pandemic in the Geographic Area of the Veneto Region, in One of the World's Oldest Populations.

Authors:  Alessandro Camerotto; Andrea Sartorio; Anna Mazzetto; Milena Gusella; Ornella Luppi; Domenica Lucianò; Olga Sofritti; Cristiano Pelati; Emilia Munno; Andrea Tessari; Simone Bedendo; Margherita Bellè; Federica Fenzi; Andrea Formaglio; Annalisa Boschini; Alberto Busson; Elisabetta Spigolon; Paolo De Pieri; Paola Casson; Edgardo Contato; Antonio Compostella
Journal:  Int J Environ Res Public Health       Date:  2020-12-04       Impact factor: 3.390

4.  Economic Role of Population Density during Pandemics-A Comparative Analysis of Saudi Arabia and China.

Authors:  Nadia Yusuf; Lamia Saud Shesha
Journal:  Int J Environ Res Public Health       Date:  2021-04-19       Impact factor: 3.390

5.  Social, economic, and environmental factors influencing the basic reproduction number of COVID-19 across countries.

Authors:  Jude Dzevela Kong; Edward W Tekwa; Sarah A Gignoux-Wolfsohn
Journal:  PLoS One       Date:  2021-06-09       Impact factor: 3.240

6.  Modelling the persistence of Covid-19 positivity rate in Italy.

Authors:  Antonio Naimoli
Journal:  Socioecon Plann Sci       Date:  2022-01-07       Impact factor: 4.641

7.  Understanding small Chinese cities as COVID-19 hotspots with an urban epidemic hazard index.

Authors:  Tianyi Li; Jiawen Luo; Cunrui Huang
Journal:  Sci Rep       Date:  2021-07-19       Impact factor: 4.379

8.  Particulate matter and SARS-CoV-2: A possible model of COVID-19 transmission.

Authors:  Nguyen Thanh Tung; Po-Ching Cheng; Kai-Hsien Chi; Ta-Chi Hsiao; Timothy Jones; Kelly BéruBé; Kin-Fai Ho; Hsiao-Chi Chuang
Journal:  Sci Total Environ       Date:  2020-08-05       Impact factor: 7.963

9.  Cumulative Impacts and COVID-19: Implications for Low-Income, Minoritized, and Health-Compromised Communities in King County, WA.

Authors:  Carolyn Ingram; Esther Min; Edmund Seto; B J Cummings; Stephanie Farquhar
Journal:  J Racial Ethn Health Disparities       Date:  2021-06-14

10.  Negative Correlation between Altitude and COVID-19 Pandemic in Colombia: A Preliminary Report.

Authors:  Eder Cano-Pérez; Jaison Torres-Pacheco; María Carolina Fragozo-Ramos; Génesis García-Díaz; Eduardo Montalvo-Varela; Juan Carlos Pozo-Palacios
Journal:  Am J Trop Med Hyg       Date:  2020-10-26       Impact factor: 3.707

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.