Literature DB >> 32534259

Modifiable areal unit problem and environmental factors of COVID-19 outbreak.

Yaqi Wang1, Qian Di2.   

Abstract

Several recent studies have explored the association between environmental factors, such as temperature, humidity, and air pollution, and the severity of the COVID-19 outbreak by analyzing the statistical association at the district level. However, we argue that the modifiable areal unit problem (MAUP) arises when aggregating disease and environmental data into districts, leading to bias in such studies. Therefore, in this study, we analyzed the association between environmental factors and the number of COVID-19 death cases under different aggregation strategies to illustrate the presence of MAUP. We used real-world COVID-19 outbreak data from the Hubei and Henan Provinces and studied their association with atmospheric NO2 levels. By fitting linear regression models with penalized splines on NO2, we found that the association between COVID-19 mortality and NO2 varies when data were aggregated (1) at the city level, (2) under two different aggregation strategies, and (3) at the provincial level, indicating the presence of MAUP. Therefore, this study reminds researchers of the presence of MAUP and the necessity to minimize this problem while exploring the environmental determinants of the COVID-19 outbreak.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  COVID-19; Modifiable areal unit problem; NO(2); Spatial analysis; Statistical bias

Mesh:

Year:  2020        PMID: 32534259      PMCID: PMC7274979          DOI: 10.1016/j.scitotenv.2020.139984

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


To the Editor: The COVID-19 outbreak is now developing into a global pandemic and causing tremendous human and economic loss. Understanding the factors affecting the pathogenesis of the virus is essential to disease control. Several studies have explored the association between COVID-19 transmission or the number of COVID-19 death cases with environmental factors, such as air pollution (Wu et al., 2020), temperature (Oliveiros et al., 2020; Qi et al., 2020; J. Wang et al., 2020; M. Wang et al., 2020; Rahman et al., 2020; Notari, 2020; Sajadi et al., 2020; Triplett, 2020; Zhu and Xie, 2020; Ma et al., 2020), and humidity (Shi et al., 2020; Luo et al., 2020; Ma et al., 2020), to name a few. The aforementioned studies were conducted in a similar ecological study framework: researchers analyzed the statistical association between numbers of COVID-19 cases at the city, regional, or national level with respect to a specific environmental factor. Apart from environmental factors, a recent study analyzing the association between COVID-19 cases and the prevalence of the Bacillus Calmette–Guérin (BCG) vaccination employed the same approach (Berg et al., 2020; Sala and Miyakawa, 2020). By finding the association between disease mortality and vaccine policy in different countries, the researchers suggested that BCG vaccination reduced the risk of COVID-19. However, we argue that these conclusions are dependent on the aggregation of environmental and disease data. Therefore, we are writing this correspondence to remind researchers of a common flaw, namely the modifiable areal unit problem (MAUP) that is associated with studies on the environmental factors of the COVID-19 outbreak. The modifiable areal unit problem is a statistical bias that arises when aggregating point measurements into districts. Summary measures, such as mean and standard deviation, are influenced by the boundaries of the aggregation districts. Consequently, the regression outcome is also influenced by the aggregation strategy. For example, disease is negatively associated with temperature at the individual level (Fig. 1 ); however, after aggregating to the district level, the relationship between disease prevalence and temperature can be positive, negative, or null, depending on district boundaries.
Fig. 1

An example of MAUP.

Note: a denotes the home addresses of several residents who are either cases or non-cases and their temperature level. b indicates the association between disease and temperature at the individual level. We aggregated individual-level disease data into districts and calculated the prevalence at the district level and the average temperature (area-weighted average). We found that the association between district-level prevalence and average temperature can be positive, null, or negative.

An example of MAUP. Note: a denotes the home addresses of several residents who are either cases or non-cases and their temperature level. b indicates the association between disease and temperature at the individual level. We aggregated individual-level disease data into districts and calculated the prevalence at the district level and the average temperature (area-weighted average). We found that the association between district-level prevalence and average temperature can be positive, null, or negative. We used real-world NO2 concentration and COVID-19 outbreak data in the Hubei and Henan Provinces, China, to illustrate the presence of MAUP. By fitting regression models, we studied the association between COVID-19 death cases and daily NO2 change (1) at the city level, (2) under two different aggregation strategies, and (3) at the provincial level (Figs. 2a and 3a ). The dose-response curves for the four scenarios showed different shapes with different regression coefficients (Figs. 2b and 3b). The NO2–COVID-19-death relationship was negative at the provincial level in Hubei Province, indicating a protective effect of NO2 on COVID-19 mortality; however, a positive relationship was observed under aggregation strategy two (Fig. 2b). Similar differences were discovered in the Henan Province, where the positive city-level association became negative when the aggregation strategy was employed (Fig. 3b). The purpose of this real-world example is not to explore the relationship between COVID-19 and NO2 but to demonstrate that such a relationship can be influenced by aggregation level and aggregation strategy.
Fig. 2

Relationship between daily NO2 (μg/m3) and COVID-19 death cases under different scenarios in Hubei Province.

a shows different aggregation strategies. b shows the corresponding NO2-death relationship under each scenario. The unit for NO2 on the x-axes is μg/m3 (b).

At the city level, each city was presented individually (denoted with different colors); for the two aggregation strategies, adjacent cities were aggregated into districts (denoted with different colors); at the provincial level, all city-level data were aggregated. The COVID-19 data were available from the “nCov2019” package in R, and the environmental data were obtained from the China National Environmental Monitoring Centre (National real time air quality data platform of the China National Environmental Monitoring Centre). We used data from January 27, 2020 to March 10, 2020, when COVID-19 surged in China. During aggregation, deaths across the city were summed and the daily NO2 level on the same day was averaged.

Under aggregation strategy 1, cities were aggregated into three districts in the following manner: District 1: Enshi, Yichang, Xiangyang, and Shiyan; District 2: Jingmen, Jingzhou, Suizhou, Xiaogan, and Wuhan; District 3: Hunaggang, Xianning, Huangshi, and Ezhou. Under aggregation strategy 2, District 1: Xiangyang, Shiyan, Suizhou, and Xiaogan; District 2: Enshi, Jingmen, and Yichang; District 3: Jingzhou, Wuhan, and Huanggang; District 4: Ezhou, Huangshi, and Xianning. Four cities were excluded because of missing environmental data or zero COVID-19 mortality.

We fit regression models with (1) the linear term of NO2 to estimate the beta coefficient and (2) a penalized spline on NO2 to estimate the dose-response curves with COVID-19 death cases under different aggregation strategies. For city- and district-level data, we put a random effect on the city or aggregated district. The dose-response curves in b were placed in the same order as those in a. The corresponding beta coefficients for the four scenarios were − 0.052 (city level), −0.75 (strategy 1), −0.83 (strategy 2), and − 1.30 (provincial level).

Fig. 3

Relationship between daily NO2 (μg/m3) and COVID-19 Death Cases under Different scenarios in Henan Province.

a shows the different aggregation strategies. b shows the corresponding NO2-death relationship under different aggregation strategies. The x-axes unit for NO2 is μg/m3 (b).

Similar to Hubei Province, data from January 27, 2020 to March 10, 2020 was used because mortality subsequently fell to zero.

Under aggregation strategy 1, cities were aggregated into three districts in the following manner: District 1: Sanmenxia, Nanyang, and Xinyang; District 2: Luoyang, Pingdingshan, Xinxiang, and Jiaozuo; District 3: Zhengzhou, Xuchang, Zhoukou, and Shangqiu. Under aggregation strategy 2, District 1: Xinyang, Nanyang, and Pingdingshan; District 2: Shangqiu, Zhoukou, and Xuchang; District 3: Xinxiang, Jiaozuo, Luoyang, Sanmenxia, and Zhengzhou. Seven cities were excluded because of no COVID-19 mortality.

We fit regression models with (1) the linear term of NO2 to estimate the beta coefficient and (2) a penalized spline on NO2 to estimate the dose-response curves with COVID-19 death cases under different aggregation strategies. For city- and district-level data, we put a random effect on the city or aggregated district. The dose-response curves in b were placed in the same order as those in a. The corresponding beta coefficients for the four scenarios were 1.9 ∗ 10−3 (city level),−2.9 ∗ 10−3 (strategy 1), 1.2 ∗ 10−2 (strategy 2), and 4.7 ∗ 10−5 (provincial level).

Relationship between daily NO2 (μg/m3) and COVID-19 death cases under different scenarios in Hubei Province. a shows different aggregation strategies. b shows the corresponding NO2-death relationship under each scenario. The unit for NO2 on the x-axes is μg/m3 (b). At the city level, each city was presented individually (denoted with different colors); for the two aggregation strategies, adjacent cities were aggregated into districts (denoted with different colors); at the provincial level, all city-level data were aggregated. The COVID-19 data were available from the “nCov2019” package in R, and the environmental data were obtained from the China National Environmental Monitoring Centre (National real time air quality data platform of the China National Environmental Monitoring Centre). We used data from January 27, 2020 to March 10, 2020, when COVID-19 surged in China. During aggregation, deaths across the city were summed and the daily NO2 level on the same day was averaged. Under aggregation strategy 1, cities were aggregated into three districts in the following manner: District 1: Enshi, Yichang, Xiangyang, and Shiyan; District 2: Jingmen, Jingzhou, Suizhou, Xiaogan, and Wuhan; District 3: Hunaggang, Xianning, Huangshi, and Ezhou. Under aggregation strategy 2, District 1: Xiangyang, Shiyan, Suizhou, and Xiaogan; District 2: Enshi, Jingmen, and Yichang; District 3: Jingzhou, Wuhan, and Huanggang; District 4: Ezhou, Huangshi, and Xianning. Four cities were excluded because of missing environmental data or zero COVID-19 mortality. We fit regression models with (1) the linear term of NO2 to estimate the beta coefficient and (2) a penalized spline on NO2 to estimate the dose-response curves with COVID-19 death cases under different aggregation strategies. For city- and district-level data, we put a random effect on the city or aggregated district. The dose-response curves in b were placed in the same order as those in a. The corresponding beta coefficients for the four scenarios were − 0.052 (city level), −0.75 (strategy 1), −0.83 (strategy 2), and − 1.30 (provincial level). Relationship between daily NO2 (μg/m3) and COVID-19 Death Cases under Different scenarios in Henan Province. a shows the different aggregation strategies. b shows the corresponding NO2-death relationship under different aggregation strategies. The x-axes unit for NO2 is μg/m3 (b). Similar to Hubei Province, data from January 27, 2020 to March 10, 2020 was used because mortality subsequently fell to zero. Under aggregation strategy 1, cities were aggregated into three districts in the following manner: District 1: Sanmenxia, Nanyang, and Xinyang; District 2: Luoyang, Pingdingshan, Xinxiang, and Jiaozuo; District 3: Zhengzhou, Xuchang, Zhoukou, and Shangqiu. Under aggregation strategy 2, District 1: Xinyang, Nanyang, and Pingdingshan; District 2: Shangqiu, Zhoukou, and Xuchang; District 3: Xinxiang, Jiaozuo, Luoyang, Sanmenxia, and Zhengzhou. Seven cities were excluded because of no COVID-19 mortality. We fit regression models with (1) the linear term of NO2 to estimate the beta coefficient and (2) a penalized spline on NO2 to estimate the dose-response curves with COVID-19 death cases under different aggregation strategies. For city- and district-level data, we put a random effect on the city or aggregated district. The dose-response curves in b were placed in the same order as those in a. The corresponding beta coefficients for the four scenarios were 1.9 ∗ 10−3 (city level),−2.9 ∗ 10−3 (strategy 1), 1.2 ∗ 10−2 (strategy 2), and 4.7 ∗ 10−5 (provincial level). The modifiable areal unit problem causes unreliable analytical results and encourages false conclusions regarding the dependence of transmission on certain factors, unnecessary control measures, or the unrealistic hope that warm weather or the BCG vaccine will impede COVID-19 transmission. Solutions for minimizing MAUP and achieving reliable analytical results include: (1) Conducting epidemiological studies at the individual level with decent exposure assessment and case-control or cohort study designs; although the finest data in this article was at the city level, MAUP still exists when aggregating individual data to the city level. Therefore, the Center for Disease Control and other institutions with individual tracking data may wish to try the preferred individual analysis method. (2) Combining epidemiological evidence with biological evidence; lab results on virus stability under different temperature and humidity conditions (van Doremalen et al., 2020) would be a good supplement for epidemiological findings. (3) Attempting to conduct the study at the finest spatial scale possible and listing this as a possible limitation if (1) is not possible. We hope this short correspondence can help other researchers working on this topic to find substantial evidence on environmental contributors and other influential factors of the COVID-19 pandemic.

Funding

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

CRediT authorship contribution statement

Yaqi Wang: Methodology, Software, Validation, Formal analysis, Data curation, Writing - original draft, Visualization. Qian Di: Conceptualization, Writing - review & editing, Supervision.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  7 in total

1.  Temperature, Humidity, and Latitude Analysis to Estimate Potential Spread and Seasonality of Coronavirus Disease 2019 (COVID-19).

Authors:  Mohammad M Sajadi; Parham Habibzadeh; Augustin Vintzileos; Shervin Shokouhi; Fernando Miralles-Wilhelm; Anthony Amoroso
Journal:  JAMA Netw Open       Date:  2020-06-01

2.  Association between ambient temperature and COVID-19 infection in 122 cities from China.

Authors:  Jingui Xie; Yongjian Zhu
Journal:  Sci Total Environ       Date:  2020-03-30       Impact factor: 7.963

3.  Effects of temperature variation and humidity on the death of COVID-19 in Wuhan, China.

Authors:  Yueling Ma; Yadong Zhao; Jiangtao Liu; Xiaotao He; Bo Wang; Shihua Fu; Jun Yan; Jingping Niu; Ji Zhou; Bin Luo
Journal:  Sci Total Environ       Date:  2020-03-26       Impact factor: 7.963

4.  Temperature dependence of COVID-19 transmission.

Authors:  Alessio Notari
Journal:  Sci Total Environ       Date:  2020-12-13       Impact factor: 7.963

5.  Aerosol and Surface Stability of SARS-CoV-2 as Compared with SARS-CoV-1.

Authors:  Neeltje van Doremalen; Trenton Bushmaker; Dylan H Morris; Myndi G Holbrook; Amandine Gamble; Brandi N Williamson; Azaibi Tamin; Jennifer L Harcourt; Natalie J Thornburg; Susan I Gerber; James O Lloyd-Smith; Emmie de Wit; Vincent J Munster
Journal:  N Engl J Med       Date:  2020-03-17       Impact factor: 91.245

6.  COVID-19 transmission in Mainland China is associated with temperature and humidity: A time-series analysis.

Authors:  Hongchao Qi; Shuang Xiao; Runye Shi; Michael P Ward; Yue Chen; Wei Tu; Qing Su; Wenge Wang; Xinyi Wang; Zhijie Zhang
Journal:  Sci Total Environ       Date:  2020-04-19       Impact factor: 7.963

7.  Mandated Bacillus Calmette-Guérin (BCG) vaccination predicts flattened curves for the spread of COVID-19.

Authors:  Martha K Berg; Qinggang Yu; Cristina E Salvador; Irene Melani; Shinobu Kitayama
Journal:  Sci Adv       Date:  2020-08-05       Impact factor: 14.136

  7 in total
  8 in total

1.  Spatiotemporal clustering patterns and sociodemographic determinants of COVID-19 (SARS-CoV-2) infections in Helsinki, Finland.

Authors:  Mika Siljander; Ruut Uusitalo; Petri Pellikka; Sanna Isosomppi; Olli Vapalahti
Journal:  Spat Spatiotemporal Epidemiol       Date:  2022-02-05

2.  Investigation of Relationships Between the Geospatial Distribution of Cancer Incidence and Estimated Pesticide Use in the U.S. West.

Authors:  Naveen Joseph; Catherine R Propper; Madeline Goebel; Shantel Henry; Indrakshi Roy; Alan S Kolok
Journal:  Geohealth       Date:  2022-05-01

3.  The impact of modelling choices on modelling outcomes: a spatio-temporal study of the association between COVID-19 spread and environmental conditions in Catalonia (Spain).

Authors:  Álvaro Briz-Redón
Journal:  Stoch Environ Res Risk Assess       Date:  2021-01-03       Impact factor: 3.379

4.  Spatiotemporal patterns of the COVID-19 epidemic in Mexico at the municipality level.

Authors:  Jean-François Mas; Azucena Pérez-Vega
Journal:  PeerJ       Date:  2021-12-24       Impact factor: 2.984

Review 5.  Methodological limitations in studies assessing the effects of environmental and socioeconomic variables on the spread of COVID-19: a systematic review.

Authors:  Maria A Barceló; Marc Saez
Journal:  Environ Sci Eur       Date:  2021-09-10       Impact factor: 5.893

6.  Urban parks as a potential mitigator of suicide rates resulting from global pandemics: Empirical evidence from past experiences in Seoul, Korea.

Authors:  U-Ram Kim; Hyungun Sung
Journal:  Cities       Date:  2022-05-04

7.  Understanding the spatial diffusion dynamics of the COVID-19 pandemic in the city system in China.

Authors:  Lijuan Gu; Linsheng Yang; Li Wang; Yanan Guo; Binggan Wei; Hairong Li
Journal:  Soc Sci Med       Date:  2022-04-28       Impact factor: 5.379

8.  Effects of long-term exposure to air pollutants on the spatial spread of COVID-19 in Catalonia, Spain.

Authors:  Marc Saez; Aurelio Tobias; Maria A Barceló
Journal:  Environ Res       Date:  2020-09-12       Impact factor: 8.431

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.