Literature DB >> 33162213

Dynamic model to predict the association between air quality, COVID-19 cases, and level of lockdown.

Yara S Tadano1, Sanja Potgieter-Vermaak2, Yslene R Kachba3, Daiane M G Chiroli3, Luciana Casacio4, Jéssica C Santos-Silva5, Camila A B Moreira6, Vivian Machado7, Thiago Antonini Alves7, Hugo Siqueira8, Ricardo H M Godoi6.   

Abstract

Studies have reported significant reductions in air pollutant levels due to the COVID-19 outbreak worldwide global lockdowns. Nevertheless, all of the reports are limited compared to data from the same period over the past few years, providing mainly an overview of past events, with no future predictions. Lockdown level can be directly related to the number of new COVID-19 cases, air pollution, and economic restriction. As lockdown status varies considerably across the globe, there is a window for mega-cities to determine the optimum lockdown flexibility. To that end, firstly, we employed four different Artificial Neural Networks (ANN) to examine the compatibility to the original levels of CO, O3, NO2, NO, PM2.5, and PM10, for São Paulo City, the current Pandemic epicenter in South America. After checking compatibility, we simulated four hypothetical scenarios: 10%, 30%, 70%, and 90% lockdown to predict air pollution levels. To our knowledge, ANN have not been applied to air pollution prediction by lockdown level. Using a limited database, the Multilayer Perceptron neural network has proven to be robust (with Mean Absolute Percentage Error ∼ 30%), with acceptable predictive power to estimate air pollution changes. We illustrate that air pollutant levels can effectively be controlled and predicted when flexible lockdown measures are implemented. The models will be a useful tool for governments to manage the delicate balance among lockdown, number of COVID-19 cases, and air pollution.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Air pollution; Artificial neural networks; Lockdown flexibility; SARS-CoV-2

Mesh:

Substances:

Year:  2020        PMID: 33162213      PMCID: PMC7598373          DOI: 10.1016/j.envpol.2020.115920

Source DB:  PubMed          Journal:  Environ Pollut        ISSN: 0269-7491            Impact factor:   8.071


Artificial Neural Networks showed to be robust predictive tools to estimate the best equilibrium among COVID-19 cases, lockdown percentage, and air pollutants level.

Introduction

The World Health Organization (WHO) stated that South America is the new epicenter of the coronavirus pandemic (CNBC, 2020), and Brazil, one of the countries with the highest incidence of new cases and the second highest total number of cases in the world. A study done by scientists from Imperial College, London, showed that Brazil had the highest rate of transmission (R0 of 2.81) among the 48 countries they investigated (The Lancet, 2020). To date (September 3, 2020), 6.6% of Brazil’s total cases (3,997,865) were recorded in São Paulo city (262,570). This number constitutes more than 30% of the cases reported in São Paulo state (826,331). On September the 3rd the number of deaths in São Paulo city was 11,554 (4.4% of confirmed cases of COVID-19 led to death), higher than the global (3.3%) (SEADE, 2020). Due to the rapid person-to-person transmission of COVID-19, São Paulo state government ordered lockdown on March 24, 2020, closing all (Secondary schools, Universities, Shopping Malls and, other commercial entities) but essential services (Nakada and Urban, 2020). As expected, beyond the efficiency to suppress the R0 (Wilder-Smith and Freedman, 2020), these actions led to the scaling down in traffic, industrial and trade activities, and consequent reduction in air pollution levels, therefore improving air quality as a whole (Dutheil et al., 2020). In response to the exponential increase in infection rates of the virus worldwide, local and national governments relaxed environmental legislation. For instance, the US EPA allowed industries and other facilities autonomy to decide and report if they meet the legislated requirements (Wu et al., 2020). Similarly, the Brazilian government has largely negated enforcement of environmental legislation during the coronavirus outbreak (The Guardian, 2020), which resulted in additional industrial air pollution emission, as well as, an increase in deforestation in the Amazon (de Oliveira et al., 2020). The danger is that reduced enforcement will continue past virus’s peak to stimulate the economy and therefore put the population at risk. Various scientists reported decreased air pollutant levels, comparing pre- and post COVID-19 air pollution levels using different methods and scales (Chauhan and Singh 2020; Dantas et al., 2020; Le et al., 2020; Li et al., 2020; Muhammad et al., 2020; Nakada and Urban, 2020; Sharma et al., 2020; Shehzad et al., 2020; Tobías et al., 2020). However, the available air pollution studies related to the COVID-19 situation are based on satellite images, air quality modeling and generally comparing lockdown period data with monthly means over the past few years. Worldwide, most studies reported in the literature indicated reductions in NOx and PM2.5 levels and an increase in O3 concentration during lockdown (Nakada and Urban, 2020; Sharma et al., 2020; Sicard et al., 2020; Siciliano et al., 2020; Tobías et al., 2020). The following are a few examples of studies using these approaches. Many researchers worldwide reported a reduction in NO2 concentration levels (Chauhan and Singh, 2020; Muhammad et al., 2020; Zambrano-Monserrate et al., 2020). Zambrano-Monserrate et al. (2020) reported reductions in China, USA, Italy, and Spain, when Copernicus Atmosphere Monitoring Service data for PM2.5 and NO2 were compared to the previous three years. Rodríguez-Urrego and Rodríguez-Urrego (2020) studied PM2.5 profiles of the 50 most polluted countries and reported an average reduction of 12% worldwide. They used the World Air Quality Index platform to obtain data and compared it to the previous 2 years. Closer to home, Dantas et al. (2020) and Nakada and Urban (2020) compared various air pollutants (including CO, O3, NO2, NO, PM2.5, PM10, and SO2) over different time scales (one year to five-year trend) in Rio de Janeiro and São Paulo, respectively. In both cases, local data were used. Both studies indicated a reduction of all pollutants investigated, except for ozone, which increased. These approaches (using satellite images, air quality modeling and generally comparing lockdown period data with monthly means over the past few years) are limited as it provides mainly an overview of past events, with no future predictions. Artificial Neural Networks (), on the other hand, is a nonlinear methodology capable of mapping a set of inputs into an output, which is important to support decisions regarding preventive measures. This approach has been used in air pollution epidemiological studies (Araujo et al., 2020; Kachba et al., 2020; Kassomenos et al., 2011; Tadano et al., 2016; Polezer et al., 2018). In Araujo et al. (2020) and Kassomenos et al. (2011), the ANN showed a better performance than linear approaches as Generalized Linear Models. Kassomenos et al. (2011) also concluded that ANN is a more flexible and adaptive mathematical approach. In this context, as lockdown status varies considerably across the globe, there is a window of opportunity for mega-cities to determine the optimum level of lockdown to ensure effective management of transmission rates, air quality, and a healthy economy. To our knowledge, ANN have not been applied to air pollution prediction by lockdown level. To that end, we used four Artificial Neural Networks (ANN) (Extreme Learning Machine – ELM; Echo State Network – ESN; Multilayer perceptron – MLP and Radial Basis Function Networks – RBF) to estimate the influence that newly reported COVID-19 cases and lockdown level may have on the local air pollution (CO, O3, NO2, NO, PM2.5, and PM10 levels) in São Paulo city. After checking compatibility, we simulated four hypothetical partial lockdown scenarios (10, 30, 70, and 90%) to investigate the relationship between reduced activities and air quality. In the light of evidence that poor air quality may exacerbate COVID-19 symptoms (Wu et al., 2020), and potentially lead to higher mortality rates, the ANN showed to be a useful predictive tool for governments. Using this approach, resumption of industrial and other activities can be managed to ensure a sustainable balance among economic health, air quality, and transmission rate.

Materials and methods

The data of São Paulo city was selected to examine the robustness of our approach. São Paulo is the most populous city of Latin America, with around 12.25 million inhabitants (IBGE, 2020), the main hotspot of COVID-19 in Brazil, and one of the most polluted cities in Latin America. The inputs were: daily number of COVID-19 cases, partial lockdown level, and meteorological variables; the outputs were the daily concentration of each air pollutant (CO [ppm], O3 [μg/m3], NO2 [μg/m3], NO [μg/m3], PM2.5 [μg/m3], and PM10 [μg/m3]). Data on the daily number of newly reported COVID-19 cases and lockdown percentages was collected from March 17, 2020 to May 13, 2020 from the Statistical Portal of São Paulo State (SEADE, 2020). The Intelligence Monitoring System of São Paulo has an agreement with mobile phone companies to track people’s movement. This georeferenced anonymised information is available on the SEADE website and has been used in this study. Meteorological variables were extracted from the Environmental Company of São Paulo State database (CETESB). These included: relative humidity – RH [%]; maximum temperature – MT [oC]; atmospheric pressure – AP [hPA]; wind speed – WS [m/s] and global solar radiation–GSR [W/m2]) (CETESB, 2020). The data on target pollutant levels of CO [ppm], O3 [μg/m3], NO2 [μg/m3], NO [μg/m3], PM2.5 [μg/m3], and PM10 [μg/m3] concentrations were selected from January 01, 2020 to May 13, 2020 (134 samples). As a matter of comparison and to improve the ANN performance, we included the data for a period with zero COVID-19 cases and no lockdown (data from January 01, 2020 to March 16, 2020). Daily concentrations were extracted from the CETESB. More than sixty-six percent of the hourly averages were similar to the daily average. The data were ratified by the CETESB, who follows the quality assurance/quality control (QA/QC) procedure approved by the State Council of Environment (CONSEMA) of the State of São Paulo. Beta radiation is used for PM10 and PM2.5 measurements, chemiluminescence for NO2 and NO, non-dispersive infrared for CO, and ultraviolet analysis for O3 (CETESB, 2020). Data from four CETESB air quality monitoring stations (AQMS) were used due to their locations (Fig. 1 ). The largest data sets could be obtained from D. Pedro II station (blue spot - located in a high demographic density area) and Tietê station (red spot located near a busy ring road). D. Pedro II station is located downtown – high demographic density area; influenced mainly by a light-duty fleet, and Tietê station is near a ring road, characterized mainly by heavy-duty emissions.
Fig. 1

Locations of the air quality monitoring stations in São Paulo. The satellite map is from Google Maps (Map data©2020 Google; https://www.google.com/maps/place/Brazil/); the satellite is from Google Earth Pro (Map data©2020 Google; www.google.com/maps/@-23.6815315,-46.8754814,10z). The maps were edited with Microsoft Power Point (version 16.28–19081202). Note: AQMS: Air Quality Monitoring Station; Tietê: ring road; D. Pedro II: downtown; ∗Tietê station has no O3 data and was replaced by data from USP-Ipen station.

Locations of the air quality monitoring stations in São Paulo. The satellite map is from Google Maps (Map data©2020 Google; https://www.google.com/maps/place/Brazil/); the satellite is from Google Earth Pro (Map data©2020 Google; www.google.com/maps/@-23.6815315,-46.8754814,10z). The maps were edited with Microsoft Power Point (version 16.28–19081202). Note: AQMS: Air Quality Monitoring Station; Tietê: ring road; D. Pedro II: downtown; ∗Tietê station has no O3 data and was replaced by data from USP-Ipen station. Table 1 shows that even at these two stations, some data is lacking. PM2.5 data from D. Pedro II station had several gaps in the data set for consecutive days, and these were replaced by data from Mooca station (yellow spot) (CETESB, 2020), as the linear correlation of the data with those from D. Pedro II station is 0.95. For missing data from non-consecutive days, the previous day’s values were used. Tietê station had no ozone data, and it was supplemented by data from a nearby location USP-Ipen station (green spot).
Table 1

Number of days with no data for each studied AQMS.

AQMSCOO3NO2NOPM10PM2.5
Tietê∗20∗0012
D. Pedro II4100010

Note: AQMS: Air Quality Monitoring Station; Tietê: ring road; D. Pedro II: downtown; ∗ Tietê station has no O3 data and was replaced by data from USP-Ipen station.

Number of days with no data for each studied AQMS. Note: AQMS: Air Quality Monitoring Station; Tietê: ring road; D. Pedro II: downtown; ∗ Tietê station has no O3 data and was replaced by data from USP-Ipen station.

Artificial Neural Networks

The four ANN used in this study are described below (further details in Araujo et al. (2020)).

Multilayer Perceptron overview

The Multilayer Perceptron (MLP) is a neural model able to map any nonlinear, continuous, limited, and differentiable function with arbitrary precision, which confers a characteristic of a universal approximator (Haykin, 2008). The basic structure of an ANN is the artificial neurons, functional units responsible for processing the information, and providing the output response (de Castro, 2007). In an MLP, the neurons are distributed in three kinds of layers. The input layer transmits the data to the intermediate (hidden) layers, where the neurons perform a nonlinear transformation, mapping the input signal to another space. Then, the signal is sent to the output layer, in which the output signal is generated based on a linear combination, in most cases. Neurons from the same layer are disconnected, while those from disjoint layers fully exchange information since this is a feed forward model (Siqueira and Luna, 2019). Training a neural model means using an algorithm to determine its free parameters or adjust the neurons’ weights. The most known way to solve this task in an MLP is to use the backpropagation algorithm, a general iterative tool based on the steepest descent, a first order unrestricted linear optimization method. In this case, the method reduces the mean square error between the desired response and the output of the network (Haykin, 2008). However, in this work, we address a second-order method that presents computational cost similar to the first: The Modified Scaled Conjugate Gradient (MSCG) (dos Santos and Von Zuben, 1999). We highlight the maximum number of iterations as the stop criterion in training. We also use the hold-out cross-validation method to determine the topology (number of neurons in the hidden layer) and avoid overfitting (Haykin, 2008).

Radial basis function

The Radial Basis Function networks (RBF) are a well-known ANN model. Like the MLP, they are feed forward architectures, and universal approximators, but present only two layers of neurons (Siqueira and Luna, 2019). The first, intermediate, perform a nonlinear input-output mapping using radial basis functions, like the Gaussian function. The second – output layer –performs the model’s response, similarly to the MLP (Haykin, 2008). The hidden neurons present two parameters: a centre c (with the same dimension of the number of inputs), and a dispersion σ Therefore, the output of each neuron is higher to inputs that are spatially closer to the current centre. The dispersion is responsible for modulating the decay of the response concerning the distance between the inputs and the centers. Usually, the Gaussian function is addressed as RBF. A linear combinator is used to perform the output response (Siqueira and Luna, 2019). The training process of an RBF is performed in two steps. The first is the adjustment of the hidden neurons (centers and dispersions), a task performed by the unsupervised clustering method. In this work, we addressed the K-Medoids algorithm. Also, we assumed that all dispersions are the same (Haykin, 2008). The second step is the adjustment of the output neurons. A simple and efficient tool found in the literature is the use of the Moore–Penrose inverse operator (Haykin, 2008).

Extreme Learning Machines

Extreme Learning Machines (ELM) are feed forward neural models, with a single hidden layer (Huang et al., 2006, 2015). This structure is quite similar to the classic MLP, the training process being the main difference (Siqueira et al., 2018). In an ELM, the intermediate neurons have weights randomly generated, and they are not adjusted during the running time. The insertion of new neurons in the hidden layer leads to a decrease in the output error (Siqueira et al., 2012a). Then, an ELM training is summarized in finding the best set of weights of the output layer. The main manner to overcome this task is to use a minimum square solution, especially the Moore–Penrose generalized inverse operation (Siqueira et al., 2018).

Echo State Networks

The Echo State Networks (ESN) are architectures of ANN, which present high similarity with the ELM, regarding the structure and training process. However, unlike the previously mentioned networks, this is a recurrent model since it presents feedback loops of information. In this case, the hidden layer, named dynamic reservoir, has such recurrence (Jaeger, 2001, 2002). Jaeger (2001, 2002) demonstrated that the reservoir is a nonlinear transformation, which is influenced by the recent samples of the input signal, so that we can choose the weights in advance if specific conditions are respected. In this work, we used the reservoir design by (Jaeger, 2001). As in the ELM, the training is responsible for determining the weights of the output layer, which may be done using the Moore–Penrose generalized inverse operation, as in the ELM case (Siqueira et al., 2018).

Computational details

The computational step involved the seven input variables mentioned above: number of COVID-19 new cases, partial lockdown level, maximum temperature, relative humidity, atmospheric pressure, wind speed, and global solar radiation. The desired signals (target) were each air pollutant’s (CO, O3, NO2, NO, PM2.5, and PM10) concentration. We evaluated the performance considering all the inputs at the same time; without the number of new COVID-19 cases; and without the number of new COVID-19 cases and partial lockdown, to analyze the robustness of the neural networks on predicting air quality according to COVID-19 variables and using a small database. All cases included the meteorological variables. To perform the computational analysis, we separated the dataset in three subsets: Training: from January 01 to April 23, 2020 (114 samples); Validation: April 24 to May 03, 2020 (10 samples); Test: May 04 to May 13, 2020 (10 samples). The training subset is used to adjust the models, and the validation is applied to verify the overtraining and define the number of neurons in the intermediate layer. Finally, the test subset is used to evaluate the performance of the models. We also verified if the use of the Z-score may bring some performance gain. It is a mathematical treatment that transforms the series of data into approximately stationary. Some studies have presented the importance of using such an approach (Kachba et al., 2020; Siqueira et al., 2018). To apply the Z-score, the value of each sample is subtracted from the mean and divided by the standard deviation. At the end of the ANN execution, the process is reversed to analyze the performances in the original domain. The number of neurons in the hidden layer was defined by empirical tests, varying from 3 to 100 neurons. The best number for each case was chosen based on the lower Mean Square Error (MSE) in the test set. The number of neurons in the hidden layer of each neural model is in Table A1, Table A2 in Appendix A.
Table A1

Computational results for Tietê Station




CO
PM10
PM2.5
NNMSEMAEMAPENNMSEMAEMAPENNMSEMAEMAPE
Without Z-ScoreAll InputsELM30.0820.24127.33655302.98314.79241.05340118.0409.37968.917
ESN30.1040.27535.00710226.95312.54336.5297084.0898.19860.844
MLP50.0560.18921.05435125.51689.97022.795758.5976.12932.043
RBF900.1070.29041.93790361.60017.40071.1457116.6248.70082.812
Without COVIDELM30.0740.22929.13625363.39316.37247.74725121.7959.63371.572
ESN30.0680.22729.31317355.85216.80755.6391773.7597.76054.564
MLP70.0540.18920.9393228.69212.61932.008554.0205.72825.911
RBF900.1070.29041.91290361.75817.41471.17390116.6848.70882.843
Without COVID and LockdownELM150.1960.35541.28420441.15718.46259.69080142.55510.54876.858
ESN50.0720.21624.92830447.59219.18762.82560124.0089.23368.624
MLP50.1060.24824.8815274.82812.14529.0015073.1815.65522.787
RBF900.1070.29142.00290361.42917.54471.30290116.3378.72082.882
With Z-ScoreAll InputsELM200.1390.33437.91950332.81316.29455.8944598.0538.77859.417
ESN30.0900.24327.40920274.42913.73854.82235110.1668.71067.406
MLP50.0390.13516.13250172.99111.02730.508356.9846.05432.115
RBF500.1070.29041.93790361.60017.40071.1453116.6288.70582.814
Without COVIDELM30.0840.24226.93235472.05119.27765.40230118.1469.50569.580
ESN30.0910.25027.32145361.42617.39861.24325113.5018.46171.963
MLP50.0720.21423.7363229.29912.62432.513543.4845.49423.216
RBF900.1060.29041.85590361.76217.41471.17310115.4538.93082.345
Without COVID and LockdownELM550.2260.38945.34225445.11018.72758.36735140.3259.70172.964
ESN30.0950.26731.36660326.90416.26151.552888.2448.18051.650
MLP50.0960.23724.6605268.29312.73430.2271269.2105.75024.045
RBF900.1090.29242.10660364.25117.59171.33670116.4528.75082.979
NO2NOO3
NNMSEMAEMAPENNMSEMAEMAPENNMSEMAEMAPE
Without Z-ScoreAll InputsELM3570.14318.94926.89655078.08959.08050.7013152.6669.65114.731
ESN3523.30317.20427.1921215628.355107.259111.7613175.9358.48714.828
MLP70608.07819.19819.419703433.16742.27227.680399.3018.25911.462
RBF5886.85925.28536.02158659.76776.705156.9273302.23611.99620.629
Without COVIDELM3510.55419.69124.813349881.516137.9641172.2313276.65713.78118.821
ESN3627.17221.81532.491378881.981219.920827.99415586.09920.91330.830
MLP55353.00016.11618.310259192.32042.76466.78870123.5468.23313.090
RBF7867.11624.99435.4273111554.876285.1362637.35080301.88111.90720.519
Without COVID and LockdownELM3749.90424.51833.48536961.11172.55473.9853131.2999.39412.071
ESN151861.37636.21158.4794525439.828123.837229.8983269.78113.05216.959
MLP3851.31125.37026.479504811.99953.29037.48990101.5706.85711.121
RBF90898.97025.54036.59439205.44581.435160.59590301.79611.91420.527
With Z-ScoreAll InputsELM3487.35418.48621.64936057.23963.55591.5203136.7319.54615.309
ESN3495.24717.89526.537813938.59098.401127.9127238.87613.11718.446
MLP40582.09118.97718.885804080.49449.61028.74117113.7588.88512.352
RBF10889.71325.35836.39638805.06777.825158.0193302.23511.98520.620
Without COVIDELM3474.97915.93620.734360013.692214.7821084.4308264.53813.59718.703
ESN3842.93725.15936.858380788.855203.8751247.2375241.88814.19618.513
MLP45533.28116.09015.52437461.86040.34265.18990127.8659.10614.463
RBF12804.60524.27635.6283109705.490283.4272652.25590302.14211.93320.552
Without COVID and LockdownELM3695.47218.53427.74835391.00757.43858.1663131.2999.39412.071
ESN601887.73036.05158.7004025912.029124.420240.3933347.06515.19919.339
MLP5763.58223.18722.621804334.74549.24833.71290134.0228.17912.276
RBF90898.80325.53536.590909225.29478.847160.69690300.18411.80820.385

NN: Number of neurons; MSE: Mean Square Error; MAE: Mean Absolute Error; MAPE: Mean Absolute Percentage Error; ∗With COVID means including the number of COVID-19 new cases and the partial lockdown.

Table A2

Computational results for D. Pedro II Station




CO
PM10
PM2.5
NNMSEMAEMAPENNMSEMAEMAPENNMSEMAEMAPE
Without Z-ScoreAll InputsELM50.1450.27446.4483201.22111.96549.041351.3035.93539.177
ESN30.2360.446175.795371.1567.84732.640357.3546.08869.635
MLP30.1330.25745.423562.8326.93628.881318.3053.32319.678
RBF30.2060.420175.73890232.00013.80067.8029061.6506.70071.874
Without COVIDELM30.0880.22060.2533167.63510.46934.639367.7766.73842.641
ESN120.3030.467206.96315417.39417.72393.584588.0448.54779.455
MLP450.1010.24242.6661573.6387.07926.9226018.6833.57923.942
RBF30.2040.419174.98390232.20613.80567.8109061.6976.70271.883
Without COVID and LockdownELM30.1190.25159.4993378.42815.71444.049366.1766.59241.157
ESN70.3130.478185.8493252.99813.81468.271382.4557.50775.413
MLP450.1110.27655.5484079.9128.11333.1204027.4144.67525.319
RBF50.1980.414172.17694232.42113.81067.8199061.7136.70371.886
With Z-ScoreAll InputsELM30.0960.22739.3503127.3209.33733.624333.0775.20048.258
ESN70.3400.511201.2903218.38912.56860.732354.3026.17667.292
MLP30.0690.20048.4808076.8437.22024.859315.8063.58228.000
RBF50.2050.420175.62790232.00013.80067.8026061.6506.70071.874
Without COVIDELM50.1230.26867.0113101.6298.64135.393354.7816.22742.818
ESN200.3920.543239.6213294.93214.77873.650388.5927.84162.774
MLP250.1170.24048.97610100.5197.81430.3588018.8723.39522.804
RBF30.2030.419174.83190232.20613.80567.8109061.6976.70271.883
Without COVID and LockdownELM30.1060.28147.5893280.54313.43854.251342.7475.44338.441
ESN150.3900.550239.7963325.95014.60866.2765120.1888.98696.500
MLP120.1010.26953.2398079.4218.15633.8031226.3964.35028.352
RBF30.1990.416171.07990232.42913.81067.8209061.7136.70371.886

NO2NOO3
NNMSEMAEMAPENNMSEMAEMAPENNMSEMAEMAPE

Without Z-ScoreAll InputsELM5785.09020.35330.175104839.14650.450153.3337467.37515.32234.491
ESN5907.30824.26340.671104772.46358.378373.8403353.38515.16332.252
MLP70557.77817.58923.740502657.19238.07281.0157138.8939.42216.985
RBF5854.31325.36251.06955416.42668.009486.65717454.93117.02036.846
Without COVIDELM3445.28217.29323.02774340.59857.558348.301563.0555.99712.454
ESN3822.07525.26052.41635996.85571.571445.1883260.12212.66926.255
MLP35464.06517.07221.11383924.37345.39686.40445120.8249.14316.909
RBF5854.78425.51751.09255461.89468.337488.8023391.84515.01133.241
Without COVID and LockdownELM71064.27527.72657.58933991.35754.068113.3793137.1919.22917.298
ESN101555.61431.14075.076158592.13980.821675.729401214.00029.87759.143
MLP40295.49614.26922.207124320.33848.81591.6353206.56612.42021.391
RBF90868.48725.61251.959905472.40168.400490.9333358.35012.87229.566
With Z-ScoreAll InputsELM3663.13216.81219.98074307.80550.687159.3213115.7699.34818.562
ESN31070.91126.30335.021305849.48067.123420.6553489.52418.49437.389
MLP60493.31918.01523.994353274.15646.06684.6614086.1997.35413.290
RBF7864.57925.50951.69035402.14667.809477.88320456.53417.05836.916
Without COVIDELM5394.52916.52229.06053768.17854.428120.683343.5015.67310.074
ESN3902.43426.09951.69035501.00568.838471.5503260.47111.80125.214
MLP20228.20712.51820.77773911.06047.17076.890589.3217.70314.095
RBF12859.82425.42552.14735401.73067.974478.2343391.50014.99833.216
Without COVID and LockdownELM3782.71124.45239.24635041.66161.826202.7193107.9107.83816.488
ESN401871.58236.53783.638128067.63380.195594.8013389.73514.92028.355
MLP55346.73516.07626.94074490.20751.74787.5433161.22410.40219.405
RBF90868.48425.61251.959405471.10068.391490.9415369.72913.68131.046

NN: Number of neurons; MSE: Mean Square Error; MAE: Mean Absolute Error; MAPE: Mean Absolute Percentage Error; ∗With COVID means including the variables number of COVID-19 new cases and the partial lockdown.

We followed the premises from the literature of adopting the MSE as the most important error metric because this is reduced during the training (adjustment) of the neural models (Araujo et al., 2020; Kachba et al., 2020; Siqueira et al., 2014, 2018, 2020). The artificial neurons in the intermediate layer of the MLP, ELM, and ESN, use the hyperbolic tangent as an activation function. In the RBF, the Gaussian function is used. The MLP training addressing the Modified Scaled Conjugate Gradient (MSCG) and uses as stop criterion the maximum number of 500 iterations. Also, the K-Medoids in RBF achieved the stop criterion after 10 iterations without modification in the position of the centroids (Figueiredo et al., 2019).

Results and discussion

For simplicity, we divided this section into three parts. Firstly, the descriptive analysis of the databases, followed by the ANN prediction results, and lastly, the results for the hypothetical scenarios of 10%, 30%, 70%, and 90% of lockdown.

Descriptive analysis

The daily concentrations during the studied period, together with the partial lockdown level, are shown in Appendix A - Figure A1. The São Paulo state government officially ordered lockdown on March 24, 2020, however, the population started to self-isolate voluntarily the week before (first available social isolation data – March 17, 2020). From March 17, 2020 to May 13, 2020, the lockdown varied between 38 and 59%, with an average of 51%.
Fig. A1

Concentrations of CO [ppm] (a), O3 [μg/m3] (b), NO2 [μg/m3] (c), NO [μg/m3] (d), PM2.5 [μg/m3] (e), and PM10 [μg/m3] (f) according to the date.

To visualize changes in air pollution levels due to voluntary self-isolation and/or lockdown, we compared the five-day average before (12–16 March 2020) voluntary self-isolation with a five-day average during self-isolation (17–21 March 2020) (Fig. 2 ). There is no distinctive change in pollutant levels within experimental error, as may be expected due to a lag in response and a low level of reduced activities. However, comparing a five-day average during the first lockdown period (54–56% lockdown from 24 to 28 of March 2020) with the period before lockdown or self-isolation, we do observe a general decrease in pollutant levels for all pollutants at Tietê and for most at D. Pedro II as is shown in Fig. 3 . As this period would reflect the changes in the self-isolation period’s activities with additional reduction of activities, this finding is not surprising.
Fig. 2

Five-day average pollutant levels before and during the voluntary self-isolation period at Tietê station (a) and D. Pedro II station (b) (CO concentration were multiplied by 100).

Fig. 3

Averages comparison between five days of official lockdown with five days before lockdown for Tietê station (a) and D. Pedro II station (b) (CO concentration were multiplied by 100).

Five-day average pollutant levels before and during the voluntary self-isolation period at Tietê station (a) and D. Pedro II station (b) (CO concentration were multiplied by 100). Averages comparison between five days of official lockdown with five days before lockdown for Tietê station (a) and D. Pedro II station (b) (CO concentration were multiplied by 100). From Figure A1 we observe that this trend continues until around the 24th of April, after which relaxation in lockdown rules corresponds to a steady increase in most of the pollutant levels. It does seem as though not all the pollutants are similarly influenced by the lockdown. The particulate matter concentration appears to be influenced by other factors as well, and reaches much higher values towards the end of the lockdown period discussed here than what it was before. The ozone levels generally increased with a lockdown percentage increase. Using a neural network to study atmospheric ozone formation in the Metropolitan Area of São Paulo (MASP), Guardani et al. (1999) found that temperature was the main factor affecting ozone formation and observed higher ozone levels in regions characterized by lower emission levels of ozone precursors. Martins and Andrade (2008) evaluated VOC s’ potential for ozone formation using a three-dimensional air quality model and found that ozone in the MASP is VOC-limited, as commonly observed in urban areas (Li et al., 2019; Siciliano et al., 2020; Tobías et al., 2020). Under these conditions, a decrease of NOx can reduce the removal of O3 through NOx titration and/or the effect of radical terminating reactions, and thereby increasing O3 formation (Seinfeld and Pandis, 2016; Sillman, 1999, 2003). Furthermore, Andrade et al. (2017), studying the MASP, explain that decreasing NOx and CO emissions simultaneously contribute to higher ozone levels. This behavior is also affirmed in (Gentner et al., 2009; Harley et al., 2005; Marr and Harley, 2002; Stedman, 2004). Table 2 presents the linear correlations between the lockdown level (varying from 38 to 59%) and air pollutant concentrations at Tietê and D. Pedro II stations for March 17, 2020 (first day of available data of social isolation) to May 13, 2020. Bar ozone, all the pollutants correlated negatively (ranging from −0.14 for CO at D. Pedro II to −0.60 for NO at Tietê) with the lockdown.
Table 2

Linear correlations between lockdown and studied air pollutant concentrations for March 17, 2020 to May 13, 2020.

COO3NO2NOPM10PM2.5
Tietê−0.450.15−0.57−0.60−0.34−0.38
D. Pedro II−0.140.11−0.42−0.33−0.23−0.26

Note: ∗Data from USP-Ipen station.

Linear correlations between lockdown and studied air pollutant concentrations for March 17, 2020 to May 13, 2020. Note: ∗Data from USP-Ipen station. Finally, Appendix A - Figure A2 shows the number of daily COVID-19 newly reported cases. The first day of registered COVID-19 cases was February 25, 2020 and an exponential increase is observed from the beginning of April onwards.
Fig. A2

Number of COVID-19 new cases by day.

ANN estimation analysis

Table 3, Table 4 contain the average and standard deviation for each pollutant level obtained from the 3 subsets (training, validation, and test) at the two sites. Although the two monitoring sites are in the same city, the descriptive statistics show significant differences. Tietê station (near highways) has higher average concentrations for all pollutants in comparison to D. Pedro II station (populated city area). The different statistical profiles of the two sites are indicative of robust evaluation of the data, as the model could provide a MAPE of ∼30%, despite two dissimilar data sets.
Table 3

Average and standard deviation for each studied pollutant for the 3 subsets (Tietê Station).


Training
Validation
Test
PollutantAverageStandard DeviationAverageStandard DeviationAverageStandard Deviation
CO [ppm]0.690.290.930.460.850.34
O3[μg/m3]702898217414
NO2[μg/m3]682486318831
NO [μg/m3]91691248915184
PM2.5[μg/m3]135.52412209.7
PM10[μg/m3]228.243193819
Table 4

Average and standard deviation for each studied pollutant for the 3 subsets (D. Pedro II Station).


Training
Validation
Test
PollutantAverageStandard DeviationAverageStandard DeviationAverageStandard Deviation
CO [ppm]0.300.150.620.400.500.36
O3[μg/m3]652481195914
NO2[μg/m3]431760336431
NO [μg/m3]211951637576
PM2.5[μg/m3]124.7208.8167.8
PM10[μg/m3]197.337143116
Average and standard deviation for each studied pollutant for the 3 subsets (Tietê Station). Average and standard deviation for each studied pollutant for the 3 subsets (D. Pedro II Station). Table A1, Table A2 (Appendix A) display the ANN computational results for AQMS Tietê (ring road station) and AQMS D. Pedro II (densely populated city area station), respectively. For this purpose, the best (lower Mean Square Error - MSE) of 30 independent executions were considered (de Castro, 2007; Haykin, 2008; Siqueira et al., 2018). The shaded values indicate results with the best performance (lower MSE). The MLP neural model achieved the best results (i.e., lowest MSE) in almost all cases, except for O3 at D. Pedro II station. The latter was best estimated using the ELM neural model. It is an important observation, as there is no consensus about which ANN is the best. It corroborates with the results achieved by Polezer et al. (2018) and Araujo et al. (2020), both applied to air pollution epidemiological studies. It is important to highlight that the best overall ANN results were achieved when the variables “number of new COVID-19 cases” and “partial lockdown” were included (8 out of 12 cases). The remaining 4 cases (NO2 and PM2.5 at Tietê, and NO2 and O3 at D. Pedro II) showed the best result considering only “partial lockdown”. In both scenarios the meteorological variables were included. To establish if the Z-Score application could result in performance gain, the ANN was also performed with the Z-score (Results shown in Table A1, Table A2). The Z-score’s use proved to be beneficial in 2 cases at the Marginal Tietê station, and four cases at the D. Pedro II site. Therefore, it can be considered in addition to increasing the quality of the results of the ANN. Fig. 4, Fig. 5 represent the observed (continuous red line) and best estimation (dashed blue line) concentration levels for CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) at Tietê and D. Pedro II stations, respectively during the period 4–13 May 2020. The lockdown level is indicated as shaded bars.
Fig. 4

Best estimation to predict CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for Tietê station. Predictions are in dashed lines and observed levels in solid lines. The bars are the partial lockdown.

Fig. 5

Best estimation to predict CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for D. Pedro II station. Predictions are in dashed lines and observed levels in solid lines. The bars are the partial lockdown.

Best estimation to predict CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for Tietê station. Predictions are in dashed lines and observed levels in solid lines. The bars are the partial lockdown. Best estimation to predict CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for D. Pedro II station. Predictions are in dashed lines and observed levels in solid lines. The bars are the partial lockdown. In general, the predicted results, using this approach, captured the original data tendencies reasonably well, with a mean absolute percentage error (MAPE) of 30% for almost all cases. The exceptions were at D. Pedro II station (CO – 48% and NO - 81%) (see Table A1, Table A2 – Appendix A). It is important to notice two distinct behaviors during the lockdown to the test set period (see Fig. 4, Fig. 5). When the lockdown level remains unchanged (first 5 days), the main influence can be ascribed to the meteorological variables (Figure A3 – shows the meteorological raw data for the test period). But after five days in the test set, the percentage lockdown jumps from 46% to 53% in two days. As the temperature and relative humidity were relatively stable in the last five days, one can say that the lockdown is the main contributor to the change in air pollutant level. Observe that ozone concentration has a consistent relation with solar irradiation, with similar profiles. This behavior is in accordance with those observed at the beginning of lockdown (March 17, 2020), as mentioned in section 3.1. The importance of maintaining continuous and consistent interventions to curb air pollution is evident from the data displayed here. It is particularly important during extreme air pollution events, and there is enough evidence that lockdown measures will nearly instantly reduce air pollution levels.
Fig. A3

Meteorological variables raw data for the test set.

Each ANN architecture has positive and negative points. As discussed in Section 2.1, the ESN is a recurrent model, presenting feedback loops of information in its hidden layer. This characteristic may be relevant when dealing with data processing since more information is available to form the output response. Additionally, together with the ELM, their training processes require less computational effort than the RBF and MLP, since there are no iterative processes to adjust their weights because the hidden layer is not modified. In addition, other works have presented the capability of such models to overcome traditional, fully trained architectures (Araujo et al., 2020; Siqueira et al., 2012a, 2014, 2018). Despite the advantage and good results found in the literature for ESN, ELM, and RBF (Siqueira et al., 2012b, 2018), the MLP errors were smaller than the others. It seems clear that adjusting the hidden weights is an important step in nonlinear mapping applications, as is presented in this investigation. In this case, there are a set of inputs of variable nature (for example, temperature, humidity, and partial lockdown), and mapping these values to another variable is not a trivial task (Kachba et al., 2020; Polezer et al., 2018).

Hypothetical scenarios

To predict the impact that the partial lockdown has on air quality, four hypothetical scenarios were modeled: a minimum lockdown level (10%); possible vertical isolation (only for COVID-19 high-risk groups – over-60s and people with chronic disease, diabetics, among others) (30%); the considered ideal lockdown percentage (70%); and an extreme isolation action (90%). The results are compared in Fig. 6, Fig. 7 , with results for AQMS Tietê D. Pedro II, respectively. The red lines correspond to 10% lockdown, the pink lines to 30% lockdown, the blue lines to 70% lockdown, and the green lines to 90% lockdown. The pollutant designation (a -f) is the same as for Fig. 4, Fig. 5.
Fig. 6

Hypothetical scenarios considering the impact that 10% (red line), 30% (pink like), 70% (blue line), and 90% (green line) lockdown would have on CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for AQMS Tietê. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Fig. 7

Hypothetical scenarios considering the impact that 10% (red line), 30% (pink like), 70% (blue line), and 90% (green line) lockdown would have on CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for AQMS D. Pedro II. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)

Hypothetical scenarios considering the impact that 10% (red line), 30% (pink like), 70% (blue line), and 90% (green line) lockdown would have on CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for AQMS Tietê. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) Hypothetical scenarios considering the impact that 10% (red line), 30% (pink like), 70% (blue line), and 90% (green line) lockdown would have on CO (a), O3 (b), NO2 (c), NO (d), PM2.5 (e), and PM10 (f) levels for AQMS D. Pedro II. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.) The data in Fig. 6 (Tietê station) indicates that in general higher concentrations are predicted for all pollutants at 10 (red line) and 30 (pink line) % lockdown. A different pattern is observed for May 07 and May 08, whereby the lower lockdown also predicted low pollutant concentrations. During these two days, the meteorological conditions changed abruptly (low temperature and solar irradiation, and high relative humidity - see Figure A3). This scenario exemplifies the complex interdependency of air pollutant levels on several variables. These findings suggest that when abrupt weather conditions are forecasted, lockdown interventions should happen a few days earlier. Our data corroborate with the recent publication of Hong et al. (2019) who reported that extreme weather events might be a crucial mechanism by which air quality is influenced. The predicted ozone concentration at Tietê station (Fig. 6b) for the 30% lockdown showed an unexpected behavior, presenting higher concentrations than 70% and 90% lockdown. It may have been a consequence of the complexity of the variables that influence air quality. Although this may be seen as a poor fit for the model, we need to emphasize that this is one case out of twelve. Although the same abrupt change in meteorological conditions was observed for 7 and 8 May at the D. Pedro II station (Fig. 7), the ANN could estimate the response more coherently than for the Tietê station. This may be due to other factors at play, influencing the air pollutant level at this station. Observe that the ozone profiles are as expected, especially for a 10% lockdown. It is important to highlight that the ANN prediction was good as only one of the seven inputs were changed. We also observe that the particulate matter levels are not greatly influenced by lockdown (as reported by Nakada and Urban, 2020), especially the PM10 concentration. At the D. Pedro II station, the PM2.5 levels also stay very similar regardless of the lockdown level. We acknowledge that air pollutant levels have a complex set of variables that determine it, and that even a powerful tool such as ANN cannot always accurately predict the level. However, the data presented here provides adequate evidence that ANN can be used successfully to estimate the impact of different levels of lockdown will have on the air quality.

Conclusion

Artificial Neural Networks were able to predict how changes in the level of lockdown affected air quality in São Paulo City. We have shown that even when using a restricted data set of pollutant levels together with meteorological information, the ANN results showed Mean Absolute Percentage Error (MAPE) around 30%. The result of the ANN approach to four hypothetical scenarios of lockdown (i.e., 10%, 30%, 70%, and 90%) showed evidence of the complexity of the calculation problem as a consequence of the abrupt meteorological changes. For the first time, ANN were used as a tool to describe the equilibrium between air pollution, COVID-19 cases, and the partial lockdown, which can be employed in several national contexts. This approach’s predictive power allows governmental bodies and policy makers to manage lockdown responsibly ensuring minimal economic impact. This method will lead to improved air pollution control measures (and potentially COVID-19 mortality) by enforcing a lockdown level that will still sustain sufficient economic activities. Furthermore, in the light of the global drive to improve air quality and work towards zero emissions, this approach could also be used in the future to reach emission target levels.

CRediT author statement

Yara S. Tadano, Conceptualization, Methodology, Data curation, Investigation, Validation, Formal analysis, Writing - original draft, Writing - review & editing. Sanja Potgieter-Vermaak: Writing - review & editing. Yslene R. Kachba, Conceptualization, Data curation. Daiane M.G. Chiroli, Writing - original draft. Luciana Casacio, Writing - review & editing. Jéssica C. Santos-Silva, Data curation, Writing - original draft. Camila A.B. Moreira, Data curation, Investigation. Vivian Machado, Visualization, Validation. Thiago Antonini Alves, Data curation, Visualization, Writing - original draft. Hugo Siqueira, Conceptualization, Methodology Software, Investigation, Validation, Formal analysis, Writing – original draf. Ricardo H.M. Godoi, Formal analysis, Writing – review.

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
  4 in total

1.  Machine Learning and Meteorological Normalization for Assessment of Particulate Matter Changes during the COVID-19 Lockdown in Zagreb, Croatia.

Authors:  Mario Lovrić; Mario Antunović; Iva Šunić; Matej Vuković; Simonas Kecorius; Mark Kröll; Ivan Bešlić; Ranka Godec; Gordana Pehnec; Bernhard C Geiger; Stuart K Grange; Iva Šimić
Journal:  Int J Environ Res Public Health       Date:  2022-06-06       Impact factor: 4.614

Review 2.  Effects of COVID-19 on the environment: An overview on air, water, wastewater, and solid waste.

Authors:  Khaled Elsaid; Valentina Olabi; Enas Taha Sayed; Tabbi Wilberforce; Mohammad Ali Abdelkareem
Journal:  J Environ Manage       Date:  2021-04-30       Impact factor: 8.910

3.  Changes in Air Quality and Drivers for the Heavy PM2.5 Pollution on the North China Plain Pre- to Post-COVID-19.

Authors:  Shuang Liu; Xingchuan Yang; Fuzhou Duan; Wenji Zhao
Journal:  Int J Environ Res Public Health       Date:  2022-10-08       Impact factor: 4.614

4.  The Effect of Lockdown Period during the COVID-19 Pandemic on Air Quality in Sydney Region, Australia.

Authors:  Hiep Duc; David Salter; Merched Azzi; Ningbo Jiang; Loredana Warren; Sean Watt; Matthew Riley; Stephen White; Toan Trieu; Lisa Tzu-Chi Chang; Xavier Barthelemy; David Fuchs; Huynh Nguyen
Journal:  Int J Environ Res Public Health       Date:  2021-03-29       Impact factor: 3.390

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.