Qinhuizi Wu1, Tao Li1, Shifu Zhang1, Jianbo Fu1, Barnabas C Seyler1, Zihang Zhou2, Xunfei Deng3, Bin Wang1, Yu Zhan1. 1. Department of Environmental Science and Engineering, Sichuan University, Chengdu, Sichuan, 610065, China. 2. Chengdu Academy of Environmental Sciences, Chengdu, Sichuan, 610072, China. 3. Institute of Digital Agriculture, Zhejiang Academy of Agricultural Sciences, Hangzhou, Zhejiang, 310021, China.
Abstract
Meteorological normalization refers to the removal of meteorological effects on air pollutant concentrations for evaluating emission changes. There currently exist various meteorological normalization methods, yielding inconsistent results. This study aims to identify the state-of-the-art method of meteorological normalization for characterizing the spatiotemporal variation of NOx emissions caused by the COVID-19 pandemic in China. We obtained the hourly data of NO2 concentrations and meteorological conditions for 337 cities in China from January 1, 2019, to December 31, 2020. Three random-forest based meteorological normalization methods were compared, including (1) the method that only resamples meteorological variables, (2) the method that resamples meteorological and temporal variables, and (3) the method that does not need resampling, denoted as Resample-M, Resample-M&T, and Resample-None, respectively. The comparison results show that Resample-M&T considerably underestimated the emission reduction of NOx during the lockdowns, Resample-None generates widely fluctuating estimates that blur the emission recovery trend during work resumption, and Resample-M clearly delineates the emission changes over the entire period. Based on the Resample-M results, the maximum emission reduction occurred during January to February 2020, for most cities, with an average decrease of 19.1 ± 9.4% compared to 2019. During April of 2020 when work resumption initiated to the end of 2020, the emissions rapidly bounced back for most cities, with an average increase of 12.6 ± 15.8% relative to those during the strict lockdowns. Consequently, we recommend using Resample-M for meteorological normalization, and the normalized NO2 concentration dynamics for each city provide important implications for future emission reduction.
Meteorological normalization refers to the removal of meteorological effects on air pollutant concentrations for evaluating emission changes. There currently exist various meteorological normalization methods, yielding inconsistent results. This study aims to identify the state-of-the-art method of meteorological normalization for characterizing the spatiotemporal variation of NOx emissions caused by the COVID-19 pandemic in China. We obtained the hourly data of NO2 concentrations and meteorological conditions for 337 cities in China from January 1, 2019, to December 31, 2020. Three random-forest based meteorological normalization methods were compared, including (1) the method that only resamples meteorological variables, (2) the method that resamples meteorological and temporal variables, and (3) the method that does not need resampling, denoted as Resample-M, Resample-M&T, and Resample-None, respectively. The comparison results show that Resample-M&T considerably underestimated the emission reduction of NOx during the lockdowns, Resample-None generates widely fluctuating estimates that blur the emission recovery trend during work resumption, and Resample-M clearly delineates the emission changes over the entire period. Based on the Resample-M results, the maximum emission reduction occurred during January to February 2020, for most cities, with an average decrease of 19.1 ± 9.4% compared to 2019. During April of 2020 when work resumption initiated to the end of 2020, the emissions rapidly bounced back for most cities, with an average increase of 12.6 ± 15.8% relative to those during the strict lockdowns. Consequently, we recommend using Resample-M for meteorological normalization, and the normalized NO2 concentration dynamics for each city provide important implications for future emission reduction.
The impact of COVID-19 pandemic control on air quality has been extensively evaluated. In 21 major cities around the world, the daily NO2 concentrations decreased by 3–58% during lockdowns (Benchrif et al., 2021; Singh et al., 2021). In six Chinese megacities, the NO2 concentrations declined by 36–53% during lockdowns (Wang et al., 2020). Since April 2020, most cities in China have lifted strict lockdowns, and the economic activities gradually recovered during work resumption. It is valuable to quantitatively assess the spatiotemporal dynamics of air pollutant emissions before, during, and after lockdowns, which will provide insights for future air quality management. The simplest method is to compare the air quality monitoring data in the corresponding periods from adjacent years (Marlier et al., 2020; Wu et al., 2021). Nevertheless, the results obtained by this method are unreliable due to the confounding effects of varying meteorological conditions (Falocchi et al., 2021; Zhao et al., 2020; Zheng et al., 2021b). In previous studies of emission assessment based on monitoring data, the meteorological effects were removed by using a counterfactual operation to address the question: What would the air pollutant concentrations be like if the meteorological conditions were the same through an episode (Grange et al., 2018)? This counterfactual operation is commonly referred to as meteorological normalization, which requires statistical models to link air pollutant concentrations with meteorological conditions.Decision tree-based machine learning algorithms, including random forests and boosted regression trees, are commonly employed for meteorological normalization (Grange and Carslaw, 2019; Grange et al., 2018; Mallet, 2021; Qu et al., 2020). As ensemble models, random forests and boosted regression trees usually consist of hundreds of decision trees, which recursively split the data at internal nodes based on certain rules. Compared to parametric statistical models, these decision tree-based models are superior in handling complicated nonlinearity and interaction, which are essential for accurately simulating the temporal variation of air pollution (Ryan et al., 2021). A model of boosted regression trees was developed to remove the meteorological effects on the temporal variation of PM2.5 to quantitatively evaluate the effectiveness of clean air policies in the Beijing-Tianjin-Hebei region (Qu et al., 2020). In addition, meteorological normalization based on random forests was implemented to assess the impact of COVID-19 lockdowns on air quality in six major Chinese cities (Wang et al., 2020). The results showed that lockdowns caused the reduction of NO2 concentrations by 36–53% compared to the business-as-usual (BAU) scenario. It is noteworthy that random forests are more robust than boosted regression trees regarding hyperparameter settings (Callens et al., 2020; Zaborski et al., 2019). Thus, meteorological normalization offers a convenient and rapid way to quantitively evaluate the effects of policies or big events on air quality.There mainly exist three different methods for meteorological normalization based on random forests (Grange et al., 2018; Vu et al., 2019; Wang et al., 2020). As the common basis of these methods, a random forest is developed to model the air pollutant concentrations based on the meteorological conditions and temporal indicators (e.g., day of year). For the first method, the random forest is trained with the data of the whole study period, and it then makes predictions on a new dataset generated by resampling the meteorological data across all the samples (Vu et al., 2019). The second method is like the first method, but the only difference is in the resampling step, in which all data for meteorological conditions and temporal indicators except for the trend term are resampled (Grange et al., 2018). For the third method, the random forest is trained with the data for the subperiod before a specified moment (e.g., the beginning of lockdowns) and then used to make predictions with the data for the subperiod after that moment (Wang et al., 2020). While all these three methods seem theoretically acceptable, divergent results may be obtained for evaluating the effects of policies or big events on emissions or air quality.This study aims to evaluate the temporal trends of NOx emissions before, during, and after the COVID-19 lockdowns in China by comparing the three meteorological normalization methods based on random forests. As NO2 has a relatively short residence time in the atmosphere (Petetin et al., 2020), the meteorologically normalized concentrations could generally reflect the local emissions. Atmospheric NO2 mainly originated from primary emissions of NO2 and oxidation of NO (Zyrichidou et al., 2015). The meteorologically normalized NO2 concentrations thus indicates the emission change of NOx. Comparing the three methods enables identification of the state-of-the-art method for practical applications, covering the predicted trends of NOx emissions and the effectiveness of removing the meteorological effects. By using the most effective method, we could then characterize the magnitude and date of the maximum reduction in emissions for each city during the lockdowns, and we could investigate the spatiotemporal patterns of the emission recovery for China during work resumption. The results of this study not only provide guidance on implementing meteorological normalization to evaluate emission changes, but also offer insights for air quality management in the future.
Materials and methods
Data preparation
Hourly data of NO2 concentrations and meteorological conditions were prepared for 337 prefecture-level cities over mainland China during 2019–2020 (Fig. S1). The hourly NO2 concentration data were collected from 1621 air quality stations maintained by the China Environmental Monitoring Center (CNEMC, 2020). The hourly meteorological data, including wind speed, wind direction, air pressure, temperature, and relative humidity, were obtained from 2167 meteorological stations operated by the China Meteorological Administration (CMA, 2015). As the air quality stations and the meteorological stations were not collocated, co-kriging interpolation with elevation was adopted to derive the meteorological condition at all air quality stations (Ahmed et al., 2018; Jin et al., 2020). The NO2 concentration and meteorological data were then summarized by averaging over all the sites within each city. The missing values of NO2 concentrations and meteorological data were filled in by using the Stineman interpolation (Stineman, 1980). Sansha city was excluded from the analysis due to the paucity of air quality monitoring data.
Meteorological normalization
We developed random forests to simulate NO2 concentrations based on the temporal variables and previously mentioned meteorological variables (Table S1). Based on the previous studies (Grange et al., 2018; Wang et al., 2020), the temporal variables include the date representation (U), year (Y), day of the year (D), month of the year (M), day of the week (W), hour of the day (H), and the number of days after the first day of the Lunar New Year Holiday (L). L is valid over an interval. For example, February 4, 2019 is the first day of the 2019 Lunar New Year, and then L for February 3, 4, and 5 in 2019 is set as −1, 0, and 1, respectively. Given the meteorological variables in the model, the temporal variables indicate the emission dynamics at various temporal scales. Specifically, U and Y represent the interannual trend, D and M represent seasonality, W represents the weekly cycle, H represents the diurnal pattern, and L represents the holiday effect of the Lunar New Year Festival.Based on the random forests, we comparatively evaluated the three commonly used meteorological normalization methods (Fig. 1
). For the first method (Resample-M), the model was trained with the data from 2019 to 2020, and only the meteorological variables were resampled. For the second method (Resample-M&T), the model was trained with the data from 2019 to 2020, with both the temporal and meteorological variables being resampled. The relative changes (ΔE
and ΔE
) of the meteorologically normalized concentrations from 2019 to 2020 estimated by Resample-M and Resample-M&T, respectively, were evaluated with the following equation (Petetin et al., 2020; Shi et al., 2021):where P
2020 is the predicted value for a day in 2020, and P
2019 is the predicted value for the corresponding day in 2019. We obtained the approximate lockdown period over the whole country from the Chinese Central Government notices, which lasted from January 20, 2020 to March 31, 2020. For the six major cities, we obtained more specific lockdown periods from their local government notices (Table S12). The buffer time varied among those cities. For the third method (Resample-None), the model was trained with the data from the period before the strict lockdowns began (i.e., January 1, 2019, to January 19, 2020) and predicted the NO2 concentrations for the period from January 20, 2020, to December 31, 2020. None of the predictor variables were resampled. The relative change estimated by Resample-None (ΔE
) with respect to the COVID-19 pandemic effect was evaluated with the following equation (Shi et al., 2021; Wang et al., 2020):where O
2020 is the observed NO2 concentrations for a day since January 20, 2020, and P
2020 is the predicted value for that day. Note that the training and prediction processes of the model were based on hourly data, which were then summarized to daily averages for evaluating the relative changes. The meteorological normalization work was implemented with the R packages of rmweather (Grange, 2020) and ranger (Talbot et al., 2021).
Fig. 1
Flowchart of three meteorological normalization methods based on random forests. Please refer to Section 2.2Meteorological normalization for the detailed descriptions.
Flowchart of three meteorological normalization methods based on random forests. Please refer to Section 2.2Meteorological normalization for the detailed descriptions.
Method evaluations
We evaluated the predictive performance of these three methods by using hold-out validation. The validation process was the same for all the three methods. For each city, we randomly sampled 80% of the whole dataset for training a random forest, which made predictions on the remaining 20% of the data. Based on the paired predictions and observations, the predictive performance was measured with the determination coefficient (R
2), root mean square error (RMSE), and normalized mean bias (NMB). The datasets used in the validation were the same between Resample-M and Resample-M&T but different for Resample-None. While Resample-M and Resample-M&T used the data for the whole study period (i.e., January 1, 2019, to December 31, 2020), Resample-None only used the data for the subperiod before the lockdowns (i.e., January 1, 2019, to January 19, 2020). This validation strategy corresponded to the training datasets used by these methods in the meteorological normalization.We compared the contributions of the meteorological variables to the NO2 concentrations before and after the meteorological normalization by Resample-M and Resample-M&T. The SHapley Additive exPlanations (SHAP) model interpretation technique was employed to measure the contribution of each meteorological variable (Kang et al., 2021; Nabavi et al., 2021; Vega García and Aznarte, 2020). A positive SHAP value means that the predictor increases the prediction, and a negative SHAP value means that the predictor decreases the prediction. The summed SHAP values of all the meteorological variables indicate their overall contributions to the predicted NO2 concentrations. The SHAP values derived from the random forest linking the meteorological data to the observed NO2 concentrations measured the contributions before the meteorological normalization. We developed an additional random forest to link the meteorological data to the processed NO2 concentrations by Resample-M (or Resample-M&T), from which the SHAP values were derived to measure the contributions after the meteorological normalization. The differences between the SHAP values before and after the meteorological normalization indicated the effects of the resampling operations on Resample-M and Resample-M&T. This evaluation approach was not applied to Resample-None, as it did not involve resampling.We evaluated the Resample-None method by switching the training and prediction data to examine the potential estimation bias. We first trained a random forest with the 2019 data (RF2019) and made predictions for 2020. We then trained another random forest with the 2020 data (RF2020) and made predictions for 2019. The differences between the predictions and observations were calculated using the following equations.
where P
2020 is the annual average prediction of RF2019 for 2020, O
2019 is the annual average observation for 2019, P
2019 is the annual average prediction of RF2020 for 2019, O
2020 is the annual average observation for 2020, and ΔC
2019 (or ΔC
2020) measures the year-over-year concentration difference attributable to the meteorological variation between 2019 and 2020. As we suspected the skewed distributions of NO2 concentrations might cause a substantial estimation bias for the Resample-None method, we reran the numerical experiments with the log-transformed NO2 concentrations. In addition, we trained a random forest with the data during January 1, 2019, to January 19, 2019, and those of the entire 2020. This random forest was used to predict the NO2 concentrations for the period of January 20, 2019, to the end of 2019, which were compared to the Resample-None results obtained by training the model with the data before the lockdowns.
Sensitivity analyses
We conducted sensitivity analyses on the random forest hyperparameters, the number of resampling times for the meteorological normalization, and the SHAP calculation parameter. We performed one-at-a-time sensitivity analyses on all hyperparameters, including the number of variables randomly selected by the internal nodes of a random forest (mtry: 3, 4, and 5) and the number of decision trees (ntrees: 300, 500, and 1000). For the meteorological normalization, we compared the model outputs when resampling times were tentatively set to 300, 500, and 1000. Moreover, we compared the SHAP values corresponding to different values of the key parameter (nsim: 5, 10, 15, 20, 30, 40, 50), which is the number of Monte Carlo simulations for calculating each SHAP value. Based on the previously mentioned sensitivity analyses, we chose the (hyper)parameters corresponding to low computing costs and acceptable predictive performance.
Spatiotemporal variation of NOx emissions
Based on the meteorologically normalized NO2 concentrations by the Resample-M, Resample-M&T, and Resample-None methods, we evaluated the spatiotemporal variations of the NOx emissions before, during, and after the strict lockdowns across China. The average emission levels (indicated by the meteorologically normalized concentrations) from January 1, 2020 to the date right before the lockdowns were set as the baselines. We estimated the maximum reduction in daily emissions during the lockdowns and the corresponding date for each city. After the lockdowns were lifted, the dates when the emissions bounced back to the baseline levels for the first time were noted as the emission recovery dates. Considering the possibly large emission fluctuation, those dates when emissions instantaneously exceeded the baseline levels were screened out. The maximum reduction in emissions and the emission recovery dates for all the cities were mapped to illustrate the spatial patterns across China. We also characterized the temporal patterns of the national average emissions during the lockdowns and work resumption.
Results and discussion
Descriptive statistics
The average NO2 concentrations for most Chinese cities were significantly lower in 2020 than in 2019, and significantly lower during the strict lockdowns than during work resumption (
P < 0.05; Fig. S2
). The national annual average NO2 concentrations were 26.4 ± 9.3 μg/m3 (in 2019) and 23.7 ± 8.2 μg/m3 (in 2020). The average NO2 concentrations in 283 cities (mainly distributed in Central, North, and South China, including the six major cities) were significantly lower in 2020 than in 2019 (P < 0.05; Fig. S2 and Table S2). The annual NO2 concentrations in Wuhan were 42.0 ± 18.5 μg/m3 (in 2019) and 34.2 ± 17.9 μg/m3 (in 2020). In 2020, the national average NO2 concentrations were 20.4 ± 8.3 μg/m3 during the strict lockdowns and 23.7 ± 10.5 μg/m3 during work resumption. As expected, the NO2 concentrations in 234 cities (mainly distributed in Central, North, and East China, including the six major cities) were significantly lower during the strict lockdowns than during when work resumed (P < 0.05; Fig. S2 and Table S3). The average NO2 concentrations in Wuhan were 21.2 ± 9.0 μg/m3 during the strict lockdowns and 37.2 ± 18.6 μg/m3 during the work resumption. In addition, the average NO2 concentrations during the lockdowns of 2020 were lower than those of the corresponding period in 2019, while the concentrations during work resumption displayed an opposite trend (Fig. S3).The meteorological conditions for most Chinese cities in were not significantly different between 2019 and 2020 (P > 0.05; Table S4). Comparing the national annual averages between 2019 and 2020, the temperature was 14.6 ± 5.4 °C versus 14.5 ± 5.6 °C, the relative humidity was 67.7 ± 11.8% versus 69.0 ± 11.2%, the atmospheric pressure was 956.9 ± 82.3 hPa versus 957.1 ± 82.5 hPa, while the wind speed in both years was approximately 1.2 ± 0.5 m/s. The cities with significantly different meteorological conditions between 2019 and 2020 exhibited certain spatial patterns (Fig. S16). For example, the cities with significantly higher atmospheric pressure in 2020 than 2019 are mainly distributed in Northwest, Northeast, and South China, while the cities with significantly lower atmospheric pressure in 2020 than 2019 are mainly distributed in Southwest China. For the periods of lockdowns and work resumption (in 2020), the meteorological conditions were not significantly different for most of the cities in China from the corresponding periods in 2019 (
P > 0.05; Table S5). For example, the average wind speeds during the strict lockdowns and the corresponding period in 2019 were the almost equivalent (1.2 ± 0.5 m/s). Similarly, the average temperature during work resumption (17.4 ± 4.9 °C) differed little from the corresponding period in 2019 (17.8 ± 4.9 °C; Table S6). Regarding the spatial distributions among the six major cities, the annual average meteorological conditions were significantly different (P < 0.05; Fig. S4). The meteorological conditions during the lockdowns and work resumption were also significantly different for these six major cities (P < 0.05; Fig. S5 and S6).
Model evaluation
As expected, given the same predictor variables and hyperparameters, the random forests used by Resample-M, Resample-M&T, and Resample-None exhibited almost identical predictive performance in the hold-out validation (Fig. S7). Based on the sensitivity analysis results, we chose the hyperparameters for the random forests (mtry = 4 and ntrees = 300), the parameter for resampling (nsamples = 500), and the parameter for calculating SHAP (nsim = 10) subject to the balance between computing costs and predictive performance (Table S7–S10). The cross-validation R
2 for all the cities ranged from 0.52 to 0.94, with an average of 0.83 ± 0.06. These results demonstrate that the random forests were capable of simulating the NO2 concentration temporal trends under the varying meteorological conditions. The R
2 was above 0.80 for more than 80% of the cities, while the other cities with R
2 below 0.80 were mainly located in Southwest and Northwest China. For these six major cities, the R
2 ranged from 0.83 to 0.91 (Table 1
). The predictive performance of this study was comparative or superior to that of the previous NO2 studies with meteorological normalization (Table S11) (Grange et al., 2018; Vu et al., 2019; Wang et al., 2020). The random forests with acceptable predictive performance were essential for the meteorological normalization.
Table 1
Performance of the random forests in predicting NO2 concentrationsa.
City
Resample-M (or Resample-M&T)
Resample-None
RMSE
R2
NMB
RMSE
R2
NMB
Beijing
5.95
0.91
0.0025
6.50
0.90
−0.0077
Chengdu
7.40
0.83
−0.0004
7.17
0.83
0.0053
Shanghai
7.61
0.87
−0.0014
7.61
0.88
0.0042
Shenzhen
5.21
0.83
−0.0045
5.60
0.84
−0.0099
Wuhan
7.22
0.91
−0.0057
7.95
0.89
0.0045
Xi'an
7.73
0.89
−0.0009
7.76
0.89
−0.0058
RMSE: root mean square error (μg/m3); R2: coefficient of determination (unitless); NMB: normalized mean bias (unitless). Please refer to S.2 for the equations.
Performance of the random forests in predicting NO2 concentrationsa.RMSE: root mean square error (μg/m3); R2: coefficient of determination (unitless); NMB: normalized mean bias (unitless). Please refer to S.2 for the equations.
Meteorological normalization method comparison
For the national averages, while the relative-change curves derived by the three methods were analogous, the estimated maximum reductions in daily emissions were considerably different (Fig. S8). The relative changes estimated by the three methods all showed V-shaped curves, with the maximum reductions occurring during the strict lockdowns. The curves first rapidly declined following the implementation of lockdowns and then gradually rose to the pre-lockdown levels. The dates when the maximum reductions occurred were similar between Resample-M (January 26, 2020) and Resample-M&T (January 25, 2020), but lagged by nearly 20 days for Resample-None (February 16, 2020). The maximum reductions were estimated to be −31.5% (Resample-M), −12.5% (Resample-M&T), and −64.1% (Resample-None). The results of Resample-M and Resample-M&T both estimated the emission changes by randomizing the hourly meteorological conditions between 2019 and 2020. While, the Resample-None results indicate the emission changes by replacing the meteorological conditions of 2019 with those of 2020. Essentially, these methods removed the meteorological effects on the NO2 concentrations, and the year-on-year relative changes estimated by these three methods were generally comparable.The Resample-M results clearly show the short-term variations and long-term trends of the relative changes in the meteorologically normalized NO2 concentrations (Fig. 2
). For the short-term variations, the Resample-M results show dramatic declines in relative changes during the lockdowns for the major cities. In Chengdu, Shenzhen, Xi'an, Shanghai, and Wuhan, the maximum reductions in daily emissions were reached within two weeks after the lockdowns began, with the relative changes ranging from −35.1% to −51.3%. In Beijing, the maximum reductions occurred within seven weeks after the lockdowns started due to the extended enforcement, with relative change of −37.1%. With respect to the long-term trends, the Resample-M results demonstrate the slow growth of emissions during work resumption for the major cities (Fig. 2). More than four months after the lockdowns were lifted, the emissions in Shanghai, Beijing, and Wuhan gradually returned to pre-lockdown levels. Relatively more rapidly, the emissions in Chengdu and Xi'an returned to their pre-lockdown levels in approximately two months. In Shenzhen, emissions already recovered even before the city's lockdown was lifted, indicating a low impact from the epidemic on the industrial and commercial activities.
Fig. 2
The relative change of the meteorologically normalized NO2 concentrations in 2020 compared to 2019 for (a) Beijing, (b) Shanghai, (c) Chengdu, (d) Shenzhen, (e) Wuhan, and (f) Xi'an. The shaded areas, green lines, grey lines represent the Resample-M, Resample-M&T, and Resample-None results, respectively. The shaded purple (red) areas indicate decreases (increases) in emissions. The yellow zone annotates the periods of strict lockdowns for each city (Table S12).
The relative change of the meteorologically normalized NO2 concentrations in 2020 compared to 2019 for (a) Beijing, (b) Shanghai, (c) Chengdu, (d) Shenzhen, (e) Wuhan, and (f) Xi'an. The shaded areas, green lines, grey lines represent the Resample-M, Resample-M&T, and Resample-None results, respectively. The shaded purple (red) areas indicate decreases (increases) in emissions. The yellow zone annotates the periods of strict lockdowns for each city (Table S12).The Resample-M&T method might have incorrectly estimated the short-term variations while decently revealing the long-term emission trends (Fig. 2). During the lockdowns, it was difficult to see the dramatic relative change declines in neither the six major cities nor the national averages (Fig. 2, S8). In Xi'an, Shenzhen, Beijing, and Shanghai, the average relative changes during the lockdowns were lower than those before the lockdowns. Taking Xi'an as an example, the average relative changes were −11.4% during the lockdowns and −18.2% from January 1, 2020, to January 25, 2020. In Chengdu and Wuhan, the daily relative changes were relatively stable from the beginning of 2020 to the lifting of lockdowns. On the other hand, the Resample-M&T results clearly illustrated that emission reductions gradually approached zero and then leveled off. Based on the SHAP results, Resample-M&T removed the meteorological effects and the seasonal variations in the NO2 concentrations (Fig. 3
). By contrast, the SHAP results show that Resample-M removed most of the meteorological effects but retained the seasonal emission variation, especially for the northern cities (i.e., Beijing and Xi'an) with stronger seasonal fluctuations in emissions (Meng et al., 2018; Wang et al., 2018).
Fig. 5
The relative change of national average daily NO2 concentrations in 2020 compared to 2019 based on the monitoring data (grey line) and the Resample-M result (red line). The vertical dashed lines correspond to the dates for all subplots in Fig. 4.
Fig. 3
SHAP values of the meteorological variables for (a) Beijing, (b) Shanghai, (c) Chengdu, (d) Shenzhen, (e) Wuhan, and (f) Xi'an before (grey lines) and after the meteorological normalization by Resample-M (shaded areas) and Resample-M&T (black lines). The shaded blue (red) areas indicate the meteorological effects associated with decreased (increased) NO2 concentrations. Please note that the black lines for Resample-M&T are very close to the x-axes.
SHAP values of the meteorological variables for (a) Beijing, (b) Shanghai, (c) Chengdu, (d) Shenzhen, (e) Wuhan, and (f) Xi'an before (grey lines) and after the meteorological normalization by Resample-M (shaded areas) and Resample-M&T (black lines). The shaded blue (red) areas indicate the meteorological effects associated with decreased (increased) NO2 concentrations. Please note that the black lines for Resample-M&T are very close to the x-axes.As the Resample-None results fluctuated widely, it was difficult to identify long-term trends of the relative changes on the meteorologically normalized NO2 concentrations, and the emission reductions tended to be overestimated (Fig. 2, S9-S12). During 2020, the standard deviations of daily relative changes for the six major cities ranged from 21.8% to 28.2% based on the Resample-None results, which were much wider than those estimated by the Resample-M results (7.6–19.3%). When switching the training and prediction years for evaluation, the results indicate that the meteorological conditions of the prediction year always caused higher NO2 concentrations when compared to those of the training year (Fig. S9). Thus, Resample-None might overestimate the emission reductions during and after the lockdowns. In the six major cities, the maximum reductions during lockdowns ranged from 64.0% to 83.5% based on the Resample-None results, which were higher than the Resample-M estimates (35.1–51.3%; Fig. 2). The distribution of NO2 observations was left-skewed, which was approximately symmetric after log-transformation (Fig. S10). Based on the log-transformed data evaluation, we inferred that the overestimation was caused by the left-skewed frequency distribution of NO2 concentrations (Fig. S11). In addition, the relative changes identified by Resample-None during work resumption might not reflect the real trends, such as in Wuhan and Xi'an (Fig. 2). When switching the training and prediction years for evaluation, the emissions during summer were always found to be lower in the prediction year than the training year, which was certainly contradictory (Fig. S12). As the NO2 concentrations were relatively lower in summer, the overprediction problem was more prominent than the other seasons. This evaluation result further demonstrates the problem of overestimating reductions by using Resample-None.Leveraging the “natural experiment” of the COVID-19 lockdowns and work resumption, we comparatively evaluated three meteorological normalization methods that have been commonly used in previous studies (Table 2
). Our comparison results suggest that the Resample-M&T method based on resampling temporal and meteorological variables was inadequate to reflect short-term variations caused by the lockdowns (Chen et al., 2019; Falocchi et al., 2021; Grange et al., 2018). The Resample-M&T method also removed seasonal variations of emissions. Furthermore, the results obtained by the Resample-None method that separates the training and prediction data wildly fluctuated and were affected by estimation bias (Lovric et al., 2021; Talbot et al., 2021; Wang et al., 2020). The Resample-None method tended to overestimate the emission reductions in the short-term, while making it difficult to reveal the long-term trend. Compared to both Resample-M&T and Resample-None, the Resample-M method that resampled only the meteorological variables was most appropriate to characterize both the short- and long-term emission trends from lockdowns and work resumption (Cole et al., 2020; Shi et al., 2021; Vu et al., 2019).
Table 2
Literature review of meteorological normalization studies based on decision trees.
Meteorological Normalization Methoda
Algorithm
References
Resample-M
Random Forest
(Cole et al., 2020; Shi et al., 2021; Vu et al., 2019)
Resample-M
Gradient Boosting Machine
Qu et al. (2020)
Resample-M&T
Random Forest
(Falocchi et al., 2021; Grange and Carslaw, 2019; Grange et al., 2018; Mallet, 2021)
Resample-M&T
Gradient Boosting Machine
Mallet (2021)
Resample-None
Random Forest
(Achebak et al., 2021; Hu et al., 2021; Wang et al., 2020)
Resample-None
Gradient Boosting Machine
Petetin et al. (2020)
Please refer to “2.2 Meteorological normalization” for definitions of these methods.
Literature review of meteorological normalization studies based on decision trees.Please refer to “2.2 Meteorological normalization” for definitions of these methods.
Spatiotemporal distributions derived by the Resample-M method
Based on the Resample-M results, the spatiotemporal distributions of relative changes on the meteorologically normalized NO2 concentrations during the lockdowns exhibited a series of concentric zones (Fig. 4
). The restrictive measures during the COVID-19 pandemic caused considerable reductions in anthropogenic NOx emissions (Wang and Su, 2020; Wang et al., 2021; Wu et al., 2021), which was corroborated by the satellite retrievals (Fig. S13). Almost all cities in China underwent emission reductions during the lockdowns, with an average of −19.1 ± 9.4% regarding the relative change. When the strict lockdowns began to be implemented around January 10, 2020, the emission reductions generally occurred in a concentric zone with Wuhan as the center and with a radius of 400 km (Fig. 4b). By January 20, 2020, the concentric zone radius gradually expanded to 500–700 km (Fig. 4c). By January 26, 2020, the national average emission reduction reached a maximum of 31.5% (Fig. 5
), and emission reductions occurred across the entire nation (Fig. 4d). The date of maximum emission reduction varied between cities, but generally occurred in January and February 2020, when the strictest lockdown measures were implemented in most cities (Gao et al., 2021) (Fig. 6
a).
Fig. 4
Composite diagram showing spatial distribution of relative changes in the meteorologically normalized NO2 concentrations based on the Resample-M method. The expanding blue circle indicates the dynamic change of the emission reduction, and the shrinking red circle indicates the dynamic change of emission recovery.
Fig. 6
Emission reduction and recovery in China based on Resample-M. (a) Maximum emission reduction dates of each city during the lockdown period. The color bar represents the dates ranging from January 20, 2020 to May 2, 2020. (b) Emission recovery dates of cities. The color bar represents the dates ranging from January 20, 2020 to December 31, 2020.
Composite diagram showing spatial distribution of relative changes in the meteorologically normalized NO2 concentrations based on the Resample-M method. The expanding blue circle indicates the dynamic change of the emission reduction, and the shrinking red circle indicates the dynamic change of emission recovery.The relative change of national average daily NO2 concentrations in 2020 compared to 2019 based on the monitoring data (grey line) and the Resample-M result (red line). The vertical dashed lines correspond to the dates for all subplots in Fig. 4.Emission reduction and recovery in China based on Resample-M. (a) Maximum emission reduction dates of each city during the lockdown period. The color bar represents the dates ranging from January 20, 2020 to May 2, 2020. (b) Emission recovery dates of cities. The color bar represents the dates ranging from January 20, 2020 to December 31, 2020.During work resumption, the concentric zones with Wuhan at their center gradually shrank, indicating the spatiotemporal dynamics of emission recovery (Fig. 4). After the lifting of lockdowns in late March and early April of 2020, the national average emissions progressively increased until reaching their pre-lockdown levels in early October 2020 (Fig. 5). From October 5 to the end of 2020, the average relative change was 7.0 ± 21.9% for all the cities. The concentric zones with respect to emission change began to shrink after the lockdowns were lifted (Fig. 4e and f). The emissions for most cities returned to or even exceeded the pre-lockdown levels by early October 2020 (Fig. 4g and h). The dates of emission recovery were considerably different between cities due to their differing economic development statuses, levels of impact from the COVID-19 pandemic, and timelines for closure/reopening (Wang and Su, 2020; Zhang et al., 2020). In Shenzhen, the emissions recovered even before the lifting of lockdowns. The emissions recovered between April and May 2020 for Chengdu and Xi'an, but only in November for Wuhan and Shanghai. It is noteworthy that the emissions in Beijing remained below the pre-lockdown levels even in the end of 2020 (Fig. 6b).Compared to the meteorologically normalized results, it is difficult to obtain the spatiotemporal distributions of relative emission change during the lockdowns and work resumption based on the observed NO2 concentrations (Fig. 5 and S13). For example, during the lockdowns, the daily average concentrations on February 7, 2020, showed a year-on-year increase for many cities in Northeast and South China (Fig. S14a). Similarly, during work resumption, the NO2 observations on November 22, 2020, dramatically decreased across the nation (Fig. S14b). Nevertheless, the emission reduction was predominant during the lockdowns (Beloconi et al., 2021; Hua et al., 2021; Sathe et al., 2021). The emission levels recovered around early October 2020, and were unlikely to decline sharply in late November (Fig. 5). The relative change in daily concentrations across the nation and six major cities suggest that the emissions rapidly recovered after the lockdowns were lifted and fluctuated widely, which might not reflect the actual temporal trend of the emissions (Fig. 5 and S15). Therefore, we were unable to accurately identify the key dates of emission reduction and recovery based on the observed concentrations due to confounding meteorological conditions effects. The concentrations processed by the meteorological normalization clearly characterized the emission dynamics during the lockdowns and work resumption.In the previous study (Zheng et al., 2021a), the bottom-up emission inventory for China was reported, and the results showed that the reduction rate of national NOx emissions was higher in March than January. Nevertheless, our results show that the reduction rate was slightly higher in January than March. This discrepancy could be attributed to the potential underestimation of NOx emission reduction in January based on the bottom-up emission inventory, as the relative decrease in NO2 concentrations based on the chemical transport model simulation was much smaller than the surface observation (Fig. 6c in Zheng et al., 2021a). More efforts are required to verify the estimated emission change. In another meteorological normalization study (Dai et al., 2021), the effects of the Chinese Spring Festival and the COVID-19 epidemic on air pollution emission reduction were evaluated separately. The meteorologically normalized surface NO2 concentrations in 31 major cities changed by −29.5% after the lockdown started. By excluding the effect of the Chinese Spring Festival (−14.1%), Dai et al. (2021) inferred that the lockdown attributed a reduction rate of −15.4%, which was comparable to the result of this study (−18.1%).
Conclusions
To the best of our knowledge, this is the first study that systematically compares the commonly used methods for meteorological normalization. Our comparison shows that resampling only the meteorological variables (not the temporal variables) is the state-of-the-art method, which clearly characterizes the short- and long-term trends of NOx emissions. Based on the results obtained by this method, the spatiotemporal distribution of relative emission change exhibited a series of concentric zones with Wuhan as their center during the lockdowns and work resumption. The maximum emission reduction occurred in January and February 2020 (during the lockdowns) for most of the cities in China. After lifting the lockdowns in early April 2020, the national emissions largely returned to their pre-lockdown levels by early October 2020. With this meteorological normalization method, we can conveniently assess the impact of policies and significant events on emissions and air quality based on readily accessible pollutant concentration and meteorological condition data. In the future, mutual verification could be conducted between meteorological normalized concentrations and associated emission inventories.
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Authors: Zongbo Shi; Congbo Song; Bowen Liu; Gongda Lu; Jingsha Xu; Tuan Van Vu; Robert J R Elliott; Weijun Li; William J Bloss; Roy M Harrison Journal: Sci Adv Date: 2021-01-13 Impact factor: 14.136