| Literature DB >> 36196368 |
Zekai Wu1, Wenqin Zhao1, Yaqiong Lv1.
Abstract
Air quality affects people's daily life. Air quality index (AQI) is an essential indicator for controlling air pollution and ensuring public health, whose accurate forecasting can provide timely air pollution warnings and remind people to take protective measures against air pollution in advance. To address this issue, this paper developed a new ensemble learning model for AQI forecasting. In this study, (1) the signal decomposition technique complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) is introduced to decompose the nonlinear and nonstationary AQI history data series into several more regular and more stable subseries firstly. (2) Fuzzy entropy (FE) is selected as the feature indicator to recombine the subseries with similar trends to avoid the problem of over-decomposition and reduce the computing time. (3) An ensemble long short-term memory (LSTM) neural network is established to forecast each reconstructed subseries, whose values are superimposed to predict the AQI value eventually. To validate the predicting performance of the proposed model, daily AQI data of Wuhan, China, dating from January 1, 2019, to February 28, 2022, is used as the experiment case. And comparative analysis is made between the proposed model and other common-used forecasting models. Benchmarking results of the numerical study demonstrate that the proposed model is superior to the other forecasting models with better AQI prediction accuracy.Entities:
Keywords: Air quality index; CEEMDAN; Forecasting; Fuzzy entropy; LSTM neural network
Year: 2022 PMID: 36196368 PMCID: PMC9522547 DOI: 10.1007/s11869-022-01252-6
Source DB: PubMed Journal: Air Qual Atmos Health ISSN: 1873-9318 Impact factor: 5.804
Corresponding range of AQI values and air pollutant concentrations
| AQI | Air pollutant concentration limits | |||||
|---|---|---|---|---|---|---|
| SO2(μg/m3) | NO2(μg/m3) | MG10(μg/m3) | CO(mg/m3) | O3(μg/m3) | PM2.5(μg/m3) | |
| 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 50 | 50 | 40 | 50 | 2 | 100 | 35 |
| 100 | 150 | 80 | 150 | 4 | 160 | 75 |
| 150 | 475 | 180 | 250 | 14 | 215 | 115 |
| 200 | 800 | 280 | 350 | 24 | 265 | 150 |
| 300 | 1600 | 565 | 420 | 36 | 800 | 250 |
Air quality grades according to AQI and related information
| AQI | Grade | Description | Suggestions |
|---|---|---|---|
| 0–50 | I | Excellent | All groups of people can take normal activities |
| 51–100 | II | Moderate | Very few extremely sensitive people should decrease outdoor activities |
| 101–150 | III | Light pollution | Special groups of people should decrease long-time, high-intensity outdoor activities |
| 151–200 | IV | Moderate pollution | Special groups of people should avoid long-time, high-intensity outdoor activities. The general groups of people should moderately decrease outdoor activities |
| 201–300 | V | Heavy pollution | Special groups of people should stay indoors and stop outdoor activities. The general groups of people should decrease outdoor activities. |
| >300 | VI | Severe pollution | Special groups of people should stay indoors. The general groups of people should avoid outdoor activities |
Fig. 1The unit network structure of LSTM
Fig. 2Flow chart of the proposed hybrid model
Fig. 3Location of Wuhan
Fig. 4Heatmap of population distribution in Hubei province
Fig. 5The original data of AQI time series
Statistics of AQI in Wuhan from January to February of the past 4 years
| Indicator | 2019 | 2020 | 2021 | 2022 |
|---|---|---|---|---|
| Average AQI value | 106.7 | 70.5 | 88.2 | 91.1 |
| Proportion of I or II air quality grades | 50.8% | 81.7% | 72.9% | 67.8% |
Descriptive statistics of two data sets
| Data set | Count | Minimum | Maximum | Mean | Standard deviation |
|---|---|---|---|---|---|
| Training data | 924 | 20 | 223 | 78.77 | 32.44 |
| Test data | 231 | 25 | 226 | 82.92 | 36.23 |
Fig. 6CEEMDAN decomposition results of the original AQI series
FE values and recombination results of CEEMDAN components
| Component | FE value | Recombination | New sequence |
|---|---|---|---|
| IMF1 | 2.6965 | IMF1 | FE-IMF1 |
| IMF2 | 1.7928 | IMF2 | FE-IMF2 |
| IMF3 | 1.0623 | IMF3 | FE-IMF3 |
| IMF4 | 0.7337 | IMF4&IMF5&IMF6 | FE-IMF4 |
| IMF5 | 0.4589 | ||
| IMF6 | 0.1678 | ||
| IMF7 | 0.0571 | IMF7&IMF8&Residue | FE-IMF5 |
| IMF8 | 0.0137 | ||
| Residue | 0.0034 |
Fig. 7Reconstructed sequences by FE
Main parameters of LSTM
| Parameter | Value |
|---|---|
| Hidden layer | 1 |
| Activation function | tanh |
| Optimizer | Adam |
| Loss function | Mean squared error |
| Hidden units | 128 |
| Batch size | 24 |
Window length and epoch of each FE-IMF forecasting model
| Reconstructed component | Window length | Epoch |
|---|---|---|
| FE-IMF1 | 2 | 180 |
| FE-IMF2 | 5 | 190 |
| FE-IMF3 | 7 | 140 |
| FE-IMF4 | 11 | 220 |
| FE-IMF5 | 7 | 130 |
Fig. 8Prediction results of each FE-IMF
Fig. 9Forecasting result of AQI
Values of the evaluation indices for the four models
| Model | RMSE | MAPE | GAR | |
|---|---|---|---|---|
| ARIMA | 27.17 | 28.98% | 0.4503 | 58.77% |
| LSTM | 23.83 | 26.46% | 0.5769 | 63.23% |
| EEMD-LSTM | 16.22 | 17.20% | 0.8041 | 77.02% |
| CEEMDAN-FE-LSTM | 9.45 | 10.47% | 0.9334 | 89.45% |
Fig. 10Values of the four indices for the four models