Literature DB >> 35445884

Multi-step short-term [Formula: see text] forecasting for enactment of proactive environmental regulation strategies.

Gul Muhammad Khan¹, Sohail Yousaf¹, Saba Gul^2,3.

Abstract

Particulate matter is one of the key contributors of air pollution and climate change. Long-term exposure to constituents of air pollutants has exerted serious health implications in both humans and plants leading to a detrimental impact on economy. Among the pollutants contributing to air quality determination, particulate matter has been linked to serious health implications causing pulmonary complications, cardiovascular diseases, growth retardation and ultimately death. In agriculture, crop yield is also negatively impacted by the deposition of particulate matter on stomata of the plant which is alarming and can cause food security concerns. The deleterious impact of air pollutants on human health, agricultural and economic well-being highlights the importance of quantifying and forecasting particulate matter. Several deterministic and deep learning models have been employed in the recent years to forecast the concentration of particulate matter. Among them, deep learning models have shown promising results when it comes to modeling time series data and forecasting it. We have explored recurrent neural networks with LSTM model which shows potential to predict the particulate matter ([Formula: see text]) based on multi-step multi-variate data of two of the most polluted regions of South Asia, Beijing, China and Punjab, Pakistan effectively. The LSTM model is tuned using Bayesian optimization technique to employ the appropriate hyper-parameters and weight initialization strategies based on the dataset. The model was able to predict [Formula: see text] for the next hour with root-mean-square error (RMSE) of 0.1913 (91.5% accuracy) and this error gradually increases with the number of time steps with next 24 hours steps prediction having RMSE of 0.7290. While in case of Punjab dataset with data recorded once a day, the RMSE for the next day forecast is 0.2192. These multi-step short-term forecasts would play a pivotal role in establishing an early warning system based on the air quality index (AQI) calculated and enable the government in enacting policies to contain it.

Entities: Chemical

Keywords: AQI; Air pollution; Early warning system; Forecasting; LSTM; Particulate matter

Mesh：

Substances：

Year: 2022 PMID： 35445884 PMCID： PMC9022063 DOI： 10.1007/s10661-022-10029-4

Source DB: PubMed Journal: Environ Monit Assess ISSN： 0167-6369 Impact factor: 3.307

Introduction

Due to their detrimental impact on climate change, human, and plant health, fine dust particles known as particulate matter (PM) in the ambient atmosphere are of significant importance in the research domain. The major source contributing to the production of particulate matter is anthropogenic activities which results in complex mixture of fine particles and water droplets. South Asian countries in particular have the most polluted cities in the world with particulate matter concentrations well above the standards set by WHO (: 25 24-hour mean, : 50 24-hour mean). Particulate matter has been observed to have an adverse effect on human health by causing cardiovascular, pulmonary diseases and increase the risk of lung cancer. Evidence suggests that deleterious health effects are attributed to long-term exposure to combustion-derived nano-particles which augments atherogenesis and causes vascular and acute adverse thrombotic effects (Mills et al., 2009). They have also been observed to impact food safety, since acts as carrier for hazardous materials such as heavy metal, the accumulation of which causes organ damage (Noh et al., 2019). The metallic and other chemical elements in can be attributed to cause health issues such as pneumonia, asthma, cardiovascular disease, and neurological diseases, the combined effect of which can be fatal (Park, 2021). Moreover, during the pandemic studies were carried out to find association between particulate matter ( and ) with some suggesting that these fine particles act as transportation agent for SARS-CoV-2 in the COVID-19 pandemic (Nor et al., 2021). Negative binomial mixed effect models were employed by (Solimini et al., 2021) globally across 63 countries to observe the statistical correlation of climate, air pollution parameters on COVID-19 cases which were suggestive of a link between them. Thus, the quantification and analysis of trends of air pollutants is of prime importance due to their pivotal role on the human, plant life and economy. Forecasting the air pollution and air quality index (Gul & Khan, 2020) would enable the government and the respective environmental protection agencies in enacting policies to contain the outbreak of airborne diseases and educate the masses about the potential hazards associated with the concentration of a particular pollutant. To study the trends of pollutants and their impact on different domains, several deterministic and statistical models were explored (Bai et al., 2018; Javadinejad et al., 2021; Ostad-Ali-Askari et al., 2017 Reddy et al., 2017). The conducive nature of meteorological parameters in the forecasting of air pollutants and their rapid change in concentration due to the former presents a challenge in accurately modeling the patterns. Deep learning models in particular were found to be promising due to the ability of recurrent neural networks (RNN) to emulate the trends in time series to predict pollutants. In this article, we present the recurrent neural network, long short-term memory (LSTM) model tuned by Bayesian optimization strategy to effectively learn the trends and model them. The network is trained on two unique datasets of different terrain belonging to two of the most polluted regions in the world, Beijing China and Panjab, Pakistan. A multi-step multi-variate model of LSTM is introduced which emulates the pattern of particulate matter by effectively encapsulating the historical events and quantifies the air quality in a region. The proposed model would enable the air quality regulatory bodies to take timely decisions to control the emissions, inform the general public about the detrimental health implications and present solutions while monitoring the short-term trend of pollutants.

Literature survey

Worldwide Epidemiological and toxicological studies carried out have suggested a strong association between exposure to particulate matter and impending adverse health effects comprising of pulmonary disease, cardiovascular disease, lung cancer, and premature mortality (Dockery et al., 1994; Lelieveld et al., 2015; Pope et al., 1995). Evidence from recent studies suggests that the most harmful effects of particulate matter are related to the size of the particle. Its exposure effectiveness level is greatly affected by weather, topography, concentration, and source. As the size of the particle decreases, there is an increase in their acidity and their ability to penetrate the lower pathway of respiratory system (Kim et al., 2015). The short-term impact of fine particles and meteorological extremes on human health was carried out in Seoul, South Korea by observing their correlation with mortality due to cerebrovascular diseases. It was observed that the impact of particulate matter on human health like asthma, pneumonia, neurological diseases was more pronounced under extreme weather conditions which can often lead to death (Park, 2021). The impact of long-term exposure to and its relation to mortality rate was explored in (Wang et al., 2020). The study comprised of 53 million senior Medicare beneficiaries living across America. It was observed that exposure to is responsible for causing respiratory, cardiovascular, and certain variants of cancer. The data from the study showed that blacks, younger and urban beneficiaries were most vulnerable to the consequences of long-term exposure to . Meta-analysis was carried out in (Farhadi et al., 2020) to substantiate the relationship between exposure to and myocardial infraction hospitalizations. The investigation found that plays a key role in the development of myocardial infractions in humans. Airborne microorganisms are pervasive in the atmosphere and are vital constituent of particulate matter which can lead to wide range of diseases in microorganisms due to their pathogenic nature (Zhai et al., 2018). A database for toxicity score for source specific particulate matter was constructed by (Park et al., 2018) to get an insight of their role in triggering adverse health effects and this information can assist the decision makers take steps to create apposite abatement policies. In (Gul & Khan, 2020) an LSTM inspired hazard level prediction system was developed using meteorological and pollutants data of two of the most polluted cities in the world. The hazard level of the next 24 hours of both cities was predicted with an average accuracy of 97%. Due to the detrimental effect of and on human health, the determination of their concentration is of great importance. In (Doreswamy et al., 2020), several machine learning models are evaluated to forecast the pollutant concentration of Taiwan with chronological data of 76 different stations recorded over a span of 5 years. It was observed that the gradient boosting regression algorithm was able to perform better in comparison with other regression models on the TAQMN dataset. The seamless modelling of air quality forecast system would assist the decision makers to improve the quality of air and its associated impact on human health, agriculture, transport, economy, climate, and ecosystems. A novel SVR-based model is introduced in (Hu et al., 2016) and trained on static and dynamic data of air pollutants concentration in Sydney. In comparison with ANN model, it was observed that the SVR model developed was able to accurately forecast the hourly concentration of air pollutants. Due to dissemination of air pollutants through wind direction and speed, the concentration of is strongly correlated with spatiotemporal characteristics. To leverage the spatial and temporal dependency of air pollutant for determination of air quality, weighted long short-term memory neural network extended model (WLSTME) was introduced in (Xiao et al., 2020). It was observed that based on the pollutant and meteorological data of Beijing, Tianjin, and Hebei over the period of 2015 to 2017, the network showed exceptional performance in comparison with STSVR, LSTME, and GWR. A forecasting model using LSTM is introduced in (Han et al., 2018) which uses sensory data of Aerosol Optical Depth (AOD), particulate matter and meteorological conditions. The network was observed to provide effective prediction of the concentrations of harmful gases with 80% variability. The system was successfully installed in Beijing, China and these prediction statistics have helped in reducing the air pollution in Beijing by 23%. Due to temporal characteristics of air pollutants, recurrent neural network that is LSTM is employed by (Reddy et al., 2017) to estimate the pollutant concentration for 6 to 10 hours into the future. It was observed that the proposed network was able to predict the pollutant concentration for several future time steps with the same accuracy as forecast for a single future time step of 1 hour which exhibit the predictive robustness of the network. A hybrid deep learning model is proposed in (Du et al., 2021) which employs Bi-LSTM to capture the temporal trends in data and a 1-D CNN to learn the spatial characteristics. The model was able to learn the non-linear co-relationships and model interdependence of the multi-variate temporal data of pollutants and produce effective results in forecasting on two real-time datasets from Beijing, China. LSTM models were observed to capture and learn the non-linear co-relationships of the highly variable data of pollutants more effectively than other deterministic and statistical models. Thus, we propose an LSTM model tuned using optimization strategies to re-structure the network to learn the temporal characteristics of multivariate pollutant data. The model was analyzed and evaluated on two real-world air quality datasets to access its forecasting performance and ability to generalize.

Prediction model framework

Forecasting air pollutants through meteorological and pollutant data requires encapsulation of temporal trends and temporal characteristics are more accurately modelled by recurrent neural networks (RNN) (Reddy et al., 2017). A multi-step multi-variable LSTM model is introduced to capture the sequential trends in the data and effectively forecast particulate matter. The prediction model framework comprises of an LSTM which is employed due to its ability to retain information over longer sequences (Gul & Khan, 2020; Park, 2021; Reddy et al., 2017). LSTM layer enables the model to capture the temporal trends in the data followed by two dense layers. The LSTM layer is provided with 128 nodes, an activation function, weight initializer and L2 regularizer (Fig. 1). Appropriate weight initialization is selected based on activation function to prevent the issue of vanishing and exploding gradients during back propagation. L2 regularization is used to deal with the over-fitting problem described by Eq. 1. The dense layer comprises of 64 nodes followed by an output layer for forecasting particulate matter . The output layer can be extended to multiple configurations depending on the sequences of predicted time steps.

Fig. 1

Proposed Network Architecture

Employed datasets

We have evaluated our network performance on two datasets of the most polluted regions in South Asia. The first dataset covers the region of Beijing, China and is available publicly at UCI website (Beijing Data) (Liang et al., 2015) with the dataset further extended by (Reddy et al., 2017). The modified Beijing air quality dataset comprises of pollutant data of and meteorological parameters such as dew pint, temperature, pressure, cumulative hours of snow, combined wind direction, cumulative wind speed, and cumulative hours of rain. The data is recorded over 35 different stations across the city of Beijing over a span of 7 years from 2010 to 2017 with 43,825 samples (Table 2).

Table 2

Data-set Specifications Of Modified UCI Dataset

Modified Beijing air quality Database Details
City/Region	Beijing, China
Time Span	2010–2017
Meteorological data	Dew Point, Temperature, Pressure, Combined wind direction, Cumulated wind speed, Cumulated hours of snow, Cumulated hours of rain
Pollutants Data	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PM_{2.5}$$\end{document}PM2.5 concentration \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(ug/m^3)$$\end{document}(ug/m3)
Number of Recording Stations	35

Data cleansing is performed and the aberrant values (High spikes in samples) of the sensors due to some defect or anomaly are removed (Reddy et al., 2017). We have further modified the dataset by adding the columns of AQI and hazard level according the formula (Eq. 2) and standards set by EPA USA to visualize and understand the nature of the data (see Fig. 2). The formula for computing AQI is given by Eq. 2 (Kanchan et al., 2015) with being used as a primary pollutant for quantifying air quality.Where,

Fig. 2

Air Quality Index (AQI) Scale, EPA USA

is index for pollutant p, is the rounded concentration of pollutant p, is the breakpoint greater or equal to , is the breakpoint less than or equal to , is the AQI corresponding to , And, is the AQI corresponding to . The hazard levels are classified into seven categories according to the pollutant concentration as depicted in Table 1 with the level of health concern listed.

Table 1

Air Quality Index set by environment protection agency, US

\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O_{3}$$\end{document}O3 (ppm)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PM_{10}$$\end{document}PM10 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(ug/m^3)$$\end{document}(ug/m3)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PM_{2.5}$$\end{document}PM2.5 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(ug/m^3)$$\end{document}(ug/m3)	CO (ppm)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$SO_{2}$$\end{document}SO2 (ppm)	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$NO_{2}$$\end{document}NO2 (ppm)	AQI Values	Level of Health Concern
0.000–0.059	0–54	0–12	0–4.4	0.000–0.034	-	0–50	Good
0.060–0.075	55–154	12.1–35.4	4.5–9.4	0.035–0.144	-	51–100	Moderate
0.076–0.095	155–254	35.5–55.4	9.5–12.4	0.145–0.224	-	101–150	Unhealthy for sensitive groups
0.096–0.115	255–354	55.5–150.4	12.5–15.4	0.225–0.304	-	151–200	Unhealthy
0.116–0.374	355–424	150.5–250.4	15.5–30.4	0.305–0.604	0.65–1.24	201–300	Very Unhealthy
-	425–504	250.5–350.4	30.5–40.4	0.605–0.804	1.25–1.64	301–400	Hazardous
-	505–604	350.5–500.4	40.5–50.4	0.805–1.004	1.65–2.04	401–500	Hazardous

Air Quality Index set by environment protection agency, US Air Quality Index (AQI) Scale, EPA USA The t-distributed Stochastic Neighbor Embedding (t-SNE) is used for non-linear dimensionality reduction of the modified Beijing air quality dataset described in Fig. 3 which represents the distribution of the data with respect to hazard levels calculated using Table 1.

Fig. 3

t-SNE plot of the modified UCI Beijing air quality dataset

t-SNE plot of the modified UCI Beijing air quality dataset Data-set Specifications Of Modified UCI Dataset The second dataset covers some regions across Punjab with 4 stations across Lahore which is one of the most polluted cities of the world and one station each in Multan and Gujranwala respectively. The dataset comprises of meteorological parameters acquired from Pakistan Meteorological Department and pollutant concentration obtained from Environment Projection Department, Punjab described in Table 3 which is recorded over a span of 3 years over 6 different stations. Data cleansing is performed by removing rows of data with sensor failure. The columns of AQI are added to the datasets by employing Eq. 2 and health hazard is categorized into seven levels according to the concentration in Fig. 2.

Table 3

Data-set Specifications Of Punjab Dataset

Punjab Database Details
City/Region	Punjab, Pakistan
Time Span	2017–2020
Metrological data	wind direction, temperature, barometric pressure, humidity, and visibility
Pollutants Data	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PM_{10}$$\end{document}PM10 concentration \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(ug/m^3)$$\end{document}(ug/m3), \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$PM_{2.5}$$\end{document}PM2.5 concentration \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$(ug/m^3)$$\end{document}(ug/m3), Nitrogen dioxide (ppm), Sulphur dioxide (ppm) and surface ozone (ppm)
Number of Recording Stations	6

Data-set Specifications Of Punjab Dataset The visual description of data distribution of Punjab dataset based on hazard intensity is shown in the t-SNE plot (Fig. 4).

Fig. 4

t-SNE plot of Punjab dataset

Network hyper-parameters optimization

Hyper parameters of the deep neural network used to forecast particulate matter () are optimized to deliver the best performance on a particular dataset with validation set used as a measure for evaluation. The search space is explored to find the optimal set of hyperparameters by using Eq. 3.Where the score of the objective function that needs to be minimized is represented by f(x); represents the group of hyperparameters with the lowest score and x is any value from the range of X domain. Several optimization strategies such as grid search, random search and Bayesian optimization are frequently employed for algorithmic optimization (Bergstra et al., 2011; Bergstra et al., 2013). In grid and random search, all the experiments are mutually exclusive of each other which doesn’t not help in exploring the search space in an effective manner making it computationally expensive and time consuming. While Bayesian optimization is a sequential-based model optimization algorithm which instead of blindly exploring the search-space makes decisions based on results from previous experiments. Thus, Bayesian optimization makes use of intuition through historical evidence to narrow down the search space to an optimal set of hyperparameters for selection (Bergstra et al., 2013). The Bayesian search uses Bayes rule (Eq. 4) to utilize the knowledge of previously known priors to direct the search towards combinations of hyperparameter which has higher probability to improve model performance.Where is the posterior probability, is the likelihood, P(A) is the class prior probability and P(B) is the predictor prior probability. Expected improvement (EI) is employed as an acquisition function which defines the criteria for selection of hyperparameters from the Gaussian process model which is used as a surrogate function for tuning (Eq. 5).Here is the Gaussian surrogate probability model, x describes the hyperparameter, y is the true objective function score and is the latest minimum score observed so far of the true objective function. To find the optimal hyperparameters under the surrogate model , the expected improvement should be maximized with respect to x.

Results and analysis

To capture the temporal characteristics of the air pollution parameters LSTM, a recurrent neural network is employed. The features from the dataset are preprocessed by normalizing the data with anomalous entries of sensor failure removed. The data is then rearranged in the form of cyclic packets with previous value of feed in addition to the meteorological data to predict the next time stamp. We start with a single step prediction of 1 hour and extend it gradually to 24 hour steps while increasing the number of output steps which gives us an insight about the networks performance on multi-step data. After selection of appropriate network, the hyperparameters such as learning rate, optimizer and activation function are tuned using Bayesian optimization. Expected improvement is used as a performance metric by Bayesian optimization to evaluate the selection of a suitable hyperparameter. The network performance on each hyperparameter from the search space is gauged by the validation set which assists in guiding the search in an appropriate direction as depicted in Fig. 5.

Fig. 5

Hyper-parameter tuning using Bayesian optimization

Hyper-parameter tuning using Bayesian optimization Robust initialization techniques are proposed in this section that removes the obstacles of training deep neural networks by solving the problem of vanishing and exploding gradients. Thus, weight initialization practices like Lecun, Glorot and He initialization are employed to effectively train the proposed network. The network was optimized to improve the performance by tuning the best activation function and optimizer for each dataset followed by learning rate. The results of tuned hyperparameters based on their performance on validation set for modified Beijing air quality dataset are described in Tables 4 and 5. MSE is used as a metric by the recurrent network during training and validation to gauge the performance of the network in the search space to find an optimal hyper parameter.

Table 4

Selection of Optimizer and Activation using Bayesian optimization on modified Beijing air quality dataset

Optimizer	Activation	MSE
Adagrad	tanh	0.041617
Adagrad	linear	0.0416348
Adagrad	linear	0.0416676
Adagrad	tanh	0.0419074
Adadelta	relu	0.0419169
Adagrad	linear	0.0419371
Adagrad	softsign	0.0420134
Adagrad	tanh	0.04209329

The bold and italic emphasis is used to highlight the best parameter in each table

Table 5

Optimal learning rate selection using Bayesian optimization on modified Beijing air quality dataset

Learning rate	MSE
0.03	0.0411263
0.013885	0.0413445
0.00665	0.0415720
0.01271	0.0416139
0.009799	0.0416274
0.005592	0.0417381
0.011722	0.0418148
0.003004	0.0418748

The bold and italic emphasis is used to highlight the best parameter in each table

Selection of Optimizer and Activation using Bayesian optimization on modified Beijing air quality dataset The bold and italic emphasis is used to highlight the best parameter in each table Optimal learning rate selection using Bayesian optimization on modified Beijing air quality dataset The bold and italic emphasis is used to highlight the best parameter in each table After selection of optimal hyperparameters, the network is reconstructed and trained with a boost in performance observed. For weight initialization He/Kaiming Initialization is selected (He et al., 2015) based on the nature of the activation function. Since Relu is a non-differentiable function, Kaiming He (He et al., 2015) proposed a weight initialization scheme that was tailored for deep neural networks that employ asymmetric and non-linear activation functions. The He normal initialization method is calculated as a random number with a normal probability distribution (U) having a mean of 0.0 and a standard deviation of , where describes the number of inputs to the node. While the He uniform comprises of weight samples taken from a uniform distribution (U) between the range and , where defines the number of input nodes.Upon employing He, Glorot and Lecun weight initializers to the network, it was observed that He uniform initialization was able to help the network in achieving appropriate weights for learning. Figures 6 and 7 describes the training and validation performance of the network with ability to forecast level for the next hour and next day with RMSE of 0.1913 and 0.6341 respectively on the test-set.

Fig. 6

Actual Vs. Predicted values of employed architecture on Hourly data of modified UCI Beijing air quality dataset

Fig. 7

Actual Vs. Predicted values of employed architecture on 24 hour data of modified UCI Beijing air quality dataset

Actual Vs. Predicted values of employed architecture on Hourly data of modified UCI Beijing air quality dataset Actual Vs. Predicted values of employed architecture on 24 hour data of modified UCI Beijing air quality dataset Forecasting air pollutants can helps in identifying the long-term trends, impact of exposure and can enable the relevant bodies to devise strategies to contain the exponential growth of the pollutants and warn the sensitive groups. To cater for studying long-term patterns of particulate matter, the model is further modified to predict future time steps. It can be observed from Table 6 that the evaluation metrics used to gauge the performance degrades as the number of future hours increases. The root-mean-square error (RMSE) of the model increases gradually while the variance score drops with prediction of additional steps into the future. The gradual degradation in performance is due to increase in number of future steps to predict while keeping the past steps constant which shows the robustness of the proposed LSTM model when it comes to multi-step prediction of long sequences.

Table 6

Test RMSE for multi-step prediction of on modified Beijing air quality dataset

Model	RMSE	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^{2}$$\end{document}R2
3 future hours	0.2598	0.932526
6 future hours	0.3475	0.879228
9 future hours	0.4096	0.832258
12 future hours	0.5578	0.688855
15 future hours	0.5195	0.730129
18 future hours	0.5020	0.748015
21 future hours	0.5408	0.707541
24 future hours	0.7290	0.468538

Test RMSE for multi-step prediction of on modified Beijing air quality dataset The deep air model proposed by (Reddy et al., 2017) was used to predict the particulate matter () of Beijing, China and the model was trained and evaluated on the modified UCI Beijing air quality dataset. On comparison with our tuned model, we observed that for the same dataset, we were able to predict the particulate matter at single and multi-step more effectively as described in Table 7. The low root-mean-square error (RMSE) and high variance score in Table 7 shows the robustness of our model in comparison with (Reddy et al., 2017).

Table 7

Comparison of test RMSE with Deep Air for various future time lags on modified Beijing air quality dataset

Model	Time Steps	RMSE	\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$R^{2}$$\end{document}R2
Our Model	1 future hour/single step	0.1913	0.963402
Deep Air (Reddy et al., 2017)	1 future hour/single step	12.78	0.96
Our Model	5 future hours	0.3229	0.895758
Deep Air (Reddy et al., 2017)	5 future hours	44.15	0.689
Our Model	10 future hours	0.4086	0.833078
Deep Air (Reddy et al., 2017)	10 future hours	74.8	0.588

Comparison of test RMSE with Deep Air for various future time lags on modified Beijing air quality dataset The network is then re-tuned for Punjab dataset by observing the trend of MSE of the validation set. Bayesian optimization is used to intuitively go through the search space to find the learning rate, activation function and optimizer. Since Tanh is selected as an optimized activation function which is non-linear in nature, the best practice of weight initialization used to prevent vanishing gradients is Xavier or Glorot initialization as proposed by (Glorot & Bengio, 2010).The Glorot uniform initialization is calculated as a random number with a uniform probability distribution (U) between the range and , where defines the number of input nodes and is the number of output nodes in the weight tensor.While in-case of Glorot normal distribution, samples are drawn from truncated normal distribution which is centered on zero and has standard deviation of . The Lecun uniform initialization is calculated by drawing random samples from a uniform probability distribution (U) between the range and , where defines the number of input nodes and is the number of output nodes in the weight tensor.While in-case of Lecun normal distribution, samples are drawn from truncated normal distribution centered around zero and has standard deviation of . Though according to (Glorot & Bengio, 2010), Glorot initialization performs better for Relu activation, but according to our validation set performance based on MSE, Lecun normal initialization was able to achieve superior performance. Tables 8 and 9 describes the results of tuned hyperparameters. These parameters are employed to retrain the network with a significant improvement in performance observed. The performance of the network on the training and validation set is described in Fig. 8. The network can forecast the with RMSE of 0.2192 respectively on the test set for 24 hour future time step. The reason for the higher RMSE in-case of Panjab dataset can be attributed to the data employed which is recorded daily due to which it shows high variance making it difficult for the network to capture the trends as compared to modified UCI dataset where parameters are recorded hourly.

Table 8

Selection of Optimizer and Activation using Bayesian optimization on Punjab dataset

Optimizer	Activation	MSE
Nadam	relu	0.6396949
Nadam	relu	0.7275196
Nadam	relu	0.7335685
Nadam	relu	0.7343104
Nadam	relu	0.7576756
Nadam	relu	0.7613520
Nadam	relu	0.7674339
Nadam	relu	0.7830696

The bold and italic emphasis is used to highlight the best parameter in each table

Table 9

Optimal learning rate selection using Bayesian optimization on Punjab dataset

Learning rate	MSE
0.0199939	0.5433736
0.0032243	0.6428966
0.0035615	0.6451763
0.0145944	0.6503993
0.0046863	0.6523746
0.0013331	0.6587814
0.0013885	0.6672297
0.0129494	0.6792039

The bold and italic emphasis is used to highlight the best parameter in each table

Fig. 8

Actual Vs. Predicted values of employed architecture on 24 hour data of Punjab dataset

Selection of Optimizer and Activation using Bayesian optimization on Punjab dataset The bold and italic emphasis is used to highlight the best parameter in each table Optimal learning rate selection using Bayesian optimization on Punjab dataset The bold and italic emphasis is used to highlight the best parameter in each table Actual Vs. Predicted values of employed architecture on 24 hour data of Punjab dataset Since, the data of pollutants is highly variable and changes rapidly with meteorological parameters, thus the forecasting capability of the model drops gradually for prediction of extended future time steps as observed. From Tables 4 to 9, the performance of the model shows an ability to generalize well and emulate the trend of particulate matter for multi-step multivariate data of different regions.

Conclusion

A deep learning model was proposed for quantification of hazard level by predicting the particulate matter concentration and evaluated on two of the most polluted regions in the world: Beijing, China and Punjab, Pakistan. Hyper-parameter tuning and weight initialization strategies were adopted to tune the network by exploring the search space effectively. The tuned LSTM model was used to learn and effectively model the temporal trends of the particulate matter and meteorological data to predict concentration of future instances of . Hourly concentration of of Beijing was predicted with an RMSE of 0.1913 and based on the average 24-hour data, the RMSE drops to 0.6341 with state of the art performance observed. While the 24-hour prediction of Punjab has an RMSE of 0.2192, this degradation in performances of the model can be attributed to drastic variance in the recorded data over a span of 24 hours. By feeding the historic data in hourly time stamps, the degradation in performance was observed to be subtle. The forecasting model using LSTM helps in mapping the AQI level and identifying the health concerns associated with it. This would enable the general public, government and environmental protection agencies to quantify the risk associated with air quality index and enable the authorities to take effective measures to minimize the consequences and assist the environment protection agencies to enact policies towards reducing the health and economic risk associated with high concentration of particulate matter.

14 in total

1. The contribution of outdoor air pollution sources to premature mortality on a global scale.

Authors: J Lelieveld; J S Evans; M Fnais; D Giannadaki; A Pozzer
Journal: Nature Date: 2015-09-17 Impact factor: 49.962

Review 2. Adverse cardiovascular effects of air pollution.

Authors: Nicholas L Mills; Ken Donaldson; Paddy W Hadoke; Nicholas A Boon; William MacNee; Flemming R Cassee; Thomas Sandström; Anders Blomberg; David E Newby
Journal: Nat Clin Pract Cardiovasc Med Date: 2008-11-25

Review 3. A review on the human health impact of airborne particulate matter.

Authors: Ki-Hyun Kim; Ehsanul Kabir; Shamin Kabir
Journal: Environ Int Date: 2014-10-24 Impact factor: 9.621

Review 4. A review on airborne microorganisms in particulate matters: Composition, characteristics and influence factors.

Authors: Yunbo Zhai; Xue Li; Tengfei Wang; Bei Wang; Caiting Li; Guangming Zeng
Journal: Environ Int Date: 2018-02-06 Impact factor: 9.621

Review 5. Air Pollution Forecasts: An Overview.

Authors: Lu Bai; Jianzhou Wang; Xuejiao Ma; Haiyan Lu
Journal: Int J Environ Res Public Health Date: 2018-04-17 Impact factor: 3.390

6. Particulate matter in the cultivation area may contaminate leafy vegetables with heavy metals above safe levels in Korea.

Authors: Kyungdeok Noh; Luc The Thi; Byoung Ryong Jeong
Journal: Environ Sci Pollut Res Int Date: 2019-07-02 Impact factor: 4.223

7. Association between PM_2.5 and risk of hospitalization for myocardial infarction: a systematic review and a meta-analysis.

Authors: Zeynab Farhadi; Hasan Abulghasem Gorgi; Hosein Shabaninejad; Mouloud Aghajani Delavar; Sogand Torani
Journal: BMC Public Health Date: 2020-03-12 Impact factor: 3.295

8. An improved deep learning model for predicting daily PM2.5 concentration.

Authors: Fei Xiao; Mei Yang; Hong Fan; Guanghui Fan; Mohammed A A Al-Qaness
Journal: Sci Rep Date: 2020-12-02 Impact factor: 4.379

9. A global association between Covid-19 cases and airborne particulate matter at regional level.

Authors: Angelo Solimini; F Filipponi; D Alunni Fegatelli; B Caputo; C M De Marco; A Spagnoli; A R Vestri
Journal: Sci Rep Date: 2021-03-18 Impact factor: 4.379

10. Differential toxicities of fine particulate matters from various sources.

Authors: Minhan Park; Hung Soo Joo; Kwangyul Lee; Myoseon Jang; Sang Don Kim; Injeong Kim; Lucille Joanna S Borlaza; Heungbin Lim; Hanjae Shin; Kyu Hyuck Chung; Yoon-Hyeong Choi; Sun Gu Park; Min-Suk Bae; Jiyi Lee; Hangyul Song; Kihong Park
Journal: Sci Rep Date: 2018-11-19 Impact factor: 4.379

Multi-step short-term [Formula: see text] forecasting for enactment of proactive environmental regulation strategies.

Introduction

Literature survey

Prediction model framework

Employed datasets

Network hyper-parameters optimization

Results and analysis

Conclusion

1. The contribution of outdoor air pollution sources to premature mortality on a global scale.

Review 2. Adverse cardiovascular effects of air pollution.

Review 3. A review on the human health impact of airborne particulate matter.

Review 4. A review on airborne microorganisms in particulate matters: Composition, characteristics and influence factors.

Review 5. Air Pollution Forecasts: An Overview.

6. Particulate matter in the cultivation area may contaminate leafy vegetables with heavy metals above safe levels in Korea.

7. Association between PM2.5 and risk of hospitalization for myocardial infarction: a systematic review and a meta-analysis.

8. An improved deep learning model for predicting daily PM2.5 concentration.

9. A global association between Covid-19 cases and airborne particulate matter at regional level.

10. Differential toxicities of fine particulate matters from various sources.

7. Association between PM_2.5 and risk of hospitalization for myocardial infarction: a systematic review and a meta-analysis.