Literature DB >> 29679837

Development of a stacked ensemble model for forecasting and analyzing daily average PM2.5 concentrations in Beijing, China.

Binxu Zhai1, Jianguo Chen2.   

Abstract

A stacked ensemble model is developed for forecasting and analyzing the daily average concentrations of fine particulate matter (PM2.5) in Beijing, China. Special feature extraction procedures, including those of simplification, polynomial, transformation and combination, are conducted before modeling to identify potentially significant features based on an exploratory data analysis. Stability feature selection and tree-based feature selection methods are applied to select important variables and evaluate the degrees of feature importance. Single models including LASSO, Adaboost, XGBoost and multi-layer perceptron optimized by the genetic algorithm (GA-MLP) are established in the level 0 space and are then integrated by support vector regression (SVR) in the level 1 space via stacked generalization. A feature importance analysis reveals that nitrogen dioxide (NO2) and carbon monoxide (CO) concentrations measured from the city of Zhangjiakou are taken as the most important elements of pollution factors for forecasting PM2.5 concentrations. Local extreme wind speeds and maximal wind speeds are considered to extend the most effects of meteorological factors to the cross-regional transportation of contaminants. Pollutants found in the cities of Zhangjiakou and Chengde have a stronger impact on air quality in Beijing than other surrounding factors. Our model evaluation shows that the ensemble model generally performs better than a single nonlinear forecasting model when applied to new data with a coefficient of determination (R2) of 0.90 and a root mean squared error (RMSE) of 23.69μg/m3. For single pollutant grade recognition, the proposed model performs better when applied to days characterized by good air quality than when applied to days registering high levels of pollution. The overall classification accuracy level is 73.93%, with most misclassifications made among adjacent categories. The results demonstrate the interpretability and generalizability of the stacked ensemble model.
Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Air quality forecast; Feature extraction; Feature importance analysis; Feature selection; Stacked generalization strategy

Mesh:

Substances:

Year:  2018        PMID: 29679837     DOI: 10.1016/j.scitotenv.2018.04.040

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


  9 in total

1.  Flexible Bayesian Ensemble Machine Learning Framework for Predicting Local Ozone Concentrations.

Authors:  Xiang Ren; Zhongyuan Mi; Ting Cai; Christopher G Nolte; Panos G Georgopoulos
Journal:  Environ Sci Technol       Date:  2022-03-21       Impact factor: 11.357

2.  Machine-learning model to predict the cause of death using a stacking ensemble method for observational data.

Authors:  Chungsoo Kim; Seng Chan You; Jenna M Reps; Jae Youn Cheong; Rae Woong Park
Journal:  J Am Med Inform Assoc       Date:  2021-06-12       Impact factor: 4.497

3.  Explainable artificial intelligence (XAI) for exploring spatial variability of lung and bronchus cancer (LBC) mortality rates in the contiguous USA.

Authors:  Zia U Ahmed; Kang Sun; Michael Shelly; Lina Mu
Journal:  Sci Rep       Date:  2021-12-16       Impact factor: 4.379

4.  A data calibration method for micro air quality detectors based on a LASSO regression and NARX neural network combined model.

Authors:  Bing Liu; Yueqiang Jin; Dezhi Xu; Yishu Wang; Chaoyang Li
Journal:  Sci Rep       Date:  2021-10-27       Impact factor: 4.379

5.  The Prediction of Influenza-like Illness and Respiratory Disease Using LSTM and ARIMA.

Authors:  Yu-Tse Tsan; Der-Yuan Chen; Po-Yu Liu; Endah Kristiani; Kieu Lan Phuong Nguyen; Chao-Tung Yang
Journal:  Int J Environ Res Public Health       Date:  2022-02-07       Impact factor: 3.390

6.  Deep Ensemble Machine Learning Framework for the Estimation of PM2.5 Concentrations.

Authors:  Wenhua Yu; Shanshan Li; Tingting Ye; Rongbin Xu; Jiangning Song; Yuming Guo
Journal:  Environ Health Perspect       Date:  2022-03-07       Impact factor: 11.035

7.  Compute Tomography Radiomics Analysis on Whole Pancreas Between Healthy Individual and Pancreatic Ductal Adenocarcinoma Patients: Uncertainty Analysis and Predictive Modeling.

Authors:  Shuo Wang; Chi Lin; Alexander Kolomaya; Garett P Ostdiek-Wille; Jeffrey Wong; Xiaoyue Cheng; Yu Lei; Chang Liu
Journal:  Technol Cancer Res Treat       Date:  2022 Jan-Dec

8.  A Particulate Matter Concentration Prediction Model Based on Long Short-Term Memory and an Artificial Neural Network.

Authors:  Junbeom Park; Seongju Chang
Journal:  Int J Environ Res Public Health       Date:  2021-06-24       Impact factor: 3.390

9.  Application of RR-XGBoost combined model in data calibration of micro air quality detector.

Authors:  Bing Liu; Xianghua Tan; Yueqiang Jin; Wangwang Yu; Chaoyang Li
Journal:  Sci Rep       Date:  2021-08-02       Impact factor: 4.379

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.