Literature DB >> 32593834

Comparison of Machine Learning and Land Use Regression for fine scale spatiotemporal estimation of ambient air pollution: Modeling ozone concentrations across the contiguous United States.

Xiang Ren1, Zhongyuan Mi2, Panos G Georgopoulos3.   

Abstract

BACKGROUND: Spatial linear Land-Use Regression (LUR) is commonly used for long-term modeling of air pollution in support of exposure and epidemiological assessments. Machine Learning (ML) methods in conjunction with spatiotemporal modeling can provide more flexible exposure-relevant metrics and have been studied using different model structures. There is however a lack of comparisons of methods available within these two modeling frameworks, that can guide model/algorithm selection in air quality epidemiology.
OBJECTIVE: The present study compares thirteen algorithms for spatial/spatiotemporal modeling applied for daily maxima of 8-hour running averages of ambient ozone concentrations at spatial resolutions corresponding to census tracts, to support estimation of annual ozone design values across the contiguous US. These algorithms were selected from nine representative categories and trained using predictors that included chemistry-transport model predictions, meteorological factors, land use and land cover, and stationary and mobile emissions.
METHODS: To obtain the best predictive performance, model structures were optimized through a repeated coarse/fine grid search with expert knowledge. Six target-oriented validation strategies were used to prevent overfitting and avoid over-optimistic model evaluation results. In order to take full advantage of the power of different algorithms, we introduced tuning sample weights in spatiotemporal modeling to ensure predictive accuracy of peak concentrations, that is crucial for exposure assessments. In spatial modeling, four interpretation and visualization tools were introduced to explain predictions from different algorithms.
RESULTS: Nonlinear ML methods achieved higher prediction accuracy than linear LUR, and the improvements were more significant for spatiotemporal modeling (nearly 10%-40% decrease of predicted RMSE). By tuning the sample weights, spatiotemporal models can predict concentrations used to calculate ozone design values that are comparable or even better than spatial models (nearly 30% decrease of cross-validated RMSE). We visualized the underlying nonlinear relationships, heterogeneous associations and complex interactions from the two best performing ML algorithms, i.e., Random Forest and Extreme Gradient Boosting, and found that the complex patterns were relatively less significant with respect to model accuracy for spatial modeling.
CONCLUSION: Machine Learning can provide estimates that are actually more interpretable and practical than linear regression to improve accuracy in modeling human exposures. A careful design of hyperparameter tuning and flexible data splitting and validations is crucial to obtain reliable and stable results. Desirable/successful nonlinear models are expected to capture similar nonlinear patterns and interactions using different ML algorithms.
Copyright © 2020 The Authors. Published by Elsevier Ltd.. All rights reserved.

Entities:  

Keywords:  Black-box model interpretation; Land use regression; Machine learning; Ozone; Spatiotemporal modeling

Year:  2020        PMID: 32593834     DOI: 10.1016/j.envint.2020.105827

Source DB:  PubMed          Journal:  Environ Int        ISSN: 0160-4120            Impact factor:   9.621


  6 in total

1.  New Deep Learning Model to Estimate Ozone Concentrations Found Worrying Exposure Level over Eastern China.

Authors:  Sichen Wang; Xi Mu; Peng Jiang; Yanfeng Huo; Li Zhu; Zhiqiang Zhu; Yanlan Wu
Journal:  Int J Environ Res Public Health       Date:  2022-06-11       Impact factor: 4.614

2.  Modeling spatial variation of gaseous air pollutants and particulate matters in a Metropolitan area using mobile monitoring data.

Authors:  Jia Xu; Wen Yang; Zhipeng Bai; Renyi Zhang; Jun Zheng; Meng Wang; Tong Zhu
Journal:  Environ Res       Date:  2022-02-08       Impact factor: 8.431

3.  Flexible Bayesian Ensemble Machine Learning Framework for Predicting Local Ozone Concentrations.

Authors:  Xiang Ren; Zhongyuan Mi; Ting Cai; Christopher G Nolte; Panos G Georgopoulos
Journal:  Environ Sci Technol       Date:  2022-03-21       Impact factor: 11.357

4.  Ozone Concentration Levels in Urban Environments-Upper Silesia Region Case Study.

Authors:  Joanna Kobza; Mariusz Geremek; Lechosław Dul
Journal:  Int J Environ Res Public Health       Date:  2021-02-04       Impact factor: 3.390

5.  The impact of the COVID-19 pandemic on air pollution: A global assessment using machine learning techniques.

Authors:  Jasper S Wijnands; Kerry A Nice; Sachith Seneviratne; Jason Thompson; Mark Stevenson
Journal:  Atmos Pollut Res       Date:  2022-04-28       Impact factor: 4.831

6.  Multi-stage ensemble-learning-based model fusion for surface ozone simulations: A focus on CMIP6 models.

Authors:  Zhe Sun; Alexander T Archibald
Journal:  Environ Sci Ecotechnol       Date:  2021-09-15
  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.