Literature DB >> 33131841

Improving the robustness of beach water quality modeling using an ensemble machine learning approach.

Leizhi Wang1, Zhenduo Zhu2, Lauren Sassoubre3, Guan Yu4, Chen Liao5, Qingfang Hu6, Yintang Wang6.   

Abstract

Microbial pollution of beach water can expose swimmers to harmful pathogens. Predictive modeling provides an alternative method for beach management that addresses several limitations associated with traditional culture-based methods of assessing water quality. Widely-used machine learning methods often suffer from high variability in performance from one year or beach to another. Therefore, the best machine learning method varies between beaches and years, making method selection difficult. This study proposes an ensemble machine learning approach referred to as model stacking that has a two-layered learning structure, where the outputs of five widely-used individual machine learning models (multiple linear regression, partial least square, sparse partial least square, random forest, and Bayesian network) are taken as input features for another model that produces the final prediction. Applying this approach to three beaches along eastern Lake Erie, New York, USA, we show that generally the model stacking approach was able to generate reliably good predictions compared to all of the five base models. The accuracy rankings of the stacking model consistently stayed 1st or 2nd every year, with yearly-average accuracy of 78%, 81%, and 82.3% at the three studied beaches, respectively. This study highlights the value of the model stacking approach in predicting beach water quality and solving other pressing environmental problems.
Copyright © 2020 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  E. coli; Fecal indicator bacteria; Machine learning model; Model stacking; Water quality

Year:  2020        PMID: 33131841     DOI: 10.1016/j.scitotenv.2020.142760

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


  3 in total

1.  Systematic review of predictive models of microbial water quality at freshwater recreational beaches.

Authors:  Cole Heasley; J Johanna Sanchez; Jordan Tustin; Ian Young
Journal:  PLoS One       Date:  2021-08-26       Impact factor: 3.240

2.  UAV-based multi-sensor data fusion and machine learning algorithm for yield prediction in wheat.

Authors:  Shuaipeng Fei; Muhammad Adeel Hassan; Yonggui Xiao; Xin Su; Zhen Chen; Qian Cheng; Fuyi Duan; Riqiang Chen; Yuntao Ma
Journal:  Precis Agric       Date:  2022-08-03       Impact factor: 5.767

3.  A data-driven interpretable ensemble framework based on tree models for forecasting the occurrence of COVID-19 in the USA.

Authors:  Hu-Li Zheng; Shu-Yi An; Bao-Jun Qiao; Peng Guan; De-Sheng Huang; Wei Wu
Journal:  Environ Sci Pollut Res Int       Date:  2022-09-22       Impact factor: 5.190

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.