Literature DB >> 31255823

Predicting uncertainty of machine learning models for modelling nitrate pollution of groundwater using quantile regression and UNEEC methods.

Omid Rahmati1, Bahram Choubin2, Abolhasan Fathabadi3, Frederic Coulon4, Elinaz Soltani5, Himan Shahabi6, Eisa Mollaefar7, John Tiefenbacher8, Sabrina Cipullo4, Baharin Bin Ahmad9, Dieu Tien Bui10.   

Abstract

Although estimating the uncertainty of models used for modelling nitrate contamination of groundwater is essential in groundwater management, it has been generally ignored. This issue motivates this research to explore the predictive uncertainty of machine-learning (ML) models in this field of study using two different residuals uncertainty methods: quantile regression (QR) and uncertainty estimation based on local errors and clustering (UNEEC). Prediction-interval coverage probability (PICP), the most important of the statistical measures of uncertainty, was used to evaluate uncertainty. Additionally, three state-of-the-art ML models including support vector machine (SVM), random forest (RF), and k-nearest neighbor (kNN) were selected to spatially model groundwater nitrate concentrations. The models were calibrated with nitrate concentrations from 80 wells (70% of the data) and then validated with nitrate concentrations from 34 wells (30% of the data). Both uncertainty and predictive performance criteria should be considered when comparing and selecting the best model. Results highlight that the kNN model is the best model because not only did it have the lowest uncertainty based on the PICP statistic in both the QR (0.94) and the UNEEC (in all clusters, 0.85-0.91) methods, but it also had predictive performance statistics (RMSE = 10.63, R2 = 0.71) that were relatively similar to RF (RMSE = 10.41, R2 = 0.72) and higher than SVM (RMSE = 13.28, R2 = 0.58). Determining the uncertainty of ML models used for spatially modelling groundwater-nitrate pollution enables managers to achieve better risk-based decision making and consequently increases the reliability and credibility of groundwater-nitrate predictions.
Copyright © 2019 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  GIS; Groundwater pollution; Machine learning; Nitrate concentration; Uncertainty assessment

Year:  2019        PMID: 31255823     DOI: 10.1016/j.scitotenv.2019.06.320

Source DB:  PubMed          Journal:  Sci Total Environ        ISSN: 0048-9697            Impact factor:   7.963


  4 in total

1.  Shallow Landslide Susceptibility Mapping: A Comparison between Logistic Model Tree, Logistic Regression, Naïve Bayes Tree, Artificial Neural Network, and Support Vector Machine Algorithms.

Authors:  Viet-Ha Nhu; Ataollah Shirzadi; Himan Shahabi; Sushant K Singh; Nadhir Al-Ansari; John J Clague; Abolfazl Jaafari; Wei Chen; Shaghayegh Miraki; Jie Dou; Chinh Luu; Krzysztof Górski; Binh Thai Pham; Huu Duy Nguyen; Baharin Bin Ahmad
Journal:  Int J Environ Res Public Health       Date:  2020-04-16       Impact factor: 3.390

2.  Machine learning-based estimation of riverine nutrient concentrations and associated uncertainties caused by sampling frequencies.

Authors:  Shengyue Chen; Zhenyu Zhang; Juanjuan Lin; Jinliang Huang
Journal:  PLoS One       Date:  2022-07-13       Impact factor: 3.752

3.  Groundwater Potential Mapping Combining Artificial Neural Network and Real AdaBoost Ensemble Technique: The DakNong Province Case-study, Vietnam.

Authors:  Phong Tung Nguyen; Duong Hai Ha; Abolfazl Jaafari; Huu Duy Nguyen; Tran Van Phong; Nadhir Al-Ansari; Indra Prakash; Hiep Van Le; Binh Thai Pham
Journal:  Int J Environ Res Public Health       Date:  2020-04-04       Impact factor: 3.390

4.  Machine Learning Model-Based Simple Clinical Information to Predict Decreased Left Atrial Appendage Flow Velocity.

Authors:  Chao Li; Guanhua Dou; Yipu Ding; Ran Xin; Jing Wang; Jun Guo; Yundai Chen; Junjie Yang
Journal:  J Pers Med       Date:  2022-03-10
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.