| Literature DB >> 32242256 |
Hoseung Jung1, Cornelius Senf2, Philip Jordan3, Tobias Krueger2.
Abstract
River water quality monitoring at limited temporal resolution can lead to imprecise and inaccurate classification of physicochemical status due to sampling error. Bayesian inference allows for the quantification of this uncertainty, which can assist decision-making. However, implicit assumptions of Bayesian methods can cause further uncertainty in the uncertainty quantification, so-called second-order uncertainty. In this study, and for the first time, we rigorously assessed this second-order uncertainty for inference of common water quality statistics (mean and 95th percentile) based on sub-sampling high-frequency (hourly) total reactive phosphorus (TRP) concentration data from three watersheds. The statistics were inferred with the low-resolution sub-samples using the Bayesian lognormal distribution and bootstrap, frequentist t test, and face-value approach and were compared with those of the high-frequency data as benchmarks. The t test exhibited a high risk of bias in estimating the water quality statistics of interest and corresponding physicochemical status (up to 99% of sub-samples). The Bayesian lognormal model provided a good fit to the high-frequency TRP concentration data and the least biased classification of physicochemical status (< 5% of sub-samples). Our results suggest wide applicability of Bayesian inference for water quality status classification, a new approach for regulatory practice that provides uncertainty information about water quality monitoring and regulatory classification with reduced bias compared to frequentist approaches. Furthermore, the study elucidates sizeable second-order uncertainty due to the choice of statistical model, which could be quantified based on the high-frequency data.Entities:
Keywords: Bayesian uncertainty quantification; Ecological status classification; High-frequency monitoring; Phosphorus; Second-order uncertainty; Water Framework Directive
Mesh:
Substances:
Year: 2020 PMID: 32242256 PMCID: PMC7118042 DOI: 10.1007/s10661-020-8223-4
Source DB: PubMed Journal: Environ Monit Assess ISSN: 0167-6369 Impact factor: 2.513
Class boundaries of physicochemical status in Irish rivers for TRP concentration
| High | Good | Moderate | |
|---|---|---|---|
| Mean (mg P L−1) | ≤ 0.025 | > 0.035 | |
| 95%ile (mg P L−1) | ≤ 0.045 | > 0.075 |
Fig. 1Frequency distributions of hourly TRP concentration from 2011 to 2013 (high-freq), bimodal lognormal mixture (mixture), and unimodal lognormal and gamma distributions fitted to the frequency distributions via maximum likelihood and their log-likelihoods
Fig. 2Mean and 95%ile calculated with the high-frequency data (2011–2013) and their sampling distributions simulated by sub-sampling the high-frequency data according to operational (OM; a and b) and surveillance (SM; b and c) monitoring schemes
Mean, skewness, and width of 95% highest density interval (HDI95) of sampling distributions of mean and 95th percentile simulated under operational monitoring (OM) and surveillance monitoring (SM) scenarios
| Operational monitoring | Surveillance monitoring | ||||
|---|---|---|---|---|---|
| Mean | 95%ile | Mean | 95%ile | ||
| Mean | Arable A | 0.00 | − 0.01 | 0.00 | − 0.01 |
| (mg P L−1) | Grassland A | 0.00 | − 0.01 | 0.00 | − 0.01 |
| Grassland B | 0.00 | 0.00 | 0.00 | 0.00 | |
| Skewness | Arable A | 2.58 | 3.36 | 1.61 | 1.75 |
| (−) | Grassland A | 1.58 | 3.08 | 1.07 | 2.47 |
| Grassland B | 1.38 | 1.9 | 0.87 | 1.34 | |
| HDI95 | Arable A | 0.05 | 0.2 | 0.04 | 0.14 |
| (mg P L−1) | Grassland A | 0.03 | 0.09 | 0.02 | 0.06 |
| Grassland B | 0.04 | 0.18 | 0.03 | 0.14 | |
Fig. 3Sampling distributions of relative errors of mean (a and c) and 95th percentile (b and d) pooled across study sites simulated under operational monitoring (OM; a and b) and surveillance monitoring (SM; c and d) scenarios
Second-order uncertainty of estimated mean and 95%ile from random low-resolution sub-samples using Bayesian lognormal model and bootstrap
| HRa | HDI95b (mg P L−1) | RMBE c (%) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| (%) | 2.5% | Median | 97.5% | 2.5% | Median | 97.5% | |||
| Operational monitoring (OM) | |||||||||
| Mean | Lognormal | Arable A | 91 | 0.01 | 0.03 | 0.07 | − 33 | 4 | 77 |
| Grassland A | 86 | 0.01 | 0.03 | 0.11 | − 29 | 0 | 69 | ||
| Grassland B | 88 | 0.02 | 0.04 | 0.12 | − 26 | − 1 | 64 | ||
| Bootstrap | Arable A | 85 | 0.01 | 0.02 | 0.05 | − 35 | − 4 | 55 | |
| Grassland A | 79 | 0.01 | 0.03 | 0.09 | − 31 | − 4 | 55 | ||
| Grassland B | 78 | 0.01 | 0.03 | 0.12 | − 29 | − 5 | 57 | ||
| 95%ile | Lognormal | Arable A | 92 | 0.03 | 0.04 | 0.07 | − 43 | 13 | 137 |
| Grassland A | 75 | 0.05 | 0.07 | 0.12 | − 52 | − 11 | 105 | ||
| Grassland B | 81 | 0.06 | 0.08 | 0.14 | − 42 | − 5 | 122 | ||
| Bootstrap | Arable A | 9 | 0.01 | 0.01 | 0.01 | − 48 | − 13 | 127 | |
| Grassland A | 6 | 0.01 | 0.01 | 0.01 | − 60 | − 16 | 119 | ||
| Grassland B | 6 | 0.01 | 0.01 | 0.01 | − 52 | − 15 | 164 | ||
| Surveillance monitoring (SM) | |||||||||
| Mean | Lognormal | Arable A | 89 | 0.01 | 0.01 | 0.03 | − 25 | − 1 | 33 |
| Grassland A | 84 | 0.01 | 0.02 | 0.04 | − 21 | − 2 | 30 | ||
| Grassland B | 85 | 0.01 | 0.02 | 0.04 | − 20 | − 3 | 28 | ||
| Bootstrap | Arable A | 91 | 0.01 | 0.02 | 0.03 | − 25 | − 2 | 34 | |
| Grassland A | 87 | 0.01 | 0.02 | 0.06 | − 22 | − 2 | 34 | ||
| Grassland B | 90 | 0.02 | 0.03 | 0.09 | − 21 | − 3 | 38 | ||
| 95%ile | Lognormal | Arable A | 90 | 0.03 | 0.05 | 0.09 | − 31 | 3 | 58 |
| Grassland A | 64 | 0.03 | 0.06 | 0.14 | − 43 | − 15 | 35 | ||
| Grassland B | 74 | 0.03 | 0.07 | 0.16 | − 34 | − 8 | 49 | ||
| Bootstrap | Arable A | 19 | 0.01 | 0.01 | 0.02 | − 30 | − 2 | 91 | |
| Grassland A | 9 | 0.01 | 0.02 | 0.02 | − 44 | − 3 | 86 | ||
| Grassland B | 8 | 0.01 | 0.02 | 0.02 | − 39 | − 1 | 117 | ||
aHit rate
bLength of 95% highest density interval of the posterior distribution
cRelative mean bias error
For the HDI95 and the RMBE, the median and central 95% across all sub-samples are given
Fig. 5Examples of posterior distributions and confidence intervals (CI) of mean and 95%ile computed using the Bayesian lognormal model (Logn) and Bayesian bootstrap (BB) and the frequentist t test, respectively, given lower-biased (a and b) and upper-biased (c and d) operational monitoring sub-samples from Grassland A and Arable A, respectively
Fig. 4Proportions of physicochemical status classes determined with the face-value approach and the right-tailed t test and the confidences of classification estimated with the Bayesian lognormal model (Logn) and the Bayesian bootstrap (BB) given 10,000 operational and surveillance monitoring sub-samples