| Literature DB >> 34878734 |
Kazutaka M Takeshita1,2, Takehiko I Hayashi3, Hiroyuki Yokomizo1.
Abstract
The goals of observational dataset analysis vary with the management phase of environments threatened by anthropogenic chemicals. For example, identifying severely compromised sites is necessary to determine candidate sites in which to implement measures during early management phases. Among the most effective approaches is developing regression models with high predictive power for dependent variable values using the Akaike information criterion. However, this analytical approach may be theoretically inappropriate to obtain the necessary information in various chemical management phases, such as the intervention effect size of a chemical required in the late chemical management phase to evaluate the necessity of an effluent standard and its specific value. However, choosing appropriate statistical methods based on the data analysis objective in each chemical management phase has rarely been performed. This study provides an overview of the primary data analysis objectives in the early and late chemical management phases. For each objective, several suitable statistical analysis methods for observational datasets are detailed. In addition, the study presents examples of linear regression analysis procedures using an available dataset derived from field surveys conducted in Japanese rivers. Integr Environ Assess Manag 2022;18:1414-1422.Entities:
Keywords: Akaike information criterion; Biomonitoring; Organic pollution; Statistical causal inference; Trace metal
Mesh:
Year: 2022 PMID: 34878734 PMCID: PMC9539851 DOI: 10.1002/ieam.4564
Source DB: PubMed Journal: Integr Environ Assess Manag ISSN: 1551-3777 Impact factor: 3.084
Independent variable set, Akaike information criterion (AIC), and Akaike weights of the top 10 models with the minimum ΔAIC values
| Model rank | TOC | Nickel | Zinc | Copper | pH | Water temperature | Riverbed | Flow velocity | Basin | AIC | ΔAIC | Akaike weights |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | + | + | + | + | + | + | −70.68 | 0.00 | 0.16 | |||
| 2 | + | + | + | + | + | + | + | −68.99 | 1.69 | 0.07 | ||
| 3 | + | + | + | + | + | + | + | −68.87 | 1.81 | 0.07 | ||
| 4 | + | + | + | + | + | + | + | −68.72 | 1.96 | 0.06 | ||
| 5 | + | + | + | + | + | −68.64 | 2.04 | 0.06 | ||||
| 6 | + | + | + | + | −68.01 | 2.67 | 0.04 | |||||
| 7 | + | + | + | + | + | −67.90 | 2.78 | 0.04 | ||||
| 8 | + | + | + | + | + | + | −67.89 | 2.79 | 0.04 | |||
| 9 | + | + | + | + | + | −67.78 | 2.90 | 0.04 | ||||
| 10 | + | + | + | + | + | + | + | + | −67.16 | 3.52 | 0.03 |
Abbreviation: TOC, total organic carbon.
ΔAIC indicates differences in AIC values relative to the minimum value.
“+” indicates factors included in the model.
Estimated intercept and partial regression coefficients for the independent variables of the model with the minimum Akaike information criterion value
| Intercept | TOC | Copper | Water temperature | Riverbed sediment | Flow velocity | Basin | |||
|---|---|---|---|---|---|---|---|---|---|
| Gravel | Boulder | Osawa | Koromo | Suikawa | |||||
| 5.91 | −0.54 | 0.54 | −0.15 | −0.73 | −1.12 | 1.46 | −0.49 | −1.20 | 0.00 |
| (1.23) | (0.15) | (0.19) | (0.09) | (0.44) | (0.40) | (0.85) | (0.79) | (0.70) | (0.55) |
|
|
|
|
|
|
|
|
|
|
|
Note: Values in parentheses are standard errors of the estimates.
Abbreviation: TOC, total organic carbon.
Coefficients for “Gravel” and “Boulder” represent the difference from the reference sediment category (sand).
Coefficients for each basin represent the difference from the reference basin (Yata River).
Figure 1Estimated marginal means (black dots) and their 95% confidence intervals (bars) for Simpson's diversity index for Insecta in 14 basins in Japan. We used the parameter estimates of the model with the minimum AIC value and the R package “emmeans” to depict this figure
Figure 2Heatmap and contour lines of the predicted Simpson's diversity index for Insecta, at TOC concentrations of 0.3–6.7 mg/L and total Ni concentrations of 0.001–0.2 mg/L. The prediction was based on the estimated marginal means calculated using the parameter estimates of the model with the minimum AIC value. In chemical management, it is often more useful to present the outputs of data analysis based on their total concentrations rather than based on free‐ion concentrations or dissolved concentrations. Therefore, we predicted the Simpson's diversity index by relating it to the concentrations of TOC and total Ni through metal speciation calculation with WHAM and numerical calculation with R. Details of this procedure are provided in the Supporting Information
Estimated intercept and partial regression coefficients for the independent variables and their p‐values, of the models with a set of independent variables satisfying the backdoor criterion to estimate the intervention effect in total organic carbon and nickel concentrations on Simpson's diversity index
| Intercept | TOC | Nickel | Zinc | Copper | pH | Riverbed sediment | Flow velocity | Basin | |||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Gravel | Boulder | Osawa | Koromo | Suikawa | |||||||
| 2.97 | −0.57 | −0.28 | 0.23 | 0.55 | 0.22 | −0.71 | −1.02 | 1.10 | −0.90 | −1.01 | 0.09 |
| (6.46) | (0.20) | (0.26) | (0.34) | (0.25) | (0.81) | (0.47) | (0.42) | (0.92) | (0.92) | (0.83) | (0.61) |
|
|
|
|
|
|
|
|
|
|
|
|
|
Note: Values in parentheses are standard errors of the estimates.
Abbreviation: TOC, total organic carbon.
Coefficients for “Gravel” and “Boulder” represent the difference from the reference sediment category (sand).
Coefficients for each basin represent the difference from the reference basin (Yata River).
Figure 3Predicted Simpson's diversity index for Insecta at (A) TOC concentrations of 0.3–6.7 mg/L and (B) total Ni concentrations of 0.001–0.2 mg/L. The prediction was based on the estimated marginal means calculated using the parameter estimates of the following three regression models: multiple linear regression model with the independent variable set satisfying the backdoor criterion (i.e., statistical causal inference, solid black line), simple linear regression model (solid blue line), and the model with the minimum AIC value (solid pink line). Details of the procedure to depict this figure are provided in the Supporting information