| Literature DB >> 32467642 |
Longzhu Q Shen1,2, Giuseppe Amatulli3,4, Tushar Sethi2, Peter Raymond5, Sami Domisch6.
Abstract
Nitrogen (N) and Phosphorus (P) are essential nutritional elements for life processes in water bodies. However, in excessive quantities, they may represent a significant source of aquatic pollution. Eutrophication has become a widespread issue rising from a chemical nutrient imbalance and is largely attributed to anthropogenic activities. In view of this phenomenon, we present a new geo-dataset to estimate and map the concentrations of N and P in their various chemical forms at a spatial resolution of 30 arc-second (∼1 km) for the conterminous US. The models were built using Random Forest (RF), a machine learning algorithm that regressed the seasonally measured N and P concentrations collected at 62,495 stations across the US streams for the period of 1994-2018 onto a set of 47 in-house built environmental variables that are available at a near-global extent. The seasonal models were validated through internal and external validation procedures and the predictive powers measured by Pearson Coefficients reached approximately 0.66 on average.Entities:
Year: 2020 PMID: 32467642 PMCID: PMC7256043 DOI: 10.1038/s41597-020-0478-7
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Chemical nutrients with their USGS Parameter Code (PC) and abbreviation.
| PC | Description | Abbreviation |
|---|---|---|
| 00600 | Total Nitrogen | TN |
| 00665 | Total Phosphorus | TP |
| 00602 | Total Dissolved Nitrogen | TDN |
| 00666 | Total Dissolved Phosphorus | TDP |
| 00618 | Nitrate | NO3 |
Number of observations of the nutrients for each of the four seasons, remained after the data cleaning.
| Season | Winter | Spring | Summer | Autumn |
|---|---|---|---|---|
| Month | 11-12-01 | 02-03-04 | 05-06-07 | 08-09-10 |
| TN | 1651 | 3090 | 3220 | 2254 |
| TDN | 678 | 1158 | 1237 | 875 |
| NO3 | 1628 | 2761 | 3314 | 2238 |
| TP | 2595 | 4831 | 5860 | 4155 |
| TDP | 911 | 1651 | 2175 | 1412 |
Fig. 1Spatio-temporal distribution of TN and TP. Spatial and seasonal distribution of the Water Quality Portal’s stations. The Total Nitrogen (TN) and Total Phosphorus (TP) seasonal mean for each station is labelled by a colour circle which also increase in size in accordance to the value TN and TP values.
Stream environmental predictors. Overview of all 47 environmental predictors used in the models.
| Variable type | Variable name | Variable description | Variable Alias |
|---|---|---|---|
| elevation | dem | Average elevation | dem_avg |
| slope | slope | Average slope | slope_ave |
| topology | ord | Stream order | lentic_lotic01 |
| soil | soil01 | Soil organic carbon | soil_avg_01 |
| soil02 | Soil pH in H2O | soil_avg_02 | |
| soil03 | Sand content mass fraction | soil_avg_03 | |
| soil04 | Silt content mass fraction | soil_avg_04 | |
| soil05 | Clay content mass fraction | soil_avg_05 | |
| soil06 | Coarse fragments (>2 mm fraction) volumetric | soil_avg_06 | |
| soil07 | Cation exchange capacity | soil_avg_07 | |
| soil08 | Bulk density of the fine earth fraction | soil_avg_08 | |
| soil09 | Depth to bedrock (R horizon) up to maximum 240 cm | soil_avg_09 | |
| land cover | soil10 | Probability of occurrence (0–100%) of R horizon | soil_avg_10 |
| lc01 | Evergreen/deciduous needleleaf trees | lu_avg_01 | |
| lc02 | Evergreen broadleaf trees | lu_avg_02 | |
| lc03 | Deciduous broadleaf trees | lu_avg_03 | |
| lc04 | Mixed/other trees | lu_avg_04 | |
| lc05 | Shrubs | lu_avg_05 | |
| lc06 | Herbaceous vegetation | lu_avg_06 | |
| lc07 | Cultivated and managed vegetation | lu_avg_07 | |
| lc08 | Regularly flooded shrub/herbaceous vegetation | lu_avg_08 | |
| lc09 | Urban/built-up | lu_avg_09 | |
| lc10 | Snow/ice | lu_avg_10 | |
| lc11 | Barren lands/sparse vegetation | lu_avg_11 | |
| lc12 | Open water | lu_avg_12 | |
| temperature | tmin | Monthly temperature average min | |
| temperature | tmax | Monthly temperature average max | |
| precipitation | prec | Sum of monthly precipitation | |
| hydro01 | Annual Mean Upstream Temperature | hydro_ave_01 | |
| hydro02 | Mean Upstream Diurnal Range (Mean of monthly (max temp - min temp)) | hydro_ave_02 | |
| hydro03 | Upstream Isothermality (hydro02 / hydro07) (* 100) | hydro_ave_03 | |
| hydro04 | Upstream Temperature Seasonality (standard deviation *100) | hydro_ave_04 | |
| hydro05 | Maximum Upstream Temperature of Warmest Month | hydro_ave_05 | |
| hydro06 | Minimum Upstream Temperature of Coldest Month | hydro_ave_06 | |
| hydro07 | Upstream Temperature Annual Range (hydro05 - hydro06) | hydro_ave_07 | |
| hydro08 | Mean Upstream Temperature of Wettest Quarter | hydro_ave_08 | |
| hydro09 | Mean Upstream Temperature of Driest Quarter | hydro_ave_09 | |
| hydroclimate | hydro10 | Mean Upstream Temperature of Warmest Quarter | hydro_ave_10 |
| hydro11 | Mean Upstream Temperature of Coldest Quarter | hydro_ave_11 | |
| hydro12 | Annual Upstream Precipitation | hydro_ave_12 | |
| hydro13 | Upstream Precipitation of Wettest Month | hydro_ave_13 | |
| hydro14 | Upstream Precipitation of Driest Month | hydro_ave_14 | |
| hydro15 | Upstream Precipitation Seasonality (Coefficient of Variation) | hydro_ave_15 | |
| hydro16 | Upstream Precipitation of Wettest Quarter | hydro_ave_16 | |
| hydro17 | Upstream Precipitation of Driest Quarter | hydro_ave_17 | |
| hydro18 | Upstream Precipitation of Warmest Quarter | hydro_ave_18 | |
| hydro19 | Upstream Precipitation of Coldest Quarter | hydro_ave_19 |
Fig. 2Bivariate maps for TN and TP. Bivariate maps showing the predicted Total Nitrogen (TN) and Total Phosphorus (TP) values in ppm across the four seasons. Streams and rivers on the original 30 arc-second resolution maps were aggregated using the mean value of a moving window with 10 × 10 grid-cells for an improved visualisation. Red indicates high concentration areas, which mainly coincide with high agriculture or grazing activities or urban zones. Blue indicates low nutrient load areas, which are frequently occupied by forests or deserts.
Fig. 3Correlation plots for TN and TP in testing. Seasonal correlation plots for TN and TP for the testing data sets. Horizontal axes represent the observations and vertical axes represent the predicted values. Ticks labelled in black are box-cox transformed values and ticks in blue are original values in ppm. Pearson coefficients (r) and RMSE(, ) are given in the upper-left corner box.
Fig. 4Residual maps for TN and TP. Residuals are computed using the testing sub-dataset (observations minus predictions). In each maps is also reported the for the testing sub-daset in ppm, and using observation in the low/high density, respectively.
| Measurement(s) | concentration of nitrogen atom in water • phosphorus atom |
| Technology Type(s) | machine learning |
| Factor Type(s) | geographic location • seasonal measurement • year of data collection |
| Sample Characteristic - Environment | stream • river • fresh water body |
| Sample Characteristic - Location | contiguous United States of America |