| Literature DB >> 31616033 |
Thorsten Behrens1,2, Raphael A Viscarra Rossel3, Ruth Kerry4, Robert MacMillan5, Karsten Schmidt6, Juhwan Lee3, Thomas Scholten6, A-Xing Zhu7,8.
Abstract
Spatial autocorrelation in the residuals of spatial environmental models can be due to missing covariate information. In many cases, this spatial autocorrelation can be accounted for by using covariates from multiple scales. Here, we propose a data-driven, objective and systematic method for deriving the relevant range of scales, with distinct upper and lower scale limits, for spatial modelling with machine learning and evaluated its effect on modelling accuracy. We also tested an approach that uses the variogram to see whether such an effective scale space can be approximated a priori and at smaller computational cost. Results showed that modelling with an effective scale space can improve spatial modelling with machine learning and that there is a strong correlation between properties of the variogram and the relevant range of scales. Hence, the variogram of a soil property can be used for a priori approximations of the effective scale space for contextual spatial modelling and is therefore an important analytical tool not only in geostatistics, but also for analyzing structural dependencies in contextual spatial modelling.Entities:
Year: 2019 PMID: 31616033 PMCID: PMC6794247 DOI: 10.1038/s41598-019-51395-3
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The first and second columns show the contextual machine learning results for the scale in meters and the respective Gaussian pyramid octaves, while the third column shows the corresponding Morans’I values. The green line represents the additive and the blue line the subtractive approach. The relevant scale range determined by the contextual machine learning method is marked by orange and red vertical lines representing the lower and upper limits of the effective scale space. The corresponding variographically determined limits are displayed in light and dark grey. The dashed lines show the octave closest to these values which were used for modeling. The normalized (0–1) experimental variograms are also shown for both the scales and the octaves in the respective transformations.
Figure 2Spherical isotropic variograms of the soil properties for the four study sites. The properties of the isotropic variograms are shown in Table 1.
Properties of the isotropic spherical variograms of the soil properties of the different datasets.
| Dataset | Nugget | Sill | Range | Nugget/sill * range |
|---|---|---|---|---|
| Meuse | 0.075 | 0.454 | 1026 | 169 |
| Lachlan | 0.006 | 0.015 | 47926 | 18516 |
| Rhine-Hesse | 0.792 | 2.257 | 9966 | 3497 |
| Piracicaba | 0.003 | 0.014 | 3334 | 805 |
Figure 3Increase in cross-validated prediction accuracy using all scales compared to the covariates only at the original (finest) scale.
Figure 4Nugget:sill ratio of the isotropic variograms for the four study sites.
Lower and upper boundaries of the effective scale space based on isotropic and anisotropic variograms of the soil properties.
| Dataset | Minimum scale | Maximum scale | ||
|---|---|---|---|---|
| isotropic | anisotropic | isotropic | anisotropic | |
| Meuse | 84 | 30 | 513 | 677 |
| Lachlan | 9258 | 395 | 23963 | 34041 |
| Rhine-Hesse | 1748 | 299 | 4983 | 7660 |
| Piracicaba | 402 | 246 | 1667 | 4175 |
The minimum scale is calculated by multiplying the nugget:sill ratio with the range of the variogram and the maximum scale equals the range of the variogram. For the anisotripic variograms the overall minimum and maximum scales were selected. The orgininal variogram distances were divided by a factor of 2 to obtain a common basis for the analysis of scales with the Gaussian scale space (see Methods).
Figure 5Comparison of the influence of selecting the relevant range based on the contextual machine learning method (green) and the variography method (yellow) with the results for the full range of scales (blue).