| Literature DB >> 29453358 |
M Tighe1, N Forster2, C Guppy2, D Savage3, P Grave4, I M Young5.
Abstract
The provenance or origin of a soil sample is of interest in soil forensics, archaeology, and biosecurity. In all of these fields, highly specialized and often expensive analysis is usually combined with expert interpretation to estimate sample origin. In this proof of concept study we apply rapid and non-destructive spectral analysis to the question of direct soil provenancing. This approach is based on one of the underlying tenets of soil science - that soil pedogenesis is spatially unique, and thus digital spectral signatures of soil can be related directly, rather than via individual soil properties, to a georeferenced location. We examine three different multivariate regression techniques to predict GPS coordinates in two nested datasets. With a minimum of data processing, we show that in most instances Eastings and Northings can be predicted to within 20% of the range of each within the dataset using the spectral signatures produced via portable x-ray fluorescence. We also generate 50 and 95% confidence intervals of prediction and express these as a range of GPS coordinates. This approach has promise for future application in soil and environmental provenancing.Entities:
Year: 2018 PMID: 29453358 PMCID: PMC5816621 DOI: 10.1038/s41598-018-21530-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Model fit parameters and prediction results of Eastings and Northings using Vis-NIR and PXRF analysis of Australian Farm and Local dataset samples.
| Scanning method | Dataset | Regression | Eastings | Northings | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Components1 | R2 | RMSE (%) | AIC | Components1 | R2 | RMSE (%) | AIC | |||
| Vis-NIR | Australia – Farm | PLS | 4 | 0.25 | 315 (20) | 169 | 3 | 0.66 | 384 (15) | 173 |
| PCR | 10 | 0.07 | 350 (22) | 184 | 4 | 0.68 | 376 (14) | 174 | ||
| EARTH | 8,6 | 0.24 | 318 (20) | — | 6,4 | 0.09 | 632 (24) | — | ||
| Australia – Local | PLS | 10 | 0.72 | 426 (14) | 244 | 14 | 0.48 | 1604 (23) | 301 | |
| PCR | 18 | 0.72 | 427 (14) | 260 | 15 | 0.54 | 1505 (22) | 301 | ||
| EARTH | 7,5 | 0.56 | 535 (18) | — | 12,9 | 0.22 | 1973 (29) | — | ||
| PXRF | Australia – Farm | PLS | 8 | −0.13 | 387 (25) | 183 | 4 | 0.67 | 370 (14) | 174 |
| PCR | 12 | 0.35 | 293 (19) | 167 | 4 | 0.68 | 376 (14) | 174 | ||
| EARTH | 7,5 | −0.69 | 473 (30) | — | 13,9 | 0.51 | 461 (17) | — | ||
| Australia – Local | PLS | 12 | 0.62 | 496 (17) | 254 | 17 | 0.75 | 1118 (16) | 294 | |
| PCR | 17 | 0.73 | 420 (14) | 257 | 17 | 0.71 | 1208 (18) | 297 | ||
| EARTH | 13,8 | 0.67 | 465 (16) | — | 16,11 | 0.53 | 1530 (22) | — | ||
1For PLS and PCR, components are the number of latent variables that minimized the adjusted general cross validation value on the training portion of data. For EARTH, the components are the number of terms and knot points automatically selected during the building of the piecewise splines of the model with 10-fold validation as per the methods section.
Figure 1Cumulative probability distributions of distance predictions falling within a specified prediction error (m) for the Australian Farm dataset. The three regression approaches and instrument combinations as described in text are presented for (a) Eastings predictions and (b) Northings predictions. Vis-NIR = black lines, PXRF = grey lines. PLS = solid lines. PXRF = short dashed lines. EARTH = dash-dot lines.
Figure 2Cumulative probability distributions of distance predictions falling within a specified prediction error (m) for the Australian Local dataset. The three regression approaches and instrument combinations selected as described in text are presented for (a) Eastings predictions and (b) Northings predictions. Vis-NIR = black lines, PXRF = grey lines. PLS = solid lines. PXRF = short dashed lines. EARTH = dash-dot lines.
Figure 3Sample points for the Australian Farm dataset, with the simulated average 50% (dark grey) and 95% (light grey) prediction space for the independent test samples overlain as the moving window of average prediction uncertainty as per text. As such the size of the shaded rectangles can be taken as graphical representations of model predictive performance. Prediction spaces were extracted from the probability distributions with the lowest cumulative error in Fig. 1.
Figure 4Sample points for the Australian Local dataset, with the simulated average 50% (dark grey) and 95% (light grey) prediction space for the independent test samples overlain as the moving window of average prediction uncertainty as per text. As such the size of the shaded rectangles can be taken as graphical representations of model predictive performance. Prediction spaces were extracted from the probability distributions with the lowest cumulative error in Fig. 2. The samples of the Local dataset that also comprise part of the Farm dataset are shown as filled dark grey squares.