| Literature DB >> 34283823 |
Elizabeth Jeanne Parent1, Serge-Étienne Parent1, Léon Etienne Parent1.
Abstract
Accuracy of infrared (IR) models to measure soil particle-size distribution (PSD) depends on soil preparation, methodology (sedimentation, laser), settling times and relevant soil features. Compositional soil data may require log ratio (ilr) transformation to avoid numerical biases. Machine learning can relate numerous independent variables that may impact on NIR spectra to assess particle-size distribution. Our objective was to reach high IRS prediction accuracy across a large range of PSD methods and soil properties. A total of 1298 soil samples from eastern Canada were IR-scanned. Spectra were processed by Stochastic Gradient Boosting (SGB) to predict sand, silt, clay and carbon. Slope and intercept of the log-log relationships between settling time and suspension density function (SDF) (R2 = 0.84-0.92) performed similarly to NIR spectra using either ilr-transformed (R2 = 0.81-0.93) or raw percentages (R2 = 0.76-0.94). Settling times of 0.67-min and 2-h were the most accurate for NIR predictions (R2 = 0.49-0.79). The NIR prediction of sand sieving method (R2 = 0.66) was more accurate than sedimentation method(R2 = 0.53). The NIR 2X gain was less accurate (R2 = 0.69-0.92) than 4X (R2 = 0.87-0.95). The MIR (R2 = 0.45-0.80) performed better than NIR (R2 = 0.40-0.71) spectra. Adding soil carbon, reconstituted bulk density, pH, red-green-blue color, oxalate and Mehlich3 extracts returned R2 value of 0.86-0.91 for texture prediction. In addition to slope and intercept of the SDF, 4X gain, method and pre-treatment classes, soil carbon and color appeared to be promising features for routine SGB-processed NIR particle-size analysis. Machine learning methods support cost-effective soil texture NIR analysis.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34283823 PMCID: PMC8291647 DOI: 10.1371/journal.pone.0233242
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Repartition of soil sampling areas on eastern provinces in Canada.
Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under ODbL. Contains information from OpenStreetMap and OpenStreetMap Foundation, which is made available under the Open Database License. http://maps.stamen.com/#toner/12/37.7706/-122.3782.
Ranges of soil properties (0–20 cm) in the data set (particle-size distribution according to the multi-point 7-h sedimentation method using 45-sec settling time for sand and 7-h settling time for clay).
| Site | Mean | Standard deviation | Minimum | Maximum |
|---|---|---|---|---|
| 5.36 | 0.67 | 3.37 | 7.90 | |
| g kg-1 | ||||
| 613 | 24 | 1 | 986 | |
| 41 | 50 | 0 | 282 | |
| 108 | 115 | 1 | 541 | |
| 191 | 126 | 5 | 580 | |
| 235 | 154 | 3 | 766 | |
| 118 | 103 | 0 | 587 | |
| 256 | 15 | 5 | 816 | |
| 131 | 13 | 5 | 839 | |
| 25.5 | 2.4 | 0.1 | 443 | |
| 1.9 | 1.6 | 0.7 | 19 | |
| 0.3 | 0.4 | 0 | 10.3 | |
| mg kg-1 | ||||
| 5599 | 2449 | 447 | 10400 | |
| 4000 | 2693 | 528 | 15346 | |
| 309 | 304 | 7 | 2287 | |
| 30 | 14 | 3 | 153 | |
| 803 | 464 | 68 | 2266 | |
| 829 | 578 | 88 | 5585 | |
| 118 | 62 | 6 | 362 | |
| 1776 | 1437 | 113 | 5103 | |
| 191 | 224 | 5 | 1572 | |
| 232 | 108 | 72 | 715 | |
| 1265 | 431 | 310 | 2241 | |
| 25 | 20 | 5 | 206 | |
| 8 | 9 | 0 | 130 | |
| 4 | 3 | 0.4 | 30 | |
| % | ||||
| 47 | 8 | 28 | 68 | |
| 34 | 10 | 11 | 58 | |
| 54 | 7 | 31 | 75 | |
| g cm-3 | ||||
| 1.06 | 0.16 | 0.71 | 1.46 | |
Fig 2Particle size distribution of studied soils in the Canadian textural diagram.
Fig 3Distribution of ilr textural variables across methodologies.
Comparison of methods (method1 minus method2) using paired t-test and confidence intervals (p ≤ 0.05).
| Target variable | Method1 | Method2 | p.value | N |
|---|---|---|---|---|
| ns | 38 | |||
| ** | 46 | |||
| ** | 106 | |||
| ** | 763 | |||
| ns | 227 | |||
| ** | 38 | |||
| ** | 46 | |||
| ** | 106 | |||
| ** | 763 | |||
| ** | 227 | |||
| ns | 38 | |||
| 46 | ||||
| * | 106 | |||
| ** | 763 | |||
| ** | 227 | |||
| * | 746 | |||
| ns | 26 | |||
| *8 | 34 | |||
| * | 77 | |||
| ns | 358 | |||
| * | 84 |
ns, *,**: Non-significant and significant at the 0.05 and 0.01 levels, respectively.
N: Sample size.
Fig 4Mean differences (p ≤ 0.025) in textural balances between several methodologies against the reference 2-point 2-h sedimentation method without pre-treatment.
Accuracy, slope, intercept, and number of observations of models.
| Set | Dependent Variables | Independent variables | Laboratory method for calibration | Sand | Silt | Clay | C | Sand | Silt | Clay | C | Value interpretation | N |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0.80 | 0.79 | 0.45 | 0.28 | 10.80 | 10.64 | 1.32 | 0.67 | - | 92 | ||||
| 0.86 | 0.80 | 0.64 | 0.63 | 9.11 | 10.35 | 1.06 | 0.48 | - | |||||
| 0.90 | 0.85 | 0.80 | 0.58 | 9.05 | 6.94 | 5.27 | 0.72 | + | 485 | ||||
| 0.89 | 0.91 | 0.85 | 0.71 | 9.30 | 7.17 | 4.21 | 0.62 | ++ | |||||
| 0.63 | 0.44 | 0.70 | 0.40 | 16.17 | 13.22 | 6.94 | 0.92 | - | 667 | ||||
| 0.76 | 0.49 | 0.79 | 0.36 | 13.42 | 11.50 | 6.60 | 0.95 | - | |||||
| 0.75 | 0.45 | 0.80 | 0.21 | 14.75 | 11.34 | 7.88 | 0.77 | - | 222 | ||||
| 0.71 | 0.40 | 0.69 | 0.02 | 15.78 | 11.81 | 9.75 | 0.86 | - | |||||
| 0.94 | 0.87 | 0.95 | 0.85 | 7.26 | 6.15 | 3.62 | 0.48 | ++ | 311 | ||||
| 0.83 | 0.69 | 0.92 | 0.76 | 12.42 | 9.40 | 4.78 | 0.62 | + | |||||
| 0.90 | 0.81 | 0.93 | 0.84 | 8.75 | 6.98 | 3.94 | 0.73 | + | 860 | ||||
| 0.91 | 0.76 | 0.94 | 0.89 | 8.40 | 7.74 | 3.84 | 0.61 | + | |||||
| 0.90 | 0.84 | 0.92 | - | 8.63 | 6.34 | 4.29 | - | ++ | |||||
| 0.91 | 0.86 | 0.90 | 0.69 | 9.27 | 5.86 | 6.06 | 0.36 | ++ | 156 | ||||
| 0.94 | 0.82 | 0.97 | 0.97 | 7.78 | 7.67 | 3.07 | 0.71 | ++ | |||||
| 0.96 | 0.88 | 0.97 | 0.95 | 6.59 | 6.34 | 3.36 | 0.97 | ++ | |||||
| 0.89 | 0.63 | 0.94 | 0.99 | 10.59 | 10.84 | 4.66 | 0.52 | - | |||||
| 0.92 | 0.78 | 0.96 | 0.94 | 8.84 | 8.39 | 3.53 | 1.06 | + | |||||
| 0.94 | 0.86 | 0.95 | 0.96 | 7.55 | 6.64 | 4.10 | 0.84 | ++ | |||||
| 0.95 | 0.84 | 0.96 | 0.98 | 7.19 | 7.17 | 3.71 | 0.54 | ++ | |||||
| 0.89 | 0.70 | 0.93 | 0.98 | 10.62 | 9.83 | 4.71 | 0.64 | + | |||||
| 0.66 | - | - | - | 12.15 | - | - | - | + | 746 | ||||
| 0.53 | - | - | - | 14.73 | - | - | - | - | |||||
C: Carbon; PT: Pre-treatments (no peroxide or peroxide); MIR: MIR scores; RMSE: Root mean square error; N: Sample size.
*: Value interpretation on sand, silt and clay contents.
-: Vulnerable estimations; +: Approximated estimations; ++: Research application estimations; +++: Quality control estimations [48,49].