| Literature DB >> 28114334 |
Gerald Forkuor1, Ozias K L Hounkpatin2, Gerhard Welp2, Michael Thiel3.
Abstract
Accurate and detailed spatial soil information is essential for environmental modelling, risk assessment and decision making. The use of Remote Sensing data as secondary sources of information in digital soil mapping has been found to be cost effective and less time consuming compared to traditional soil mapping approaches. But the potentials of Remote Sensing data in improving knowledge of local scale soil information in West Africa have not been fully explored. This study investigated the use of high spatial resolution satellite data (RapidEye and Landsat), terrain/climatic data and laboratory analysed soil samples to map the spatial distribution of six soil properties-sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and nitrogen-in a 580 km2 agricultural watershed in south-western Burkina Faso. Four statistical prediction models-multiple linear regression (MLR), random forest regression (RFR), support vector machine (SVM), stochastic gradient boosting (SGB)-were tested and compared. Internal validation was conducted by cross validation while the predictions were validated against an independent set of soil samples considering the modelling area and an extrapolation area. Model performance statistics revealed that the machine learning techniques performed marginally better than the MLR, with the RFR providing in most cases the highest accuracy. The inability of MLR to handle non-linear relationships between dependent and independent variables was found to be a limitation in accurately predicting soil properties at unsampled locations. Satellite data acquired during ploughing or early crop development stages (e.g. May, June) were found to be the most important spectral predictors while elevation, temperature and precipitation came up as prominent terrain/climatic variables in predicting soil properties. The results further showed that shortwave infrared and near infrared channels of Landsat8 as well as soil specific indices of redness, coloration and saturation were prominent predictors in digital soil mapping. Considering the increased availability of freely available Remote Sensing data (e.g. Landsat, SRTM, Sentinels), soil information at local and regional scales in data poor regions such as West Africa can be improved with relatively little financial and human resources.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28114334 PMCID: PMC5256943 DOI: 10.1371/journal.pone.0170478
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Map of the study watershed and locations of soil sampling.
Statistical parameters of the mid infrared spectroscopy-partial least squares regression prediction models (n = 100 samples) and of the predicted dataset (n = 1104 samples).
| Parameters | Full cross-validation | Test-set validation (V = 10%) | Predicted dataset | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| R2 (%) | RMSECV | RPD | Slope | R2 (%) | RMSEP | RPD | Slope | Min | Max | Mean | SD | |
| Sand (%) | 70.5 | 6.8 | 1.8 | 0.7 | 80.9 | 5.7 | 2.5 | 0.7 | 3.5 | 66.2 | 37.6 | 9.6 |
| Silt (%) | 75.8 | 4.9 | 2 | 0.8 | 88.2 | 3.9 | 3 | 0.8 | 22.4 | 67.1 | 45.1 | 8.6 |
| Clay (%) | 77.6 | 6.2 | 2.1 | 0.8 | 80.6 | 5.5 | 2.4 | 0.8 | 0 | 54.5 | 20.8 | 8 |
| CEC (cmolc kg-1) | 75.6 | 3.6 | 2 | 0.8 | 90.5 | 3.2 | 3.6 | 0.8 | 0 | 36.3 | 12.5 | 5.9 |
| SOC (%) | 95.3 | 0.1 | 4.6 | 0.9 | 92.2 | 0.2 | 3.6 | 0.9 | 0 | 4.4 | 1.5 | 0.6 |
| Nitrogen (%) | 85.5 | 0 | 2.6 | 0.9 | 85.7 | 0 | 3 | 0.8 | 0 | 0.3 | 0.1 | 0.1 |
RMSECV: root mean square error of cross validation, RMSEP: root mean square error of prediction, RPD: ratio of performance to deviation, V: validation set, SD: standard deviation.
Spectral bands of satellite images used and definitions of soil and vegetation indices calculated.
| Blue (B) | Green (G) | Red (R) | Red edge (RdE) | Near infra red (NIR) | |||
| Blue (B) | Green (G) | Red (R) | Near infrared (NIR) | Shortwave infrared 1 (SWIR 1) | Shortwave infrared 2 (SWIR 2) | ||
| Brightness Index (BI) | ((R2 + G2 + B2) / 3)0.5 | Average reflectance magnitude | [ | ||||
| Saturation Index (SI) | (R–B) / (R + B) | Spectral slope | [ | ||||
| Hue Index (HI) | (2 * R–G–B) / (G–B) | Primary colors | [ | ||||
| Coloration Index (CI) | (R–G) / (R + G) | Soil color | [ | ||||
| Redness Index (RI) | R2 / (B * G3) | Hematite content | [ | ||||
| Normalized Difference Vegetation Index (NDVI) | (NIR–R) / (NIR + R) | Health and amount of vegetation | [ | ||||
Terrain and climatic variables considered in this study.
| Parameters | Definition | Units | Authors | |
|---|---|---|---|---|
| Slope* | Inclination of the land surface from the horizontal | Radians/% | [ | |
| Steepest slope | Maximal rate of elevation change ingravitational field | radians | [ | |
| Curvature | Curvature | degree m-1 | [ | |
| General curvature | Combination of horizontal and vertical curvature | degree m-1 | [ | |
| Plan curvature* | Horizontal (contour) curvature | degree m-1 | [ | |
| Maximum curvature | Maximum Curvature | degree m-1 | [ | |
| Minimum curvature | Minimum Curvature | degree m-1 | [ | |
| Total curvature | Curvature of the surface itself | degree m-1 | [ | |
| Parallel curvature | Parallel curvature | degree m-1 | [ | |
| Rectangle curvature | Rectangle curvature | degree m-1 | [ | |
| Flow line curvature | Flow line curvature | degree m-1 | [ | |
| Profile Curvature | Vertical rate of change of slope | degree m-1 | [ | |
| Horizontal curvature | Measure of flow convergence and divergence | degree m-1 | [ | |
| Flow direction* | Path of water flow | - | [ | |
| Aspect | Direction the slope faces | degree | [ | |
| Cose Aspect | Direction the slope faces: eastness | Degree | [ | |
| Sine Aspect | Direction the slope faces: northness | degree | [ | |
| Elevation | Vertical distance above sea level | m | [ | |
| Protection index | Extent at which a cell is protected by relief based on the immediate surrounding cell | [ | ||
| Topographic position index | Location higher or lower than the average of their surroundings | [ | ||
| Saga Wetness Index | Ratio of local catchment area to slope | - | [ | |
| Flow accumulation* | Ultimate flow path of every cell on the landscape grid | - | [ | |
| Channel network base Level | Channel network base level elevation | m | [ | |
| Temperature (mean annual) | Temperature | °C | [ | |
| Precipitation (mean annual) | Precipitation | mm | [ | |
The variables with (*) were calculated in SAGA as well as ArcGIS due to slight differences in the computational algorithms used by the two software packages.
Number of spectral and terrain/climatic predictors used in modelling each soil parameter.
| Data/Parameter | Sand | Silt | Clay | CEC | SOC | Nitrogen |
|---|---|---|---|---|---|---|
| Spectral | 17 | 22 | 21 | 12 | 26 | 19 |
| Terrain/climatic | 9 | 10 | 5 | 13 | 12 | 12 |
| Total | 26 | 32 | 26 | 25 | 38 | 31 |
Internal model validation based on 80% training data (All Spectral and topographic/climate predictors).
| Model | Sand | Silt | Clay | CEC | SOC | Nitrogen | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | R2 | RMSE | R2 | RMSE | R2 | RMSE | R2 | RMSE | R2 | RMSE | R2 | |
| MLR | 7.566 | 0.346 | 5.940 | 0.537 | 6.946 | 0.212 | 4.786 | 0.357 | 0.546 | 0.348 | 0.038 | 0.352 |
| RFR | 7.586 | 0.342 | 5.937 | 0.538 | 7.022 | 0.185 | 4.689 | 0.383 | 0.528 | 0.39 | 0.038 | 0.354 |
| SVM | 7.592 | 0.342 | 6.091 | 0.519 | 6.993 | 0.206 | 4.889 | 0.333 | 0.551 | 0.341 | 0.038 | 0.339 |
| SGB | 7.707 | 0.318 | 6.094 | 0.514 | 7.164 | 0.162 | 4.767 | 0.360 | 0.539 | 0.367 | 0.038 | 0.339 |
External validation in small catchment based on 20% testing data with spectral data and terrain/climatic variables.
| Model | Sand | Silt | Clay | CEC | SOC | Nitrogen | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | |
| MLR | 8.482 | 0.189 | 5.900 | 0.107 | 6.708 | 0.239 | 4.787 | 0.415 | 0.541 | 0.285 | 0.043 | 0.290 |
| RFR | 7.764 | 0.180 | 5.708 | 0.103 | 6.590 | 0.242 | 4.593 | 0.318 | 0.512 | 0.261 | 0.041 | 0.273 |
| SVM | 8.415 | 0.188 | 5.899 | 0.107 | 6.667 | 0.234 | 4.897 | 0.394 | 0.549 | 0.283 | 0.043 | 0.287 |
| SGB | 7.954 | 0.189 | 5.819 | 0.107 | 6.791 | 0.242 | 4.562 | 0.314 | 0.526 | 0.272 | 0.041 | 0.286 |
External validation based on 102 samples outside the small catchment with spectral data and terrain/climatic variables.
| Model | Sand | Silt | Clay | CEC | SOC | Nitrogen | ||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | RMSE | sMAPE | |
| MLR | 17.341 | 0.547 | 9.350 | 0.157 | 11.804 | 0.548 | 5.597 | 0.469 | 0.847 | 0.505 | 0.059 | 0.496 |
| RFR | 14.115 | 0.314 | 8.713 | 0.146 | 10.623 | 0.478 | 4.891 | 0.415 | 0.765 | 0.472 | 0.053 | 0.457 |
| SVM | 20.257 | 0.193 | 9.106 | 0.153 | 14.738 | 0.566 | 5.669 | 0.448 | 0.750 | 0.471 | 0.057 | 0.488 |
| SGB | 15.184 | 0.341 | 8.846 | 0.148 | 10.875 | 0.497 | 4.960 | 0.398 | 0.759 | 0.476 | 0.051 | 0.454 |
First five predictors that were highly significant for RFR (based on “IncNodePurity” importance measure) and MLR analysis.
| Model | Rank | Sand | Silt | Clay | CEC | SOC | Nitrogen |
|---|---|---|---|---|---|---|---|
| MLR | 1 | june_SWIR2 | june_SWIR2 | june_NIR | june_SWIR2 | Elevation | Elevation |
| 2 | june_green | June_RI | June_RI | May_RI | prep | March_NDVI | |
| 3 | June_CI | may_red | may_blue | may_RE | march_NIR | march_NIR | |
| 4 | may_green | june_red | June_SI | June_BI | March_NDVI | march_green | |
| 5 | April_HI | June_BI | June_CI | june_red | june_SWIR1 | March_CI | |
| RFR | 1 | june_SWIR2 | June_RI | june_NIR | june_SWIR2 | june_red | june_NIR |
| 2 | may_NIR | May_SI | June_RI | june_blue | june_NIR | June_SI | |
| 3 | june_green | june_SWIR1 | june_blue | May_RI | Elevation | Elevation | |
| 4 | May_SI | june_SWIR2 | june_SWIR1 | March_NDVI | June_SI | march_green | |
| 5 | may_green | May_CI | temp | june_red | June_BI | may_red |
The names of the spectral predictors (see Table 2) here are a concatenation of the month of satellite acquisition and a spectral channel or indice. For example, “May_BI” represents the brightness index calculated from the May RapidEye image. prep: precipitation, temp: temperature.
Fig 2Spatial distribution of sand, silt, clay, cation exchange capacity (CEC), soil organic carbon (SOC) and total nitrogen (N) in the topsoil of the studied watershed.