| Literature DB >> 30357023 |
Jianli Ding1,2, Aixia Yang1,3, Jingzhe Wang1,2, Vasit Sagan4, Danlin Yu5,6.
Abstract
Soil organic carbon (SOC) is an important soil property that has profound impact on soil quality and plant growth. With 140 soil samples collected from Ebinur Lake Wetland National Nature Reserve, Xinjiang Uyghur Autonomous Region of China, this research evaluated the feasibility of visible/near infrared (VIS/NIR) spectroscopy data (350-2,500 nm) and simulated EO-1 Hyperion data to estimate SOC in arid wetland regions. Three machine learning algorithms including Ant Colony Optimization-interval Partial Least Squares (ACO-iPLS), Recursive Feature Elimination-Support Vector Machine (RF-SVM), and Random Forest (RF) were employed to select spectral features and further estimate SOC. Results indicated that the feature wavelengths pertaining to SOC were mainly within the ranges of 745-910 nm and 1,911-2,254 nm. The combination of RF-SVM and first derivative pre-processing produced the highest estimation accuracy with the optimal values of Rt (correlation coefficient of testing set), RMSE t and RPD of 0.91, 0.27% and 2.41, respectively. The simulated EO-1 Hyperion data combined with Support Vector Machine (SVM) based recursive feature elimination algorithm produced the most accurate estimate of SOC content. For the testing set, Rt was 0.79, RMSE t was 0.19%, and RPD was 1.61. This practice provides an efficient, low-cost approach with potentially high accuracy to estimate SOC contents and hence supports better management and protection strategies for desert wetland ecosystems.Entities:
Keywords: Desert wetland soil; Ebinur lake wetland; Machine learning; Soil organic carbon
Year: 2018 PMID: 30357023 PMCID: PMC6195798 DOI: 10.7717/peerj.5714
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Study area and locations of sampling points.
Vectorization by Jingzhe Wang.
Descriptive statistics of soil organic carbon in both training and testing sets.
| Models | Sample size | Min/% | Max/% | Mean/% | St.dev/% |
|---|---|---|---|---|---|
| Training sets | 70 | 0.02 | 2.97 | 0.51 | 0.64 |
| Testing sets | 70 | 0.01 | 3.42 | 0.40 | 0.65 |
Figure 2Spectral reflectance of different wetland soil organic carbon contents.
(A) Arenosols, (B) Solonetz, (C) Solonetz, (D) Solonchaks, (E) Solonetz.
Selected feature wavelengths, training sets and testing sets results by ACO-iPLS method.
| Pre-processing | Selected wavelengths | Training sets | Testing sets | |||
|---|---|---|---|---|---|---|
| RMSE | RMSE | RPD | ||||
| A′ | 1,786∼1,929 | 0.86 | 0.33 | 0.83 | 0.40 | 1.63 |
| 1∕A′ | 494∼638 | 0.64 | 0.57 | 0.76 | 0.74 | 0.87 |
| lgA′ | 1,786∼1,929 | 0.73 | 0.50 | 0.82 | 0.59 | 1.10 |
Selected feature wavelengths, training sets and testing sets results by RF-SVM method.
| Pre-processing | Selected wavelengths | Training sets | Testing sets | |||
|---|---|---|---|---|---|---|
| RMSE | RMSE | RPD | ||||
| A′ | 780, 1,911, 783, 779, 768, 759, 793, 794, 2,254, 910, 1,677, 1,912, 2,089, 745, 825, 2,088, 746, 2,090, 1,913, 1,751 | 0.97 | 0.16 | 0.91 | 0.27 | 2.41 |
| 1∕A′ | 663, 1,836, 658, 2,431, 2,494, 618, 999, 746, 370, 2,475, 960, 510, 1,081, 443, 1,681, 1,123, 360, 793, 2,123, 2,476 | 0.99 | 0.03 | 0.84 | 0.34 | 1.88 |
| lgA′ | 706, 736, 731, 1,943, 779, 721, 413, 510, 704, 397, 732, 1,944, 1,085, 2,091, 2,347, 881, 2,422, 1,966, 2,257, 2,111 | 0.99 | 0.03 | 0.81 | 0.45 | 1.44 |
Selected feature wavelengths, training sets and testing sets results by RF method.
| Pre-processing | Selected wavelengths | Training sets | Testing sets | |||
|---|---|---|---|---|---|---|
| RMSE | RMSE | RPD | ||||
| A′ | 794, 740, 758, 713, 741, 821, 789, 766, 613, 682, 732, 776, 822, 720, 769, 746, 635, 733, 940, 668 | 0.98 | 0.15 | 0.92 | 0.33 | 1.98 |
| 1∕A′ | 1,403, 1,402, 1,390, 1,399, 1,405, 1,404, 2,189, 2,196, 620, 2,176, 822, 2,192, 809, 2,177, 670, 632, 2,191, 1,388, 727, 2,315 | 0.98 | 0.14 | 0.83 | 0.43 | 1.51 |
| lgA′ | 676, 633, 2,189, 2,202, 2,195, 675, 1,402, 2,183, 722, 632, 620, 703, 821, 2,205, 2,193, 689, 2,200, 646, 812, 714 | 0.98 | 0.14 | 0.90 | 0.41 | 1.58 |
Figure 3Selected spectral interval by ACO-iPLS with first derivative spectra.
Figure 5Selected wavelengths by RF with first derivative spectra.
Comparison of the results by different models with first derivative spectra.
| Modeling methods | Training sets | Testing sets | |||
|---|---|---|---|---|---|
| RMSE | RMSE | RPD | |||
| AOC-iPLS | 0.86 | 0.33 | 0.83 | 0.40 | 1.63 |
| RF | 0.98 | 0.15 | 0.92 | 0.33 | 1.98 |
| RF-SVM | 0.97 | 0.16 | 0.91 | 0.27 | 2.41 |
Figure 4Selected wavelengths by RF-SVM with first derivative spectra.
Figure 6Measured content and the values estimated by SVM model with simulated EO-1 Hyperion data.
Selected feature wavelengths and training sets and testing sets results by RF-SVM method with simulated EO-1 Hyperion data.
| Modeling methods | Selected wavelengths | Training sets | Testing sets | |||
|---|---|---|---|---|---|---|
| RMSE | RMSE | RPD | ||||
| RF-SVM | 824, 813, 2,194, 2,426, 2,093, 702, 2,083, 712, 2,436, 803, 2,174, 2,214, 2,163, 2,416, 2,103, 722, 2,133, 1,810, 1,669, 2,123 | 0.96 | 0.23 | 0.79 | 0.19 | 1.61 |
Figure 7Comparison of the measured content and the values estimated by different models.