| Literature DB >> 35408299 |
Boqiang Xie1,2,3, Jianli Ding1,2,3, Xiangyu Ge1,2,3, Xiaohang Li1,2,3, Lijing Han1,2,3, Zheng Wang1,2,3.
Abstract
Soil organic carbon (SOC), as the largest carbon pool on the land surface, plays an important role in soil quality, ecological security and the global carbon cycle. Multisource remote sensing data-driven modeling strategies are not well understood for accurately mapping soil organic carbon. Here, we hypothesized that the Sentinel-2 Multispectral Sensor Instrument (MSI) data-driven modeling strategy produced superior outcomes compared to modeling based on Landsat 8 Operational Land Imager (OLI) data due to the finer spatial and spectral resolutions of the Sentinel-2A MSI data. To test this hypothesis, the Ebinur Lake wetland in Xinjiang was selected as the study area. In this study, SOC estimation was carried out using Sentinel-2A and Landsat 8 data, combining climatic variables, topographic factors, index variables and Sentinel-1A data to construct a common variable model for Sentinel-2A data and Landsat 8 data, and a full variable model for Sentinel-2A data, respectively. We utilized ensemble learning algorithms to assess the prediction performance of modeling strategies, including random forest (RF), gradient boosted decision tree (GBDT) and extreme gradient boosting (XGBoost) algorithms. The results show that: (1) The Sentinel-2A model outperformed the Landsat 8 model in the prediction of SOC contents, and the Sentinel-2A full variable model under the XGBoost algorithm achieved the best results R2 = 0.804, RMSE = 1.771, RPIQ = 2.687). (2) The full variable model of Sentinel-2A with the addition of the red-edge band and red-edge index improved R2 by 6% and 3.2% over the common variable Landsat 8 and Sentinel-2A models, respectively. (3) In the SOC mapping of the Ebinur Lake wetland, the areas with higher SOC content were mainly concentrated in the oasis, while the mountainous and lakeside areas had lower SOC contents. Our results provide a program to monitor the sustainability of terrestrial ecosystems through a satellite perspective.Entities:
Keywords: Landsat 8; Sentinel-1A; Sentinel-2A; digital soil mapping; ensemble learning algorithms; soil organic carbon
Mesh:
Substances:
Year: 2022 PMID: 35408299 PMCID: PMC9003097 DOI: 10.3390/s22072685
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1The study area is located in the Xinjiang Uyghur Autonomous Region within China: (A) Xinjiang Uyghur Autonomous Region within China; (B) The Ebinur Lake basin; (C) Sentinel-2A image; (D) Landsat 8 image; (E) landscape around the Ebinur Lake; (F) farmland landscape within the oasis. Images A, C and D were created using the Red, Green and Blue bands of remote sensing images.
Landsat 8 and Sentinel-2A data band information.
| Satellite Sensor Name | Band Name | Spectral Position | Central Wavelength | Original Resolution |
|---|---|---|---|---|
| Sentinel-2A/MSI | B2-Blue | 458–523 | 490 | 10 |
| B3-Green | 543–578 | 560 | 10 | |
| B4-Red | 650–680 | 665 | 10 | |
| B5-Red Edge 1 | 698–713 | 705 | 20 | |
| B6-Red Edge 2 | 733–748 | 740 | 20 | |
| B7-Red Edge 3 | 773–793 | 783 | 20 | |
| B8-NIR | 785–900 | 842 | 10 | |
| B8A-Red Edge 4 | 855–875 | 865 | 20 | |
| B11-SWIR1 | 1565–1655 | 1610 | 20 | |
| B12-SWIR2 | 2100–2280 | 2190 | 20 | |
| Landsat 8/OLI | B2-Blue | 450–515 | 483 | 30 |
| B3-Green | 525–600 | 560 | 30 | |
| B4-Red | 630–680 | 660 | 30 | |
| B5-NIR | 845–885 | 865 | 30 | |
| B6-SWIR1 | 1560–1660 | 1650 | 30 | |
| B7-SWIR2 | 2100–2300 | 2220 | 30 |
Sentinel-1A data information.
| Date | Sensor Mode | Polarization | Direction |
|---|---|---|---|
| 26 July 2017 | IW | VV | Ascending |
| 26 July 2017 | IW | VH | Ascending |
Spectral index information for Landsat 8 and Sentinel-2A data.
| Index | Formula | Sentinel-2A MSI Equation | Landsat 8 OIL Equation |
|---|---|---|---|
| NDVI |
|
|
|
| VI |
|
|
|
| VI |
|
|
|
| VI |
|
|
|
| TVI |
|
|
|
| SAVI |
|
|
|
| SATVI |
|
|
|
| NBR2 |
|
|
|
| BI |
|
|
|
| BI2 |
|
|
|
| RI |
|
|
|
| CI |
|
|
|
| LSWI |
|
|
|
| MSI |
|
|
|
The newly proposed red-edge spectral indices.
| Index | Formula | Sentinel-2 MSI Equation |
|---|---|---|
| NDVI re1 |
|
|
| NDVI re2 |
|
|
| NDVI re3 |
|
|
| EVI re1 |
|
|
| EVI re2 |
|
|
| EVI re3 |
|
|
| DVI re1 |
|
|
| DVI re2 |
|
|
| DVI re3 |
|
|
| RVI re1 |
|
|
| RVI re2 |
|
|
| RVI re3 |
|
|
| TVI re1 |
|
|
| TVI re2 |
|
|
| TVI re3 |
|
|
| SAVI re1 |
|
|
| SAVI re2 |
|
|
| SAVI re3 |
|
|
| SATVI re1 |
|
|
| SATVI re2 |
|
|
| SATVI re3 |
|
|
| BI re1 |
|
|
| BI re2 |
|
|
| BI re3 |
|
|
| BI2 re1 |
|
|
| BI2 re2 |
|
|
| BI2 re3 |
|
|
| RI re1 |
|
|
| RI re2 |
|
|
| RI re3 |
|
|
| CI re1 |
|
|
| CI re2 |
|
|
| CI re3 |
|
|
| NDWI re1 |
|
|
| MSI re1 |
|
|
Details of the Landsat 8 and Sentinel-2A modeling strategies.
| Model Name | Variable Combinations |
|---|---|
| Model A-I | Landsat 8(6band) |
| Model A-II | Landsat 8(6band) + Spectral index |
| Model A-III | Landsat 8(6band) + Spectral index + Climate variables + Topographic variables |
| Model A-IV | Landsat 8(6band) + Spectral index + Climate variables + Topographic variables + Sentinel-1A |
| Model B-I | Sentinel-2A(6band) |
| Model B-II | Sentinel-2A(6band) + Spectral index |
| Model B-III | Sentinel-2A(6band) + Spectral index + Climate variables + Topographic variables |
| Model B-IV | Sentinel-2A(6band) + Spectral index + Climate variables + Topographic variables + Sentinel-1A |
| Model C-I | Sentinel-2A(10band) |
| Model C-II | Sentinel-2A(10band) + Spectral index + Red-edge index |
| Model C-III | Sentinel-2A(10band) + Spectral index + Red-edge index + Climate variables + Topographic variables |
| Model C-IV | Sentinel-2A(10band) + Spectral index + Red-edge index + Climate variables + Topographic variables + Sentinel-1A |
Descriptive statistics for the entire SOC dataset, the training dataset and the validation dataset.
| Dataset | Sample Size | Minimum | Maximum | Median | Mean | Standard Deviation (g/kg) |
|---|---|---|---|---|---|---|
| Whole dataset | 95 | 1.487 | 25.015 | 7.715 | 7.723 | 4.380 |
| Training dataset | 66 | 1.487 | 25.015 | 7.889 | 7.905 | 4.582 |
| Validation dataset | 29 | 1.699 | 17.245 | 7.036 | 7.310 | 3.926 |
Figure 2(a) Importance of variables in the Landsat 8 model; (b) importance of variables in the Sentinel-2A model.
Predictive performances of the models.
| Modeling Technique | Model Name | All Variables | Top 5 Variables | ||||
|---|---|---|---|---|---|---|---|
|
| RMSE (g/kg) | RPIQ |
| RMSE (g/kg) | RPIQ | ||
| RF | Model A-I | 0.583 | 2.781 | 1.711 | 0.606 | 2.474 | 1.924 |
| Model A-II | 0.633 | 2.692 | 1.768 | 0.648 | 2.343 | 2.031 | |
| Model A-III | 0.627 | 2.640 | 1.803 | 0.661 | 2.299 | 2.070 | |
| Model A-IV | 0.681 | 2.447 | 1.945 | 0.709 | 2.141 | 2.223 | |
| Model B-I | 0.615 | 2.660 | 1.789 | 0.624 | 2.617 | 1.818 | |
| Model B-II | 0.632 | 2.502 | 1.902 | 0.655 | 2.426 | 1.962 | |
| Model B-III | 0.569 | 2.537 | 1.876 | 0.685 | 2.179 | 2.184 | |
| Model B-IV | 0.701 | 2.401 | 1.982 | 0.736 | 2.067 | 2.303 | |
| Model C-I | 0.615 | 2.596 | 1.833 | 0.654 | 2.252 | 1.885 | |
| Model C-II | 0.693 | 2.405 | 1.979 | 0.694 | 2.230 | 2.135 | |
| Model C-III | 0.640 | 2.387 | 1.994 | 0.713 | 2.148 | 2.216 | |
| Model C-IV | 0.705 | 2.106 | 2.260 | 0.744 | 2.005 | 2.374 | |
| GBDT | Model A-I | 0.531 | 2.630 | 1.810 | 0.641 | 2.393 | 1.976 |
| Model A-II | 0.689 | 2.463 | 1.933 | 0.695 | 2.237 | 2.128 | |
| Model A-III | 0.670 | 2.369 | 2.009 | 0.704 | 2.236 | 2.129 | |
| Model A-IV | 0.671 | 2.374 | 2.004 | 0.723 | 2.132 | 2.233 | |
| Model B-I | 0.626 | 2.483 | 1.917 | 0.654 | 2.367 | 2.010 | |
| Model B-II | 0.649 | 2.364 | 2.014 | 0.699 | 2.244 | 2.121 | |
| Model B-III | 0.681 | 2.229 | 2.135 | 0.713 | 2.110 | 2.255 | |
| Model B-IV | 0.708 | 2.132 | 2.232 | 0.753 | 2.057 | 2.334 | |
| Model C-I | 0.659 | 2.347 | 2.028 | 0.682 | 2.238 | 2.126 | |
| Model C-II | 0.663 | 2.370 | 2.008 | 0.708 | 2.190 | 2.174 | |
| Model C-III | 0.687 | 2.267 | 2.100 | 0.727 | 2.084 | 2.284 | |
| Model C-IV | 0.751 | 2.104 | 2.262 | 0.772 | 1.965 | 2.423 | |
| XGBoost | Model A-I | 0.600 | 2.483 | 1.917 | 0.637 | 2.327 | 2.045 |
| Model A-II | 0.677 | 2.394 | 1.988 | 0.702 | 2.155 | 2.209 | |
| Model A-III | 0.693 | 2.420 | 1.966 | 0.726 | 2.124 | 2.241 | |
| Model A-IV | 0.701 | 2.291 | 2.077 | 0.759 | 2.003 | 2.376 | |
| Model B-I | 0.685 | 2.236 | 2.129 | 0.701 | 2.175 | 2.188 | |
| Model B-II | 0.693 | 2.342 | 2.033 | 0.722 | 2.242 | 2.223 | |
| Model B-III | 0.712 | 2.111 | 2.254 | 0.754 | 1.987 | 2.395 | |
| Model B-IV | 0.735 | 2.037 | 2.337 | 0.788 | 1.921 | 2.477 | |
| Model C-I | 0.694 | 2.290 | 2.079 | 0.727 | 2.119 | 2.246 | |
| Model C-II | 0.715 | 2.161 | 2.203 | 0.749 | 2.000 | 2.380 | |
| Model C-III | 0.726 | 2.028 | 2.347 | 0.786 | 1.830 | 2.600 | |
| Model C-IV | 0.771 | 1.899 | 2.506 | 0.804 | 1.771 | 2.687 | |
Figure 3(a) Spatial distribution of SOC content predicted using the Landsat 8 model; (b) spatial distribution of SOC content predicted using the full variable Sentinel-2A model; (a-I), (a-II), (b-I) and (b-II) show specific details in an area for comparison.