| Literature DB >> 34072874 |
Xiaoting Zhou1, Weicheng Wu1, Ziyu Lin1, Guiliang Zhang2, Renxiang Chen2, Yong Song2, Zhiling Wang2, Tao Lang2, Yaozu Qin1, Penghui Ou1, Wenchao Huangfu1, Yang Zhang1, Lifeng Xie1, Xiaolan Huang1, Xiao Fu1, Jie Li1, Jingheng Jiang1, Ming Zhang1, Yixuan Liu1, Shanling Peng1, Chongjian Shao1, Yonghui Bai1, Xiaofeng Zhang3, Xiangtong Liu4, Wenheng Liu1.
Abstract
Landslides are one of the major geohazards threatening human society. The objective of this study was to conduct a landslide hazard susceptibility assessment for Ruijin, Jiangxi, China, and to provide technical support to the local government for implementing disaster reduction and prevention measures. Machine learning approaches, e.g., random forests (RFs) and support vector machines (SVMs) were employed and multiple geo-environmental factors such as land cover, NDVI, landform, rainfall, lithology, and proximity to faults, roads, and rivers, etc., were utilized to achieve our purposes. For categorical factors, three processing approaches were proposed: simple numerical labeling (SNL), weight assignment (WA)-based and frequency ratio (FR)-based. Then 19 geo-environmental factors were respectively converted into raster to constitute three 19-band datasets, i.e., DS1, DS2, and DS3 from three different processes. Then, 155 observed landslides that occurred in the past decades were vectorized, among which 70% were randomly selected to compose a training set (TS1) and the remaining 30% to form a validation set (VS1). A number of non-landslide (no-risk) samples distributed in the whole study area were identified in low slope (<1-3°) zones such as urban areas and croplands, and also added to the TS1 and VS1 in the same ratio. For comparison, we used the FR approach to identify the no-risk samples in both flat and non-flat areas, and merged them into the field-observed landslides to constitute another pair of training and validation sets (TS2 and VS2) using the same ratio of 7:3. The RF algorithm was applied to model the probability of the landslide occurrence using DS1, DS2, and DS3 as predictive variables and TS1 and TS2 for training to obtain the SNL-based, WA-based, and FR-based RF models, respectively. Verified against VS1 and VS2, the three models have similar overall accuracy (OA) and Kappa coefficient (KC), which are 89.61%, 91.47%, and 94.54%, and 0.7926, 0.8299, and 0.8908, respectively. All of them are much better than the three models obtained by SVM algorithm with OA of 81.79%, 82.86%, and 83%, and KC of 0.6337, 0.655, and 0.660. New case verification with the recent 26 landslide events of 2017-2020 revealed that the landslide susceptibility map from WA-based RF modeling was able to properly identify the high and very high susceptibility zones where 23 new landslides had occurred, and performed better than the SNL-based and FR-based RF modeling, though the latter has a slightly higher OA and KC. Hence, we concluded that all three RF models achieve reasonable risk prediction, but WA-based and FR-based RF modeling deserves a recommendation for application elsewhere. The results of this study may serve as reference for the local authorities in prevention and early warning of landslide hazards.Entities:
Keywords: geo-environmental factor quantification; landslide; random forest; susceptibility zoning
Mesh:
Year: 2021 PMID: 34072874 PMCID: PMC8199194 DOI: 10.3390/ijerph18115906
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Location of the study area, Ruijin County, Jiangxi, China, and location of the training and validation sites of landslides in the study area.
Figure 2Methodological flowchart.
Figure 3Geo-environmental factors 1: (a) NDVI and (b) rivers.
Figure 4Geo-environmental factors 2: (a) lithology and (b) faults.
Figure 5Geo-Environmental factors 3: (a) slope and (b) aspect.
Figure 6Geo-environmental factors 4: (a) landuse/cover and (b) road.
Figure 7Landslide susceptibility index (LSI) of the study area and distribution of the non-landslide points.
Figure 8Landslide susceptibility zonation maps of Ruijin: (a) from the simple numeric labeling (SNL)-based RF modeling; (b) from the weight assignment (WA)-based RF modeling; and (c) from the frequency ratio (FR)-based RF modeling.
Distribution of landslides within different susceptibility levels.
| RF Model | SNL- | WA- | FR- | SNL- | WA- | FR- | SNL- | WA- | FR- | SNL- | WA- | FR- |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Susceptibility Level | Area (km2) | Percentage (%) | Number of Historical Landslides | Percentage (%) | ||||||||
| Very High | 118.72 | 107.13 | 135.32 | 4.86 | 4.39 | 5.13 | 132 | 137 | 135 | 85.16 | 88.39 | 87.10 |
| High | 437.27 | 363.78 | 212.66 | 17.92 | 14.91 | 12.70 | 18 | 14 | 14 | 11.61 | 9.03 | 9.03 |
| Medium | 665.71 | 545.69 | 364.47 | 27.28 | 22.56 | 18.79 | 3 | 1 | 5 | 1.94 | 0.65 | 3.23 |
| Low | 726.33 | 745.11 | 679.71 | 29.76 | 30.53 | 25.27 | 1 | 2 | 1 | 0.65 | 1.29 | 0.65 |
| Very Low | 492.35 | 678.68 | 1048.24 | 20.18 | 27.81 | 38.12 | 1 | 1 | 0 | 0.65 | 0.65 | 0.00 |
Figure 9Out-of-bag (OOB) error plot versus number of trees (NT) with RF modeling: (a) simple numeric labeling (SNL)-based RF modeling using TS1, (b) weight assignment (WA)-based RF modeling using TS1, and (c) frequency ratio (FR)-based RF modeling using TS2.
Figure 10Frequency ratio (FR) of each geo-environmental factor: (a) distance to roads; (b) distance to rivers; (c) distance to lithostratigraphic boundaries; (d) slope; (e) elevation; and (f) NDVI.
Figure 11Importance (%) of the geo-environmental factors in landslide events from different random forest (RF) modeling.
Performance of the RF and SVM algorithms vs. validation sets (VS1 and VS2).
| Item | SNL-Based RF Model (VS1) | WA-Based RF Model (VS1) | FR-Based RF Model (VS2) | SNL-Based SVM Model (VS1) | WA-Based SVM Model (VS1) | FR-Based SVM Model (VS2) |
|---|---|---|---|---|---|---|
| Precision (%) | 94.67 | 95.00 | 94.00 | 83.33 | 84.67 | 92.67 |
| Recall (%) | 85.54 | 88.67 | 95.27 | 82.78 | 83.55 | 77.65 |
| KC (%) | 79.26 | 82.99 | 89.08 | 63.37 | 65.50 | 66.00 |
| OA (%) | 89.61 | 91.49 | 94.54 | 81.79 | 82.86 | 83.00 |
Figure 12Prediction of the landslide susceptibility and case verification: (a) from simple numeric labeling (SNL)-based RF modeling; (b) from weight assignment (WA)-based RF modeling; and (c) from frequency ratio (FR)-based RF modeling. (d) landslide behind the No. 6 Middle School of Ruijin and (e) bulges on the side wall feet of the Longzhu Temple.