| Literature DB >> 32545618 |
Yue Wang1, Deliang Sun1, Haijia Wen2,3,4, Hong Zhang1, Fengtai Zhang5.
Abstract
To compare the random forest (RF) model and the frequency ratio (FR) model for landslide susceptibility mapping (LSM), this research selected Yunyang Country as the study area for its frequent natural disasters; especially landslides. A landslide inventory was built by historical records; satellite images; and extensive field surveys. Subsequently; a geospatial database was established based on 987 historical landslides in the study area. Then; all the landslides were randomly divided into two datasets: 70% of them were used as the training dataset and 30% as the test dataset. Furthermore; under five primary conditioning factors (i.e., topography factors; geological factors; environmental factors; human engineering activities; and triggering factors), 22 secondary conditioning factors were selected to form an evaluation factor library for analyzing the landslide susceptibility. On this basis; the RF model training and the FR model mathematical analysis were performed; and the established models were used for the landslide susceptibility simulation in the entire area of Yunyang County. Next; based on the analysis results; the susceptibility maps were divided into five classes: very low; low; medium; high; and very high. In addition; the importance of conditioning factors was ranked and the influence of landslides was explored by using the RF model. The area under the curve (AUC) value of receiver operating characteristic (ROC) curve; precision; accuracy; and recall ratio were used to analyze the predictive ability of the above two LSM models. The results indicated a difference in the performances between the two models. The RF model (AUC = 0.988) performed better than the FR model (AUC = 0.716). Moreover; compared with the FR model; the RF model showed a higher coincidence degree between the areas in the high and the very low susceptibility classes; on the one hand; and the geographical spatial distribution of historical landslides; on the other hand. Therefore; it was concluded that the RF model was more suitable for landslide susceptibility evaluation in Yunyang County; because of its significant model performance; reliability; and stability. The outcome also provided a theoretical basis for application of machine learning techniques (e.g., RF) in landslide prevention; mitigation; and urban planning; so as to deliver an adequate response to the increasing demand for effective and low-cost tools in landslide susceptibility assessments.Entities:
Keywords: Yunyang County; frequency ratio model; landslide susceptibility; random forest model
Mesh:
Year: 2020 PMID: 32545618 PMCID: PMC7345078 DOI: 10.3390/ijerph17124206
Source DB: PubMed Journal: Int J Environ Res Public Health ISSN: 1660-4601 Impact factor: 3.390
Figure 1Location of the study area.
Figure 2Rainfall and temperature distribution in Yunyang County (2009–2018).
Figure 3Landslide Inventory Map of the study area.
Figure 4The scale of landslide type and trigger.
Figure 5Typical landslides: (a) Dashiban Landslide (1 April 2014); (b) Jiuxingping Landslide (15 June 2012); and (c) Liekou Mountain Landslide (1 April 2014).
Classification of conditioning factors.
| Factor | Type | Classification |
|---|---|---|
| Elevation/m | Continuous | (1) <340; (2) 340~543; (3) 543~690; (4) 690~832; (5) 832~951; (6) 951~1053; (7) 1053~1144; (8) 1144~1302; (9) 1302~1556; (10) 1556~1654; (11) >1654 |
| Slope/° | Continuous | (1) <5; (2) 5~10; (3) 10~15; (4) 15~20; (5) 20~25; (6) 25~30; (7) 30~35; (8) 35~40; (9) >40 |
| RDLS/m | Continuous | (1) <20; (2) 20~30; (3) 30~40; (4) 40~50; (5) 50~80; (6) 80~120; (7) >120 |
| Aspect | Categorical | (1) Flat; (2) North; (3) Northeast; (4) East; (5) Southeast; (6) South; (7) Southwest; (8) West; (9) Northwest |
| Slope position | Categorical | (1) Ridge; (2) Upper slope; (3) Middle slope; (4) Flats slope; (5) Lower slope; (6) Valley |
| Micro-landform | Categorical | (1) Canyons, and Deeply incised streams; (2) Midslope drainages, and shallow valleys; (3) Upland drainages, and Headwaters; (4) U-shape valleys; (5) Plains; (6) Open slopes; (7) Upper slopes, and Plateau; (8) Local ridges hills in valleys; (9) Midslope ridges, and Small hills in plains; (10) Mountain tops, and High narrow ridges |
| Curvature | Continuous | (1) <−1; (2) −1~−0.5; (3) −0.5~0; (4) 0~0.5; (5) 0.5~1; (6) >1 |
| Profile Curvature | Continuous | (1) <−1; (2) −1~−0.5; (3) −0.5~0; (4) 0~0.5; (5) 0.5~1; (6) >1 |
| Plan Curvature | Continuous | (1) <−1; (2) −1~−0.5; (3) −0.5~0; (4) 0~0.5; (5) 0.5~1; (6) >1 |
| TRI | Continuous | (1) <1.05; (2) 1.05~1.1; (3) 1.1~1.15; (4) 1.15~1.2; (5) >1.2 |
| TWI | Continuous | (1) <4; (2) 4~6; (3) 6~8; (4) 8~10; (5) >10 |
| STI | Continuous | (1) <20; (2) 20~40; (3) 40~70; (4) 70~100; (5) 100~200; (6) >200 |
| SPI | Continuous | (1) <15; (2) 15~30; (3) 30~45; (4) 45~60; (5) 60~100; (6) 100~1000; (7) >1000 |
| Lithology | Categorical | (1) J3s, J3p, J3zj, J3D; (2) J2xs, J2s; (3) J2z, J1-2z, J1z, J2x, J2zs, J1b-j2Q; (4) T3xj, T3zj, T3z2; (5) T2b2, T2b; (6) T1d; (7) T1-2j1, T1-2j2, T1-2j3, T1j; (8) P2 1+w, P2 l-d, P2; (9) P1 m + g, P1 l + q, P1, C; (10) O |
| Distance from fault/m | Continuous | (1) <500; (2) 50~1000; (3) 1000~1500; (4) 1500~2000; (5) 2000~2500; (6) 2500~3000; (7) >3000 |
| CRDS | Categorical | (1) Dip-slope I; (2) Dip-slope II; (3) Outward slope; (4) Oblique slope; (5) Tangential slope; (6) Reverse slope; (7) Flat |
| NDVI | Continuous | (1) 0~0.1; (2) 0.1~0.15; (3) 0.15~0.2; (4) 0.2~0.25; (5) >0.25 |
| Distance from rivers/m | Continuous | (1) <100; (2) 100~200; (3) 200~300; (4) 300~400; (5) 400~500; (6) 500~600; (7) >600 |
| Annual average rainfall/mm | Continuous | (1) <1221; (2) 122~1251; (3) 1251~1276; (4) 1276~1308; (5) 1308~1343; (6) 1343~1389; (7) 1389~1440; (8) >1440 |
| Land cover | Categorical | (1) Meadow; (2) Farmland; (3) Water area; (4) Forest; (5) Garden plot; (6) Others/14531/0.0000; (7) Residential land; (8) Transportation |
| Distance from roads/m | Continuous | (1) <100; (2) 100~200; (3) 200~300; (4) 300~400; (5) 400~500; (6) 500~600; (7) >600 |
| POI kernel density | Continuous | (1) 0–1; (2) 1–2; (3) 2–3; (4) 3–4; (5) 4–5; (6) 5–10; (7) >10 |
Figure 6Thematic maps of topographic factors: (a) Elevation; (b) relief degree of land surface (RDLS); (c) Slope; (d) Aspect; (e) Slope position; (f) Curvature; (g) Plan Curvature; (h) Profile Curvature; (i) Micro-landform; (j) topographic wetness index (TWI); (k) terrain roughness index (TRI); (l) sediment transport index (STI); (m) stream power index (SPI).
Figure 7Thematic maps of geological factors: (a) Lithology; (b) Distance from fault; (c) The combination reclassification of stratum dip direction and slope aspect (CRDS).
Figure 8Thematic maps of environmental factors: (a) normalized vegetation index (NDVI); (b) Land cover.
Figure 9Thematic maps of triggering factors: (a) Annual average rainfall; (b) Distance from rivers.
Figure 10Thematic maps of human engineering activities: (a) Point of interest (POI) kernel density; (b) Distance from the road.
Data and data sources.
| Data Name | Data Sources | Type | Scale |
|---|---|---|---|
| Historical landslide | Chongqing Geological Monitoring Station | Dataset | |
| Elevation | Aster satellite | Grid | 30 m |
| Geological data | National Geological Data Center | Grid | 1:200,000 |
| Land cover | Chongqing Municipal Bureau of Land and Resources | Vector | 1:100,000 |
| Administrative division | Chongqing Municipal Bureau of Land and Resources | Vector | 1:100,000 |
| River network | Chongqing Water Resources Bureau | Vector | 1:100,000 |
| Satellite image | Geospatial Data Cloud platform | Grid | 30 m |
| Annual rainfall | Chongqing Meteorological Administration | Dataset | 90 m |
| Road | Chongqing Transportation Commission | Vector | 1:100,000 |
| POI of Chongqing | Web Crawler | Dataset |
Figure 11The methodological framework of the study.
Figure 12The schematic diagram of the RF algorithm.
Figure 13The error rate of the overall RF model (black line: OOB (out of the bag); red line: without landslide; green line: with landslide).
Explanation of statistical-index-based evaluations.
| No | Metric | Equation | Definition |
|---|---|---|---|
| 1 | Precision |
| The fraction of relevant instances in the retrieved instances. |
| 2 | Sensitivity (SST) |
| The percentage of landslide cells that are correctly classified. |
| 3 | Specificity (SPF) |
| The percentage of non-landslide cells that are correctly classified. |
| 4 | Accuracy (ACC) |
| The proportion of landslide and non-landslide cells which are correctly classified. |
| 5 | Recall |
| It indicates how many positive examples in the sample are predicted correctly. |
is the number of correctly predicted landslide cells. is the sum of cells of non-landslides that are classified as landslide. is the sum of cells of landslides that are classified as non-landslide. is the number of correctly predicted non-landslide cells. is the sum of landslides and non-landslides.
The accuracy of 10-fold cross-validation of the RF model.
| Subset | Accuracy | Subset | Accuracy | ||
|---|---|---|---|---|---|
| Training | Testing | Training | Testing | ||
| 1 | 1.000 | 0.900 | 6 | 1.000 | 0.902 |
| 2 | 1.000 | 0.904 | 7 | 1.000 | 0.906 |
| 3 | 1.000 | 0.909 | 8 | 1.000 | 0.918 |
| 4 | 1.000 | 0.915 | 9 | 1.000 | 0.885 |
| 5 | 1.000 | 0.916 | 10 | 1.000 | 0.916 |
Figure 14Landslide susceptibility map in the RF model: (a) Enlarged area of the valley; (b) Enlarged area along the river.
Statistic result of landslide susceptibility in different classes of RF.
| Landslide Probability | Susceptibility Class | Grid Number | Area Proportion | Landslide | Landslide Proportion | Density Proportion (Pcs/km2) |
|---|---|---|---|---|---|---|
| <0.06 | Very low | 1,373,501 | 34.1% | 26 | 2.6% | 0.021 |
| 0.06–0.12 | Low | 1,147,495 | 28.5% | 58 | 5.9% | 0.056 |
| 0.12–0.21 | Medium | 991,347 | 24.6% | 138 | 14.0% | 0.155 |
| 0.21–0.31 | High | 396,368 | 9.8% | 147 | 14.9% | 0.412 |
| >0.31 | Very high | 118,277 | 2.9% | 618 | 62.6% | 5.806 |
Classification and FR of conditioning factors.
| Factor | Type | Classification/Grid Number/FR |
|---|---|---|
| Elevation/m | Continuous | (1) <340/647744/1.8208; (2) 340~543/99121/1.1279; (3) 543~690/783868/1.0606; (4) 690~832/614750/0.7927; (5) 832~951/390895/0.5657; (6) 951~1053/235266/0.5048; (7) 1053~1144/142794/0.3728; (8) 1144~1302/140356/0.2042; (9) 1302~1556/75487/0.0543; (10) 1556~1654/13725/0.0000; (11) >1654/5892/0.0000 |
| Slope/° | Continuous | (1) <5/231687/0.5303; (2) 5~10/562181/0.8013 ; (3) 10~15/747696/1.2652; (4) 15~20/743059/1.2952; (5) 20~25/638172/1.1743; (6) 25~30/481841/0.9264; (7) 30~35/317712/0.6316; (8) 35~40/180510/0.5672; (9) >40/139136/0.4415 |
| RDLS/m | Continuous | (1) <20/1498140/1.0157; (2) 20~30/976617/1.3054; (3) 30~40/708930/1.0616; (4) 40~50/433899/0.6350; (5) 50~80/397593/0.5689; (6) 80~120/40815/0.2015; (7) >120/3177/0.0000 |
| Aspect | Categorical | (1) Flat/832/0.0000; (2) North/559138/1.0693; (3) Northeast/418623/0.8413; (4) East/476389/0.9542; (5) Southeast/493071/1.0382; (6) South/599652/0.9629; (7) Southwest/493884/1.1028 ; (8) West/501655/1.0776; (9) Northwest/498750/0.9278 |
| Slope position | Categorical | (1) Ridge/1487219/1.0188; (2) Upper slope/282489/1.0583; (3) Middle slope/74587/0.4941; (4) Flats slope/469978/1.0631; (5) Lower slope/217136/0.9619; (6) Valley/1510585/0.9814 |
| Micro-landform | Categorical | (1) Canyons, and Deeply incised streams/1447542/1.0920; (2) Midslope drainages, and shallow valleys/93339/1.0530; (3) Upland drainages, and Headwaters/213973/0.6507; (4) U-shape valleys/269975/1.3197; (5) Plains/14080/0.0000; (6) Open slopes/67096/1.2207; (7) Upper slopes, and Plateau/242500/0.8444; (8) Local ridges hills in valleys/214246/0.9940; (9) Midslope ridges, and Small hills in plains/99198/1.8165; (10) Mountain tops, and High narrow ridges/1380045/0.8606 |
| Curvature | Continuous | (1) <−1/739914/0.9354; (2) −1~−0.5/473674/0.9597; (3) −0.5~0/932464/1.2473; (4) 0~0.5/697900/0.9213; (5) 0.5~1/455266/0.9355; (6) >1/742776/0.8932 |
| Profile Curvature | Continuous | (1) <−1/397552/0.8344; (2) −1~−0.5/466912/1.0087; (3) −0.5~0/1127645/0.9914; (4) 0~0.5/1146502/1.0787; (5) 0.5~1/495095/1.0836; (6) >1/408288/0.8526 |
| Plan Curvature | Continuous | (1) <−1/199130/0.7198; (2) −1~−0.5/399120/0.9440; (3) −0.5~0/1453089/1.0963; (4) 0~0.5/1352014/1.0329; (5) 0.5~1/421528/0.8549; (6) >1/217113/0.7922 |
| TRI | Continuous | (1) <1.05/1956974/1.0568; (2) 1.05~1.1/921054/1.1916; (3) 1.1~1.15/495536/0.9917; (4) 1.15~1.2/273264/0.6744; (5) >1.2/395166/0.5078 |
| TWI | Continuous | (1) <4/522384/0.7134; (2) 4~6/2078046/1.0327; (3) 6~8/878720/1.0346; (4) 8~10/368110/1.1904; (5) >10/194734/0.9043 |
| STI | Continuous | (1) <20/2917053/0.9841; (2) 20~40/522250/0.9410; (3) 40~70/282523/0.9857; (4) 70~100/119897/1.3321; (5) 100~200/119007/1.3076; (6) >200/81264/1.0583 |
| SPI | Continuous | (1) <15/1742958/1.0315; (2) 15~30/660771/0.8987; (3) 30~45/276316/0.9485; (4) 45~60/166651/0.9092; (5) 60~100/256386/0.7986; (6) 100~1000/734492/1.0649; (7) >1000/204420/1.2220 |
| Lithology | Categorical | (1) J3s, J3p, J3zj, J3D/1151138/1.2333; (2) J2xs, J2s/1168089/1.2294; (3) J2z, J1-2z, J1z, J2x, J2zs, J1b-j2Q/617821/0.8741; (4) T3xj, T3zj, T3z2/359056/0.3418; (5) T2b2, T2b/379171/0.9711; (6) T1d/87456/0.7485; (7) T1-2j1, T1-2j2, T1-2j3, T1j/211999/0.3281; (8) P2 1+w, P2 l-d, P2/53807/0.1521; (9) P1 m + g, P1 l + q, P1, C/1323/0.0000; (10) O/8310/0.4923 |
| Distance from fault/m | Continuous | (1) <500/3370/0.0000; (2) 50~1000/517/1.5808; (3) 1000~1500/6879/1.7852; (4) 1500~2000/8551/1.4362; (5) 2000~2500/10401/0.7871;(6) 2500~3000/12126/1.0127; (7) >3000/3993800/0.9983 |
| CRDS | Categorical | (1) Dip-slope I/65686/1.5565; (2) Dip-slope II/280703/1.0053; (3) Outward slope/345779/1.2419; (4) Oblique slope/1001302/1.0987; (5) Tangential slope/635257/1.1330; (6) Reverse slope/1476082/0.8672; (7) Flat/231650/0.5296 |
| NDVI | Continuous | (1) 0~0.1/208363/0.6878; (2) 0.1~0.15/390391/0.7762; (3) 0.15~0.2/1282355/0.9196; (4) 0.2~0.25/1429333/1.0370; (5) >0.25/731025/1.2771 |
| Distance from rivers/m | Continuous | (1) <100/244170/0.8718; (2) 100~200/203437/1.7104; (3) 200~300/214861/1.6194; (4) 300~400/186399/1.6471; (5) 400~500/192759/1.2317; (6) 500~600/182581/1.1210; (7) >600/2816099/0.8460 |
| Annual average rainfall/mm | Continuous | (1) <1221/291710/1.6968; (2) 122~1251/521092/1.1775; (3) 1251~1276/626074/0.9735; (4) 1276~1308/854004/1.0346; (5) 1308~1343/832774/1.0463; (6) 1343~1389/663280/0.7401; (7) 1389~1440/199205/0.3491; (8) >1440/49295/0.0830 |
| Land cover | Categorical | (1) Meadow/727009/0.8346; (2) Farmland/71553/1.4188; (3) Water area/3057711/0.7712; (4) Forest/9809/0.9130; (5) Garden plot/26616/0.8581; (6) Others/14531/0.0000; (7) Residential land/132682/0.9227; (8) Transportation/109/1.4084 |
| Distance from roads/m | Continuous | (1) <100/625582/1.7864; (2) 100~200/424193/1.0905; (3) 200~300/402164/1.2113; (4) 300~400/313648/0.9266; (5) 400~500/298588/0.7129; (6) 500~600/258887/0.9171; (7) >600/1717244/0.7175 |
| POI kernel density | Continuous | (1) 0–1/286078/0.5581; (2) 1–2/1291095/0.8624; (3) 2–3/1153628/1.0752; (4) 3–4/513828/1.2508; (5) 4–5/244087/0.8553; (6) 5–10/316584/1.1508; (7) >10/235006/1.3238; |
Figure 15Landslide susceptibility map in the FR model: (a) Enlarged area of the valley; (b) Enlarged area along the river.
Statistic result of landslide susceptibility in different classes of FR.
| Landslide Probability | Susceptibility Class | Grid Number | Area Proportion | Landslide | Landslide Proportion | Density Proportion (Pcs/km2) |
|---|---|---|---|---|---|---|
| <21.10 | Very low | 986,559 | 24.5% | 56 | 5.7% | 0.063 |
| 21.10–22.19 | Low | 1,093,869 | 27.1% | 181 | 18.3% | 0.184 |
| 22.19–22.93 | Medium | 875,313 | 21.7% | 230 | 23.3% | 0.292 |
| 22.93–24.54 | High | 924,737 | 22.9% | 387 | 39.2% | 0.465 |
| >24.54 | Very high | 152,601 | 3.8% | 133 | 13.5% | 0.968 |
Confusion matrix of RF.
| RF | True Condition | Summation | ||
|---|---|---|---|---|
| Landslide | Non-Landslide | |||
|
|
| 907 | 9 | Precision: 0.990 |
|
| 80 | 9861 | Precision: 0.992 | |
|
| Recall: 0.919 | Recall: 0.999 | Accuracy: 0.992 | |
Confusion matrix of FR.
| FR | True Condition | Summation | ||
|---|---|---|---|---|
| Landslide | Non-Landslide | |||
|
|
| 711 | 4114 | Precision: 0.147 |
|
| 276 | 5756 | Precision: 0.954 | |
|
| Recall: 0.720 | Recall: 0.538 | Accuracy: 0.600 | |
Figure 16ROC curve and AUC value.
Figure 17Quantitative comparison of landslide susceptibility class: (a) percentage of susceptibility regions (%); (b) percentage of landslides (%).
Figure 18New landslide maps: (a) RF; (b) FR.
Figure 19Mean decrease accuracy (sorted in descending order from top to bottom) of attributes, as assigned by the RF.
Figure 20AUC values of RF model with different reduced landslide influencing factors.
Figure 21Typical factors in landslide density statistics: (a) Elevation; (b) Annual average rainfall; (c) Slope; (d) Distance from faults; (e) Lithology.