Literature DB >> 33267086

The Influence of Different Knowledge-Driven Methods on Landslide Susceptibility Mapping: A Case Study in the Changbai Mountain Area, Northeast China.

Zhongjun Ma¹, Shengwu Qin¹, Chen Cao¹, Jiangfeng Lv¹, Guangjie Li¹, Shuangshuang Qiao¹, Xiuyu Hu¹.

Abstract

Landslides are one of the most frequent geomorphic hazards, and they often result in the loss of property and human life in the Changbai Mountain area (CMA), Northeast China. The objective of this study was to produce and compare landslide susceptibility maps for the CMA using an information content model (ICM) with three knowledge-driven methods (the artificial hierarchy process with the ICM (AHP-ICM), the entropy weight method with the ICM (EWM-ICM), and the rough set with the ICM (RS-ICM)) and to explore the influence of different knowledge-driven methods for a series of parameters on the accuracy of landslide susceptibility mapping (LSM). In this research, the landslide inventory data (145 landslides) were randomly divided into a training dataset: 70% (81 landslides) were used for training the models and 30% (35 landslides) were used for validation. In addition, 13 layers of landslide conditioning factors, namely, altitude, slope gradient, slope aspect, lithology, distance to faults, distance to roads, distance to rivers, annual precipitation, land type, normalized difference vegetation index (NDVI), topographic wetness index (TWI), plan curvature, and profile curvature, were taken as independent, causal predictors. Landslide susceptibility maps were developed using the ICM, RS-ICM, AHP-ICM, and EWM-ICM, in which weights were assigned to every conditioning factor. The resultant susceptibility was validated using the area under the ROC curve (AUC) method. The success accuracies of the landslide susceptibility maps produced by the ICM, RS-ICM, AHP-ICM, and EWM-ICM methods were 0.931, 0.939, 0.912, and 0.883, respectively, with prediction accuracy rates of 0.926, 0.927, 0.917, and 0.878 for the ICM, RS-ICM, AHP-ICM, and EWM-ICM, respectively. Hence, it can be concluded that the four models used in this study gave close results, with the RS-ICM exhibiting the best performance in landslide susceptibility mapping.

Entities: Chemical Disease Species

Keywords: AHP; Changbai Mountain area; Cohen’s kappa index; GIS; entropy weight method; landslide susceptibility mapping; rough set

Year: 2019 PMID： 33267086 PMCID： PMC7514856 DOI： 10.3390/e21040372

Source DB: PubMed Journal: Entropy (Basel) ISSN： 1099-4300 Impact factor: 2.524

1. Introduction

Landslides are one of the most frequent geomorphic hazards, and they have considerable economic and ecological consequences [1,2,3]. To mitigate these social and economic losses, it is valuable and essential to assess the landslide susceptibility in a region. Therefore, in recent years, the assessment of landslide susceptibility, which refers to the likelihood of a landslide occurring in an area on the basis of the local terrain and environmental conditions, has become a topic of major interest [4,5]. Landslide susceptibility mapping (LSM) is considered in the decision-making process involving land use management as an efficient approach to reduce property damage and economic loss in landslide-prone areas [1,6,7,8,9]. The outcome maps would be useful for general planned development activities and disaster management in the future, such as choosing new urban areas and infrastructural activities, as well as for environmental protection. Landslide susceptibility maps can be obtained using both qualitative (inventory-based and knowledge-driven methods) or quantitative approaches (data-driven methods and physically based models) [4,10,11,12,13,14,15,16,17]. Landslide inventory-based techniques, as a prelude to all other methods, include the collection of past landslide data, construction of databases, and production of susceptibility maps based on those data [18]. Landslide inventory mapping can be carried out using a variety of methods that were updated and summarized by Corominas et al. [17]. Knowledge-driven methods that estimate landslide potential from the practical experience and expertise of the researcher are used by geomorphologists to analyze aerial photographs or to conduct field surveys [19]. Data-driven landslide susceptibility assessment methods are used to select and analyze factors affecting landslides in areas with environmental conditions similar to those where past landslides have been reported [17]. They can be grouped in bivariate statistical analysis, multivariate statistical analysis and active learning statistical analysis. In bivariate statistical analysis, the weights of the landslide conditioning factors are assigned based on landslide density using different methods—including frequency ratio (FR) [13,15,20,21,22], the information content model (ICM) [23,24], weight of evidence (WoE) [16], certainty factors (CF) [25], favorability functions (FF) [26], and the likelihood ratio model (LRM) [27]. The multivariate statistical methods evaluate the combined relationship between a dependent variable (landslide occurrence) and a series of independent variables (landslide controlling factors), and the most popular methods to analyze the resulting matrix include logistic regression (LR) [6,13,28,29,30,31,32,33], discriminant analysis (DA) [34,35], random forest (RF) [36,37,38] and active learning statistical analysis, such as the artificial neural networks (ANNs) [3,6,39,40,41,42]. Physically based methods, such as deterministic techniques, are based on mathematical modeling of the physical mechanisms controlling slope failure [43,44,45,46,47,48,49]. However, it is reported that the methods are only applicable over large areas when the geological and geomorphological conditions are fairly homogeneous and the landslide types are simple [17]. Moreover, several studies have used two or more models to produce landslide susceptibility maps and compare their accuracy [3,15,16,36,50,51,52,53]. Although many approaches are available for producing landslide susceptibility maps, the focus in this paper is on the influence of different knowledge-driven methods for a series of parameters on the accuracy of LSM. A comparative study between the ICM using three knowledge-driven methods, that is, rough set (RS), artificial hierarchy process (AHP), and entropy weight method (EWM), has been considered so far in the literature. Therefore, four models, namely, ICM, AHP-ICM, EWM-ICM, and RS-ICM, were used to produce landslide susceptibility maps in the Changbai Mountain area (CMA). The four models were validated based on receiver operating characteristic (ROC) curves and the Cohen’s kappa index, and the influence of different knowledge-driven methods is discussed in the study.

2. Study Area

The CMA is located in northeastern China and is bounded by North Korea (Figure 1) [54], between approximately 41°21′N and 43°01′N and 127°01′E and 128°54′E, where the climate is temperate continental monsoon with mean annual temperatures of −7.3 to +4.8 °C and average annual precipitation ranging from 700 to 1400 mm, depending on the elevation. The vegetation is dominated by mixed forests (600–1100 m), coniferous forests (1100–1800 m), sub-alpine Betula ermanii forests (1800–2000 m), or alpine tundra vegetation above 2000 m [55]. The CMA is one of the largest volcanic areas in East Asia, with volcanic tectonic geomorphology, water landscapes, glaciers, and periglacial landforms [56]. The east and southeast areas of the study area border North Korea and Russia. The study area includes seven counties and cities, including Changbai, Fusong, and Antu. The total area of the CMA is approximately 19,774 km2.

Figure 1

Study area and shaded relief image showing the surface.

The topography is diverse, including valleys, basins, hills, and steep slopes. The overall trend of the terrain is centered on Baiyun Mountain and gradually reduced to the surrounding area, characterized by large height, deep cutting, and steep slopes. The highest point of the CMA is Baiyun Mountain, with an elevation of 2694 m. The lowest point is northwest of the Songhua River, with an elevation of 276 m. The tectonic pattern of the area is dominated by NSS- and NNS-trending fault zones supplemented by SW-, SN- and NW-trending faults. The main faults include the Shulan-Yitong graben fault, the Dunhua-Mishan graben, and the Tumenjiang fault [57]. Extensive tectonic fragmentation, frequent volcanic activity, steep slopes, a dense drainage and road network, and extensive human activity, including North Korea’s nuclear testing and other factors, constitute a study area that is particularly prone to landslide phenomena.

3. Data Collection

Extensive field investigations and observations were identified and mapped in the CMA to produce a detailed and reliable landslide inventory map. In all, 116 landslides were identified and mapped in the study area by aerial photos supported by field investigation from 2012 to 2015. The landslides in the study area are characterized by two modes: slump-tensile rupture and creep-tensile rupture, both of them eventually leading to landslide instability. Of these landslides, 70% (81 landslides) were randomly selected for model training and the remaining 30% (35 landslides) were selected for validation. A series of field investigations were undertaken to identify the relationship between landslide occurrence and environmental factors. Figure 2 illustrates some typical landslides that destroyed railways and roads.

Figure 2

Field views of landslides considered in the area. (a) Wanlihe station landslide; (b) Qinxiang landslide.

Thirteen layers of landslide conditioning factors, namely, altitude, slope gradient, slope aspect, lithology, distance to faults, distance to roads, distance to rivers, annual precipitation, land type, normalized difference vegetation index (NDVI), topographic wetness index (TWI), plan curvature, and profile curvature, were taken as independent, causal predictors for producing LSM. The selection of 13 predictors was based on the works of previous researchers, collection of data availability and the experience and knowledge about landslide activities in the study area [24,58]. The continuous predictors, such as altitude, slope gradient, slope aspect, distance to faults, distance to roads, distance to rivers, NDVI, TWI, plan curvature, and profile curvature, were classified according to natural break classes and the previous study, and the discrete predictors, including lithology, annual precipitation, and land type, were classified based on the existing classification. The spatial database for the study area is shown in Table 1.

Table 1

Spatial database for the study area.

Data Layers	Data Type	Scale
Altitude	Grid	30 m × 30 m
Slope gradient	Grid	30 m × 30 m
Slope aspect	Grid	30 m × 30 m
Lithology	Polygon	1:100000
Annual precipitation	Polygon	1:100000
Land type	Polygon	1:100000
Distance to rivers	Polygon	1:100000
Distance to faults	Polygon	1:100000
Distance to roads	Polygon	1:100000
Plan curvature	Grid	30 m × 30 m
Profile curvature	Grid	30 m × 30 m
Topographic wetness index (TWI)	Grid	30 m × 30 m
Normalized difference vegetation index (NDVI)	Grid	30 m × 30 m

The details about the data collection procedure and the preparation of the thematic layers are as follows: The altitude map, slope gradient, slope aspect, plan curvature, and profile curvature were produced using the digital elevation model (DEM) with a grid size of 30 m × 30 m. In the present study area, the altitude ranges from 276 to 2694 m and is reclassified into five categories using natural break classes: 276–715 m, 715–924 m, 924–1162 m, 1162–1507 m, and 1507–2694 m. The slope gradient is reclassified into six classes, namely, 0–8°, 8–15°, 15–25°, 25–35°, 35–45°, and >45°. The slope aspect is reclassified into eight classes: North (−22.5–22.5°), Northeast (22.5–67.5°), East (67.5–112.5°), Southeast (112.5–157.5°), South (157.5–202.5°), Southwest (202.5–247.5°), West (247.5–302.5°), and Northwest (302.5–347.5). The plan curvature is reclassified into three classes: <-0.5, −0.5–0.5 and >0.5. The profile curvature is reclassified into three classes: <−0.5, −0.5–0.5 and >0.5. Lithology, annual precipitation, land type, rivers, faults, and road feature maps were obtained from the China Geology survey. The distance to faults was classified into five classes: 0–500, 500–1000, 1000–1500, 1500–2000, and >2000 m. In the case of distance to rivers, there are five classes with 100 m intervals. For distance to roads, there are six classes: 0–500, 500–1000, 1000–1500, 1500–2000, and >2000 m. The hydrological factor TWI was calculated using Equation (1). The TWI can be used as an estimate of spatial patterns for soil moisture, since topography controls the hydrological conditions of surface runoff and groundwater flow: where is the specific catchment area and is the slope gradient (in degrees). In the current study, TWI is divided into five classes: <9, 9–11, 11–14, 14–18, and >18. The NDVI is a standardized index for generating an image displaying greenness, and it was prepared using Landsat-7 images based on the following equation: where (band 4) and (band 3) are the infrared and red bands of the electromagnetic spectrum, respectively. This index outputs values between −1.0 and 1.0, mostly representing greenness, where any negative values are mainly generated from clouds, water, and snow, and values near zero are mainly generated from rock and bare soil [1]. In this study, the NDVI map was divided into five classes: <0.1, 0.1–0.3, 0.3–0.5, 0.5–0.7, and >0.7.

4. Methodology

4.1. The Rough Set Model

RS theory was proposed by Pawlak as a mathematical framework for approximate reasoning that considers uncertainty and vagueness in decision-making processes [59,60]. A rough set is used to modify the index weight of factor layers, which can make the distribution more reasonable. The operation steps are as follows: Formally, a quaternion is an information system, where is the non-empty finite set of objects called the universe; is the non-empty finite set of attributes; , where is the range of attributes; and is called an information function such that for , where and ; denotes the condition attribute set; and is the decision attribute set. The knowledge expression system, with both condition and decision attributes, is a decision table. The significance of different condition attribute sets for the decision attribute set is different, and some attributes are redundant. Rough set theory, by mining the potential relationship between the condition and the decision attribute sets in the knowledge system, obtains the weight values of different condition attributes. In decision-making systems, the importance of the conditional attribute can be calculated as: where reflects the dependency degree of the decision attribute set on conditional set C, so , means that does not depend on ; means that partly depends on ; means that totally depends on , and the dependency is calculated using Equation (4). If is deleted from the conditional attribute set , the dependency of the decision attribute set to set is calculated using Equation (5): where is the number of samples in set . Then, the index weight vector can be obtained using Equation (6)”

4.2. The Analytic Hierarchy Process

The analytic hierarchy process (AHP), as a multi-criteria decision analysis method, was proposed by Saaty [61], and have been widely used in LSM [23,24]. The weights can be derived by taking the principal eigenvector of a square reciprocal matrix of pairwise comparisons between the criteria. The pairwise comparison of the 9-point rating scale is shown in Table 2. This approach can be described in four steps as follows [51,62,63]:

Table 2

Pair-wise comparison of 9-point rating scale.

Importance	Definition
1	Equal importance
3	Moderate prevalence of one over another
5	Strong or essential prevalence
7	Very strong or demonstrated prevalence
9	Extremely high prevalence
2, 4, 6, 8	Intermediate values

Step 1: Establish the hierarchical tree model for landslide susceptibility mapping; Step 2: Build the judgement matrix based on pairwise comparison; Step 3: Calculate the weights or the level of influence for each element based on the minimum of squares, the logarithmic minimum of squares, the special vector, or approximation methods; Step 4: Check the consistency of the weights, called the consistency ratio (CR). The CR must be equal to or less than 10%; otherwise, the pairwise comparison values have to be recalculated.

4.3. The Entropy Weight Method

The entropy method is an objective method for calculating the weight of evaluation predictors based on measured values [19,40,64,65,66]. The operation steps are as follows: Step 1: Establish matrix of the original evaluation data according to the evaluation objects and indicators, where is the number of evaluation objects, and is the number of evaluation indicators. Step 2: Normalize matrix . For the cost type, the larger the better: For the efficiency type, the smaller the better: The standardization process yields the standard-grade matrix . Step 3: Calculate the entropy value : where , if , . Step 4: Calculate the weight of each indicator based on the entropy values:

4.4. The Information Content Model

The information content model (ICM), as a statistical analysis method that has been used with good results for landslide susceptibility assessment [67], was introduced by C.E Shannon and derived from information theory. The ICM is used to calculate the effects of various engineering geological environments on landslides. The calculation procedure is as follows: Step 1: Calculate the information content for each factor that influences landslide occurrence: where is the probability of occurrence of in the landslide area and is the probability of occurrence of in the study area; where is the number of landslides in the study area, is the total number of pixels in the study area, is the number of pixels for factor in the landslide area, and is the number of pixels for factor in the study area. Step 2: Calculate the total information content for each factor : where is the total information content for factor and is the total number of predictors. The greater the value, the more likely the landslide will occur.

4.5. The Landslide Susceptibility Assessment

Based on the ICM and combining with AHP, RS, and EWM, the information weight for each factor can be obtained using the following formula: where are the weights for landslide conditioning predictors calculated by the AHP, RS, and EWM, and is the comprehensive index of the landslide sensitivity.

4.6. Performance Evaluation

To assess the performance and measure the spatial consistency of the four models, the Cohen’s kappa index, a different measure of the reliability of a classification model, was used [68,69,70]. The Cohen’s kappa index is obtained as: where is the observed agreements and is the expected agreements. Of these, , , , and are the number of false positive, false negative, true positive, and true negative, respectively. In our case, a k value close to 0 means that the agreement is no better than chance, whereas a k value close to 1 indicates a perfect agreement.

5. Results

5.1. LSM using the ICM

The information content (IC) for causative predictors was calculated with Equations (11), (12), and (13), and the results are listed in Table 3. The final thirteen-factor landslide susceptibility map obtained by the ICM is shown in Figure 3.

Table 3

Information content for causative predictors.

Factor	Class	Number of Landslides	Total Count	Information Content	Landslide density (one/km²)
Altitude/m	276–715	46	6,092,214	0.69	0.0084
	715–924	15	6,952,902	−0.57	0.0024
	924–1162	12	4,897,656	−0.44	0.0027
	1162–1507	4	2,570,941	−0.89	0.0017
	1507–2694	4	794,775	0.28	0.0056
Slope gradient/°	0–8	16	8,989,246	−0.76	0.0020
	8–15	23	6,698,970	−0.10	0.0038
	15–25	21	4,135,676	0.29	0.0056
	25–35	12	1,175,687	0.99	0.0113
	35–45	4	263,478	1.38	0.0169
	>45	5	45,431	3.37	0.1223
Lithology group	Extra-hard rock	45	15,716,967	−0.28	0.0032
	Hard rock	30	4,450,560	0.57	0.0075
	Soft rock	4	721,619	0.38	0.0062
	Extra-soft rock	2	401,503	0.27	0.0055
Distance to faults/m	<500	8	1,562,977	0.30	0.0057
	500–1000	6	1,590,516	−0.01	0.0042
	1000–1500	10	1,603,260	0.50	0.0069
	1500–2000	4	1,593,415	−0.41	0.0028
	>2000	53	14,958,243	−0.07	0.0039
Slope aspect	North	6	2,874,520	−0.60	0.0023
	Northeast	6	2,665,103	−0.52	0.0025
	East	6	2,713,061	−0.54	0.0025
	Southeast	8	2,311,122	−0.09	0.0038
	South	14	2,559,523	0.36	0.0061
	Southwest	24	2,584,545	0.89	0.0103
	West	12	2,931,061	0.07	0.0045
	Northwest	5	2,669,553	−0.71	0.0021
Distance to roads/m	<500	41	1,099,232	2.28	0.0414
	500–1000	2	982,690	−0.63	0.0023
	1000–1500	2	921,682	−0.56	0.0024
	1500–2000	1	882,297	−1.21	0.0013
	>2000	35	17,416,244	−0.64	0.0022
Distance to rivers/m	0–100	9	1,924,137	0.21	0.0052
	100–200	27	1,868,346	1.34	0.0161
	200–300	14	1,790,363	0.72	0.0087
	300–400	8	1,682,849	0.22	0.0053
	>400	23	14,042,793	−0.84	0.0018
Annual precipitation/mm	<700	18	4,524,190	0.05	0.0044
	700–800	41	7,781,530	0.33	0.0059
	800–900	9	5,299,708	−0.80	0.0019
	900–1100	3	1,846,085	−0.85	0.0018
	>1100	10	1,849,861	0.33	0.0059
Land type	Cultivation	13	929,169	1.30	0.0155
	Bush	2	662,587	−0.23	0.0034
	Grass	1	176,242	0.40	0.0063
	Residential land	1	236,892	0.10	0.0047
	River	3	192,095	1.41	0.0174
	Forest	61	19,105,160	−0.17	0.0035
NDVI	<0.1	7	1,306,736	0.34	0.0060
	0.1–0.3	11	1,534,455	0.63	0.0080
	0.3–0.5	22	3,657,943	0.46	0.0067
	0.5–0.7	40	14,605,211	−0.33	0.0030
	>0.7	1	187,856	0.34	0.0059
TWI	<9	43	7,275,471	0.44	0.0066
	9–11	23	9,193,638	−0.42	0.0028
	11–14	8	3,649,323	−0.55	0.0024
	14–18	5	964,716	0.31	0.0058
	>18	2	225,340	0.85	0.0099
Plan curvature	<0.5	18	3,382,177	0.34	0.0059
	−0.5–0.5	43	14,431,242	−0.24	0.0033
	>0.5	20	3,495,068	0.41	0.0064
Profile curvature	<0.5	24	4,378,283	0.37	0.0061
	−0.5–0.5	30	12,458,967	−0.46	0.0027
	>0.5	27	4,471,238	0.46	0.0067

Figure 3

The study area’s influencing factor maps: (a) altitude; (b) slope gradient; (c) slope aspect; (d) lithology; (e) distance to faults; (f) distance to roads; (g) distance to rivers; (h) annual precipitation; (i) land type; (j) normalized difference vegetation index (NDVI); (k) topographic wetness index (TWI); (l) plan curvature; (m) profile curvature.

It is clear that the landslide occurrence increases with the slope gradient. The slope gradient class >45° has the highest IC value, and the lowest IC value is −0.76 for slope class 0–8°. In the case of slope aspect, the IC value is positive from south to east, with the maximum value (0.89) at southeast-facing slopes followed by south-facing (0.36) slopes. In terms of altitude, the IC values indicate they are positive for the ranges of 276–715 and 1507–2694, with the highest value for altitudes between 276–715 m. For the lithology groups, Hard rock class is associated with a higher IC value, whereas the Extra-hard rock determines a lower IC value. The plan and profile curvature IC values are negative only for the range of −0.5–0.5 and at −0.24 and −0.46, respectively. The concave (>0.5) and convex (<−0.5) slopes are positive for landslide susceptibility in the study area. As for land types, Cultivation and River are more susceptible to landslides. The relation between TWI landslide probabilities showed that >18 has the highest value of IC, and class 0.5–0.7 has the lowest NDVI value. In terms of distance to faults, the highest and lowest IC values are located in the intervals of 1000–1500 m and >2000 m, respectively. In the case of distance to roads, the interval <500 m has the highest IC value, which means that the landslide susceptibility is higher in this area. The river incision can cause instability of slopes by changing groundwater level and toe erosion. Generally, with the increase of the distance to rivers the IC values decrease, and the IC value is positive only for the class >400 m. In the case of annual precipitation, the landslides are mainly distributed within 700–800 mm and >1000 mm, and their IC values are all 0.33. The final ICM method landslide susceptibility map is shown in Figure 4.

Figure 4

The landslide susceptibility map for the CMA extracted with the ICM.

5.2. LSM Using the RS-ICM Method

The following predictors were selected as indices for LSM: slope gradient, slope aspect, lithology, distance to faults, distance to roads, distance to rivers, annual precipitation, land type, NDVI, TWI, plan curvature, and profile curvature. The 13 causative predictors were classified into grades 1, 2, 3, 4, 5, 6, 7, 8, and 9 according to landslide density as shown in Table 4. In ArcGIS software (version 10.2, Esri Co. Ltd., California, CA, USA), the value of all evaluation factor layers is extracted into landslides, and the results are used as condition attributes. We generated landslide density maps (Figure 5) based on the landslide distribution. The landslides were divided into six levels, 1, 2, 3, 4, 5, and 6, using natural break classification according to the landslide density, and the results were used as decision attributes. The initial decision table, with 85 rows and 8 columns, was established by defining the density of landslides as the decision attribute set. The weights of 13 predictors were calculated using the rough set method, as shown in Table 5. The final landslide RS-ICM susceptibility map is shown in Figure 6.

Table 4

Evaluation class based on landslide density.

Landslide Density (one/km²)	Landslide Susceptibility Grade
0.001–0.002	1
0.002–0.003	2
0.003–0.004	3
0.004–0.005	4
0.005–0.006	5
0.006–0.007	6
0.007–0.008	7
0.008–0.009	8
>0.009	9

Figure 5

Landslide density maps based on landslide distribution.

Table 5

Weights of 13 predictors using the rough set approach.

Predictors	X₁	X₂	X₃	X₄	X₅	X₆	X₇	X₈	X₉	X₁₀	X₁₁	X₁₂	X₁₃
Weights	0.0610	0.0244	0.0976	0.1585	0.0366	0.0122	0.0976	0.0732	0.1829	0.0488	0	0.1220	0.0854

Notes: X1: Altitude; X2: Slope gradient; X3: Slope aspect; X4: Lithology; X5: Distance to faults; X6: Distance to roads; X7: Distance to rivers; X8: Annual precipitation; X9: Land type; X10: NDVI; X11: TWI; X12: Plan curvature; X13: Profile curvature.

Figure 6

The landslide susceptibility map of the CMA extracted with the RS-ICM method.

5.3. LSM Using the AHP-ICM Method

In the study area, the 13 layers of landslide conditioning predictors were compared with each other to determine their relative importance using the analytic hierarchy process (AHP) mentioned above. The judgement matrix for these evaluation predictors is shown in Table 6. The CR = 0.0183, which is <0.1, indicates that the calculated weights are reasonable. The final landslide susceptibility map of the AHP-ICM method is shown in Figure 7.

Table 6

Pair-wise comparison matrix for influencing factor weights.

Heading	X₁	X₂	X₃	X₄	X₅	X₆	X₇	X₈	X₉	X₁₀	X₁₁	X₁₂	X₁₃	Weights
X₁	1	1/6	1/2	1/7	1/3	1/4	1	1/4	1/5	1/4	1/3	1/2	1/2	0.0218
X₂	6	1	4	1/2	3	2	5	3	1	1	2	3	3	0.1386
X₃	2	1/4	1	1/5	1	1/3	2	1/2	1/3	1/2	1/2	1	1	0.0405
X₄	7	2	5	1	4	2	5	3	1	2	3	4	4	0.1832
X₅	3	1/3	1	1/4	1	1/2	2	1/2	1/2	2	1	1	1	0.0586
X₆	4	1/2	3	1/2	2	1	3	1	1	1	1	2	2	0.0889
X₇	1	1/5	1/2	1/5	1/2	1/3	1	1/3	1/4	1/4	1/3	1/2	1/2	0.0252
X₈	4	1/3	2	1/3	2	1	3	1	1/2	1	1	2	2	0.0773
X₉	5	1	3	1	2	1	4	2	1	1	2	3	3	0.1221
X₁₀	4	1	2	1/2	1/2	1	4	1	1	1	1	2	2	0.0864
X₁₁	3	1/2	2	1/3	1	1	3	1	1/2	1	1	1	1	0.0661
X₁₂	2	1/3	1	1/4	1	1/2	2	1/2	1/3	1/2	1	1	1	0.0457
X₁₃	2	1/3	1	1/4	1	1/2	2	1/2	1/3	1/2	1	1	1	0.0457

Figure 7

The landslide susceptibility map for the CMA extracted with the AHP-ICM method.

5.4. LSM Using the EWM-ICM Method

The weights of causative predictors based on EWM were calculated according to the principles of entropy weighted theory and are listed in Table 7. At first, the IC values were multiplied by the weights in Table 4 and all the weighted factor maps were then aggregated. Finally, the maps were reclassified to produce the EWM-ICM-generated LSM (Figure 8).

Table 7

The weights of 13 predictors using the EWM approach.

Predictors	X₁	X₂	X₃	X₄	X₅	X₆	X₇	X₈	X₉	X₁₀	X₁₁	X₁₂	X₁₃
Weights	0.06	0.30	0.05	0.02	0.02	0.25	0.08	0.04	0.07	0.02	0.05	0.02	0.04

Figure 8

The CMA landslide susceptibility map extracted with the EWM-ICM method.

5.5. Validation

To determine the statistical reliability of the results, it is essential to perform validation of the four models. To perform this validation, the ROC curve was constructed, and the area under the ROC curve (AUC) was used for the quantitative comparison of the four models. The comparison results are shown in Figure 9. The success rate (Figure 9a) comes from the training dataset (70%, 81 landslides), and the prediction rate (Figure 9b) comes from the validation dataset (30%, 35 landslides). As shown in Figure 9, the ICM, RS-ICM, AHP-ICM, and EWM-ICM success rates were 0.931, 0.939, 0.912, and 0.883, and their prediction accuracy rates were 0.926, 0.927, 0.917, and 0.878, respectively. The Cohen’s kappa indexes were 0.721, 0.743, 0.720, and 0.663 for the ICM, RS-ICM, AHP-ICM, and EWM-ICM. It indicates a substantial agreement between the observed and the predicted values for all four models.

Figure 9

Receiver operating characteristic (ROC) curve evaluation of the four models: (a) success rate curve; (b) prediction rate curve.

The susceptibility maps were classified as low, moderate, high, and very high based on a natural break approach. Table 8 displays the total area and ratios of low, moderate, high and very high susceptibility for the four models. The ICM, RS-ICM, and AHP-ICM methods with better LSM performance in the CMA have approximately the same result in area ratio of different landslide susceptibility classes, and they have relatively large differences compared to EWM-ICM.

Table 8

Distribution of area in different landslide susceptibility classes.

Model	Area (km²) and Ratio (%)	Susceptibility
Model	Area (km²) and Ratio (%)	Low	Moderate	High	Very High
ICM	Area	642.67	676.50	414.96	181.83
ICM	Ratio	33.54	35.31	21.66	9.49
RS-ICM	Area	522.98	727.38	486.82	178.77
RS-ICM	Ratio	27.30	37.96	25.41	9.33
AHP-ICM	Area	515.62	697.16	505.96	197.22
AHP-ICM	Ratio	26.91	36.39	26.41	10.29
EWM-ICM	Area	165.52	704.41	824.07	221.96
EWM-ICM	Ratio	8.64	36.77	43.01	11.58

6. Discussion

6.1. Spatial Consistency of the Four Models

The Cohen’s kappa indexes were calculated to measure the spatial consistency of the four models. The results of their spatial consistency are shown in Table 9, which indicates a substantial agreement between the ICM and AHP-ICM, and a moderate agreement between the RS-ICM and AHP-ICM and between the ICM and RS-ICM. For the RS-ICM and EWM-ICM, the Cohen’s kappa index was estimated to be equal to 0.140, characterizing with slight agreement. It should be noted that the Cohen’s kappa indexes between EWM-ICM, which has the lowest performance for LSM in the CMA, and the other three models are very small, which indicates that they have relatively large differences in spatial consistency compared to EWM-ICM, and the ICM, RS-ICM, and AHP-ICM methods yield similar results.

Table 9

Cohen’s kappa index between two models.

Models	ICM and RS-ICM	ICM and AHP-ICM	ICM and EWM-ICM	RS-ICM and AHP-ICM	RS-ICM and EWM-ICM	AHP-ICM and EWM-ICM
Cohen’s Kappa Index	0.595	0.698	0.325	0.484	0.140	0.286

6.2. Predisposing Factors Analysis of Information

The ICM is a simple and effective tool in landslide susceptibility assessment, and the information values represent the contribution ratio of different predictor classes to landslide occurrence. With the increase of slope gradient, the landslides increase. The reason for this is that the slope gradient not only affects the stress distribution inside the slope masses but also affects weathering layer depth and slope surface runoff [58]. However, the landslides are mainly concentrated in the altitude range of 276–715, which is due to the fact that the catchment areas are mainly concentrated in this range. It should be noted that landslides primarily developed within the hard rock lithology group rather than soft rock. The hard rock group is mainly composed of basalt and trachyte, which are prone to rock fall, and the landslides in the study area are characterized by two modes: slump-tensile rupture and creep-tensile rupture, which are related to the development of rock fall. The landslides develop in the weak layer of the high-steep slope of the hard rock lithology group. Once the rock fall occurs at the front edge of the slope, the trailing edge will break through the weak layer along the weak edge and gradually form a connecting slip surface, forming a stepped rock fall-landslide, with conditions that are consistent with the occurrence conditions of landslide disasters. In the case of distance to faults, distance to roads, and distance to rivers, there was a decreasing tendency with the increase in distance. The reasons for this phenomenon can be summed up as follows: (a) With strong weathering and a well-developed rock structure plane in a fault zone, it provides favorable conditions for a landslide occurrence; (b) Roads increase stress and strain on the back of the slope, resulting in slope disturbance and failure; (c) The strength of degree of surface incision is directly related to the development of drainages. The closer to the rivers, the more severe erosion, and the more landslides. For slope aspect, the slope aspects in the south direction (south, southeast, and southwest) were more prone to landslides. The reason for this is that these direction slopes are exposed to more sunlight or affected by the orientation of discontinuities controlling the landslides, which is the same as the conclusion proposed by Du [58]. Annual precipitation is one of the major initiating factors of landslides [71]. It is accepted that with the increase of rainfall, the probability of landslide occurrence increases, but the results show that the IC values in the intervals of 800–900 and 900–1000 are negative, mainly because the two intervals are distributed in the Extra-hard rock group that is not prone to landslides. It is generally accepted that Land type plays a crucial role in the landslide distribution [36]. The landslides are primarily developed with Cultivation and Residential in the study area, which are the most dramatic locations of human activity.

6.3. Comparative Analysis of Three Knowledge-Driven Methods

The ICM, an objective evaluation method commonly applied for statistical analysis, is suitable for the evaluation of LSM [24]. Four landslide susceptibility maps were produced based on the ICM with three knowledge-driven methods to mitigate the social and economic losses induced by landslides in the CMA. In the comprehensive evaluation process, the key issue is to determine the weight of each predictor, which reflects the relative importance of each evaluation indicator. It should be noted that the LSM is completely dependent on the value of the weight when the evaluation object and evaluation indicators are determined. Hence, the reasonable choice of the weighting method directly affects the rationality and credibility of the landslide-prone partition evaluation results. In this study, three knowledge-driven methods, including subjective evaluation and objective evaluation, were used to produce landslide susceptibility maps. The RS theory is an effective tool in dealing with vagueness and uncertainty information. Peng et al. [60] mentioned that the RS theory is an attribute reduction tool to identify the significant environmental parameters of a landslide. Liu et al. [72] used RS theory to clarify the relationship between landslide and environmental factors in the Qinggan River of the Three Gorges area. The AHP is an expert-based evaluation method that is often applied in landslide susceptibility assessment and mapping [63]. However, it should be noted that the AHP has been criticized for its inability to adequately handle the ambiguity and imprecision associated with the conversion of linguistic labels attached to the ratio scale, to crisp numbers used in the comparison matrix [73]. The EWM is an objective weighting method, which determines the criteria weights by solving mathematical models without any consideration of the decision maker’s preferences. However, it is sometimes contrary to the actual situation, and it is difficult to give a clear explanation for the obtained results. According to the results in the success accuracy section, the RS-ICM (AUC = 0.939) had the best effect for LSM in the CMA, and the ICM (AUC = 0.931) with AHP-ICM (AUC = 0.912) performed better than EWM-ICM (AUC = 0.883).

6.4. Importance of Predictors

The weight assignment of landslide conditioning predictors is the basis of LSM. This paper focuses on the influence of different knowledge-driven methods for a series of parameters on the accuracy of LSM. Figure 10 displays the weights of the RS, AHP, and EWM methods. According to the results, lithology (0.1585), land type (0.1829), and plan curvature (0.1220) had the highest weights using RS. Slope gradient (0.1386), lithology (0.1832), and land type (0.1221) had the highest weights by AHP, and slope gradient (0.3) and distance to roads (0.25) had the highest weights using EWM. The results show that lithology and land type are crucial predictors for LSM in the CMA for RS-ICM (AUC = 0.939) and AHP-ICM (0.912), which have better performance than EWM-ICM (0.878). However, Chen et al. [52] found that slope gradient, altitude, and rainfall had the highest importance in landslide occurrence. Kawabata and Bandibas [74] mentioned that geology is the most important factor. Meinhardt, et al. [75] reported that slope gradient, lithology, and precipitation increase have higher importance in landslide occurrence. Pham, et al. [76] mentioned that distance to roads, slope gradient, elevation, and rainfall have higher importance on landslide occurrence, which is consistent with the results of Youssef et al. [77]. It can be found that slope gradient, lithology, distance to roads, and land type are the four most important predictors affecting the susceptibility of landslides in the CMA.

Figure 10

The columnar statistical graph of weights for the thirteen predictors of the three methods.

7. Conclusions

Landslides are one of the most frequent geomorphic hazards that often result in loss of property and human life in the CMA. In this research, 13 layers of landslide conditioning predictors, namely, altitude, slope gradient, slope aspect, lithology, distance to faults, distance to roads, distance to rivers, annual precipitation, land type, NDVI, TWI, plan curvature, and profile curvature, were taken as independent, causal predictors, and three knowledge-driven methods, that is, RS, AHP, and EWM, were applied to produce CMA landslide susceptibility maps and their performance was compared. The influence of different knowledge-driven methods for a series of parameters on the accuracy of LSM was explored. The results demonstrate that the four models are good at predicting landslide susceptibility, and the RS-ICM had the best effect for LSM in the CMA; the next best were ICM and AHP-ICM, as their accuracies were slightly higher than that of EWM-ICM. In addition, the importance of different predictors in landslide occurrence was investigated. It was concluded that lithology and land type are crucial predictors for LSM in the CMA. The four landslide susceptibility maps illustrated that the high-susceptibility areas were mainly composed of three parts: the region with a radius of 13,000 m around the Tianchi volcanic cone and mountain area in southwest Changbaishan Tianchi, the hard rock group area, and the surrounding area of roads. The outcome of this research is useful for general planned development activities and disaster management in the future and for engineers to reduce losses caused by existing and future landslides using prevention, mitigation, and avoidance.

3 in total