| Literature DB >> 30400627 |
Ataollah Shirzadi1, Karim Soliamani2, Mahmood Habibnejhad3, Ataollah Kavian4, Kamran Chapi5, Himan Shahabi6, Wei Chen7, Khabat Khosravi8, Binh Thai Pham9, Biswajeet Pradhan10,11, Anuar Ahmad12, Baharin Bin Ahmad13, Dieu Tien Bui14,15.
Abstract
The main objective of this research was to introduce a novel machine learning algorithm of alternating decision tree (ADTree) based on the multiboost (MB), bagging (BA), rotation forest (RF) and random subspace (RS) ensemble algorithms under two scenarios of different sample sizes and raster resolutions for spatial prediction of shallow landslides around Bijar City, Kurdistan Province, Iran. The evaluation of modeling process was checked by some statistical measures and area under the receiver operating characteristic curve (AUROC). Results show that, for combination of sample sizes of 60%/40% and 70%/30% with a raster resolution of 10 m, the RS model, while, for 80%/20% and 90%/10% with a raster resolution of 20 m, the MB model obtained a high goodness-of-fit and prediction accuracy. The RS-ADTree and MB-ADTree ensemble models outperformed the ADTree model in two scenarios. Overall, MB-ADTree in sample size of 80%/20% with a resolution of 20 m (area under the curve (AUC) = 0.942) and sample size of 60%/40% with a resolution of 10 m (AUC = 0.845) had the highest and lowest prediction accuracy, respectively. The findings confirm that the newly proposed models are very promising alternative tools to assist planners and decision makers in the task of managing landslide prone areas.Entities:
Keywords: GIS; Iran; alternating decision tree; landslide; machine learning algorithms
Year: 2018 PMID: 30400627 PMCID: PMC6263474 DOI: 10.3390/s18113777
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Location of landslides in the study area in Kurdistan Province of Iran.
Figure 2Lithological map of the study area.
Landside conditioning factors and their classes for landslide modeling in Bijar City.
| No. | Landslide Causal Factors | Classes | |
|---|---|---|---|
| Topographic factors | 1 | Slope (o) | (1) 0–5; (2) 5–10; (3) 10–15; (4) 15–20; (5) 20–25; (6) 25–30; (7) 30–45; (8) >45 |
| 2 | Aspect | (1) Flat; (2) North; (3) Northeast; (4) East; (5) Southeast; (6) South; (7) Southwest; (8) West; (9) Northwest | |
| 3 | Elevation (m) | (1) 1573–1700; (2) 1700–1800; (3) 1800–1900; (4) 1900–2000; (5) 2000–2100; (6) 2100–2200; (7) 2200–2300; (8) 2300–2400; (9) >2400 | |
| 4 | Curvature (m−1) | (1) [(−12.5)–(−1.4)]; (2) [(−1.4)–(−0.4)]; (3) [(−0.4)–(−0.2)]; (4) [(−0.2)–0.9]; (5) [0.9–2.5]; (6) [2.5–15.6] | |
| 5 | Plan curvature (m−1) | (1) [(−6.7)–(−0.8)]; (2) [(−0.8)–(−0.2)]; (3) [(−0.2)–0]; (4) [0–0.4]; (5) [0.4–1.1]; (6) [1.1–10.4] | |
| 6 | Profile curvature (m−1) | (1) [(−10.7)–(−1.7)]; (2) [(−1.7)–(−0.7)]; (3) [(−0.7)–(−0.2)]; (4) [(−0.2)–0.2]; (5) [0.2–0.9]; (6) [0.9–7.5] | |
| 7 | STI | (1) 0–7; (2) 7–14; (3) 14–21; (4) 21–28; (5) 28–35; (6) 35–42 | |
| Hydrological factors | 8 | Rainfall (mm) | (1) 263–270; (2) 270–300; (3) 300–330; (4) 330–360; (5) 360–390; (6) 390–420; (7) 420–450 |
| 9 | Annual solar radiation (h) | (1) 3.015–6.563; (2) 5.563–6.747; (3) 6.747–6.849; (4) 6.849–6.930; (5) 6.930–7.073; (6) 7.073–7.236; (7) 7.236–8.215 | |
| 10 | SPI | (1) 0–998; (2) 998–6986; (3) 6986–19,961; (4) 19,961–45,911; (5) 45,911–101,803; (6) 101,803–255,505 | |
| 11 | TWI | (1) 1–3; (2) 3–4; (3) 4–6; (4) 6–8; (5) 8–9; (6) 9–11 | |
| 12 | Distance to Rivers (m) | (1) 0–50; (2) 50–100; (3) 100–150; (4) 150–200; (5) >200 | |
| 13 | River density (km/km2) | (1) 0–1.9; (2) 1.9–3.2; (3) 3.2–4.2; (4) 4.2–5.2; (5) 5.2–6.3; (6) 6.3–7.8; (7) 7.8–13.2 | |
| Lithological factors | 14 | Lithology | (1) Quaternary (2) Tertiary (3) Cretaceous |
| 15 | Distance to Faults (m) | (1) 0–200; (2) 200–400; (3) 400–600; (4) 600–800; (5) 800–1000; (6) >1000 | |
| 16 | Fault density (km/km2) | (1) 0–0.3; (2) 0.3–0.8; (3) 0.8–1.2; (4) 1.2–1.7; (5) 1.7–2.1; (6) 2.1–2.5; (7) 2.5–3.2 | |
| Land Cover Factors | 17 | Land use | (1) Residential area (2) Arable land (dry faring and cultivated lands); (3) Wood land; (4) Grassland; (5) Barren land |
| 18 | NDVI | (1) [(−0.23)–(−0.061)]; (2) [(−0.061)–(−0.0081)]; (3) [(−0.0081)–(0.060)]; (4) [(0.060)–0.14]; (5) [0.14–0.24]; (6) [0.24–0.41]; (7) [0.41–0.73] | |
| Anthropogenic factors | 19 | Distance to Roads (m) | (1) 0–50; (2) 50–100; (3) 100–150; (4) 150–200; (5) >200 |
| 20 | Road density (km/km2) | (1) 0–0.0013; (2) 0.0013–0.0027; (3) 0.0027–0.0041; (4) 0.0041–0.0055; (5) 0.0055–0.0069; (6) 0.0069–0.0083; (7) 0.0083–0.0097 |
Factor selection based on the information gain ration techniques.
| Conditioning Factors | 10 m | 20 m | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 60%/40% | 70%/30% | 80%/20% | 90%/10% | 60%/40% | 70%/30% | 80%/20% | 90%/10% | |||||||||
| AM | R | AM | R | AM | R | AM | R | AM | R | AM | R | AM | R | AM | R | |
| Slope angle | 0.105 | 2 | 0.482 | 1 | 0.509 | 1 | 0.484 | 1 | 0.135 | 2 | 0.655 | 1 | 0.459 | 1 | 0.481 | 1 |
| TWI | 0.142 | 1 | 0.597 | 2 | 0.427 | 2 | 0.409 | 2 | 0.142 | 1 | 0.482 | 2 | 0.428 | 2 | 0.409 | 2 |
| Aspect | 0.071 | 3 | 0.065 | 10 | 0.058 | 11 | 0.088 | 6 | 0.071 | 3 | 0.072 | 9 | 0.065 | 7 | 0.085 | 6 |
| STI | 0.064 | 4 | 0.195 | 4 | 0.172 | 4 | 0.186 | 4 | 0.064 | 4 | 0.195 | 4 | 0.173 | 3 | 0.186 | 3 |
| Profile curvature | 0.005 | 5 | 0.042 | 12 | 0.094 | 7 | 0.031 | 12 | 0 | - | 0.032 | 12 | 0.011 | 12 | 0 | - |
| Plan curvature | 0 | - | 0.221 | 3 | 0.174 | 3 | 0.191 | 3 | 0 | - | 0.440 | 3 | 0.172 | 4 | 0.167 | 4 |
| Elevation | 0 | - | 0.096 | 7 | 0.086 | 8 | 0.095 | 5 | 0 | - | 0.096 | 7 | 0.059 | 9 | 0.095 | 5 |
| Curvature | 0 | - | 0.114 | 5 | 0.106 | 5 | 0.085 | 7 | 0 | - | 0.065 | 10 | 0.022 | 11 | 0.046 | 11 |
| Land use | 0 | - | 0.064 | 9 | 0.058 | 11 | 0.050 | 11 | 0 | - | 0.080 | 8 | 0.058 | 10 | 0.070 | 8 |
| Rainfall | 0 | - | 0.051 | 11 | 0.064 | 10 | 0.057 | 10 | 0 | - | 0.051 | 11 | 0.065 | 8 | 0.057 | 10 |
| SPI | 0 | - | 0.070 | 8 | 0.075 | 9 | 0.071 | 9 | 0 | - | 0.116 | 6 | 0.076 | 6 | 0.071 | 9 |
| Solar radiation | 0 | - | 0.099 | 6 | 0.092 | 6 | 0.081 | 8 | 0 | - | 0.119 | 5 | 0.077 | 5 | 0.076 | 7 |
AM, Average Merit; R, Rank.
Figure 3Flowchart of modeling process and methodology used in this study.
Model performance using training dataset and ADTree algorithm.
| Raster Resolution (m) | 10 | 20 | ||||||
|---|---|---|---|---|---|---|---|---|
| Sample Size (%) | 60%/40% | 70%/30% | 80%/20% | 90%/10% | 60%/40% | 70%/30% | 80%/20% | 90%/10% |
| Statistic Measures | ||||||||
| TP | 60 | 77 | 85 | 91 | 55 | 72 | 81 | 89 |
| TN | 47 | 81 | 76 | 89 | 48 | 78 | 82 | 92 |
| FP | 7 | 0 | 4 | 9 | 12 | 9 | 8 | 11 |
| FN | 20 | 4 | 13 | 11 | 19 | 3 | 7 | 8 |
| SST % | 0.750 | 0.951 | 0.867 | 0.892 | 0.743 | 0.960 | 0.920 | 0.918 |
| SPF % | 0.870 | 1.000 | 0.950 | 0.908 | 0.800 | 0.897 | 0.911 | 0.893 |
| ACC % | 0.799 | 0.975 | 0.904 | 0.900 | 0.769 | 0.926 | 0.916 | 0.905 |
| Kappa | 0.597 | 0.950 | 0.809 | 0.800 | 0.537 | 0.851 | 0.831 | 0.810 |
| RMSE | 0.351 | 0.157 | 0.291 | 0.300 | 0.407 | 0.239 | 0.273 | 0.298 |
Model performance using validation dataset and ADTree algorithm.
| Raster Resolution (m) | 10 | 20 | ||||||
|---|---|---|---|---|---|---|---|---|
| Sample Size (%) | 60%/40% | 70%/30% | 80%/20% | 90%/10% | 60%/40% | 70%/30% | 80%/20% | 90%/10% |
| Statistic Measures | ||||||||
| TP | 26 | 27 | 17 | 10 | 19 | 27 | 18 | 10 |
| TN | 37 | 23 | 19 | 10 | 25 | 22 | 20 | 10 |
| FP | 18 | 3 | 5 | 1 | 35 | 3 | 4 | 1 |
| FN | 7 | 7 | 3 | 1 | 9 | 8 | 1 | 1 |
| SST % | 0.788 | 0.794 | 0.850 | 0.909 | 0.679 | 0.771 | 0.947 | 0.909 |
| SPF % | 0.673 | 0.885 | 0.792 | 0.909 | 0.417 | 0.880 | 0.833 | 0.909 |
| ACC % | 0.716 | 0.833 | 0.818 | 0.909 | 0.500 | 0.817 | 0.884 | 0.909 |
| Kappa | 0.631 | 0.666 | 0.636 | 0.818 | 0.572 | 0.633 | 0.727 | 0.818 |
| RMSE | 0.363 | 0.182 | 0.390 | 0.331 | 0.484 | 0.256 | 0.342 | 0.309 |
Figure 4The effects of sample size and raster resolution on the performance of landslide modeling: (a) sample size of 60%/40%; (b) sample size of 70%/30%; (c) sample size of 80%/20%; and (d) sample size of 90%/10%.
The optimal values of the number of iteration and seed for different sample sizes and raster resolutions using ensemble models.
| Ensemble Models | 90%/10% and Resolution 20 m | 80%/20% and Resolution 20 m | 70%/30% and Resolution 10 m | 60/410% and Resolution 10 m | ||||
|---|---|---|---|---|---|---|---|---|
| S | I | S | I | S | I | S | I | |
| MB | 7 | 15 | 5 | 11 | 3 | 10 | 1 | 14 |
| BA | 3 | 10 | 4 | 10 | 6 | 10 | 8 | 10 |
| RS | 4 | 10 | 8 | 10 | 1 | 11 | 7 | 16 |
| RF | 6 | 15 | 3 | 13 | 5 | 13 | 1 | 14 |
I, iteration; S, seed.
Figure 5The trend of changes of the number of seed and iteration in the landslide modeling process: (a) optimum number of iteration for the combination of 60%/40% with the raster resolution of 10 m, (b) optimum number of seed for the combination of 60%/40% with the raster resolution of 10 m; (c) optimum number of iteration for the combination of 70%/30% with the raster resolution of 10 m, (d) optimum number of seed for the combination of 70%/30% with the raster resolution of 10 m; (e) optimum number of iteration for the combination of 80%/20% with the raster resolution of 20 m, (f) optimum number of seed for the combination of 80%/20% with the raster resolution of 20 m; (g) optimum number of iteration for the combination of 90%/10% with the raster resolution of 20 m, (h) optimum number of seed for the combination of 90%/10% with the raster resolution of 20 m.
Results of ensembles modeling by combination of 60%/40% and raster resolution of 10 m.
| Criteria | ADTree | RF | RS | BA | MB | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| T | V | T | V | T | V | T | V | T | V | |
| True positive | 60 | 26 | 46 | 26 | 60 | 30 | 48 | 27 | 52 | 29 |
| True negative | 47 | 37 | 61 | 36 | 63 | 36 | 59 | 37 | 63 | 33 |
| False positive | 7 | 18 | 21 | 18 | 7 | 14 | 19 | 17 | 15 | 15 |
| False negative | 20 | 7 | 6 | 8 | 4 | 8 | 8 | 7 | 4 | 11 |
| Sensitivity | 0.750 | 0.788 | 0.885 | 0.765 | 0.938 | 0.789 | 0.857 | 0.794 | 0.929 | 0.725 |
| Specificity | 0.870 | 0.673 | 0.744 | 0.667 | 0.900 | 0.720 | 0.756 | 0.685 | 0.808 | 0.688 |
| Accuracy | 0.799 | 0.716 | 0.799 | 0.705 | 0.918 | 0.750 | 0.799 | 0.727 | 0.858 | 0.705 |
| AUROC | 0.864 | 0.737 | 0.907 | 0.796 | 0.974 | 0.791 | 0.889 | 0.788 | 0.940 | 0.756 |
T, training dataset; V, validation dataset.
Results of ensembles modeling by combination of 60%/40% and raster resolution of 10 m.
| Criteria | ADTree | RF | RS | BA | MB | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| T | V | T | V | T | V | T | V | T | V | |
| True positive | 77 | 27 | 73 | 28 | 76 | 28 | 75 | 28 | 80 | 28 |
| True negative | 81 | 23 | 77 | 22 | 79 | 21 | 78 | 21 | 78 | 22 |
| False positive | 0 | 3 | 8 | 2 | 5 | 2 | 6 | 2 | 1 | 8 |
| False negative | 4 | 7 | 4 | 8 | 2 | 9 | 3 | 9 | 3 | 2 |
| Sensitivity | 0.951 | 0.794 | 0.948 | 0.778 | 0.974 | 0.757 | 0.962 | 0.757 | 0.964 | 0.933 |
| Specificity | 1.000 | 0.885 | 0.906 | 0.917 | 0.940 | 0.913 | 1.000 | 0.913 | 0.987 | 0.733 |
| Accuracy | 0.975 | 0.833 | 0.926 | 0.833 | 0.957 | 0.817 | 0.981 | 0.817 | 0.975 | 0.833 |
| AUROC | 0.979 | 0.862 | 0.984 | 0.898 | 0.997 | 0.901 | 0.983 | 0.893 | 0.996 | 0.892 |
T, training dataset; V, validation dataset.
Results of ensembles modeling by combination of 80%/20% and raster resolution of 20 m.
| Criteria | ADTree | RF | RS | BA | MB | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| T | V | T | V | T | V | T | V | T | V | |
| True positive | 81 | 18 | 81 | 19 | 78 | 18 | 81 | 18 | 81 | 18 |
| True negative | 82 | 20 | 82 | 20 | 82 | 20 | 82 | 20 | 82 | 20 |
| False positive | 8 | 4 | 8 | 3 | 11 | 4 | 8 | 4 | 8 | 4 |
| False negative | 7 | 1 | 7 | 2 | 7 | 2 | 7 | 2 | 7 | 2 |
| Sensitivity | 0.920 | 0.947 | 0.920 | 0.905 | 0.918 | 0.900 | 0.920 | 0.900 | 0.920 | 0.900 |
| Specificity | 0.911 | 0.833 | 0.911 | 0.870 | 0.882 | 0.833 | 0.911 | 0.833 | 0.911 | 0.833 |
| Accuracy | 0.916 | 0.884 | 0.916 | 0.886 | 0.899 | 0.864 | 0.916 | 0.864 | 0.916 | 0.864 |
| AUROC | 0.967 | 0.903 | 0.987 | 0.937 | 0.972 | 0.926 | 0.974 | 0.926 | 0.988 | 0.934 |
T, training dataset; V, validation dataset.
Results of ensembles landslide modeling using combination of 90%/10% and raster resolution of 20 m.
| Criteria | ADTree | RF | RS | BA | MB | |||||
|---|---|---|---|---|---|---|---|---|---|---|
| T | V | T | V | T | V | T | V | T | V | |
| True positive | 89 | 10 | 96 | 10 | 88 | 10 | 87 | 10 | 92 | 10 |
| True negative | 92 | 10 | 94 | 10 | 92 | 9 | 93 | 10 | 95 | 10 |
| False positive | 11 | 1 | 4 | 1 | 12 | 1 | 13 | 1 | 8 | 1 |
| False negative | 8 | 1 | 6 | 1 | 8 | 2 | 7 | 1 | 5 | 1 |
| Sensitivity | 0.918 | 0.909 | 0.941 | 0.909 | 0.917 | 0.833 | 0.926 | 0.909 | 0.948 | 0.909 |
| Specificity | 0.893 | 0.909 | 0.959 | 0.909 | 0.885 | 0.900 | 0.877 | 0.909 | 0.922 | 0.909 |
| Accuracy | 0.905 | 0.909 | 0.950 | 0.909 | 0.900 | 0.864 | 0.900 | 0.909 | 0.935 | 0.909 |
| AUROC | 0.957 | 0.876 | 0.983 | 0.913 | 0.968 | 0.884 | 0.968 | 0.921 | 0.992 | 0.926 |
T, training dataset; V, validation dataset.
Figure 6Landslide susceptibility mapping prepared by the ADTree model and its ensemble: (a) ADTree, sample size 60/40 & Resolution: 10 m; (b) RS-ADT, sample size 60/40 & Resolution: 10 m; (c) ADTree, sample size 70/30 & Resolution: 10 m; (d) RS-ADTree, sample size 70/30 & Resolution: 10 m; (e) ADTree, sample size 80/20 & Resolution: 20 m; (f) MB-ADTree, sample size 80/20 & Resolution: 20 m; (g) ADTree, sample size 90/10 & Resolution: 20 m; (h) MB-ADTree, sample size 90/10 & Resolution: 20 m.
Figure 7Model comparison and evaluation of the ADTree and its ensembles in different sample sizes and raster resolutions using: training dataset (a,c,e,g); and validation dataset (b,d,f,h).