| Literature DB >> 31146336 |
Dieu Tien Bui1,2, Ataollah Shirzadi3, Himan Shahabi4, Kamran Chapi5, Ebrahim Omidavr6, Binh Thai Pham7, Dawood Talebpour Asl8, Hossein Khaledian9, Biswajeet Pradhan10,11, Mahdi Panahi12, Baharin Bin Ahmad13, Hosein Rahmani14, Gyula Gróf15, Saro Lee16,17.
Abstract
In this study, we introduced a novel hybrid artificial intelligence approach of rotation forest (RF) as a Meta/ensemble classifier based on alternating decision tree (ADTree) as a base classifier called RF-ADTree in order to spatially predict gully erosion at Klocheh watershed of Kurdistan province, Iran. A total of 915 gully erosion locations along with 22 gully conditioning factors were used to construct a database. Some soft computing benchmark models (SCBM) including the ADTree, the Support Vector Machine by two kernel functions such as Polynomial and Radial Base Function (SVM-Polynomial and SVM-RBF), the Logistic Regression (LR), and the Naïve Bayes Multinomial Updatable (NBMU) models were used for comparison of the designed model. Results indicated that 19 conditioning factors were effective among which distance to river, geomorphology, land use, hydrological group, lithology and slope angle were the most remarkable factors for gully modeling process. Additionally, results of modeling concluded the RF-ADTree ensemble model could significantly improve (area under the curve (AUC) = 0.906) the prediction accuracy of the ADTree model (AUC = 0.882). The new proposed model had also the highest performance (AUC = 0.913) in comparison to the SVM-Polynomial model (AUC = 0.879), the SVM-RBF model (AUC = 0.867), the LR model (AUC = 0.75), the ADTree model (AUC = 0.861) and the NBMU model (AUC = 0.811).Entities:
Keywords: Geographic information science; Kurdistan province; ensemble algorithms; geomorphology; gully erosion; machine learning
Year: 2019 PMID: 31146336 PMCID: PMC6603737 DOI: 10.3390/s19112444
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Location of study area and gully erosion sites in Kurdistan province and Iran.
Figure 2Photos of gully erosion in the Klocheh Watershed, Kurdistan province, Iran.
Gully conditioning factors and their classes for gully modeling in Klocheh Watershed, Kurdistan Province, Iran.
| No. | Factors | Classes | Classification Method | |
|---|---|---|---|---|
|
| 1 | Slope (o) | (1) 0–2; (2) 2–5; (3) 5–10; (4) 10–15; (5) 15–20; (6) >20 | Manual |
| 2 | Aspect | (1) Flat; (2) North; (3) Northeast; (4) East; (5) Southeast; (6) South; (7) Southwest; (8) West; (9) Northwest | Azimuth | |
| 3 | Elevation (m) | (1) 1612–1700; (2) 1700–1800; (3) 1800–1900; (4) 1900–2000; (5) 2000–2100; (6) 2100–2200; (7) 2200–2300; (8) 2300–2400 | Manual | |
| 4 | Plan curvature (m−1) | (1) [(−5.67)–(−0.736)]; (2) [(−0.736)–(−0.188)]; (3) [(−0.188)–0.149]; (4) [0.149–0.697]; (5) [0.6974–5.08] | Natural break | |
| 5 | Profile curvature (m−1) | (1) [(−6.357)–(−0.972)]; (2) [(−0.972)–(−0.187)]; (3) [(−0.187)–0.317]; (4) [0.317–1.1]; (5) [1.1–7.94] | Natural break | |
| 6 | STI | (1) 0–1.286; (2) 1.286–2.894; (3) 2.894–5.145; (4) 5.145–8.468; (5) 8.468–27.33 | Natural break | |
| 7 | VD | (1) 0–48.231; (2) 48.231–108.520; (3) 108.520–176.340; (4) 176.340–254.720; (5) 254.720–384.340 | Natural break | |
|
| 8 | Rainfall (mm) | (1) 261–286; (2) 286–298; (3) 298–306; (4) 306–312; (5) 312–322 | Natural break |
| 9 | SPI | (1) 0–112.4; (2) 112.4–224.8; (3) 224.8–401.5; (4) 401.5–722.7; (5) 722.7–4095 | Natural break | |
| 10 | TWI | (1) 1–3; (2) 3–4; (3) 4–5; (4) 5–6; (5) 6–9.059 | Natural break | |
| 11 | HG | (1) A; (2) B; (3) C; (4) D | HG type | |
| 12 | Flow accumulation | (1) 0–5; (2) 5–10; (3) 10–20; (4) 20–30; (5) >30 | Manual | |
| 13 | Permeability | (1) Low; (2) Moderate; (3) High | Permeability type | |
| 14 | Distance to river (m) | (1) 0–20; (2) 20–40; (3) 40–60; (4) 60–80; (5) >80 | Manual | |
| 15 | River density (km/km2) | (1) 0–2.775; (2) 2.775–4.810; (3) 4.810–6.598; (4) 6.598–8.694; (5) 8.694–15.72 | Natural break | |
|
| 16 | Lithology | (1) JL; (2) JS; (3) Mm; (4) PLb; (5) Pcg; (6) Plm; (7) Plt; (8) Qal; (9) Qc; (10) Qtr; (11) Qt1; (12) Qt2 | Lithology type |
| 17 | Distance to fault (m) | (1) 0–100; (2) 100–200; (3) 200–500; (4) 500–1000; (5) >1000 | Manual | |
| 18 | Fault density (km/km2) | (1) 0–0.287; (2) 0.287–0.823; (3) 0.823–1.270; (4) 1.270–1.820; (5) 1.820–2.440 | Natural break | |
|
| 19 | Land use | (1) Wood land; (2) Dry-farming and cultivated lands; (3) Poor pastures; (4) Semi-dense pastures; (5) Destroyed pastures | Land use type |
|
| 20 | Distance to road (m) | (1) 0–100; (2) 100–200; (3) 200–300; (4) 300–500; (5) >500 | Manual |
| 21 | Road density (km/km2) | (1) 0–0.684; (2) 0.684–1.750; (3) 1.750–2.570; (4) 2.570–3.690; (5) 3.690–6.980 | Natural break | |
|
| 22 | Geomorphology | (1) The valley plain unit (2) Hilly unit; (3) Mountain unit; (4) New plain unit; (5) Old plain unit; (6) Fluvial sediment unit | Geomorphology type |
Figure 3The flowchart of the study.
Figure 4The most important conditioning factors for gully erosion modeling in the Klocheh Watershed, Kurdistan Province, Iran.
Machine learning algorithm used parameters for gully modeling in the Klocheh Watershed, Kurdistan Province, Iran.
| Model Name | Description of Parameters |
|---|---|
|
| Classifier: ADTree; MaxGroup: 3; MinGroup: 3; Number of iterations: 10; Number of Groups: False; Projection Filter: PCA; Removed Percentage: 50; Number of seeds: 5 |
|
| Number of Boosting Iterations: 10; Random Seed: 0; Save Instance Data: false; Search Path: Expand all Paths |
|
| Maximum Its: −1; Ridge: 1.0 × 108 |
|
| Build Logistic Models: True; C: 1; Check turned Off: False; Epsilon: 1.0 × 1012: Filter Type: Not normalization/standardization; Kernel: PolyKernel; Number of folds: −1; Tolerance Parameter: 0.001 |
|
| Build Logistic Models: True; C: 1; Check turned Off: False; Epsilon: 1.0 × 1012: Filter Type: Not normalization/standardization; Kernel: RBF; Number of folds: −1; Tolerance Parameter: 0.001 |
|
| - |
Figure 5Modeling process for selecting the best values for the number of seed and iteration parameters for rotation forest (RF) as a Meta/ensemble classifier based on alternating decision tree (RF-ADTree) model: (a) number of seeds based on the area under the curve (AUC) of Receiver Operating Characteristic (ROC), (b) number of iterations based on the AUC of ROC, (c) number of seeds based on the root mean square error (RMSE), and (d) number of iterations based on the RMSE.
Model performances in the training dataset for the new hybrid “RF-ADTree” model and other benchmark models.
| Measures | NBMU | SVM-Polynomial | SVM-RBF | LR | ADTree | RF-ADTree |
|---|---|---|---|---|---|---|
| True positive | 466 | 461 | 494 | 470 | 476 | 501 |
| True negative | 513 | 574 | 558 | 550 | 551 | 570 |
| False positive | 174 | 179 | 146 | 170 | 164 | 139 |
| False negative | 127 | 66 | 82 | 90 | 89 | 70 |
| Sensitivity (%) | 0.786 | 0.875 | 0.858 | 0.839 | 0.842 | 0.877 |
| Specificity (%) | 0.747 | 0.762 | 0.793 | 0.764 | 0.771 | 0.804 |
| Accuracy (%) | 0.765 | 0.809 | 0.822 | 0.797 | 0.802 | 0.837 |
| RMSE | 0.398 | 0.378 | 0.375 | 0.376 | 0.379 | 0.373 |
| AUC | 0.844 | 0.871 | 0.895 | 0.876 | 0.885 | 0.909 |
Model performances in the validation dataset for the new hybrid “RF-ADTree” model and other benchmark models.
| Measures | NBMU | SVM-Polynomial | SVM-RBF | LR | ADTree | RF-ADTree |
|---|---|---|---|---|---|---|
| True positive | 201 | 195 | 198 | 201 | 204 | 213 |
| True negative | 210 | 244 | 227 | 236 | 240 | 240 |
| False positive | 74 | 80 | 77 | 47 | 71 | 62 |
| False negative | 65 | 31 | 48 | 39 | 35 | 35 |
| Sensitivity (%) | 0.756 | 0.863 | 0.805 | 0.838 | 0.854 | 0.859 |
| Specificity (%) | 0.739 | 0.753 | 0.747 | 0.834 | 0.772 | 0.795 |
| Accuracy (%) | 0.747 | 0.798 | 0.773 | 0.836 | 0.807 | 0.824 |
| RMSE | 0.403 | 0.380 | 0.381 | 0.380 | 0.384 | 0.378 |
| AUC | 0.843 | 0.863 | 0.873 | 0.869 | 0.882 | 0.906 |
Figure 6Histograms of all models for selecting the best classification method of gully susceptibility maps.
Figure 7Gully erosion maps obtained by RF-ADTree model and other soft computing benchmark models.
Figure 8Histograms of gully susceptibility classes with the sixth most important factors: DR: distance to river; L: lithology; Gm: geomorphology; HG: hydrological group; Lu: land use; S: slope, 1: ADTree model; 2: Logistic Regression (LR) model; 3: Naïve Bayes Multinomial Updatable (NBMU) model; 4: Support Vector Machine-Radial Base Function (SVM-RBF) kernel model; 5: SVM-Polynomial kernel model and 6: RF-ADTree model.
Figure 9Model comparison using ROC (a) and AUC (b).
Average ranking of the five gully erosion models for the study area using the Friedman’s test.
| No. | Gully Models | Mean Ranks | χ2 | Sig. |
|---|---|---|---|---|
| 1 | SVM-Polynomial | 2.29 | 2040 | 0.000 |
| 2 | SVM-RBF | 2.71 | ||
| 3 | LR | 3.06 | ||
| 4 | NBMU | 3.49 | ||
| 5 | ADTree | 4.65 | ||
| 6 | RF-ADTree | 4.80 |
Performance of the RF-ADTee model compared to other gully erosion models using Wilcoxon signed-rank test (two-tailed).
| No. | Pairwise Comparison | NPD | NND | Significance | ||
|---|---|---|---|---|---|---|
|
| SVM-Polynomial vs. SVM-RBF | 303 | 540 | −9.755 | 0.000 | Yes |
|
| SVM-Polynomial vs. LR | 245 | 700 | −13.424 | 0.000 | Yes |
|
| SVM-Polynomial vs. NBMU | 349 | 905 | −9.343 | 0.000 | Yes |
|
| SVM-Polynomial vs. ADTree | 196 | 1057 | −23.838 | 0.000 | Yes |
|
| SVM-Polynomial vs. RF-ADTree | 129 | 1126 | −26.125 | 0.000 | Yes |
|
| SVM-RBF vs. LR | 325 | 568 | −4.621 | 0.000 | Yes |
|
| SVM-RBF vs. NBMU | 434 | 813 | −3.536 | 0.000 | Yes |
|
| SVM-RBF vs. ADTree | 234 | 1009 | −21.050 | 0.000 | Yes |
|
| SVM-RBF vs. RF-ADTree | 194 | 1049 | −23.189 | 0.000 | Yes |
|
| LR vs. NBMU | 448 | 780 | −2.020 | 0.043 | Yes |
|
| LR vs. ADTree | 273 | 978 | −19.344 | 0.000 | Yes |
|
| LR vs. RF-ADTree | 222 | 1019 | −21.772 | 0.000 | Yes |
|
| NBMU vs. ADTree | 291 | 916 | −19.038 | 0.000 | Yes |
|
| NBMU vs. RF-ADTree | 249 | 919 | −19.714 | 0.000 | Yes |
|
| ADTree vs. RF-ADTree | 578 | 591 | −0.616 | 0.538 | No |
NPD: Number of positive; NND: Number of negative differences.