| Literature DB >> 30065216 |
Dieu Tien Bui1,2, Himan Shahabi3, Ataollah Shirzadi4, Kamran Chapi5, Biswajeet Pradhan6,7, Wei Chen8, Khabat Khosravi9, Mahdi Panahi10, Baharin Bin Ahmad11, Lee Saro12,13.
Abstract
In this study, land subsidence susceptibility was assessed for a study area in South Korea by using four machine learning models including Bayesian Logistic Regression (BLR), Support Vector Machine (SVM), Logistic Model Tree (LMT) and Alternate Decision Tree (ADTree). Eight conditioning factors were distinguished as the most important affecting factors on land subsidence of Jeong-am area, including slope angle, distance to drift, drift density, geology, distance to lineament, lineament density, land use and rock-mass rating (RMR) were applied to modelling. About 24 previously occurred land subsidence were surveyed and used as training dataset (70% of data) and validation dataset (30% of data) in the modelling process. Each studied model generated a land subsidence susceptibility map (LSSM). The maps were verified using several appropriate tools including statistical indices, the area under the receiver operating characteristic (AUROC) and success rate (SR) and prediction rate (PR) curves. The results of this study indicated that the BLR model produced LSSM with higher acceptable accuracy and reliability compared to the other applied models, even though the other models also had reasonable results.Entities:
Keywords: GIS; South Korea; land subsidence; machine learning algorithms
Year: 2018 PMID: 30065216 PMCID: PMC6111310 DOI: 10.3390/s18082464
Source DB: PubMed Journal: Sensors (Basel) ISSN: 1424-8220 Impact factor: 3.576
Figure 1Study area; (a) Geographic location of the study area in the northeast of South Korea; (b) Location of study area between Mt. Baek-Wu to the west and Mt. Ham-Beak to the southeast; (c) and (d) the pictures at the surveyed subsidence locations that were taken from field surveys.
Land subsidence conditioning factors and their classes.
| Land Subsidence Factors | Classes | GIS Data Type | Scale |
|---|---|---|---|
| Slope angle (°) | (1) 0–10; (2) 10–20; (3) 20–30; (4) 30–40; (5) >40 | GRID | 1 m × 1 m |
| Distance to drift (m) | (1) 0–2; (2) 2–8; (3) 8–19; (4) 19–50; (5) >50 | Line | 1:5000 |
| Drift density (m/m2) | (1) 0–0.002; (2) 0.002–0.0448; (3) 0.0448–0.120; (4) 0.120–0.299; (5) 0.299–0.952 | Polygon | 1:5000 |
| Geology | (1) Gobangsan Group; (2) Sadong Group | Polygon | 1:50,000 |
| Distance to lineament (m) | (1) 0–10; (2) 10–20; (3) 20–30; (4) 30–60; (5) >60 | Line | 1:5000 |
| Lineament density (m/m2) | (1) 0–0.001; (2) 0.001–0.029; (3) 0.029–0.0435; (4) 0.0435–0.052; (5) 0.052–0.109 | Polygon | 1:5000 |
| Land use | (1) Mixed forest land; (2) Deciduous forest; (3) Mixed barren land; (4) Commercial area; (5) Coniferous forest; (6) Other grasses; (7) Transportation; (8) Natural grasses; (9) Field | Polygon | 1:50,000 |
| RMR | (1) 0.00366–1.26; (2) 1.26–1.54; (3) 1.54–1.93; (4) 1.93–2.79; (5) 2.79–4 | Polygon | 1:5000 |
Figure 2The flowchart of land subsidence modelling process in the study area.
Figure 3Prediction capability of the most important land subsidence conditioning factors for land subsidence modelling.
Model results and analysis using training and validation datasets. TP: true positive, TN: true negative, FP: false positive, FN: false negative, SST: sensitivity, SPC: specificity, ACC: accuracy, T: training; V: validation.
| BLR | SVM | LMT | ADTree | |||||
|---|---|---|---|---|---|---|---|---|
| T | V | T | V | T | V | T | V | |
| TP | 16 | 5 | 16 | 4 | 15 | 5 | 14 | 4 |
| TN | 15 | 6 | 14 | 6 | 14 | 5 | 15 | 5 |
| FP | 2 | 1 | 2 | 1 | 3 | 2 | 2 | 2 |
| FN | 1 | 2 | 3 | 3 | 2 | 2 | 3 | 3 |
| SST | 0.941 | 0.714 | 0.842 | 0.571 | 0.882 | 0.714 | 0.824 | 0.571 |
| SPC | 0.882 | 0.857 | 0.875 | 0.857 | 0.824 | 0.714 | 0.882 | 0.714 |
| ACC | 0.912 | 0.786 | 0.857 | 0.714 | 0.853 | 0.714 | 0.853 | 0.643 |
| Kappa | 0.822 | 0.571 | 0.764 | 0.571 | 0.764 | 0.428 | 0.764 | 0.428 |
| RMSE | 0.297 | 0.426 | 0.323 | 0.430 | 0.335 | 0.432 | 0.363 | 0.462 |
Parameters of machine learning algorithms applied in this study.
| Algorithm | Parameters |
|---|---|
| BLR | Hyper parameter value range, R: 0.01–3.16; Specific hyper parameter value, 0.27; The maximum number of iterations to perform, 1000; The number of folds in the internal cross-validation or pruning, 2; The random number seed, 1; the threshold for classification, 0.5. |
| LMT | The minimum number of instances at which a node is considered for splitting, 15; a fixed number of iterations for LogitBoost, −1. |
| SVM | Build logistic model, False; C, 0.1; epsilon, 1.0 × 10−12; filter type, normalized training data; kernel function, polykernel; number of folds, −1; random seed, 1; tolerance parameter, 0.001. |
| ADT | Number of boosting iteration, 10; random seed, 0; search path, expand all paths |
Figure 4Land subsidence susceptibility maps using: (a) the Bayesian logistic regression (BLR), (b) the support vector machine (SVM), (c) the logistic model tree (LMT) and (d) the alternating decision tree (ADTree).
Figure 5Model validation and comparison using AUROC based on the (a) training and (b) validation datasets.
Figure 6Model validation and comparison using (a) success rate curve and (b) prediction rate curve.
Performance comparison of the machine learning models in land subsidence using Wilcoxon signed-rank test (two-tailed). The standard p-value is 0.05.
| No. | Pair Wise Comparison | Number of Positive Differences | Number of Negative Differences | Significance | ||
|---|---|---|---|---|---|---|
| 1 | BLR vs. SVM | 27 | 7 | −4.078 | 0.000 | Yes |
| 2 | BLR vs. LMT | 24 | 10 | −2.522 | 0.012 | Yes |
| 3 | BLR vs. ADTree | 28 | 4 | −4.469 | 0.000 | Yes |
| 4 | SVM vs. LMT | 27 | 7 | −4.043 | 0.000 | Yes |
| 5 | SVM vs. ADTree | 33 | 1 | −5.069 | 0.000 | Yes |
| 6 | LMT vs. ADTree | 33 | 1 | −5.003 | 0.000 | Yes |