| Literature DB >> 31467342 |
Baofeng Di1,2, Hanyue Zhang2, Yongyao Liu2, Jierui Li2, Ningsheng Chen3, Constantine A Stamatopoulos4, Yuzhou Luo5, Yu Zhan6,7,8.
Abstract
A gradient boosting machine (GBM) was developed to model the susceptibility of debris flow in Sichuan, Southwest China for risk management. A total of 3839 events of debris flow during 1949-2017 were compiled from the Sichuan Geo-Environment Monitoring program, field surveys, and satellite imagery interpretation. In the cross-validation, the GBM showed better performance, with the prediction accuracy of 82.0% and area under curve of 0.88, than the benchmark models, including the Logistic Regression, the K-Nearest Neighbor, the Support Vector Machine, and the Artificial Neural Network. The elevation range, precipitation, and aridity index played the most important role in determining the susceptibility. In addition, the water erosion intensity, road construction, channel gradient, and human settlement sites also largely contributed to the formation of debris flow. The susceptibility map produced by the GBM shows that the spatial distributions of high-susceptibility watersheds were highly coupled with the locations of the topographical extreme belt, fault zone, seismic belt, and dry valleys. This study provides critical information for risk mitigating and prevention of debris flow.Entities:
Year: 2019 PMID: 31467342 PMCID: PMC6715629 DOI: 10.1038/s41598-019-48986-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Observation vs. prediction of watershed-based debris flow in Sichuan by gradient boosting machinea.
| Observation-N | Observation-Y | Accuracy (%) | |
|---|---|---|---|
| Prediction-N | 1514 | 259 | 85.4 |
| Prediction-Y | 186 | 515 | 73.5 |
| Total | 1700 | 774 | 82.0 |
aN means there did not exist debris flow, and Y represents there existed debris flow.
Figure 1Change in predictive deviance during the stepwise removal of the least important variable from the gradient boosting machines (GBM). The dashed red line indicates the setting for the final GBM, which achieves quasi-optimal performance with much fewer variables.
Performance comparison of Gradient Boosting Machine (GBM), Logistic Regression (LR), K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Artificial Neural Network (ANN) in predicting susceptibility of debris flow based on cross-validation.
| Metric | GBM | LR | KNN | SVM | ANN |
|---|---|---|---|---|---|
| Accuracy (%) | 79.5 | 78.6 | 81.8 | 80.5 | |
| AUCa | 0.85 | 0.83 | 0.86 | 0.86 |
aAUC: Area under curve (AUC) of receiver operating characteristic (ROC), with the best performance shown in bold.
Figure 2Variable importance plot for the gradient boosting machine predicting the susceptibility of debris flow in Sichuan. The relative importance is normalized so that they sum up to 100 for more intuitive interpretation. Please refer to Table 4 for the description of the variable acronyms.
Summary of predictor variables for developing the classification models.
| Variable | Acronym | Unit | Reference |
|---|---|---|---|
| Debris flow events | DF | — |
[ |
| Area | Area | m2 |
[ |
| Perimeter | PE | m |
[ |
| Elevation difference | ED | m |
[ |
| Channel gradient | CG | % |
[ |
| Average slope | AS | ° |
[ |
| Average aspect | AA | ° |
[ |
| Channel length | CL | km |
[ |
| Active fault | AF | m |
[ |
| Seismic intensity | SI | — |
[ |
| Rock hardness | RH | — |
[ |
| Aridity index | AI | — |
[ |
| Moisture index | MI | — |
[ |
| Annual accumulated temperature above 10 °C | ACT | °C |
[ |
| Annual rainfall | AR | mm |
[ |
| Annual temperature | AT | °C |
[ |
| Maximum daily rainfall | MDR_1 | mm |
[ |
| Maximum 3-day rainfall | MDR_3 | mm |
[ |
| Normalized Difference Vegetation Index | NDVI | — |
[ |
| Soil texture | ST | — |
[ |
| Water erosion intensity | WaE | — |
[ |
| Wind erosion intensity | WiE | — |
[ |
| Freeze-thaw erosion intensity | FtE | — |
[ |
| Land use | LU | — |
[ |
| Settlement sites | RA | — |
[ |
| Population density | PD | persons/km2 |
[ |
| Road length | RL | km |
[ |
*Detailed classification of part indexes.
Seismic intensity: (1) < VI; (2) VI; (3) VII; (4) VIII; (5) ≥ IX.
Rock hardness: (1) Very strong; (2) Strong; (3) Medium; (4) Weak; (5) Very weak; (6) Solum.
Soil texture: (1) Sand; (2) Silt; (3) Clay.
Water erosion intensity: (1) Very low; (2) Low; (3) Moderate; (4) High; (5) Very high; (6) Extreme.
Wind erosion intensity: (1) Very low; (2) Low; (3) Moderate; (4) High; (5) Very high; (6) Extreme.
Freeze-thaw erosion intensity: (1) Very low; (2) Low; (3) Moderate; (4) High.
Land use: (1) Paddy field; (2) Dry farm; (3) Forest; (4) Shrubbery; (5) Open forest; (6) Other forest; (7) High coverage grassland; (8) Moderate coverage grassland; (9) Low coverage grassland; (10) River; (11) Lake; (12) Reservoir; (13) Permanent glacier; (14) Mudflat; (15) Bottomland; (16) Urban land; (17) Rural residential area; (18) Other construction land; (19) Sand; (20) Gobi; (21) Saline-alkali soil; (22) Wetland; (23) Bare; (24) Rock; (25) Others.
Road length: (1) Highway; (2) National road; (3) Provincial road; (4) County road; (5) Railway.
Figure 3Spatial distributions of the predicted susceptibility of watershed-based debris flow by using (a) gradient boosting machine, (b) logistic regression, (c) K-nearest neighbor, (d) support vector machine, and (e) artificial neural network. N/A: Not applicable. The debris-flow formation conditions were inadequate in the plateaus and plains, and thus these areas were excluded from the susceptibility modeling.
Classification for the predicted susceptibility of watershed-based debris flow by using gradient boosting machine.
| Class | Number of Watersheds | Area (km2) | Percentage (%) |
|---|---|---|---|
| Very low | 1342 | 226600 | 47 |
| Low | 328 | 56500 | 12 |
| Moderate | 212 | 33500 | 7 |
| High | 292 | 51500 | 10 |
| Very high | 297 | 58600 | 12 |
| N/Aa | / | 58000 | 12 |
aN/A: Not applicable. The debris-flow formation conditions were inadequate in the plateaus and plains, and thus these areas were excluded from the susceptibility modeling.
Figure 4Map of the study area, watershed divisions, and locations of the observed debris flow.
Figure 5Correlations among the predictor variables of the final gradient boosting machine. The correlations were evaluated by using the Spearman correlation coefficient. Please refer to Table 4 for the description of the variable acronyms. The color of each grid cell represents the correlation strength (annotated on the bottom bar) of the two variables labelled in the leftmost and topmost ends.