| Literature DB >> 36097043 |
Lijun Sun1, Nanyan Hu2, Yicheng Ye1, Wenkan Tan1, Menglong Wu1, Xianhua Wang3, Zhaoyun Huang4.
Abstract
Rockburst forecasting plays a crucial role in prevention and control of rockburst disaster. To improve the accuracy of rockburst prediction at the data structure and algorithm levels, the Yeo-Johnson transform, K-means SMOTE oversampling, and optimal rockburst feature dimension determination are used to optimize the data structure. At the algorithm optimization level, ensemble stacking rockburst prediction is performed based on the data structure optimization. First, to solve the problem of many outliers and data imbalance in the distribution of rockburst data, the Yeo-Johnson transform and k-means SMOTE algorithm are respectively used to solve the problems. Then, based on six original rockburst features, 21 new features are generated using the PolynomialFeatures function in Sklearn. Principal component analysis (PCA) dimensionality reduction is applied to eliminate the correlations between the 27 features. Thirteen types of machine learning algorithms are used to predict datasets that retain different numbers of features after dimensionality reduction to determine the optimal rockburst feature dimension. Finally, the 14-feature rockburst dataset is used as the input for integrated stacking. The results show that the ensemble stacking model based on Yeo-Johnson, K-means SMOTE, and optimal rockburst feature dimension determination can improve the accuracy of rockburst prediction by 0.1602-0.3636. Compared with the 13 single machine learning models without data preprocessing, this data structure optimization and algorithm optimization method effectively improves the accuracy of rockburst prediction.Entities:
Mesh:
Year: 2022 PMID: 36097043 PMCID: PMC9468028 DOI: 10.1038/s41598-022-19669-5
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Statistical parameters of different rockburst grades.
| Rockburst grades | Rockburst features | ||||||
|---|---|---|---|---|---|---|---|
| Statistical parameters | |||||||
| I | Maximum | 118.40 | 237.10 | 17.66 | 7.90 | 5.26 | 8.21 |
| Minimum | 1.60 | 18.32 | 0.38 | 1.10 | 0.05 | 45.42 | |
| Mean | 26.38 | 104.30 | 4.83 | 3.43 | 0.43 | 24.81 | |
| Coefficient of variation | 0.90 | 0.51 | 0.58 | 0.62 | 2.26 | 0.50 | |
| II | Maximum | 148.40 | 263.00 | 22.60 | 10.00 | 4.55 | 42.96 |
| Minimum | 13.50 | 26.06 | 0.77 | 0.85 | 0.11 | 4.48 | |
| Mean | 51.83 | 127.88 | 6.71 | 4.31 | 0.51 | 22.94 | |
| Coefficient of variation | 0.52 | 0.39 | 0.61 | 0.44 | 1.27 | 0.41 | |
| III | Maximum | 132.10 | 304.00 | 54.15 | 10.00 | 2.56 | 80.00 |
| Minimum | 14.40 | 30.00 | 1.50 | 2.03 | 0.09 | 2.97 | |
| Mean | 65.52 | 145.75 | 8.21 | 5.52 | 0.48 | 23.00 | |
| Coefficient of variation | 0.34 | 0.31 | 0.82 | 0.28 | 0.5 | 0.52 | |
| IV | Maximum | 110.35 | 306.58 | 58.59 | 11.20 | 0.82 | 32.24 |
| Minimum | 30.10 | 80.60 | 2.50 | 1.90 | 0.26 | 2.80 | |
| Mean | 82.37 | 160.34 | 11.50 | 6.34 | 0.54 | 17.58 | |
| Coefficient of variation | 0.29 | 0.34 | 0.83 | 0.29 | 0.30 | 0.37 | |
Rockburst classification standards.
| Rockburst grades | Rockburst features | |||||
|---|---|---|---|---|---|---|
| I | 0–24.0 | 0–80.0 | 0–5.0 | 0–2.0 | 0.1–0.3 | 40.0–53.0 |
| II | 24.0–60.0 | 80.0–120.0 | 5.0–7.0 | 2.0–3.5 | 0.3–0.5 | 26.7–40.0 |
| III | 60.0–126.0 | 120.0–180.0 | 7.0–9.0 | 3.5–5.0 | 0.5–0.7 | 14.5–26.7 |
| IV | 126.0–200.0 | 180.0–320.0 | 9.0–30.0 | 5.0–6.5 | 0.7–0.9 | 0–14.5 |
Figure 1Proportion of each rockburst grade in the dataset.
Figure 2Overlaid histograms of each feature in the rockburst dataset.
Figure 3Unscaled data features.
Figure 4Data features after the Yeo–Johnson transformation.
Prediction results with the original rockburst dataset.
| Model | Rockburst grades | Training set | Test set | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall rate | F1 | Accuracy | Precision | Recall rate | F1 | Accuracy | ||
| SVC | I | 0.7353 | 0.6579 | 0.6944 | 0.6262 | 0.6667 | 0.7692 | 0.7143 | 0.5507 |
| II | 0.5854 | 0.4364 | 0.5000 | 0.4167 | 0.2632 | 0.3226 | |||
| III | 0.6107 | 0.9091 | 0.7306 | 0.5476 | 0.7931 | 0.6479 | |||
| IV | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |||
| DT | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.7273 | 0.6154 | 0.6667 | 0.6667 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6842 | 0.6842 | 0.6842 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6875 | 0.7586 | 0.7213 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.4286 | 0.3750 | 0.4000 | |||
| KNN | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.8182 | 0.6923 | 0.7500 | 0.6377 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6667 | 0.6316 | 0.6486 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6207 | 0.6207 | 0.6207 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.4545 | 0.6250 | 0.5363 | |||
| NBM | I | 0.5952 | 0.6579 | 0.6250 | 0.5680 | 0.5000 | 0.7692 | 0.6061 | 0.4493 |
| II | 0.5250 | 0.3818 | 0.4421 | 0.3077 | 0.2105 | 0.2500 | |||
| III | 0.6058 | 0.7159 | 0.6562 | 0.5457 | 0.5172 | 0.5263 | |||
| IV | 0.4000 | 0.3200 | 0.3556 | 0.2500 | 0.2500 | 0.2500 | |||
| GP | I | 0.7500 | 0.7105 | 0.7297 | 0.7524 | 0.5714 | 0.6154 | 0.5926 | 0.4928 |
| II | 0.6809 | 0.5818 | 0.6275 | 0.2308 | 0.1579 | 0.1875 | |||
| III | 0.7727 | 0.9659 | 0.8586 | 0.5385 | 0.7241 | 0.6176 | |||
| IV | 0.8462 | 0.4400 | 0.5789 | 0.6667 | 0.2500 | 0.3636 | |||
| MLP | I | 0.7812 | 0.6579 | 0.7143 | 0.6942 | 0.7500 | 0.4615 | 0.5714 | 0.5217 |
| II | 0.6304 | 0.5273 | 0.5743 | 0.4118 | 0.3684 | 0.3889 | |||
| III | 0.7182 | 0.8977 | 0.7980 | 0.5500 | 0.7586 | 0.6377 | |||
| IV | 0.5556 | 0.4000 | 0.4651 | 0.2500 | 0.1250 | 0.1667 | |||
| QDA | I | 0.7742 | 0.6316 | 0.6957 | 0.6117 | 0.7143 | 0.7692 | 0.7407 | 0.5362 |
| II | 0.5660 | 0.5455 | 0.5556 | 0.4375 | 0.3684 | 0.4000 | |||
| III | 0.6747 | 0.6364 | 0.6550 | 0.5926 | 0.5517 | 0.5714 | |||
| IV | 0.4103 | 0.6400 | 0.5000 | 0.3333 | 0.5000 | 0.4000 | |||
| GB | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.6667 | 0.6154 | 0.6400 | 0.6377 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6000 | 0.6316 | 0.6154 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6774 | 0.7241 | 0.7000 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.5000 | 0.3750 | 0.4286 | |||
| XgBoost | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.7692 | 0.7692 | 0.7692 | 0.6522 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6429 | 0.4737 | 0.5455 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6875 | 0.7586 | 0.7213 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.4000 | 0.5000 | 0.4444 | |||
| LightBoost | I | 1.0000 | 1.0000 | 1.0000 | 0.9951 | 0.6923 | 0.6923 | 0.6923 | 0.6377 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6000 | 0.4737 | 0.5294 | |||
| III | 1.0000 | 0.9886 | 0.9943 | 0.6471 | 0.7586 | 0.6984 | |||
| IV | 0.9615 | 1.0000 | 0.9804 | 0.5714 | 0.5000 | 0.5333 | |||
| RF | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.8182 | 0.6923 | 0.7500 | 0.6957 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6316 | 0.6316 | 0.6316 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.7097 | 0.7586 | 0.7333 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.6250 | 0.6250 | 0.6250 | |||
| ET | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.8000 | 0.6154 | 0.6957 | 0.6957 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6316 | 0.6316 | 0.6316 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.7188 | 0.7931 | 0.7541 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.6250 | 0.6250 | 0.6250 | |||
| CatBoost | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.7500 | 0.6923 | 0.7200 | 0.6957 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.7059 | 0.6316 | 0.6667 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6970 | 0.7931 | 0.7419 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.5714 | 0.5000 | 0.5333 | |||
Prediction results with the rockburst dataset after preprocessing.
| Model | Rockburst grades | Training set | Test set | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Precision | Recall rate | F1 | Accuracy | Precision | Recall rate | F1 | Accuracy | ||
| SVC | I | 0.9888 | 1.0000 | 0.9944 | 0.9802 | 0.9333 | 0.9333 | 0.9333 | 0.8051 |
| II | 0.9888 | 0.9778 | 0.9832 | 0.7143 | 0.6667 | 0.6897 | |||
| III | 0.9767 | 0.9546 | 0.9655 | 0.6765 | 0.7931 | 0.7302 | |||
| IV | 0.9667 | 0.9886 | 0.9775 | 0.9231 | 0.8276 | 0.8727 | |||
| DT | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.8710 | 0.9000 | 0.8852 | 0.7966 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6765 | 0.7677 | 0.7188 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.7407 | 0.6897 | 0.7143 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.9231 | 0.8276 | 0.8727 | |||
| KNN | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9355 | 0.9667 | 0.9508 | 0.8136 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.778 | 0.7000 | 0.7368 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.7097 | 0.7586 | 0.7333 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.8276 | 0.8276 | 0.8276 | |||
| NBM | I | 0.7300 | 0.8295 | 0.7766 | 0.6808 | 0.7714 | 0.9000 | 0.8308 | 0.6949 |
| II | 0.6441 | 0.4222 | 0.5101 | 0.7059 | 0.4000 | 0.5105 | |||
| III | 0.5795 | 0.5795 | 0.5795 | 0.5714 | 0.6897 | 0.6250 | |||
| IV | 0.7383 | 0.8977 | 0.8103 | 0.7419 | 0.7931 | 0.7667 | |||
| GP | I | 0.9556 | 0.9773 | 0.9663 | 0.9379 | 0.9333 | 0.9333 | 0.9333 | 0.7797 |
| II | 0.9556 | 0.9556 | 0.9556 | 0.6667 | 0.6667 | 0.6667 | |||
| III | 0.9195 | 0.9091 | 0.9143 | 0.6562 | 0.7241 | 0.6885 | |||
| IV | 0.9295 | 0.9091 | 0.9143 | 0.8846 | 0.7931 | 0.8364 | |||
| MLP | I | 1.0000 | 1.0000 | 1.0000 | 0.9972 | 0.8750 | 0.9333 | 0.9032 | 0.7627 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.6774 | 0.7000 | 0.6885 | |||
| III | 1.0000 | 0.9886 | 0.9943 | 0.6786 | 0.6552 | 0.6667 | |||
| IV | 0.9888 | 1.0000 | 0.9944 | 0.8148 | 0.7586 | 0.7857 | |||
| QDA | I | 0.8049 | 0.7500 | 0.7765 | 0.6780 | 0.7586 | 0.7333 | 0.7458 | 0.6356 |
| II | 0.6377 | 0.4889 | 0.5535 | 0.5833 | 0.4667 | 0.5185 | |||
| III | 0.5568 | 0.5568 | 0.5568 | 0.4722 | 0.5862 | 0.5231 | |||
| IV | 0.7043 | 0.9205 | 0.7980 | 0.7586 | 0.7586 | 0.7586 | |||
| GB | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9032 | 0.9333 | 0.9180 | 0.7881 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.8400 | 0.7000 | 0.7636 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6364 | 0.7241 | 0.6774 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.7931 | 0.7931 | 0.7931 | |||
| XgBoost | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9333 | 0.9333 | 0.9333 | 0.7797 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.7692 | 0.6667 | 0.7143 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6111 | 0.7586 | 0.6769 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.8462 | 0.7586 | 0.8000 | |||
| LightBoost | I | 1.0000 | 1.0000 | 1.0000 | 0.9972 | 0.9655 | 0.9333 | 0.9492 | 0.7797 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.7857 | 0.7333 | 0.7586 | |||
| III | 1.0000 | 0.9886 | 0.9943 | 0.5938 | 0.6552 | 0.6230 | |||
| IV | 0.9888 | 1.0000 | 0.9944 | 0.7931 | 0.7931 | 0.7931 | |||
| RF | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9333 | 0.9333 | 0.9333 | 0.7966 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.7500 | 0.7000 | 0.7241 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6667 | 0.7586 | 0.7079 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.8519 | 0.7931 | 0.8214 | |||
| ET | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9355 | 0.9667 | 0.9508 | 0.8136 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.7586 | 0.7333 | 0.7458 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6667 | 0.7586 | 0.7097 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.9200 | 0.7931 | 0.8519 | |||
| CatBoost | I | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 0.9062 | 0.9667 | 0.9355 | 0.7966 |
| II | 1.0000 | 1.0000 | 1.0000 | 0.8400 | 0.7000 | 0.7636 | |||
| III | 1.0000 | 1.0000 | 1.0000 | 0.6364 | 0.7241 | 0.6774 | |||
| IV | 1.0000 | 1.0000 | 1.0000 | 0.8214 | 0.7931 | 0.8070 | |||
Figure 5The mean decrease accuracy graph of ET and KNN models.
Figure 6Heat map of the Pearson correlation coefficients of rockburst features.
Figure 7Average prediction accuracy of 26 datasets.
Figure 8Ensemble stacking flow chart.
Figure 9Confusion matrix diagrams for the XgBoost model and stacking model.