Literature DB >> 30030501

An approach to predict the height of fractured water-conducting zone of coal roof strata using random forest regression.

Dekang Zhao1,2, Qiang Wu3,4.   

Abstract

Water inrushes from coal-roof strata account for a great proportion of coal mine accidents, and the height of fractured water-conducting zone (FWCZ) is of significant importance for the safe production of coal mines. A novel and promising model for predicting the height of FWCZ was proposed based on random forest regression (RFR), which is a powerful intelligent machine learning algorithm. RFR has high prediction accuracy and is robust in dealing with the complicated and non-linear problems. Also, it can evaluate the importance of the variables. In this study, the proposed model was applied to Hongliu Coal Mine in Northwest China. 85 field measured samples were collected in total, with 60 samples (70%) used for training and 20 (30%) used for validation. For comparison, a support vector machine (SVM) model was also constructed for the prediction. The results show that the two models are in accordance with the field measured data, and RFR shows a better performance on good tolerance to outliers and noises and efficiently on high-dimensional data sets. It is demonstrated that RFR is more practicable and accurate to predict the height of FWCZ. The achievements will be helpful in preventing and controlling the water inrushes from coal-roof strata, and also can be extended to various engineering applications.

Entities:  

Year:  2018        PMID: 30030501      PMCID: PMC6054685          DOI: 10.1038/s41598-018-29418-2

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

In mining activities, mine water has always been a great threat to the coal mine safety. According to statistics, more than 25 billion tons of coal resources are at the risk of water inrushes in China[1]. With increased mining depths in recent years, the hydrogeological conditions of mining become more and more complicated, and the water inrushes from coal-roof strata are increasingly serious[2,3]. During coal extraction, the strata overlying the coal seams move significantly downward due to the rock pressure, forming multiple fractures and fissures in the coal-roof strata. Once these fractures are interconnected and the impermeability of aquitards is destroyed, various kinds of water bodies from coal-roof strata, including surface water, goaf water and aquifer water, will flow into the mining area through the fractures, resulting in water-inrush accidents. The accidents may cause tremendous loss of life and property. Therefore, in order to effectively prevent water inrushes and ensure the safe production of the coal mines, it is essential to accurately predict the height of fractured water-conducting zone (FWCZ) of coal-roof strata. Aiming at the prediction of the height of FWCZ, scholars proposed many methods, including empirical formula method, field measured method, theoretical calculation, numerical simulation and so on[4-12]. In the early 1980s, Liu[4] proposed an empirical formula by the regression analysis of the limited field measured data collected from several large-scale coalmines in North China. But the formula only considers a few factors so that it is unable to precisely reflect the complicated development mechanism of the water-conducting fractures. For this problem, Hu[5] summarized the nonlinear statistical relation between the FWCZ and multiple mining factors including mining height, hard-rock lithology ratio, working face length, mining depth and so on. Shi[6] analyzed the movement characteristics of the overlying strata and the division theory of the “four zones” in overlying strata, then proposed theoretical formulas considering multiple mining factors. To ensure the mining safety of shallow coal seams under water-rich aquifers and determine the development of the fractured water-conducting zone, Liu[7] built a numerical model to analyze the damage zone distribution in Flac3D model. Furthermore, the mathematical theories were gradually adopted in this field. For instance, Yang[8] used the analytic hierarchy process and the fuzzy cluster analysis method to predict the height of FWCZ. Generally, the theoretical calculation model and the numerical simulation have the problems of over idealization and the difficult selection of the mechanical parameters. Meanwhile, the methods were also combined to determine the height of FWCZ. Based on the traditional empirical formula, the crack-measure system and the borehole television detecting system, Gao[9] developed the theoretical calculation, quantitative analysis and detection of the FWCZ of coal-roof strata. At present, the most effective method is the field measurement by using borehole video camera system, water injection system and other direct monitoring equipment. However, these systems require large quantities of engineering with enormous expenses. In recent years, with the rapid development of artificial intelligence technologies, the application of machine learning algorithms (MLAs), such as decision tree (DT), support vector machine (SVM), artificial neural network (ANN) and so on, to predicting the height of FWCZ has gradually been a trend[13-17]. For example, Sun[13] proposed a synthetic calculation system that coupled genetic algorithm (GA) and support vector regression (SVR). This system reflected the relationship between the height of FWCZ and the mining factors effectively. Wu[14] presented a radial basis function neural networks (RBFNN) model to predict the height of FWCZ for fully mechanized longwall mining with sublevel caving. However, these MLAs also have certain limitations in practical applications. For instance, numerous data pre-treatment is required in the DT model, and it tends to fall into local optimum; as for the ANN model, it has the shortcomings of over learning, slow convergence speed and local minimum value[2]. Considering all the aforementioned problems, this paper proposed a predicting model of the height of FWCZ based on random forest regression (RFR), which is a non-parametric regression approach introduced by Breiman[18] in 2001. RFR is a nonlinear modeling tool, coupling the main advantages of two major learning techniques: bagging and random feature selection[18]. It is suitable for the problems with unclear priori knowledge and incomplete data. Further, unlike simple DTs and neural networks (NNs), RFR runs efficiently on high-dimensional data sets. But if there are a lot of irrelevant variables, the DTs does not perform well. The objective of decision tree is to find the interaction between variables, and the weakness of the neural network is its inability to explain its reasoning process and reasoning basis[18]. Compared with the traditional intelligence algorithms, such as ANN and SVM, RFR has high prediction accuracy and good tolerance to outliers and noises[18,19]. Because of its superior performance, RFR has been widely applied to various fields such as biology, medicine, economics, management, remote sensing and other fields in recent years[19-25]. RBFNN has strong nonlinear fitting ability, it can map any complex nonlinear relation, and its learning rule is simple, which is easy to realize by computer. However, the theory and learning algorithm need to be further improved[26-29]. The group method of data handling (GMDH)-type neural network does not need the preset network structure, and the classification rules are expressed by some simple polynomials. However, the GMDH training algorithm can obtain good results only in the case that the noise and interference are distributed by Gaussian, otherwise, the training algorithm often overfits the network[30,31]. In the current study, RFR was used to predict the height of FWCZ. To verify the effectiveness of the generated RFR model, it was applied to Hongliu Coal Mine in Northwest China. Also, a SVM model was constructed for comparison. The results indicate that the RFR model has a better performance, and the prediction results are in good accordance with the field measured data observed using borehole video camera system (BVCS).

Material and Methods

Study area

The Hongliu Coal Mine is situated in the middle east of the Ningxia Hui Autonomous Region in Northwest China, approximately 80 km northwest of Yinchuan City (Fig. 1). The mine is distributed in the NW-SE direction, and it has an area of 79.55 km2 with a length of 15 km and a width of 5.5 km. The general elevation is approximately 1400 m above sea level. Topographically, the mine is located in the west of Mu Us Desert, and the landform in the study area is classified as hilly terrain. It has a semiarid-desert continental-monsoon climate with a mean annual precipitation of 216.3 mm.
Figure 1

Location of the study area and geological structure map.

Location of the study area and geological structure map. Most areas of the mine are covered by aeolian sand of Quaternary, except that sporadic bedrocks are exposed in certain local regions in the southwest of the mine field. According to drilling data, the main strata include: Shangtian Formation of Upper Triassic, Yan’an Formation of Medium Jurassic, An’ding Formation of Upper Jurassic, Qingshuiying Formation of Paleogene (Oligocene) and Quaternary. Hydrogeologically, aquifers in the mine area can be divided into five groups according to the type of aquifer media and void: Quaternary loose alluvial pore aquifer group; Cretaceous rock pore and fractured aquifer group; Yan’an formation (Jurassic System) rock pore and fractured aquifer group; Upper Triassic fractured aquifer group; Permian sandstone and Carboniferous thin limestone aquifer group. Structurally, the overall structural complexity in this area is moderate. In general, the Hongliu Coal Mine takes on a linear structure in NW direction. The crisscrossed faults are widely distributed in the study area. According to statistics, 44 faults and five large-scale folds have been exposed by drilling in the study area. The coal-measure strata in the mining area are in the Yan’an formation of medium Jurassic System. There are 18 coal seams. The main stable and minable coal seams are No. 2 and No. 4 with the average thickness of 4.61 m and 2.97 m, respectively.

Division zones of the coal-roof strata after mining

After the mining of the coal seam, the coal-roof strata are destroyed in various degrees, and have an obvious zoning property. According to the damage degrees and the movement characteristics, the coal-roof strata are divided into three zones: caved zone, fractured zone and continuous bending zone[4,17,32], as illustrated in Fig. 2. The fractured water-conducting zone studied in this paper consists of the caved zone and fractured zone.
Figure 2

Division zones of the coal-roof strata after mining.

Division zones of the coal-roof strata after mining.

Caved zone

Caved zone is at the bottom of the overlying strata. With the moving forward of the mining working face, the immediate roof strata bear imbalance stress. When the load applied on the strata exceeding their bearing capacity, fractures generate. Finally, the strata crush, and the rocks irregularly fall into the void zone until it is filled. Thus, if an aquitard is located within the caved zone, its impermeability will become invalid in different degrees. So the caved zone provides ideal passages of the water from above aquifers to the working face.

Fractured zone

Fractured zone is above the caved zone, and the strata in this zone still maintain a certain continuity compared with the caved zone. The vertical fractures, inclined fractures and horizontal abscission-layer are heavily developed and distributed in the rocks at the bottom of this zone. The damage extent gradually decreases from the bottom of the fracture zone to the upper part, leading to the decrease of the fractures upward to the integrity rocks. This zone makes it possible that the fractures connect the aquifers, causing water inrushes from coal-roof strata. This zone is the main part of the water-conducting zone.

Continuous bending zone

Continuous bending zone refers to the strata between the fractured zone and the ground surface. The strata in this zone present the basic properties of downward movement without fractures developed within the rocks, especially the soft rock and loose soil strata. The movement of the strata almost hardly affects the impermeability of the aquitards in this zone, and it plays a protective role of the aquitards. A few fissures may appear in certain tension positions, but in general the strata maintain continuous[17].

Random forest regression (RFR)

RFR, introduced by Breiman in 2001, is an ensemble learning algorithm of multiple regression trees. Compared with simple decision trees, RFR runs efficiently on high-dimensional data sets, and it is more accurate and robust to noise[18,19]. Besides, RFR has great advantages over traditional intelligent algorithms[18-24]. On the one hand, it has a very fast learning process and can handle a large number of input variables while assessing the importance of variables. On the other hand, when building a forest, it can internally estimate the generalization error and estimating missing data can maintain high accuracy even if most of the data is lost. RFR is an ensemble of regression trees (RTs) to predict the value of a variable. It draws multiple samples based on the bootstrap resampling method from the original samples, and then constructs the decision trees model for the samples. Finally, the prediction output is obtained by calculating the average value of all prediction trees[18]. Figure 3 shows the sketch map of the RFR structure, and the specific implementation procedures of the RFR algorithm are as follows:
Figure 3

Sketch map of the RFR structure.

Sketch map of the RFR structure. (1) Draw k samples randomly from the original training set X (N samples) using bootstrap resampling method, and then k regression trees are constructed. In this process, the probability that each sample wouldn’t be drawn is p = (1−1/N). If N tends to infinity, p ≈ 0.37, as indicates that about 37% of the samples in the original training set X are not drawn, these data are called out-of-bag (OOB) data. These OOB data can be used to be test samples. (2) For k bootstrap samples, k unpruned regression trees are created respectively. In the tree growing process, for each node, m attributes are randomly selected from the total M attributes as internal nodes (m < M). Then, according to the minimum Gini index principle, an optimal attribute is selected from m attributes as a split variable to make the branches grow. (3) The generated k regression trees constitute the final random forest regression model. The model estimation performance could be evaluated based on the indices: mean square error of OOB (MSE) and coefficients of determination ().where n is the total number of the OOB samples; y is the observed output value; ŷ is the predicted output obtained by the generated RFR regression model; is the predicted variance of the OOB output.

Variables importance measures

The RFR model provides two ways to calculate the importance degree of each variable index: mean decrease in Gini index and mean decrease in accuracy[18-20]. The mean decrease in Gini index means the total impurity decrease of each variable at each tree node. The method evaluates the importance of the variables by calculating the Gini index based on the Equation (3), and then accumulates the total impurity decrease of all the trees.where p is the probability of the samples belonging to the i-th leaf; N is the number of the leave; I is the Gini index. The basic principle of the OOB error estimation method is: when the noise is added to a related feature which plays an important role in the accuracy, the prediction accuracy of the RFR will decrease significantly. The main procedures are as follows: firstly, for the generated RFR, the OOB error e of each decision tree is calculated according to the OOB data; secondly, the j-th eigenvalue X of the OOB data is changed randomly (namely the noise interference is added artificially); then, the OOB data with noise are used to test the performance of the RFR and a new OOB error is obtained. Finally, the importance degree of the variable X can be calculated according to the Equation (4):where X is the j-th eigenvalue of the OOB data; e is the initial OOB error; is the OOB error with noise; n is the number of the decision trees; I(X) is the importance of the variable X. The greater the OOB error caused by the change of the variable X, the more the decrease in accuracy, indicating the more important the variable is.

Construction of the main controlling factors system

The development of FWCZ of coal-roof strata is influenced by multiple factors. And it has a complex nonlinear relationship with the strata geological features, rock mechanics and mining conditions[17]. Based on a large number of field observations for fully-mechanized mining and theoretical studies, five main controlling factors were selected, including mining depth, mining height, lithology type of the overlying strata, working-face length and coal-seam dip angle. A brief overview of the five factors is described as follows.

Mining depth

According to the theories of mining engineering geology and rock mechanics, the situ stress of the strata around the underground excavation space has a great impact on the destruction scope of the surrounding rock. Generally, the primary rock stress of the surrounding rock is proportional to the mining depth. With the increase of the mining depth of the coal seam, the in situ stresses and the displacement of the overlying rock gradually increase, which will lead to more fractures developed in the coal-roof strata.

Mining height

Mining height is the decisive factor of the fractured zone height. The greater the mining height, the larger the range of the coal-roof plastic zone. And a greater space available to the caving rock will form, resulting in a greater height of the fractured zone. In the traditional empirical formula prediction method, mining height is the only factor that controls the FWCZ height.

Lithology type

When the overlying rock is disturbed by the mining activities, the brittle rock with higher hardness (such as limestone and sandstone) is apt to crack and produce fractures. While, for the soft rock (such as mudstone and shale), the plastic deformation mainly occurs, and fractures rarely appear. After the extraction of the coal seams, the compressive strength of the overlying rock directly affects the rock failure degree. The rock with a greater compressive strength will be not prone to be destroyed. Generally, according to the uniaxial compressive strength of the rock, the lithology of the overlying strata is classified into four types[13-15]: hard, medium hard, medium soft and soft, with the quantitative values of 4, 3, 2 and 1, respectively.

Working-face length

Working-face length, like the mining height, is an index that reflects the influence of the mining space size on the fractured water-conducting zone. According to the material mechanics theory, the curvature of a rock beam with two ends fixed is proportional to the span. The greater the length of the working face, the greater the downward curvature of the coal-roof strata. Thus, the break probability of the rock beam increases, resulting in a higher height of the fractured zone.

Coal-seam dip angle

The influence that the coal-seam dip angle on the overlying strata is mainly embodied in the different failure forms of the strata. When the coal seam is horizontal, the form of the fractured zone is nearly symmetrical, showing a saddle shape. With the increase of the dip angle, the failure form of overlying rock gradually develops into parabola and arch shapes.

Results and Discussion

Datasets used

The collection of the datasets is the most important part for any machine learning algorithm. In this study, 85 field measured datasets for fully-mechanized mining were collected from several large-scale coalmines in North China, referring to the previous research documents[13-17]. Each case contains the field measured data of the aforementioned five main-controlling factors and the height of FWCZ. Of the 85 datasets, 60 (70%) were randomly selected for training (Table 1), while the remaining 25 (30%) for model testing. Figure 4 shows the detailed flowchart of the methodology used in this study.
Table 1

Field measured sample datasets for model training.

No.Sample sourceMining depth (m)Mining height (m)Lithology typeWorking-face length (m)Coal-seam dip angle (°)Height of fractured water-conducting zone (m)
1No. 3241 working face in Qidong Coal Mine5502.441801555.32
2No. 8 coal seam in Yangzhuang Coal Mine3201.7465627.5
3No. 2 coal seam in Tiebei Coal Mine12531150522
4No. 4320 working face in Xinglongzhuang Coal Mine45084170886.8
5No. 4308 working face in Dongtan Coal Mine4334306035
6No. 16 coal seam in Zhaopo Coal Mine1201.2275831
7No. 3 coal seam in Wulanmulun Coal Mine101.12.23158163
8No. 13013 coal seam in Baodian Coal Mine4172.9480468
9No. 2 coal seam in Guantai Coal Mine30042752120
10No. 8 coal seam in Luling Coal Mine2764.51350717.2
11No. 7 coal seam in Fangezhuang Coal Mine8442108330
12No. 1203 working face in Daliuta Coal Mine4941135545
13No. C13-1 working face in Panxie Coal Mine1173.42205272
14No. 1 coal seam in Xinji Coal Mine29061645885.6
15No. 4 coal seam in Liuhualing Coal Mine892.03469745.86
16No. 16 coal seam in Laoshidan Coal Mine2001.514504.5
17No. 2 coal seam in Kongji Coal Mine20084897648
18No. 11 coal seam in Tongting Coal Mine23021853752.5
19No. 1 coal seam in Qilianta Coal Mine564.3455042.5
20No. 31107 working face in Luxi Coal Mine3502.52135520
21No. 9 coal seam in Luling Coal Mine284721303.526
22No. 7141 working face in Qidong Coal Mine5202.331741250.675
23No. 3241 working face in Qidong Coal Mine5092.25318012.534.925
24No. 7130 working face in Qidong Coal Mine402.5331701219.6
25No. 1013 working face in Wugou Coal Mine386.53.131501040.79
26No. 1017 working face in Wugou Coal Mine3803.53180645.84
27No. 345 working face in Qinan Coal Mine395.53.4531601426.7
28No. 1031 working face in Taoyuan Coal Mine384.22.653190.52133
29No. 1062 working face in Taoyuan Coal Mine306331502833.615
30No. 745 working face in Haizi Coal Mine404.52.33951819.5
31No. 1031 working face in Haizi Coal Mine313.52.4365621.9
32No. 841 working face in Zhuxianzhuang Coal Mine342.53.831141328.455
33No. 821 working face in Zhuxianzhuang Coal Mine338.51.93115.52020.995
34No. 822 working face in Zhuxianzhuang Coal Mine3161.931651226.085
35No. 721 working face in Zhuxianzhuang Coal Mine2961.9495.51517.84
36No. II 865 working face in Zhuxianzhuang Coal Mine493.7513.4331301593.175
37No. 8212 working face in Xutong Coal Mine3952.53178926.33
38No. 7126 working face in Xutong Coal Mine478.52.53180833.755
39No. 16028 working face in Paner Coal Mine3401.82178319.69
40No. 1207 working face in Paner Coal Mine31922148517.155
41No. 1201(3) working face in Paner Coal Mine3112285319.11
42No. 1201(1) working face in Paner Coal Mine327.52278722.995
43No. 12128 working face in Paner Coal Mine355.522125323.865
44No. 12118 working face in Paner Coal Mine349231305.522.31
45No. 12117 working face in Paner Coal Mine36323180816.845
46No. 1701(3) working face in Pansan Coal Mine44723107430.965
47No. 1711(3) working face in Pansan Coal Mine420.52.83135341.13
48No. 1211(3) working face in Pansan Coal Mine509.5321401026.01
49No. 14032(3) working face in Panyi Coal Mine3832.22125513.035
50No. 14021(3) working face in Panyi Coal Mine376.522124512.675
51No. 1401(3) working face in Panyi Coal Mine3911.82125514.29
52No. 1402(3) working face in Panyi Coal Mine4042.22150621.195
53No. 1412(3) working face in Panyi Coal Mine4153.43120830.085
54No. 1121(1) working face in Panyi Coal Mine4181.83120624.69
55No. 2622(3) working face in Panyi Coal Mine552.55.83182844.36
56No. 1121(3) working face in Panyi Coal Mine490.563182744.19
57No. 1211(3) working face in Xieqiao Coal Mine44543198829.265
58No. 1221(3) working face in Xieqiao Coal Mine490.553172852.76
59No. 1221(3) working face in Zhangji Coal Mine605.533136238.185
60No. 1212(3) working face in Zhangji Coal Mine5163.93205231.765
Figure 4

Detailed flowchart of the proposed methodology.

Field measured sample datasets for model training. Detailed flowchart of the proposed methodology.

Establishment of the RFR model

In the RFR, two parameters are required to define: the number of trees in the forest (ntree), and the number of the random variables of the split nodes (mtry). To maximize the model accuracy, it is necessary to optimize the combination of the parameters mtry and ntree[18]. When ntree is defined with a small value, the RFR prediction error is uncontrollable and the model performance cannot achieve the optimal identification. Conversely, if the parameter ntree is too large, the computation time and required memory will increase accordingly. By repeated operation, it is found that when ntree = 200, the MSEOOB tends to be stable and the model does not tend to over fitting. According to Breiman[18], mtry < M. In this case study, there are five variables, namely M = 5. To assess the optimal value of mtry, three RFR models were created for mtry = 1, mtry = 2 and mtry = 3 (Fig. 5). Figure 5 shows the change of the error depending on the number of the trees ntree. The results show that when ntree = 200, the error of the model is stable, and when mtry = 1, the MSEOOB is lowest at about 6.9 m2. Therefore, considering both the accuracy and computation cost, the two optimized parameters of the RFR are as follows: mtry = 1 and ntree = 200.
Figure 5

The OOB error of the RFR model.

The OOB error of the RFR model. The contribution of each factor to the generated RFR model is shown in Fig. 6. As shown, the importance degree of each factor is measured based on two ways: mean decrease in Gini index and OOB mean decrease in accuracy. According to the Gini index, mining height and mining depth have the highest importance, followed by coal-seam dip angle and working-face length, while lithology type has the lowest importance. Regarding the OOB mean decrease in accuracy, the order of the importance degree is consistent with the result obtained by Gini index method. Based on both of the features of importance, mining height and mining depth are the two most important factors out of the five factors, as suggests that they contribute overwhelmingly to the development of the FWCZ height.
Figure 6

Importance degree of the main controlling factors determined by two ways: (a) Mean decrease in Gini index; (b) Mean decrease in accuracy. (MH: mining height; MD: mining depth; CSDA: coal-seam dip angle; WFL: working-face length; LT: lithology type).

Importance degree of the main controlling factors determined by two ways: (a) Mean decrease in Gini index; (b) Mean decrease in accuracy. (MH: mining height; MD: mining depth; CSDA: coal-seam dip angle; WFL: working-face length; LT: lithology type).

SVM model for comparison

For comparison, support vector machine (SVM) regression model was also used for the prediction of the height of fractured water-conducting zone. SVM has superior prediction performance in various fields for data modeling and function optimization because of its ability to represent non-linearities[23]. The radial basis function (RBF) was adopted as the kernel function, and the two main parameters RBF kernel coefficient γ and penalty coefficient C were determined as 0.1 and 0.5. And then the SVM regression model was constructed using the same training data aforementioned.

Model evaluation

The model evaluation is an important procedure before the model application. The root mean square error (RMSE) and the coefficient of determination R2 were utilized to evaluate the performance of the two generated regression models. RMSE is generally used for measuring the residual errors, and it reflects the difference between original and modeled values. The lower the RMSE, the better the model performs. R2 provides a measure of how well the predicted output of the regression model fits the observed data. The value of R2 varies between 0 and 1. A higher R2 indicates that the regression model fits the data better. The two indices are defined as follows:where n is the total number of the test samples; y is the observed output value of the test samples; y is the predicted value by the generated models; ŷ is the average output value of the test samples. Table 2 lists 25 sample datasets for model testing to evaluate the performance of the models. Using the two regression models generated above, the heights of FWCZ of the 25 cases were predicted. Figure 7 shows the predicted value against the observed data with the test data using SVM and RFR, respectively. Based on the Equations (5) and (6), the RMSE and R2 of the two models were calculated as Table 3. As it shown, the RFR model has the lower RMSE and higher R2 with the value of 2.363 and 0.968, respectively (compared to 4.396 and 0.902 for SVM model). Therefore, it is concluded that both models are reasonable, and RFR has a better performance compared with the SVM.
Table 2

Field measured sample datasets for model testing.

No.Sample sourceMining depth (m)Mining height (m)Lithology typeWorking-face length (m)Coal-seam dip angle (°)Height of fractured water-conducting zone (m)
1No. 1215(3) working face in Zhangji Coal Mine520.533202233.365
2No. 1242(1) working face in Gubei Coal Mine6203.142403.520.215
3No. 7192 working face in Kongzhuang Coal Mine2205.331202546.5
4No. S4101 working face in Pingshuo Coal Mine3607.693220345.125
5No. ZF2801 working face in Xiagou Coal Mine3479.93100279.255
6No. 5306 working face in Xinglongzhuang Coal Mine4126.93160438.8
7No. 6206 working face in Wangzhuang Coal Mine3165.932484.560.81
8No. I03(2) working face in Laogongyingzi Coal Mine2403.51195721.675
9No. I03(4) working face in Laogongyingzi Coal Mine2403.51195717.445
10No. 3202 working face in Wangpo Coal Mine474.165.83230465.395
11No. 93101 working face in Nantun Coal Mine541.55.2831756.549.25
12No. 1301 working face in Jisan Coal Mine4806.33170446.13
13No. 1305 working face in Dongtan Coal Mine6008.783223.35654.08
14No. 2308 working face in Xinglongzhuang Coal Mine332.857.153160516.115
15No. 2306 working face in Xinglongzhuang Coal Mine319.28.231607.527.84
16No. 2302 working face in Xinglongzhuang Coal Mine278.158.73170828.56
17No. 2300 working face in Xinglongzhuang Coal Mine2828.553140525.255
18No. 23S2 working face in Xinglongzhuang Coal Mine258.558.453175320.85
19No. 2303 working face in Xinglongzhuang Coal Mine286.457.83150835.9
20No. 1314 working face in Baodian Coal Mine3508.531696.555.255
21No. 2605 working face in Yangcun Coal Mine187.51.233001010.41
22No. 63110 working face in Nantun Coal Mine368.055.773125648.35
23No. 2186 working face in Donghuantuo Coal Mine4203.43702356.8
24Fangezhuang Coal Mine1731.94702025.3
25No. 1672 working face in Qianjiaying Coal Mine4463.841431740
Figure 7

Comparison of the observed and predicted height with the test data by using: (a) SVM; (b) RFR.

Table 3

RMSE and R2 of the RFR and SVM models.

Prediction modelRMSE (m) R 2
SVM4.3960.902
RFR2.6360.968
Field measured sample datasets for model testing. Comparison of the observed and predicted height with the test data by using: (a) SVM; (b) RFR. RMSE and R2 of the RFR and SVM models.

Model application

Engineering background and predicted results

The No. 1121 working face of the No. 2 coal seam, located in the center of Hongliu Coal Mine, is the initial mining face of the mine. The length of the working face is 1379 m, and the average mining depth is 265 m. The fully-mechanized longwall mining method is adopted in the mining process. The No. 2 coal seam belongs to the Yan’an Formation of the Jurassic System, with an average thickness of 5.28 m. The dip angle of the No. 2 coal seam varies from 5° to 15°, with the average value of 10°. Figure 8 displays the typical geological column of the No. 1121 working face overlying strata. As it shown, the strata directly overlying the coal mainly consist of the silt and fine sandstones in the lower Zhiluo formation, which are considered as aquitards. The average total thickness of these strata is 52.2 m. According to the rock division rule aforementioned, the sandstone is considered to be hard rock, so the lithology type of the strata is quantified as 4. The first aquifer overlying the No. 2 coal seam is about 51.28 m distance from the coal. It consists of grit sandstones with great thickness, and it has a rich water-abundance property. Thus, in order to evaluate the risk of water inrushes from overlying the coal seam and take corresponding measures, it is necessary to precisely predict the height of the FWCZ. By applying the above generated SVM and RFR models to the No. 1121 working face of the No. 2 coal seam, the height of FWCZ is predicted as 64.17 m and 62.96 m, respectively.
Figure 8

Typical geological column of the No. 1121 working face overlying strata.

Typical geological column of the No. 1121 working face overlying strata.

Practical situation of the No. 1121 working face

When the No. 1121 working face moved forward about 56 m, a large amount of water from the overlying aquifer leaked into the working face. The maximum water inflow was up to 1817 m3/h, so the mining operation had to be terminated. For drawing up the water-inrush prevention measures scientifically, the borehole video camera system (BVCS) was used to observe the height of the fractured water-conducting zone. BVCS is an exploration technology which can directly observe the inner conditions of the boreholes based on the optics theory. The system can be used to observe the strata lithology, geological structures properties, fracture-zone development conditions, groundwater levels change and so on[10]. In this study, the BVCS was used to observe the change of the fractures development degree with the increase of the borehole depth and determine the top boundaries of the fractured zone and the caved zone. Figure 9 shows the video camera images of the borehole HL-1. According to the images, the rocks above 279.07 m are sandstone and mudstone interbed, and they are relatively integrated except that a few small cracks in the horizontal direction appear in certain local positions (Fig. 9a).
Figure 9

Video camera images of the borehole HL-1: (a) Integrate rock without fracture; (b) Fractured zone with various forms of fractures; (c) Caved zone.

Video camera images of the borehole HL-1: (a) Integrate rock without fracture; (b) Fractured zone with various forms of fractures; (c) Caved zone. Figure 9b shows the fractured zone images with various forms of fractures: a nearly vertical fracture with a small width appears at the borehole depth of 279.07–279.27 m; there are many abscission-layer phenomena between 286.3 m and 294.68 m; the crisscrossed fractures with large displacements are distributed in the rocks below 294 m. Therefore, according to the fractures development conditions described above, the depth of 279.07 m is considered as the top boundary of the fractured zone. As Fig. 9c shown, the rocks below the depth of 298.9 m were damaged seriously, and there is a vast void area with obvious mining collapse characteristics. So the depth of 298.9 m is determined as the top boundary of the caved zone. Based on the formula proposed by China Coal Industry Bureau[20], the height of FWCZ of coal-roof strata can be calculated as follows:where H is the maximum height of FWCZ (m); H′ is the depth of the coal-seam floor (m); h is the depth of the fractured zone’s top boundary (m); M is the thickness of the mining coal seam (m). According to the drilling data and the video camera images of borehole HL-1, the depth of the No. 2 coal-floor is 345.88 m, and the thickness of the coal seam is 5.28 m. Therefore, combined with the Equation (7), the height of FWCZ is calculated to be H = (345.88–279.07–5.28) = 61.53 m. Table 4 shows the comparison between the field measured data and the prediction results obtained by the proposed methods.
Table 4

Comparison between the field measured data and the prediction results obtained by the SVM and RFR.

ModelPrediction results (m)Field measured data by BVCS (m)Absolute error (m)Relative error (%)
SVM64.1761.532.644.29
RFR62.961.432.32
Comparison between the field measured data and the prediction results obtained by the SVM and RFR. The results show that compared with the field measured data, the SVM and RFR methods have the relative error of 4.29% and 2.32%, respectively. It indicates that both of the prediction results are generally in good agreement with the field-observed result, and the RFR model has a better performance in the application of the study area, which is in accordance with the above conclusion.

Summary and Conclusions

To ensure the safe production of coal mines, this study proposed a prediction model of the height of FWCZ based on RFR. RFR is a robust machine learning method that can be used to evaluate the variable importance and predict the height of FWCZ. Compared with the traditional MLAs, RFR has numerous advantages, especially, its high prediction accuracy and it is well suitable for the problems with unclear priori knowledge and incomplete data. For the objective problems faced in this study, for instance, the lack of data samples, the RFR model can still maintain a high degree of accuracy. Then, the RFR model was applied to Hongliu Coal Mine in Northwest China. And the main conclusions are reached as follows. Five variables were selected to construct the main controlling factors system. And according to the importance degree measurement by the mean decrease in Gini index and OOB mean decrease in accuracy, mining height and mining depth are the top two most important factors out of the five variables. For comparison with the generated RFR model, a SVM model was also constructed using the same training datasets. By the validation of the two models, the RFR model has the lower RMSE and higher R2 with the value of 2.363 and 0.968, respectively (compared to 4.396 and 0.902 for SVM model). The two generated models were applied to the No. 1121 working face in Hongliu coal mine to verify the effectiveness of the models. The prediction heights of the FWCZ by using RFR and SVM are 62.96 m and 64.17 m, respectively. Field measured data by borehole video camera system is 61.53 m, and the RFR and SVM have the relatively error of 2.32% and 4.29%, respectively. It is concluded that RFR has a better performance in the application of the study area compared with the SVM. This study shows the potential to provide a novel approach to predict the height of FWCZ. The results provide a reference for water-inrush risk management, prevention and reduction in the study area.
  3 in total

1.  A comparison of random forest regression and multiple linear regression for prediction in neuroscience.

Authors:  Paul F Smith; Siva Ganesh; Ping Liu
Journal:  J Neurosci Methods       Date:  2013-09-06       Impact factor: 2.390

2.  Predictive modeling of groundwater nitrate pollution using Random Forest and multisource variables related to intrinsic and specific vulnerability: a case study in an agricultural setting (Southern Spain).

Authors:  Victor Rodriguez-Galiano; Maria Paula Mendes; Maria Jose Garcia-Soldado; Mario Chica-Olmo; Luis Ribeiro
Journal:  Sci Total Environ       Date:  2014-01-24       Impact factor: 7.963

3.  Dissolution Trapping of Carbon Dioxide in Heterogeneous Aquifers.

Authors:  Mohamad Reza Soltanian; Mohammad Amin Amooie; Naum Gershenzon; Zhenxue Dai; Robert Ritzi; Fengyang Xiong; David Cole; Joachim Moortgat
Journal:  Environ Sci Technol       Date:  2017-06-16       Impact factor: 9.028

  3 in total
  2 in total

1.  Understanding the Capability of an Ecosystem Nature-Restoration in Coal Mined Area.

Authors:  Xiaoqin Cui; Suping Peng; Laurence R Lines; Guowei Zhu; Zhenqi Hu; Fan Cui
Journal:  Sci Rep       Date:  2019-12-23       Impact factor: 4.379

2.  Experimental investigation on rockburst behavior of the rock-coal-bolt specimen under different stress conditions.

Authors:  Gen-Shui Wu; Wei-Jian Yu; Jian-Ping Zuo; Chun-Yuan Li; Jie-Hua Li; Shao-Hua Du
Journal:  Sci Rep       Date:  2020-05-05       Impact factor: 4.379

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.