In hydraulic fracturing operations, small rounded particles called proppants are mixed and injected with fracture fluids into the targeted formation. The proppant particles hold the fracture open against formation closure stresses, providing a conduit for the reservoir fluid flow. The fracture's capacity to transport fluids is called fracture conductivity and is the product of proppant permeability and fracture width. Prediction of the propped fracture conductivity is essential for fracture design optimization. Several theoretical and few empirical models have been developed in the literature to estimate fracture conductivity, but these models either suffer from complexity, making them impractical, or accuracy due to data limitations. In this research, and for the first time, a machine learning approach was used to generate simple and accurate propped fracture conductivity correlations in unconventional gas shale formations. Around 350 consistent data points were collected from experiments on several important shale formations, namely, Marcellus, Barnett, Fayetteville, and Eagle Ford. Several machine learning models were utilized in this research, such as artificial neural network (ANN), fuzzy logic, and functional network. The ANN model provided the highest accuracy in fracture conductivity estimation with R 2 of 0.89 and 0.93 for training and testing data sets, respectively. We observed that a higher accuracy could be achieved by creating a correlation specific for each shale formation individually. Easily obtained input parameters were used to predict the fracture conductivity, namely, fracture orientation, closure stress, proppant mesh size, proppant load, static Young's modulus, static Poisson's ratio, and brittleness index. Exploratory data analysis showed that the features above are important where the closure stress is the most significant.
In hydraulic fracturing operations, small rounded particles called proppants are mixed and injected with fracture fluids into the targeted formation. The proppant particles hold the fracture open against formation closure stresses, providing a conduit for the reservoir fluid flow. The fracture's capacity to transport fluids is called fracture conductivity and is the product of proppant permeability and fracture width. Prediction of the propped fracture conductivity is essential for fracture design optimization. Several theoretical and few empirical models have been developed in the literature to estimate fracture conductivity, but these models either suffer from complexity, making them impractical, or accuracy due to data limitations. In this research, and for the first time, a machine learning approach was used to generate simple and accurate propped fracture conductivity correlations in unconventional gas shale formations. Around 350 consistent data points were collected from experiments on several important shale formations, namely, Marcellus, Barnett, Fayetteville, and Eagle Ford. Several machine learning models were utilized in this research, such as artificial neural network (ANN), fuzzy logic, and functional network. The ANN model provided the highest accuracy in fracture conductivity estimation with R 2 of 0.89 and 0.93 for training and testing data sets, respectively. We observed that a higher accuracy could be achieved by creating a correlation specific for each shale formation individually. Easily obtained input parameters were used to predict the fracture conductivity, namely, fracture orientation, closure stress, proppant mesh size, proppant load, static Young's modulus, static Poisson's ratio, and brittleness index. Exploratory data analysis showed that the features above are important where the closure stress is the most significant.
Hydraulic
fracturing enables the exploitation of hydrocarbons from
tight shale formations at commercial rates. A huge amount of slurry
containing the proppant is placed in the fractures to keep them open
and conductive under formation closure stresses. The ability of these
fractures to deliver hydrocarbons is called propped fracture conductivity.
It is described mathematically as the product of the proppant permeability
and the width of the fracture.[1,2]Under controlled
laboratory conditions, propped fracture conductivity
is measured precisely using an American Petroleum Institute (API)
conductivity cell. Extensive studies have been conducted on common
shale formations (namely, Marcellus, Barnett, Fayetteville, and Eagle
Ford) to understand the factors influencing propped fracture conductivity.
Each shale formation is composed of different minerals with different
amounts. For instance, Barnett is a clay-rich shale that may contain
up to 50% clay minerals. In contrast, the Eagle Ford has a minor clay
content and is made of more than 70% carbonate. Figure shows a rough estimation of each shale’s
mineralogical content.[3−6]
Figure 1
Mineralogy
of different shale formations.
Mineralogy
of different shale formations.Shale and proppant properties combined control the propped fracture
conductivity along with the closure stress.[7,8] When
the proppant load is low, the conductivity is mainly controlled by
the shale properties. Contrary to the high proppant load, the proppant
characteristics become more important in determining the fracture
conductivity.[9,10]On the one hand, the shale-related
properties have a major effect
on propped fracture conductivity at low proppant concentrations. The
fracture orientation effect relative to the bedding planes on fracture
conductivity becomes more obvious when there is an intense anisotropy
in the shale mechanical properties, that is, static Young’s
modulus (Estatic) and static Poisson’s
ratio (PR).[11] Also, the fracture topography
plays an important role, that is, the conductivity increases with
the increase in the fracture surface roughness.[12] Knorr[13] found that the increase
in the fracture surface’s roughness decreases the pace at which
the conductivity is declining. On the other hand, the increase in
the proppant concentration yields a higher conductivity.[14] Also, as the proppant size increases, the conductivity
increases.[15]Other factors could
also influence the magnitude of fracture conductivity
due to shale, fracture fluids, and proppant interactions. For instance,
the short-run fracture conductivity loss could happen due to water
damage, clay swelling, and proppant embedment.[16−18] Also, the long-run
fracture conductivity loss with production occurs due to fine migration,
proppant crushing, and shale creeping.[10,19−21] Non-Darcy flow, due to high gas velocity, is known to reduce the
effective proppant permeability.[22] These
interactions make the process of predicting propped fracture conductivity
a challenging task.[23]There are several
reported approaches in the literature to predict
fracture conductivity. These models are either theoretical-based or
empirical, driven from lab data. Zhang et al.[24] derived a theoretical model to predict shale fracture conductivity
considering the damage caused by water. The model coefficients were
obtained from experiments on the Barnett shale. Thus, it is only applicable
to shales with similar properties and mineralogy. Jia et al.[25] provided a mathematical model to estimate nonspherical
proppant conductivity, while Xu et al.[26] added the impact of fracture tortuosity. The complex discrete element
method combined with computational fluid dynamics were used to estimate
the propped fracture conductivity profile.[27] Many theoretical models have been proposed to predict propped fracture
conductivity, but only a few empirical models were developed. In fact,
generating sufficient experimental data that consider all factors
impacting fracture conductivity is challenging. Kainer[28] employed multiple linear regression to generate
empirical correlations based on experimental data for different shale
types. He developed several models and determined the corresponding
significant predictors for each type. The correlation coefficient
obtained, however, is low.The measurement of shale propped
fracture conductivity in a laboratory
setup is the most accurate approach but is a time-consuming and costly
process. Predicting fracture conductivity from theoretical models
is challenging as these models are complex and require many input
parameters that might not be available. The previously developed empirical
correlations are questionable and based on limited data. This suggests
the need for a quick, cheap, and precise method to predict the propped
fracture conductivity of different shale formations. The objective
of this work is to develop such models using different machine learning
(ML) approaches on vast experimental data.
Methodology
The
experimental data for the prediction of propped fracture conductivity
were taken from several published sources from the same research group.[8−10,12] We conducted an extensive literature
review to collect the data using the same API conductivity cell and
experimental procedures to ensure collection of a good number of consistent
experimental data on several shales under a controlled environment.
After data preprocessing, the total number of collected data points
was 351. The relative importance of the features controlling fracture
conductivity was examined through the correlation coefficient (CC)
criterion. This study used Pearson, Spearman, and Kendall CC criteria
to investigate the relationship between the features and fracture
conductivity. Then the data were analyzed, and correlations were extracted
using artificial neural network (ANN), fuzzy logic (FL), and functional
network (FN) models.
Exploratory Data Analysis
Exploratory
data analysis
(EDA) is the process of investigating the relationships, patterns,
and collinearity among the input parameters. The experimental data
include investigations of fracture orientation, proppant mesh size,
closure stress, proppant load, static Estatic, PR, brittleness index (BI), and measured propped fracture conductivity.
The fracture orientation is a categorical variable that can only be
vertical or horizontal. Transforming the categorical variable into
a dummy variable is a common practice to account for the different
categories in the model. The presence of a categorical variable is
represented by 1, while the absence is expressed by 0. Thus, the fracture
orientation is dimensionless. The mesh size range was averaged and
then converted into a particle diameter as shown in Table . The data were obtained from
experiments on different shale formations such as Marcellus, Eagle
Ford, Barnett, and Fayetteville. A complete statistical description
of the data used for training is given in Table .
Table 1
Conversion of the
Proppant Size from
API to Inches
mesh size
(API)
mesh size
(in)
30
0.0232
40
0.0165
50
0.0117
70
0.0083
100
0.0059
Table 2
Statistical Analysis of the Data Utilized
for ML Modeling
Pair Plots
A pair
plot was used to observe both distributions
of single variables and relationships between two variables. Pair
plots were used to plot features when there were more than three dimensions. Figure shows the pair plot
of the overall propped fracture conductivity data. The pair plot was
used to address the issue of redundant variables by figuring out the
interrelationship between the selected features.
Figure 2
Pair plot of the combined
conductivity data set for different shale
formations such as Marcellus, Eagle Ford, and combined Fayetteville
and Barnett.
Pair plot of the combined
conductivity data set for different shale
formations such as Marcellus, Eagle Ford, and combined Fayetteville
and Barnett.
Relative Importance
ML models are data-driven, which
means that the best way to feature selection is by picking the optimum
input parameters. The optimum way is to find the linear CC between
input parameters and the target parameter. The value of CC between
the pair of two variables always lies between −1 and 1. A CC
value close to “–1” shows a strong inverse relationship
between the pair of variables, while a value close to “1”
shows a strong direct relationship between the two variables. A CC
value of zero shows that no relationship exists between the two variables.In this study, Pearson, Spearman, and Kendall CC criteria were
used to estimate the linear relationship between input parameters
and the output parameter. The definitions of Pearson, Spearman, and
Kendall CC criteria are given as eqs 23.where x is the value of x-variable
and y is the value of y-variable
and k is the total number of samples. The Spearman
correlation between two variables is equal to the Pearson correlation
between the rank values of those two variables; while Pearson’s
correlation assesses linear relationships, Spearman’s correlation
assesses monotonic relationships (whether linear or not). If there
are no repeated data values, a perfect Spearman correlation of +1
or −1 occurs when each of the variables is a perfect monotone
function of the other.where cov(x, y) is the covariance of the rank variables
and
γγ are the standard deviations of the rank variables. A Kendall tau
CC test is a nonparametric hypothesis test for statistical dependence
between the pair of two variables. The Kendall tau CC is equivalent
to the Spearman rank CC. While the Spearman rank CC is like the Pearson
CC but computed from ranks, the Kendall tau correlation represents
a probability.where nc is the number
of concordant pairs, nd is the value of
the number of discordant pairs, and n is the total
number of samples.Figure shows the
Pearson, Spearman, and Kendall CC features with the natural log of
propped fracture conductivity. The CC values of all input features
based on Pearson criteria improved significantly by taking the natural
logarithm of propped fracture conductivity. One might notice that
the most significant parameter is the closure stress, which negatively
correlates with the fracture conductivity. It is also confirmed that
the proppant mesh size, proppant load, specimen Estatic, and BI are equally important and contribute positively
to fracture conductivity.
Figure 3
Relative importance of the input parameters
with the natural log
of propped fracture conductivity.
Relative importance of the input parameters
with the natural log
of propped fracture conductivity.
Data Stratification
The data set was divided randomly
into two parts: 70% of the data set was utilized for training the
model, while the remaining 30% was kept for testing the model. Data
were stratified randomly by Python. This splitting process ensured
that the testing data fell within the range of the training data.
FN Model
The FN technique is the most advanced and
simplest conventional neural network technique.[29] The FN is a supervised data learning technique mostly used
for function approximation and regression purposes.[30] The FN algorithm proceeds with the learning from domain
and data knowledge. A typical FN model is comprised of three units:
an input storing unit, an output storing unit, and several layers
of processing units. Unlike neural networks, the algorithm itself
determines the network topology, structures, and unknown neural functions.
In the conventional neural network model, the number of neurons is
fixed and usually user-defined. The weights and biases (connections
between the neurons) of the associated neurons are learned and estimated
during the training phase.[31] In contrast,
in FN models, the number of neural functions is adaptive and varied
during the structural learning and parametric learning phases.
FL Model
FL is a computing approach based on the degree
of uncertainty instead of conventional sharp cutoffs. Contrary to
the Boolean logic in which the variable truth values are either 0
or 1, there can be any real values between 0 and 1 in FL. Fuzzy sets
are more realistic and mimic human reasoning in making decisions based
on ambiguous or non-numeric data. FL is composed of four components,
namely, rule base, fuzzification, inferential engine, and defuzzification.
The rule base governs the decision making using a set of rules and
conditions. Fuzzification turns the exact inputs known as crisp inputs
into fuzzy sets, which are fed into the inferential engine. The latter
evaluates the degree of matching between each rule and the fuzzy input
to determine which rules should be fired. Finally, the defuzzification
turns the fuzzy sets back into a crisp value.
ANN Model
ANN
is a computing system that mimics an
animal brain’s biological neural network. In this study, comprehensive
numerical experimentation was performed to arrive at the optimum ANN
model. The optimum selection of the hidden layer, the number of neurons
in the hidden layer, optimum transfer function, optimum learning algorithm,
and the best learning rate was selected based on the predicted values’
accuracy.
Results and Discussion
In this study,
two scenarios were examined to train the ML models.
The first scenario was to combine all the data sets of different formations
and develop a single generalized model. The second scenario was to
create an ML model for each formation separately.The FN model
was used on a combined data set to develop a generalized
model. On the overall data set, FN predicted the propped fracture
conductivity with R2 of 0.853 on training
and R2 of 0.817 on testing. The root mean
square error (RMSE) of the natural logarithm of conductivity was 0.051
on training and 0.104 on testing. The average absolute percentage
error (AAPE) was relatively high, about 14.811% on training and 17.444%
on testing. The training and testing cross plots are shown in Figure .
Figure 4
Training and testing
cross plots of the FN model on the overall
data set.
Training and testing
cross plots of the FN model on the overall
data set.A type of FL named adaptive network-based
fuzzy inference system
was employed to optimize the fuzzy inference system. The latter was
created by generating a fuzzy inference system (genfis2) using subtractive
clustering. In genfis2, the input and output membership functions
were hardwired to Gaussian and linear, respectively. Also, the number
of membership functions was two, while the cluster radius was set
to 0.8. The trained FL models developed by the combined data set and
each separate shale formation data set are summarized in Table .
Table 3
FL Model Accuracy Comparison
AAPE
RMSE
R2
formations
training
testing
training
testing
training
testing
overall
16.0861
11.6894
0.6418
0.6425
0.8341
0.8376
Marcellus
35.6522
31.2780
0.9418
0.8296
0.6882
0.7314
Eagle Ford
4.0536
3.8391
0.3118
0.3031
0.9153
0.9299
Barnett/Fayetteville
10.9987
8.7878
0.5296
0.4761
0.9180
0.9297
The trained model on the whole data set could predict
propped fracture
conductivity with R2 of 0.83. The RMSE
was 0.64, while the AAPE was 16.08% on training and 11.69% on testing.
The training and testing cross plots are shown in Figure .
Figure 5
Training and testing
cross plots of the FL model on the overall
data set.
Training and testing
cross plots of the FL model on the overall
data set.On the overall data set, ANN was
able to predict propped fracture
conductivity with an R2 of 0.895 on training
and 0.892 on testing. The RMSE of the natural logarithm of conductivity
was 0.030 on training and 0.057 on testing. The AAPE was quite high,
about 11.860% on training and 7.869% on testing. The training and
testing cross plots are shown in Figure . The hyperparameters of the proposed ANN
model are listed in Table .
Figure 6
Training and testing cross plots of the ANN model on the overall
data set.
Table 4
Optimum Values for
the Proposed ANN
Model for Propped Fracture Conductivity on the Overall Data Set
ANN parameters
optimum values
inputs
6
middle layer
1
neurons in the middle layer
10
training algorithm
Levenberg–Marquardt
(LM)
learning
rate, α
0.20
activation function
of the
middle layer
tangent
sigmoidal
activation function of the
outer layer
pure
linear
Training and testing cross plots of the ANN model on the overall
data set.ANN performed
better than FN and FL. FN and FL models resulted
in lower R2, high AAPE, and high RMSE
both during training and testing compared to the ANN model.For the second scenario, the whole data set was segregated based
on the formation type. An ANN model was trained separately for Marcellus,
Eagle Ford, and combined Barnett and Fayetteville formations. The
prediction performances of ANN models for each formation are shown
on training and testing cross plots in Figures 89. Table shows
the comparison of the ANN model accuracy for the two scenarios.
Figure 7
Training and
testing cross plots of the ANN model on the Marcellus
shale data set.
Figure 8
Training and testing cross plots of the ANN
model on the Eagle
Ford shale data set.
Figure 9
Training and testing
cross plots of the ANN model on the combined
Barnett shale and Fayetteville data set.
Table 5
ANN Model Accuracy Comparison
AAPE
RMSE
R2
formations
training
testing
training
testing
training
testing
overall
11.29
5.912
0.035
0.04
0.886
0.929
Marcellus
24.694
16.444
0.078
0.142
0.816
0.804
Eagle Ford
1.471
2.223
0.016
0.029
0.978
0.978
Barnett/Fayetteville
3.491
1.51
0.026
0.024
0.989
0.996
Training and
testing cross plots of the ANN model on the Marcellus
shale data set.Training and testing cross plots of the ANN
model on the Eagle
Ford shale data set.Training and testing
cross plots of the ANN model on the combined
Barnett shale and Fayetteville data set.In this
study, in addition to the trained ANN model, explicit empirical
correlations to predict propped fracture conductivity of the shale
formations were proposed. The normalized form of the proposed fracture
conductivity is given by eq where Y = w1 O + w1 Mesh + w1PL + w1σc + w1Estatic + w1υstatic + w1BI. The w1, w2, b1,
and b2 are the weights and biases (see Table ) of the propped fracture
conductivity model. O, Mesh, PL, σc, Estatic, υ, and BI are the normalized values of fracture orientation,
mesh size, proppant load, closure stress, Estatic, PR, and BI. These values can be determined using eqs 5–11.
Table 6
Weights and Biases of the Optimized
ANN Model on the Overall Data Set, Marcellus Shale Data Set, Eagle
Ford Shale Data Set, and Combined Barnett and Fayetteville Shale Data
Set
The proposed equations
to predict propped fracture conductivity W on the overall data set, Marcellus shale,
Eagle Ford shale, and Barnett shale are given as eqs 131415.where W for each formation
can be calculated using eq . The normalized parameters in eq can be calculated using eqs 67891011. The minimum and
maximum values of each input parameter are reported in Table .
Sensitivity Analysis
A sensitivity analysis was carried
out to validate the proposed empirical correlation to predict the
propped fracture conductivity on the overall data set. In the sensitivity
analysis, one variable was changed and the other variables were kept
constant at their average mean values, as listed in Table . Figure shows the predicted propped fracture conductivity’s
sensitivity results with the closure stress (σc)
at different proppant loads such as 0.01, 0.025, 0.05, and 0.075 at
horizontal and vertical fracture orientations. The σc was changed from its minimum value of 500 psi to a maximum value
of 6000 psi. These minimum and maximum values were those on which
the model was trained and are listed in Table . As evident from the sensitivity analysis
results, the increase in the proppant load increases the propped fracture
conductivity. The results obtained were very much aligned with the
experimental observation.
Figure 10
Sensitivity analysis of propped fracture conductivity
with closure
stress at different proppant loads for both horizontal and vertical
fracture orientations.
Sensitivity analysis of propped fracture conductivity
with closure
stress at different proppant loads for both horizontal and vertical
fracture orientations.Figure shows
the sensitivity results of the predicted propped fracture conductivity
equation with the closure stress (σc) at different Estatic values of shales such as 20, 25, and
30 GPa at horizontal and vertical fracture orientations. The closure
stress was changed from its minimum value of 500 psi to a maximum
value of 6000 psi. An increase in the Estatic value of the shale increases the overall propped fracture conductivity.
This proves that the developed empirical correlation is capable of
predicting the propped fracture conductivity behavior if it is within
the range on which the model is trained, as listed in Table .
Figure 11
Sensitivity analysis
of propped fracture conductivity with closure
stress at different Estatic values of
the rock for both horizontal and vertical fracture orientations.
Sensitivity analysis
of propped fracture conductivity with closure
stress at different Estatic values of
the rock for both horizontal and vertical fracture orientations.Similarly, the impact of the proppant load on the
fracture conductivity
at both vertical and horizontal directions is shown in Figure . The model could predict
that increasing the proppant mesh size results in higher fracture
conductivity. This behavior is well reported in the literature that
identified that higher proppant permeability is obtained from large-sized
proppant particles. This behavior can be reproduced by the model within
the range of data in Table .
Figure 12
Sensitivity analysis of propped fracture conductivity with closure
stress at different mesh sizes for both horizontal and vertical fracture
orientations.
Sensitivity analysis of propped fracture conductivity with closure
stress at different mesh sizes for both horizontal and vertical fracture
orientations.Several BIs have been defined
in the literature that helps to characterize
unconventional resources. The most common ones are based on elastic
properties, mineralogical composition, or strength parameters. The
mathematical descriptions of the different BIs are expressed as eqs 1718. All these BIs are equivalent and yield
high values for quartz-rich rocks.[32]In this work, the BI defined by elastic properties was used in
developing different ML models. The other two methods could not be
implemented as the mineralogical composition and strength parameters
were not available for each sample. Figure shows that the higher the rock elastic
BI values, the higher the conductivity. Proppant impediment is lower
in brittle formations and hence a wider fracture width is achieved.
Figure 13
Sensitivity
analysis of propped fracture conductivity with closure
stress at different BIs for both horizontal and vertical fracture
orientations.
Sensitivity
analysis of propped fracture conductivity with closure
stress at different BIs for both horizontal and vertical fracture
orientations.
Conclusions
This
study provides a simple and accurate propped fracture conductivity
correlation based on several ML methods, such as ANN, FL, and FN.
More than 350 data points were collected from API conductivity experiments
in four shale formations: Marcellus, Barnett, Fayetteville, and Eagle
Ford. Based on the analysis of the collected data, the following can
be concluded:Pearson, Spearman, and Kendall CC criteria
show that closure stress is the most important feature for predicting
propped fracture conductivity. The proppant load, size, and formation
elastic properties (i.e., Estatic, PR,
and BI) are also significant.The ANN model provided the best result
compared to FL and FN models. The R2 of
the general conductivity model was 0.89 and 0.93 for the training
and testing data sets, respectively.The formation-specific ANN models were
more accurate than the general correlation except for Marcellus shale.
Hence, it is advised to use the specific correlations for Barnett,
Fayetteville, and Eagle Ford and the general correlation for Marcellus.The provided correlations
are easy
to use and require simple input parameters such as fracture orientation,
closure stress, proppant mesh size, proppant load, Estatic, PR, and BI.The provided correlations are data-driven
and should be used only in the specified ranges.