The maximum (Shmax) and minimum (Shmin) horizontal stresses are essential parameters for the well planning and hydraulic fracturing design. These stresses can be accurately measured using field tests such as the leak-off test, step-rate test, and so forth, or approximated using physics-based equations. These equations require measuring some in situ geomechanical parameters such as the static Poisson ratio and static elastic modulus via experimental tests on retrieved core samples. However, such measurements are not usually accessible for all drilled wells. In addition, the recently proposed machine learning (ML) models are based on expensive and destructive tests. Therefore, this study aims at developing a new approach to predict the least principal stresses in a time- and cost-effective way. New models have been developed using ML approaches, that is, artificial neural network (ANN) and support vector machine (SVM), to predict Shmin and Shmax gradients (outputs) from well-log data (inputs). A wide-ranged set of actual field data were collected and extensively analyzed before being fed to the algorithms to train the models. The developed ANN-based models outperformed the SVM-based ones with a mean absolute average error (MAPE) not exceeding 0.30% between the actual and predicted output values. Besides, new equations have been developed to mimic the processing of the optimized networks. The new empirical equations were verified by another unseen data set, resulting in a remarkably matched actual stress-gradient values, confirmed by a prediction accuracy exceeding 90% in addition to an MAPE of 0.43%. The results' statistics confirmed the robustness of the developed equations to predict the Shmin and Shmax gradients with a high degree of accuracy whenever the logging data are available.
The maximum (Shmax) and minimum (Shmin) horizontal stresses are essential parameters for the well planning and hydraulic fracturing design. These stresses can be accurately measured using field tests such as the leak-off test, step-rate test, and so forth, or approximated using physics-based equations. These equations require measuring some in situ geomechanical parameters such as the static Poisson ratio and static elastic modulus via experimental tests on retrieved core samples. However, such measurements are not usually accessible for all drilled wells. In addition, the recently proposed machine learning (ML) models are based on expensive and destructive tests. Therefore, this study aims at developing a new approach to predict the least principal stresses in a time- and cost-effective way. New models have been developed using ML approaches, that is, artificial neural network (ANN) and support vector machine (SVM), to predict Shmin and Shmax gradients (outputs) from well-log data (inputs). A wide-ranged set of actual field data were collected and extensively analyzed before being fed to the algorithms to train the models. The developed ANN-based models outperformed the SVM-based ones with a mean absolute average error (MAPE) not exceeding 0.30% between the actual and predicted output values. Besides, new equations have been developed to mimic the processing of the optimized networks. The new empirical equations were verified by another unseen data set, resulting in a remarkably matched actual stress-gradient values, confirmed by a prediction accuracy exceeding 90% in addition to an MAPE of 0.43%. The results' statistics confirmed the robustness of the developed equations to predict the Shmin and Shmax gradients with a high degree of accuracy whenever the logging data are available.
The downhole formation
stresses are key factors in different operations
in the petroleum industry. How the stresses are concentrated in the
vicinity of the wellbore directly affect the drilling operation since
it controls the wellbore integrity and hence may cause many drilling-related
incidents, that is, stuck bottom-hole-assembly, pack-off, and lost
circulation.[1] The availability of formation
stress data that describe the wellbore stress-state condition would
contribute to providing viable solutions to many integrity-related
wellbore problems that may be encountered during drilling. These solutions
include determining the optimum mud weight, defining the safe drilling
window, specifying stable trajectories, determining casing setting
depths, and so forth.[2] Furthermore, defining the
downhole stress condition or distribution is considered the cornerstone
for developing a representative geomechanical model of subsurface
formations whereby a broad suite of problems along different stages
of the reservoir life could be addressed and resolved.[3−7]With a simplifying assumption, three mutual orthogonal principal
stress components can represent the downhole stress state, that is,
the overburden stress (Sv) and the least
principal stresses: the maximum (Shmax) and minimum (Shmin) horizontal stresses. Since it is due to the compressive
stress caused by the overburden formations, the vertical stress (Sv) can be estimated from the overburden formation-bulk-density
log.[8]There are two types of techniques,
that is, direct and indirect
methods, to determine the least principal stresses. The direct method
comprises the direct measurement of the stress state by conducting in situ field tests such as the leak-off test, mini-frac
test, step-rate test, and so forth.[2,9,10] Shmax cannot be directly measured using
these methods;[11] hence, theoretical (empirical)
correlations are developed to estimate Shmax depending
on the values of Sv and Shmin.[12,13] The main challenges of this method are being
time-consuming, expensive, and usually unavailable for most of the
wells. Making the matters even more challenging is that such tests
are typically applied at specific depths which means there is no continuous
profile of these stresses that would be available based solely on
these direct tests.On the other hand, the indirect methods
involve the determination
of the least principal stresses using the well-log data. Different
physics-informed theoretical models, that is, uniaxial strain theory
and poroelastic strain models, were developed to determine the downhole
formation stresses.[14−17] These models are based on lab measurements of some in situ geomechanical parameters, that is, static elastic moduli, strains,
and static Poisson’s ratio. These measurements can be accurately
measured from the lab tests (e.g., triaxial tests) conducted on retrieved
cores that have been sampled from the downhole formations.[17]Thereafter, the measured parameters would
be presented in a continuous
profile form after correlating them to the conventional logging data.
Besides, there is still a need for at least one direct field test,
that is, the leak-off test, to incorporate the effect of tectonic
stresses on the generated profiles.[18−20] However, one main drawback
of this technique is the high cost of retrieving such core samples
to be subjected to such lab measurements. This, in turn, limits the
accessibility of this kind of information for most of the drilled
wells. Some recent studies introduced the application of machine learning
(ML) to estimate the downhole principal stresses using the breakout
data.[21,22] The breakout geometries can be derived from
the analysis of image logs.[23] However,
borehole breakouts are considered destructive techniques that are
based on failure models.[24] Besides, most
drilled wells lack such data due to the high cost and time consumption
of running these special logs. Accordingly, based on the literature,
direct nondestructive techniques to determine the formation stresses
are yet to be researched. Another approach was introduced by AlTammar
and Alruwaili.[25] to estimate Shmin and Shmax based on the caliper log data; however, a certainty
analysis has to be incorporated into the model for geomechanical properties
that are not readily available.Therefore, a project was initiated
to investigate the feasibility
of ML to estimate the formation stresses using the available and easy-to-get
data such as mechanical data and logging data. The results of the
first phase demonstrated the ability of ML-based models to predict
the in situ stresses using the mechanical drilling
data.[26] The second phase of the project,
which is the subject of this paper, investigated the application of
ML to predict the least principal stresses using logging data in a
white-box version.Therefore, this study aims at developing
a new, robust tool that
can estimate the gradients of the least principal stresses, Shmin and Shmax, from the conventional well-log data
by deploying ML approaches: artificial neural network (ANN) and support
vector machine (SVM). The ML approaches have been selected due to
the recent high computational capabilities of computers and the outstanding
performance of such approaches to mimic and solve highly complex problems.
Recently, different ML approaches have been successfully applied in
the field of petroleum-related geomechanics such as predicting unconfined
compressive strength,[27,28] elastic parameters,[29,30] and wellbore failures.[31]The novelty
of this study was extended to develop state-of-the-art
equations to estimate Shmin and Shmax directly
from the logging data. These equations, with a detailed procedure
for application, introduce the developed ML models in a white-box
version to allow the reproducibility of the results, unlike the usual
black-box nature of the ML models.
Data Analysis
This section describes the data set used for this study with summarized
insights on the data preprocessing applied before proceeding with
the model development.
Data Description
Field measurements,
2385 data points, were collected from two wells in a Middle East field
representing a complex carbonate reservoir. These data include well-logging
records and in situ maximum and minimum horizontal
stresses, Shmax and Shmin. The logging data
comprise gamma-ray (GR) log, formation bulk density (RHOB) log, compressional
(DTC) and shear (DTS) wave transit-time log, neutron porosity (Phi)
log, dynamic Poisson’s ratio (PRd), and dynamic
elastic modulus (Ed). The data collected
from well-A have been used for training and testing the models, while
the data gathered from well-B were directed to validate the developed
models and verify their performance.
Data
Acquisition
The stress magnitude
can be estimated either by employing field tests or using developed
theoretical-based equations. The equations developed based on the
poroelastic model are considered the most common and applicable method
to estimate the stress profile at the desired depth of the drilled
wells.[6,11,32] Blanton and
Oslon[16] were the first to introduce anisotropy
in the in situ horizontal stress equation for different
lithologies. Their model considers the effect of the tectonic stresses
by introducing the tectonic strains into the equation. Accordingly,
the least principal stresses can be estimated using eqs and 2.[16,32]where
PRs is the static Poisson’s
ratio, Sv is the vertical stress component,
α is Biot’s elastic coefficient, and ε and ε are the
elastic strains in the Shmin and Shmax directions,
respectively.First, the vertical stress Sv was estimated from the RHOB log by integrating the formation
density from the surface to the depth of interest using eq .where ρ(z)is
the formation
density at a certain depth of z and g is the gravitational acceleration.Then, dynamic Poisson’s
ratio (PRd) and dynamic
elastic modulus (Ed) were estimated based
on the acoustic and RHOB logs using the formulas listed in Appendix A. The calculated Ed and PRd were then correlated with Es and PRs obtained from the experimental tests
conducted on the samples cored from the downhole formations.To determine the elastic strains’ values ε and ε, an equal-strain
assumption was initially considered for both directions before estimating
the Shmin using eq . A field test was then used to calibrate Shmin for tectonic effects. In the case of not achieving an accurate match
with the measured value, Shmin was recalculated using different
ratio values (ε/ε). This step was repeated iteratively until an acceptable
converge on Shmin values and accurate match were achieved.
Finally, the Shmin and Shmax profiles were estimated
and considered as the outputs for the proposed ML models.
Statistical Descriptive Analysis
The obtained data
in this study were statistically analyzed and describes
by deploying different statistical measures, as listed in Table . This helps provide
better understanding of the data and its distribution. The descriptive
measures listed indicated that both the logging and stress data cover
a wide range with a representative distribution and hence give more
confidence to capture the nature of the problem. The data ranges can
be summarized as following: GR: 3.34–90.49 API unit, DTC: 44.82–66.12
μs/ft, DTS: 81.28–132.47 μs/ft, RHOB: 2.32–3.04
g/cm3, Phi: 0.28–0.32 fraction, Ed: 5.70–14.79 Mpsi, PRd: 0.28–0.33
fraction, Shmin: 11 292.34–12 361.17
psi, and Shmax: 12 308.02–14 599.00
psi.
Table 1
Descriptive Statistical Summary of
the Data Set Used in This Study
parameter
GR (API unit)
DTC (μs/ft)
DTS (μs/ft)
RHOB (g/cm3)
Phi
Ed (Mpsi)
PRd
Shmin (psi)
Shmax (psi)
minimum
3.34
44.82
81.28
2.32
0.28
5.70
0.28
11292.34
12308.02
maximum
90.47
66.12
132.47
3.04
0.32
14.79
0.33
12361.17
14599.00
mean
29.56
48.43
89.97
2.82
0.29
12.37
0.30
11886.61
13778.29
std
14.25
2.89
6.94
0.11
0.01
1.57
0.01
274.67
450.26
skewness
0.64
2.66
2.66
–0.92
1.09
–1.34
1.55
–0.07
–0.50
Data Preprocessing
Data preprocessing
is an essential step for developing ML-based models since the quality
of data directly has a considerable impact on the ability of the model
to learn and give accurate predictions.[33] Therefore, the obtained data were initially preprocessed before
being fed to the proposed models.[34] The
data set was first cleaned from any missing data, redundant or duplicated
information, and contextual errors such as negative values and unreasonable
values that do not make sense from the engineering point of view.
Then, a MATLAB code was specially designed to detect and eliminate
the outliers using several techniques, that is, quartiles, and so
forth.
Dimensionality Reduction
It refers
to the process of reducing the dimensionality of the input features,
that is, the logging data, to obtain a set of principal features.
Accordingly, the redundant and irrelevant input information was identified
by studying the collinearity between the input features and then removed.
First, the input data were normalized between 0 and 1 for better representation.
Then, the correlation coefficient (R-value) was calculated
between the inputs to evaluate how strongly each input linearly correlates
with the others (Table ). In the case of having two or more features that have an R-value of more than 0.95, only one of them would be considered,
and the others would be excluded. Therefore, only GR, RHOB, and DTC were selected to feed the proposed models after excluding
the others that have almost the same distribution as the selected
ones (Figure ).
Table 2
Correlation Coefficient Analysis among
the Input Features (Logging Data)
parameter
GR
DTC
DTS
RHOB
PHI
Ed
PRd
GR
1.00
DTC
–0.45
1.00
DTS
–0.45
1.00
1.00
RHOB
–0.21
–0.35
–0.35
1.00
Phi
–0.49
0.95
0.95
–0.27
1.00
Ed
0.38
–0.95
–0.95
0.51
–0.96
1.00
PRd
–0.48
0.98
0.98
–0.31
0.99
–0.96
1.00
Figure 1
Graphic display
of the distribution of the normalized logging data
where the y-axis represents the normalized data values
and the x-axis represents the data index.
Graphic display
of the distribution of the normalized logging data
where the y-axis represents the normalized data values
and the x-axis represents the data index.Moreover, taking the square root (Sqrt) of the GR values, square-root
transformation reduced its skewness from 0.63 to −0.07. It
approached zero, which is indicative of being more like a normal distribution.
Therefore, Sqrt(GR) values were considered instead of GR values as
an input feature.
Correlation Analytics
Pearson’s
correlation coefficient was used to investigate the relative importance
between each input feature and the outputs. This correlation helps
identify to what extent the output is dependent on each input feature.[35] The R-value between Shmin and the selected inputs did not exceed 0.29. As an attempt
to enhance this value, the formation depth was integrated into the
stress profile to express it as a stress gradient profile instead.
Studying the correlation between each feature and the Shmin gradient, Figure a shows a significant increase in the R-value from
0.29, −0.27, and 0.12 to −0.53, 0.60, and 0.21 for Sqrt(GR),
DTC, and RHOB, respectively, compared to the initial case with the
Shmin values.
Figure 2
Correlation coefficient between (a) Shmin and Shmin-gradient and (b) Shmax and Shmax-gradient,
with each input feature [Sqrt(GR), DTC, and RHOB].
Correlation coefficient between (a) Shmin and Shmin-gradient and (b) Shmax and Shmax-gradient,
with each input feature [Sqrt(GR), DTC, and RHOB].Similarly, the Shmax gradient was found to have
a relatively
higher R-value with the input features compared to
ShmaxR-values, as shown in Figure b. Therefore, the Shmin and Shmax gradients were considered the proposed models’
outputs instead of the absolute Shmin and Shmax values. The formula used to calculate Pearson’s correlation
coefficient is presented in Appendix A.
Model Development
The proposed models were
then developed using the preprocessed
data set by employing ANN and SVM techniques to predict the Shmin and Shmax gradients based on the selected conventional
logging data; GR, DTC, and RHOB.
Artificial
Neural Network
ANN as
a supervised-learning technique is recently well-known for its high
capability of modeling several engineering problems with a high degree
of complexity. The basic architecture of a neural network typically
consists of three types of layers: the input layer, hidden layer(s),
and output layer.[41] The input features
are assigned to the input layer that has weighted connections with
the hidden layer(s). The neurons in the hidden layer process the input
data before being transferred through the network connections to the
output layer to ultimately produce the output in the output layer.[36] The optimization process of the network aims
at tuning the weights of the network connections as well as the biases
to yield the lowest possible error for a given network configuration.[37,38]
Support Vector Machine
SVM is one
of the most common ML techniques, well-known for its high capability
to deal with classification and regression applications with a high
degree of complexity.[41] It follows the
supervised learning approach while carrying out the transformation
of the input data set into a higher-degree dimensional (n-dimensional) feature space whereby more space would be available
for training instances to achieve the optimal hyper-plane.[42] Several parameters are required to be adequately
optimized while SVM training to develop a robust model with optimal
performance.[42−44] Recently, many studies employed the SVM technique
in estimating several petroleum-related parameters and in geomechanics-related
applications.[45−49]
Results and Discussion
ANN-Based
Model Development
In this
study, ANN was employed to develop new models that can estimate the
Shmin and Shmax gradients based on the well-log
data as feeding inputs. The obtained data set was initially divided
into three main categories: training, validation, and testing sets.
Typically, multiple models are trained using the training set with
different hyper-parameters before being tested internally utilizing
the validation set to evaluate the selected hyper-parameters. The
developed model with those hyper-parameters that yield acceptable
prediction accuracy on the validation set is then tested using the
testing set to evaluate the generalization error of the trained model.[39]Ratios ranging from 70 to 90% were tested
for the training set, and for each trial, the rest of the data was
split using a one-to-one ratio for the testing and validation sets.
Meanwhile, different combinations of the ANN parameters were tested
to optimize the model. Table lists the ANN-parameter options that have been tested in
addition to the selected (optimized) ones.
Table 3
Tested
Options for Optimizing the
Developed ANN-Based Models
optimized
parameters
parameter
tested options/ranges
Shmin gradient
model
Shmax gradient
model
number of hidden layers
1–4
single hidden layer
number
of neurons in each layer
5–40
30
15
split ratio
70–90% (for training set) the rest was divided by 1-to-1 ratio for validation and testing
(training/validation/testing) 0.8/0.1/0.1
training algorithms
trainlm
trainbfg
trainrp
trainlm
trainscg
traincgb
traincgf
traincgp
trainoss
traingdx
transfer function
tansig
logsig
elliotsig
tansig
radbas
hardlim
satlin
learning rate
0.01–0.9
0.05
0.15
The gradient descent algorithm was implemented
while iteratively
updating the network parameters in the gradient direction of the objective
function. The process includes considering random values of the model
hyperparameters and iteratively adjusting them using the available
options to eventually reduce the loss function over a series of trials
(epochs). The model’s hyper-parameters are updated through
each iteration to minimize the loss of the next iteration using the
back propagation technique.A MATLAB code was developed to test
different scenarios while optimizing
the network. Each scenario includes different combinations of the
available options of the ANN parameters. The prediction for each case
was evaluated in terms of the R-value to assess the
collinearity between the predicted and actual output values. In addition,
the prediction error was evaluated using the mean absolute percentage
error (MAPE) and root-mean-squared error (RMSE) between the predicted
and observed output values for the training, validation, and testing
processes. Achieving the highest R-value besides
the lowest MAPE and RMSE was the objective criteria to select the
optimized parameters of the network. The mathematical formulas used
to calculate MAPE and RMSE are stated in Appendix
A.
Shmin Gradient
Prediction
The tuning process of the developed Shmin gradient model results in a network architecture of three layers:
one input layer including the input features [Sqrt(GR); DTC and RHOB],
one hidden layer with 30 neurons, and one output (Shmin gradient) layer. The developed model was trained by the Levenberg
Marquardt algorithm (trainlm) with a learning rate of 0.05 using a
transfer function of tan-sigmoidal type for the input layer and a
linear function for the output layer. Figure shows a typical architecture schematic of
the developed ANN-based models. The crossplots between the predicted
and actual Shmin gradients, Figure , showed a significant match with an R-value of 0.90 and MAPE not exceeding 0.14% both for the
training and testing processes.
Figure 3
Typical architecture of the developed
ANN-based models.
Figure 4
Crossplots between the
actual and predicted Shmin gradients
for the developed ANN-based model for (a) training and (b) testing
processes.
Typical architecture of the developed
ANN-based models.Crossplots between the
actual and predicted Shmin gradients
for the developed ANN-based model for (a) training and (b) testing
processes.After fitting a regression model,
the prediction residuals have
been checked to ensure reliable regression results. Therefore, the
residuals of the Shmin-gradient model were plotted versus
the fitted values, as depicted in Figure a, which shows the random scattering of the
residuals around zero. The residual histograms were also found to
be more-like normally distributed (Figure b), which demonstrates that all the fitted
values have almost the same degree of scattering.[40]
Figure 5
Analysis of the prediction residuals of the Shmin-gradient
ANN-based model: (a) residuals vs fitted values and (b) histogram
of the prediction residuals.
Analysis of the prediction residuals of the Shmin-gradient
ANN-based model: (a) residuals vs fitted values and (b) histogram
of the prediction residuals.
Shmax Gradient Prediction
Similarly, the optimized network for predicting the Shmax gradient contained one hidden layer with 15 neurons. The model was
trained with a learning rate of 0.15 using trainlm as a learning algorithm.The narrow scatter of the points along the 45-line in the crossplots
is shown in Figure , indicating the agreement between the observed Shmax gradient
and the predicted ones for both the training and testing. This is
further verified by the low MAPE 0.30% between the observed and predicted
values for the testing process. In addition, the average R-value is 0.98 for both. The evaluation metrics (R-value, MAPE, and RMSE) listed in Table describe the accuracy of the ANN-based models.
Furthermore, plotting the model prediction residuals versus the fitted
values showed a scattered pattern around zero, Figure a, with approximately a normal distribution
in the residual histogram plot depicted in Figure b. These measures indicate the stable prediction
(regression) performance of the developed model.
Figure 6
Crossplots between the
actual and predicted Shmax gradients
for the developed ANN-based model for (a) training and (b) testing
processes.
Table 4
Summary of the Metric Used for Evaluating
the Accuracy of the Developed ANN-Based and SVM-Based Models
training
process
testing
process
model
output parameter
R-value
MAPE (%)
RMSE
R-value
MAPE (%)
RMSE
Shmin gradient
ANN
0.92
0.12
0.0016
0.92
0.14
0.0013
SVM
0.86
0.18
0.0019
0.86
0.16
0.0017
Shmax gradient
ANN
0.98
0.28
0.0037
0.98
0.30
0.0038
SVM
0.98
0.34
0.0041
0.97
0.41
0.0041
Figure 7
Analysis of the prediction residuals of the
Shmax-gradient
ANN-based model: (a) residuals vs fitted values and (b) histogram
of the prediction residuals.
Crossplots between the
actual and predicted Shmax gradients
for the developed ANN-based model for (a) training and (b) testing
processes.Analysis of the prediction residuals of the
Shmax-gradient
ANN-based model: (a) residuals vs fitted values and (b) histogram
of the prediction residuals.
SVM-Based
Model Development
The same
data set was used for building the SVM-based models to estimate the
Shmin and Shmax gradients using the same input
features. For optimizing the SVM-based models, both Gaussian and polynomial
kernel functions were tested with different SVM-model optimizing parameters;
epsilon, lambda, kernel option, C-parameter, and
verbose. The model was trained using 70% of the obtained data, while
the rest were used for the validation and testing processes with a
one-to-one ratio. For both the Shmin- and Shmax-gradient models, the sensitivity analysis shows that the epsilon,
lambda, and verbose parameters did not significantly impact prediction
accuracy. The Gaussian kernel function yielded better prediction performance
regarding the R-value between the predicted and actual
output values than the polynomial function. Varying kernel options
from one to nine showed that a kernel option of 3.5 gave the best
prediction performance with the lowest MAPE for both the Shmin- and Shmax-gradient models. A C-parameter
of 400 was selected for the Shmin gradient model, while
600 was chosen for the Shmax-gradient model. Increasing
the C-parameter value beyond the values chosen resulted
in an over-fitting problem in the developed models indicated by low
training error while, conversely, very high errors in the testing
process. These selected values of the SVM-based model parameters yielded
the best prediction performance during the testing process in terms
of the R-value of 0.86 and 0.97 and MAPE values of
0.16 and 0.41% between the predicted and the actual values for the
Shmin and Shmax gradient models, respectively.
The statistical parameters (R-value, MAPE, and RMSE)
describing the performance of the SVM-based models to estimate the
Shmin and Shmax gradients are listed in Table .Table summarizes the selected SVM
parameters for the developed Shmin- and Shmax-gradient models. Figures and 9 show the crossplots between
the predicted and observed output values for model development processes
(training and testing).
Table 5
Tested Options for
Optimizing the
Developed SVM-Based Models
selected
parameters
parameter
tested options/ranges
Shmin gradient model
Shmax gradient model
kernel function
Gaussian, polynomial, htrbf, rbf
Gaussian function
kernel option
1.5–7
3.5
lambda
1 × 10–7 to 1 × 10–1
1 × 10–5
epsilon
0.00001–0.1
0.1
verbose
1
1
C-parameter
10–1000
400
600
Figure 8
Crossplots between the actual and predicted
Shmin gradients
for the developed SVM-based models for (a) training and (b) testing
processes.
Figure 9
Crossplots between the actual and predicted
Shmax gradients
for the developed SVM-based models for (a) training and (b) testing
processes.
Crossplots between the actual and predicted
Shmin gradients
for the developed SVM-based models for (a) training and (b) testing
processes.Crossplots between the actual and predicted
Shmax gradients
for the developed SVM-based models for (a) training and (b) testing
processes.Comparing the prediction performance of both ANN and
SVM models
in the testing data set showed that the developed ANN-based models
outperformed the SVM-based ones while predicting Shmin and
Shmax gradients. The developed ANN-based models yielded
better predictions for the testing process of the developed models
regarding higher R-values of 0.92 and 0.98 and lower
MAPE values of 0.14 and 0.30% between the predicted and actual Shmin and Shmax gradients. However, the predictions
of the developed SVM-based models resulted in R-values
of 0.86 and 0.97 with MAPE values of 0.16 and 0.41% for the Shmin and Shmax gradients models, respectively, Figures and 11. Furthermore, the ANN approach has the privilege
of having the potential to extract imitating equations to the neural
network process.
Figure 10
Comparison of the prediction performance between the developed
(Shmin gradient) ANN-based and SVM-based models in terms
of (a) R-value and (b) MAPE for training, validation,
and testing processes.
Figure 11
Comparison of the prediction
performance between the developed
(Shmax gradient) ANN-based and SVM-based models in terms
of (a) R-value and (b) MAPE for training, validation,
and testing processes.
Comparison of the prediction performance between the developed
(Shmin gradient) ANN-based and SVM-based models in terms
of (a) R-value and (b) MAPE for training, validation,
and testing processes.Comparison of the prediction
performance between the developed
(Shmax gradient) ANN-based and SVM-based models in terms
of (a) R-value and (b) MAPE for training, validation,
and testing processes.
Empirical
Equations for Estimating Shmin and Shmax Gradients
One of the primary
outcomes of this study was the development of new empirical equations
that can be used to estimate the Shmin and Shmax gradients without needing to run the MATLAB codes. Accordingly,
Shmin and Shmax gradients can be calculated
using the novel ANN-based eqs and 5, respectively.The subscript “normalized”
refers to the normalized form of the Shmin and Shmax gradients, and the input parameters should be first normalized using
the point-slope form in eq .where X is the actual value
of the input parameter, Xmin and Xmax are the minimum and maximum values of the
input features, respectively, and Xnormalized is the normalized form of the input parameter. The normalized form
of the Shmin and Shmax gradients in eqs and 5 can be calculated using eqs and 8.The [Sqrt(GR)], DTC, and RHOB represent
the normalized
forms of the input parameters obtained using eq . These equations were established to mimic
the developed ANN-based models utilizing the tuned weights and biases
of the optimized networks. The weights and biases of the developed
Shmin and Shmax models in eqs and 8 are listed in Tables and 7, respectively. The input parameters should be measured in
the following units: GR in API unit, DTC in μs/ft, and RHOB
in g/cm3.
Table 6
Extracted Weights
and Biases to Be
Used in Eq for Estimating
the Shmin Gradient
W1i,j
i
j = 1
j = 2
j = 3
W2i
b1,i
b2
1
–3.921
0.527
1.072
0.298
4.579
–0.870
2
2.934
2.286
2.366
–0.298
–3.932
3
0.043
–6.361
–0.428
–0.876
–5.893
4
–3.070
–2.316
–2.405
0.664
2.966
5
2.943
1.749
–2.839
0.685
–3.041
6
–1.677
–4.048
0.278
0.127
2.709
7
–0.767
–4.177
–4.646
–0.340
1.410
8
–3.120
–3.061
0.288
0.945
1.909
9
0.968
–2.507
–2.958
0.441
–1.716
10
–3.968
1.687
1.129
–0.330
1.137
11
–1.476
4.656
2.314
0.339
2.230
12
3.189
2.455
–1.452
0.154
–1.175
13
–3.588
0.998
2.001
0.278
0.889
14
–0.228
1.546
3.449
0.625
1.180
15
–1.189
3.067
–2.984
0.213
0.542
16
1.895
–2.761
2.420
0.148
1.167
17
3.512
1.644
–2.954
0.251
0.944
18
–1.783
–1.520
–0.185
0.752
–1.164
19
–4.342
–0.693
–3.299
0.380
–0.803
20
–4.097
–0.671
–2.223
–0.496
–0.770
21
2.793
–3.729
0.564
–0.697
1.154
22
1.768
–2.186
3.675
–0.134
1.558
23
–0.492
–1.169
5.038
–0.751
–3.895
24
–3.288
0.868
–2.794
0.185
–2.401
25
3.086
0.805
3.362
0.112
1.596
26
4.075
–1.966
1.095
–0.105
2.999
27
–4.754
1.211
–1.468
–0.870
–2.993
22
–0.508
1.499
–4.257
0.431
–3.777
29
2.612
1.414
2.946
–0.492
4.200
30
0.698
–1.903
5.501
0.693
–4.498
Table 7
Extracted Weights and Biases to Be
Used in Eq for Estimating
the Shmax Gradient
W1i,j
i
j = 1
j = 2
j = 3
W2i
b1,i
b2
1
1.719
–2.422
12.895
–0.327
–10.487
–0.250
2
–1.273
1.409
–8.854
–0.437
7.050
3
–5.160
–1.220
–4.956
–1.429
2.775
4
–5.181
1.336
–2.033
0.170
3.534
5
0.431
1.159
3.291
0.184
0.899
6
5.567
–3.632
–6.719
–0.045
–5.231
7
–9.146
–4.834
–5.639
–0.201
2.091
8
–5.561
–1.786
–5.311
1.451
2.698
9
0.096
–4.575
–0.572
0.207
–0.478
10
–2.126
0.730
19.793
0.040
–6.191
11
–6.788
3.517
3.563
0.083
–0.822
12
–0.056
–3.909
–0.429
0.448
–2.425
13
–2.571
–4.794
3.664
1.154
–8.127
14
2.126
4.353
–3.162
1.428
7.309
15
2.434
–12.016
–1.286
0.129
8.345
Model Verification
For further investigation
of the performance of the developed equations, 456 (unseen) data points
from well B were used to evaluate the performance of the developed
equations. These data involved the logging measurements (GR, RHOB,
and DTC) and the corresponding Shmin and Shmax gradients. The logging data were fed as inputs for the developed
ANN-based equations, and the results were then compared with the actual
stress-gradient values. The prediction results of both the Shmin and Shmax gradients remarkably matched the actual
values, confirmed by MAPE values of 0.18 and 0.43%; besides, R-value exceeds 0.90 for the Shmin and Shmax predictions, respectively, Figure . These results demonstrate the outstanding
performance of the developed ANN-based equations to develop continuous
profiles of the Shmin and Shmax with high accuracy
whenever the well-log data are available.
Figure 12
Prediction performance
of the developed ANN-based equations (actual
vs predicted stress gradients) for the verification process: (a) Shmin-gradient prediction and (b) Shmax-gradient prediction.
Prediction performance
of the developed ANN-based equations (actual
vs predicted stress gradients) for the verification process: (a) Shmin-gradient prediction and (b) Shmax-gradient prediction.Having continuous profiles of the least principal
stresses for
the drilled wells could help provide practical solutions to several
wellbore instability issues that may affect the well integrity. Besides,
such data would help develop a comprehensive geomechanical model of
the subsurface formations. As a result, a broad suite of problems
along different stages of the well life could be addressed and avoided.It should be highlighted that the application of the developed
correlation is more recommended for carbonate formations from which
most of the data used in developing the models were obtained. This
can be explained as other formation types may have different log responses
to the geomechanical properties that control the downhole stress distributions.
Therefore, some errors might be expected upon the application for
different formation lithologies. Moreover, it is recommended to employ
the developed equations using inputs within the range and the same
units listed in Table to ensure reliable results.
Conclusions
New
models were developed using two ML techniques, ANN and SVM,
to predict the maximum (Shmax) and minimum (Shmin) horizontal stress gradients. The developed models used the conventional
logging data: GR, RHOB, and DTC as feeding inputs to the algorithms.
The findings of this research can be highlighted as follows:The prediction performance of the
developed models by
ANN surpassed the SVM-based ones with accuracy exceeding 90% and a
MAPE of 0.30%.Novel equations were established
according to the tuned
weights and biases of the optimized neural networks. These equations
can estimate the Shmin and Shmax gradients directly
from the logging data.The new equations
were validated using a different data
set achieving an obvious match between the predicted and actual stress-gradient
values with MAPE not exceeding 0.43%. The results reflect the robustness
of the new equations to accurately estimate the Shmin and
Shmax gradients directly from the well-logging data.