Real-time prediction of the formation pressure gradient is critical mainly for drilling operations. It can enhance the quality of decisions taken and the economics of drilling operations. The pressure while drilling tool can be used to provide pressure data while drilling, but the tool cost and its availability limit its usage in many wells. The available models in the literature for pressure gradient prediction are based on well logging or a combination of some drilling parameters and well logging. The well-logging data are not available for all wells in all sections in most wells. The objective of this paper is to use support vector machines, functional networks, and random forest (RF) to develop three models for real-time pore pressure gradient prediction using both mechanical and hydraulic drilling parameters. The used parameters are mud flow rate (Q), standpipe pressure, rate of penetration, and rotary speed (RS). A data set of 3239 field data points was used to develop the predictive models. A different data set unseen by the model was utilized for the validation of the proposed models. The three models predicted the pore pressure gradient with a correlation coefficient (R) of 0.99 and 0.97 for training and testing, respectively. The root-mean-squared error (RMSE) ranged from 0.008 to 0.021 psi/ft for training and testing, respectively, between the predicted and the actual pore pressure data. Moreover, the average absolute percentage error (AAPE) ranged from 0.97% to 3.07% for training and testing, respectively. The RF model outperformed the other models by an R of 0.99 and RMSE of 0.01. The developed models were validated using another data set. The models predicted the pore pressure gradient for the validation data set with high accuracy (R of 0.99, RMSE around 0.01, and AAPE around 1.8%). This work shows the reliability of the developed models to predict the pressure gradient from both mechanical and hydraulic drilling parameters while drilling.
Real-time prediction of the formation pressure gradient is critical mainly for drilling operations. It can enhance the quality of decisions taken and the economics of drilling operations. The pressure while drilling tool can be used to provide pressure data while drilling, but the tool cost and its availability limit its usage in many wells. The available models in the literature for pressure gradient prediction are based on well logging or a combination of some drilling parameters and well logging. The well-logging data are not available for all wells in all sections in most wells. The objective of this paper is to use support vector machines, functional networks, and random forest (RF) to develop three models for real-time pore pressure gradient prediction using both mechanical and hydraulic drilling parameters. The used parameters are mud flow rate (Q), standpipe pressure, rate of penetration, and rotary speed (RS). A data set of 3239 field data points was used to develop the predictive models. A different data set unseen by the model was utilized for the validation of the proposed models. The three models predicted the pore pressure gradient with a correlation coefficient (R) of 0.99 and 0.97 for training and testing, respectively. The root-mean-squared error (RMSE) ranged from 0.008 to 0.021 psi/ft for training and testing, respectively, between the predicted and the actual pore pressure data. Moreover, the average absolute percentage error (AAPE) ranged from 0.97% to 3.07% for training and testing, respectively. The RF model outperformed the other models by an R of 0.99 and RMSE of 0.01. The developed models were validated using another data set. The models predicted the pore pressure gradient for the validation data set with high accuracy (R of 0.99, RMSE around 0.01, and AAPE around 1.8%). This work shows the reliability of the developed models to predict the pressure gradient from both mechanical and hydraulic drilling parameters while drilling.
Petroleum engineering is an engineering field concerned with the
activities related to hydrocarbon production, which can be either
crude oil or natural gas. As a part of petroleum engineering, drilling
is considered the only way for the actual discovery of reservoirs
to produce hydrocarbons. While drilling a well through a certain geological
column, various formations with different properties and pressures
(pore and fracture pressures) are encountered. The knowledge of the
drilling window, which contains both pore and fracture pressures versus
depth, may secure reaching the target with minimum time and costs.The pore pressure or formation pressure is the pressure exerted
by the fluids contained in the porous media. The normal pore pressure
at any depth results from the weight of the water column extended
from a certain depth to the surface. The normal pressure gradient
ranges from 0.433 psi/ft for fresh water to 0.465 psi/ft for salt
water.[1] The abnormal pressure term can
be used to describe the deviation from the normal gradient, which
can be either overpressure or subnormal pressure.[2] Overpressure, also called geopressured, is the pore pressure
greater than the normal pressure. This pressure results from an extra
pressure source added to the normal pressure and may cause kicks and
blowouts. The excess pressure source may be due to geological, mechanical,
geochemical, geothermal, and combined reasons.[3] However, the subnormal pressure is the pressure less than the normal
pressure and may cause differential-pressure pipe sticking and loss
of circulation.[3] Real-time pore pressure
prediction may enhance well trajectory, casing, and mud program designs
and provide better wellbore stability analysis, resulting in reducing
the overall drilling time and cost.[4]Real-time pore pressure prediction can be either qualitative or
quantitative. The proposed methods in the literature used well-logging
data, formation properties and few studies combined logging and drilling
parameter data. Hottman and Johnson[5] predicted
the pore pressure based on logging data from Miocene and Oligocene
shales. The authors established Cartesian cross-plots relating the
pressure gradient and the difference in sonic travel time or resistivity
ratio between the observed and the normal trends. Pennebaker[6] used the sonic travel time ratio instead of the
difference in sonic travel time used by Hottman and Johnson.[5] Matthews and Kelly[7] utilized a semilog scale instead of the Cartesian one for the same
correlation of Hottman and Johnson. Eaton[8] mentioned that both pore and overburden pressures control log-derived
data. Eaton[8] developed an empirical sonic-based
model to estimate the pore pressure gradient in shales. Gardner et
al.[9] introduced a new attempt to predict
the pore pressure by including the overburden pressure data after
he had analyzed the data provided by Hottman and Johnson. Bowers[10] predicted the pore pressure from sonic data
(slowness) after replacing the effective stress (αV) with (αV – pore pressure).Foster
and Whalen[11] used the equivalent
depth method for the first time to predict the pore pressure from
electrical logs. Additionally, Ham[12] used
the equivalent depth method with resistivity, sonic, and density data
to estimate the pore pressure and calculate the required mud density.
Eaton[8,13] proposed empirical resistivity- and conductivity-based
models to predict the formation pressure gradient in shales from well
logs. Bingham[14] introduced the d exponent to correct the rate of penetration (ROP) for
the changes in the hole diameter, weight on bit (WOB), and rotary
speed (RS). Jorden and Shirley[15] modified
the Bingham approach by suggesting a new term called dexp. Rehm and McClendon[16] modified
Jorden and Shirley’s dexp by considering
the mud weight effect. Eaton[17] noticed
that the corrected d exponent plot is similar to
the resistivity log plot. Consequently, the author introduced a model
to estimate the pore pressure at any depth using observed dc, normal trendline value of dc, overburden pressure, and normal pore pressure gradients.
Machine Learning Studies in Petroleum Engineering
Machine
learning has various tools such as artificial neural networks
(ANNs), support vector machines (SVMs), adaptive neurofuzzy inference
system (ANFIS), and functional networks (FNs), which provide robust
performance and high accuracy for classification and prediction problems.[18] Machine learning is widely used in many disciplines
of engineering, medicine, economics, military, and marine sectors.[19] It has been widely used in petroleum engineering
as it not only has the ability to solve complex problems and deal
with the big data but also perfectly represents them with high accuracy
compared to other models.[20] Different models
were introduced for different purposes such as ROP prediction and
optimization for different drilled formations and well profiles,[21] estimating the oil recovery factor,[22] lithology classification,[23] well planning,[24] the formation
lithology,[25] prediction of formation tops,[26] estimating the properties of reservoir fluids,[27] fracture density estimation,[28] detecting the downhole abnormalities during horizontal
drilling,[29] wellbore stability,[30] predicting the compressional and shear sonic
times,[31] fracture pressure prediction while
drilling,[32] estimating the content of total
organic carbon,[33] identifying and estimating
the rock failure parameters,[34] estimating
the wear of a drill bit from the drilling parameters,[35] predicting the rheological properties of drilling fluids
in real time,[36−39] and estimating the rock static Young’s modulus.[40,41]A few studies used machine learning tools to predict the pore
pressure
gradient. Ahmed et al.[42] applied ANN to
develop a pore pressure white box prediction model using seven parameters.
The used parameters were porosity (φ), rotation speed (rpm),
WOB, mud weight (MW), bulk density, rate of penetration (ROP), and
interval transient time (Δt). Ahmed et al.[43] used five artificial intelligence tools (ANN,
ANFIS, RBF, SVM, and FN) to estimate the pore pressure with the same
seven input variables used in Abdulmalek et al.’s[42] study. Aliouane et al.[44] proposed fuzzy logic (FL)- and ANN-based models to predict the pore
pressure from logging data in shale gas reservoirs. Hu et al.[45] applied the ANN to predict the formation pressure.
The authors used four input parameters, which are depth, density,
interval transit time, and gamma ray. Li et al.[46] used the ANN to predict the pore pressure by including
input parameters, such as gamma ray, interval transit time, natural
potential, and pipe pressure test data.Rashidi and Asadi[47] introduced the ANN-based
model to predict the pore pressure using two parameters (mechanical
specific energy and drilling efficiency). The proposed model does
not consider bit hydraulics and bit wear, which may cause wrong predictions
in very soft and very hard abrasive formations. Keshavarzi and Jahanbakhshi[48] applied the back-propagation neural network
(BPNN) and the general regression neural network (GRNN) to predict
the pressure gradient in the Asmari reservoir in Iran. The input parameters
were depth, permeability, rock density, and porosity.The objective
of this work is to use SVMs, FNs, and RF to develop
three models to predict the pore pressure gradient while drilling
using the available mechanical and hydraulic drilling parameters.
Unlike the other empirical models, the proposed models do not require
a pressure trend (such as normal pressure trend) to predict the pore
pressure gradient. The high cost and low availability of the pressure
while drilling (PWD) tool limit its usage in many wells. The available
models in the literature are based on well logging or a combination
of some drilling parameters and well logging. The well-logging data
are not available for all wells in all sections in most wells. Moreover,
the logging while drilling (LWD) tool is located tens of feet above
the drilling bit in case it is there which actually does not reflect
the formations being drilled instantaneously.
Methodology
The study started with data acquisition from vertical wells in
the Middle East followed by the filtration and cleaning process. The
data set went through the analysis stage to obtain more information
about the inputs and the target. Then, random division of the data
set took place for training and testing. The model development stage
started with running the initial case, and the parameters were updated
until obtaining the best results. Finally, the optimum parameters
were extracted, and the models were validated using unseen data, which
were not included in training and testing. Figure summarizes the methodology followed in this
study to build the different models. In this work, three machine learning
models were developed to predict the pore pressure gradient of the
downhole formations while drilling using three machine learning tools:
SVM, FNs, and RF.
Figure 1
Flowchart of the methodology followed in this work.
Flowchart of the methodology followed in this work.
Data Processing and Analysis
Data Description
A data set of around
3239 points was collected from some vertical wells in a field in the
Middle East. The data included hydraulic and mechanical drilling parameters
in addition to the pore pressure gradient data with their corresponding
depths. These drilling parameters include hydraulic measurements such
as pump rate and standpipe pressure (SPP) and mechanical measurements
such as RS, ROP, torque (T), and WOB. The drilling
parameters were used as model inputs to predict the pore pressure
gradient as a target.These drilling parameters can be measured
at either the surface or the downhole during normal drilling operations.
Additionally, these parameters are affected by formations being drilled
and their fluid content. The field data were statistically analyzed
showing data variability, as the data cover a wide range of the inputs
and the outputs, as shown in Table . For example, the data have a wide range of the pore
pressure gradient as it covers normal (around 0.465 psi/ft), abnormal
(greater than 0.465 psi/ft), and subnormal (lower than 0.465 psi/ft)
pressure.
Table 1
Statistical Analysis of the Field
Data Used to Generate the Predictive Models (Total of 3147 Data Points)
statistical
parameter
depth (ft)
pump rate (gal/min)
SPP
(psi)
RS (rpm)
WOB (klb)
T (klb ft)
ROP (ft/h)
pore pressure
(psi)
pressure
gradient (psi/ft)
minimum
12 591
283.69
2000.54
65.92
5.21
2.87
3.02
4572.55
0.36
maximum
14 700
308.83
3140.57
148.96
20.73
5.82
65.08
8147.59
0.58
mean
13 714
299.52
2599.73
118.73
14.13
3.85
27.22
6696.00
0.48
standard deviation
761.67
4.53
377.66
22.05
2.38
0.38
9.43
1419.43
0.08
skewness
–0.07
–0.78
0.09
–0.86
–0.10
0.99
0.45
–0.27
–0.36
kurtosis
1.22
3.55
1.28
2.47
4.44
4.84
3.60
1.27
1.36
Data Processing
In machine learning,
the data quality is as important as the quality of a prediction or
a classification model. The field data should be filtered and analyzed
for better prediction.[49] A specially designed
MATLAB program was used to remove all values that are not representative,
such as missing values, −999 values, not a number (NAN), and
any unrealistic data points. Consequently, the data were cleaned by
removing all values that are not representative, such as negative
values, −999 values, NAN, and any unrealistic data points.
Then, the outliers should be excluded as they may result in major
issues in statistical analysis.[50] Outliers
may be as a result of human errors and/or instrumentational errors.
Outlier detection can be performed by some techniques such as the Z score and a box-and-whisker plot.[51] The reliability of the input data was tested by different strategies
such as comparing the collected data with the working ranges of the
tools and with the same parameters in offset wells in the same field.
Additionally, the actual pore pressure gradient values were compared
to the pore pressure values generated using the common trends of the
pressure gradient for different formations in this field. This comparison
ensured a reasonable match between the actual and the generated values,
confirming the reliability of the collected data.
Input Selection
The formation properties
affect the drillability of the different geological strata during
drilling operations, as these properties dictate the resistance to
drill through these formations. The drilling parameters can somehow
reflect the resistance encountered during drilling the geological
column. The generated cuttings have effects on the pump pressures
and their rates during drilling. The formation types and the drilling
parameters play a significant role in controlling the ROP.[52,53] As a result, the aforementioned drilling parameters can somehow
reflect the nature of the drilled formations, and in turn, their pore
pressure gradients. The ROP can be utilized as an indicator to detect
overpressurized formations during drilling. The increased porosity
(trapped fluids) and the reduced density of the under-compacted formations
make them more drillable. The ROP was used to build the models as
it reflects the effects of other drilling parameters such as WOB.
Additionally, RS was used as it indirectly includes the torque effect.
Two mechanical drilling parameters (ROP and RS) are used in addition
to two hydraulic drilling parameters (SPP and pump rate).
Results and Discussion of the Developed Prediction
Models
SVM Model
SVM is a supervised learning
tool used to analyze data for regression and classification problems
with a high degree of complexity. Moreover, SVM has advantages over
other artificial intelligence algorithms, such as generalization capability,
strong interference capacity, and less learning time.[54,55] SVM transfers the data from a lower-dimensional to a higher-dimensional
space, called kernel space, which provides more space for training
examples to find a support vector classifier (hyperplane), which reduces
the number of misclassified points.[56] SVM
uses kernel functions to move the data to the kernel space by systematically
finding the support vector classifiers in the higher dimension. The
kernel function selection is based on the nature of the data. The
performance of the SVM-based model relies on the optimization process
of many parameters to develop the desired predictive model with a
high accuracy.The filtered data were split into two groups,
with a ratio of 75:25 for training and testing the model, respectively.
The SVM model parameters, including various combinations of different
options available for SVM parameters, were optimized by running different
cases for each parameter. The standard practice to optimize the hyperparameters
is to use grid search or random search with cross-validation. These
methods search through a fixed (grid search) or random (random search)
set of values for the hyperparameters and select the one that provides
the best modeling performance after being evaluated by cross-validation.
In this study, the hyperparameters were optimized one by one to select
the optimum. Moreover, many metrics were calculated for every run
when we changed one hyperparameter at a time to look at all of them
and then compare to decide about the optimum one.Different
kernel functions, like Gaussian and polynomial functions,
with different tuning parameters, such as regularization parameter
(C) (from 1 to 3000), kernel option (from 1 to 40),
epsilon, and lambda, were tested to obtain the best performance. The
regularization or penalty parameter characterizes the generalization
ability of the machine that controls the sensitivity of the machines
to outliers or, in other words, it tells the algorithm how much it
should care about the misclassified points. The kernel option is a
scalar or a vector containing the options for the kernel function
selected. In the case of the polynomial kernel, the kernel option
is a scalar that gives the degree of the polynomial or a vector in
which the first element is the degree of the polynomial and other
elements give the bandwidth of each dimension; thus, the vector is
of size n + 1, where n is the dimension
of the problem. For the Gaussian kernel, the kernel option defines
the bandwidth of each dimension.The model performance was evaluated
using R and
AAPE between the predicted and actual target values. The correlation
coefficient and AAPE were calculated by eqs and 2, as shown in Appendix 1. The parameters giving the highest R and the lowest errors (RMSE and AAPE) between the actual
and the predicted pore pressure gradient values were chosen. The SVM-based
model, with its optimized parameters, as listed in Table , predicted the pore pressure
gradient with an R value of 0.98 and an AAPE of 1.53%
for training and an R value of 0.97 and an AAPE of
1.98% for testing. Moreover, the RMSE was 0.016 and 0.018 for training
and testing, respectively. Figure shows the cross-plots between the estimated and actual
pore pressure gradients, in which the points significantly coincide
with the 45° line, showing a high accuracy of prediction.
Table 2
SVM-Based Model Optimized Parameters
parameter
T
kernel function
Gaussian
C
10
kernel option
30
Figure 2
Cross-plots
between the predicted and actual pressure gradient
results: (a) training and (b) testing (SVM model).
Cross-plots
between the predicted and actual pressure gradient
results: (a) training and (b) testing (SVM model).
FN Model
FNs are considered a powerful
tool that have high capability like artificial neural networks (ANNs)
for prediction and classification engineering problems.[57,58] FNs are the generalization of ANNs in which the activation functions
associated with neurons are learnt from data (not fixed). In ANNs,
the weights of the neurons should be learnt, while they are suppressed
in FNs.[52] Unlike ANNs, FNs do not require
weights on the neuron connections as they use multiargument functional
models and the weight effect inherently exists within these functions.[59,60] The outputs of the neurons are forced to converge to an equivalent
output. The specification of the initial topology in the FN is based
on the features of the problem. As a result, understanding the problem
can assist in developing the structure of the network.The same
set of data was used to develop the FN predictive model. The data
were randomly divided into a ratio of 75/25 for training and testing,
respectively. The inputs were pump rate, ROP, RS, and SPP. Five FN
methods with four types per each were examined: exhaustive search
(FNESM), forward selection (FNFSM), backward elimination (FNBEM),
forward–backward (FNFBM), and backward–forward (FNBFM).
The performance of each method was compared to that of others based
on the R and AAPE to select the optimum method. Based
on the optimization process, as listed in Table , FNFBM and FNBFM with type 3 (nonlinear)
resulted in the highest R of 0.97 for training and
0.96 for testing between the actual and predicted output values. Moreover,
the RMSE was around 0.019 and 0.021 for training and testing, respectively,
and the AAPE was around 2.8% for training and 3.07 for testing. Figure shows the cross-plots
between the predicted and the actual pore pressure gradients, in which
the points coincide with the 45° line.
Table 3
FN Model Performance for Various Methodsa
FN method
relationship
type
R_training
R_testing
AAPE_training
AAPE_testing
FNESM
1
0.9430
0.9394
4.3283
4.4767
FNESM
2
0.9622
0.9562
3.1991
3.4261
FNFSM
1
0.9430
0.9394
4.3283
4.4767
FNFSM
2
0.9622
0.9562
3.1991
3.4261
FNFSM
3
0.9673
0.9607
2.8830
3.1588
FNFSM
4
0.9631
0.9574
3.0901
3.3141
FNBEM
1
0.9430
0.9394
4.3283
4.4767
FNBEM
2
0.9603
0.9537
3.2099
3.4565
FNBEM
3
0.9649
0.9588
3.0436
3.3243
FNBEM
4
0.9599
0.9539
3.3287
3.5710
FNFBM
1
0.9430
0.9394
4.3283
4.4767
FNFBM
2
0.9622
0.9562
3.1939
3.4201
FNFBM
3
0.9696
0.9631
2.8032
3.0798
FNBFM
1
0.9430
0.9394
4.3283
4.4767
FNBFM
2
0.9622
0.9562
3.1939
3.4201
FNBFM
3
0.9696
0.9631
2.8032
3.0798
The bold
numbers represent the optimum
runs.
Figure 3
Cross-plots between the
predicted and actual pressure gradient
results: (a) training and (b) testing (FN model).
Cross-plots between the
predicted and actual pressure gradient
results: (a) training and (b) testing (FN model).The bold
numbers represent the optimum
runs.
RF Model
RF is considered an ensemble
learning technique that can be utilized for regression and classification
problems.[61] It combines hundreds or thousands
of decision trees to train each tree on a slightly different set of
observations. It uses a process, called bootstrapping, as an iterative
resampling technique to estimate statistics of a population by sampling
a data set with replacement. The predictions of each tree are averaged
to provide the final prediction in a process called aggregation. About
33% of the original data sets are not included in bootstrapping, and
this sample, called out-of-bag data, is used to internally check the
RF-based model accuracy.[62] The RF is better
than a single decision tree due to its ability of limiting the overfitting
without an increase in the error margin.[63]After the optimization process in which the tuning parameters
are adjusted, the optimum parameters are listed in Table . Max_features option determines
the number of features to be considered when searching for the best
split (e.g., if it is “sqrt,” then max_features = sqrt
(n_features) and if it is “log 2,”
then max_features = log2(n_features)).
The option n_estimators determines the number of
trees in the forest, while max_depth determines the maximum depth
of the tree. The RF model predicted the pore pressure gradient using
the same input parameters with R values of 0.99 for
training and 0.98 for testing, with an AAPE not exceeding 1.79%. Moreover,
the RMSE did not exceed 0.011 psi/ft for both training and testing. Figure shows the cross-plots
between the predicted and the actual pore pressure gradients, in which
the points significantly coincide with the 45° line.
Table 4
RF-Based Model Optimized Parameters
parameter
available
options
optimum option
max_features
[“auto,” “sqrt,” “log2”]
sqrt
max_depth
[3, 4, 5,..., 30]
11
n_estimators
[3, 4, 5,..., 150]
100
Figure 4
Cross-plots
between the predicted and actual pressure gradient
results: (a) training and (b) testing (RF model).
Cross-plots
between the predicted and actual pressure gradient
results: (a) training and (b) testing (RF model).The optimum
results of the three models were compared, as listed
in Table , which shows
that the RF model is the most appropriate technique in predicting
the pore pressure gradient, with the highest R value
and the lowest error for training and testing. The SVM model came
in the second place, followed by the FN model, as the least accurate
model. The histogram of errors for the RF model (the least errors)
was constructed, as shown in Figure . Moreover, the RMSE was calculated for the testing
data set for the RF model after dividing it into three categories
(subnormal, normal, and supernormal). The RMSE was 0.016, 0.02, and
0.015 for subnormal, normal, and supernormal, respectively. Based
on the results we got from the analysis performed in each category,
it is found that the RMSE as an evaluation metric is more important
than R to check the models’ performance in
our problem. To prove this, we took a small sample from the supernormal
pressure data set and calculated the RMSE and R for
this sample. Then, the results were compared to the overall RMSE and R of the full supernormal pressure data set. The sample
had an RMSE of 0.001 and an R value of around 0.7,
while the full supernormal pressure data set had an RMSE of 0.015
and an R value about 0.94. Based on the results,
both gave small RMSE and different R values.
Table 5
Summary of the Different Model Performance
technique
R _training
R_testing
AAPE_training (%)
AAPE_testing (%)
RMSE_training
RMSE_testing
RF
0.99
0.98
0.97
1.79
0.008
0.011
SVM
0.97
0.97
1.54
1.98
0.016
0.018
FN
0.97
0.96
2.10
3.07
0.019
0.021
Figure 5
Histograms
of errors for the RF model: (a) training and (b) testing.
Histograms
of errors for the RF model: (a) training and (b) testing.
Model Application and Validation
The performance of
the proposed three models was validated for predicting
the pressure gradient using a total of 92 unseen data points from
the same field that were not included in building the models. Table shows the statistical
analysis for the selected data set, which seems very similar to the
training and testing data set analysis. A comparison between the actual
versus the predicted pressure gradients from the models was established.
The models predicted a continuous profile of the pressure gradients
based on the continuous profiles of the available drilling parameters.
The RF and SVM models estimated the pore pressure gradient with high R values of about 0.99 and 0.98 for the FN model between
the actual and predicted output values for validation. Moreover, the
AAPE did not exceed 1.2% for RF, 1.6% for SVM, and 2.6% for FN. The
RF model gave the least RMSE of 0.01 psi/ft compared to SVM and FN
models. Figure shows
the cross-plots between the predicted and the actual pressure gradient
results for the validation data set. Moreover, the measured and predicted
pore pressure gradient values were plotted on the same graph to track
the differences along the selected sections, as shown in Figure , showing a high
accuracy of pore pressure gradient prediction.
Table 6
Statistical Analysis of the Data Set
Used for Validating the Models
statistical
parameter
depth (ft)
pump rate (gal/min)
SPP
(psi)
RS (rpm)
ROP (ft/h)
pressure
gradient (psi/ft)
minimum
12 598
287.28
2054.72
68.36
9.27
0.36
maximum
14 687
308.83
3112.56
146.17
53.95
0.57
mean
13 698
299.85
2598.76
123.29
27.98
0.48
standard deviation
775.65
4.51
379.89
19.41
9.28
0.08
skewness
0.03
–0.91
0.18
–1.33
0.62
–0.18
kurtosis
1.19
3.99
1.23
3.98
3.51
1.25
Figure 6
Cross-plots between the
predicted and the actual pressure gradient
results for 92 unseen data points: (a) RF model, (b) SVM model, and
(c) FN model.
Figure 7
Pore pressure gradient profile for 92 unseen
data points: (a) RF
model, (b) SVM model, and (c) FN model.
Cross-plots between the
predicted and the actual pressure gradient
results for 92 unseen data points: (a) RF model, (b) SVM model, and
(c) FN model.Pore pressure gradient profile for 92 unseen
data points: (a) RF
model, (b) SVM model, and (c) FN model.
Conclusion
In this study, a new approach for pore pressure
gradient prediction
while drilling from the available hydraulic and mechanical drilling
parameters using different techniques of AI was proposed. The developed
models used around 3239 field data points from some wells to build
the models. The developed models do not require a pressure trend (such
as normal pressure trend) to estimate the pore pressure. The proposed
models can be included in any automated drilling system to predict
the pressure gradient in real time at a reasonable cost. Additionally,
it can economically replace the PWD tool and can reduce the nonproductive
time by minimizing some drilling problems by predicting and avoiding
them before their occurrence. These models can technically and economically
enhance the drilling operations while drilling and in predrilling
design by taking the right actions and avoiding possible problems
such as kicks, blowouts, and loss of circulation. The outcomes of
this study can be summarized as follows:The optimum SVM model used the Gaussian
kernel function
with a C value of 10 and kernel option of 30.The optimum FN model used either FNFBM or
FNBFM with
type 3 (nonlinear) to obtain the best results.The optimum parameters of the RF model were sqrt as
max_features, max_depth of 11, and n_estimators of
100.The three models predicted the pore
pressure gradient
with a correlation coefficient (R) between 0.99 and
0.97 for training and testing.The AAPE
ranged from 0.97% to 3.07% for training and
testing between the predicted and the actual pore pressure data. Moreover,
the RMSE did not exceed 0.021 psi/ft for all models.The RF model outperformed the other models by an R of 0.99, RMSE of 0.011 psi/ft, and an AAPE of 0.97%. Furthermore,
it predicted the pore pressure gradient for the validation stage with
a high accuracy (R of 0.99, RMSE of 0.01 psi/ft,
and AAPE around 1.19%).The common practice
while over balanced drilling is
to have a safety margin over the pore pressure gradient, which, in
some cases, may be in the range of 0.1 psi/ft. As a result, gradient
prediction with 0.02 psi/ft RMSE is reasonable.