Literature DB >> 33564733

Data-Driven Approaches to Predict Thermal Maturity Indices of Organic Matter Using Artificial Neural Networks.

Zeeshan Tariq1, Mohamed Mahmoud1, Mohamed Abouelresh1, Abdulazeez Abdulraheem1.   

Abstract

Prediction of thermal maturity index parameters in organic shales plays a critical role in defining the hydrocarbon prospect and proper economic evaluation of the field. Hydrocarbon potential in shales is evaluated using the percentage of organic indices such as total organic carbon (TOC), thermal maturity temperature, source potentials, and hydrogen and oxygen indices. Direct measurement of these parameters in the laboratory is the most accurate way to obtain a representative value, but, at the same time, it is very expensive. In the absence of such facilities, other approaches such as analytical solutions and empirical correlations are used to estimate the organic indices in shale. The objective of this study is to develop data-driven machine learning-based models to predict continuous profiles of geochemical logs of organic shale formation. The machine learning models are trained using the petrophysical wireline logs as input and the corresponding laboratory-measured core data as a target for Barnett shale formations. More than 400 log data and the corresponding core data were collected for this purpose. The petrophysical wireline logs are γ-ray, bulk density, neutron porosity, sonic transient time, spontaneous potential, and shallow resistivity logs. The corresponding core data includes the experimental results from the Rock-Eval pyrolysis and Leco TOC measurements. A backpropagation artificial neural network coupled with a particle swarm optimization algorithm was used in this work. In addition to the development of optimized PSO-ANN models, explicit empirical correlations are also extracted from the fine-tuned weights and biases of the optimized models. The proposed models work with a higher accuracy within the range of the data set on which the models are trained. The proposed models can give real-time quantification of the organic matter maturity that can be linked with the real-time drilling operations and help identify the hotspots of mature organic matter in the drilled section.

Entities:  

Year:  2020        PMID: 33564733      PMCID: PMC7864083          DOI: 10.1021/acsomega.0c03751

Source DB:  PubMed          Journal:  ACS Omega        ISSN: 2470-1343


Introduction

The depletion and fall of conventional oil and gas resources will lead to a shortage in the supply of the world’s energy needs.[1] Therefore, unconventional resources, in particular shale gas, are gaining popularity in the recent era of oil and gas.[2−4] Organic-rich shale is one of the most vital sources of unconventional oil and gas. The appropriate geochemical characterization of shale resources plays a critical role in defining the prospect and development of the economic model of the field. For example, the production from Barnett Shale is controlled mainly by the thermal maturity, total organic carbon (TOC), and thickness of the shale target.[5] The geochemical analysis aims to evaluate the organic richness, thermal maturity, and hydrocarbon potentiality of the organic-rich shale.[6] The wettability and pore structure of the organic-rich shale are also affected by geochemical parameters.[7] The most accurate estimation of the organic richness and thermal maturity of shale helps in reducing the risk carried by petroleum well drilling. Mineral heterogeneity, complex lithology, and natural fractures in shales have brought great challenges to the accurate estimation of the organic content.[8,9] Core measurements and well logs are the main ways to obtain geochemical parameters. Direct measurement of organic richness in the laboratory on the core samples is the most accurate way to obtain thermal maturity parameters. However, retrieving core samples from each well of every field and carrying out laboratory experiments on them is quite a time-consuming and costly method. Consequently, core-based geochemical data is very scarce. On the other hand, well log data is a prime component of all well drilling plans and hence is readily available. Limited core sample data and the associated well logs are used to develop correlations that can be applied to the whole well. Therefore, empirical correlations and machine learning-based models are used to obtain these parameters indirectly from well logs (need reference). The accuracy and applicability of these models depend on the data set and the region from where the data set is collected (need reference). Machine learning (ML) and artificial intelligence (AI) both are the captivating fields that integrate computational power with human intelligence to produce smart and reliable solutions of extremely nonlinear and highly complicated problems.[10] In the past two decades, engineering journals have reported numerous articles utilizing AI and ML for regression, function approximation, and classification problems.[11−13] With the advent of soft computing techniques, several correlations utilizing techniques from the field of AI have come to the fore, especially in reservoir characterization,[14−16] reservoir engineering,[17−20] and reservoir geomechanics.[21,22] In petroleum geochemistry, such correlations can be seen in the works of Rahaman.[23] In recent years, support vector machine (SVM) and least-squares support vector machine techniques were actively used by researchers to predict total organic carbon (TOC);[24,25] researchers have used artificial neural networks too to predict TOC and thermal maturity.[26−28] Based on the literature survey, it was observed that much attention was given to predict only the TOC content of the organic shales. However, organic matter geochemical analysis of shale gas formations requires estimation of a suite of parameters such as Tmax, S1, S2, S3, and TOC. Therefore, the objective of this study is to explore the potential of the machine learning technique in predicting these five geochemical parameters (Tmax, S1, S2, S3, and TOC) for the Barnett Shale. This study has utilized both machine learning and evolutionary algorithms to arrive at the optimum model. In addition to that, five explicit empirical correlations derived from the PSO-ANN algorithm are proposed; these correlations do not require any ML-based software for the execution.

Background

Geochemical properties of shale such as thermal maturity, source potentials (S1–S3), total organic carbon content (TOC), hydrogen index (HI), and oxygen index (OI) are important parameters to evaluate its production potential.[8,29]S1 and S2 are called as volatile hydrocarbon and remaining hydrocarbon or oil potential, respectively.[30] The HI and OI are calculated from TOC and source potentials. Maximum temperature (Tmax) is a chemical indicator of thermal maturity. Tmax is the temperature at which the S2 attains its maximum hydrocarbon generation.[31] It accounts for the hydrogen and oxygen richness of the shale. On the other hand, TOC is an important indicator of organic richness in shale plays. TOC is expressed in weight percentage (wt %). Usually, if TOC is less than 0.5 wt %, it means no organic matter exists in the shale rock. A wt % of TOC greater than 0.5 is a positive sign for the existence of organic matter in shale rock. Several cross plots such as between Tmax and hydrogen index (HI), S2 and TOC, and HI and OI are used to evaluate the level of thermal maturity and kerogen type.[32−34]Table presents the range of TOC and Tmax according to the maturity level.
Table 1

TOC and Tmax Range Describing the Level of Maturity

parameterno hydrocarbonmaturity rangepostmaturity rangerefs
TOC, wt %less than 0.50.5–2greater than 2(35, 36)
Tmax, °Cless than 435435–465greater than 465(33, 37)
There are three methods to determine geochemical parameters of organic-rich shale such as direct measurement, a single well log method,[38] and a composite well log method.[39] TOC can be measured directly in the laboratory in different ways such as filter acidification,[40] nonfilter acidification,[40] total minus coulometric,[41] Rock-Eval,[42] laser-induced pyrolysis,[43] and diffuse reflectance infrared Fourier transformation spectroscopy (DRIFTS).[34] Indirect methods involve the utilization of petrophysical well logs and seismic data. A large number of models are reported in the literature for the prediction of geochemical parameters using composite well logs.[6,24−27,44−53] Schmoker[38] established the first correlation to predict TOC for Devonian shale formation. Schmoker correlation is expressed in eq , which gives results in volume percentageSchmoker[54] modified his correlation for Bakken formation as given by eq .where ρ is the organic matter density in g/cm3, ρmi is the average density of grain and pore fluid in g/cm3, Rρ is the ratio of the organic matter to organic carbon in weight percentage. Passey et al.[55] suggested an easy-to-use model for TOC prediction, as summarized in eqs and 4. Currently, this model is widely used for evaluating the unconventional resources reserve.where Rbaseline and R are the base formation and evaluated formation resistivities in Ω·m, respectively, Δ log R represents the log separation, Δtbaseline and Δt are the base formation and evaluated formation sonic transit times both in μs/ft, respectively, and LOM represents the formation level of maturity. Sultan[28] used a self-adaptive differential evolution algorithm to optimize artificial neural networks (ANN) and presented the empirical correlation to predict TOC of the Barnett shale. His correlation for TOC prediction is given by eqs –8Table presents some of the recent research works related to the prediction of organic matter in shale using machine learning and nonlinear regression approaches for the relevant geological fields.
Table 2

Summary of the Research Related to the Prediction of Organic Matter in Shale

refsstudy conductedtechniquemethod typeinput parametersageological field study
Tan et al.[24]prediction of TOCartificial intelligenceepilson-SVR, nu-SCR, SMO-SVR, and RBFCNL, GR, AC, K, TH, U, PE, RHOB, and RTHuangping syncline, China
Rui et al.[25]prediction of TOCartificial intelligenceSVMwireline log data such as RHOB, GR, SP, RT, and DTBeibu Gulf basin
Lawal et al.[27]prediction of TOCartificial intelligenceANNXRD data: SiO2, Al2O3, MgO, and CaODevonian Shale
Sultan[28]prediction of TOCartificial intelligenceself-adaptive differential evolution-based ANNwell logs: GR, DT, RT, and RHOBDevonian Shale
Mahmoud[56]prediction of TOCartificial intelligenceANNwell logs: GR, DT, RT, and RHOBDevonian Shale
Zhao et al.[50]prediction of TOCregressionnonlinearCNLOrdos Basin in China and Bakken Shale of North Dakota
Wang et al.[53]prediction of TOCregressionnonlinearDT and RTSichuan Basin, Southern China
Alizadeh et al.[46]prediction of TOC and S2artificial intelligenceANNDT and RTDezful Embayment, Iran
Handhal et al.[45]prediction of TOCartificial intelligenceSVR, ANN, KNN, random forest, and rotation forestGR, RHOB, NPHI, RllD, and DTRumaila Oil Field, Iran
Wang et al.[48]TOC, S1, and S2artificial intelligenceANNRHOB, NPHI, RT, and DTBohai Bay Basin, China

GR = γ-ray, RHOB = bulk density, LLD = deep lateral log, LLS = shallow lateral log, MSFL = microspherical focused log, RILD = deep induction resistivity log, DT = compressional wave travel time, TH = thorium, U = uranium, K = potassium, RT = resistivity log, NPHI = neutron porosity, SP = spontaneous potential, CNL = compensated neutron log, PE = photoelectric index, SiO2 = silicondioxide, Al2O3 = aluminiumdioxide, MgO = magnesium oxide, and CaO = calcium oxide.

GR = γ-ray, RHOB = bulk density, LLD = deep lateral log, LLS = shallow lateral log, MSFL = microspherical focused log, RILD = deep induction resistivity log, DT = compressional wave travel time, TH = thorium, U = uranium, K = potassium, RT = resistivity log, NPHI = neutron porosity, SP = spontaneous potential, CNL = compensated neutron log, PE = photoelectric index, SiO2 = silicondioxide, Al2O3 = aluminiumdioxide, MgO = magnesium oxide, and CaO = calcium oxide.

Results and Discussion

This section demonstrates the combined results for the prediction of five thermal maturity parameters such as Tmax, S1, S2, S3, and TOC. In this study, MATLAB, version 2020a was utilized to train models for thermal maturity of organic shales. An ANN is the stochastic learning technique that can generate nonunique results at every run. In this study, a seed number was assigned during each model run to obtain a unique result. A multiobjective function is defined using eq , which was designed to get the most accurate results during training and testing. Further improvement in results was done by coupling PSO with ANN.where MAEtraining–1 is the inverse of the mean absolute error (MAE) during training, MAEtesting–1 is the inverse of MAE testing, Rtraining2 is the R2 obtained during training, and Rtesting2 is the R2 obtained during testing. The inverse of MAE was taken to advance both MAE and R2 in the same direction to get the maximum value for the objective function. The step-by-step pseudocode for the proposed PSO-ANN algorithm for thermal maturity parameter prediction is given in Table . Figure shows the workflow chart of the proposed PSO-ANN algorithm to predict thermal maturity parameters.
Table 3

Step-by-Step Pseudocode for the Proposed PSO-ANN Algorithm for Thermal Maturity Parameter Prediction

stepsworking
1start
2set input variables
3initialize parameters of ANN such as learning rate, activation functions, etc.
4vary the number of hidden layers (sensitivity of hidden layers, 1–3)
5vary the number of neurons in the hidden layer (sensitivity of neurons, 5–30)
5select the learning algorithm of ANN
6select the learning rate [0, 1] for the selected learning algorithm
7train and test the ANN model and
8evaluate the objective function for a minimum convergence value
9extract weights and biases from the trained model
10initialize parameters of PSO algorithm such as the number of iterations, population of particles, cognitive and social accelerations, and initial and final inertia weights
11set range for sample search space of each extracted weights and biases
12feed extracted weights and biases in a PSO algorithm as the initial population
13evaluate the objective function for a minimum convergence value
14run the iterative process until the stopping criteriona is achieved
15pick the global best solution
16set optimum weights and biases from the globally best model in the network for the prediction of thermal maturity parameters
17end

stopping criterion = a maximum number of iterations are attained or a maximum level of inactivity is reached.

Figure 1

Workflow chart of the proposed PSO-ANN algorithm to obtain thermal maturity parameters.

Workflow chart of the proposed PSO-ANN algorithm to obtain thermal maturity parameters. stopping criterion = a maximum number of iterations are attained or a maximum level of inactivity is reached. In general, the trained ANN models for the prediction of thermal maturity of an organic shale comprised six input neurons such as GR, RHOB, DT, RILD, NPHI, and SP logs with ten middle-layer neurons. The number of neurons was chosen because of their best performance in terms of achieving maximum value of objective function defined in eq . Figure shows the sensitivity of the number of neurons with the objective function. The middle layer with ten neurons was chosen because of their high performance. The general topology of the proposed model for the prediction of thermal maturity parameters is given in Figure , and the optimum values of the ANN technique for different models are listed in Table . The total data set for each model was stratified in the proportion of 70 and 30%. The 70% proportion was used for the training, and 30% was used for the testing of the trained model.
Figure 2

Optimal number of neurons in the middle layer using an objective function.

Figure 3

General topography of the proposed ANN model for the prediction of thermal maturity parameters.

Table 4

Optimum Values for the Proposed ANN Models

parameters of the ANN modelrangeTmaxS1S2S3TOC
number of input parameters666666
middle layer(s)1–311111
neurons in the middle layer5–151010101010
learning algorithmquasi-newton, conjugate gradient, Levenberg–Marquardt (LM), Newton’s method, gradient descent (GD), resilient backpropagation (RB), Fletcher–Powell conjugate gradient, one-step secantLMRBRBLMGD
rate of learning, α0.1–0.50.150.20.100.160.18
middle-layer transfer functiontangential sigmoidal (tansig), logarithmic sigmoidal, hyperbolic sigmoidal, linear, rectified linear unittansigtansigtansigtansigtansig
epochs100–500150115125180260
outer-layer transfer functionlinearlinearlinearlinearlinearlinear
Optimal number of neurons in the middle layer using an objective function. General topography of the proposed ANN model for the prediction of thermal maturity parameters. For a Tmax model, a total of 400 data points were obtained. On a training data set, the ANN model predicted the Tmax with an R2 of 0.917, an average absolute percentage error (AAAPE) of 1.006%, and a root-mean-square error (RMSE) of 0.258, while on a testing data set, the ANN model predicted the Tmax with an R2 of 0.918, an AAPE of 1.137%, and an RMSE of 0.428. The training and testing scatter plots are shown in Figure . The learning algorithm utilized was the LM with a learning rate of 0.15. With this combination, an optimum model was stored, and their weights and biases were extracted. The mathematical model for Tmax utilizing optimum weights and biases is given in Appendix A.
Figure 4

Training and testing of the Tmax model.

Training and testing of the Tmax model. For an S1 model, a total of 400 data points was obtained. On a training data set, the ANN model predicted the S1 with an R2 of 0.919, an AAPE of 10.14%, and an RMSE of 0.003, while on a testing data set, the ANN model predicted the S1 with an R2 of 0.883, an AAPE of 12.232%, and an RMSE of 0.006. The training and testing scatter plots are shown in Figure . The learning algorithm utilized was the RB with a learning rate of 0.15. With this combination, an optimum model was stored, and their weights and biases were extracted. The mathematical model for S1 utilizing optimum weights and biases is given in Appendix B.
Figure 5

Training and testing of the S1 model.

Training and testing of the S1 model. For an S2 model, a total of 380 data points was obtained. On a training data set, the ANN model predicted the S2 with an R2 of 0.839, an AAPE of 13.8%, and an RMSE of 0.006, while on a testing data set, the ANN model predicted the S2 with an R2 of 0.827, an AAPE of 15.538%, and an RMSE of 0.010. The training and testing scatter plots are shown in Figure . The learning algorithm utilized was the RB with the learning rate of 0.15. With this combination, the optimum model was stored, and their weights and biases were extracted. The mathematical model for S2 utilizing optimum weights and biases is given in Appendix C.
Figure 6

Training and testing of the S2 model.

Training and testing of the S2 model. For an S3 model, a total of 450 data points was obtained. On a training data set, the ANN model predicted the S3 with an R2 of 0.891, an AAPE of 5.4%, and an RMSE of 0.001, while on a testing data set, the ANN model predicted the S3 with an R2 of 0.868, an AAPE of 6.11%, and an RMSE of 0.001. The training and testing scatter plots are shown in Figure . The learning algorithm utilized was the LM with the learning rate of 0.15. With this combination, the optimum model was stored, and their weights and biases were extracted. The mathematical model for S3 utilizing optimum weights and biases is given in Appendix D.
Figure 7

Training and testing of the S3 model.

Training and testing of the S3 model. For a TOC model, a total of 360 data points were obtained. On a training data set, the ANN model predicted the TOC with an R2 of 0.825, an AAPE of 7.863%, and an RMSE of 0.029, while on a testing data set, the ANN model predicted the TOC with an R2 of 0.826, an AAPE of 8.713%, and an RMSE of 0.048. The training and testing scatter plots are shown in Figure . The learning algorithm utilized was GD with a learning rate of 0.15. With this combination, the optimum model was stored, and their weights and biases were extracted. The mathematical model for TOC utilizing optimum weights and biases is given in Appendix E.
Figure 8

Training and testing of the TOC model.

Training and testing of the TOC model. To see the improvement in accuracy of the models using the proposed PSO-ANN-based algorithm, a comparison was made between conventional and PSO-ANN by comparing the R2 obtained on overall data sets (training and testing). Figure shows the bar chart that illustrates that in all five models (Tmax, S1, S2, S3, and TOC) the R2 values obtained using the PSO-ANN algorithm were much higher than the conventional ANN model. This proves that the proposed PSO-ANN algorithm-based models have much higher accuracy than the models based on the conventional ANN technique.
Figure 9

Comparison of conventional ANN and PSO-ANN in terms of the coefficient of determination on an overall data set.

Comparison of conventional ANN and PSO-ANN in terms of the coefficient of determination on an overall data set.

Conclusions

A good estimation of shale geochemical properties requires a sophisticated approach. Minor variations in anticipated results lead to wastage of man-hours and huge investments. On the other hand, a small improvement in the estimation practices can improve the worth of the exploration project manyfold. The development of robust and improved models for prediction of thermal maturity of organic shale was the focus of this study. To achieve the objective, an ANN tool coupled with a PSO algorithm is employed in this work. The evaluation of the proposed models was based on various statistical measures such as RMSE, AAPE, MAE, and R2. A step-by-step comprehensive analysis to reach the optimum model selection along with the statistical and graphical metrics was also presented in this study. The generalization capability of the proposed models was tested using a blind data set. By correlating the predicted maturity index with the core-based one, the proposed models were found to be effective, faster, and more readily available than lab analysis. The proposed models are completely reproducible. The proposed PSO-ANN-based models can give reliable predictions in the absence of experimental data and therefore can be a good choice to be included in any software package for a complete analysis of geochemical data without going to the laboratory for carrying out the Rock-Eval pyrolysis experiment. The proposed models can give real-time quantification of organic matter maturity using readily available well logs that can help us to identify the hotspots of mature organic matter in the drilled section.

Materials and Methods

Studied Geological Field

The Mississippian Barnett Shale in Fort Worth Basin, North Texas is a classic world-class unconventional shale gas reservoir.[57] It consists mainly of silica-rich mudstone interlaminated with clay- and calcareous-rich mudstone and deposited in a low-energy, relatively deep water environment.[35,58] The subsurface thickness of Barnett Shale reaches up to about 1000 ft. (304.8 m) in the Newark sub-basin. In the Newark East field, from where the data comes, Barnett Shale is thermally mature, averaging 4–5 wt % total organic carbon TOC, and trapped between two impermeable limestone beds, which is the most favorite conditions for vertical well completion. However, the variation in the thickness, mineral composition, organic richness, and thermal maturity levels through the entire Fort Worth Basin raised the need to reduce the uncertainty of predicted reservoir properties in the rest of the basin and develop AI models that can be applied to another shale gas reservoirs.

Geochemical Analysis

Pyrolysis is a process of performing thermal decomposition of materials at higher temperatures. The geochemical analysis of an organic matter is comprised of Rock-Eval pyrolysis. Pyrolysis is used to evaluate the thermal maturity and organic richness of the source rock. From the pyrolysis experiment, the quality, quantity type, thermal maturity, hydrogen index, migration index, production index, and oxygen index of organic matter can be determined. Typically, the five parameters Tmax, S1, S2, S3, and TOC are measured. S1 accounts for free hydrocarbon released at 300°C measured in mg HC/g rock. S2 accounts for hydrocarbon released from the cracking of kerogen at the temperature range between 300 and 600 °C measured in mg HC/g rock. S3 accounts for carbon dioxide (CO2) released from the breaking of carboxy groups and other oxygen-containing compounds measured in mg CO2/g rock. The TOC is measured by oxidizing the residue left in the pyrolysis process at a fixed temperature of 600 °C.[6,36]Figure shows the schematic of the different fractions obtained from total organic matter.
Figure 10

Diagram showing the fractions of the total organic matter.

Diagram showing the fractions of the total organic matter.

Data Analytics

The statistical description of the data set used to train AI models for the geochemical parameters prediction is given in Table . The ranges of the input parameters for each model are quite practically reasonable. The complete data set utilized for the training of each model is given in Figure .
Table 5

Ranges of the Data Used for AI Modeling

 GR, API
RHOB (g/cc)
NPHI (vol/vol)
RILD, (Ω–m)
Δtc (μs/ft)
SP (mV)
modelsminmaxminmaxminmaxminmaxminmaxminMax
Tmax (°C)18.229417.062.372.860.0030.334.81283.3445.2793.48–154.18–28.62
S1 (mg HC/g rock)19.120336.732.372.860.0030.331.31027.9045.2793.39–154.18–28.62
S2 (mg HC/g rock)18.229372.292.372.860.0030.331.31283.3445.2793.39–154.18–28.81
S3 (mg CO2/g rock)18.229417.062.372.830.0030.294.81088.6645.2792.08–154.18–28.81
TOC (wt %)55.500359.762.202.680.030.336.0148.8762.0593.39–86.69–29.25
Figure 11

Well logs’ input data (AT90 is a RILD log).

Well logs’ input data (AT90 is a RILD log). The core data of the corresponding conventional wireline well logs are collected. The frequency distribution of the measured Tmax, S1, S2, S3, and TOC from the geochemical analysis is shown in Figure . The Tmax data is evenly distributed over a wide range between 420 and 540 °C. S1 is mainly distributed between 0.1 and 1. S2 is mainly distributed between 0.3 and 1.6. About 60% of the cores have an S2 lower than 1, and only a few have permeability above 1.5. The S3 data is uniformly distributed over a range between 0.1 and 0.3. The TOC data is mainly distributed between 2 and 6, with only fewer points above 6. The frequency histograms show that the core data values are distributed over a wide range of values and are quite heterogenous.
Figure 12

Frequency distribution of Tmax, S1, S2, S3, and TOC.

Frequency distribution of Tmax, S1, S2, S3, and TOC.

Feature Selection

Feature selection was made by evaluating the relative importance of the input parameters with the output parameter using the Pearson correlation coefficient (CC) criterion, which is given by eq where x and y are two variables and k is the sample size. The value of CC lies between −1 and +1. The values near to negative one show an inverse relationship between two variables, the values near to the positive one show a direct relationship between two variables, and the values near to zero show a poor relationship between the pair of two variables. Figure shows the CC of input parameters such as GR, RHOB, NPHI, AT90, Δtc, and SP log with the target parameters such as Tmax, S1, S2, S3, and TOC.
Figure 13

Relative importance of the input parameters such as GR, ρ, NPHI, AT90 (RILD), Δtc, and SP logs with the output parameters such as Tmax, S1, S2, S3, and TOC.

Relative importance of the input parameters such as GR, ρ, NPHI, AT90 (RILD), Δtc, and SP logs with the output parameters such as Tmax, S1, S2, S3, and TOC.

Accuracy Metrics

The models were evaluated based on the goodness-of-fit tests such as the average absolute percentage error (AAPE), mean absolute error (MAE), root-mean-square error (RMSE), and coefficient of determination (R2). The definition of these parameters is given in Table .
Table 6

Statistical Indicators of Model Performance Evaluationa

goodness-of-fit testmathematical expression
average absolute percentage error
mean absolute error
root-mean-square error
coefficient of determination

Ymeasured is the measured value of TOC, Ypredicted is the estimated value from the model, and n is the total number of samples.

Ymeasured is the measured value of TOC, Ypredicted is the estimated value from the model, and n is the total number of samples.

Machine Learning Method

Artificial neural network (ANN) is a machine learning (ML) technique, mostly used for function approximation purposes. It is comprised of a series of layers such as an input layer, a middle layer(s), and an output layer. The middle layer is also called a hidden layer and it can be single or multiple, depending on the training data set.[59] The selection of the number of neurons in the middle layers depends on the overall model performance in terms of accuracy.[60] A transfer function exists between the input layer and the middle layer, and another transfer function exists between the middle layer and the output layer. Various choices of transfer functions are available such as linear, sigmoidal, radial basis, and rectified linear unit (ReLU) type. The detailed description of the theory and utilization of ANN can be found in our previous publications.[10,61,62] Particle swarm optimization (PSO) is utilized to optimize the weights and biases of a neural network. In the past, many researchers have found good results by coupling PSO with other AI techniques such as ANN, least-squares support vector machine (LSSVM), and adaptive neuro-fuzzy inference system (ANFIS).[63−67] PSO is an evolutionary algorithm (EA) inspired by the social movement of birds and fish. EA algorithms are based on the stochastic approach that looks for the best possible solution in the search space. The PSO algorithm depends on four parameters that are population size, weight, particle velocity, and cognitive parameters. A detailed discussion about the PSO algorithm can be found in the publication of Abido.[68] Particles velocity term is given by eq where w is the weight of the particle (0 ≤ w ≤ 1.2), v is the particle velocity, c1 is the cognitive parameter (0 ≤ c1 ≤ 1.2), c2 is the cognitive parameter (0 ≤ c2 ≤ 1.2), n is the number of iteration, p is the local best solution of the particle, pgb is the global best solution of the particle, and p is the ith position of the particle at the nth iteration. The next position for each candidate solution in the search space is created by summation of the current particle position and particle velocity
Table 7

Weights and Biases of the Proposed Model for Tmax Prediction

 weights between input and hidden layers (w1)
   
hidden layer neurons (Nh)GRρNPHIRILDΔtCSPweights between hidden and output layers (w2)hidden layer bias (b1)output layer bias (b2)
1–1.79410.29670.24053.9757–0.68210.1778–1.77175.10031.4308
2–2.07790.80551.2726–0.5317–0.2028–0.35552.95961.0329
3–0.2541–1.23232.3337–0.04874.06131.08120.59336.0397
40.1803–1.5660–0.70031.97810.31960.4630–1.36312.3029
5–1.82470.78782.11221.78920.23562.82051.0260–2.4688
60.8137–1.3633–0.81321.1902–0.44000.00502.55271.4510
7–0.4772–3.1851–0.83430.48390.3739–3.1749–0.54140.1539
8–1.10433.5500–3.5252–0.94982.17984.9600–0.1921–3.1789
9–1.70330.4957–0.6426–0.4278–1.38910.10371.8245–4.7481
102.3884–2.42911.0258–0.2708–4.48492.0855–0.46371.1179
Table 8

Weights and Biases of the proposed Model for S1 Prediction

 weights between input and hidden layers (w1)
   
hidden layer neurons (Nh)GRρNPHIRILDΔtCSPweights between hidden and output layers (w2)hidden layer bias (b1)output layer bias (b2)
1–0.08722.37950.610.9346–1.03022.60132.20322.7471–1.125
22.95531.1960.9532–0.14972.0403–0.6852–1.56683.5566
30.9799–2.0489–0.51931.05991.19693.2706–2.0522–0.5307
4–0.30191.56610.15720.09540.0242–2.0559–3.12891.1359
51.8638–0.8322–3.39660.69462.85643.4534–1.377–3.9418
6–3.25060.87941.52931.29030.86820.4842–3.4481–0.2281
71.7284–1.39164.3578–2.3109–5.37190.012–1.05320.6863
8–2.33250.53831.3951–0.83930.6870.34733.2165–2.2594
90.64381.10763.16290.66330.8062–0.85841.45813.789
101.7403–1.57151.5465–2.1928–0.0194–2.249–0.75711.6332
Table 9

Weights and Biases of the Proposed Model for S2 Prediction

 weights between input and hidden layers (w1)
   
hidden layer neurons (Nh)GRρNPHIRILDΔtCSPweights between hidden and output layers (w2)hidden layer bias (b1)output layer bias (b2)
1–3.615–3.6585.4261.3722.019–5.133–0.406–2.4870.711
25.917–0.7781.8211.348–2.1401.007–0.5095.152
3–2.575–0.0201.776–0.650–1.1191.8175.629–0.469
4–0.3390.970–1.707–1.300–0.551–0.8811.030–0.459
5–0.890–0.4780.944–9.386–0.1142.1032.762–11.697
6–1.8690.3890.9031.4730.715–0.4652.5941.915
71.196–9.027–3.5651.744–4.834–6.791–0.3311.166
8–2.1091.0270.8750.5381.649–0.208–2.6020.392
9–1.703–0.1761.199–0.617–1.3071.459–6.972–0.500
104.054–6.402–1.309–5.677–6.26517.7360.621–10.018
Table 10

Weights and Biases of the Proposed Model for S3 Prediction

 weights between input and hidden layers (w1)
   
hidden layer neurons (Nh)GRρNPHIRILDΔtCSPweights between hidden and output layers (w2)hidden layer bias (b1)output layer bias (b2)
1–1.366–0.2030.630–0.508–1.215–1.586–2.3722.103–2.081
20.2480.581–0.757–4.4690.6321.1453.125–3.801
3–1.2663.1042.077–0.856–1.393–2.0280.9581.611
4–0.034–0.8770.945–4.088–0.1830.3582.039–3.462
5–0.4850.0792.798–0.098–2.098–2.5921.451–1.440
6–0.1311.104–0.1110.917–1.104–0.2803.9820.190
70.0060.4730.145–3.503–0.0810.578–3.989–3.248
80.3420.235–0.188–0.969–1.8830.090–2.849–1.548
90.459–0.0801.096–1.149–1.7652.0862.7231.431
10–1.719–0.903–2.5390.334–3.9600.8200.9854.271
Table 11

Weights and Biases of the Proposed Model for TOC Prediction

 weights between input and hidden layers (w1)
   
hidden layer neurons (Nh)GRρNPHIRILDΔtCSPweights between hidden and output layers (w2)hidden layer bias (b1)output layer bias (b2)
11.759–1.0401.2332.680–0.320–1.2251.372–1.6831.923
20.8750.2880.309–1.399–0.7010.6824.431–0.756
30.990–0.414–1.286–1.252–0.2833.069–1.250–1.101
4–1.100–1.601–1.629–3.0780.297–0.8821.538–3.488
52.962–1.1300.781–2.217–1.3490.861–0.7870.840
6–0.8861.2270.3866.6392.619–3.7801.0592.219
70.417–1.8630.815–0.0821.125–1.183–1.5242.046
8–0.847–1.0050.4790.9910.4100.273–0.556–2.068
9–0.606–1.045–1.2030.9601.4490.2642.0001.132
102.624–2.675–0.025–0.7040.800–1.3331.0133.426
  2 in total

1.  Deep Learning-Based Artificial Neural Network-Cellular Automata Model in Constructing Landscape Gene in Shaanxi Ancient Towns under Rural Revitalization.

Authors:  Xiameng Wei
Journal:  Comput Intell Neurosci       Date:  2022-07-16

2.  Evaluation of Mechanical Properties of Materials Based on Genetic Algorithm Optimizing BP Neural Network.

Authors:  Tianzeng Liu; Guangping Zou
Journal:  Comput Intell Neurosci       Date:  2021-07-19
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.