Literature DB >> 34901650

Modeling the Solubility of Sulfur in Sour Gas Mixtures Using Improved Support Vector Machine Methods.

Yu-Chen Wang¹, Zheng-Shan Luo¹, Yi-Qiong Gao¹, Yu-Lei Kong¹.

Abstract

The study of sulfur solubility is of great significance to the safe development of sulfur-containing gas reservoirs. However, due to measurement difficulties, experimental research data on sulfur solubility thus far are limited. Under the research background of small samples and poor information, a weighted least-squares support vector machine (WLSSVM)-based machine learning model suitable for a wide temperature and pressure range is proposed to improve the prediction accuracy of sulfur solubility in sour gas. First, we use the comprehensive gray relational analysis method to extract important factors affecting sulfur solubility as the model input parameters. Then, we use the whale optimization algorithm (WOA) and gray wolf optimizer (GWO) intelligence algorithms to find the optimal solution of the penalty factor and kernel coefficient and bring them into three common kernel functions. The optimal kernel function is calculated, and the final WOA-WLSSVM and GWO-WLSSVM models are established. Finally, four evaluation indicators and an outlier diagnostic method are introduced to test the proposed model's performance. The empirical results show that the WOA-WLSSVM model has better performance and reliability; the average absolute relative deviation is as low as 3.45%, determination coefficient (R 2) is as high as 0.9987, and the prediction accuracy is much higher than that of other models.

Entities: Chemical

Year: 2021 PMID： 34901650 PMCID： PMC8655918 DOI： 10.1021/acsomega.1c05032

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

The amount of harmful gas emitted by natural gas combustion is far lower than that of other fossil energy sources, which plays an important role in supporting the low-carbon and green development of the world. At present, unconventional oil and gas (such as sour gas reservoirs) account for an increasing proportion of the world’s new oil and gas production and reserves. China’s proven geological reserves of sour gas with high H2S and CO2 exceed 5000 × 108 m3, accounting for approximately one-fourth of the total reserves of gas reservoirs in China.[1] The discovery of large high-sulfur gas reservoirs such as the Luojiazhai Gas Field, Puguang Gas Field, Dukouhe Gas Field, Tieshanpo Gas Field, and Yuanba Gas Field provides an important gas source guarantee for the national “West-East Gas Pipeline Project.”[2] The sulfur deposition damage of high-sulfur gas reservoirs is the main feature that distinguishes them from conventional gas reservoirs, and it is also one of the main factors that affect the economic benefits of high-sulfur gas field development. Since the 1950s, scholars from the United States, Canada, Germany, and other countries have successively carried out much research on sulfur deposition during the exploitation of sulfur-bearing gas fields. They believe that sulfur solubility is an important condition for identifying sulfur deposition, so accurate prediction of sulfur solubility in sour gas is very important for the development of the sulfur gas field.[3,4] At present, there are four methods for obtaining the solubility of sulfur in sulfur-containing gas: experimental measurement, equation of state (EOS), empirical model, and machine learning method. As early as 1960, Kennedy conducted the first experiment on the solubility of elemental sulfur in single-component gas and multicomponent mixed gas. Since 1990 in China, Gu Mingxing, Zeng Ping, Yang Xuefeng, Bian Xiaoqiang, Sun Changyu, Hu Jinghong, and others have also analyzed the solubility of elemental sulfur.[5] Sulfur solubility experiments usually need to be carried out at high temperatures (303.2∼433.15 K) and high pressures (6.7∼155 MPa), and H2S is toxic and corrosive, making experiments more difficult. Therefore, experimental data on sulfur solubility are scarce and valuable compared to other solubility data and are an important basis for our subsequent studies. EOS and empirical formulas are not only difficult to calculate but also have certain limitations.[6,7] Machine learning (ML), as a relatively young and important branch of artificial intelligence, can now also be used to predict sulfur solubility and gradually reveal its excellent performance and practicality.[8]Table shows the comparison of various ML methods to predict sulfur solubility. Through research, it has been found that most of the predecessors used an artificial neural network (ANN) to make predictions, for example, feedforward neural networks (Mohammadi),[9] the GA-LM-BP hybrid model (Chen),[10] and the cascaded forward neural network (CFNN) hybrid model (Amar M N).[11] Although the ANN is an efficient and long-established ML model, the complexity of the model itself (the increase in layers and parameters) necessitates a large amount of data for training. However, the precipitation of sulfides in sour gas reservoirs is long-term, and it is difficult to obtain comprehensive first-hand data. Therefore, compared with the ANN, the support vector machine (SVM), an ML model suitable for small samples and poor information, is more in line with the background of sulfur solubility research. To date, it has been uncommon for scholars to use SVM to calculate sulfur solubility. For the first time, Bian et al.[12,13] combined the gray wolf optimizer (GWO) algorithm with a least-squares support vector machine (LSSVM) and used 70% of the experimental sulfur solubility data in 184 groups of mixed gases as the training set to train the LSSVM. The model’s average absolute relative deviation (AARD) = 3.5029% and R2 = 0.9976 showed excellent predictive performance. In addition, Liu et al.[14] used SVR to predict the thermodynamic properties of pure fluids and their mixtures and also obtained ideal and excellent prediction results.

Table 1

Comparison of Several Machine Learning Methods for Sulfur Solubility Prediction

ML models	differences
Mohammadi (2008)	a feedforward neural network (FNN) is first used to predict the dissolution of sulfur in pure H₂S at high temperatures (316–433 K) and high pressure (60 MPa). The results show that the average relative error between the predicted value and the experimental value is 6.1%.
Chen (2014)	a GA-LM-BP ANN model is proposed, and 74 sets of data are used to train and test the model. The simulation results show that the average relative deviation (AARD) between the training results and measured values is 5.90%, and the AARD for the test results is 5.54%.
Bian (2019)	using the GWO-LSSVM hybrid model, five influencing factors are considered. This model shows good performance, with the minimum average absolute relative deviation (AARD = 3.5029%) and the maximum determination coefficient (R² = 0.9976) for all 239 data (for pure H₂S and sour gas).
Amar M. N. (2020)	three models of CFNN, GEP, and MLP are established, and it is concluded that for the calculation of the solubility of pure mixed H₂S and sour gas, the cascaded forward neural network (CFNN) prediction model is better than other methods. The overall RMSE values of the CFNN model are 3.8101 and 0.0232, respectively.

In summary, the experimental measurement method has a long period, high cost, and low security; the EOS and empirical model have low universality and excessive calculation. Among ML methods, prediction models based on the ANN have been widely used in research on sulfur solubility and have excellent practical performance; prediction models based on SVM have not been involved in much previous research and have broad development prospects in the research on sulfur solubility prediction.[13] It is important to note that while the use of ML methods allows for direct modeling based on existing data, the development of other methods is encouraged and invaluable and should not be superseded by other methods. In this work, a comprehensive gray relational analysis (CGRA) method that combines difference and division methods is first constructed to screen out the main factors affecting sulfur solubility to determine the model input parameters.[15] Then, an SVM-based hybrid machine learning model (WOA&GWO-WLSSVM) is proposed to predict sulfur solubility in sour gas. The input parameters of the model are reservoir temperature, pressure, and mole fraction of CH4, H2S, and CO2, and the target parameter is sulfur solubility. The model is developed and tested using data sets (245) in the public literature, evaluated by four statistical indicators (average absolute relative deviation (AARD), root-mean-squared error (RMSE), standard deviation (SD), and R),[2] and compared with the prediction results of three empirical formulas and three ML models. After rigorous calculations, the results show that the AARD and R2 of WOA-WLSSVM reached 3.45% and 0.9987, respectively, both of which were superior to those of other models, indicating that the performance of the model was good and the prediction effect was more accurate. In addition, outlier diagnosis is carried out through the leverage method, and only individual data points are outside the valid range, which proves that the model passes the statistical test and has good validity and reliability. This research is organized as follows, and the research process is shown in Figure . In Section 2, the modeling technique is described in detail. Section 3 describes the data analysis and model training. In Section 4, the prediction results of the model are evaluated through statistical indicators and leverage methods, and a rigorous quantitative evaluation of the performance of the new model is conducted. Section 5 gives the conclusion of this research.

Figure 1

Research process description.

Modeling Techniques

CGRA: An Improvement Based on GRA

When dealing with problems that have complex interrelationships, we often do not have all the information and sufficient data. The gray relational analysis (GRA) method does not require a large amount of sample data. It mainly focuses on the degree of relevance between the impact index and the research question.[16] Traditional GRA uses the absolute value of the difference between two data sequences to calculate the correlation degree. It considers only the degree of geometric similarity between data sequences and ignores the degree of numerical proximity.[17] If the two curves are parallel, the correlation degree between them is calculated by the traditional gray correlation analysis method to be 1. In fact, the correlation degree between the two curves is not 1, and the calculated correlation degree does not match the actual situation. Therefore, a CGRA method that combines difference and division methods is constructed; it uses distance similarity and shape similarity to describe the degree of relevance, which addresses the disadvantage of traditional GRA that it ignores the degree of numerical proximity. To enhance the generalization ability and robustness of the weighted LSSVM (WLSSVM) model, CGRA is used to extract and analyze the features. The process of performing CGRA is as follows: Construction ofthe feature matrix: Let X0 be the quantity that characterizes the behavior of the system, where its observed value on the sequence number k is x0(k); then, X0(k) = (x0(1), x0(2), ···, x0(m)) is called the characteristic behavior sequence of the system. Let X be the system factor, where its observed value on the serial number k is x(k); then, X(k) = (x(1), x(2), ···, x(m)) (i = 1,2, ···, n) is said to be the behavior sequence of the system’s related factors. These n + 1 sequences form a characteristic matrix of order m × (n + 1), as shown in eq :where m is the dimension of the eigenvector; n is the sample number; the subscript k = 1,2, ···, m; and i = 1,2, ···, n (the same is true below). Calculation of the difference matrix: The difference between each component of the characteristic behavior sequence of the system and the behavior sequence of the related factors is calculated to form a difference matrix, as shown in eq .where Δx0(k) represents the difference between the kth eigenvalue of the system feature and the kth eigenvalue of the ith sample in the sequence of related factors. Δx0 is introduced into the following formula to form the gray correlation degree of the shape similarity: Calculation of the quotient matrix: The quotient of each component in the system characteristic behavior sequence and the related factor behavior sequence is calculated to form a quotient matrix, as shown in eq .where Δx0′(k) represents the quotient of the kth eigenvalue of the system feature and the kth eigenvalue of the ith sample in the behavior sequence of related factors. Δx0′(k) is introduced into the following formula to form the gray correlation degree of the distance similarity: Calculation of the comprehensive gray correlation degree: Combining eqs and 5, the formula for the comprehensive gray relational degree is defined as follows:

WLSSVM: An Improvement Based on SVM

SVM is an ML model suitable for small samples and poor information. It is difficult to obtain comprehensive first-hand data due to the long-term and continuous precipitation of sulfide in acid gas reservoirs, so this is consistent with the background of the SVM model. LSSVM is a special extension of SVM. Although the computational complexity is reduced, the robust performance of the model is also reduced.[18] In 2002, Suykens proposed an improved LSSVM algorithm—WLSSVM. Its core idea is to assign weights to training errors based on LSSVM, which can effectively reduce the impact of noise in the training samples and improve the rate of convergence.[19] WLSSVM is based on the optimization problem of LSSVM that weights the error ξ of each item with a coefficient v(20) The optimization problem can be described as eq :where b is the threshold value; ω is the weight coefficient vector; ϕ( · ) is the mapping from the input space to a high-dimensional space; ϑ is the regularization parameter; ξ is the error sequence, and v is the weight value, which is calculated according to the sample training error. We introduce the Lagrange function:where α*(1,2, ···, N) is the Lagrange multiplier, according to the Karush–Kuhn–Tucker condition: In the feature space, the inner product operation in the mapping space is simplified by introducing a kernel function. There are three main types of kernel functions, as follows:where σ is the parameter of the kernel function. Sigmoid kernel functions: Polynomial kernel functions: Radial basis function (RBF) kernel functions: Then, the optimization problem for eq can be transformed into the following problem:where l1 × is the unit row vector of 1 × N; l is the unit column vector of N × 1; By solving eq , the expression of the WLSSVM model can be obtained as follows:

Swarm-Based Algorithm

The swarm-based algorithm is an emerging intelligent algorithm that has become the focus of an increasing number of researchers. It has a very special connection with artificial life, especially evolutionary strategies and genetic algorithms. Some classic intelligence algorithms are often used to optimize WLSSVM models, such as differential evolution, GA, and the ant lion optimizer. Although the classic algorithm just mentioned has a certain improvement on the classification effect of the WLSSVM, it does not easily jump out of the trap of local extremes, resulting in low classification accuracy.[21] Compared with these algorithms, the GWO and WOA adopt a new search mechanism. They have the advantages of simple and fast calculations, fewer parameters, and strong global search capabilities. Therefore, they have a great probability of avoiding local extremes. They have also been used in different ML applications.[22]

GWO Algorithm

The GWO is a new type of swarm-based algorithm derived from the social hierarchy mechanism and hunting behavior of gray wolves in nature.[23] At present, the GWO algorithm has been successfully applied to power systems, UAV path planning, economic dispatch assignment, PI controller optimization, workshop schedules, and other fields.[24,25] In the GWO, there are four wolves of different social classes. α, β, and δ wolves are the first three categories (classes: α > β > δ), which play an important role in guiding the main search direction, and a large number of ω wolves attack prey at the lowest level. The algorithm mechanism is shown in Figure . The main optimization process can be divided into four stages.[26]

Figure 2

Principles of the GWO.

Encircling prey Principles of the GWO. The position of each gray wolf in the search space is updated according to the position of the prey. The update equation is as follows:where t is the number of iterations, Xp is the position of the prey, X is the position of the gray wolf, and D is the distance between the prey and the gray wolf, which is defined as follows: A and C are vector coefficients, and the calculation formulas are as follows:where a linearly decreases from 2 to 0 as the iteration progresses, and r1 and r2 are random numbers in [0,1]. Hunting According to the information of α wolves, β wolves, and δ wolves, the positions of individual gray wolves in the wolf pack are updated. The update formula is as follows:where X1, X2, X3 are defined as eqs 2526:where Xα, Xβ, Xδ are the three optimal solutions in the tth iteration and Dα, Dβ, Dδ are defined as eqs 2829: Attacking prey Attacking prey is the final stage of the hunting process, which is equivalent to strengthening the local search during the search process. Through the above process, the wolf terminates the attack on the prey when the prey stops moving, which is also controlled by A and a. A change in A can be achieved by a change in a, and the interval of a is [0,2] in the whole iteration process. When |A| < 1, the wolf can move to any position between its current position and its prey. When |A| > 1, the wolves look for new spaces to find better prey.

WOA Algorithm

The WOA was also proposed by Professor Mirjalili,[27] but it was slightly later than the GWO, so we can see some influence of the GWO on the WOA. Relatively speaking, the main feature of the WOA is the use of random individuals or optimal individuals to simulate the hunting behavior of humpback whales and the use of spirals to simulate the bubble-net attack mechanism of humpback whales.[28,29] The predation process of the whale is summarized as follows: Encircling prey When a whale is looking for prey, it should first determine the position of the prey and then encircle it. Assuming that the current optimal position is the target prey, the individuals in the group move to the optimal position. The vector D is the distance between an individual and the optimal whale position. The location is updated as eqs and 31:where t is the current iteration number, X*(t) is the position of the best whale in generation t, and X(t) is the position of the whale in generation t. The definitions of random vectors A and C are as follows:where r is a random vector in [0,1]; a = 2 – 2t/Tmax (Tmax is the maximum number of iterations.) When |A| ≤ 1, the whale thinks that it has found its prey and can launch a bubble attack. Bubble-net attacking method In the WOA, two whale predation methods are established, namely, the shrinking hunting method and the spiral bubble-net attacking method. Shrinking hunting is achieved by reducing the vector a (the size of vector A is in [−a, a]); when the spiral bubble-net attack is launched, the individual whales attack their prey in a spiral path. The updated position equation used is as follows:where D′ = |X*(t) – X(t)| represents the distance between the whale and the current optimal position, the constant b represents the shape of the spiral, and l is a random number in [−1,1]. To simulate the attack of whale groups on prey, both shrinking envelopment and spiral paths are used. The WOA sets a probability p, where p is a random number in [0,1]. It is assumed that the probabilities of the whales using the two predation methods are both 0.5, and the iterative mathematical model of the whale position is as follows: Searching for prey During the predation process, in addition to updating the position of the whale following the optimal position, the whale will randomly update its position; this forces the whale to have a larger search range so that the WOA has a better global search capability. When |A| ≥ 1, the whale conducts a random search for prey, and the mathematical expression at this stage is as follows: where Xrand is the random agent position vector in the population.

K-Fold Cross-Validation

Before K-fold cross-validation was proposed, hold-out cross-validation was often used. Here, the data were used only once and not fully utilized. However, when training the model, it is often the case that the number of samples is not sufficient. K-fold cross-validation can efficiently utilize the data set and avoid over- and under-learning. The principle of K-fold cross-validation is to divide the entire data sample set into K groups, taking turns to use K-1 groups of the data set as the training set and the remaining group (i groups) as the testing set; each time the model is trained, the corresponding score is obtained, and the final average score is used as the model evaluation criterion.[30] The K-fold cross-validation structure is shown in Figure .

Figure 3

K-fold cross-validation structure.

Establishment of the WOA&GWO-WLSSVM Model

According to the basic principle of the WLSSVM algorithm, it is important to obtain the appropriate parameters (penalty factor ϑ and kernel coefficient σ2) for the WLSSVM model. Therefore, this study uses two intelligence algorithms—the GWO and WOA—to optimize the parameters to improve the regression performance of the model. Figure shows the overall framework of the WOA&GWO-WLSSVM model. The establishment of the model is divided into two major stages:

Figure 4

Overall framework of the WOA&GWO-WLSSVM model.

Training stage: The training sample data is read and normalized; WLSSVM is optimized through the GWO and WOA, and the optimal solution of the penalty factor and the kernel coefficient is found. Then, the optimal solution is brought into three common kernel functions, and the MSE and R2 are used as the verification standards to select the kernel function to determine the final prediction model. Prediction stage: The normalized test set is substituted into the final prediction model for calculation, the predicted value is denormalized, and the MSE and R2 between the actual value and the predicted value are calculated. In this iterative loop, when the predicted value of MSE is the smallest and R2 is the largest (within the maximum number of iterations), the iteration ends, and the final sulfur solubility prediction result is output. Overall framework of the WOA&GWO-WLSSVM model.

Data Analysis and Model Training

Experimental Data

A total of 245 sets of experimental sulfur solubility data[6,7,31−35] were collected from previous literature studies for the establishment and evaluation of the model. The data set used in the study is shown in Table (temperature (303.2–486 K), pressure (6.7–155 MPa), and H2S content (1–26.62%)). Compared with the data sets in previous studies (Bian, Liang Fu, Amar), it is more extensive. The Sun data set is used as an independent checking set to evaluate the application performance of the model in the field of actual gas reservoir engineering. The predicted values of the k testing set are averaged as the final predicted result of the testing set (k = 10 in the present study).[36]

Table 2

Sulfur Solubility Data Sets Used in the Study

author	temperature (°C)	pressure (MPa)
Brunner and Woll (1980)	373.15–433.15	10–60
Brunner (1988)	398–486	6.7–155
Gu (1993)	363.2–383.2	10–50
Sun CY (2003)	303.2–363.2	20–45
Yang XF (2009)	373.15	24–36
Bian XQ (2010)	336.2–396.6	10–55.2
Zhang GD (2014)	373.15–425.65	20–66.52

The training set is used to adjust the parameters ϑ and σ2; the testing set does not participate in training and is used to evaluate the generalization ability of the final model.

Selection of the Model Input Parameters

When determining the input parameters of the model, it is necessary to investigate the main factors affecting the solubility of sulfur in the mixed gas. The CGRA method is used to obtain the gray correlation coefficient value.[37,38] The larger the gray correlation coefficient value of a factor is, the greater its impact on the research objective is.[39] As shown in Figure , the most influential factor is H2S content followed by CO2 content, reservoir pressure, temperature, and CH4 content. The gray correlation coefficient values of N2 and C2H6 content are less than 0.5, so these two factors are eliminated. Therefore, the new model aims to obtain the best regression between sulfur solubility and H2S content, CO2 content, reservoir pressure, temperature, and CH4 content.

Figure 5

CGRA for sulfur solubility in mixed acid gas.

Determination of the Model Details

The accuracy of model prediction is closely related to the choice of the kernel function. Different kernel functions will cause WLSSVM to choose different support vector algorithms.[40−42] Substituting three common kernel functions in learning and using the MSE and R2 as the verification standards, equations are given as eqs and 39, and the running results are shown in Table .where N is the number of all experimental sulfur solubility data points and yexp, ycal, yaveexp represent the experimental value of sulfur solubility, the predicted value, and the average value of the experimental data, respectively.

Table 3

MSE and R2 of Different Kernel Functions

kernel function	validation criteria	WOA-WLSSVM	GWO-WLSSVM
K₁(x,x_i)	MSE	0.866	0.741
K₁(x,x_i)	R²	0.714	0.622
K₂(x,x_i)	MSE	0.511	0.642
K₂(x,x_i)	R²	0.318	0.253
K₃(x,x_i)	MSE	0.029	0.044
K₃(x,x_i)	R²	0.945	0.914

Table shows that the MSE and R2 corresponding to the RBF kernel function (K3(x, x)) are both the best, and the prediction accuracy is significantly higher than that of the other two kernel functions. This shows that its approximation characteristics and sulfur solubility values are more suitable for the relevant data provided in this study, so the RBF kernel function is more in line with the requirements of sulfur solubility regression prediction in this study. During the training process, trial-and-error testing is used to determine the parameters of the GWO-WLSSVM and WOA-WLSSVM models. The parameters are listed in Table .

Table 4

Parameters of the Trained Model

parameter	GWO-WLSSVM	WOA-WLSSVM
input data form	[−1, +1]	[−1, +1]
input variables	5	5
max iterations	200	200
search agents	30	30
ϑ_Best	0.7833	2.3718
σ_Bes²_t	8.8485	12.9816

Results and Discussion

Quantitative Evaluation

To verify the prediction effect of the model, the following four statistical indicators were selected for quantitative evaluation: the R2, the AARD, the RMSE, and the SD. They are calculated using eqs 404142: The comparison between the prediction results of the training set, the testing set, and the checking set and the experimental data are shown in Table and Figure . For the training set, both the WOA-WLSSVM model and the GWO-WLSSVM model have a low AARD value and high R2 value. The calculated data points are in good agreement with the experimental data points, which indicates that the two models have a strong fitting ability. For the testing set, the WOA-WLSSVM model’s AARD = 3.68% and R2 = 0.9985; the GWO-WLSSVM model’s AARD = 3.84% and R2 = 0.9983; the prediction value of the former is slightly more consistent with the experimental value, which proves that the WOA-WLSSVM model has a better prediction effect. To prove the accuracy of the two models in this study, three widely used data sets[6,32,33] are used to compare the prediction results with the experimental data, as shown in Figures 89.

Table 5

Statistical Evaluation Results of the Sulfur Solubility Prediction Model (a, b)

data sets	AARD (%)	SD	RMSE	R²
(a) WOA-WLSSVM
training sets	3.35	0.06	0.03	0.9991
testing sets	3.68	0.07	0.04	0.9985
checking sets	3.87	0.09	0.01	0.9896
all sets	3.45	0.07	0.02	0.9987
(b) GWO-WLSSVM
training sets	3.43	0.06	0.03	0.9987
testing sets	3.84	0.08	0.05	0.9983
checking sets	3.89	0.08	0.01	0.9888
all sets	3.47	0.07	0.02	0.9983

Figure 6

(a,b) Results of training and testing.

Figure 7

(a,b) Predicted results compared with the experimental results: Brunner.

Figure 8

Predicted results compared with the experimental results: Bian.

Figure 9

Predicted results compared with the experimental results: Zhang.

(a,b) Results of training and testing. (a,b) Predicted results compared with the experimental results: Brunner. Predicted results compared with the experimental results: Bian. Predicted results compared with the experimental results: Zhang. To evaluate the application performance of the model in actual gas reservoir engineering, a new set of data sets, that of Sun35 (where the experimental data are more representative and suitable for most sour gas reservoirs), is used as a checking set for application performance testing, as shown in Table . The relative error (RE) indicates that the predicted values of the two new models are not greatly different from the experimental values. Between them, the RE value of WOA-WLSSVM is lower, which proves that its performance is better and it can better predict sulfur solubility in acid gas reservoirs.

Table 6

Performance Testing with a New Data Set

gas composition	temperature (K)	pressure (MPa)	experiment value g/m³	WOA-WLSSVM	WOA-WLSSVM	GWO-WLSSVM	GWO-WLSSVM
gas composition	temperature (K)	pressure (MPa)	experiment value g/m³	calculated value g/m³	RE/%	calculated value g/m³	RE/%
4.95% H₂S, 7.40% CO₂, 87.65% CH₄	303. 2	30	0.057	0.055	3.509	0.091	2.247
	303. 2	40	0.105	0.102	2.857	0.123	2.500
	323. 2	30	0.083	0.082	1.205	0.111	5.932
	323. 2	40	0.128	0.121	5.469	0.153	1.325
	343. 2	35	0.152	0.145	4.605	0.165	5.096
	343. 2	40	0.175	0.182	4.000	0.203	3.571
	363. 2	40	0.220	0.221	0.455	0.284	2.899
	363. 2	45	0.284	0.283	0.352	0.355	0.281
9.93% H₂S, 7.16% CO₂, 82.91% CH₄	303. 2	30	0.089	0.087	2.247	0.091	2.247
	303. 2	40	0.120	0.123	2.500	0.123	2.500
	323. 2	30	0.118	0.115	2.542	0.111	5.932
	323. 2	40	0.151	0.148	1.987	0.153	1.325
	343. 2	35	0.157	0.160	1.911	0.165	5.096
	343. 2	40	0.196	0.195	0.510	0.203	3.571
	363. 2	40	0.276	0.272	1.449	0.284	2.899
	363. 3	45	0.356	0.359	0.843	0.355	0.281
14.98% H₂S, 7.31% CO₂, 77.71% CH₄	303. 2	30	0.118	0.123	4.237	0.122	3.390
	303. 2	40	0.139	0.138	0.719	0.142	2.158
	323. 2	30	0.142	0.143	0.704	0.145	2.113
	323. 2	40	0.190	0.187	1.579	0.188	1.053
	343. 2	35	0.231	0.235	1.732	0.227	1.732
	343. 2	40	0.287	0.261	9.059	0.268	6.620
	363. 2	40	0.497	0.523	5.231	0.484	2.616
	363. 2	45	0.666	0.681	2.252	0.671	0.751
17.71% H₂S, 6.81% CO₂, 75.48% CH₄	303. 2	20	0.012	0.014	16.667	0.013	8.333
	303. 2	30	0.133	0.112	15.789	0.113	15.038
	303. 2	40	0.162	0.172	6.173	0.157	3.086
	323. 2	30	0.148	0.144	2.703	0.144	2.703
	323. 2	40	0.244	0.239	2.049	0.249	2.049
	343. 2	35	0.267	0.271	1.498	0.271	1.498
	343. 2	40	0.351	0.345	1.709	0.345	1.709
	363. 2	40	0.618	0.633	2.427	0.623	0.809
	363 .2	45	0.814	0.812	0.246	0.832	2.211
26.62% H₂S, 7.00% CO₂, 66.38% CH₄	303. 2	30	0.193	0.202	4.663	0.213	10.363
	303. 2	40	0.248	0.271	9.274	0.246	0.806
	323. 2	30	0.240	0.235	2.083	0.237	1.250
	323. 2	40	0.368	0.375	1.902	0.372	1.087
	343. 2	35	0.488	0.451	7.582	0.495	1.434
	343. 2	40	0.657	0.761	15.830	0.703	7.002
	363. 2	40	1.194	1.231	3.099	1.201	0.586
	363. 2	45	1.455	1.475	1.375	1.507	3.574
10.00% H₂S, 0.86% CO₂, 89.14% CH₄	303.2	30	0.081	0.084	3.704	0.085	4.938
	303. 2	40	0.113	0.116	2.655	0.102	9.735
	323. 2	30	0.117	0.123	5.128	0.125	6.838
	323. 2	40	0.124	0.129	4.032	0.119	4.032
	343. 2	35	0.152	0.148	2.632	0.148	2.632
	343. 2	40	0.180	0.186	3.333	0.179	0.556
	363. 2	40	0.225	0.230	2.222	0.242	7.556
	363. 2	45	0.317	0.345	8.833	0.312	1.577
10.03%H₂S, 10.39%CO₂, 79.58% CH₄	303. 2	30	0.091	0.085	6.593	0.088	3.297
	303. 2	40	0.127	0.133	4.724	0.123	3.150
	323. 2	30	0.130	0.136	4.615	0.136	4.615
	323. 2	40	0.155	0.159	2.581	0.152	1.935
	343. 2	35	0.160	0.164	2.500	0.165	3.125
	343. 2	40	0.204	0.198	2.941	0.199	2.451
	363. 2	40	0.293	0.287	2.048	0.285	2.730
	363. 2	45	0.366	0.382	4.372	0.372	1.639

Model Comparison

The accuracy and reliability of the model were further verified by using all the data and the four statistical indicators mentioned above. The WOA&GWO-WLSSVM model was compared with three widely used empirical models (those of Roberts,[43] and Guo–Wang,[44] and Hu[45]) and three ML models (those of Chen,[10] Amar,[11] and Bian[12]), and the analysis results are shown in Table . After calculation, it is found that the statistical indicators of the prediction results of the empirical model are generally inferior to those of the ML methods. In addition, among the ML methods, WOA-WLSSVM obtains the best statistical indicators: the AARD of the model is 0.7, 0.06, 0.05, and 0.02% lower than that of the Chen model, Amar model, Bian model, and GWO-WLSSVM, respectively; the SD of the model is reduced by 0.01 compared with that of the Chen model; the RMSE of the model is reduced by 0.011, 0.002, 0.003, and 0.001 compared with that of the Chen model, Amar model, Bian model, and GWO-WLSSVM, respectively; for WOA-WLSSVM, R2 reached 0.9987, which is higher than that of the other models, indicating that the model has a higher degree of fit and a better prediction effect.

Table 7

Comparison of the New Model with Other Models

models	AARD (%)	SD	RMSE	R²
Roberts model	64.36	0.86	0.67	0.6792
Guo–Wang model	12.84	0.15	0.17	0.9833
Hu model	17.32	0.22	0.21	0.9731
Chen model	4.15	0.06	0.032	0.9968
Amar model	3.51	0.07	0.023	0.9981
Bian model	3.50	0.07	0.024	0.9976
WOA-WLSSVM	3.45	0.07	0.021	0.9987
GWO-WLSSVM	3.47	0.07	0.022	0.9983

The model we propose is based on SVM, and the approach used in this work seems similar to Bian’s approach12from the macro level. Therefore, in this section, we compare the scores of the 10-fold cross-validation, which not only evaluates the prediction effect but also reflects the stability of the model.[30,46] The stability of the model is directly related to its application effect in actual engineering and is also the focus of our attention. The scores of the proposed model and Bian’s model after 10-fold cross-validation are shown in Table . The mean score of WOA-WLSSVM was as high as 0.8941, and the SD σ was also 0.0192 lower than that of GWO-WLSSVM. This indicates that WOA can find the optimal parameters of WLSSVM more precisely and better satisfy the pursuit of high accuracy and precision for the model. The mean score of GWO-WLSSVM is 12.80% higher and has a lower SD than Bian’s model, indicating that the improved WLSSVM model outperforms the improved LSSVM model in terms of prediction accuracy and stability. It should be noted that through Table we can see that the model of Bian et al. also performs very well and predicts much better than the empirical model, indicating that the GWO-LSSVM model is equally reliable and applicable. This also fully illustrates that the improved SVM model is an efficient method for sulfur solubility prediction.

Table 8

10-Fold Cross-Validation Score

number	WOA-WLSSVM	GWO-WLSSVM	Bian model (GWO-LSSVM)
1	0.7991	0.7732	0.6112
2	0.8713	0.7361	0.6301
3	0.8863	0.889	0.7119
4	0.9211	0.8213	0.7702
5	0.9502	0.8402	0.7322
6	0.8899	0.9031	0.8071
7	0.8818	0.9004	0.8102
8	0.9033	0.8912	0.7969
9	0.9075	0.7919	0.7829
10	0.9306	0.8989	0.8346
average rating	0.8941	0.8445	0.7487
standard deviation (σ)	0.0390	0.0582	0.0729

Outlier Diagnosis

Outlier diagnosis tests unreasonable data, uses the leverage method to search for outliers in the data set for reliability analysis, and draws a Williams plot to show the correlation between the standardized cross-validation residuals and the hat index (H).[47−49] The definition of H is as follows (eq ):where X is a two-dimensional matrix composed of n data values (rows) and k input variables (columns) and t is the transpose matrix. In the Williams plot, there is a square area (0 ≤ H ≤ H* and −3 ≤ SR ≤ 3) determined by the standard residual (SR) and leverage threshold H* (H* is generally equal to 3n/(k + 1)).[50] If most of the data points are distributed in the square area, it means that there are few abnormal data points and also proves the validity of the model in the field of statistics. The Williams plot output by the two new models after outlier detection is shown in Figure . It can be seen from the figure that most of the sulfur solubility data predicted by the two models are within the valid range of [−3,3] and [0, H*]. It is proven that the two models proposed in this study pass the statistical test. We can see that the WOA-WLSSVM model has fewer outliers, so it is more effective and reliable than GWO-WLSSVM.

Figure 10

(a,b) Diagnosis of outliers.

Conclusions

The main factors affecting the solubility of sulfur in sour gases were screened by an improved CGRA method, and the input variables of the WLSSVM model were determined. As an improvement of the traditional SVM, the use of WLSSVM improves the rate of convergence and saves computational cost. Using the WOA and GWO swarm-based algorithms to find the optimal parameters, the WOA-WLSSVM and GWO-WLSSVM sulfur solubility prediction models (for sour gas) are established. After statistical analysis (AARD, RMSE, SD, and R2) and statistical tests (outlier diagnostics), the results prove that the two models have good accuracy, robustness, generalization, validity, and reliability. The WOA-WLSSVM models are superior to other predictive models, including empirical models and ML models: AARD is as low as 3.45%, R2 is as high as 0.9987, and the prediction accuracy is much higher than that of other prediction models. This indicates that the improved SVM model is an efficient method for predicting sulfur solubility in sour gas mixtures. The sulfur solubility data set (245 data sets) used in this study ranges in temperature (303.2–486 K), pressure (6.7–155 MPa), and H2S content (1–26.62%). Compared with the data sets in previous studies (Bian, Liang Fu, and Amar), it is more extensive. It should be noted that the prediction of sulfur solubility using the ML method in this range is better than that using other models. For predictions outside this range, the validity of the model needs to be tested using reliable experimental data.

4 in total

1. Variable selection, outlier detection, and figures of merit estimation in a partial least-squares regression multivariate calibration model. A case study for the determination of quality parameters in the alcohol industry by near-infrared spectroscopy.

Authors: Patrícia Valderrama; Jez Willian B Braga; Ronei Jesus Poppi
Journal: J Agric Food Chem Date: 2007-10-17 Impact factor: 5.279

2. Model-wise and point-wise random sample consensus for robust regression and outlier detection.

Authors: Moumen T El-Melegy
Journal: Neural Netw Date: 2014-07-07

3. A robust cutting pattern recognition method for shearer based on Least Square Support Vector Machine equipped with Chaos Modified Particle Swarm Optimization and Online Correcting Strategy.

Authors: Xinggao Liu; Shuting He; Youzhi Gu; Zhipeng Xu; Zeyin Zhang; Wenhai Wang; Ping Liu
Journal: ISA Trans Date: 2019-09-07 Impact factor: 5.468

4 in total

1 in total

1. Evaluation of Therapeutic Effects of Computed Tomography Imaging Classification Algorithm-Based Transcatheter Arterial Chemoembolization on Primary Hepatocellular Carcinoma.

Authors: Qiang Li; Guang Luo; Jian Li
Journal: Comput Intell Neurosci Date: 2022-04-22

1 in total