Literature DB >> 35707186

A Clinical Decision Support System (CDSS) for Unbiased Prediction of Caesarean Section Based on Features Extraction and Optimized Classification.

Ashir Javeed¹, Liaqat Ali², Abegaz Mohammed Seid³, Arif Ali⁴, Dilpazir Khan⁴, Yakubu Imrana^5,6.

Abstract

Nowadays, caesarean section (CS) is given preference over vaginal birth and this trend is rapidly rising around the globe, although CS has serious complications such as pregnancy scar, scar dehiscence, and morbidly adherent placenta. Thus, CS should only be performed when it is absolutely necessary for mother and fetus. To avoid unnecessary CS, researchers have developed different machine-learning- (ML-) based clinical decision support systems (CDSS) for CS prediction using electronic health record of the pregnant women. However, previously proposed methods suffer from the problems of poor accuracy and biasedness in ML. To overcome these problems, we have designed a novel CDSS where random oversampling example (ROSE) technique has been used to eliminate the problem of minority classes in the dataset. Furthermore, principal component analysis has been employed for feature extraction from the dataset while, for classification purpose, random forest (RF) model is deployed. We have fine-tuned the hyperparameter of RF using a grid search algorithm for optimal classification performance. Thus, the newly proposed system is named ROSE-PCA-RF and it is trained and tested using an online CS dataset available on the UCI repository. In the first experiment, conventional RF model is trained and tested on the dataset while in the second experiment, the proposed model is tested. The proposed ROSE-PCA-RF model improved the performance of traditional RF by 4.5% with reduced time complexity, while only using two extracted features through the PCA. Moreover, the proposed model has obtained 96.29% accuracy on training data while improving the accuracy of 97.12% on testing data.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35707186 PMCID： PMC9192258 DOI： 10.1155/2022/1901735

Source DB: PubMed Journal: Comput Intell Neurosci

1. Introduction

Medical practitioners recommend a caesarean section (CS) to avoid the complications of vaginal birth or sometimes women do not want to go through delivery pain; thus, they prefer caesarean over vaginal birth [1, 2]. The frequent CS could be fatal for the fetus as well as the mother. Therefore, CS should not be performed unless it becomes an essential likelihood that will improve the outcomes. Hence, most of the medical practitioners believe that CS should be performed only when normal vaginal birth has some complications which might pose a life threat for the mother and the child [3]. Unfortunately, nowadays the use of CS is common among the pregnant women and this trend is rapidly rising worldwide. Even the figure from the world health organizations is also terrifying predicting 29% of the all births eventually take place through CS by 2030 [4]. The CS is a surgical procedure, during which doctors usually slitted the abdomen in a horizontal direction just above the pubic bone that eventually gives a way out for a baby from the womb. Generally, CS is recommended by the doctors after examining the medical maternal condition. Furthermore, CS is not a cheap option for birth, especially for those mothers who belong to developing nations. Additionally, the women who receive medical maternity care should also be aware of the consequence (risk and benefits) of CS and also show her consent to it [3]. The situation becomes even worse: when CS is done unnecessarily, this presents serious health hazards to the mother. Therefore, planning for the CS can be done from the beginning of the pregnancy or in the middle stage of the pregnancy. But the riskiest situation for the decision of CS is when complications arise just before the birth time. Thus, early identification and appropriate management of CS can improve the outcome of birth for mother and fetus [5]. Furthermore, the labor attending team also lacks the tools and standardized operating procedures which eventually assist them to make an informed decision about CS while medical indication is not clear enough [6]. To improve this scenario, the attendants of maternity should have tools that assist them in the selection of an appropriate way of birth. Nowadays, the healthcare sector generates a massive amount of the patient's data including information about the patients age, sex, blood group, and laboratory reports, etc. The patients data are preserved electronically for future use; thus it is referred to as an electronic health record (EHR) [7]. Researcher and scientists are utilizing this EHR for designing and developing various diagnostic systems for early detection of diseases like Parkinson's disease (PD) using hand written images [8-10], PD detection using voice [11-13], hepatitis disease [10, 14], lung cancer detection [15, 16], carcinoma detection [17, 18], heart failure prediction [19, 20], heart disease detection [21-23], and mortality prediction [24, 25]. These diagnostic systems can predict diseases in a timely and cost-effective manner. Therefore, researchers have demonstrated a strong interest in designing and construction of CDSS for the effective prediction of CS so that unnecessary surgery can be avoided that ultimately will help to improve the health of women. From the literature, it is revealed that scientists had designed different CDSS using ML and data mining approaches. Gharehchopogh et al. [26] had studied several risk factors which are associated with childbirth to determine whether the maternal required a CS or not. They had utilized decision tree (DT) in their proposed diagnostic system. Caruana et al. [27] demonstrated from their research work that there is an immense difference between the physician's decision regarding CS to the actual requirements. In their study, they had used RF and naive Bayes (NB) as a classifier and achieved the highest accuracy of 90%. Furthermore, Stylios et al. [28] used fuzzy cognitive maps along with artificial neural networks (ANN) for the prediction of birth through observing different features such as fetus heart signal, uterine contractions, and psychological condition of the subject. Vovsha et al. [29] studied support vector machine (SVM) with radial basis kernel function and logistic regression (LR) for the identification of premature birth. They had used cost function to tackle with the skewness problem in relation to the size of each class in the dataset. Maroufizadeh et al. [30] proposed a system that had obtained the highest accuracy of 70% on 2120 Iranian primiparous women through using LR, RF, and ANN. A mobile application for android system was designed and developed by Iftitah and Rulaningtyas [31] using NB for the prediction of delivery method and their proposed app obtained the accuracy of 90%. Saleem et al. [32] derived robust features from the dataset using multivariate extensions of EMD, and through their proposed method, the accuracy of 90% was recorded. Moreover, Amin and Ali [33] evaluated the performance of different ML models for the prediction of CS during the birth time. They had studied NB, kNN, SVM, RF, and LR machine learning models and obtained the highest accuracy of 90% using kNN. Recently, Rahman et al. [34] had explored various ML models such as kNN, DT, SVM, RF, and GB to classify CS for healthcare operational decision. In their study, they had used several data imbalance techniques to improve the minority classes samples in the dataset. After extensive experimental work, they had obtained the accuracy of 95%. Moreover, Abdillah et al. proposed a novel method using linear discernment analysis (LDA) and SVM for the classification of CS and they had used various evaluation metrics to test the proposed LDA-SVM model. According to experimental results, they had obtained the accuracy of 74.5% using the holdout validation scheme [35]. From the literature review, it is evident that researchers have developed several CDSS for CS prediction based on ML algorithms. However, ML models suffer from various problems such as underfitting and overfitting, where underfitting means ML model show excellent performance on training dataset while, for testing dataset, ML model display poor performance. On the other side, if a ML model performs better on a testing dataset but degrades in performance on a training dataset, this is referred to as overfitting. These issues typically arise, when datasets contained a small number of samples and imbalance classes such that Rahman et al. [34] and Abdillah et al. [35] have used in their proposed models. Additionally, feature preprocessing is a critical component of ML models for attaining superior outcomes, but researchers have given little attention to this facet in their suggested ML models for CS prediction in the past. Motivated by these findings, we have developed a CDSS for CS prediction where the number of samples in the dataset was increased using the ROSE [36] technique to overcome the problem of imbalance classes in the dataset. Moreover, we used scaling method for data normalization and hybridized two ML algorithms to design a predictive model for CS. The first algorithm is used for feature extraction from the dataset that eventually helps to improve the accuracy of the newly proposed system and also reduce its computational time. For feature extraction, the newly proposed system employs principal component analysis (PCA) to reduce the dimensionality of the dataset by generating new features from the dataset which have complimentary information about the prediction of CS. The second algorithm is a classifier, and the newly proposed model for CS prediction employs the RF model for classification. To improve the performance of RF, we have used grid search algorithm for finding the optimal parameters of the RF which eventually boost the classification performance of RF. The two algorithms collaborate to form a hybrid system and we have named it ROSE-PCA-RF model. Figure 1 presents the workflow of the proposed ROSE-PCA-RF model.

Figure 1

Workflow of the proposes ROSE-PCA-RF model.

The prime contributions of this study are summed up as follows: The proposed ROSE-PCA-RF model automatically predicts CS for the pregnant women using EHR. ROSE has been used for the first on the CS dataset to deal with the problem of biasedness in ML model which occur due to minority classes. Unfortunately, previous studies neither discussed this problem nor took any measures to avoid it. The newly developed ROSE-PCA-RF model has improved the performance of the traditional random forest model by 4.5% for CS prediction and also reveals better performance than fifteen recently proposed ML-based models. The proposed ROSE-PCA-RF model has used principal component analysis to reduce the data dimensionality; i.e., the dataset features reduced from six to only two features, which ultimately helps to gain better performance in terms of accuracy as well as lower time complexity. The proposed ROSE-PCA-RF model has demonstrated significantly improved performance in comparison to other state-of-the-art ML models on the CS dataset.

2. Materials and Methods

2.1. Dataset

For the experimental purposes, we have collected a CS dataset from an online UCI machine learning repository [37]. There are two classes in the dataset, where one class eventually needs a CS for the childbirth while the other class suggests a normal method of vaginal birth. The number of variables in the selected dataset is 6 and the total number of samples are 80 pregnant women among which 34 are negative cases or normal vaginal birth case while 46 are the positive cases for the CS. All the females selected in the dataset have age range from 18 to 40 years. The data type of all the variables in this dataset is integers. The dataset consists of five features along with another feature as default for decision making. The dataset is based on the following features such as age, delivery number (DN), delivery time (DT), blood pressure (BP), heart problem (HP), and default. DN is actually the total number of times a subject has been pregnant; for example, a subject is about to deliver a baby for the first time, second time, etc. Another feature of the dataset is the delivery time that basically describes the time on which a subject gives a birth such as on time birth, preterm birth, and postterm birth. Other two variables relevant to the subject personal health condition such as blood pressure and HP. HP has binary value that means if a subject has HP problem, then the value for the HP variable will be 1. Otherwise its value will be 0 in case the subject does not suffer from HP. The last variable in the dataset is the default that has a binary value of 1 or 0. If it is 1, then the subject has to go through CS while in case of 0 value, the subject does not need a CS and in such a situation normal vaginal birth procedure should be adopted. Table 1 provide an overview of all the variables (features) of CS dataset.

Table 1

The description of dataset.

Sr. no	Feature	Data type	(Mean ± std)_Caesarian	(Mean ± std)_normal
1	Age	Integer	28.02 ± 5.58	27.24 ± 4.00
2	Delivery number	Integer	1.76 ± 0.86	1.53 ± 0.65
3	Delivery time	Integer	0.52 ± 0.77	0.79 ± 0.83
4	Blood pressure	Integer	0.98 ± 0.79	1.03 ± 0.57
5	Heart problem	Integer	0.52 ± 0.50	0.18 ± 0.38
6	Caesarean	Integer	1.00 ± 0.00	0.00 ± 0.00

2.2. Proposed Method

The proposed clinical decision support system (CDSS) for CS prediction has two main components that are hybridized and work as one black-box model. The idea behind hybridizing the two components into a block is that they work in connection, where first component of the system is a feature extraction module that extracts the useful features from the dataset and reduce the data dimensionality, while second module of the system works as a predictive model. The extracted features from the first module of the system will be given as input to the predictive module for classification of the CS. Data plays a vital part in the predictive model for acquiring better results; therefore, we have to preprocess the collected dataset. Unfortunately, the collected dataset of CS has imbalance classes and small number of samples which can cause the problem of biasness in the results. It has been studied from the literature that if ML models are trained on imbalance data, then models show biased performance by favoring the majority class while ignoring the minority class [38, 39]. This is due to the fact that minority class samples are trained infrequently during the training process; thus, prediction of the minority class is rare, ignored, and undiscovered [40]. As a result, the test's samples of the minority class are misclassified in comparison to majority class [40, 41]. For binary classification such as CS prediction, a model will display higher rate of sensitivity (if a caesarean subject belongs to majority class, such as in case of CS data) and lower rate of specificity (when vaginal birth subjects are minority class). Such kind of performance clearly depicts the biased nature of a ML model toward majority class. However, this biased behavior of the ML models was totally ignored in the previously studies conducted for CS prediction. To avoid this problem, we have oversampled the training data, while the testing data remain in pristine condition so that the predicted result from the proposed model should be unbiased. In recent past, Zhang and Chen [36] has presented ROSE method to deal with the problem of imbalance classes in a dataset; the proposed system utilized ROSE technique to overcome the problem of imbalance classes in a dataset. Usually, the problem of imbalance class and a small number of samples in a dataset is resolved by randomly increasing the class distribution in the dataset. But the working of ROSE is quite different from the traditional way of oversampling techniques. The ROSE utilizes a receiver operator curve (ROC) to check the efficiency of the classifier rather than accuracy. While in the second step, cross-validation scheme is used to verify the quality of the data that is oversampled. Thus, ROSE can be used as a tool to explore minority class in a dataset and effectively oversample the classes in a dataset. Let us consider a training set R, with Q samples {a, b}, X = 1,…, N class label wϵλ = {b0, b1}. A are the random vector attributes defined on γ, ρ(a) as probability density function. Let ℕbe the number of examples belonging to class B. The ROSE working is described as follows: Choose b = b with probability π. Choose {a, b} ϵ R, such as a=a, with probability 1/N. Sample a from U(·, a), a probability distribution center at a, and covariance matrix D. Through ROSE, the training data have been oversampled to deal with the limitations of a small number of samples in the dataset. Furthermore, ROSE also helps to eliminate the risk of minority classes problem in the dataset which tends to cause biasness problem in the predictive model. Figure 2(a) provides an overview of the original distribution of training data values in the dataset. After oversampling of training dataset, the normalization process is carried out on the dataset. The performance of ML algorithms improves after data standardization. As, CS dataset has only numeric values with different scales; therefore, for data standardization, we have applied Standard-Scaler function on the data. Standard-Scalar function works as rescaling the distribution of the data values such that the mean of observed value is 0 and standard deviation is 1. Figure 2(b) presents the training data distribution after oversampling and scaling.

Figure 2

Overview of data processing. (a) Original training data distribution. (b) Training data distribution after oversampling and scaling.

After preprocessing of the dataset, features will be extracted from the dataset by applying PCA. The main reason for feature extraction is to reduce the dimensionality of the dataset and this can be done by mapping the attributes of higher dimension features to the lower dimension feature space [42]. PCA is an unsupervised learning method, where the data dimensionality is reduced through correlation among multidimensional data group. Furthermore, PCA can simplify the data structure and facilitate easier use of the dataset without parameter limitation and also minimize the information loss. PCA also reduces the calculation cost of the algorithm [43]. The mathematical equations for feature extraction through PCA are given as follows: In (1), every column in the matrix is the feature. Take the average of each feature and then subtract the average from real data values to the new centralized data. Now, calculate the covariance matrix: Solve eigenvalue ν and eigenvector λ of covariance matrix by eigenvalue decomposition method, sort the eigenvalue in descending order, and then select the highest κ from them corresponding to κ eigenvectors used as row vector to build eigenvector matrix α. Now multiply the dataset ℕ∗ℙ by eigenvector of ν dimensional eigenvector and obtain the data matrix θ = αω of the last dimension reduction. Figure 3 provides the overview of PCA working for features extraction from the dataset. It is also evident from Figure 3 that PCA has generated only two new features from the dataset after 2D analysis and newly extracted features are given as input to the RF for the classification of CS.

Figure 3

Workflow of PCA for features extraction.

The formulation of the RF model for CS classification is given as follows: RF algorithm is an ensemble classifier γ(α, β) where β are similarly scattered random vectors of RF and the most evident classes are selected as input α through each tree participation. For input sample γ that is identical to the samples of training dataset, δ samples and σ features were derived from F feature space. ℕ training sets will be obtained, such as υ1, υ2,…., υ by randomly selecting n times. From corresponding training sets, the DT τ1, τ2,... τ will be obtained and each tree in the forest is completely grown apart from prune. For classification purposes, numerous decision trees are included in the construction of RF. In the DT, two hyperparameters are important for classification such as the number of trees in the forest (E) and (D) depth of each tree [44]. In this study, we have employed a grid search algorithm to find out the optimal value of E and D which eventually improve the performance of the RF model. Thus, a RF model is built and new sample data are ingested into the model; then, decision trees will work along with an analysis for finding out the class of new sample. At the end, all the tree in the RF vote for making the classification decision of given sample of data. In RF, trees are assembled by resampling to the training data that is known as bootstrap. For model integration with replacement method, bootstrap is used because it is a simple and reliable technique [45]. From training set, fixed number of samples are extracted using bootstrap random sampling. The number of samples are returned back into the training set after sampling. A new batch of bootstrap is generated from the extracted samples and there is also the probability of collected samples to be sampled again after returning to training set. Previously, collected samples are likely to be sampled after returning back to the training set. For instance, there is a random sample of ∂ sample. The probability of whether it will be collected each time or not is given as 1/∂ and (1 − 1/∂). When random sampling is done for T time, then probability of the given sample is given as (1 − 1/∂)T, where T converge to ∞, and (1 − 1/∂)T converge to 1/e = 0.368. Furthermore, there exists a mirror sample and (1/3) instance will go to the new samples. Out-of-Bag is the data that are ignored during data extraction process. This is known as Out-of-Bag (OOB) error and is mathematically denoted as In (3), 𝕏 shows the error for testing ℕ𝔹 data and ℕ𝔹 stands for OOB. Gini Index helps to build a DT and also describes the impurity level of the model by CART algorithm. When Gini Index value is small, then model has minor impurity. The Gini Index formulation for classification problem with probability of n category is ℧ for N classes, given as follows: Gini Index is also used for feature selection in DT; therefore, the mathematical formula for Gini Index is given as Model overfitting can be resemble by DT; therefore, prepruning and postpruning methods are deployed to reduce the rate of overfitting. Furthermore, prepruning leads to premature growth of the DT while postpruning can achieve better results. Thus, DT has been grown without pruning for better results. After designing and optimization of the RF, the extracted features from the dataset by PCA have been fed into the RF model for the classification of CS. Through holdout validation scheme, 70% data from the dataset have been used for the training of proposed ROSE-PCA-RF model. While 30% data are used for the testing of purpose model. The newly proposed ROSE-PCA-RF model has shown improved performance and achieved the highest accuracy for unbiassed CS prediction by eliminating the problems of model overfitting and imbalance classes in the dataset by deploying ROSE technique.

2.3. Validation Scheme

From the literature, it has been studied that the holdout validation scheme is used as a standard for evaluating the performance of ML-based predictive models for medical applications [14, 22]. In holdout validation scheme, a dataset is divided into two parts where one part is used for the training purpose, while the rest part is used for testing of ML model. Moreover, Souza et al. [6] and Abdillah et al. [35] had used holdout validation schemes to evaluate their prediction models for CS. They had divided the dataset into two parts where 70% of the data had been used for training purpose of the ML model while 30% data are used for testing the predictive model. Thus, for train-test purpose of the proposed model, we have used the above mentioned criteria of data partition in our experiments.

2.4. Evaluation Metrics

There are various evaluation metrics available to measure the performance of ML models such as sensitivity, specificity, accuracy, f-score, confusion metric, Matthews correlation coefficient (MCC), area under the curve (AUC), and the ROC. To evaluate the performance of the newly proposed ROSE-PCA-RF CDSS, we have selected accuracy, sensitivity, specificity, MCC, and F1-score as evaluation metrics, where accuracy is defined as the correctly classified samples in the dataset [46]. The precise classification of the CS samples in the dataset is known as sensitivity. Moreover, specificity describes the accurate classification of the vaginal birth samples. Mathematically the evaluation metrics are given aswhere TP stands for the number of true positive that is CS samples which are correctly classified for caesarean method of the childbirth. FP stands for the number of false positives and it is the number of samples that should receive CS treatment, but they were predicted for the vaginal birth method. TN stands for the true negative that describes the vaginal birth samples which are correctly predicted for the vaginal birth. FN stands for false negative that describes the vaginal birth samples which are predicted for CS treatment. The key element to measure the performance of the predictive model is the statistical analysis. Therefore, we have selected MCC as a statistical analysis of our binary classification problem. MCC helps to determine the test accuracy through values ranging between −1 and 1, where 1 is the precise predictions and −1 denotes inferior predictions.

3. Experimental Results and Discussion

Four different types of experiments were carried out to rigorously assess the performance and efficacy of the newly proposed CDSS method for CS prediction. In the first experiment, we have demonstrated the impact of unbalancing classes in the dataset using seven conventional ML models. In the second experiment, the traditional RF algorithm is fine-tuned using a grid search algorithm and is tested on the balanced CS dataset. While in the third experiment, the dataset is preprocessed through random oversampling, standardization, and normalization. Following that, the newly proposed method ROSE-PCA-RF is tested on process data with balanced classes for the decision of CS. In the fourth experiment, other conventional ML methods are fine-tuned and tested on the same balanced CS dataset for performance comparison. All experiments are carried out on a system powered by an Intel (R) Core (TM) i5-8250U CPU running at 1.60 GHz and running Windows 10 Home 64 bit as the operating system. All of the experiments make use of the Python software package as a software tool.

3.1. Experiment#1: Impact of Imbalance Classes in the Dataset

In this section, we have developed several ML models to demonstrate the impact of imbalanced data for the prediction of CS. From Table 2, it can be depicted that ML models are sensitive to the imbalanced data. The developed ML models are clearly biased in favor of majority class. For instance, it can be observed from Table 2 that we obtained a high rate of sensitivity and low rate of specificity (highlighted in the Table 2) when ML models are trained using imbalanced data. To avoid this biasedness problem in the ML models, we take a step to balance the training data and for this purpose we deployed ROSE method [36] to balance the size of each class in the training data. After balancing the data, it is evident from Table 3 that the performance of ML models is improved, i.e., the ML models does not suffer from the biased performance as can be seen from the values of sensitivity and specificity highlighted in Table 3.

Table 2

Performance of conventional predictive models on imbalanced CS dataset, where Acctrain is accuracy on training data, Acctest accuracy on test data, Sen sensitivity, Spec specificity, and MCC Matthews correlation coefficient.

Table 3

Performance of various state-of-the-art predictive models on balance CS dataset, where Acctrain is accuracy on training data, Acctestis accuracy on test data, Sen is sensitivity, Spec is specificity, and MCC is Matthews correlation coefficient.

3.2. Experiment#2: Conventional RF Model for Caesarean Section Prediction

This experiment was created using the Python programming language to assess the effectiveness of the RF classifier on CS dataset. This experiment has two phases: in the first phase, we deployed a conventional RF without tuning hyperparameters, while in the second phase, we fine-tuned the RF's hyperparameters using a grid search algorithm. Table 3 shows the results of this experiment, in which conventional RF obtained an accuracy of 90.25% on test data, while RF with fine-tuned hyperparameter obtained an accuracy of 92.59% on test data. ROC charts were also used to validate the results of this experiment. Figure 4 depicts the performance of both traditional and fine-tuned RF models.

Figure 4

Performance comparison between conventional and tuned RF. (a) ROC plot of the conventional RF. (b) ROC plot of the fine-tuned RF.

3.3. Experiment#3: The Proposed ROSE-PCA-RF Model for Caesarean Section Prediction

In this experiment, we have assessed the performance of the proposed ROSE-PCA-RF model for CS classification. We have employed the principal component analysis as a feature extraction method from the dataset, which not only improved the accuracy of the random forest but also significantly reduced the time complexity of the proposed ROSE-PCA-RF model by shrinking the feature space, reducing the number of features from six to two. The extracted features are fed into the fine-tuned RF model. For the RF optimization task, we have used a grid search algorithm, which assisted us in determining the optimal depth of the tree (D) and its edges (E). The obtained accuracy along with other performance evaluation metric values is given in Table 4, for the newly proposed ROSE-PCA-RF model. Table 4 shows that the newly proposed ROSE-PCA-RF model has achieved the best accuracy of 97.12% for CS prediction with a max depth hyperparameter D value of 6 and an estimator value E of 25.

Table 4

Classification accuracy of the proposed ROSE-PCA-RF model with optimal hyperparameters of RF on balance dataset, where E is the number of estimators, D is the depth hyperparameter, Acctrain is the accuracy on training data, Acctest is the accuracy on test data, Sen is sensitivity, Spec is specificity, and MCC is Matthews correlation coefficient.

In comparison to the traditional random forest model, the proposed ROSE-PCA-RF model has improved the performance of the random forest classifier by 5.0% for the CS prediction and this can be observed by comparison of Tables 3 and 4. Furthermore, we have evaluated the performance of the proposed ROSE-PCA-RF model using the ROC, where ML model with the highest AUC is considered more efficient and accurate. Figure 5 depicts the ROC plot of the newly proposed ROSE-PCA-RF model for CS prediction that has AUC of 98%. In comparison to the ROC plot from Figure 4(a), the conventional RF model has an AUC of 90% and from Figure 4(b), it can observed that fine-tuned RF model has an AUC of 92%. From Figure 6, it can be depicted that the conventional random forest has took time of 1.4317 seconds for the execution while the newly proposed ROSE-PCA-RF method has completed all the processing tasks within 1.3279 seconds which is lesser as compared to the conventional RF. Hence, the proposed ROSE-PCA-RF model has less time complexity than traditional RF.

Figure 5

ROC plot of the proposed ROSE-PCA-RF model.

Figure 6

Execution time comparison between conventional RF and the proposed ROSE-PCA-RF.

3.4. Experiment#4: Other ML Models for Caesarean Section Prediction

Hereby, we have evaluated the performance of other state-of-the-art ML models such as NB, LR, DT, AdaBoost, and SVM with different kernels functions (Linear, BRF, Poly, and Sigmoid) for CS prediction. The performance of each of the aforementioned conventional ML models can be depicted from Table 3 with their fine-tuned hyperparameter values. The main reason for conducting this comparison is to demonstrate the efficacy of these ML models for CS prediction. Table 3 shows that the DT performed admirably, achieving an accuracy of 92.56%, while the newly proposed ROSE-PCA-RF model achieved an accuracy of 97.12%, which is 4.56% better than the DT's performance. Furthermore, we have plotted the ROC of these cutting-edge ML models to analyze the AUC, as shown in Figure 7.

Figure 7

ROC charts of the state-of-the-art ML model with optimized hyperparameters on balance dataset.

In comparison to the proposed ROSE-PCA-RF model, the DT has shown fairly good performance. Moreover, the DT's ROC plot has shown an AUC of 93%, whereas the proposed ROSE-PCA-RF model achieved the AUC of 98% in ROC plot in Figure 5, which is 5% higher than the DT's AUC. As a result, the proposed ROSE-PCA-RF model's performance refinement is also confirmed by the ROC plot and AUC.

3.5. Comparative Study with Previously Proposed Methods for Caesarean Section Prediction

From the literature, we have studied several ML-based CDSSs which were designed and developed by the researchers for CS prediction. Table 5 provides a brief description of these ML-based methods which were presented in the past. Our proposed ROSE-PCA-RF model has shown significant improved performance in comparison to the recently proposed method of Abdillah et al. [35] and Rahman et al. [34]. Furthermore, Abdillah et al. [35] had used several data partition schemes for training and testing purposes of their proposed LDA-SVM model. But for the holdout validation scheme, their proposed LDA-SVM model obtained an accuracy of 70% on testing dataset while 67.86% on training dataset, while our proposed model obtained the accuracy of 96.26% on training data and 97.12% on testing data which means the newly proposed ROSE-PCA-RF model does not suffer from the problem of overfitting as the model has shown better performance on both training and testing data.

Table 5

Classification accuracies comparison with the previously proposed methods for CS prediction.

Study (year)	Method	Accuracy (%)	Balancing
Verhoeven et al. (2009) [47]	SPSS	76.00	No
Gharehchopogh et al. (2012) [26]	DT C4.5	86.25	No
Vovsha et al. (2014) [29]	LR, SVM	65.00	No
Sodsee (2014) [48]	CPD-NN, kNN	75.00	No
Maroufizadeh et al. (2018) [30]	LR, RF, ANN	70.00	No
Iftitah and Rulaningtyas (2018) [31]	Naive Bayes	90.00	No
Amin and Ali (2018) [33]	WEKA software	95.00	No
Ayyappan 2019 [49]	SMO in PUK kernel	75.00	No
Souza et al. (2019) [6]	LR	88.03	No
Saleem et al. (2019) [32]	AdaBoost	91.80	No
Lee and Ahn (2019) [50]	ANN	91.00	No
Khan et al. (2020) [51]	AdaBoost	88.69	No
Meyer et al. (2020) [52]	XGBoost	85.00	No
Abdillah et al. (2021) [35]	LDA-SVM	70.83	No
Rahman et al. (2021) [34]	SMOTE-RF	93.00	Yes
Rahman et al. (2021) [34]	SMOTE-SVM	94.00	Yes
Proposed method (2022)	ROSE + PCA-RF	97.12	Yes

Additionally, for effective evaluation of the proposed ROSE-PCA-RF model, we have deployed various evaluation metrics such as F1-score, MCC, sensitivity, and specificity. To avoid the overfitting problem of the ML model, sensitivity and specificity values are important. Our proposed model not only achieved better accuracy on testing and training dataset, but also has shown excellent results of sensitivity and specificity, which means ROSE-PCA-RF model does not suffer from the problem of overfitting. Hence, the proposed ROSE-PCA-RF method is more precise as compared to that of Abdillah et al. [35] and Rahman et al. [34] proposed models and the rest of the other fifteen proposed methods for CS prediction in the past. This can be depicted from Table 5.

4. Conclusion

In this study, we have identified the problem of lower accuracy and overfitting in the recently proposed ML models for CS prediction and presented a novel CDSS for making an informed decision for CS. ROSE technique is used on the CS dataset for the first time to overcome the limitation of a small number of samples and imbalance classes in the dataset. The proposed ROSE-PCA-FR systems have two components that are hybridized and work as a single system. The first component of the proposed model works to extract significantly important features from the dataset to achieve better accuracy and for this purpose, we have employed PCA that has reduced the number of features from 6 to only two features. The second component of the proposed model works as a classifier and, for this objective, a RF algorithm has been utilized for the classification of CS. The hyperparameter of the RF model are fine-tuned using a grid search algorithm. From the experimental results, it was observed that the newly proposed ROSE-PCA-RF model has outperformed the traditional random forest model along with the other fifteen recently proposed models for CS prediction by using only two features which are extracted by PCA from the dataset. Moreover, the newly proposed ROSE-PCA-RF model has also demonstrated lower time complexity as compared to the traditional random forest model because the proposed model has used a lesser number of features than the traditional random forest. From experimental results, it can be concluded that the proposed ROSE-PCA-RF method can assist the physicians and labor attending personnel in making informed decisions for CS during a childbirth.

19 in total

1. Machine learning for sub-population assessment: evaluating the C-section rate of different physician practices.

Authors: Rich Caruana; Radu S Niculescu; R Bharat Rao; Cynthia Simms
Journal: Proc AMIA Symp Date: 2002

2. Validation of models that predict Cesarean section after induction of labor.

Authors: C J M Verhoeven; A Oudenaarden; M A A Hermus; M M Porath; S G Oei; B W J Mol
Journal: Ultrasound Obstet Gynecol Date: 2009-09 Impact factor: 7.299

3. Clustering-based undersampling with random over sampling examples and support vector machine for imbalanced classification of breast cancer diagnosis.

Authors: Jue Zhang; Li Chen
Journal: Comput Assist Surg (Abingdon) Date: 2019-08-12 Impact factor: 1.787

4. Automated Detection of Parkinson's Disease Based on Multiple Types of Sustained Phonations Using Linear Discriminant Analysis and Genetically Optimized Neural Network.

Authors: Liaqat Ali; Ce Zhu; Zhonghao Zhang; Yipeng Liu
Journal: IEEE J Transl Eng Health Med Date: 2019-10-07 Impact factor: 3.316

5. Knowledge and attitude of pregnant women to caesarean section in a semi-urban community in northwest Nigeria.

Authors: Ao Ashimi; Tg Amole; Ld Aliyu
Journal: J West Afr Coll Surg Date: 2013-04

6. Artificial Neural Network Analysis of Spontaneous Preterm Labor and Birth and Its Major Determinants.

Authors: Kwang Sig Lee; Ki Hoon Ahn
Journal: J Korean Med Sci Date: 2019-04-29 Impact factor: 2.153

7. A Strategy for Classification of "Vaginal vs. Cesarean Section" Delivery: Bivariate Empirical Mode Decomposition of Cardiotocographic Recordings.

Authors: Saqib Saleem; Syed Saud Naqvi; Tareq Manzoor; Ahmed Saeed; Naveed Ur Rehman; Jawad Mirza
Journal: Front Physiol Date: 2019-03-19 Impact factor: 4.566

8. A Feature-Driven Decision Support System for Heart Failure Prediction Based on χ ² Statistical Model and Gaussian Naive Bayes.

Authors: Liaqat Ali; Shafqat Ullah Khan; Noorbakhsh Amiri Golilarz; Imrana Yakubu; Iqbal Qasim; Adeeb Noor; Redhwan Nour
Journal: Comput Math Methods Med Date: 2019-11-20 Impact factor: 2.238

9. Data Mining in Employee Healthcare Detection Using Intelligence Techniques for Industry Development.

Authors: Abolfazl Mehbodniya; Ihtiram Raza Khan; Sudeshna Chakraborty; M Karthik; Kamakshi Mehta; Liaqat Ali; Stephen Jeswinde Nuagah
Journal: J Healthc Eng Date: 2022-01-10 Impact factor: 2.682

1 in total

1. An Intelligent Learning System for Unbiased Prediction of Dementia Based on Autoencoder and Adaboost Ensemble Learning.

Authors: Ashir Javeed; Ana Luiza Dallora; Johan Sanmartin Berglund; Peter Anderberg
Journal: Life (Basel) Date: 2022-07-21

1 in total