| Literature DB >> 35909874 |
Shweta Kharya1, Edeh Michael Onyema2,3, Aasim Zafar4, Mohd Anas Wajid4, Rockson Kwasi Afriyie5, Tripti Swarnkar6, Sunita Soni7.
Abstract
There are growing concerns about the mortality due to Breast cancer many of which often result from delayed detection and treatment. So an effective computational approach is needed to develop a predictive model which will help patients and physicians to manage the situation timely. This study presented a Weighted Bayesian Belief Network (WBBN) modeling for breast cancer prediction using the UCI breast cancer dataset. New automated ranking method was used to assign proper weights to attribute value pair based on their impact on causing the disease. Association between attributes was generated using weighted association rule mining between two attributes, multiattributes, and with class labels to generate rules. Weighted Bayesian confidence and weighted Bayesian lift measures were used to produce strong rules to build the model. To build WBBN, the Open Markov tool was used for structure and parametric learning using generated strong rules. The model was trained using 70% records and tested on 30% records with a threshold value of minimum support = 36% and confidence = 70% which produced results with an accuracy of 97.18%. Experimental results show that WBBN achieved better results in most cases compared to other predictive models. The study would contribute to the fight against breast cancer and the quality of treatment.Entities:
Mesh:
Year: 2022 PMID: 35909874 PMCID: PMC9328988 DOI: 10.1155/2022/3813705
Source DB: PubMed Journal: Comput Intell Neurosci
Figure 1Workflow diagram of the Proposed Model.
Breast Cancer Dataset with discretized values.
| S.No | Attribute_name | Range |
|---|---|---|
| 1 | Clump _thickness [1–10] | 1 |
| 2 | ||
|
| ||
| 2 | Uniformity_of_cellsize [1–10] | 3 |
| 4 | ||
|
| ||
| 3 | Uniformity_of_cellshape [1–10] | 5 |
| 6 | ||
|
| ||
| 4 | Marginal_Adhesion [1–10] | 7 |
| 8 | ||
|
| ||
| 5 | Single_Epithelial_cellsize [1–10] | 9 |
| 10 | ||
|
| ||
| 6 | Bare_Nuclei [1–10] | 11 |
| 12 | ||
|
| ||
| 7 | Bland_Chromatin [1–10] | 13 |
| 14 | ||
|
| ||
| 8 | Normal_Nucleoli [1–10] | 15 |
| 16 | ||
|
| ||
| 9 | Mitoses [1–10] | 17 |
| 18 | ||
|
| ||
| 10 | Output (class label representing 2 typesof breast cancer class) | 19 |
| 20 | ||
Weights of attribute, value pair with class label = ‘yes'.
| Attribute, value pair | Rank | Weights (class label = Yes) |
|---|---|---|
| Clump Thickness,1 | 18 | .009 |
| Clump Thickness,2 | 10 | .11 |
| UniformityofCellSize,3 | 19 | .002 |
| UniformityofCellSize,4 | 9 | .12 |
| UnformityofCellShape,5 | 20 | .001 |
| UnformityofCellShape,6 | 8 | .095 |
| MarginalAdhesion,7 | 17 | .015 |
| MarginalAdhesion,8 | 11 | .107 |
| SingleEpithlialCell,10 | 7 | .122 |
| BareNuclei,12 | 7 | .122 |
| BlandChromatin,13 | 21 | .001 |
| BlandChromatin,14 | 8 | .121 |
| NormalNuclei,16 | 7 | .122 |
| Mitosis,17 | 7 | .122 |
Generation of strong rules on basis of WBC and WBL and its accuracy.
| S.No | Minimum threshold | Training dataset (%) | Testing dataset (%) | No. of association rules based on weighted support and weighted confidence | No. of strong association rules based on WBC and WBL | Accuracy (%) |
|---|---|---|---|---|---|---|
| 1 | Support = 36% Confidence = 70% | 100 | 100 | 22 | 10 | 97.08 |
| 2 | 80 | 20 | 11 | 7 | 95.7 | |
| 3 | 70 | 30 | 11 | 5 | 97.18 | |
| 4 | 60 | 40 | 11 | 7 | 92.5 | |
|
| ||||||
| 5 | Support = 40% Confidence = 80% | 100 | 100 | 11 | 7 | 89.53 |
| 6 | 80 | 20 | 11 | 7 | 95.74 | |
| 7 | 70 | 30 | 11 | 7 | 86 | |
| 8 | 60 | 40 | 28 | 12 | 92.55 | |
|
| ||||||
| 9 | Support = 26% Confidence = 60% | 100 | 100 | 23 | 11 | 89.53 |
| 10 | 80 | 20 | 22 | 11 | 95.74 | |
| 11 | 70 | 30 | 23 | 12 | 97.18 | |
| 12 | 60 | 40 | 11 | 9 | 92.55 | |
|
| ||||||
| 13 | Support = 10% Confidence = 50% | 100 | 100 | 23 | 12 | 89.53 |
| 14 | 80 | 20 | 23 | 12 | 95.74 | |
| 15 | 70 | 30 | 23 | 12 | 97 | |
| 16 | 60 | 40 | 23 | 12 | 92.5 | |
Performance of WBBN on various Clinical Datasets.
| Datasets | No. ofclass labels | Minimum threshold | Trainingdataset (%) | Testingdataset (%) | No. of strong association rulesbased on WBC and WBL | Accuracy (%) |
|---|---|---|---|---|---|---|
| Heart | 5 | Support = 36% | 70 | 30 | 7 | 92.7 |
| Confidence = 70% | ||||||
|
| ||||||
| Pima indian | 1 | Support = 40% | 80 | 20 | 7 | 95.8 |
| Confidence = 80% | ||||||
|
| ||||||
| Hepatitis | 2 | Support = 36% | 70 | 30 | 6 | 94.18 |
| Confidence = 70% | ||||||
|
| ||||||
| Liver disorder | 2 | Support = 40% | 70 | 30 | 5 | 94.3 |
| Confidence = 80% | ||||||
Figure 2WBBN Model using Open Markov.
Comparison of proposed model and other models applied to WBC Data set.with existing Bayesian models on WBC.
| Dataset | Model | Accuracy (%) |
|---|---|---|
| Wisconsin Breast Cancer Dataset | Weighted Bayesian Belief Network | 97.18 |
| Naive bayes | 95.99 | |
| SVM | 97.13 | |
| MLP | 95.27 | |
| KNN | 95.27 | |
| Decision tree | 95.36 | |
| CART | 93.3 | |
| Bayesian Network | 96.5 |
Comparison of proposed model with existing Bayesian models on WBC dataset.
| Model | Dataset | Accuracy (%) |
|---|---|---|
| WBBN | WBC | 97.18 |
| Bayesian Network | 96.1 | |
| Bayes Net | 95.25 | |
| Bayesian Network | 96.5 |