| Literature DB >> 31814951 |
Habib Dhahri1,2, Eslam Al Maghayreh1,3, Awais Mahmood1, Wail Elkilani1, Mohammed Faisal Nagi1.
Abstract
There have been several empirical studies addressing breast cancer using machine learning and soft computing techniques. Many claim that their algorithms are faster, easier, or more accurate than others are. This study is based on genetic programming and machine learning algorithms that aim to construct a system to accurately differentiate between benign and malignant breast tumors. The aim of this study was to optimize the learning algorithm. In this context, we applied the genetic programming technique to select the best features and perfect parameter values of the machine learning classifiers. The performance of the proposed method was based on sensitivity, specificity, precision, accuracy, and the roc curves. The present study proves that genetic programming can automatically find the best model by combining feature preprocessing methods and classifier algorithms.Entities:
Mesh:
Year: 2019 PMID: 31814951 PMCID: PMC6878785 DOI: 10.1155/2019/4253641
Source DB: PubMed Journal: J Healthc Eng ISSN: 2040-2295 Impact factor: 2.682
Figure 1Example of pipeline.
Figure 2Flowchart of GP.
Figure 3Wrapper methods.
Figure 4Embedded methods.
Comparison of feature-selection algorithm.
| Search algorithm | Number of selected attributes | Numbers |
|---|---|---|
| PSO | 1, 9, 10, 16, 21, 23, 24, 25, 26, 27, 30, 31 | 12 |
| Evolutionary search | 1, 3, 9, 10, 11, 15, 23, 24, 25, 26, 27, 29, 30 | 13 |
| Genetic algorithm | 1, 7, 9, 10, 16, 21, 23, 24, 25, 26, 29, 30 | 12 |
| Best first | 1, 4, 9, 10, 16, 21, 23, 25, 26, 27, 29, 30 | 12 |
Figure 5ROC curve for LDA.
Figure 6ROC curve for LR.
Figure 7ROC curve for ET.
Figure 8ROC curve for RF.
Figure 9ROC curve for GB.
Figure 10ROC curve for AB.
Figure 11ROC curve for DT.
Figure 12ROC curve for KNN.
Figure 13ROC curve for GNB.
Figure 14Combining feature extraction.
Figure 15Comparison of classifier accuracy.
Figure 16Comparison of log-loss classifier.
F1-Measurements for breast cancer results.
| GB | DT | RF | GBN | SVM | KNN | AB | LDA | QDA | LR | ET | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Benign (%) | 96.69 | 95.36 | 97.37 | 96.69 | 78.72 | 93.42 | 98.67 | 96.73 | 0.97.26 | 96.10 | 98.01 |
| Malignant (%) | 93.51 | 90.91 | 94.74 | 93.51 | 0 | 86.84 | 97.44 | 93.33 | 95.12 | 91.89 | 96.10 |
| Average (%) | 95.57 | 93.80 | 96.45 | 95.57 | 51.10 | 91.11 | 98.23 | 95.33 | 96.51 | 94.63 | 97.34 |
Log-loss measure result for breast cancer results.
| GB | DT | RF | GBN | SVM | KNN | AB | LDA | QDA | LR | ET | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Log-loss (%) | 0.06 | 2.12 | 0.09 | 0.19 | 0.59 | 0.992 | 0.39 | 0.16 | 0.25 | 0.13 | 0.09 |
Figure 17Validation accuracy.