| Literature DB >> 35741136 |
Mubarak Taiwo Mustapha1,2, Dilber Uzun Ozsahin3,2, Ilker Ozsahin1,2, Berna Uzun4.
Abstract
On average, breast cancer kills one woman per minute. However, there are more reasons for optimism than ever before. When diagnosed early, patients with breast cancer have a better chance of survival. This study aims to employ a novel approach that combines artificial intelligence and a multi-criteria decision-making method for a more robust evaluation of machine learning models. The proposed machine learning techniques comprise various supervised learning algorithms, while the multi-criteria decision-making technique implemented includes the Preference Ranking Organization Method for Enrichment Evaluations. The Support Vector Machine, having achieved a net outranking flow of 0.1022, is ranked as the most favorable model for the early detection of breast cancer. The net outranking flow is the balance between the positive and negative outranking flows. This indicates that the higher the net flow, the better the alternative. K-nearest neighbor, logistic regression, and random forest classifier ranked second, third, and fourth, with net flows of 0.0316, -0.0032, and -0.0541, respectively. The least preferred alternative is the naive Bayes classifier with a net flow of -0.0766. The results obtained in this study indicate the use of the proposed method in making a desirable decision when selecting the most appropriate machine learning model. This gives the decision-maker the option of introducing new criteria into the decision-making process.Entities:
Keywords: benign; decision-making; machine learning; malignant; supervised learning
Year: 2022 PMID: 35741136 PMCID: PMC9221649 DOI: 10.3390/diagnostics12061326
Source DB: PubMed Journal: Diagnostics (Basel) ISSN: 2075-4418
Figure 1Digitalized image of the Wisconsin breast dataset [27].
Class Distribution.
| Label | Count | Designation | |
|---|---|---|---|
| 1 | Malignant (M) | 212 | 1 |
| 2 | Benign (B) | 357 | 0 |
Figure 2Confusion Matrix [17].
The linguistic scale of importance.
| Linguistic Scale | Triangular Fuzzy Scale | Criteria |
|---|---|---|
| Very High (VH) | (0.75, 1, 1) | Accuracy, recall, precision |
| High (H) | (0.50, 0.75, 1) | Number of training samples needed, the impact of feature scaling, the impact of hyper-parameter tuning |
| Medium (M) | (0.25, 0.50, 0.75) | Tolerance to irrelevant attributes |
| Low (L) | (0, 0.25, 0.50) | - |
| Very Low (VL) | (0, 0, 0.25) | - |
Decision matrix of alternatives.
| Criteria | Accuracy | Recall | Precision | F1-Score | ROC AUC | Log Loss | Number of Training Samples Needed | Impact of Feature Scaling | Impact of Hyperparameter Tuning | Tolerance to İrrelevant Attributes |
|---|---|---|---|---|---|---|---|---|---|---|
| SVM | 99.0% | 99.0% | 99.5% | 99.0% | 99.5% | −0.828 | 0.92 | 0.92 | YES | 0.92 |
| Random Forest | 97.5% | 97.0% | 98.0% | 97.0% | 99.0% | −0.815 | 0.75 | 0.08 | YES | 0.08 |
| Logistic Regression | 97.5% | 97.0% | 98.0% | 97.0% | 99.0% | −0.815 | 0.50 | 0.25 | NO | 0.50 |
| KNN | 98.0% | 98.0% | 98.5% | 98.0% | 99.0% | −0.819 | 0.08 | 0.92 | YES | 0.50 |
| Naive Bayes | 97.5% | 97.0% | 98.0% | 97.0% | 99.0% | −0.815 | 0.50 | 0.08 | NO | 0.75 |
The complete ranking of alternatives using the Wisconsin dataset.
| Ranking | Alternatives | Positive Outranking Flow | Negative Outranking Flow | Net Flow |
|---|---|---|---|---|
| 1 | SVM | 0.3954 | 0.0000 | 0.3954 |
| 2 | KNN | 0.1807 | 0.0962 | 0.0845 |
| 3 | Logistic Regression | 0.0516 | 0.1357 | −0.0841 |
| 4 | Naive Bayes | 0.1401 | 0.1718 | −0.1401 |
| 5 | Random Forest | 0.013644 | 0.2693 | −0.2557 |
Figure 3Rainbow ranking of machine learning algorithms. NOTS: Number of training samples; IOFS: impact of feature scaling; IOHT: impact of hyper-parameter tuning; TTIA: tolerance to irrelevant attributes; ROC-AUC: receiver operating characteristic curve–the area under the curve.
Decision matrix of alternatives for the BIRADS dataset.
| Criteria | Accuracy | Recall | Precision | F1-Score | ROC AUC | Log Loss | Number of Training Samples Needed | Impact of Feature Scaling | Impact of Hyperparameter Tuning | Tolerance to İrrelevant Attributes |
|---|---|---|---|---|---|---|---|---|---|---|
| SVM | 97.0% | 95.5% | 97.5% | 98.5% | 99.5% | −0.8110 | 0.92 | 0.92 | YES | 0.92 |
| Random Forest | 96.0% | 96.0% | 98.0% | 98.0% | 99.0% | −0.8026 | 0.75 | 0.08 | YES | 0.08 |
| Logistic Regression | 95.5% | 95.5% | 97.0% | 96.5% | 99.0% | −0.7984 | 0.50 | 0.25 | NO | 0.50 |
| KNN | 95.5% | 96.0% | 97.5% | 96.0% | 98.5% | −0.7990 | 0.08 | 0.92 | YES | 0.50 |
| Naive Bayes | 94.0% | 94.0% | 96.0% | 96.0% | 98.0% | −0.7860 | 0.50 | 0.08 | NO | 0.75 |
The complete ranking of alternatives using the BIRADS dataset.
| Ranking | Alternatives | Positive Outranking Flow | Negative Outranking Flow | Net Flow |
|---|---|---|---|---|
| 1 | SVM | 0.315 | 0.0000 | 0.3152 |
| 2 | KNN | 0.1734 | 0.1493 | 0.0241 |
| 3 | Logistic Regression | 0.08336 | 0.1157 | −0.0321 |
| 4 | Random Forest | 0.0729 | 0.1438 | −0.0709 |
| 5 | Naive Bayes | 0.0044 | 0.2408 | −0.2363 |
Figure 4Rainbow ranking of ML algorithms using the BIRADS dataset. NOTS: number of training samples; IOFS: impact of feature scaling; IOHT: impact of hyper-parameter tuning; TTIA: tolerance to irrelevant attributes; ROC-AUC: receiver operating characteristic curve–the area under the curve.