Literature DB >> 33489117

Comparing supervised and semi-supervised Machine Learning Models on Diagnosing Breast Cancer.

Nosayba Al-Azzam1, Ibrahem Shatnawi2.   

Abstract

BACKGROUND: Breast cancer disease is the most common cancer in US women and the second cause of cancer death among women.
OBJECTIVES: To compare and evaluate the performance and accuracy of the key supervised and semi-supervised machine learning algorithms for breast cancer prediction.
MATERIALS AND METHODS: We have used nine machine learning classification algorithms for supervised (SL) and semi-supervised learning (SSL): 1) Logistic regression; 2) Gaussian Naive Bayes; 3) Linear Support vector machine; 4) RBF Support vector machine; 5) Decision Tree; 6) Random Forest; 7) Xgboost; 8) Gradient Boosting; 9) KNN. The Wisconsin Diagnosis Cancer dataset was used to train and test these models. To ensure the robustness of the model, we have applied K-fold cross-validation and optimized hyperparameters. We have evaluated and compared the models using accuracy, precision, recall, F1-score, and ROC curves.
RESULTS: The results of all models are inspiring using both SL and SSL. The SSL has high accuracy (90%-98%) with just half of the training data. The KNN model for the SL and logistic regression for the SSL achieved the highest accuracy of 98.
CONCLUSION: The accuracies of SSL algorithms are very close to the SL algorithms. The accuracies of all models are in the range of 91-98%. SSL is a promising and competitive approach to solve the problem. Using a small sample of labeled and low computational power, the SSL is fully capable of replacing SL algorithms in diagnosing tumor type.
© 2021 The Authors.

Entities:  

Keywords:  ANN, Artificial Neural Network; Breast cancer; Cov, covariance; Diagnosis; EDA, Exploratory Data Analysis; FNA, fine needle aspirate; FPR, False positive rate; ID3, Information Gain; KNN, K- nearest neighbor; MRI, Magnetic resonance imaging; Machine learning algorithms; RBF, Radial Basis Function; ROC, Receiver Operator Characteristic; SL, Supervised Learning; SSL, Semi-Supervisd Learning; SVM, Support vector machine; Semi-supervised; Supervised; TPR, True positive rate; WDBC, Wisconsin Diagnostic Breast Cancer; Xgboost, eXtreme Gradient Boosting; t-SNE, t-distributed Stochastic Neighbor Embedding

Year:  2021        PMID: 33489117      PMCID: PMC7806524          DOI: 10.1016/j.amsu.2020.12.043

Source DB:  PubMed          Journal:  Ann Med Surg (Lond)        ISSN: 2049-0801


  14 in total

1.  Predicting breast cancer survivability: a comparison of three data mining methods.

Authors:  Dursun Delen; Glenn Walker; Amit Kadam
Journal:  Artif Intell Med       Date:  2005-06       Impact factor: 5.326

2.  An immune-inspired semi-supervised algorithm for breast cancer diagnosis.

Authors:  Lingxi Peng; Wenbin Chen; Wubai Zhou; Fufang Li; Jin Yang; Jiandong Zhang
Journal:  Comput Methods Programs Biomed       Date:  2016-07-09       Impact factor: 5.428

Review 3.  A historic and scientific review of breast cancer: The next global healthcare challenge.

Authors:  Sven Becker
Journal:  Int J Gynaecol Obstet       Date:  2015-10       Impact factor: 3.561

4.  Breast cancer statistics, 2019.

Authors:  Carol E DeSantis; Jiemin Ma; Mia M Gaudet; Lisa A Newman; Kimberly D Miller; Ann Goding Sauer; Ahmedin Jemal; Rebecca L Siegel
Journal:  CA Cancer J Clin       Date:  2019-10-02       Impact factor: 508.702

5.  STROCSS 2019 Guideline: Strengthening the reporting of cohort studies in surgery.

Authors:  Riaz Agha; Ali Abdall-Razak; Eleanor Crossley; Naeem Dowlut; Christos Iosifidis; Ginimol Mathew
Journal:  Int J Surg       Date:  2019-11-06       Impact factor: 6.071

6.  Semi-supervised learning improves gene expression-based prediction of cancer recurrence.

Authors:  Mingguang Shi; Bing Zhang
Journal:  Bioinformatics       Date:  2011-09-04       Impact factor: 6.937

Review 7.  Breast cancer.

Authors:  Nadia Harbeck; Frédérique Penault-Llorca; Javier Cortes; Michael Gnant; Nehmat Houssami; Philip Poortmans; Kathryn Ruddy; Janice Tsang; Fatima Cardoso
Journal:  Nat Rev Dis Primers       Date:  2019-09-23       Impact factor: 52.329

8.  Predicting breast cancer survivability using fuzzy decision trees for personalized healthcare.

Authors:  Muhammad Umer Khan; Jong Pill Choi; Hyunjung Shin; Minkoo Kim
Journal:  Conf Proc IEEE Eng Med Biol Soc       Date:  2008

9.  A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification.

Authors:  Alexander Statnikov; Lily Wang; Constantin F Aliferis
Journal:  BMC Bioinformatics       Date:  2008-07-22       Impact factor: 3.169

Review 10.  Machine learning applications in cancer prognosis and prediction.

Authors:  Konstantina Kourou; Themis P Exarchos; Konstantinos P Exarchos; Michalis V Karamouzis; Dimitrios I Fotiadis
Journal:  Comput Struct Biotechnol J       Date:  2014-11-15       Impact factor: 7.271

View more
  2 in total

Review 1.  Semi-supervised learning in cancer diagnostics.

Authors:  Jan-Niklas Eckardt; Martin Bornhäuser; Karsten Wendt; Jan Moritz Middeke
Journal:  Front Oncol       Date:  2022-07-14       Impact factor: 5.738

2.  Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis.

Authors:  Abdur Rasool; Chayut Bunterngchit; Luo Tiejian; Md Ruhul Islam; Qiang Qu; Qingshan Jiang
Journal:  Int J Environ Res Public Health       Date:  2022-03-09       Impact factor: 3.390

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.