Literature DB >> 33456763

MSclassifier: median-supplement model-based classification tool for automated knowledge discovery.

Emmanuel S Adabor1, George K Acquaah-Mensah2, Gaston K Mazandu3.   

Abstract

High-throughput technologies have resulted in an exponential growth of publicly available and accessible datasets for biomedical research. Efficient computational models, algorithms and tools are required to exploit the datasets for knowledge discovery to aid medical decisions. Here, we introduce a new tool, MSclassifier, based on median-supplement approaches to machine learning to enable an automated and effective binary classification for optimal decision making. The MSclassifier package estimates medians of features (attributes) to deduce supplementary data, which is subsequently introduced into the training set for balancing and building superior models for classification. To test our approach, it is used to determine HER2 receptor expression status phenotypes in breast cancer and also predict protein subcellular localization (plasma membrane and nucleus). Using independent sample and cross-validation tests, the performance of MSclassifier is evaluated and compared with well established tools that could perform such tasks. In the HER2 receptor expression status phenotype identification tasks, MSclassifier achieved statistically significant higher classification rates than the best performing existing tool (90.30% versus 89.83%, p=8.62e-3). In the subcellular localization prediction tasks, MSclassifier and one other existing tool achieved equally high performances (93.42% versus 93.19%, p=0.06) although they both outperformed tools based on Naive Bayes classifiers. Overall, the application and evaluation of MSclassifier reveal its potential to be applied to varieties of binary classification problems. The MSclassifier package provides an R-portable and user-friendly application to a broad audience, enabling experienced end-users as well as non-programmers to perform an effective classification in biomedical and other fields of study. Copyright:
© 2020 Adabor ES et al.

Entities:  

Keywords:  Breast cancer; HER2 receptor status; classification.; machine learning; protein subcellular localization; software package

Mesh:

Year:  2020        PMID: 33456763      PMCID: PMC7788522          DOI: 10.12688/f1000research.25501.1

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


  12 in total

1.  Multi-class protein fold recognition using support vector machines and neural networks.

Authors:  C H Ding; I Dubchak
Journal:  Bioinformatics       Date:  2001-04       Impact factor: 6.937

2.  Boosting for tumor classification with gene expression data.

Authors:  Marcel Dettling; Peter Bühlmann
Journal:  Bioinformatics       Date:  2003-06-12       Impact factor: 6.937

3.  A Gene Regulatory Program in Human Breast Cancer.

Authors:  Renhua Li; John Campos; Joji Iida
Journal:  Genetics       Date:  2015-10-28       Impact factor: 4.562

4.  SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks.

Authors:  Emmanuel S Adabor; George K Acquaah-Mensah; Francis T Oduro
Journal:  J Biomed Inform       Date:  2014-08-30       Impact factor: 6.317

5.  Tissue classification with gene expression profiles.

Authors:  A Ben-Dor; L Bruhn; N Friedman; I Nachman; M Schummer; Z Yakhini
Journal:  J Comput Biol       Date:  2000       Impact factor: 1.479

6.  An approach for deciphering patient-specific variations with application to breast cancer molecular expression profiles.

Authors:  Radhakrishnan Nagarajan; Meenakshi Upreti
Journal:  J Biomed Inform       Date:  2016-07-28       Impact factor: 6.317

7.  Predicting the subcellular localization of human proteins using machine learning and exploratory data analysis.

Authors:  George K Acquaah-Mensah; Sonia M Leach; Chittibabu Guda
Journal:  Genomics Proteomics Bioinformatics       Date:  2006-05       Impact factor: 7.691

8.  Breast cancer subtypes based on ER/PR and Her2 expression: comparison of clinicopathologic features and survival.

Authors:  Adedayo A Onitilo; Jessica M Engel; Robert T Greenlee; Bickol N Mukesh
Journal:  Clin Med Res       Date:  2009-06

Review 9.  Commercialized multigene predictors of clinical outcome for breast cancer.

Authors:  Jeffrey S Ross; Christos Hatzis; W Fraser Symmans; Lajos Pusztai; Gabriel N Hortobágyi
Journal:  Oncologist       Date:  2008-05

10.  Classification of breast cancer patients using somatic mutation profiles and machine learning approaches.

Authors:  Suleyman Vural; Xiaosheng Wang; Chittibabu Guda
Journal:  BMC Syst Biol       Date:  2016-08-26
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.