Literature DB >> 19406131

Multiclass cancer classification by support vector machines with class-wise optimized genes and probability estimates.

Ashish Anand1, P N Suganthan.   

Abstract

We investigate the multiclass classification of cancer microarray samples. In contrast to classification of two cancer types from gene expression data, multiclass classification of more than two cancer types are relatively hard and less studied problem. We used class-wise optimized genes with corresponding one-versus-all support vector machine (OVA-SVM) classifier to maximize the utilization of selected genes. Final prediction was made by using probability scores from all classifiers. We used three different methods of estimating probability from decision value. Among the three probability methods, Platt's approach was more consistent, whereas, isotonic approach performed better for datasets with unequal proportion of samples in different classes. Probability based decision does not only gives true and fair comparison between different one-versus-all (OVA) classifiers but also gives the possibility of using them for any post analysis. Several ensemble experiments, an example of post analysis, of the three probability methods were implemented to study their effect in improving the classification accuracy. We observe that ensemble did help in improving the predictive accuracy of cancer data sets especially involving unbalanced samples. Four-fold external stratified cross-validation experiment was performed on the six multiclass cancer datasets to obtain unbiased estimates of prediction accuracies. Analysis of class-wise frequently selected genes on two cancer datasets demonstrated that the approach was able to select important and relevant genes consistent to literature. This study demonstrates successful implementation of the framework of class-wise feature selection and multiclass classification for prediction of cancer subtypes on six datasets.

Entities:  

Mesh:

Year:  2009        PMID: 19406131     DOI: 10.1016/j.jtbi.2009.04.013

Source DB:  PubMed          Journal:  J Theor Biol        ISSN: 0022-5193            Impact factor:   2.691


  5 in total

1.  Identification and optimization of classifier genes from multi-class earthworm microarray dataset.

Authors:  Ying Li; Nan Wang; Edward J Perkins; Chaoyang Zhang; Ping Gong
Journal:  PLoS One       Date:  2010-10-28       Impact factor: 3.240

2.  RNA-Seq-Based Breast Cancer Subtypes Classification Using Machine Learning Approaches.

Authors:  Zhezhou Yu; Zhuo Wang; Xiangchun Yu; Zhe Zhang
Journal:  Comput Intell Neurosci       Date:  2020-10-29

3.  A comprehensive simulation study on classification of RNA-Seq data.

Authors:  Gökmen Zararsız; Dincer Goksuluk; Selcuk Korkmaz; Vahap Eldem; Gozde Erturk Zararsiz; Izzet Parug Duru; Ahmet Ozturk
Journal:  PLoS One       Date:  2017-08-23       Impact factor: 3.240

4.  Some remarks on protein attribute prediction and pseudo amino acid composition.

Authors:  Kuo-Chen Chou
Journal:  J Theor Biol       Date:  2010-12-17       Impact factor: 2.691

5.  Identification of potential biomarkers to differentially diagnose solid pseudopapillary tumors and pancreatic malignancies via a gene regulatory network.

Authors:  Pengping Li; Yuebing Hu; Jiao Yi; Jie Li; Jie Yang; Jin Wang
Journal:  J Transl Med       Date:  2015-11-14       Impact factor: 5.531

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.