Literature DB >> 16026974

Gene extraction for cancer diagnosis by support vector machines--an improvement.

Te Ming Huang1, Vojislav Kecman.   

Abstract

OBJECTIVE: To improve the performance of gene extraction for cancer diagnosis by recursive feature elimination with support vector machines (RFE-SVMs): A cancer diagnosis by using the DNA microarray data faces many challenges the most serious one being the presence of thousands of genes and only several dozens (at the best) of patient's samples. Thus, making any kind of classification in high-dimensional spaces from a limited number of data is both an extremely difficult and a prone to an error procedure. The improved RFE-SVMs is introduced and used here for an elimination of less relevant genes and just for a reduction of the overall number of genes used in a medical diagnostic.
METHODS: The paper shows why and how the, usually neglected, penalty parameter C and some standard data preprocessing techniques (normalizing and scaling) influence classification results and the gene selection of RFE-SVMs. The gene selected by RFE-SVMs is compared with eight other gene selection algorithms implemented in the Rankgene software to investigate whether there is any consensus among the algorithms, so the scope of finding the right set of genes can be reduced.
RESULTS: The improved RFE-SVMs is applied on the two benchmarking colon and lymphoma cancer data sets with various C parameters and different standard preprocessing techniques. Here, decreasing C leads to the smaller diagnosis error in comparisons to other known methods applied to the benchmarking data sets. With an appropriate parameter C and with a proper preprocessing procedure, the reduction in a diagnosis error is as high as 36%.
CONCLUSIONS: The results suggest that with a properly chosen parameter C, the extracted genes and the constructed classifier will ensure less overfitting of the training data leading to an increased accuracy in selecting relevant genes. Finally, comparison in gene ranking obtained by different algorithms shows that there is a significant consensus among the various algorithms as to which set of genes is relevant.

Entities:  

Mesh:

Year:  2005        PMID: 16026974     DOI: 10.1016/j.artmed.2005.01.006

Source DB:  PubMed          Journal:  Artif Intell Med        ISSN: 0933-3657            Impact factor:   5.326


  19 in total

1.  An automatic method for arterial pulse waveform recognition using KNN and SVM classifiers.

Authors:  Tânia Pereira; Joana S Paiva; Carlos Correia; João Cardoso
Journal:  Med Biol Eng Comput       Date:  2015-09-24       Impact factor: 2.602

2.  Investigating the efficacy of nonlinear dimensionality reduction schemes in classifying gene and protein expression studies.

Authors:  George Lee; Carlos Rodriguez; Anant Madabhushi
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2008 Jul-Sep       Impact factor: 3.710

3.  Network Medicine: New Paradigm in the -Omics Era.

Authors:  Nancy Lan Guo
Journal:  Anat Physiol       Date:  2011-12-13

4.  Study on Contribution of Biological Interpretable and Computer-Aided Features Towards the Classification of Childhood Medulloblastoma Cells.

Authors:  Daisy Das; Lipi B Mahanta; Shabnam Ahmed; Basanta Kr Baishya; Inamul Haque
Journal:  J Med Syst       Date:  2018-07-04       Impact factor: 4.460

5.  Accurate prediction of coronary artery disease using reliable diagnosis system.

Authors:  Indrajit Mandal; N Sairam
Journal:  J Med Syst       Date:  2012-02-12       Impact factor: 4.460

6.  Identification of disease-causing genes using microarray data mining and Gene Ontology.

Authors:  Azadeh Mohammadi; Mohammad H Saraee; Mansoor Salehi
Journal:  BMC Med Genomics       Date:  2011-01-26       Impact factor: 3.063

7.  A mixture model with a reference-based automatic selection of components for disease classification from protein and/or gene expression levels.

Authors:  Ivica Kopriva; Marko Filipović
Journal:  BMC Bioinformatics       Date:  2011-12-30       Impact factor: 3.169

8.  Recursive gene selection based on maximum margin criterion: a comparison with SVM-RFE.

Authors:  Satoshi Niijima; Satoru Kuhara
Journal:  BMC Bioinformatics       Date:  2006-12-25       Impact factor: 3.169

9.  Application of two machine learning algorithms to genetic association studies in the presence of covariates.

Authors:  Bareng A S Nonyane; Andrea S Foulkes
Journal:  BMC Genet       Date:  2008-11-14       Impact factor: 2.797

10.  Development and multicenter validation of a CT-based radiomics signature for discriminating histological grades of pancreatic ductal adenocarcinoma.

Authors:  Na Chang; Lingling Cui; Yahong Luo; Zhihui Chang; Bing Yu; Zhaoyu Liu
Journal:  Quant Imaging Med Surg       Date:  2020-03
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.