Literature DB >> 15374862

A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis.

Alexander Statnikov1, Constantin F Aliferis, Ioannis Tsamardinos, Douglas Hardin, Shawn Levy.   

Abstract

MOTIVATION: Cancer diagnosis is one of the most important emerging clinical applications of gene expression microarray technology. We are seeking to develop a computer system for powerful and reliable cancer diagnostic model creation based on microarray data. To keep a realistic perspective on clinical applications we focus on multicategory diagnosis. To equip the system with the optimum combination of classifier, gene selection and cross-validation methods, we performed a systematic and comprehensive evaluation of several major algorithms for multicategory classification, several gene selection methods, multiple ensemble classifier methods and two cross-validation designs using 11 datasets spanning 74 diagnostic categories and 41 cancer types and 12 normal tissue types.
RESULTS: Multicategory support vector machines (MC-SVMs) are the most effective classifiers in performing accurate cancer diagnosis from gene expression data. The MC-SVM techniques by Crammer and Singer, Weston and Watkins and one-versus-rest were found to be the best methods in this domain. MC-SVMs outperform other popular machine learning algorithms, such as k-nearest neighbors, backpropagation and probabilistic neural networks, often to a remarkable degree. Gene selection techniques can significantly improve the classification performance of both MC-SVMs and other non-SVM learning algorithms. Ensemble classifiers do not generally improve performance of the best non-ensemble models. These results guided the construction of a software system GEMS (Gene Expression Model Selector) that automates high-quality model construction and enforces sound optimization and performance estimation procedures. This is the first such system to be informed by a rigorous comparative analysis of the available algorithms and datasets. AVAILABILITY: The software system GEMS is available for download from http://www.gems-system.org for non-commercial use. CONTACT: alexander.statnikov@vanderbilt.edu.

Entities:  

Mesh:

Substances:

Year:  2004        PMID: 15374862     DOI: 10.1093/bioinformatics/bti033

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  133 in total

1.  Computational Analysis of Muscular Dystrophy Sub-types Using A Novel Integrative Scheme.

Authors:  Chen Wang; Sook Ha; Jianhua Xuan; Yue Wang; Eric Hoffman
Journal:  Neurocomputing       Date:  2012-09-01       Impact factor: 5.719

Review 2.  Extension of the visualization tool MapMan to allow statistical analysis of arrays, display of corresponding genes, and comparison with known responses.

Authors:  Björn Usadel; Axel Nagel; Oliver Thimm; Henning Redestig; Oliver E Blaesing; Natalia Palacios-Rojas; Joachim Selbig; Jan Hannemann; Maria Conceição Piques; Dirk Steinhauser; Wolf-Rüdiger Scheible; Yves Gibon; Rosa Morcuende; Daniel Weicht; Svenja Meyer; Mark Stitt
Journal:  Plant Physiol       Date:  2005-07       Impact factor: 8.340

3.  Formative evaluation of a prototype system for automated analysis of mass spectrometry data.

Authors:  N Fananapazir; M Li; D Spentzos; C F Aliferis
Journal:  AMIA Annu Symp Proc       Date:  2005

Review 4.  Classification algorithms for phenotype prediction in genomics and proteomics.

Authors:  Habtom W Ressom; Rency S Varghese; Zhen Zhang; Jianhua Xuan; Robert Clarke
Journal:  Front Biosci       Date:  2008-01-01

5.  Effects of SVM parameter optimization on discrimination and calibration for post-procedural PCI mortality.

Authors:  Michael E Matheny; Frederic S Resnic; Nipun Arora; Lucila Ohno-Machado
Journal:  J Biomed Inform       Date:  2007-05-18       Impact factor: 6.317

6.  Analyse multiple disease subtypes and build associated gene networks using genome-wide expression profiles.

Authors:  Sara Aibar; Celia Fontanillo; Conrad Droste; Beatriz Roson-Burgo; Francisco J Campos-Laborie; Jesus M Hernandez-Rivas; Javier De Las Rivas
Journal:  BMC Genomics       Date:  2015-05-26       Impact factor: 3.969

7.  A Multianalyte Panel Consisting of Extracellular Vesicle miRNAs and mRNAs, cfDNA, and CA19-9 Shows Utility for Diagnosis and Staging of Pancreatic Ductal Adenocarcinoma.

Authors:  Zijian Yang; Michael J LaRiviere; Jina Ko; Jacob E Till; Theresa Christensen; Stephanie S Yee; Taylor A Black; Kyle Tien; Andrew Lin; Hanfei Shen; Neha Bhagwat; Daniel Herman; Andrew Adallah; Mark H O'Hara; Charles M Vollmer; Bryson W Katona; Ben Z Stanger; David Issadore; Erica L Carpenter
Journal:  Clin Cancer Res       Date:  2020-04-16       Impact factor: 12.531

8.  Treatment-related features improve machine learning prediction of prognosis in soft tissue sarcoma patients.

Authors:  Jan C Peeken; Tatyana Goldberg; Christoph Knie; Basil Komboz; Michael Bernhofer; Francesco Pasa; Kerstin A Kessel; Pouya D Tafti; Burkhard Rost; Fridtjof Nüsslin; Andreas E Braun; Stephanie E Combs
Journal:  Strahlenther Onkol       Date:  2018-03-20       Impact factor: 3.621

9.  Blood gene expression signatures predict exposure levels.

Authors:  P R Bushel; A N Heinloth; J Li; L Huang; J W Chou; G A Boorman; D E Malarkey; C D Houle; S M Ward; R E Wilson; R D Fannin; M W Russo; P B Watkins; R W Tennant; R S Paules
Journal:  Proc Natl Acad Sci U S A       Date:  2007-11-02       Impact factor: 11.205

10.  Sparse representation for classification of tumors using gene expression data.

Authors:  Xiyi Hang; Fang-Xiang Wu
Journal:  J Biomed Biotechnol       Date:  2009-03-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.