Literature DB >> 22759420

Comparative evaluation of set-level techniques in predictive classification of gene expression samples.

Matěj Holec1, Jiří Kléma, Filip Zelezný, Jakub Tolar.   

Abstract

BACKGROUND: Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted with similar benefits in predictive classification tasks accomplished with machine learning algorithms. Initial studies into the predictive performance of set-level classifiers have yielded rather controversial results. The goal of this study is to provide a more conclusive evaluation by testing various components of the set-level framework within a large collection of machine learning experiments.
RESULTS: Genuine curated gene sets constitute better features for classification than sets assembled without biological relevance. For identifying the best gene sets for classification, the Global test outperforms the gene-set methods GSEA and SAM-GS as well as two generic feature selection methods. To aggregate expressions of genes into a feature value, the singular value decomposition (SVD) method as well as the SetSig technique improve on simple arithmetic averaging. Set-level classifiers learned with 10 features constituted by the Global test slightly outperform baseline gene-level classifiers learned with all original data features although they are slightly less accurate than gene-level classifiers learned with a prior feature-selection step.
CONCLUSION: Set-level classifiers do not boost predictive accuracy, however, they do achieve competitive accuracy if learned with the right combination of ingredients. AVAILABILITY: Open-source, publicly available software was used for classifier learning and testing. The gene expression datasets and the gene set database used are also publicly available. The full tabulation of experimental results is available at http://ida.felk.cvut.cz/CESLT.

Entities:  

Mesh:

Year:  2012        PMID: 22759420      PMCID: PMC3382436          DOI: 10.1186/1471-2105-13-S10-S15

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


  40 in total

1.  Selection and validation of differentially expressed genes in head and neck cancer.

Authors:  M A Kuriakose; W T Chen; Z M He; A G Sikora; P Zhang; Z Y Zhang; W L Qiu; D F Hsu; C McMunn-Coffran; S M Brown; E M Elango; M D Delacure; F A Chen
Journal:  Cell Mol Life Sci       Date:  2004-06       Impact factor: 9.261

2.  Gene expression profiling of gliomas strongly predicts survival.

Authors:  William A Freije; F Edmundo Castro-Vargas; Zixing Fang; Steve Horvath; Timothy Cloughesy; Linda M Liau; Paul S Mischel; Stanley F Nelson
Journal:  Cancer Res       Date:  2004-09-15       Impact factor: 12.701

Review 3.  Microarray data analysis: from disarray to consolidation and consensus.

Authors:  David B Allison; Xiangqin Cui; Grier P Page; Mahyar Sabripour
Journal:  Nat Rev Genet       Date:  2006-01       Impact factor: 53.242

4.  Analyzing gene expression data in terms of gene sets: methodological issues.

Authors:  Jelle J Goeman; Peter Bühlmann
Journal:  Bioinformatics       Date:  2007-02-15       Impact factor: 6.937

5.  A novel signaling pathway impact analysis.

Authors:  Adi Laurentiu Tarca; Sorin Draghici; Purvesh Khatri; Sonia S Hassan; Pooja Mittal; Jung-Sun Kim; Chong Jai Kim; Juan Pedro Kusanovic; Roberto Romero
Journal:  Bioinformatics       Date:  2008-11-05       Impact factor: 6.937

6.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

7.  Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors:  A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal:  Proc Natl Acad Sci U S A       Date:  2001-11-13       Impact factor: 11.205

8.  Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context.

Authors:  Gad Abraham; Adam Kowalczyk; Sherene Loi; Izhak Haviv; Justin Zobel
Journal:  BMC Bioinformatics       Date:  2010-05-25       Impact factor: 3.169

9.  Towards precise classification of cancers based on robust gene functional expression profiles.

Authors:  Zheng Guo; Tianwen Zhang; Xia Li; Qi Wang; Jianzhen Xu; Hui Yu; Jing Zhu; Haiyun Wang; Chenguang Wang; Eric J Topol; Qing Wang; Shaoqi Rao
Journal:  BMC Bioinformatics       Date:  2005-03-17       Impact factor: 3.169

10.  A HIF1alpha regulatory loop links hypoxia and mitochondrial signals in pheochromocytomas.

Authors:  Patricia L M Dahia; Ken N Ross; Matthew E Wright; César Y Hayashida; Sandro Santagata; Marta Barontini; Andrew L Kung; Gabriela Sanso; James F Powers; Arthur S Tischler; Richard Hodin; Shannon Heitritter; Francis Moore; Robert Dluhy; Julie Ann Sosa; I Tolgay Ocal; Diana E Benn; Deborah J Marsh; Bruce G Robinson; Katherine Schneider; Judy Garber; Seth M Arum; Márta Korbonits; Ashley Grossman; Pascal Pigny; Sérgio P A Toledo; Vania Nosé; Cheng Li; Charles D Stiles
Journal:  PLoS Genet       Date:  2005-07-25       Impact factor: 5.917

View more
  9 in total

1.  Identification of marker genes and pathways specific to precancerous duodenal adenomas and early stage adenocarcinomas.

Authors:  Yoshiki Sakaguchi; Nobutake Yamamichi; Shuta Tomida; Chihiro Takeuchi; Natsuko Kageyama-Yahara; Yu Takahashi; Kazuya Shiogama; Ken-Ichi Inada; Masao Ichinose; Mitsuhiro Fujishiro; Kazuhiko Koike
Journal:  J Gastroenterol       Date:  2018-06-28       Impact factor: 7.527

2.  Using ILP to Identify Pathway Activation Patterns in Systems Biology.

Authors:  Samuel R Neaves; Louise A C Millard; Sophia Tsoka
Journal:  Inductive Log Program       Date:  2016

3.  Multi-class BCGA-ELM based classifier that identifies biomarkers associated with hallmarks of cancer.

Authors:  Vasily Sachnev; Saras Saraswathi; Rashid Niaz; Andrzej Kloczkowski; Sundaram Suresh
Journal:  BMC Bioinformatics       Date:  2015-05-20       Impact factor: 3.169

4.  A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

Authors:  Junhee Seok; Ronald W Davis; Wenzhong Xiao
Journal:  PLoS One       Date:  2015-05-01       Impact factor: 3.240

5.  A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition.

Authors:  Watcharaporn Tanchotsrinon; Chidchanok Lursinsap; Yong Poovorawan
Journal:  BMC Bioinformatics       Date:  2015-03-05       Impact factor: 3.169

6.  Structured feature selection using coordinate descent optimization.

Authors:  Mohamed F Ghalwash; Xi Hang Cao; Ivan Stojkovic; Zoran Obradovic
Journal:  BMC Bioinformatics       Date:  2016-04-08       Impact factor: 3.169

7.  Novel gene sets improve set-level classification of prokaryotic gene expression data.

Authors:  Matěj Holec; Ondřej Kuželka; Filip Železný
Journal:  BMC Bioinformatics       Date:  2015-10-28       Impact factor: 3.169

8.  Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data.

Authors:  Harsh Saini; Sunil Pranit Lal; Vimal Vikash Naidu; Vincel Wince Pickering; Gurmeet Singh; Tatsuhiko Tsunoda; Alok Sharma
Journal:  BMC Med Genomics       Date:  2016-12-05       Impact factor: 3.063

9.  Predictive modelling using pathway scores: robustness and significance of pathway collections.

Authors:  Marcelo P Segura-Lepe; Hector C Keun; Timothy M D Ebbels
Journal:  BMC Bioinformatics       Date:  2019-11-04       Impact factor: 3.169

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.