Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparative evaluation of set-level techniques in predictive classification of gene expression samples.

Literature DB >> 22759420

Comparative evaluation of set-level techniques in predictive classification of gene expression samples.

Matěj Holec¹, Jiří Kléma, Filip Zelezný, Jakub Tolar.

Abstract

BACKGROUND: Analysis of gene expression data in terms of a priori-defined gene sets has recently received significant attention as this approach typically yields more compact and interpretable results than those produced by traditional methods that rely on individual genes. The set-level strategy can also be adopted with similar benefits in predictive classification tasks accomplished with machine learning algorithms. Initial studies into the predictive performance of set-level classifiers have yielded rather controversial results. The goal of this study is to provide a more conclusive evaluation by testing various components of the set-level framework within a large collection of machine learning experiments.
RESULTS: Genuine curated gene sets constitute better features for classification than sets assembled without biological relevance. For identifying the best gene sets for classification, the Global test outperforms the gene-set methods GSEA and SAM-GS as well as two generic feature selection methods. To aggregate expressions of genes into a feature value, the singular value decomposition (SVD) method as well as the SetSig technique improve on simple arithmetic averaging. Set-level classifiers learned with 10 features constituted by the Global test slightly outperform baseline gene-level classifiers learned with all original data features although they are slightly less accurate than gene-level classifiers learned with a prior feature-selection step.
CONCLUSION: Set-level classifiers do not boost predictive accuracy, however, they do achieve competitive accuracy if learned with the right combination of ingredients. AVAILABILITY: Open-source, publicly available software was used for classifier learning and testing. The gene expression datasets and the gene set database used are also publicly available. The full tabulation of experimental results is available at http://ida.felk.cvut.cz/CESLT.

Entities: Chemical Disease Gene Species

Mesh：

Year: 2012 PMID： 22759420 PMCID： PMC3382436 DOI： 10.1186/1471-2105-13-S10-S15

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

40 in total

1. Selection and validation of differentially expressed genes in head and neck cancer.

Authors: M A Kuriakose; W T Chen; Z M He; A G Sikora; P Zhang; Z Y Zhang; W L Qiu; D F Hsu; C McMunn-Coffran; S M Brown; E M Elango; M D Delacure; F A Chen
Journal: Cell Mol Life Sci Date: 2004-06 Impact factor: 9.261

2. Gene expression profiling of gliomas strongly predicts survival.

Authors: William A Freije; F Edmundo Castro-Vargas; Zixing Fang; Steve Horvath; Timothy Cloughesy; Linda M Liau; Paul S Mischel; Stanley F Nelson
Journal: Cancer Res Date: 2004-09-15 Impact factor: 12.701

Review 3. Microarray data analysis: from disarray to consolidation and consensus.

Authors: David B Allison; Xiangqin Cui; Grier P Page; Mahyar Sabripour
Journal: Nat Rev Genet Date: 2006-01 Impact factor: 53.242

4. Analyzing gene expression data in terms of gene sets: methodological issues.

Authors: Jelle J Goeman; Peter Bühlmann
Journal: Bioinformatics Date: 2007-02-15 Impact factor: 6.937

5. A novel signaling pathway impact analysis.

Authors: Adi Laurentiu Tarca; Sorin Draghici; Purvesh Khatri; Sonia S Hassan; Pooja Mittal; Jung-Sun Kim; Chong Jai Kim; Juan Pedro Kusanovic; Roberto Romero
Journal: Bioinformatics Date: 2008-11-05 Impact factor: 6.937

6. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors: Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal: Proc Natl Acad Sci U S A Date: 2005-09-30 Impact factor: 11.205

7. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

Authors: A Bhattacharjee; W G Richards; J Staunton; C Li; S Monti; P Vasa; C Ladd; J Beheshti; R Bueno; M Gillette; M Loda; G Weber; E J Mark; E S Lander; W Wong; B E Johnson; T R Golub; D J Sugarbaker; M Meyerson
Journal: Proc Natl Acad Sci U S A Date: 2001-11-13 Impact factor: 11.205

8. Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context.

Authors: Gad Abraham; Adam Kowalczyk; Sherene Loi; Izhak Haviv; Justin Zobel
Journal: BMC Bioinformatics Date: 2010-05-25 Impact factor: 3.169

9. Towards precise classification of cancers based on robust gene functional expression profiles.

Authors: Zheng Guo; Tianwen Zhang; Xia Li; Qi Wang; Jianzhen Xu; Hui Yu; Jing Zhu; Haiyun Wang; Chenguang Wang; Eric J Topol; Qing Wang; Shaoqi Rao
Journal: BMC Bioinformatics Date: 2005-03-17 Impact factor: 3.169

10. A HIF1alpha regulatory loop links hypoxia and mitochondrial signals in pheochromocytomas.

Authors: Patricia L M Dahia; Ken N Ross; Matthew E Wright; César Y Hayashida; Sandro Santagata; Marta Barontini; Andrew L Kung; Gabriela Sanso; James F Powers; Arthur S Tischler; Richard Hodin; Shannon Heitritter; Francis Moore; Robert Dluhy; Julie Ann Sosa; I Tolgay Ocal; Diana E Benn; Deborah J Marsh; Bruce G Robinson; Katherine Schneider; Judy Garber; Seth M Arum; Márta Korbonits; Ashley Grossman; Pascal Pigny; Sérgio P A Toledo; Vania Nosé; Cheng Li; Charles D Stiles
Journal: PLoS Genet Date: 2005-07-25 Impact factor: 5.917

9 in total

1. Identification of marker genes and pathways specific to precancerous duodenal adenomas and early stage adenocarcinomas.

Authors: Yoshiki Sakaguchi; Nobutake Yamamichi; Shuta Tomida; Chihiro Takeuchi; Natsuko Kageyama-Yahara; Yu Takahashi; Kazuya Shiogama; Ken-Ichi Inada; Masao Ichinose; Mitsuhiro Fujishiro; Kazuhiko Koike
Journal: J Gastroenterol Date: 2018-06-28 Impact factor: 7.527

Comparative evaluation of set-level techniques in predictive classification of gene expression samples.

1. Selection and validation of differentially expressed genes in head and neck cancer.

2. Gene expression profiling of gliomas strongly predicts survival.

Review 3. Microarray data analysis: from disarray to consolidation and consensus.

4. Analyzing gene expression data in terms of gene sets: methodological issues.

5. A novel signaling pathway impact analysis.

6. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

7. Classification of human lung carcinomas by mRNA expression profiling reveals distinct adenocarcinoma subclasses.

8. Prediction of breast cancer prognosis using gene set statistics provides signature stability and biological context.

9. Towards precise classification of cancers based on robust gene functional expression profiles.

10. A HIF1alpha regulatory loop links hypoxia and mitochondrial signals in pheochromocytomas.

1. Identification of marker genes and pathways specific to precancerous duodenal adenomas and early stage adenocarcinomas.

2. Using ILP to Identify Pathway Activation Patterns in Systems Biology.

3. Multi-class BCGA-ELM based classifier that identifies biomarkers associated with hallmarks of cancer.

4. A hybrid approach of gene sets and single genes for the prediction of survival risks with gene expression data.

5. A high performance prediction of HPV genotypes by Chaos game representation and singular value decomposition.

6. Structured feature selection using coordinate descent optimization.

7. Novel gene sets improve set-level classification of prokaryotic gene expression data.

8. Gene masking - a technique to improve accuracy for cancer classification with high dimensionality in microarray data.

9. Predictive modelling using pathway scores: robustness and significance of pathway collections.