Literature DB >> 23070592

The limitations of simple gene set enrichment analysis assuming gene independence.

Pablo Tamayo1, George Steinhardt2, Arthur Liberzon3, Jill P Mesirov4.   

Abstract

Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis's nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis's on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene-gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods.
© The Author(s) 2012.

Entities:  

Keywords:  Gene set enrichment analysis; gene expression

Mesh:

Year:  2012        PMID: 23070592      PMCID: PMC3758419          DOI: 10.1177/0962280212460441

Source DB:  PubMed          Journal:  Stat Methods Med Res        ISSN: 0962-2802            Impact factor:   3.021


  62 in total

1.  Correlation between gene expression levels and limitations of the empirical bayes methodology for finding differentially expressed genes.

Authors:  Xing Qiu; Lev Klebanov; Andrei Yakovlev
Journal:  Stat Appl Genet Mol Biol       Date:  2005-11-22

2.  Analyzing gene expression data in terms of gene sets: methodological issues.

Authors:  Jelle J Goeman; Peter Bühlmann
Journal:  Bioinformatics       Date:  2007-02-15       Impact factor: 6.937

3.  Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles.

Authors:  Aravind Subramanian; Pablo Tamayo; Vamsi K Mootha; Sayan Mukherjee; Benjamin L Ebert; Michael A Gillette; Amanda Paulovich; Scott L Pomeroy; Todd R Golub; Eric S Lander; Jill P Mesirov
Journal:  Proc Natl Acad Sci U S A       Date:  2005-09-30       Impact factor: 11.205

4.  De-correlating expression in gene-set analysis.

Authors:  Dougu Nam
Journal:  Bioinformatics       Date:  2010-09-15       Impact factor: 6.937

5.  Gene expression profiling of pediatric acute myelogenous leukemia.

Authors:  Mary E Ross; Rami Mahfouz; Mihaela Onciu; Hsi-Che Liu; Xiaodong Zhou; Guangchun Song; Sheila A Shurtleff; Stanley Pounds; Cheng Cheng; Jing Ma; Raul C Ribeiro; Jeffrey E Rubnitz; Kevin Girtman; W Kent Williams; Susana C Raimondi; Der-Cherng Liang; Lee-Yung Shih; Ching-Hon Pui; James R Downing
Journal:  Blood       Date:  2004-06-29       Impact factor: 22.113

6.  Transcriptional network governing the angiogenic switch in human pancreatic cancer.

Authors:  Amir Abdollahi; Christian Schwager; Jörg Kleeff; Irene Esposito; Sophie Domhan; Peter Peschke; Kai Hauser; Philip Hahnfeldt; Lynn Hlatky; Jürgen Debus; Jeffrey M Peters; Helmut Friess; Judah Folkman; Peter E Huber
Journal:  Proc Natl Acad Sci U S A       Date:  2007-07-24       Impact factor: 11.205

7.  Multiple testing for gene sets from microarray experiments.

Authors:  Insuk Sohn; Kouros Owzar; Johan Lim; Stephen L George; Stephanie Mackey Cushman; Sin-Ho Jung
Journal:  BMC Bioinformatics       Date:  2011-05-26       Impact factor: 3.169

8.  Pathway analysis of expression data: deciphering functional building blocks of complex diseases.

Authors:  Frank Emmert-Streib; Galina V Glazko
Journal:  PLoS Comput Biol       Date:  2011-05-26       Impact factor: 4.475

9.  PAGE: parametric analysis of gene set enrichment.

Authors:  Seon-Young Kim; David J Volsky
Journal:  BMC Bioinformatics       Date:  2005-06-08       Impact factor: 3.169

10.  An erythroid differentiation signature predicts response to lenalidomide in myelodysplastic syndrome.

Authors:  Benjamin L Ebert; Naomi Galili; Pablo Tamayo; Jocelyn Bosco; Raymond Mak; Jennifer Pretz; Shyam Tanguturi; Christine Ladd-Acosta; Richard Stone; Todd R Golub; Azra Raza
Journal:  PLoS Med       Date:  2008-02       Impact factor: 11.069

View more
  44 in total

1.  Computational Systems Biology of Psoriasis: Are We Ready for the Age of Omics and Systems Biomarkers?

Authors:  Tuba Sevimoglu; Kazim Yalcin Arga
Journal:  OMICS       Date:  2015-10-19

2.  Hippocampal Pruning as a New Theory of Schizophrenia Etiopathogenesis.

Authors:  Enrico Cocchi; Antonio Drago; Alessandro Serretti
Journal:  Mol Neurobiol       Date:  2015-04-24       Impact factor: 5.590

3.  Toward a gold standard for benchmarking gene set enrichment analysis.

Authors:  Ludwig Geistlinger; Gergely Csaba; Mara Santarelli; Marcel Ramos; Lucas Schiffer; Nitesh Turaga; Charity Law; Sean Davis; Vincent Carey; Martin Morgan; Ralf Zimmer; Levi Waldron
Journal:  Brief Bioinform       Date:  2021-01-18       Impact factor: 11.622

4.  Extracting the Strongest Signals from Omics Data: Differentially Expressed Pathways and Beyond.

Authors:  Galina Glazko; Yasir Rahmatallah; Boris Zybailov; Frank Emmert-Streib
Journal:  Methods Mol Biol       Date:  2017

5.  Testing for differentially expressed genetic pathways with single-subject N-of-1 data in the presence of inter-gene correlation.

Authors:  A Grant Schissler; Walter W Piegorsch; Yves A Lussier
Journal:  Stat Methods Med Res       Date:  2017-05-29       Impact factor: 3.021

6.  Identification of altered biological processes in heterogeneous RNA-sequencing data by discretization of expression profiles.

Authors:  Andrea Lauria; Serena Peirone; Marco Del Giudice; Francesca Priante; Prabhakar Rajan; Michele Caselle; Salvatore Oliviero; Matteo Cereda
Journal:  Nucleic Acids Res       Date:  2020-02-28       Impact factor: 16.971

7.  Likelihood ratio statistics for gene set enrichment in Alzheimer's disease pathways.

Authors:  Jordan Bryan; Arpita Mandan; Gauri Kamat; W Kirby Gottschalk; Alexandra Badea; Kendra J Adams; J Will Thompson; Carol A Colton; Sayan Mukherjee; Michael W Lutz
Journal:  Alzheimers Dement       Date:  2021-01-21       Impact factor: 21.566

8.  Differential Gene Set Enrichment Analysis: a statistical approach to quantify the relative enrichment of two gene sets.

Authors:  James H Joly; William E Lowry; Nicholas A Graham
Journal:  Bioinformatics       Date:  2021-01-29       Impact factor: 6.937

9.  InSilico DB genomic datasets hub: an efficient starting point for analyzing genome-wide studies in GenePattern, Integrative Genomics Viewer, and R/Bioconductor.

Authors:  Alain Coletta; Colin Molter; Robin Duqué; David Steenhoff; Jonatan Taminau; Virginie de Schaetzen; Stijn Meganck; Cosmin Lazar; David Venet; Vincent Detours; Ann Nowé; Hugues Bersini; David Y Weiss Solís
Journal:  Genome Biol       Date:  2012-11-18       Impact factor: 13.583

10.  The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation.

Authors:  Armand Valsesia; Aurélien Macé; Sébastien Jacquemont; Jacques S Beckmann; Zoltán Kutalik
Journal:  Front Genet       Date:  2013-05-30       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.