Literature DB >> 33774420

Beyond standard pipeline and p < 0.05 in pathway enrichment analyses.

Wentian Li1, Andrew Shih1, Yun Freudenberg-Hua2, Wen Fury3, Yaning Yang4.   

Abstract

A standard pathway/gene-set enrichment analysis, the over-representation analysis, is based on four values: the size of two gene-sets, size of their overlap, and size of the gene universe from which the gene-sets are chosen. The standard result of such an analysis is based on the p-value of a statistical test. We supplement this standard pipeline by six cautions: (1) any p-value threshold to distinguish enriched gene-sets from not-enriched ones is to certain degree arbitrary; (2) genes in a gene-set may be correlated, which potentially overcount the gene-set size; (3) any attempt to impose multiple testing correction will increase the false negative rate; (4) gene-sets in a gene-set database may be correlated, potentially overcount the factor for multiple testing correction; (5) the discrete nature of the data make it possible that a minimum change in counts may lead to a quantum change in the p-value threshold-based conclusion; (6) the two gene-sets may not be chosen from the universe of all human genes, but in fact from a subset of that universe, or even two different subsets of all genes. Careful reconsideration of these issues can have an impact on an enrichment analysis conclusion. Part of our cautions mirror the call from statistician that reaching conclusion from data is not a simple matter of p-value smaller than 0.05, but a thoughtful process with due diligences.
Copyright © 2021 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Gene-set enrichment; Human genes; Pathway analysis; Pipelines; Statistical significance

Mesh:

Year:  2021        PMID: 33774420      PMCID: PMC9179938          DOI: 10.1016/j.compbiolchem.2021.107455

Source DB:  PubMed          Journal:  Comput Biol Chem        ISSN: 1476-9271            Impact factor:   3.737


  35 in total

1.  Statistical significance for genomewide studies.

Authors:  John D Storey; Robert Tibshirani
Journal:  Proc Natl Acad Sci U S A       Date:  2003-07-25       Impact factor: 11.205

2.  GiANT: gene set uncertainty in enrichment analysis.

Authors:  Florian Schmid; Matthias Schmid; Christoph Müssel; J Eric Sträng; Christian Buske; Lars Bullinger; Johann M Kraus; Hans A Kestler
Journal:  Bioinformatics       Date:  2016-02-01       Impact factor: 6.937

3.  A general framework for multiple testing dependence.

Authors:  Jeffrey T Leek; John D Storey
Journal:  Proc Natl Acad Sci U S A       Date:  2008-11-24       Impact factor: 11.205

Review 4.  On parameters of the human genome.

Authors:  Wentian Li
Journal:  J Theor Biol       Date:  2011-08-03       Impact factor: 2.691

5.  Avoiding the pitfalls of gene set enrichment analysis with SetRank.

Authors:  Cedric Simillion; Robin Liechti; Heidi E L Lischer; Vassilios Ioannidis; Rémy Bruggmann
Journal:  BMC Bioinformatics       Date:  2017-03-04       Impact factor: 3.169

6.  Differential burden of rare protein truncating variants in Alzheimer's disease patients compared to centenarians.

Authors:  Yun Freudenberg-Hua; Wentian Li; Avinash Abhyankar; Vladimir Vacic; Vanessa Cortes; Danny Ben-Avraham; Jeremy Koppel; Blaine Greenwald; Soren Germer; Robert B Darnell; Nir Barzilai; Jan Freudenberg; Gil Atzmon; Peter Davies
Journal:  Hum Mol Genet       Date:  2016-06-03       Impact factor: 6.150

7.  Pathway enrichment analysis and visualization of omics data using g:Profiler, GSEA, Cytoscape and EnrichmentMap.

Authors:  Jüri Reimand; Ruth Isserlin; Veronique Voisin; Mike Kucera; Christian Tannus-Lopes; Asha Rostamianfar; Lina Wadi; Mona Meyer; Jeff Wong; Changjiang Xu; Daniele Merico; Gary D Bader
Journal:  Nat Protoc       Date:  2019-02       Impact factor: 13.491

Review 8.  Ten years of pathway analysis: current approaches and outstanding challenges.

Authors:  Purvesh Khatri; Marina Sirota; Atul J Butte
Journal:  PLoS Comput Biol       Date:  2012-02-23       Impact factor: 4.475

9.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists.

Authors:  Da Wei Huang; Brad T Sherman; Richard A Lempicki
Journal:  Nucleic Acids Res       Date:  2008-11-25       Impact factor: 16.971

10.  The Impact of Pathway Database Choice on Statistical Enrichment Analysis and Predictive Modeling.

Authors:  Sarah Mubeen; Charles Tapley Hoyt; André Gemünd; Martin Hofmann-Apitius; Holger Fröhlich; Daniel Domingo-Fernández
Journal:  Front Genet       Date:  2019-11-22       Impact factor: 4.599

View more
  2 in total

1.  A composite ranking of risk factors for COVID-19 time-to-event data from a Turkish cohort.

Authors:  Ayse Ulgen; Sirin Cetin; Meryem Cetin; Hakan Sivgin; Wentian Li
Journal:  Comput Biol Chem       Date:  2022-04-09       Impact factor: 3.737

2.  Blood-Type-A Is A COVID-19 Infection And Hospitalization Risk In A Turkish Cohort.

Authors:  Meryem Cetin; Sirin Cetin; Ayse Ulgen; Wentian Li
Journal:  Transfus Clin Biol       Date:  2022-10-12       Impact factor: 2.126

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.