| Literature DB >> 33456621 |
Alin Tomoiaga1, Peter Westfall1, Michele Donato2, Sorin Draghici2, Sonia Hassan3, Roberto Romero3, Paola Tellaroli4.
Abstract
Identifying the biological pathways that are related to various clinical phenotypes is an important concern in biomedical research. Based on estimated expression levels and/or p-values, over-representation analysis (ORA) methods provide rankings of pathways, but they are tainted because pathways overlap. This crosstalk phenomenon has not been rigorously studied and classical ORA does not take into consideration: (i) that crosstalk effects in cases of overlapping pathways can cause incorrect rankings of pathways, (ii) that crosstalk effects can cause both excess type I errors and type II errors, (iii) that rankings of small pathways are unreliable and (iv) that type I error rates can be inflated due to multiple comparisons of pathways. We develop a Bayesian hierarchical model that addresses these problems, providing sensible estimates and rankings, and reducing error rates. We show, on both real and simulated data, that the results of our method are more accurate than the results produced by the classical over-representation analysis, providing a better understanding of the underlying biological phenomena involved in the phenotypes under study. The R code and the binary datasets for implementing the analyses described in this article are available online at: http://www.eng.wayne.edu/page.php?id=6402.Entities:
Keywords: Bayes model; data augmentation; gene expression; genomic pathway analysis; hierarchical modelling
Year: 2016 PMID: 33456621 PMCID: PMC7810237 DOI: 10.1007/s12561-016-9160-1
Source DB: PubMed Journal: Stat Biosci ISSN: 1867-1764