| Literature DB >> 28560097 |
Laurence de Torrente1, Samuel Zimmerman1, Deanne Taylor2,3, Yu Hasegawa1, Christine A Wells4, Jessica C Mar1,5,6.
Abstract
Identifying the pathways that control a cellular phenotype is the first step to building a mechanistic model. Recent examples in developmental biology, cancer genomics, and neurological disease have demonstrated how changes in the variability of gene expression can highlight important genes that are under different degrees of regulatory control. Simple statistical tests exist to identify differentially-variable genes; however, methods for investigating how changes in gene expression variability in the context of pathways and gene sets are under-explored. Here we present pathVar, a new method that provides functional interpretation of gene expression variability changes at the level of pathways and gene sets. pathVar is based on a multinomial exact test, or an asymptotic Chi-squared test as a more computationally-efficient alternative. The method can be used for gene expression studies from any technology platform in all biological settings either with a single phenotypic group, or two-group comparisons. To demonstrate its utility, we applied the method to a diverse set of diseases, species and samples. Results from pathVar are benchmarked against analyses based on average expression and two methods of GSEA, and demonstrate that analyses using both statistics are useful for understanding transcriptional regulation. We also provide recommendations for the choice of variability statistic that have been informed through analyses on simulations and real data. Based on the datasets selected, we show how pathVar can be used to gain insight into expression variability of single cell versus bulk samples, different stem cell populations, and cancer versus normal tissue comparisons.Entities:
Keywords: Bioinformatics; Cellular heterogeneity; Functional genomics; Gene expression variability; Single cell analysis; Transcriptional regulation
Year: 2017 PMID: 28560097 PMCID: PMC5444375 DOI: 10.7717/peerj.3334
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1The distribution of gene expression variability highlights the regulatory control that different genes in the pathway are subjected to.
(A) Absolute gene expression is a proxy for how genes are transcriptionally regulated between samples. Studying the consistency of how genes are expressed can also add information on pathway control e.g., lower levels of inter-individual variability may reflect increased regulatory control. (B) By considering the distribution of gene expression variability, we may be able to understand transcriptional regulation in a more comprehensive manner—this is the premise of the pathVar method.
Figure 2Overview of pathVar, including the main functions in the R package.
Figure 3Example of four significant KEGG pathways for one-group pathVar analysis of the Bock embryonic stem cell data.
(A) Variability count distribution for the reference. (B) Splicesome pathway (hsa03040), (C) oxidative phosphorylation (hsa00190), (D) ECM-receptor interaction (hsa04512). The red stars indicate a significant difference between the pathway and reference distribution for a specific level of expression variability.
Figure 4Example of two significant KEGG pathways when comparing human embryonic stem cells (ESC) and induced pluripotent stem cell (iPSC) data using the two-group pathVar analysis.
(A) Oxidative phosphorylation (hsa00190), (B) DNA replication (hsa03030). In both pathways, a higher number of genes with lower variability are present in ESCs versus iPSCs. The red stars indicate a significant difference between the two groups for a specific level of expression variability.