| Literature DB >> 29449408 |
Joseph G Azofeifa1,2, Mary A Allen2, Josephina R Hendrix1,3, Timothy Read2,4, Jonathan D Rubin4, Robin D Dowell1,2,3.
Abstract
Transcription factors (TFs) exert their regulatory influence through the binding of enhancers, resulting in coordination of gene expression programs. Active enhancers are often characterized by the presence of short, unstable transcripts termed enhancer RNAs (eRNAs). While their function remains unclear, we demonstrate that eRNAs are a powerful readout of TF activity. We infer sites of eRNA origination across hundreds of publicly available nascent transcription data sets and show that eRNAs initiate from sites of TF binding. By quantifying the colocalization of TF binding motif instances and eRNA origins, we derive a simple statistic capable of inferring TF activity. In doing so, we uncover dozens of previously unexplored links between diverse stimuli and the TFs they affect.Entities:
Year: 2018 PMID: 29449408 PMCID: PMC5848612 DOI: 10.1101/gr.225755.117
Source DB: PubMed Journal: Genome Res ISSN: 1088-9051 Impact factor: 9.043
Figure 1.Enhancer RNA (eRNA) presence marks the active subset of TF binding. (A) ROC analysis of TF binding site prediction via eRNA presence. False-positive and true-positive rates are varied by thresholding the penalized likelihood ratio statistic generated from Tfit. (B) TF binding peaks (Supplemental Table S1) were grouped according to eRNA association. A box-and-whiskers displays the median/variability in proportion of histone mark association between the groups across all TFs (Supplemental Table S1). Asterisks indicate a P-value <10−10 by z-test. All data in A and B are K562 cells. (C) Pairwise cell type–associated TF binding peaks were grouped according to eRNA presence from matched cell types (Supplemental Table S2). A gene was considered “neighboring” by a distance <10 kb. (D) Log base 10 FPKM fold change of “neighboring” genes related to eRNA-grouped NR2F2 binding peaks. (E) Histogram of Log base 10 FPKM fold change of “neighboring” genes for all possible eRNA-grouped TF ChIP-seq data sets (n = 255).
Figure 2.Motif colocalization with eRNA origins varies by cell type. (A) An example locus of GRO-seq, the inferred eRNA origin, and computation of “motif displacement” (MD) and the associated MD-score. (B) Each row is a TF motif model, and each column is a bin of a histogram (100) where heat is proportional to the frequency of a motif instance at that distance from an eRNA origin. (C) A comparison between the expected MD-score for a motif model (x-axis) and the observed MD-score in a K562 GRO-cap experiment (Core et al. 2014). Red and green dots indicate a P-value <10−6 above or below expectation hypothesis tests, respectively. (D) MD-scores were computed and ranked under six nascent transcription data sets. (E) Each row corresponds to a nascent data set, and each column relates to motif frequency. These MD distributions are shown for two demonstrative examples (JUND and CLOCK) and the associated MD-scores, sorted by publication.
Figure 3.MD-scores predict TF activity. (A, top) The MD distribution, MD-score, and the number of motifs within 1.5 kb of any eRNA origin before and after stimulation with Nutlin-3a (e.g., Nutlin) on TP53 (Allen et al. 2014), the TF known to be activated. (Bottom) For all motif models (each dot), the change in MD-score (ΔM DS) following perturbation (y-axis) relative to the number of motifs within 1.5 kb of any eRNA origin (x-axis). Red points indicate significantly increased and/or decreased MD-scores, respectively (P-value <10−6). Similar analysis for TNF activation of the NF-κB complex (B) (Luo et al. 2014) and estradiol activation of estrogen receptor (ESR1; C) (Hah et al. 2013). (D) A time series data set following treatment with flavopiridol (Jonkers et al. 2014). The y-axis indicates the MD-score change relative to time point zero. Blue dots indicate a MD-score difference <10−6. A darker shaded line indicates a time trajectory with at least one significant MD-score. (E) Time series data set following treatment with Kdo2-lipid A (KLA) where each time point is normalized to time-matched DMSO (Kaikkonen et al. 2014). Therefore, the y-axis indicates MD-score difference relative to the time point–matched DMSO sample. NCBI Sequence Read Archive (SRA) SRR numbers of these comparisons are outlined in Supplemental Table S4.