Literature DB >> 28190769

Paf1 Has Distinct Roles in Transcription Elongation and Differential Transcript Fate.

Harry Fischl¹, Françoise S Howe¹, Andre Furger¹, Jane Mellor².

Abstract

RNA polymerase II (Pol2) movement through chromatin and the co-transcriptional processing and fate of nascent transcripts is coordinated by transcription elongation factors (TEFs) such as polymerase-associated factor 1 (Paf1), but it is not known whether TEFs have gene-specific functions. Using strand-specific nucleotide resolution techniques, we show that levels of Paf1 on Pol2 vary between genes, are controlled dynamically by environmental factors via promoters, and reflect levels of processing and export factors on the encoded transcript. High levels of Paf1 on Pol2 promote transcript nuclear export, whereas low levels reflect nuclear retention. Strains lacking Paf1 show marked elongation defects, although low levels of Paf1 on Pol2 are sufficient for transcription elongation. Our findings support distinct Paf1 functions: a core general function in transcription elongation, satisfied by the lowest Paf1 levels, and a regulatory function in determining differential transcript fate by varying the level of Paf1 on Pol2.

Entities: Chemical Disease Gene Mutation Species

Keywords: NET-seq; Paf1; RNA polymerase II; Saccharomyces cerevisiae; TEF-seq; differential mRNP nuclear export; lncRNA; mRNA; nucleosome; transcription elongation

Mesh：

Substances：

Year: 2017 PMID： 28190769 PMCID： PMC5316414 DOI： 10.1016/j.molcel.2017.01.006

Source DB: PubMed Journal: Mol Cell ISSN： 1097-2765 Impact factor: 17.970

Introduction

Nuclear export of transcripts via the nuclear pore complex is a key step in eukaryotic gene expression (Wickramasinghe and Laskey, 2015). Although the machinery controlling RNA export is highly conserved, there are many examples of selective RNA export. However, the regulatory processes controlling preferential export of some RNAs over others are not understood. One well studied example of selective RNA export in yeast concerns the generally distinct fate of mRNAs compared with long non-coding RNAs (lncRNAs) (Tuck and Tollervey, 2013). Currently, the distinction between mRNAs and lncRNAs is believed to be made during 3′ end formation of the transcript, following transcription elongation but prior to the acquisition of export competence, with the length of the poly(A) tail being a determinant for this process (Tuck and Tollervey, 2013). The lncRNAs, and some 20% of mRNAs that behave like lncRNAs, are preferentially captured and retained by Mtr4-dependent surveillance. These transcripts are subsequently degraded in the nucleus by the nuclear exosome and are characterized by their reduced association with the conserved nuclear-cytoplasmic export factor Mex67/NXF1/TAP (tandem affinity purification) (Tuck and Tollervey, 2013). Efficient nuclear export of a transcript requires RNA binding proteins such as Nab2, controlling poly(A) tail length, and an adaptor protein such as Yra1/REF, linking the transcript to Mex67 and the nuclear pore complex (Baejen et al., 2014, Hieronymus and Silver, 2003, Tuck and Tollervey, 2013). However, there are likely to be multiple parallel export pathways, with the export of particular mRNAs occurring at different rates in association with particular combinations of poly(A) binding proteins and adaptors. Mex67 links elongation to transcript export as co-transcriptionally deposited monoubiquitylation of lysine 123 on histone H2B (H2Bub) facilitates recruitment of Mex67 to the transcript (Vitaliano-Prunier et al., 2012). The deposition of H2Bub requires the Paf1 complex (Paf1C), containing Paf1 (RNA polymerase II [Pol2]-associated factor 1), in the Pol2 transcription elongation complex (TEC) (Tomson and Arndt, 2013, Yang et al., 2016). Paf1C was originally described as interacting with the TATA binding protein, the transcription elongation factors (TEFs) Spt4-Spt5, and facilitates chromatin transcription (FACT), and, in addition to controlling deposition of H2Bub, Paf1C has been implicated in many aspects of gene regulation (Kowalik et al., 2015, Poli et al., 2016, Tomson and Arndt, 2013, Yang et al., 2016), including release of the promoter-proximal paused Pol2 into productive elongation (Chen et al., 2015, Yu et al., 2015) and transcription elongation on DNA or chromatin templates (Rondón et al., 2004, Tous et al., 2011). Consistent with a direct role in elongation, chromatin immunoprecipitation (ChIP) reveals accumulation of Paf1 over the transcribed region of genes (Mayer et al., 2010). Currently, our understanding of the relationship between TEFs such as Paf1 and elongating Pol2 is limited because ChIP, the technique used to capture TEF interactions in chromatin, cannot resolve the DNA strand involved (Mahony and Pugh, 2015). This problem is compounded by recent observations showing that budding yeast, fly, and mammalian genomes are extensively transcribed, often on both strands of DNA (reviewed in Mellor et al., 2016, and Murray and Mellor, 2016). To address exactly how Paf1 influences transcription elongation and transcript fate, we applied nucleotide resolution strand-specific techniques to map (1) Pol2 on genes in the absence of Paf1 (by native elongating transcript sequencing [NET-seq]) and (2) Paf1 association with elongating Pol2 (by TEF-associated nascent elongating transcript sequencing [TEF-seq], an adaption of NET-seq). Here we show that Paf1 controls the differential nuclear export of mRNAs and preferential nuclear retention of the majority of lncRNAs and some mRNAs. Paf1 shows selective enrichment on, or depletion from, Pol2-transcribing RNAs that are also selectively enriched or depleted for 3′ end processing and export factors. Paf1 enrichment/depletion on Pol2 is dynamic and controlled by environmental factors via promoters, suggesting that the fate of transcripts is equally dynamic and determined before their transcription. Paf1 also functions to control Pol2 distribution on genes during elongation, but this function is common to all genes and is distinct from its role in orchestrating differential nuclear export of specific transcripts. Together, our findings support distinct functions for Paf1C associated with Pol2, a core function in Pol2 elongation at all genes and a regulatory function determining differential transcript fate at selected genes.

Results

Mapping Strand-Specific Paf1 Association with Pol2 Using TEF-Seq

To map the position of Paf1 on elongating Pol2, we immunoprecipitated) FLAG-tagged Paf1 and sequenced the nascent RNA, purified from the active site of co-immunoprecipitated Pol2, from the 3′ end (Figure 1A; Tables S1 and S2; Figures S1 and S2). This technique, called TEF-seq, is related to NET-seq, which maps the position of the 3′ nucleotide of the nascent RNA in transcriptionally engaged Pol2 (Churchman and Weissman, 2011). Alignment of reads shows the position of the final incorporated nucleotide, giving a strand-specific nucleotide resolution map of transcriptionally engaged Pol2 associated with Paf1. Replicates show that TEF-seq data are reproducible (Figure S1H). To ensure that our map (Figure 1B) reports only the signal from Pol2 associated with Paf1, we used two important controls. First, we removed the background signal that is not specific to tagged Paf1 using material present in an immunoprecipitation (IP) from a strain lacking a FLAG tag (no-tag control; STAR Methods; Figures S1E–S1G). Second, to show that the RNA recovered is specifically associated with Pol2, we subjected Paf1 to single or sequential IP procedures (Paf1-FLAG or Rpb3-FLAG and then Paf1-HA) and normalized to levels of Pol2 also subjected to single or sequential IPs (Rpb2-FLAG or Rpb3-FLAG and then Rpb2-HA) by tagging the Rpb3 and/or Rpb2 components of Pol2 (Figure 1; STAR Methods). The Pol2-normalized Paf1 metagene profiles for the single and sequential IPs are similar, suggesting that TEF-seq captures only Paf1 associated with elongating Pol2 and not from a Pol2-independent but direct association of the Paf1C with RNA. Thus, we are capturing Paf1 specifically associated with Pol2, although Paf1 could be associated with Pol2 via the transcript emanating from its active site (Dermody and Buratowski, 2010), via the phosphorylated C-terminal domain (CTD) of its largest subunit, Rpb1 (Custódio and Carmo-Fonseca, 2016, Qiu et al., 2012) or via another TEF (Tomson and Arndt, 2013).

Figure 1

Mapping Paf1 on Pol2 Using TEF-Seq

(A) Schematic showing single or sequential immunoprecipitation of Pol2 (via the components Rpb3 or Rpb2, NET-seq) or TEF:Pol2 complexes (TEF-seq) via epitope tags from cryogenically lysed and DNase I-treated yeast extracts.

(B) Examples of Rpb3 and Rpb2 NET-seq and Paf1 TEF-seq profiles on individual genes. Images in integrated genome viewer (IGV) (Thorvaldsdóttir et al., 2013) show loci on the Watson (top, filled arrows) or Crick (bottom, open arrows) strands. Scales allow relative changes to be visualized and are proportional to each other across all IGV images.

(C) Comparison of Rpb2-normalized Paf1 metagene profiles from single (blue) or sequential (black) IP procedures aligned at the TSS or PAS (dashed line, right). Pol2-normalized (using either Rpb2 or Rpb3 NET-seq data) TEF metagene profiles are calculated as follows. TEF/Pol2 (TEF-seq counts/NET-seq counts) ratios are calculated for each 10-nt bin across the specified region. For TSS-aligned plots, the final 200 nt of each gene before the TES is excluded. For PAS or TES-aligned plots, the initial 500 nt after the TSS of each gene is excluded. These ratios are multiplied by a TEF and Pol2 subunit-specific scaling factor to enable visualization on the same arbitrary linear scale. The mean (scaling factor × TEF)/Pol2 ratios are then calculated for each 10-nt bin across all genes after first excluding the top 10% and bottom 10% of ratios to prevent single-nucleotide spikes from skewing the mean (see STAR Methods for details).

See also Tables S1 and S2 and Figures S1 and S2.

Paf1 Oscillates on Pol2 during Transcription through Chromatin

The association of Paf1 with elongating Pol2 is not uniform but shows a remarkable periodicity (∼170 nt), reminiscent of phased positioned nucleosomes (Jiang and Pugh, 2009; Figures 1C and 2A). To examine this in more detail, TEF-seq profiles were obtained for the Pol2-associated histone chaperones Spt6 and Spt16 (Tables S1 and S2; Figures S1 and S2). The Rpb3-normalized metagene profiles for Spt6, Spt16, and Paf1 revealed their dynamic behavior during transcription through chromatin (Figure 2). Relative to the center point of nucleosomes +2 to +5, these factors oscillate around nucleosomes, with peaks and troughs separated by ∼80 nt, offset for Spt6 compared with Spt16 and Paf1, which are known to interact (Krogan et al., 2004, Squazzo et al., 2002). The marked phasing offset between Spt6 and Spt16 oscillations may reflect their preferential affinities for H3 or H2A-H2B dimers, respectively (Bortvin and Winston, 1996, Hondele et al., 2013). The ability to detect the dynamics of transcription through chromatin reflects the high resolution of the TEF-seq data.

Figure 2

Levels of Paf1 Oscillate on Pol2

(A) Rpb3-normalized TEF metagene profiles for Paf1, Spt6, and Spt16 from the TSS to +1400 nt averaged across mRNA genes (see Figure 1C for details).

(B) Rpb3-normalized TEF metagene profiles aligned to the center (dashed line) of nucleosomes +2 to +5 (relative to the TSS) averaged across mRNA genes.

(C) p Values (one-tailed Student’s t test) calculated to test whether there is a significant difference in the mean TEF/Rpb3 ratio at the center compared with the center +50 nt of each nucleosome.

(D) Schematic showing five stages of the oscillations between Pol2, Spt6, Spt16, Paf1, and nucleosomes (the green circle is Paf1 and Spt16, and purple is Spt6). The equivalent positions (1–5) are shown in (B).

See also Tables S1 and S2 and Figures S1 and S2.

Paf1 Is Enriched and Depleted on Pol2

The high resolution of these data gives us the opportunity to study Paf1 levels on Pol2 transcribing different genes. TEF-seq reads for Paf1 from the transcript start site (TSS) + 400 nt to the transcript end site (TES) − 200 nt were counted, thus excluding regions in which Paf1 shows large changes in its association with Pol2. Differential occupancy was assessed by comparing the Paf1-specific counts to counts for Rpb3 across the same regions of mRNA genes. Paf1 shows a large range in relative occupancy levels, with 138 genes significantly (p < 0.05) Paf1-enriched and 153 Paf1-depleted (Figures 3A and 3B; Table S3; see STAR Methods for selection of gene groups for statistical analysis). Paf1 is also depleted from Pol2 on lncRNA genes regardless of whether they encode stable or unstable transcripts (Figures 3C and 3D). No significant difference in transcript levels between the Paf1-FLAG- and Rpb3-FLAG-tagged strains was evident for Paf1-enriched or -depleted genes (Figure S3), showing that the Paf1 and Rpb3 epitope tags are not responsible for the differential levels observed. Finally, enrichment and depletion of Paf1 is also observed at the same sets of genes when the read counts for Paf1 in the single or sequential IPs are normalized to read counts for Rpb2 (instead of Rpb3) in single or sequential IPs, respectively (Figures 3B and 3D).

Figure 3

Differential Levels of Paf1 on Pol2 Are Determined by the Promoter and CTD Phosphorylation Status

(A and B) Differential Paf1 occupancy at mRNA genes.

(A) Scatterplot showing the base mean occupancy against the log2 relative Paf1-to-Rpb3 occupancy ratio for each gene. Yellow points, 0.05 ≤ p < 0.1. Red points, p < 0.05.

(C and D) Differential Paf1 occupancy at lncRNA (triangles) and mRNA (circles) genes.

(E) Sequence motifs enriched within 500 nt upstream of the most significantly (p < 0.1) Paf1-enriched genes. Discriminative Regular Expression Motif Elicitation (DREME) (Bailey, 2011) was used to uncover and display the motifs.

(F) Bar chart showing Paf1 enrichment on Pol2 (Rpb3) for the hybrid constructs indicated. The URA3 coding region replaces that of CMK2 or SLC1 (the endogenous genes are Paf1-enriched and Paf1-depleted, respectively). Error bars show the SEM (n = 2).

(B, D, G, and H) Pol2-normalized TEF metagene profiles from the TSS to +800 nt (D and G) or +1,200 nt (B and H) averaged across all mRNA genes (black) and all lncRNA genes (purple, D and G) or the 1,000 most Paf1-enriched or -depleted mRNA genes (B and H), selected by ordering the genes by their log2 relative Paf1-to-Rpb3 occupancy ratio (see Figure 1C for details). Profiles are smoothed by calculating a running mean average using a moving window containing five 10-nt bins. p Values (one-tailed Welch’s t test) are calculated to test whether there is a significant difference in the ratios of the two groups at each 10-nt bin from 400–700 nt (D and G) or 400–1,100 nt (B and H). The upper quartile of these is taken, meaning that 75% of p values are lower than that shown.

Genes with Paf1-Enriched Pol2 Share Regulatory Features

Paf1-enriched mRNA genes share distinctive features not seen in Paf1-depleted mRNA genes. First, there is a significant enrichment (p < 0.00023) for divergently expressed gene pairs, 100 pairs within the 1,200 most Paf1-enriched genes, selected by ordering the genes by their log2 relative Paf1-to-Rpb3 ratio. This suggests that the shared bi-directional promoter between these divergent pairs, which are often, but not always, co-regulated (Yan et al., 2015), may be responsible for promoting Paf1 enrichment on both genes. This enrichment for divergent pairs is also present within the original class of 138 significantly (p < 0.05) Paf1-enriched genes (four pairs, p < 0. 017). There is no significant enrichment for divergent pairs within the 1,200 most Paf1-depleted genes (70 pairs, p = 0.73). Second, Paf1-enriched genes share regulatory features (Figure 3E; Figure S3). They are regulated by transcription factors such as Sfp1, Haa1, and Cbf1/Cpf1 (a MYC homolog) (Hu et al., 2007, Keller et al., 2001, Mellor et al., 1990). Interestingly, degradation of promoter-bound MYC releases PAF1C for transfer onto Pol2 during elongation in mammalian cells (Jaenicke et al., 2016). In addition, the promoters of significantly (p < 0.1) Paf1-enriched genes are enriched with DNA sequence elements within 500 nt of the TSS, such as AAAATTTT/C (104 of 229, p = 1.9 × 10−9) and CTCATCG/T (81 of 229, p = 6.4 × 10−9), binding sites for Sfp1 and Dot6/Tod6, respectively, often present together in promoters (Jorgensen et al., 2002). This supports a role for the promoter in determining Paf1 enrichment on Pol2. To validate this, we replaced the coding regions of Paf1-enriched CMK2 and Paf1-depleted SLC1 with URA3 and immunoprecipitated the transcripts associated with FLAG-tagged Rpb3 or FLAG-tagged Paf1. Assessment of levels of nascent transcripts immunoprecipitated from the hybrid genes revealed that the Pol2 transcribing URA3 from the promoter of the Paf1-enriched gene is more Paf1-enriched than when transcribing URA3 from the promoter of the Paf1-depleted gene (Figure 3F). This further implicates the promoter in determining levels of Paf1 on Pol2. Next we asked how the promoter influences enrichment or depletion of Paf1 on Pol2 transcribing different genes.

Differential Levels of Paf1 on Pol2 Reflect Levels of Serine 2 and 5 Phosphorylation of the Pol2 CTD

One feature of lncRNA genes, transcribed by Paf1-depleted Pol2, is that Pol2 shows reduced levels of Ser2-phosphorylated CTD on Rpb1 (Milligan et al., 2016, Murray et al., 2015). This is consistent with the documented role for Paf1C in determining levels of Ser2 phosphorylation on the Rpb1 CTD in Pol2 (Dronamraju and Strahl, 2014). Because Paf1C can also interact with Pol2 via its Ser2-phosphorylated CTD (Qiu et al., 2012), this positive feedback provides a possible explanation for how differential recruitment, controlled by the promoter, is maintained during transcription. To relate enrichment or depletion from Pol2 to the phosphorylation status of the CTD, we used TEF-seq to assess levels of six additional TEFs on Pol2 (Tables S1 and S2). These include (1) Set2, the lysine 36 on histone H3 (H3K36) methyltransferase, and Pcf11, required for 3′ end formation, both of which, like Paf1, show some dependence on Ser2-phosphorylated CTD for their association with Pol2; (2) Spt6 and Spt16, which are known to interact with Paf1; (3) Cet1, a capping enzyme that interacts with Ser5-phosphorylated CTD; and (4) Ssu72, a Ser5 phosphatase (Figures 3G and 3H; Table S2; Figure S2). Set2, Spt6, and Spt16 show significantly decreased recruitment to Pol2 on lncRNA genes compared with mRNA genes and on the 1,000 most Paf1-depleted genes compared with the 1,000 most Paf1-enriched genes, whereas no significant decrease is observed for Pcf11, Cet1, or Ssu72. Because Pcf11 and Cet1 interact with CTD repeats phosphorylated at Ser2 or Ser5 alone (Table S2), respectively, differential levels of these modifications are unlikely to be related to levels of Paf1. By contrast, Set2 and Paf1 preferentially interact with the CTD when both Ser2 and Ser5 are phosphorylated within the same and/or adjacent repeats (Kizer et al., 2005, Qiu et al., 2012). This analysis suggests that differential Ser2 phosphorylation, only within or next to Ser5 phosphorylated repeats, is the key to maintaining differential Paf1 during transcription. Finally, because a paf1Δ strain lacking all Paf1 protein shows reduced levels of both Ser2 phosphorylation and Pcf11 on genes (Tomson and Arndt, 2013) but Pcf11 does not show significantly decreased levels on Paf1-depleted genes (Figures 3G and 3H), we propose that a low level of Paf1, equal to that on the most Paf1-depleted genes identified by our TEF-seq analysis, may determine a level of CTD repeats phosphorylated at Ser2 alone sufficient for Pcf11 recruitment.

Paf1-Enriched or Depleted Pol2 on Genes Is Not Related to Transcription Elongation

To address whether Paf1 enrichment/depletion on Pol2 is related to the documented role of Paf1 in transcription elongation (Rondón et al., 2004, Tous et al., 2011), we subjected a paf1Δ strain lacking all Paf1 protein to NET-seq, which effectively captures all forms of engaged Pol2 on the genome (Churchman and Weissman, 2011, Nguyen et al., 2014). Pol2 metagene profiles (Figure 4A), heatmaps showing Pol2 profiles on all genes (Figure 4B), and Pol2 profiles on individual genes (Figure 4C) reveal distinct differences. The most notable differences in the paf1Δ strain are the shifts in the position of the 5′ peak toward the 3′ end and increased reads over the 3′ region of genes relative to reads over the 5′ region, with a notable peak just preceding the polyadenylation site (PAS). To see whether genes with Pol2 enriched or depleted for Paf1 have different distributions of Pol2 over their transcribed regions, we compared the metagene profiles for each class (Figure 4D). The genes with Paf1-enriched or Paf1-depleted Pol2 showed an identical profile after controlling for gene length. Importantly, both profiles were different from that obtained from a paf1Δ strain lacking all Paf1 protein. Thus, a certain level of Paf1 ensures a normal profile of Pol2 during transcription elongation on all genes, which must be equivalent to or below that on the most Paf1-depleted gene identified by our TEF-seq analysis. If Paf1 enrichment or depletion is not related to elongation, then what is it doing?

Figure 4

Distinct Functions for Paf1 in Elongation and RNA Fate

(A) Metagene occupancy profiles averaged across mRNA genes using Rpb3 chromatin immunoprecipitation on a microarray chip (ChIP-chip) data (Mayer et al., 2010) and Rpb3 NET-seq data for WT and paf1Δ strains. ChIP-chip and WT NET-seq profiles are plotted on a linear scale from minimum to maximum, and the paf1Δ profile is plotted so that the area under the curve (AUC) between the TSS and the PAS is the same as the WT AUC (see STAR Methods for details).

(B) Heatmaps showing Pol2 occupancy determined by Rpb3 NET-seq over all genes ordered by length (see STAR Methods for details).

(D) Metagene NET-seq occupancy profiles averaged across mRNA genes restricted by length (>1,800–4,000 nt or >1,200–1,800 nt) that are either Paf1-enriched (log2 relative Paf1-to-Rpb3 occupancy ratio > 0.2) (n = 378 and 449, respectively) or depleted (log2 relative Paf1-to-Rpb3 occupancy ratio < −0.2) (n = 614 and 335, respectively). Profiles are shown for WT and paf1Δ strains with data plotted so that the AUCs between the TSS and the PAS are the same.

Paf1-Enriched Pol2 Encodes Transcripts Enriched for RNA Binding Proteins Involved in Processing and Nuclear Export

We took a bioinformatics approach to learn more about the consequences of Paf1 enrichment or depletion on Pol2. Analysis of photoactivatable ribonucleoside-enhanced crosslinking and immunoprecipitation (PAR-CLIP) datasets in which RNA-bound factors are directly UV cross-linked to transcripts revealed that Paf1-enriched or -depleted genes encode transcripts that are similarly enriched or depleted for RNA processing and nuclear export factors. Mean PAR-CLIP counts per nucleotide for 23 RNA-associated factors already normalized for transcript levels were calculated for all transcripts encoded by each Paf1 group: enriched, depleted, or showing no significant enrichment or depletion (p < 0.05) (Figure 5A; Table S4; Baejen et al., 2014). Paf1-enriched or depleted genes encode transcripts that are significantly (p < 0.001) enriched or depleted for six of the 23 RNA-associated factors (Cft2 [CPSF-100], Mex67 [TAP], Mpe1, Nab2 [ZC3H14], Pab1 [PABPC1], and Npl3) compared with those with no particular Paf1 enrichment or depletion. These factors have a range of functions, including transcript cleavage and polyadenylation (Cft2 and Mpe1), poly(A) binding (Nab2 and Pab1), and nuclear export of transcripts (Mex67 and Npl3). Yra1, an RNA adaptor that interacts with Mex67, Hpr1, and Sub2, components of mRNA export complexes, is also enriched on transcripts encoded by Paf1-enriched genes. Interestingly, Paf1C interacts with Hpr1, Yra1, Npl3, and Nab2 (Table S4).

Figure 5

Paf1 Influences the Nuclear Export of Transcripts

(A) Transcripts from Paf1-enriched or Paf1-depleted mRNA genes show increased or decreased binding, respectively, to Cft2, Mex67, Mpe1, Nab2, Npl3, and Pab1. mRNA genes were split into Paf1-enriched (red), Paf1-depleted (purple), and no significant difference in Paf1 (blue) groups (p < 0.05). Bar charts show the average PAR-CLIP counts per nucleotide, normalized for differences in transcript levels, for each group for each protein using data from Baejen et al. (2014). Error bars show the SEM p values (Welch’s t test) calculated to test the significance of the increase or decrease in the mean of the enriched or depleted group relative to the no significant difference group, respectively.

(B) Transcripts from Paf1-enriched or Paf1-depleted mRNA genes show increased binding to cytoplasmic factors or the nuclear surveillance factor Mtr4, respectively. Bar charts show the percent of mRNA transcripts from each Paf1 group in each of ten clusters defined by Tuck and Tollervey (2013). Transcripts have been clustered according to their association with particular combinations of RNA binding proteins. The asterisk marks cluster II, in which transcripts are associated specifically with the poly(A) binding protein Pab1 but retained in the nucleus for degradation.

(C) Top: profiles showing average relative occupancy for Paf1 (blue) and Rpb3 (black) in 50-nt bins from TSS −100 nt to PAS +100 nt on selected mRNA genes. TSS and PAS are shown by dashed lines. The asterisk marks gaps in the profiles resulting from discarding non-uniquely aligned reads. Blue shading indicates the first 400 nt of each gene, reflecting regions were Paf1 has not reached maximum levels on Pol2. Paf1 levels are multiplied by a scaling factor (2.2) across all profiles to enable visualization of relative changes. Bottom: bar charts showing the mean ratio of nuclear to whole-cell transcript levels for the loci shown in the WT and paf1Δ strains, measured using qRT-PCR for transcripts encoded by the genes shown above. Transcript levels are normalized to the level of the ADH1 transcript in each sample. Error bars show the SEM p values (Welch’s t test) calculated to test the significance of the increase in the mean ratio in the paf1Δ strain (n = 4).

(D) RNA FISH for CMK2 transcripts in WT and paf1Δ strains. The bar chart shows the mean ratio of nuclear to whole-cell transcripts in more than 200 cells per repeat experiment. Error bars show the SEM p value (Welch’s t test) calculated to test the significance of the increase in the mean ratio in the paf1Δ strain (n = 4). Also shown are images showing the distribution of transcripts in a wide field and selected images of individual cells for WT and paf1Δ strains. Selected individual cells are numbered 1–4. Blue is DAPI, red dots are CMK2 transcripts (see also Figures S4 and S5).

(E) Schematic showing the known relationships between Paf1 in Paf1C, H2Bub, and RNA processing and export factors (arrows indicate known direct or indirect interactions). Also shown are the features determined in this work, including the role of transcription factors (TFs) and the phosphorylated CTD (Ser2 and Ser5) in selective Paf1 enrichment on Pol2.

Paf1-Enriched Pol2 Promotes Nuclear Export of the Encoded Transcript

If the processing and export of these selected transcripts depend on levels of Paf1 on Pol2 transcribing the gene, we predict that Paf1-enriched genes would show increased nuclear retention of their encoded transcripts when PAF1 is ablated (paf1Δ). Indeed, this is what we observed by cell fractionation followed by qRT-PCR at transcripts from Paf1-enriched genes (Figure 5C). No significant increase in nuclear retention was observed for transcripts from Paf1-depleted genes or genes at which Paf1 is neither enriched nor depleted. We used single-molecule RNA fluorescence in situ hybridization (FISH) to detect the transcripts from Paf1-enriched CMK2 in wild-type (WT) and paf1Δ backgrounds to confirm these observations (Figure 5D). Counting the number of CMK2 transcripts in the nucleus and cytoplasm of more than 1,000 individual cells reveals a significant increase in nuclear relative to cytoplasmic transcripts in the absence of Paf1. As expected, an lncRNA showed no change in subcellular localization in the paf1Δ background (Figure S4). These data support a link between Paf1-enriched Pol2 on genes and nuclear export of the encoded transcripts (Figure 5E). However, because deletion of Paf1 may result in changes that indirectly affect transcript export, we sought to confirm the link between differential Paf1 enrichment and RNA export using physiological conditions to alter Paf1 levels on Pol2.

Paf1 Enrichment or Depletion on Pol2 Is Dynamic and Related to Differential RNA Export from the Nucleus

Because promoters are involved in Paf1 enrichment on Pol2, we expect enrichment/depletion of Paf1 on Pol2 to be dynamic and change with environmental conditions. We used a simple carbon source shift, transferring cells from glucose- to galactose-containing medium for 5, 15, or 60 min, and compared Paf1 TEF-seq profiles and Rpb3 NET-seq profiles at the three time points with the same data for the cells grown in glucose. We observed significant enrichment or depletion for Paf1 on Pol2 for each sample (Figure 6A; Table S3). Importantly, by comparing how genes that are Paf1-enriched or -depleted under one condition change under the other three conditions, we demonstrate unequivocally that levels of Paf1 on Pol2 are dynamic and that a gene transcribed by Paf1-depleted Pol2 under one condition can be transcribed by Paf1-enriched Pol2 under another condition, and vice versa, independent of levels of transcription. Consistent with a role for transcription factors in differential recruitment of Paf1, 19 of 30 genes (63%) with significantly (p < 0.05) Paf1-enriched Pol2 in both glucose and after 60 min in galactose are regulated by Sfp1.

Figure 6

Paf1 Enrichment or Depletion on Pol2 Is Dynamic and Related to RNA Export

(A) Differential Paf1 occupancy at mRNA genes cultured in glucose (GLU) or after 5, 15, or 60 min in galactose (GAL) (columns). The rows show genes that are significantly (p < 0.05) enriched (green) or depleted (purple) under each of the four conditions and how the patterns of enrichment or depletion for those genes under one condition changes as the environment changes (growth conditions). For each condition, Paf1 TEF-seq and Rpb3 NET-seq data were obtained in duplicate. Each scatterplot shows the base mean occupancy against the log2 relative Paf1-to-Rpb3 occupancy ratio for each mRNA gene.

(B) Profiles showing average relative occupancy for Paf1 (blue) and Rpb3 (black) in 100-nt bins across selected mRNA genes. Paf1 levels are multiplied by 2.2 across all GLU profiles and 1.8 across all GAL 60-min profiles because this brings the level of Paf1 compared with Rpb3 to the same level for both conditions for the control gene, CLB1, which shows almost no change in relative Paf1-to-Rpb3 occupancy between these two conditions. Pol2 on the selected genes is significantly (p < 0.05) Paf1-enriched in glucose, and all show a decrease in the log2 relative Paf1-to-Rpb3 occupancy ratio greater than 1.2 upon the switch to galactose for 60 min.

(C) Bar charts showing the mean ratio of nuclear to whole-cell transcript levels in glucose and after 60 min in galactose for the loci shown, measured using qRT-PCR for transcripts encoded by the genes shown above. Transcript levels are normalized to the level of the CLB1 transcript in each sample. Error bars show the SEM p values (Welch’s t test) calculated to test the significance of the increase in the mean ratio in galactose (n = 4).

Discussion

Paf1 and its associated proteins in Paf1C likely act as a platform, recruiting factors that connect Paf1 to multiple transcription processes from elongation to polyadenylation and termination (Chen et al., 2015, Tomson and Arndt, 2013, Yang et al., 2016, Yu et al., 2015). Given Paf1C’s role in mediating post-translational modifications to histones, such as H2Bub, and its capacity to form protein:protein interactions with factors involved in RNA processing and nuclear export (Table S4), it is not surprising that Paf1 has additional roles in regulating nuclear export of transcripts, as we show here. What is surprising is that this is a selective event and distinct from Paf1’s role in elongation. Given the biochemical evidence that Paf1’s function in elongation is direct (Rondón et al., 2004, Tous et al., 2011), we propose that there may be two distinct Paf1-associated functions. Low levels of Paf1, equal to that on the most Paf1-depleted gene defined using TEF-seq, would be sufficient for the core elongation function on all genes. The second function would result from differential association of Paf1 with Pol2 above this level, leading to selective promotion of RNA nuclear export. Our analysis implicates the promoter as the main feature linked to enrichment of Paf1 on Pol2 transcribing the associated gene. Furthermore, levels of Ser2-phosphorylated residues with the CTD, only when neighboring Ser5 residues are also phosphorylated, maintain differential enrichment or depletion during transcription elongation. There are a number of possible ways to envisage how the promoter is linked to Paf1 enrichment on Pol2, all involving specific DNA sequence elements and transcription factors such as Sfp1, Haa1, or Cbf1. Transcription factors (TFs) are known to influence the transition from a more Ser5- to a more Ser2-phosphorylated CTD during transcription elongation (Kwak and Lis, 2013). By analogy with the role of MYC in mammalian cells (Jaenicke et al., 2016), TFs could recruit Paf1, perhaps in association with another factor, such as CK2, known to interact with both Paf1 and Sfp1, and then transfer Paf1 to Pol2. Alternatively, TFs could recruit CTD kinases such as Bur1 (CDK9) or Ctk1 (CDK12), leading to differential Ser2 phosphorylation levels on Pol2 and, hence, differential Paf1 levels. Finally, TFs could promote recruitment of a Paf1-enriched pre-initiation complex (PIC) as opposed to a mediator-enriched PIC (Hampsey, 1998). Current models propose that transcript fate is determined by poly(A) tail length and the proteins associated with different transcripts during 3′ end formation (Tuck and Tollervey, 2013). However, we show here that transcript poly(A) tail length (Subtelny et al., 2014) showed no relationship to Paf1 enrichment or depletion on mRNA genes (Figure S5) and, thus, is unlikely to be related to Paf1’s role in selectively promoting transcript nuclear export. Instead, our data suggest that the proteins associated with these transcripts, and thus their fate, can be determined before 3′ end formation and is related to the levels of Paf1 in the elongation complex during transcription elongation, which is dynamic and subject to environmental control via promoters. Transcript fate can be modulated by signals received at the promoter, already shown to influence transcript localization in the cytoplasm (Zid and O’Shea, 2014) and half-life (Bregman et al., 2011, Trcek et al., 2011). Our data add differential RNA export to the functions likely to be determined by the promoter. When Paf1 is associated with Pol2, we envisage two consequences. Paf1C is required for H2Bub. H2Bub stimulates the ubiquitylation of the associated with the Pta1 subunit of CPF (APT) complex subunit Swd2, which, in turn, causes an association between Mex67 and Swd2 and independently facilitates recruitment of the Mex67 adaptor Npl3 to nascent transcripts to promote nuclear export (Vitaliano-Prunier et al., 2012). Consistently, genes and transcription units with Pol2 depleted for Paf1 also show reduced H2Bub (Murray et al., 2015). Alternatively, Paf1C could simply be transported with Pol2 to the 3′ region to recruit or modify RNA binding and processing enzymes, such as Nab2, Npl3, Yra1, and Hpr1. Paf1 interacts with Hpr1 (Chang et al., 1999), which, in turn, interacts with Mex67 and Yra1, facilitating interactions with the nuclear pore. Hpr1 also undergoes sumoylation, enabling it to bind specific RNAs, including members of the HAA1 regulon (Bretes et al., 2014), shown here to be Paf1-enriched. NET-seq supports a role for Paf1 in the control of transcription elongation. Paf1 changes the relative distribution of Pol2 over the 5′ and 3′ regions of genes. Paf1 deletion in yeast and knockdown in mammals results in increased Pol2 CTD phosphorylation at Ser5 and reduced phosphorylation at Ser2 (Mueller et al., 2004, Nordick et al., 2008, Yu et al., 2015), although, in other work, an increase in Ser2 phosphorylation is observed (Chen et al., 2015). The disruption of Paf1-dependent CTD phosphorylation patterns could contribute to or result from slower elongation (Tomson and Arndt, 2013) and compromised checkpoints (Laitem et al., 2015, Rodríguez-Molina et al., 2016), leading to more Pol2 on the body of the gene, and supports some conservation of Paf1’s role in controlling transcription elongation across the species. However, Paf1’s role in transcription elongation is not related to the differences in Paf1 levels on Pol2 observed at different genes. In summary, our findings support distinct forms of the Pol2 elongation complex, with a core function in elongation and a regulatory function determining differential transcript fate, determined by levels of Paf1 on Pol2. Paf1 is unlikely to be unique in showing enrichment or depletion on elongating Pol2; preliminary data show enrichment or depletion of other elongation factors on Pol2 at different groups of genes. The use of an IP step in TEF-seq makes it eminently adaptable to studying a range of different Pol2-associated factors in different organisms, cell types, or growth conditions and to enhance our understanding of transcription elongation and its relationship to transcript fate. It will be important to determine the conservation of Paf1’s role in transcript fate in metazoans, and development of TEF-seq in these systems will enable this. We anticipate that, given the degree of overlapping pervasive transcription in eukaryotic genomes (Mellor et al., 2016, Murray and Mellor, 2016), the strand-specific nucleotide resolution of TEF-seq will be an invaluable tool.

STAR★Methods

Key Resources Table

Contact for Reagent and Resource Sharing

As Lead Contact, Jane Mellor is responsible for all reagent and resource requests. Please contact Jane Mellor at jane.mellor@bioch.ox.ac.uk with requests and inquiries.

Experimental Model and Subject Details

All yeast strains used in this study are listed in the Key Resources Table. All strains and genetic manipulations were verified by sequencing and phenotype. Strains were streaked from glycerol stocks onto 2% agar YPD (1% yeast extract (Difco), 1% bactopeptone, 2% glucose) plates and grown (48 hr, 30°C). Cells were then grown (overnight, 30°C) in 5 mL or 25 mL YPD. This culture was used to inoculate a 25 mL or 1 L YPD culture at OD600 0.2 which was grown (30°C, 200 rpm) to OD600 0.6-0.7. For carbon source shift experiments, cell cultures were centrifuged (3000 rpm, 3 min) and then resuspended in YPG (1% yeast extract (Difco), 1% bactopeptone, 2% galactose) prewarmed to 30°C. Resuspended cells were incubated (30°C, 200 rpm) for the specified time.

Method Detail

Genetic manipulation of yeast strains

All strains used in this study are listed in the KEY RESOURCES TABLE. Genetic manipulation of strains was carried out using the homologous recombination method described by (Longtine et al., 1998). For gene deletion strains, PCR products were made containing a selection marker with promoter and terminator sequences flanked at both ends by 40bp of sequence homologous to sequences either side of the region to be deleted. For 3xFLAG tagged strains, first a plasmid was created from plasmid pFA6a-GFP(S65T)-His3MX6 (Longtine et al., 1998) with the 3xFLAG sequence amplified from strain YSC001 by PCR inserted in place of GFP. PCR products were made using this template consisting of a 40bp sequence homologous to the first 40bp upstream of the stop codon of the gene to be tagged followed by the FLAG sequence, His5 selection marker and 40bp of sequence homologous to a region downstream of the gene to be tagged. 3xHA tagged strains were created in a similar manner using the pFA6a-3HA-KanMX6 plasmid (Longtine et al., 1998). Cells to be transformed were grown to log phase, pelleted, re-suspended in 450 μL 0.1 M LiAc/TE and incubated (> 1 hr, 4°C). 100 μL of cell suspension, 10 μL of gel-extracted PCR product, 10 μL calf thymus DNA (Sigma D8661), 700 μL 0.1 M LiAc/TE, 40% PEG were incubated (30 min, 30°C) then heat-shocked (20 min, 42°C). Cells were pelleted (5 min, 7000 rpm), re-suspended in H20 and plated onto appropriate selection media. DNA was extracted from transformants, screened by PCR and confirmed by sequencing.

NET-seq/TEF-seq

Yeast culture

Cells were grown in 25 mL YPD (1% yeast extract (Difco), 1% bactopeptone, 2% glucose) overnight. This was used to inoculate 1 L of YPD in a 2 L flask to give a starting OD600 of 0.2. 2 L of cells were grown (30°C, 170 rpm) to OD600 of 0.65 (mid-log) and harvested by filtration onto a 0.45 μm pore size, 90 mm diameter nitrocellulose filter paper (Whatman). Cells were scraped off the filter paper with a spatula pre-cooled in liquid nitrogen and flash frozen in liquid nitrogen. Cells were ground in 6 cycles of 3 min at a 15 Hz shaking frequency in a 50 mL grinding jar (Retch) with a 25 mm stainless steel ball using a Retsch MM400 mixer mill. The grinding chamber was cooled in liquid nitrogen in between cycles. 1 g of yeast grindate was stored at −80°C. For carbon source shift experiments 2 L of cells were grown to mid-log in 2 × 2 L flasks, then centrifuged (3000 rpm, 3 min) and then resuspended in 2 L YPG (1% yeast extract (Difco), 1% bactopeptone, 2% galactose) prewarmed to 30°C. Resuspended cells were returned to 2 L flasks, incubated (30°C, 170 rpm) and harvested by filtration as above after the requisite incubation time.

Immunoprecipitation (IP)

1 g of yeast grindate was resuspended in Lysis buffer A (20 mM HEPES (pH 7.4), 110 mM KOAc, 0.5% Triton X-100, 0.1% Tween 20, 10 mM MnCl2, 1x proteinase inhibitors (complete, EDTA-free (Roche)), 50 U/ml SUPERase.In (Invitrogen)). Resuspended grindate was incubated (4°C, 20 min) with 660 U of DNase I (Promega). Insoluble cell debris was pelleted by spinning (16,000 g, 4°C, 10 min). Supernatants were combined (a 20 μL input sample was taken and combined with 20 μL 2 × SDS loading buffer (80 mM Tris-HCl (pH 6.8), 200 mM DTT, 3.2% SDS, 0.1% Bromophenol Blue, 1.6% glycerol)) and added to 0.5 mL of anti-FLAG M2 affinity agarose beads (Sigma) pre-washed with 2 × 10 mL Lysis buffer A (without SUPERase.In). Beads and supernatant were incubated (4°C, 2.5 h) on a nutator, before spinning (1,000 g, 4°C, 2 min). A 20 μL unbound sample was taken and combined with 20 μL 2 × SDS loading buffer. Beads were washed four times with 10 mL Wash buffer A (20 mM HEPES (pH 7.4), 110 mM KOAc, 0.5% Triton X-100, 0.1% Tween 20, 1 mM EDTA). Beads were incubated (30 min, 4°C) twice with 300 μL of Elution buffer (Lysis buffer A with 1 mg/mL 3 × FLAG peptide (Sigma)). The elution supernatants were combined (a 20 μL elute sample was taken and combined with 20 μL 2 × SDS loading buffer) and RNAs were extracted using a miRNeasy kit (QIAGEN) according to the manufacturer’s instructions. For the sequential IP procedure, the elution supernatants from two IPs were combined giving a total of 1200 ul and added to 15 ul anti-HA dynabeads (Pierce) pre-washed with 2 × 1 mL Lysis buffer A. Beads and supernatant were incubated (4°C, 2.5 h) on a nutator. A 20 μL secondary unbound sample was taken and combined with 20 μL 2 × SDS loading buffer. Beads were washed five times with 1 mL Wash buffer A. Beads were resuspended in 100 uL Lysis buffer A (a 5 uL secondary elute sample was taken and combined with 15 uL Lysis buffer A and 20 uL 2 × SDS loading buffer) and RNAs were extracted using a miRNeasy kit (QIAGEN) according to the manufacturer’s instructions. Input, unbound and elute samples were analyzed by western blot.

Western blotting

Protein samples were separated by gel electrophoresis on 7.5% or 10% SDS polyacrylamide gels and transferred to nitrocellulose membranes. Membranes were incubated with 5% BSA in TBST (20 mM Tris-HCl (pH 7.5), 150 mM NaCl, 0.1% Tween-20) for 2 hr, then primary antibody in 2.5% BSA in TBST for 1.5 hr, washed, then incubated with rabbit, mouse or rat HRP-conjugated secondary antibody (Sigma) diluted 1:4000 in 2.5% BSA in TBST for 45 min and washed again. Primary antibodies and their dilutions are detailed in the Key Resources Table. Antibody binding was visualized using chemiluminescence (Pierce) and X-ray film.

Library generation

Linker ligation and RNA fragmentation

Immunoprecipitated RNA (3 μg in 30 μL 10 mM Tris-HCl, pH 7.0) was denatured (2 min, 80°C) and placed on ice. A 5′ adenylated, 3′ blocked with a dideoxy C base cloning linker 5rApp/CTGTAGGCACCATCAAT/3ddC (Integrated DNA Technologies) was ligated to the 3′ ends of RNAs by first dividing the RNA into three microfuge tubes and then adding 10 μL ligation reaction mix to give final concentrations of 50 ng μL−1 cloning linker 1, 12% PEG 8000, 1 × T4 RNA ligase 2 (Rnl2) (truncated) ligation buffer, 10 U μL−1 T4 Rnl2 (truncated) (NEB) and incubating for 3 hr at 37°C. Ligated RNA was incubated (95°C, 35 min) with 20 μL Alkaline fragmentation buffer (100 mM NaCO3 (pH 9.2), 2 mM EDTA) to fragment linker ligated RNA to a narrow size range to reduce size bias of future steps. RNA was precipitated by incubating (30 min, −20°C) with ice cold 500 μL H20, 60 μL 3 M NaOAc (pH 5.5), 2 μL 15 mg mL−1 GlycoBlue (Ambion) and 0.75 mL isopropanol and then spinning (16,000 g, 4°C, 30 min). Pellets were washed with 0.75 mL 80% ethanol, dried (10 min, room temperature) and resuspended sequentially in the same 10 μL 10 mM Tris-HCl (pH 7.0).

Size selection

Ligated and fragmented RNA was mixed with 10 μL 2 × TBE-urea loading dye (89 mM Tris-HCl, 89 mM Boric acid, 2 mM EDTA, 12% Ficoll, 7 M Urea, 2.5 mg/ml Orange G), denatured (2 min, 80°C) and run on a 10 well 10% TBE-urea gel (Biorad) (200 V, 32 min). The gel was stained with SYBRGold (Invitrogen) and the region containing 38-95nt fragments excised. This gel piece was divided between three microfuge tubes and physically disrupted. Each was incubated (70°C, 10 min) in 200 μL H20. The tubes were pooled and gel debris was removed using a Costar-Spin-X column (Corning). RNA was precipitated by adding 60 μL 3 M NaOAc (pH 5.5), 2 μL 15 mg mL−1 GlycoBlue and 0.75 mL isopropanol, incubating (30 min, −20°C) and then spinning (16,000 g, 4°C, 30 min). Pellets were washed with 0.75 mL 80% ethanol, dried (10 min, room temperature) and resuspended in 10 μL 10 mM Tris-HCl (pH 7.0).

Reverse transcription and circularisation

Reverse transcription (RT) was carried out by adding 3.28 μL 5 × FS buffer, 0.82 μL dNTPs (10 mM each), 0.5 μL 100 μM RT primer (This is phosphorylated at the 5′ end (5 Phos) for later circularization and contains two 18 carbon spacer sequences (iSp18)), denaturing (80°C, 2 min) and incubating (48°C, 30 min) with 0.5 μL Superase.In, 0.82 μL 0.1 M DTT, 0.82 μL Superscript III (Invitrogen). 1.8 μL 1 M NaOH was added and RNA was degraded (98°C, 20 min). To neutralize, 1.8 μL 1 M HCl was added. cDNA was mixed with 20 μL 2 × TBE-urea loading dye, denatured (3 min, 95°C), and run (loaded in 2 wells, 20 μL per well) on a 10 well 10% TBE-urea gel (Biorad) (200 V, 50 min). The gel was stained with SYBRGold (Invitrogen) and the regions containing the RT product excised. Both gel pieces were physically disrupted and incubated (70°C, 10 min) in 200 μL H20 in separate tubes. Gel debris was removed and tubes were pooled. cDNA was precipitated by adding 25 μL 3 M NaCl, 2 μL GlycoBlue and 0.75 mL isopropanol, incubating (−20°C, 30 min) and then spinning (16,000 g, 4°C, 30 min). The pellet was washed with 0.75 mL 80% ethanol, dried (10 min, room temperature) and resuspended in 15 μL 10 mM Tris-HCl (pH 8.0). The RT product was circularized by incubating (60°C, 60 min) with 2 μL 10 × CircLigase buffer, 1 μL 1 mM ATP, 1 μL 50 mM MnCl2, 1 μL CircLigase (Epicenter). The enzyme was heat-inactivated at 80°C, 10 min.

PCR amplification

16.7 μL 5 × HF Phusion buffer, 1.7 μL dNTPs (10 mM), 0.4 μL 100 μM Barcoded primer (A, B, C, or D), 0.4 μL 100 μM Primer 1, 59.2 μL H20, 0.8 μl Phusion polymerase (NEB) was added to 5 μL circularized DNA. This was divided (16.7 μL per tube) between five 0.2 mL tubes. PCR reactions were heated (98°C, 30 s) and then submitted to 7 temperature cycles (98°C, 10 s; 60°C, 10 s; 72°C, 10 s). At the end of cycle 3 and each subsequent cycle one tube was removed and placed on ice. 3.4 μL DNA loading dye (1.5 g Ficoll 400, 25 mg Orange G in 10 mL H20) was added to each reaction and run on an 8% TBE gel (Invitrogen) (180 V, 55 min). The gel was stained with SYBRGold and the PCR product excised from the PCR reaction with the highest unsaturated signal without higher molecular weight products. The gel piece was physically disrupted and incubated (room temperature, overnight, with agitation) in 0.67 mL DNA soaking buffer (0.3 M NaCl, 10 mM Tris-HCl pH 8.0, 1 mM EDTA). DNA was precipitated by adding 2 μL GlycoBlue and 0.68 mL isopropanol, incubating (−20°C, 30 min) and spinning (16,000 g, 4°C, 30 min). The pellet was washed with 0.75 mL 80% ethanol, dried (10 min, room temperature) and resuspended in 10 μL 10 mM Tris-HCl (pH 8.0).

Sequencing

Samples were submitted to Harvard Biopolymers facility. They determined DNA quality and quantity using the TapeStation (Agilent) and qPCR and then sequenced samples on an Illumina HiSeq 2000 machine. 50nt were sequenced from one end using the sequencing primer. Barcoded samples were pooled so that 2-3 samples were multiplexed per lane. See the Key Resources Table for a list of DNA primers used for library generation and sequencing.

Biochemistry

Nuclear Extraction

50 mL from a 75 mL OD600 0.6 culture was pelleted (1,000 g, 3 min), and incubated (< 5 min, 30°C) in 0.5 mL spheroplasting buffer A (1 M sorbitol, 0.5 mM 2-mercaptoethanol, 20 mg/ml zymolase 20T (MP Biomedicals)). Once > 70% of cells were spheroplasted, cells were pelleted (16,000 g, 10 s, 4°C), washed carefully twice with 1 mL 1 M sorbitol, resuspended in 100 μL 18% Ficoll and then mixed quickly with 1 mL 9% Ficoll. Nuclei were pelleted (16,000 g, 5 min, 4°C) and washed twice with 1 mL 1 M sorbitol. RNA was extracted from nuclei as detailed below. Total RNA was extracted from the remaining 25 mL of culture as detailed -below.

RNA extraction

25 mL OD600 0.6 culture was pelleted (1,000 g, 3 min), resuspended in 400 μL T.E.S. (100 mM Tris-HCl (pH 7.5), 100 mM EDTA (pH 8.0), 0.5% SDS) and 400 μL phenol:chloroform (pH 4.7) and incubated (65°C, 20 min, 1400 rpm). After spinning (16,000 g, 20 min, 4°C), the upper layer was mixed with 40 μL 3 M NaOAc pH 5.5 and 1 mL ethanol and incubated (−80°C, 30 min). RNA precipitate was pelleted (16,000 g, 10 min, 4°C) and resuspended in 100 μL H20.

Reverse transcription

4 μg of extracted RNA was incubated (37°C, 1 h) with 2 μL DNase I buffer, 1 μL RNaseOUT (Invitrogen), 6 μL H20, 1 μL DNase I (Roche) then heated (70°C, 15 min). 0.5 μg RNA was reverse transcribed by incubating (65°C, 5 min) with 0.6 μL random primers, 1 μL dNTPs in a 13 μL final volume and then adding 4 μL First-Strand buffer, 1 μL 0.1 M DTT, 1 μL RNaseOUT, 1 μL Superscript III (replaced with 1 μL H20 in control reactions) and incubating at 50°C (1 h) and then 70°C (15 min). Complementary DNA (cDNA) levels for each gene were analyzed by Real Time quantitative PCR (qPCR) using a RotorGene (Corbett), SYBR Green mix (Bioline) and primers. Transcript levels in each sample were normalized to the level of the ADH1 transcript for the comparison between BY4741 and paf1Δ samples as this gene is not Paf1-enriched or depleted in YPD. Transcript levels in each sample were normalized to the level of the CLB1 transcript for the comparison between BY4741 in YPD and BY4741 after 60 min in YPG samples as the level of Paf1 on Pol2 at this gene does not change between these two conditions. The mean transcript level or nuclear/total transcript level was then calculated across all replicates. Error bars show the SEM. See Table S5 for primers used in qPCR.

Assessingf Paf1 on Pol2 at URA3

We used artificial constructed genes to assess the level of Paf1 on Pol2 downstream of a Paf1-enriched or Paf1-depleted gene promoter. The coding region of the URA3 gene was inserted in place of CMK2 (a Paf1-enriched gene) or SLC1 (a Paf1-depleted gene) coding regions. Thus URA3 was placed downstream of the promoter of a Paf1-enriched or a Paf1-depleted gene. The NET-seq/TEF-seq IP protocol was then carried out on Paf1-3xFLAG, Rpb3-3xFLAG or non-tagged versions of these strains inclusive of the purification of RNA step. 1 μg of purified RNA was reverse transcribed using random primers as above. cDNA levels for 4 control genes (RCR1, TIM21, CLB1 and PUN1) and for test gene URA3 were established for each sample. The level of Paf1, Rpb3 and no-tag signal is expected to be the same in all strains at the control genes. Therefore the control genes were used to determine scaling factors to normalize the signal for each factor to the same level for each strain. An average scaling factor was then determined and this was applied to the signal at URA3. (Paf1 signal - no-tag signal)/(Rpb3 signal - no-tag signal) was calculated for URA3 at each locus to give a value for the level of Paf1 on Pol2. 2 biological replicates were carried out.

RNA fluorescence in situ hybridization (RNA FISH)

35 mL OD600 0.6 culture was fixed by incubating (45 min, 80 rpm, 22°C) with 4.8 mL 32% paraformaldehyde. Fixed cells were washed twice with 10 mL FISH buffer A (1.2 M sorbitol, 0.1 M KHPO4 (pH 7.5)) and resuspended in 1 mL FISH buffer B (FISH buffer A, 20 mM ribonucleoside vanadyl-complex (VRC), 20 μM 2-mercaptoethanol). The mixture was incubated (∼25 min) at 30°C with 15 μL lyticase (25 U μL−1, Sigma) until > 70% of cells were spheroplasted. Cells were pelleted (1000 g, 3 min, 4°C) and washed with and then resuspended in 1 mL FISH buffer B without 2-mercaptoethanol. 150 μL of cells were left to settle (30 min, 4°C) on poly-L-lysine treated coverslips. These were washed with 2 mL FISH buffer A to remove unattached cells, incubated (−20°C, > 3 h) in 2 mL 70% ethanol and then rehydrated for 5 min in 2 mL FISH wash buffer (10% formamide, 2x SSC). For hybridization, coverslips were placed cell-coated side down on a 48 μL drop containing 50 nM of 42 × 20nt Cy3-labeled probes complementary to CMK2 (Biosearch Technologies) or 29 × 20nt Cy3-labeled probes complementary to GAL1 antisense (AS) transcript, 0.1 g mL−1 dextran sulfate, 1 mg mL−1 E. coli tRNA, 2 mM VRC, 20 μg ml−1 BSA, 2x SSC, 10% formamide and incubated (30°C, 20 h) in a sealed Parafilm chamber (See Table S6 for list of FISH probes used). Coverslips were twice incubated (30°C, 30 min) in pre-warmed 2 mL FISH wash buffer, dipped in H20, air-dried, placed cell-coated side down on a drop of ProLong Diamond Antifade Mountant with DAPI (Life Technologies), allowed to polymerize for 24 hr in the dark and then sealed with nail varnish. Cells were imaged using a DeltaVision Elite wide-field fluorescence deconvolution microscope using a 100x/1.40 objective lens. For CMK2, 31 0.2 μm z stacks were imaged using an exposure time of 0.08 s and 1 s for DAPI and Cy3 channels respectively for each stack. For GAL1 AS, 21 0.2 μm z stacks were imaged with an exposure time of 0.01 s and 1 s for DAPI and Cy3 channels respectively.

Quantification and statistical analysis

NET-seq/TEF-seq sequencing data processing

Alignment of sequencing reads

FASTQ files with sanger (+33) encoded quality scores were uploaded to Galaxy (Blankenberg et al., 2010). Adaptor sequences were clipped and reads shorter than 15nt after clipping were discarded. Reads were aligned to the S. cerevisiae genome (2011 assembly) using Bowtie for Illumina (Langmead, 2010). Parameters were chosen so non-uniquely aligned reads were discarded (-n 1, -e 70, -l 28, -k 1, -m 1). The SAM file output was converted to a BAM file and downloaded (Li et al., 2009).

Narrowing reads and data visualization

Using R/Bioconductor, aligned reads were narrowed to their 3′ nucleotide and used to create BigWig files for data visualization in the IGV (Thorvaldsdóttir et al., 2013). When available, data from combined biological replicates were used for visualization.

Subtraction of non-tag specific signal

Subtraction of non-tag specific signal was done using data from a non-tagged control strain. Signal that is not specific to the FLAG-tagged TEFs, Rpb2 or Rpb3 is present within alignments, mainly resulting from the non-specific binding to and elution from the affinity matrix of processed transcripts undergoing degradation, generating the 3′OH that allows their incorporation into the libraries and alignment. This signal can be removed using signal obtained from a strain grown in the same conditions without a FLAG-tagged factor (no-tag; control IP). To do this the Pol3-transcribed SCR1 gene (chosen as no signal specific to a factor associated with the Pol2 transcription elongation complex should map to this gene) was divided into 10nt bins exclusive of the first 20nt and final 50nt of the gene (position chrV:442007-442458). The ratio of counts (FLAG-tagged factor/no-tag control) for each bin was calculated. The mean ratio (mean SCR1 ratio) was taken and mean SCR1 ratio × no-tag reads were subtracted from FLAG-tagged factor reads genome-wide. For the sequential IP, signal that is not specific to the HA-tagged factor is removed using signal obtained from a strain without an HA-tagged factor (RPB3-3xFLAG, no-HA-tag; control IP). In this case the signal that is not specific to the HA-tagged factor results to a greater extent from Pol2 associated nascent transcripts due to non-specific binding of Pol2, enriched during the first IP step, to the HA-beads during the second IP step. Reads over Pol3 transcribed-SCR1 are not representative of this signal and so cannot be used to determine the ratio necessary for its removal. Instead the ratio of total reads before alignment (HA-tagged factor/no-HA-tag control) was used assuming linear proportionality between this and the signal not specific to the HA-tagged factor. This ratio × no-tag reads were subtracted from HA-tagged factor reads genome-wide. For visualization, an 11 nt running mean was calculated, except the running mean window did not overlap gene TSSs or TESs or the 3′ nt of exons (upstream of an intron) or 3′ nt of introns. When available, data from combined biological replicates were used for visualization.

mRNA and ncRNA annotations

The TSS and PAS/TES positions of all genes encoding mRNAs, stable lncRNAs (termed SUTs; stable unannotated transcripts) and unstable lncRNAs (termed CUTs; cryptic unstable transcripts) were taken from (Pelechano et al., 2013), selecting first for the most abundant transcript isoforms expressed in YPD and, of these, the longest, to obtain unique annotations for 5579 mRNAs, 612 stable lncRNAs and 344 unstable lncRNAs. The annotations for the start and end of coding sequences, introns, snRNAs, snoRNAs and other ncRNAs were taken from the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org/).

Metagene analysis

Data from combined biological replicates when available were used for plotting metagene profiles. All metagene profiles were plotted using single IP TEF-seq and/or single IP NET-seq data unless otherwise stated.

Standard average occupancy profiles

Standard average occupancy profiles (metagene occupancy profiles) were plotted using all mRNA genes unless otherwise stated. For plots of the 5′ region of genes, genes were aligned by their TSS. The final 200nt upstream of the PAS/TES were excluded to avoid transitions occurring at the 3′ end of genes influencing these profiles. The region from the TSS-100nt to TSS+1000nt was split into 10nt bins and the number of counts in each bin was calculated from Rpb3 ChIP-chip data taken from (Mayer et al., 2010) and Rpb3 NET-seq data for wild-type and paf1Δ strains. The mean number of counts for each bin at each position was then calculated excluding the top 5% and bottom 5% of bins. These were excluded to prevent highly and lowly transcribed genes from skewing the mean. Mean counts were plotted on a linear scale from the minimum value to the maximum value except otherwise stated. For plots of the 3′ region of genes, genes were aligned by their PAS, genes shorter than 501nt and the region from the TSS to +500nt of genes were excluded to avoid transitions occurring at the 5′ end of genes influencing these profiles. The region from PAS-400nt to PAS+100nt was split into 10nt bins and the mean number of counts for each bin was calculated and plotted in the same manner.

Rpb3-normalized TEF profiles

Rpb3-normalized average TEF occupancy profiles (metagene occupancy profiles) were plotted for all mRNA genes or all lncRNA genes unless otherwise stated. For plots of the 5′ and 3′ end of genes, genes were aligned to the TSS and the PAS respectively and the same regions were excluded as before. For plots of transitions relative to nucleosomes +2 to +5, the position of nucleosomes was taken from (Brogaard et al., 2012). This publication uses chemical modification of histones to give a base pair resolution genome-wide map of the position of the center of each nucleosome. The region from −100nt to + 100nt around each of the nucleosomes after the TSS+50nt until the PAS were taken from each gene. Again each region was split into 10nt bins and the number of counts from TEF-seq for each TEF divided by the number of counts for Rpb3 NET-seq was calculated. These were multiplied by a factor-specific scaling factor to enable plotting on the same arbitrary linear scale (factor specific scaling factors: Cet1, 1; Paf1, 2.2; Pcf11, 2.7; Spt6, 0.47; Spt16, 1.04; Set2, 6; Ssu72, 8). This is necessary as differences in sequencing depth and the level at which Pol2 is co-immunoprecipitated with different TEFs, leads to large differences in the level of counts for each factor. The mean (scaling factor × TEF)/Rpb3 ratio was then calculated for each bin after first excluding undefined ratios and the top 10% and bottom 10% of ratios. These were excluded to prevent single nucleotide spikes in the data seen with TEF-seq from skewing the mean.

Rpb2-normalized TEF profiles

Rpb2-normalized average TEF occupancy profiles (metagene occupancy profiles) were plotted in the same way as for Rpb3-normalized average TEF occupancy profiles. For single IP data, the mean 0.88 × (Paf1-FLAG TEF-seq)/(Rpb2-FLAG NET-seq) ratio was calculated for each bin. For sequential IP data, the mean 0.75 × (Rpb3-FLAG then Paf1-HA TEF-seq)/(Rpb3-FLAG then Rpb2-HA NET-seq) ratio was calculated for each bin.

Use of Rpb2 or Rpb3 NET-seq data

There are twelve Pol2 subunits in humans and yeast (Rpb1-12 in yeast). Here, single IP NET-seq data, giving a map of transcribing Pol2, have been obtained either by immunoprecipitating Pol2 via its Rpb3 subunit, as in the original protocol (Churchman and Weissman, 2011), or via its Rpb2 subunit. The Pol2 profiles obtained differ slightly depending on which subunit is IP’d, with slightly fewer reads in the 5′ region relative to the 3′ region of genes for Rpb2 than Rpb3. This results in different Pol2-normalized Paf1 occupancy profiles depending on which subunit is used for normalization (see Figures 3B and 3D). We envisage that the more exterior location of the Rpb3 subunit within the Pol2 complex enables this subunit to be more effectively IP’d from all regions of the gene, whereas the Rpb2 subunit, at the center of the Pol2 complex, may be masked by the binding of particular TEFs to Pol2, decreasing the ability to IP it from particular regions. As we predict Rpb3 NET-seq data give a truer signal for the profile of Pol2, we have used these data for the majority of our analysis. Rpb2 NET-seq data have only been used to confirm the similarity between Pol2-normalized Paf1-occupancy profiles obtained from single and sequential IP procedures. This is because only Rpb2 sequential IP NET-seq data have been obtained (Rpb3-FLAG IP followed by Rpb2-HA IP).

Heatmaps

All mRNA genes were ordered by length and split into 10nt bins. Using Rpb3 single IP NET-seq data for wild-type or paf1Δ strains, the mean number of counts in each bin was calculated. Negative mean counts were replaced with a zero. Mean counts for each bin across a gene were divided by the sum of the mean counts for that gene to normalize for differences in expression level. These values were converted to a heatmap using a linear scale from white to black.

Differential occupancy analysis

Signal for single IP Paf1 TEF-seq and single IP Rpb3 NET-seq from the TSS+400nt to TES-200nt were counted, thus excluding regions in which Paf1 shows large changes in association with Pol2, for two biological replicates for each carbon source condition. Differential occupancy was assessed using the DESeq algorithm within the DESeq2 package (Love et al., 2014) by comparing the counts for Paf1 and Rpb3 at all mRNA genes (Figure 2A; Table S3) or all mRNA and lncRNA genes (Figure 2B). This gives a log2 value for the relative Paf1 to Rpb3 occupancy for each gene and a p value, adjusted for multiple testing using the Benjamini-Hochberg method, showing the significance of Paf1-enrichment or depletion on Pol2 at each gene. For the different downstream analyses carried out on Paf1-enriched and -depleted mRNA gene groups different cut-offs were used based either by selecting genes with log2 relative Paf1 to Rpb3 occupancy above or below a certain threshold or by selecting genes with an adjusted p value below a certain threshold. Reductions in the stringency of the thresholds used to determine which genes are included in each group, increases the probability of the inclusion of false positives. However, such reductions are necessary for particular analyses to ensure that there are sufficient numbers of genes to analyze. By including larger numbers of genes in particular analyses, any observations are less likely to be the result of random noise thus increasing the robustness of these observations. For metagene profiles, including more genes helps to reduce signal noise. The particular cut-off used in each analysis and the number of genes in each group that this gives is specified when described in the main text.

Divergent promoters enrichment analysis

For the analysis of genes expressed divergently from bi-directional divergent promoters, all yeast mRNA genes at which a log2 relative Paf1 to Rpb3 occupancy value had been established using the DESeq algorithm were taken. The number of divergently expressed gene pairs within this list was then established (661 gene pairs). Divergently expressed gene pairs were defined as genes transcribed in opposite directions to each other with TSSs within 1200 bp upstream of each other and with no other gene from this list between these two genes. Two groups consisting of the 1200 most Paf1-enriched and the 1200 most Paf1-depleted genes were established based on ordering genes by their log2 value for relative Paf1 to Rpb3 occupancy. To establish the significance of the enrichment of divergently expressed gene pairs within these groups, 10000 random samples containing 1200 genes were repeatedly taken from the list of genes and the number of pairs within each sample was counted, giving a probability distribution of the number of expected divergently expressed gene pairs within a random sample of this size. This was compared to the number of divergently expressed gene pairs within each group to give a p value showing the significance of any enrichment or depletion above or below what would be expected by chance.

Promoter motif enrichment analysis

For analysis of the enrichment of motifs within the promoters of Paf1-enriched genes the 500 bp DNA sequence upstream of the TSS was taken from Paf1-enriched genes (p value < 0.1, 229 genes). DREME (Villanyi et al., 2014) was then used to search for significantly enriched motifs on both strands compared to the same promoter regions taken from Paf1-depleted genes (p value < 0.1, 219 genes).

RNA FISH analysis

Images were deconvolved using a conservative deconvolution method and 15 cycles using DeltaVision Softworx software. Image quantification was carried out using Fiji (Schindelin et al., 2012). In this, images were compressed to 2D images displaying the maximum intensity projection for each pixel across z stacks 12 to 22 (for CMK2) or 6-16 (for GAL1 AS). Cells were outlined manually. For nuclei, an automatic threshold was applied on the DAPI signal using Otsu’s thresholding algorithm (Otsu, 1979). Regions corresponding to cell nuclei above the threshold were then automatically outlined. For CMK2, Cy3 signal contrast was enhanced by applying the “Sharpen” processing command. Dots corresponding to CMK2 or GAL1 AS transcripts were then counted for cells and nuclei by applying the “Find Maxima” command with a noise tolerance adjusted so that control cells in which CMK2 or GAL1 was deleted gave no counts. For CMK2, 4 biological replicates were performed for both BY4741 and paf1Δ strains with 230-400 cells counted per replicate per strain. For GAL1 AS, 3 biological replicates were performed for both GAL1::ADH1 and GAL1::ADH1 paf1Δ strains with 150-350 cells counted per replicate per strain. The nuclear to whole cell transcript ratio was calculated for each replicate. Bar charts show the mean of these ratios for each transcript in each strain. Error bars show the SEM.

Data and Software Availability

The accession number for the raw and processed sequencing data reported in this paper is ArrayExpress: E-MTAB-4568. Images and other data are available at Mendeley (DOI: 10.17632/2vhybtrwvk.1).

Author Contributions

H.F. and J.M. designed the experiments. H.F and F.S.H. performed the experiments and analyzed the data. H.F. and J.M. wrote the manuscript. H.F., F.H., A.F., and J.M. revised the manuscript.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Antibodies

Mouse monoclonal anti FLAG-tag peptide sequence (clone M2) (Dilution for western blot: 1:5000)	Sigma-Aldrich	Cat#F1804, RRID: AB_262044
Rat monoclonal anti HA-tag peptide sequence (clone 3F10) (Dilution for western blot: 1:500)	Roche	Cat#ROAHAHA
Rabbit monoclonal anti residues 1-80 from N terminus of Rpb1 (clone σ80) (Dilution for western blot: 7:1000)	Santa Cruz	Cat#sc-25758, RRID: AB_655813
Rat monoclonal anti-Ser2P Rpb1 CTD (clone 3E10) (Dilution for western blot: 1:1000)	Millipore	Cat#04-1571, RRID: AB_11212363
Rat monoclonal anti-Ser5P Rpb1 CTD (clone 3E8) (Dilution for western blot: 1:2000)	Millipore	Cat#04-1572, RRID: AB_10615822
Rat monoclonal anti-Ser7P Rpb1 CTD (clone 4E12) (Dilution for western blot: 1:2000)	Millipore	Cat#04-1570, RRID: AB_10618152

Chemicals, Peptides, and Recombinant Proteins

Anti-FLAG M2 affinity agarose beads (affinity gel)	Sigma-Aldrich	Cat#A2220
Anti-HA Dynabeads	Pierce	Cat#88837
3 × FLAG peptide	Sigma-Aldrich	Cat#F4799

Critical Commercial Assays

miRNeasy kit	QIAGEN	Cat#217004

Deposited Data

Raw and processed data	This paper	ArrayExpress: E-MTAB-4568
Images and data	This paper	Mendeley: DOI: 10.17632/2vhybtrwvk.1
Annotations of most abundant mRNA and lncRNA transcript isoforms	Pelechano et al., 2013	GEO: GSE39128
Annotations of introns and non-coding RNAs	Saccharomyces Genome Database (SGD)	http://www.yeastgenome.org/
S. cerevisiae reference genome: sacCer3 (April 2011 sequence)	SGD	http://www.yeastgenome.org/
Rpb3 ChIP-chip processed data	Mayer et al., 2010	ArrayExpress: E-TABM-1033
Processed data	Baejen et al., 2014	GEO: GSE59676
Gene lists from clustering analysis	Tuck and Tollervey, 2013	N/A

Experimental Models: Organisms/Strains

S. cerevisiae: Name = BY4741; Genotype = MATa; his3Δ1; leu2Δ0; met15Δ0; ura3Δ0	Euroscarf	N/A
S. cerevisiae: Name = YSC001; Genotype = RPB3-3xFLAG-NAT	Churchman and Weissman, 2011	N/A
S. cerevisiae: Name = BY4741 RPB3-3xFLAG; Genotype = RPB3-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 RPB2-3xFLAG; Genotype = RPB2-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 SPT6-3xFLAG; Genotype = SPT6-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 SPT16-3xFLAG; Genotype = SPT16-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 PAF1-3xFLAG; Genotype = PAF1-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 PCF11-3xFLAG; Genotype = PCF11-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 CET1-3xFLAG; Genotype = CET1-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 SSU72-3xFLAG; Genotype = SSU72-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4742 SET2-3xFLAG; Genotype = SET2-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 RPB3-3xFLAG PAF1-3xHA; Genotype = RPB3-3xFLAG-His3MX6, PAF1-3xHA-KanMX6	This paper	N/A
S. cerevisiae: Name = BY4741 RPB3-3xFLAG RPB2-3xHA; Genotype = RPB3-3xFLAG-His3MX6, RPB2-3xHA-KanMX6	This paper	N/A
S. cerevisiae: Name = BY4741 RPB3-3xFLAG paf1Δ; Genotype = RPB3-3xFLAG-HisMX6, paf1::KanMX6	This paper	N/A
S. cerevisiae: Name = BY4741 paf1Δ; Genotype = paf1Δ::KanMX6	This paper	N/A
S. cerevisiae: Name = BY4741 slc1::URA3; Genotype = slc1::URA3	This paper	N/A
S. cerevisiae: Name = BY4741 slc1::URA3 RPB3-3xFLAG; Genotype = slc1::URA3, RPB3-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 slc1::URA3 PAF1-3xFLAG; Genotype = slc1::URA3, PAF1-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 cmk2::URA3; Genotype = cmk2::URA3	This paper	N/A
S. cerevisiae: Name = BY4741 cmk2::URA3 RPB3-3xFLAG; Genotype = cmk2::URA3, RPB3-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 cmk2::URA3 PAF1-3xFLAG; Genotype = cmk2::URA3, PAF1-3xFLAG-His3MX6	This paper	N/A
S. cerevisiae: Name = BY4741 GAL1::ADH1_T; see Murray et al. (2012) for genotype details	Murray et al., 2012	N/A
S. cerevisiae: Name = BY4741 GAL1::ADH1_Tpaf1Δ; Genotype = BY4741 GAL1::ADH1_T, paf1::KanMX6	This paper	N/A

Sequence-Based Reagents

RT primer: 5′/5Phos/ATCTCGTATGCCGTCTTCTGCTTG/iSp18/CACTCA/iSp18/TCCGACGATCATTGATGGTGCCTACAG 3′	IDT	Custom order
Barcoded primer 1: AATGATACGGCGACCACCGAGATCTACACGATCGGAAGAGCACACGTCTGAACTCCAGTCACATGCCATCCGACGATCATTGATGG	IDT	Custom order
Barcoded primer 2: AATGATACGGCGACCACCGAGATCTACACGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGCATCTCCGACGATCATTGATGG	IDT	Custom order
Barcoded primer 3: AATGATACGGCGACCACCGAGATCTACACGATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCTCCGACGATCATTGATGG	IDT	Custom order
Barcoded primer 4: AATGATACGGCGACCACCGAGATCTACACGATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCATCCGACGATCATTGATGG	IDT	Custom order
Primer 1: CAAGCAGAAGACGGCATACGA	IDT	Custom order
Sequencing primer: TCCGACGATCATTGATGGTGCCTACAG	IDT	Custom order
Primers for reverse transcription PCR. See Table S5	N/A	N/A
RNA FISH probes. See Table S6	Biosearch Technologies	Custom order

Software and Algorithms

Galaxy	Blankenberg et al., 2010	https://usegalaxy.org
Bowtie for Illumina	Langmead, 2010	https://usegalaxy.org
Samtools	Li et al., 2009	https://usegalaxy.org
Integrated Genome Viewer	Thorvaldsdóttir et al., 2013	http://software.broadinstitute.org/software/igv/
R/Bioconductor	Bioconductor	https://www.bioconductor.org/
DREME	Bailey, 2011	http://meme-suite.org/
DESeq2	Love et al., 2014	https://bioconductor.org/packages/release/bioc/html/DESeq2.html
DeltaVision Softworx	GE Healthcare	N/A
Fiji	Schindelin et al., 2012	http://imagej.net/Fiji

89 in total

1. Using both strands: The fundamental nature of antisense transcription.

Authors: Struan C Murray; Jane Mellor
Journal: Bioarchitecture Date: 2016

2. Different strategies for carboxyl-terminal domain (CTD) recognition by serine 5-specific CTD phosphatases.

Authors: Stéphane Hausmann; Hisashi Koiwa; Shankarling Krishnamurthy; Michael Hampsey; Stewart Shuman
Journal: J Biol Chem Date: 2005-09-07 Impact factor: 5.157

3. A novel assay identifies transcript elongation roles for the Nup84 complex and RNA processing factors.

Authors: Cristina Tous; Ana G Rondón; María García-Rubio; Cristina González-Aguilera; Rosa Luna; Andrés Aguilera
Journal: EMBO J Date: 2011-04-08 Impact factor: 11.598

4. Single-molecule mRNA decay measurements reveal promoter- regulated mRNA stability in yeast.

Authors: Tatjana Trcek; Daniel R Larson; Alberto Moldón; Charles C Query; Robert H Singer
Journal: Cell Date: 2011-12-23 Impact factor: 41.582

5. Pol II CTD kinases Bur1 and Kin28 promote Spt5 CTR-independent recruitment of Paf1 complex.

Authors: Hongfang Qiu; Cuihua Hu; Naseem A Gaur; Alan G Hinnebusch
Journal: EMBO J Date: 2012-07-13 Impact factor: 11.598

6. A map of nucleosome positions in yeast at base-pair resolution.

Authors: Kristin Brogaard; Liqun Xi; Ji-Ping Wang; Jonathan Widom
Journal: Nature Date: 2012-06-28 Impact factor: 49.962

7. The Paf1 complex represses small-RNA-mediated epigenetic gene silencing.

Authors: Katarzyna Maria Kowalik; Yukiko Shimada; Valentin Flury; Michael Beda Stadler; Julia Batki; Marc Bühler
Journal: Nature Date: 2015-03-25 Impact factor: 49.962

8. A transcriptome-wide atlas of RNP composition reveals diverse classes of mRNAs and lncRNAs.

Authors: Alex Charles Tuck; David Tollervey
Journal: Cell Date: 2013-08-29 Impact factor: 41.582

9. Sumoylation of the THO complex regulates the biogenesis of a subset of mRNPs.

Authors: Hugo Bretes; Jérôme O Rouviere; Thibaut Leger; Marlene Oeffinger; Frédéric Devaux; Valérie Doye; Benoit Palancade
Journal: Nucleic Acids Res Date: 2014-02-05 Impact factor: 16.971

10. Extensive transcriptional heterogeneity revealed by isoform profiling.

Authors: Vicent Pelechano; Wu Wei; Lars M Steinmetz
Journal: Nature Date: 2013-04-24 Impact factor: 49.962

18 in total

1. Paf1 and Ctr9, core components of the PAF1 complex, maintain low levels of telomeric repeat containing RNA.

Authors: Joana Rodrigues; David Lydall
Journal: Nucleic Acids Res Date: 2018-01-25 Impact factor: 16.971

Review 2. Emerging Properties and Functional Consequences of Noncoding Transcription.

Authors: Ryan Ard; Robin C Allshire; Sebastian Marquardt
Journal: Genetics Date: 2017-10 Impact factor: 4.562

Review 3. Emerging Insights into the Roles of the Paf1 Complex in Gene Regulation.

Authors: S Branden Van Oss; Christine E Cucinotta; Karen M Arndt
Journal: Trends Biochem Sci Date: 2017-09-01 Impact factor: 13.807

4. The Paf1 Complex Broadly Impacts the Transcriptome of Saccharomyces cerevisiae.

Authors: Mitchell A Ellison; Alex R Lederer; Marcie H Warner; Travis N Mavrich; Elizabeth A Raupach; Lawrence E Heisler; Corey Nislow; Miler T Lee; Karen M Arndt
Journal: Genetics Date: 2019-05-15 Impact factor: 4.562

5. Reliance of Host-Encoded Regulators of Retromobility on Ty1 Promoter Activity or Architecture.

Authors: Alicia C Salinero; Simey Emerson; Tayla C Cormier; John Yin; Randall H Morse; M Joan Curcio
Journal: Front Mol Biosci Date: 2022-07-01

6. PAF1 regulation of promoter-proximal pause release via enhancer activation.

Authors: Fei Xavier Chen; Peng Xie; Clayton K Collings; Kaixiang Cao; Yuki Aoi; Stacy A Marshall; Emily J Rendleman; Michal Ugarenko; Patrick A Ozark; Anda Zhang; Ramin Shiekhattar; Edwin R Smith; Michael Q Zhang; Ali Shilatifard
Journal: Science Date: 2017-08-31 Impact factor: 47.728

7. Biochemical insights into Paf1 complex-induced stimulation of Rad6/Bre1-mediated H2B monoubiquitination.

Authors: Feilong Chen; Beibei Liu; Lu Guo; Xuan Ge; Wei Feng; De-Feng Li; Hao Zhou; Jiafu Long
Journal: Proc Natl Acad Sci U S A Date: 2021-08-17 Impact factor: 11.205

Review 8. The Paf1 Complex: A Keystone of Nuclear Regulation Operating at the Interface of Transcription and Chromatin.

Authors: Alex M Francette; Sarah A Tripplehorn; Karen M Arndt
Journal: J Mol Biol Date: 2021-04-01 Impact factor: 6.151

9. The ribosome assembly gene network is controlled by the feedback regulation of transcription elongation.

Authors: Fernando Gómez-Herreros; Thanasis Margaritis; Olga Rodríguez-Galán; Vicent Pelechano; Victoria Begley; Gonzalo Millán-Zambrano; Macarena Morillo-Huesca; Mari Cruz Muñoz-Centeno; José E Pérez-Ortín; Jesús de la Cruz; Frank C P Holstege; Sebastián Chávez
Journal: Nucleic Acids Res Date: 2017-09-19 Impact factor: 16.971

10. Activity-Dependent Regulation of Alternative Cleavage and Polyadenylation During Hippocampal Long-Term Potentiation.

Authors: Mariana M Fontes; Aysegul Guvenek; Riki Kawaguchi; Dinghai Zheng; Alden Huang; Victoria M Ho; Patrick B Chen; Xiaochuan Liu; Thomas J O'Dell; Giovanni Coppola; Bin Tian; Kelsey C Martin
Journal: Sci Rep Date: 2017-12-12 Impact factor: 4.379