Literature DB >> 36124234

Integrative analysis reveals histone demethylase LSD1 promotes RNA polymerase II pausing.

Hani Jieun Kim^1,2, Pishun Li³, Taiyun Kim^1,2,4, Andrew J Oldfield⁵, Xiaofeng Zheng³, Pengyi Yang^1,2,4.

Abstract

Lysine-specific demethylase 1 (LSD1) is well-known for its role in decommissioning enhancers during mouse embryonic stem cell (ESC) differentiation. Its role in gene promoters remains poorly understood despite its widespread presence at these sites. Here, we report that LSD1 promotes RNA polymerase II (RNAPII) pausing, a rate-limiting step in transcription regulation, in ESCs. We found the knockdown of LSD1 preferentially affects genes with higher RNAPII pausing. Next, we demonstrate that the co-localization sites of LSD1 and MYC, a factor known to regulate pause-release, are enriched for other RNAPII pausing factors. We show that LSD1 and MYC directly interact and MYC recruitment to genes co-regulated with LSD1 is dependent on LSD1 but not vice versa. The co-regulated gene set is significantly enriched for housekeeping processes and depleted of transcription factors compared to those bound by LSD1 alone. Collectively, our integrative analysis reveals a pleiotropic role of LSD1 in promoting RNAPII pausing.

Entities: Chemical

Keywords: Molecular biology; molecular mechanism of gene regulation; omics

Year: 2022 PMID： 36124234 PMCID： PMC9482124 DOI： 10.1016/j.isci.2022.105049

Source DB: PubMed Journal: iScience ISSN： 2589-0042

Introduction

Lysine-specific demethylase 1 (LSD1) (Shi et al., 2004), also known as KDM1A/AOF2, plays a pleiotropic role in a broad spectrum of biological processes including stem cell pluripotency, differentiation, and cancer (Amente et al., 2013). It binds to both promoters and enhancers to specifically catalyze the demethylation at H3K4 and H3K9 (Shi et al., 2004). In embryonic stem cells (ESCs), the inhibition or ablation of LSD1 has been shown to lead to severe proliferative defects and cell death (Adamo et al., 2011; Wang et al., 2009; Whyte et al., 2012). Many studies have provided further insight into the role of LSD1 as enhancers in diverse cell types including ESCs (AlAbdi et al., 2020; Petell et al., 2016; Tatsumi et al., 2020; Vinckier et al., 2020; Whyte et al., 2012). For example, LSD1 has been reported to decommission enhancers of ESC-specific genes during mouse ESC differentiation (Whyte et al., 2012). While these studies demonstrate the functional role of LSD1 at enhancers in ESCs, the functional implication of LSD1 binding at gene promoters remains largely unknown. At the promoters of actively transcribed genes, pausing the RNA polymerase II (RNAPII) during early elongation is an obligate part of the transcription cycle that is experienced by RNAPII at almost all genes (Core and Adelman, 2019) across the cell and tissue types (Day et al., 2016). Blocking the release of the paused RNAPII through the use of potent small molecule inhibitors has been shown to globally trap RNAPII at promoters, thereby abrogating nearly all RNA synthesis in mammalian cells (Henriques et al., 2013; Jonkers and Lis, 2015; Rahl et al., 2010). Release of the paused RNAPII in promoter-proximal regions has been shown to be the rate-limiting step in productive elongation, making the efficiency of pause release a central determinant of gene expression. Although early studies have shown that paused RNAPII is preferentially found at many developmental genes in the Drosophila melanogaster embryo (Zeitlinger et al., 2007), subsequent research has demonstrated that pausing is widespread in higher eukaryotes and is not only limited to developmental genes but also genes associated with essential biological processes such as cell proliferation and stress response (Vinckier et al., 2020). For example, in ESCs, RNAPII pausing has been implicated in key processes such as self-renewal (Min et al., 2011) and differentiation (Core and Adelman, 2019; Gaertner and Zeitlinger, 2014; Jonkers and Lis, 2015; Levine, 2011). Thus, the identification of factors that modulate RNAPII pausing and release at genes of different functions is of great interest toward understanding gene regulation. Many events and factors are involved in the establishment and release of paused RNAPII (Chen et al., 2018; Gamot et al., 2014; Gardini et al., 2014; Meng and Bartholomew, 2018). After recruitment to the promoter region of a gene, RNAPII begins transcribing a short, nascent RNA approximately 25–50 nucleotides long. However, RNAPII halts within the initially transcribed region and remains promoter-proximally paused until it receives further signals. The paused state is stabilized by SPT5 and NELF-A (Yamaguchi et al., 2013), with NELF-A preventing the reactivation of the RNAPII catalytic site (Henriques et al., 2018; Vos et al., 2018a, 2018b). The release of RNAPII into productive RNA synthesis is triggered by the activity of the kinase, positive transcription elongation factor b (P-TEFb), where CDK9 acts as the catalytic subunit of P-TEFb. Recruitment of P-TEFb was proposed to be facilitated by c-MYC (MYC), BRD4, various subunits of the Mediator, and super elongation complexes (Li et al., 2018; Liu et al., 2015; Rahl et al., 2010). Critically, the phosphorylation of SPT5 by P-TEFb has been shown to be directly linked to pause release (Cheng et al., 2012; Guo et al., 2000; Marshall and Price, 1995; Yamada et al., 2006), causing the dissociation of NELF-A from RNAPII to enable reactivation and continued the elongation of the nascent RNA (Cheng and Price, 2007; Wu et al., 2003; Yamada et al., 2006). Here, we report that LSD1 promotes RNAPII pausing in mouse ESCs. We found that LSD1 presence is pervasive at gene promoters and in particular co-localizes with a number of factors associated with paused RNAPII. Correlation of LSD1 binding sites against a compendium of ChIP-seq datasets demonstrated a strong enrichment of factors involved in RNAP II pausing with LSD1 and the knockdown of LSD1 preferentially affects genes with higher RNAPII pausing than those with lower pausing. Our functional experiments demonstrate that the LSD1 knockdown causes the global reduction of RNAPII pausing of genes and in particular those that are bound by LSD1. Interestingly, our analysis demonstrated shared and unique binding profiles between LSD1 and MYC, a well-known player of RNAPII release in mammalian cells (Rahl et al., 2010), having been demonstrated to recruit the elongation factor SPT5 (Baluapuri et al., 2019) as well as facilitate the recruitment of P-TEFB (Rahl et al., 2010). To further understand the potential interplay between LSD1 and MYC in the regulation of RNAPII pausing, we investigated the co-localization sites of these two factors and found that they are associated with significantly higher RNAPII pausing and more enriched with pausing and releasing factors. Finally, we reveal that LSD1 targets different gene categories depending on MYC occupancy. Collectively, our results unveil LSD1 in regulating RNAPII pausing and thereby gene expression and highlight a role of LSD1 at promoters that may be distinct from its role in ESC differentiation.

Results

LSD1 co-localizes with key components of the RNAPII pausing machinery at gene promoters in embryonic stem cells

We previously developed an analytic tool, PAD (http://pad2.maths.usyd.edu.au/), for identifying TFs, chromatin remodelers, and histone modifications that co-localize on chromatin based on their genomic binding profiles in ESCs, as measured using ChIP-seq (Yang et al., 2017). Here, we extended PAD to enable targeted co-localization analysis within 12 functional genomic regions identified by ChromHMM (Ernst and Kellis, 2017) (Figure S1; see STAR Methods). Previous studies have largely focused on the role of LSD1 as enhancers (Vinckier et al., 2020; Whyte et al., 2012), and despite the pervasiveness of LSD1 binding at promoters (Whyte et al., 2012), its role at promoters remains largely uninvestigated. To investigate the role of LSD1 at promoters, we correlated its binding profiles with other DNA- and chromatin-binding proteins at promoter regions using PAD. Our analysis shows a strong co-localization of LSD1 at promoters with core RNAPII pausing factors (RNAPII, CDK9, NELF-A, and SPT5) and additional factors such as MYC (Baluapuri et al., 2019; Price, 2010; Rahl et al., 2010) and BRD4 (Devaiah et al., 2012; Gatchalian et al., 2018; Itzen et al., 2014; Lu et al., 2015; Patel et al., 2013) (Figure 1A). In particular, LSD1 co-localized most strongly with CDK9, the enzymatic subunit of the pause release factor P-TEFb (Booth et al., 2018; Larochelle et al., 2012; Lu et al., 2016) across the three genomic regions (Figure 1B). When we further investigated the co-localization at the binding sites of LSD1, we found that while the ChIP-seq signal of LSD1-CoREST complex components, CoREST, HDAC1, and HDAC2, were observed at comparable levels at promoters, bivalent domains, and enhancers (Figure 1C), the co-localization of LSD1 with RNAPII pausing and release factors was the strongest at promoters (Figure 1C).

Figure 1

Co-localization of LSD1 with RNAPII pausing machinery across regulatory regions

(A) DNA binding proteins are ranked by the correlation of binding sites with LSD1 at promoter regions. A panel of genes known to be associated with RNAPII pausing are highlighted in red and two ESC pluripotency-associated TFs are highlighted in green for contrast. Gene set enrichment test was applied to the RNAPII-associated proteins (those in red) with respect to their correlation with LSD1.

(B) Co-localization heatmaps of LSD1 and select of RNAPII-associated factors across three regulatory regions (promoters, bivalent domains, and enhancers).

(C) Binding sites of LSD1, components of the LSD1 complex (CoREST, HDAC1, and HDAC2), and factors associated with the RNAPII pausing machinery (RNAPII, NELF-A, MYC, CDK9, BRD4, and SPT5) at the three regulatory regions.

(D) Pie charts showing the proportion of binding sites of factors across the 12 regulatory regions.

LSD1 regulates the transcription of genes associated with RNAPII pausing

Given the widespread presence of LSD1 at gene promoters, we first investigated the relationship between gene expression and LSD1 binding. We found that the higher LSD1 binding signals at gene promoters, the higher expression of the genes in ESCs (Figure 2A). Comparing gene expression of wildtype ESCs with those after LSD1 knockdown shows that genes with higher expression are more downregulated upon LSD1 knockdown, providing evidence for LSD1 binding at gene promoters for their transcription activation (Figure S2A). These results are consistent with experimental studies that have shown that LSD1 is a transcriptional activator (Amente et al., 2013; Sehrawat et al., 2018). We next sought to assess the relationship between RNAPII pausing and LSD1 binding of genes. To do this, we calculated the RNAPII pausing index (PI) of all genes based on RNAPII ChIP-seq data (Muse et al., 2007) (see STAR Methods) and correlated them with the LSD1 ChIP-seq signal. Comparison with global run-on sequencing (GRO-seq) (Williams et al., 2015), which measures nascent RNA from transcriptionally engaged RNAPII, confirms a strong positive correlation between RNAPII ChIP-seq and GRO-seq signals (Figures S2B and 2SC). Our analysis suggests that LSD1 is preferentially bound at the promoters of paused genes (Figure 2B). Furthermore, we found that LSD1-bound genes show similar levels of RNAPII PI to those bound by other factors known to regulate RNAPII pausing or release in ESCs (Figure 2C). These results suggest that LSD1-bound genes are subject to RNAPII pausing.

Figure 2

Knockdown of LSD1 affects genes with higher RNAPII pausing than those with lower pausing (see also Figure S2)

(A) Gene sets were partitioned according to the level of expression in ESCs and the level of LSD1 signal (RPM) (+/− 1kb around the transcription start site [TSS]) was quantified for each gene set.

(B) Boxplot of LSD1 signal (RPM) at gene promoters grouped according to RNAPII pausing index (PI), calculated as the ratio of RNAPII binding across the gene body against RNAPII binding at the TSS (see STAR Methods).

(D and E) Boxplot of log2 fold-change in gene expression for gene sets grouped in terms of RNAPII PI after knockdown of LSD1 in (d) ESCs and (e) human ESCs (hESCs). ∗ denotes statistical significance (p < 0.05) using the Wilcox rank sum test. For each boxplot, center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers.

Knockdown of LSD1 affects genes with higher RNAPII pausing than those with lower pausing (see also Figure S2) (A) Gene sets were partitioned according to the level of expression in ESCs and the level of LSD1 signal (RPM) (+/− 1kb around the transcription start site [TSS]) was quantified for each gene set. (B) Boxplot of LSD1 signal (RPM) at gene promoters grouped according to RNAPII pausing index (PI), calculated as the ratio of RNAPII binding across the gene body against RNAPII binding at the TSS (see STAR Methods). (C) Boxplot of RNAPII PI of genes bound by NELF-A, LSD1, CDK9, MYC, TBP, BRD4, and SPT5. (D and E) Boxplot of log2 fold-change in gene expression for gene sets grouped in terms of RNAPII PI after knockdown of LSD1 in (d) ESCs and (e) human ESCs (hESCs). ∗ denotes statistical significance (p < 0.05) using the Wilcox rank sum test. For each boxplot, center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers. We reasoned that if LSD1 has a functional role in controlling gene expression via regulation of RNAPII pausing, loss of LSD1 expression should preferentially affect the expression of LSD1 target gene sets with higher RNAPII PI much more so than those with lower RNAPII PI. To test this, we partitioned all quantified genes in ESCs into five groups and assessed their log2 fold change in expression before and after LSD1 knockdown (Foster et al., 2010). We found that the log2 fold change in gene expression after LSD1 knockdown is proportionately affected by RNAPII pausing, indicating that LSD1 knockdown preferentially affects genes with higher RNAPII PI (Figure 2D). Lastly, we confirmed our observations using data from human ESCs by showing the same pausing-dependent log2 fold change in expression with LSD1 knockdown can be observed in mouse and human ESCs (Adamo et al., 2011) (Figure 2E). Collectively, these results suggest a potential role for LSD1 in regulating gene expression through RNAPII pausing, and these findings motivated us to investigate the mechanistic role of LSD1 in modulating transcriptional programs associated with RNAPII pausing.

LSD1 knockdown causes global reduction of RNAPII pausing

Given its association with RNAPII pausing and the transcription of genes, we next investigated the direct impact of LSD1 knockdown on RNAPII pausing at gene promoters. By genome-wide ChIP-seq profiling of RNAPII with or without LSD1 knockdown (Figures 3A and 3B), we found that the depletion of LSD1 in ESC induced the reduction of RNAPII pausing at various LSD1-bound genes (Figures 3A and S3C). Meta-gene analysis revealed that the average ChIP-seq signal of RNAPII profiles across the gene bodies of all LSD1-bound genes demonstrated a clear reduction of RNAPII pausing (Figures 3B and S3D). Notably, we found that the knockdown of LSD1 led to a global reduction of RNAPII pausing (Figure 3C) and this extent of RNAPII pausing reduction was significantly greater in the LSD1-bound genes when compared to that of all the genes, even when stratified into low and high transcriptional groups (Figures 3C and S3E). Collectively, these results indicate that LSD1 promotes RNAPII pausing and the decrease of LSD1 expression causes a reduction of RNAPII pausing at gene promoters.

Figure 3

Impact of LSD1 knockdown on RNAPII pausing (see also Figure S3)

(A) ChIP-seq profiles of LSD1 and RNAPII with non-targeted (NT) or shLSD1 knockdown (KD) at six example genes. Promoter regions of genes are highlighted.

(B) Meta-gene body profile of input (blue) and RNAPII ChIP-seq (red) signal in NT and LSD1 KD samples.

(C) Boxplot showing the fold change (LSD1 KD versus NT) of RNAPII pausing index (PI) in LSD1-bound genes or all genes that have quantified PIs. p-values denote the statistical significance using the Wilcox rank sum test. Center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers.

Impact of LSD1 knockdown on RNAPII pausing (see also Figure S3) (A) ChIP-seq profiles of LSD1 and RNAPII with non-targeted (NT) or shLSD1 knockdown (KD) at six example genes. Promoter regions of genes are highlighted. (B) Meta-gene body profile of input (blue) and RNAPII ChIP-seq (red) signal in NT and LSD1 KD samples. (C) Boxplot showing the fold change (LSD1 KD versus NT) of RNAPII pausing index (PI) in LSD1-bound genes or all genes that have quantified PIs. p-values denote the statistical significance using the Wilcox rank sum test. Center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers.

LSD1 and MYC co-occupancy is enriched for RNAPII pausing

MYC is known to play a major role in RNAPII release in mammalian cells (Rahl et al., 2010). It binds to promoter-proximal regions of transcribed genes to regulate the release of paused RNAPII by recruiting P-TEFb (Rahl et al., 2010). Recently, MYC has also been shown to recruit the elongation factor SPT5, a subunit of the elongation factor DSIF, to promoter-proximal RNAPII, wherein SPT5 travels with RNAPII and enhances its processivity during transcriptional elongation (Baluapuri et al., 2019). Given the important role of MYC in regulating the transcriptional activity of RNAPII and the strong co-localization between LSD1 and the pause-release factor P-TEFb, we asked whether comparing sites of LSD1 and MYC co-localization against sites where either LSD1 or MYC are found alone may provide additional insight into the role of LSD1 in RNAPII pause release. To test this, we partitioned bindings sites of LSD1 and MYC into three categories: MYC-specific sites, co-localized sites, and LSD1-specific sites (Figure 4A). Approximately 57% of MYC binding sites were co-localized with LSD1 and 19% of LSD1 binding sites were co-localized with MYC. As a quality control, we assessed the LSD1 and MYC signals at MYC-specific and LSD1-specific sites and found that they are significantly depleted as per their definition (Figures 4A and S4A). In comparison, the co-localized sites had a comparatively similar amount of factor when compared to their respective individual sites.

Figure 4

Partition of genome-wide binding sites into MYC-specific, co-localized, and LSD1-specific sites reveals enrichment of pausing (see also Figure S4)

(A) Density plots and boxplots of MYC and LSD1 occupancy (RPM) at the three genomic sites. The three lines denote the averaged signal across sites from each category: MYC-specific (orange), co-localized (purple), and LSD1-specific (blue) sites.

(B) Venn diagram showing the overlap of binding sites by the presence of either MYC and/or LSD1. Boxplots of regularised and log-transformed RNA expression and (d) RNAPII pausing in WT ESCs. Wilcox rank-sum test was used in (C and D). Center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers.

(E) Co-immunoprecipitation experiment reveals a direct interaction between LSD1 and MYC with or without the presence of benzonase in mouse embryonic stem cells.

(F and G) Bar plot of ChIP-qPCR results showing MYC and LSD1 occupancy at selected genes in ESCs transduced with GFP (Non-targeted, NT), MYC (shMYC), or LSD1 (shLSD1) shRNA lentivirus. For ChIP-qPCRs analysis, ESCs were collected 96h after transduction. % Input values were plotted as mean ± standard error. Student’s t-test; ∗p-value<0.05; ∗∗p-value<0.01; ∗∗∗p-value<0.001, ns = no significant.

Partition of genome-wide binding sites into MYC-specific, co-localized, and LSD1-specific sites reveals enrichment of pausing (see also Figure S4) (A) Density plots and boxplots of MYC and LSD1 occupancy (RPM) at the three genomic sites. The three lines denote the averaged signal across sites from each category: MYC-specific (orange), co-localized (purple), and LSD1-specific (blue) sites. (B) Venn diagram showing the overlap of binding sites by the presence of either MYC and/or LSD1. Boxplots of regularised and log-transformed RNA expression and (d) RNAPII pausing in WT ESCs. Wilcox rank-sum test was used in (C and D). Center line, median; box limits, upper and lower quartiles; whiskers, 1.5 times the interquartile range; points, outliers. (E) Co-immunoprecipitation experiment reveals a direct interaction between LSD1 and MYC with or without the presence of benzonase in mouse embryonic stem cells. (F and G) Bar plot of ChIP-qPCR results showing MYC and LSD1 occupancy at selected genes in ESCs transduced with GFP (Non-targeted, NT), MYC (shMYC), or LSD1 (shLSD1) shRNA lentivirus. For ChIP-qPCRs analysis, ESCs were collected 96h after transduction. % Input values were plotted as mean ± standard error. Student’s t-test; ∗p-value<0.05; ∗∗p-value<0.01; ∗∗∗p-value<0.001, ns = no significant. Next, with the hypothesis that LSD1 co-occupancy would show us a differential binding profile of RNAPII pausing factors at MYC binding sites, we evaluated the difference in occupancy of RNAPII pausing factors (RNAPII, TBP, NELF-A, SPT5, BRD4, and CDK9) at these regions. Strikingly, we saw that compared to the LSD1/MYC-specific sites, sites co-occupied by LSD1 and MYC were significantly enriched by all RNAPII pausing factors. Interestingly, we observed that MYC-specific and co-localized sites are epigenetically more permissive with higher H3K27ac and H3K4me3 and lower H3K27me3 signals (Figures S4B and S4C) than LSD1-specific sites, suggesting that their target genes may be more transcriptionally active. Collectively, these findings show that MYC and LSD1 co-localized sites are active promoters that are enriched for RNAPII pause/release factors. To further characterize genes regulated by either LSD1 or MYC or both (Figure 4B), we identified the target genes bound by LSD1 and/or MYC and analyzed their expression profiles in ESCs. While all three gene sets have higher than average transcription in ESCs (Figure 4C), we found that genes targeted by MYC or both MYC and LSD1 were significantly more expressed than LSD1-specific target genes (Figure 4C), consistent with what we observed at the epigenetic level (Figures S4B and S4C). We next compared the RNAPII PI of the three sets of target genes in ESCs (Figure 4D). We found genes bound specifically by MYC, a factor well-documented for its role in RNAPII pause release (Baluapuri et al., 2019; Lu et al., 2015; De Pretis et al., 2017; Price, 2010; Rahl et al., 2010), had a similar level of RNAPII pausing compared to those bound specifically by LSD1 (Figure 4D), and consistent with our observation that the co-occupied sites are enriched for RNAPII pausing factors, we found that genes co-occupied by LSD1 and MYC exhibited significantly higher RNAPII pausing than those with LSD1 or MYC alone.

MYC binding at co-localized target genes is dependent on the presence of LSD1

To further strengthen the mechanistic insight of LSD1 in RNAPII pausing, we further investigated the interaction between MYC and LSD1. Our association studies suggested a functional link between LSD1 and MYC, as revealed by the significant enrichment of RNAPII-associated factors at target genes co-bound by LSD1 and MYC compared to the singly occupied target genes. First, we performed co-immunoprecipitation analysis to demonstrate the co-localization between LSD1 and MYC, which revealed that, indeed, they are directly interacting in ESCs and do so independently of the presence of DNA and RNA (Figure 4E). Next, we investigated whether the binding of MYC at its target sites is dependent on the presence of LSD1. Previously, MYC has been shown to be responsible for the recruitment of SPT5 (Baluapuri et al., 2019); therefore, the dependency of MYC recruitment on LSD1 has significant implications on RNAPII pausing and elongation. Leveraging on the target genes bound by LSD1 or MYC or both, we performed MYC and LSD1 ChIP qPCR at these sites under either LSD1 or MYC knockdown. Our findings revealed that the binding of MYC is dependent on the presence of LSD1 specifically at target genes that are co-regulated by these two factors, whereas MYC binding was not affected by the depletion of LSD1 in sites specifically targeted by MYC (Figure 4F). In contrast, depletion of MYC had no effect on co-regulated target genes, suggesting that the binding of LSD1 is independent of the presence of MYC (Figure 4G). Collectively, these findings support the co-operative role between LSD1 and MYC in regulating transcription.

Occupancy of LSD1 and MYC is associated with different transcription factors

As our analysis suggested differential regulation of genes targeted by both LSD1 and MYC versus those targeted by LSD1 or MYC alone, we performed de novo motif discovery from the DNA sequences of the three sets of binding sites (Figure S5A) and transcription factor (Figure S5B) enrichment from the three gene sets to further characterize their difference. We found that the top two enriched motifs differed between the three sets of binding sites. Specifically, we found that the sites co-occupied by LSD1 and MYC, but not those by MYC alone, were enriched with motifs of KLF family member SP1 (Kaczynski et al., 2003) (Figure S5A). This is in agreement with previous research that found that KLF family members are DNA-binding transcription factors specifically associated with RNAPII promoters with slow TBP (TATA-binding protein) turnover, which is associated with high transcriptional activity when compared to promoters with fast TBP turnover (Hasegawa and Struhl, 2019). At LSD1-specific binding sites, we found that the most enriched transcription factor in this gene set is CTCF, the binding of which at promoter-proximal regions has recently been shown to regulate distal enhancer-dependent gene transcription (Kubo et al., 2021). We note that the TF enrichment is consistent with our de novo motif analysis as we generated a motif resembling those of Lin28A in the LSD1-specific binding sites, which have been shown to be occupied near CTCF target sites (Tan and Yeo, 2016) (Figure S5). MYC and REST (a subunit of coREST complex) are highlighted as positive controls (Figure S5B). Collectively, these enrichment studies suggest different factors are involved in the regulation of the differentially occupied target genes.

Occupancy of LSD1 and MYC is associated with different gene functions

Lastly, we demonstrate that different gene sets are enriched in each category (Figure 5A). Whereas MYC-specific genes are enriched for pathways such as RNA splicing and miRNA processing, in agreement with previous knowledge (Kim et al., 2020), co-localized genes are largely enriched for metabolic processes, and LSD1-specific genes are enriched for pathways pertaining to cellular organization and localization (Figure 5A). Together, these results suggest that genes co-bound by LSD1 and MYC and those bound by them individually may undergo different transcriptional regulations and perform distinct biological functions.

Figure 5

Differentially enrichment of transcription factors and housekeeping genes at LSD1 sites (see also Figure S5)

(A) Over-representation analysis of gene set enrichment from the three categories. X-axis denotes the degree of enrichment in terms of the negative log10 p-value.

(B) Overlap and ratios of MYC, LSD1, and their co-localization sites with transcription factors (left) and housekeeping genes (right). (a) Test of non-independent overlap of (C) using Fisher’s exact test.

(D) Schematic of the differential regulation of LSD1 target genes. Sites occupied by LSD1 only and those co-occupied with RNAPII pausing machinery show differential transcriptional control, epigenetic landscape, and enrichment of transcription factors and housekeeping genes in ESCs.

Differentially enrichment of transcription factors and housekeeping genes at LSD1 sites (see also Figure S5) (A) Over-representation analysis of gene set enrichment from the three categories. X-axis denotes the degree of enrichment in terms of the negative log10 p-value. (B) Overlap and ratios of MYC, LSD1, and their co-localization sites with transcription factors (left) and housekeeping genes (right). (a) Test of non-independent overlap of (C) using Fisher’s exact test. (D) Schematic of the differential regulation of LSD1 target genes. Sites occupied by LSD1 only and those co-occupied with RNAPII pausing machinery show differential transcriptional control, epigenetic landscape, and enrichment of transcription factors and housekeeping genes in ESCs. Given the different functional enrichment of genes found by the above analysis, we next conducted a test for non-independent overlap of the three gene sets with transcription factors, which are frequently associated with enhancers and cell type-specificity (Eeckhoute et al., 2009), and housekeeping genes. Intriguingly, we found a reciprocal pattern of enrichment for transcription factors vs. housekeeping genes in the three gene sets. LSD1-specific target genes were enriched for target genes encoding transcription factors whilst MYC-specific and co-localized genes were not (Figures 5B and 5C). In contrast, co-localized and MYC-specific target genes were enriched for target genes encoding housekeeping genes whilst LSD1-specific genes were depleted, consistent with the gene set enrichment results showing that these sites were enriched for genes related to, for example, metabolic processes that take place regardless of the cell type (Figure 5A). Collectively, our findings reveal a putative role in pausing LSD1 at promoters through its association with MYC and P-TEFb that may be distinct from its enhancer-related role in ESC differentiation (Whyte et al., 2012) and demonstrate differential transcriptional control, epigenetic landscape, and enrichment of transcription factors and housekeeping genes between LSD1 sites distinguished by RNAPII pausing (Figure 5D).

Discussion

Here we show how the association of LSD1 at the promoters of actively transcribed genes corresponds to genes engaged in RNAPII pausing in ESCs and demonstrate a functional role of LSD1 in RNAPII pausing. LSD1 has been shown to decommission enhancers during ESC differentiation (Vinckier et al., 2020; Whyte et al., 2012), but its role in the promoters of genes remains elusive. Although the regulation of LSD1 at gene promoters can be attributed to both its binding at distal regulatory elements such as enhancers and promoter-proximal regions, we found that the majority of the LSD1 binding sites are located at promoter-proximal regions (71.5%, Figure 2D). Several lines of evidence suggest the involvement of LSD1 in RNAPII pausing. We observed that the LSD1 signal was strongly and positively correlated with gene expression, supporting a transactivating role of LSD1 in gene transcription. This is consistent with a recent study that demonstrates that LSD1 is recruited to promoters by MYCB to promote active transcription (Ahmed and Streit, 2018). Previously, LSD2 the only mammalian homolog of LSD1 has also been shown to have an important function in active gene transcription elongation (Fang et al., 2010). Moreover, our genome-wide chromatin-binding profiles demonstrated that LSD1 binding sites at RNAPII paused sites are strongly co-localized with RNAPII pause-release factors (Figure 1C). A recent study has demonstrated a functional role of SIRT6, a histone deacetylase, in the release of RNAPII pausing by preventing the eviction of NELF-E from the chromatin (Etchegaray et al., 2019), highlighting the direct involvement of chromatin remodellers in RNAPII pausing and release. Previous studies have shown that LSD1 physically interacts with RNAPII (Latos et al., 2015), MYC (Baluapuri et al., 2019; Weimann et al., 2013), and SNAIL (Carmichael et al., 2020), which in Drosophila embryos has been shown to inhibit the release of paused RNAPII (Bothma et al., 2011). Among 22 methyltransferases, LSD1 was the only one found to be in direct physical contact with MYC in the methyltransferase interactome (Baluapuri et al., 2019; Weimann et al., 2013). We confirmed LSD1 as a physical interactor of MYC in mouse ESCs through co-immunoprecipitation analysis (Figure 4E). Our findings further demonstrate a causal link between LSD1 and MYC function by investigating the LSD1 dependency of MYC genomic binding. Our results revealed that the recruitment of MYC is dependent on the presence of LSD1 specifically at target genes that are co-regulated by these two factors but not vice versa. Collectively, these findings strongly support the co-operative role between LSD1 and MYC in regulating transcription and highlight at least two distinct streams of MYC-regulated transcription that are either LSD1 dependent or independent. A recent study demonstrated a generic role of MYC in governing the transition of RNAPII from a paused state into a transcriptionally engaged mode by directly recruiting SPT5 to RNAPII (Baluapuri et al., 2019). Consistent with the function of MYC as a universal amplifier of gene expression (Nie et al., 2012), the ablation of MYC led to global depletion of SPT5 in chromatin (Baluapuri et al., 2019). Strikingly, we found that among the MYC binding sites, the presence of LSD1 facilitated the enrichment of RNAPII pausing factors at the promoters of genes and a higher pausing index (Figure 4). In comparison, genes bound by MYC demonstrated lower RNAPII pausing index but still retained a transcriptional level comparable to genes bound by both factors. We note that whilst we defined sites or genes to be either bound or unbound in this study, the LSD1 signal is continuous (Figure S4A) and so are the other RNAPII pausing-associated factors (Figure S4B), and hence the effect of LSD1 is in general global as was seen in Figure 3C and was also demonstrated by the changes of the gene expression in Figure 2D. Consistent with these findings, it has previously been shown that genetic ablation and chemical inhibition of LSD1 in pluripotent cell lines (mouse ESCs and F9 cells) increased the level of H3K56ac (Yin et al., 2014), a histone modification associated with the gene body (Jonkers and Lis, 2015) and recently identified as a key histone mark, along with H3K9ac, that modulates transcriptional pausing and elongation (Etchegaray et al., 2019). In another study, LSD1 ablation and catalytic inhibition led to an upregulation of H4K16ac (Yin et al., 2014), a histone mark that has been implicated in the regulation of RNAPII promoter-proximal pausing by recruiting BRD4 and P-TEFb (Kapoor-Vazirani et al., 2011; Zippo et al., 2009). Thus, our integrative analysis suggests a role for LSD1 in the regulation of RNAPII pause release through its direct association with critical pause release factors including MYC. We note that these secondary changes brought about by LSD1 knockdown in the level of histone modifications that have been shown to impact gene transcription may partially contribute toward the global decrease in RNPII occupancy observed in our study with LSD1 knockdown (Figure 3C). In addition, our genome-wide binding profiles reveal a strong co-localization of LSD1 and CDK9 (Figure 1C), the enzymatic subunit of P-TEFb, whose activity has been proposed as the rate-limiting step in paused RNAPII release (Gressel et al., 2017; Larochelle et al., 2012; Parua et al., 2018). The co-localization of LSD1 and CDK9 was observed at both promoter and enhancer elements, where CDK9 has been shown to be involved in enhancer-associated RNAPII pausing (Bacon and D’Orso, 2019; Ghavi-helm et al., 2014; Henriques et al., 2018). Future studies on LSD1 and CDK9 may shed light on their relationship in regulating RNAPII pausing. In ESCs, the catalytic activity of LSD1 has been shown to render enhancers inactive and is associated with the silencing of target genes (Whyte et al., 2012). Studies have shown LSD1 occupies the regulatory domains of pluripotency genes, including transcription factors, and decommissions them only when cells undergo lineage-specific differentiation (Vinckier et al., 2020; Whyte et al., 2012). Although we found that LSD1-specific genes are enriched for transcription factors, supporting its role in regulating cell type-specific differentiation, genes co-localized by LSD1 and MYC are significantly depleted of transcription factors and rather enriched for housekeeping genes, and in particular those that are involved in the regulation of cell type-invariant metabolic processes. Consistent with this, loss of LSD1 function has been shown to be associated with significant proliferative defects in neural (Sun et al., 2010) and embryonic stem cells (Yin et al., 2014) and has been implicated in cellular growth pathways and in the metastatic and oncogenic potential of several types of cancer (Amente et al., 2013; Wang et al., 2007, 2009). Our integrative analysis presents evidence for a previously unanticipated role of the histone demethylase LSD1 in cooperation with other RNAPII pausing factors in modulating cell type-specific and cell type-invariant genes and forms the basis for further investigation. Future work to assess the various phosphorylated forms of the carboxy-terminal tail of RNAP2 upon LSD1 knockdown and to evaluate the role of LSD1 activity toward its binding with MYC and its influence on RNAP2 pausing will be required to further elucidate the mechanism of LSD1 action in RNAP2 pausing.

Limitations of the study

Whilst our study highlights the role of LSD1 in RNAPII pausing, several limitations of the study remain to be addressed that will further shed light on its mechanism of action. For example, experiments probing the role of the LSD1 histone demethylase activity and its relationship with RNAPII pausing, the direct interaction of LSD1 with RNAPII pausing related factors, and the changes in the levels of key indicators of RNAPII pausing and releasing (such as RNAPII CTD Ser2 and Ser5 phosphorylation) should further reveal important mechanistic insights. We believe these are important future research avenues that will elucidate the exact mechanism of action of LSD1 in RNAPII pausing.

STAR★Methods

Key resources table

Resource availability

Lead contact

Further information and requests for reagent and resource may be directed to and will be fulfilled by the lead contact, Dr. Pengyi Yang (pengyi.yang@sydney.edu.au).

Materials availability

This study did not generate new unique reagents.

Experimental model and subject details

E14TG2a mouse ESCs (ATCC) were routinely cultured on gelatin-coated plates in DMEM (GIBCO) supplemented with 15% ESC-qualified fetal bovine serum (GIBCO), 10 mM 2-mercaptoethanol, 1mM nonessential amino acids, and 1,000 U/ml of LIF (Millipore). For cell maintenance, ESCs were cultured in the serum-free ESGRO medium (Millipore). 293T cells were maintained in DMEM with 10% fetal bovine serum (Biological Industries) at 37°C with 5% CO2.

Method details

Lentiviral shRNA constructs, lentivirus preparation and transduction

shRNA constructs for mouse Lsd1 were designed to target 21-base-pair (bp) transcript-specific regions. Complementary single-stranded oligos were annealed and cloned into AgeI and EcoRI sites of the pLKO.1-TRC lentivirus vector as suggested by the RNAi consortium. The sequences targeted by the shRNAs are as follows: Lsd1-1#, GAGTTGAAAGAGCTTCTTAAT; Lsd1-2#, CCACAAGTCAAACCTTTATTT; c-Myc-1#, ACTTCACCAACAGGAACTATG; c-Myc-2#, TGGAGATGATGACCGAGTTAC. Lsd1-2# and c-Myc-2# were used in the RT-qPCR and ChIP-qPCR analyses. All plasmids were verified by sequencing. Lentivirus production and transduction were carried out by transfecting 293T cells. Briefly, shRNA constructs were cotransfected with the envelope plasmid pMD2.G (Addgene) and the packing vector psPAX2(pMD2.G(Addgene_12259) and psPAX2 (Addgene_12260), kindly gifted from Didier Trono, using lipofiter reagents (Hanbio). Viruses were harvested 48 h after transfection and tittered by colony formation assay. Then, the viruses were used to transduce E14 ESCs. To select for infected cells, puromycin was added to the media at a concentration of 1 μg/mL after 24 h of infection cells were treated with puromycin for two days and harvested for RNA extraction using RT-qPCR at day 4.

RT-qPCR

Total RNA was extracted using the GeneJET RNA Purification Kit (ThermoFisher Scientific, K0731). cDNA was synthesized from 1 μg of total RNA using the PerfectStart® Uni RT&qPCR Kit for qPCR (Transgen Biotech, AUQ-01) according to the manufacturer’s instructions. RT-qPCR was carried out with PerfectStart® Green qPCR SuperMix (Transgen Biotech, AQ601-01) using Roche LightCycler 480 (Roche). The relative expression level was determined by using the 2-ΔCq method and normalized to GAPDH expression. The primers used in the paper are listed in Table S1.

Co-immunoprecipitation

E14 ESCs were collected and lysed in lysis buffer (50 mM Tris–HCl [pH 7.5], 150 mM NaCl, 1% Triton X-100, 1 mM EDTA, protease inhibitor cocktail and 1 mM DTT). The lysates were treated with 20 U/mL benzonase (Solarbio) and then were cleared by centrifugation at 12 000 rpm for 20 min at 4°C. The supernatant was diluted with binding buffer (50 mM Tris–HCl pH 7.5, 150 mM NaCl, 8% glycerol, 1 mM EDTA, 1× protease inhibitor cocktail and 1 mM DTT) until the concentration of Triton X-100 decreased to 0.2%. Antibodies (c-Myc, Abcam#ab32072; LSD1, a gift kindly provided by Jiemin Wong) were added to the lysate at a final concentration of 1 μg/mg and incubated overnight at 4°C, followed by antibody-protein complex capture with Protein A/G Magnetic Beads (Pierce, cat#88802). After washing (50 mM Tris–HCl [pH 7.5], 150 mM NaCl, 0.1% Triton X-100, 1 mM EDTA, protease inhibitor cocktail and 1 mM DTT), complexes were eluted, boiled in 1× SDS loading buffer (50mM Tris-HCl [pH6.8], 2% SDS, 100mM DTT, 10% Glycerol and 0.1% bromophenol blue dye) and analyzed by western blot.

Western blot analysis

Cells were lysed in ice-cold cell lysis buffer (50 mM Tris [pH 7.5], 150 mM NaCl, 1% Triton X-100, 1 mM EDTA) supplemented with protease inhibitor cocktail (Roche, cat#11836153001) and DTT. The proteins were separated on 10% SDS-PAGE gels and electrotransferred onto Nitrocellulose Transfer Membrane (PALL, P-N66485) that were placed in 5% skimmed milk for blocking. Probing was performed with specific primary antibodies (4°C, overnight) and horseradish peroxidase-conjugated secondary antibodies (Room temperature for 1 h). The primary antibodies used in this study targeted LSD1 (gifts of Prof.Jiemin Wong, dilution 1:1000; HUABIO # ER1802-12, dilution 1:1000), c-Myc (Abcam#ab32072, dilution 1:1000; ZENBIO #380784, dilution 1:1000), β-Tubulin (Abmart #M30109, dilution 1:2000), Pol II (Santa Cruz #sc-47701, dilution 1:1000), Phospho-Pol II (Ser2) (ZENBIO #381832, dilution 1:1000), Phospho-Pol II (Ser5) (ZENBIO #381833 dilution 1:1000), GAPDH(HUABIO #ET1601-4, dilution 1:1000).

ChIP-qPCR and ChIP-seq

We followed the established ChIP experiment protocol (Li et al., 2017) where lentivirus transduced ESCs were fixed using 1% formaldehyde for 10 min, and 0.125 M glycine was added to stop the fixation. Cells were then harvested, and DNA was fragmented to 300–500 base pairs (bp) by sonication with Scientz08-III sonicator (SCIENTZ). 10% of the total supernatant was retained for sequencing the input control. Immunoprecipitation was performed with antibodies (Pol II, Santa Cruz#sc-47701; c-Myc, Abcam#ab32072; LSD1 antibody, a gift from Jiemin Wong) conjugated to Dynabeads protein A/G beads (Pierce, cat#88802). The DNA from ChIP was eluted, reverse-cross-linked, extracted by AMPure XP beads (Beckman). qPCR from ChIP experiments was carried out with PerfectStart® Green qPCR SuperMix (Transgen Biotech, AQ601-01) using Roche LightCycler 480 (Roche). For primer sequences used in ChIP-qPCR validation experiments, see Table S2. For ChIP-seq, 1 ng of ChIP DNA or input DNA was used to generate sequencing libraries using NEBNext Ultra II DNA Library Prep Kit for Illumina (NEB). Libraries were sequenced on the NovaSeq 6000 (Illumina) using the 75-nt sequencing protocol.

PAD implementation

Public ChIP-seq data analysis

All public ChIP-seq data analyzed in this study were generated from mouse ESC lines (Yang et al., 2017). Reads from each ChIP-seq dataset were aligned to the mouse genome (mm9 assembly). For each DNA-binding protein, mapped reads were binned across the mouse genome with 1kb bin size and quantified as reads per base/kilobase per million reads (RPKM) and used in PAD clustering (see next section). ChIP-seq read density plots were generated by calculating the number of reads within ±2.5 kb upstream and downstream of sites of interest in 100 bp windows and normalized to RPKM and plotted as histograms. Data for heatmaps were generated in a similar manner.

PAD framework

PAD2 was developed in Python 3.7 based on Django web framework and the interactive plots in PAD2 were rendered using Plotly.js, an extension from PAD clustering (Yang et al., 2017) implementation. PAD clustering (Yang et al., 2017) (http://pad2.maths.usyd.edu.au) was previously developed to characterize the co-localization of TFs and epigenomic marks at various genomic regions based on their ChIP-seq profiling. Here, we extended PAD by increasing the number of partitions for genomic binding sites and histone modifications to 12 functional genomic regions identified by ChromHMM (Ernst and Kellis, 2017) in ESCs (Figure S1).

Mapping of genome to functional regions

A collection of transcription factors, chromatin remodellers and histone mark ChIP-seq datasets have been processed and quantified in 1kb bin across the genome. For each of the peak files, we mapped them to the 12 genomic functional regions (Figure S1) by using the intersect method in BEDTools v2.28.0 (Quinlan, 2014) and calculate their fold change. The fold change of a protein binding at each functional region is defined as follows:where denote the percentage of protein binding at genomics region , and denote the percentage of genomic region covers in a whole genome.

Clustering of DNA binding proteins

To perform clustering of DNA binding proteins at different genomic regions, we combine a set of peak files from a user-specified selection of DNA binding proteins and functional regions to a matrix. We then perform hierarchical clustering with Pearson’s correlation as a similarity metric.

Quantification and statistical analysis

Analysis and visualization of ChIP-seq data

ChIP-seq data (pair-end reads) generated from this study was aligned using STAR (v2.5.4b) (Dobin et al., 2013), with following parameters: --outFilterMultimapNmax 1 --alignEndsType EndToEnd --outFilterMismatchNmax 3 --alignIntronMax 1. BedTools (v2.26.0) (Quinlan, 2014) and SamTools (v1.7) (Li et al., 2009) were used to manipulate and convert files. Coverage analyses on bed and bam files were performed using BedTools and DeepTools (v2.0) (Ramírez et al., 2014) were applied to generate and visualize the matrices as heatmaps and to create the average ChIP-seq profiles.

Binding sites and target genes

To define binding sites for each DNA-binding factor, aligned reads were processed using SISSRs with a common input (Oldfield et al., 2014) and those with a p < 0.001, a stringent cut-off, were called as binding sites (Jothi et al., 2008). MYC-specific and LSD1-specific sites were defined as any MYC or LSD1 peaks that lack LSD1 or MYC ChIP-seq signals, respectively, at the same locus. Co-localized sites were determined as loci where MYC and LSD1 peaks overlap (within 500 bp window). Based on the mm9 RefSeq annotation, genes with closest TSSs to binding sites (within 1.5 kb) of a DNA-binding factor are assigned as the target genes of that factor.

Calculation of RNAPII pausing index

To calculate the RNAPII pausing index, we used the method described by (Day et al., 2016). Specifically, for each gene, we calculated the pausing index as follows:where the transcription start site (TSS) region of a gene is defined as the −50 bp to +300 bp around the TSS and the gene body is defined as +300 bp downstream of the TSS to +3 kb past the transcription end site. To segregate genes into different tiers of pausing, we categorized genes into five pausing groups based on their RNAPII pausing index at the promoter so that each group contained roughly the same number of genes. The same pausing groups (i.e., homologous genes) were used to evaluate the effect of LSD1 knockdown in hESCs. Differences in median RNAPII pausing indices were calculated using the Wilcoxon Rank Sum test where the number of asterisks denote the level of statistical significance: ∗∗∗, p < 0.001; ∗∗, p < 0.01; and ∗, p < 0.05.

Gene expression analysis

Previously published data from LSD1 knockout and WT mouse ESC samples measured using Illumina MouseWG-6 v2.0 expression beadchip were downloaded from the NCBI Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) with accession numbers GSE21131(Foster et al., 2010). Log2 fold change for each gene was then calculated by averaging across duplicates in either LSD1 KD or WT samples and subtracting log2 transformed values in LSD1 KD measurement with WT measurement. The same approach was taken to process data from MYC knockdown and WT mouse ESC samples measured using Illumina HiSeq 2500 (GSE113329 (Seruggia et al., 2019)) and microarray data from LSD1 knockdown and WT human ESC samples measured using Affymetrix Human Promoter 1.0R Array from human ESCs (GSE24844)(Adamo et al., 2011). RNA-seq data from ESCs in naïve state (GSE117896) was used for defining expressed genes (Yang et al., 2019). Specifically, expressed genes were defined as any gene with regularised log expression equal or higher than 5.

Motif and transcription factor enrichment analysis

For motif enrichment analysis, DNA sequences (500 bp) flanking the center of the binding sites for each factor were first extracted from mm9 assembly using the ‘getfasta’ of the bedtools (Quinlan and Hall, 2010). These sequences were subsequent searched using MEME (Bailey et al., 2006) using a minimum and maximum window size of 5 and 15, respectively, and the zero or one motif occurrence per sequence (zoops) option. The top two most enrichment motifs (based on the MEME reported E-value) were presented and annotated using known motifs. For the transcription factor enrichment analysis, we used the target gene sets and the ChIP-X Enrichment Analysis 3 (ChEA3) software to generate the gene-set associated TF ranks (Keenan et al., 2019). The transcription factor target over-representation analysis is performed using putative target genes determined by ChIP-seq experiments from ENCODE. A Fisher’s exact test (FET) with a background of 20,000 is used to quantify the significance in the overlap between the input gene set to the TF target gene sets, and the top ten TFs most closely associated were visualized.

Functional enrichment analysis

Functional enrichment analysis, in terms of overrepresentation of genes from a pathway, was performed using Fisher’s exact test against the gene ontology (GO) terms from the GO database (Blake et al., 2015). p-values were adjusted using the Benjamini-Hochberg method.

REAGENT or RESOURCE	SOURCE	IDENTIFIER
Antibodies

Anti-RNAP2	Santa Cruz	Cat#sc-47701; RRID:AB_677353
Anti-cMYC	Abcam	Cat#ab32072; RRID:AB_731658
Anti-LSD1	Gift from Jiemin Wong	N/A
Anti-beta-tubulin	Abmart	Cat#M30109; RRID:AB_2916070

Chemicals, peptides, and recombinant proteins

Puromycin	Selleck	S7417
Recombinant Mouse LIF Protein	Millipore	ESG1107
Protease inhibitor cocktail	Roche	11836153001

Critical commercial assays

GeneJET RNA Purification Kit	ThermoFisher Scientific	K0731
PerfectStart® Uni RT&qPCR Kit	Transgen Biotech	AUQ-01
PerfectStart® Green qPCR SuperMix	Transgen Biotech	AQ601-01
AMPure XP	Beckman Coulter	A63881

Deposited data

RNAPII ChIP-seq datasets of WT and LSD1 knockdown ESCs	This paper	GEO: GSE173089
Mouse ESCs MYC ChIP-seq	Chen et al., 2008	GEO: GSE11431
Mouse ESCs NANOG ChIP-seq	Whyte et al., 2013	GEO: GSE44286
Mouse ESCs Sox2 ChIP-seq	Whyte et al., 2013	GEO: GSE44286
Mouse ESCs SPT5 ChIP-seq	Rahl et al. (2010)	GEO: GSE20530
Mouse ESCs NELF-A ChIP-seq	Rahl et al. (2010)	GEO: GSE20530
Mouse ESCs BRD4 ChIP-seq	Gatchalian et al. (2018)	GEO: GSE111264
Mouse ESCs LSD1 ChIP-seq	Whyte et al. (2012)	GEO: GSE27841
Mouse ESCs CDK9 ChIP-seq	Whyte et al., 2013	GEO: GSE44286
Mouse ESCs TBP ChIP-seq	https://www.ncbi.nlm.nih.gov/geo/	GEO: GSE22303
Mouse ESCs CoREST ChIP-seq	Whyte et al. (2012)	GEO: GSE27841
Mouse ESCs HDAC1-2 ChIP-seq	Whyte et al. (2012)	GEO: GSE27841
Mouse ESCs H3K27ac ChIP-seq	Yang et al. (2019)	GEO: GSE117896
Mouse ESCs H3K4me3 ChIP-seq	Yang et al. (2019)	GEO: GSE117896
Mouse ESCs H3K27me3 ChIP-seq	Yang et al. (2019)	GEO: GSE117896
Mouse ESCs GRO-seq	Williams et al. (2015)	GEO: GSE43390
Mouse ESCs LSD1 KD microarray	Foster et al. (2010)	GEO: GSE21131
Mouse ESCs cMYC KD RNA-seq	Seruggia et al. (2019)	GEO: GSE113329
Human ESCs LSD1 KD microarray	Adamo et al. (2011)	GEO: GSE24844
Mouse ESCs 0h RNA-seq	Yang et al. (2019)	GEO: GSE117896

Experimental models: Cell lines

E14TG2a mouse embryonic stem cells	ATCC	CRL-1821

Oligonucleotides

shRNA-#1 targeting sequence for LSD1: GAGTTGAAAGAGCTTCTTAAT	This paper	N/A
shRNA-#2 targeting sequence for LSD1: CCACAAGTCAAACCTTTATTT	This paper	N/A
shRNA-#1 targeting sequence for c-MYC: ACTTCACCAACAGGAACTATG	This paper	N/A
shRNA-#1 targeting sequence for c-MYC:TGGAGATGATGACCGAGTTAC	This paper	N/A
Primers for RT-qPCR. See Table S1	This paper	N/A
Primers for ChIP-qPCR. See Table S2	This paper	N/A

Software and algorithms

PAD2	This paper	http://pad2.maths.usyd.edu.au
R version 4.1.0	R Development Core Team, 2016	https://www.R-project.org/
Bowtie2	Langmead and Salzberg, 2012	http://bowtie-bio.sourceforge.net/bowtie2/index.shtml
Samtools	Li et al. (2009)	http://samtools.sourceforge.net
deepTools 2.0	Ramírez et al. (2014)	https://deeptools.readthedocs.io/en/develop/

90 in total

1. NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in Drosophila.

Authors: Chwen-Huey Wu; Yuki Yamaguchi; Lawrence R Benjamin; Maria Horvat-Gordon; Jodi Washinsky; Espen Enerly; Jan Larsson; Andrew Lambertsson; Hiroshi Handa; David Gilmour
Journal: Genes Dev Date: 2003-06-01 Impact factor: 11.361

Review 2. P-TEFb: Finding its ways to release promoter-proximally paused RNA polymerase II.

Authors: You Li; Min Liu; Lin-Feng Chen; Ruichuan Chen
Journal: Transcription Date: 2018-01-12

Review 3. RNA polymerase II pausing during development.

Authors: Bjoern Gaertner; Julia Zeitlinger
Journal: Development Date: 2014-03 Impact factor: 6.868

Review 4. Chromatin-state discovery and genome annotation with ChromHMM.

Authors: Jason Ernst; Manolis Kellis
Journal: Nat Protoc Date: 2017-11-09 Impact factor: 13.491

5. c-Myc is a universal amplifier of expressed genes in lymphocytes and embryonic stem cells.

Authors: Zuqin Nie; Gangqing Hu; Gang Wei; Kairong Cui; Arito Yamane; Wolfgang Resch; Ruoning Wang; Douglas R Green; Lino Tessarollo; Rafael Casellas; Keji Zhao; David Levens
Journal: Cell Date: 2012-09-28 Impact factor: 41.582

6. MEME: discovering and analyzing DNA and protein sequence motifs.

Authors: Timothy L Bailey; Nadya Williams; Chris Misleh; Wilfred W Li
Journal: Nucleic Acids Res Date: 2006-07-01 Impact factor: 16.971

7. Integrator complex regulates NELF-mediated RNA polymerase II pause/release and processivity at coding genes.

Authors: Bernd Stadelmayer; Gaël Micas; Adrien Gamot; Pascal Martin; Nathalie Malirat; Slavik Koval; Raoul Raffel; Bijan Sobhian; Dany Severac; Stéphanie Rialle; Hugues Parrinello; Olivier Cuvier; Monsef Benkirane
Journal: Nat Commun Date: 2014-11-20 Impact factor: 14.919

8. A non-canonical BRD9-containing BAF chromatin remodeling complex regulates naive pluripotency in mouse embryonic stem cells.

Authors: Jovylyn Gatchalian; Shivani Malik; Josephine Ho; Dong-Sung Lee; Timothy W R Kelso; Maxim N Shokhirev; Jesse R Dixon; Diana C Hargreaves
Journal: Nat Commun Date: 2018-12-03 Impact factor: 14.919

9. Promoter-specific dynamics of TATA-binding protein association with the human genome.

Authors: Yuko Hasegawa; Kevin Struhl
Journal: Genome Res Date: 2019-11-15 Impact factor: 9.043

10. Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data.

Authors: Raja Jothi; Suresh Cuddapah; Artem Barski; Kairong Cui; Keji Zhao
Journal: Nucleic Acids Res Date: 2008-08-06 Impact factor: 16.971