Anja S Knaupp1, Monika Mohenska1, Michael R Larcombe1, Ethan Ford2, Sue Mei Lim1, Kayla Wong1, Joseph Chen1, Jaber Firas1, Cheng Huang3, Xiaodong Liu1, Trung Nguyen2, Yu B Y Sun1, Melissa L Holmes1, Pratibha Tripathi4, Jahnvi Pflueger2, Fernando J Rossello1, Jan Schröder1, Kathryn C Davidson1, Christian M Nefzger1, Partha P Das4, Jody J Haigh5, Ryan Lister2, Ralf B Schittenhelm6, Jose M Polo7. 1. Department of Anatomy and Developmental Biology, Monash University, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Clayton, VIC 3800, Australia; Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia. 2. Australian Research Council Centre of Excellence in Plant Energy Biology, The University of Western Australia, Perth, WA 6009, Australia; Harry Perkins Institute of Medical Research, Nedlands, WA 6009, Australia. 3. Monash Proteomics and Metabolomics Facility, Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia. 4. Department of Anatomy and Developmental Biology, Monash University, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Clayton, VIC 3800, Australia. 5. Australian Centre for Blood Diseases, Monash University, Clayton, VIC 3004, Australia; Department of Pharmacology and Therapeutics, University of Manitoba, Winnipeg, MB, Canada; Research Institute in Oncology and Hematology, CancerCare Manitoba, Winnipeg, MB, Canada. 6. Monash Proteomics and Metabolomics Facility, Department of Biochemistry and Molecular Biology, Monash University, Clayton, VIC 3800, Australia. Electronic address: ralf.schittenhelm@monash.edu. 7. Department of Anatomy and Developmental Biology, Monash University, Clayton, VIC 3800, Australia; Development and Stem Cells Program, Monash Biomedicine Discovery Institute, Clayton, VIC 3800, Australia; Australian Regenerative Medicine Institute, Monash University, Clayton, VIC 3800, Australia. Electronic address: jose.polo@monash.edu.
Abstract
Cellular identity is ultimately dictated by the interaction of transcription factors with regulatory elements (REs) to control gene expression. Advances in epigenome profiling techniques have significantly increased our understanding of cell-specific utilization of REs. However, it remains difficult to dissect the majority of factors that interact with these REs due to the lack of appropriate techniques. Therefore, we developed TINC: TALE-mediated isolation of nuclear chromatin. Using this new method, we interrogated the protein complex formed at the Nanog promoter in embryonic stem cells (ESCs) and identified many known and previously unknown interactors, including RCOR2. Further interrogation of the role of RCOR2 in ESCs revealed its involvement in the repression of lineage genes and the fine-tuning of pluripotency genes. Consequently, using the Nanog promoter as a paradigm, we demonstrated the power of TINC to provide insight into the molecular makeup of specific transcriptional complexes at individual REs as well as into cellular identity control in general.
Cellular identity is ultimately dictated by the interaction of transcription factors with regulatory elements (REs) to control gene expression. Advances in epigenome profiling techniques have significantly increased our understanding of cell-specific utilization of REs. However, it remains difficult to dissect the majority of factors that interact with these REs due to the lack of appropriate techniques. Therefore, we developed TINC: TALE-mediated isolation of nuclear chromatin. Using this new method, we interrogated the protein complex formed at the Nanog promoter in embryonic stem cells (ESCs) and identified many known and previously unknown interactors, including RCOR2. Further interrogation of the role of RCOR2 in ESCs revealed its involvement in the repression of lineage genes and the fine-tuning of pluripotency genes. Consequently, using the Nanog promoter as a paradigm, we demonstrated the power of TINC to provide insight into the molecular makeup of specific transcriptional complexes at individual REs as well as into cellular identity control in general.
Pluripotent stem cells (PSCs) carry immense therapeutic potential as they can produce any cell type of the body and can self-renew indefinitely. The two most common in vitro PSCs are blastocyst-derived embryonic stem cells (ESCs) (Kaufman et al., 1983; Martin 1981) and induced PSCs (iPSCs), which are obtained from somatic cells through expression of the transcription factors (TFs) OCT4, SOX2, KLF4, and C-MYC (Takahashi et al., 2007; Takahashi and Yamanaka 2006). Together with OCT4 and SOX2, NANOG forms the core transcriptional network in ESCs/iPSCs that mediates expression of self-renewal and pluripotency genes and repression of differentiation genes (Loh et al., 2006; Boyer et al., 2005; Chen et al., 2008; Marson et al., 2008; Kim et al., 2008). ESCs express Nanog heterogeneously and can be maintained upon Nanog deletion, although they are differentiation prone (Chambers et al., 2007). These NANOG fluctuations seem to play a role in lineage commitment with high levels impeding ESC differentiation (Abranches et al., 2014; Chambers et al., 2003; Kalmar et al., 2009).Like any gene, Nanog expression is regulated by TFs that interact with regulatory elements (REs) and other factors. At least two REs control Nanog in ESCs: the promoter and the −5 kb enhancer (Apostolou et al., 2013; Kagey et al., 2010). NANOG binds to both of these REs (Chen et al., 2008; Kim et al., 2008) and mediates positive (Wu et al., 2006) and negative (Fidalgo et al., 2012; Navarro et al., 2012) feedback loops. Positive transcriptional regulation has been associated with binding of OCT4 and SOX2 to the Nanog promoter (Rodda et al., 2005). Conversely, Nanog auto-repression has been linked to the interaction of NANOG with ZFP281 and the NuRD repressor complex (Fidalgo et al., 2012). Nevertheless, while some of the proteins that occupy these REs are known, the full complex composition remains largely elusive.This knowledge gap is partly due to the difficulty in dissecting a specific regulatory complex. Chromatin immunoprecipitation (ChIP) has been invaluable to study in vivo DNA-protein interactions but only allows interrogation of one factor at a time, and requires a priori candidates and appropriate antibodies. Thus, being able to simultaneously analyze an entire protein complex is extremely advantageous but is challenging due to (1) a genomic region of low abundance (e.g., two copies per cell) has to be targeted specifically and (2) the interacting proteins have to be enriched efficiently (e.g., mass spectrometry [MS] analyzes proteins without amplification).Several groups have set out to overcome these limitations by developing locus-specific isolation or proximity labeling methods. These include methods based on nucleic acid hybridization (Antão et al., 2012; Déjardin and Kingston 2009; Ide and Dejardin 2015; Kennedy-Darling et al., 2014) or DNA-binding proteins, such as LexA (Byrum et al., 2012; Fujita and Fujii 2011), TetR (Pourfarzad et al., 2013), Cas9 (Waldrip et al., 2014; Fujita and Fujii 2013; Gao et al., 2018; Schmidtmann et al., 2016; Tsui et al., 2018; Qiu et al., 2019; Liu et al., 2017), or TALEs (Byrum et al. 2013; Fang et al., 2018; Fujita et al., 2013). Although nucleic acid hybridization-based approaches pioneered the field, they require intensive optimization (Antão et al., 2012; Déjardin and Kingston 2009; Ide and Dejardin 2015), which might be why they have not yet been widely adopted or translated to single-copy elements in mammalian cells. To interrogate such challenging regions, different groups have adapted the CRISPR-Cas9 technology (Fujita and Fujii 2013; Qiu et al., 2019; X. Liu et al., 2020, 2017). Unlike TetR and LexA, CRISPR-Cas9 does not require insertion of binding sites into the target sequence and can be customized relatively easily. However, there are increasing reports of high-frequency off-targeting events by CRISPR-Cas9 (Fu et al., 2013; Duan et al., 2014; Kuscu et al., 2014; Lin et al., 2014; X. Wang et al., 2015; Liu et al., 2018), which in turn potentially skew the results of such single-locus studies. Importantly, TALEs are associated with significantly lower off-targeting than CRISPR-Cas9 (X. Wang et al., 2015). Furthermore, evidence suggests that TALEs can be utilized to enrich a specific locus from the complex mammalian genome (Fang et al., 2018).Taking advantage of these characteristics, we developed a TALE-based method that allows isolation of a specific genomic region from mammalian cells termed TINC (TALE-mediated isolation of nuclear chromatin). Using TINC, we interrogated the regulatory complex formed at the Nanog promoter in mouse ESCs.
Results
Development of TINC to Identify the Nanog Regulatory Complex
To interrogate the protein complex formed at the Nanog promoter in ESCs we developed and applied TINC (Figure 1A). In brief, cells expressing a 3xHA-tagged TALE designed to bind to a region of interest are fixed and the nuclei and chromatin are isolated before sonication. The target region is then isolated using affinity purification of the TALE, and nucleic acids and proteins are further processed. To minimize selection of proteins enriched through TALE off-targeting, we performed TINC with two different TALEs targeting the Nanog promoter and considered only proteins enriched by both TALEs as genuine binders. To that end, four TALEs were designed (Figure S1A) and tested by ChIP-qPCR to identify the most efficient TALEs (Figure S1B). Consequently, TALE1 and TALE2 were selected (Figure 1B). Importantly, ESCs stably expressing either TALE showed unaltered Nanog expression (Figures 1C and S1C) and retained their pluripotency potential (Figure S1D).
Figure 1
TINC Allows Isolation of the Nanog Regulatory Complex from ESCs
(A) Schematic of the TINC method.
(B) Two TALEs were designed to bind upstream of the OCT4-SOX2 motif in the Nanog promoter. ATAC-seq shows that the promoter and the −5 kb Nanog enhancer are located in open chromatin, and ChIP-seq indicates that both regions are targeted by OCT4 and SOX2.
(C) qRT-PCR of Nanog transcript levels (fold Gapdh) (mean ± SD; n = 6 independent experiments).
(D) TINC-qPCR confirmed strong enrichment of the Nanog promoter over the genomic background (Sox2 regulatory region 2) by TALE1 and TALE2 (mean ± SD; n = 3 independent experiments, e.g., R1, R2, and R3).
(E) Overlap of proteins isolated by TALE1 and TALE2 upon subtraction of proteins enriched unspecifically from the empty vector controls (n = 3 independent experiments).
(F) Overlap of the intercept of proteins identified by TALE1 and TALE2 in three TINC runs shown in Figure 1E. The 455 proteins present in at least two of the three replicates (i.e., TINC proteins) contained many of the previously published Nanog promoter binders.
(G) ChIP-seq data processed by ChIP-Atlas (Oki et al., 2018) shown at the Nanog promoter. Proteins identified as Nanog promoter binders by ChIP-Atlas but not the original publication are marked with an asterisk. Below each BigWig track (black) are the respective peak calls by the ChIP-Atlas (red). The promoter (P) and the −5 kb enhancer (RE) of Nanog are indicated in orange.
See also Figure S1 and Tables S1 and S2.
TINC Allows Isolation of the Nanog Regulatory Complex from ESCs(A) Schematic of the TINC method.(B) Two TALEs were designed to bind upstream of the OCT4-SOX2 motif in the Nanog promoter. ATAC-seq shows that the promoter and the −5 kb Nanog enhancer are located in open chromatin, and ChIP-seq indicates that both regions are targeted by OCT4 and SOX2.(C) qRT-PCR of Nanog transcript levels (fold Gapdh) (mean ± SD; n = 6 independent experiments).(D) TINC-qPCR confirmed strong enrichment of the Nanog promoter over the genomic background (Sox2 regulatory region 2) by TALE1 and TALE2 (mean ± SD; n = 3 independent experiments, e.g., R1, R2, and R3).(E) Overlap of proteins isolated by TALE1 and TALE2 upon subtraction of proteins enriched unspecifically from the empty vector controls (n = 3 independent experiments).(F) Overlap of the intercept of proteins identified by TALE1 and TALE2 in three TINC runs shown in Figure 1E. The 455 proteins present in at least two of the three replicates (i.e., TINC proteins) contained many of the previously published Nanog promoter binders.(G) ChIP-seq data processed by ChIP-Atlas (Oki et al., 2018) shown at the Nanog promoter. Proteins identified as Nanog promoter binders by ChIP-Atlas but not the original publication are marked with an asterisk. Below each BigWig track (black) are the respective peak calls by the ChIP-Atlas (red). The promoter (P) and the −5 kb enhancer (RE) of Nanog are indicated in orange.See also Figure S1 and Tables S1 and S2.Next, TINC was performed with both TALEs in three independent experiments. As negative control, empty vector-transfected ESCs were used. The DNA isolated in each TINC reaction was analyzed by qPCR, which indicated comparable Nanog promoter enrichment efficiencies by both TALEs (approximately 30-fold over background) (Figure 1D). To determine the genome-wide binding sites of TALE1 and TALE2, two replicates were sequenced. This revealed that TALE1 and TALE2 had 12 and 3 off-targets, respectively (Figures S1E and S1F; Table S1), which is considerably less than what had been shown for comparable CRISPR-Cas9 approaches (Liu et al., 2020, 2017). Importantly, this analysis also confirmed that the only site co-bound by both TALEs is the Nanog promoter (Figure S1E). Hence, the protein content of each sample was analyzed by MS. Upon subtraction of proteins also detected in the negative controls, we obtained several hundred proteins from each sample and an overlap of approximately 60% between proteins enriched by TALE1 and TALE2 (Figure 1E). Further comparative analyses of the proteins revealed that 241 were present in all three and an additional 214 in two of the three replicates (Figure 1F; Table S1). Importantly, these 455 proteins, hereafter referred to as TINC proteins, contained 30 published binders (Figure 1F; Tables S1 and S2). In addition, ChIP-Atlas, which processes published ChIP sequencing (ChIP-seq) data via its own pipeline and can hence identify binding events not detected in the original publication (Oki et al., 2018), confirmed Nanog promoter targeting by an additional 12 TINC proteins (Figure 1G; Tables S1 and S2). Notably, TINC did not simply result in the detection of highly abundant chromatin-associated proteins, but also in the specific enrichment of lowly abundant candidates, which further supports the validity and sensitivity of TINC (Figure S1G). Altogether, this demonstrates that TINC enables interrogation of specific regulatory complexes and has the power to uncover hundreds of interacting proteins.
Analysis of the Nanog Regulatory Complex Reveals Proteins with Different Functions
Enrichment analyses of the TINC proteins showed overrepresentation of nucleic acid and chromatin binders, transcriptional regulators, and protein-modifying enzymes with roles in cell cycle, DNA replication and repair, histone modification, and stem cell regulation (Figures 2A and 2B). In accordance, protein network analysis identified protein clusters related to many of these processes (Figure 2C; Table S3). This shows that, while the cell cultures were not synchronized and therefore proteins associated with various stages of the cell cycle were detected, TINC identified a large number of proteins related to transcriptional regulation with a role in stem cell identity as expected from Nanog.
Figure 2
Characterization of the Proteins Identified at the Nanog Promoter in ESCs
(A) Protein class categories for all 455 TINC proteins. A total of 197 proteins were not assigned.
(B) Clustered GO biological processes and KEGG pathways enriched in the TINC protein dataset. Enrichment is the −log10(geometric mean) of p-values of clustered terms.
(C) Protein-protein interaction network analysis and MCL clustering of the TINC proteins. GO biological processes and KEGG pathways enriched in the top ten clusters are displayed.
(D) Enriched processes and pathways for TFs identified by TINC. Significance cutoff q = 0.05 (Benjamini-Hochberg correction) is shown (x axis).
(E) Overlap of the TINC proteins with the pluripotency regulatory network (Fisher's exact test, <2.2 × 10−16). Proteins identified in all three TINC replicates are depicted in red and proteins identified in two out of the three TINC replicates are shown in orange. The size of the node (protein) indicates the number of proteins interacting with that specific protein.
(F) KD of 37 TINC proteins resulted in a significant change in Nanog expression in a genome-wide RNAi screen (Gingold et al., 2014).
See also Figure S2 and Table S3.
Characterization of the Proteins Identified at the Nanog Promoter in ESCs(A) Protein class categories for all 455 TINC proteins. A total of 197 proteins were not assigned.(B) Clustered GO biological processes and KEGG pathways enriched in the TINC protein dataset. Enrichment is the −log10(geometric mean) of p-values of clustered terms.(C) Protein-protein interaction network analysis and MCL clustering of the TINC proteins. GO biological processes and KEGG pathways enriched in the top ten clusters are displayed.(D) Enriched processes and pathways for TFs identified by TINC. Significance cutoff q = 0.05 (Benjamini-Hochberg correction) is shown (x axis).(E) Overlap of the TINC proteins with the pluripotency regulatory network (Fisher's exact test, <2.2 × 10−16). Proteins identified in all three TINC replicates are depicted in red and proteins identified in two out of the three TINC replicates are shown in orange. The size of the node (protein) indicates the number of proteins interacting with that specific protein.(F) KD of 37 TINC proteins resulted in a significant change in Nanog expression in a genome-wide RNAi screen (Gingold et al., 2014).See also Figure S2 and Table S3.The TINC proteins included 56 TFs and 101 epigenetic modifiers (Figure S2A; Table S1). While the epigenetic modifiers were mainly associated with histone modifications (Figure S2B), the TFs showed enrichment for various processes related to transcriptional regulation and maintenance of pluripotency (Figure 2D). Intersection of the TINC proteins with the previously defined pluripotency regulatory network (Nefzger et al., 2017; Xu et al. 2013, 2014) revealed that 69 key network components form part of the Nanog regulatory complex (Figures 2E and S2C). We also identified ESRRB, SALL4, and NACC1 in our TALE pull-downs; however, these key regulators were also present in the negative controls (albeit with a lower number of peptide spectrum matches) and were therefore excluded from further analyses.To determine the transcriptional output of the TINC proteins, we utilized a published RNAi screen conducted with a Nanog-GFP reporter ESC line (Gingold et al., 2014). Interestingly, while we identified 381 of the authors' small interfering RNA targets at the Nanog promoter, knockdown (KD) of only 23 resulted in a significant decrease in Nanog expression, indicative of a role of these proteins in maintenance of pluripotency (Figures 2F, S2D, and S2E). Conversely, KD of 14 TINC proteins resulted in a significant increase in Nanog expression, suggesting that these proteins act as repressors of Nanog and might play a role in silencing of this gene during differentiation (Figures 2F, S2D, and S2E). Together, these results show that various transcriptional regulators, including activators and repressors, form part of the Nanog regulatory complex in ESCs.
Different Requirements for Nanog Regulatory Complex Members in Pluripotency Acquisition and Maintenance
Cellular reprogramming is driven by reconfiguration of the epigenome, which leads to silencing of the somatic network and activation of the pluripotency program (Chronis et al., 2017; Knaupp et al., 2017; Maherali et al., 2007; Polo et al., 2012). To determine which TINC proteins are specific to pluripotency and hence upregulated during reprogramming of mouse embryonic fibroblasts (MEFs) into iPSCs, we performed differential gene expression (DGE) analysis. DGE analysis revealed that 61% of the TINC proteins are expressed significantly higher in iPSCs (versus 8% in MEFs) and that many of the known Nanog binders might not be pluripotency specific (Figure 3A).
Figure 3
Interrogation of Novel Nanog Promoter Interactors Identified by TINC
(A) DGE analysis of TINC protein expression in iPSCs versus MEFs. Proteins selected for follow-up experiments are shown in red. Significance cutoff false discovery rate (FDR) of 0.05 is shown (y axis).
(B) ChIP-qPCR analysis for Nanog promoter enrichment by 3xHA-tagged factors stably expressed in ESCs (mean ± SD; n = 3 independent experiments). ∗∗p ≤ 0.01 versus ESC control.
(C) qRT-PCR analysis of Nanog expression upon shRNA-mediated KD of selected factors and normalized to the levels of housekeeping gene Ywhaz (mean ± SD; n = 3 independent experiments). ∗p ≤ 0.05 and ∗∗p ≤ 0.01 versus non-targeting (NT) shRNA control.
(D) Heatmap showing standardized TINC protein expression during MEF reprogramming (grouped by protein categories). Black bars indicate that the proteins are a part of the pluripotency network from Figure 2E and blue bars whether they were identified in all three or in two out of the three TINC replicates.
(E) Distribution of Pearson's correlation coefficients of TINC protein expression in relation to the expression of Nanog during MEF reprogramming.
(F) Standardized gene expression dynamics of the TINC proteins during MEF reprogramming. Targets selected for further investigation are highlighted and their reprogramming correlation coefficients to Nanog are shown in brackets.
(G) MEF reprogramming efficiency upon shRNA-mediated KD of selected targets (mean ± SD; n = 4 technical replicates, representative of three independent experiments).
See also Figure S3.
Interrogation of Novel Nanog Promoter Interactors Identified by TINC(A) DGE analysis of TINC protein expression in iPSCs versus MEFs. Proteins selected for follow-up experiments are shown in red. Significance cutoff false discovery rate (FDR) of 0.05 is shown (y axis).(B) ChIP-qPCR analysis for Nanog promoter enrichment by 3xHA-tagged factors stably expressed in ESCs (mean ± SD; n = 3 independent experiments). ∗∗p ≤ 0.01 versus ESC control.(C) qRT-PCR analysis of Nanog expression upon shRNA-mediated KD of selected factors and normalized to the levels of housekeeping gene Ywhaz (mean ± SD; n = 3 independent experiments). ∗p ≤ 0.05 and ∗∗p ≤ 0.01 versus non-targeting (NT) shRNA control.(D) Heatmap showing standardized TINC protein expression during MEF reprogramming (grouped by protein categories). Black bars indicate that the proteins are a part of the pluripotency network from Figure 2E and blue bars whether they were identified in all three or in two out of the three TINC replicates.(E) Distribution of Pearson's correlation coefficients of TINC protein expression in relation to the expression of Nanog during MEF reprogramming.(F) Standardized gene expression dynamics of the TINC proteins during MEF reprogramming. Targets selected for further investigation are highlighted and their reprogramming correlation coefficients to Nanog are shown in brackets.(G) MEF reprogramming efficiency upon shRNA-mediated KD of selected targets (mean ± SD; n = 4 technical replicates, representative of three independent experiments).See also Figure S3.Next, we selected proteins with a large change (e.g., RCOR2 and DNMT3L), an intermediate change (e.g., WDR76 and ZFP57), and no change (e.g., ATAXIN10 and QRICH1) in DGE for further investigation. To overcome the lack of ChIP antibodies, these six proteins as well as the positive controls MYBL2 (Zhan et al., 2012) and SOX2 were 3xHA tagged. This allowed us to confirm Nanog promoter binding by the majority of these candidates (Figure 3B), further validating TINC as well as endorsing ChIP-Atlas, which had identified ZFP57 as a Nanog binder (Figure 1G) while the original publication did not (Strogantsev et al., 2015). To determine whether these proteins directly control Nanog expression, we used small hairpin RNAs (shRNAs) for which we obtained approximately 75% target KD (Figures S3A and S3B). Depletion of several of these factors resulted in a change in ESC morphology, which was most pronounced for Nanog and Rcor2 (Figure S3C). As shown previously, KD of Nanog led to a considerable amount of cell death (Chen et al.,2012), while loss of Rcor2 triggered the cells to grow in a monolayer instead of dome-shaped colonies (Figure S3C). Despite the perceived morphological changes, only KD of Mybl2 and Qrich1 resulted in a significant change in Nanog expression (Figure 3C).Analysis of TINC protein expression during MEF reprogramming revealed that many are transiently or permanently upregulated and that these proteins are mainly associated with chromatin organization and transcriptional regulation (Figures 3D, S3D, and S3E). Furthermore, 73% of the TINC proteins, including the majority of our candidates, are upregulated at the end of reprogramming and are strongly correlated with Nanog expression (Figures 3E, 3F, and S3D). Interestingly, while KD of Mybl2, Dnmt3l, Qrich1, Wdr76, and Zfp57 resulted in a significant decrease in reprogramming efficiency, KD of Rcor2 led to a significant increase (Figure 3G). Overall, these results suggest that many of the proteins identified at the Nanog promoter in the pluripotent state also play a role in establishing this state during reprogramming.Since it has previously been proposed that Rcor2 positively regulates MEF reprogramming (Yang et al., 2011), we set out to further explore its role during this conversion. Using a different reprogramming system (OKMS instead of OKSM) and two alternative Rcor2 shRNAs individually and combined, we observed the same trend (Figures S3F and S3G). Flow cytometric analysis during reprogramming revealed that the total number of cells en route to becoming iPSCs (SSEA1+) approximately doubled due to an overall increase in cell numbers rather than an increase in the percentage of SSEA1+ cells (Figures S3H–S3J). Conversely, overexpression of Rcor2 led to a decrease in iPSC formation, although non-significant, which indicates that RCOR2 may impair pluripotency acquisition (Figure S3K).
RCOR2 Is a Component of the Pluripotency Regulatory Network
To gain further insight into the targets and partners of RCOR2, we first performed ChIP-seq, which revealed extensive binding to intergenic and intronic regions in ESCs (Figure S4A; Table S4). Approximately 13% of the binding sites of RCOR2 are located in promoter regions, including the promoters of Nanog, Oct4, and Sox2 (Figures 4A and S4A). Indeed, the majority of RCOR2 target genes are expressed (e.g., 8,514) and associated with various cell division- and pluripotency-related functions, while low or unexpressed targets (e.g., 3,030) are mainly associated with neuronal processes (Figure S4B). Intersection of the genome-wide binding sites and target genes of RCOR2 revealed extensive co-occupancy of RCOR2 with NANOG, OCT4, or SOX2 targets (Figures 4B and S4C; Table S4). A hypergeometric test validated that indeed RCOR2 co-occupancy with OCT4 (p = 0), SOX2 (p = 1.01664 × 10−310), or NANOG (p = 3.96871 × 10−246) targets is highly significant. Notably, genes occupied by all four factors have the highest median expression, include many key pluripotency regulators, and are enriched for various developmental processes (Figures 4B, 4C, S4D). In agreement, motif enrichment analysis showed several pluripotency TF motifs, including OCT4, SOX2, and NANOG in ESC enhancers (Figure 4D). In line with the observed binding of RCOR2 to neuronal genes (Figure S4B), we obtained motif enrichment for REST, a repressor associated with silencing of neuronal genes in non-neuronal cells (Chong et al., 1995; Schoenherr and Anderson 1995).
Figure 4
RCOR2 Is Part of the Pluripotency Regulatory Network in ESCs
(A) Tracks depicting RCOR2 ChIP-seq at Nanog, Oct4, and Sox2. Promoters (P) and known regulatory elements (REs), including the Oct4 distal (DE) and proximal (PE) enhancers are indicated in orange. ATAC-seq shows regions of open chromatin.
(B) Intersection of NANOG, SOX2, OCT4, and RCOR2 target genes in ESCs. Expressed targets (red), repressed targets (blue), and the expression distributions are shown.
(C) Proteins of the pluripotency regulatory network are colored indicative of whether they are targets of NANOG (N), SOX2 (S), OCT4 (O), and/or RCOR2 (R). The size of the node (protein) indicates the number of proteins interacting with that specific protein.
(D) Motif enrichment analysis of RCOR2 ChIP-seq peaks. Where a point is present, a significant enrichment for the motif (x axis) was found at all sites, pluripotency enhancers, or at other sites (non-pluripotency enhancers) occupied by RCOR2 (y axis). Point size represents the proportion of sequences featuring the motif and color gradient the enrichment significance.
(E) Volcano plot showing the enrichment of proteins copurified by RCOR2 in comparison with the negative control (n = 3 technical replicates). Lines indicate the threshold above which proteins are significantly enriched (FDR < 0.05 and log2FC > 1). TINC proteins are shown in red and NuRD complex components and interactors are indicated.
(F) DGE analysis of TINC protein expression in iPSCs versus MEFs. Proteins also detected in RCOR2 CoIPs are colored in turquoise. Significance cutoff FDR = 0.05 is shown (y axis).
See also Figure S4 and Table S4.
RCOR2 Is Part of the Pluripotency Regulatory Network in ESCs(A) Tracks depicting RCOR2 ChIP-seq at Nanog, Oct4, and Sox2. Promoters (P) and known regulatory elements (REs), including the Oct4 distal (DE) and proximal (PE) enhancers are indicated in orange. ATAC-seq shows regions of open chromatin.(B) Intersection of NANOG, SOX2, OCT4, and RCOR2 target genes in ESCs. Expressed targets (red), repressed targets (blue), and the expression distributions are shown.(C) Proteins of the pluripotency regulatory network are colored indicative of whether they are targets of NANOG (N), SOX2 (S), OCT4 (O), and/or RCOR2 (R). The size of the node (protein) indicates the number of proteins interacting with that specific protein.(D) Motif enrichment analysis of RCOR2 ChIP-seq peaks. Where a point is present, a significant enrichment for the motif (x axis) was found at all sites, pluripotency enhancers, or at other sites (non-pluripotency enhancers) occupied by RCOR2 (y axis). Point size represents the proportion of sequences featuring the motif and color gradient the enrichment significance.(E) Volcano plot showing the enrichment of proteins copurified by RCOR2 in comparison with the negative control (n = 3 technical replicates). Lines indicate the threshold above which proteins are significantly enriched (FDR < 0.05 and log2FC > 1). TINC proteins are shown in red and NuRD complex components and interactors are indicated.(F) DGE analysis of TINC protein expression in iPSCs versus MEFs. Proteins also detected in RCOR2 CoIPs are colored in turquoise. Significance cutoff FDR = 0.05 is shown (y axis).See also Figure S4 and Table S4.Next, to determine interactors of RCOR2, we conducted co-immunoprecipitations (CoIPs), which identified 368 proteins (Figures 4E and 4F; Table S4). Among those were 79 transcriptional regulators predominantly associated with histone modifications, including LSD1, which had previously been found to interact with RCOR2 in pluripotent and neural cells (Yang et al., 2011; Wang et al., 2016) (Figure S4E; Table S4). Upon close examination of the TINC results, we noticed that LSD1 had also been excluded due to traces in the negative controls. LSD1 forms part of various complexes, including CoREST/REST and NuRD (Andrés et al., 1999; Foster et al., 2010; Mosammaparast and Shi 2010). Notably, while RCOR2 CoIP resulted in enrichment of various NuRD complex components, CoREST and REST were not detected (Figures 4E; Table S4). Furthermore, integration of published ChIP-seq data confirmed extensive co-binding of RCOR2 and LSD1 within ESCs (Figures S4F and S4G; Table S4). Interestingly, while some of these sites are also bound by REST and CoREST or, to a larger extent by NuRD complex components, many sites are exclusively targeted by LSD1 and RCOR2 (Figures S4F and S4G). Notably, ESC enhancers did not show major occupancy by CoREST or REST but extensive targeting by LSD1, RCOR2, and NuRD members (Figure S4H). Inhibition of LSD1 has previously been shown to promote iPSC formation (Sun et al., 2016). Importantly, concomitant inhibition of LSD1 alleviated the inhibitory effect of RCOR2 overexpression, suggesting that RCOR2 exerts its function at least partly via LSD1 during reprogramming (Figure S3K). Together, our results revealed that RCOR2 binds extensively in ESCs and forms part of various regulatory complexes, including, LSD1, NuRD, and CoREST/REST complexes.
RCOR2 Is Required for Efficient ESC Differentiation
To further investigate the role of RCOR2, we created a CRISPR-Cas9 knockout (KO) ESC line (Figure 5A). Rcor2 depletion was confirmed at the DNA, RNA, and protein levels (Figures S5A–S5C). Notably, Rcor2 KO resulted in similar morphological changes as observed upon shRNA-mediated KD (Figures S3C and 5A). Subsequent RNA sequencing of the Rcor2 KO line revealed transcriptional deregulation of 1,977 genes, approximately 74% of which are RCOR2 targets (Figure 5B; Table S4). Interestingly, many RCOR2 targets, which showed transcriptional deregulation upon Rcor2 KO were related to cell cycle and cell division (Figure S5D). In agreement, growth rate analyses revealed a significant decrease in ESC and MEF doubling time upon Rcor2 depletion, suggesting that RCOR2 is a negative regulator of cell-cycle progression (Figure S5E). Furthermore, this analysis revealed that RCOR2 is involved in transcriptional regulation of various pluripotency network components, including repression of Oct4 and Nanog (Figures 5C and S5F).
Figure 5
Rcor2 KO ESCs Have a Differentiation Impairment
(A) Representative bright-field images of WT and Rcor2 KO ESCs. Scale bar, 250 μm.
(B) DGE analysis of Rcor2 KO versus WT ESCs separated into genes occupied by RCOR2 and genes that are not RCOR2 targets (y axis). Red indicates genes that show an increase and blue indicates genes that show a decrease in expression upon Rcor2 KO (n = 2 independent experiments).
(C) Proteins of the pluripotency regulatory network colored according to their change in expression upon Rcor2 KO.
(D) Representative bright-field images of WT and Rcor2 KO EBs on days 4 and 7 of culture. Scale bar, 250 μm.
(E) EB sizes on day 4 (D4) and day 7 (D7) of culture as measured by diameter (mean ± SD; n ≥ 8 EBs, representative of two independent experiments).
(F) Frequency of contractile cardiac colonies obtained in an EB cardiac differentiation assay (mean ± SD; n = 3 technical replicates).
(G) qRT-PCR analysis to examine pluripotency marker expression in Rcor2 KO EBs. Transcript levels were normalized to the levels of the housekeeping gene Ywhaz and then to corresponding WT EBs (mean ± SD; n = 3 technical replicates, representative of 2 independent experiments).
See also Figure S5 and Table S4 and Video S1.
Rcor2 KO ESCs Have a Differentiation Impairment(A) Representative bright-field images of WT and Rcor2 KO ESCs. Scale bar, 250 μm.(B) DGE analysis of Rcor2 KO versus WT ESCs separated into genes occupied by RCOR2 and genes that are not RCOR2 targets (y axis). Red indicates genes that show an increase and blue indicates genes that show a decrease in expression upon Rcor2 KO (n = 2 independent experiments).(C) Proteins of the pluripotency regulatory network colored according to their change in expression upon Rcor2 KO.(D) Representative bright-field images of WT and Rcor2 KO EBs on days 4 and 7 of culture. Scale bar, 250 μm.(E) EB sizes on day 4 (D4) and day 7 (D7) of culture as measured by diameter (mean ± SD; n ≥ 8 EBs, representative of two independent experiments).(F) Frequency of contractile cardiac colonies obtained in an EB cardiac differentiation assay (mean ± SD; n = 3 technical replicates).(G) qRT-PCR analysis to examine pluripotency marker expression in Rcor2 KO EBs. Transcript levels were normalized to the levels of the housekeeping gene Ywhaz and then to corresponding WT EBs (mean ± SD; n = 3 technical replicates, representative of 2 independent experiments).See also Figure S5 and Table S4 and Video S1.Next, to assess the possible role of RCOR2 during differentiation, we subjected the KO and wild-type (WT) ESC lines to an embryoid body (EB) formation assay. Rcor2 KO EBs showed a considerable amount of cell death and were significantly smaller than WT EBs (Figures 5D and 5E). Furthermore, differentiation into cardiac cells revealed that WT EBs generated beating colonies, while Rcor2 KO EBs did not (Figures 5F; Video S1). Importantly, qRT-PCR revealed persistent expression of pluripotency genes in the Rcor2 KO EBs (Figure 5G). In agreement with the EB assay, teratoma assays showed limited growth for Rcor2 KO ESCs in vivo. Although Rcor2 KO ESCs were able to eventually form teratomas with cell types from all three germ layers, it took an additional 5 days to reach a palpable size (Figures S5G–S5H). Together, these data suggest that RCOR2 plays a role in downregulating pluripotency genes during differentiation.
Discussion
We developed TINC, which allows isolation of a specific genomic locus and applied it to interrogate the protein complex formed at the Nanog promoter in ESCs. To reduce the number of false positives, two TALEs were utilized and only proteins enriched by both were considered as true binders. Although our experimental setup allowed us to confirm the reproducibility of TINC, it prevented the usage of label-free quantitative software packages (such as MaxQuant [Cox and Mann 2008]). In future, such quantitative approaches could be used to decrease the amount of false negatives (e.g., despite higher peptide spectrum matches in the TALE samples LSD1, ESRRB, SALL4, and NACC1 had been excluded due to trace levels in the negative controls).TINC allowed us to identify 455 proteins at the Nanog promoter in ESCs (i.e., TINC proteins). While only a fraction of these proteins were known binders, many had previously been associated with a role in pluripotency. For example, TINC revealed direct Nanog promoter targeting by JMJD1A, RCOR2, and ZFP57, all of which had been linked to a change in Nanog expression upon KD (Loh et al., 2007; Riso et al., 2016; Yang et al., 2011). Many of the TINC proteins are upregulated during reprogramming and KD of several of them impaired iPSC formation, suggesting that they play a role in establishing the pluripotent state. Conversely, KD of Rcor2 resulted in a significant increase in colony numbers. Further investigation revealed that Rcor2 depletion leads to increased Nanog levels, a decrease in MEF/ESC doubling time, and increased cell numbers; all processes that facilitate reprogramming. Furthermore, our data suggest that RCOR2 mediates its function at least partially through an interaction with LSD1, inhibition of which also promotes cellular reprogramming (Sun et al., 2016). We validated our result in different reprogramming systems; however, since previous work observed RCOR2 as a positive regulator of cellular reprogramming (Yang et al., 2011), its mechanisms might be expression level and/or reprogramming system dependent. Further studies may need to be performed to address these differences in observations.Interestingly, RCOR2 seems to be non-essential for ESC maintenance despite occupying many expressed genes. Indeed, approximately 70% of its targets are transcriptionally active and are associated with maintenance of pluripotency, while inactive RCOR2 targets are enriched for various neuron-related functions. Similarly, REST has been implicated in the repression of neuronal genes in non-neuronal cells (Chong et al., 1995; Schoenherr and Anderson 1995). Importantly, our data revealed that RCOR2 forms part of the REST complex in ESCs and occupies approximately 40% of REST binding sites. While many of these regions are also targeted by LSD1, they account for less than 5% of RCOR2 or LSD1 binding sites. Notably, RCOR2 and LSD1 have a similar number of targets (approximately 12,000 genes), over 70% of which are shared. This suggests that, while in ESCs RCOR2 forms part of the REST complex, it also interacts with alternative LSD1 complexes, including NuRD. Previous work has shown that LSD1 is significantly more likely to occupy pluripotency enhancers with NuRD complex components than the CoREST/REST complex (Whyte et al., 2018) and our data revealed that RCOR2 does so too. This suggests that RCOR2 may be involved in the fine-tuning of pluripotency genes and the repression of lineage-specific genes in ESCs as part of various LSD1 complexes.Similar to RCOR2, LSD1 does not play a major role in ESC maintenance; however, LSD1 inhibition leads to incomplete repression of many ESC genes during differentiation (Whyte et al., 2018). Similarly, Rcor2 KO ESCs were characterized by an inefficiency to form EBs, which in turn showed prolonged expression of pluripotency genes, as well as poor differentiation potential in vitro and in vivo. This further supports a functional interaction of LSD1 and RCOR2 during pluripotency exit.In summary, TINC allowed us to interrogate the regulatory complex formed at the Nanog promoter in an unbiased manner and revealed that transcriptional regulation of this TF occurs at many different levels. Furthermore, our data suggest that many factors that aid in downregulating Nanog during differentiation, such as RCOR2, already reside at its promoter in the pluripotent state. Together, this implies a highly complex and coordinated interplay of multiple factors that ensures the correct NANOG levels so that ESCs can self-renew and exert their full differentiation potential, rapidly and on-cue.
Experimental Procedures
Generation of TALE-Expressing ESC Lines
The Nanog targeting TALEs were created as described (Briggs et al., 2012) and stable ESC lines were generated as described in the Supplemental Information.
TINC
Approximately 1 × 109 cells were fixed with formaldehyde, and nuclei and chromatin were isolated as described previously (Knaupp et al., 2017; Kustatscher et al., 2014). The chromatin was then sonicated using a Bioruptor NextGen device (Diagenode) and the TALEs immunoprecipitated using Anti-HA Agarose Resin (Pierce 26182) as described in the Supplemental Information. Nanog promoter enrichment was confirmed by qPCR before liquid chromatography-tandem MS (LC-MS/MS) analysis.
LC-MS/MS Analysis of Proteins Enriched by TINC
To determine the proteins isolated by TINC, in-gel tryptic digests were analyzed using an Orbitrap Fusion Tribrid mass spectrometer (Thermo Fisher Scientific) as described in the Supplemental Information.All other experimental procedures are detailed in Supplemental Information.
Data and Code Availability
The accession numbers for the data reported in this paper are available in GEO: GSE160816 and ProteomeXchange: PXD022088 (Perez-Riverol et al., 2019).
Author Contributions
J.M.P. conceptualized the study. A.S.K. and J.M.P. conceived and designed the experiments. R.B.S. and A.S.K. designed the proteomics approach. A.S.K. and M.R.L. performed the experiments with support from S.M.L., K.W., J.C., J.F., X.L., Y.B.Y.S., M.L.H., J.P., P.T., and K.C.D. M.M. performed the bioinformatic analyses with assistance from F.J.R. and J.S. under supervision of J.M.P. TALEs were created by E.F. with support from T.N. and J.J.H. LC-MS/MS analyses were performed by R.B.S. with support from C.H. R.L., P.P.D., C.M.N., and K.C.D. helped with experimental design and manuscript editing. A.S.K., M.M., M.R.L., R.B.S., and J.M.P. wrote the manuscript and all authors approved of the final manuscript.
Conflicts of Interests
J.M.P. is a founder and member of the SAB of Mogrify; however, the work presented in this manuscript is not related to this company. All other authors have no conflict of interest or financial interest to declare.
Authors: Yuin-Han Loh; Qiang Wu; Joon-Lin Chew; Vinsensius B Vega; Weiwei Zhang; Xi Chen; Guillaume Bourque; Joshy George; Bernard Leong; Jun Liu; Kee-Yew Wong; Ken W Sung; Charlie W H Lee; Xiao-Dong Zhao; Kuo-Ping Chiu; Leonard Lipovich; Vladimir A Kuznetsov; Paul Robson; Lawrence W Stanton; Chia-Lin Wei; Yijun Ruan; Bing Lim; Huck-Hui Ng Journal: Nat Genet Date: 2006-03-05 Impact factor: 38.330
Authors: Elsa Abranches; Ana M V Guedes; Martin Moravec; Hedia Maamar; Petr Svoboda; Arjun Raj; Domingos Henrique Journal: Development Date: 2014-07 Impact factor: 6.868
Authors: Christian M Nefzger; Fernando J Rossello; Joseph Chen; Xiaodong Liu; Anja S Knaupp; Jaber Firas; Jacob M Paynter; Jahnvi Pflueger; Sam Buckberry; Sue Mei Lim; Brenda Williams; Sara Alaei; Keshav Faye-Chauhan; Enrico Petretto; Susan K Nilsson; Ryan Lister; Mirana Ramialison; David R Powell; Owen J L Rackham; Jose M Polo Journal: Cell Rep Date: 2017-12-05 Impact factor: 9.423