Literature DB >> 32294452

A WIZ/Cohesin/CTCF Complex Anchors DNA Loops to Define Gene Expression and Cell Identity.

Megan Justice1, Zachary M Carico2, Holden C Stefan3, Jill M Dowen4.   

Abstract

Chromosome structure is a key regulator of gene expression. CTCF and cohesin play critical roles in structuring chromosomes by mediating physical interactions between distant genomic sites. The resulting DNA loops often contain genes and their cis-regulatory elements. Despite the importance of DNA loops in maintaining proper transcriptional regulation and cell identity, there is limited understanding of the molecular mechanisms that regulate their dynamics and function. We report a previously unrecognized role for WIZ (widely interspaced zinc finger-containing protein) in DNA loop architecture and regulation of gene expression. WIZ forms a complex with cohesin and CTCF that occupies enhancers, promoters, insulators, and anchors of DNA loops. Aberrant WIZ function alters cohesin occupancy and increases the number of DNA loop structures in the genome. WIZ is required for proper gene expression and transcriptional insulation. Our results uncover an unexpected role for WIZ in DNA loop architecture, transcriptional control, and maintenance of cell identity.
Copyright © 2020 The Author(s). Published by Elsevier Inc. All rights reserved.

Entities:  

Keywords:  CTCF; DNA loop; WIZ; cellular identity; cohesin; gene expression; genome organization; stem cell; transcription

Mesh:

Substances:

Year:  2020        PMID: 32294452      PMCID: PMC7212317          DOI: 10.1016/j.celrep.2020.03.067

Source DB:  PubMed          Journal:  Cell Rep            Impact factor:   9.423


INTRODUCTION

Eukaryotic genomes are organized into DNA loops that play important roles in gene expression control. DNA binding proteins and transcriptional cofactors can facilitate interactions between enhancers and promoters, or spatially constrain such interactions, to ensure proper transcriptional regulation of genes. Despite the importance of DNA looping to genome structure and function, the molecular mechanisms that control dynamic DNA loops and gene expression are poorly understood. Two structural regulators of DNA loops include cohesin and CTCF. CTCF is a zinc finger-containing protein that binds to a specific DNA sequence motif that is often found at the anchors of DNA loops (Rowley and Corces, 2018). Cohesin is a ringshaped structural maintenance of chromosomes (SMC) protein complex that is thought to facilitate formation or stabilization of a DNA loop. The DNA loops occupied by cohesin and CTCF have been termed topologically associating domains (TADs), loop domains, CTCF contact domains, and insulated neighborhoods (Dixon et al., 2012; Dowen et al., 2014; Gibcus and Dekker, 2013; Gorkin et al., 2014; Merkenschlager and Nora, 2016; Nora et al., 2012). These DNA loops can influence the targeting of enhancers to specific genes at a locus and prevent enhancers from acting on other nearby genes (Dowen et al., 2014). DNA loop anchor sites have also been described as insulator elements for their ability to block other potential DNA loops and serve as barriers to the spread of a chromatin state (Bonev and Cavalli, 2016; Eagen, 2018). Recently, several other proteins have been reported to occupy DNA loop anchors, including YY1, BRD2, TOP2B, and ZNF143 (Hsu et al., 2017; Uusküla-Reimand et al., 2016; Weintraub et al., 2017; Wen et al., 2018). How these proteins regulate the formation and/or dissolution of DNA loops, and how such activities affect gene expression, remains unclear. Recently, WIZ (widely interspaced zinc finger-containing protein) was reported to localize to CTCF binding sites and therefore represents a candidate structural regulator of long-range DNA interactions (Isbel et al., 2016). WIZ is a zinc finger-containing protein that occupies promoters and CTCF binding sites in the mouse adult cerebellum (Isbel et al., 2016). Previous studies implicate WIZ in heterochromatin formation via the recruitment and stabilization of G9a and GLP histone methyltransferases to DNA, thereby directing the deposition of H3K9me1, H3K9me2, and H3K27me1 at specific sites in the genome (Bian et al., 2015; Mozzetta et al., 2014; Simon et al., 2015; Ueda et al., 2006; Wu et al., 2011). These histone modifications are associated with HP1 binding and Polycomb-mediated transcriptional repression of genes (Bannister et al., 2001; Lachner et al., 2001; Mozzetta et al., 2014). Like several other DNA loop structuring factors, the homozygous loss of Wiz results in embryonic lethality (Daxinger et al., 2013; Isbel et al., 2016). Heterozygous loss of Wiz results in improper expression of protocadherin genes in the brain and causes decreased activity and increased anxiety-like behavior in mice (Isbel et al., 2016). Here, we investigate the role of WIZ at CTCF binding sites in the genome and report a role for WIZ distinct from that with G9a and GLP in heterochromatin formation. We identify a function for WIZ in DNA loop architecture, regulation of gene expression, and maintenance of stem cell identity.

RESULTS

WIZ Binds CTCF Sites across the Mammalian Genome

To investigate the chromosomal localization of WIZ relative to other proteins that contribute to long-range DNA interactions, we performed chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) in mouse embryonic stem cells (mESCs) for WIZ, CTCF, and the cohesin subunit SMC1A (Table S1). We found that WIZ occupies 44,018 sites in the genome, including many sites occupied by CTCF and SMC1A (Figures 1A, 1B, and S1A). WIZ and CTCF signals were highly correlated across the genome and also within peaks (Figure 1C). WIZ binding was enriched at cis-regulatory elements, including enhancers, promoters, CTCF sites, cohesin-mediated DNA loop anchors, and super-enhancers (Figures 1D and S1B). Like CTCF and cohesin, WIZ was enriched at the boundaries of insulated neighborhood structures throughout the genome. WIZ binding sites were enriched for the CTCF consensus sequence motif, with its being the top motif represented in WIZ peaks (Figure 1E; Table S2). Motif discovery on (1) WIZ peaks that overlap CTCF peaks and (2) WIZ peaks that do not overlap CTCF peaks revealed mostly overlapping results, with enriched motifs for CTCF, ZIC1, and ZIC4 identified in both lists (Figure S1C). This suggests that WIZ is not recruited to a class of sites independent of CTCF, in a manner that is DNA sequence specific. Importantly, our CTCF ChIP-seq data are similar to other published datasets in terms of both peak number and overlap (Figure S1D) (Nora et al., 2017). We confirmed the specificity of the WIZ antibody using a Myc-tagged version of human WIZ, which is highly conserved with mouse WIZ (Figure S1E). Patterns of CTCF binding across the genome are strikingly consistent among different cell types (Cuddapah et al., 2009; de Wit et al., 2015; Dixon et al., 2012; Kim et al., 2007; Nora et al., 2012). Similarly, WIZ shows a moderate overlap in binding between mESCs and adult mouse cerebellum, and 17,313 of 21,817 conserved WIZ peaks (79%) across these tissues overlap with conserved CTCF peaks in these tissues (Figure S1F) (Isbel et al., 2016; Shen et al., 2012). Additionally, WIZ and CTCF transcript levels are generally correlated across many different human cell types, consistent with a possible widespread role for WIZ with CTCF (Figure S1G). Together, these data show that WIZ occupies many sites across the genome with cohesin and CTCF, including the anchors of DNA loops and insulated neighborhoods.
Figure 1.

WIZ Occupies Enhancers, Promoters, Insulators, and DNA Loop Anchors across the Embryonic Stem Cell Genome

(A) Genome Browser tracks showing ChIP-seq signal for WIZ, CTCF, and SMC1A. High-confidence SMC1A ChIA-PET interactions are depicted as black lines (Dowen et al., 2014).

(B) Average signal plots and clustered heatmaps displaying WIZ, CTCF, and SMC1A ChIP-seq signal (Z score normalized) at WIZ peaks.

(C) Correlation of WIZ and CTCF ChIP-seq signal (Z score normalized) at a union set of peaks (Pearson correlation r = 0.93).

(D) Average signal plots showing the occupancy of WIZ, CTCF, and SMC1A at enhancers, promoters, CTCF sites, DNA loop anchors from cohesin ChIA-PET, and insulated neighborhoods.

(E) MEME-ChIP motif discovery identifies the CTCF consensus motif as the top motif present within WIZ peaks.

See also Figure S1 and Table S2. See STAR Methods for detailed description of genomics analyses. Datasets used in this figure are listed in Table S1.

WIZ Interacts with CTCF and the Cohesin Complex

To determine if WIZ forms a complex with CTCF and cohesin, we performed co-immunoprecipitations (coIPs) followed by western blots. Pull-downs using antibodies targeting either WIZ or CTCF co-purified CTCF and WIZ, respectively, suggesting that WIZ and CTCF are in a complex with each other (Figure 2A). Additionally, SMC1A was also co-purified using either WIZ or CTCF antibodies. These interactions appear to be independent of DNA and RNA, as nuclear extracts for the coIPs were prepared in the presence of a nuclease. To investigate co-occupancy of CTCF and WIZ on chromatin, a sequential ChIP experiment (re-ChIP) was performed in which a CTCF or control IgG antibody was used in a first ChIP reaction (Figure S2A). From the CTCF ChIP eluate, a second ChIP experiment was performed using CTCF, WIZ, IgG, or no antibody as a control. Both CTCF and WIZ antibodies showed enrichment in the second ChIP, demonstrating that CTCF and WIZ co-occupy chromatin sites. Together these results suggest that WIZ physically interacts, either directly or indirectly, with both CTCF and cohesin.
Figure 2.

WIZ Forms a Complex with CTCF and Cohesin

(A) Western blot analysis showing co-immunoprecipitation of WIZ, CTCF, and SMC1A, as well as IgG controls from nuclear lysates.

(B) Genome Browser tracks showing CTCF and RAD21 occupancy in wild-type and Wiz cells at an ectopic RAD21 peak. WIZ occupancy in wild-type cells is shown.

(C) Genome Browser tracks showing CTCF and RAD21 occupancy in wild-type and Wiz cells at a differential RAD21 site. WIZ occupancy in wild-type cells is shown.

(D) Overlap of RAD21 peaks in wild-type and Wiz cells. For shared RAD21 peaks and ectopic RAD21 peaks, the overlap with functional elements in the genome is shown (CTCF sites, enhancers, promoters, other).

(E) Average signal plots and heatmaps of RAD21 signal in wild-type and Wiz cells at 25,549 ectopic RAD21 peaks in Wiz cells.

(F) MA plots showing differential enrichment of RAD21 and CTCF between wild-type and Wiz cells. Sites of significantly differential enrichment are shown in green.

See also Figure S2 and Table S2. See STAR Methods for detailed description of genomics analyses. Datasets used in this figure are listed in Table S1.

We next considered whether WIZ directly binds DNA at CTCF sites. WIZ and CTCF both contain multiple C2H2-type zinc finger motifs, but the two proteins share minimal amino acid sequence identity. WIZ has 6 zinc fingers that are widely spaced compared with those of other proteins, including CTCF. CTCF has 11 zinc fingers, of which zinc fingers 4–7 bind the core DNA consensus motif (Nakahashi et al., 2013). Importantly, the amino acids within zinc fingers 4–7 of CTCF that are responsible for the recognition of the CTCF DNA sequence motif are not conserved in WIZ. Because CTCF-occupied sites are known to have low nucleosome density (Nora et al., 2017), we examined nucleosome density at sites co-occupied by WIZ and CTCF and compared them with WIZ-occupied sites that are not CTCF peaks. WIZ/CTCF peaks showed decreased nucleosome occupancy, while WIZ peaks not overlapping CTCF peaks tend to be of low-amplitude signal and do not show well-positioned nucleosomes (Figure S2B) (Mullen et al., 2011). Taken together with the coIP experiments, our data suggest that WIZ exists in a complex with CTCF and cohesin at sites at which CTCF specifically binds to its consensus motif in DNA. To investigate the function of WIZ in cohesin and CTCF occupancy on the genome, we generated Wiz cells using CRISPR/Cas9 genome editing (Figures S2C and S2D). These cells have a large in-frame deletion that removes 67% of the coding sequence of the Wiz gene (including zinc fingers 2–5), likely resulting in a null allele. Western blot analysis confirmed that an epitope within the deleted region is not detected in Wiz cells. Wiz cells may have a slight reduction in CTCF levels. Likewise, we observed reduced WIZ levels following small interfering RNA (siRNA) depletion of CTCF in wild-type (WT) cells, possibly indicating that WIZ protein stability is sensitive to CTCF levels. Additionally, GAPDH siRNA treatment in Wiz cells may have reduced CTCF levels compared with WT cells. Importantly, the levels of cohesin subunits SMC1A, RAD21, and SMC3 are largely unaltered in Wiz cells (Figure S2E). We performed ChIP-seq for CTCF and the cohesin subunit RAD21 in WT and Wiz cells (Figures 2B and 2C). Notably, the core cohesin complex members SMC1A, SMC3, and RAD21 are frequently used interchangeably as proxies for cohesin and their signals are correlated. We used a spike-in of human chromatin during the ChIPs in order to quantitatively measure relative levels of enrichment. Wiz cells showed increased cohesin signal, which could be detected in two distinct analyses. First, ChIP-seq for RAD21 revealed a striking increase in the number of cohesin peaks in Wiz cells (Figure 2D). Although most of the RAD21 peaks in WT cells were preserved in Wiz cells (84%), there was also a large increase in ectopic RAD21 peaks in Wiz cells (Figure 2E). The 25,549 ectopic RAD21 peaks represented 48% of the total RAD21 peaks detected in Wiz cells. The ectopic RAD21 peaks rarely overlapped CTCF sites (1,896 of 25,549), which is a strikingly different pattern from the shared RAD21 peaks that preferentially occur at CTCF sites. Motif discovery performed on the 25,549 ectopic RAD21 peaks identified the CTCF motif, as a subset of ectopic cohesin sites do overlap with CTCF, but did not reveal a striking relationship with other DNA-binding factors (Figure S2F). Importantly, the ectopic RAD21 peaks in Wiz cells did not strongly overlap SMC1A peaks in WT cells. Specifically, 13,729 of the 25,549 ectopic RAD21 peaks did not overlap SMC1 peaks in WT cells, supporting the notion that these are de novo peaks (Figure S2G). Importantly, the RAD21 ChIP-seq datasets showed similar IP efficiencies in WT and Wiz cells (Figure S2H). A second distinct analysis that identifies differential ChIP-seq signal at a largely conserved peak set was used and revealed increased cohesin signal at cohesin-occupied sites in Wiz cells. We observed 23,056 sites of differential cohesin enrichment in Wiz cells, with the majority (97.3%) of these sites showing stronger RAD21 signal in Wiz cells compared with WT (Figure 2F). Although the analyses used to identify ectopic RAD21 peaks and sites with differential RAD21 signal are distinct, they likely measure different aspects of the same phenomenon of altered cohesin occupancy across the genome. CTCF binding was largely unchanged in Wiz cells, as 26,498 of 29,082 peaks (91%) were preserved (Figure S2I). Quantitative analysis revealed that only 1,969 sites (7%) exhibited differential CTCF enrichment between WT and Wiz cells, of which most represented increased signal within weak CTCF peaks. The majority of sites with differential CTCF signal also showed differential cohesin signal; however, there was a large class of sites that also displayed only differential cohesin signal (Figure S2J). These data demonstrate that Wiz cells gain a large number of ectopic cohesin peaks across the genome at sites that are rarely CTCF occupied. Additionally, Wiz cells show increased cohesin enrichment at many sites normally occupied by cohesin and CTCF.

WIZ Is Required for Proper Gene Expression and Maintenance of Cell Identity

Proteins involved in structuring DNA loops are required for proper regulation of gene expression, as their loss can cause mis-targeting of enhancers to inappropriate genes and alter expression of cell identity genes (Dowen et al., 2014; Sun et al., 2019). Previous studies have shown that depletion of cohesin or CTCF leads to misregulation of genes, with fewer transcriptional changes seen following acute and partial depletions, and complete or long-term loss causing hundreds of misexpressed genes (Dowen et al., 2013; Ing-Simmons et al., 2015; Kagey et al., 2010; Nora et al., 2017; Rao et al., 2017; Seitan et al., 2013; Sofueva et al., 2013; Viny et al., 2015; Zuin et al., 2014). In order to examine the role of WIZ in gene regulation, we performed RNA sequencing (RNA-seq) in Wiz cells. Overall, 3,683 genes were differentially expressed in Wiz cells, with 1,519 genes downregulated and 2,164 genes upregulated compared with WT (10% false discovery rate [FDR]; Figure 3A; Table S3). Gene Ontology analysis revealed that some of the biological processes most affected in Wiz cells include system development, anatomic structure morphogenesis, and regulation of cell differentiation (Figure 3B; Table S4). Similarly, gene set enrichment analysis (GSEA) revealed that differentially expressed genes (DEGs) are enriched in the ‘‘regulation of embryonic development’’ gene set (Figure S3A; Table S4). Among the DEGs, the stem cell identity genes Nanog and Pou5f1(Oct4) were downregulated in Wiz cells, whereas endodermal transcription factors Sox17 and Gata6 were among the most upregulated (Figure 3C). These results suggest a widespread role for WIZ in proper transcriptional regulation and maintenance of embryonic stem cell identity.
Figure 3.

WIZ Is Required for Proper Gene Expression

(A) Changes in gene expression from RNA-seq in Wiz cells versus wild-type cells. Genes with significant changes in expression (false discovery rate [FDR]-adjusted p < 0.1) are colored with upregulated genes shown in red and downregulated genes in blue.

(B) Gene Ontology analysis identifies misregulated biological processes in Wiz cells that are involved in stem cell identity and differentiation.

(C) Differentially expressed genes ranked by log2 fold change with key pluripotency and cell identity markers indicated.

(D) Model depicting a Super-enhancer Domain where transcriptional insulation may occur.

(E) Change in gene expression at ten Super-enhancer Domains from RNA-seq in Wiz cells versus wild-type cells.

(F) Change in gene expression of DEGs located inside Super-enhancer Domains and within 150 kb of a Super-enhancer Domain.

(G) Average signal plots showing RAD21 and CTCF signal in wild-type and Wiz cells at Super-enhancer Domains.

See also Figure S3, Table S3, and Table S4. See STAR Methods for detailed description of genomics analyses. Datasets used in this figure are listed in Table S1.

Given that WIZ co-occupies the genome with CTCF and cohesin, we next evaluated genes whose expression is controlled by genome architecture (Dowen et al., 2014). Insulated neighborhoods are DNA loops formed by cohesin and CTCF. Some insulated neighborhoods focus the activity of strong enhancers on highly expressed target genes inside of loops and prevent enhancers from accessing genes outside of loops (Dowen et al., 2014). Importantly, genes within insulated neighborhoods often encode master transcription factors and other regulators of cell identity. Therefore, to assess whether WIZ supports transcriptional insulation at DNA loops, we examined expression of genes within and outside of insulated neighborhoods previously identified from cohesin ChIA-PET data (Dowen et al., 2014). We focused on insulated neighborhoods that contain super-enhancers, termed super-enhancer domains (SDs), and their highly expressed target genes (Figure 3D) (Dowen et al., 2014). At ten example SDs, we observed decreased expression of super-enhancer target genes and increased expression of genes outside of the DNA loop in Wiz cells relative to WT cells (Figure 3E). At eight of these ten SDs, WIZ peaks overlapped the anchors of the SD or fell within 1 kb of the anchors of the SD. Moreover, 111 DEGs in Wiz cells lie within SDs and tend to decrease in expression compared with all DEGs (Figures 3F and S3B). Furthermore, 178 DEGs that are located outside of SDs tend to show increased expression, consistent with inappropriate enhancer targeting. RAD21 signal was increased both at the boundaries and inside of SDs, but CTCF signal was largely unchanged (Figure 3G). Importantly, the function of WIZ in supporting transcriptional insulation appears distinct from its previously reported role as a G9a cofactor in heterochromatin formation. Although more than 8,500 genes are differentially expressed in G9a−/− ESCs (Mozzetta et al., 2014), about 3,500 genes are differentially expressed in Wiz cells (Figure S3C). Only 2,252 genes are differentially expressed in both G9a− ESCs and Wiz cells, consistent with the notion that WIZ and G9a do not genocopy each other with regard to their roles in regulating gene expression. Furthermore, although Wiz cells showed a signature change in gene expression inside and immediately outside of SDs, G9a−/− cells did not show this pattern (Figure S3D). Together, these results suggest that WIZ is required for proper transcriptional insulation and control of embryonic stem cell gene expression programs.

WIZ Is a Regulator of DNA Loops

To directly investigate whether WIZ is required for DNA loop architecture, we performed Hi-C in WT and Wiz cells. We generated two biological replicates for WT and Wiz cells, totaling 783 million reads and 835 million reads, respectively (Table S5). Overall, WT and Wiz cells displayed similar patterns of DNA interactions, as measured by the distances between PETs in the datasets (Figure S4A). To assess whether specific features of genome organization were altered in Wiz cells, we examined DNA loops, contact domains, and genome compartmentalization using established analysis methods. DNA loops, distinct contacts between pairs of specific distal loci, were identified using HiCCUPS (Rao et al., 2014). DNA loops were largely intact in Wiz cells; however, local changes in DNA loops were detected, such as at the Dazl locus (Figures 4A and 4B). This locus also displayed differential cohesin and CTCF occupancy, as well as differential expression of the Dazl and Tbc1d5 genes between WT and Wiz cells (Figure S4B). Overall, there was an increase in total DNA loop number in Wiz cells (4,119) versus WT cells (3,094) (Figure 4C). By comparing the DNA loops detected in WT cells and Wiz cells, we identified 2,321 persistent loops, in which a DNA loop used identical anchor sites in both cell lines. Differential loops were also detected, in which one or both anchors were altered in one of the cell lines. There were 773 DNA loops specific to WT cells and 1,798 DNA loops specific to Wiz cells. The DNA loops in Wiz cells were smaller than those in WT cells, with the mean loop size decreased from ~606 kb in WT cells to ~521 kb (Figures 4D and S4C). We also assessed the strength of loops using aggregate peak analysis (APA) (Rao et al., 2014), which revealed that persistent DNA loops display stronger APA scores in Wiz cells than WT cells (Figure S4D). As expected, WT cells displayed stronger APA scores than Wiz cells at WT-specific DNA loops, and Wiz cells displayed stronger APA scores than WT cells at Wiz-specific DNA loops.
Figure 4.

WIZ Is Important for DNA Loop Architecture of the Genome

(A) Genome Browser tracks showing DNA loops, cohesin (RAD21) occupancy, and CTCF occupancy in wild-type cells and Wiz cells. DNA loops were identified using HiCCUPS. WIZ occupancy in wild-type cells is shown.

(B) Hi-C maps showing signal in wild-type and Wiz cells (left) at the Dazl locus at 5 kb resolution. Differential signal between wild-type and Wiz cells shown on the right.

(C) Venn diagram showing both persistent and differential DNA loops between wild-type and Wiz cells.

(D) Size distribution of DNA loops in wild-type and Wiz cells. To compare the means of the distributions, a Wilcoxon rank-sum test was performed; **** represents an adjusted p value of 4e-18.

(E) Proportion of shared RAD21 peaks, ectopic RAD21 peaks and differential RAD21 sites that overlap the anchor of a DNA loop detected in wild-type or Wiz cells.

(F) Compartmentalization of the genome into A and B compartments on the basis of eigenvector (EV) score, represented as wild-type versus Wiz cells. Eigenvector track of chromosome 1 is shown for each replicate.

See also Figure S4 and Table S5. See STAR Methods for detailed description of genomics analyses. Datasets used in this figure are listed in Table S1.

The increased number of DNA loops detected in Wiz cells is consistent with the increased cohesin occupancy observed by differential RAD21 ChIP-seq signal and the presence of ectopic peaks. Notably, 26.5% of shared RAD21 sites and a similar 23.8% of differential RAD21 sites (showing mostly increased signal in Wiz cells) are located in DNA loop anchors (Figure 4E). Only 9.6% of ectopic RAD21 peaks overlap a DNA loop anchor. This suggests that although there are many new ectopic cohesin sites across the genome of Wiz cells, they are less frequently engaged in a DNA loop than shared cohesin sites. To investigate the relationship between differential DNA loops and differential gene expression, we identified classes of DNA loops across the genome and examined the expression of genes inside. The classes included persistent loops, differential loops, loops that are both differential and persistent (because of the nesting of multiple DNA loops), or no loops. Overall, the proportion of DEGs was unchanged across these four classes of DNA loops (Figure S4E). However, there was a difference in the APA score of all DNA loops detected in each cell line, with the loops detected in Wiz cells showing stronger insulation scores than the DNA loops detected in WT cells (Figure S4F). Additionally, DNA loops containing DEGs showed stronger insulation scores in Wiz cells than the DNA loops containing DEGs in WT cells (Figure S4F). Furthermore, at seven of the ten example SDs (Figure 3E), either the gene inside or outside was located within 25 kb of a WT-specific DNA loop anchor or Wiz-specific DNA loop anchor. Finally, we assessed changes in contact domains, defined as discrete regions of increased chromatin interactions above background, using the Arrowhead algorithm (Rao et al., 2014). Although the number and size of contact domains were not significantly altered between WT and Wiz cells, the overlap of contact domains revealed the presence of WT-specific and Wiz-specific structures (Figures S4G and S4H). Compartmentalization of the genome into A (active) and B (inactive) compartments was investigated using principal-component analysis (Heinz et al., 2010; Lieberman-Aiden et al., 2009) and revealed minimal instances of compartment switching between WT and Wiz cells (Figure 4F). Taken together, these results suggest that loss of WIZ causes changes in DNA loops by altering cohesin occupancy at specific sites across the genome.

DISCUSSION

Here, we demonstrate that WIZ is required for embryonic stem cell identity gene expression programs and represents a DNA loop structuring protein. WIZ forms a complex with CTCF and cohesin at the anchors of DNA loops across the mammalian genome. Aberrant WIZ function causes many changes in gene expression, including at DNA loops important for regulating stem cell identity genes. Like other proteins involved in structuring DNA loops, WIZ is essential for embryonic viability and is ubiquitously expressed across many cell and tissue types (Isbel et al., 2016; Uhlén et al., 2015). Although WIZ has previously been implicated in heterochromatin formation, this work reveals a distinct role for WIZ in transcriptional regulation and DNA loop architecture. WIZ forms a complex with CTCF and cohesin at many sites across the genome, including CTCF binding sites, enhancers, promoters, DNA loop anchors, and insulated neighborhoods (Dowen et al., 2014). Recruitment of WIZ to these sites is likely mediated by interaction with CTCF and not direct binding to DNA, as coIP experiments revealed a physical interaction between CTCF and WIZ that is not dependent on DNA or RNA. Additionally, although both WIZ and CTCF contain zinc finger motifs, WIZ lacks the key residues that mediate CTCF binding to its consensus DNA sequence motif (Nakahashi et al., 2013). Specific aspects of how CTCF and WIZ interact remain to be investigated, including identifying the domains involved and determining whether the interaction is direct. WIZ regulates both cohesin distribution on chromatin and DNA loop architecture. Wiz cells display >20,000 ectopic cohesin peaks that tend to not overlap CTCF sites, enhancers, or promoters of genes. The appearance of a large number of ectopic cohesin peaks in Wiz cells suggests that WIZ normally acts by negatively regulating cohesin occupancy on the genome. The ectopic cohesin peaks are less likely to engage in a DNA loop than shared cohesin peaks. These findings suggest that aberrant cohesin localization alone is not sufficient for formation of DNA loops and that the gained cohesin occupancy and DNA loops in Wiz cells are not simply a consequence of altered gene expression programs, as they tend not to overlap promoters. Aberrant WIZ function caused an overall increase in DNA loop number and decrease in DNA loop size. Both persistent DNA loops and DNA loops specific to Wiz cells were stronger than those in WT cells. Although DNA loops were altered in Wiz cells, major changes in contact domains and compartmentalization of the genome into A and B compartments were not observed. This is consistent with evidence that contact domains and compartments are largely a product of transcriptional and chromatin state, while DNA loops are a product of the activity of cohesin and CTCF (Rao et al., 2017). We did not detect a significant global relationship between DEGs and either persistent or differential DNA loops in our analysis. Taken together, these results suggest that WIZ normally restricts cohesin levels and distribution across the genome limiting the number of DNA loops. Wiz cells display gene expression signatures consistent with loss of pluripotency. In mESCs, the genes responsible for maintenance of stem cell identity and pluripotency are housed within DNA loops (Dowen et al., 2014; Sun et al., 2019). Wiz cells showed decreased expression of several of these genes, including Nanog, Pou5f1(Oct4), and Prdm14. This is consistent with previous studies which found that DNA loop structuring proteins and complexes, such as cohesin, are required for maintenance of stem cell identity (Hu et al., 2009; Kagey et al., 2010). Furthermore, genes that direct changes in cell identity during differentiation, such as Gata6 and Sox17, exist in insulated neighborhoods and show increased expression in Wiz cells. Although our study cannot distinguish between direct and indirect transcriptional effects of WIZ deletion, the altered expression of cell identity genes in Wiz cells likely contributes to broad transcriptional changes, affecting biological processes such as cellular differentiation, morphogenesis, and development. Thus, we conclude that WIZ is required for maintenance of embryonic stem cell identity, potentially through its regulation of DNA loop architecture. Importantly, the phenotype of Wiz mESCs cannot be fully explained by loss of G9a/GLP-mediated heterochromatin formation. Previous work showed that a double knockout of G9a and GLP did not alter expression of Pou5f1(Oct4), Prdm14, and Gata6 (Mozzetta et al., 2014), which were identified as DEGs in Wiz cells. If WIZ solely functions in heterochromatin formation, then Wiz cells should largely genocopy loss of G9a and GLP, but they do not. Instead, we propose that WIZ can regulate gene expression through its role in mediating genome architecture. Several recent reports have identified candidate DNA loop structuring factors that associate with cohesin and CTCF, including BRD2, ZNF143, YY1, and TOP2A/2B (Hsu et al., 2017; Uusküla-Reimand et al., 2016; Weintraub et al., 2017; Wen et al., 2018). The molecular mechanisms by which these proteins, and WIZ, regulate DNA loop architecture remain unclear. Notably, loss of the cohesin unloading factor WAPL has been shown to increase the number of DNA loops, similar to Wiz cells, but WAPL-deficient cells show larger DNA loops, while Wiz cells display smaller loops than WT cells (Haarhuis et al., 2017). It is unclear how WIZ might support the lengthening of DNA loops, while also suppressing their number. As Wiz cells did not display an overall change in cohesin subunit protein levels, it is possible that WIZ regulates the ratio of DNA-associated versus free cohesin in the nucleus. Alternatively, WIZ could regulate the translocation of cohesin along DNA and/or cohesin stability at CTCF sites. Further studies are needed to elucidate the precise role of WIZ, along with other structural regulators, in DNA loop architecture. In conclusion, WIZ is required for proper gene expression and maintenance of stem cell identity. WIZ co-occupies the genome with the DNA loop structuring proteins cohesin and CTCF. Aberrant WIZ function causes an increase in cohesin and, to a lesser to extent, CTCF occupancy across the genome. This is associated with an increase in the number of DNA loops, which tend to be smaller than those found in wild-type cells. This work identifies WIZ as a structural regulator of DNA loop architecture that is important for proper transcriptional regulation of cell identity genes.

STAR★METHODS

LEAD CONTACT AND MATERIALS AVAILABILITY

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, Jill Dowen (jilldowen@unc.edu). All unique/stable cell reagents generated in this study are available from the Lead Contact with a completed Materials Transfer Agreement.

EXPERIMENTAL MODEL AND SUBJECT DETAILS

Cell lines

V6.5 murine embryonic stem cells were a gift from R. Young of the Whitehead Institute for Biomedical Research. V6.5 are male cells derived from a C57BL/6(F) x 129/sv(M) cross. HEK293T (female human embryonic kidney) cells were a gift from R. Young of the Whitehead Institute for Biomedical Research.

Cell culture

Naive V6.5 murine embryonic stem cells (mESCs) were grown on irradiated murine embryonic fibroblasts in serum +LIF standard conditions, as previously described (Dowen et al., 2014). Briefly, KnockOut DMEM(Thermo Fisher Scientific, 10829-018) was supplemented with 15% fetal bovine serum (VWR, 97068-085)). Cell counts were obtained on a Countess II FL Automated Cell Counter (Invitrogen). HEK293T cells were cultured in DMEM (GIBCO, 11995065) supplemented with 10% cosmic calf serum (Thermo Fisher Scientific, SH3008703), 1x GlutaMAX (Thermo Fisher Scientific, 35050-061), 100U/ml penicillin, 100ug/ml streptomycin (Thermo Fisher Scientific, 15140-122) and passaged similarly to mESCs.

Genome editing

mESCs were transfected with plasmids containing a sgRNA, Cas9 and a fluorescent gene (eGFP or mCherry) using Lipofectamine 2000 (Thermo Fisher Scientific 11-668-027). Two days later single cells were sorted by UNC Flow Cytometry Core Facility staff using a FACSAria II (BD Biosciences). 104 cells were collected, expanded, screened by PCR and DNA sequencing, and cryogenically stored. Individual allele sequences were determined by PCR of the region surrounding the mutated site, followed by TOPO-TA cloning (Thermo Fisher Scientific, K4575J10) and Sanger sequencing. sgRNA sequences are provided below and were designed using the CRISPR tool (http://zlab.bio/guide-design-resources) (Cong and Zhang, 2015). The Wiz deletion allele (referred to in this text as Wiz) contains a homozygous deletion from exon 3 to exon 7. The official allele name according to the International Committee on Standardized Genetic Nomenclature for Mice is Wiz. sgRNA 1: 5′- CCATGCCCTTCCCGCCTACC —3′ sgRNA 2: 5′- TCCCTGGTTGGCCGAAGTGC 3′ Wiz murine embryonic stem cell line sequence at the site of genome editing: CCACTGTCAGCTGCCCTTCCAGCTTCAAGCCCTGGTTCTCCCCCAG-deletion-GCGGGAAGGGCATGGTGAGAGGTAAGCGTG GCCGTCTTGAAGTCAGGAAGCTTCTGGGTGGCCCCCCTGGCTCCCGGCCCAGGGGACTTGGGGGCAACTCCGCTGCTGAGGTG GCCAGCAGCTCTTGCAGGATGTTGATGGGTGAGATGGTGAGTTCCCAGTTGGTGATGCCAAAGTCACGAAGGTGGGCCCGGGCA TGACTGGAGAGGCCAGCCCGAGTATCAAAAC

METHOD DETAILS

Chromatin immunoprecipitation

Chromatin immunoprecipitation (ChIP) was performed using the following antibodies: WIZ (Novus Biologicals, NBP1-80586), CTCF (Active Motif, 61311), SMC1 (Bethyl Laboratories, A500-055A), RAD21 (Bethyl Laboratories, A300-080A). For WIZ ChIP-seq replicate 1 in wild-type cells, mESCs were crosslinked in 1% formaldehyde (Sigma Aldrich, F1635) for 20 minutes, then quenched with 125 mM glycine. Cells were lysed first with Lysis Buffer 1 (50 mM HEPES-KOH pH7.5, 140 mM NaCl, 1mM EDTA, 10% glycerol, 0.5% NP-40, and 0.25% Triton X-100) by incubating cells in the buffer for 10 minutes at 4C. Nuclei were next lysed with Lysis Buffer 2 (10mM Tris-HCl pH 8, 200mM NaCl, 1mM EDTA, and 0.5 mM EGTA) by incubating nuclei in the buffer for 10 minutes at room temp. Finally, nuclear extracts were resuspended in Sonication Buffer 1 (20mM Tris-HCl pH 8, 150mM NaCl, 2mM EDTA, 0.1% SDS, and 1% Triton X-100). Cells were sonicated using a Branson probe sonicator with the following settings: 18% amplitude, 30 s on, 60 s off, 17 cycles, 8.5 minutes total. WIZ antibody (4ug, NBP1-80586) was incubated with Protein G Dynabeads (150ul, Thermo Fisher Scientific, 10004D) for 6 hours at 4C. Unbound antibody was removed by washing beads three times with PBS before sonicated chromatin was added to antibody conjugated beads and incubated overnight at 4C. Beads were washed with sonication buffer, wash buffer 1 (20mM Tris-HCl pH 8, 500mM NaCl, 2mM EDTA, 0.1% SDS, and 1% Triton X-100), wash buffer 2 (10mM Tris-HCl pH 8, 250mM LiCl, 1mM EDTA, and 1% NP-40), and wash buffer 3 (10mM Tris pH 8, 1mM EDTA, and 50mM NaCl). Chromatin was eluted from beads by adding elution buffer (50mM Tris pH 8, 10mM EDTA, and 1% SDS) and incubating at 65C for 1 hour, spinning down the mixture, then moving the supernatant to a new tube. Supernatant was left at 65C overnight to reverse crosslinks. RNA was degraded by adding TE and RNase A (Sigma Aldrich, R4642) to the tubes at 37C for 2 hours followed by protein degradation with CaCl2 and Proteinase K (NEB, P8107S) for 30 minutes at 55C. DNA was precipitated using phenol chloroform followed by NaCl, glycogen, and ethanol addition. Resulting DNA pellet was resuspended in 10 mM Tris HCl pH 8. ChIP-seq library was prepared using the ThruPLEX-FD Prep kit (Takara, R400428). For WIZ ChIP-seq replicate 2, CTCF ChIP-seq replicate 1, and SMC1 ChIP-seq replicate 1, 50 million mESCs were crosslinked in 1% formaldehyde for 2 minutes before quenching with 125 mM glycine. Crosslinked cells were lysed using Lysis Buffer 1 and Lysis Buffer 2 before resuspension in Sonication Buffer 1. Human chromatin (from HEK293T cells) was spiked in (to final 5%) prior to sonication for the indicated experiments. Sonication of nuclei was performed on a Covaris E220 with the following settings: Duty Factor 8, PIP/W 210, and 200 cycles per burst for 12 minutes. Chromatin fragments of 200-1,000 base pair size were generated. Antibodies (WIZ, NBP1-80586, 4ug; CTCF, 61311, 10ug; and SMC1A, A300-055A, 10ug) were incubated with 100uL Protein G Dynabeads for 6 hours. Unbound antibody was removed via washing, as detailed above, before incubation of antibody bound beads with chromatin overnight. Beads were then washed with Sonication Buffer 1 and Wash Buffers 1, 2, and 3 as detailed above. Chromatin was eluted as described above. Crosslinks were reversed overnight via incubation at 65C and addition of 5ul Proteinase K. Zymo ChIP DNA Clean and Concentrate kit (Zymo Research, D5205) was used to purify DNA following Proteinase digestion. Sequencing libraries were prepared using NEBNext Ultra II DNA library prep kit for Illumina (NEB, E7645S). For CTCF ChIP-seq replicate 2, 40 million cells were crosslinked in 1% formaldehyde for 2 minutes, lysed with Lysis buffers 1 and 2 as indicated above, before resuspension in Sonication Buffer 1 and addition of human spike-in chromatin. Sonication was performed on Covaris E220 as detailed above and ChIP protocol was completed as detailed above. Sequencing libraries were prepared using a Hyper Prep kit (Roche/Kapa Biosciences, KK8502) according to manufacturer’s instructions For SMC1A ChIP-seq replicate 2, cells were crosslinked for 20 minutes in 1% formaldehyde (Sigma Aldrich, F1635) and quenched by adding 125 mM glycine. Cells were lysed using Lysis Buffers 1 and 2, and recovered nuclei were resuspended in Sonication Buffer 1 and sonicated using a Biorupter (Diagenode) water bath sonicator. Insoluble material was then cleared by spinning sonicated lysates for 10 minutes at 21,000 rcf. To perform the IP, 15mg of anti-SMC1A antibody (Bethyl Laboratories, A300-055A) was incubated with 100µl Dynabeads per IP for 8 hours, after which beads were washed twice with PBS to remove excess antibodies. An estimated 25x106 cell equivalents of chromatin in 550ml then were incubated overnight at 4°C with the antibody-bound beads while rotating. Beads were collected using a magnet and the unbound fraction removed, followed by washes with Wash Buffers 1, 2, and 3 as detailed above. Crosslinks were reversed as described above, and phenol:chloroform extraction was used to collect DNA. Libraries were prepared using a ThruPlex DNaseq kit according to manufacturer’s instructions, and 50bp single-end sequencing reads were collected using the Illumina HiSeq 2500 platform. For RAD21 ChIP-seq, cells were crosslinked in 1% formaldehyde (ThermoFisher, 28906) for 5 minutes in PBS and quenched by adding glycine. Cells were lysed using Lysis Buffers 1 and 2, and recovered nuclei resuspended in 1ml Covaris Shearing Buffer (50mM Tris pH 7.5, 10mM EDTA, 0.1% SDS). Human HEK293T nuclei prepared in the same manner were spiked-in to a final concentration of 5%, and nuclei were sheared for 12 minutes using a Covaris E220 water bath sonicator with device settings of duty factor 5, PIP/W of 140, and 200 cycles per burst. Insoluble material was cleared by spinning sonicated lysates for 10 minutes at 21,000rcf. 10mg anti-RAD21 antibody (Bethyl Laboratories, A300-080A) was incubated for 8 hours with 30µl of Dynabeads in PBS, after which excess antibodies were washed from the beads. 107 cell equivalents of sheared chromatin were incubated overnight with the antibody-bound beads in 1 mL Sonication Buffer 1, washed with Wash Buffers 1, 2, and 3, and crosslinks were reversed overnight as described above. DNA was purified using a ChIP DNA Clean and Concentrator Kit (Zymo Research, D5205) according to manufacturer’s instructions. Sequencing libraries were prepared using a Hyper Prep kit (Kapa Biosciences, KK8502) according to manufacturer’s instructions, and 150bp paired-end sequencing reads were collected using the Illumina NovaSeq SP sequencing platform and reagents.

Re-ChIP

For the Re-ChIP experiment, 120 million wild-type mESCs were crossliked in 1% formaldehyde for 5 minutes before lysing with Lysis Buffers 1 and 2 then resuspension in Covaris Shearing Buffer. Cells were sonicated using a Covaris E220 sonicator with the settings detailed above. Input material was saved post-sonication and the remaining chromatin was divided into three tubes for IP of CTCF (Active Motif, 61311, 10 ug), WIZ (Novus Biologicals, NBP1-80586, 4ug), and IgG (Bethyl Laboratories, P120-101, 10 ug). For the first ChIP, the Active Motif Re-ChIP-IT kit (53016) was used with the following modifications: 1) the antibodies and beads were incubated together for 6 hours prior to the addition of chromatin; 2) incubation of chromatin, antibodies/beads was performed overnight at 4C. Following the first ChIP, additional input material was saved before each tube of chromatin was divided into four tubes for the second ChIP with CTCF (Active Motif, 61311, 10ug), WIZ (Novus Biologicals, NBP1-805864ug), IgG (Bethyl Laboratories, P120-101, 10 ug), or no antibody as a control and allowed to incubate overnight. Input material from the first ChIP was purified using the Zymo ChIP DNA Clean and Concentrator kit (Zymo Research, D5201) prior to qPCR analysis. qPCR was performed on an Applied Biosystems QuantStudio 6 qPCR machine using primers found in Table S6.

High-throughput sequencing

50bp or 100bp single-end or paired-end sequencing was performed on Illumina Hi-Seq 4000, Hi-Seq 2500, or NovaSeq 6000 platforms using Illumina reagents according to manufacturer’s instructions.

Transfection

A pcDNA3.1 empty vector and vector containing human Wiz cDNA driven by a CMV promoter with a Myc-His tag was obtained from Dr. Samantha Pattenden. The plasmids were transformed into DH5-α competent E. coli and purified using the Zymo II Midiprep kit (Zymo Research, D4200). For transfection, 8 ug of isolated plasmid was mixed with 2 M CaCl2. The mixture was added to 2X HEPES Buffered Saline (HBS) with addition of air bubbles with an empty pipette. Once entire CaCl2/DNA mixture was added to HBS, the DNA/HBS mixture was added dropwise to a 10 cm plate of HEK293T cells. Cells were incubated for 24 hours before receiving a change to fresh media. After an additional 24 hours, cells were scraped and pelleted for collection.

Co-immunoprecipitation

Co-immunoprecipitation studies were performed using a Nuclear Complex Co-IP Kit (Active Motif, 54001) and Protein G Dynabeads (Thermo Fisher Scientific, 10009D). Each immunoprecipitation was performed using 100ug of nuclear extract. Input material was loaded as a control, with 1X corresponding to 20ug of protein.

Western blotting

Cells were washed with PBS and collected via scraping. Pellets were resuspended in Lysis Buffer A (10mM HEPES pH 7.9, 10mM KCl, 0.1mM EDTA, and 0.1mM EGTA) with 1x protease inhibitor cocktail (Sigma Aldrich, 11697498001) and incubated for 15 minutes at 4C before addition of 1 mL 10% NP-40 and pelleting via centrifugation. The resulting pellet was resuspended in cold TEN250/0.1 buffer (50mM Tris-HCl pH 7.5, 250mM NaCl, 5mM EDTA, and 0.1mM NP-40) and incubated for a minimum 30 minutes at 4C. Following pelleting via centrifugation, the nuclear fraction (supernatant) was collected. Prior to western blotting, protein levels were quantified using the DC assay from BioRad (BioRad, 5000112). Samples were run in 4%-20% Tris-Glycine gels (BioRad, 4568094) and transferred to PVDF membranes (VWR, 29301-856). Membranes were blocked for at least 45 minutes with 5% Blotting-Grade Blocker (BioRad, 1706404) before overnight incubation at 4C with primary antibody. The membrane was then washed 3 × 10 minutes with TBS-T before incubation for 1 hour at room temperature with secondary antibody. After 3 × 10 minute washes with TBS-T, membranes were imaged using either Thermo SuperSignal West Pico (Thermo Fisher Scientific, 34577) or Thermo SuperSignal West Femto (Thermo Fisher Scientific, 34094) chemiluminescent substrate with an Amersham Imager 600. Primary antibodies included those used previously in ChIP-seq experiments and anti-H3 (Abcam, ab1791) and anti-MYC tag (Abcam, ab9106).

RT-qPCR

Three replicates of the Wiz clonal cell line and wild-type cell line were resuspended in 1ml Trizol (Invitrogen, 15596018). Chloroform (Sigma Aldrich, C2432) was added for phase separation. RNA was purified using the Zymo RNA Clean and Concentrator Kit (Zymo Research, R1013). cDNA was prepared with Superscript IV (Thermo Fisher Scientific, 18091050). qPCR was performed on an Applied Biosystems QuantStudio 6 qPCR machine using primers found in Table S6.

RNA-seq

Three replicates of the Wiz clonal cell line and wild-type cell line were resuspended in 1ml Trizol (Invitrogen, 15596018). Chloroform (Sigma Aldrich, C2432) was added for phase separation. RNA was purified using the Zymo RNA Clean and Concentrator Kit (Zymo Research, R1013). Libraries were prepared using a TruSeq RNA Library Prep Kit v2 (Illumina, RS-122-2001) with indexes AR001-AR009. Library cleanup was performed using Agencourt AMPure XP beads (Beckman Coulter, A63881). Sequencing was performed on an Illumina Hi-Seq 4000 with 50 bp paried end reads.

Hi-C

2-5x106 cells were crosslinked with 1% formaldehyde in PBS for 2 minutes, then quenched by addition of 125 mM glycine. Hi-C library construction was performed using an Arima-HiC kit (Arima Genomics) and Hyper Prep DNA-seq library prep kit (Kapa Biosciences, KK8502) according to manufacturer’s instructions, with the following modifications. Digested Hi-C libraries were treated with Arima ligase for 1 hour at room temperature instead of the recommended 15 minutes, and barcoded sequencing adapters were ligated for 1 hour at 20 °C instead of the recommended 15 minutes.

QUANTIFICATION AND STATISTICAL ANALYSIS

ChIP-seq analysis and normalization

Replicates were merged as raw fastq files before reads were aligned to a merged genome containing both mouse genome assembly mm10 and human genome assembly hg38 using bowtie (v 1.2.2) (parameters -v 2 -p 24 -S -m 1 –best –strata) (Langmead et al., 2009). Mouse chromosomes were denoted by Mchr prefix to allow for separating from human in future steps. Duplicate sequences were removed using samtools (v 1.9) markdup (-r -s) (Li et al., 2009). Reads mapping to mouse and human chromosomes were separated using samtools idxstats and counted with awk. A bam file containing only mouse reads was created using samtools view and converted to bed format using bedtools (v 2.26) bamtobed and reads were extended by 200 bp (Quinlan and Hall, 2010). Extended bed files were used to call peaks using MACS (v 2016-02-15) with a false discovery rate of 1% (macs2 callpeak -f BED -g mm -q 0.01) (Zhang et al., 2008). To obtain a high confidence peak set, called peak summits were expanded by 50 bp on either side using awk and any expanded peak overlapping a repeat element (defined using the Repeat Masker Track from UCSC genome browser) was removed prior to any peak-related analysis. A normalization factor was calculated for each sample using the formula 1/h where h is the number of human aligned reads in millions, as described previously (Orlando et al., 2014). Normalization to the provided reference (human genome), rather than the total read depth of the dataset, enables for the discovery and quantification of dynamic epigenomic changes. The bed file containing mouse reads was converted to bedgraph using bedtools genomecov (-bga -scale 1/h) before being converted to a bigwig file with bedGraphToBigWig from ucsctools (v 320) (Kent et al., 2002). Z-score normalization was performed where indicated using a custom R script from Spencer Nystrom of Dr. Daniel McKay’s lab. Overlap peak lists were generated by using bedtools intersect on summit files generated by MACS extended by 50 bp on either side. Average signal plots were generated using deeptools (v 3.0.1) computeMatrix (reference-point for CTCF sites, promoters, and enhancers; scale-regions for meta-loop anchors and insulated domains) followed by deeptools plotProfile (Ramírez et al., 2016). Enhancers were defined by merging ChIP-seq data from the master transcription factors Nanog, Sox2, and Oct4 and calling peaks on the merged data (Whyte et al., 2013). Heatmaps were generated using deeptools computeMatrix reference-point or scaled-regions followed by deeptools plotHeatmap. Data in Figure 1B were subjected to k-means clustering with 2 clusters using deeptools plotHeatmap–kmeans 2 (chosen based on results from k-means clustering with a range of 2–6 clusters). In order to visualize different classes of binding sites in each group featured in Figure S2J, bedtools bigwigCompare was used to create a subtractive bigwig track in which WT RAD21 ChIP-seq signal was subtracted from Wizdel RAD21 ChIP-seq signal at each site. Each heatmap in Figure S2J was then plotted in order of descending signal based on the subtractive RAD21 heatmap values. Correlation plots were generated using deeptools multiBigwigSummary followed by plotCorrelation (–removeOutliers –skipZeros –corMethod pearson). Coverage tracks were visualized using the UCSC Genome Browser. Unbiased motif analysis was performed using MEME-ChIP (Bailey et al., 2009). Differentially bound CTCF and Cohesin sites were identified using DiffBind (Ross-Innes et al., 2012).

RNA-seq analysis

RNA sequence reads were aligned to genomic sequence using Star (version 2.6.0a) (Dobin et al., 2013). Differentially expressed genes were identified using DESeq2 from Bioconductor (Love et al., 2014) (Table S3). A PCA plot was generated using DESeq2 plotPCA and an MA plot was generated using DESeq2 plotMA. Locations of insulated neighborhoods (Super-enhancer Domains and Polycomb Domains) were obtained from (Dowen et al., 2014). Coordinates of insulated neighborhoods were converted from mm9 to mm10 using the UCSC LiftOver tool. Genes located within or near these neighborhoods were identified using bedtools intersect. Ranked list of DEGs by expression, barplot of GO terms, and barplot of pairs of genes exhibiting the loss of insulation signature were manually generated with Microsoft Excel (Table S4). Violin plots of genes within and near insulated neighborhoods were generated using ggplot2 geom_violin (Wickham, 2016). Significance of violin plots was computed using the Wilcoxan test via compare_means from R package ggpubr (Wickham, 2016). Gene Set Enrichment Analysis was performed to identify potential impacts on biological processes (Subramanian et al., 2005) (Table S4).

Hi-C analysis

Initial processing of Hi-C data was performed with the Juicer software package (Durand et al., 2016a). Reads were aligned using BWA-mem with default parameters, after which PCR duplicates, reads with Q % 30, and self-ligated fragments were filtered out before Hi-C matrices were assembled (Table S5) (Li and Durbin, 2009). Matrices were visualized using the Juicebox software package (Durand et al., 2016b). WT replicates were well correlated (Pearson R = 0.98) and Wiz replicates were well correlated (Pearson R = 0.93). Loops were called using the HiCCUPS algorithm (Rao et al., 2014) (parameters -p 8,4,2 -i 14,10,6) on Knight-Ruiz balanced matrices at resolutions of 5 kb, 10 kb, and 25 kb, and the resulting list of merged loops was used for subsequent analyses. Contact domains were called using the Arrowhead algorithm (Rao et al., 2014) with default parameters on Knight-Ruiz balanced matrices at 25 kb resolution. Eigenvectors for analysis and visualization of compartmentalization were calculated by passing aligned reads into the Homer Hi-C analysis software package (Heinz et al., 2010; Lieberman-Aiden et al., 2009). Loops were classified as either dynamic or static by measuring the distance from a left anchor in one genotype to the nearest left anchor in the other genotype using bedtools closest and repeating this analysis for right anchors. Due to resolution limitations, a dynamic anchor was defined as being more than 25,001 bp away from the nearest same side anchor in the other genotype. To be considered static, both anchors of a loop must be within 25kb of an anchor in the opposite genotype. A dynamic loop may have either one or two altered anchors. Loop size was determined by measuring the distance from the start of the left anchor to the end of the right anchor. The same analyses were performed using domain anchors. Genes were assigned to loop structures by performing bedtools intersect between the classes of looped regions and a list of gene promoters. Log fold change of genes inside various looped regions was plotted using pheatmap.

DATA AND CODE AVAILABILITY

The accession number for the raw and processed sequencing data reported in this paper is GEO: GSE137285. All datasets are summarized in Table S1. Oligos used are detailed in Table S6. Custom ChIP-seq processing script is available at GitHub: https://github.com/dowenlab.

KEY RESOURCES TABLE

REAGENT or RESOURCESOURCEIDENTIFIER
Antibodies
WIZ antibodyNovus BiologicalsCat#NBP1-80586; RRID: AB_11011659
CTCF antibodyActive MotifCat#61311; RRID: AB_2614975
SMC1A antibodyBethyl Laboratories, IncCat#A300-055A; RRID: AB_2192467
RAD21 antibodyBethyl Laboratories, IncCat#A300-080A; RRID: AB_2176615
IgG antibodyBethyl Laboratories, IncCat#P120-101; RRID: AB_479829
H3 antibodyAbcamCat#ab1791; RRID: AB_302613
MYC tag antibodyAbcamCat#ab9106; RRID: AB_307014

Chemicals, Peptides, and Recombinant Proteins
cOmplete Protease Inhibitor CocktailSigma AldrichCat#11697498001
DMEM, high glucose, pyruvateGIBCOCat#11995065
HyClone Cosmic Calf SerumThermo Fisher ScientificCat#SH3008703
GlutaMAX SupplementThermo Fisher ScientificCat#35050-061
Penicillin-Streptomycin (10,000 U/mL)Thermo Fisher ScientificCat#15140-122
Lipofectamine 2000 Transfection ReagentInvitrogenCat#11-668-027
Formaldehyde SolutionSigma AldrichCat#F1635
Protein G DynabeadsThermo Fisher ScientificCat#10004D
Pierce 16% Formaldehyde, Methanol-freeThermo Fisher ScientificCat#28906
DC Protein Assay Kit IIBioRadCat#5000112
Blotting-Grade BlockerBioRadCat#1706404
SuperSignal West Pico PLUS Chemiluminescent SubstrateThermo Fisher ScientificCat#34577
SuperSignal West Femto Maximum Sensitivity SubstrateThermo Fisher ScientificCat#34094
TRIzol ReagentInvitrogenCat#15596018
ChloroformSigma AldrichCat#C2432
AMPure XP beadsBeckman CoulterCat#A63881
Proteinase KNew England BioLabsCat#P81075
Ribonuclease A (RNase A) from Bovine PancreasSigma AldrichCat#R4642
KnockOut DMEMThermo Fisher ScientificCat#10829-018
Premium Grade Fetal Bovine Serum (FBS)VWRCat#97068-085 Lot#249B17

Critical Commercial Assays
TOPO TA Cloning Kit for SequencingInvitrogenCat#K4575J10
ThruPLEX DNA-seq KitTakara BioCat#R400428
Zymo ChIP DNA Clean & ConcentratorZymo ResearchCat#D5205
NEBNext Ultra II DNA Library Prep KitNew England BioLabsCat#E7645S
KAPA HyperPrep KitRoche/KapaCat#KK8502
Re-ChIP-ITActive MotifCat#53016
Nuclear Complex Co-IP KitActive MotifCat#54001
RNA Clean and Concentrator KitZymo ResearchCat#R1013
SuperScript IV First Strand Synthesis SystemInvitrogenCat#18091050
TruSeq RNA Library Prep Kit v2IlluminaCat#RS-122-2001
Arima-HiC KitArima GenomicsN/A (This is the only product at of publication.) company’s the time
ZymoPURE II Plasmid Midiprep KitZymo ResearchCat#D4200

Deposited Data
Calibrated (Spike-In) ChIP-seqThis study.GEO: GSE137285
Hi-CThis study.GEO: GSE137285
RNA-seqThis study.GEO: GSE137285
OCT4 ChIP-seqWhyte et al. (2013). PMID: 23582322GEO: GSE44286
SOX2 ChIP-seqWhyte et al. (2013). PMID: 23582322GEO: GSE44286
NANOG ChIP-seqWhyte et al. (2013). PMID: 23582322GEO: GSE44286
WIZ ChIP-seq (Cerebellum)Isbel et al. (2016). PMID: 27410475GEO: GSE76909
CTCF ChIP-seq (Cerebellum)Shen et al. (2012). PMID: 22763441GEO: GSE23830
H3 ChIP-seqMullen et al. (2011). PMID: 22036565GEO: GSE23830
CTCF ChIP-seq (mESC)Nora et al. (2017). PMID: 28525758GEO: GSE98671
SMC1 ChIA-PETDowen et al. (2014). PMID: 25303531GEO: GSE57911
mESC RNA-seqMozzetta et al. (2014). PMID: 24389103GEO: GSE49669
G9a/ mESC RNA-seqMozzetta et al. (2014). PMID: 24389103GEO: GSE49669

Experimental Models: Cell Lines
Murine Embryonic Stem Cells (mESC) v6.5Laboratory of Dr. Richard YoungN/A
Human Embryonic Kidney Cells (HEK293T)Laboratory of Dr. Richard YoungN/A
Wizdel Mouse Embryonic Stem Cells (mESC)This studyN/A

Oligonucleotides
RT-qPCR PrimersThis StudySee Table S6
ChIP-qPCR PrimersThis StudySee Table S6

Software and Algorithms
Custom ChIP-seq Processing ScriptThis Studyhttps://github.com/dowenlab
BowtieLangmead et al. (2009). http://genomebiology.biomedcentral.com/articles/10.1186/gb-2009-10-3-r25v1.2.2
SamtoolsLi et al. (2009)v1.9
BedtoolsQuinlan and Hall, 2010v2.26
MACSZhang et al. (2008)v2016-02-15
UCSC ToolsKent et al. (2002). https://doi.org/10.1101/gr.229102v320
DeepToolsRamírez et al. (2016)v3.0.1
MEME SuiteBailey et al. (2009)N/A
DiffBindRoss-Innes et al. (2012)N/A
STAR AlignerDobin et al. (2013)v2.6.0a
DESeq2Love et al. (2014). http://genomebiology.biomedcentral.com/articles/10.1186/s13059-014-0550-8N/A
ggpubr, ggplot2Wickham (2016)N/A
Juicer/JuiceboxDurand et al., 2016b, 2016a. https://doi.org/10.1016/j.cels.2016.07.002; https://doi.org/10.1016/j.cels.2015.07.012N/A
Arrowhead, HiCCUPsRao et al. (2014). https://doi.org/10.1016/j.cell.2014.11.021N/A
HOMERHeinz et al. (2010). https://doi.org/10.1016/j.molcel.2010.05.004N/A
BWA (Burrows-Wheeler Aligner)Li and Durbin (2009)N/A
GSEA (Gene Set Enrichment Analysis)Subramanian et al. (2005). https://doi.org/10.1073/pnas.0506580102N/A

Other
4-20% Mini-PROTEAN TGX Stain-Free Protein GelsBioRadCat#4568094
FluoroTrans Transfer MembranesVWRCat#29301-856
  61 in total

1.  Histone methyltransferase G9a contributes to H3K27 methylation in vivo.

Authors:  Hui Wu; Xiuzhen Chen; Jun Xiong; Yingfeng Li; Hong Li; Xiaojun Ding; Sheng Liu; She Chen; Shaorong Gao; Bing Zhu
Journal:  Cell Res       Date:  2010-11-16       Impact factor: 25.617

2.  A Role for Widely Interspaced Zinc Finger (WIZ) in Retention of the G9a Methyltransferase on Chromatin.

Authors:  Jeremy M Simon; Joel S Parker; Feng Liu; Scott B Rothbart; Slimane Ait-Si-Ali; Brian D Strahl; Jian Jin; Ian J Davis; Amber L Mosley; Samantha G Pattenden
Journal:  J Biol Chem       Date:  2015-09-03       Impact factor: 5.157

3.  Proteomics. Tissue-based map of the human proteome.

Authors:  Mathias Uhlén; Linn Fagerberg; Björn M Hallström; Cecilia Lindskog; Per Oksvold; Adil Mardinoglu; Åsa Sivertsson; Caroline Kampf; Evelina Sjöstedt; Anna Asplund; IngMarie Olsson; Karolina Edlund; Emma Lundberg; Sanjay Navani; Cristina Al-Khalili Szigyarto; Jacob Odeberg; Dijana Djureinovic; Jenny Ottosson Takanen; Sophia Hober; Tove Alm; Per-Henrik Edqvist; Holger Berling; Hanna Tegel; Jan Mulder; Johan Rockberg; Peter Nilsson; Jochen M Schwenk; Marica Hamsten; Kalle von Feilitzen; Mattias Forsberg; Lukas Persson; Fredric Johansson; Martin Zwahlen; Gunnar von Heijne; Jens Nielsen; Fredrik Pontén
Journal:  Science       Date:  2015-01-23       Impact factor: 47.728

4.  Cohesin Loss Eliminates All Loop Domains.

Authors:  Suhas S P Rao; Su-Chen Huang; Brian Glenn St Hilaire; Jesse M Engreitz; Elizabeth M Perez; Kyong-Rim Kieffer-Kwon; Adrian L Sanborn; Sarah E Johnstone; Gavin D Bascom; Ivan D Bochkov; Xingfan Huang; Muhammad S Shamim; Jaeweon Shin; Douglass Turner; Ziyi Ye; Arina D Omer; James T Robinson; Tamar Schlick; Bradley E Bernstein; Rafael Casellas; Eric S Lander; Erez Lieberman Aiden
Journal:  Cell       Date:  2017-10-05       Impact factor: 41.582

5.  Comprehensive mapping of long-range interactions reveals folding principles of the human genome.

Authors:  Erez Lieberman-Aiden; Nynke L van Berkum; Louise Williams; Maxim Imakaev; Tobias Ragoczy; Agnes Telling; Ido Amit; Bryan R Lajoie; Peter J Sabo; Michael O Dorschner; Richard Sandstrom; Bradley Bernstein; M A Bender; Mark Groudine; Andreas Gnirke; John Stamatoyannopoulos; Leonid A Mirny; Eric S Lander; Job Dekker
Journal:  Science       Date:  2009-10-09       Impact factor: 47.728

6.  Mediator and cohesin connect gene expression and chromatin architecture.

Authors:  Michael H Kagey; Jamie J Newman; Steve Bilodeau; Ye Zhan; David A Orlando; Nynke L van Berkum; Christopher C Ebmeier; Jesse Goossens; Peter B Rahl; Stuart S Levine; Dylan J Taatjes; Job Dekker; Richard A Young
Journal:  Nature       Date:  2010-08-18       Impact factor: 49.962

7.  The zinc finger proteins ZNF644 and WIZ regulate the G9a/GLP complex for gene repression.

Authors:  Chunjing Bian; Qiang Chen; Xiaochun Yu
Journal:  Elife       Date:  2015-03-19       Impact factor: 8.140

8.  Topoisomerase II beta interacts with cohesin and CTCF at topological domain borders.

Authors:  Liis Uusküla-Reimand; Huayun Hou; Payman Samavarchi-Tehrani; Matteo Vietri Rudan; Minggao Liang; Alejandra Medina-Rivera; Hisham Mohammed; Dominic Schmidt; Petra Schwalie; Edwin J Young; Jüri Reimand; Suzana Hadjur; Anne-Claude Gingras; Michael D Wilson
Journal:  Genome Biol       Date:  2016-08-31       Impact factor: 13.583

9.  Fast and accurate short read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2009-05-18       Impact factor: 6.937

10.  Cohesin-mediated interactions organize chromosomal domain architecture.

Authors:  Sevil Sofueva; Eitan Yaffe; Wen-Ching Chan; Dimitra Georgopoulou; Matteo Vietri Rudan; Hegias Mira-Bontenbal; Steven M Pollard; Gary P Schroth; Amos Tanay; Suzana Hadjur
Journal:  EMBO J       Date:  2013-11-01       Impact factor: 11.598

View more
  16 in total

1.  Polycomb-mediated genome architecture enables long-range spreading of H3K27 methylation.

Authors:  Katerina Kraft; Kathryn E Yost; Sedona E Murphy; Andreas Magg; Yicheng Long; M Ryan Corces; Jeffrey M Granja; Lars Wittler; Stefan Mundlos; Thomas R Cech; Alistair N Boettiger; Howard Y Chang
Journal:  Proc Natl Acad Sci U S A       Date:  2022-05-26       Impact factor: 12.779

2.  Genome-wide analyses of chromatin interactions after the loss of Pol I, Pol II, and Pol III.

Authors:  Yongpeng Jiang; Jie Huang; Kehuan Lun; Boyuan Li; Haonan Zheng; Yuanjun Li; Rong Zhou; Wenjia Duan; Chenlu Wang; Yuanqing Feng; Hong Yao; Cheng Li; Xiong Ji
Journal:  Genome Biol       Date:  2020-07-02       Impact factor: 13.583

Review 3.  Clustered Protocadherins Emerge as Novel Susceptibility Loci for Mental Disorders.

Authors:  Zhilian Jia; Qiang Wu
Journal:  Front Neurosci       Date:  2020-11-12       Impact factor: 4.677

Review 4.  CTCF as a boundary factor for cohesin-mediated loop extrusion: evidence for a multi-step mechanism.

Authors:  Anders S Hansen
Journal:  Nucleus       Date:  2020-12       Impact factor: 4.197

5.  BET inhibition disrupts transcription but retains enhancer-promoter contact.

Authors:  Nicholas T Crump; Erica Ballabio; Laura Godfrey; Ross Thorne; Emmanouela Repapi; Jon Kerry; Marta Tapia; Peng Hua; Christoffer Lagerholm; Panagis Filippakopoulos; James O J Davies; Thomas A Milne
Journal:  Nat Commun       Date:  2021-01-11       Impact factor: 14.919

6.  A cohesin cancer mutation reveals a role for the hinge domain in genome organization and gene expression.

Authors:  Zachary M Carico; Holden C Stefan; Megan Justice; Askar Yimit; Jill M Dowen
Journal:  PLoS Genet       Date:  2021-03-24       Impact factor: 5.917

7.  Functional impact of cancer-associated cohesin variants on gene expression and cellular identity.

Authors:  Natalie L Rittenhouse; Zachary M Carico; Ying Frances Liu; Holden C Stefan; Nicole L Arruda; Junjie Zhou; Jill M Dowen
Journal:  Genetics       Date:  2021-04-15       Impact factor: 4.562

8.  A fast Myosin super enhancer dictates muscle fiber phenotype through competitive interactions with Myosin genes.

Authors:  Matthieu Dos Santos; Stéphanie Backer; Frédéric Auradé; Matthew Man-Kin Wong; Maud Wurmser; Rémi Pierre; Francina Langa; Marcio Do Cruzeiro; Alain Schmitt; Jean-Paul Concordet; Athanassia Sotiropoulos; F Jeffrey Dilworth; Daan Noordermeer; Frédéric Relaix; Iori Sakakibara; Pascal Maire
Journal:  Nat Commun       Date:  2022-02-24       Impact factor: 14.919

9.  Loss of Wiz Function Affects Methylation Pattern in Palate Development and Leads to Cleft Palate.

Authors:  Ivana Bukova; Katarzyna Izabela Szczerkowska; Michaela Prochazkova; Inken M Beck; Jan Prochazka; Radislav Sedlacek
Journal:  Front Cell Dev Biol       Date:  2021-06-02

Review 10.  Wiring the Brain by Clustered Protocadherin Neural Codes.

Authors:  Qiang Wu; Zhilian Jia
Journal:  Neurosci Bull       Date:  2020-09-17       Impact factor: 5.271

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.