| Literature DB >> 35685367 |
Julie Segueni1, Daan Noordermeer1.
Abstract
The emergence and progression of cancers is accompanied by a dysregulation of transcriptional programs. The three-dimensional (3D) organization of the human genome has emerged as an important multi-level mediator of gene transcription and regulation. In cancer cells, this organization can be restructured, providing a framework for the deregulation of gene activity. The CTCF protein, initially identified as the product from a tumor suppressor gene, is a jack-of-all-trades for the formation of 3D genome organization in normal cells. Here, we summarize how CTCF is involved in the multi-level organization of the human genome and we discuss emerging insights into how perturbed CTCF function and DNA binding causes the activation of oncogenes in cancer cells, mostly through a process of enhancer hijacking. Moreover, we highlight non-canonical functions of CTCF that can be relevant for the emergence of cancers as well. Finally, we provide guidelines for the computational identification of perturbed CTCF binding and reorganized 3D genome structure in cancer cells.Entities:
Keywords: 3C, chromosome conformation capture; 3D genome organization; AS, alternative splicing; CBS, CTCF binding site; CNV, copy number variation; CRE, cis-regulatory element; CT, chromosome territory; CTCF; Cancer genomes; Computational biology; DSB, DNA double strand break; Enhancer hijacking; FISH, fluorescence in situ hybridization; GIST, gastrointestinal stromal tumors; Hi-C, high-throughput 3C; RBR, RNA binding region; SE, super enhancer; T-ALL, T cell acute lymphoblastic leukemia; TAD, topologically associating domain; TADs; Topologically associating domains; Tumor suppressor; ZF, zinc finger
Year: 2022 PMID: 35685367 PMCID: PMC9166472 DOI: 10.1016/j.csbj.2022.05.044
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1Different scales of intra-chromosomal 3D genome organization in human cells and their corresponding appearance in Hi-C interaction maps. A: Hi-C compartments (also known as A and B compartments) constitute the largest scale of organization and represent alternating active regions and inactive regions that each preferentially engage in homotypic interactions. B: TADs are sub-Megabase domains embedded within Hi-C compartments. Within a TAD, interactions are enriched over surrounding domains. TADs appear as triangles along the diagonal of the Hi-C map. C: DNA loops represent interactions between two genomic loci, e.g. an enhancer and a promoter (red and blue bars) or the two extremities of a TAD (purple bars). They appear as a punctuated increase of signal (black dots) within or on top of a TAD in a Hi-C map. Below each schematic Hi-C map, approximate length-scales are indicated. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 2Domain organization of the CTCF protein. The linear organization of the CTCF protein is indicated on top, including its different functional domains. The reverse complement of the 15 bp core DNA binding motif that is recognized by ZFs 3–7 is indicated below. Lollipops highlight the positions where the presence of a methylated CpG dinucleotide can prevent CTCF binding [46].
Fig. 3Functions of CTCF in transcriptional regulation and impact of perturbed CTCF binding in cancer cells. A: TAD insulation, enhancer-promoter looping and heterochromatin barrier function through CTCF binding. A schematic Hi-C map with 3 TADs and simulated CTCF, Rad21 (cohesin complex) and H3K27me3 ChIP-seq are depicted. The yellow gene (center TAD) is inactive because of the enhancer-blocking activity of CTCF at the TAD boundary. The green gene on the right is activated by its enhancer though DNA loop formation. B: Schematic chromatin organization of the 3 TADs from panel A, showing the relative position between neighboring TADs, the containment of heterochromatin and the physical proximity between the green gene and its enhancer at loop anchors mediated by CTCF and the cohesin complex. C: Consequence of changes in the DNA sequence or methylation status of a CBS on TAD structure. Perturbation of CTCF binding at the CBSs that separate TAD 2 and TAD 3 (purple arrow) causes a fusion between the domains. This allows the hijacking of the enhancer located in former TAD 3 by the gene previously located in TAD 2. D: Schematic chromatin organization of the TADs from panel C. TAD 2 and 3 have fused, with the disruption of the boundary causing enhancer hijacking and gene activation. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Fig. 4The non-canonical function of CTCF in alternative splicing. CTCF-mediated alternative splicing occurs in a methylation-dependent manner. When exon 2 is unmethylated, CTCF can bind and will pause RNA Pol II, leading to the inclusion of the exon. When exon 2 is methylated, CTCF will not bind and the exon will be skipped.
Fig. 5Overview of computational strategies for the analysis of protein-DNA binding and 3D genome organization in cancer cells. The left panel shows an outline for the computational analysis of protein-DNA binding data from the mapping of raw sequencing reads to the identification of genomic features like differential peaks and binding motifs. Commonly used tools are indicated for each step. The right panel shows an outline for the computational analysis of 3D genome organization data from the mapping of raw sequencing reads to the identification of genomic features like A/B compartments, TAD boundaries and DNA loops. The outcomes from both analyses can be intersected to identify correlations between protein binding (e.g. CTCF) and 3D genome organization.
Comparison of Hi-C analysis pipelines. Asterisks indicate the inclusion of tools with comparable output. The cooler file format allows interoperability between pipelines.
* For the HOMER toolbox, citation number is not restricted to Hi-C related tools.
* The Juicer pipeline allows the identification of contact domains, not TAD boundaries.
* The HOMER toolbox does not provide a “Distance vs Counts” but a “Distal-To-Local” tool to analyze chromatin compaction.