| Literature DB >> 34156435 |
Allegra Angeloni1,2, Ozren Bogdanovic1,2.
Abstract
In vertebrates, cytosine-guanine (CpG) dinucleotides are predominantly methylated, with ∼80% of all CpG sites containing 5-methylcytosine (5mC), a repressive mark associated with long-term gene silencing. The exceptions to such a globally hypermethylated state are CpG-rich DNA sequences called CpG islands (CGIs), which are mostly hypomethylated relative to the bulk genome. CGIs overlap promoters from the earliest vertebrates to humans, indicating a concerted evolutionary drive compatible with CGI retention. CGIs are characterised by DNA sequence features that include DNA hypomethylation, elevated CpG and GC content and the presence of transcription factor binding sites. These sequence characteristics are congruous with the recruitment of transcription factors and chromatin modifying enzymes, and transcriptional activation in general. CGIs colocalize with sites of transcriptional initiation in hypermethylated vertebrate genomes, however, a growing body of evidence indicates that CGIs might exert their gene regulatory function in other genomic contexts. In this review, we discuss the diverse regulatory features of CGIs, their functional readout, and the evolutionary implications associated with CGI retention in vertebrates and possibly in invertebrates.Entities:
Keywords: CpG islands; DNA methylation; chromatin; orphan CpG islands
Mesh:
Substances:
Year: 2021 PMID: 34156435 PMCID: PMC8286816 DOI: 10.1042/BST20200695
Source DB: PubMed Journal: Biochem Soc Trans ISSN: 0300-5127 Impact factor: 5.407
Figure 1.Sequence and chromatin features of CGIs.
CGIs are characterised by elevated CpG density and GC content [10,11,14,74], transcription factor binding sites (TFBS) [42–46], and G quadruplex (G4) DNA sequences [47–49,52,53]. CGIs overlap key gene regulatory elements, such as promoters [2,4,5,27,113,114] and enhancers [6,7,9] and can thus switch between active/poised and repressive chromatin states, depending on the activity of the gene which they are regulating. These states are influenced by the complement of so called ‘reader' proteins targeted to CGIs, which include transcriptional activators (CBP/P300, SETD1, CFP1, TET1, KDM2A, RNAP2) and repressors (PRC1, PRC2, KDM2B) [6,9,23,28,30,38,39,81–84,110–112]. In exceptional cases, such as in imprinted control regions (ICRs), or cancer testis antigen gene (CTA) promoters, CGIs can be stably silenced through DNA methylation (5mC) and methyl-CpG binding proteins (MBDs), a state which is reinforced by constant targeting of DNA methyltrasferases (DNMTs) [3,24,25,56,61,62]. It is not yet clear how continuous presence of 5mC within CGIs influences G4 sequences [52].
Figure 2.DNA methylation status, sequence features, and CGI presence in ten metazoan genomes.
Global DNA methylation levels obtained from whole-genome bisulfite sequencing datasets of the following species: sponge (Amphimedon Queenslandica), lancelet (Branchiostoma lanceolatum), sea vase (Ciona intestinalis), pacific oyster (Crassotrea gigas), honeybee (Apis melifera), worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), frog (Xenopus tropicalis), mouse (Mus musculus), and zebrafish (Danio rerio) [6,20,21,73,115]. Presence of CGIs in the genome has been previously described in the following studies [2,5,116], whereas elevated CpG density at TSS was previously discussed here [2,19–21,70,72].