| Literature DB >> 23990325 |
Wil A M Loenen1, Elisabeth A Raleigh.
Abstract
The 1952 observation of host-induced non-hereditary variation in bacteriophages by Salvador Luria and Mary Human led to the discovery in the 1960s of modifying enzymes that glucosylate hydroxymethylcytosine in T-even phages and of genes encoding corresponding host activities that restrict non-glucosylated phage DNA: rglA and rglB (restricts glucoseless phage). In the 1980's, appreciation of the biological scope of these activities was dramatically expanded with the demonstration that plant and animal DNA was also sensitive to restriction in cloning experiments. The rgl genes were renamed mcrA and mcrBC (modified cytosine restriction). The new class of modification-dependent restriction enzymes was named Type IV, as distinct from the familiar modification-blocked Types I-III. A third Escherichia coli enzyme, mrr (modified DNA rejection and restriction) recognizes both methylcytosine and methyladenine. In recent years, the universe of modification-dependent enzymes has expanded greatly. Technical advances allow use of Type IV enzymes to study epigenetic mechanisms in mammals and plants. Type IV enzymes recognize modified DNA with low sequence selectivity and have emerged many times independently during evolution. Here, we review biochemical and structural data on these proteins, the resurgent interest in Type IV enzymes as tools for epigenetic research and the evolutionary pressures on these systems.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23990325 PMCID: PMC3874153 DOI: 10.1093/nar/gkt747
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.DNA modifications recognized by Type IV enzymes. Enzymatic DNA modifications in the major groove of double-stranded DNA are methylation at cytosine C5 or N4, or at adenosine N6; and glucosylation of a pre-existing 5-hydroxymethylcytosine. The beta-glucosyl derivative is shown; other configurations and other sugars are known to be added by some phages. hm5C is incorporated during replication, after conversion of the dCTP pool to hmdCTP. Phosphorothioate modification of the backbone is carried out postsynthetically. Other biological DNA modifications are known. Only those shown to elicit action of characterized Type IV enzymes are shown here.
DNA modifications that elicit cleavage by Type IV enzymes
| Protein | m5C | hm5C | ghm5C | m4C | m6A | PT | References |
|---|---|---|---|---|---|---|---|
| EcoKMcrA | (+) | (+) | (−) | NT | (−) | NT | ( |
| ScoA3McrA | + | NT | NT | NT | (+) | + | ( |
| EcoKMcrBC | + | + | − | (+) | − | NT | ( |
| BanUMcrB | (+) | ( | |||||
| BanUMcrB3 | (+) | ( | |||||
| EcoKMrr | (+) | (−) | (−) | (−?) | (+) | NT | ( |
| BanUMrr | (+) | (+?) | ( | ||||
| ScoA3Mrr | (−) | (+) | ( | ||||
| ZmoMrr | (+?) | (+?) | ( | ||||
| SauUSI | + | + | − | − | −? | NT | ( |
| SauNewI | (+) | ( | |||||
| SepRPMcrR | (+) | ( | |||||
| ScoA3I | (+) | ( | |||||
| PvuRts1I family | +/− | + | + | NT | − | NT | ( |
| GmrSD | − | − | + | NT | − | NT | ( |
| ScoA3II+III | (+) | (−) | ( |
aModifications: m5C: 5-methylcytosine; hm5C: 5-hydroxymethylcytosine; ghm5C; glucosylated hydroxymethylcytosine; m4C: N4-methylcytosine; m6A: N6-methyladenine; PT: phosphorothioation of non-bridging oxygen in DNA linkages, also called S-DNA.
+/−: at least 100-fold less activity on this substrate than on substrates with + entry.
(−), (+), based on in vivo restriction of phage infection or plasmid transformation with appropriate host mutant configurations; in vitro cleavage results have not been reported.
(+?) either m5C or m6A is recognized; these were not distinguished in the reported experiments.
−?: m6A sites tested were not cleaved, but few modified sequences were tested.
NT: not tested.
Where the name found in REBASE (and listed at the left) is not the same as that used in the cited report, the genomic locus_ID is given in the References column, or the name used in the publication.
DNA modifications that elicit cleavage by other modification-dependent enzymes (Type IIM)
| Protein | m5C | hm5C | ghm5C | m4C | m6A | PT | References |
|---|---|---|---|---|---|---|---|
| DpnI | − | − | − | − | + | NT | ( |
| MspJI family | + | + | − | − | − | NT | ( |
| SgeI | + | ( | |||||
| AoxI | + | ||||||
| BisI | + | ||||||
| BlsI | + | ||||||
| GlaI | + | ( | |||||
| GluI | + | ||||||
| KroI | + | ||||||
| MalI | + | ||||||
| MteI | + | ||||||
| PcsI | + |
aModifications: m5C: 5-methylcytosine; hm5C: 5-hydroxymethylcytosine; ghm5C; glucosylated hydroxymethylcytosine; m4C: N4-methylcytosine; m6A: N6-methyladenine; PT: phosphorothioation of non-bridging oxygen in DNA linkages, also called S-DNA.
+/−: at least 100-fold less activity on this substrate than on substrates with + entry.
(−), (+), based on in vivo restriction of phage infection or plasmid transformation with appropriate host mutant configurations; in vitro cleavage results have not been reported.
(+?) either m5C or m6A is recognized; these were not distinguished in the reported experiments.
−?: m6A sites tested were not cleaved, but few modified sequences were tested.
NT: not tested.
Where the name found in REBASE (and listed at the left) is not the same as that used in the cited report, the genomic locus_ID is given in the References column, or the name used in the publication.
Characteristics of Type IV restriction enzymes
| Protein | Subunits/ domains | DNA Binding | Endonuclease domain | NTP hydrolysis | Recognition site |
|---|---|---|---|---|---|
| EcoKMcrA | – | ||||
| N-terminal | DBD | (Y > R)m5CGR bound ( | |||
| C-terminal | H-N-Hc | Bioinformatic ID ( | |||
| ScoA3McrA | – | Some CmCWGG and some S-DNA (PT modified) sites are cleaved ( | |||
| N-terminal | DBD? | Not related to EcoMcrA ( | |||
| C-terminal | H-N-Hc | 37% identical to EcoMcrA ( | |||
| EcoKMcrBC | GTP | Rm5C(N30-35|)-(N30-3000)-Rm5C | |||
| McrB-N | DUF3578 | McrB binds DNA ( | |||
| McrB-C | P-loop NTPase | ( | |||
| McrC | PD-(D/E)XK? | Bioinformatic ID ( | |||
| EcoKMrr | – | m6A or m5C; sequence specificity ambiguous ( | |||
| N-terminal | Mrr-N | Presumed DNA binding ( | |||
| C-terminal | Mrr-cat (D/E).. (D/E/Q) × K | Bioinformatic ID ( | |||
| SauUS1 | ATP or dATP | Sm5CNGS; two copies required for cleavage ( | |||
| N-terminal | PLDc-2 | ( | |||
| Middle | P-loop NTPase | ( | |||
| C-terminal | DUF3427 | ( | |||
| PvuRts1I | PD-(D/E)XK? | – | mC(N11-13/N9-10|)G 2-base extensions ( | ||
| EcoCTGmrSD | ? | ? | UTP>>GTP, CTP | Cuts T-even DNA ( | |
| GmrS | Motifs suggested | DUF262 | ( | ||
| GmrD | DUF1524 | To be determined |
aRecognition sites (represented 5′→3′) are those determined in vitro by binding or cleavage experiments.
bMcrBC cleavage results in a double-strand cut near one Rm5C site (72,73,74) but requires cooperation of two sites (39,40) or a translocation block (73). The sites may be on different daughters across a fork (75). These are separated by 30–3000 (39,72,74) and may be on either strand (39,76); disposition of opposing nicks is not tightly constrained (73), and minor cleavage clusters are found ∼40, ∼50 and ∼60 nt from the m5C (74).
Degeneracy abbreviations: B = C or G or T; D = A or G or T; H = A or C or T; K = G or T; M = A or C; N = A or C or G or T; R = A or G; S = C or G; V = A or C or G; W = A or T; Y = C or T.
Cleavage positions are listed as (N# to top cut/# to bottom cut|). If no number is listed, the position of cleavage is not determined. Space between numbers (e.g. PvuRts1I N11-13/N9-10) indicates the range of positions at which cleavage may occur.
Characteristics of other modification-dependent enzymes (Type IIM)
| Protein | Subunits/ domains | DNA Binding | Endonuclease Domain | Recognition site | Comment and references |
|---|---|---|---|---|---|
| DpnI family | G m6A|TC | 13 characterized isoschizomers | |||
| N-terminal | PD..(D/E)XK | ( | |||
| C-terminal | wHTH | R m6A|TC | ( | ||
| MspJI family | mC with preferences | Second copy stimulates cleavage | |||
| MspJI | 5 mCNNR(N9/13|) | ( | |||
| N-terminal | SRA-like | ( | |||
| C-terminal | Mrr-cat (D/E)..(D/E/Q)XK | ( | |||
| FspEI | Like MspJI | C m5C(N12/16|) | ( | ||
| LpnPI | Like MspJI | C m5CDG(N10/14|) | ( | ||
| AspBHI | Like MspJI | YS m5CNS(N8/12|) | ( | ||
| RlaI | Like MspJI | V m5CW | ( | ||
| SgrTI | Like MspJI | C m5CDS(N10/14|) | ( | ||
| SgeI | Like MspJI | m5CNNR(N9/13|) | 49% identical to MspJI; ( | ||
| No family assigned | Information from | ||||
| AoxI | |RG m5CY | ||||
| BisI | G m5C|NGC | ||||
| BlsI | RYN|R Y | At least two m5C required | |||
| GlaI | R m5C|GY | ( | |||
| GluI | GmC|NG m5C | ||||
| KroI | G| m5CCGGC | ||||
| MteI | GmCG m5C|NGm5CGm5C | ||||
| PcsI | m5CG(N5|N2)m5CG | ||||
| PkrI | Gm5CN|G m5C | At least 3 m5C required |
Figure 2.McrA functional domains. Domain function was inferred indirectly from genetic analysis by Anton & Raleigh (2004) (57). Many mutations in the N-terminal domain spared some activity in one or more of three functional tests (grey segments) while others were deficient in all activities (black segments). One mutation (asterix) was fully active on m5C-containing substrates, but fully inactive in the hm5C challenge in vivo. Most mutations in the C-terminus (pale grey segment) retained function in one test that was interpreted as measuring m5C binding ability. A predicted structural model by Bujnicki, Radlinska and Rychlewski (2000) (58) for this C-terminal region is compatible with these results.
Figure 3.McrBC Assembly Model. Two proteins are expressed from mcrB in vivo. Both the complete protein (McrB-L) and a small one missing the N-terminus (McrB-S; top row) bind GTP, forming high-order multimers detected by gel filtration (second row). When visualized by scanning transmission electron microscopy, these appear as heptameric rings with a central channel. Rings of McrB-L in top views show projections that may correspond to the N-terminal DNA-binding domain (red segment). Both forms can then associate with McrC, judged again by gel filtration. McrB-L:GTP can bind to its specific substrate (RmC) in the absence of McrC (third row); in its presence, the substrate is cleaved (fourth row). GTP hydrolysis is required for cleavage (arrow): a supershifted binding complex forms in the presence of GTP-gamma-S, but no cleavage occurs. Translocation accompanies GTP hydrolysis; double-stranded cleavage requires collaboration between two complexes, or a translocation block. The path of the DNA in the figure is arbitrary, as is the conformation of McrC. Modified from Bourniquel,A.A. and Bickle,T.A. Complex restriction enzymes: NTP-driven molecular motors. Bourniquel and Bickle (84) with permission. Copyright © 2002 Elsevier Masson SAS. All rights reserved.
Figure 4.McrB-N in comparison to other base-flipping proteins. (A) SRA domains SUVH5 (3Q0C) and UHRF1 (2ZKD) use loops extending from a crescent formed from two beta sheets to flip C or m5C from undeformed B-form DNA into a pocket (top row), whereas McrB-N (3SSC; bottom row) uses loops from one beta-sheet to distort the DNA and flip the base. It resembles the human alkyladenine glycosylase (1BNK) (bottom row) in bending the DNA toward the major groove, while flipping the base via the minor groove. Figure 5 of Sukackaite et al. (60). (B) The SRA-like hemi-methylated 5mC recognition domains. A ribbon model of the N-terminal domain of the MspJI structure (4F0Q and 4F0P; left) compared with the SRA domain of URHF1 (PDB 3FDE; right). The crescent shape formed by interacting beta sheets and helices αB and αC are the conserved features of the SRA domain highlighted here. Loops on the concave side of UHRF1 participate in flipping the base, and similar loops presumably do so for MspJI. Two of these vary in length among family members and may play roles in sequence context specificity. Figure 2a and b from Horton et al. (80).
Figure 5.Schematic diagrams of cleavage positions for MspJI and AbaSDFI. Cleavage of both strands is elicited by a singly modified site for both MspJI (A) and AbaSDFI (C). Cleavage position is fixed relative to the modified site, but with a four-base 5′ extension for MspJI and a two-base 3′ extension for AbaSDFI. When a site is symmetrically-modified (as for CpG sites in mammalian DNA), a 32 base-pair fragment is excised from the DNA (B). (A) Figure 2a and (B) Figure 3a reprinted with permission from Cohen-Karni et al. (52). (C) Figure 5a from Wang et al. (47).