| Literature DB >> 26001968 |
Giedrius Sasnauskas1, Evelina Zagorskaitė2, Kotryna Kauneckaitė2, Giedre Tamulaitiene2, Virginijus Siksnys1.
Abstract
The eukaryotic Set and Ring Associated (SRA) domains and structurally similar DNA recognition domains of prokaryotic cytosine modification-dependent restriction endonucleases recognize methylated, hydroxymethylated or glucosylated cytosine in various sequence contexts. Here, we report the apo-structure of the N-terminal SRA-like domain of the cytosine modification-dependent restriction enzyme LpnPI that recognizes modified cytosine in the 5'-C(mC)DG-3' target sequence (where mC is 5-methylcytosine or 5-hydroxymethylcytosine and D = A/T/G). Structure-guided mutational analysis revealed LpnPI residues involved in base-specific interactions and demonstrated binding site plasticity that allowed limited target sequence degeneracy. Furthermore, modular exchange of the LpnPI specificity loops by structural equivalents of related enzymes AspBHI and SgrTI altered sequence specificity of LpnPI. Taken together, our results pave the way for specificity engineering of the cytosine modification-dependent restriction enzymes.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26001968 PMCID: PMC4499157 DOI: 10.1093/nar/gkv548
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Structurally characterized SRA domains and their recognition sequences
| Protein | Recognition sitea | Base modification | PDB ID | References |
|---|---|---|---|---|
| UHRF1-SRA | 5′-(mC)G-3′ | 5mC, 5hmC | 2ZO0, 2ZO1, 3CLZ, 2ZKD, 2ZKE, 2ZKD | ( |
| UHRF2-SRA | 5′-(mC)G-3′ | 5hmC > 5mC | 4PW5, 4PW6, 4PW7 | ( |
| SUVH5-SRA | 5′-(mC)G-3′ | 5mC | 3Q0B, 3Q0C, 3Q0D | ( |
| 5′-(mC)HH-3′ | ||||
| MspJI | 5′-(mC)NNR-3′ | 5mC, 5hmC | 4R28, 4F0Q, 4F0P | ( |
| AspBHI | 5′-YS(mC)NS-3′ | 5mC, 5hmC | 4OC8 | ( |
| LpnPI | 5′-C(mC)DG-3′ | 5mC, 5hmC | 4RZL | this work, ( |
| PvuRts1I | 5′-(mC)-3′ | 5hmC, 5ghmC | 4OQ2, 4OKY | ( |
| AbaSI | 5′-(mC)-3′ | 5hmC, 5ghmC | 4PAR, 4PBA, 4PBB | ( |
a(mC) – modified cytosine; N – any nucleotide; D – A, T, or G; Y – T or C; S – G or C; R – A or G.
Data collection and refinement statistics
| Space group | |
| A (Å) | 82.037 |
| B (Å) | 82.037 |
| C (Å) | 152.829 |
| Wavelength | 1.0012 |
| X-ray source | MAX II I911-3 beamline |
| Total reflections | 438 186 |
| Unique reflections | 33 984 |
| Resolution range (Å) | 41.4-2.1 |
| Completeness (%) (last shell) | 100 (100) |
| Multiplicity (last shell) | 12.9 (12.8) |
| I/σ (last shell) | 21.5 (5.1) |
| R(merge) (%) (last shell) | 10.1 (52.4) |
| B(iso) from Wilson (Å2) | 21.68 |
| Resolution range (Å) | 41.019–2.10 |
| Reflections work/test | 60 566/6585 |
| Protein atoms | 3490 |
| Solvent molecules | 422 |
| 16.6 | |
| 19.9 | |
| R.M.S.D. bond lengths (Å) | 0.010 |
| R.M.S.D. angles (°) | 1.092 |
| Ramachandran core region (%) | 97.56 |
| Ramachandran allowed region (%) | 2.44 |
| Ramachandran disallowed region (%) | 0 |
Oligoduplex substrates
| Oligoduplex | Sequencea | Specificationb |
|---|---|---|
| gC(mC)TG | 5′-CCGTAG | Oligoduplex with a standard LpnPI recognition site; the reference substrate in DNA cleavage studies |
| 3′-GGCATC | ||
| tC(mC)TG | 5′-CCGTA | As gC(mC)TG, but the -2 bp is T:A |
| 3′-GGCAT | ||
| aC(mC)TG | 5′-CCGTA | The -2 bp is A:T |
| 3′-GGCAT | ||
| cC(mC)TG | 5′-CCGTA | The -2 bp is C:G |
| 3′-GGCAT | ||
| gG(mC)TG | 5′-CCGTAG | The -1 bp is G:C |
| 3′-GGCATC | ||
| gT(mC)TG | 5′-CCGTAG | The -1 bp is T:A |
| 3′-GGCATC | ||
| gA(mC)TG | 5′-CCGTAG | The -1 bp is A:T |
| 3′-GGCATC | ||
| gC(mC)AG | 5′-CCGTAG | The +1 bp is A:T |
| 3′-GGCATC | ||
| gC(mC)CG | 5′-CCGTAG | The +1 bp is C:G |
| 3′-GGCATC | ||
| gC(mC)GG | 5′-CCGTAG | The +1 bp is G:C |
| 3′-GGCATC | ||
| gC(mC)TC | 5′-CCGTAG | The +2 bp is C:G |
| 3′-GGCATC | ||
| gC(mC)TA | 5′-CCGTAG | The +2 bp is A:T |
| 3′-GGCATC | ||
| gC(mC)TT | 5′-CCGTAG | The +2 bp is T:A |
| 3′-GGCATC | ||
| gG(mC)TC | 5′-CCGTAG | The -2 bp is G:C and the +2 bp is C:G |
| 3′-GGCATC | ||
| gT(mC)TT | 5′-CCGTAG | The -2 bp is T:A and the +2 bp is T:A |
| 3′-GGCATC | ||
| gA(mC)TA | 5′-CCGTAG | The -2 bp is A:T and the +2 bp is A:T |
| 3′-GGCATC | ||
| gCCTG | 5′-CCGTAGC | As gC(mC)TG, but 5mC is replaced with an unmodified cytosine |
| 3′-GGCATCGGACCAGCTAGGATCGACCAGCGG-5′ |
a‘5’ designates 5-methylcytosine; the DNA regions recognized by LpnPI are underlined; DNA base pairs that deviate from the reference substrate ‘gC(mC)TG’ are shown in typeface.
bBase pairs upstream of 5mC are numbered −1 and −2; base pairs downstream of 5mC are numbered +1 and +2.
Figure 1.DNA recognition domain of restriction endonuclease LpnPI. (A) Sequence alignment of the N-terminal domains of LpnPI (LpnPI-N) and AspBHI (AspBHI-N). Secondary structure elements of LpnPI-N and AspBHI-N are numbered as in (13). Residues that form the flipped-out base binding pocket are marked with asterisks. Alignment was generated with ESPript (31). (B) Superimposition of LpnPI-N (in yellow) and AspBHI-N (in white; PDB 4OC8). The putative LpnPI/AspBHI DNA recognition loops are colored as follows: Loop-B3, blue/light blue; Loop-78, green/lime; Loop-2B, magenta/light magenta; Loop-6C, cyan/aquamarine. (C) The model of DNA-bound LpnPI-N, based on the crystal structure of DNA-bound UHRF1-SRA (PDB 3FDE). DNA recognition loops are colored as in (B), the flipped cytosine and the orphan intra-helical guanine are shown in red. (D) Schematic representation of LpnPI interactions with DNA. Protein loops and the 5mC:G base pair are colored as in panel (C); other bases comprising the LpnPI recognition site are shown in light orange and are numbered from ‘−1’ (the bp upstream of 5mC) to ‘+2’ (the 2nd bp downstream of 5mC).
Figure 2.DNA recognition by SRA domains. The structures of the DNA-bound UHRF1-SRA and MspJI, and the apo-structures of MspJI, LpnPI-N and AspBHI-N (PDB ID: 3FDE, 4R28, 4F0Q, 4RZL, 4OC8) were superimposed with MultiProt (29). Equivalent DNA recognition elements in all panels are shown in identical orientation. Left: recognition of the flipped-out base in the protein pocket; center: Loop-6C or ‘CpG recognition’/‘NKR finger’ loop (cyan); right: Loop-B3 or ‘base-flipping-promotion’ loop (blue), Loop-B2 (magenta), and Loop-78 (green). In all panels the flipped-out base and the orphan intra-helical guanine are colored red; other nucleotides comprising the specific recognition sequence of the corresponding protein are colored orange and are numbered from ‘−2’ (the second bp upstream of 5mC) to ‘+3’ (the third bp downstream of 5mC). (A) DNA recognition by MspJI. Loop-6C occupies a similar position both in the apo- and the DNA-bound structures and does not make base-specific contacts. Residues Q33, E65, and K173 from the ‘2B’, ‘B3’, and ‘78’ loops, respectively, are close to the DNA bases. (B and C) The models of DNA-bound LpnPI and AspBHI based on the co-crystal structure of UHRF1-SRA. Loop-C6 and Loop-B3 residues 41–43, 91–93, and 99 are different in LpnPI and AspBHI. LpnPI Loop-2B and Loop-78 residues 27–29 and 136–137 were mutated in the present study; AspBHI Loop-2B residues T25 and D32 are critical for the enzyme function (13); AspBHI Loop-78 residue R132 overlaps with the critical LpnPI residue R137. (D) DNA recognition by the SRA domain of the eukaryotic UHRF1 protein. The loops equivalent to Loop-78 and Loop-2B in MspJI-like restriction endonucleases are colored green and magenta, respectively.
Catalytic activity of LpnPI mutants
| Mutation | Activity (%)b | |
|---|---|---|
| wt LpnPI | (3.3 ± 0.8) × 10−3 | 100 |
| 5(h)mC binding pocket | ||
| D71A | (1.0 ± 0.3) × 10−6 | 0.03 |
| D71N | (2.0 ± 0. 4) × 10−4 | 6 |
| Loop-2B (contacts downstream of 5(h)mC) | ||
| S25A | (7.0 ± 1.5) × 10−3 | 200 |
| N27A | No cleavage | <0.01 |
| D30A | (1.0 ± 0.3) × 10−3 | 30 |
| Loop-B3 (adjacent to orphan guanine) | ||
| G41S | (1.0 ± 0.1) × 10−5 | 0.3 |
| N42A | No cleavage | <0.01 |
| M43A | (1.0 ± 0.2) × 10−4 | 3 |
| M43Q | (1.6 ± 0.1) × 10−3 | 50 |
| Loop-6C (contacts upstream of 5(h)mC) | ||
| R98A | (1.0 ± 0.6) × 10−6 | 0.03 |
| Loop-78 (contacts downstream of 5(h)mC) | ||
| R137A | (0.7 ± 0.3) × 10−5 | 0.2 |
| S136A | (2.1 ± 0.1) × 10−3 | 60 |
aOligoduplex DNA cleavage reactions were performed on the ‘gC(mC)TG’ substrate (Table 3) and the observed rate constants kobs were determined by single-exponential fits (see Materials and Methods for details). The lowest DNA cleavage rate measured in our assay is 3 × 10−7 s−1.
bThe activity is expressed as the ratio kobs(mutant)/kobs(wt) × 100%.
Figure 3.Recognition site preference of LpnPI. Oligoduplex DNA cleavage reactions were performed under standard reaction conditions and the observed rate constants kobs were determined by single-exponential fits (see Materials and Methods for details). The recognition sequences of the DNA substrates are shown on the left-hand side of the graphs. ‘(mC)’ stands for 5mC (the last substrate in each graph is the unmethylated control); sequence positions that differ from the reference ‘gC(mC)TG’ substrate are marked with grey boxes; full oligoduplex sequences are listed in Table 3. The reaction rates of LpnPI mutants that show increased cleavage due to loop replacement are marked by blue streaked bars; ‘−’ marks undetectable cleavage (rate lower than 3 × 10−7 s−1, the starting position of the x-axis). Alignments of the LpnPI/AspBHI Loop-6C and the LpnPI/SgrTI Loop-2B that were replaced in the LpnPI-91RLL and LpnPI-27HTG are shown above panels A and B, respectively. (A) Wt enzyme and the LpnPI variant LpnPI-91RLL (Loop-6C replacement) on DNA substrates with variable sequence upstream and downstream of 5mC. (B) Wt enzyme and the LpnPI variant LpnPI-27HTG (Loop-2B replacement) on DNA substrates with variable sequence upstream and downstream of 5mC. (C) Wt enzyme and the ‘double-swap’ LpnPI variant LpnPI-27HTG-91RLL on DNA substrates with variable sequences upstream and downstream of 5mC.