| Literature DB >> 19014591 |
Katarzyna H Kaminska1, Mikihiko Kawai, Michal Boniecki, Ichizo Kobayashi, Janusz M Bujnicki.
Abstract
BACKGROUND: Catalytic domains of Type II restriction endonucleases (REases) belong to a few unrelated three-dimensional folds. While the PD-(D/E)XK fold is most common among these enzymes, crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI). Bioinformatics analyses supported by mutagenesis experiments suggested that some REases belong to the HNH fold (e.g. R.KpnI), and that a small group represented by R.Eco29kI belongs to the GIY-YIG fold. However, for a large fraction of REases with known sequences, the three-dimensional fold and the architecture of the active site remain unknown, mostly due to extreme sequence divergence that hampers detection of homology to enzymes with known folds.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19014591 PMCID: PMC2630997 DOI: 10.1186/1472-6807-8-48
Source DB: PubMed Journal: BMC Struct Biol ISSN: 1472-6807
Figure 1Multiple sequence alignment of 25 members of the R.Hpy188I family, together with the structurally characterized homologs from the GIY-YIG superfamily, UvrC (1yd0) and I-TevI (1mk0). The members are indicated by the NCBI GI number followed by the original REBASE name, e.g. R.Hpy188I (for sequences available in REBASE, e.g. R.Hpy188I) or the abbreviated genus and species name with exception of Marine for the marine metagenome and Synpha for Synechococcus phage S-PM2. Amino acid residues are colored according to the similarity of their physico-chemical properties. Secondary structure (ss), as determined experimentally for UvrC and I-TevI and predicted for R.Hpy188I (taken from the final model presented in the present study), is indicated below the alignment as tubes (helices) and arrows (strands). The alternative positions of Arg residue (84 or 104 or 105 in R.Hpy188I sequence) are indicated by asterisks (*) above the alignment, whereas the position of the two Cys residues C90 and C101 is indicated by plus characters (+).
Figure 2A structural model of R.Hpy188I (A and B) compared to its experimentally characterized homologs (C and D). Coordinates of the R.Hpy188I model are available for download from the FTP server (A) R.Hpy188I with the homology-modeled part colored according to secondary structure (helices in red, strands in yellow, loops in grey) and regions modeled de novo shown in cyan. (B) R.Hpy188I model colored according to the MetaMQAP score. Reliable regions are colored blue, less reliable regions are colored green (predicted deviation from the native structure ~3 Å) unreliable regions are colored yellow to red. (C) GIY-YIG domain in I-TevI (1mk0). (D) GIY-YIG domain in UvrC (1yd0).
Figure 3Superposition, in stereo view, of the predicted active site residues of R.Hpy188I (green), UvrC (1yd0; blue) and I-TevI (1mk0; red). A divalent metal ion (Mn2+ in the case of the 1yd0 structure) is shown as a yellow sphere.
Figure 4A structural model of R.Hpy188I with presumptive active site residues and the pair of cysteines (A) compared with alternative locations of the R84 residue transferred to positions 104 or 105, i.e. "computationally modeled" double mutants R84A/T104R (B) and R84A/N105R (C), respectively. The orientation of models is the same as in Figure 2.
Figure 5Surface features in the R.Hpy188I model. The orientation is the same as in Figure 2. (A) Sequence conservation: conserved regions are colored purple, variable regions are colored teal, while average regions are white. (B) Electrostatic potential: positively and negatively charged regions are colored in blue and red, respectively.
Figure 6A minimum evolution phylogenetic tree of the R.Hpy188I family. The branches of the tree are indicated by the sequence names as in Figure 1. Values at the nodes indicate the percent value of bootstrap support. Features that distinguish different lineages are indicated on the right side of the figure (amino acid residues are numbered according to their position in the R.Hpy188I sequence).
Genomic context analysis of R.Hpy188I family members
| R.Hpy188I family membera | Neighboring MTase homolog | ||||
| GI (REBASE) | Truncationb | GI (REBASE) | Truncationb | Relationship to other proteins (PFAM, TIGRFAMs and BLASTPc) | Nucleotide sequence |
| R.Hpy188I branch | |||||
| 8248059 (R.Hpy188I) | - | 8248058 (M.Hpy188I) | - | Experimental [ | 8248057 |
| 8052220 (R.HpyF17I) | - | 8052219 (M.HpyF17I) | - | TIGR02986, M.Hpy188I | 8052218 |
| 137606550 (Marine) | 5' | 137606551 | 5' | M.Hpy188I | 137606549 (env_nt) |
| 138608358 (Marine) | 5' | 138608359, 138608360d | 5', gap | M.Hpy188I | 138608357 (env_nt) |
| 134801317 (Marine) | 5' | N.A.e | N.A. | N.A. | 134801314 (env_nt) |
| 136020097 (Marine) | 3' | 136020096 | 5' | M.EsaSS1928P | 136020095 (env_nt) |
| 137734056 (Marine) | 5' | N.A.e | N.A. | N.A. | 137734055 (env_nt) |
| 139454792 (Marine) | 5' | 139454794, 139454795d | gap | M.Hpy188I | 139454791 (env_nt) |
| 139696470 (Marine) | 3' | N.A.e | N.A. | N.A. | 139696468 (env_nt) |
| 140559819 (Marine) | 5'f and 3' | N.A.e | N.A. | N.A. | 140559817 (env_nt) |
| 140872195 (Marine) | 3'f | 140872191 | 5'f | M.EsaSS1928P | 140872190 (env_nt) |
| 135963558 (Marine) | 5' | 135963559, 135963560g | complex | M.Hpy188I | 135963557 (env_nt) |
| 136192486 (Marine) | 5' | 136192485, 136192487h | frameshift | PF02384.7, M.Hpy188I | 136192484 (env_nt) |
| R.HpyAORF481P branch | |||||
| 135833547 (Marine) | 5' | 135833548 | 3' | PF02086.6, M.HpyAORF481P | 135833546 (env_nt) |
| 144033223 (Marine) | 3' | 144033224 | - | M.MunI | 144033222 (env_nt) |
| 141400025 (Marine) | 3' | 141400026 | 5' and 3'f | PF02086.6, M.HpyAORF481P | 141400023 (env_nt) |
| 143652664 (Marine) | - | 143652663 | - | PF02086.6, M.HpyAORF481P | 143652662 (env_nt) |
| 134783906 (Marine) | 3' | N.A.e | N.A. | N.A. | 134783905 (env_nt) |
| 135316365 (Marine) | 5' | N.A.e | N.A. | N.A. | 135316364 (env_nt) |
| 144056752 (Marine) | 5' | 144056751 | 3' | PF02086.6, M.HpyAORF481P | 144056750 (env_nt) |
| 108562884 (R.HpyHORF458P) | - | 108562883 (M.HpyHORF458P) | - | PF02086.6, M.HpyAORF481P | 108562424i |
| 15611501 (R.Hpy99ORF433P) | - | 15611500 (M.Hpy99ORF433P) | - | PF02086.6, M.HpyAORF481P | 15611071j |
| 15645110 (R.HpyAORF481P) | - | 15645109 (M.HpyAORF481P) | - | PF02086.6, M.HpyAORF481P | 15644634k |
| 57242192 (R.CupORF237P) | - | 57242195 (M.CupORF237P) | - | PF02086.6, M.HpyAORF481P | 57242183l |
| 58533050 (Synpha) | - | 58532812 (M.SspSPM2ORFAP) | - | PF02086.6, M.HpyAORF481P | 58532811 |
N.A., Not applicable.
a The order of the sequences is the same as their order in Figure 1.
b End of the cloned DNA fragment or end of the sequenced part of the cloned fragment. 5' or 3' indicate missing sequence data at the 5' or 3' side of the putative nuclease or MTase gene (corresponding to truncation of N- or C- terminus of the protein sequence, respectively).
c A homologous MTase is described by its REBASE name.
d The MTase gene includes an internal gap (unsequenced fragment).
e A MTase gene was not detected on one side of the REase gene, but the other side is not available for analysis, thus the presence of the MTase cannot be either excluded or confirmed.
f The clone contains an internal gap (unsequenced fragment), which overlaps with one end of the gene.
g Probably a combination of rearrangement(s) and frameshift(s)
hA frameshift mutation splits the MTase gene.
i Genomic neighborhood includes also M.HpyHORF460P (GI 108562885).
j Genomic neighborhood includes also R.Hpy99XIP (GI 15611503) and M.Hpy99XI (GI 15611502).
kGenomic neighborhood includes also R.HpyAORF483P (GI 15645111) and M.HpyAORF483P (pseudogene).
l Another putative RM system comprising M.CupORF235P (GI 57242193) and CUP0236 (GI 57242194; not in REBASE; a homolog of HacSORF520P from REBASE) is inserted into the CupORF237P RM system.