| Literature DB >> 24302578 |
Juliane Krebes1, Richard D Morgan, Boyke Bunk, Cathrin Spröer, Khai Luong, Raphael Parusel, Brian P Anton, Christoph König, Christine Josenhans, Jörg Overmann, Richard J Roberts, Jonas Korlach, Sebastian Suerbaum.
Abstract
The genome of Helicobacter pylori is remarkable for its large number of restriction-modification (R-M) systems, and strain-specific diversity in R-M systems has been suggested to limit natural transformation, the major driving force of genetic diversification in H. pylori. We have determined the comprehensive methylomes of two H. pylori strains at single base resolution, using Single Molecule Real-Time (SMRT®) sequencing. For strains 26695 and J99-R3, 17 and 22 methylated sequence motifs were identified, respectively. For most motifs, almost all sites occurring in the genome were detected as methylated. Twelve novel methylation patterns corresponding to nine recognition sequences were detected (26695, 3; J99-R3, 6). Functional inactivation, correction of frameshifts as well as cloning and expression of candidate methyltransferases (MTases) permitted not only the functional characterization of multiple, yet undescribed, MTases, but also revealed novel features of both Type I and Type II R-M systems, including frameshift-mediated changes of sequence specificity and the interaction of one MTase with two alternative specificity subunits resulting in different methylation patterns. The methylomes of these well-characterized H. pylori strains will provide a valuable resource for future studies investigating the role of H. pylori R-M systems in limiting transformation as well as in gene regulation and host interaction.Entities:
Mesh:
Substances:
Year: 2013 PMID: 24302578 PMCID: PMC3936762 DOI: 10.1093/nar/gkt1201
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Circos plots displaying the distribution of methylated bases in the genomes of H. pylori 26695 (A) and J99-R3 (B). The outermost and innermost tracks represent the interpulse duration ratios. The colored tracks in between represent the location of methylation of the different motifs. Each motif is displayed by one track except for the asymmetric Type I recognition sites for which both motifs were combined. For m6A and m4C methylation, the cutoff for calling a site modified was Qmod >100 (kinetic score); for the TET-treated m5C methylation a kinetic score threshold >50 was used. The position and type of methylation for each motif is depicted in the legend. Novel motifs are highlighted in red. The genome tick marks display the genomic positions (kbp). The locations of the cag pathogenicity island (cagPAI) and PZ are indicated. Please note that the PZ of strain 26695 is split into two regions (21).
Methylated sequence motifs detected for H. pylori 26695
| Type of R-M system | Methyltransferase activity | Type of methylation | Total number in genome | Number methylated | % detected methylated | Assignment | Locus | Reference |
|---|---|---|---|---|---|---|---|---|
| I | 5′-AC | m6A | 330 | 326 | 98.8 | M.HpyAXIII | hp0850 | This study |
| 3′-TGTN8 | 330 | 326 | 98.8 | |||||
| II | 5′- | m5C | 4885 | 3913 | 80.1 | M2.HpyAVI | hp0051 | ( |
| 3′-GG | m6A | 4885 | 4841 | 99.1 | M1.HpyAVI | hp0050 | ( | |
| 5′-C | m5C | 1304 | 1198 | 91.9 | M.HpyAV | hp0054 | This study/( | |
| 3′-GG | m6A | 1304 | 1252 | 96.0 | ||||
| 5′-G | m6A | 10 834 | 10 764 | 99.4 | M.HpyAIII | hp0092 | ( | |
| 5′- | m6A | 608 | 601 | 98.8 | M.HpyAX | hp0260 | ( | |
| 5′-A | m6A | 972 | 957 | 98.5 | M.HpyAVII | hp0478 | ( | |
| 5′-G | m5C | 12 538 | 6328 | 50.5 | M.HpyAVIII | hp1121 | ( | |
| 5′-C | m6A | 14 782 | 14 741 | 99.7 | M.HpyAI | hp1208 | ( | |
| 5′-G | m6A | 5570 | 5549 | 99.6 | M.HpyAIV | hp1352 | ( | |
| 5′-GAAG | m6A | 4823 | 4809 | 99.7 | M1.HpyAII | hp1367 | ( | |
| 3′-CTT | m4C | 4823 | 4593 | 95.2 | M2.HpyAII | hp1368 | ||
| 5′-GCGT | m6A | 2058 | 2046 | 99.4 | HpyAXIV | hp1517 | This study | |
| III | 5′-GC | m6A | 4260 | 4234 | 99.4 | M.HpyAXI | hp0593 | ( |
| IIG or III | 5′-CGR | m6A | 3275 | 3240 | 98.9 | - | n.d. | This study |
aThe methylated position within the motif is highlighted in bold. Underlining indicates the modified base in the complementary strand. Pairs of reverse-complementary motifs belonging to one recognition sequence are grouped together. Novel recognition sequences are highlighted by an asterisk.
bThe total number includes motifs occurring on the ‘+’- and ‘−’-strand.
cThe MTase M.HpyAXIII achieves specificity from distantly located S subunit S.HpyAXIII (HP0790).
Methylated sequence motifs detected for H. pylori J99-R3
| Type of R-M system | Methyltransferase activity | Type of methylation | Total number in genome | Number methylated | % detected methylated | Assignment | Locus | Reference |
|---|---|---|---|---|---|---|---|---|
| I | 5′-A | m6A | 611 | 610 | 99.8 | M.Hpy99XVII | jhp0786 | This study |
| 3′-TTCN6G | 611 | 610 | 98.8 | |||||
| 5′-RT | m6A | 370 | 363 | 98.1 | M.Hpy99XVI | jhp0786 | This study | |
| 5′-A | m6A | 1582 | 1575 | 99.6 | M.Hpy99XV | jhp1423 | This study | |
| 5′-A | m6A | 287 | 285 | 99.3 | M.Hpy99XV | jhp1423 | This study | |
| 3′-TTCN6 | 287 | 286 | 99.7 | |||||
| II | 5′-C | m6A | 5027 | 4993 | 99.3 | M1.Hpy99V | jhp0043 | ( |
| 5′-G | m6A | 210 | 203 | 96.7 | M.Hpy99II | jhp0045 | ( | |
| 5′-G | m6A | 10 958 | 10 934 | 99.8 | M.Hpy99VI | jhp0085 | ( | |
| 5′- | m6A | 674 | 653 | 96.9 | M.Hpy99VII | jhp0244 | ( | |
| 5′- | m4C | 3624 | 3581 | 98.8 | M.Hpy99VIII | jhp0248 | ( | |
| 5′-A | m6A | 854 | 841 | 98.5 | M.Hpy99XIX | jhp0430 | ( | |
| 5′-A | m5C | 1000 | 765 | 76.5 | M.Hpy99XI | jhp0435 | ( | |
| 5′-G | m6A | 368 | 355 | 96.5 | M.Hpy99XII | jhp0454 | ( | |
| 5′- | m4C | 2398 | 2323 | 96.9 | M.Hpy99IV | jhp0629 | ( | |
| 5′-C | m4C | 538 | 454 | 84.4 | M.Hpy99I | jhp0756 | ( | |
| 5′- | m6A | 3906 | 3838 | 98.3 | M.Hpy99XVIII | jhp1012 | ( | |
| 5′-G | m5C | 13 798 | 6917 | 54.0 | M.Hpy99III | jhp1050 | ( | |
| 5′-C | m6A | 15 120 | 15 091 | 99.8 | M.Hpy99X | jhp1131 | ( | |
| 5′-G | m6A | 5516 | 5490 | 99.5 | M.Hpy99IX | jhp1271 | ( | |
| 5′-GGWTA | m6A | 2676 | 2655 | 99.2 | Hpy99XIV | jhp1272 | This study | |
| 5′-GCCT | m6A | 3106 | 3096 | 99.7 | Hpy99XIII | jhp1409 | This study |
aThe methylated position within the motif is highlighted in bold. Underlining indicates the modified base in the complementary strand. Pairs of reverse-complementary motifs belonging to one recognition sequence are grouped together. Novel recognition sequences are highlighted by an asterisk.
bThe total number includes motifs occurring on the ‘+’- and ‘−’-strand.
cJHP0786 (M) can form two distinct Type I MTase complexes and exhibits a dual specificity (methylation of two recognition sites) depending on the associated S subunit. The MTase complex composed of JHP0786 and JHP0785 (S) was designated M.Hpy99XVI and the one containing JHP0786 and JHP0726 (S) was named M.Hpy99XVII.
dThis Type I MTase complex has dual specificities caused by the presence of three TRDs in the associated S subunit (JHP1422).
eFor Hpy99V, there are two MTases associated. M1.Hpy99V is an active m6A MTase methylating GGAG while M2.Hpy99V is an inactive m5C MTase as the TET-treated samples showed no methylation of its recognition site CTCC.
Figure 2.Graphical representation of the silenced JHP0414/JHP0415 Type I MTase of J99-R3 (A), the putative phase-variable Type III MTases (B), BcgI-like R-M systems (C) and Type ISP MTases (D). (A) The gene sequences of HsdM (JHP0415) and HsdS (JHP0414) are displayed as gray bars and the protein sequences as yellow bars. SMRT sequencing of the J99-R3 genome did not reveal methylation by the Type I MTase complex (dashed lines), while expression of the enzyme complex on a plasmid in E. coli 2796 and subsequent SMRT analyses demonstrate methylation of the recognition site AAAN6TGG. (B–D) The gene sequences (gray arrows) contain one or two putative or authentic frameshifts that would prevent translation of full-length proteins (colored bars). Each frameshift was repaired through site-directed mutagenesis (addition and deletion of an indicated number of nucleotides, +15 Nt—TAAGGTTAATATATG) and correction was verified by targeted Sanger sequencing. The activity of the MTase proteins was pretested with a methylation activity assay. If an MTase showed activity in the assay, the gDNA of the corresponding E. coli host strain ER2796 was subjected to SMRT sequencing and analyzed for methylation. ‘mut’ marks alleles with frameshift corrections, ‘trunc’ labels not fully translated alleles, ‘n.d.’ points out that no SMRT sequencing was performed and dashed lines indicate that no methylation could be detected through SMRT sequencing. Filled inverted triangles highlight the position of the frameshift mutations. JHP—ORFs of J99, HP—ORFs of 26695. Homologous systems are displayed in the same color.
Figure 3.Functional characterization of JHP0785/JHP0786 and JHP0726 through insertion mutagenesis (A) and expression of the R-M genes in E. coli 2796 (B) confirmed that the Type I MTase JHP0786 interacts with the two S subunits JHP0785 or JHP0726 to achieve methylation of two different recognition sites. The methylation patterns of the J99-R3 parental and mutant genomes and the E. coli ER2796 host genomes were analyzed by SMRT sequencing. Motifs displayed in gray are not modified. Methylation of a recognition site is indicated by coloring and boldface type. The methylated nucleotide within a site is highlighted by the addition of a methyl group (CH3) to the sequence. ‘M’ is the abbreviation for methyltransferase, which acts as a homodimer. The two S subunits are composed of either one or two TRDs and interaction of TRDs is achieved by helical connector regions (H) adjacent to each TRD. (A) Expression of the three R-M components in J99-R3 and functional inactivation of either component (labeled by a black cross) resulted in different methylation patterns of the two distinct recognition sites. (B) Analysis of enzyme activity through plasmid-based expression of either JHP0786/JHP0785 (pRRS) alone or together with JHP0726 (pACYC184) in E. coli.
Figure 4.Graphical representation of the unique ability of the Type I MTase complex JHP1422/JHP1423 (HsdS/HsdM) to recognize and methylate two distinct recognition sites. (A) Schematic representation of the domain structure of the JHP1422 specificity subunit (HsdS) deduced from the J99 genome sequence (21). The proposed protein harbors three TRDs of which TRD1 is unique and TRD2 and 3 are duplicated as indicated by the coloring. Helical connector regions (H), known as conserved regions (CR), flank each TRD. (B) Inactivation of JHP1423 (HsdM) by insertion mutagenesis and subsequent SMRT analysis resulted in a lack of methylation of two different recognition sites as indicated by the gray color of the two sites. (C) Joint expression of JHP1423 (M) and several allele variants of JHP1422 led to different methylation patterns in the E. coli host genome. Cloning of JHP1422/JHP1423 yielded two naturally occurring allelic variants of JHP1422 (left panel: allele variant 1 and 2, JHP1423 sequence not shown) that differ in composition and protein length. In contrast to the proposed sequence, the H2 region following TRD2 is replaced by the H1 for allele variant 1. Allele variant 2 shows the same exchange but, in addition, has a second TRD2 flanked by the H2 region. Both variants have a different impact on methylation of the two recognition sites (right panel). Allele variants 3–6 were obtained by mutagenesis to reveal the specificity of the TRDs. Coloring and boldface type highlight methylation of the recognition sites, and the methyl group (CH3) indicates the modified nucleotide. Recognition sequences displayed in gray are not modified. The methylation patterns were analyzed by SMRT sequencing of the J99-R3jhp1423 mutant genome (B) and E. coli ER2796 genomes (C).
Figure 5.Graphical representation of the specificity switching phenomenon of the two homologous Type IIG MTases JHP1272 and HP1353-HP1354. (A) The gene sequences (gray arrows) contain two putative or authentic frameshifts that would prevent translation of full-length proteins (blue bars). Each frameshift was repaired through site-directed mutagenesis [addition or deletion of C nucleotide(s) is indicated] and correction was verified by targeted Sanger sequencing. Plasmid DNA (Hpy99XIV) or gDNA (remainder) of the E. coli host strains ER2796 was subjected to SMRT sequencing and analyzed for methylation. ‘mut’ marks alleles with frameshift corrections, ‘n.d.’ points out that no SMRT sequencing was performed and dashed lines indicate that no methylation could be detected. Filled inverted triangles highlight the position of the frameshift mutations. JHP—ORF of J99, HP—ORF of 26695. (B–D) Portions of the amino-acid multiple sequence alignments including secondary structure predictions (raw output of PROMALS) of the MTases JHP1272 and HP1353-HP1354. The numbers preceding each sequence display the position within the amino acid sequence. An asterisk at the end of a sequence indicates a stop codon. CTRD—C-terminal additional recognition domain, TRD—target recognition domain. (B) Secondary structure-guided alignment of JHP1272 and HP1353 CTRD and TRD with two MmeI family enzymes. The alignment shows the similarity of both MTases to the secondary structure of MmeI and MaqI in the MmeI region known to specify recognition of the −2 and −1 bases (relative to the methylated base within a recognition site). For these positions a changing specificity in the two active variants of JHP1272 and HP1353-HP1354 was observed. The red color text indicates predicted alpha helices and blue indicates predicted beta strands. The arrows indicate the four beta strands forming the structural motif in MmeI family enzymes that present −2 and −1 contact amino acids. The underlined PPPP region represents the first homopolymeric repeat of C nucleotides. (C) Putative recognition region for −1 and −2 base positions within the regular TRD and CTRD of the two MTases. Amino acid identities are depicted in bold black, similarities in bold green. (D) Alignment of HP1353-HP1354 and JHP1272 CTRD. Both regions show 87% amino acid identity (82 of 94), which most likely accounts for the identical specificity switch for −1 and −2 base positions observed for both protein variants containing the CTRD.
Comparison of methylation patterns between H. pylori 26695 and J99-R3
| Specificity | R-M characteristics | Nomenclature | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Modification | Type | Sub type | Locus M gene | Locus S gene | Locus M gene | Locus S gene | 26695 | J99-R3 | |
| Specificities found in 26695 and J99-R3 | |||||||||
| C | m6A | IIP | Alpha | hp1208 | – | jhp1131 | – | M.HpyAI | M.Hpy99X |
| G | m6A | IIP | Beta | hp0092 | – | jhp0085 | – | M.HpyAIII | M.Hpy99VI |
| | m6A | IIP | Beta | hp0260 | – | jhp0244 | – | M.HpyAX | M.Hpy99VII |
| G | m6A | IIP | Beta | hp1352 | – | jhp1271 | – | M.HpyAIV | M.Hpy99IX |
| A | m6A | IIP | Gamma | hp0478 | – | jhp0430 | – | M.HpyAVII | M.Hpy99XIX |
| G | m5C | IIP | 5mC | hp1121 | – | jhp1050 | – | M.HpyAVIII | M.Hpy99III |
| G | m6A | IIS | Beta | hp0050 | – | jhp0043 | – | M1.HpyAVI | M1.Hpy99V |
| Unique specificities of 26695 | |||||||||
| GAA | m6A | IIS | Beta | hp1367 | – | No homologue | – | M1.HpyAII | – |
| | m4C | IIS | Beta | hp1368 | – | No homologue | – | M2.HpyAII | – |
| | m5C | IIS | 5mC | hp0051 | – | jhp0044 | – | M2.HpyAVI | M2.Hpy99V, inactive |
| GCGT | m6A | IIG | Gamma | hp1517 | – | (jhp1409) | – | HpyAXIV | (Hpy99XIII) |
| GA | m6A, m5C | IIS | Gamma,5mC | hp0054 | – | No homologue | – | M.HpyAV | – |
| GC | m6A | III | Beta | hp0593 | – | No homologue | – | M.HpyAXI | – |
| AC | m6A | I | Gamma | hp0850 | hp0790 | (jhp0786) | – | M.HpyAXIII | (M.Hpy99XVII) |
| CGR | m6A | ? | ? | ? | – | – | – | – | – |
| Unique specificities of J99-R3 | |||||||||
| G | m6A | IIP | Gamma | (hp0503P) | – | jhp0454 | – | M.HpyAXII, inactive | M.Hpy99XII |
| | m4C | IIP | Beta | (hp0263P) | – | jhp0248 | – | Inactive | M.Hpy99VIII |
| | m6A | IIP | Gamma | (hp0369P) | – | jhp1012 | – | Inactive | M.Hpy99XVIII |
| | m4C | IIP | Beta | No homologue | – | jhp0629 | – | – | M.Hpy99IV |
| G | m6A | IIP | Beta | No homologue | – | jhp0045 | – | – | M.Hpy99II |
| C | m4C | IIP | Beta | No homologue | – | jhp0756 | – | – | M.Hpy99I |
| A | m5C | IIP | 5mC | (hp0483 fs) | – | jhp0435 | – | Inactive | M.Hpy99XI |
| GCCT | m6A | IIG | Gamma | (hp1517) | – | jhp1409 | – | (HpyAXIV) | Hpy99XIII |
| GGWTA | m6A | IIG | Gamma | (hp1353-mut1) | – | jhp1272 | – | (HpyAXVI-mut1) | Hpy99XIV |
| RT | m6A | I | Gamma | (hp0850) | – | jhp0786 | jhp0785 | (M.HpyAXIII) | M.Hpy99XVI |
| A | m6A | I | Gamma | (hp0850) | – | jhp0786 | jhp0726 | (M.HpyAXIII) | M.Hpy99XVII |
| A | m6A | I | Gamma | No homologue | – | jhp1423 | jhp1422 | – | M.Hpy99XV |
| A | m6A | I | Gamma | No homologue | – | jhp1423 | jhp1422 | – | M.Hpy99XV |
| Specificities found in 26695 following frameshift repair of the corresponding genes | |||||||||
| GGANN | m6A | ISP | Gamma | hp0669 (hp0668) | – | (jhp0612P) | – | M.HpyAXVIII | inactive |
| CRTTA | m6A | IIG | Gamma | hp1353-mut1 | – | (jhp1272) | – | HpyAXVI-mut1 | (Hpy99XIV) |
| CRTCN | m6A | IIG | Gamma | hp1353-mut2 | – | (jhp1272-mut1) | – | HpyAXVI-mut2 | (Hpy99XIV-mut1) |
| TC | m6A | III | Beta | hp1370 | – | No homologue | – | M.HpyAXVII | – |
| Specificities found in J99-R3 following frameshift repair of the corresponding genes | |||||||||
| TC | m6A | IIB | Gamma | (hp1472) | (hp1471) | jhp1365 | jhp1364 | Inactive | Hpy99XXII |
| GWC | m6A | III | Beta | (hp1522) | – | jhp1411 | – | Inactive | M.Hpy99XXI |
| GGWCN | m6A | IIG | Gamma | (hp1353-mut2) | – | jhp1272-mut1 | – | (HpyAXVI-mut2) | Hpy99XIV-mut1 |
| Silenced specificity found in J99-R3 | |||||||||
| AA | m6A | I | Gamma | (hp0463P) | – | jhp0415 | jhp0414 | Inactive | M.Hpy99XX |
aThe methylated position within the motif is highlighted in bold. If present, underlining indicates the modified base in the complementary strand.
bLoci and systems displayed in brackets are homologous but have different specificities. The addition of a ‘P’ indicates that a gene is inactive. The locus hp0483 is inactivated by an authentic frameshift (fs).