Literature DB >> 22821560

Movement of DNA sequence recognition domains between non-orthologous proteins.

Yoshikazu Furuta1, Ichizo Kobayashi.   

Abstract

Comparisons of proteins show that they evolve through the movement of domains. However, in many cases, the underlying mechanisms remain unclear. Here, we observed the movements of DNA recognition domains between non-orthologous proteins within a prokaryote genome. Restriction-modification (RM) systems, consisting of a sequence-specific DNA methyltransferase and a restriction enzyme, contribute to maintenance/evolution of genomes/epigenomes. RM systems limit horizontal gene transfer but are themselves mobile. We compared Type III RM systems in Helicobacter pylori genomes and found that target recognition domain (TRD) sequences are mobile, moving between different orthologous groups that occupy unique chromosomal locations. Sequence comparisons suggested that a likely underlying mechanism is movement through homologous recombination of similar DNA sequences that encode amino acid sequence motifs that are conserved among Type III DNA methyltransferases. Consistent with this movement, incongruence was observed between the phylogenetic trees of TRD regions and other regions in proteins. Horizontal acquisition of diverse TRD sequences was suggested by detection of homologs in other Helicobacter species and distantly related bacterial species. One of these RM systems in H. pylori was inactivated by insertion of another RM system that likely transferred from an oral bacterium. TRD movement represents a novel route for diversification of DNA-interacting proteins.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22821560      PMCID: PMC3467074          DOI: 10.1093/nar/gks681

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Comparisons of proteins indicate that they evolved through movement of domains. However, the elementary steps of these movements have been unclear. In eukaryotes, exon shuffling and alternative splicing generate proteins with switched domains (1). In this work, we report the movement of domain sequences between non-orthologous proteins of a prokaryote species lacking an exon–intron structure. The mobile domains are involved in DNA recognition. Recognition of DNA sequences by proteins is central to life. Restriction (R) and modification (M) enzymes have provided paradigms for understanding recognition of well-defined DNA sequences (2). In three (I–III) types of restriction–modification (RM) systems (3), an M enzyme methylates DNA at a specific sequence while an R enzyme cleaves DNA lacking this methylation. Type IV restriction enzymes cut DNA that is methylated at a specific sequence (4). DNA methyltransferases brings about three types of modification: m5C, m6A and m4C, and they are mainly grouped into six classes, α to ζ, according to the order of nine conserved motifs and the target recognition domain (TRD) (5,6). Most of the known Type II systems consist of separate R and M enzymes that independently recognize a target sequence and catalyze reactions (4,7). In M proteins, amino acid sequence motifs common to DNA methyltransferases are well conserved and the TRD is easily identified (6), while R proteins have much less similarity to each other (8). Type I systems consist of R, M and specificity (S) subunit genes, the products of which form multisubunit enzymes for modification (SM) or restriction (SMR) (9). Sequence recognition is determined by the TRD in the S subunit. The TRD consists of two domains, each of which recognizes half of a bipartite target sequence (10). Type III systems consist of res and mod genes. The mod gene product alone has M activity, while the complex of the two gene products has R enzyme activity (11). The mod subunit is responsible for target recognition and its TRD can be easily identified. Type III mod genes in some host-adapted bacterial pathogens are known for diversity in the sequence recognized by the TRD and for involvement in phase variation of global gene expression (12–14). To date, Type III systems have been annotated by sequence similarity to known Type III enzymes (4), and almost all mod genes are classified as β type (REBASE Enzymes, http://rebase.neb.com/cgi-bin/msublist). RM systems, which limit horizontal transfer of genes, are themselves mobile. Some acquire mobility by traveling with another class of mobile elements such as plasmids and prophages (15–22). Some RM genes are flanked by insertion sequence (IS) elements (23–26). The mobility of other RM systems that are unlinked to a typical mobile element has been suggested by analyses of their phylogenetic relationships, genome contexts and genome comparisons. Phylogenetic trees of RM genes suggest horizontal gene transfer between distantly related prokaryotes (27–29). Genome comparison has revealed insertion of RM systems with long target duplications (30). A large genome inversion was observed next to an RM insertion, suggesting involvement of RM activity in the inversion (31,32). These observations strongly support the nature of RM systems as mobile elements and their contribution to various genome rearrangements. This concept is also supported by experimental analyses (33,34). The biological significance of RM systems has been mainly explained by their activity as a defense system for host cells against invading DNAs such as those from bacteriophages. Recent work has demonstrated their role beyond defense against invaders, suggesting they are like a watchdog, maintaining epigenetic order. RM systems define specific epigenetic status in a genome by combinatorial methylation of specific genome sequences (35). This epigenetic status regulates the transcriptome (12–14). Alteration of the epigenetic status can lead to cell death by R enzyme activity (36–38). This may help maintain the epigenome and RM systems themselves (39). For example, a host bacterial attack by R enzyme activity might contribute to maintenance of healthy genomes under stressful conditions (38). Helicobacter pylori are pathogenic epsilonproteobacteria in the human stomach (40) that are known to code for abundant and diverse RM systems (41,42). They are also known for high mutation and homologous recombination rates and for natural competence, which results in high diversity in genomic DNA between isolates (43,44). Helicobacter pylori have coevolved since human ancestor started migration from Africa and show wide phylogeographic differentiation (45). By phylogenetic analysis, H. pylori are grouped into hspWAfrica, hpEurope, hspAmerind, hspEastAsia and other groups (43). Most earlier studies characterizing their enzymes were restricted to Type II RM systems and to European strains. We compared the complete genome sequences of global H. pylori strains for diversity in Type I RM systems and found domain sequence movement (DoMo) within an orthologous gene (46). In this work, we analyzed diversity in Type III mod genes in global H. pylori genome sequences. In addition to mobility of the mod gene itself, our results revealed various modes of mobility of TRD sequences between mod genes.

MATERIALS AND METHODS

Comparison of RM systems

The complete genome sequences used (Table 1) were downloaded from the National Center for Biotechnology Information (NCBI) database as of 1 November 2010, except for those of F16, F30, F32 and F57, which had been obtained by our group (47). The locus tags in the sequences were used as registered in the NCBI database. The genome sequence of strain 908 was not used (except for in core tree construction) because its completeness was not guaranteed.
Table 1.

Strains

aBased on a phylogenetic tree of the core genes (47).

Strains aBased on a phylogenetic tree of the core genes (47). Sequences of RM systems were downloaded from REBASE (41) as of 15 July 2011. Assignment of a gene as Type III mod in H. pylori and other Helicobacter species was confirmed by significant amino acid sequence similarity by BLASTP (48) (e-value <1e − 24) of at least one allele at a locus with EcoP15 mod, which was confirmed experimentally as Type III mod (49,50). For further confirmation, a conserved small PD − (D/E)XK motif was also detected in the C-terminal region of Type III res genes paired with the mod genes or their homologs (51). Nucleotide and amino acid sequences were aligned by mafft (52) and ClustalW (53) with the default parameters. For the gene split at Locus 0 of F57 by conjugative transposon insertion, the nucleotide sequences of HPF57_0278 and HPF57_0312 were concatenated and used for analysis. A homology group of TRDs was defined by clustering with each other with an e-value < 1e − 90 by BLASTN (54).

Phylogenetic tree construction

Homologs of TRDs in the other species were sought by BLASTP (48) against the NCBI nr database, using amino acid sequences of TRD without a stop codon within them as a query. BLASTP hits derived from H. pylori were omitted from the results. Hits with an e-value < 1e − 50, which is ∼50% amino acid sequence identity, were retrieved (Supplementary Table S1). The 16S rRNA sequence of the species of the hit was retrieved from the sequence list for the All-Species Living Tree Project (55) and one representative species per genus were chosen for a phylogenetic tree construction (Supplementary Table S2). A phylogenetic tree of the 16S rRNAs was drawn by the maximum likelihood method with Kimura-2 parameter by MEGA (56) with 1000 bootstrap replicates. Helicobacter felis was not included because no sequences annotated as 16S rRNA in its genome were full length. Phylogenetic trees for tree comparison were drawn by the neighbor-joining method with Kimura-2 parameter by MEGA (56) with 1000 bootstrap replicates. The core tree in Figure 5 was redrawn from a published sequence alignment (47) using MEGA (56) with 1000 bootstrap replicates.
Figure 5.

Comparison of phylogenetic trees of non-TRD regions of mod genes at each locus and the core genome. Left, non-TRD regions; right, core genome. Numbers indicate bootstrap values. Colors indicate phylogenetic groups in Table 1.

Detection of horizontally transferred genes by pentanucleotide word composition

Whole mod gene sequences were inferred by pentanucleotide composition whether or not they were horizontally transferred (57). In brief, we first extracted all coding and non-coding regions from a whole genome sequence and constructed a Markov chain model of coding regions as a training model. Then, the posterior probability that the nucleotide sequence of a gene appeared in the coding regions of the same genome was calculated using the Markov chain model and Bayes theorem. Statistical significance was calculated by the posterior probability of 100 artificial sequences generated based on the Markov chain model. P < 0.01 was used as the threshold.

RESULTS

Detecting Type III mod genes in H. pylori genomes

We retrieved Type III mod gene sequences from 19 H. pylori complete genomes using REBASE and homology search (see ‘Materials and Methods’ section). We used only complete genome sequences to ensure finding all possible orthologs and paralogs. Helicobacter pylori are known for phylogeographic divergence in their genomes (45). The 19 chosen strains were assigned to hpEurope, hspWAfrica, hspEAsia and hspAmerind groups (Table 1) based on the core phylogenetic tree (confirmed by STRUCTURE analysis, Koji Yahara, personal communication) (47). Each gene had a prefix in the locus tag that was unique to the genome (Table 1). This grouping information was used in the analysis of horizontal transfer of TRD sequences below. The examined H. pylori genomes had five orthologous groups of Type III mod genes, each at a unique locus (Figure 1; see Supplementary Figure S1 for locus tags).
Figure 1.

TRDs of Type III mod genes in H. pylori complete genomes. (A) Relative positions of conserved motifs and TRD in Type III mod genes. Roman numerals indicate the conserved motifs of DNA methyltransferases (6). (B) Locus 0. HP0260 homologs. A large triangle on the homolog in strain F57 represents insertion of a conjugative transposon. (C) Locus 1. HP1369 homologs. (D) Locus 2. HP1522 homologs. (E) Locus 3. jhp1296 homologs. (F) Locus 4. HP0593 homologs. Members of the same TRD homology group are in the same color. Small vertical bar in orange, start codon; small vertical bar in green, stop codon generated by a frameshift mutation. For locus tags, see Supplementary Figure S1.

TRDs of Type III mod genes in H. pylori complete genomes. (A) Relative positions of conserved motifs and TRD in Type III mod genes. Roman numerals indicate the conserved motifs of DNA methyltransferases (6). (B) Locus 0. HP0260 homologs. A large triangle on the homolog in strain F57 represents insertion of a conjugative transposon. (C) Locus 1. HP1369 homologs. (D) Locus 2. HP1522 homologs. (E) Locus 3. jhp1296 homologs. (F) Locus 4. HP0593 homologs. Members of the same TRD homology group are in the same color. Small vertical bar in orange, start codon; small vertical bar in green, stop codon generated by a frameshift mutation. For locus tags, see Supplementary Figure S1. The mod homologs at loci 1 through 4 were found linked to a res-like gene with a conserved small PD-(D/E)XK motif at C terminal region. The mod homolog at locus 0 is, however, not linked to such a gene in any of these strains when present, although many of its homologs in other species are linked with one. The mod homolog at locus 0 may be a solitary methyltransferase evolutionarily related to Type III RM systems.

Structural variations and insertion of a different RM system likely from an oral bacterium

Type III mod loci were found with insertion/deletion of entire genes, truncations by varying length at mononucleotide repeats (14) and nonsense mutations. Some alterations in start and stop codons indicated by small vertical bars in Figure 1 were associated with phase variation (58). Among the identified loci, locus 4 carried the mod gene in only 5 of the 19 strains. Genome context analysis revealed that this absence was due to an apparent substitution by a hypothetical gene in all except a single strain (Figure 2A). The exception was strain PeCan4, in which the mod gene was substituted with another Type II RM system and an IS (ISHp608) (Figure 2A). The Type II RM genes had close homology with TdeIII RM genes in Treponema denticola, as detected by BLASTP (Figure 2B and C). The amino acid identities in the coding region were 53% for M genes and 62% for R genes. This bacterial species is mainly found in the oral cavity (59), suggesting horizontal gene transfer of this Type II RM system to H. pylori from the oral bacterium or a related bacterium. The neighboring IS may have helped this transfer, although we cannot exclude the possibility that the insertion occurred after the transfer.
Figure 2.

Evolution at Locus 4. (A) Structure in various strains. IIM, Type II system M gene; IIR, Type II system R gene; IIIM, Type III system mod gene; IIIR, Type III system res gene. (B) Alignment of amino acid sequences between HPPC_02695 (Type II M) and M. TdeIII derived from T. denticola. (C) Alignment of amino acid sequences between HPPC_02670 (Type II R) and TdeIII derived from T. denticola. Identical residues in alignments are shaded.

Evolution at Locus 4. (A) Structure in various strains. IIM, Type II system M gene; IIR, Type II system R gene; IIIM, Type III system mod gene; IIIR, Type III system res gene. (B) Alignment of amino acid sequences between HPPC_02695 (Type II M) and M. TdeIII derived from T. denticola. (C) Alignment of amino acid sequences between HPPC_02670 (Type II R) and TdeIII derived from T. denticola. Identical residues in alignments are shaded.

Diversity in TRD sequences on mod genes

Although Locus 0 (previously assigned as mod-1 (60), Figure 1B and Supplementary Figure S2) showed strong conservation in the TRD, the other four orthologous groups at the remaining loci (defined as Loci 1 through 4, previously assigned as mod-3, mod-5, mod-4, mod-2, respectively (60), Figure 1C–F and Supplementary Figures S3–S6) had significant sequence diversity in their TRDs. Allelic variation at locus 2 (Figure 1D) has been reported (14,61). In this work, we defined TRD homology groups using clustering of TRD sequences after BLASTN analysis. TRD sequences were clustered in the same homology group when the e-value in BLASTN was < 1e − 90. We identified 22 distinct TRD homology groups in all, with two to eight distinct TRD homology groups at each of the four loci, among the 19 strains.

Movement of TRD sequences between mod genes of different homology groups at different loci

The diversity of TRD sequences within the same orthologous group (and at the same locus) (Figure 1) can be explained by allelic homologous recombination at the conserved regions flanking the TRD sequences, referred to as non-TRD regions. In addition, we found that some TRDs were shared by mod genes of different orthologous groups at different loci (Figure 1). TRD homology groups A and C were found at loci 1 and 3, while TRD homology group D was found at loci 1, 2 and 4. This suggested movement of a TRD sequence between different orthologous groups at different loci. We hypothesized about the mechanism underlying such movements. Homologous recombination involving most of the non-TRD regions cannot be used for movement because the non-TRD regions were not conserved between different orthologous groups. By detailed comparison of the flanking sequences, we found that the movement could be explained by recombination using a short DNA sequence similarity at the regions flanking the TRD sequences. These are common among the methyltransferases of different orthologous groups (Figure 3).
Figure 3.

Sequence alignments at suggested recombination sites for TRD replacement. (A) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group A. (B) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group C. (C) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group D at loci 2 and 4. (D) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group D at loci 1 and 2. Gray, conserved sequences at each locus; boxed, sequences for recombination; red hatched box, sequence corresponding to conserved amino acid sequence of motif I, FxGxG.

Sequence alignments at suggested recombination sites for TRD replacement. (A) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group A. (B) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group C. (C) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group D at loci 2 and 4. (D) (i) Recombination scheme, (ii) 5′-side alignment and (iii) 3′-side alignment of group D at loci 1 and 2. Gray, conserved sequences at each locus; boxed, sequences for recombination; red hatched box, sequence corresponding to conserved amino acid sequence of motif I, FxGxG. For example, the movement of group A between loci 1 and 3 could be explained by the sequence similarity of 13 bp at the 5′-side of the TRD sequences, and 117 bp at the 3′-side (Figure 3A). The movement of group C between loci 1 and 3 could be also explained by 17 bp of sequence similarity at the 5′-side and 117 bp at the 3′-side of the TRD sequences (Figure 3B). The cases for Group A and Group C are similar for the sequences involved. For group D movement between loci 2 and 4, the sequence similarity was 32 bp at the 5′-side and 57 bp at the 3′-side of the TRD sequences (Figure 3C). In contrast, the sequence similarity between loci 1 and 2 was 8 bp at the 5′-side and 26 bp at the 3′-side (Figure 3D). For the 5′-boundary, the expansion of a poly-G repeat at locus 1 might have disrupted a longer region of similarity. Poly G repeats were associated with two additional recombination areas (Figure 3A and B). We do not know whether these were related to the process or a consequence of the recombination.

Phylogenetic incongruence between TRD and non-TRD regions consistent with TRD movement

Movement of a TRD sequence between different loci (orthologous groups) might lead to phylogenetic incongruence between the TRD and the remaining non-TRD gene regions. Phylogenetic trees were compared for all four loci (Figure 4; see Supplementary Figure S7 for locus tags and bootstrap values). TRD homology groups A, C and D, each of which formed a cluster in the TRD tree, were connected with more than one cluster of the non-TRD regions’ tree, consistent with the movement of these TRD sequences between orthologous groups (loci). In addition, clustering in the TRD tree seemed to be mostly independent of locus. This was in contrast to clustering in TRD homology group W and non-TRD homology group at Locus 0, where all members were clustered into a single group. These results are consistent with the movement of TRD sequences between loci (homology groups with respect to non-TRD regions) during evolution. In other words, the diversity of TRD appeared to have occurred through acquiring a new sequence rather than only through accumulation of mutations in an ancestral TRD sequence unique to each locus (orthologous group).
Figure 4.

Comparison of phylogenetic trees of the TRD and non-TRD regions of Type III mod genes at all loci. Left, TRD trees; right, non-TRD region. Line colors correspond to TRD colors in Figure 1. TRD groups shared by different loci are connected with thicker lines. For locus tags and bootstrap values, see Supplementary Figure S7.

Comparison of phylogenetic trees of the TRD and non-TRD regions of Type III mod genes at all loci. Left, TRD trees; right, non-TRD region. Line colors correspond to TRD colors in Figure 1. TRD groups shared by different loci are connected with thicker lines. For locus tags and bootstrap values, see Supplementary Figure S7.

Movement of mod genes between H. pylori genomes inferred from phylogenetic incongruence

To follow any mobility of the mod genes with respect to the entire genome, the phylogenetic trees of the non-TRD regions at each locus were compared with a phylogenetic tree of concatenated core genes, which in some sense reflects the overall evolution of the entire genome (Figure 5). Some groups showed entanglement compared with the core tree. Comparison of phylogenetic trees of non-TRD regions of mod genes at each locus and the core genome. Left, non-TRD regions; right, core genome. Numbers indicate bootstrap values. Colors indicate phylogenetic groups in Table 1. For Locus 1 (Figure 5B), the major clustering into two in the non-TRD regions was different from the clustering in the core tree. Separation of SJM180 from the other hpEurope strains and clustering with several hspEastAsian strains might reflect a horizontal transfer from an hspEastAsia strain to an hpEurope strain. For Locus 2 (Figure 5C), PeCan4 was separated from the hspAmerind cluster and included in the hpEurope cluster. A transfer might have occurred from an hpEurope strain to a PeCan4 ancestor. Tree comparison of Locus 3 (Figure 5D) showed a more complex pattern, which separated all the major phylogeographic groups (hpEurope, hspAmerind, hspEastAsia) into more than one of the three major clusters. These apparent incongruities between the trees of the non-TRD regions and the core genome suggested recombination of the mod genes with the remainder of the genome, consistent with the high degree of mutual homologous recombination in H. pylori (43). Combined with TRD mobility between mod genes, the mobility of mod genes revealed multilevel mobility in RM systems. From these limited results, we have not so far obtained evidence to relate the two levels of mobility: TRD movement between mod genes and mod movement between genomes. We do not know whether mod gene moved by itself or together with linked genes. A hypothesis of transfer of many genes other than the mod gene at locus 2 from the European/ African lineages to PeCan4 would be consistent with the position of this strain in the core tree.

mod genes and TRD sequences in other Helicobacter species

We investigated the origin of the extremely diverse TRD sequences of mod genes in H. pylori. Rather than simple accumulation of mutations and intraspecific transfer, we suspected acquisition by horizontal gene transfer from other bacterial species as suggested for the Type I specificity subunit (62). First, we searched for homologs of mod genes in Helicobacter species other than H. pylori to determine history of gain/loss of mod genes themselves (Figure 6 and Supplementary Figure S8). The number of mod genes per genome varied from 0 to 5. More than half showed homology to the non-TRD region of H. pylori mod genes. These mod homologs carry a TRD sequence from the various TRD homology groups found above in H. pylori. In addition, 10 novel groups for TRD and five novel groups for non-TRD were found among the non-pylori Helicobacter species. To determine the gain/loss history, we listed these mod homologs in a phylogenetic tree of Helicobacter species (Figure 6).
Figure 6.

Distribution of TRD homologs in Helicobacter species. Each column in the right represents a homology group found in the Helicobacter species. Groups 0 through 4 correspond to mod genes at Loci 0 through 4 in H. pylori. Colors in a square represent the TRD sequences as in Figure 1.

Distribution of TRD homologs in Helicobacter species. Each column in the right represents a homology group found in the Helicobacter species. Groups 0 through 4 correspond to mod genes at Loci 0 through 4 in H. pylori. Colors in a square represent the TRD sequences as in Figure 1. Homologs of mod at H. pylori Locus 0 and mod at locus 4 were observed only in H. pylori and H. acinonychis. The simplest explanation for this pattern is that these two subfamilies were acquired after separation of H. cetorum from the common ancestor of H. pylori and H. acinonychis and before separation of the latter two species. Because homologs of Locus 2 mod were also observed in H. cetorum, those homologs might have been acquired before separation of H. cetorum and the other two species. Homologs of Locus 3 mod gene were found in H. pylori, H. acinonychis, H. cetorum and H. cinaedi, a species distantly-related to H. pylori but that infects humans. Horizontal transfer might have occurred between H. cinaedi and the common ancestor of H. pylori and H. cetorum. Five mod homology groups (designated groups 5–9) were not homologous to any non-TRD region of the H. pylori mod genes identified (Figure 6). They were found in H. acinonychis, H. cetorum, H. suis, H. canadensis, H. bilis and H. cinaedi. Species-specific mod homology groups were found even within closely-related species such as H. pylori, H. acinonychis and H. cetorum, which suggested frequent gain/loss events of whole mod genes. A phylogentic tree of the newly identified 10 TRDs and H. pylori TRDs (Supplementary Figure S8) revealed their extensive diversity. None of the new TRDs was closely related to another. The tree as a whole was consistent with extensive horizontal transfer of TRDs between Helicobacter species. Interestingly, TRD homology group D with groups (loci) 1, 2 and 4 mod genes in H. pylori was found to be associated with group 9 mod in H. suis. This observation suggested that TRD movement between non-orthologous mod genes could occur not only in H. pylori but also in other species and contribute to distant horizontal transfer.

TRD sequences in distantly related bacteria

Next, to examine the possibility of the transfer of TRD sequences from/to distantly related species, we searched sequences that showed similarity to the diverse TRD sequences by BLASTP analysis against the nr database (Figure 7). For some homology groups of TRD, similar sequences were detected in species distant from H. pylori, but not from epsilonproteobacteria such as other Helicobacter and Campylobacter species. This suggested distant horizontal transfer as opposed to vertical transfer. In particular, TRD homology group R had strong sequence similarity even at the nucleotide level to a gene in Haemophilus (around e-value 1e − 80 by BLASTN) and group U to one in Mycoplasma (e-value 2e − 60 by BLASTN). These results strongly supported a relatively recent horizontal transfer of TRD sequences between these pairs of distantly related species.
Figure 7.

TRD homologs detected in distant species. Left, a 16S rRNA-based phylogenetic tree of a genus where a TRD homolog was detected. Middle, classes of these genera. Right, presence (color) or absence (white) of a homolog of each TRD group in each genus. Note that they are colored when at least one of the species in the genus was detected with similar sequence to TRD groups. Asterisk: at least one of similar genes in a genus was detected as horizontally transferred by pentanucleotide composition analysis of its entire open reading frame.

TRD homologs detected in distant species. Left, a 16S rRNA-based phylogenetic tree of a genus where a TRD homolog was detected. Middle, classes of these genera. Right, presence (color) or absence (white) of a homolog of each TRD group in each genus. Note that they are colored when at least one of the species in the genus was detected with similar sequence to TRD groups. Asterisk: at least one of similar genes in a genus was detected as horizontally transferred by pentanucleotide composition analysis of its entire open reading frame. We also determined horizontal gene transfer based on pentanucleotide word compositions of entire open reading frames (see ‘Materials and Methods’ section). This method detects coding sequences recently transferred from a distantly related organism. All mod genes in H. pylori were judged as recently transferred except for TRD C and TRD I. The TRD group C was found in many bacterial groups, so we could not determine which served as the donor and which as the recipient. TRD group O might have moved from Fusobacterium to H. pylori and Neisseria, while TRD group T might have moved from Ureaplasma to epsilonproteobacteria.

DISCUSSION

Movement of TRD between genes for Type III RM systems

We analyzed Type III mod genes at the nucleotide sequence level in global H. pylori genomes. We found mobility of the TRD sequences between different mod orthologs at different loci. The mechanism underlying the TRD movement was suggested to be recombination at 8–117 bp of similar sequences flanking TRD regions. The mod genes analyzed in this work all belonged to the β group, as do most of the other mod genes, based on the arrangement of the methyltransferase motifs and TRD (49). The TRD was flanked by motif IV–VIII at the 5′-side and X–III motif at the 3′-side. Recombination apparently took advantage of the conservation of DNA sequences at both the flanking regions that encode the conserved amino acid motifs. A related mode of TRD diversification by recombination was previously observed in the specificity subunit of Type I RM systems and other genes (46,63–65). Some Type I specificity subunits consist of two TRDs flanked by the same pair of 19–49 bp sequences. Taking advantage of these repeat sequences for recombination, a TRD sequence can replace another TRD at the same or at another TRD site. The target TRD site can be at a separate locus. The movement between the two TRD sites is named DoMo (46). TRD sequence movement by DoMo in the Type I specificity subunit is restricted to the same orthologous group, whether between the same locus or between different loci. However, the TRD movement of Type III mod we found here is unique because it takes place between different orthologous groups, taking advantage of the weak homology at the motif sequences conserved among DNA methyltransferases.

Comparison with related reactions

Movement of sequences between homologous genes at different loci is known as gene conversion (66). It is frequently observed in H. pylori outer membrane protein genes such as those in bab and sab families (67,68). Gene conversion from an unexpressed to an expressed locus mediates antigenic variation of outer membrane proteins and pili in several bacteria (69–71). The TRD movement reported in this work is unique in the low sequence similarity of the recombining regions (13–52% nucleotide sequence identity) between different orthologous groups, which encode conserved amino acid motifs of DNA methyltransferases. Most of other examples of gene conversion use long flanking regions conserved between genes as recombination sites. The movements use relatively long similar sequences at 3′-side of TRD, but short (13–32 bp) similar sequences at the 5′-side. This might explain the apparently lower frequency of TRD movement observed between different orthologous groups. An example similar to the short conserved-motif-driven gene conversion was described for rearrangement in a tandem paralog cluster in Staphylococcus aureus (72). In this case, conserved motif sequences in paralogs in tandem are used as sites for unequal homologous recombination. Similar unequal recombination was not found for Type III RM systems here but has been observed for Type I RM systems (46). Another example of domain shuffling in an intronless gene was reported in albumin-binding genes, which are suggested to use multiple 15-bp direct repeats, the recer sequences, within a gene (73,74). This is also different from our case where non-repeated sequences at both the 5′- and 3′-sides were used.

Inactivation of an RM system by insertion of another RM system from a distantly related bacterium

The Type III RM system at locus 4 appeared to have decayed following insertion of a Type II RM system that likely transferred from an oral bacterium (Figure 2). The oral cavity is suggested as a reservoir of H. pylori (75), thus frequent interactions between H. pylori and oral bacteria may have led to this observation. This replacement of an epigenetic system by another epigenetic system through distant horizontal transfer was likely accompanied by a change in the epigenome, more specifically, in the DNA methylation pattern. This might be a conflict between epigenetic systems (35,36). Other examples of ‘RM-on-RM-type’ insertions were found in genome comparisons (30,76,77) and are reminiscent of the ‘transposon-on-transposon’ structure often found in eukaryotic genomes (78,79).

TRD movement between distantly-related bacteria

Homologs of an H. pylori TRD homology group were found in a different Helicobacter species (Figure 6). Because the mod non-TRD sequence belongs to a different homology group, this transfer might have taken place through the above mechanism. In other words, the above mechanism might have promoted distant horizontal transfer of various TRDs. The H. pylori TRD homologs are found in a wide variety of bacteria (Figure 7). We do not yet know the relative contribution of three possible processes to this distribution: TRD movement between mod genes, mod gene movement and Type III RM system movement.

Biological significance of switches in TRD of RM systems

The capacity to switch the TRD of mod gene affects both R and M activity of Type III RM systems. A change in restriction specificity will alter the repertoire of acceptable DNAs. This will not only affect the defense against infecting genetic elements but also, in a wider sense, limit the future direction of genome evolution. The change in the modification specificity will lead to a change in epigenetic methylation states and, therefore, the transcriptome (14,80). The two-sided change could provide the bacterium with an ability to adapt to various environments by changing the genome and global gene expression. Each of these patterns could be modulated by variation in the strength of each enzyme activity. Each of these unique epigenome states may define an elementary unit of natural selection. Such concepts of RM-driven adaptive evolution can be evaluated through examination of additional H. pylori genomes and experimentation. The mobility of mod genes in Helicobacter species (Figure 6) and the mobility of TRD in distantly related bacteria (Figure 7) are consistent with the concept of epigenetics-driven evolution. They likely lead to changes in the recognition sequence of both R and M genes and therefore, changes in the epigenome. Currently, recognition sequences for Type III mod genes of H. pylori are known only for HP0260 and HP0593, which are the genes of the 26695 strain with TRD homology group W recognizing GATC at locus 0 and group D recognizing CTGCAG at locus 4 (81). Further experimental determination of recognition sequences by cleavage tests of methylated sites by restriction enzyme with known recognition sequence (81) or by single-molecule real-time sequencing methods (82) would help further understanding of the diversity of RM systems and its biological significance.

SUPPLEMENTARY DATA

Supplementary Data are available at NAR Online: Supplementary Figures 1–8 and Supplementary Tables 1–3.

FUNDING

The Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science [21370001 to I.K., 24790412 to Y.F.]; the Ministry of Education, Culture, Sports, Science and Technology (MEXT) [24113506 to I.K., 24119503 to Y.F.]; the global COE project of Genome Information Big Bang from MEXT [to I.K.] and the Takeda Science Foundation [to Y.F.]. Funding for open access charge: The Grants-in-Aid for Scientific Research from the Japan Society for the Promotion of Science [2137001 to I.K.]. Conflict of interest statement. None declared.
  79 in total

1.  Molecular phylogenetics of DNA 5mC-methyltransferases.

Authors:  J M Bujnicki; M Radlinska
Journal:  Acta Microbiol Pol       Date:  1999

2.  On the structure and operation of type I DNA restriction enzymes.

Authors:  G P Davies; I Martin; S S Sturrock; A Cronshaw; N E Murray; D T Dryden
Journal:  J Mol Biol       Date:  1999-07-09       Impact factor: 5.469

Review 3.  Type II restriction endonucleases: structure and mechanism.

Authors:  A Pingoud; M Fuxreiter; V Pingoud; W Wende
Journal:  Cell Mol Life Sci       Date:  2005-03       Impact factor: 9.261

4.  Genetic addiction: selfish gene's strategy for symbiosis in the genome.

Authors:  Atsushi Mochizuki; Koji Yahara; Ichizo Kobayashi; Yoh Iwasa
Journal:  Genetics       Date:  2005-11-19       Impact factor: 4.562

5.  The phasevarion: a genetic system controlling coordinated, random switching of expression of multiple genes.

Authors:  Yogitha N Srikhanta; Tina L Maguire; Katryn J Stacey; Sean M Grimmond; Michael P Jennings
Journal:  Proc Natl Acad Sci U S A       Date:  2005-03-31       Impact factor: 11.205

6.  Sau42I, a BcgI-like restriction-modification system encoded by the Staphylococcus aureus quadruple-converting phage Phi42.

Authors:  Rita M Dempsey; David Carroll; Huimin Kong; Lauren Higgins; Conor T Keane; David C Coleman
Journal:  Microbiology       Date:  2005-04       Impact factor: 2.777

7.  The evolutionary history of human DNA transposons: evidence for intense activity in the primate lineage.

Authors:  John K Pace; Cédric Feschotte
Journal:  Genome Res       Date:  2007-03-05       Impact factor: 9.043

8.  Ace is a collagen-binding MSCRAMM from Enterococcus faecalis.

Authors:  R L Rich; B Kreikemeyer; R T Owens; S LaBrenz; S V Narayana; G M Weinstock; B E Murray; M Höök
Journal:  J Biol Chem       Date:  1999-09-17       Impact factor: 5.157

9.  Haemophilus influenzae phasevarions have evolved from type III DNA restriction systems into epigenetic regulators of gene expression.

Authors:  Kate L Fox; Stefanie J Dowideit; Alice L Erwin; Yogitha N Srikhanta; Arnold L Smith; Michael P Jennings
Journal:  Nucleic Acids Res       Date:  2007-08-02       Impact factor: 16.971

10.  An African origin for the intimate association between humans and Helicobacter pylori.

Authors:  Bodo Linz; François Balloux; Yoshan Moodley; Andrea Manica; Hua Liu; Philippe Roumagnac; Daniel Falush; Christiana Stamer; Franck Prugnolle; Schalk W van der Merwe; Yoshio Yamaoka; David Y Graham; Emilio Perez-Trallero; Torkel Wadstrom; Sebastian Suerbaum; Mark Achtman
Journal:  Nature       Date:  2007-02-07       Impact factor: 49.962

View more
  15 in total

1.  Kinetic and catalytic properties of M.HpyAXVII, a phase-variable DNA methyltransferase from Helicobacter pylori.

Authors:  Yedu Prasad; Ritesh Kumar; Awanish Kumar Chaudhary; Rajkumar Dhanaraju; Soneya Majumdar; Desirazu N Rao
Journal:  J Biol Chem       Date:  2018-11-26       Impact factor: 5.157

2.  Microevolution of Virulence-Related Genes in Helicobacter pylori Familial Infection.

Authors:  Yoshikazu Furuta; Mutsuko Konno; Takako Osaki; Hideo Yonezawa; Taichiro Ishige; Misaki Imai; Yuh Shiwa; Mari Shibata-Hatta; Yu Kanesaki; Hirofumi Yoshikawa; Shigeru Kamiya; Ichizo Kobayashi
Journal:  PLoS One       Date:  2015-05-15       Impact factor: 3.240

3.  Distribution of the type III DNA methyltransferases modA, modB and modD among Neisseria meningitidis genotypes: implications for gene regulation and virulence.

Authors:  Aimee Tan; Dorothea M C Hill; Odile B Harrison; Yogitha N Srikhanta; Michael P Jennings; Martin C J Maiden; Kate L Seib
Journal:  Sci Rep       Date:  2016-02-12       Impact factor: 4.379

4.  Lifespan of restriction-modification systems critically affects avoidance of their recognition sites in host genomes.

Authors:  Ivan Rusinov; Anna Ershova; Anna Karyagina; Sergey Spirin; Andrei Alexeevski
Journal:  BMC Genomics       Date:  2015-12-21       Impact factor: 3.969

5.  Genotypic and phenotypic prevalence of Nocardia species in Iran: First systematic review and meta-analysis of data accumulated over years 1992-2021.

Authors:  Mohammad Hashemzadeh; Aram Asareh Zadegan Dezfuli; Azar Dokht Khosravi; Mohammad Savari; Fatemeh Jahangirimehr
Journal:  PLoS One       Date:  2021-07-22       Impact factor: 3.240

6.  Mobility of DNA sequence recognition domains in DNA methyltransferases suggests epigenetics-driven adaptive evolution.

Authors:  Yoshikazu Furuta; Ichizo Kobayashi
Journal:  Mob Genet Elements       Date:  2012-11-01

7.  Methylome diversification through changes in DNA methyltransferase sequence specificity.

Authors:  Yoshikazu Furuta; Hiroe Namba-Fukuyo; Tomoko F Shibata; Tomoaki Nishiyama; Shuji Shigenobu; Yutaka Suzuki; Sumio Sugano; Mitsuyasu Hasebe; Ichizo Kobayashi
Journal:  PLoS Genet       Date:  2014-04-10       Impact factor: 5.917

Review 8.  To be or not to be: regulation of restriction-modification systems and other toxin-antitoxin systems.

Authors:  Iwona Mruk; Ichizo Kobayashi
Journal:  Nucleic Acids Res       Date:  2013-08-13       Impact factor: 16.971

9.  The interplay of restriction-modification systems with mobile genetic elements and their prokaryotic hosts.

Authors:  Pedro H Oliveira; Marie Touchon; Eduardo P C Rocha
Journal:  Nucleic Acids Res       Date:  2014-08-12       Impact factor: 16.971

10.  Restriction-Modification Systems as Mobile Genetic Elements in the Evolution of an Intracellular Symbiont.

Authors:  Hao Zheng; Carsten Dietrich; Yuichi Hongoh; Andreas Brune
Journal:  Mol Biol Evol       Date:  2015-11-13       Impact factor: 16.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.