Literature DB >> 25860355

Genome-wide methylation patterns in Salmonella enterica Subsp. enterica Serovars.

Cary Pirone-Davies1, Maria Hoffmann1, Richard J Roberts2, Tim Muruvanda1, Ruth E Timme1, Errol Strain3, Yan Luo3, Justin Payne1, Khai Luong4, Yi Song4, Yu-Chih Tsai4, Matthew Boitano4, Tyson A Clark4, Jonas Korlach4, Peter S Evans1, Marc W Allard1.   

Abstract

The methylation of DNA bases plays an important role in numerous biological processes including development, gene expression, and DNA replication. Salmonella is an important foodborne pathogen, and methylation in Salmonella is implicated in virulence. Using single molecule real-time (SMRT) DNA-sequencing, we sequenced and assembled the complete genomes of eleven Salmonella enterica isolates from nine different serovars, and analysed the whole-genome methylation patterns of each genome. We describe 16 distinct N6-methyladenine (m6A) methylated motifs, one N4-methylcytosine (m4C) motif, and one combined m6A-m4C motif. Eight of these motifs are novel, i.e., they have not been previously described. We also identified the methyltransferases (MTases) associated with 13 of the motifs. Some motifs are conserved across all Salmonella serovars tested, while others were found only in a subset of serovars. Eight of the nine serovars contained a unique methylated motif that was not found in any other serovar (most of these motifs were part of Type I restriction modification systems), indicating the high diversity of methylation patterns present in Salmonella.

Entities:  

Mesh:

Substances:

Year:  2015        PMID: 25860355      PMCID: PMC4393132          DOI: 10.1371/journal.pone.0123639

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

The methylation of DNA is important in all kingdoms of life as a mechanism of epigenetic control [1-3]. Methylation is achieved through the action of methyltransferase enzymes (MTases), which covalently attach methyl groups to DNA bases. In eukaryotes, 5-methylcytosine (m5C) is the most common methylation. In contrast, N6-methyladenine (m6A) is the most frequent methylation in prokaryotes, although N4-methylcytosine (m4C) and m5C are also widespread. Methylation in eukaryotes has been well studied and is known to mediate diverse processes including growth, development, and disease [4]. In prokaryotes, methylation is a key component of restriction-modification (RM) systems, which protect cells from foreign DNA. RM systems are composed of multiple proteins, including at least one MTase, which recognizes and methylates a base contained within a specific sequence motif, and one endonuclease, or REase, which cleaves foreign DNA with a methylation pattern different from that of the host DNA. RM systems are subdivided into four main classes that differ in subunit composition, motif characteristics, cofactor requirements, and location of DNA cleavage (for review, see [5]). In brief, Type 1 RM systems are composed of two restriction subunits (R), two methylation subunits (M) and one specificity subunit (S), which recognizes specific DNA sequences. Recognized motifs are asymmetric and bipartite. Type II systems include one R and one M subunit which can function independently, and recognized motifs are mostly symmetric. Type III systems are hetero-oligomers composed of a mod subunit (recognizes and modifies DNA) and a res subunit that is only active in a mod-res complex. The only RM systems that recognize methylated, instead of unmethylated sites, are Type IV. Methylation in bacteria also influences critical processes including gene regulation, cell cycle control, pathogenicity, and DNA repair [2]. Despite the important implications of bacterial methylation, its distribution, diversity, and functional consequences have not been extensively investigated. This paucity of data can, in part, be attributed to technological limitations. Methylation studies in eukaryotes have been facilitated by the development of detection methods for m5C, including bisulfite conversion, which allows for genome-wide modification analyses. Comparable methods have not been available for the detection of m6A and m4C until recent advances in sequencing technology. SMRT sequencing couples whole-genome sequencing with the simultaneous detection of base modifications using kinetic signals during DNA polymerization [6, 7]. This new technology has led to insights regarding the methylomes of several bacterial species [8-12]. However, methylation is widespread throughout the bacterial kingdom and is very diverse [13]. Thus, more studies are needed to gain a comprehensive understanding of the distribution and diversity of methylation motifs and their associated MTases, and ultimately to comprehend methylation functions and evolutionary history in these organisms. Salmonella enterica is the leading cause of death and hospitalizations due to foodborne pathogens each year [14]. Previous studies have shown that the methylation of the Gm6ATC motif by the MTase Dam is an essential factor in the virulence of Salmonella, and that a lack of methylation leads to attenuation in animal models [15]. Subsequent studies have elucidated the mechanisms by which some virulence genes are regulated by Dam, including the plasmid-encoded fimbriae (pef) locus [16] and the std fimbrial operon [17]. In addition, Dam regulates both the phase variation of STM2209-STM2208 which alters lipopolysaccharide O-antigen side chain length [18], and the phase variation of the phage P22 glucosyltransferase (gtr) operon which controls O-antigen glucosylation [19]. Thus, it is possible that the methylation of other motifs in Salmonella also may have implications for virulence, pathogenicity, and other functions. Here, we sequenced and closed the genomes of six Salmonella enterica isolates from five serovars. We then analysed their methylomes, along with the methylomes of four additional serovars that we sequenced previously [11, 20–22], and employed a bioinformatics approach to identify methyltransferases and match them to observed methylated motifs in the genomes. We also examined how methylation patterns varied between Salmonella serovars.

Materials and Methods

We selected five serovars of Salmonella enterica subs enterica from our in-house strain collection at the FDA-CFSAN. These included Salmonella enterica subs enterica serovar (S. Bareilly), S. Abaetetuba, S. Abony, S. Anatum, S. Bredeney, S. Montevideo, and two isolates of S. Enteritidis. We also included data from four serovars we sequenced previously, S. Javiana, S. Typhimurium, S. Heidelberg, and S. Cubana [11, 20–22] (see Table 1 for strain names and accession numbers).
Table 1

Summary of Salmonella genomes sequenced in this study.

SerovarChromosome size (bp)Plasmid size (bp)GenBank Accession (chromosome)GenBank Accession (plasmid)PhageMTase on phage (specificity, if known)MTase on plasmid (specificity, if known)
S. Bareilly CFSAN000189473061278193CP006053.1CP006054.1Salmon_Fels_1_NC_010391 Gifsy_1_NC_010392_M.SbaUORF280P
S. Abony CFSAN0012754737447NACP007534.1 _Entero_ST64T_NC_004348 Gifsy_2_NC_010393__
S. Anatum CFSAN0006654706101NACP007531.1 _Salmon_Fels_1_NC_010391 Gifsy_1_NC_010392M.SenAnaORF14155P_
S. Cubana CFSAN0020504977480166,668 122,863CP006055.1CP006056.1 CP006057.1Gifsy_1_NC_010392 Salmon_vB_SemP_Emek_NC_018275_ M.Sen2050ORF235P(GATC) M.Sen2050ORF245P M.Sen2050ORF400P M.Sen2050ORF480P(CAGCTG)
S. Heidelberg CFSAN0020694783943110,363 37,679CP005390.2CP005389.2 CP005391.2Entero_P22_NC_002371 Gifsy_2_NC_010393 M.Sen2069ORF4005P (GATC)M.Sen2069ORF23325P
S. Heidelberg CFSAN002064478386737692CP005995.1CP005994.1Entero_P22_NC_002371 Gifsy_2_NC_010393 M.Sen2069ORF21380P (GATC)_
S. Javiana CSFAN001992463416124,012 17,094CP004027.1CP004026.1 CP004028.1Gifsy_2_NC_010393 Salmon_RE_2010_NC_019488 Entero_PsP3_NC_005340M.SenJORF19790P (GATC)_
S. Montevideo CFSAN0002554694375NACP007530.1 _Salmon_vB_SosS_Oslo_NC_018279 Entero_Fels_2_NC_010463 M.Sen255II (ATGCAT)_
S. Enteritidis CFSAN000158467966259369CP007528.1CP007529.1Salmon_RE_2010_NC_019488 Gifsy_2_NC_010393 M.Sen158III (GATC)_
S. Enteritidis CFSAN000111467908139599CP007598.1CP007599.1Gifsy_2_NC_010393 Salmon_RE_2010_NC_019488M.Sen1427ORF7910P (GATC)_
S. Typhimurium CFSAN00192148599313,609 4,675 221,009CP006048.1CP006052.1 CP006051.1 CP006050.1Salmon_ST64B_NC_004313 Gifsy_1_NC_010392 Gifsy_2_NC_010393 Entero_ST104_NC_005841_M.SenTFORF23885P (CAGCTG) M.SenTFORF24805P (CCNGG)
Each strain was plated onto Trypticase Soy Agar and incubated overnight at 37°C. Cells were then inoculated into Trypticase Soy Broth for DNA extraction. A 1 ml-aliquot was pelleted, and genomic DNA was extracted using the DNeasy Blood and Tissue kit from Qiagen (Qiagen, CA, USA). All samples were analyzed at the exponential stage of growth. DNA was sheared to approximately 10 kb using a Covaris g-TUBE (Covaris, Inc.; Woburn, MA). SMRTbell 10 kb template libraries were prepared using DNA Template Prep Kit 2.0 and the Low-Input 10 kb Library Protocol (Pacific Biosciences; Menlo Park, CA, USA). In brief, DNA was concentrated, repaired, ligated to hairpin adapters, and purified. Incompletely formed SMRTbell templates were digested with a combination of Exonucleases III and VII. Adapters were annealed, and SMRT sequencing was carried out on the PacBioRS II (Pacific Biosciences; Menlo Park, CA, USA) using standard protocols. Analysis of sequence reads was implemented using SMRT Analysis 1.10 and the SMRT Portal 2.0 platform (Pacific Biosciences). De novo assembly was performed using the Hierarchical Genome Assembly Process (HGAP) with default parameters [23]. HGAP consists of three steps to ensure high accuracy. First, Basic Local Alignment with Successive Refinement (BLASR) is used to align all reads to the longest seed reads and a consensus is generated to create pre-assembled reads. Preassembled reads are then assembled using the Celera assembler. Finally, all reads are mapped to the de novo assembly and final consensus and accuracy scores are determined using the Quiver consensus algorithm. HGAP outputs assemblies with overlapping regions at the ends. Coordinates of this region were identified using dot plots in Gepard [24], and trimmed from one end to circularize the genome. Genomes were checked manually for even sequencing coverage. Genomes were annotated using the NCBI (National Center for Biotechnology Information) Prokaryotic Genomes Automatic Annotation Pipeline [25] (http://www.ncbi.nlm.nih.gov/genomes/static/Pipeline.html). Prophages were detected using PHAST [26]. Only prophages scored as intact are reported here. We excluded putative intact prophages that did not show significant sequence similarity to known phages using the Basic Local Alignment Search Tool (BLAST) sequence alignment tool with default parameters. Motif Detection and Analysis was also carried out using SMRT Analysis 1.1 and the RS_Modification_and_Motif_Analysis.1 protocol as described at http://www.pacb.com/pdf/TN_Detecting_DNA_Base_Modifications.pdf. Interpulse durations (IPDs) were measured based on the kinetic signals [7] and processed as described previously [6]. At each position in the genome, the observed IPD was compared to the IPD of an in-silico control using a two-sample t-test, and a QV score was calculated as QV = -10 log (p-value). Bases were accepted as modified based on a minimum QV threshold value. QV 30 was used as a threshold for preliminary analyses. A plot of QV versus coverage was then constructed using publicly available R scripts found at: https://github.com/PacificBiosciences/motif-finding. The observed bimodal distribution of kinetic data, resulting from modified and unmodified positions, was then used to determine a more stringent QV threshold (S1 Fig). Only sites with a minimum of 25x coverage were included. Motifs were identified using the algorithm MotifMaker. m6A and m4C motifs can be reliably detected with 25x coverage across all positions in the genome, but m5C requires either significantly higher coverage (~100x) or Tet-methylation for confident detection. In this study we report only m6A and m4C methylations. To identify MTases, assembled genomes were scanned for homologs of RM system genes using in-house software (e value > 1e-11) to identify putative MTases as previously described [10]. Predicted specificities were assigned to candidate MTases based on specificities of the known MTases. The presence of functional motifs and information regarding the placement of the gene within the genome were also used to support or reject those assignments, as were known characteristics of different MTase types. For example, Type III MTases and most Type IIG systems only methylate one strand of their recognition sequence, whereas Type I systems have bipartite recognition sequences. MTase candidates with predicted specificities were matched where possible with observed motifs found in our motif analyses. If a single candidate MTase existed for an observed motif, then that gene was assumed to be responsible for that particular specificity. If multiple candidates existed for a single motif, no MTase was assigned. When making assignments of new motifs to specific MTases, we always cross-checked the matched gene against other similar genes in REBASE and against the unassigned motifs from the more than 700 other genomes for which we have PacBio data. In many cases, the same motif occurred in a different genome with an essentially identical methyltransferase or specificity subunit protein sequence, adding weight to the strength of the assignment. Raw processed PacBio data files were deposited in the Sequence Read Archive (SRA) database of the National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/sra) (S2 Table) and MTase information and sequences were deposited in REBASE (http://rebase.neb.com/rebase/rebase.html).

Results and Discussion

Genome Assemblies

All genomes were assembled into a single, circular chromosomal contig and up to three plasmids. Consensus accuracy scores were at least 99.9999% for all assemblies. Sizes of Salmonella chromosomes ranged from 4,547,600 – 4,977,480 bp, plasmid sizes ranged from 3,609–221,009 bp (Table 1). Sequences were deposited in GenBank. Putative prophages and BLAST alignment data are reported in Table 1.

Methylation Patterns

This is the first comparative report of genome-wide methylation patterns in the pathogenic bacteria Salmonella enterica. We analyzed the methylomes of five Salmonella enterica subsp. enterica serovars, including two isolates of S. Enteritidis. We also sequenced and released their closed genomes. We present those results, along with data from four additional Salmonella serovars, S. Javiana, S. Typhimurium, S. Heidelberg, and S. Cubana, which we analyzed previously [11, 20–22]. In total, we observed 18 motifs among the nine Salmonella serovars, 16 m6A motifs, one m4C motif, m4CCWWGG, and one Type I MTase which encodes both m6A and m4C activities, Gm6ATGN5G4m GC (Fig 1; an underscore represents the base which is methylated in the opposite DNA strand; W = A or T). Eight of the motifs were novel, i.e., they have not been previously observed in any bacterial species. We were able to match 13 of the Salmonella motifs to their respective MTase enzymes in most of the serovars tested (S1 Table).
Fig 1

The methylomes of eleven Salmonella serovars.

Violet = MTase identified, light violet = MTase unknown, red = novel motif, orange = MTase present in genome, motif not observed. Light violet/red stripe = novel motif, MTase unknown. Roman numerals indicate MTase type (I – III). Note the majority of novel MTases are Type I systems.

The methylomes of eleven Salmonella serovars.

Violet = MTase identified, light violet = MTase unknown, red = novel motif, orange = MTase present in genome, motif not observed. Light violet/red stripe = novel motif, MTase unknown. Roman numerals indicate MTase type (I – III). Note the majority of novel MTases are Type I systems. Several motifs were common among multiple serovars, while other motifs were unique to specific serovars. All Salmonella serovars examined contained the methylated motifs ATGCm6AT, CAGm6AG, and Gm6ATC. In all serovars, we identified a Type III MTase responsible for the methylation of CAGm6AG, and an extremely common Type II MTase was found to methylate the ATGCm6AT motif (see Table 2 for a list of enzyme names specific to each strain). The methylation of ATGCm6AT was never complete (38–78.5%). This MTase is usually active in Salmonella, although rarely active in E. coli, and is not thought to be an essential gene [27]. Confident assignment of an MTase to the Gm6ATC motifs could only be performed in eight of the eleven isolates: two were orphan MTases, and the remaining were common Type II enzymes. In multiple serovars, we identified candidate enzymes that have the potential to methylate this motif (Table 3).
Table 2

Summary of motifs and methyltransferases found in each Salmonella genome.

SerovarEnzyme AssignmentGene Locus_Tag (GenBank)TypeSub- TypeMotif ObservedMotif Uni-que* % Methyl-ated 5'-3'/3'-5'Number Methyl- ated Motifs (5'-3' strand/ 3'-5' strand)Number Motifs in Genome (5'-3' strand/ 3'-5' strand)
S. Bareilly CFSAN000189M.SbaUISEEB0189_17520IIIbetaCAGm6AGno97.756525787
 M.SbaUIISEEB0189_19945IIbetaCm6AGCTGno88.214661662
 M.SbaUIIISEEB0189_19740IgammaCCGm6ANNNNNGTCyes98.6/ 98.6482/ 482489/ 489
 M.SbaUIVSEEB0189_02925IIbetaATGCm6ATno78.510931392
 M.SbaUDamSEEB0189_02450Orphan_G6mATCno98.6/ 98.63714837688
S. Abony CFSAN001275M.SenAboISEEA0014_11325IIIbetaCAGm6AGno96.853915569
 M.SenAboIISEEA0014_03225IIbetaATGCm6ATno38283744
 M.SenAboIVSEEA0014_08865IgammaGAm6ACNNNNNNNTTAyes94.9/ 93.5410/ 404432/ 432
 M.SenAboDamSEEA0014_03700OrphanalphaG6mATCno95.1/ 95.13560737436
 M1.SenAboIIISEEA0014_08700IgammaG6mATGNNNNNG4mGC/ G4mCCNNNNNCATCyes96.1/ 31.01260/ 4061311/ 1311
 M2.SenAboIIISEEA0014_08705IgammaG6mATGNNNNNG4mGC/ G4mCCNNNNNCATCyes96.1/ 31.01260/ 4061311/ 1311
S. Anatum CFSAN000665M.SenAnaISEEA1592_11695IgammaCCm6ANNNNNNNNTGAGyes99.7/ 99.4354/ 353355/ 355
 M.SenAnaIISEEA1592_09525IIIbetaCAGm6AGno100.055095511
 M.SenAnaIIISEEA1592_17520IIbetaATGCm6ATno66.7/ 66.76741010
 M.SenAnaIVSEEA1592_11855IIbeta 4mCCWWGG no83.3/ 83.314231708
 M.SenAnaDamSEEA1592_01330Orphan_G6mATCno99.8/ 99.83714037224
S. Cubana CFSAN002050M.Sen2050ICFSAN002050_08375IIIbetaCAGm6AGno95.162356558
 M.Sen2050IICFSAN002050_23900IIbetaATGCm6ATno45.1/ 45.15101131
 __I_GGm6ANNNNNNATTAyes92.7/ 92.3459/ 457495/ 495
 __I_TCm6ANNNNNGTTYyes95.5/ 92.31248/ 13381352/ 1352
S. Heidelberg CFSAN002064M.Sen2064ICFSAN002064_15765IgammaGm6AGNNNNNNRTAYGno97.9/ 97.5231/ 230236/ 236
 M.Sen2064IICFSAN002064_18310IIIbetaCAGm6AGno98.255875691
 M.Sen2064IIICFSAN002064_10125IIbetaATGCm6ATno42.4319752
 __II_ACCm6ANCCno99.427032719
S. Heidelberg CFSAN002069M.Sen2069ICFSAN002069_07060IIIbetaCAGm6AGno97.958165939
 M.Sen2069IICFSAN002069_09575IgammaGm6AGNNNNNNRTAYGno97.5/ 97.9238/ 239244/ 244
 M.Sen2069IIICFSAN002069_15235IIbetaATGCm6ATno42.2/ 42.2217514
 __  ACCm6ANCC 9927472774
S. Javiana CFSAN001992M.SenJICFSAN001992_09405IIIbetaCAGm6AGno97.854105523
 M.SenJIICFSAN001992_11490IgammaCCm6AYNNNNNRTANNCyes98.1/ 97.7474/ 472483/ 483
 M.SenJIIICFSAN001992_16620IIbetaATGCm6ATno58.8/ 58.88031364
 __ G6mATCno98.9/ 98.93633036738
S. Montevideo CFSAN000255M.Sen255IY007_00590IIIbetaCAGm6AGno99.455045535
 M.Sen255IIY007_12075IIbetaATGCm6ATno50.0/ 50.0387774
 __I_CG6mAYNNNNNNNRTRTCyes99.1/ 98.9439/ 438443/ 443
 __IIalphaG6mATCno99.1/ 99.13686637204
 __I_GCm6ANNNNNNCTGAno98.6/ 99.5554/ 559562/ 562
S. Enteriditis CSFAN000111M.Sen1427IISEEE1427_7355IgammaCGm6ANNNNNNTRCCno98.4/ 97.91721/ 17121749/ 1749
 __II G6mATCno98.8/ 98.83682437256
 M.Sen1427ISEEE1427_9465  CAGm6AGno99.255055549
 M.Sen1427III   ATGC6mATno43315732
S. Enteriditis CFSAN000158M.Sen158ISEEE0968_18850IIIbetaCAGm6AGno98.154905599
 M.Sen158IISEEE0968_20955IgammaCGm6ANNNNNNTRCCno99.0/ 98.01739/ 17221757/ 1757
 M.Sen158IIISEEE0968_03950IIbetaATGC6mAT no41.3/ 41.3302732
S. Typhimurium CFSAN001921M.SenTFICFSAN001921_15255IIIbetaCAGm6AGno89.356356308
 M.SenTFIICFSAN001921_17800IgammaCRTm6AYNNNNNNCTCno90.7/ 89.1233/ 229257/ 257
 M.SenTFIIICFSAN001921_00055IIbetaATGCm6ATno60.96301035
 SenTFIVCFSAN001921_17955IIalphaGATC6mAGno94.328413011

*A unique motif refers to one that has not been previously observed in any bacterial species.

Table 3

Methyltransferases identified in the Salmonella serovars, but not assigned to a motif.

SerovarEnzyme AssignmentTypeSubTypeMotif (if known)
S. Bareilly CFSAN000189M.SbaUORF19730PIgamma-
 M.SbaUORF280PIIbeta-
S. Abony CFSAN001275M.SenAboORF8720PIgamma-
S. Anatum CFSAN000665M.SenAnaDamPOrphanalphaG6mATC
 M.SenAnaORF14155PIIalphaG6mATC
S. Cubana CFSAN002050M.Sen2050DamPOrphanalphaG6mATC
 M.Sen2050ORF235PIIalphaG6mATC
 M.Sen2050ORF245PIIgamma-
 M.Sen2050ORF400PIIgamma-
 M.Sen2050ORF480PIIbeta-
 M.Sen2050ORF4940PIgamma-
 M.Sen2050ORF5885PIgamma-
S. Heidelberg CFSAN002064M.Sen2064DamPOrphanalphaG6mATC
 M.Sen2064ORF21380PIIalphaG6mATC
 Sen2064ORF15615PIIG,SGATC6mAG
S. Heidelberg CFSAN002069M.Sen2069DamPOrphanalphaG6mATC
 M.Sen2069ORF4005PIIalphaG6mATC
 M.Sen2069ORF23325PIIbeta-
 Sen2069ORF9735PIIG,SGATC6mAG
S. Javiana CFSAN001992M.SenJORF11520PIgamma-
 M.SenJDamPorphanalphaG6mATC
 M.SenJORF19790PIIalphaG6mATC
 M.SenJORF20475PIIalphaG6mATC
 M.SenJORF6415PII G6mATC
S. Montevideo CFSAN000255M.Sen255DamPOrphanalphaG6mATC
 M.Sen255ORF17075PIIalphaG6mATC
 M.Sen255ORF20925PIgamma-
 M.Sen255ORF5995PIgamma-
S. Enteritidis CSFAN000111M.Sen1427DamPOrphanalphaG6mATC
 M.Sen1427ORF7380PIgamma-
 M.Sen1427ORF7910PIIalphaG6mATC
S. Enteritidis CFSAN000158M.Sen158DamPOrphanalphaG6mATC
 M.Sen158ORFDPIIalphaG6mATC
 M.Sen158ORF20930PIgamma-
S. Typhimurium CFSAN001921M.SenTFDamPOrphanalphaG6mATC
 M.SenTFORF6885PII G6mATC
 M.SenTFORF23885PIIbetaCm6AGCTG
 M.SenTFORF24320PII -
 M.SenTFORF3520PIIIbeta-

am5C MTases not included.

*A unique motif refers to one that has not been previously observed in any bacterial species. am5C MTases not included. Other observed motifs were common among a subsection of the serovars examined. For example S. Typhimurium and both isolates of S. Heidelberg contained the common motif Gm6AGN6RTAYG that is methylated by a Type I MTase. Six of the nine serovars, S. Bareilly, S. Abony, S. Cubana, S. Javiana, S. Montevideo, and S. Anatum, contained a motif not found in the other serovars tested (Fig 1). For example, in S. Anatum, we observed the motif CCm6AN7 TGAG. Fig 2 shows the kinetic signals of three of these motifs. In most cases these unique motifs were strongly methylated. Several novel motifs were not matched to any MTases including GGm6AN6ATTA and RAm6ACN5 TGA in S. Cubana, and CGm6AYN7RTRTC in S. Montevideo.
Fig 2

Diagram of the interpulse duration (IPD) ratio of three novel motifs identified in three Salmonella serovars.

Vertical axis = IPD ratio, horizontal axis = genome position. IPD ratio listed next to bar. A. Motif CCm6AN8 TGAG in S. Anatum. B. Motif RAm6ACN5 TGA in S. Cubana. C. Motif CGm6AYN6RTRTC in S. Montevideo.

Diagram of the interpulse duration (IPD) ratio of three novel motifs identified in three Salmonella serovars.

Vertical axis = IPD ratio, horizontal axis = genome position. IPD ratio listed next to bar. A. Motif CCm6AN8 TGAG in S. Anatum. B. Motif RAm6ACN5 TGA in S. Cubana. C. Motif CGm6AYN6RTRTC in S. Montevideo. Several observed motifs could not be assigned to a single MTase. In some cases, there were multiple MTases with predicted specificities that matched that of an observed motif. In these cases, it was not possible to predict which enzyme was responsible for the methylation of the observed motifs, and thus no enzyme was assigned. Furthermore, we could not rule out the possibility that multiple enzymes methylated the same motif, as has been observed with Gm6ATC [28]. MTases may also be promiscuous [29], i.e., they methylate multiple motifs, making a match to any single motif unrealistic. In some cases, there was no MTase present in the genome with a specificity predicted to recognize an observed methylated motif. On other occasions, we did not observe the methylation of a motif that we predicted would be present based on a putative MTase identification. For example, in S. Heidelberg CFSAN002064, we detected the gene for the putative methyltransferase Sen2064ORF15615P, and predicted that it would be responsible for GATCm6AG methylation. However, we did not observe the activity of this methyltransferase in S. Heidelberg, which means the enzyme is inactive. Inactivity can be the result of a mutation in the enzyme which renders it inactive, or, the enzyme may be functional, but not at the time of analysis. For example, some MTases may be inactive due to transcriptional silencing as is often found when the genes are present as part of a prophage [30]. Furthermore, an MTase may be transcribed, but for unknown reasons, may not routinely modify its’ target motif [12]. Cloning MTase genes has shown to be a useful approach for their characterization [6], and may help to match motifs to predicted MTases in cases where bioinformatics alone was insufficient. This approach should be incorporated into future studies that target particular MTases. For example, the cloning of Sen2064ORF15615P in an expression vector would resolve whether the enzyme is inactive or not functional in S. Heidelberg at the time of analysis. We cannot completely rule out the possibility that DNA MTase genes exist that show no similarity to characterized MTase genes. However, with methylation data from more than 700 genomes available and almost 2,500 characterized and 50,000 putative MTase genes identified in REBASE, the chances of finding a completely new way of methylating DNA are getting increasingly smaller. In particular, we rarely come across a case where we can be certain that there are insufficient MTases to account for the observed patterns of methylation. However, in Salmonella enterica subsp. enterica serovar Heidelberg CFSAN002064, the methylated motif ACCm6ANCC occurs, which may indicate a plasmid is missing. This contrasts with CFSAN002069, which also has this motif, but does have a potential plasmid-encoded MTase. In other cases we have observed this motif is present in strains containing plasmids (R.J. Roberts, unpublished). Furthermore, as more genome sequence data and PacBio methylation data appear, our ability to predict recognition sequences from sequence data alone is growing. Already, rules are becoming apparent for predicting the specificity of Type IIG enzymes [31]. Most of the novel motifs observed in each serovar were modified by Type I RM systems (Fig 1). Type I systems have a modular structure that may allow sequence specificities to diversify more easily than the structures of other RM types (for review, see [32]). Each system consists of two methylase (M) units, two restriction endonuclease (R) units, and one sequence specificity (S) subunit [33, 34]. The S subunit has two TRDs, each of which recognizes one half of the target motif. Recombination events may occur on the S subunit, either within a single TRD or within the sequence that joins the two, resulting in novel specificity. Also, R and M subunits may interact with foreign S subunits entering the cell, also resulting in novel specificity. This has been observed in Lactococcus [35]. One interesting Type I motif, Gm6ATGN5G4m GC, is exhibited by the specificity subunit of the SenAboIII system. This example of cooperation between an m6A methylase and an m4C methylase is quite rare and has only been infrequently observed previously (R. Morgan, unpublished observations). Unique motifs found among closely related taxa may be the result of horizontal gene transfer (HGT). Studies have demonstrated that HGT accounts for the movements of RM systems based on evidence of codon usage bias [36] and differential GC content of RM genes [37]. We identified several MTases that are located on prophages and plasmids, indicating possible mechanisms of transfer (Table 1). Also, through BLAST similarity searches against REBASE we found that several MTase sequences are most similar, or highly similar, to enzymes in Enterobacteriaceae genera other than Salmonella, suggesting that these systems may have been acquired via HGT. For example, M.SbaUII from S. Bareilly, which methylates the motif Cm6AGCTG, is most similar to an MTase found in Pectobacterium. Currently, we are building a robust Salmonella phylogeny, including representatives of other Enterobacteriaceae genera, to test these and other evolutionary hypotheses. In some taxa, we detected a proportion of motifs that were not fully methylated within the genome. In particular, only 38–78.5% of ATGCm6AT sites across the genome were methylated, and 89.3–100% of CAGm6AG sites were methylated (Table 2). Orphan MTases or RM systems with an inactive REase often do not methylate all sites in the genome, as complete methylation at all sites to protect from cleavage is usually unnecessary. Incomplete methylation may also be due to the fact that cells are analyzed at different times during the cell cycle, or methylation at certain sites may be inhibited by DNA binding proteins [38]. Environmental factors, including culture conditions, may also affect the frequency of methylation [9, 39]. Incomplete methylation may play a role in the regulation of gene expression. Thus, studies examining the functional implications of ATGCm6AT and CAGm6AG methylation will be particularly interesting. In several of the genomes, ATGCm6AT methyltransferases are biased towards preferentially methylating this motif when preceded by a cytosine, a thymine, or both. For example, in S. Heidelberg CFSAN002069, AATGCm6AT and GATGCm6AT are methylated at lower frequencies than TATGCm6AT and CATGCm6AT. All four motifs are found in a roughly 1:1:1:1 ratio throughout the genome, indicating a true bias in methyltransferase activity. Currently, we are investigating the biological significance of these observations. Interestingly, 20 ATGCm6AT motifs are present in a collection of 101 previously characterized Salmonella virulence genes [40], and ten of these are AATGCm6ATs, a much higher proportion than what is expected by chance.

Conclusions

In total, we observed 18 motifs among the nine Salmonella serovars, eight of which are novel. These findings indicate the diversity of motifs present in Salmonella enterica. The functions of the observed motifs are unknown, except for Gm6ATC, which has been well studied and is involved in a variety of biological processes including virulence [15]. In E. coli, methylation of CTGCm6AG by the MTase M.EcoGIII, is shown to affect the transcription of over 30% of genes [12]. It is possible that the methylation of motifs in Salmonella described here may also play a role in virulence and other cell functions, and thus merit further study. Future studies should also continue to explore how methylation patterns vary across serovars, and examine within-serovar variation. Methylation may be useful as a typing marker, as closely related taxa are often difficult to differentiate using morphological and molecular markers. The reconstruction of a Salmonella phylogeny, along with the analysis of the methylomes will allow us to address these issues and gain a more broad view of the evolutionary history and functional significance of methylation within the genus.

Kinetic score (QV) vs. sequencing coverage for adenine residues in S. Heidelberg CFSAN002064.

The line indicates the QV cutoff used for MTase specificity determination. (TIF) Click here for additional data file.

Explanation of MTase assignments.

(XLSX) Click here for additional data file.

SRA Accession numbers.

(XLSX) Click here for additional data file.
  39 in total

1.  An essential role for DNA adenine methylation in bacterial virulence.

Authors:  D M Heithoff; R L Sinsheimer; D A Low; M J Mahan
Journal:  Science       Date:  1999-05-07       Impact factor: 47.728

Review 2.  Mammalian cytosine methylation at a glance.

Authors:  Steen K T Ooi; Anne H O'Donnell; Timothy H Bestor
Journal:  J Cell Sci       Date:  2009-08-15       Impact factor: 5.285

3.  Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.

Authors:  Chen-Shan Chin; David H Alexander; Patrick Marks; Aaron A Klammer; James Drake; Cheryl Heiner; Alicia Clum; Alex Copeland; John Huddleston; Evan E Eichler; Stephen W Turner; Jonas Korlach
Journal:  Nat Methods       Date:  2013-05-05       Impact factor: 28.547

Review 4.  Dam methylation: coordinating cellular processes.

Authors:  Anders Løbner-Olesen; Ole Skovgaard; Martin G Marinus
Journal:  Curr Opin Microbiol       Date:  2005-04       Impact factor: 7.934

5.  Comparative genomics and transcriptional analysis of prophages identified in the genomes of Lactobacillus gasseri, Lactobacillus salivarius, and Lactobacillus casei.

Authors:  Marco Ventura; Carlos Canchaya; Valentina Bernini; Eric Altermann; Rodolphe Barrangou; Stephen McGrath; Marcus J Claesson; Yin Li; Sinead Leahy; Carey D Walker; Ralf Zink; Erasmo Neviani; Jim Steele; Jeff Broadbent; Todd R Klaenhammer; Gerald F Fitzgerald; Paul W O'toole; Douwe van Sinderen
Journal:  Appl Environ Microbiol       Date:  2006-05       Impact factor: 4.792

Review 6.  Regulation and function of DNA methylation in plants and animals.

Authors:  Xin-Jian He; Taiping Chen; Jian-Kang Zhu
Journal:  Cell Res       Date:  2011-02-15       Impact factor: 25.617

7.  Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori.

Authors:  R A Alm; L S Ling; D T Moir; B L King; E D Brown; P C Doig; D R Smith; B Noonan; B C Guild; B L deJonge; G Carmel; P J Tummino; A Caruso; M Uria-Nickelsen; D M Mills; C Ives; R Gibson; D Merberg; S D Mills; Q Jiang; D E Taylor; G F Vovis; T J Trust
Journal:  Nature       Date:  1999-01-14       Impact factor: 49.962

8.  Poultry-associated Salmonella enterica subsp. enterica serovar 4,12:d:- reveals high clonality and a distinct pathogenicity gene repertoire.

Authors:  Stephan Huehn; Cornelia Bunge; Ernst Junker; Reiner Helmuth; Burkhard Malorny
Journal:  Appl Environ Microbiol       Date:  2008-12-29       Impact factor: 4.792

9.  PHAST: a fast phage search tool.

Authors:  You Zhou; Yongjie Liang; Karlene H Lynch; Jonathan J Dennis; David S Wishart
Journal:  Nucleic Acids Res       Date:  2011-06-14       Impact factor: 16.971

10.  Complete Genome Sequences of Salmonella enterica Serovar Heidelberg Strains Associated with a Multistate Food-Borne Illness Investigation.

Authors:  Peter S Evans; Yan Luo; Tim Muruvanda; Sherry Ayers; Brian Hiatt; Maria Hoffman; Shaohua Zhao; Marc W Allard; Eric W Brown
Journal:  Genome Announc       Date:  2014-06-05
View more
  22 in total

Review 1.  Deciphering bacterial epigenomes using modern sequencing technologies.

Authors:  John Beaulaurier; Eric E Schadt; Gang Fang
Journal:  Nat Rev Genet       Date:  2019-03       Impact factor: 53.242

Review 2.  Genomic sequencing of Neisseria gonorrhoeae to respond to the urgent threat of antimicrobial-resistant gonorrhea.

Authors:  A Jeanine Abrams; David L Trees
Journal:  Pathog Dis       Date:  2017-06-01       Impact factor: 3.166

3.  Integration of the Salmonella Typhimurium Methylome and Transcriptome Reveals That DNA Methylation and Transcriptional Regulation Are Largely Decoupled under Virulence-Related Conditions.

Authors:  Jeffrey S Bourgeois; Caroline E Anderson; Liuyang Wang; Jennifer L Modliszewski; Wei Chen; Benjamin H Schott; Nicolas Devos; Dennis C Ko
Journal:  mBio       Date:  2022-06-06       Impact factor: 7.786

4.  Genes affecting progression of bacteriophage P22 infection in Salmonella identified by transposon and single gene deletion screens.

Authors:  Kaitlynne Bohm; Steffen Porwollik; Weiping Chu; John A Dover; Eddie B Gilcrease; Sherwood R Casjens; Michael McClelland; Kristin N Parent
Journal:  Mol Microbiol       Date:  2018-03-30       Impact factor: 3.501

5.  Complete Annotated Genome Sequence of the Salmonella enterica Serovar Typhimurium LT7 Strain STK003, Historically Used in Gene Transfer Studies.

Authors:  Julie Zaworski; Anne Guichard; Alexey Fomenkov; Richard D Morgan; Elisabeth A Raleigh
Journal:  Microbiol Resour Announc       Date:  2021-03-11

6.  Complete Genome Sequence of Salmonella enterica subsp. enterica Serovar Agona 460004 2-1, Associated with a Multistate Outbreak in the United States.

Authors:  Maria Hoffmann; Justin Payne; Richard J Roberts; Marc W Allard; Eric W Brown; James B Pettengill
Journal:  Genome Announc       Date:  2015-07-02

7.  Comparative Methylome Analysis of the Occasional Ruminant Respiratory Pathogen Bibersteinia trehalosi.

Authors:  Brian P Anton; Gregory P Harhay; Timothy P L Smith; Jochen Blom; Richard J Roberts
Journal:  PLoS One       Date:  2016-08-24       Impact factor: 3.240

8.  Novel Methyltransferase Recognition Motif Identified in Chania multitudinisentens RB-25(T) gen. nov., sp. nov.

Authors:  Robson Ee; Yan-Lue Lim; Wai-Fong Yin; Wah-Seng See-Too; Richard J Roberts; Kok-Gan Chan
Journal:  Front Microbiol       Date:  2016-08-31       Impact factor: 5.640

9.  A Single Nucleotide Polymorphism in lptG Increases Tolerance to Bile Salts, Acid, and Staining of Calcofluor-Binding Polysaccharides in Salmonella enterica Serovar Typhimurium E40.

Authors:  Taylor A Wahlig; Eliot Stanton; Jared J Godfrey; Andrew J Stasic; Amy C L Wong; Charles W Kaspar
Journal:  Front Microbiol       Date:  2021-06-02       Impact factor: 5.640

10.  Complete Genome Sequence of Enteroinvasive Escherichia coli O96:H19 Associated with a Severe Foodborne Outbreak.

Authors:  Emily A Pettengill; Maria Hoffmann; Rachel Binet; Richard J Roberts; Justin Payne; Marc Allard; Valeria Michelacci; Fabio Minelli; Stefano Morabito
Journal:  Genome Announc       Date:  2015-08-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.