Literature DB >> 24105689

Evolution of enterohemorrhagic escherichia coli O26 based on single-nucleotide polymorphisms.

Stefan Bletz1, Martina Bielaszewska, Shana R Leopold, Robin Köck, Anika Witten, Jörg Schuldes, Wenlan Zhang, Helge Karch, Alexander Mellmann.   

Abstract

Enterohemorrhagic Escherichia coli (EHEC) O26:H11/H⁻ is the predominant non-O157 EHEC serotype among patients with diarrhea, bloody diarrhea, and hemolytic uremic syndrome (HUS) worldwide. To elucidate their phylogeny and association between their phylogenetic background and clinical outcome of the infection, we investigated 120 EHEC O26:H11/H⁻ strains isolated between 1965 and 2012 from asymptomatic carriers and patients with diarrhea or HUS. Whole-genome shotgun sequencing (WGS) was applied to ten representative EHEC O26 isolates to determine single nucleotide polymorphism (SNP) localizations within a predefined set of core genes. A multiplex SNP assay, comprising a randomly distributed subset of 48 SNPs, was established to detect SNPs in 110 additional EHEC O26 strains. Within approximately 1 Mb of core genes, WGS resulted in 476 high-quality bi-allelic SNP localizations. Forty-eight of these were subsequently investigated in 110 EHEC O26 and four different SNP clonal complexes (SNP-CC) were identified. SNP-CC2 was significantly associated with the development of HUS. Within the subsequently established evolutionary model of EHEC O26, we dated the emergence of human EHEC O26 to approximately 19,700 years ago and demonstrated a recent evolution within humans into the 4 SNP-CCs over the past 1,650 years. WGS and subsequent SNP typing enabled us to gain new insights into the evolution of EHEC O26 suggesting a common theme in this EHEC group with analogies to EHEC O157. In addition, the SNP-CC analysis may help to assess a risk in infected individuals for the progression to HUS and to implement more specific infection control measures.

Entities:  

Keywords:  EHEC O26; SNP typing; enterohemorrhagic E. coli; evolution; whole-genome shotgun sequencing

Mesh:

Year:  2013        PMID: 24105689      PMCID: PMC3814194          DOI: 10.1093/gbe/evt136

Source DB:  PubMed          Journal:  Genome Biol Evol        ISSN: 1759-6653            Impact factor:   3.416


Introduction

Enterohemorrhagic Escherichia coli (EHEC) are a highly pathogenic subgroup of Shiga toxin (Stx)-producing E. coli. In humans, EHEC infections cause watery and bloody diarrhea, hemolytic uremic syndrome (HUS) (Tarr et al. 2005; Mellmann et al. 2009), and is the most common cause of acute renal failure in children (Brandt et al. 1994; Kaplan 1998; Tarr et al. 2005). Although EHEC O157:H7 is the serotype most commonly associated with HUS worldwide (Banatvala et al. 2001; Robert-Koch-Institut 2008; Centers for Disease Control and Prevention [CDC] 2011), the large O104:H4 outbreak in Germany in spring 2011 (Bielaszewska et al. 2011; Mellmann et al. 2011) and several outbreaks caused by other non-O157 EHEC such as O26 (Bradley et al. 2012; Brown et al. 2012; L’Abée-Lund et al. 2012; Wahl et al. 2011) attest to the potential menace of non-O157 EHEC. Among these, EHEC O26:H11/H− (nonmotile) are the serotypes that are most frequently associated with severe human diseases in Europe (Gerber et al. 2002; Tozzi et al. 2003; Ethelberg et al. 2004; Espié et al. 2008; Mellmann et al. 2008; Zimmerhackl et al. 2010; Käppeli et al. 2011; Buvens et al. 2012) and the United States (Jelacic et al. 2003; Brooks et al. 2005; Hedican et al. 2009). Furthermore, EHEC O26 has also been increasingly detected in South American (Rivas et al. 2006), Asian (Hiroi et al. 2012), and Australian (Vally et al. 2012) patients, demonstrating the global dissemination of this EHEC serogroup. Moreover, EHEC O26 infections can be comparable with EHEC O157 infections in the severity of the acute HUS and long-term sequelae (Gerber et al. 2002; Pollock et al. 2011; Zieg et al. 2012; Rosales et al. 2012). The evolution of EHEC O157 was thoroughly characterized in step-wise evolutionary models, where EHEC O157 emerged from E. coli O55:H7 by loss and acquisition of virulence and phenotypic traits (Feng et al. 2007). This scenario was built on multilocus enzyme electrophoresis and multilocus sequence typing (MLST) data (Feng et al. 1998, 2007). Later, analyses based on multilocus variable number of tandem repeat analysis (MLVA) and single nucleotide polymorphisms (SNPs) enabled a precise reconstruction of this model and further improved branching into different and evolutionary conserved subtypes (Leopold et al. 2009; Jenke et al. 2010, 2012). This approach allowed for assigning different rates of HUS to different subtypes (Alpers et al. 2009; Jenke et al. 2010). In contrast, little is known about the evolution of EHEC O26. Recently, we identified a newly emerging, highly virulent clone within EHEC O26 based on MLST and specific virulence determinants (Bielaszewska et al. 2013); however, neither its evolutionary origin nor its reservoir is currently known. We therefore applied whole-genome shotgun sequencing (WGS) of representative EHEC O26 isolates from human diseases to develop an evolutionary model of this important pathogen and subsequently investigated the molecular epidemiology of a diverse European collection of EHEC O26 using a SNP-based assay. In addition, we examined whether the detected genotypes also reflect the presence of highly pathogenic clones within the population of EHEC O26 ultimately enabling a risk assessment in EHEC O26 infections to their progression to HUS.

Materials and Methods

Bacterial Isolates

In total, we investigated 120 EHEC O26:H11/H− isolates (supplementary table S1, Supplementary Material online). All isolates were intimin (eae) positive and harbored either the Stx1a-encoding gene (stx1a), Stx2a-encoding gene (stx2a), or both. Ten of the 120 isolates representing phylogenetic breadth based on isolation year and country (if applicable) were subjected to WGS for subsequent development of the evolutionary model and the SNP assay. For evaluation of the model and investigation of the clinical association of genotypes to certain clinical outcomes, we included a representative subset of well-characterized clinical isolates from the previously published study (Bielaszewska et al. 2013) and four isolates from asymptomatic carriers. Into this otherwise randomly selected subset of isolates, all isolates from countries other than Germany (n = 60) and the rare MLST sequence types (ST) STs396, 591, 1565, 1566, and 1705 were included (for details see table S1, Supplementary Material online). The chromosomal sequence of O26:H11 strain 11368 (Ogura et al. 2009) (NCBI accession number NC_013361.1) served as reference.

Whole-Genome Shotgun Sequencing, Sequence Analysis, SNP Discovery

For WGS of the ten EHEC O26 isolates, sequencing libraries were prepared using the Nextera XT chemistry (Illumina Inc., San Diego, CA, USA) for a 100 bp paired-end sequencing run on an Illumina HiScanSQ sequencer in accordance to the manufacturer’s recommendations (Illumina Inc.). Sequencing reads were assembled using the CLC bio Genomic Workbench reference assembler (CLC bio, Denmark) using the chromosomal sequence of the O26:H11 strain 11368 (Ogura et al. 2009) as reference. For creation of a robust phylogeny, we extracted the core genome open reading frames (ORF) sequences starting from the previously published list of core ORFs (Mellmann et al. 2011) and included all ORFs that were present in all O26 isolates (n = 1,130). As an outgroup for phylogenetic analysis, the chromosomal sequence of EHEC O111:H− strain 11128, NCBI acc. no. NC_013364.1 (Ogura et al. 2009), was used. SNPs were discovered by mapping the consensus sequence of the respective isolate against the O26 reference sequence using the Ridom SeqSphere software version 0.99 beta (Ridom GmbH, Münster, Germany).

Phylogenetic Analysis

For inferring the evolutionary model of EHEC O26 based on core genome ORF sequences of the ten shotgun genome sequenced isolates and the reference strain 11368, a neighbor-joining tree was initially constructed using the MEGA5 software with default parameters (Tamura et al. 2011). We concatenated the ORFs present in all isolates and calculated the Ks values according to the modified version of the Yang-Nielsen algorithm (MYN) by using the KaKs calculator 2.0 (Zhang et al. 2006; Wang et al. 2010) for deeper analysis of the evolutionary history. This, together with an estimated synonymous substitution rate of 1.44 × 10−10 per base pair per generation (Lenski et al. 2003) and 300 generations per year for E. coli (in vivo) (Guttman and Dykhuizen 1994), we determined the age intervals between two isolates: Ks/([1.44 × 10−10/bp]/generation × 300 generation/year). The concatenated sequences of the intermediate isolates (postulated ancestors) were defined by the sequences of the precursor isolate within the neighbor-joining tree. To portray the SNP data from all 120 isolates, we generated a minimum spanning tree using the Ridom SeqSphere software version 0.99 beta (Ridom GmbH).

Bead-Based Multiplex Assay for SNP Detection

For robustness, the discovered SNP localizations were divided into four multiplex sets (supplementary table S4, Supplementary Material online) for subsequent detection using MagPlex-TAG microspheres on the Luminex MAGPIX platform (Luminex Corp., Austin, TX). All 96 multiplex polymerase chain reaction (PCR)-primers and 96 multiplex allele specific primer extension (ASPE) primers with the appending of appropriate TAG sequence were designed for each set of SNPs (wild type and variant) with PrimerPlex 2.6 (PREMIER Biosoft International, Palo Alto, CA). For each SNP locus, one ASPE primer for the reference (wild type) and one ASPE primer for the allelic variant (variant) were designed. This double provision, in principle, ensures that in the SNP-screening of samples the allele calls once positively and once negatively. In this case, possible tri-allelic or even tetra-allelic polymorphisms would be found during the measurements. The multiplex PCR primers with the respective amplicon sizes and the multiplex ASPE-primers with capture sequences (TAG sequence) and corresponding bead number are shown in supplementary table S4, Supplementary Material online. The following procedure was performed with minor modifications (discussed later) in accordance with the manufacturer’s recommendations (Song et al. 2010). Briefly, the multiplex PCR reaction for amplification of 12 loci per set (supplementary table S4, Supplementary Material online) was performed in 12.5 µl containing 6.25 µl REDTaq ReadyMix (Sigma-Aldrich, Munich, Germany), 1 µl of each forward and reverse primer mix (5 µM), 3.25 µl PCR water and 1 µl template DNA extracted from a single fresh colony. The PCR cycling parameters consisted of an initial step at 80 °C for 5 min, 30 cycles at 94 °C for 45 s, 60 °C for 45 s and 72 °C for 60 s, and a final step at 72 °C for 10 min. Subsequently, 5 µl of the PCR product were purified using 1.5 U Exonuclease I (E. coli) and 1.5 U Shrimp Alkaline Phosphatase (Exo/SAP) by incubation at 37 °C for 45 min followed by an inactivation step at 80 °C for 15 min. For the ASPE reaction, a 20 µl reaction contained 2 µl 10 × buffer, 1 µl MgCl2 (25 mM), 1 µl dNTP mix (100 µM dTTP, dGTP, dATP; Sigma-Aldrich), 0.25 µl biotin-dCTP (400 µM, Invitrogen, Darmstadt, Germany), 1 µl ASPE primer mix (0.5 µM), 0.75 µl AmpliTaq (5 U/µl, Applied Biosystems, Foster City, CA), 9 µl PCR-water, and 5 µl purified PCR product. The cycling conditions were 96 °C for 2 min, followed by 30 cycles at 94 °C for 30 s, 60 °C for 60 s and 74 °C for 2 min. For the final hybridization step, the appropriate MagPlex TAG microspheres for each set of the 24 corresponding bead types (1,250 beads of each per reaction) were used. The hybridization mix was subjected to two washing steps and incubated in 1× Tm Hybridization buffer containing 4 µg/ml streptavidin-R-phycoerythrin conjugate (SAPE) (Invitrogen, Darmstadt, Germany) at 37 °C for 15 min. Finally, the fluorescence was measured in 50 counts within 1 minute using the xPonent 4.1 software (Luminex Corp.). Based on the median fluorescence intensity (MFI) and the net MFI, a SNP call was evaluated only if the following quality criteria were met: detection of ≥50 beads per bead type, an MFI > 300, and a ratio of MFIcalled allele/(MFIwild type allele + MFIvariant allele) > 0.9 (Song et al. 2010). Measurements that did not fulfill these criteria were Sanger sequenced to validate the putative SNPs. For initial validation of this SNP assay, the putative SNP localizations of two samples (1226/65 and 3271/00) that differed considerably in their SNP profile were Sanger sequenced. Moreover, reproducibility of the assay was evaluated by analyzing one O26 isolate (126814/98) starting from the extracted DNA in five independent replicates (Song et al. 2010).

Data and Statistical Analysis

The Luminex MAGPIX analysis calculates the following values with the xPonent software: MFI, net MFI, Count, Allelic Call, and Allelic Ratio. The data were exported and further processed in Microsoft Excel. We tabulated the clinical picture (HUS, bloody diarrhea, diarrhea, asymptomatic, and unknown disease) to related SNP allelic profiles to evaluate whether certain SNP genotypes were associated with a specific disease. For statistical analysis, we used Fisher’s exact test of Epi Info 7 (Centers for Disease Control and Prevention, Atlanta, GA). Data were evaluated as statistically significant with P values < 0.05.

Results

Whole-Genome Shotgun Sequencing of Ten Representatives EHEC O26 Isolates for SNP Discovery

The SNP discovery was performed using WGS of ten EHEC O26 isolates listed in supplementary table S1, Supplementary Material online. After assembly, we queried the assemblies for the 1,144 ORF sequences of previously published E. coli core ORF definition (Mellmann et al. 2011). In total, 1,130 of these 1,144 ORFs were present in all O26 isolates and extracted for further analysis (supplementary table S2, Supplementary Material online). Within these 1,130 ORFs representing 952,632 bp of the chromosome, we identified 476 SNPs (in 298 ORFs) (supplementary table S3, Supplementary Material online). Of these, we selected in total 48 SNP localizations manually at random, 12 per quarter of the chromosome (fig. 1), to develop a multiplex SNP assay. All 48 SNPs were bi-allelic, synaptomorphic polymorphisms; of these 30 were nonsynonymous (ns) and 18 were synonymous (s), when compared with the reference sequence (table 1).
F

Distribution of the investigated 1,130 core genome ORFs (in green), of the discovered 476 SNPs (in blue), and the 48 SNPs of the multiplex assay (in red) illustrated in a circular map of reference genome of O26:H11 strain 11368.

Table 1

List of the 48 Synaptomorphic SNPs in 47 Loci in This Study Based on the Genome Sequence of O26:H11 Strain 11368 (GenBank accession number NC_013361.1)

Locus TagGene (No. of SNPs)Absolute SNP PositionSNPSNP EffectSNP Allele Frequencya (%)
ECO26_0083fruR (1)90659T → GbSynonymous78.33
ECO26_0094murC (1)102806T → GNonsynonymous (Tyr → Asp)5.83
ECO26_0341ykgF (1)363280T → GcSynonymous56.67
ECO26_0370prpC (1)398342T → CcNonsynonymous (Tyr → His)56.67
ECO26_0554ybcF (1)598509A → GNonsynonymous (Thr → Ala)5.83
ECO26_0653ybdK (1)693357A → CbNonsynonymous (Lys → Thr)89.17
ECO26_0785sdhB (1)841192C → GdNonsynonymous (Asp → Glu)6.67
ECO26_0787sucB (1)844779T → GSynonymous10.00
ECO26_0968ybjG (1)1020736T → AdSynonymous6.67
ECO26_1012cydC (1)1066493G → CcNonsynonymous (Arg → Thr)56.67
ECO26_1062ssuD (1)1135005T → GbNonsynonymous (Ile → Ser)89.17
ECO26_1083ycbG (1)1157959A → GbNonsynonymous (Asn → Asp)89.17
ECO26_1434ptsG (1)1447293A → CNonsynonymous (Asp → Ala)100.00
ECO26_1531hyaE (1)1528765C → TbNonsynonymous (Ala → Val)89.17
ECO26_1687minD (1)1652262A → GSynonymous100.00
ECO26_1741narH (1)1711223T → CbNonsynonymous (Val → Ala)89.17
ECO26_1835yciK (1)1787173T → CNonsynonymous (Met → Thr)100.00
ECO26_1890ycjF (1)1844594G → AbNonsynonymous (Gly → Asp)89.17
ECO26_2286speG (1)2221328A → TcNonsynonymous (Glu → Val)56.67
ECO26_2339fumC (1)2265373T → CbNonsynonymous (Leu → Pro)89.17
ECO26_2367pdxH (1)2297263C → GcSynonymous56.67
ECO26_2432ydiA (1)2368018C → AdNonsynonymous (Leu → Ile)6.67
ECO26_2433aroH (1)2368309C → ANonsynonymous (Pro → Thr)99.17
ECO26_2838rcsA (1)2735850A → GbNonsynonymous (His → Arg)89.17
ECO26_3081fruB (1)3018001T → CbNonsynonymous (Val → Ala)89.17
ECO26_3092yejE (1)3030971A → TbNonsynonymous (Ser → Cys)89.17
ECO26_3306truA (1)3232688G → TcNonsynonymous (Arg → Leu)56.67
ECO26_3433oxc (1)3349414G → AdSynonymous6.67
ECO26_3489yfeG (1)3410529T → GdNonsynonymous (Cys → Gly)6.67
ECO26_3612recO (1)3551044T → AbSynonymous89.17
ECO26_3961ygeY (1)3914494A → GSynonymous88.33
ECO26_3979lysS (1)3939467G → AdNonsynonymous (Val → Ile)6.67
ECO26_4164ttdB (1)4129239A → GSynonymous100.00
ECO26_4280#1glmM (2)4253102T → CSynonymous98.33
ECO26_4280#24254024T → GcSynonymous56.67
ECO26_4302kdsC (1)4272578T → CcNonsynonymous (Val → Ala)56.67
ECO26_4343tldD (1)4314205T → CcNonsynonymous (Cys → Arg)56.67
ECO26_4351yhdH (1)4326496G → TbSynonymous89.17
ECO26_4486yrfF (1)4446501A → CdSynonymous6.67
ECO26_4837gidA (1)4858567T → CNonsynonymous (Val → Ala)90.00
ECO26_4917uhpT (1)4943021A → CbSynonymous89.17
ECO26_5089secE (1)5143059C → AbNonsynonymous (Asn → Lys)89.17
ECO26_5096rpoC (1)5151319T → CSynonymous100.00
ECO26_5099thiG (1)5157488C → TNonsynonymous (Leu → Phe)90.00
ECO26_5139lysC (1)5206902T → CbNonsynonymous (Val → Ala)89.17
ECO26_5227adiC (1)5302486A → GbNonsynonymous (Tyr → Cys)89.17
ECO26_5302dipZ (1)5389233C → GbSynonymous89.17
ECO26_5384cysQ (1)5467060T → GbSynonymous89.17

aSNP allele frequency of 120 isolates used in this study.

bcanSNP for SNP-CC4.

cSNP differentiating SNP-CC2 and SNP-CC3

dCanonical SNP (canSNP) for SNP-CC1.

Distribution of the investigated 1,130 core genome ORFs (in green), of the discovered 476 SNPs (in blue), and the 48 SNPs of the multiplex assay (in red) illustrated in a circular map of reference genome of O26:H11 strain 11368. List of the 48 Synaptomorphic SNPs in 47 Loci in This Study Based on the Genome Sequence of O26:H11 Strain 11368 (GenBank accession number NC_013361.1) aSNP allele frequency of 120 isolates used in this study. bcanSNP for SNP-CC4. cSNP differentiating SNP-CC2 and SNP-CC3 dCanonical SNP (canSNP) for SNP-CC1.

SNP Typing of 120 O26 Isolates Using the Multiplex Assay

To achieve the desired robustness of the assay, we divided the multilocus genotyping assay into four sets comprising 12 SNP localizations each (supplementary table S4, Supplementary Material online). To test the assay accuracy, two isolates (1226/65 and 3271/00) were initially SNP genotyped and all alleles were Sanger sequenced. In all localizations, the SNP assay was concordant with the sequencing results. To test reproducibility, one strain (126814/98) was tested in five independent repeats. Supplementary figures S1 and S2, Supplementary Material online, show the data of the average MFI minus background correction (net MFI) with standard deviation of the different SNP alleles of the 48 investigated SNPs. In all cases, the SNP call was unambiguous and the MFI of called alleles was always at least 13-fold greater than the MFI of corresponding uncalled alleles. After this validation, a total of 5,760 SNPs were called in the 120 EHEC O26 strains. Only 2.1% (122 SNPs) had to be Sanger sequenced for confirmation, because the values were ambiguous. In all cases, however, the failure was due to mutations in the binding region of ASPE primer and sequencing confirmed the missing SNPs as known alleles (table 1). Overall, SNP genotyping of the 120 EHEC O26 isolates resulted in ten unique SNP profiles. Their phylogenetic relationships are displayed in a minimum spanning tree (MST) in figure 2. Clustering of SNP genotypes enabled us to assign SNP clonal complexes (SNP-CCs) as phylogenetically conserved groups, which is analogous to MLST, where MLST clonal complexes are phylogenetically informative groups (Feil and Spratt 2001). Isolates sharing ≥90% of the 48 SNPs (i.e., ≥44 SNPs) were grouped, resulting in four different SNP-CCs (SNP-CC1 to SNP-CC4) (fig. 2). Further details of the 48 investigated SNP localizations, for example, their ability to serve as a canonical SNP for a certain SNP-CC, are given in table 1. Of the determined SNP-CCs, SNP-CC2 and SNP-CC3 encompassed most isolates (60 [50.0%] and 39 [32.5%] isolates, respectively). The remaining SNP-CCs contained 13 (10.1%, SNP-CC4) and 8 (6.7%, SNP-CC1) isolates. Comparison of SNP data with MLST corroborated this separation as all isolates of SNP-CC2 and SNP-CC4 were exclusively MLST ST29 and ST21, respectively. Moreover, nearly all (36 of 39) SNP-CC3 isolates were ST21 and the majority of SNP-CC1 (6 of 8 isolates) were ST29 with few single locus variants (slv) of either ST21 (ST591, ST1565, and ST1705 in SNP-CC3) or ST29 (ST396 and ST1566 in SNP-CC1). Taken together, SNP genotyping subdivided the EHEC O26 population into four different SNP-CCs, one of which (SNP-CC2) was the recently described highly pathogenic “new clone” that separated from the remaining O26 population (Bielaszewska et al. 2013) and three further SNP-CCs.
F

The minimum spanning tree (MST) shows the molecular phylogeny of 120 EHEC O26 isolates. The different colors represent the symptoms of the infected patients. Each node represents a unique SNP profile. SNP clonal clusters (SNP-CCs) are numbered (SNP-CC1 to SNP-CC4). The node size reflects the number of isolates. Small numbers on connecting lines display the distance (number of differing SNPs) between two nodes.

The minimum spanning tree (MST) shows the molecular phylogeny of 120 EHEC O26 isolates. The different colors represent the symptoms of the infected patients. Each node represents a unique SNP profile. SNP clonal clusters (SNP-CCs) are numbered (SNP-CC1 to SNP-CC4). The node size reflects the number of isolates. Small numbers on connecting lines display the distance (number of differing SNPs) between two nodes.

Evolutionary Model of EHEC O26

The phylogenetic topology of the ten representative EHEC O26 isolates (supplementary table S1, Supplementary Material online), the O26:H11 reference strain 11368 and, as an outgroup, the next closely related EHEC serotype O111:H− strain 11128 (NC_013364.1) (Ogura et al. 2009; Ju et al. 2012) are shown in a neighbor-joining tree (fig. 3). The branching within this tree was concordant to the separation based on the SNP assay (fig. 2) underlining the unbiased representativeness of the selected 48 SNP localizations. We propose an evolutionary model of EHEC O26 with a subdivision into four SNP-CCs. Using the Ks values (number of synonymous substitutions per synonymous site) of strains 11128 (EHEC O111) and A10, along with hypothetical intermediate isolates A01 to A10 and a common ancestor of E. coli O26 and O111 as previously proposed (Whittam et al. 1988), we postulate that E. coli O26 and O111 separated 19,700 years ago (fig. 4). Since then, EHEC O26 likely developed sequentially from SNP-CC1 to SNP-CC4. The evolution of these clonal clusters occurred within 1,650 years of this bifurcation (fig. 4). During this evolution, there was a parallel evolution of the core genome and stx as the most important virulence marker as both were almost exclusively associated in a fixed combination within the different SNP-CCs (fig. 4).
F

Neighbor-joining tree based on 1,130 concatenated ORFs of ten representative isolates (gray) and reference strains O26:H11 (11368) and O111:H− (11128) (white). The SNP clonal clusters (SNP-CCs) are marked and demonstrate the quartering of the phylogenetic tree. Phylogenetic analysis generated by MEGA5 (Tamura et al. 2011).

F

Evolutionary model and calculated age distances for EHEC O26 pathogens based on the neighbor-joining tree (fig. 3) and inserted in the SNP clonal clusters (SNP-CC1 to SNP-CC4). Blue boxes are the EHEC O26 isolates with shotgun genome sequencing data. In gray, hypothetical founders of O26 isolates are shown (A01 to A10). The ancestry is calculated based on the phylogeny displayed in figure 3. White boxes show the two EHEC O26:H11 and EHEC O111:H− reference strains (strains 11368 and 11128, respectively) that are fully sequenced; EHEC O111:H− is assumed to be the closest relative of serogroup O26 (Whittam et al. 1988). Blue lines connect the isolates and the hypothetical ancestors and red numbers show the synonymous/nonsynonymous SNPs between these genotypes. The gray line connects the O111:H− reference strain and the first O26 ancestor A10 as the common O26/O111 ancestor is not known. Distances are not drawn to scale.

Neighbor-joining tree based on 1,130 concatenated ORFs of ten representative isolates (gray) and reference strains O26:H11 (11368) and O111:H− (11128) (white). The SNP clonal clusters (SNP-CCs) are marked and demonstrate the quartering of the phylogenetic tree. Phylogenetic analysis generated by MEGA5 (Tamura et al. 2011). Evolutionary model and calculated age distances for EHEC O26 pathogens based on the neighbor-joining tree (fig. 3) and inserted in the SNP clonal clusters (SNP-CC1 to SNP-CC4). Blue boxes are the EHEC O26 isolates with shotgun genome sequencing data. In gray, hypothetical founders of O26 isolates are shown (A01 to A10). The ancestry is calculated based on the phylogeny displayed in figure 3. White boxes show the two EHEC O26:H11 and EHEC O111:H− reference strains (strains 11368 and 11128, respectively) that are fully sequenced; EHEC O111:H− is assumed to be the closest relative of serogroup O26 (Whittam et al. 1988). Blue lines connect the isolates and the hypothetical ancestors and red numbers show the synonymous/nonsynonymous SNPs between these genotypes. The gray line connects the O111:H− reference strain and the first O26 ancestor A10 as the common O26/O111 ancestor is not known. Distances are not drawn to scale.

Clinical Implications of EHEC O26 Separation into Four SNP-CCs

Finally, we investigated whether the separation of EHEC O26 into SNP-CCs is associated with human diseases by analyzing the association of the SNP genotypes with clinical outcomes of the infection. Indeed, isolates of SNP-CC2, which comprised 50.0% of all strains and were responsible for 61.8% all of (47 of 76 strains) HUS cases in this study (table 2), showed a highly significant association (odds ratio 3.86, 95% confidence interval 1.63–9.30, P < 0.01) with the development of HUS. In contrast, none of the other SNP-CCs exhibited a statistically significant association with HUS (table 2).
Table 2

Distribution of Diseases over Four Different SNP-CC of 120 EHEC O26 Strains Isolated from Patients and Associations of EHEC O26 SNP-CCs with HUS

SNP-CC (Total No. Isolates)Disease (HUS/BD/D/A/U)OR (95% CI) (HUS)P Value
SNP-CC1 (8)5/0/2/0/10.96 (0.19–5.41)0.96
SNP-CC2 (60)47/0/11/1/13.86 (1.63–9.30)<0.01
SNP-CC3 (39)22/3/11/3/00.65 (0.27–1.52)0.28
SNP-CC4 (13)2/5/6/0/00.08 (0.01–0.42)<0.01

Note.—BD, bloody diarrhea; D, diarrhea; A, asymptomatic; U, unknown; OR, odds ratio; CI, confidence interval.

Distribution of Diseases over Four Different SNP-CC of 120 EHEC O26 Strains Isolated from Patients and Associations of EHEC O26 SNP-CCs with HUS Note.—BD, bloody diarrhea; D, diarrhea; A, asymptomatic; U, unknown; OR, odds ratio; CI, confidence interval.

Discussion

To elucidate the evolutionary history of EHEC O26 and to analyze more precisely the differentiation of this globally emerging pathogen into groups with different genotypic and clinical characteristics, we applied WGS and SNP typing of a diverse collection of ten EHEC O26. Analysis of SNPs within 1,130 core genome genes enabled us not only to develop a multiplex assay with a reduced number of SNP localizations for high-throughput grouping of EHEC O26 into distinct and phylogenetically and clinically meaningful SNP-CCs but also to establish an evolutionary model of EHEC O26. We were surprised by the clear-cut separation of EHEC O26 into four distinct SNP-CCs based on SNP data of the 120 strains as our previous investigations based on MLST and virulence profiling distinguished only a single highly pathogenic clone that was distinct from the remaining EHEC O26 population in Europe (Bielaszewska et al. 2013). However, this separation corroborates with studies of EHEC O157 (Manning et al. 2008; Leopold et al. 2009; Eppinger et al. 2011) and EHEC O104 (Brzuszkiewicz et al. 2011; Mellmann et al. 2011; Rasko et al. 2011), where genome sequencing information precisely refined and thereby proved evolutionary models. From an evolutionary perspective, different scenarios could explain the separation of EHEC O26 into distinct SNP-CCs. First, an evolutionary bottleneck could have led to a reduction of the EHEC O26 population into four major genotypes. As this must have occurred in a highly specific manner favoring at least four genotypes during the emergence of EHEC O26, this scenario is unlikely. Another explanation for the emergence of different EHEC O26 SNP-CCs is the evolutionary concept of an “epidemic” population structure (Smith et al. 2000). In this model, highly adaptive and frequently pathogenic clones arise from a recombining background population for a certain time period before they disappear again in the background population because of diversification predominantly driven by recombination and secondarily by point mutations (Smith et al. 2000). In addition to the fact that E. coli in general exhibits frequent recombination (Wirth et al. 2006), especially between closely related members of this species (Leopold et al. 2011), it is also known that diversity and recombination within tightly constrained clones such as highly pathogenic EHEC is very limited (Noller et al. 2003; Wirth et al. 2006; Manning et al. 2008; Leopold et al. 2009), thus not favoring this model. The most likely model to explain this scenario could be the model of source–sink evolution dynamics that was introduced for bacterial pathogens by Sokurenko et al. (2006) and has been described for EHEC O157 (Leopold et al. 2009) and uropathogenic E. coli (Chattopadhyay et al. 2007). This model postulates that a diverse population of EHEC O26 has already circulated over a longer period of time in an evolutionary stable niche (source) and only few strains were able to adapt during the transfer into a new niche (sink) with positive and purifying selection (Chattopadhyay et al. 2007). Calculation of the Ka/Ks value further corroborated this hypothesis by indicating significant purifying selection with a Ka/Ks value of 0.19 between the O26/O111 ancestor and the first strain (A10) of SNP-CC1 during the emergence of O26, that is, the transfer into the sink. This model immediately raises the question of the natural reservoir of EHEC O26. Although cattle are the known major reservoir of EHEC O26 (Blanco et al. 2004; Geue et al. 2009; Chase-Topping et al. 2012), highly pathogenic EHEC O26 of the new clone carrying solely stx2a (Bielaszewska et al. 2013) have only rarely been isolated outside humans (Allerberger et al. 2003; Blanco et al. 2004; Chase-Topping et al. 2012). Future studies applying the described SNP multiplex assay on nonhuman samples are necessary to elucidate potential reservoirs and to further understand the evolutionary dynamics between source and sink. By calculating the Ks value, we were also able to develop a timeline of the EHEC O26 evolution (fig. 4). Using EHEC O111 as the closest relative to EHEC O26 (Whittam et al. 1988; Ogura et al. 2009), we dated the separation from a common ancestor at approximately 19,700 years ago. Estimating 200 generations/year as done for EHEC O157:H7 (Leopold et al. 2009), we calculate that EHEC O26 separated from the common EHEC O26/O111 ancestor 29,600 years ago. Interestingly, like EHEC O157 where two different groups emerged (sorbitol-fermenting O157:H− [subgroup B] and nonsorbitol-fermenting EHEC O157:H7 [subgroup C]) and of which the latter further differentiated over at least 2,500 years, EHEC O26 also diverged into the four extant SNP-CCs over the course of 2,400 years. We further calculated the association of certain EHEC disease entities (HUS vs. diarrhea without HUS and asymptomatic cases) with SNP-CCs to determine whether there is a clinical impact of the clustering into 4 SNP-CCs (table 2). Indeed, only SNP-CC2 was significantly associated with what the treating physicians termed HUS, underlining the clinical importance of the EHEC O26 grouping. Whether the infected individual finally develops a severe EHEC disease is of course also influenced by yet unknown host factors. Interestingly, the presence of certain genotypes with an increased virulence is again similar to EHEC O157 (Manning et al. 2008; Jenke et al. 2010, 2012). Altogether, these observations suggest a common theme in the EHEC evolution, which is driven by the transfer into new hosts, that is, the humans, and rapid selection processes. One limitation of our study might be a potential sampling bias toward isolates from severely ill patients. However, SNP data of the diverse collection of strains spanning several decades and different countries still reflected this separation into four SNP-CCs. Moreover, also the four isolates from asymptomatic carriers shared the identical SNP genotype with isolates from severely ill patients further corroborating the separation into the four SNP-CCs. A second limitation might be the limited geographical distribution of the strains used to determine the evolutionary model as, with the exception of one strain, all were isolated from patients in Germany. An inclusion of additional strains from different continents may provide additional information; however, grouping of the SNP genotypes of the 110 isolates from seven European countries into the four SNP-CCs also approved our model at least for Europe. Another limitation might be the inclusion of only 1,130 ORFs; we were, however, not aiming for separation of closely related strains during outbreak investigations but used these genes that are conserved within E. coli solely for generation of a robust phylogenetic signal. In summary, based on WGS and subsequent multiplexed SNP calling, we established an evolutionary model of the emergence and further diversification of EHEC O26 into four phylogenetically and clinically meaningful SNP-CCs. These data broadened our knowledge about the evolution of this important human pathogen and suggest a common theme in the EHEC evolution. Moreover, information about the SNP-CC may help implement more specific infection control measures and may enable a risk assessment for each detected EHEC O26 isolate. Future studies should also focus on EHEC O26 in nonhuman environments to understand their behavior and evolutionary processes in their likely reservoirs.

Supplementary Material

Supplementary tables S1–S4 and figures S1 and S2 are available at Genome Biology and Evolution online (http://www.gbe.oxfordjournals.org/).
  60 in total

1.  Characterisation of the Escherichia coli strain associated with an outbreak of haemolytic uraemic syndrome in Germany, 2011: a microbiological study.

Authors:  Martina Bielaszewska; Alexander Mellmann; Wenlan Zhang; Robin Köck; Angelika Fruth; Andreas Bauwens; Georg Peters; Helge Karch
Journal:  Lancet Infect Dis       Date:  2011-06-22       Impact factor: 25.071

2.  Vital signs: incidence and trends of infection with pathogens transmitted commonly through food--foodborne diseases active surveillance network, 10 U.S. sites, 1996-2010.

Authors: 
Journal:  MMWR Morb Mortal Wkly Rep       Date:  2011-06-10       Impact factor: 17.586

3.  Enterohemorrhagic Escherichia coli O26:H11-Associated Hemolytic Uremic Syndrome: Bacteriology and Clinical Presentation.

Authors:  Lothar-Bernd Zimmerhackl; Alejandra Rosales; Johannes Hofer; Magdalena Riedl; Therese Jungraithmayr; Alexander Mellmann; Martina Bielaszewska; Helge Karch
Journal:  Semin Thromb Hemost       Date:  2010-09-23       Impact factor: 4.180

4.  Origins of the E. coli strain causing an outbreak of hemolytic-uremic syndrome in Germany.

Authors:  David A Rasko; Dale R Webster; Jason W Sahl; Ali Bashir; Nadia Boisen; Flemming Scheutz; Ellen E Paxinos; Robert Sebra; Chen-Shan Chin; Dimitris Iliopoulos; Aaron Klammer; Paul Peluso; Lawrence Lee; Andrey O Kislyuk; James Bullard; Andrew Kasarskis; Susanna Wang; John Eid; David Rank; Julia C Redman; Susan R Steyert; Jakob Frimodt-Møller; Carsten Struve; Andreas M Petersen; Karen A Krogfelt; James P Nataro; Eric E Schadt; Matthew K Waldor
Journal:  N Engl J Med       Date:  2011-07-27       Impact factor: 91.245

5.  Enterohemorrhagic Escherichia coli O26:H11/H-: a new virulent clone emerges in Europe.

Authors:  Martina Bielaszewska; Alexander Mellmann; Stefan Bletz; Wenlan Zhang; Robin Köck; Annelene Kossow; Rita Prager; Angelika Fruth; Dorothea Orth-Höller; Monika Marejková; Stefano Morabito; Alfredo Caprioli; Denis Piérard; Geraldine Smith; Claire Jenkins; Katarína Curová; Helge Karch
Journal:  Clin Infect Dis       Date:  2013-02-01       Impact factor: 9.079

6.  Obscured phylogeny and possible recombinational dormancy in Escherichia coli.

Authors:  Shana R Leopold; Stanley A Sawyer; Thomas S Whittam; Phillip I Tarr
Journal:  BMC Evol Biol       Date:  2011-06-27       Impact factor: 3.260

7.  Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology.

Authors:  Alexander Mellmann; Dag Harmsen; Craig A Cummings; Emily B Zentz; Shana R Leopold; Alain Rico; Karola Prior; Rafael Szczepanowski; Yongmei Ji; Wenlan Zhang; Stephen F McLaughlin; John K Henkhaus; Benjamin Leopold; Martina Bielaszewska; Rita Prager; Pius M Brzoska; Richard L Moore; Simone Guenther; Jonathan M Rothberg; Helge Karch
Journal:  PLoS One       Date:  2011-07-20       Impact factor: 3.240

8.  Human infections with non-O157 Shiga toxin-producing Escherichia coli, Switzerland, 2000-2009.

Authors:  Ursula Käppeli; Herbert Hächler; Nicole Giezendanner; Lothar Beutin; Roger Stephan
Journal:  Emerg Infect Dis       Date:  2011-02       Impact factor: 6.883

9.  Highly virulent Escherichia coli O26, Scotland.

Authors:  Kevin G J Pollock; Sheetal Bhojani; T James Beattie; Lesley Allison; Mary Hanson; Mary E Locking; John M Cowden
Journal:  Emerg Infect Dis       Date:  2011-09       Impact factor: 6.883

10.  KaKs_Calculator 2.0: a toolkit incorporating gamma-series methods and sliding window strategies.

Authors:  Dapeng Wang; Yubin Zhang; Zhang Zhang; Jiang Zhu; Jun Yu
Journal:  Genomics Proteomics Bioinformatics       Date:  2010-03       Impact factor: 7.691

View more
  15 in total

1.  Virulence Gene Profiles and Clonal Relationships of Escherichia coli O26:H11 Isolates from Feedlot Cattle as Determined by Whole-Genome Sequencing.

Authors:  Narjol Gonzalez-Escalona; Magaly Toro; Lydia V Rump; Guojie Cao; T G Nagaraja; Jianghong Meng
Journal:  Appl Environ Microbiol       Date:  2016-06-13       Impact factor: 4.792

2.  Effective Surveillance Using Multilocus Variable-Number Tandem-Repeat Analysis and Whole-Genome Sequencing for Enterohemorrhagic Escherichia coli O157.

Authors:  Kenichi Lee; Hidemasa Izumiya; Sunao Iyoda; Makoto Ohnishi
Journal:  Appl Environ Microbiol       Date:  2019-08-14       Impact factor: 4.792

3.  Real-Time Genome Sequencing of Resistant Bacteria Provides Precision Infection Control in an Institutional Setting.

Authors:  Alexander Mellmann; Stefan Bletz; Thomas Böking; Frank Kipp; Karsten Becker; Anja Schultes; Karola Prior; Dag Harmsen
Journal:  J Clin Microbiol       Date:  2016-08-24       Impact factor: 5.948

4.  Characteristics of emerging human-pathogenic Escherichia coli O26:H11 strains isolated in France between 2010 and 2013 and carrying the stx2d gene only.

Authors:  Sabine Delannoy; Patricia Mariani-Kurkdjian; Stephane Bonacorsi; Sandrine Liguori; Patrick Fach
Journal:  J Clin Microbiol       Date:  2014-11-26       Impact factor: 5.948

5.  A Nonautochthonous U.S. Strain of Vibrio parahaemolyticus Isolated from Chesapeake Bay Oysters Caused the Outbreak in Maryland in 2010.

Authors:  Julie Haendiges; Jessica Jones; Robert A Myers; Clifford S Mitchell; Erin Butler; Magaly Toro; Narjol Gonzalez-Escalona
Journal:  Appl Environ Microbiol       Date:  2016-05-16       Impact factor: 4.792

6.  Comparison of whole genome sequences from human and non-human Escherichia coli O26 strains.

Authors:  Keri N Norman; Michael L Clawson; Nancy A Strockbine; Robert E Mandrell; Roger Johnson; Kim Ziebell; Shaohua Zhao; Pina M Fratamico; Robert Stones; Marc W Allard; James L Bono
Journal:  Front Cell Infect Microbiol       Date:  2015-03-11       Impact factor: 5.293

Review 7.  Diarrheagenic Escherichia coli.

Authors:  Tânia A T Gomes; Waldir P Elias; Isabel C A Scaletsky; Beatriz E C Guth; Juliana F Rodrigues; Roxane M F Piazza; Luís C S Ferreira; Marina B Martinez
Journal:  Braz J Microbiol       Date:  2016-11-05       Impact factor: 2.476

8.  Population structure of Escherichia coli O26 : H11 with recent and repeated stx2 acquisition in multiple lineages.

Authors:  Yoshitoshi Ogura; Yasuhiro Gotoh; Takehiko Itoh; Mitsuhiko P Sato; Kazuko Seto; Shyuji Yoshino; Junko Isobe; Yoshiki Etoh; Mariko Kurogi; Keiko Kimata; Eriko Maeda; Denis Piérard; Masahiro Kusumoto; Masato Akiba; Kiyoshi Tominaga; Yumi Kirino; Yuki Kato; Katsuhiko Shirahige; Tadasuke Ooka; Nozomi Ishijima; Ken-Ichi Lee; Sunao Iyoda; Jacques Georges Mainil; Tetsuya Hayashi
Journal:  Microb Genom       Date:  2017-11

9.  Genetic characterization of Shiga toxin-producing Escherichia coli O26:H11 strains isolated from animal, food, and clinical samples.

Authors:  Alejandra Krüger; Paula M A Lucchesi; A Mariel Sanso; Analía I Etcheverría; Ana V Bustamante; Julia Burgán; Luciana Fernández; Daniel Fernández; Gerardo Leotta; Alexander W Friedrich; Nora L Padola; John W A Rossen
Journal:  Front Cell Infect Microbiol       Date:  2015-10-20       Impact factor: 5.293

10.  Targeted Amplicon Sequencing for Single-Nucleotide-Polymorphism Genotyping of Attaching and Effacing Escherichia coli O26:H11 Cattle Strains via a High-Throughput Library Preparation Technique.

Authors:  Sarah A Ison; Sabine Delannoy; Marie Bugarel; Tiruvoor G Nagaraja; David G Renter; Henk C den Bakker; Kendra K Nightingale; Patrick Fach; Guy H Loneragan
Journal:  Appl Environ Microbiol       Date:  2015-11-13       Impact factor: 4.792

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.