Literature DB >> 31064635

Genetic diversity and delineation of Salmonella Agona outbreak strains by next generation sequencing, Bavaria, Germany, 1993 to 2018.

Alexandra Dangel1,2, Anja Berger1,2, Ute Messelhäußer1, Regina Konrad1, Stefan Hörmansdorfer1, Nikolaus Ackermann1, Andreas Sing1.   

Abstract

BackgroundIn 2017, a food-borne Salmonella Agona outbreak caused by infant milk products from a French supplier occurred in Europe. Simultaneously, S. Agona was detected in animal feed samples in Bavaria.AimUsing next generation sequencing (NGS) and three data analysis methods, this study's objectives were to verify clonality of the Bavarian feed strains, rule out their connection to the outbreak, explore the genetic diversity of Bavarian S. Agona isolates from 1993 to 2018 and compare the analysis approaches employed, for practicality and ability to delineate outbreaks caused by the genetically monomorphic Agona serovar.MethodsIn this observational retrospective study, three 2017 Bavarian feed isolates were compared to a French outbreak isolate and 48 S. Agona isolates from our strain collections. The later included human, food, feed, veterinary and environmental isolates, of which 28 were epidemiologically outbreak related. All isolates were subjected to NGS and analysed by: (i) a publicly available species-specific core genome multilocus sequence typing (cgMLST) scheme, (ii) single nucleotide polymorphism phylogeny and (iii) an in-house serovar-specific cgMLST scheme. Using additional international S. Agona outbreak NGS data, the cluster resolution capacity of the two cgMLST schemes was assessed.ResultsWe could prove clonality of the feed isolates and exclude their relation to the French outbreak. All approaches confirmed former Bavarian epidemiological clusters.ConclusionEven for S. Agona, species-level cgMLST can produce reasonable resolution, being standardisable by public health laboratories. For single samples or homogeneous sample sets, higher resolution by serovar-specific cgMLST or SNP genotyping can facilitate outbreak investigations.

Entities:  

Keywords:  Europe; Germany; Salmonella; bacterial infections; food-borne infections; molecular methods; outbreaks; public health; salmonellosis; surveillance

Mesh:

Substances:

Year:  2019        PMID: 31064635      PMCID: PMC6505185          DOI: 10.2807/1560-7917.ES.2019.24.18.1800303

Source DB:  PubMed          Journal:  Euro Surveill        ISSN: 1025-496X


Introduction

Salmonellosis is one of the most common food-borne human diseases. It is often transmitted via contaminated meat, eggs or seafood products. Moreover, due to the robustness of Salmonella spp., dried products like herbs or spices have also proven their potential as vehicle of infection. Of more than 2,600 different serovars of S. enterica, only a few non-typhoidal serovars are responsible for most human infections. In S. enterica subsp. enterica, these serovars include for example Enteridis, Typhimurium and Agona. In Europe, S. Agona is far from leading the list of pathogenic serovars, as cases of S. Enteritidis and S. Typhimurium are much more numerous [1]. Globally, S. Agona is a common pathogen and food-borne outbreaks connected to it have been consistently reported in several countries. Examples are a 2002−03 outbreak, caused by aniseed-fennel-caraway tea products affecting 77 patients in Germany [2,3], a 2008 outbreak connected to meat products of one supplier causing 163 infections in 10 different countries with most cases in the United Kingdom [4], a 2011 multi-state outbreak in the United States (US) caused by fresh papaya resulting in more than 100 infected patients [5], or a point-sourced outbreak caused by tuna sushi in Sydney, Australia in 2015 [6]. In outbreak investigations, serotyping and phage typing have been used for decades in many laboratories including reference laboratories. Serotyping is still serving as a gold-standard technique for routine typing. In combination with other typing techniques like phage typing, it may be suited for the investigation of small, geographically limited outbreaks [7]. However, many serovars are polyphyletic and serotyping sometimes confounds genetically unrelated isolates and thus does not recognise evolutionary groupings in some cases. Therefore, attempts were made some years ago to replace this technique by molecular typing methods such as multilocus sequence typing (MLST), which is able to recognise evolutionary relationships with higher resolution [8]. Furthermore, since many years, pulsed-field gel electrophoresis (PFGE), classifying bacteria based on their universal band pattern after chromosomal restriction, is globally used as a standard molecular technique in outbreak investigations [9-11]. However, despite advantages of molecular techniques, traditional serotyping is still universally used and provides an important historical context. Beyond PFGE and MLST, variable number of tandem repeats (VNTRs) proved to be suitable molecular targets for assessing genetic polymorphisms within bacterial species [12,13]. The multilocus variable number of tandem repeats (MLVA) technique, as a form of VNTR typing showed increased analysis depth in outbreak investigations and proved to be suitable for important Salmonella serovars such as Enteritidis, Typhimurium or Dublin [14-16]. However, the variability of protocols and targets hindered comprehensive standardisation, although efforts towards this are ongoing [17,18]. All these techniques provide reliable first level classification and are discriminative enough to investigate epidemiologically well-defined outbreaks. Nonetheless, in the meantime, a number of studies have shown that whole genome sequencing (WGS) gives the highest resolution for outbreak investigation, especially if case distribution is diffuse with respect to geographical area or time frame of occurrence [7,19,20]. For implementation of WGS in S. Agona outbreak investigations and molecular surveillance, it has to be taken into account that this serovar is monophyletic, more homogeneous, as well as evolutionarily younger than most other well-investigated pathogenic serovars [21,22]. In 2017, three feed samples (rapeseed meal) of a factory in the district of Lower Bavaria were submitted to the Bavarian Health and Food Safety Authority. Culture-based species and serovar identification detected S. Agona in all three samples. In December 2017, an outbreak of the same serovar was reported in France, attributable to 37 French cases and two international cases, caused by infant milk products of a French supplier and traceable to one single French production facility [23,24]. The aim of the current WGS investigation, using next generation sequencing (NGS) was to verify potential clonality of the Bavarian isolates, to exclude any connection between the Bavarian feed samples and the simultaneous French outbreak and to gain a more precise insight into the genetic diversity of S. Agona collected in Bavaria over the past 25 years.

Methods

Isolates and sequence data used in the analyses

For this observational retrospective study, we used 48 S. Agona isolates dated from 1993 to 2018 from our strain collections to investigate the three isolates of the Bavarian rapeseed meal from 2017 in a wider context. The total 51 isolates were all the S. Agona isolates available for us. As many diagnostic laboratories exist in Bavaria, our isolates were not representative for the occurrence of S. Agona in this federal state, where, for example, 99 cases of human infections had been officially notified in the 2011 to 2018 period alone. The 48 isolates that we employed were either from humans, food, feed, animals or the environment. A total of 28 thereof were outbreak-related. Outbreak isolates belonged to four epidemiologically-linked events represented by 15, three, three and two isolates as well as an additional five isolates with a suspected epidemiological connection. The 48 isolates from 1993 to 2018 were studied by WGS together with the three 2017 isolates of the Bavarian rapeseed feed. Raw NGS data of the published representative isolate of the French outbreak [24], available under European Nucleotide Archive (ENA) accession ERR2219379, were also added to the bioinformatics analyses. For the evaluation of necessary analysis depth, 70 NGS raw datasets from various European outbreaks, published by Zhou et al. 2013 [21], available under National Center for Biotechnology Information (NCBI) bioproject PRJEB1944, were added to the data analysis as well. To distinguish the subset of isolates derived from our strain collections, or data thereof, from the data published by Zhou et al. [21], we further refer to our 51 S. Agona isolates/strains as ‘Bavarian’.

Species and serovar identification

All Bavarian S. Agona strains were cultured on Columbia sheep blood agar (Oxoid, Wesel, Germany) and identified by matrix-assisted laser desorption/ionisation time-of-flight mass spectroscopy (MALDI-TOF MS; Bruker, Bremen, Germany). Somatic (O) and flagellar (H) antigens were identified by using slide agglutination (antisera provided by Sifin, Berlin, Germany) according to the White−Kauffmann−Le Minor scheme [25].

Next generation sequencing

Salmonella Agona isolates were freshly grown on blood agar plates. One inoculation loop of bacterial material was suspended in 50 µL phosphate buffered saline (PBS) and cells were pre-treated with 1 µg lysozyme for 15 min at 37 °C followed by a 2 hour incubation step at 65 °C with 200 µL incorporation buffer, 200 µL lysis buffer, 30 µL 20 mg/mL Proteinase K and 10 µL 10 mg/mL ribonuclease (RNase) A (all reagents from Promega, Mannheim, Germany). Genomic DNA (gDNA) was then isolated with the Maxwell 16 LEV Blood DNA Kit on the Maxwell 16 instrument (Promega, Mannheim, Germany) according to manufacturer’s instructions with Tris buffer for gDNA elution. Whole genome libraries for NGS were prepared using the Nextera XT kit (Illumina, San Diego, California, US). Next generation sequencing was performed on the Illumina MiSeq with 2x250 bp paired-end reads. Sequencing runs were evaluated for quality using the Illumina SAV Software. Sequencing data were uploaded to the NCBI sequence read archive (SRA) [26], under BioProject PRJNA473689.

Multilocus sequence typing analyses

Core genome MLST (cgMLST) of reads was performed with Ridom SeqSphere + Software (Ridom, Munster, Germany [27]) with default settings for trimming and velvet assembly. For the assignment of cgMLST alleles, two different schemes were used: (i) a publicly-available S. enterica (species level) cgMLST scheme designed by Enterobase and (ii) an in-house serovar-specific cgMLST scheme.

Enterobase-designed Salmonella enterica core genome multilocus typing scheme

The publicly available species-specific SeqSphere + software-implemented S. enterica cgMLST scheme with the 3,002 target loci developed as Salmonella cgMLST v2 scheme by Enterobase was employed to analyse sequencing data [28,29].

In-house serovar-specific Salmonella Agona core genome multilocus typing scheme

An in-house developed serovar-specific S. Agona scheme, based on reference genome NC_011149.1 of strain SL-483 and query genomes NC_022991.1, NZ_CP015024.1, NZ_CP011259.1, was generated using the following default filter thresholds. For the reference genome filter thresholds: (i) minimum length: 60 bases; (ii) start codon and single stop codon required at beginning and end of gene; (iii) homologous/paralogous gene filter, excluding multiple copies of a gene with basic local alignment search tool (BLAST) overlap ≥ 100 bp or identity ≥ 90%; (iv) overlap filter, excluding overlap with other genes > 4 bases. For the query genome filters thresholds: (i) start and stop codon required at beginning and end of gene; (ii) BLAST hit locus overlap = 100% and identity ≥ 90% in every query genome; (iii) BLAST options: word size = 11, mismatch penalty = −1, match reward = 1, gap open costs = 5, gap extension costs = 2. Thereby, the final scheme resulted in 4,111 target loci. New alleles and sequence types (ST) were submitted to the nomenclature server for the public scheme.

Assessment of relationships between isolates’ core genome multilocus sequence type allelic profiles by minimum spanning trees

cgMLST typing results were visualised in minimum spanning trees (MSTs), excluding all samples in the respective scheme, not fulfilling ‘good target’ quality control (QC) for > 90% for the scheme’s target loci (= 90% of targets present in the isolate, same length as reference +/− 3 triplets, without ambiguities and without frame shifts in consensus, Table 1). A cluster was defined as a group of closely related cgMLST-analysed isolates in both schemes with a single-linkage threshold of ≤ 7 alleles. This was the default distance threshold for the software-implemented public S. enterica scheme and was adopted for direct comparison for the S. Agona scheme. During typing with the S. enterica cgMLST scheme, the SeqSphere + software assigns an existing cluster type (CT) to each isolate with an allelic distance ≤ 7 to an already established CT founder profile on the central nomenclature server. Otherwise, a new CT is established, uploaded and the isolate becomes the founder of this CT [30,31].
Table 1

Characteristics of sequenced Bavarian Salmonella Agona isolates and French representative outbreak isolate, Germany, 1993−2018 (n = 52)

Sample nameYearMaterialIsolation sourceCountry/stateCountry of origin/travel history S. enterica cgMLST CTcgMLSTclusterEpidemiological link
ERR2219379a 2017NAHumanFranceNA704NDNA
SA00012011StoolHumanGermany/BavariaIraq11952Travel cluster
SA00022011StoolHumanGermany/BavariaIraq1215NDNA
SA00042012StoolHumanGermany/BavariaAfghanistan11952Travel cluster
SA00052012StoolHumanGermany/BavariaMadagascar1209NoneNA
SA00062013StoolHumanGermany/BavariaGermany1210NDNA
SA00072013StoolHumanGermany/BavariaRussia1211NDNA
SA00082013Bacterial strainHumanGermany/BavariaGermany1212NDNA
SA00092013Bacterial strainHumanGermany/BavariaGermany11996NA
SA00112014StoolHumanGermany/BavariaUnspecified foreign country12148NA
SA00122014StoolHumanGermany/BavariaSyria11952Travel cluster
SA00132014StoolHumanGermany/BavariaSyria11952Travel cluster
SA00152014StoolHumanGermany/BavariaSyria1194NDNA
SA00162014StoolHumanGermany/BavariaNigeria11952Travel cluster
SA00172015StoolHumanGermany/BavariaKosovo under UN Security Council Resolution 12441196NDNA
SA0018b 2015StoolHumanGermany/BavariaEritrea12148b NA
SA00192015Bacterial strainHumanGermany/BavariaColombia1197NDNA
SA0031b 2018Nutritional supplementNutritional supplementGermany/BavariaNA1513NDb NA
SA00322017StoolHumanGermany/BavariaNA1198NDNA
SA00332017StoolHumanGermany/BavariaNA1124NDNA
SA00342017StoolHumanGermany/BavariaNA11996NA
SA00352017CattleCattleGermany/BavariaNA1200NDNA
SA00362015Chicken faecesEnvironmentGermany/BavariaNA11344Chicken cluster
SA0037b,c 2015CattleCattleGermany/BavariaNANDc NDb,c NA
SA00382015Chicken faecesEnvironmentGermany/BavariaNA11344Chicken cluster
SA00392015Chicken faecesEnvironmentGermany/BavariaNA11344Chicken cluster
SA00402015ChickenEnvironmentGermany/BavariaNA1201NDNA
SA00412008Black pepperSpicesGermany/BavariaNA1202NDNA
SA00422017Animal feedAnimal feedGermany/BavariaNA11933Feed cluster
SA00432017Animal feedAnimal feedGermany/BavariaNA11933Feed cluster
SA00442017Animal feedAnimal feedGermany/BavariaNA11933Feed cluster
SA00452003Digestive teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00462003Digestive teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00472003Cough and bronchial teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00481994Turkey legFoodGermany/BavariaNA1204NDNA
SA00501994Shredded coconutFoodGermany/BavariaNA12055Coconut cluster
SA00511994Shredded coconutFoodGermany/BavariaNA12077NA
SA00521994Shredded coconutFoodGermany/BavariaNA12055Coconut cluster
SA00532003AniseedTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00542003Children calming teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00552003Children calming teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00562003AniseedTea/raw teaGermany/BavariaNA12031Tea outbreak
SA0057b 2003Aniseed organicTea/raw teaGermany/BavariaNA12031b Tea outbreak
SA00582003Aniseed organicTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00592003Pectoral teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00602003Flatulence teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00612003Pectoral and cough teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00622003Pectoral and cough teaTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00632003AniseedTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00642003AniseedTea/raw teaGermany/BavariaNA12031Tea outbreak
SA00651993TurkeyFoodGermany/BavariaNA12077NA
SA00662018StoolHumanGermany/BavariaThailand, Cambodia, Vietnam1206NDNA

cgMLST: core genome multilocus sequence typing; CT: cluster type; NA: no information available; ND: none detected; UN: United Nations.

a Representative dataset of the French outbreak.

b Excluded from cgMLST with S. Agona scheme.

c Excluded from cgMLST with S. enterica scheme.

cgMLST: core genome multilocus sequence typing; CT: cluster type; NA: no information available; ND: none detected; UN: United Nations. a Representative dataset of the French outbreak. b Excluded from cgMLST with S. Agona scheme. c Excluded from cgMLST with S. enterica scheme.

Multilocus sequence typing

In silico MLST analysis of NGS data was performed with the standard seven-gene target scheme [32].

Single nucleotide polymorphism phylogeny

Whole genome (wg) single nucleotide polymorphism (SNP)-based phylogeny was calculated using the run_snp_pipeline script of the PHEnix pipeline by Public Health England [33]. It includes trimming with trimmomatic [34], mapping to the reference genome NC_011149.1 of strain SL-483 with bwa-mem mapping [35] with default settings, variant calling and filtering (frequency ≥ 0.9, mapping quality score ≥ 30, read depth ≥ 10) by Genome Analysis Toolkit (GATK)2 Unified Genotyper [36]. Variant calls for all SNP positions passing filters and all positions not passing filters were extracted and SNPs were concatenated to alignments with the vcf2fasta-script from the same pipeline, allowing ≤ 90% missing data per sample and ≤ 20% missing data per position throughout all samples. Maximum likelihood (ML) trees were generated from SNP alignments by RaxML [37], including 100 bootstrap replicates.

Results

Next generation sequencing of the 51 Bavarian S. Agona isolates (named with prefix SA), collected between 1993 and 2018 revealed high quality reads and reference genome coverage of 28−171 fold. In our bioinformatics data analyses, we included the published NGS raw data (ENA accession: ERR2219379) of an isolate from a case of the infant-milk-caused outbreak originating from a France-based manufacturer [24]. This isolate served as a representative of the French outbreak. All isolates were typed by in silico MLST, resulting in ST 13, the typical ST for serovar Agona [8]. In a next step, the isolates were typed with Ridom SeqSphere + software with the public S. enterica cgMLST scheme, consisting of Enterobase-developed 3,002 target loci (Figure 1). Additionally, wg SNP analysis was performed to investigate the phylogenetic relationship in highest possible resolution (Figure 2).
Figure 1

Minimum spanning tree of the core genome multilocus sequence type allelic profiles of Salmonella Agona strains, including 51 Bavarian isolates, a representative isolate of an infant-milk-caused outbreak in France and the reference strain SL-483, Germany, 1993−2018 (n = 53 isolates)

Figure 2

Maximum likelihood tree resulting from the whole genome single nucleotide polymorphism-based phylogenetic analysis of Salmonella Agona strains including 51 Bavarian isolates, a representative isolate of an infant milk-caused outbreak in France and the reference strain SL-483, Germany, 1993−2018 (n = 53 isolates)

Minimum spanning tree of the core genome multilocus sequence type allelic profiles of Salmonella Agona strains, including 51 Bavarian isolates, a representative isolate of an infant-milk-caused outbreak in France and the reference strain SL-483, Germany, 1993−2018 (n = 53 isolates) cgMLST: core genome multilocus sequence typing. Bavarian S. Agona samples have the prefix SA. The representative isolate of the infant-milk outbreak in France [23] is ERR2219379. The National Center for Biotechnology Information (NCBI) reference genome of strain SL-483 has GenBank accession number NC_011149.1. The analysis and tree were obtained with the public Ridom-SeqSphere + -integrated S. enterica cgMLST scheme of 3,002 target loci. On the tree, allele distances between samples are indicated. Clusters of samples with maximum seven alleles distance are shaded in grey. Samples are colour coded by their isolation source, as given in the legend. Maximum likelihood tree resulting from the whole genome single nucleotide polymorphism-based phylogenetic analysis of Salmonella Agona strains including 51 Bavarian isolates, a representative isolate of an infant milk-caused outbreak in France and the reference strain SL-483, Germany, 1993−2018 (n = 53 isolates) cgMLST: core genome multilocus sequence typing. Bavarian S. Agona samples have the prefix SA. The representative isolate of the infant milk outbreak in France [23] is ERR2219379. The National Center for Biotechnology Information (NCBI) reference genome of strain SL-483 has GenBank accession number NC_011149.1 and figures in the tree as ‘reference’. In the tree, samples are colour coded according to isolation source, S. Agona cgMLST cluster and collection year as given in the legend. The general scale bar indicates 0.001 substitutions per site (597 SNPs), based on an alignment of 596,849 positions. SNP distance bars between specific samples are indicated on the right side. Blue vertical SNP scale bars indicate epidemiologically-linked cgMLST clusters, blue vertical dashed lines indicate cgMLST clusters without epidemiological link and black vertical lines indicate exemplary distances between non clustered samples. The MST from results of the S. enterica cgMLST scheme, including all samples exceeding the target-QC cut-off, revealed in total eight clusters with maximum six alleles difference (Figure 1). The four most relevant clusters (1−4) comprised three to 15 samples with a maximum within cluster difference of zero to five alleles (Table 2). Each of the remaining four clusters (5−8) included two samples and internal distances ranging from zero (cluster 7) to six alleles (cluster 8).
Table 2

Allele and single nucleotide polymorphism differences within and between clusters of the sequenced Bavarian Salmonella Agona isolates, Germany, 1993−2018 (n = 34 isolates)

MeasureMethodCluster ID(number of isolates)Epidemiological link
Cluster 1(15)Tea outbreakCluster 2(5)Travel clusterCluster 3(3)Feed clusterCluster 4(3)Chicken clusterCluster 5(2)Coconut clusterCluster 6(2)None detectedCluster 7(2)None detectedCluster 8a (2)a None detected
Within cluster distance: median (min–max) Salmonella enterica cgMLST0 (0–3)2 (0–5)0 (0–2)0 (0–0)4 (4–4)2 (2–2)0 (0–0)6 (6–6)
S. Agona cgMLST0 (0–3)2 (0–6)1 (0–3)0 (0–2)3 (3–3)2.5 (0–5)0 (0–0)Nonea
SNP phylogeny1 (0–5)3.5 (0–7)2 (0–2)2 (0–2)6 (6–6)7 (7–7)0 (0–0)8 (8–8)
Min distance to nearest neighbour S. enterica cgMLST1020172010272312
S. Agona cgMLST15302528153337Nonea
SNP phylogeny2147373721495922
Nearest neighbour outside cluster in question S. enterica cgMLSTSA0040, SA0050(in cluster 5)SA0007SA0053, SA0054, SA0055, SA0058(all cluster 1)SA0055(in cluster 1)SA0053, SA0054, SA0055, SA0058(all cluster 1)SA0053(in cluster1)SA0055, SA0056(all cluster 1)SA0015
S. Agona cgMLSTSA0007, SA0050(in cluster 5)SA0047(in cluster 1)SA0047 and SA0058 (both in cluster 1), SA0007SA0007SA0055, SA0056(all cluster 1)SA0053(in cluster 1)SA0007Nonea
SNP phylogenySA0050(in cluster 5)SA0047(in cluster 1)SA0007SA0007SA0057(in cluster 1)SA0047(in cluster 1)SA0018a (in cluster 8)SA0015

cgMLST: core genome multilocus sequence typing; max: maximum; min: minimum; SNP: single nucleotide polymorphism.

a Cluster 8 not detected in S. Agona cgMLST due to exclusion of sample SA0018.

cgMLST: core genome multilocus sequence typing; max: maximum; min: minimum; SNP: single nucleotide polymorphism. a Cluster 8 not detected in S. Agona cgMLST due to exclusion of sample SA0018. None of the 51 Bavarian isolates collected from 1993–2018 clustered with the representative sample from the recent outbreak in France. The French representative sample ERR2219379 differed from the Bavarian samples in at least 15 alleles/40 SNPs. This difference was observed to distinguish unrelated isolates and clusters from different years or with different epidemiological origins throughout the whole sample set (Figure 1, Figure 2, Table 1). The three Bavarian feed isolates collected in 2017 from rapeseed cake (SA0042, SA0043, SA0044) built up a distinct cluster with two alleles maximum distance (Figure 1, cluster 3), corresponding to two SNPs in the wg SNP analysis (Figure 2). They differed from the nearest neighbours outside the clusters in at least 17 alleles/37 SNPs (Table 2). Furthermore, some of the Bavarian strains, isolated in former years, aggregated in epidemiologically described clusters, although the corresponding isolates available in our strain collection for sequencing mainly covered food or veterinary samples and corresponding human isolates were not available for analysis. Most eye-catching, the isolates of the biggest cluster with 15 samples (Figure 1, cluster 1) were all isolated from tea or raw tea products, connected to a diffuse outbreak caused by aniseed-fennel-caraway tea products in 2002–03. The outbreak investigation at that time identified contaminated raw tea imported from Turkey as source [2,3]. All isolates of this cluster were closely connected with zero to one allele single-linkage distance (Figure 1) and three alleles or five SNPs (Figure 2) maximum distance within the cluster. They showed at least 10 alleles/26 SNPs to all other non-connected isolates (Table 2). The five isolates of cluster 2 with single-linkage distances of zero to two alleles and maximum intra-cluster distance of five alleles or seven SNPs were obtained from asylum seekers between 2011 and 2015 (Table 1). Details of their travel history regarding countries or period are unknown. Hence the slightly higher variation of their allelic/SNP distances than in other point-sourced epidemiologically linked clusters is not surprising. Cluster 4 consists of three samples with zero alleles/SNPs difference to each other but at least 20 alleles/45 SNPs difference to the other samples. Isolates in this cluster shared a clear epidemiological link, as samples originated from chicken faecal samples, collected in 2015 in different laying hen flocks of one Bavarian egg producer. Clusters 5, 6, 7 and 8 only consist of two isolates each, respectively. Cluster 5 was built up from two shredded coconut samples from 1994 with four alleles/six SNPs difference which likely have an epidemiological link although information on their origin or supplier is not available. Isolates from two human patients without epidemiological link clustered together with two alleles (cluster 6) and seven SNPs distance. Cluster 7 came from two genetically identical strains from food samples from 1993−94 (shredded coconut and turkey) for which no epidemiological link is known. One additional pair of closely related human samples (SA0011 and SA0018) with reported travel history to Eritrea (SA0018) and an unspecified country (SA0011) built cluster 8 with six alleles and eight SNPs difference. To evaluate the needed resolution of the cgMLST analysis regarding analysed genomic content and consequential cluster demarcation, the species level typing with the public S. enterica cgMLST scheme, consisting of Enterobase-developed 3,002 target loci was compared with an in-house generated ad hoc serovar Agona-specific cgMLST scheme with 4,111 target loci (Figure 3). Typing with both schemes was performed using the SeqSphere + software algorithms. As no empiric threshold was defined prior the analysis for the serovar-specific scheme it was used with the same thresholds as the species-specific scheme. For the comparison of the data analysis approaches, 70 published NGS raw datasets of isolates described by Zhou et al. [21] were added in both approaches to widen the view on cluster definition and concordance by genomic and epidemiological data on a supra-regional level. This dataset covered S. Agona isolates from 1952 to 2010 covering several European outbreaks. The results of the public S. enterica cgMLST scheme (Figure 3A) are very similar to those of the in-house S. Agona cgMLST scheme (Figure 3B) and SNP-profiling (Figure 2). All Bavarian and published outbreak clusters were detected by both schemes (Figure 3A and B). Therefore, the usage of the public S. enterica scheme was generally evaluated as practical for S. Agona outbreak investigations.
Figure 3

Minimum spanning trees obtained with different typing schemes (Panels A and B) of the core genome multilocus sequence type allelic profiles of 51 Bavarian Salmonella Agona isolates, a representative isolate of an infant milk outbreak in France, the reference strain SL-483 and 70 isolates from various European outbreaks (n = 123 isolates)a

Minimum spanning trees obtained with different typing schemes (Panels A and B) of the core genome multilocus sequence type allelic profiles of 51 Bavarian Salmonella Agona isolates, a representative isolate of an infant milk outbreak in France, the reference strain SL-483 and 70 isolates from various European outbreaks (n = 123 isolates)a cgMLST: core genome multilocus sequence typing. a Due to shortfall below the 90% good target threshold for the S. Agona scheme target loci, SA0018 does not figure in panel B, which shows a total of 122 isolates. Bavarian S. Agona samples have the prefix SA. The representative isolate of the infant milk outbreak in France [23] is ERR2219379. The National Center for Biotechnology Information (NCBI) reference genome of strain SL-483 has GenBank accession number NC_011149.1. The 70 isolates from various European outbreaks are published in a separate study [21]. Allele distances between samples are indicated and clusters of samples with minimum spanning distances of zero to seven alleles are shaded in grey. Samples are colour coded by their isolation source, as given in the legend. As SA0018 was excluded from the serovar-specific cgMLST due to shortfall below the 90% good target threshold for the S. Agona scheme target loci, cluster 8 detected in species-specific cgMLST (Figure 1, Figure 3A) was not apparent in serovar-specific cgMLST (Figure 3B). Due to a smaller number of target loci in the public cgMLST scheme, allele distances between samples were generally lower than with the serovar-specific scheme. Two epidemiologically unrelated clusters linked by an unrelated Irish environmental isolate (ERS180349) (Figure 3B, clusters 2 and 12) could not be delimitated clearly in the S. enterica scheme with a default cluster threshold of seven alleles (Figure 3A, cluster 2 + 12). Those two clusters – one consisting of Bavarian patient isolates with travel history to African or Arabian countries described above (cluster 2; Table 1 and 2) and one described by Zhou et al. (Figure 3B, cluster 12), originating from a large multi-country 2008−09 outbreak originating from Ireland [21] – could more clearly be separated with the in-house cgMLST S. Agona scheme (Figure 3B, clusters 2 and 12), using the same cluster threshold. In the species-level cgMLST scheme, the software assigns CTs for all samples. These were different for most isolates of the two unrelated clusters, suggesting two different sources for these two clusters, additional to the epidemiological information.

Discussion

Our high resolution WGS-based analysis approach by either wg SNP-profiling or species- or serovar-specific cgMLST delivered good and reliable results for typing S. Agona isolates and evaluating their affiliation to, or delineation from, outbreak clusters. We could thereby exclude a connection between the Bavarian feed sample isolates from 2017 and a large outbreak due to infant milk products, which occurred at the same time in France and other European countries. Furthermore, we could confirm S. Agona epidemiological clusters from former years as well as identify previously unrecognised clusters. As recently shown for serovar S. Enteritidis [38], cgMLST was a standardisable and easily applicable to S. Agona and could reach an analysis depth comparable to wg SNP profiling. Recently, cgMLST was published as a tool with reasonable resolution for investigations of Salmonella population structure [29]. Thereby, the extensively curated S. enterica based scheme was made available on a publicly accessible website [28]. The same scheme was implemented in the Ridom SeqSphere + Software too, although with own allele calling and clustering rules and was used in this study to compare the resolution of species- and serovar-specific typing with our in-house developed serovar-specific scheme. It could be shown that even on species level all outbreaks were clearly identified by both schemes. Only two independent outbreak clusters could not be delimitated clearly anymore by the publicly available scheme, due to the lower number of species-specific target loci and the used cluster threshold. Of course, more focused typing schemes or approaches like serovar-level cgMLST or wg SNP genotyping deliver more detailed typing results catching at least the portion of the serovar-specific diversity manifested in the core- or reference-genome. This is particularly true for monomorphic organisms such as S. Agona, for which mobile elements that are mostly not covered by core genome or single reference-based approaches generate a lot of diversity [21], or for situations when single samples are to be inferred as part of specific clusters or not, especially when epidemiological information is not fully discriminative. Consequently, the serovar-specific in-house scheme developed for this investigation, worked very well for the analysed sample set. However, due to the lack of target curation and its creation from only one reference and three query genomes, it will need further optimisation and testing with diverse sample sets for suitability on a broader scale. This can be assumed from the fact that four isolates had to be excluded from typing due to shortfall below the 90% good target threshold using the in-house scheme, but only one isolate using the publicly available scheme (Table 1). The use of a publicly available scheme whose target loci underwent extensive curation, like the Enterobase S. enterica scheme [29], can render suitable and reliable results even in cases where serotyping has not yet been performed or in the event of the lack of a specific reference. Generally, cluster definition is based on a single-linkage allelic difference threshold. The software implements a proposed default threshold of seven alleles for the S. enterica scheme. This was adopted for the S. Agona scheme for comparison and as no empirically tested threshold was established for this scheme before. However, as shown with the high-resolution in-house S. Agona cgMLST scheme, clusters with a clear epidemiological link did not show more than four alleles and six SNPs difference. Therefore, a cluster with a distance above four, but below the threshold of seven alleles (e.g. Figure 3B, clusters 6 and 11), may contain falsely grouped samples. Thus, the cluster thresholds in both schemes could be decreased down from seven based on the analysed sample set. However, theoretically a cluster threshold should be lower in a scheme with less loci like the S. enterica scheme (3,002 loci) than in a scheme with more loci like the S. Agonas scheme (4,111 loci). A decreased cluster threshold would be in accordance with the special genetic characteristics of S. Agona, which emerged more recently and is more monomorphic than most other serovars [21]. However, while facilitating cluster delimitation in S. Agona, a decreased threshold would, especially in the S. enterica scheme, very likely impair clustering in the case of more heterogeneous serovars or when linked isolates have evolved over a longer time period. Cluster types assigned by the publicly available S. enterica scheme are intended to roughly classify genetically similar samples in an easy way. This is very helpful and an important step towards standardisation, also in terms of inter-laboratory communication and for alerts concerning the detection of a known CT. In the case of the two epidemiologically unrelated clusters 2 and 12 (Figure 3B), which were not clearly delimited by the public S. enterica scheme (Figure 3A, Cluster 2 + 12), the software-assigned CT values for each isolate were mainly distinct for the two clusters (cluster 2: CT-1195, cluster 12: CT-25). Hence, using the CT as a simplification measure to distinguish between clusters from different sources helped, although the clusters were difficult to resolve at least with a MST and the applied threshold for single-linkage clustering. However, due to the complexity of NGS data the logical principle on which such a simplification, like using the CT value, is based, always has to be considered for its correct use, as this can also result in false inclusion into or exclusion from a cluster. Indeed, the CTs are assigned depending on an isolate’s proximity in terms of allelic distance to a specific CT founder allele profile or by incremental expansion of the nomenclature when a new isolate exhibits a not yet known and sufficiently different allele combination [30,31]. The incremental nomenclature expansion together with the fact that only the distance to the nearest CT founder is reflected in the CT assignment of new isolates can lead to the assignment of the same CT to isolates with allelic distances above the threshold or of different CTs to isolates below a cluster threshold. This may be for example the case for isolate ERS180350, correctly grouped within the Irish outbreak cluster of 2008 with common CT-25 (Figure 3B, cluster 12), but being tagged with a deviant CT (CT-1224). Furthermore, microbial evolution can be different in certain epidemiological niches which cannot fully be covered by such simplification. Due to these limitations, outbreak investigations should generally avoid relying solely on simplifications like CT values, but also include visual inspection of cluster formation in trees to avoid overlooking of connected samples. Classifications by clusters or simplified measures such as CTs should always be interpreted with caution, especially when allelic or SNP distances near an empirically tested cluster threshold occur. Moreover, particularly, but not only, for organisms with specific genetic characteristics such as S. Agona, empirical as well as epidemiological data should always be taken into account in addition to the molecular data [39]. Our analysis also shows that if a common reference or close genetic relationship for a specific set of isolates is known, high resolution approaches facilitate analysis and enable clear grouping of individual isolates in dispute, whereas the versatility of genetically broader approaches enables more standardised results for more heterogeneous sets of isolates. To reduce interpretation complexity and further extend fast and easy usability of NGS approaches for public health analyses, further standardisation and harmonisation between laboratories are nevertheless required. Many researchers and authorities have already realised this, but the implementation and ongoing optimisation will be a huge effort in the forthcoming years. Concluding, consistent with results obtained for other serovars and species, wg SNP profiling as well as serovar- or species-specific cgMLST can be used for reasonable, reproducible and reliable high resolution classification of S. Agona WGS data to detect outbreak clusters. We showed this with a representative dataset from regional and international sources covering human, food, feed, veterinary and environmental isolates and thereby various types of focus areas of public health authorities. With this approach, relationships between past or international cases could also be inferred using representative public data. We also highlighted the importance and supportive power of epidemiological sample data and an integrated view on both molecular and epidemiological data. Importantly, NGS results still need careful evaluation, as their interpretation approaches often have to be a trade-off between highest resolution and versatility. Standardisation and harmonisation on an international level will further improve using the surplus of information coming from NGS in molecular surveillance.
  30 in total

Review 1.  The role of short sequence repeats in epidemiologic typing.

Authors:  A van Belkum
Journal:  Curr Opin Microbiol       Date:  1999-06       Impact factor: 7.934

Review 2.  Multiple-locus variable number tandem repeats analysis for genetic fingerprinting of pathogenic bacteria.

Authors:  Bjørn-Arne Lindstedt
Journal:  Electrophoresis       Date:  2005-06       Impact factor: 3.535

3.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

4.  Molecular epidemiology of Salmonella enterica serovar Agona: characterization of a diffuse outbreak caused by aniseed-fennel-caraway infusion.

Authors:  W Rabsch; R Prager; J Koch; K Stark; P Roggentin; J Bockemühl; G Beckmann; R Stark; W Siegl; A Ammon; H Tschäpe
Journal:  Epidemiol Infect       Date:  2005-10       Impact factor: 2.451

5.  The key role of pulsed-field gel electrophoresis in investigation of a large multiserotype and multistate food-borne outbreak of Salmonella infections centered in Pennsylvania.

Authors:  Carol H Sandt; Donna A Krouse; Charles R Cook; Amy L Hackman; Wayne A Chmielecki; Nancy G Warren
Journal:  J Clin Microbiol       Date:  2006-09       Impact factor: 5.948

6.  Salmonella typhi, the causative agent of typhoid fever, is approximately 50,000 years old.

Authors:  Claire Kidgell; Ulrike Reichard; John Wain; Bodo Linz; Mia Torpdahl; Gordon Dougan; Mark Achtman
Journal:  Infect Genet Evol       Date:  2002-10       Impact factor: 3.342

7.  Pulsed-field gel electrophoresis for Salmonella infection surveillance, Texas, USA, 2007.

Authors:  Stephen G Long; Herbert L DuPont; Linda Gaul; Raouf R Arafat; Beatrice J Selwyn; Joan Rogers; Eric Casey
Journal:  Emerg Infect Dis       Date:  2010-06       Impact factor: 6.883

8.  Salmonella agona outbreak from contaminated aniseed, Germany.

Authors:  Judith Koch; Annette Schrauder; Katharina Alpers; Dirk Werber; Christina Frank; Rita Prager; Wolfgang Rabsch; Susanne Broll; Fabian Feil; Peter Roggentin; Jochen Bockemühl; Helmut Tschäpe; Andrea Ammon; Klaus Stark
Journal:  Emerg Infect Dis       Date:  2005-07       Impact factor: 6.883

9.  Fast and accurate long-read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2010-01-15       Impact factor: 6.937

10.  Multi-locus variable-number tandem repeat analysis for outbreak studies of Salmonella enterica serotype Enteritidis.

Authors:  Burkhard Malorny; Ernst Junker; Reiner Helmuth
Journal:  BMC Microbiol       Date:  2008-05-30       Impact factor: 3.605

View more
  7 in total

1.  Whole-Genome Sequencing Provides Insight Into Antimicrobial Resistance and Molecular Characteristics of Salmonella From Livestock Meat and Diarrhea Patient in Hanzhong, China.

Authors:  Rui Weng; Yihai Gu; Wei Zhang; Xuan Hou; Hui Wang; Junqi Tao; Minghui Deng; Mengrong Zhou; Yifei Zhao
Journal:  Front Microbiol       Date:  2022-06-09       Impact factor: 6.064

2.  Genomic Comparison of Salmonella Enteritidis Strains Isolated from Laying Hens and Humans in the Abruzzi Region during 2018.

Authors:  Lisa Di Marcantonio; Anna Janowicz; Katiuscia Zilli; Romina Romantini; Stefano Bilei; Daniela Paganico; Tiziana Persiani; Guido Di Donato; Elisabetta Di Giannatale
Journal:  Pathogens       Date:  2020-05-05

3.  Investigating Major Recurring Campylobacter jejuni Lineages in Luxembourg Using Four Core or Whole Genome Sequencing Typing Schemes.

Authors:  Morgane Nennig; Ann-Katrin Llarena; Malte Herold; Joël Mossong; Christian Penny; Serge Losch; Odile Tresse; Catherine Ragimbeau
Journal:  Front Cell Infect Microbiol       Date:  2021-01-08       Impact factor: 5.293

4.  Genetic changes are introduced by repeated exposure of Salmonella spiked in low water activity and high fat matrix to heat.

Authors:  Leen Baert; Johan Gimonet; Caroline Barretto; Coralie Fournier; Balamurugan Jagadeesan
Journal:  Sci Rep       Date:  2021-04-14       Impact factor: 4.379

5.  Toward an Integrated Genome-Based Surveillance of Salmonella enterica in Germany.

Authors:  Laura Uelze; Natalie Becker; Maria Borowiak; Ulrich Busch; Alexandra Dangel; Carlus Deneke; Jennie Fischer; Antje Flieger; Sabrina Hepner; Ingrid Huber; Ulrich Methner; Jörg Linde; Michael Pietsch; Sandra Simon; Andreas Sing; Simon H Tausch; Istvan Szabo; Burkhard Malorny
Journal:  Front Microbiol       Date:  2021-02-10       Impact factor: 5.640

6.  The Current Landscape of Antibiotic Resistance of Salmonella Infantis in Italy: The Expansion of Extended-Spectrum Beta-Lactamase Producers on a Local Scale.

Authors:  Lisa Di Marcantonio; Romina Romantini; Francesca Marotta; Alexandra Chiaverini; Katiuscia Zilli; Anna Abass; Elisabetta Di Giannatale; Giuliano Garofolo; Anna Janowicz
Journal:  Front Microbiol       Date:  2022-03-28       Impact factor: 5.640

7.  Detection of mobile genetic elements associated with antibiotic resistance in Salmonella enterica using a newly developed web tool: MobileElementFinder.

Authors:  Markus H K Johansson; Valeria Bortolaia; Supathep Tansirichaiya; Frank M Aarestrup; Adam P Roberts; Thomas N Petersen
Journal:  J Antimicrob Chemother       Date:  2021-01-01       Impact factor: 5.790

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.