Literature DB >> 28974944

Whole-Genome Sequencing for National Surveillance of Shigella flexneri.

Marie A Chattaway1, David R Greig1, Amy Gentle1, Hassan B Hartman1, Timothy J Dallman1, Claire Jenkins1.   

Abstract

National surveillance of Shigella flexneri ensures the rapid detection of outbreaks to facilitate public health investigation and intervention strategies. In this study, we used whole-genome sequencing (WGS) to type S. flexneri in order to detect linked cases and support epidemiological investigations. We prospectively analyzed 330 isolates of S. flexneri received at the Gastrointestinal Bacteria Reference Unit at Public Health England between August 2015 and January 2016. Traditional phenotypic and WGS sub-typing methods were compared. PCR was carried out on isolates exhibiting phenotypic/genotypic discrepancies with respect to serotype. Phylogenetic relationships between isolates were analyzed by WGS using single nucleotide polymorphism (SNP) typing to facilitate cluster detection. For 306/330 (93%) isolates there was concordance between serotype derived from the genome and phenotypic serology. Discrepant results between the phenotypic and genotypic tests were attributed to novel O-antigen synthesis/modification gene combinations or indels identified in O-antigen synthesis/modification genes rendering them dysfunctional. SNP typing identified 36 clusters of two isolates or more. WGS provided microbiological evidence of epidemiologically linked clusters and detected novel O-antigen synthesis/modification gene combinations associated with two outbreaks. WGS provided reliable and robust data for monitoring trends in the incidence of different serotypes over time. SNP typing can be used to facilitate outbreak investigations in real-time thereby informing surveillance strategies and providing the opportunities for implementing timely public health interventions.

Entities:  

Keywords:  Shigella flexneri; outbreaks; phylogeny; surveillance; whole-genome sequencing

Year:  2017        PMID: 28974944      PMCID: PMC5610704          DOI: 10.3389/fmicb.2017.01700

Source DB:  PubMed          Journal:  Front Microbiol        ISSN: 1664-302X            Impact factor:   5.640


Introduction

Shigellosis is caused by four species of Shigella, including S. boydii, S. dysenteriae, S. flexneri and S. sonnei, transmitted via the fecal oral route. Symptoms typically start 1–2 days after exposure and include diarrhea, bloody diarrhea abdominal pain, fever, and tenesmus. The burden of shigellosis is highest in developing countries with up to 167 million episodes of diarrhea annually, leading to over a million deaths (Kotloff et al., 1999). A multicenter study of shigellosis in six Asian countries indicated the incidence rate to be highest in children under the age of 4 years old and in adults over 70 years old (von Seidlein et al., 2006). In the United Kingdom, S. flexneri is most commonly associated with causing travelers’ diarrhea and outbreaks of gastrointestinal symptoms in men who have sex with men (MSM) (Simms et al., 2015). Furthermore, there are reports of increased intercontinental dissemination of multidrug resistant S. flexneri (Baker et al., 2015). Between 2004 and 2015, 18,266 Shigella cases were reported by GBRU for England and Wales with S. flexneri accounting for 7075 (39%) of these infections (S. sonnei n = 8897, 49%; S. boydii n = 1364, 7%; S. dysenteriae n = 808, 4%; Shigella species unknown n = 122, 1%). Shigella flexneri are traditionally serotyped phenotypically using antisera raised in rabbits, although molecular PCR methods have been implemented in a number of reference laboratories (Zhang et al., 2012; Gentle et al., 2016). Serotyping provides limited resolution with serotypes 2a (n = 2448, 35%), 3a (n = 1476, 21%), 6 (n = 1047, 15%), and 1b (n = 711, 10%) accounting for 81% of S. flexneri cases. Without a higher level of discrimination, outbreak detection is dependent on the identification of epidemiological links between cases belonging to the same serotype. Whole-genome sequencing (WGS) has been shown to have potential in replacing traditional phenotypic and PCR methods (Gentle et al., 2016) for routine surveillance. This approach has the added value of further discriminating strains by their genetic relatedness to a single nucleotide polymorphism (SNP) level and has been used to investigate multiple gastrointestinal outbreaks at Public Health England (PHE) (McDonnell et al., 2013; Dallman et al., 2015, 2016). The aim of this study was to compare traditional serotyping with WGS for serotyping S. flexneri for routine public health surveillance and to evaluate utility of WGS data to support epidemiologically linked clusters and outbreak investigations.

Materials and Methods

Bacterial Strains

Bacterial isolates of S. flexneri from 330 cases were submitted to the Gastrointestinal Bacterial Reference Unit between August 2015 and January 2016, from local and regional hospital laboratories in England and Wales. This strain set comprised the following phenotypic serotypes (numbers of isolates belonging to each serotype in parenthesis): 1a (3), 1b (16), 1c (25), 2a (185), 2b (16), 3a (41), 3b (7), 4av/E1037 (9), 6 (19) X (4), Y (4), X and Y (1). Epidemiological data on age, sex, and region of residence were available from laboratory report forms. Travel history was available for 176 of cases. All isolates were serotyped and WGS.

Serotyping and PCR

Phenotypic identification of S. flexneri isolates were confirmed using the Omnilog GenIII MicroPlate (Biolog, Hayward, CA, United States). Serotyping was carried out using standard methods by slide agglutination using both commercially available monovalent antisera (Denka Seiken, Japan) and monoclonal antibody reagents (Reagensia AB, Sweden) and in-house antisera raised in rabbits (Gross and Rowe, 1985). Molecular PCR was carried out on the discrepancies between phenotypic and WGS typing results as previously described (Gentle et al., 2016) and a serotype assigned according to the gene combination detected (Supplementary Table ).

Whole-Genome Sequencing

Genome sequencing and sequencing analysis were carried out as previously described (Dallman et al., 2015). Genomic DNA extracted using the QiaSymphony DNA extraction platform (Qiagen) from 330 S. flexneri was fragmented and tagged for multiplexing with Nextera XT DNA Sample Preparation Kits (Illumina) and sequenced using the Illumina HiSeq 2500 at PHE. A reference database containing the gene sequences encoding the 12 O-antigen synthesis or modification genes described by Sun et al. (2011, 2012a,b), including wzxc1-5, wzxe1-5, wzx6, gtrI, gtrII, gtrIV, gtrV, gtrX, gtr1c, oac oac1b and opt, was constructed. Using the GeneFinder tool (Doumith, unpublished), FASTQ reads were mapped to the S. flexneri O-antigen synthesis or modification genes using Bowtie 2 (Langmead and Salzberg, 2012) and the best match to each target was reported with metrics including coverage, depth, mixture and nucleotide similarity in XML format for quality assessment. Only in silico predictions of serotype that matched to a gene determinant at >80% nucleotide identity over >80% target gene length were accepted. FASTQ sequences were deposited in the National Center for Biotechnology Information Short Read Archive under the bioproject PRJNA315192 (see Supplementary Table for SRA identifiers).

Cluster Detection

Short reads were quality trimmed and mapped to the reference S. flexneri serotype 2a strain 2457T (AE014073.1) (Wei et al., 2003) or the reference strain NC_007613 if S. flexneri serotype 6, using BWA v0.75 (Li et al., 2009; Bolger et al., 2014). The Sequence Alignment Map output from BWA was sorted and indexed to produce a Binary Alignment Map (BAM) using Samtools (Li and Durbin, 2010). GATK v2.6.5 was used to create a Variant Call Format (VCF) file from each of the BAMs, which were further parsed to extract only SNP positions which were of high quality (MQ > 30, DP > 10, GQ > 30, Variant Ratio > 0.9) (McKenna et al., 2010). Gubbins v2.0.0 (Croucher et al., 2015) was used to identify recombinant regions of the genome which were subsequently masked for phylogenetic analysis. Pseudosequences of polymorphic positions were used to create maximum likelihood trees using RAxML v8.1.17 (Stamatakis, 2014). De novo assembly was carried out using Spades 3.5.0 using ‘–careful’ and ‘ -k 21,33,55,65,77,83,91’ options (Bankevich et al., 2012). To proactively detect outbreaks from WGS data, SNP typing was carried out on S. flexneri isolates belonging to clonal complex (CC) 245 and CC145. CC145 mostly comprises S. boydii serotypes but includes S. flexneri serotype 6, as this S. flexneri serotype was misidentified historically (Wirth et al., 2006; Chattaway et al., 2017). Hierarchical single linkage clustering was performed at seven descending thresholds of SNP distance (Δ250, Δ100, Δ50, Δ25, Δ10, Δ5, Δ0) as previously described (Dallman et al., 2016). This clustering results in a discrete seven digit code where each number represents the cluster membership at each descending SNP distance threshold. The resultant SNP addresses describes an isolates position in the S. flexneri population structure where two isolates with the same SNP addresses have 0 SNP differences between them.

Results

Demographic of Patients

There were 330 isolates reported between 1st August 2015 and 18th January 2016, 236 (71%) were from males, 85 (26%) from females, and for nine (3%) cases the sex was not stated. Travel history was not provided for 154 (47%) cases, 85 cases reported travel 7 days prior to onset of symptoms and 91 (27%) reported that they did not travel during the 7 days prior to onset of symptoms. The most frequently reported destinations were India (n = 16, 5%) and Pakistan (n = 11, 3%) with other counties accounting for less < 1% each.

Comparison of WGS Predicted Serotype versus PCR Predicted Serotype Results

Of the 330 cultures tested prospectively by WGS, 306 (93%) had concordant results with phenotypic serotyping. Three of the mismatched results between the phenotypic and genotypic tests were attributed to mutations identified in O-antigen synthesis or modification genes (Table ), including nonsense mutations resulting in early stop codons (n = 2) and a frameshift mutation. Repeat testing of sample 4 revealed the discrepancy was due to an auto-agglutination reaction of the strain with the sera (Table ). Summary of mismatched phenotypic and genotypic results. The remaining mismatches were novel serotype gene profiles detected by WGS, both associated with outbreaks (Cluster 2, n = 13; Cluster 4, n = 5) (Tables ). Cluster 2 comprised 13 isolates associated with a local community outbreak. The isolates failed to agglutinate any of the serotype specific S. flexneri antisera and were positive for wzx1-5 and gtrIc gene targets in the PCR but were negative for gtrI. Typical strains of serotype 1c are positive for wzx1, gtrI, and gtrIc (Supplementary Table ). These strains were designated 1c variants (1cv). Cluster 4 comprised five isolates of S. flexneri associated with an outbreak of gastrointestinal symptoms in five captive chimpanzees phenotypically identified as serotype 3a. The isolates had wzx1-5, oac, and grtX but were also positive for gtrII and were designated 3a variant (3av). Summary of Clusters detected by WGS.

Outbreak Investigation Using SNP Clustering Typing

Analysis of the WGS data organized 161 of the 330 isolates into 36 five SNP single-linkage clusters of two or more isolates associated with CC245 and one cluster in CC145. The median number of cases in these clusters was 2 and ranged from 2 to 27; 32 (89%) clusters investigated comprised less than 5 isolates (Table ). It was not possible to identify epidemiological links associated with these small clusters using only the limited epidemiological data available from laboratory report forms. However, 9/32 (28%) comprised at least one case reporting recent travel abroad prior to onset of symptoms. During the period of the study, four outbreaks were identified following routine surveillance of local hospital reports of gastrointestinal symptoms caused by S. flexneri. SNP typing confirmed that all the isolates belonging to each outbreak cluster were closely related and monophyletic. Cluster 1 was the largest cluster and had a high male to female ratio with the 97% of cases being adult males (for two cases the gender was not stated) (Table ). Isolates from this cluster were observed throughout the study period. The minimum SNP distance between isolates in this cluster was zero with the median distance 17. The maximum SNP distance between any two isolates was 64, however the majority of these SNPs were caused by a transposase mediated recombination of pic, encoding a serine protease. This demographic of adult males has previously been shown to be characteristic of clusters linked to sexual transmission among the MSM community (Borg et al., 2012; Gilbart et al., 2015; Baker et al., 2015; Simms et al., 2015). Cluster 1 was part of larger outbreak of S. flexneri serotype 2a previously described by Simms et al. (2015). Clusters 2 and 3 were community outbreaks and cases were geographically linked. They were temporally restricted and genetically homogenous, exhibiting 0–1 SNPs difference in the core genome between isolates. Despite a thorough epidemiological investigation, the source and route of transmission associated with Cluster 2 could not be determined. Cluster 3 was linked to consumption of contaminated food at a restaurant, most likely due to an infected food handler. Cluster 4 was associated with an outbreak of gastrointestinal symptoms in a group of captive chimpanzees with a minimum SNP distance between isolates of one and a maximum SNP distance of four. The S. flexneri population structure clusters into seven distinct phylogenetic groups (PGs) (Connor et al., 2015). Phylogenetic analysis of the diversity of CC245 in the PHE collection is shown in Figure . Four out of seven clades are represented by samples received by PHE through routine surveillance. With respect to the clusters described in this study, Clusters 1, 3, and 4 fall within PG3 and Cluster 2 falls within PG1. Maximum Likelihood tree showing a single representative from each 10 SNP single linkage cluster (N = 156) within the PHE clonal complex 245 database (N = 1333). Phylogenetic Groups as defined by Connor et al. (2015) are listed 1–7 and the four outbreak clusters annotated. Refer to Table for details of the outbreak clusters.

Discussion

This study showed WGS to be a robust and reliable method for serotyping S. flexneri isolates and provided additional strain discrimination at the SNP level. There was high correlation between phenotypic serotyping and WGS serotyping (93%) thus facilitating the prospective comparison of WGS data with historical phenotypic data and ensuring continuity for monitoring trends in the incidence of different serotypes over time. In three of the mismatches, the WGS derived serotype was shown to predict serotype based on gene presence which may not be expressed phenotypically. WGS provided insight on the effect of mutations on O-antigen modification genes and potential mechanisms that inactivate phenotypic expression. WGS analysis also identified novel serotype gene profiles associated with two outbreaks. Serotype is not a robust phylogenetic marker, as the O-antigen synthesis/modification genes are encoded on mobile genetic elements (prophages) (Connor et al., 2015). Prior to the implementation of WGS, detection of outbreaks of S. flexneri at PHE relied on the identification of epidemiological links between cases as serotyping was not discriminatory enough to detect outbreaks during routine surveillance. During this study, SNP typing provided microbiological evidence that the isolates associated with each of the four outbreaks identified were closely related. SNP typing has been used previously to investigate outbreaks of S. sonnei, and this study provides further evidence of the utility of this approach (McDonnell et al., 2013; Dallman et al., 2016; Mook et al., 2016; Baker et al., 2017). Three of the outbreak clusters were temporally and genetically restricted with less than five SNP differences observed between outbreak isolates. In contrast, an outbreak representing on-going person-to-person transmission within the MSM community exhibited between zero and 64 SNP differences between isolates. S. flexneri may become endemic in defined populations or communities, such as religious communities or sexual networks. Over time circulating strains will accumulate mutations leading to an increase in observed SNPs between cases in that network. Similarly, extended transmission provides greater opportunities for horizontal gene transfer in the population and therefore bioinformatic analyses have to be robust to such influxes of variation. This heterogeneity in genetic conservation between epidemiologically linked cases highlights the need to be flexible with respect to the case cluster definition. Epidemiological data associated with each outbreak, and WGS analyses of the deeper phylogenetic relationship between isolates, should be used in concert to inform outbreak investigations. The use of WGS for routine surveillance of S. flexneri provides reliable and robust data that can be used to monitor trends in the incidence of different serotypes over time. WGS derived serotyping data ensures backward compatibility with historical phenotypic serotyping data. SNP typing can be used to facilitate outbreak investigations in real-time thereby enhancing surveillance strategies and providing the opportunities for implementing rapid public health interventions.

Informed Consent

Informed consent was not required as all data in this study was anonymized.

Author Contributions

DG and AG performed the DNA extractions, PCR, phenotypic serotyping and identification. MC and CJ implemented the wet lab WGS pipelines and performed analysis. TD, HH, and DG performed bioinformatic analysis. MC and CJ wrote the manuscript and TD, DG, and HH contributed to the manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Table 1

Summary of mismatched phenotypic and genotypic results.

Reference no.Sample no.Phenotypic serotypePCR serotypeWGS serotypeExplanation of mismatch
SRR478773711a1b1bOacIb contains an early stop codon at position 128, resulting in the 1a phenotype
SRR47868412X variant/Y variant3a3aOac contains an early stop codon at position 20 resulting in the X phenotype
SRR50183213Y variant2a2aFrameshift mutation (insertion of two A’s) at position 1116 rendering gtrII non-functional and resulting in the Y phenotype
SRR50174714X variantYYAuto agglutination strain resulting in incorrect phenotype
SRR4788187 SRR48975985–61c1c1aNo explanation for the mismatch could be determined
Cluster 2 (13 isolates)7–19Negative serology1cv1cvNovel serotype
Cluster 4 (five isolates)20–243a3av3avNovel serotype
Table 2

Summary of Clusters detected by WGS.

Cluster no.No. casesTravelM:F ratioAge rangeSerotypeEpidemiological context/transmission routeSNP addressMinimum SNP differenceMaximum SNP differenceMedian SNP difference
11217119:020–632aMSM34.42.42.42.#040ˆ17
21304:923–96$1cvCommunity3.45.45.46.46.46.47000
3713:421–532bRestaurant8.9.9.138.140.154.167000
45N/AN/AN/A$3avCaptive chimpanzees45.130.197.328.353.405.#142.5
  28 in total

1.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

Authors:  Aaron McKenna; Matthew Hanna; Eric Banks; Andrey Sivachenko; Kristian Cibulskis; Andrew Kernytsky; Kiran Garimella; David Altshuler; Stacey Gabriel; Mark Daly; Mark A DePristo
Journal:  Genome Res       Date:  2010-07-19       Impact factor: 9.043

2.  Development of a multiplex PCR assay targeting O-antigen modification genes for molecular serotyping of Shigella flexneri.

Authors:  Qiangzheng Sun; Ruiting Lan; Yiting Wang; Ailan Zhao; Shaomin Zhang; Jianping Wang; Yan Wang; Shengli Xia; Dong Jin; Zhigang Cui; Hongqing Zhao; Zhenjun Li; Changyun Ye; Shuxia Zhang; Huaiqi Jing; Jianguo Xu
Journal:  J Clin Microbiol       Date:  2011-08-31       Impact factor: 5.948

3.  Fast gapped-read alignment with Bowtie 2.

Authors:  Ben Langmead; Steven L Salzberg
Journal:  Nat Methods       Date:  2012-03-04       Impact factor: 28.547

4.  Use of whole-genome sequencing for the public health surveillance of Shigella sonnei in England and Wales, 2015.

Authors:  Timothy J Dallman; Marie A Chattaway; Piers Mook; Gauri Godbole; Paul D Crook; Claire Jenkins
Journal:  J Med Microbiol       Date:  2016-06-14       Impact factor: 2.472

5.  Global burden of Shigella infections: implications for vaccine development and implementation of control strategies.

Authors:  K L Kotloff; J P Winickoff; B Ivanoff; J D Clemens; D L Swerdlow; P J Sansonetti; G K Adak; M M Levine
Journal:  Bull World Health Organ       Date:  1999       Impact factor: 9.408

6.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

7.  Identification of a divergent O-acetyltransferase gene oac 1b from Shigella flexneri serotype 1b strains.

Authors:  Qiangzheng Sun; Ruiting Lan; Yan Wang; Jianping Wang; Shengli Xia; Yiting Wang; Jin Zhang; Deshan Yu; Zhenjun Li; Huaiqi Jing; Jianguo Xu
Journal:  Emerg Microbes Infect       Date:  2012-09-05       Impact factor: 7.163

8.  Fast and accurate long-read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2010-01-15       Impact factor: 6.937

9.  Retrospective analysis of whole genome sequencing compared to prospective typing data in further informing the epidemiological investigation of an outbreak of Shigella sonnei in the UK.

Authors:  J McDonnell; T Dallman; S Atkin; D A Turbitt; T R Connor; K A Grant; N R Thomson; C Jenkins
Journal:  Epidemiol Infect       Date:  2013-02-21       Impact factor: 4.434

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  12 in total

Review 1.  Microbial source tracking using metagenomics and other new technologies.

Authors:  Shahbaz Raza; Jungman Kim; Michael J Sadowsky; Tatsuya Unno
Journal:  J Microbiol       Date:  2021-02-10       Impact factor: 3.422

2.  In Silico Serotyping Based on Whole-Genome Sequencing Improves the Accuracy of Shigella Identification.

Authors:  Yun Wu; Henry K Lau; Teresa Lee; David K Lau; Justin Payne
Journal:  Appl Environ Microbiol       Date:  2019-03-22       Impact factor: 4.792

3.  Evaluation of Whole-Genome Sequencing for Identification and Typing of Vibrio cholerae.

Authors:  David R Greig; Ulf Schaefer; Sophie Octavia; Ebony Hunter; Marie A Chattaway; Timothy J Dallman; Claire Jenkins
Journal:  J Clin Microbiol       Date:  2018-10-25       Impact factor: 5.948

4.  Setup, Validation, and Quality Control of a Centralized Whole-Genome-Sequencing Laboratory: Lessons Learned.

Authors:  Cath Arnold; Kirstin Edwards; Meeta Desai; Steve Platt; Jonathan Green; David Conway
Journal:  J Clin Microbiol       Date:  2018-07-26       Impact factor: 5.948

5.  Shifting national surveillance of Shigella infections toward geno-serotyping by the development of a tailored Luminex assay and NGS workflow.

Authors:  Eleonora Ventola; Bert Bogaerts; Sigrid C J De Keersmaecker; Kevin Vanneste; Nancy H C Roosens; Wesley Mattheus; Pieter-Jan Ceyssens
Journal:  Microbiologyopen       Date:  2019-03-28       Impact factor: 3.139

Review 6.  Status and potential of bacterial genomics for public health practice: a scoping review.

Authors:  Nina Van Goethem; Tine Descamps; Brecht Devleesschauwer; Nancy H C Roosens; Nele A M Boon; Herman Van Oyen; Annie Robert
Journal:  Implement Sci       Date:  2019-08-13       Impact factor: 7.327

7.  Use of whole-genome sequencing to identify clusters of Shigella flexneri associated with sexual transmission in men who have sex with men in England: a validation study using linked behavioural data.

Authors:  Holly D Mitchell; Amy F W Mikhail; Anaïs Painset; Timothy J Dallman; Claire Jenkins; Nicholas R Thomson; Nigel Field; Gwenda Hughes
Journal:  Microb Genom       Date:  2019-11

8.  Accessory Genome Dynamics and Structural Variation of Shigella from Persistent Infections.

Authors:  Rebecca J Bengtsson; Timothy J Dallman; Hester Allen; P Malaka De Silva; George Stenhouse; Caisey V Pulford; Rebecca J Bennett; Claire Jenkins; Kate S Baker
Journal:  mBio       Date:  2021-04-27       Impact factor: 7.867

9.  Evaluation of whole-genome sequencing-based subtyping methods for the surveillance of Shigella spp. and the confounding effect of mobile genetic elements in long-term outbreaks.

Authors:  Isabelle Bernaquez; Christiane Gaudreau; Pierre A Pilon; Sadjia Bekal
Journal:  Microb Genom       Date:  2021-11

10.  Persistent Transmission of Shigellosis in England Is Associated with a Recently Emerged Multidrug-Resistant Strain of Shigella sonnei.

Authors:  Megan Bardsley; Claire Jenkins; Holly D Mitchell; Amy F W Mikhail; Kate S Baker; Kirsty Foster; Gwenda Hughes; Timothy J Dallman
Journal:  J Clin Microbiol       Date:  2020-03-25       Impact factor: 5.948

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.