Literature DB >> 26042182

Draft genomes of Shigella strains used by the STOPENTERICS consortium.

Omar Rossi1, Kate S Baker2, Armelle Phalipon3, François-Xavier Weill3, Francesco Citiulo1, Philippe Sansonetti3, Christiane Gerke1, Nicholas R Thomson2.   

Abstract

BACKGROUND: Despite a significant global burden of disease, there is still no vaccine against shigellosis widely available. One aim of the European Union funded STOPENTERICS consortium is to develop vaccine candidates against Shigella. Given the importance of translational vaccine coverage, here we aimed to characterise the Shigella strains being used by the consortium by whole genome sequencing, and report on the stability of strains cultured in different laboratories or through serial passage.
METHODS: We sequenced, de novo assembled and annotated 20 Shigella strains being used by the consortium. These comprised 16 different isolates belonging to 7 serotypes, and 4 derivative strains. Derivative strains from common isolates were manipulated in different laboratories or had undergone multiple passages in the same laboratory. Strains were mapped against reference genomes to detect SNP variation and phylogenetic analysis was performed.
RESULTS: The genomes assembled into similar total lengths (range 4.14-4.83 Mbp) and had similar numbers of predicted coding sequences (average of 4,400). Mapping analysis showed the genetic stability of strains through serial passages and culturing in different laboratories, as well as varying levels of similarity to published reference genomes. Phylogenetic analysis revealed the presence of three main clades among the strains and published references, one containing the Shigella flexneri serotype 6 strains, a second containing the remaining S. flexneri serotypes and a third comprised of Shigella sonnei strains.
CONCLUSIONS: This work increases the number of the publically available Shigella genomes available and specifically provides information on strains being used for vaccine development by STOPENTERICS. It also provides information on the variability among strains maintained in different laboratories and through serial passage. This work will guide the selection of strains for further vaccine development.

Entities:  

Keywords:  Genome; STOPENTERICS; Shigella; Vaccine

Year:  2015        PMID: 26042182      PMCID: PMC4454270          DOI: 10.1186/s13099-015-0061-5

Source DB:  PubMed          Journal:  Gut Pathog        ISSN: 1757-4749            Impact factor:   4.181


Background

Shigella are Gram-negative bacteria that represent the etiologic agent of the shigellosis, a global human health problem, especially in developing countries and in children younger than 5 years. Shigellosis is estimated to cause annually 125 million cases and 100,000 deaths [1], and is one of main causes of traveller’s diarrhea. The genus Shigella comprises four serogroups (Shigella dysenteriae, Shigella sonnei, Shigella flexneri and Shigella boydii) subdivided in 50 different serotypes based on the carbohydrate composition of the O antigen of their lipopolysaccharide [2] and the presence of serotypes varies among different regions and over time [3]. As no vaccines are currently widely available, one of the aims of the European Union-funded STOPENTERICS consortium (Vaccination against Shigella and ETEC: novel antigens, novel approaches) [4] is to develop novel vaccine candidates against Shigella [e.g. the Generalized Modules for Membrane Antigens (GMMA) approach [5, 6]], as well as to improve the immunogenicity of the existing antigens (e.g. synthetic chemistry for glycoconjugates [7]). To this end, partners of the STOPENTERICS consortium have been integrating basic research, particularly genomics, transcriptomics, proteomics, and other high-throughput technologies, with novel vaccine technologies and synthetic chemistry [7]. To assemble Shigella expertise to identify and rapidly take novel vaccine candidates through to clinical trials for effective vaccine development, the research is carried out among different academic institutions (e.g. University of Oxford, Wellcome Trust Sanger Institute, Institut Pasteur) and vaccines companies (Novartis Vaccines Institute for Global Health and Sanofi-Pasteur). To ensure the congruence of strains between laboratories, and create a public resource for vaccine development and further Shigella research, we whole genome sequenced the Shigella strains used by the STOPENTERICS consortium which are used as they offer most effective breadth of cross-protection against Shigella sp. in endemic areas [8], and report the assembly and annotation of their draft genomes. We assessed the presence of SNPs between strains and against references, as well as defined their phylogenetic relationships, and compared genetic stability of strains maintained in different consortium laboratories and after serial passage.

Methods

Bacterial strains

The Shigella strains analysed in this study and relevant metadata are summarized in Table 1. Strains were serotyped by slide agglutination using commercially available monovalent antisera (Denka Seiken, Japan) to all type specific somatic antigens and the group factor antigens [9].
Table 1

Summary results assembly, annotation and mapping

Name in the studyTrue name, country of infection, year of isolationSample run accessionSample accessionDe novo assembly genomic sizeContigs numberAverage Contigs LengthN50CDS detectedReference used for mappingNumber of SNPs detected% of reference mapped
Ss_53GKorea, 2000ERS387232ERR4773764,698,81440211,688.5929,8564,495Ss 53G292.86
Ss_53G_pKorea, 2000ERS387243ERR4773874,832,55940611,902.8528,1774,655Ss 53G292.80
Ss_25931UnknownERS387235ERR4773794,799,85242611,267.2627,7654,578Ss 53G63089.51
Sf 1a_Sh07.3008Sh07-3008, Cameroon, 2007ERS445026ERR5733824,139,08026515,619.1734,7714,044Sf 2a 2457T3,45986.58
Sf 1b_Sh04.7434Sh04-7432, Tunisia, 2004ERS445024ERR5733804,402,07831414,019.3634,5524,342Sf 2a 2457T2,93587.63
Sf 1b_Sh04.9462Sh04-9462, Cameroon, 2004ERS445025ERR5733814,272,35828015,258.4234,7564,206Sf 2a 2457T3,20787.36
Sf 2a_2457TJapan, 1954ERS387233ERR4773774,681,42934413,608.8135,4414,583Sf 2a 2457T19593.72
Sf 2a_2457T_pJapan, 1954ERS387242ERR4773864,697,21134313,694.4935,1514,605Sf 2a 2457T19293.88
Sf 3a_6865_1UnknownERS387236ERR4773804,665,09933513,925.6735,4954,550Sf 2a 2457T7,54386.79
Sf 3a_6865_2UnknownERS445023ERR5733794,704,03033014,254.6435,9914,580Sf 2a 2457T7,70887.08
Sf 5a_M90TUnknown <1980ERS387234ERR4773784,486,89932713,721.4032,1604,391Sf 5a M90T2597.82
Sf 6_Sh10.5302_1201005302, Madagascar, 2010ERS387237ERR4773814,414,1464469,897.1922,8384,269Sb Sb2274,40889.30
Sf 6_Sh10.5302_2201005302, Madagascar, 2010ERS445029ERR5733854,351,33642610,214.422,7744,168Sb Sb2274,40689.10
Sf 6_Sh10.3933201003933, Nigeria, 2010ERS387238ERR4773824,508,36843310,411.9423,0904,386Sb Sb2274,45689.33
Sf 6_Sh10.8537201008537, Egypt, 2010ERS387239ERR4773834,524,54742510,645.9923,2384,398Sb Sb2274,45189.77
Sf 6_Sh10.6306201006306, India, 2010ERS387240ERR4773844,481,17843910,207.6923,0664,367Sb Sb2274,46789.44
Sf 6_Sh10.6237201006237, Mexico, 2010ERS387241ERR4773854,528,96843410,435.4124,0124,397Sb Sb2274,38989.39
Sf 6_NCDC.2924-71NCDC 2924-71, Unknown, 1971ERS445027ERR5733834,392,20841310,634.8922,7844,246Sb Sb2274,28889.66
Sf 6_Sc544Unknown, <1977ERS445028ERR5733844,430,66741510,676.3122,4944,302Sb Sb2274,29688.82
Sf 6_Sh11.10088201110088, France (Reunion Island), 2011ERS445030ERR5733864,547,25642310,750.0123,9914,428Sb Sb2274,48389.88

DNA extraction and genome sequencing

Bacterial cultures were grown over night in liquid Luria–Bertani (LB) media to an optical density (measured at 600 nm) of approximately three. Genomic DNA was isolated using the Wizard kit (Promega, Madison, WI, USA) according to manufacturer’s instructions. Purified DNA was then sequenced at the Wellcome Trust Sanger Institute (WTSI). Paired end libraries 150 bp in length were generated and sequenced on the Illumina MiSeq instrument (San Diego, CA, USA) according to in house protocols [10, 11], with an approximately 500 bp insert size. Sequence data for each of the strains were deposited in the European Nucleotide Archive (accession numbers in Table 1).

Genomic analysis

Resulting sequencing reads were trimmed using Trimmomatic v0.27 [12] to remove adapters, bases with a PHRED score of <30, and remaining reads with lengths <50 bp. High quality reads were then mapped to relevant reference strains (Table 1), using SMALT (http://www.sanger.ac.uk/resources/software/smalt/) and Single Nucleotide Polymorphisms (SNPs) were called using Samtools [13]. Nucleotides where mapping quality was below 30 and genotyping quality was below 50 were excluded from further analysis. Mapping coverage of all isolates was approximately 70-fold coverage. De novo assembly was performed using Velvet Optimiser [14] and contiguous sequences were annotated using Prokka [15]. Clustering and BLAST comparisons were used to determine the presence/absence of genes in annotated assemblies as previously described [16]. To prepare a multiple sequence alignment for phylogenetic analysis, sequencing data from strains in this study and from simulated fastq data created from published reference genomes were mapped to the chromosome of S. flexneri 2457T (GenBank accession: NC_004741.1). The other reference isolates (and their accessions) used in this analysis were: S. sonnei Ss046 (NC_007384.1), S. sonnei 53G (NC_016822.1), S. flexneri 5 M90T (AGNM01000000), S. flexneri 5a 8401 (NC_008258.1), S. flexneri 2a NCTC1 (LM651928), S. flexneri 2a 301 (NC_004337.2), S. flexneri X 2002017 (NC_017328.1) and S. boydii Sb 227 (NC_007613.1). Core genes (n = 2,427) were identified that had 100% mapping coverage in all isolates and phylogenetic analysis was performed using RAxML software v7.0.3 [17] on the 43,349 variable sites (subset from 2,306,256 bp) of these core genes. In silico molecular serotyping of S. flexneri isolates was performed on de novo assemblies for each isolate (and as in [18]). Briefly, the presence/absence and known differences of the gtr genes (encoding for enzymes responsible of the presence of type specific antigens I, II, IV, V, X, IC), oac genes (encoding for enzymes that mediates O-acetylation modification in serotypes 1b, 3a, 3b, and 4b) and wzx6 (specific for serotype 6) were analyzed, facilitating the differentiation of the six different S. flexneri serotypes.

Results and discussion

Sixteen different Shigella isolates belonging to seven different serotypes were sequenced (listed in Table 1). These included S. sonnei (2 isolates) and different S. flexneri serotypes including 1a, 1b (2 isolates), 2a, 3a, 5a and 6 (eight different isolates) plus four derivative strains from either serial passage (S. sonnei 53G, S. flexneri 2a 2457T) or having been cultivated and the DNA extracted in different laboratories (S. flexneri 3a 6865 and S. flexneri 6 10.5302). Derivative strains from the same isolate, but manipulated in different laboratories of the STOPENTERICS consortium were denoted ‘_1’ and ‘_2’, whereas those that had undergone serial passage (~10 passages) in the same laboratory were denoted ‘_p’. The derivatives allowed us to assess the genetic stability of strains across laboratories and through serial passage. Summary results assembly, annotation and mapping Results of genomic assembly and annotation were similar for all strains (Table 1). The strains assembled into an average of 381 contigs (range 265–446), with an average contigs length of 12,141 bp (range 9,897–15,619) and an N50 of 28,620 (range 22,494–35,991). The resulting genomic size was similar for all the strains and fell within the range of 4.14–4.83 Mbp. Similarly, automated annotation predicted the presence of an average of 4,400 coding sequences per genome (range 4,044–4,583; Table 1). The serotypes of the Shigella strains were confirmed based on the combinations of gtr and oac genes, encoding the relevant enzymes for the serotype-specific OAg modifications [18] (not shown). To facilitate strain comparisons and phylogenetic analysis, sequence reads were mapped to existing Shigella reference genomes (Table 1). The percentage of the reference genome covered by mapped reads ranged from 87 to 98% and the number of SNPs varied (Table 1) depending on the isolate. These data showed comparatively few SNPs (<200) when an isolate was compared to a previously published reference of itself (as in the case of S. sonnei 53G, S. flexneri 2a 2457T, S. flexneri 5a M90T). Higher numbers of SNPs were seen where no such reference was available. For example, when an isolate was mapped to a reference genome of a different isolate of the same serotype (e.g. Ss_25931 mapped against Ss_53G) several hundred SNPs were seen, and several thousand SNPs were seen if the isolate was mapped to a reference isolate from a phylogenetic related, but distinct serotype (e.g. S. flexneri six isolates mapped against S. boydii strain Sb227). To assess the genomic stability of isolates held at different laboratories and through serial passage within the same laboratory, we resequenced a number of isolates and compared their mapping results to the relevant reference (Table 1). Two isolates (original and passaged) of S. sonnei 53G had only two SNPs relative to the published reference genome, and these SNPs were the same in both isolates. Similarly, the sequences of original and passaged S. flexneri 2a strain 2457T were very similar, but had 195 and 192 SNPs relative to the published reference genome. Among these SNPs, 188 were common to both isolates and the remaining four and seven sites were not resolved in the other isolate, indicating that the two isolates were likely identical to each other. The level of genetic variation compared to the reference strain was surprising (~200 SNPs) and may have biological significance, showing the utility of obtaining up-to-date genetic information for the exact strain being worked with in a given project. Two strains, Sf 3a_6865 and Sf 6_10.5302, were manipulated for sequencing in separate laboratories in the consortium. These strains differed by only one and two SNPs respectively, indicating that over a 2–3 year time period, isolate genomes remain relatively stable through passage and between laboratories, but may differ significantly from published references. To assess the phylogenetic relationship of the isolates, we constructed a maximum likelihood phylogenetic tree of a large core genome shared among the strains (Figure 1). Consistent with expectations based on prior evolutionary studies of shigellae [19, 20], the strains were divided into three main clades, with the S. flexneri six strains being phylogenetically removed from the remaining S. flexneri serotypes, and the S. sonnei strains forming a separate clade.
Figure 1

Mid-point rooted maximum likelihood phylogeny of strains based on core genome. Names of strains sequenced in this study are abbreviated and those of reference genomes are given in full.

Mid-point rooted maximum likelihood phylogeny of strains based on core genome. Names of strains sequenced in this study are abbreviated and those of reference genomes are given in full.

Conclusions

The work presented here increases the number of publically available Shigella genomes, including for the first time, sequencing data for S. sonnei 25931, two S. flexneri 1b, one S. flexneri 1a, one S. flexneri 3a and 8 S. flexneri six isolates. We provide details on the draft genomes generated from this sequencing data, and report SNP variation in strains maintained in different laboratories and after serial passage. We also described the relatedness of the strains and isolates used by the STOPENTERICS consortium, and have deposited this data as a public resource. Data presented in this work will guide the selection of strains for further development of vaccine and contribute to a growing awareness of diversity in Shigella.
  19 in total

1.  RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2006-08-23       Impact factor: 6.937

2.  Prokka: rapid prokaryotic genome annotation.

Authors:  Torsten Seemann
Journal:  Bioinformatics       Date:  2014-03-18       Impact factor: 6.937

3.  Multiple independent origins of Shigella clones of Escherichia coli and convergent evolution of many of their characteristics.

Authors:  G M Pupo; R Lan; P R Reeves
Journal:  Proc Natl Acad Sci U S A       Date:  2000-09-12       Impact factor: 11.205

4.  Revisiting the molecular evolutionary history of Shigella spp.

Authors:  Jian Yang; Huan Nie; Lihong Chen; Xiaobing Zhang; Fan Yang; Xingye Xu; Yafang Zhu; Jun Yu; Qi Jin
Journal:  J Mol Evol       Date:  2006-12-09       Impact factor: 2.395

5.  The Sequence Alignment/Map format and SAMtools.

Authors:  Heng Li; Bob Handsaker; Alec Wysoker; Tim Fennell; Jue Ruan; Nils Homer; Gabor Marth; Goncalo Abecasis; Richard Durbin
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

Review 6.  Structure and genetics of Shigella O antigens.

Authors:  Bin Liu; Yuriy A Knirel; Lu Feng; Andrei V Perepelov; Sof'ya N Senchenkova; Quan Wang; Peter R Reeves; Lei Wang
Journal:  FEMS Microbiol Rev       Date:  2008-04-16       Impact factor: 16.408

7.  Draft genome sequences of the type strains of Shigella flexneri held at Public Health England: comparison of classical phenotypic and novel molecular assays with whole genome sequence.

Authors:  Philip M Ashton; Kate S Baker; Amy Gentle; David J Wooldridge; Nicholas R Thomson; Timothy J Dallman; Claire Jenkins
Journal:  Gut Pathog       Date:  2014-03-31       Impact factor: 4.181

8.  A large genome center's improvements to the Illumina sequencing system.

Authors:  Michael A Quail; Iwanka Kozarewa; Frances Smith; Aylwyn Scally; Philip J Stephens; Richard Durbin; Harold Swerdlow; Daniel J Turner
Journal:  Nat Methods       Date:  2008-12       Impact factor: 28.547

9.  Global and regional mortality from 235 causes of death for 20 age groups in 1990 and 2010: a systematic analysis for the Global Burden of Disease Study 2010.

Authors:  Rafael Lozano; Mohsen Naghavi; Kyle Foreman; Stephen Lim; Kenji Shibuya; Victor Aboyans; Jerry Abraham; Timothy Adair; Rakesh Aggarwal; Stephanie Y Ahn; Miriam Alvarado; H Ross Anderson; Laurie M Anderson; Kathryn G Andrews; Charles Atkinson; Larry M Baddour; Suzanne Barker-Collo; David H Bartels; Michelle L Bell; Emelia J Benjamin; Derrick Bennett; Kavi Bhalla; Boris Bikbov; Aref Bin Abdulhak; Gretchen Birbeck; Fiona Blyth; Ian Bolliger; Soufiane Boufous; Chiara Bucello; Michael Burch; Peter Burney; Jonathan Carapetis; Honglei Chen; David Chou; Sumeet S Chugh; Luc E Coffeng; Steven D Colan; Samantha Colquhoun; K Ellicott Colson; John Condon; Myles D Connor; Leslie T Cooper; Matthew Corriere; Monica Cortinovis; Karen Courville de Vaccaro; William Couser; Benjamin C Cowie; Michael H Criqui; Marita Cross; Kaustubh C Dabhadkar; Nabila Dahodwala; Diego De Leo; Louisa Degenhardt; Allyne Delossantos; Julie Denenberg; Don C Des Jarlais; Samath D Dharmaratne; E Ray Dorsey; Tim Driscoll; Herbert Duber; Beth Ebel; Patricia J Erwin; Patricia Espindola; Majid Ezzati; Valery Feigin; Abraham D Flaxman; Mohammad H Forouzanfar; Francis Gerry R Fowkes; Richard Franklin; Marlene Fransen; Michael K Freeman; Sherine E Gabriel; Emmanuela Gakidou; Flavio Gaspari; Richard F Gillum; Diego Gonzalez-Medina; Yara A Halasa; Diana Haring; James E Harrison; Rasmus Havmoeller; Roderick J Hay; Bruno Hoen; Peter J Hotez; Damian Hoy; Kathryn H Jacobsen; Spencer L James; Rashmi Jasrasaria; Sudha Jayaraman; Nicole Johns; Ganesan Karthikeyan; Nicholas Kassebaum; Andre Keren; Jon-Paul Khoo; Lisa Marie Knowlton; Olive Kobusingye; Adofo Koranteng; Rita Krishnamurthi; Michael Lipnick; Steven E Lipshultz; Summer Lockett Ohno; Jacqueline Mabweijano; Michael F MacIntyre; Leslie Mallinger; Lyn March; Guy B Marks; Robin Marks; Akira Matsumori; Richard Matzopoulos; Bongani M Mayosi; John H McAnulty; Mary M McDermott; John McGrath; George A Mensah; Tony R Merriman; Catherine Michaud; Matthew Miller; Ted R Miller; Charles Mock; Ana Olga Mocumbi; Ali A Mokdad; Andrew Moran; Kim Mulholland; M Nathan Nair; Luigi Naldi; K M Venkat Narayan; Kiumarss Nasseri; Paul Norman; Martin O'Donnell; Saad B Omer; Katrina Ortblad; Richard Osborne; Doruk Ozgediz; Bishnu Pahari; Jeyaraj Durai Pandian; Andrea Panozo Rivero; Rogelio Perez Padilla; Fernando Perez-Ruiz; Norberto Perico; David Phillips; Kelsey Pierce; C Arden Pope; Esteban Porrini; Farshad Pourmalek; Murugesan Raju; Dharani Ranganathan; Jürgen T Rehm; David B Rein; Guiseppe Remuzzi; Frederick P Rivara; Thomas Roberts; Felipe Rodriguez De León; Lisa C Rosenfeld; Lesley Rushton; Ralph L Sacco; Joshua A Salomon; Uchechukwu Sampson; Ella Sanman; David C Schwebel; Maria Segui-Gomez; Donald S Shepard; David Singh; Jessica Singleton; Karen Sliwa; Emma Smith; Andrew Steer; Jennifer A Taylor; Bernadette Thomas; Imad M Tleyjeh; Jeffrey A Towbin; Thomas Truelsen; Eduardo A Undurraga; N Venketasubramanian; Lakshmi Vijayakumar; Theo Vos; Gregory R Wagner; Mengru Wang; Wenzhi Wang; Kerrianne Watt; Martin A Weinstock; Robert Weintraub; James D Wilkinson; Anthony D Woolf; Sarah Wulf; Pon-Hsiu Yeh; Paul Yip; Azadeh Zabetian; Zhi-Jie Zheng; Alan D Lopez; Christopher J L Murray; Mohammad A AlMazroa; Ziad A Memish
Journal:  Lancet       Date:  2012-12-15       Impact factor: 79.321

10.  Trimmomatic: a flexible trimmer for Illumina sequence data.

Authors:  Anthony M Bolger; Marc Lohse; Bjoern Usadel
Journal:  Bioinformatics       Date:  2014-04-01       Impact factor: 6.937

View more
  4 in total

1.  Conformational and Immunogenicity Studies of the Shigella flexneri Serogroup 6 O-Antigen: The Effect of O-Acetylation.

Authors:  Nicole Inge Richardson; Neil Ravenscroft; Vanessa Arato; Davide Oldrini; Francesca Micoli; Michelle M Kuttel
Journal:  Vaccines (Basel)       Date:  2021-04-27

2.  Genome sequence of Shigella flexneri strain SP1, a diarrheal isolate that encodes an extended-spectrum β-lactamase (ESBL).

Authors:  Ping Shen; Jianzhong Fan; Lihua Guo; Jiahua Li; Ang Li; Jing Zhang; Chaoqun Ying; Jinru Ji; Hao Xu; Beiwen Zheng; Yonghong Xiao
Journal:  Ann Clin Microbiol Antimicrob       Date:  2017-05-12       Impact factor: 3.944

3.  Role of a fluid-phase PRR in fighting an intracellular pathogen: PTX3 in Shigella infection.

Authors:  Valeria Ciancarella; Luigi Lembo-Fazio; Ida Paciello; Anna-Karin Bruno; Sébastien Jaillon; Sara Berardi; Marialuisa Barbagallo; Shiri Meron-Sudai; Dani Cohen; Antonio Molinaro; Giacomo Rossi; Cecilia Garlanda; Maria Lina Bernardini
Journal:  PLoS Pathog       Date:  2018-12-07       Impact factor: 6.823

4.  Characterization of a serologically atypical Shigella flexneri Z isolated from diarrheal patients in Bangladesh and a proposed serological scheme for Shigella flexneri.

Authors:  Mohammad Shahnaij; Hasan A Latif; Ishrat J Azmi; Mohammed Badrul Amin; Sharmin J Luna; Mohammad Aminul Islam; Kaisar Ali Talukder
Journal:  PLoS One       Date:  2018-08-24       Impact factor: 3.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.