| Literature DB >> 28824552 |
Chad R Laing1, Matthew D Whiteside1, Victor P J Gannon1.
Abstract
Food safety is a global concern, with upward of 2.2 million deaths due to enteric disease every year. Current whole-genome sequencing platforms allow routine sequencing of enteric pathogens for surveillance, and during outbreaks; however, a remaining challenge is the identification of genomic markers that are predictive of strain groups that pose the most significant health threats to humans, or that can persist in specific environments. We have previously developed the software program Panseq, which identifies the pan-genome among a group of sequences, and the SuperPhy platform, which utilizes this pan-genome information to identify biomarkers that are predictive of groups of bacterial strains. In this study, we examined the pan-genome of 4893 genomes of Salmonella enterica, an enteric pathogen responsible for the loss of more disability adjusted life years than any other enteric pathogen. We identified a pan-genome of 25.3 Mbp, a strict core of 1.5 Mbp present in all genomes, and a conserved core of 3.2 Mbp found in at least 96% of these genomes. We also identified 404 genomic regions of 1000 bp that were specific to the species S. enterica. These species-specific regions were found to encode mostly hypothetical proteins, effectors, and other proteins related to virulence. For each of the six S. enterica subspecies, markers unique to each were identified. No serovar had pan-genome regions that were present in all of its genomes and absent in all other serovars; however, each serovar did have genomic regions that were universally present among all constituent members, and statistically predictive of the serovar. The phylogeny based on SNPs within the conserved core genome was found to be highly concordant to that produced by a phylogeny using the presence/absence of 1000 bp regions of the entire pan-genome. Future studies could use these predictive regions as components of a vaccine to prevent salmonellosis, as well as in simple and rapid diagnostic tests for both in silico and wet-lab applications, with uses ranging from food safety to public health. Lastly, the tools and methods described in this study could be applied as a pan-genomics framework to other population genomic studies seeking to identify markers for other bacterial species and their sub-groups.Entities:
Keywords: Salmonella; food safety; genomics; pan-genome; predictive markers
Year: 2017 PMID: 28824552 PMCID: PMC5534482 DOI: 10.3389/fmicb.2017.01345
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
The frequency of the subspecies observed within the study set of 4936 Salmonella enterica genomes, prior to any quality filtering.
| Subspecies | No. |
|---|---|
| 4913 | |
| 7 | |
| 7 | |
| 4 | |
| 4 | |
| 1 | |
The serovars with more than 20 representatives in the current study set of 4936 Salmonella enterica genomes, and their frequency, prior to any quality filtering.
| Serovar | No. |
|---|---|
| Typhi | 1977 |
| Typhimurium | 758 |
| Enteritidis | 413 |
| Heidelberg | 201 |
| Paratyphi | 158 |
| Kentucky | 155 |
| Agona | 136 |
| Weltevreden | 120 |
| Bareilly | 106 |
| Newport | 82 |
| Tennessee | 77 |
| Montevideo | 69 |
| Saintpaul | 48 |
| Infantis | 39 |
| Senftenberg | 35 |
| Bovismorbificans | 34 |
| Hadar | 33 |
| Muenchen | 30 |
| Anatum | 27 |
| Schwarzengrund | 27 |
| Dublin | 24 |
| Cerro | 21 |
The average number of species-specific genomic regions found among serovars of subspecies enterica, that contained at least 10 representative genomes, within the 4870 quality filtered subspecies enterica genomes of this study.
| Serovar | Average no. species-specific regions |
|---|---|
| Enteritidis | 401.7 |
| Anatum | 401.5 |
| Muenchen | 400.5 |
| Hadar | 400.3 |
| Typhimurium | 400.1 |
| Newport | 399.8 |
| Thompson | 399.7 |
| Saintpaul | 399.6 |
| Heidelberg | 397.4 |
| Dublin | 395.2 |
| Infantis | 394.9 |
| Braenderup | 392.8 |
| Weltevreden | 390.0 |
| Bareilly | 388.5 |
| Kentucky | 380.3 |
| Plymouth/Zega | 377.9 |
| Senftenberg | 376.5 |
| Mbandaka | 374.5 |
| Lubbock | 374.1 |
| Reading | 370.4 |
| Agona | 369.5 |
| Tennessee | 368.3 |
| Schwarzengrund | 362.3 |
| Paratyphi | 361.5 |
| Derby | 360.7 |
| Montevideo | 360.1 |
| Typhi | 358.1 |
| Bovismorbificans | 355.3 |
| Cerro | 342.0 |
The putative function of the S. enterica species-specific regions for functions that were identified more than once, utilizing the best hit for each region.
| Putative protein function | Frequency |
|---|---|
| Hypothetical | 64 |
| Secreted effector | 10 |
| Membrane | 7 |
| Secretion system apparatus | 5 |
| Uncharacterized | 5 |
| Fimbrial | 5 |
| Pathogenicity island 2 effector | 4 |
| Fimbrial assembly | 4 |
| Outer membrane usher | 4 |
| mfs transporter | 3 |
| Oxidoreductase | 3 |
| Histidine kinase | 3 |
| Putative inner membrane | 3 |
| Putative cytoplasmic | 3 |
| lysr family transcriptional regulator | 3 |
| Transcriptional regulator | 2 |
| Permease | 2 |
| Outer membrane | 2 |
| Type III secretion | 2 |
| Phosphoglycerate transport | 2 |
| arac family transcriptional regulator | 2 |
| Conserved hypothetical | 2 |
| Methyl-accepting chemotaxis | 2 |
| Hybrid sensor histidine kinase/response regulator | 2 |
| Glycosyl transferase, partial | 2 |
| Phenylacetaldehyde dehydrogenase | 2 |
| Pathogenicity island 1 effector | 2 |
| 2 | |
| Type III secretion system | 2 |
| Transcriptional regulator, partial | 2 |
| Cytoplasmic | 2 |
| Fimbrial chaperone | 2 |
| Putative sialic acid transporter | 2 |
The number of subspecies-specific pan-genome markers that were universally present or absent among members of the subspecies, and not absent or present among genomes from any other subspecies.
| Subspecies | No. markers |
|---|---|
| 207 | |
| 93 | |
| 9 | |
| 134 | |
| 192 | |
| 135 | |
The number of pan-genome regions that were universally present and absent, as well as statistically over- or under-represented in comparison to all other genomes, within the 10 most abundant serovars within the 4870 subspecies enterica genomes of this study.
| Serovar | No. universally present | No. universally absent |
|---|---|---|
| Typhi | 288 | 2720 |
| Typhimurium | 41 | 698 |
| Enteritidis | 18 | 440 |
| Heidelberg | 121 | 840 |
| Paratyphi | 65 | 202 |
| Kentucky | 177 | 331 |
| Agona | 161 | 638 |
| Weltevreden | 426 | 608 |
| Bareilly | 87 | 436 |
| Newport | 226 | 360 |