| Literature DB >> 31139608 |
Kelsey J Jesser1, Willy Valdivia-Granda2, Jessica L Jones3, Rachel T Noble1.
Abstract
Vibrio parahaemolyticus is a ubiquitous and abundant member of native microbial assemblages in coastal waters and shellfish. Though V. parahaemolyticus is predominantly environmental, some strains have infected human hosts and caused outbreaks of seafood-related gastroenteritis. In order to understand differences among clinical and environmental V. parahaemolyticus strains, we used high quality DNA sequencing data to compare the genomes of V. parahaemolyticus isolates (n = 43) from a variety of geographic locations and clinical and environmental sample matrices. We used phylogenetic trees inferred from multilocus sequence typing (MLST) and whole-genome (WG) alignments, as well as a novel classification and genome clustering approach that relies on protein motif fingerprints (MFs), to assess relationships between V. parahaemolyticus strains and identify novel molecular targets associated with virulence. Differences in strain clustering at more than one position were observed between the MLST and WG phylogenetic trees. The WG phylogeny had higher support values and strain resolution since isolates of the same sequence type could be differentiated. The MF analysis revealed groups of protein motifs that were associated with the pathogenic MLST type ST36 and a large group of clinical strains isolated from human stool. A subset of the stool and ST36-associated protein motifs were selected for further analysis and the motif sequences were found in genes with a variety of functions, including transposases, secretion system components and effectors, and hypothetical proteins. DNA sequences associated with these protein motifs are candidate targets for future molecular assays in order to improve surveys of pathogenic V. parahaemolyticus in the environment and seafood.Entities:
Keywords: MLST; Vibrio parahaemolyticus; genomics; phylogenetics; protein motif fingerprinting; virulence; whole-genome sequencing
Year: 2019 PMID: 31139608 PMCID: PMC6519141 DOI: 10.3389/fpubh.2019.00066
Source DB: PubMed Journal: Front Public Health ISSN: 2296-2565
Vibrio parahaemolyticus isolates.
| 1 | SAMN02368229 | Oyster | 2007 | FL | O4:Kuk | 536 | –/– |
| 2 | SAMN02368232 | Oyster | 2007 | FL | O11:Kuk | 734 | –/– |
| 3 | SAMN02368266 | Oyster | 2007 | FL | O4:K42 | 1146 | –/– |
| 4 | SAMN02368267 | Oyster | 2007 | FL | O11:Kuk | 1153 | –/– |
| 5 | SAMN02368274 | Oyster | 2007 | FL | O5:Kuk | 743 | –/– |
| 6 | SAMN02368227 | Oyster | 2007 | LA | O4:K10 | 732 | –/– |
| 7 | SAMN03358821 | Oyster | 2007 | PEI, Canada | O11:Kuk | 1152 | –/– |
| 8 | SAMN02368264 | Oyster | 2007 | PEI, Canada | O11:Kuk | 1152 | –/– |
| 9 | SAMN02368270 | Oyster | 2007 | SC | O3:Kuk | 741 | –/– |
| 10 | SAMN02368244 | Oyster | 2007 | WA | O3:Kuk | 1148 | –/– |
| 11 | SAMN02741394 | Oyster | 2010 | MD | Unk | 34 | –/+ |
| 12 | SAMN02741402 | Oyster | 2010 | MD | Unk | 8 | –/+ |
| 13 | SAMN02368293 | Stool | 2006 | HI | O4:K4 | 283 | –/– |
| 14 | SAMN02368297 | Stool | 2006 | MA | O4:K53 | 749 | +/+ |
| 15 | SAMN02368298 | Stool | 2006 | MA | O1:Kuk | 3 | +/– |
| 16 | SAMN02368290 | Stool | 2006 | MD | O5:K47 | 1144 | –/+ |
| 17 | SAMN02368282 | Stool | 2006 | ME | O5:Kuk | 1150 | –/+ |
| 18 | SAMN03358827 | Stool | 2006 | NY | O10:Kuk | 636 | +/+ |
| 19 | SAMN03358828 | Stool | 2006 | NY | O3:K6 | 3 | +/– |
| 20 | SAMN02368284 | Stool | 2006 | NY | O4:Kuk | 36 | +/+ |
| 21 | SAMN02368283 | Stool | 2006 | NY | O10:Kuk | 809 | –/+ |
| 22 | SAMN03358830 | Stool | 2006 | NY | O4:K12 | 36 | +/+ |
| 23 | SAMN02368288 | Stool | 2006 | VA | O8:K41 | Undefined | –/– |
| 24 | SAMN02368286 | Stool | 2006 | VA | O5:K17 | 674 | –/– |
| 25 | SAMN02368315 | Stool | 2007 | AK | O4:K63 | 36 | +/+ |
| 26 | SAMN02368321 | Stool | 2007 | GA | O4:K8 | Undefined | +/– |
| 27 | SAMN02368292 | Stool | 2007 | HI | O5:Kuk | 79 | –/– |
| 28 | SAMN02368291 | Stool | 2007 | HI | O5:K17 | 79 | –/– |
| 29 | SAMN02368322 | Stool | 2007 | IA | O4:K12 | 36 | +/+ |
| 30 | SAMN02368323 | Stool | 2007 | IA | O4:K12 | 36 | +/+ |
| 31 | SAMN02368304 | Stool | 2007 | MD | O3:K56 | 750 | +/+ |
| 32 | SAMN03358834 | Stool | 2007 | NV | O1:Kuk | 199 | +/+ |
| 33 | SAMN03358837 | Stool | 2007 | NY | O10:Kuk | 636 | +/+ |
| 34 | SAMN03358839 | Stool | 2007 | OR | O1:Kuk | 65 | –/+ |
| 35 | SAMN02368303 | Stool | 2007 | SD | O1:K56 | 775 | +/+ |
| 36 | SAMN02368318 | Stool | 2007 | VA | O1:K20 | 1132 | +/+ |
| 37 | SAMN02368312 | Stool | 2007 | WA | O4:K12 | 36 | +/+ |
| 38 | SAMN02368311 | Stool | 2007 | WA | O4:K12 | 36 | +/+ |
| 39 | SAMN02368325 | Stool | 2007 | WA | O4:Kuk | 36 | +/+ |
| 40 | SAMN02368333 | Stool | 2009 | OK | O4:K12 | 36 | +/+ |
| 41 | SAMN01923894 | Unk | 2006 | USA | Unk | 3 | Not typed |
| 42 | SAMN01940374 | Water | 2009 | USA | Unk | 1567 | –/– |
| 43 | SAMN02368278 | Hand | 2006 | LA | O1:Kuk | 744 | –/– |
BioSample IDs are searchable in NCBI's BioSample database; web entries include sample information and links to raw sequence data.
Indicates year/location of collection for environmental isolates and year/location of sample isolation from patient for clinical isolates.
ST is undefined due to an insertion in the recA MLST locus [strains with similar insertions described in (.
Figure 1Approximate maximum-likelihood phylogenetic tree based on a whole genome (WG) sequence alignment of 43 V. parahaemolyticus genomes. Colors indicate hemolysin gene (tdh and trh) presence or absence and shape indicates sample isolation source. Inset is an enlargement of the ST36 clade to illustrate subclade diversity and resolution. Nodes are labeled with FastTree support values. Circles at nodes indicate bootstrap support values (1000 replicates) of >0.9 (black), >0.7 (gray), and >0.5 (white). Numbered clusters correspond with clusters listed in Table 2.
Figure 2Maximum-likelihood phylogenetic tree based on an alignment of multilocus sequence typing (MLST) loci for 43 V. parahaemolyticus genomes. Colors indicate hemolysin gene (tdh and trh) presence or absence and shape indicates sample isolation source. Nodes are labeled with bootstrap support values (1000 replicates). Numbered clusters correspond with clusters listed in Table 2.
Cluster membership of Vibrio parahaemolyticus isolates in the WG and MLST phylogenies and the MF clustering analysis.
| 1 | SAMN02368229 | 1 | 1 | |
| 2 | SAMN02368232 | 3 | 3 | |
| 3 | SAMN02368266 | 1 | 3 | |
| 4 | SAMN02368267 | 3 | 1 | |
| 5 | SAMN02368274 | 3 | 1 | |
| 6 | SAMN02368227 | 2 | 3 | |
| 7 | SAMN03358821 | 3 | 1 | |
| 8 | SAMN02368264 | 3 | 1 | |
| 9 | SAMN02368270 | 1 | ||
| 10 | SAMN02368244 | 3 | 2 | |
| 11 | SAMN02741394 | 3 | 1 | |
| 12 | SAMN02741402 | 1 | 3 | |
| 13 | SAMN02368293 | 1 | 1 | |
| 14 | SAMN02368297 | 1 | 3 | Stool cluster |
| 15 | SAMN02368298 | 1 | 3 | |
| 16 | SAMN02368290 | 1 | 3 | Stool cluster |
| 17 | SAMN02368282 | 3 | 1 | Stool cluster |
| 18 | SAMN03358827 | 2 | 1 | Stool cluster |
| 19 | SAMN03358828 | 1 | 3 | |
| 20 | SAMN02368284 | 2 | 2 | ST36 |
| 21 | SAMN02368283 | 1 | 3 | Stool cluster |
| 22 | SAMN03358830 | 2 | 2 | ST36 |
| 23 | SAMN02368288 | 1 | N/A | Stool cluster |
| 24 | SAMN02368286 | 1 | 1 | |
| 25 | SAMN02368315 | 2 | 2 | ST36 |
| 26 | SAMN02368321 | 1 | N/A | |
| 27 | SAMN02368292 | 1 | 3 | Stool cluster |
| 28 | SAMN02368291 | 1 | 3 | Stool cluster |
| 29 | SAMN02368322 | 2 | 2 | ST36 |
| 30 | SAMN02368323 | 2 | 2 | ST36 |
| 31 | SAMN02368304 | 3 | 1 | |
| 32 | SAMN03358834 | 2 | 1 | Stool cluster |
| 33 | SAMN03358837 | 2 | 1 | Stool cluster |
| 34 | SAMN03358839 | 1 | 3 | Stool cluster |
| 35 | SAMN02368303 | 3 | 1 | Stool cluster |
| 36 | SAMN02368318 | 3 | 1 | Stool cluster |
| 37 | SAMN02368312 | 2 | 2 | ST36 |
| 38 | SAMN02368311 | 2 | 2 | ST36 |
| 39 | SAMN02368325 | 2 | 2 | ST36 |
| 40 | SAMN02368333 | 2 | 2 | ST36 |
| 41 | SAMN01923894 | 1 | 3 | |
| 42 | SAMN01940374 | 1 | 1 | |
| 43 | SAMN02368278 | 1 | 3 |
Corresponds to numbered clusters in the WG phylogeny (.
Corresponds to numbered clusters in the MLST phylogeny (.
Corresponds to labeled MF clusters (.
Isolate not included in MLST analysis due to an insertion in the recA MLST locus.
Figure 3Pearson correlation average linkage hierarchical clustering of Vibrio motif fingerprints (MFs) across 43 V. parahaemolyticus genomes revealed a large cluster of stool isolates (green) and clustered ST36 isolates together (blue). Protein motifs associated with these clusters are designated by colored boxes, and specific motif taxonomies that are proposed as targets for quantitative molecular assay design are labeled with colored stars.