| Literature DB >> 25642229 |
Yael Koton1, Michal Gordon2, Vered Chalifa-Caspi2, Naiel Bisharat1.
Abstract
In 1996 a common-source outbreak of severe soft tissue and bloodstream infections erupted among Israeli fish farmers and fish consumers due to changes in fish marketing policies. The causative pathogen was a new strain of Vibrio vulnificus, named biotype 3, which displayed a unique biochemical and genotypic profile. Initial observations suggested that the pathogen erupted as a result of genetic recombination between two distinct populations. We applied a whole genome shotgun sequencing approach using several V. vulnificus strains from Israel in order to study the pan genome of V. vulnificus and determine the phylogenetic relationship of biotype 3 with existing populations. The core genome of V. vulnificus based on 16 draft and complete genomes consisted of 3068 genes, representing between 59 and 78% of the whole genome of 16 strains. The accessory genome varied in size from 781 to 2044 kbp. Phylogenetic analysis based on whole, core, and accessory genomes displayed similar clustering patterns with two main clusters, clinical (C) and environmental (E), all biotype 3 strains formed a distinct group within the E cluster. Annotation of accessory genomic regions found in biotype 3 strains and absent from the core genome yielded 1732 genes, of which the vast majority encoded hypothetical proteins, phage-related proteins, and mobile element proteins. A total of 1916 proteins (including 713 hypothetical proteins) were present in all human pathogenic strains (both biotype 3 and non-biotype 3) and absent from the environmental strains. Clustering analysis of the non-hypothetical proteins revealed 148 protein clusters shared by all human pathogenic strains; these included transcriptional regulators, arylsulfatases, methyl-accepting chemotaxis proteins, acetyltransferases, GGDEF family proteins, transposases, type IV secretory system (T4SS) proteins, and integrases. Our study showed that V. vulnificus biotype 3 evolved from environmental populations and formed a genetically distinct group within the E-cluster. The unique epidemiological circumstances facilitated disease outbreak and brought this genotype to the attention of the scientific community.Entities:
Keywords: Vibrio vulnificus; accessory genome; aquaculture; core genome; evolution; microbial genome; whole genome shotgun sequences
Year: 2015 PMID: 25642229 PMCID: PMC4295529 DOI: 10.3389/fmicb.2014.00803
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Characteristics of strains used in the study.
| 101/04 | 1 | Fish | 1997 | Israel | 59 |
| 2322 | 1 | Water | 1997 | Israel | 10 |
| 491771 | 3 | Human | 1997 | Israel | 8 |
| VV9-09 | 3 | Human | 1999 | Israel | 8 |
| VV 4-03 | 3 | Human | 2003 | Israel | 8 |
Sequence type as determined by multi-locus sequence typing (MLST).
Quality assessment of assembled genomes.
| No. of reads | 9,786,036 | 9,150,870 | 7,465,222 | 7,291,670 | 7,944,338 | NA | NA |
| Average length – contigs (bp) | 14,853 | 12,811 | 13,463 | 9,774 | 11,454 | 23,531 | 41,029 |
| No. contigs (≥0 bp) | 370 | 413 | 391 | 543 | 460 | 218 | 140 |
| No. contigs (≥1000 bp) | 63 | 78 | 177 | 200 | 190 | 187 | 115 |
| GC (%) | 46.32 | 46.4 | 46.42 | 46.43 | 46.42 | 46.49 | 46.73 |
| N50 | 506,351 | 324,391 | 54,238 | 57,095 | 57,166 | 52,210 | 230,903 |
| N75 | 143,535 | 161,992 | 33,564 | 34,718 | 33,212 | 32,756 | 109,426 |
| No. of fully unaligned contigs | 133 | 162 | 101 | 175 | 139 | 68 | 43 |
| Fully unaligned length (bp) | 170,338 | 152,003 | 326,751 | 390,311 | 369,492 | 284,904 | 226,696 |
| No. mismatches per 100 kbp | 3717.52 | 3753.33 | 3024.61 | 3013.1 | 3022.73 | 3019.71 | 2875.88 |
| No. indels per 100 kbp | 73.28 | 93.49 | 77.05 | 80.98 | 77.7 | 80.33 | 59.75 |
| Genome fraction (%) | 63.6 | 61.4 | 79.8 | 78.9 | 79.4 | 79.9 | 79.3 |
| No. predicted genes (≥0 bp) | 5017 | 4836 | 4961 | 5068 | 5026 | 4907 | 5291 |
| No. predicted genes (≥300 bp) | 4380 | 4255 | 4206 | 4264 | 4196 | 4279 | 4699 |
| No. predicted genes (≥1500 bp) | 752 | 722 | 719 | 713 | 714 | 680 | 806 |
| No. predicted genes (≥3000 bp) | 85 | 84 | 73 | 71 | 72 | 63 | 78 |
N50 is the length for which the collection of all contigs of that length or longer covers at least half an assembly. N75, defined similarly to N50, with 75% instead of 50%.
Strains sequenced by others (Danin-Poleg et al., .
The average number of mismatches per 100,000 aligned bases. True SNPs and sequencing errors are not distinguished and are counted equally.
The average number of indels per 100,000 aligned bases. Several consecutive single nucleotide indels are counted as one indel.
Percentage of aligned bases in the reference.
Figure 1Venn diagram representing differential and shared gene counts between representative strains of the three biotypes. Biotype 1 = strain CMCP6, biotype 2 = strain ATCC 33147, and biotype 3 = strain 491771.
Figure 2Functional annotations of core and accessory genes. (A) COG categories and (B) COG subcategories of predicted genes within the core and accessory genomes of V. vulnificus. Each category or subcategory is graphed as a percentage of the total number of genes in the core or accessory genomes. Accessory genome percentages are averages of the 16 analyzed genomes.
Figure 3REALPHY analysis based on WGS sequence data from current study and other publicly available genome sequence data. (A) Based on whole genomes, (B) based on core genomes, (C) based on accessory genomes. Human pathogenic strains are indicated by red font. Bootstrapping was performed using 500 iterations. For clarity purposes some of the strains were not included in the analysis.
Figure 4Circular view of BLAST alignment of three core genomes against the complete genome of reference strain CMCP6. The circles from outside to inside include; CDS positive and negative strands, group E1 core genome, group E2 core genome, group C core genome, GC content. Figure generated using CGView (Grant and Stothard, 2008).