| Literature DB >> 35930328 |
Kazunori Murase1,2, Eiji Arakawa3, Hidemasa Izumiya3, Atsushi Iguchi4, Taichiro Takemura5, Taisei Kikuchi2,6, Ichiro Nakagawa1, Nicholas R Thomson7,8, Makoto Ohnishi3, Masatomo Morita3.
Abstract
Approximately 200 O-serogroups of Vibrio cholerae have already been identified; however, only 2 serogroups, O1 and O139, are strongly related to pandemic cholera. The study of non-O1 and non-O139 strains has hitherto been limited. Nevertheless, there are other clinically and epidemiologically important serogroups causing outbreaks with cholera-like disease. Here, we report a comprehensive genome analysis of the whole set of V. cholerae O-serogroup reference strains to provide an overview of this important bacterial pathogen. It revealed structural diversity of the O-antigen biosynthesis gene clusters located at specific loci on chromosome 1 and 16 pairs of strains with almost identical O-antigen biosynthetic gene clusters but differing in serological patterns. This might be due to the presence of O-antigen biosynthesis-related genes at secondary loci on chromosome 2.Entities:
Keywords: O-antigen biosynthetic gene cluster; O-serogroup reference strain; Vibrio cholerae; multi-chromosomal bacteria
Mesh:
Substances:
Year: 2022 PMID: 35930328 PMCID: PMC9484750 DOI: 10.1099/mgen.0.000860
Source DB: PubMed Journal: Microb Genom ISSN: 2057-5858
General genome statistics for 11 . strains
|
General genome statistics |
N16961 |
VCSRO5 |
VCSRO17 |
VCSRO63 |
VCSRO77 |
VCSRO102 |
VCSRO207 |
VCSRO45 |
VCSRO51 |
VCSRO96 |
VCSRO162 |
|---|---|---|---|---|---|---|---|---|---|---|---|
|
Cluster 3 |
Cluster 3 |
Cluster 3 |
Cluster 3 |
Cluster 3 |
Cluster 3 |
Cluster 3 |
Cluster 2 |
Cluster 2 |
Cluster 2 |
Cluster 1 | |
|
Chromosome 1 | |||||||||||
|
Genome size (bp) |
2 961 149 |
2 952 352 |
2 939 341 |
2 869 733 |
3 064 657 |
2 874 693 |
2 868 058 |
3 021 501 |
2 967 527 |
2 887 793 |
2 966 062 |
|
No. of CDSs |
2775 |
2720 |
2703 |
2623 |
2801 |
2601 |
2592 |
2767 |
2737 |
2632 |
2691 |
|
No. of rRNA operon |
8 |
8 |
8 |
8 |
8 |
8 |
8 |
8 |
8 |
8 |
8 |
|
No. of tRNA and tmRNA |
95 |
99 |
100 |
101 |
101 |
96 |
101 |
100 |
102 |
97 |
103 |
|
GC content (%) |
47.70 |
47.90 |
47.69 |
48.08 |
47.76 |
47.99 |
48.01 |
47.87 |
47.93 |
48.09 |
47.68 |
|
No. of genomic island* |
6 |
5 |
5 |
2 |
5 |
4 |
3 |
6 |
5 |
3 |
6 |
|
No. of strain-specific genes |
147 |
139 |
96 |
114 |
198 |
66 |
93 |
170 |
163 |
82 |
273 |
|
Proportion of unique genes (%) |
5.30 |
5.11 |
3.55 |
4.35 |
7.07 |
2.54 |
3.59 |
6.14 |
5.96 |
3.12 |
10.14 |
|
No. of core genes |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
1254 |
|
Proportion of core genes (%) |
45.19 |
46.10 |
46.39 |
47.81 |
44.77 |
48.21 |
48.38 |
45.32 |
45.82 |
47.64 |
46.60 |
|
Chromosome 2 | |||||||||||
|
Genome size (bp) |
1 072 315 |
1 070 220 |
1 102 179 |
1 155 566 |
1 007 849 |
1 123 019 |
1 163 376 |
1 096 179 |
1 004 624 |
1 165 751 |
1 094 700 |
|
No. of CDSs |
1115 |
956 |
976 |
1035 |
916 |
1013 |
1017 |
990 |
895 |
1081 |
971 |
|
No. of rRNA operons |
– |
– |
– |
– |
– |
– |
– |
– |
– |
– |
– |
|
Number of tRNA and tmRNA |
4 |
4 |
4 |
4 |
4 |
4 |
4 |
4 |
4 |
4 |
3 |
|
GC contents (%) |
46.92 |
47.20 |
47.28 |
46.95 |
47.17 |
46.87 |
46.85 |
46.66 |
47.06 |
46.62 |
46.53 |
|
No. of genomic island* |
1 |
3 |
3 |
3 |
3 |
2 |
5 |
5 |
2 |
4 |
3 |
|
No. of strain-specific genes |
144 |
89 |
87 |
123 |
153 |
153 |
139 |
83 |
67 |
184 |
222 |
|
Proportion of unique genes (%) |
12.91 |
9.31 |
8.91 |
11.88 |
16.70 |
15.10 |
13.67 |
8.38 |
7.49 |
17.02 |
22.86 |
|
No. of core genes |
196 |
196 |
196 |
196 |
196 |
196 |
196 |
196 |
196 |
196 |
196 |
|
Proportion of core genes (%) |
17.58 |
20.50 |
20.08 |
18.94 |
21.40 |
19.35 |
19.27 |
19.80 |
21.90 |
18.13 |
20.19 |
* The relevant characteristics of the genomic island identified in each strain are shown in Table S4.
Fig. 1.Classification of the O-antigen biosynthetic gene cluster. A representative of each type is enlarged from Fig. S1. An O-antigen synthesis unit, which contains genes related to nucleotide sugar biosynthesis, glycosyltransferases and O-antigen processing, is enclosed in a box. The O139 type of the O-antigen biosynthetic gene cluster possesses wbfABCDF and wzz in the 5′ region of the operon. The two-unit type of the O-antigen biosynthetic gene cluster possesses two synthesis units and conserved seven genes in the 5′ region of the second operon.
Fig. 2.Pan-genome profile and phylogenetic relation of the genomes. (a) A pan-genome curve for 191 . was generated by plotting the total number of distinct gene families against the number of genomes considered using PanGP. Similarly, the number of shared gene families is plotted against the number of genomes to generate the core genome plot that depicts the trend in the contraction of the core genome size with sequential addition of more genomes. (b) Assignments of core and non-core genes to COG and KEGG, as predicted by their respective databases. The values in each category indicate the relative abundance of core or non-core gene sets identified in the pan-genome profile of 191 . genomes. (c) The core gene-based phylogenetic tree classified into three groups (cluster 1, light green; cluster 2, pale pink; cluster 3, lavender) according to the statistical significance, as calculated by the hierBAPS clustering method. Heatmap shows the pairwise comparison of ANI values calculated on the whole-genome level by FastANI (v1.3). (d) Pan-genome profile and the relevant statistics are shown in the circular phylogram or bar plots. Orthologous gene clusters in the circular phylogram were organized by Euclidean distance and the Ward linkage algorithm in the anvi'o (v5) platform.
Fig. 3.Whole-genome alignment profile of 11 . strains. (a) Dot plot representation of DNA sequence homology of Chr1 or Chr2 between strains. GenomeMatcher (v2.30) was used for blastn analysis and visualization of the results. (b) Linear maps of Chr1 (left panel) or Chr2 (right panel) with a large inversion were built using AliTV (v1.0.6) visualization software, based on the whole-genome alignments with Lastz aligner. The red plots represent the shared sequences showing >95 % similarity between two different genomes. The grey segments indicate the inverted region on Chr1 or Chr2.
Fig. 4.Linkage of representative genomes from each phylogenetic cluster in . The linkages of gene synteny in Chr1 or Chr2 were visualized using Circos (v0.69–7) and are shown by the lines coloured with orange and light blue, respectively. The outermost circles represent the GIs, chromosomes and GC contents of each reference genome. There was no synteny between Chr1 and Chr2 in any strain.
Fig. 5.Distribution of GIs identified on Chr1 and Chr2 in 191 . strains. The profile was plotted according to the phylogenetic tree shown in Fig. 2c. Blue and red dots indicate the presence of GIs identified on Chr1 and Chr2, respectively.