| Literature DB >> 19830476 |
Tammi Vesth1, Trudy M Wassenaar, Peter F Hallin, Lars Snipen, Karin Lagesen, David W Ussery.
Abstract
Thirty-two genome sequences of various Vibrionaceae members are compared, with emphasis on what makes V. cholerae unique. As few as 1,000 gene families are conserved across all the Vibrionaceae genomes analysed; this fraction roughly doubles for gene families conserved within the species V. cholerae. Of these, approximately 200 gene families that cluster on various locations of the genome are not found in other sequenced Vibrionaceae; these are possibly unique to the V. cholerae species. By comparing gene family content of the analysed genomes, the relatedness to a particular species is identified for two unspeciated genomes. Conversely, two genomes presumably belonging to the same species have suspiciously dissimilar gene family content. We are able to identify a number of genes that are conserved in, and unique to, V. cholerae. Some of these genes may be crucial to the niche adaptation of this species.Entities:
Mesh:
Substances:
Year: 2010 PMID: 19830476 PMCID: PMC2807590 DOI: 10.1007/s00248-009-9596-7
Source DB: PubMed Journal: Microb Ecol ISSN: 0095-3628 Impact factor: 4.552
Vibrionaceae genomes used in this analysis
| GPID | Organism | Contigs | Accession/GenBank | Status | No. of genes | Ref. |
|---|---|---|---|---|---|---|
| 36 |
| 2 | AE003852.1 | Fully sequenced | 3,828 | [ |
| 15667 |
| 2 | CP000626.1 | Fully sequenced | 3,875 | [ |
| 32853 |
| 2 | CP001235.1 | Fully sequenced | 3,934 | [ |
| 33555 |
| 2 | CP001485.1 | Fully sequenced | 3,774 | [ |
| 15666 |
| 153 | NZ_AAKF00000000 | Unfinished (Easygene) | 3,421 | [ |
| 15670 |
| 268 | NZ_AAKJ00000000 | Unfinished (NCBI) | 3,815 | [ |
| 33559 |
| 8 | NZ_ACIA00000000 | Unfinished (NCBI) | 3,632 | [ |
| 33557 |
| 17 | NZ_ACHZ00000000 | Unfinished (NCBI) | 3,748 | [ |
| 33553 |
| 11 | NZ_ACHX00000000 | Unfinished (NCBI) | 3,811 | [ |
| 32851 |
| 2 | CP001233.1 | Fully sequenced | 3,693 | [ |
| 18495 |
| 162 | NZ_AAWF00000000 | Unfinished (NCBI) | 3,425 | [ |
| 18265 |
| 254 | NZ_AAUR00000000 | Unfinished (NCBI) | 3,758 | [ |
| 18253 |
| 257 | NZ_AAUT00000000 | Unfinished (NCBI) | 3,771 | [ |
| 17723 |
| 154 | NZ_AATY00000000 | Unfinished (Easygene) | 3,407 | [ |
| 33561 |
| 12 | NZ_ACFQ00000000 | Unfinished (NCBI) | 3,574 | [ |
| 33549 |
| 5 | NZ_ACHV00000000 | Unfinished (NCBI) | 3,461 | [ |
| 33579 |
| 35 | NZ_ACHW00000000 | Unfinished (NCBI) | 3,621 | [ |
| 33551 |
| 20 | NZ_ACHY00000000 | Unfinished (NCBI) | 3,600 | [ |
| 13564 |
| 143 | NZ_ABGR00000000 | Unfinished (NCBI) | 3,935 | [ |
| 19857 |
| 3 | CP000789.1 | Fully sequenced | 6,064 | [ |
| 349 |
| 2 | AE016795.2 | Fully sequenced | 4,538 | [ |
| 1430 |
| 3 | BA000037.2 | Fully sequenced | 5,028 | [ |
| 19397 |
| 158 | NZ_ABCH00000000 | Unfinished (NCBI) | 5,360 | [ |
| 15693 |
| 222 | NZ_AAKK00000000 | Unfinished (Easygene) | 4,004 | [ |
| 13616 |
| 99 | NZ_AAND00000000 | Unfinished (NCBI) | 4,590 | [ |
| 32815 |
| 2 | FM954973.1 | Fully sequenced | 4,434 | [ |
| 19395 |
| 78 | NZ_ACCV00000000 | Unfinished (Easygene) | 3,780 | [ |
| 360 |
| 2 | BA000031.2 | Fully sequenced | 4,832 | [ |
| 12986 |
| 3 | CP000020.1 | Fully sequenced | 3,823 | [ |
| 19393 |
| 3 | CP001133.1 | Fully sequenced | 4,039 | [ |
| 30703 |
| 6 | FM178379.1 | Fully sequenced | 4,284 | [ |
| 13128 |
| 3 | CR354531.1 | Fully sequenced | 5,480 | [ |
GPID genome project identifier at NCBI. Contigs the number of contiguous sequences, which for a completely sequenced genome is at least two (for two chromosomes) and can be up to six when plasmids are present. Unfinished sequences are represented by multiple contigs per chromosome
aStrains containing the genes encoding the cholera enterotoxin subunits are indicated
Figure 1Phylogenetic tree of the 16S rRNA gene extracted from 32 sequenced Vibrio genomes listed in Table 1. Environmental V. cholerae lacking the cholera enterotoxin genes are highlighted in bright green, whilst pathogenic V. cholerae genomes are in dark green. Further colouring was used for species for which two genomes are represented
Figure 2Pan-genome family clustering of the 32 Vibrio genome sequences. The two plots represent weighted values for genes present in at least 90% of the genomes (stabilome) or genes found in only a few (two to four) genomes (mobilome). The colours highlighting the species are the same as in Fig. 1
Figure 3Pan- and core genome plot of the 32 Vibrionaceae genomes. The colours highlighting species are the same as in Fig. 1
Figure 4BLAST matrix of the 32 Vibrionaceae genomes. The colours highlighting the species are the same as in Fig. 1. Since the reciprocal similarity (reported as percent) is not readable at this resolution, every matrix cell is coloured using the scales as indicated. The bottom row identifies hits (other than hits-to-self) found within a genome. Four matrix cells reporting high pairwise similarities are outlined; their numbers are specified in the text
Figure 5BLAST atlas with V. cholerae strain N16961 as a reference strain, showing chromosomes 1 (top) and 2 (bottom). The best BLAST hits identified with genes from N16961 in the other V. cholerae genomes are represented in dark red, for the location as it appears in N16961. Blast hits in the other genomes are shown in various colours as indicated to the right. Major areas conserved in V. cholerae but not in other Vibrionaceae are identified as gap B, gap C, gap D and gap F in green; areas that are found in toxigenic V. cholerae only are marked black as gap A, gap E and gap G. The superintegron on chromosome 2 of V. cholerae is also indicated
A selection of genes located in the gaps marked in Fig. 5
| Gap A (850000–913000) | |
| 852903–851557 | Citrate/sodium symporter |
| 853165–854235 | Citrate (pro-3S)-lyase ligase |
| 854287–854583 | Citrate lyase subunit gamma |
| 854565–855455 | Citrate lyase, beta subunit |
| 855391–856995 | Citrate lyase, alpha subunit |
| 856992–857528 | citX protein |
| 857506–858447 | citG protein |
| 869812–866873 | Helicase-related protein |
| 870391–869813 | Tellurite resistance protein-related |
| 871298–870819 | Transcriptional regulator, putative |
| 873242–874225 | Transposase, putative |
| 876974–880015 | ToxR-activated gene A protein |
| 881390–884728 | Inner membrane protein, putative |
| 885773–886267 | tagD protein |
| 888405–886543 | Toxin co-regulated pilus biosynthesis |
| 888846–889511 | Toxin co-regulated pilus biosynthesis |
| 889496–889906 | Toxin co-regulated pilus biosynthesis |
| 890449–891123 | Toxin co-regulated pilin |
| 891203–892495 | Toxin co-regulated pilus biosynthesis |
| 892495–892947 | Toxin co-regulated pilus biosynthesis |
| 892950–894419 | Toxin co-regulated pilus biosynthesis |
| 894412–894867 | Toxin co-regulated pilus biosynthesis |
| 894855–895691 | Toxin co-regulated pilus biosynthesis |
| 895707–896165 | Toxin co-regulated pilus biosynthesis |
| 896155–897666 | Toxin co-regulated pilus biosynthesis |
| 897641–898663 | Toxin co-regulated pilus biosynthesis |
| 898673–899689 | Toxin co-regulated pilus biosynthesis |
| 899896–900726 | TCP pilus virulence regulatory protein |
| 900726–901487 | Leader peptidase TcpJ |
| 901494–903374 | Accessory colonization factor AcfB |
| 903380–904150 | Accessory colonization factor AcfC |
| 904648–905556 | tagE protein |
| 906206–905559 | Accessory colonization factor AcfA |
| 914124–912856 | Phage family integrase |
| Gap B (975000–1010000) | |
| 978644–979144 | Phosphotyrosine protein phosphatase |
| 981833–982387 | Serine acetyltransferase-related protein |
| 982384–983532 | Exopolysacch. biosynth protein EpsF |
| 983529–984938 | Polysacch. export protein, putative (gfcE) |
| 986166–986597 | Serine acetyltransferase-related protein |
| 986597–987937 | capK protein, putative |
| 987913–989010 | Polysaccharide biosynthesis protein, putative |
| 1001910–1002437 | Polysaccharide export-related protein (gfcE) |
| 1002462–1004675 | Putative exopolysacch. biosynth protein |
| Gap C (1130000–1160000) | |
| 1139646–1142912 | Chitinase, putative |
| 1147856–1148998 | Response regulator |
| 1149033–1149398 | Response regulator |
| 1149990–1151309 | Sensory box sensor histidine kinase |
| 1151321–1152625 | Sensor histidine kinase |
| 1152625–1154235 | Response regulator |
| 1154252–1155595 | Response regulator |
| 1157228–1155624 | Sensor histidine kinase |
| 1158044–1157232 | Periplasmic binding protein-related |
| Gap D (1478000–1520000) | |
| 2086826–2087584 | CDP-diacylglycerol-glyc.-3-phosph-3-phosphatidyltransferase |
| 2087587–2088519 | Phosphatidate cytidylyltransferase |
| 2094741–2095604 | PvcB protein |
| 2098112–2097183 | LysR family transcriptional regulator |
| 2098432–2100258 | pvcA protein |
| 2117923–2119977 | Methyl-accepting chemotaxis protein |
| 2120575–2120030 | Transcriptional regulator |
| 2120663–2121826 | Benzoate transport protein |
| Gap E (1537000–1587500) | |
| 1541452–1543170 | Sensor histidine kinase/response regulator |
| 1545396–1543231 | Toxin secretion transporter, putative |
| 1546802–1545399 | RTX toxin transporter |
| 1548919–1546757 | RTX toxin transporter |
| 1549662–1550123 | RTX toxin activating protein |
| 1550108–1563784 | RTX toxin RtxA |
| 1564376–1564152 | RstC protein |
| 1564844–1564470 | RstB1 protein |
| 1565901–1564822 | RstA1 protein |
| 1566027–1566365 | Transcriptional repressor RstR |
| 1567341–1566967 | Cholera enterotoxin, B subunit |
| 1568114–1567338 | Cholera enterotoxin, A subunit |
| 1569412–1568213 | Zona occludens toxin |
| 1569702–1569409 | Accessory cholera enterotoxin |
| 1571241–1570993 | Colonization factor |
| 1571760–1571377 | RstB2 protein |
| 1572817–1571738 | RstA1 protein |
| 1572943–1573281 | Transcriptional repressor RstR |
| 1577272–1575704 | Phage replication protein Cri |
| 1582123–1580555 | Phage replication protein Cri |
| 1583160–1583513 | Transposase OrfAB, subunit A |
| 1583510–1584382 | Transposase OrfAB, subunit B |
| Gap F (1896000–1956000) | |
| 1896092–1897327 | Phage family integrase |
| 1900831–1898009 | Helicase, putative |
| 1903632–1902898 | Chemotaxis protein MotB-related |
| 1908858–1905790 | Type I restriction enzyme HsdR |
| 1916009–1913628 | DNA methylase HsdM, putative |
| 1933231–1935654 | Neuraminidase |
| 1936007–1935801 | Transcriptional regulator |
| 1936121–1936597 | DNA repair protein RadC, putative |
| 1938391–1937519 | Transposase OrfAB, subunit B |
| 1938732–1938388 | Transposase OrfAB, subunit A |
| 1941671–1941351 | Transcriptional regulator, putative |
| 1942032–1941658 | Middle operon regulator-related |
| 1944457–1943306 | eha protein |
| Gap G (chromosome II, 21300–223000) | |
| 213207–214250 | GMP reductase |
| 214574–215725 | DNA methyltransferase |
| 220262–219825 | IS1004 transposase |
All gene annotations are taken from the reference genome V. cholerae strain N16961. Hypothetical proteins were excluded. Gaps A, E and G are conserved in pathogenic strains, whereas gaps B, C, D and F are conserved in all V. cholerae genomes analysed (Figure 1)