| Literature DB >> 20047673 |
Peter E Chen1, Christopher Cook, Andrew C Stewart, Niranjan Nagarajan, Dan D Sommer, Mihai Pop, Brendan Thomason, Maureen P Kiley Thomason, Shannon Lentz, Nichole Nolan, Shanmuga Sozhamannan, Alexander Sulakvelidze, Alfred Mateczun, Lei Du, Michael E Zwick, Timothy D Read.
Abstract
BACKGROUND: New DNA sequencing technologies have enabled detailed comparative genomic analyses of entire genera of bacterial pathogens. Prior to this study, three species of the enterobacterial genus Yersinia that cause invasive human diseases (Yersinia pestis, Yersinia pseudotuberculosis, and Yersinia enterocolitica) had been sequenced. However, there were no genomic data on the Yersinia species with more limited virulence potential, frequently found in soil and water environments.Entities:
Mesh:
Year: 2010 PMID: 20047673 PMCID: PMC2847712 DOI: 10.1186/gb-2010-11-1-r1
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Strains sequenced in this study
| Species | ATCC number | Other designations | Year isolated | Location isolated | Description | Optimum growth temperature | Reference |
|---|---|---|---|---|---|---|---|
| 35236T | CNY 6065 | NR | Czechoslovakia | Drinking water | 26°C | [ | |
| 43970T | CDC 2475-87 | NR | France | Human stool | 26°C | [ | |
| 33641T | CDC 1461-81, CIP 80-29 | NR | Denmark | Sewage | 26°C | [ | |
| 29909T | CIP 80-28 | NR | NR | Human urine | 37°C | [ | |
| 33638T | CIP 80-30 | NR | NR | Human urine | 26°C | [ | |
| 43969T | CDC 2465-87 | NR | USA | Soil | 26°C | [ | |
| 43380T | H271-36/78, CDC 3022-85 | 1978 | Germany | Dog feces | 26°C | [ | |
| 29473T | 2396-61 | 1961 | Idaho, USA | Rainbow trout ( | 26°C | [ |
NR, not reported in reference publication.
Genomes summary
| Species | Type strain | NCBI project ID | GenBank accession number | Total reads | Number of contigs >500 nt | Total length of large contigs | % large contigs <Q40 | Number of contigs aligned to chromosomal scaffold |
|---|---|---|---|---|---|---|---|---|
| ATCC_43380 | 29767 | [Genbank: | 991,106 | 83 | 4,303,720 | 0.11 | 60 | |
| ATCC_29473 | 29769 | [Genbank: | 1,347,304 | 103 | 3,716,658 | 0.004 | 68 | |
| ATCC_35236 | 29741 | [Genbank: | 1,125,002 | 104 | 4,277,123 | 0.006 | 60 | |
| ATCC_33638 | 29761 | [Genbank: | 1,374,452 | 86 | 4,637,246 | 0.003 | 63 | |
| ATCC_29909 | 29755 | [Genbank: | 1,768,909 | 74 | 4,684,150 | 0.003 | 68 | |
| ATCC_33641 | 29743 | [Genbank: | 1,504,985 | 90 | 4,864,031 | 0.005 | 56 | |
| ATCC_43969 | 16105 | [Genbank: | 1,825,876 | 110 | 4,535,932 | 0.003 | 80 | |
| ATCC_43970 | 16104 | [Genbank: | 1,263,275 | 144 | 4,316,521 | 0.006 | 91 |
Distribution of common repeat sequences
| ERIC | YPAL | IS1541C | Aldovae3 | ||
|---|---|---|---|---|---|
| 0 | 3 | 5 | 0 | 5 | |
| 54 | 43 | 33 | 61 | 38 | |
| 55 | 52 | 29 | 5 | 36 | |
| 63 | 144 | 100 | 3 | 75 | |
| 6 | 84 | 46 | 0 | 40 | |
| 9 | 45 | 6 | 9 | 13 | |
| 0 | 57 | 6 | 0 | 5 | |
| 2 | 91 | 48 | 0 | 43 | |
| 2 | 99 | 70 | 0 | 59 | |
| 6 | 62 | 26 | 0 | 20 | |
| 0 | 37 | 8 | 0 | 7 | |
| 45 | 2 | 0 | 0 | 2 |
Three of the repeat sequences found using de novo searches matched the known repeat elements ERIC, YPAL, and IS1541C and are identified as such. Kristensenii39 and Aldovae3 are elements found from de novo searches in the Y. kristensenii and Y. aldovae genomes, respectively.
Figure 1. The phylogeny of the Yersinia genus was constructed from a dataset of 681 concatenated, conserved protein sequences using the Neighbor-Joining (NJ) algorithm implemented by PHYLIP [51]. The tree was rooted using E. coli. The scale measures number of substitutions per residue. Tree topologies computed using maximum likelihood and parsimony estimates are identical with each other and the NJ tree (Additional file 20). The only branches not supported in more than 99% of the 1,000 bootstrap replicates using both methods are marked with asterisks. Both these branches were supported by >57% of replicates.
Yersinia core size reduction by exclusion of one species
| Species excluded | Core protein families |
|---|---|
| None | 2,072 |
| 2,074 | |
| 2,085 | |
| 2,079 | |
| 2,077 | |
| 2,080 | |
| 2,076 | |
| 2,078 | |
| 2,091 | |
| 2,232 | |
| 2,076 | |
| 2,094 |
The core protein families with number of members 2 or greater were recalculated in each case (see Materials and methods) with the protein set from one genome missing.
Figure 2Comparison of major COG groups in . Bars represent the number of proteins assigned to COG superfamilies [52] for each genome, based on matches to the Conserved Domain Database [95] database with an E-value threshold <10-10. The COG groups are: U, intracellular trafficking; G, carbohydrate transport and metabolism; R, general function prediction; I, lipid transport and metabolism; D, cell cycle control; H, coenzyme transport and metabolism; B, chromatin structure; P, inorganic ion transport and metabolism; W, extracellular structures; O, post-translational modification; J, translation; A, RNA processing and editing; L, replication, recombination and repair; C, energy production; M, cell wall/membrane biogenesis; Q, secondary metabolite biosynthesis; Z, cytoskeleton; V, defense mechanisms; E, amino acid transport and metabolism; K, transcription; N, cell motility; T, signal transduction; F, nucleotide transport; S, function unknown.
Figure 3Distribution of protein clusters across . (a) The Venn diagram shows the number of protein clusters unique or shared between the two other high virulence Yersinia species (see Materials and methods). (b) The number of shared and unique clusters that do not contain a single member of the eight low human virulence genomes sequenced in this study.
Figure 4Protein-based comparison of . The map represents the blast score ratio (BSR) [98,99] to the protein encoded by Y. enterocolitica [15]. Blue indicates a BSR >0.70 (strong match); cyan 0.69 to 0.4 (intermediate); green <0.4 (weak). Red and pink outer circles are locations of the Y. enterocolitica genes on the + and - strands. The genomes are ordered from outside to inside based on the greatest overall similarity to Y. enterocolitica: Y. kristensenii, Y. frederiksenii, Y. mollaretii, Y. intermedia, Y. bercovieri, Y. aldovae, Y. rohdei, Y. ruckeri, Y. pseudotuberculosis, and Y. pestis. The black bars on the outside refer to genome islands in Y. enterocolitica identified by Thomson et al. [15].
Key niche-specific genes in Yersinia
|
|
|
|
|
|
|
|
|
| |
|---|---|---|---|---|---|---|---|---|---|
| + | + | + | - | + | + | + | + | + | |
| + | + | - | - | + | + | + | + | + | |
| + | + | + | + | + | + | + | + | ||
| + | + | + | - | + | + | + | + | + | |
| + | + | + | + | + | + | + | + | ||
| + | + | + | - | + | + | + | + | + | |
| + | + | - | + | + | + | + | + | ||
| + | - | + | - | + | + | + | + | + | |
| - | - | - | - | +/- hyfABCGHINfdhF | +/- (hyaD, hypEDB) | - | - | + | |
| - | - | - | - | - | - | + | + | + | |
| - | - | - | - | - | - | +/- | - | - |
Abbreviations: cbi, cobalamin (vitamin B12) biosynthesis; pdu, 1,2-propanediol utilization; ttr, tetrathionate respiration; eut, ethanolamine degradation; hyd-2 and hyd-4, hydrogenases 2 and 4, respectively; ure, urease; mtn, methionine salvage pathway; opg, osmoprotectant (synthesis of periplasmic branched glucans).