| Literature DB >> 20090843 |
Zhemin Zhou1, Xiaomin Li, Bin Liu, Lothar Beutin, Jianguo Xu, Yan Ren, Lu Feng, Ruiting Lan, Peter R Reeves, Lei Wang.
Abstract
There are 29 E. coli genome sequences available, mostly related to studies of species diversity or mode of pathogenicity, including two genomes of the well-known O157:H7 clone. However, there have been no genome studies of closely related clones aimed at exposing the details of evolutionary change. Here we sequenced the genome of an O55:H7 strain, closely related to the major pathogenic O157:H7 clone, with published genome sequences, and undertook comparative genomic and proteomic analysis. We were able to allocate most differences between the genomes to individual mutations, recombination events, or lateral gene transfer events, in specific lineages. Major differences include a type II secretion system present only in the O55:H7 chromosome, fewer type III secretion system effectors in O55:H7, and 19 phage genomes or phagelike elements in O55:H7 compared to 23 in O157:H7, with only three common to both. Many other changes were found in both O55:H7 and O157:H7 lineages, but in general there has been more change in the O157:H7 lineages. For example, we found 50% more synonymous mutational substitutions in O157:H7 compared to O55:H7. The two strains also diverged at the proteomic level. Mutational synonymous SNPs were used to estimate a divergence time of 400 years using a new clock rate, in contrast to 14,000 to 70,000 years using the traditional clock rates. The same approaches were applied to three closely related extraintestinal pathogenic E. coli genomes, and similar levels of mutation and recombination were found. This study revealed for the first time the full range of events involved in the evolution of the O157:H7 clone from its O55:H7 ancestor, and suggested that O157:H7 arose quite recently. Our findings also suggest that E. coli has a much lower frequency of recombination relative to mutation than was observed in a comparable study of a Vibrio cholerae lineage.Entities:
Mesh:
Year: 2010 PMID: 20090843 PMCID: PMC2806823 DOI: 10.1371/journal.pone.0008700
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
General features of the chromosome and plasmid of E. coli CB9615.
| Chromosome | pO55 | Total | |
| length (bp) | 5,386,352 | 66,001 | 5,452,353 |
| G+C ratio (%) | 50.52 | 48.87 | 50.5 |
| open reading frame (ORF) | 5028 | 109 | 5,137 |
| protein coding region (% of genome size) | 87 | 84.4 | 87 |
| average ORF length (bp) | 923.2 | 511 | 914.5 |
| pseudogene | 69 | - | 69 |
| rRNA (16S-23S-5S) | 7 | 0 | 7 |
| tRNA and tmRNA | 101 | 0 | 101 |
| ncRNA | 52 | 0 | 52 |
Figure 1Alignment of the genomes of CB9615, Sakai and EDL933.
Scales are Mbp. The grey and yellow shaded regions represent segments present in all strains, with inverted segments shaded in yellow. Purple and cyan boxes within the grey or yellow regions represent the indels of phages or phage-like elements (insertions and deletions respectively) defined in Table S9. Red and green boxes represent other major indels (insertions and deletions respectively) that involve the changes of the gene numbers in Table S10, with the indel numbers also shown as defined in Table S9. The orange boxes indicate the E. coli O157:H7 O-antigen segment gained by recombination. Figure S1 is a greatly expanded version of this figure showing individual genes.
Figure 2Tree showing the relationships of CB9615 and 2 O157:H7 strains.
The tree topography is taken from the alignment of 26 completed genomes (Figure 3) and Whittam [7]. For each lineage the number of mutations (including small indels), recombination events and insertion or deletion events (large indels) are shown in a grid, as specified with the key. Mutations are shown as intergenic, other non-coding, non-synonymous or synonymous SNPs (igSNPs, ncSNPs, nsSNP, sSNPs), small insertions and small deletions or indels if not differentiable. Large indels are separated into insertions or deletions where possible. Events allocated to the divergence between CB9615 and O157:H7, or between Sakai and EDL933, respectively, but not to either lineage, are shown in the grids between the two lineages. The branch point estimates for group B [12] including strain 493-89), G5101 and F6141, and clusters 1 (Strains 14359 and 87-14) and 2 (86-24) are marked with dotted lines on the O157:H7 lineage, and TB182A on the O55:H7 lineage. The distribution of SNPs along that lineage is based on reanalysis of data from Zhang et al. [17] and Leopold et al. [12].
Summary of mutational and recombination changes in the 3 O55:H7/O157:H7 genomes.
| Lineage | Mutational SNPsbc | Recombination events and recombination related SNPs | |||||||||||||||||||
| Recombination events | Recombination related SNPs | ||||||||||||||||||||
| NS | S | I | NC | ins | del | indel | total | No. genes | No. events | % genome involved | Average and range of divergence (%) | No. Genes | NS | S | I | NC | ins | del | indel | total | |
| CB9615 | 545 | 392 | 204 | 66 | 31 | 35 | - | 1273 | 838 (486) | 7 | 0.02% | 6.94 (3.08∼8.67) | 10 | 61 | 44 | 4 | 14 | 0 | 1 | - | 124 |
| O157 | 797 | 556 | 276 | 61 | 34 | 56 | - | 1780 | 1145 (690) | Total: 5 | 2.01% | 3.24 (3.22∼ 27.91) | 92 | 744 | 1839 | 259 | 557 | 18 | 12 | - | 3429 |
| Other: 4 | 0.01% | 8.60 (3.23∼27.91) | 4 | 20 | 17 | 0 | 0 | 0 | 0 | - | 37 | ||||||||||
| Rseg: 1 | 2.00% | 3.22 | 88 | 724 | 1822 | 259 | 557 | 18 | 12 | - | 3392 | ||||||||||
| Sakai | 64 | 22 | 19 | 12 | 16 | 15 | - | 148 | 109 (61) | 7 | 0.70% | 0.47 (0.12∼8.90) | 11 | 19 | 10 | 9 | 26 | 0 | 1 | - | 65 |
| EDL933 | 99 | 39 | 36 | 19 | 198 | 22 | - | 413 | 181 (82) | 20 | 1.56% | 1.10 (0.12∼78.63) | 56 | 196 | 34 | 148 | 47 | 26 | 3 | - | 454 |
| O55/O157 | 62 | 26 | 24 | 5 | - | - | 23 | 140 | 66 (44) | 21 | 0.48% | 2.91 (0.53∼24.39) | 38 | 165 | 253 | 61 | 134 | - | - | 13 | 626 |
| Sakai/EDL933 | 2 | 0 | 0 | 0 | - | - | 7 | 9 | 8 (2) | 7 | 1.25% | 2.70 (0.69∼9.98) | 28 | 218 | 54 | 9 | 64 | - | - | 5 | 350 |
| Total | 1569 | 1035 | 559 | 163 | 279 | 128 | 30 | 3763 | 1897(1193) | 67 | - | 2.56 (0.12∼78.45) | 223 | 1403 | 2234 | 490 | 842 | 44 | 17 | 18 | 5048 |
The CB9615, EDL933 and Sakai lineages are the strain specific lineages as shown in Figure 1. The O157 lineage is the segment of the O157:H7 lineage prior to divergence of EDL933 and Sakai. Events shown in the O55/O157 and Sakai/EDL933 rows are those allocated to the O55/O157 and Sakai/EDL933 divergence respectively, but not to a specific lineage.
Excludes SNPs in regions thought to have entered by recombination.
NS, non-synonymous; S, synonymous; I, intergenic; NC, in non-coding genes; ins, insertion; del, deletion; indel, insertion or deletion (not distinguishable).
The number in brackets is number of genes carrying at least 1 non-synonymous SNP.
Covers only recombinant regions longer than 20 bps.
Includes 3388 SNPs in the large recombinant event involving the O-antigen gene cluster.
Excludes the 3388 SNPs in the large recombinant event involving the O-antigen gene cluster.
The 3388 SNPs in the large recombinant event involving the O-antigen gene cluster.
Figure 3Maximum likelihood phylogenetic tree of 26 Escherichia coli and Shigella strains.
The phylogenetic tree of the Escherichia core genome genes was constructed from the concatenated alignments of the 2034 genes in the core genome of the E. coli/Shigella genomes. The closely related species, E. fergusonii (CU928158), was chosen to root the tree.
Principal characteristics of the 28 Escherichia coli/Shigella strains.
| Name | additional information | Serotype | Clinical condition (Pathotype | GenBank accession | Genome sequence reference |
| K-12 | K-12 MG1655 | O16 | Commensal | U00096 |
|
| K-12 W3110 | O16 | Commensal | AP009048 |
| |
| K-12 DH10B | O16 | Commensal | CP000948 |
| |
| K12 BW2952 | O16 | Commensal | CP001396 |
| |
| HS | O9 | Commensal | CP000802 |
| |
| ATCC 8739 | O146 | Commensal | CP000946 | ||
| IAI1 | O8 | Commensal | CU928160 |
| |
|
|
|
|
|
| |
| EDL933 | O157:H7 | Diarrhoea (EHEC) | AE005174 |
| |
| Sakai | O157:H7 | Diarrhoea (EHEC) | BA000007 |
| |
| UMN026 | O17:K52:H18 | Cystitis (ExPEC) | CU928163 |
| |
| IAI39 | O7:K1 | Pyeloneprhitis (ExPEC) | CU928164 |
| |
| UTI89 | O18 | Cystitis (ExPEC) | CP000243 |
| |
| APEC 01 | O1 | Colisepticemia (ExPEC) | CP000468 |
| |
| S88 | O45:K1:H7 | New born meningitis (ExPEC) | CU928161 |
| |
| CFT073 | O6:K2:H1 | Pyeloneprhitis (ExPEC) | AE014075 |
| |
| ED1a | O81 | Commensal | CU928162 |
| |
| 536 | O6:K15:H31 | Pyeloneprhitis (ExPEC) | CP000247 |
| |
| E2348/69 | O127:H6 | Diarrhoea (EPEC) | FM180568 |
| |
| E24377A | O139:H28 | Diarrhoea (ETEC) | CP000800 |
| |
| SMS-3-5 | O19:H34 | Commensal | CP000970 |
| |
| SE11 | O152:H28 | Commensal | AP009240 |
| |
| B4 Sb227 |
| B4 | Shigellosis | CP000036 |
|
| B18 BS512 |
| B18 | Shigellosis | CP001063 |
|
| SS Ss046 |
| Sonnei | Shigellosis | CP000038 |
|
| F2a 301 |
| F2a | Shigellosis | AE005674 |
|
| F2a 2457T |
| F2a | Shigellosis | AE014073 |
|
| F5b 8401 |
| F5b | Shigellosis | CP000266 |
|
| D1 Sd197 |
| D1 | Shigellosis | CP000034 |
|
Name as used in this paper.
O antigen not expressed in K-12 due to mutation.
EAEC (Enteroaggregative E. coli), EPEC(Enteropathogenic E.coli), EHEC (Enterohaemorrhagic E. coli), ExPEC (Extraintestinal pathogenic E. coli).
Figure 4Tree showing the relationships of UTI89, S88 and APEC 01 ExPEC strains.
The tree topography is for mutational SNPs as allocated by the virtual outgroup analysis (Tables S7 and S8). For each lineage the number of mutations and recombination events is shown in grids as for Figure 1.