| Literature DB >> 18931093 |
Kenshiro Oshima1, Hidehiro Toh, Yoshitoshi Ogura, Hiroyuki Sasamoto, Hidetoshi Morita, Sang-Hee Park, Tadasuke Ooka, Sunao Iyoda, Todd D Taylor, Tetsuya Hayashi, Kikuji Itoh, Masahira Hattori.
Abstract
We sequenced and analyzed the genome of a commensal Escherichia coli (E. coli) strain SE11 (O152:H28) recently isolated from feces of a healthy adult and classified into E. coli phylogenetic group B1. SE11 harbored a 4.8 Mb chromosome encoding 4679 protein-coding genes and six plasmids encoding 323 protein-coding genes. None of the SE11 genes had sequence similarity to known genes encoding phage- and plasmid-borne virulence factors found in pathogenic E. coli strains. The comparative genome analysis with the laboratory strain K-12 MG1655 identified 62 poorly conserved genes between these two non-pathogenic strains and 1186 genes absent in MG1655. These genes in SE11 were mostly encoded in large insertion regions on the chromosome or in the plasmids, and were notably abundant in genes of fimbriae and autotransporters, which are cell surface appendages that largely contribute to the adherence ability of bacteria to host cells and bacterial conjugation. These data suggest that SE11 may have evolved to acquire and accumulate the functions advantageous for stable colonization of intestinal cells, and that the adhesion-associated functions are important for the commensality of E. coli in human gut habitat.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18931093 PMCID: PMC2608844 DOI: 10.1093/dnares/dsn026
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1Circular representation of the SE11 chromosome. From the outside in: circles 1 and 2 of the chromosome show the positions of protein-coding genes on the positive and negative strands, respectively. Circles 3 and 4 show the positions of protein-coding genes that have orthologs in E. coli strains E24377A and K-12 MG1655, respectively. Circle 5 shows the positions of the prophages PP_SE11 (blue), integrative elements IE_SE11 (orange), and large segment near the aspV (green). Circle 6 shows the positions of tRNA genes (purple) and rRNA genes (brown). Circle 7 shows a plot of GC skew [(G – C)/(G + C); khaki indicates values > 0; purple indicates values < 0]. Circle 8 shows a plot of G + C content (higher values outward).
Figure 2Circular representations of three larger plasmids of SE11. The outer and inner circles of each plasmid represent genes on the positive and negative strands, respectively.
General features of the SE11 genome
| Chromosome | Plasmids | ||||||
|---|---|---|---|---|---|---|---|
| pSE11-1 | pSE11-2 | pSE11-3 | pSE11-4 | pSE11-5 | pSE11-6 | ||
| Size (bp) | 4 887 515 | 100 021 | 91 158 | 60 555 | 6929 | 5366 | 4082 |
| GC content (%) | 50.8 | 50.5 | 50.2 | 48.6 | 48.0 | 46.2 | 49.4 |
| Protein-coding gene | 4679 | 124 | 112 | 67 | 10 | 7 | 3 |
| Assigned function | 2772 | 70 | 49 | 46 | 4 | 2 | 1 |
| Conserved hypothetical | 1474 | 48 | 38 | 11 | 2 | 3 | 2 |
| Unknown function | 118 | 6 | 23 | 10 | 4 | 2 | 0 |
| Phage related | 315 | 0 | 2 | 0 | 0 | 0 | 0 |
| IS element | 33 | 2 | 5 | 12 | 0 | 0 | 0 |
| rRNA gene | 86 | 0 | 0 | 0 | 0 | 0 | 0 |
| tRNA gene | 22 | 0 | 0 | 0 | 0 | 0 | 0 |
Figure 3Classification of all 5002 protein-coding genes in SE11 based on comparison with those in MG1655 and 12 other E. coli strains. The 5002 protein-coding genes annotated in SE11 were compared with those in 13 other sequenced E. coli strains and classified into given categories with the percentage ratio. A: highly conserved genes with MG1655 (952); B: highly conserved genes in all 14 strains (2802); C: SE11 genes absent in MG1655 (1016); D: SE11-specific genes in all 14 E. coli strains (170); E: poorly conserved genes with MG1655 (62); A + B: total highly conserved genes with MG1655 (3754); C + D: total SE11 genes absent in MG1655 (1186). Number of classified genes are given in parentheses.
Prophages and integrative elements in the SE11 chromosome
| Start | End | Size (kb) | Integration site | Phage type | Sequence duplication (nt) | |
|---|---|---|---|---|---|---|
| Prophage | ||||||
| PP_SE11-1 | 608 272 | 658 623 | 50.4 | Lambda-like | 47 | |
| PP_SE11-2 | 1 475 327 | 1 527 950 | 52.6 | ttcA | lambda-like | 43 |
| PP_SE11-3 | 1 739 450 | 1 786 245 | 46.8 | between ydfJ and rspB | lambda-like | – |
| PP_SE11-4 | 1 928 706 | 1 964 507 | 35.8 | between btuC and ihfA | P2-like | 25 |
| PP_SE11-5 | 2 122 554 | 2 167 374 | 44.8 | yecE | Lambda-like | 11 |
| PP_SE11-6 | 2 261 390 | 2 309 029 | 47.6 | Unclear | 77 | |
| PP_SE11-7 | 2 660 266 | 2 696 947 | 36.7 | yfcI | Mu-like | 5 |
| Integrative element | ||||||
| IE_SE11-1 | 295 509 | 328 178 | 32.7 | – | 19 | |
| IE_SE11-2 | 3 014 785 | 3 021 558 | 6.8 | ssrA | – | – |
| IE_SE11-3 | 4 769 885 | 4 786 593 | 16.7 | – | – | |
Figure 4Locations and lengths of the strain-specific segments. Horizontal axis represents the MG1655 chromosome location and vertical axis shows lengths of the strain-specific segments (>5 kb) compared with the MG1655 chromosome. The positions of PP_SE11 and IE_SE11 are indicated in SE11. Prophages at the same locus as the large SE11-specific segments are indicated in O157. Positions of 10 prophages in MG1655 are shown at the top of the figure. The total length of the strain-specific segment (>5 kb) is indicated under each strain name.
Figure 5Comparisons of the genomic location of three SE11 prophages with the corresponding location of the related prophages of K-12 and O157 strains. Genomic organizations of PP_SE11-1 (A); PP_SE11-2 (B) and PP_SE11-3 (C). Genes and their orientations are depicted with arrows using the following colors: red, integrase genes; orange, phage-related genes; yellow, transposase genes; green, genes outside PP_SE11; blue, virulence-associated genes in O157; gray, genes in MG1655 and O157 conserved in SE11. Light blue bars indicate orthologous regions.
Genes for fimbriae in SE11
| Locus | Presence in:a | |||||||
|---|---|---|---|---|---|---|---|---|
| MG1655 | E24377A | HS | O157 Sakai | CFT073 | UTI89 | 536 | ||
| Chromosome | ||||||||
| ECSE_0135–ECSE_0141 | + | + | + | + | + | + | (+) | |
| ECSE_0555–ECSE_0560 | + | + | + | + | − | − | − | |
| ECSE_0775–ECSE_0778 | + | + | + | + | − | − | − | |
| ECSE_0999–ECSE_1005 | + | + | + | + | − | − | − | |
| ECSE_1099–ECSE_1106 | Curli | + | + | + | + | + | + | (+) |
| ECSE_1592–ECSE_1597 | (+) | (+) | (+) | + | + | (+) | (+) | |
| ECSE_2377–ECSE_2380 | + | + | + | (+) | + | + | + | |
| ECSE_2643–ECSE_2648 | + | + | + | + | + | + | + | |
| ECSE_3324–ECSE_3326 | + | + | + | − | + | + | − | |
| ECSE_3375–ECSE_3378 | CS1-like | − | + | + | − | − | − | + |
| ECSE_3428–ECSE_3431 | + | + | + | (+) | − | − | − | |
| ECSE_4015–ECSE_4018 | − | + | − | (+) | − | − | − | |
| ECSE_4585–ECSE_4593 | type 1 | + | (+) | + | + | + | + | + |
| Plasmid | ||||||||
| ECSE_P1-0108–ECSE_P1-0120 | type IV pili | − | (+) | − | − | − | − | − |
| ECSE_P2-0001–ECSE_P2-0005 | Caf-like | − | − | − | − | − | − | − |
| ECSE_P3-0031–ECSE_P3-0037 | F4-like | − | − | − | − | − | − | − |
| ECSE_P3-0061–ECSE_P3-0066 | − | − | − | − | − | − | − | |
a‘+’ indicates a locus where all genes are present; ‘−’ indicates a locus where all genes are absent; and ‘(+)’ indicates a locus where one or more genes, but not all, are absent or disrupted.
Genes for autotransporter in SE11
| Locus | Length (aa) | Presence in:a | ||||||
|---|---|---|---|---|---|---|---|---|
| MG1655 | E24377A | HS | O157 Sakai | CFT073 | UTI89 | 536 | ||
| ECSE_0327 | 765 | − | −+ | (+) | + | + | + | |
| ECSE_0393 | 968 | (+) | + | (+) | + | + | + | + |
| ECSE_1215 | 773 | (+) | + | (+) | − | − | − | − |
| ECSE_1251 | 961 | + | + | + | (+) | − | − | − |
| ECSE_1600 | 1806 | (+) | (+) | + | (+) | − | − | − |
| ECSE_2459 | 761 | + | + | + | + | (+) | (+) | (+) |
| ECSE_2494 | 1244 | + | + | + | + | + | + | + |
| ECSE_3884 | 1616 | − | + | + | + | + | + | + |
a‘+’ indicates presence; ‘−’ indicates absence; and ‘(+)’ indicates presence of a truncated gene.