| Literature DB >> 19767599 |
Sadayuki Ochi1, Tohru Shimizu, Kaori Ohtani, Yoshio Ichinose, Hideyuki Arimitsu, Kentaro Tsukamoto, Michio Kato, Takao Tsuji.
Abstract
We report here the complete nucleotide sequence of pEntH10407 (65 147 bp), an enterotoxigenic Escherichia coli enterotoxin plasmid (Ent plasmid), which is self-transmissible at low frequency. Within the plasmid, we identified 100 open reading frames (ORFs) which could encode polypeptides. These ORFs included regions encoding heat-labile (LT) and heat-stable (STIa) enterotoxins, regions encoding tools for plasmid replication and an incomplete tra (conjugation) region. The LT and STIa region was located 13.5 kb apart and was surrounded by three IS1s and an IS600 in opposite reading orientations, indicating that the enterotoxin genes may have been horizontally transferred into the plasmid. We identified a single RepFIIA replication region (2.0 kb) including RepA proteins similar to RepA1, RepA2, RepA3 and RepA4. The incomplete tra region was made up of 17 tra genes, which were nearly identical to the corresponding genes of R100, and showed evidence of multiple insertions of ISEc8 and ISEc8-like elements. These data suggest that pEntH10407 has the mosaic nature characteristic of bacterial virulence plasmids, which contains information about its evolution. Although the tra genes might originally have rendered pEntH10407 self-transferable to the same degree as R100, multiple insertion events have occurred in the tra region of pEntH10407 to make it less mobile. Another self-transmissible plasmid might help pEntH10407 to transfer efficiently into H10407 strain. In this paper, we suggest another possibility: that the enterotoxigenic H10407 strain might be formed by auto-transfer of pEntH10407 at a low rate using the incomplete tra region.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19767599 PMCID: PMC2762410 DOI: 10.1093/dnares/dsp015
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1Map of pH10407K. The sequence of pEntH10407K was determined by a whole-genome shotgun strategy. Sequence reads were assembled and gaps were closed by direct sequencing of PCR products amplified with oligonucleotide primers designed to anneal to each end of neighbouring contigs. The sequence was annotated using GenomeGambler (Xanagen Inc., Kanagawa, Japan). ORFs encoding products that were at least 50 amino acids in length were identified first; then possible ORFs were selected by combinations of database matches and by the presence of a ribosome binding site. Inner circle: ORFs, with their orientations colour-coded by functional category: red, known or putative virulence-associated proteins; pink, conjugal DNA transfer; orange, IS-related or transposase fragments; purple, intact IS or transposase; blue, plasmid replication, maintenance or other DNA metabolic functions; green, conserved hypothetical proteins; yellow, putative proteins. The outer circle shows the scale in base pairs. Nomenclature of ORFs is given in Table 1. The figure was generated using the program ‘in silico MolecularCloning GE’ (In Silico Biology, Inc., Kanagawa, Japan).
ORFs of pEntH10407K
| ORF | Gene | Orientationa | Position (bp) | Size (aa) | Homologue by BLAST | Identity/similarity (%) | Accession no. |
|---|---|---|---|---|---|---|---|
| ORF001 | + | 39–641 | 200 | Lytic transglycosylase | 94/96 | ABG29580 | |
| ORF002 | − | 801–667 | 44 | Conserved hypothetical protein | 100/100 | ABD60010 | |
| ORF003 | − | 1759–938 | 273 | Conserved hypothetical protein YubP | 9/99 | BAA78846 | |
| ORF004 | − | 2166–1870 | 98 | Conserved hypothetical protein | 100/100 | BAA78845 | |
| ORF005 | − | 2318–2190 | 42 | Hypothetical protein | |||
| ORF006 | + | 2509–2826 | 105 | Conserved hypothetical protein | 91/93 | BBA97937 | |
| ORF007 | − | 2971–2807 | 54 | Conserved hypothetical protein | 93/93 | EDX27828 | |
| ORF008 | − | 3223–3062 | 53 | Post-segregation killing protein | 72/84 | P16077 | |
| ORF009 | − | 3281–3066 | 71 | Modulator of post-segregation killing protein | 50/58 | P23587 | |
| ORF010 | − | 3746–3432 | 104 | Zn-dependent dehydrogenases | 99/100 | CAI79556 | |
| ORF011 | − | 4462–3743 | 239 | PsiA | 97/98 | BAA78841 | |
| ORF012 | − | 4893–4459 | 144 | PsiB | 100/100 | ABD51587 | |
| ORF013 | − | 5649–4948 | 233 | Predicted transcriptional regulator | 91/96 (truncated) | ABD51586 | |
| ORF014 | − | 6912–5662 | 416 | Predicted transcriptional regulator | 96/97 (truncated) | AAW58879 | |
| ORF015 | − | 7209–6976 | 77 | Conserved hypothetical protein | 98/99 | ABE10669 | |
| ORF016 | − | 7832–7266 | 188 | Ssb | 94/96 | BBA78826 | |
| ORF017 | + | 7858–8094 | 78 | Hypothetical protein | |||
| ORF018 | − | 8273–8025 | 82 | Hypothetical protein | |||
| ORF019 | + | 8997–10 664 | 555 | Conserved hypothetical protein YkfC | 63/77 | ABI41559 | |
| ORF020 | − | 11 020–10 778 | 80 | Conserved hypothetical protein | 93/93 | ZP_00719262 | |
| ORF021 | − | 11 583–11 020 | 187 | Conserved hypothetical protein | 97/99 | BAF33947 | |
| ORF022 | − | 12 991–11 630 | 453 | Conserved hypothetical protein | 90/94 | BAF33946 | |
| ORF023 | + | 12 995–13 135 | 46 | Conserved hypothetical protein | 100/100 | CAP07686 | |
| ORF024 | − | 13 849–13 415 | 144 | Conserved hypothetical protein | 94/97 | BBA78815 | |
| ORF025 | − | 14 084–13 863 | 73 | Conserved hypothetical protein | 99/100 | ABE10652 | |
| ORF026 | − | 14 768–14 085 | 227 | Conserved hypothetical protein | 97/98 | AAS76 410 | |
| ORF027 | – | 15 251–14 844 | 135 | Conserved hypothetical protein | 94/97 | AAW58863 | |
| ORF028 | + | 15 284–16 246 | 320 | StbA | 99/99 | ABD59972 | |
| ORF029 | + | 16 246–16 599 | 117 | StbB | 99/99 | ABD59971 | |
| ORF030 | + | 17 011–17 286 | 91 | InsA of IS | 100/100 | AAA58242 | |
| ORF031 | + | 17 205–17 708 | 167 | InsB of IS | 100/100 | AAA96694 | |
| ORF032 | − | 17 982–17 719 | 87 | Hypothetical protein | |||
| ORF033 | − | 18 983–18 720 | 87 | Conserved hypothetical protein | 95/97 | AAS58634 | |
| ORF034 | − | 19 618–19 115 | 167 | InsB of IS | 99/99 | AAA96694 | |
| ORF035 | − | 19 812–19 537 | 91 | InsA of IS | 99/100 | AAA58242 | |
| ORF036 | + | 19 925–20 158 | 77 | ORF1 of IS | 80/80 (truncated) | ABB68584 | |
| ORF037 | + | 20 208–21 026 | 272 | ORF2 of IS | 99/99 | AAN43456 | |
| ORF038 | − | 21 722–21 348 | 124 | LT-B | 100/100 | AAC60441 | |
| ORF039 | − | 22 549–21 719 | 276 | LT-A | 100/100 | P43530 | |
| ORF040 | + | 22 812–23 084 | 90 | Conserved hypothetical protein | 99/100 | AAZ91090 | |
| ORF041 | + | 23 065–23 334 | 89 | Transposase of IS | 76/79 (truncated) | AAM14707 | |
| ORF042 | + | 23 250–23 564 | 104 | Transposase of IS | 99/99 (truncated) | AAM14707 | |
| ORF043 | + | 23 779–25 002 | 407 | Putative transposase | 99/99 (truncated) | AAT35239 | |
| ORF044 | − | 25 347–24 928 | 139 | Putative transposase | 99/99 (truncated) | CAI79504 | |
| ORF045 | + | 25 487–26 164 | 225 | ORF1 of IS | 100/100 | AAW51734 | |
| ORF046 | + | 26 164–26 511 | 115 | ORF2 of IS | 100/100 | AAW51735 | |
| ORF047 | + | 26 531–28 102 | 523 | ORF3 of IS | 100/100 | AAW51736 | |
| ORF048 | − | 28 394–28 176 | 72 | STIa | 100/100 | P01559 | |
| ORF049 | − | 29 343–28 840 | 167 | InsB of IS | 99/99 | AAA96694 | |
| ORF050 | − | 29 537–29 262 | 91 | InsA of IS | 100/100 | AAA58242 | |
| ORF051 | + | 29 641–29 796 | 51 | Putative transposase | 98/100 (truncated) | CAA07835 | |
| ORF052 | − | 30 316–30 008 | 102 | Predicted transcriptional regulator | 100/100 | ZP_00713086 | |
| ORF053 | + | 30 288–30 512 | 74 | Putative transposase | 92/94 (truncated) | AAM14707 | |
| ORF054b | + | 30 664–31 479 | 271 | Aminoglucoside 3′-phosphotransferase | 100/100 | AAA80260 | |
| ORF055 | + | 32 512–32 628 | 38 | Conserved hypothetical protein | 95/95 | ACD54240 | |
| ORF056 | − | 34 173–32 701 | 490 | BaeS | 73/86 | ABE10335 | |
| ORF057 | − | 34 892–34 170 | 240 | BaeR | 83/90 | ABE10334 | |
| ORF058 | + | 35 034–35 633 | 199 | Thiosulphate reductase cytochrome B subunit | 67/82 | CAD42043 | |
| ORF059 | + | 35 644–36 414 | 256 | Oxidoreductase, molybdopterin-binding subunit | 82/91 | CAD42042 | |
| ORF060 | + | 36 444–36 683 | 79 | Conserved hypothetical protein | 59/70 (truncated) | ABE10331 | |
| ORF061 | − | 39 074–37 563 | 503 | Putative ATP binding protein | 25/43 | CAD16966 | |
| ORF062 | − | 39 721–39 434 | 95 | RelE | 93/96 | ABD51640 | |
| ORF063 | − | 39 969–39 718 | 83 | RelB | 100/100 | ABD51639 | |
| ORF064 | − | 40 233–40 042 | 63 | Conserved hypothetical protein | 95/95 | AAL72549 | |
| ORF065 | − | 40 370–40 185 | 61 | RepA4 | 90/93 (truncated) | BAA78895 | |
| ORF066 | − | 40 573–40 325 | 82 | RepA4 | 88/90 (truncated) | ABC42205 | |
| ORF067 | − | 40 873–40 703 | 56 | Conserved hypothetical protein | 81/85 | ACD06080 | |
| ORF068 | − | 41 793–40 936 | 285 | RepA1 | 99/100 | CAI79519 | |
| ORF069 | − | 41 860–41 786 | 24 | TapA | 100/100 | BBA78893 | |
| ORF070 | − | 41 937–41 806 | 43 | RepA3 | 87/89 (truncated) | AAA26066 | |
| ORF071 | − | 42 354–42 094 | 86 | RepA2 | 99/100 | ABE10578 | |
| ORF072 | − | 43 184–42 594 | 196 | Superfamily I DNA/RNA helicase | 100/100 | AAW58927 | |
| ORF073 | − | 43 430–43 227 | 67 | YmoA | 97/97 | AAO49553 | |
| ORF074 | − | 43 937–43 476 | 153 | Thermonuclease family protein | 98/98 | ABE10720 | |
| ORF075 | − | 44 394–44 182 | 70 | Conserved hypothetical protein | 100/100 | BAF33997 | |
| ORF076 | − | 45 086–44 526 | 186 | FinO | 99/99 | AAC70069 | |
| ORF077 | − | 45 944–45 141 | 267 | TraX | 99/100 | BAF33995 | |
| ORF078 | − | 51 177–45 907 | 1756 | TraI | 98/99 | CAA39337 | |
| ORF079 | − | 53 465–51 177 | 762 | TraD | 96/96 | BAA78884 | |
| ORF080 | − | 53 944–53 516 | 142 | Conserved hypothetical protein YhfA | 99/99 (truncated) | ABE10712 | |
| ORF081 | − | 55 646–54 075 | 523 | ORF3 of IS | 100/100 | AAW51736 | |
| ORF082 | − | 56 013–55 666 | 115 | ORF2 of IS | 100/100 | AAW51735 | |
| ORF083 | − | 56 690–56 013 | 225 | ORF1 of IS | 100/100 | AAW51734 | |
| ORF084 | + | 56 910–57 311 | 133 | L0013 of IS | 99/100 | ABG71816 | |
| ORF085 | + | 57 308–57 655 | 115 | L0014 of IS | 100/100 | AAG54624 | |
| ORF086 | + | 57 705–59 243 | 512 | L0015 of IS | 99/99 | AAG54625 | |
| ORF087 | − | 59 613–59 377 | 78 | Conserved hypothetical protein YfhA | 99/99 (truncated) | ABD60024 | |
| ORF088 | − | 59 827–59 606 | 73 | TraR | 99/100 | ABC42235 | |
| ORF089 | − | 60 477–59 962 | 171 | TraV | 98/99 | BAA78858 | |
| ORF090 | − | 60 725–60 474 | 83 | TrbG | 96/98 | BAA97951 | |
| ORF091 | − | 61 038–60 718 | 106 | TrbD | 89/94 | ABC42237 | |
| ORF092 | − | 61 612–61 025 | 195 | TraP | 96/97 | BAA78856 | |
| ORF093 | − | 63 032–61 581 | 483 | TraB | 99/99 | BAA78855 | |
| ORF094 | − | 63 760–63 032 | 242 | TraK | 99/99 | BAA78854 | |
| ORF095 | − | 64 313–63 747 | 188 | TraE | 99/100 | BAA78853 | |
| ORF096 | − | 64 646–64 335 | 103 | TraL | 99/100 | BAA97945 | |
| ORF097 | − | 65 026–64 661 | 121 | TraA | 96/98 | BAA78851 | |
| ORF098 | − | 65 453–65 058 | 131 | TraY | 100/100 | BAA97943 | |
| ORF099 | − | 66 241–65 552 | 229 | TraJ | 98/99 | BAA97942 | |
| ORF100 | − | 66 811–66 428 | 127 | TraM | 98/100 | ABD51596 |
a+, clockwise; −, counterclockwise.
bIt originates from kanamycin resistance gene inserted in pEntH10407 to provide a selectable marker.
Figure 2Relationship of complete transfer operons of R100 and F plasmid to the incomplete operon of pEntH10407K. (A) The incomplete tra region in pEntH10407K is compared with the transfer operons of R100 and F plasmids. ORFs are orientated according to pEntH10407K. Diagonally hatched arrows indicate genes of the transfer system. tra genes are in uppercase; trb genes are in lower case. ORFs are not to scale. Although transcription proceeds from right to left in R100 and F plasmids, pEntH10407K only contains the portion of the tra region from traM to traR, which are similar to the analogous ORFs in the F plasmid, and another portion from traD to traX, which is similar to the corresponding region from traD to traX of R100. Between the two tra regions, both transposase and putative transposase genes (from ORF080 to ORF087) were recognized. (B) Per cent sequence similarity for each ORF in pEntH10407K to analogous sequences from the R100 and F plasmids. HP indicates a hypothetical protein. Values indicate identity obtained by BLASTX comparison of each pEntH10407K ORF with the corresponding ORF from R100 and F plasmids. For each pair of comparisons, the higher identity value is in the shaded box. Continuous line indicates that a similar ORF was not found. (C) Comparison of putative oriT region of conjugative plasmid aligned at nick site (arrow) according to Frost et al.[32] Sequences were aligned; conserved sequence is marked with the shaded box.
Transfer proficiency of pEntH10407K and the mutant plasmids
| Plasmid in donor | Relevant genotype in | Transfer frequency |
|---|---|---|
| pUC19 | —a | <10−11 |
| pBluescript II SK(+) | —a | <10−11 |
| R100 | 2.44 × 10−6 | |
| pEntH10407K | 2.84 × 10−9 | |
| pEntH10407KΔ | <10−11 |
To evaluate the transmissibility of pEntH10407K, various plasmids [pUC19, pBluescript II SK(+), R100, pEntH10407K and mutated pEntH10407K] were electropolated into a derivative (a spontaneous nalidixic acid-resistant mutant) of the E. coli K-12 strain. The transformants were used in the experiment as donor strains. Aliquots from overnight cultures of the donor and the recipient in Luria–Bertani medium were mixed in 1:1, and incubated at 37°C for 12 h. After mating, the mixtures were diluted and spread (both diluted and undiluted) on selective agar. As controls, aliquots of the donor and recipient cultures were also spread separately on selective plates. Transfer frequencies were calculated per donor bacterium.
aNo tra region exists.
bA complete tra region is present.
Figure 3Dot plot analysis for pEntH10407K versus R100. Dot matrix analysis was performed using the Harrplot 2.0 software (Software Development) with a windows setting at 15 and threshold at 10. Solid diagonal line represents similarity. The toxin region (pathogenicity islet) and the incomplete tra region of pEntH10407K are indicated by horizontally and diagonally hatched box, respectively, at the top of the dot plot. The region of the ISEc8 and ISEc8-like elements in the incomplete tra region is indicated by open box. The tra region of R100 is indicated by closed box at the right side of the dot plot.