| Literature DB >> 14733619 |
Preeti Mehta1, Sherwood Casjens, Sankaran Krishnaswamy.
Abstract
BACKGROUND: Many sequenced bacterial genomes harbor phage-like elements or cryptic prophages. These elements have been implicated in pathogenesis, serotype conversion and phage immunity. The e14 element is a defective lambdoid prophage element present at 25 min in the E. coli K-12 genome. This prophage encodes important functional genes such as lit (T4 exclusion), mcrA (modified cytosine restriction activity) and pin (recombinase).Entities:
Mesh:
Substances:
Year: 2004 PMID: 14733619 PMCID: PMC331406 DOI: 10.1186/1471-2180-4-4
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Figure 1Overview of the The genetic functions of a generic lambdoid bacteriophage genome (brown rectangle) are shown above displayed with a transcriptional map (black arrows). In the middle, the section of the E. coli K-12 genome that contains e14 (gray rectangle) is shown with ORFs denoted by rectangular arrows oriented in the direction of transcription (green – host genes; red – e14 genes that are likely nonfunctional; black – e14 genes that are known to be functional; blue – e14 genes whose functionality cannot be assessed at present; parentheses indicate the boundaries of the P-invertable element). Small black arrows above the e14 map denote putative promoters, vertical lines denote putative terminators and small black squares putative operators. The yellow regions between the lambdoid and e14 maps indicate regions where e14 has homology to at least one known member of the lambdoid phage family (see text for details). Below, colored rectangles mark regions of highest homology between e14 and various known phages and prophages with regions of greater similarity closer to the e14 map (these are not meant to show all known homologies, only the closest ones); CPS-53 is a defective prophage in E. coli K-12, CP-933H is prophage in E. coli EDL933 and CP073-5, Sti4b, and Sti8 are provisional names for prophages in E. coli CFT073 and S. typhi CT18 (Supplementary Material of Ref. [39]).
Annotation of genes encoded by the e14 element. The functional annotation of the e14 genes along with the BLAST and FASTA hits, the closest structural homolog if any and the cluster to which the gene belongs are listed. TM indicates the transmembrane region, SS presence of signal sequence, COG, SM, PF, IPR, PS are prefixes to COG, SMART, PFAM, INTERPRO and PROSITE ids respectively. Genes for which direct or indirect evidence for transcriptional or translational expression is available have been indicated with a (+) sign and those genes which are inducible on SOS induction are marked with a (l+) in the last column of the table
| YmfD | b1137 | 659–1324 | 221 (-) | 0.35 | IPR001601 | Very weak match to Methyltranferase and tellurite resistance TehB (l+) |
| YmfE | b1138 | 1325–2029 | 234 (-) | 0.31 | TM(22–42, 59–79, 154–174, 186–206) | (l+) |
| Lit | b1139 | 2487–3380 | 297 (+) | 0.37 | PS00142 TM (61–82, 149–178) | T4 exclusion, Interacts with DNA, is a protease (l+) |
| IntE | b1140 | 3471–4598 | 375 (-) | 0.45 | IPR002104, PF00589 | phage integrase (l+) |
| Vxis | b1141 | 4579–4824 | 81 (-) | 0.44 | - | phage excisionase (l+) |
| YmfH | b1142 | 4861–5172 | 103 (-) | 0.54 | TM(42–62, 73–93) | similar to Q8FET3 of |
| YmfI | b1143 | 5289–5630 | 113 (+) | 0.39 | - | (l+) |
| YmfJ | b1144 | 5568–5852 | 94 (-) | 0.47 | - | similar to Zinc finger protein Q8BGS3 (l+) |
| YmfK | b1145 | 6051–6725 | 224 (-) | 0.44 | IPR006198, PF00717 | cI/c2 repressor (l+) |
| b1146 | b1146 | 6513–7016 | 167 (+) | - | - | probable homolog of cro from |
| YmfL | b1147 | 7048–7617 | 189 (+) | 0.49 | - | - |
| YmfM | b1148 | 7614–7952 | 112 (+) | 0.50 | - | - |
| YmfN | b1149 | 7962–9329 | 455(+) | 0.54 | IPR005021, PF03354, COG4626, SM00345 | Fusion of a replicase and a phage terminase |
| YmfR | b1150 | 9341–9523 | 60 (+) | 0.60 | TM(5–25, 26–46) | - |
| YmfO | b1151 | 9523–9934 | 137 (+) | 0.56 | IPR006944, PF04860 | Probable pseudogene, phage portal |
| YmfP | b1152 | 9935–10714 | 259 (+) | 0.58 | - | tail protein (baseplate?) |
| YmfQ | b1153 | 10705–11289 | 194 (+) | 0.57 | SS(1–32) | tail protein (baseplate?) |
| YcfK | b1154 | 11293–11922 | 209 (+) | 0.50 | COG3299 | tail fibre |
| YmfS | b1155 | 11924–12337 | 137 (+) | 0.42 | PF02413, IPR003458 | tail fibre assembly |
| TfaE | b1156 | 12309–12911 | 200 (-) | 0.48 | PF02413, IPR003458 | tail fibre assembly |
| StfE | b1157 | 12911–13411 | 166 (-) | 0.47 | - | side tail fibre |
| PinE | b1158 | 13477–14031 | 184 (+) | 0.49 | PF00239, PF02796, PS00397, PS00398 | DNA invertase – catalyses the inversion of 1800 bp P-region (+) |
| McrA | b1159 | 14138–14971 | 277 (+) | 0.38 | SM00507, IPR002711, IPR003615 | Modified cytosine restriction endonuclease A (+) |
Homologous regions of e14 with other phage and bacterial genomes. Regions of similarity of e14 with other genomes. All the regions indicated show greater than 85% identity in the region of the match. The matching regions in e14 are ordered based on position in the e14 genome. Figure 1 provides a schematic representation of this table.
| 24627–25019 | 4859–5249 | ymfI | |
| 25540–27256 | 5903–9926 | ymfK, ymfL, ymfM | |
| 1156–2211 | 8192–9247 | ymfN | |
| 15305–16983 | 9926–11604 | ymfO, ycfK | |
| 27440–27600 | 5031–5190 | ymfI | |
| 29587–29625 | 7544–7582 | ymfL | |
| 1140–1405 | 8228–8493 | ymfN | |
| 1608–2840 | 8696–9928 | ymfN | |
| 38985–38623 | 4859–5220 | ymfI | |
| 37962–37900 | 5900–5963 | - | |
| 36862–36316 | 7035–7581 | ymfL | |
| 36307–35709 | 7596–8194 | ymfM | |
| 153052–153211 | 6–165 | icd | |
| 25330–25963 | 11395–14028 | pin | |
| 23421–23500 | 11517–11596 | ycfK | |
| 24611–24263 | 12232–12580 | ymfS, tfaA | |
| 1197399–1197608 | 7–216 | icd | |
| 325210–325530 | 11604–11284 | ycfk | |
| 921146–921257 | 13459–13570 | pin | |
| 2684817–2685390 | 14032–13459 | pin | |
| 20864–20264 | 12256–12856 | ymfS, tfaA | |
| 19918–19709 | 13205–13414 | pin | |
| 22957–23180 | 8225–8448 | ymfN | |
| 23410–23498 | 8678–8766 | ymfN | |
| 23764–23883 | 9035–9154 | ymfN | |
| 23735–23883 | 9006–9154 | ymfN | |
| 23941–24443 | 9212–9714 | ymfR, ymfO | |
| 3853–3216 | 12324–12961 | tfaA, stfE (-) | |
| 3890–4390 | 12918–13415 | stfE | |
| 6180–6377 | 13218–13415 | stfE | |
| 4700–4880 | 13211–13415 | stfE | |
| 2998–2803 | 13215–13410 | stfE | |
| 5203–5418 | 13200–13415 | stfE | |
| 5701–5905 | 13211–13415 | stfE | |
| 6438–6879 | 13476–13917 | Pin | |
| 3525667–3526015 | 12581–12233 | tfaA, stfE | |
| 3526698–3527201 | 13396–13898 | stfE, pin | |
| 3524880–3525072 | 13410–13218 | stfE | |
| 2754201–2754696 | 13898–13404 | stfE, pin | |
| 1410749–1410959 | 13205–13415 | stfE | |
| 1411306–1411507 | 13218–13419 | stfE | |
| 1408359–1408791 | 13908–13476 | Pin | |
| 1409336–1409551 | 13415–13200 | stfE | |
| 1408852–1409071 | 13415–13196 | stfE | |
Figure 2The regulatory region of the The possible ORFs for the cro (b1146) and cI repressors (b1145) are indicated in blue and orientation indicated by the arrow. The inverted repeats as detected by Allison et al. [31] for SfV are boxed. The palindromic regions are underlined. The ribosome binding sites (RBS) and putative -10 and -35 for the early right operon are indicated by different color letters within the box.
Figure 3Cumulative GC-plot of Cumulative GC-plot of e14 using a window size of 500 showing the regions of minima, which were analyzed for possible origins of replication. The y-axis represents ∑ (G-C)/(G+C) multiplied by 1000. The x-axis gives the base positions in e14.
Predicted promoters for the e14 element. Putative promoters predicted using BPROM available at the website . Scores are as given by BPROM. Promoters with a score above 3 were considered for the study. Only those promoters which could be associated with some gene are listed. Promoter for the shorter ORF of b1146 was predicted based on Allison et al. [31] and GeneMark program and hence is omitted from the table.
| 2070 | - | gtatataat | ttgtaa | 72 | 47 | 9.24 | ymfE, ymfD |
| 2461 | + | gtatatact | ctgaag | 62 | 19 | 6.95 | lit |
| 6003 | - | ttttatact | tttatg | 76 | 33 | 8.09 | ymfJ |
| 6919 | - | cacaaaact | ttgctc | 17 | 31 | 3.24 | ymfK |
| 6800 | + | atgtaatat | ttgaag | 61 | 54 | 3.5 | ymfL |
| 13200 | + | acttaaaat | ttgcat | 67 | 50 | 4.68 | pinE |
| 14113 | + | aagtagtat | ttgcaa | 44 | 55 | 5.55 | mcrA |
Predicted terminators for the e14 element. 'rho' independent terminators in the e14 genome as predicted by the GCG terminator program. Only terminators, which could be associated with genes are listed here.
| 3379 | + | gatatggctgtccgccgctcgcttaaagtggactttttagtttttatcatg |
| 5769 | + | tgctaacaaaatgcgggcctcagtgcctgcatttggctctatctgctgcaa |
| 7072 | + | cactggaaaatagaaaaacagcctgagtggtacgtgaaagctgtcagaaaa |
| 11921 | + | aagatgaaaatatactgttgcttaaataccgttggtttttttatggatggc |
| 14045 | + | ttgtgtacaaaagaaagtaaaacaacagcaacttgttgcaattttatcaat |
| 15382 | + | ttaaatattgaaacgggcgtataacacgcccgttgttttatttatgtggat |
| 622 | - | ctaaagatgtatgtgaaggggccgcgctcgcggccttttttacattccgca |
| 1385 | - | agtcggaaaaatcccggacgataaaataaaagaatttttcactaaaaataa |
| 6057 | - | agcctaatcaatgtttatgaacctgcttcggcaggtttttttatacttgac |
Predicted phage like elements using a comparative protein based approach. Phage elements detected in other genomes using orthology to e14 proteins as a criterion. Clustering of orthologous proteins (COG hits) for the e14 proteins in different organisms was examined. Only those organisms with two or more COG hits in the e14 element are listed. Estimates of the boundaries of the phage element are provided. 26 phage related regions could be identified by this analysis out of which 23 are already known phage areas in the bacterial genomes. Two (labeled P2 and P3) of the remaining three regions are probably non-phage areas. *The regions which have not been previously identified as prophage element have been marked as P1, P2 and P3. $Denotes approximate boundaries.
| b1149, b1151, b1159 | z1359, z1362, z1356 | 1250302–1295563 | CP-933M | |
| b1149, b1151, b1158, b1159 | z1803, z1806, z1817, z1800 | 1626570–1674696 | CP-933N | |
| b1149, b1149, b1151, b1159 | z6045, z6070, z6042, z6047 | 2271618–2331237 | CP-933P | |
| b1145, b1154, b1155, b1157, b1158 | z0309, z0314, z0315, z0317, zpinH | 300060–310646 | CP-933H | |
| b1140, b1141, b1145, b1155, b1157 | z1866, z1867, zumuD, z1920, z1918 | 1701990–1749459 | CP-933X | |
| b1140, b1143, b1155 | zintT, z2978, z2983 | 2668339–2689384 | CP-933T | |
| b1149, b1151 | z1854, z1849 | 1678701–1693737 | CP-933C | |
| b1140, b1145 | zintU, z3126 | 2743223–2788401 | CP-933U | |
| b1140, b1145 | zintO, z2090 | 1849324–1929903 | CP-933O | |
| b1145, b1149, b1151 | z3358, z3332, z3328 | 2966157–3015089 | CP-933V | |
| b1156, b1158 | ybcx, ybck | 564025–585326 | DLP12 | |
| b1156, b1157, b1158 | ynac, stfr, pinr | 1409966–1433025 | rac | |
| b1156, b1157, b1158 | ydfm, ydfn, pinq | 1630450–1646830 | qin | |
| b1154, b1156 | yfdk, tfaS | 2464404–2474619 | KpLE1, CPS-53 | |
| b1149, b1158 | BS_lexA, BS_yneB | $1902658–1919056 | *P1 | |
| b1151, b1152 | BS_ykeA, BS_xkdT | 1316849–1347491 | PBSX | |
| b1152, b1158 | BS_yqbT, BS_spoIVCA | $2652219–2700977 | SKIN | |
| b1149, b1151 | Mlr8521, Mlr8522 | $6974260–7021772 | Meso2 | |
| b1143, b1149 | Mlr4761, Mlr4759 | $3776480–3781495 | *P2 | |
| b1150, b1154, b1157 | CC1890, CC1902, CC1904 | $2096699–2098302 | *P3 | |
| b1140, b1149 | XF1642, XF1645 | 1595657–1629967 | XfP4 | |
| b1140, b1152 | XF1555, XF1598 | 1519081–1532748 | XfP3 | |
| b1152, b1153, b1157 | NMA1323, NMA1324, NMA1325 | 1207176–1236496 | Pnm2 | |
| b1152, b1153 | NMA1826, NMA1825 | 1768530 to 1807766 | Pnm1 | |
| b1140, b1159 | spy1488, spy1468 | 1189125–1222634 | 370.2 | |
| b1157, b1158 | spy0671, spy0655, spy1468 | 529591–570493 | 370.1 | |