| Literature DB >> 24586046 |
Shivapriya Chithambaram1, Ramanandan Prabhakaran1, Xuhua Xia2.
Abstract
Because phages use their host translation machinery, their codon usage should evolve toward that of highly expressed host genes. We used two indices to measure codon adaptation of phages to their host, rRSCU (the correlation in relative synonymous codon usage [RSCU] between phages and their host) and Codon Adaptation Index (CAI) computed with highly expressed host genes as the reference set (because phage translation depends on host translation machinery). These indices used for this purpose are appropriate only when hosts exhibit little mutation bias, so only phages parasitizing Escherichia coli were included in the analysis. For double-stranded DNA (dsDNA) phages, both r(RSCU) and CAI decrease with increasing number of transfer RNA genes encoded by the phage genome. r(RSCU) is greater for dsDNA phages than for single-stranded DNA (ssDNA) phages, and the low r(RSCU) values are mainly due to poor concordance in RSCU values for Y-ending codons between ssDNA phages and the E. coli host, consistent with the predicted effect of C→T mutation bias in the ssDNA phages. Strong C→T mutation bias would improve codon adaptation in codon families (e.g., Gly) where U-ending codons are favored over C-ending codons ("U-friendly" codon families) by highly expressed host genes but decrease codon adaptation in other codon families where highly expressed host genes favor C-ending codons against U-ending codons ("U-hostile" codon families). It is remarkable that ssDNA phages with increasing C→T mutation bias also increased the usage of codons in the "U-friendly" codon families, thereby achieving CAI values almost as large as those of dsDNA phages. This represents a new type of codon adaptation.Entities:
Keywords: Escherichia coli; bacteriophage; codon adaptation; deamination; mutation bias; phage-host coevolution
Mesh:
Substances:
Year: 2014 PMID: 24586046 PMCID: PMC4032129 DOI: 10.1093/molbev/msu087
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
The Effect of tRNA-Mediated Selection in Escherichia coli, Whose Genomic Sequence Has Equal Nucleotide Frequencies, Presumably Resulting from Little Mutation Bias.
| AA | Codon | tRNA | CF | |
|---|---|---|---|---|
| Glu | GAA | 4,683 | 4 | A-ending |
| GAG | 1,459 | 0 | ||
| Phe | UUC | 2,229 | 2 | C-ending |
| UUU | 872 | 0 | ||
| Leu4 | CUA | 54 | 1 | |
| CUG | 5,698 | 4 | G-ending | |
| CUC | 541 | 1 | ||
| CUU | 357 | 0 | ||
| Arg4 | CGA | 34 | 0 | |
| CGG | 33 | 1 | ||
| CGC | 1,530 | 0 | ||
| CGU | 2,995 | 3 | U-ending |
Note.—CF, codon favored by tRNA.
aNumber of codons in highly expressed E. coli genes compiled in the EMBOSS package (Rice et al. 2000).
bNumber of E. coli tRNA genes with anticodon forming Watson–Crick pairing with the associated codon. Nucleotide A at the first anticodon position is mostly modified to inosine.
cLeu and Arg are coded by a four-codon subfamily and a two-codon subfamily. Leu4 and Arg4 refer to their respective four-codon subfamily.
Fictitious Codon Usage for Highly Expressed Host Genes (HOST) and Two Phage Genes (PG1 and PG2).
| AA | Codon | Count | RSCU | ||||
|---|---|---|---|---|---|---|---|
| HOST | PG1 | PG2 | HOST | PG1 | PG2 | ||
| Gly | GGA | 400 | 50 | 75 | 0.8889 | 1 | 1 |
| GGG | 300 | 30 | 45 | 0.6667 | 0.6 | 0.6 | |
| GGC | 100 | 20 | 30 | 0.2222 | 0.4 | 0.4 | |
| GGU | 1,000 | 100 | 150 | 2.2222 | 2 | 2 | |
| Phe | UUC | 2,000 | 20 | 10 | 1.8182 | 0.2 | 0.2 |
| UUU | 200 | 180 | 90 | 0.1818 | 1.8 | 1.8 | |
Note.—rRSCU between HOST and PG1 is identical to that between HOST and PG2, but PG2 will have higher CAI than PG1 when CAI is computed with HOST as the reference set of genes.
FCodon adaptation of the phage genes, measured by rRSCU, decreases with increasing number of tRNA genes encoded in phage genomes.
Number of A- or G-Ending Codons (Ncod), RSCU, and Number of tRNA Genes (NtRNA) for Escherichia coli and Two Phage Species (WV8 and bV_EcoS_AKFV33).
| AA | Codon | WV8 | bV_EcoS_AKFV33 | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| RSCU | RSCU | RSCU | ||||||||
| E | GAA | 4,683 | 1.525 | 4 | 1,125 | 1.259 | 1 | 1,489 | 1.365 | 1 |
| E | GAG | 1,459 | 0.475 | 662 | 0.741 | 692 | 0.635 | |||
| G | GGA | 118 | 0.068 | 1 | 245 | 0.584 | 1 | |||
| G | GGG | 267 | 0.154 | 1 | 150 | 0.357 | ||||
| K | AAA | 4,129 | 1.595 | 5 | 1,262 | 1.195 | 1 | 1,551 | 1.364 | 1 |
| K | AAG | 1,050 | 0.406 | 851 | 0.805 | 1 | 723 | 0.636 | 1 | |
| L | CUA | 54 | 0.033 | 1 | 233 | 0.745 | 1 | 544 | 1.335 | 1 |
| L | CUG | 5,698 | 3.427 | 3 | 318 | 1.017 | 433 | 1.063 | ||
| L | UUA | 210 | 0.774 | 1 | 718 | 1.453 | 1 | |||
| L | UUG | 333 | 1.227 | 1 | 270 | 0.547 | ||||
| P | CCA | 474 | 0.564 | 1 | 408 | 2.032 | 1 | 428 | 1.558 | 1 |
| P | CCG | 2,509 | 2.983 | 1 | 62 | 0.309 | 154 | 0.561 | ||
| Q | CAA | 550 | 0.355 | 2 | 481 | 1.058 | 1 | 593 | 1.06 | 1 |
| Q | CAG | 2,548 | 1.645 | 2 | 428 | 0.942 | 1 | 526 | 0.94 | 1 |
| R | AGA | 21 | 1.235 | 8 | 438 | 1.581 | 1 | 317 | 1.461 | 1 |
| R | AGG | 13 | 0.765 | 1 | 116 | 0.419 | 117 | 0.539 | ||
| S | UCA | 189 | 0.261 | 1 | 498 | 1.64 | 1 | |||
| S | UCG | 275 | 0.380 | 1 | 38 | 0.125 | ||||
| T | ACA | 181 | 0.160 | 1 | 447 | 1.002 | 1 | |||
| T | ACG | 526 | 0.465 | 1 | 164 | 0.368 | ||||
| V | GUA | 1,329 | 0.805 | 5 | 765 | 1.508 | 1 | |||
| V | GUG | 1,784 | 1.080 | 231 | 0.455 | |||||
Note.—See text for reasons of including only R-ending codons.
aFrom highly expressed E. coli genes, as compiled in the EMBOSS distribution (Rice et al. 2000).
Mean and Distribution of rRSCU Values for Various dsDNA and ssDNA Phage Families.
| Type | Phage Family | Minimum | Maximum | Average | SD | |
|---|---|---|---|---|---|---|
| dsDNA | Myoviridae | 9 | 0.3437 | 0.9207 | 0.6953 | 0.2359 |
| Podoviridae | 12 | 0.2553 | 0.8034 | 0.4216 | 0.1859 | |
| Siphoviridae | 16 | 0.2412 | 0.8955 | 0.6600 | 0.2355 | |
| Tectiviridae | 1 | 0.6084 | 0.6084 | 0.6084 | NA | |
| ssDNA | Inoviridae | 4 | 0.2700 | 0.3922 | 0.3449 | 0.0524 |
| Microviridae | 7 | 0.2757 | 0.3709 | 0.3173 | 0.0409 |
Note.—NA, not applicable.
Contrasting rRSCU Values for R-Ending Codons and for Y-Ending Codons (designated by rRSCU.R and rRSCU.Y, respectively).
| Family | ACCN | ||
|---|---|---|---|
| Microviridae | NC_001330 | 0.6504 | 0.0854 |
| Microviridae | NC_001420 | 0.4530 | 0.0332 |
| Microviridae | NC_007856 | 0.4652 | 0.0447 |
| Microviridae | NC_007817 | 0.4168 | 0.0200 |
| Microviridae | NC_001422 | 0.4497 | 0.0843 |
| Microviridae | NC_012868 | 0.6009 | 0.1118 |
| Microviridae | NC_007821 | 0.6030 | 0.1158 |
| Inoviridae | NC_001332 | 0.5475 | 0.1709 |
| Inoviridae | NC_001954 | 0.4753 | 0.2154 |
| Inoviridae | NC_002014 | 0.5892 | 0.2105 |
| Inoviridae | NC_003287 | 0.4876 | 0.0894 |
| Mean | 0.5217 | 0.1074 |
Effect of Life Cycle of dsDNA Phages on Codon Usage Concordance between Phage and Host, Measured by rRSCU.
| PhageFam | PhageName | Accession | LifeCycle | |
|---|---|---|---|---|
| Myoviridae | Enterobacteria phage Mu | NC_000929 | Temperate | 0.9207 |
| Myoviridae | Enterobacteria phage P2 | NC_001895 | Temperate | 0.9011 |
| Myoviridae | Enterobacteria phage P4 | NC_001609 | Temperate | 0.8287 |
| Myoviridae | Enterobacteria phage SfV | NC_003444 | Temperate | 0.8750 |
| Myoviridae | Escherichia phage D108 | NC_013594 | Temperate | 0.9207 |
| Myoviridae | Enterobacteria phage JSE | NC_012740 | Virulent | 0.4789 |
| Myoviridae | Enterobacteria phage Phi1 | NC_009821 | Virulent | 0.4971 |
| Myoviridae | Enterobacteria phage phiEcoM-GJ1 | NC_010106 | Virulent | 0.3437 |
| Myoviridae | Enterobacteria phage RB49 | NC_005066 | Virulent | 0.4917 |
| Podoviridae | Escherichia phage phiV10 | NC_007804 | Temperate | 0.7308 |
| Podoviridae | Stx2 converting phage I | NC_003525 | Temperate | 0.8034 |
| Podoviridae | Enterobacteria phage 13a | NC_011045 | Virulent | 0.3181 |
| Podoviridae | Enterobacteria phage EcoDS1 | NC_011042 | Virulent | 0.4021 |
| Podoviridae | Enterobacteria phage K1-5 | NC_008152 | Virulent | 0.2629 |
| Podoviridae | Enterobacteria phage K1E | NC_007637 | Virulent | 0.2553 |
| Podoviridae | Enterobacteria phage K1F | NC_007456 | Virulent | 0.2553 |
| Podoviridae | Enterobacteria phage N4 | NC_008720 | Virulent | 0.2661 |
| Podoviridae | Enterobacteria phage T3 | NC_003298 | Virulent | 0.5306 |
| Podoviridae | Enterobacteria phage T7 | NC_001604 | Virulent | 0.3274 |
| Podoviridae | Enterobacteria phage BA14 | NC_011040 | Virulent | 0.4504 |
| Siphoviridae | Enterobacteria phage BP-4795 | NC_004813 | Temperate | 0.8049 |
| Siphoviridae | Enterobacteria phage cdtI | NC_009514 | Temperate | 0.8307 |
| Siphoviridae | Enterobacteria phage HK022 | NC_002166 | Temperate | 0.7416 |
| Siphoviridae | Enterobacteria phage HK97 | NC_002167 | Temperate | 0.7303 |
| Siphoviridae | Enterobacteria phage lambda | NC_001416 | Temperate | 0.8520 |
| Siphoviridae | Enterobacteria phage N15 | NC_001901 | Temperate | 0.8955 |
| Siphoviridae | Escherichia Stx1 converting bacteriophage | NC_004913 | Temperate | 0.8108 |
| Siphoviridae | Stx2-converting phage 1717 | NC_011357 | Temperate | 0.8335 |
| Siphoviridae | Enterobacteria phage SSL-2009a | NC_012223 | Temperate | 0.7853 |
| Siphoviridae | Enterobacteria phage EPS7 | NC_010583 | Virulent | 0.2583 |
| Siphoviridae | Enterobacteria phage JK06 | NC_007291 | Virulent | 0.2565 |
| Siphoviridae | Enterobacteria phage RTP | NC_007603 | Virulent | 0.2412 |
| Siphoviridae | Enterobacteria phage T1 | NC_005833 | Virulent | 0.4637 |
| Siphoviridae | Enterobacteria phage TLS | NC_009540 | Virulent | 0.4734 |
Note.—The phages are organized by phage families (PhageFam) and then by life cycle (LifeCycle: temperate or virulent) within each phage family.
FPositive association between SKEWTC, defined as (NT – NC)/(NT + NC) where Ni is the number of nucleotide i in a phage genome, and F4, the percentage of codons in four codon families (Gly, Arg4, Ser4, and Val) in which highly expressed E. coli genes prefer U-ending codons against C-ending codons. Results are from 11 ssDNA E. coli phages. We noted that, because U-rich codons will increase, and C-rich codons decrease, with increasing C→T mutation bias, only Gly codon family should be used for testing the predicted positive correlation, which would lead to r = 0.6837 and P = 0.02036.
FUUN codons increases, and CCN codons decreases, with C→T mutation measured by TC skew at the third codon position (SKEWTC3), but at different extent.