| Literature DB >> 25480780 |
Abstract
Two alternative hypotheses attribute different benefits to codon-anticodon adaptation. The first assumes that protein production is rate limited by both initiation and elongation and that codon-anticodon adaptation would result in higher elongation efficiency and more efficient and accurate protein production, especially for highly expressed genes. The second claims that protein production is rate limited only by initiation efficiency but that improved codon adaptation and, consequently, increased elongation efficiency have the benefit of increasing ribosomal availability for global translation. To test these hypotheses, a recent study engineered a synthetic library of 154 genes, all encoding the same protein but differing in degrees of codon adaptation, to quantify the effect of differential codon adaptation on protein production in Escherichia coli. The surprising conclusion that "codon bias did not correlate with gene expression" and that "translation initiation, not elongation, is rate-limiting for gene expression" contradicts the conclusion reached by many other empirical studies. In this paper, I resolve the contradiction by reanalyzing the data from the 154 sequences. I demonstrate that translation elongation accounts for about 17% of total variation in protein production and that the previous conclusion is due to the use of a codon adaptation index (CAI) that does not account for the mutation bias in characterizing codon adaptation. The effect of translation elongation becomes undetectable only when translation initiation is unrealistically slow. A new index of translation elongation ITE is formulated to facilitate studies on the efficiency and evolution of the translation machinery.Entities:
Keywords: codon usage bias; codon-anticodon adaptation; index of translation elongation; translation efficiency; translation elongation
Mesh:
Substances:
Year: 2014 PMID: 25480780 PMCID: PMC4317663 DOI: 10.1534/genetics.114.172106
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Codon frequency (CF) for E. coli highly expressed genes (HEGs) and non-HEGs, as well as the computed Si values according to Equation 1
| AA | Codon | CFHEG | CFnon-HEG | |
|---|---|---|---|---|
| A | GCA | 1973 | 25,511 | 1.1495 |
| A | GCG | 2654 | 43,261 | 0.9118 |
| A | GCC | 1306 | 33,463 | 0.5646 |
| A | GCU | 2288 | 18,526 | 1.7865 |
| C | UGC | 475 | 8,397 | 1.1541 |
| C | UGU | 270 | 6,802 | 0.8098 |
| D | GAC | 2786 | 23,226 | 1.5125 |
| D | GAU | 2345 | 41,472 | 0.7130 |
| E | GAA | 4683 | 49,154 | 1.1180 |
| E | GAG | 1459 | 22,920 | 0.7470 |
| F | UUC | 2229 | 20,332 | 1.7637 |
| F | UUU | 872 | 29,556 | 0.4746 |
| G | GGA | 118 | 10,786 | 0.7282 |
| G | GGG | 267 | 14,842 | 1.1975 |
| G | GGC | 2987 | 37,418 | 0.8210 |
| G | GGU | 3583 | 30,154 | 1.2221 |
| H | CAC | 1160 | 12,144 | 1.7105 |
| H | CAU | 477 | 17,170 | 0.4975 |
| I | AUA | 22 | 5,926 | 0.0000 |
| I | AUC | 3488 | 30,787 | 1.5592 |
| I | AUU | 1640 | 39,788 | 0.5673 |
| K | AAA | 4129 | 41,696 | 1.0469 |
| K | AAG | 1050 | 13,057 | 0.8502 |
| L | CUA | 54 | 5,258 | 0.1275 |
| L | CUG | 5698 | 66,130 | 1.0694 |
| L | CUC | 541 | 14,591 | 1.2085 |
| L | CUU | 357 | 14,679 | 0.7927 |
| L | UUA | 210 | 18,739 | 0.7639 |
| L | UUG | 333 | 18,273 | 1.2422 |
| M | AUG | 2444 | 35,527 | 0.0000 |
| N | AAC | 2832 | 26,674 | 1.5850 |
| N | AAU | 539 | 23,652 | 0.3402 |
| P | CCA | 474 | 11,046 | 0.5779 |
| P | CCG | 2509 | 29,125 | 1.1601 |
| P | CCC | 38 | 7,443 | 0.2235 |
| P | CCU | 343 | 9,235 | 1.6258 |
| Q | CAA | 550 | 20,405 | 0.4975 |
| Q | CAG | 2548 | 36,780 | 1.2788 |
| R | AGA | 21 | 2,880 | 0.9782 |
| R | AGG | 13 | 1,681 | 1.0374 |
| R | CGA | 34 | 4,837 | 1.2807 |
| R | CGG | 33 | 7,370 | 0.8158 |
| R | CGC | 1530 | 28,473 | 0.6413 |
| R | CGU | 2995 | 25,528 | 1.4001 |
| S | AGC | 1015 | 20,868 | 1.3432 |
| S | AGU | 168 | 11,802 | 0.3931 |
| S | UCA | 189 | 9,614 | 0.9119 |
| S | UCG | 275 | 11,909 | 1.0711 |
| S | UCC | 1110 | 10,649 | 0.8950 |
| S | UCU | 1320 | 10,217 | 1.1094 |
| T | ACA | 181 | 9,527 | 0.7719 |
| T | ACG | 526 | 19,197 | 1.1132 |
| T | ACC | 2533 | 29,335 | 0.9108 |
| T | ACU | 1286 | 10,950 | 1.2389 |
| V | GUA | 1329 | 13,513 | 1.5053 |
| V | GUG | 1784 | 34,133 | 0.8000 |
| V | GUC | 824 | 19,972 | 0.4993 |
| V | GUU | 2669 | 22,297 | 1.4485 |
| W | UGG | 819 | 19,945 | 0.0000 |
| Y | UAC | 1569 | 15,094 | 1.5503 |
| Y | UAU | 865 | 21,207 | 0.6083 |
Taken from the Ecoli_high.cut file distributed with EMBOSS 6.4 (Rice ) representing a compilation of codon usage from known highly expressed E. coli K12 genes.
Mean codon frequencies from four sequenced E. coli K12 genomes (NC_010473, NC_020518, NC_007779, and NC_000913) minus CFHEG.
Figure 1Relationship between protein abundance (measured by GFP normalized fluorescence; data kindly provided by Dr. Plotkin) and translation elongation efficiency ITE, contrasting with that between protein abundance and CAI (codon adaptation index).
Figure 2Ranked protein abundance rProt (protein abundance is measured by GFP normalized fluorescence; data kindly provided by Dr. Plotkin) increases with translation elongation efficiency ITE, except for the group with extraordinarily strong secondary structure at the 5′ end (the MFE1 group). rProt also increases with decreasing stability of secondary structure, with MFE1 having the most stable and MFE4 the weakest secondary structure. The range of MFE is indicated for each of the four MFE groups.