| Literature DB >> 22581775 |
Naama Wald1, Maya Alroy, Maya Botzman, Hanah Margalit.
Abstract
Synonymous codons are unevenly distributed among genes, a phenomenon termed codon usage bias. Understanding the patterns of codon bias and the forces shaping them is a major step towards elucidating the adaptive advantage codon choice can confer at the level of individual genes and organisms. Here, we perform a large-scale analysis to assess codon usage bias pattern of pyrimidine-ending codons in highly expressed genes in prokaryotes. We find a bias pattern linked to the degeneracy of the encoded amino acid. Specifically, we show that codon-pairs that encode two- and three-fold degenerate amino acids are biased towards the C-ending codon while codons encoding four-fold degenerate amino acids are biased towards the U-ending codon. This codon usage pattern is widespread in prokaryotes, and its strength is correlated with translational selection both within and between organisms. We show that this bias is associated with an improved correspondence with the tRNA pool, avoidance of mis-incorporation errors during translation and moderate stability of codon-anticodon interaction, all consistent with more efficient translation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22581775 PMCID: PMC3424539 DOI: 10.1093/nar/gks348
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.An association between the degeneracy level of amino acids and codon bias pattern in N1N2Y3 codon-pairs. The matrix of codon-pair biases in multiple organisms (Green: U-Bias, magenta: C-Bias, black: N-Bias) was clustered using a hamming distance metric. The clustering reveals a clear distinction between codon-pairs belonging to codon families with four-fold degeneracy (yellow bar) and those belonging to two- and three-fold degenerate families (cyan bar), where the former are statistically significantly biased towards the U-ending codon and the latter are statistically significantly biased towards the C-ending codon. This distinction is most pronounced in a subset of organisms (blue bar).
Figure 2.Organisms conforming to the fold rule demonstrate strong translational selection and fast growth. OFRS and ENC′diff were calculated as described in ‘Materials and Methods’ section and minimal generation times were taken from (46). (A) Association between OFRS and ENC′diff was evaluated by computing Pearson correlation coefficient. (B) Association between OFRS and minimal generation time was evaluated by computing Spearman correlation coefficient (since the generation time variable is not normally distributed). Organisms most compatible with the fold rule (marked by the blue bar in Figure 1) are colored dark gray and have relatively high ENC′diff and low minimal generation time values.
Figure 3.U-Bias in 4-fold codon-pairs is associated with the tRNA repertoire. Circles represent organisms with the indicated U34N35N36 (X-axis) and G34N35N36 (Y-axis) tRNA gene copy numbers. The only exception is the Arg codon-pair CGY for which the common tRNA is A34C35G36 (where A is modified to I). Circle size represents the number of organisms with the specific tRNA gene counts. Circle color represents the direction of bias in the codon-pair (Green: U-Bias; magenta: C-Bias). Circles along the diagonal indicate equal number of gene copies of the two tRNAs. Most organisms with C-Bias have a low U34/G34 ratio.
Propensity towards C-Bias depends on the similarity between amino acids sharing the codon quartet
| Ser(AGY)–Arg(AGR) | Phe(UUY)–Leu(UUR) | Asn(AAY)–Lys(AAR) | His(CAY)–Gln(CAR) | Asp(GAY)–Glu(GAR) | |
|---|---|---|---|---|---|
| BLOSUM substitution values | −1:−2 | 0 | 0 | 1 | 1:2 |
| C-Bias | 199 | 210 | 189 | 140 | 69 |
| U-Bias | 1 | 4 | 7 | 33 | 62 |
| N-Bias | 41 | 27 | 45 | 68 | 110 |
| C/U Ratio | 199 | 52.5 | 27 | 4.2 | 1.1 |
aRange is the minimal and maximal substitution values observed in BLOSUM substitution matrices 70, 75, 80, 85 and 90
bCalculated from organisms with high ENC′diff values (upper 50%) for Ser, Phe, Asn, His and Asp.
Figure 4.Hydrogen bonds stabilizing codon–anticodon interactions [an extension of the model suggested in (61)]. In black are the standard Watson–Crick interactions formed between N1:N36 and N2:N35, representing two or three hydrogen bonds each. In light gray is the loop-stabilizing bond between U33 and N35, which occurs only when N35 is a purine. In dark gray is the interaction formed between N3:N34, which represents two hydrogen bonds for U3:G34 and three hydrogen bonds for C3:G34.
Extended stability score of N1N2Y3 codon-pairs
| Amino acid | Codon pair | Fold | N1:N36 H-bonds | N2:N35 H-bonds | N2 Y/R | C3:G34 H-bonds | U3:G34 H-bonds | N1N2C3 Extended stability score | N1N2U3 Extended stability score |
|---|---|---|---|---|---|---|---|---|---|
| Asn | AAY | 2 | 2 | 2 | 0 | 3 | 2 | 7 | 6 |
| Tyr | UAY | 2 | 2 | 2 | 0 | 3 | 2 | 7 | 6 |
| Ser | AGY | 2 | 2 | 3 | 0 | 3 | 2 | 8 | 7 |
| Ile | AUY | 3 | 2 | 2 | 1 | 3 | 2 | 8 | 7 |
| His | CAY | 2 | 3 | 2 | 0 | 3 | 2 | 8 | 7 |
| Asp | GAY | 2 | 3 | 2 | 0 | 3 | 2 | 8 | 7 |
| Cys | UGY | 2 | 2 | 3 | 0 | 3 | 2 | 8 | 7 |
| Phe | UUY | 2 | 2 | 2 | 1 | 3 | 2 | 8 | 7 |
| Arg | CGY | 4 | 3 | 3 | 0 | 2(+) | 2(−) | 8(+) | 8(−) |
| Thr | ACY | 4 | 2 | 3 | 1 | 3 | 2 | 9 | 8 |
| Leu | CUY | 4 | 3 | 2 | 1 | 3 | 2 | 9 | 8 |
| Gly | GGY | 4 | 3 | 3 | 0 | 3 | 2 | 9 | 8 |
| Val | GUY | 4 | 3 | 2 | 1 | 3 | 2 | 9 | 8 |
| Ser | UCY | 4 | 2 | 3 | 1 | 3 | 2 | 9 | 8 |
| Pro | CCY | 4 | 3 | 3 | 1 | 3 | 2 | 10 | 9 |
| Ala | GCY | 4 | 3 | 3 | 1 | 3 | 2 | 10 | 9 |
aMarks the extended stability score of the preferred codon according to the observed fold rule.
bThe Arg-CGY codon-pair is read by the ACG (modified to ICG) tRNA. Since bonds formed within an I:C base pair are stronger than within an I:U pair, the relevant hydrogen bonds and the extended stability scores are marked with (+) and (−), respectively.
Summary of potential forces affecting codon bias in N1N2Y3 codon-pairs
| Amino acid | Codon pair | Preferred codon | Watson–Crick | GC | tRNA pool | Error minimization | Stability | Extended stability |
|---|---|---|---|---|---|---|---|---|
| Asn | AAY | C | + | − − | NR | + | + | + |
| Tyr | UAY | C | + | − − | NR | NR | + | + |
| Ser | AGY | C | + | − | NR | + | NR | + |
| Ile | AUY | C | + | − | NR | NR | + | + |
| His | CAY | C | + | − − | NR | + | NR | + |
| Asp | GAY | C | + | − − | NR | + | NR | + |
| Cys | UGY | C | + | + | NR | NR | NR | + |
| Phe | UUY | C | + | − | NR | + | + | + |
| Arg | CGY | U | NR | − | − | NR | + | + |
| Thr | ACY | U | − | ++ | + | NR | NR | + |
| Leu | CUY | C/U | NR | ++ | + | NR | NR | NR |
| Gly | GGY | U | − | − | ++ | NR | + | + |
| Val | GUY | U | − | + | + | NR | NR | + |
| Ser | UCY | U | − | ++ | ++ | NR | NR | + |
| Pro | CCY | U | − | + | ++ | NR | + | + |
| Ala | GCY | U | − | + | + | NR | + | + |
aCodon preference is determined by bias direction observed in organisms with high ENC’diff as seen in Supplementary Figure S2.
bAgreement with a prefect Watson–Crick base pairing. (+) The preferred codon forms a perfect Watson–Crick pairing with the available tRNA, (−) the preferred codon forms a wobble pairing with the tRNA, (NR) both codons form wobble pairing (due to A to I modification) or there is no preferred codon.
cGC effect on bias direction as seen in Supplementary Figure S5: (−) no effect, (− −) opposite effect, (+) weak effect, (++) strong effect.
dtRNA pool size effect on bias direction: (−) no effect, (+) weak effect, (++) strong effect, (NR) not relevant.
eIndication of error minimization: (+) observed, (NR) not relevant.
fAgreement with Grosjean’s stability model (56): (+) the preferred codon forms moderate interaction with the tRNA compared to its synonym, (NR) not relevant to SWY and WSY codon-pairs.
gAgreement with the Extended Stability model: (+) the preferred codon forms a moderate interaction with the tRNA compared to its synonym, (NR) there is no distinctly preferred codon.