| Literature DB >> 23695187 |
Andrea Porceddu1, Sara Zenoni, Salvatore Camiolo.
Abstract
Little is known about the natural selection of synonymous codons within the coding sequences of plant genes. We analyzed the distribution of synonymous codons within plant coding sequences and found that preferred codons tend to encode the more conserved and functionally important residues of plant proteins. This was consistent among several synonymous codon families and applied to genes with different expression profiles and functions. Most of the randomly chosen alternative sets of codons scored weaker associations than the actual sets of preferred codons, suggesting that codon position within plant genes and codon usage bias have coevolved to maximize translational accuracy. All these findings are consistent with the mistranslation-induced protein misfolding theory, which predicts the natural selection of highly preferred codons more frequently at sites where translation errors could compromise protein folding or functionality. Our results will provide an important insight in future studies of protein folding, molecular evolution, and transgene design for optimal expression.Entities:
Keywords: coding sequences evolution; codon bias; constrained sites
Mesh:
Substances:
Year: 2013 PMID: 23695187 PMCID: PMC3698923 DOI: 10.1093/gbe/evt078
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Global Test for Translational Accuracy in Plant Genes
| Orthologous from | Alignments with | Whole Coding Sequence | Lacking First 100 Residues | ||
|---|---|---|---|---|---|
| Odds Ratio | Odds Ratio | ||||
| 1.06 | 22.36 | 1.07 | 25.77 | ||
| 1.08 | 18.45 | 1.09 | 18.87 | ||
| 1.09 | 18.08 | 1.12 | 21.2 | ||
| 1.12 | 21.02 | 1.13 | 20.51 | ||
| 1.12 | 26.51 | 1.08 | 14.71 | ||
| 1.11 | 22.27 | 1.06 | 12.30 | ||
| 1.06 | 4.02 | 1.04 | 2.35 | ||
| 1.06 | 6.22 | 1.03 | 2.77 | ||
Note.—An odds ratio greater than 1 dictates the preferential usage of preferred codons to encode evolutionarily constrained residues. The positions of evolutionarily constrained residues in A. thaliana proteins were identified from alignments between A. thaliana and O. sativa or A. lyrata orthologs. Evolutionarily conserved residues in rice proteins were identified from alignments between O. sativa and A thaliana or B. distachyon orthologs.
**P < 0.01.
***P < 0.001.
Global Test for Translational Accuracy in Plant Genes after Controlling for the Lipman-Wilbur Effect
| Orthologous from | Alignments with | Whole Coding Sequence | Lacking First 100 Residues | ||
|---|---|---|---|---|---|
| Odds Ratio | Odds Ratio | ||||
| 1.17 | 16.23 | 1.08 | 15.94 | ||
| 1.18 | 23.88 | 1.11 | 13.52 | ||
| 1.45 | 8.23 | 1.29 | 4.5 | ||
Note.—An odds ratio greater than 1 dictates the preferential usage of preferred codons to encode evolutionarily constrained residues. Only unconstrained sites involving amino acids encoded by codons with the same silent bases and the same favored base(s) were considered. The positions of evolutionarily constrained and unconstrained residues in A. thaliana proteins were identified from alignments between A. thaliana and A. lyrata orthologs. Evolutionarily conserved residues in rice proteins were identified from alignments between O. sativa and B. distachyon orthologs.
***P < 0.001.
Selection for Translational Accuracy for Residues Included in the Functional Domains of Arabidopsis thaliana Proteins
| Algorithm | All Genes | |
|---|---|---|
| Odds Ratio | ||
| ProDrom | 1.106 | 5.18 |
| PatternScan | 1.05 | 7.44 |
***P < 0.001.
Signatures of Translational Selection Are Consistent for Most Amino Acids in Plant Proteins
| Residue | Rice (Low GC) | Rice (High GC) | ||||
|---|---|---|---|---|---|---|
| Odds Ratio | Odds Ratio | Odds Ratio | ||||
| Ala | 1.05 | 5.97 | 1.24 | 16.09 | 1.03 | 0.85 |
| Cys | 1.20 | 9.64 | ND | ND | 1.43 | 1.37 |
| Asp | 1.01 | 1.40 | ND | ND | ND | ND |
| Glu | 1.14 | 16.21 | 1.17 | 10.70 | 1.65 | 4.70 |
| Phe | 1.03 | 3.18 | ND | ND | ND | ND |
| Gly | 0.93 | 6.54 | 1.04 | 2.20 | 0.75 | 2.84 |
| His | 1.05 | 3.35 | ND | ND | 1.19 | 1.18 |
| Iso | 0.99 | 1.23 | 0.92 | 4.68 | 1.30 | 2.46 |
| Leu | 1.13 | 16.72 | 1.21 | 12.93 | 1.05 | 1.37 |
| Asn | 1.11 | 9.72 | ND | ND | 2.14 | 5.63 |
| Pro | 0.92 | 4.86 | 1.12 | 6.10 | 1.16 | 2.82 |
| Gln | 1.31 | 23.11 | 1.24 | 10.69 | 1.69 | 3.47 |
| Arg | 1.05 | 4.19 | 1.109 | 5.76 | 1.01 | 0.230 |
| Ser | 0.95 | 4.29 | ND | ND | 1.18 | 3.88 |
| Thr | 0.93 | 7.65 | 0.984 | 0.58 | 0.96 | 0.70 |
| Val | 1.00 | 0.14 | 1.09 | 6.20 | 0.92 | 2.05( |
| Tyr | 1.13 | 8.72 | ND | ND | 0.69 | 1.80 |
| Lys | 1.11 | 12.42 | 1.16 | 9.37 | 1.36 | 2.31( |
Note.—ND, not determined. Significance level in parentheses disappears after Benjamini–Hochberg correction for multiple testing (Benjamini and Hochberg 1996).
*P < 0.05.
**P < 0.01.
***P < 0.001.
Signatures of Selection for Arabidopsis thaliana Genes with Different Expression Profiles (Breadth and Level of Expression)
| Odds Ratio | ||
|---|---|---|
| Expression breadth low | 1.06 | 6.36 |
| Expression breadth intermediate | 1.06 | 5.52 |
| Expression breadth high | 1.05 | 14.54 |
| Expression level low | 1.06 | 11.075 |
| Expression level intermediate | 1.05 | 7.32 |
| Expression level high | 1.04 | 6.75 |
**P < 0.01.
***P < 0.001.
Codon Preferences and Translational Accuracy in Arabidopsis thaliana and Rice Genes (Low-GC Data Set)
| Amino Acid | Codon | ||||
|---|---|---|---|---|---|
| Preferentiality (High vs. Low) | Accuracy (Akashi Test) | Preferentiality (High vs. Low) | Accuracy (Akashi Test) | ||
| Ala | GCA | 0.73 | 0.97 | 0.88** | 1.24** |
| GCC | 1.23 | 0.99 | 0.86* | 0.79** | |
| GCG | 0.94 | 0.94 | 0.95 | 0.66 | |
| GCT | 1.19 | 1.05* | 1.24 | 1.25** | |
| Cys | TGC | 1.13* | 1.20** | 1.09 | 1.16* |
| TGT | 0.88* | 0.82** | 0.92 | 0.86* | |
| Asp | GAC | 1.35 | 1.01 | 1.01 | 0.98 |
| GAT | 0.74 | 0.99 | 0.99 | 1.02 | |
| Glu | GAA | 0.71 | 0.87 | 0.88** | 0.85** |
| GAG | 1.41 | 1.14 | 1.13** | 1.17** | |
| Phe | TTC | 1.60 | 1.03 | 1.14* | 1.08 |
| TTT | 0.62 | 0.96 | 0.87* | 0.92 | |
| Gly | GGA | 1.06* | 1.05* | 0.93 | 1.19** |
| GGC | 0.79 | 0.93* | 0.90 | 0.78** | |
| GGG | 0.73 | 1.08* | 0.97 | 1.03 | |
| GGT | 1.27 | 0.93* | 1.18 | 1.04 | |
| His | CAC | 1.57 | 1.05 | 0.91 | 0.97 |
| CAT | 0.64 | 0.95 | 1.10 | 1.03 | |
| Ile | ATA | 0.50 | 1.05* | 0.80 | 1.02 |
| ATC | 1.58 | 0.98 | 1.07 | 1.07 | |
| ATT | 1.06* | 0.97 | 1.14* | 0.92 | |
| Leu | CTA | 0.78 | 1.03 | 0.86* | 1.05 |
| CTC | 1.37 | 1.05* | 0.93 | 0.91* | |
| CTG | 0.92* | 1.07* | 0.96 | 1.00 | |
| CTT | 1.20 | 1.12** | 1.29 | 1.22** | |
| TTA | 0.61 | 0.84 | 0.83** | 0.89* | |
| TTG | 1.03 | 0.90** | 0.99 | 0.91* | |
| Asn | AAC | 1.56 | 1.11** | 1.02 | 1.17* |
| AAT | 0.64 | 0.89** | 0.98 | 0.85* | |
| Pro | CCA | 0.96 | 0.99 | 0.96 | 1.19** |
| CCC | 1.08 | 0.92* | 0.92 | 0.89* | |
| CCG | 0.91* | 0.98 | 0.81* | 0.69** | |
| CCT | 1.05 | 1.04 | 1.16** | 1.12* | |
| Gln | CAA | 0.78 | 0.76 | 0.82 | 0.81** |
| CAG | 1.29 | 1.31 | 1.22 | 1.24** | |
| Arg | AGA | 0.79 | 0.84 | 0.81 | 0.90* |
| AGG | 1.22 | 1.05 | 1.01 | 0.90* | |
| CGA | 0.75 | 1.05 | 0.97 | 1.26* | |
| CGC | 1.13* | 1.07 | 1.04 | 0.93 | |
| CGG | 0.64 | 1.11* | 0.94 | 1.02 | |
| CGT | 1.64 | 1.09* | 1.46 | 1.33** | |
| Ser | AGC | 1.03 | 1.00 | 0.99 | 0.84** |
| AGT | 0.87 | 0.86 | 0.98 | 0.79** | |
| TCA | 0.86 | 1.11** | 0.95 | 1.19** | |
| TCC | 1.19 | 0.95 | 1.01 | 1.09* | |
| TCG | 1.06 | 1.04 | 1.24 | 1.03 | |
| TCT | 1.09 | 1.00 | 1.00 | 1.05 | |
| Thr | ACA | 0.73 | 1.15** | 0.95 | 1.14* |
| ACC | 1.48 | 0.88** | 1.01 | 0.89* | |
| ACG | 0.77 | 0.90* | 0.87 | 0.94 | |
| ACT | 1.14 | 1.00 | 1.09 | 0.99 | |
| Val | GTA | 0.54 | 0.83 | 0.80 | 0.86* |
| GTC | 1.29 | 1.00 | 1.00 | 1.02 | |
| GTG | 1.06* | 1.08* | 0.92 | 0.98 | |
| GTT | 1.11 | 1.03 | 1.21 | 1.10* | |
| Tyr | TAC | 1.76 | 1.13** | 1.08 | 1.12 |
| TAT | 0.57 | 0.88** | 0.93 | 0.89 | |
| Lys | AAA | 0.70 | 0.89** | 0.90* | 0.86** |
| AAG | 1.42 | 1.11** | 1.11* | 1.16** | |
Note.—Columns list the odds ratios for preferential synonymous codon usage in the 10% genes with the highest and lowest expression levels, based on AT40 (Schmid et al. 2005) and OS5 (Jain et al. 2007 for Arabidopsis and rice).
***P < 0.001; **P < 0.01; *P < 0.05 (after Benjamini–Hochberg correction for multiple testing [Benjamini and Hochberg 1996]).