| Literature DB >> 28533334 |
Salvatore Camiolo1, Gaurav Sablok2, Andrea Porceddu3.
Abstract
Mistranslation errors compromise fitness by wasting resources on nonfunctional proteins. In order to reduce the cost of mistranslations, natural selection chooses the most accurately translated codons at sites that are particularly important for protein structure and function. We investigated the determinants underlying selection for translational accuracy in several species of plants belonging to three clades: Brassicaceae, Fabidae, and Poaceae. Although signatures of translational selection were found in genes from a wide range of species, the underlying factors varied in nature and intensity. Indeed, the degree of synonymous codon bias at evolutionarily conserved sites varied among plant clades while remaining uniform within each clade. This is unlikely to solely reflect the diversity of tRNA pools because there is little correlation between synonymous codon bias and tRNA abundance, so other factors must affect codon choice and translational accuracy in plant genes. Accordingly, synonymous codon choice at a given site was affected not only by the selection pressure at that site, but also its participation in protein domains or mRNA secondary structures. Although these effects were detected in all the species we analyzed, their impact on translation accuracy was distinct in evolutionarily distant plant clades. The domain effect was found to enhance translational accuracy in dicot and monocot genes with a high GC content, but to oppose the selection of more accurate codons in monocot genes with a low GC content.Entities:
Keywords: RNA folding; codon bias; protein domains; translational accuracy
Mesh:
Substances:
Year: 2017 PMID: 28533334 PMCID: PMC5499143 DOI: 10.1534/g3.117.040626
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Phylogenetic trees corresponding to the clades we analyzed: Arabidopsis thaliana (AT), Arabidopsis lyrata (AL), Brassica rapa (BR), Capsella rubella (CR), Eutrema salsugineum (ES), Fragaria vesca (FV), Glycine max (GM), Medicago truncatula (MT), Prunus persica (PP), Phaseolus vulgaris (PV), Brachypodium distachyon (BD), Oryza sativa (OS), Sorghum bicolor (SB), and Zea mays (ZM).
Figure 2(A) Heat map showing the odds ratios for each codon in each species based on Seforta codon enrichment analysis (the average of all pairwise comparisons within each clade). (B) Heat map showing the tRNA-RSCU values in each species.
Average GC3 content of optimal and accurate codons in four species belonging to the fabids, malvids, and monocots
| Species | Average GC3 | |
|---|---|---|
| Optimal Codons | Accurate Codons | |
| 67.9 | 55.2 | |
| 70.8 | 44.8 | |
| 73.7 | 100.0 | |
| 40.0 | 51.9 | |
| 17.7 | 95.5 | |
| 6.3 | 50.0 | |
Figure 3Average GC3, average GC, and percentage of conserved sites within different portions of the transcripts in A. thaliana, M. truncatula, and O. sativa. Position axes refer to the number of sliding windows analyzed (window step = 3 nt/1 amino acid; window size = 9 nt/3 amino acids). Average GC3 and GC values are calculated at a nucleotide level, whereas the percentage of conserved sites refers to the protein coordinates. Red represents the domain regions and blue represents the nondomain regions.
Compositional statistics of four plant species belonging to the three clades (A. thaliana for malvids, M. truncatula for fabids, and O. sativa and Z. mays for monocots)
| Odds Ratio | ||||||||
|---|---|---|---|---|---|---|---|---|
| Species | ΔGC3 (D−ND) | Δ%Conserved (D−ND) | Δ%Conserved (S−L) | All | Domains | Nondomains | Stems | Loops |
| 2.8 | 5.29 | 0.40 | 1.11 | 1.14 | 1.09 | 1.10 | 1.14 | |
| 1.5 | 14.97 | 1.21 | 1.03 | 1.07 | 1.02 ns | 0.99 ns | 1.03 ns | |
| 7.4 | 19.27 | 0.03 ns | 1.08 | 1.01 ns | 1.06 | 1.02 ns | 1.13 | |
| 2.9 | 17.64 | 0.26 ns | 1.06 | 1.02 ns | 1.01 ns | 1.07 | 1.06 | |
| 5.1 | 18.15 | 0.28 ns | 0.79 | 0.84 | 0.79 | 0.80 | 0.78 | |
| 1.0 | 16.34 | 0.38 ns | 1.05 | 1.00 ns | 1.02 ns | 1.00 ns | 1.06 | |
Odds ratios refer to the Akashi test on the genomes of each species calculated for: all, overall transcripts; domains, transcript segments covering protein domain regions; nondomains, transcript segments covering nondomain regions; stems, transcript portions containing codons with the third base in a stem region; and loops, transcript portions containing codons with the third base in a loop region. D, codons in protein domain regions; ND, codons in protein nondomain regions; S, codons with the third base in a stem region; L, codons with the third base in a loop region; ns, nonsignificant.
Figure 4Variation of Fop and translational accuracy along the transcript. Each value represents an adjacent window containing five amino acids. The corresponding codons were pooled from all the genes to form a supersequence, which has been used for the calculations.