| Literature DB >> 24906480 |
Abstract
The tRNA adaptation index (tAI) is a widely used measure of the efficiency by which a coding sequence is recognized by the intra-cellular tRNA pool. This index includes among others weights that represent wobble interactions between codons and tRNA molecules. Currently, these weights are based only on the gene expression in Saccharomyces cerevisiae. However, the efficiencies of the different codon-tRNA interactions are expected to vary among different organisms. In this study, we suggest a new approach for adjusting the tAI weights to any target model organism without the need for gene expression measurements. Our method is based on optimizing the correlation between the tAI and a measure of codon usage bias. Here, we show that in non-fungal the new tAI weights predict protein abundance significantly better than the traditional tAI weights. The unique tRNA-codon adaptation weights computed for 100 different organisms exhibit a significant correlation with evolutionary distance. The reported results demonstrate the usefulness of the new measure in future genomic studies.Entities:
Keywords: codon usage bias; protein levels; ribosome; tRNA adaptation index; wobble interactions
Mesh:
Substances:
Year: 2014 PMID: 24906480 PMCID: PMC4195497 DOI: 10.1093/dnares/dsu017
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Crick's wobble rules for calculating W
| Codon third position | Anticodon first position | |||
|---|---|---|---|---|
| U | I | |||
| C | G | |||
| A | U | |||
| G | C |
The W values are calculated based on Equation (1). The 64 codons are clustered in the genetic code into 16 groups, each one consists of four codons. The four codons in each group differ only in their third position (the wobble position). The formulas for calculating the W values for each of the four codons in the group are given in the table. i denotes the index of the codon in the quartet which ends with U, i + 1, i + 2, and i + 3 denote the three other codons which end with bases C, A, and G, respectively. j denotes the index of the tRNA whose anticodon starts with I; all base pairing between the ith codon and the jth anticodon are WC. j + 1, j + 2, and j + 3 denote the three other tRNAs whose anticodons start with bases G, U, and C, respectively. tGCN represents the tRNA gene copy number corresponding to the interaction between the ith codon and the jth tRNA. For each codon, W sums over all tRNAs that can pair with the codon. For example, the GCU codon which ends with U can either pair with anticodons that start with I (IGC) and generate a standard WC base pairing, or pair with anticodons that start with G (GGC) and generate a wobble base pairing.
The different base-pairings
| I | G | U | C | A | L | |
|---|---|---|---|---|---|---|
| I | — | — | — | |||
| G | — | — | — | — | ||
| U | — | — | — | — | ||
| C | — | — | — | — | — | |
| A | — | — | — | — | — | — |
| L | — | — | — | — | — | |
S-values are given to the pairing between the first position of the jth anticodon (tRNA) and the third position of the ith codon. S-values of WC base pairs are shown in italics, wobble values are shown in bold. Interactions which are not included in the calculation of the tAI are marked with hyphens. Lysidine (L) is a bacterial RNA modification of the DNA nucleotide cytidine (c).[44,45]
Figure 1.Dot plots of log(PA) vs. stAI and the corresponding Spearman rank correlations between stAI and PA. The correlations (and P-values) are calculated for the eight model organisms with PA measurements which include three bacteria (A–C), three non-fungal eukaryotes (D–F), and two fungi (G–H).
Spearman rank correlation of the original tAI and the stAI with PA
| Number of genes | Number of proteins | Change (%) | |||
|---|---|---|---|---|---|
| Non-fungal | |||||
| 4,145 | 688 | 0.5032 | 0.5493 | +8.39 | |
| 4,501 | 1,266 | 0.3574 | 0.36757 | +2.76 | |
| 3,667 | 2,114 | 0.0959 | 0.19408 | +50.58 | |
| 28,163 | 8,478 | 0.3328 | 0.3762 | +11.53 | |
| 22,830 | 6,959 | 0.0919 | 0.0956 | +3.87 | |
| 10,926 | 6,510 | 0.4878 | 0.5001 | +2.46 | |
| Fungi | |||||
| 5,869 | 2,666 | 0.6915 | 0.5802 | −19.18 | |
| 5,017 | 1,464 | 0.6554 | 0.56715 | −15.58 | |
The correlations between tAI and PA vs. the correlations between stAI and PA in eight model organisms with available PA data. The third column refers to the number of genes with available PA measurements in each organism.
Figure 2.Comparison between stAI and the tAI. The middle bars representing the number of times (based on the jack-knifing analysis) the stAI outperformed the other versions of the tAI; as can be seen, stAI outperforms tAI in all non-fungal organisms.
Figure 3.Principal component analysis (PCA) on the 100 different S sets demonstrates clustering of S according to evolutionary domains. The first three components of the PCA are presented. Each point in the figure represents one of the 100 analysed organisms; the shape of the point corresponds to the domain of the organism at the tree of life and the colour corresponds to the cluster the point was classified based on the k-means algorithm.
The mean inferred wobble S-values
| SG:U | SI:C | SI:A | SU:G | SL/agm:A | |
|---|---|---|---|---|---|
| Eukarya | 0.7861 | 0.4659 | 0.9075 | 0.6295 | — |
| Bacteria | 0.6294 | 0.4211 | 0.8773 | 0.698 | 0.7309 |
| Archaea | 0.3898 | 0.3774 | 0.5015 | 0.4363 | 0.6453 |
| Mean | 0.6 | 0.42 | 0.76 | 0.588 | 0.6881 |
The mean inferred wobble S-values strength for each domain of life and for the entire analysed dataset (last row).
Figure 4.S distributions among different domains of life. Each figure contains three histograms representing the S in the different domains of life; the mean and SD of the S-values in each domain are also reported. The P-values corresponding to the comparison between every two S means appear in the bottom of the figure (see section 2 sub-section ‘Permutation test for comparing two S means’).
Figure 5.sI:A distribution within the major phylums of the eukaryotic and bacterial domains with a significant empirical P-value (see details in section 2).