| Literature DB >> 22865738 |
Salvatore Camiolo1, Lorenzo Farina, Andrea Porceddu.
Abstract
The codon composition of coding sequences plays an important role in the regulation of gene expression. Herein, we report systematic differences in the usage of synonymous codons among Arabidopsis thaliana genes that are expressed specifically in distinct tissues. Although we observed that both regionally and transcriptionally associated mutational biases were associated significantly with codon bias, they could not explain the observed differences fully. Similarly, given that transcript abundances did not account for the differences in codon usage, it is unlikely that selection for translational efficiency can account exclusively for the observed codon bias. Thus, we considered the possible evolution of codon bias as an adaptive response to the different abundances of tRNAs in different tissues. Our analysis demonstrated that in some cases, codon usage in genes that were expressed in a broad range of tissues was influenced primarily by the tissue in which the gene was expressed maximally. On the basis of this finding we propose that genes that are expressed in certain tissues might show a tissue-specific compositional signature in relation to codon usage. These findings might have implications for the design of transgenes in relation to optimizing their expression.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22865738 PMCID: PMC3454886 DOI: 10.1534/genetics.112.143677
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562
Results generated using PERMANOVA by the analysis of 500 datasets that were produced following the internal double randomization scheme
| Average pseudo- | No. of datasets with significance at | No. of datasets with significance at | |
|---|---|---|---|
| EB1 | 1.57 | 483 (96.6) | 423 (84.6) |
| EB2 | 1.48 | 482 (95.4) | 420 (84.0) |
| EB_5-6-7 | 1.41 | 383 (76.6) | 233 (46.6) |
PERMANOVA used the pseudo-F to test the null hypothesis that there was no difference among the levels of the classification variable. The number of permutations used to assess the significance of F was set to 4999. The last two columns represent the number (and percentage) of the 500 analyses that indicated a significant effect at P < 0.05 or P < 0.01, respectively (see Materials and Methods for more detail). EB, expression breadth.
Summary of results generated by using PERMANCOVA through the analysis of 500 datasets that were produced following the internal double randomization scheme
| Effect of the main factor (tissue) | Effect of the covariate | |||
|---|---|---|---|---|
| Covariate | Average pseudo- | No. of datasets with significance at | Average pseudo- | No. of datasets with significance at |
| GC intergenic | 1.58 | 482 (96.4) | 1.41 | 137 (27.4) |
| Expr level (pE) | 1.35 | 337 (67.4) | 3.03 | 490 (98.0) |
| Expr level (avgEL) | 1.32 | 310 (62.0) | 3.04 | 500 (100.0) |
| Fop (tRNA) | 1.54 | 471 (94.2) | 12.39 | 500 (100.0) |
| Gi + Ti | 1.34 | 318 (63.6) | 1.67 | 292 (58.5) |
The covariables were the GC content of intergenic sequences, the G + T content of introns, the Fop, and the expression level measured as either pE or avgEL. The number of permutations used to assess the significance of both the covariable and the independent variable pseudo-F was set to 999 (see Materials and Methods for more detail).
Summary of results generated by PERMANOVA on 500 datasets that were produced by applying the double randomization scheme to the datasets of EB1 genes that were expressed at high or low levels
| Average pseudo- | No. of datasets with significance at | No. of datasets with significance at | |
|---|---|---|---|
| Root | 1.74 | 298 (59.6) | 148 (29.6) |
| High | |||
| Seed | 1.70 | 315 (63) | 138 (27.6) |
| High | |||
| Pollen | 0.91 | 7 (1.45) | 2 (0.4) |
| High | |||
| EB1 low | 1.285 | 153 (30.6) | 43 (8.6) |
| EB1 high | 2.06 | 492 (98.4) | 476 (95.2) |
Root, seed, or pollen are summarized. PERMANOVA results between EB1 genes of different tissues that were expressed at either high or low levels are summarized. In each case, the significance was calculated by a permutation approach that involved 4999 permutations.
Figure 1 Differentiation between tissue-specific datasets. Dendrogram representing the average differentiation among tissue-specific genes in terms of synonymous codon usage.
Figure 2 Mutual information of synonymous codon usage. The mutual information (MI) of synonymous codon usage in tissue-specific genes is significantly higher than that expected from a random distribution. Each row represents a cluster of genes expressed in a tissue-specific manner, whereas each column represents a codon. Statistical significance is expressed as −log(P).
The tissue signature is the ratio between the average distance of EB2 (or EB_5-6-7) genes from a randomly selected set of EB1 genes and the average distance of the same genes from the cognate EB1 genes
| Cognate EB1 distance | Random EB1 distance | Tissue signature | ||
|---|---|---|---|---|
| EB2 dataset | ||||
| Flower | 1.19 | 1.16 | 0.98 | <0.001 |
| Pollen | 1.04 | 1.17 | 1.12 | <0.0001 |
| Root | 1.03 | 0.99 | 0.96 | <0.0001 |
| Seed | 1.00 | 0.97 | 0.96 | <0.0001 |
| Shoot apex | 0.88 | 1.26 | 1.43 | <0.0001 |
| EB_567 dataset | ||||
| Flower | 1.05 | 0.97 | 0.92 | <0.0001 |
| Pollen | 1.22 | 1.24 | 1.02 | 0.343 |
| Root | 1.14 | 1.10 | 0.96 | 0.011 |
| Seed | 1.30 | 1.29 | 0.99 | 0.420 |
| Shoot apex | 1.25 | 1.44 | 1.16 | <0.0001 |
Distances were calculated as the average pseudo-F, which in turn was calculated using nonparametric MANOVA.