| Literature DB >> 23761955 |
Chen Xu1, Jing Dong, Chunfa Tong, Xindong Gong, Qiang Wen, Qiang Zhuge.
Abstract
We used large samples of expressed sequence tags to characterize the patterns of codon usage bias (CUB) in seven different Citrus species and to analyze their evolutionary effect on selection and base composition. We found that A- and T-ending codons are predominant in Citrus species. Next, we identified 21 codons for 18 different amino acids that were considered preferred codons in all seven species. We then performed correspondence analysis and constructed plots for the effective number of codons (ENCs) to analyze synonymous codon usage. Multiple regression analysis showed that gene expression in each species had a constant influence on the frequency of optional codons (FOP). Base composition differences between the proportions were large. Finally, positive selection was detected during the evolutionary process of the different Citrus species. Overall, our results suggest that codon usages were the result of positive selection. Codon usage variation among Citrus genes is influenced by translational selection, mutational bias, and gene length. CUB is strongly affected by selection pressure at the translational level, and gene length plays only a minor role. One possible explanation for this is that the selection-mediated codon bias is consistently strong in Citrus, which is one of the most widely cultivated fruit trees.Entities:
Keywords: citrus; codon usage; evolution
Year: 2013 PMID: 23761955 PMCID: PMC3667683 DOI: 10.4137/EBO.S11930
Source DB: PubMed Journal: Evol Bioinform Online ISSN: 1176-9343 Impact factor: 1.625
Differences in relative synonymous codon usage (RSCU) across codons between genes with high and low levels of expression.
Notes: The right-most column shows codons with significantly increased usage in highly expressed genes, as determined by a t-test (P < 0.05). Each * represents a species for which the t-test was significant, in the order as they are listed in the figure. Codons above the horizontal dotted lines were used to design the optimal codons and were used to calculate frequencies of optimal codon usage (FOP) in all species. The color in the figure indicates the gradient of ΔRSCU values, from the most positive (green) to the most negative (orange).
Likelihood ratio statistics.
| M1 (neutral) vs. M2 (selection) | 635.54 | 2 | 0.000E + 000 | 9.21 |
| M7 (beta) vs. M8 (beta and v) | 650.86 | 2 | 0.000E + 000 | 9.21 |
Figure 1Effective number of codons (ENC) as a measure of overall average codon usage bias (CUB) in seven Citrus species. The actual mean ENC, mean GC3s, and mean GC are shown below each bar. 95% confidence bars in standard error of mean are shown. A lower ENC represents greater bias.
Figure 2The ENC plot of Citrus. The continuous curve represents the relationship between GC3s and ENC values under random codon usage.
Figure 3Proportion of variation in the frequency of optimal codon usage explained by gene expression, gene length, and base composition at synonymous sites.
Figure 4Prediction of the ω value of each branch during evolutionary processes. Unrooted tree representing the phylogenetic relationship between the seven species. ML estimates non-synonymous (dN) and synonymous (dS) substitution rates, dN/dS ratios (in parentheses), and the maximum likelihood estimates of selection acting on preferred codons are shown above each branch and are calculated from the concatenated data set of 84 genes. Branch lengths are proportional to the synonymous substitution rate.
Parameter estimates and log-likelihood values under models of variable ratios among sites.
| M1 | Nearly neutral | 1 | −64600.24 | 1.55928 | P0= 0.20008 | Not allowed |
| ω0= 0.19552 | ||||||
| P1= 0.79992 | ||||||
| ω1= 1.00000 | ||||||
| M2 | Positive selection | 3 | −64282.47 | 1.72982 | P0= 0.06407 | 1Q 5N |
| ω0= 0.00000 | 1031W 1037A 1353E 1384R 1388S 1394H 1400A 1413D | |||||
| P1= 0.77815 | 1418L 1737E 1738R 1742C 1760V 1769S 1770R 1787V | |||||
| ω1= 1.00000 | 1973R 1976R 1992A 1995S 2266L 2272N 2280L 2283S | |||||
| P2= 0.15778 | 2285S | |||||
| ω2= 4.51408 | 2390T 2 | |||||
| 2514S 2516C 2522Y 2535L 2570S 2587R 2660C 2662M | ||||||
| 2743L 2748K 2756M 2818W 2821S 2822H 2823C 2827L | ||||||
| 2829G 2838A | ||||||
| 3018S | ||||||
| 3040F | ||||||
| 3778R 3780Y 3793T 3808Y 3810Y 3816L 3830L 3832S | ||||||
| M7 | Beta | 2 | −64609.64 | 1.56036 | P = 0.45889 | Not allowed |
| q = 0.08925 | ||||||
| M8 | Beta and ω | 4 | −64284.21 | 1.71973 | P0= 0.84072 | 1Q 5N |
| P = 0.27207 | 929S 1031W 1037A 1264T 1353E 1384R 1388S 1394H | |||||
| q = 0.03464 | 1400A 1413D 1418L 1737E 1738R 1742C 1760V 1769S | |||||
| (P1= 0.15928) | 1770R 1787V | |||||
| ω = 4.34756 | 2272N 2280L 2283S 2285S | |||||
| 2352R 2362R 2363H 2390T 2 | ||||||
| 2 | ||||||
| 2587R 2660C 2662M | ||||||
| 2672S | ||||||
| 2686F 2691H 2735W 2743L 2748K 2756M 2817T 2818W | ||||||
| 2819M 2821S 2822H | ||||||
| 2964K 3012L | ||||||
| 3491P 3495V | ||||||
| 3808Y 3810Y |
Notes: P represents the number of free parameters in the ω-distribution. Sites inferred to be under positive selection at the 99% level are bold and those at the 95% level are in italic.
Optimal codon table in Citrus sinensis.
| Ala | GCU | 0.59 | 29.3 |
| GCA | 0.75 | 20.6 | |
| GCC | −0.81 | 15.9 | |
| GCG | −0.52 | 8.3 | |
| Arg | AGA | 0.81 | 14.9 |
| AGG | 0.26 | 14.9 | |
| CGU | −0.02 | 5.5 | |
| CGA | 0.09 | 4.8 | |
| CGG | −0.32 | 4.4 | |
| CGC | −0.8 | 4.6 | |
| Gly | GGU | 0.46 | 19.7 |
| GGA | 0.37 | 18.7 | |
| GGG | −0.12 | 14 | |
| GGC | −0.71 | 17.2 | |
| His | CAU | 0.76 | 12.4 |
| CAC | −0.76 | 10.6 | |
| Val | GUU | 0.59 | 27.6 |
| GUA | 0.38 | 8.3 | |
| GUC | −0.66 | 11.5 | |
| GUG | −0.31 | 21.2 | |
| Lys | AAA | 0.17 | 25.7 |
| AAG | −0.17 | 34 | |
| Phe | UUU | 0.63 | 23.3 |
| UUC | −0.63 | 21 | |
| Pro | CCU | 0.78 | 16.8 |
| CCC | −0.53 | 11.2 | |
| CCG | −0.86 | 7.3 | |
| CCA | 0.61 | 16 | |
| Thr | ACU | 0.69 | 18.5 |
| ACA | 0.67 | 15.1 | |
| Asn | AAU | 0.67 | 24.8 |
| AAC | −0.67 | 21.7 | |
| Asp | GAU | 0.56 | 33.8 |
| GAC | −0.56 | 18.6 | |
| Cys | UGU | 0.52 | 8.5 |
| UGC | −0.52 | 8.8 | |
| Gln | CAA | 0.13 | 18.7 |
| CAG | −0.13 | 16.5 | |
| Glu | GAA | 0.32 | 28.6 |
| GAG | −0.32 | 31.3 | |
| Ile | AUU | 0.49 | 24.2 |
| AUA | 0.32 | 12.6 | |
| AUC | −0.81 | 16.4 | |
| Leu | UUA | 0.48 | 12.3 |
| UUG | 0.25 | 22.4 | |
| CUU | 0.47 | 25.2 | |
| CUA | 0.16 | 8.4 | |
| CUC | −1.24 | 14.6 | |
| CUG | −0.11 | 13 | |
| Met | AUG | 0 | 0 |
| Tyr | UAU | 0.72 | 15.4 |
| UAC | −0.72 | 13.9 | |
| Ser | UCU | 0.7 | 17.7 |
| UCA | 0.52 | 17 | |
| AGU | 0.6 | 11.2 | |
| UCC | −0.81 | 9.8 | |
| UCG | −0.73 | 9.1 | |
| AGC | −0.27 | 12 | |
| Thr | ACC | −0.71 | 10.6 |
| ACG | −0.66 | 6.9 |
Notes: ΔRSCU: Relative synonymous codon usage in predicted genes with high and low gene expression levels based on the EST sequence in Citrus sinensis. Optimal codons (red box) were identified based on differences in relative synonymous codon usage. Frequency per thousand bases: use frequency per thousand bases in identified high-confidence coding sequences from full-length cDNA in Citrus sinensis. Optimal codons (green box) were identified based on different frequencies per thousand bases (P < 0.05).