| Literature DB >> 21402862 |
C A Whittle1, Y Sun, H Johannesson.
Abstract
Neurospora comprises a primary model system for the study of fungal genetics and biology. In spite of this, little is known about genome evolution in Neurospora. For example, the evolution of synonymous codon usage is largely unknown in this genus. In the present investigation, we conducted a comprehensive analysis of synonymous codon usage and its relationship to gene expression and gene length (GL) in Neurospora tetrasperma and Neurospora discreta. For our analysis, we examined codon usage among 2,079 genes per organism and assessed gene expression using large-scale expressed sequenced tag (EST) data sets (279,323 and 453,559 ESTs for N. tetrasperma and N. discreta, respectively). Data on relative synonymous codon usage revealed 24 codons (and two putative codons) that are more frequently used in genes with high than with low expression and thus were defined as optimal codons. Although codon-usage bias was highly correlated with gene expression, it was independent of selectively neutral base composition (introns); thus demonstrating that translational selection drives synonymous codon usage in these genomes. We also report that GL (coding sequences [CDS]) was inversely associated with optimal codon usage at each gene expression level, with highly expressed short genes having the greatest frequency of optimal codons. Optimal codon frequency was moderately higher in N. tetrasperma than in N. discreta, which might be due to variation in selective pressures and/or mating systems.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21402862 PMCID: PMC3089379 DOI: 10.1093/gbe/evr018
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Summary of Gene and EST Data Used in the Present Analysis
| Taxon | Genes | ESTs | |||
| Number of Genes | Number of Highly Expressed Genes | Number of Lowly Expressed Genes | Number of ESTs Examined | Number of ESTs Matching Genes | |
| 2,079 | 688 | 1,391 | 279,323 | 105,003 | |
| 2,079 | 702 | 1,377 | 453,559 | 161,242 | |
FThe relationship between gene expression level and codon usage in Neurospora species. (A) The mean ENCs for lowly (<10 ESTs per 100,000) and highly (≥10ESTs per 100,000) expressed genes. (B) The mean GC3 values for lowly and highly expressed genes. Error bars represent standard errors. Different letters (a or b) above bars indicate statistically significant differences (P < 0.05).
The Difference in Relative Synonymous Codon Usage (▵RSCU) in Highly Versus Lowly Expressed Genes in Neurospora tetrasperma and N. discreta
| Mean RSCU | ▵RSCU | Mean RSCU | ▵RSCU | ||||||
| Codon | Amino Acid | High Exp | Low Exp | High Exp | Low Exp | ||||
| GCA | Ala | 0.1956 | 0.4777 | −0.2821 | ** | 0.2250 | 0.4806 | −0.2556 | ** |
| GCG | Ala | 0.5228 | 0.8695 | −0.3467 | ** | 0.5029 | 0.8704 | −0.3675 | ** |
| AGA | Arg | 0.4472 | 0.6415 | −0.1943 | ** | 0.4670 | 0.6282 | −0.1612 | ** |
| CGA | Arg | 0.2029 | 0.5418 | −0.3390 | ** | 0.2552 | 0.5433 | −0.2881 | ** |
| CGG | Arg | 0.4235 | 0.7713 | −0.3478 | ** | 0.4070 | 0.7761 | −0.3691 | ** |
| AGG | Arg | 0.6866 | 1.1374 | −0.4508 | ** | 0.6823 | 1.1442 | −0.4619 | ** |
| AAT | Asn | 0.2504 | 0.4893 | −0.2388 | ** | 0.2837 | 0.5047 | −0.2211 | ** |
| GAT | Asp | 0.7297 | 0.7949 | −0.0651 | * | 0.7563 | 0.8036 | −0.0473 | |
| TGT | Cys | 0.2152 | 0.4578 | −0.2426 | ** | 0.2157 | 0.4583 | −0.2426 | ** |
| CAA | Gln | 0.3881 | 0.7019 | −0.3138 | ** | 0.4180 | 0.7097 | −0.2917 | ** |
| GAA | Glu | 0.3305 | 0.6150 | −0.2845 | ** | 0.3636 | 0.6117 | −0.2480 | ** |
| GGA | Gly | 0.3594 | 0.6709 | −0.3115 | ** | 0.3850 | 0.6634 | −0.2784 | ** |
| GGG | Gly | 0.1820 | 0.5178 | −0.3358 | ** | 0.1785 | 0.5349 | −0.3564 | ** |
| CAT | His | 0.4014 | 0.6610 | −0.2595 | ** | 0.4219 | 0.6799 | −0.2579 | ** |
| ATT | Ile | 0.8619 | 0.9150 | −0.0531 | 0.8770 | 0.9196 | −0.0426 | ||
| ATA | Ile | 0.0549 | 0.2078 | −0.1529 | ** | 0.0637 | 0.2153 | −0.1516 | ** |
| TTA | Leu | 0.0391 | 0.1267 | −0.0876 | ** | 0.0474 | 0.1303 | −0.0829 | ** |
| CTA | Leu | 0.1668 | 0.3896 | −0.2228 | ** | 0.1693 | 0.3999 | −0.2306 | ** |
| TTG | Leu | 0.7561 | 1.0437 | −0.2876 | ** | 0.7508 | 1.0453 | −0.2944 | ** |
| CTG | Leu | 1.0216 | 1.3779 | −0.3563 | ** | 1.0046 | 1.4020 | −0.3974 | ** |
| AAA | Lys | 0.1338 | 0.3765 | −0.2427 | ** | 0.1569 | 0.3841 | −0.2273 | ** |
| TTT | Phe | 0.3908 | 0.6737 | −0.2829 | ** | 0.4283 | 0.6905 | −0.2622 | ** |
| CCA | Pro | 0.2523 | 0.5846 | −0.3322 | ** | 0.2679 | 0.5970 | −0.3291 | ** |
| CCG | Pro | 0.4855 | 0.9180 | −0.4325 | ** | 0.4838 | 0.9099 | −0.4260 | ** |
| AGC | Ser | 1.3687 | 1.4512 | −0.0825 | 1.3735 | 1.4404 | −0.0668 | ||
| TCG | Ser | 1.0090 | 1.1733 | −0.1642 | * | 0.9901 | 1.1778 | −0.1877 | * |
| AGT | Ser | 0.2827 | 0.5621 | −0.2794 | ** | 0.3103 | 0.5628 | −0.2525 | ** |
| TCA | Ser | 0.2156 | 0.5276 | −0.3120 | ** | 0.2523 | 0.5278 | −0.2756 | ** |
| ACA | Thr | 0.3168 | 0.6412 | −0.3243 | ** | 0.3561 | 0.6244 | −0.2683 | ** |
| ACG | Thr | 0.5405 | 0.9533 | −0.4128 | ** | 0.5431 | 0.9662 | −0.4231 | ** |
| TAT | Tyr | 0.4027 | 0.5991 | −0.1964 | ** | 0.4179 | 0.6056 | −0.1877 | ** |
| GTA | Val | 0.1475 | 0.3198 | −0.1723 | ** | 0.1673 | 0.3160 | −0.1488 | ** |
| GTG | Val | 0.6124 | 1.0538 | −0.4414 | ** | 0.6021 | 1.0696 | −0.4676 | ** |
Note.—Pair-wise t-tests were conducted for each codon across all highly expressed versus all lowly expressed genes. P values are shown and have been adjusted for Bonferroni correction (*indicates 0.05< P > 0.00001; **indicates P ≤ 0.00001). Codons in bold have been assigned as optional codons (N = 26). Underlined codons are the primary optimal codon per synonymous codon family that is largest positive ▵RSCU. The standard errors for mean RSCU values are provided in supplementary table 1 (Supplementary Material online).
This codon was designated as a putative optimal codon for this amino acid as RSCU has a greater value in the highly expressed genes, even though comparison is not statistically significant.
Two-Way ANOVA Results for the Fop with Gene Expression and GL as Factors
| DOF | Sum of Squares | Mean Square | Proportion of Variation Explained | DOFa | Sum of Squares | Mean Square | Proportion of Variation Explained | |||||
| Gene expression | 9.95 | 9.95 | 1038.92 | <10−10 | 32.0% | 1 | 9.77 | 9.77 | 1026.74 | <10−10 | 31.9% | |
| GL | 2 | 1.24 | 0.62 | 63.75 | <10−10 | 4.00% | 2 | 1.59 | 0.80 | 83.81 | <10−10 | 5.18% |
| Gene expression × GL | 2 | 0.07 | 0.03 | 3.49 | 0.03 | 0.22% | 2 | 0.30 | 0.15 | 15.93 | 3.0 × <10−7 | 0.98% |
| Residual | 2,073 | 19.85 | 0.01 | 2,073 | 19.72 | 0.01 | ||||||
| Total | 2,078 | 31.07 | 0.02 | 2,078 | 30.64 | 0.02 | ||||||
Three categories of GLs (short, medium, and long) and two categories (low and high) of gene expression were utilized. The N values for each of the six categories of gene expression and GL were as follows: N. tetrasperma NHigh_Short = 200, NHigh_Medium = 251, NHigh_Long = 237, NLow_Short = 398, NLow_Medium = 575, NLow_Long = 418; N. discreta, NHigh_Short = 186, NHigh_Medium = 261, NHigh_Long = 255, NLow_Short = 412, NLow_Medium = 565 NLow_Long = 400.
FThe Fop value for genes from different GL categories and gene expression levels. (A) Neurospora tetrasperma and (B) N. discreta. All comparisons among GLs (per expression level) and among gene expression levels (per GL) within each figure are statistically significantly different (P < 0.05) after post hoc analysis of ANOVAs and correction for multiple tests. Error bars represent standard errors.
FThe mean GC content at third nucleotide positions of codons (GC3) and for introns of genes (GCI) from each combination of gene expression level (High, Low) and GL (Short, Medium, Long). (A) Neurospora tetrasperma (B) N. discreta. All comparisons among GC3 and GCI are statistically significantly different within each of the six combinations of gene expression and GL per taxon (t-tests P < 0.05 after Bonferroni correction). Error bars represent standard errors.
FThe difference in the Fop among Neurospora tetrasperma and N. discreta for each for each of the 2,079 genes examined herein. Genes are provided on the x axis in the order they occur on the chromosomes (Chr).