| Literature DB >> 20445740 |
Guiming Liu1, Jinyu Wu, Huanming Yang, Qiyu Bao.
Abstract
The alternative synonymous codons in Corynebacterium glutamicum, a well-known bacterium used in industry for the production of amino acid, have been investigated by multivariate analysis. As C. glutamicum is a GC-rich organism, G and C are expected to predominate at the third position of codons. Indeed, overall codon usage analyses have indicated that C and/or G ending codons are predominant in this organism. Through multivariate statistical analysis, apart from mutational selection, we identified three other trends of codon usage variation among the genes. Firstly, the majority of highly expressed genes are scattered towards the positive end of the first axis, whereas the majority of lowly expressed genes are clustered towards the other end of the first axis. Furthermore, the distinct difference in the two sets of genes was that the C ending codons are predominate in putatively highly expressed genes, suggesting that the C ending codons are translationally optimal in this organism. Secondly, the majority of the putatively highly expressed genes have a tendency to locate on the leading strand, which indicates that replicational and transciptional selection might be invoked. Thirdly, highly expressed genes are more conserved than lowly expressed genes by synonymous and nonsynonymous substitutions among orthologous genes fromthe genomes of C. glutamicum and C. diphtheriae. We also analyzed other factors such as the length of genes and hydrophobicity that might influence codon usage and found their contributions to be weak.Entities:
Year: 2010 PMID: 20445740 PMCID: PMC2860111 DOI: 10.1155/2010/343569
Source DB: PubMed Journal: Comp Funct Genomics ISSN: 1531-6912
Figure 1(a) The GC content and GC skew of the genome C. glutamicum with a 24 kb of window size and a 3 kb of step size. (b) The ENC plot of C. glutamicum. The continuous curve represents the relationship between GC3s and ENC values under random codon usage. (c) Distribution of C. glutamicum genes on the plane defined by the two main axes of the correspondence analysis.
Result of factorial correspondence analyses on codon usage in C. glutamicum.
| Interia | CAI | GC3s | G3s | C3s | A3s | T3s | |
|---|---|---|---|---|---|---|---|
| Axis1 | 20.33 | 0.855** | 0.594** | 0.557** | 0.881** | −0.112** | −0.593** |
| Axis2 | 10.49 | 0.367** | 0.006 | 0.524** | 0.336** | 0.657** | −0.542** |
**Correlation is significant at the 0.01 level.
Codon usage in putative highly expressed and lowly expressed genes of C. glutamicum.
| AA | Codon | High | Low | AA | Codon | High | Low | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|
|
| RSCU |
| RSCU |
| RSCU |
| RSCU | ||||
| Phe | UUU | 149 | 0.17 | 1058 | 1.37 | Ser | UCU | 371 | 0.77 | 539 | 1.22 |
|
| 1608 | 1.83 | 481 | 0.63 |
| 1972 | 4.11 | 284 | 0.64 | ||
| Leu | UUA | 19 | 0.03 | 486 | 0.78 | UCA | 175 | 0.36 | 360 | 0.82 | |
| UUG | 239 | 0.33 | 1284 | 2.06 | UCG | 59 | 0.12 | 676 | 1.53 | ||
| CUU | 656 | 0.91 | 707 | 1.13 | Pro | CCU | 527 | 0.92 | 492 | 1.11 | |
|
| 1583 | 2.21 | 341 | 0.55 | CCC | 185 | 0.32 | 205 | 0.46 | ||
| CUA | 140 | 0.2 | 274 | 0.44 |
| 1454 | 2.55 | 371 | 0.84 | ||
|
| 1666 | 2.32 | 651 | 1.04 | CCG | 119 | 0.21 | 705 | 1.59 | ||
| Ile | AUU | 412 | 0.43 | 1207 | 1.71 | Thr | ACU | 304 | 0.38 | 605 | 1.2 |
|
| 2477 | 2.57 | 654 | 0.93 |
| 2758 | 3.45 | 427 | 0.85 | ||
| AUA | 7 | 0.01 | 257 | 0.36 | ACA | 74 | 0.09 | 378 | 0.75 | ||
| Met | AUG | 1143 | 1 | 863 | 1 | ACG | 61 | 0.08 | 601 | 1.2 | |
| Val |
| 1495 | 1.45 | 883 | 1.11 | Ala |
| 1707 | 1.18 | 957 | 1.08 |
|
| 1631 | 1.59 | 452 | 0.57 |
| 1138 | 0.79 | 512 | 0.58 | ||
| GUA | 336 | 0.33 | 396 | 0.5 |
| 2529 | 1.75 | 789 | 0.89 | ||
| GUG | 650 | 0.63 | 1443 | 1.82 | GCG | 411 | 0.28 | 1300 | 1.46 | ||
| Tyr | UAU | 47 | 0.07 | 632 | 1.41 | Cys | UGU | 64 | 0.44 | 213 | 1.3 |
|
| 1311 | 1.93 | 262 | 0.59 |
| 230 | 1.56 | 114 | 0.7 | ||
| TER | UAA | 100 | 2.19 | 39 | 0.85 | TER | UGA | 4 | 0.09 | 47 | 1.03 |
| UAG | 33 | 0.72 | 51 | 1.12 | Trp | UGG | 577 | 1 | 661 | 1 | |
| His | CAU | 47 | 0.09 | 560 | 1.35 | Arg | CGU | 744 | 1.62 | 557 | 1.45 |
|
| 981 | 1.91 | 271 | 0.65 |
| 1887 | 4.11 | 362 | 0.94 | ||
| Gln | CAA | 284 | 0.34 | 539 | 0.79 | CGA | 89 | 0.19 | 317 | 0.83 | |
|
| 1385 | 1.66 | 830 | 1.21 | CGG | 16 | 0.03 | 489 | 1.27 | ||
| Asn | AAU | 141 | 0.13 | 795 | 1.37 | Ser | AGU | 19 | 0.04 | 483 | 1.09 |
|
| 1955 | 1.87 | 369 | 0.63 | AGC | 284 | 0.59 | 305 | 0.69 | ||
| Lys | AAA | 298 | 0.26 | 664 | 0.91 | Arg | AGA | 5 | 0.01 | 236 | 0.62 |
|
| 2014 | 1.74 | 800 | 1.09 | AGG | 13 | 0.03 | 341 | 0.89 | ||
| Asp | GAU | 959 | 0.55 | 1537 | 1.53 | Gly | GGU | 1026 | 0.93 | 1069 | 1.46 |
|
| 2559 | 1.45 | 468 | 0.47 |
| 2830 | 2.55 | 562 | 0.77 | ||
| Glu |
| 2117 | 1.04 | 1095 | 0.91 | GGA | 549 | 0.5 | 568 | 0.78 | |
| GAG | 1941 | 0.96 | 1313 | 1.09 | GGG | 26 | 0.02 | 724 | 0.99 | ||
N: the number of codons; AA: amino acid.
*Codon with significantly (P < .01) higher frequencies in highly expressed genes.
High: codons in highly expressed genes; Low: codons in lowly expressed genes.
Percentages of genes in C. glutamicum on the leading (versus lagging) strand.
| Range | Total number | Leading | Lagging | ||
|---|---|---|---|---|---|
|
|
| Percent |
| Percent | |
| CAI < 0.35 | 1958 | 1087 | 55.52 | 871 | 44.48 |
| 0.35 < CAI < 0.65 | 721 | 434 | 60.19 | 287 | 39.81 |
| CAI > 0.65 | 61 | 41 | 67.21 | 20 | 32.79 |
| Ribosomal | 52 | 44 | 84.62 | 8 | 15.38 |
Figure 2Plot of CAI values for C. glutamicum against Ka and Ks. (a) Plot of CAI values for C. glutamicum against Ka. (b) Plot of CAI values for C. glutamicum against Ks. The correlation coefficients (r) and level of significance (P) are shown.