| Literature DB >> 20150956 |
Sameer Hassan1, Vasantha Mahalingam, Vanaja Kumar.
Abstract
Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc) and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.Entities:
Year: 2010 PMID: 20150956 PMCID: PMC2817497 DOI: 10.1155/2009/316936
Source DB: PubMed Journal: Adv Bioinformatics ISSN: 1687-8027
Relative synonymous codon usage for all the genes of 32 mycobacteriophages was calculated. AA and N are the amino acid and number of codons, respectively.
| AA | Condon | N | RSCU | AA | Condon | N | RSCU |
|---|---|---|---|---|---|---|---|
| Phe | UUU | 1795 | (0.20) | Ser | UCU | 2127 | (0.40) |
| UUC | 15765 | (1.80) | UCC | 7032 | (1.32) | ||
| Leu | UUA | 195 | 0.03 | UCA | 1889 | (0.35) | |
| UUG | 5037 | (0.66) | UCG | 10888 | (2.04) | ||
| Tyr | UAU | 2447 | (0.31) | Cys | UGU | 874 | (0.33) |
| UAC | 13288 | (1.69) | UGC | 4496 | (1.67) | ||
| ter | UAA | 393 | (0.00) | Ter | UGA | 1365 | (0.00) |
| ter | UAG | 388 | (0.00) | Trp | UGG | 11727 | (1.00) |
| Leu | CUU | 2805 | (0.37) | Pro | CCU | 3503 | (0.42) |
| CUC | 13867 | (1.82) | CCC | 10146 | (1.21) | ||
| CUA | 1400 | (0.18) | CCA | 2630 | (0.31) | ||
| CUG | 22302 | (2.93) | CCG | 17369 | (2.06) | ||
| His | CAU | 2427 | (0.40) | Arg | CGU | 5190 | (0.83) |
| CAC | 9611 | (1.60) | CGC | 16237 | (2.60) | ||
| Gln | CAA | 3076 | (0.30) | CGA | 3195 | (0.51) | |
| CAG | 17766 | (1.70) | CGG | 9987 | (1.60) | ||
| Ile | AUU | 3524 | (0.40) | Thr | ACU | 2990 | (0.32) |
| AUC | 21992 | (2.52) | ACC | 20215 | (2.20) | ||
| AUA | 636 | (0.07) | ACA | 2910 | (0.32) | ||
| Met | AUG | 11923 | (1.00) | ACG | 10698 | (1.16) | |
| Asn | AAU | 2886 | (0.29) | Ser | AGU | 1693 | (0.32) |
| AAC | 16738 | (1.71) | AGC | 8424 | (1.58) | ||
| Lys | AAA | 3185 | (0.29) | Arg | AGA | 692 | (0.11) |
| AAG | 18997 | (1.71) | AGG | 2110 | (0.34) | ||
| Val | GUU | 4218 | (0.40) | Ala | GCU | 7738 | (0.50) |
| GUC | 18361 | (1.73) | GCC | 26730 | (1.72) | ||
| GUA | 1855 | (0.17) | GCA | 6591 | (0.42) | ||
| GUG | 17981 | (1.70) | GCG | 21142 | (1.36) | ||
| Asp | GAU | 8082 | (0.42) | Gly | GGU | 10057 | (0.78) |
| GAC | 30303 | (1.58) | GGC | 27920 | (2.17) | ||
| Glu | GAA | 8983 | (0.50) | GGA | 5241 | (0.41) | |
| GAG | 26657 | (1.50) | GGG | 8345 | (0.65) |
Nc and GC3s values for 32 mycobacteriophages with standard deviation within brackets.
| Phages | Nc | GC3s |
|---|---|---|
| 244 | 38.98 (6.96) | 80.65 (7.87) |
| Bxb1 | 40.17 (7.28) | 80.05 (7.334) |
| Bxz2 | 37.93 (5.70) | 82.72 (6.46) |
| Che9c | 39.48 (6.96) | 81.01 (7.842) |
| Rosebush | 32.53 (4.04) | 88.44 (4.27) |
| Omega | 43.91 (6.54) | 75.49 (7.01) |
| Halo | 37.87 (5.24) | 82.46 (5.59) |
| Barnyard | 47.96 (7.24) | 65.84 (8.87) |
| Bxz1 | 35.80 (5.64) | 85.09 (6.48) |
| Cjw1 | 38.80 (6.40) | 80.64 (7.83) |
| Corndog | 37.87 (6.69) | 82.65 (7.23) |
| Orion | 37.79 (5.55) | 82.85 (5.17) |
| Plot | 44.28 (5.69) | 73.35 (6.59) |
| Llij | 43.52 (6.18) | 72.23 (5.51) |
| Pipefish | 35.12 (3.55) | 87.35 (3.53) |
| PMC | 42.62 (6.28) | 71.95 (5.30) |
| Qyrzula | 32.45 (3.93) | 88.56 (4.37) |
| Wildcat | 47.62 (6.63) | 66.37 (6.29) |
| D29 | 37.83 (5.28) | 82.58 (5.59) |
| L5 | 40.67 (5.70) | 80.79 (5.86) |
| PBI1 | 44.07 (5.67) | 73.71 (6.63) |
| PG1 | 37.59 (5.20) | 82.7 (4.90) |
| Cooper | 31.44 (4.65) | 89.35 (5.90) |
| Che12 | 39.06 (4.83) | 81.94 (5.75) |
| Catera | 35.44 (5.66) | 85.53 (6.24) |
| TM4 | 34.07 (3.60) | 88.19 (4.64) |
| Che8 | 43.62 (6.33) | 70.57 (5.37) |
| Tweety | 42.76 (5.81) | 72.07 (5.51) |
| U2 | 39.58 (7.03) | 81.10 (7.26) |
| Bethlehem | 40.22 (6.98) | 80.35 (6.57) |
| Giles | 36.48 (5.04) | 83.38 (5.24) |
| Che9d | 44.10 (7.13) | 71.6 (5.8) |
GC % and base composition in the third codon position for 32 mycobacteriophage genomes.
| Virus | G + C | GC1s | GC2s | GC (1st + 2nd) | GC3s | C3s | T3s | A3s | G3s |
|---|---|---|---|---|---|---|---|---|---|
| 244 | 63.26 | 63.3 | 44.7 | 54 | 81 | 51.14 | 10.58 | 8.88 | 30.01 |
| Bxb1 | 63.43 | 63.4 | 45.7 | 54.55 | 80.4 | 44.89 | 11.53 | 8.53 | 35.63 |
| Bxz2 | 64.41 | 64 | 45.6 | 54.8 | 83 | 48.62 | 9.12 | 8.25 | 34.56 |
| Che9C | 65.66 | 66 | 48.9 | 57.45 | 81.4 | 47.27 | 10.17 | 8.9 | 34.16 |
| Rosebush | 69.16 | 68.3 | 49.8 | 59.05 | 88.5 | 51.21 | 6.26 | 5.36 | 37.73 |
| Omega | 61.48 | 62.4 | 45.1 | 53.75 | 76.1 | 45.43 | 13.88 | 10.78 | 30.52 |
| Halo | 63.79 | 63.4 | 44.5 | 53.95 | 82.7 | 47.4 | 9.57 | 8.06 | 35.5 |
| Barnyard | 58.05 | 60.9 | 45.5 | 53.2 | 67 | 38.55 | 18.39 | 15.97 | 27.68 |
| Bxz1 | 65.04 | 64.1 | 44.9 | 54.5 | 85.3 | 51.84 | 9.61 | 5.39 | 33.72 |
| Cjw1 | 63.52 | 63.7 | 45.1 | 54.4 | 81 | 51.39 | 10.2 | 9.27 | 29.76 |
| Corndog | 65.64 | 64.8 | 48.4 | 56.6 | 83 | 46.49 | 9.76 | 7.7 | 36.64 |
| Orion | 66.9 | 66.6 | 50.2 | 58.4 | 83.17 | 50.46 | 10.19 | 7.04 | 32.85 |
| Plot | 60.22 | 61.6 | 44.3 | 52.95 | 74.1 | 42.4 | 15.73 | 11.05 | 31.34 |
| Llij | 61.66 | 62.4 | 48.9 | 55.65 | 73.1 | 39.73 | 16.12 | 11.79 | 32.87 |
| Pipefish | 67.64 | 65.4 | 49.3 | 57.35 | 87.5 | 49.1 | 7.19 | 5.51 | 38.72 |
| PMC | 61.46 | 62.8 | 48.2 | 55.5 | 72.8 | 39.67 | 15.81 | 12.38 | 32.66 |
| Qyrzula | 69.14 | 68.1 | 49.8 | 58.95 | 88.6 | 51.07 | 6.08 | 5.41 | 38 |
| Wildcat | 57.03 | 59.8 | 42.8 | 51.3 | 67.7 | 35.37 | 22.27 | 11.57 | 31.45 |
| D29 | 63.78 | 63.4 | 44.4 | 53.9 | 82.9 | 47.52 | 9.51 | 8 | 35.49 |
| L5 | 62.49 | 62.1 | 43.5 | 52.8 | 81.2 | 47.45 | 10.01 | 9.3 | 33.79 |
| PBI1 | 60.28 | 61.4 | 44.3 | 52.85 | 74.5 | 42.71 | 15.41 | 10.99 | 31.37 |
| PG1 | 66.83 | 66.5 | 50.2 | 58.35 | 83.12 | 50.46 | 10.24 | 7.07 | 32.79 |
| Cooper | 69.26 | 67.6 | 50 | 58.8 | 89.47 | 52.65 | 6.32 | 4.38 | 37.19 |
| Che12 | 63.26 | 62.5 | 44.4 | 53.45 | 82.25 | 48.61 | 9.86 | 8.3 | 33.79 |
| Catera | 65.17 | 64.1 | 44.9 | 54.5 | 85.8 | 52.29 | 9.44 | 5.1 | 33.71 |
| Che8 | 61.41 | 63.71 | 48.6 | 56.16 | 70.57 | 37.08 | 16.05 | 12.52 | 34.32 |
| TM4 | 68.7 | 68.3 | 48.9 | 58.6 | 88.19 | 47.26 | 6.66 | 5.08 | 40.99 |
| Che9d | 61.4 | 63.02 | 48.15 | 55.59 | 71.6 | 37.07 | 15.54 | 11.94 | 35.41 |
| Bethlehem | 63.29 | 63.37 | 45.22 | 54.30 | 80.35 | 42.56 | 11.32 | 7.93 | 38.15 |
| Giles | 67.89 | 66.55 | 52.72 | 59.64 | 83.38 | 48.11 | 6.26 | 10.06 | 35.55 |
| Tweety | 61.83 | 63.12 | 48.89 | 56.01 | 72.07 | 38.77 | 14.85 | 12.15 | 34.2 |
| U2 | 63.87 | 63.61 | 45.92 | 54.77 | 81.1 | 43.53 | 10.68 | 7.83 | 37.94 |
Figure 1Nc plot of thirty two mycobacteriophages. The genes for individual phages are represented by different colors.
Correlation coefficient of Nc with C3s, T3s, A3s, and G3s base composition.
| NC | ||||
|---|---|---|---|---|
| Phage name | C3s | T3s | A3s | G3s |
| 244 | −.814** | .816** | .655** | −0.126NS |
| Bxb1 | −.509** | .727** | .867** | −.532** |
| Bxz2 | −.631** | .636** | .829** | −.315* |
| Che9C | −.705** | .829** | .831** | −0.19NS |
| Rosebush | −.599** | .688** | .609** | −0.063NS |
| Omega | −.714** | .635** | .618** | −0.141NS |
| Halo | −.564** | .394** | .690** | −0.129NS |
| Barnyard | −.641** | .694** | .796** | −.621** |
| Bxz1 | −.742** | .725** | .777** | −0.123NS |
| Cjw1 | −.808** | .774** | .629** | −0.104NS |
| Corndog | −.484** | .788** | .621** | −.320** |
| Orion | −.817** | .726** | .673** | 0.147NS |
| Plot | −.666** | .709** | .717** | −.401** |
| Llij | −.278* | .348** | 0.212NS | −0.041NS |
| Pipefish | −.372** | .527** | .567** | −0.085NS |
| PMC | −.442** | .409** | −0.047 | 0.247NS |
| Qyrzula | −.657** | .743** | .665** | −0.048NS |
| Wildcat | −0.216NS | 0.069NS | .487** | −.274* |
| D29 | −.564** | .391** | .690** | −0.13NS |
| L5 | −.642** | .668** | .735** | −0.182NS |
| PBI1 | −.662** | .718** | .702** | −.387** |
| PG1 | −.815** | .735** | .616** | 0.214NS |
| Cooper | −.672** | .806** | .844** | −0.165NS |
| Che12 | −.711** | .708** | .628** | −0.115NS |
| Catera | −.716** | .768** | .702** | −0.125NS |
| TM4 | −.575** | .635** | .601** | −0.021NS |
| Bethlehem | −.583** | .700** | .758** | −.468** |
| Che8 | −.404** | .410** | 0.036NS | 0.109NS |
| Tweety | −.359** | .353** | 0.106NS | 0.088NS |
| Giles | −.535** | .502** | .406** | 0.158NS |
| U2 | −.514** | .783** | .880** | −.544** |
| Che9d | −0.102NS | 0.085 | .401** | −0.194NS |
Notable significant relationships are marked by **P < .01 or *P < .05, NSNonsignificant.
Figure 2Plot of Nc versus Gene length for all mycobacteriophage genomes.
Correlation coefficient of gene length with Nc and GC3s values.
| Length | ||
|---|---|---|
| Phage name | Nc | GC3s |
| Bxb1 | −.440** | .445** |
| Bxz2 | −.328* | .414** |
| Omega | −.223* | .243** |
| Bxz1 | −.241** | .205* |
| Cjw1 | −.227* | 0.213NS |
| Corndog | −.331** | .336** |
| Llij | −.342** | .332** |
| Wildcat | −.280* | 0.026NS |
| Catera | −.200* | .168* |
| U2 | −.475** | .427** |
| Bethlehem | −.472** | .445** |
Notable significant relationships are marked by **P < .01 or *P < .05, NSNonsignificant.
Figure 3Correspondence analysis of Relative Synonymous Codon Usage values of mycobacteriophages (32 genomes).
Figure 4Scatter plot of mycobacteriophages and Nc values.
Figure 5Scatter plot of mycobacteriophages and GC3s values.
Correlation of Axis1 with other codon usage indices.
| Axis 1 | |||||||
|---|---|---|---|---|---|---|---|
| Virus | GC3s | A3s | T3s | G3s | C3s | Gravy | Aromaticity |
| 244 | −.930** | .729** | .829** | −.230* | −.798** | −0.206NS | 0.178NS |
| Bxb1 | −.891** | .871** | .689** | −.417** | −.603** | −0.222NS | 0.078NS |
| Bxz2 | −.904** | .865** | .676** | −.503** | −.502** | −.323* | −0.16NS |
| Che9C | −.921** | .854** | .848** | −0.201NS | −.717** | −0.002NS | −0.053NS |
| Rosebush | −.689** | .323** | .743** | 0.108NS | −.638** | 0.151NS | 0.028NS |
| Omega | −.897** | .650** | .771** | −.262** | −.736** | 0.004NS | −0.055NS |
| Halo | −.746** | .720** | .498** | −0.125NS | −.641** | −0.091NS | 0.013NS |
| Barnyard | −.933** | .843** | .792** | −.647** | −.730** | 0.02NS | 0.023NS |
| Bxz1 | −.901** | .859** | .729** | −.221** | −.718** | 0.005NS | −0.048NS |
| Cjw1 | −.929** | .796** | .742** | −.362** | −.712** | −.259* | 0.153NS |
| Corndog | −.904** | .647** | .838** | −0.175NS | −.652** | −0.184NS | −0.002NS |
| Orion | −.763** | .641** | .644** | 0.099NS | −.729** | −.362** | −0.169NS |
| Plot | −.826** | .820** | .594** | −.427** | −.631** | 0.172NS | 0.229NS |
| Llij | 0.15NS | .763** | −.846** | −.912** | .896** | −0.186NS | −0.166NS |
| Pipefish | .370** | −0.205NS | −.348** | −0.036NS | .263* | −0.105NS | 0.185NS |
| PMC | 0.097NS | .853** | −.919** | −.944** | .908** | −0.229NS | −0.078NS |
| Qyrzula | −.641** | .323** | .731** | 0.09NS | −.588** | 0.061NS | −0.076NS |
| Wildcat | 0.14NS | −.788** | .556** | .411** | −0.208NS | 0.017NS | −0.036NS |
| D29 | .740** | −.721** | −.479** | 0.126NS | .634** | 0.241NS | −0.041NS |
| L5 | .862** | −.775** | −.670** | .384** | .485** | .296* | −0.071NS |
| PBI1 | .845** | −.806** | −.664** | .365** | .705** | −0.131NS | −.269* |
| PG1 | .740** | −0.524NS | −.678** | −0.219NS | .753** | .277** | 0.118NS |
| Cooper | .782** | −.782** | −.609** | 0.153NS | .563** | 0.2NS | −0.001NS |
| Che12 | .803** | −.752** | −.596** | 0.201NS | .650** | 0.214NS | −0.249NS |
| Catera | .914** | −.840** | −.766** | .233** | .702** | 0.041NS | 0.002NS |
| TM4 | 0.085NS |
| 0.258NS | 0.182NS | −0.094NS | 0.003NS | 0.008NS |
| Bethelhem | .882** | −.881** | −.625** | .505** | .602** | 0.229NS | .294* |
| Che8 | 0.124NS | .816** | −.883** | −.917** | .899** | −0.242NS | −0.189NS |
| Tweety | −0.113NS | −.797** | .891** | .934** | −.902** | 0.231NS | 0.098NS |
| Giles | −.352** | .730** | −0.447NS | −.887** | .691** | −.444** | −0.102NS |
| U2 | .922** | −.918** | −.737** | .522** | .537** | 0.066NS | 0.01NS |
| Che9d | 0.16NS | .778** | −.831** | −.857** | .862** | −0.027NS | −0.231NS |
Notable significant relationships are marked by **P < .01 or *P < .05, NSNonsignificant.
Relative Synonymous Codon Usage for the highly and lowly expressed genes.
| AA | Codon | RSCUa | Na | RSCUb | Nb | AA | Codon | RSCUa | Na | RSCUb | Nb |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU | 0.08 | (34) | 0.57 | (185) | Ser | UCU | 0.11 | (27) | 0.94 | (178) |
| UUC* | 1.92 | (839) | 1.43 | (466) | UCC* | 1.20 | (284) | 0.88 | (166) | ||
| Leu | UUA | 0.00 | (0) | 0.16 | (41) | UCA | 0.10 | (23) | 0.84 | (160) | |
| UUG | 0.09 | (33) | 1.55 | (390) | UCG | 1.76 | (417) | 1.67 | (317) | ||
| CUU | 0.07 | (26) | 0.93 | (235) | Pro | CCU | 0.16 | (60) | 0.85 | (249) | |
| CUC* | 1.38 | (518) | 1.16 | (292) | CCC* | 1.49 | (545) | 0.88 | (257) | ||
| CUA | 0.01 | (5) | 0.49 | (123) | CCA | 0.12 | (43) | 0.71 | (209) | ||
| CUG* | 4.45 | (1676) | 1.70 | (429) | CCG* | 2.23 | (813) | 1.56 | (459) | ||
| Ile | AUU | 0.19 | (78) | 1.01 | (288) | Thr | ACU | 0.09 | (47) | 0.78 | (217) |
| AUC* | 2.81 | (1181) | 1.74 | (495) | ACC* | 3.11 | (1573) | 1.20 | (333) | ||
| AUA | 0.00 | (1) | 0.25 | (72) | ACA | 0.07 | (33) | 0.70 | (193) | ||
| Met | AUG | 1.00 | (648) | 1.00 | (427) | ACG | 0.73 | (367) | 1.32 | (364) | |
| Val | GUU | 0.15 | (88) | 0.92 | (332) | Ala | GCU | 0.22 | (183) | 0.91 | (428) |
| GUC* | 1.97 | (1131) | 1.04 | (373) | GCC* | 2.50 | (2078) | 1.03 | (483) | ||
| GUA | 0.08 | (48) | 0.34 | (123) | GCA | 0.22 | (181) | 0.76 | (355) | ||
| GUG | 1.79 | (1025) | 1.69 | (608) | GCG | 1.06 | (882) | 1.30 | (611) | ||
| Tyr | UAU | 0.10 | (41) | 0.69 | (185) | Cys | UGU | 0.06 | (7) | 0.83 | (123) |
| UAC* | 1.90 | (767) | 1.31 | (350) | UGC* | 1.94 | (212) | 1.17 | (175) | ||
| TER | UAA | 0.56 | (20) | 0.79 | (28) | TER | UGA | 1.77 | (63) | 1.63 | (58) |
| UAG | 0.67 | (24) | 0.59 | (21) | Trp | UGG | 1.00 | (531) | 1.00 | (468) | |
| His | CAU | 0.12 | (36) | 0.89 | (236) | Arg | CGU | 0.52 | (164) | 1.37 | (364) |
| CAC* | 1.88 | (570) | 1.11 | (296) | CGC* | 3.79 | (1189) | 1.50 | (399) | ||
| Gln | CAA | 0.06 | (29) | 0.63 | (221) | CGA | 0.18 | (55) | 0.85 | (227) | |
| CAG* | 1.94 | (1021) | 1.37 | (480) | CGG | 1.41 | (442) | 1.23 | (328) | ||
| Asn | AAU | 1.10 | (50) | 0.75 | (265) | Ser | AGU | 0.11 | (27) | 0.73 | (139) |
| AAC* | 1.90 | (926) | 1.25 | (446) | AGC* | 2.71 | (642) | 0.94 | (178) | ||
| Lys | AAA | 0.06 | (33) | 0.60 | (229) | Arg | AGA | 0.00 | (1) | 0.41 | (109) |
| AAG* | 1.94 | (1046) | 1.40 | (539) | AGG | 0.11 | (33) | 0.64 | (170) | ||
| Asp | GAU | 0.15 | (151) | 0.89 | (564) | Gly | GGU | 0.54 | (305) | 1.04 | (414) |
| GAC* | 1.85 | (1868) | 1.11 | (706) | GGC* | 2.90 | (1646) | 1.19 | (471) | ||
| Glu | GAA | 0.29 | (270) | 0.84 | (513) | GGA | 0.14 | (81) | 0.91 | (363) | |
| GAG* | 1.71 | (1609) | 1.16 | (704) | GGG | 0.41 | (235) | 0.86 | (341) |
*Codons whose occurrences are significantly higher (P < .01) in the extreme left side of axis 1 than the genes present on the extreme right of the first major axis. Each group contains 10% of sequences at either extreme of the major axis generated by correspondence analysis. AA: amino acid; N: number of codon; agenes on extreme left of axis 1; bgenes on extreme right of axis 1.