| Literature DB >> 28506040 |
Abstract
Codon usage bias (CUB) is a unique property of genomes and has contributed to the better understanding of the molecular features and the evolution processes of particular gene. In this study, genetic indices associated with CUB, including relative synonymous codon usage and effective numbers of codons, as well as the nucleotide composition, were investigated in the Clonorchis sinensis tyrosinase genes and their platyhelminth orthologs, which play an important role in the eggshell formation. The relative synonymous codon usage patterns substantially differed among tyrosinase genes examined. In a neutrality analysis, the correlation between GC12 and GC3 was statistically significant, and the regression line had a relatively gradual slope (0.218). NC-plot, i.e., GC3 vs effective number of codons (ENC), showed that most of the tyrosinase genes were below the expected curve. The codon adaptation index (CAI) values of the platyhelminth tyrosinases had a narrow distribution between 0.685/0.714 and 0.797/0.837, and were negatively correlated with their ENC. Taken together, these results suggested that CUB in the tyrosinase genes seemed to be basically governed by selection pressures rather than mutational bias, although the latter factor provided an additional force in shaping CUB of the C. sinensis and Opisthorchis viverrini genes. It was also apparent that the equilibrium point between selection pressure and mutational bias is much more inclined to selection pressure in highly expressed C. sinensis genes, than in poorly expressed genes.Entities:
Keywords: Clonorchis sinensis; codon usage bias; mutational bias; selection pressure; tyrosinase
Mesh:
Substances:
Year: 2017 PMID: 28506040 PMCID: PMC5450960 DOI: 10.3347/kjp.2017.55.2.175
Source DB: PubMed Journal: Korean J Parasitol ISSN: 0023-4001 Impact factor: 1.341
G+C content of platyhelminth tyrosinase gene sequences
| Species | Accession no. | Length in bp (codon no.) | GC | GC1 | GC2 | GC3 | CAI-1 | CAI-2 | ENC |
|---|---|---|---|---|---|---|---|---|---|
| GAA27975 | 1431 (477) | 45.8 | 48.6 | 43.6 | 45.3 | 0.721 | 0.752 | 55.1 | |
| GAA32069 | 1449 (483) | 45.3 | 50.7 | 38.9 | 46.4 | 0.700 | 0.730 | 59.4 | |
| GAA48882 | 1434 (478) | 47.5 | 52.5 | 40.6 | 49.4 | 0.697 | 0.742 | 58.0 | |
| GAA48883 | 1419 (473) | 46.8 | 51.0 | 42.5 | 46.9 | 0.710 | 0.740 | 58.2 | |
| GAA54899 | 978 (326) | 49.2 | 52.1 | 44.2 | 51.2 | 0.690 | 0.756 | 58.2 | |
|
| |||||||||
| XP_009169523 | 1038 (346) | 49.5 | 50.9 | 45.1 | 52.6 | 0.693 | 0.755 | 55.1 | |
| XP_009170910-1 | 1419 (473) | 47.7 | 52.4 | 43.6 | 47.1 | 0.711 | 0.746 | 58.1 | |
| XP_009170910-2 | 1431 (477) | 46.3 | 49.7 | 43.0 | 46.3 | 0.718 | 0.744 | 55.0 | |
| XP_009170911 | 1434 (478) | 46.4 | 51.3 | 40.4 | 47.5 | 0.701 | 0.748 | 58.4 | |
| XP_009173140 | 1449 (483) | 46.2 | 50.7 | 39.8 | 48.2 | 0.696 | 0.736 | 59.1 | |
|
| |||||||||
| 1494 (498) | 35.0 | 42.2 | 38.6 | 24.3 | 0.763 | 0.810 | 44.9 | ||
| 1380 (460) | 34.3 | 39.1 | 39.1 | 24.6 | 0.761 | 0.815 | 43.4 | ||
| 1425 (475) | 32.8 | 36.4 | 36.2 | 25.7 | 0.790 | 0.820 | 45.7 | ||
| 1437 (479) | 40.2 | 43.2 | 39.0 | 38.2 | 0.752 | 0.771 | 54.8 | ||
| 1545 (515) | 33.9 | 38.3 | 36.1 | 27.2 | 0.793 | 0.791 | 46.2 | ||
| 1647 (549) | 37.3 | 39.5 | 34.8 | 37.5 | 0.745 | 0.761 | 52.5 | ||
|
| |||||||||
| XP_002572828 | 1437 (479) | 31.2 | 39.7 | 39.0 | 14.8 | 0.795 | 0.820 | 36.4 | |
| XP_002576328 | 1446 (482) | 30.9 | 38.4 | 41.7 | 12.7 | 0.782 | 0.825 | 34.5 | |
|
| |||||||||
| AAW24636 | 1458 (486) | 34.3 | 41.8 | 40.7 | 20.4 | 0.786 | 0.805 | 40.9 | |
| AAW26996 | 1416 (472) | 30.7 | 39.8 | 41.7 | 10.6 | 0.786 | 0.827 | 32.8 | |
|
| |||||||||
| KGB32970 | 1458 (486) | 31.4 | 40.1 | 38.7 | 15.4 | 0.795 | 0.826 | 36.1 | |
| KGB41928 | 1140 (380) | 32.0 | 41.6 | 40.0 | 14.5 | 0.767 | 0.837 | 35.9 | |
|
| |||||||||
| AJE29953 | 1425 (475) | 46.9 | 51.4 | 43.2 | 46.1 | 0.690 | 0.741 | 61.0 | |
|
| |||||||||
| AGC74039 | 1509 (503) | 49.8 | 53.3 | 42.1 | 54.1 | 0.685 | 0.714 | 54.1 | |
|
| |||||||||
| ELU13195 | 1602 (534) | 53.6 | 53.6 | 44.9 | 62.4 | 0.631 | 0.717 | 58.3 | |
Values in parenthesis are percentage (%) of G+C in the whole genomic and coding DNA sequences, respectively, of respective organisms.
The identity of each gene is presented by the accession number of its protein product or transcribed mRNA sequence (italicized).
Only the partial sequence information is currently available from GenBank.
The codon adaptation index (CAI) values were calculated by referencing Caenorhabditis elegans (CAI-1) and Escherichia coli (CAI-2) gene sets, respectively.
Effective number of codon.
Fig. 1Neutrality plots (GC12 against GC3) of platyhelminth tyrosinase genes. The dotted line was obtained by a regression analysis. Identity of each genes was distinguished with the GenBank accession no. of corresponding protein or cDNA (italicized) as follows: 1, GAA27975; 2, GAA32069; 3, GAA48882; 4, GAA48883; 5, GAA54899; 6, XP_009169523; 7, XP_009170910-1; 8, XP_009170910-2; 9, XP_009170911; 10, XP_009173140; 11, GAKN01000997, 12, GAKN01001049; 13, GAKN01005471; 14, GAKN01006835; 15, GAKN01007889; 16, JN983828; 17, XP_002572828; 18, XP_002576328; 19, AAW24636; 20, AAW26996; 21, KGB32970; 22, KGB41928; 23, AJE29953; 24, AGC74039; 25, ELU13195.
Relative usage of synonymous codons in Clonorchis sinensis tyrosinase genes
| Amino acid | Codon | Relative synonymous codon usage (no.) | ||||
|---|---|---|---|---|---|---|
|
| ||||||
| GAA27975 | GAA32069 | GAA48882 | GAA48883 | GAA54899 | ||
| Ala | GCG | 0.14 (1) | 0.60 (3) | 0.60 (3) | 0.67 (4) | 0.71 (3) |
| GCC | 1.10 (8) | 0.80 (4) | 1.40 (7) | 1.33 (8) | 1.18 (5) | |
| GCA | 1.79 (13) | 1.20 (6) | 1.80 (9) | 0.67 (4) | 0.94 (4) | |
| GCT | 0.97 (7) | 1.40 (7) | 0.20 (1) | 1.33 (8) | 1.18 (5) | |
|
| ||||||
| Cys | TGC | 0.84 (8) | 0.64 (7) | 0.73 (8) | 0.74 (7) | 0.86 (3) |
| TGT | 1.16 (11) | 1.36 (15) | 1.27 (14) | 1.26 (12) | 1.14 (4) | |
|
| ||||||
| Asp | GAC | 0.83 (12) | 1.04 (14) | 1.00 (16) | 0.73 (12) | 1.03 (15) |
| GAT | 1.17 (17) | 0.96 (13) | 1.00 (16) | 1.27 (21) | 0.97 (14) | |
|
| ||||||
| Glu | GAG | 0.62 (5) | 0.95 (10) | 0.93 (13) | 0.87 (10) | 1.50 (6) |
| GAA | 1.38 (11) | 1.05 (11) | 1.07 (15) | 1.13 (13) | 0.50 (2) | |
|
| ||||||
| Phe | TTC | 1.13 (13) | 1.17 (17) | 0.94 (15) | 0.97 (14) | 0.76 (8) |
| TTT | 0.87 (10) | 0.83 (12) | 1.06 (17) | 1.03 (15) | 1.24 (13) | |
|
| ||||||
| Gly | GGG | 0.00 (0) | 0.33 (3) | 0.69 (5) | 0.35 (3) | 1.05 (5) |
| GGC | 0.71 (6) | 0.67 (6) | 0.28 (2) | 0.94 (8) | 0.84 (4) | |
| GGA | 1.41 (12) | 1.78 (16) | 1.66 (12) | 1.12 (9) | 1.26 (6) | |
| GGT | 1.88 (16) | 1.22 (11) | 1.38 (10) | 1.65 (14) | 0.84 (4) | |
|
| ||||||
| His | CAC | 1.47 (11) | 1.21 (17) | 1.10 (11) | 1.22 (11) | 0.67 (3) |
| CAT | 0.53 (4) | 0.79 (11) | 0.90 (9) | 0.78 (7) | 1.33 (6) | |
|
| ||||||
| Ile | ATC | 0.92 (8) | 0.80 (8) | 1.04 (8) | 1.14 (8) | 1.09 (4) |
| ATA | 0.35 (3) | 0.90 (9) | 0.78 (6) | 0.29 (2) | 0.82 (3) | |
| ATT | 1.73 (15) | 1.30 (13) | 1.17 (9) | 1.57 (11) | 1.09 (4) | |
|
| ||||||
| Lys | AAG | 0.72 (9) | 0.74 (10) | 0.55 (3) | 0.70 (8) | 0.20 (1) |
| AAA | 1.28 (16) | 1.26 (17) | 1.45 (8) | 1.30 (15) | 1.80 (9) | |
|
| ||||||
| Leu | TTG | 2.17 (13) | 1.59 (9) | 1.89 (12) | 2.06 (12) | 1.82 (10) |
| CTG | 0.83 (5) | 1.94 (11) | 1.42 (9) | 0.86 (5) | 1.27 (7) | |
| CTC | 0.83 (5) | 0.53 (3) | 0.95 (6) | 0.69 (4) | 1.27 (7) | |
| TTA | 0.67 (4) | 0.35 (2) | 0.47 (3) | 0.51 (3) | 0.36 (2) | |
| CTA | 0.33 (2) | 0.71 (4) | 0.79 (5) | 0.69 (4) | 0.18 (1) | |
| CTT | 1.17 (7) | 0.88 (5) | 0.47(3) | 1.20 (7) | 1.09 (6) | |
|
| ||||||
| Met | ATG | 1.00 (11) | 1.00 (11) | 1.00 (14) | 1.00 (15) | 1.00 (5) |
|
| ||||||
| Asn | AAC | 0.89 (12) | 1.14 (12) | 0.82 (9) | 0.87 (10) | 0.82 (7) |
| AAT | 1.11 (15) | 0.86 (9) | 1.18 (13) | 1.13 (13) | 1.18 (10) | |
|
| ||||||
| Pro | CCG | 0.69 (5) | 0.59 (4) | 1.03 (8) | 0.65 (5) | 0.89 (6) |
| CCC | 0.83 (6) | 1.04 (7) | 1.03 (8) | 0.77 (6) | 0.44 (3) | |
| CCA | 0.97 (7) | 1.19 (8) | 1.55 (12) | 1.42 (11) | 1.78 (12) | |
| CCT | 1.52 (11) | 1.19 (8) | 0.39 (3) | 1.16 (9) | 0.89 (6) | |
|
| ||||||
| Gln | CAG | 0.62 (4) | 0.59 (5) | 1.25 (10) | 0.89 (4) | 0.40 (4) |
| CAA | 1.38 (9) | 1.41 (12) | 0.75 (6) | 1.11 (5) | 1.20 (6) | |
|
| ||||||
| Arg | AGG | 0.39 (2) | 1.00 (5) | 0.39 (2) | 0.38 (2) | 0.40 (1) |
| CGG | 1.35 (7) | 0.80 (4) | 1.16 (6) | 0.94 (5) | 0.80 (2) | |
| CGC | 0.58 (3) | 1.20 (6) | 0.39 (2) | 0.38 (2) | 0.40 (1) | |
| AGA | 1.16 (6) | 0.60 (3) | 0.97 (5) | 1.12 (6) | 2.00 (5) | |
| CGA | 1.55 (8) | 1.60 (8) | 1.74 (9) | 1.88 (10) | 0.40 (1) | |
| CGT | 0.97 (5) | 0.80 (4) | 1.35 (7) | 1.31 (7) | 2.00 (5) | |
|
| ||||||
| Ser | AGC | 0.64 (3) | 0.90 (3) | 0.55 (2) | 1.14 (4) | 1.11 (5) |
| TCG | 1.29 (6) | 0.30 (1) | 0.00 (0) | 1.14 (4) | 0.89 (4) | |
| TCC | 0.64 (3) | 0.30 (1) | 1.36 (5) | 0.29 (1) | 1.11 (5) | |
| AGT | 1.29 (6) | 2.10 (7) | 1.36 (5) | 0.86 (3) | 1.11 (5) | |
| TCA | 1.29 (6) | 0.90 (3) | 1.36 (5) | 0.86 (3) | 1.78 (8) | |
| TCT | 0.86 (4) | 1.50 (5) | 1.36 (5) | 1.71 (6) | 0.00 (0) | |
|
| ||||||
| Thr | ACG | 0.87 (5) | 1.05 (5) | 1.00 (7) | 0.83 (5) | 1.09 (6) |
| ACC | 1.04 (6) | 0.84 (4) | 1.14 (8) | 0.83 (5) | 1.82 (10) | |
| ACA | 0.35 (2) | 1.05 (5) | 1.00 (7) | 1.17 (7) | 0.55 (3) | |
| ACT | 1.74 (10) | 1.05 (5) | 0.86 (6) | 1.17 (7) | 0.55 (3) | |
|
| ||||||
| Val | GTG | 1.60 (10) | 1.33 (8) | 1.57 (11) | 1.12 (7) | 1.71 (9) |
| GTC | 0.16 (1) | 0.50 (3) | 0.86 (6) | 0.48 (3) | 0.95 (5) | |
| GTA | 0.32 (2) | 1.00 (6) | 0.71 (5) | 0.64 (4) | 0.38 (2) | |
| GTT | 1.92 (12) | 1.17 (7) | 0.86 (6) | 1.76 (11) | 0.95 (5) | |
|
| ||||||
| Trp | TGG | 1.00 (15) | 1.00 (13) | 1.00 (11) | 1.00 (15) | 1.00 (10) |
|
| ||||||
| Tyr | TAC | 1.18 (13) | 0.77 (10) | 0.84 (8) | 1.67 (15) | 0.75 (3) |
| TAT | 0.82 (9) | 1.23 (16) | 1.16 (11) | 0.33 (3) | 1.25 (5) | |
|
| ||||||
| Stop | TAG | 0.00 (0) | 0.00 (0) | 3.00 (1) | 0.00 (0) | ND |
| TGA | 0.00 (0) | 3.00 (1) | 0.00 (0) | 3.00 (1) | ||
| TAA | 3.00 (1) | 0.00 (0) | 0.00 (0) | 0.00 (0) | ||
Not determined due to the partial sequence information of GAA54899 gene.
Fig. 2Scatter plots of GC3 (A) and codon adaptation index (CAI, B) vs effective number of codons (ENC) of the platyhelminth tyrosinase genes. The continuous curve in panel A symbolizes the null hypothesis that the GC bias at the synonymous site is solely due to mutation but not selection. The hypothetical ENC values used in the curve were calculated by an equation, ENC=2+S+29/(S2+(1−S)2), where S represents the given GC3 value. The dotted line in panel B was obtained by a regression analysis between ENC and CAI. For gene identities, refer to the legend for Fig. 1.
Fig. 3Correspondence analysis of relative synonymous codon usage. The distribution of genes (A) and codons (B) is shown along the first and second axes. Numerals in parenthesis in panel A indicate GC contents of all codons encoded in each of the respective genes.
Correlation between tyrosinase gene position on axis 1 and axis 2 and nucleotide compositions in different codon positions
| Axis | GC | GC1 | GC2 | GC12 | GC3 | ENC |
|---|---|---|---|---|---|---|
| 1 | −0.972 | −0.887 | −0.482 | −0.814 | −0.995 | −0.961 |
| 2 | −0.116 | −0.323 | −0.409 | −0.375 | 0.022 | 0.080 |
Effective number of codon.