| Literature DB >> 24707212 |
Hoda Mirsafian1, Adiratna Mat Ripen2, Aarti Singh1, Phaik Hwan Teo1, Amir Feisal Merican3, Saharuddin Bin Mohamad3.
Abstract
Synonymous codon usage bias is an inevitable phenomenon in organismic taxa across the three domains of life. Though the frequency of codon usage is not equal across species and within genome in the same species, the phenomenon is non random and is tissue-specific. Several factors such as GC content, nucleotide distribution, protein hydropathy, protein secondary structure, and translational selection are reported to contribute to codon usage preference. The synonymous codon usage patterns can be helpful in revealing the expression pattern of genes as well as the evolutionary relationship between the sequences. In this study, synonymous codon usage bias patterns were determined for the evolutionarily close proteins of albumin superfamily, namely, albumin, α-fetoprotein, afamin, and vitamin D-binding protein. Our study demonstrated that the genes of the four albumin superfamily members have low GC content and high values of effective number of codons (ENC) suggesting high expressivity of these genes and less bias in codon usage preferences. This study also provided evidence that the albumin superfamily members are not subjected to mutational selection pressure.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24707212 PMCID: PMC3951064 DOI: 10.1155/2014/639682
Source DB: PubMed Journal: ScientificWorldJournal ISSN: 1537-744X
Genomic information of the reference sequences, grand average hydrophobicity score, ENCs, GC content, and GC3 of human albumin superfamily members.
| Human albumin superfamily | ||||
|---|---|---|---|---|
| Albumin (ALB) | Afamin (AFM) | Alpha-fetoprotein (AFP) | Vitamin D-binding protein (VDBP) | |
| GenBank accession number | NM_000477.5 | NM_001133.2 | NM_001134.1 | NM_000583.3 |
| Gene length (bp) | 2264 | 1997 | 2032 | 2024 |
| Grand average of hydrophobicity score (Gravy score) | −0.354 | −0.248 | −0.388 | −0.336 |
| GC content | 42.95 | 42.02 | 39.28 | 44.63 |
| Effective number of codons (ENC) | 53.91 | 51.65 | 54.78 | 56.62 |
| GC3 | 38.00 | 37.10 | 37.30 | 42.80 |
Figure 1Comparison of percent similarity and identity of nucleotide sequences and amino acid sequences of human albumin superfamily members.
Nucleotide distribution of human albumin superfamily members.
| ALB (%) | AFP (%) | AFM (%) | VDBP (%) | |
|---|---|---|---|---|
| A | 30.4 (556) | 32.6 (596) | 32.8 (591) | 29.9 (426) |
| T | 26.7 (488) | 25.4 (465) | 27.9 (502) | 25.5 (363) |
| G | 23.0 (421) | 21.7 (397) | 20.1 (361) | 21.4 (305) |
| C | 19.9 (365) | 20.3 (372) | 19.2 (346) | 23.2 (331) |
| AT | 57.049 | 57.978 | 60.722 | 55.368 |
| GC | 42.951 | 42.022 | 39.278 | 44.632 |
The values in parenthesis represent the number of individual nucleotides in the genes of human albumin superfamily members.
Figure 2Codon frequency distribution of human albumin superfamily members.
Relative synonymous codon usage in human albumin superfamily members. The value in bold indicates the codons used with high frequency.
| Amino acid | Codons | RSCU1 | Number | RSCU2 | Number | RSCU3 | Number | RSCU4 | Number |
|---|---|---|---|---|---|---|---|---|---|
| Phe | UUU |
| 25 |
| 17 |
| 28 |
| 11 |
| UUC | 0.57 | 10 | 0.94 | 15 | 0.70 | 15 | 0.84 | 8 | |
|
| |||||||||
| Leu | UUA | 0.94 | 10 | 1.00 | 10 |
| 11 | 0.53 | 5 |
| UUG | 1.13 | 12 | 1.10 | 11 | 0.87 | 8 | 0.63 | 6 | |
| CUU |
| 19 | 0.90 | 9 |
| 11 | 1.16 | 11 | |
| CUC | 0.66 | 7 | 0.40 | 4 | 0.98 | 9 | 0.95 | 9 | |
| CUA | 0.38 | 4 | 0.90 | 9 | 0.65 | 6 | 1.05 | 10 | |
| CUG | 1.13 | 12 |
| 17 | 1.09 | 10 |
| 16 | |
|
| |||||||||
| Ile | AUU |
| 4 |
| 15 | 1.07 | 10 |
| 3 |
| AUC |
| 4 | 0.71 | 8 | 0.75 | 7 |
| 3 | |
| AUA | 0.33 | 1 | 0.97 | 11 |
| 11 | 0.75 | 2 | |
|
| |||||||||
| Val | GUU | 1.12 | 12 |
| 11 |
| 13 | 0.89 | 6 |
| GUC | 0.65 | 7 | 0.80 | 6 | 0.67 | 6 |
| 8 | |
| GUA | 0.74 | 8 | 0.80 | 6 | 0.67 | 6 | 1.04 | 7 | |
| GUG |
| 16 | 0.93 | 7 | 1.22 | 11 | 0.89 | 6 | |
|
| |||||||||
| Ser | UCU | 0.64 | 3 | 1.26 | 8 |
| 12 | 1.43 | 10 |
| UCC |
| 7 | 0.47 | 3 | 0.86 | 5 | 1.29 | 9 | |
| UCA | 1.29 | 6 |
| 10 | 1.03 | 6 |
| 12 | |
| UCG | 0.64 | 3 | 0.47 | 3 | 0.00 | 0 | 0.00 | 0 | |
|
| |||||||||
| Pro | CCU |
| 10 |
| 9 |
| 12 | 1.38 | 9 |
| CCC | 1.00 | 6 | 0.76 | 4 | 0.71 | 5 | 0.92 | 6 | |
| CCA | 1.17 | 7 | 1.52 | 8 | 1.57 | 11 |
| 10 | |
| CCG | 0.17 | 1 | 0.00 | 0 | 0.00 | 0 | 0.15 | 1 | |
|
| |||||||||
| Thr | ACU | 0.97 | 7 |
| 16 | 1.18 | 10 | 1.25 | 10 |
| ACC | 1.24 | 9 | 0.67 | 6 | 0.82 | 7 |
| 11 | |
| ACA |
| 11 | 1.33 | 12 |
| 13 | 1.13 | 9 | |
| ACG | 0.28 | 2 | 0.22 | 2 | 0.47 | 4 | 0.25 | 2 | |
|
| |||||||||
| Ala | GCU |
| 30 | 1.20 | 15 |
| 11 |
| 15 |
| GCC | 0.89 | 14 | 0.88 | 11 | 0.71 | 5 | 1.09 | 9 | |
| GCA | 1.08 | 17 |
| 21 | 1.29 | 9 | 0.85 | 7 | |
| GCG | 0.13 | 2 | 0.24 | 3 | 0.43 | 3 | 0.24 | 2 | |
|
| |||||||||
| Tyr | UAU |
| 13 |
| 9 |
| 9 |
| 9 |
| UAC | 0.63 | 6 | 0.94 | 8 | 0.94 | 8 | 0.88 | 7 | |
|
| |||||||||
| His | CAU |
| 11 |
| 13 |
| 8 |
| 4 |
| CAC | 0.63 | 5 | 0.38 | 3 | 0.77 | 5 |
| 4 | |
|
| |||||||||
| Gln | CAA |
| 11 |
| 23 |
| 17 |
| 8 |
| CAG | 0.90 | 9 | 0.85 | 17 | 0.74 | 10 | 0.67 | 4 | |
|
| |||||||||
| Asn | AAU |
| 11 |
| 10 |
| 17 |
| 12 |
| AAC | 0.71 | 6 |
| 10 | 0.97 | 16 | 0.67 | 6 | |
|
| |||||||||
| Lys | AAA |
| 40 |
| 33 |
| 28 | 0.93 | 20 |
| AAG | 0.67 | 20 | 0.71 | 18 | 0.67 | 14 |
| 23 | |
|
| |||||||||
| Asp | GAU |
| 25 |
| 21 |
| 15 |
| 16 |
| GAC | 0.61 | 11 | 0.73 | 12 | 0.70 | 8 | 0.77 | 10 | |
|
| |||||||||
| Glu | GAA |
| 38 |
| 34 |
| 40 |
| 27 |
| GAG | 0.77 | 24 | 0.76 | 21 | 0.64 | 19 | 0.74 | 16 | |
|
| |||||||||
| Cys | UGU | 0.86 | 15 |
| 18 | 0.81 | 13 |
| 14 |
| UGC |
| 20 | 0.94 | 16 |
| 19 |
| 14 | |
|
| |||||||||
| Arg | CGU |
| 3 |
| 2 |
| 2 | 0.00 | 0 |
| CGC | 0.22 | 1 | 0.25 | 1 | 0.27 | 1 | 0.00 | 0 | |
| CGA |
| 3 |
| 2 |
| 2 |
| 2 | |
| CGG | 0.44 | 2 | 0.25 | 1 | 0.00 | 0 | 0.46 | 1 | |
|
| |||||||||
| Ser | AGU |
| 6 |
| 9 | 0.79 | 5 |
| 6 |
| AGC | 0.64 | 3 | 0.51 | 3 |
| 9 | 0.71 | 5 | |
|
| |||||||||
| Arg | AGA |
| 13 |
| 11 |
| 12 |
| 5 |
| AGG | 1.11 | 5 | 1.75 | 7 | 1.36 | 5 |
| 5 | |
|
| |||||||||
| Gly | GGU | 0.92 | 3 | 0.75 | 3 | 0.62 | 4 | 0.29 | 1 |
| GGC | 0.92 | 3 | 0.75 | 3 | 0.77 | 5 |
| 5 | |
| GGA |
| 6 |
| 6 |
| 13 |
| 5 | |
| GGG | 0.31 | 1 |
| 4 | 0.62 | 4 | 0.86 | 3 | |
RSCU1 : RSCU values for ALB; RSCU2: RSCU values for AFP; RSCU3: RSCU values for AFM; RSCU4: RSCU values for DBP.
A + U and G + C preferential codon usage of human albumin superfamily members.
| A + U | G + C | |
|---|---|---|
| ALB | 17 | 3 |
| AFP | 17 | 1 |
| AFM | 18 | 2 |
| VDBP | 11 | 4 |