Literature DB >> 16202392

Determination of mutation trend in proteins by means of translation probability between RNA codes and mutated amino acids.

Guang Wu1, Shaomin Yan.   

Abstract

In this study, we estimate the translation probability to amino acid from RNA codon. With the determined 183 translation probabilities and amino-acid composition of eight highly mutated proteins, we construct the theoretical distributions of mutated amino acids in these proteins and then compare them with their actual distributions affected by mutations. Thereafter we trace the pattern of translation probabilities from RNA codons to mutated amino acids of 1053 point missense mutations. Finally, we statistically conclude that the natural mutation trend goes along the theoretical translation probability.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16202392      PMCID: PMC7117410          DOI: 10.1016/j.bbrc.2005.09.106

Source DB:  PubMed          Journal:  Biochem Biophys Res Commun        ISSN: 0006-291X            Impact factor:   3.575


Amino-acid coding obeys the standard genetic codes: four distinct symbols grouped in clusters of three elements, or triplets, code for the 20 different amino acids and the STOP signal which marks the end of protein transcription [1]. Translation of the genetic code requires attachment of tRNAs to their cognate amino acids, and the editing by phenylalanyl-tRNA synthetase is essential for faithful translation of the genetic code [2]. Translation could be enhanced by increasing the rate of elongation, reducing the cost of proofreading [3], [4], increasing the accuracy of translation [5], [6], [7], or by any combination of those mechanisms [8]. Also protein charge heterogeneity [9] and protein release factors [10] can influence the translation of mRNAs. Genetic mutation engineers the mutation at the protein level. Between RNA and protein, the RNA codes have the unambiguous relationship with their translated amino acids, for example, any of four RNA codons ACU, ACC, ACA, and ACG can be translated to the amino acid threonine. Between DNA and RNA, a single-base change in DNA nucleotide leads to the corresponding change in the RNA code. For instance, the RNA codon ACU can be changed to ACC, ACA, ACG, AUU, AAU, AGU, UCU, CCU, and GCU due to a single-base change in DNA, if we do not consider the possibility that U changes to U in the RNA codon ACU. As a result, each change in RNA code will be translated to different amino acids (Table 1 ). Table 1 shows that the translated amino acids induced by the changes in RNA code are not equally distributed (the last two rows).
Table 1

Possible changes in the RNA codon ACU and the related changes in the translated amino acids

RNA codon
Translated amino acid
First positionSecond positionThird position
Change in the first position
ACUThreonine
UCUSerine
CCUProline
GCUAlanine
Change in the second position
ACUThreonine
AUUIsoleucine
AAUAsparagine
AGUSerine



Change in the third position
ACUThreonine
ACCThreonine
ACAThreonine
ACGThreonine
Translated amino acids1 alanine + 1 asparagine + 1 isoleucine + 1 proline + 2 serines + 6 threonines
Translation probability1/12 + 1/12 + 1/12 + 1/12 + 2/12 + 6/12
Possible changes in the RNA codon ACU and the related changes in the translated amino acids This provides us a way to estimate the mutation trend from RNA to amino acid. For example, we have a RNA sequence and its protein sequence, we are interested in the RNA codon ACU and we would like to know which type of mutated amino acid is likely to be formed by an unspecified change in RNA codon ACU. From Table 1 we know that the mutated amino acid is highly likely to be threonine because it has the largest probability whereas other types of mutated amino acids have small probabilities. This is very suggestive because we can eventually define the mutated amino acids with the largest translation probability based on the change in RNA codes. On the other hand, the explicit corresponding relationship between RNA codons and their translated amino acids with different translation probabilities provides us another way to estimate the mutation trend from amino acid to RNA. For example, we still have a RNA sequence and its protein sequence with many documented mutations. We are interested in the amino acid threonine and its mutated amino acids, and we would like to know the distribution of translation probability from RNA codons to threonine and its mutated amino acids in order to analyze the mutation trend from RNA to amino acid. The amino acid threonine can be translated from four RNA codons ACU, ACC, ACA, and ACG. Naturally, we can have the translation probabilities with respect to whether or not we include the cases of “U” changing to “U,” “A” changing “A,” “C” changing to “C”, and “G” changing “G” (Table 2 ). As can be seen in Table 2, threonine can be mutated to eight different amino acids with different translation probabilities if we do not calculate the probabilities that “U” changes to “U”, “A” changes to “A,” “C” changes to “C”, and “G” changes to “G.” So far, we have yet to know the probabilistic pattern in the proteins with many mutations regarding the changes in the corresponding RNA codons, intuitively we might not expect the mutations to follow the maximal translation probability (the last two rows in Table 2).
Table 2

Mutated amino acids and their translation probabilities with regard to the changes in four RNA codons ACU, ACC, ACA, and ACG

RNA codonACUACCACAACG
Original amino acidThreonineThreonineThreonineThreonine
Mutated amino acids
Changes in the first position in RNA codonSerineSerineSerineSerine
ProlineProlineProlineProline
AlanineAlanineAlanineAlanine
Changes in the second position in RNA codonIsoleucineIsoleucineIsoleucineMethionine
AsparagineAsparagineLysineLysine
SerineSerineArginineArginine
Changes in the third position in RNA codonThreonineThreonineThreonineThreonine
ThreonineThreonineThreonineThreonine
ThreonineThreonineThreonineThreonine



Total I4 alanines + 2 arginines + 2 asparagines + 3 isoleucines + 2 lysines + methionine + 4 prolines + 6 serines + 12 threonines
Translation probability I4/36 + 2/36 + 2/36 + 3/36 + 2/36 + 1/36 + 4/36 + 6/36 + 12/36
Total II4 alanines + 2 arginines + 2 asparagines + 3 isoleucines + 2 lysines + methionine + 4 prolines + 6 serines
Translation probability II4/24 + 2/24 + 2/24 + 3/24 + 2/24 + 1/24 + 4/24 + 6/24

I and II indicate the inclusion and exclusion, respectively, of the same type of amino acids before and after mutation, i.e., the mutated amino acid is threonine.

Mutated amino acids and their translation probabilities with regard to the changes in four RNA codons ACU, ACC, ACA, and ACG I and II indicate the inclusion and exclusion, respectively, of the same type of amino acids before and after mutation, i.e., the mutated amino acid is threonine. Comparing these two approaches offered by the translation probability, the first one can be used to predict what will happen in future, i.e., which type of amino acid will be likely to be formed in future from a change in RNA codon such as those in Table 1. The second one can be used to analyze what happened in the past, that is, which RNA code was changed and led to the mutation at amino acid level such as those in Table 2. At this stage, the second approach is more intriguing because a large number of proteins with their mutations have been documented, so a detailed analysis can give us the concepts that may govern the trend of natural mutations. In this study, we apply this approach to the proteins with many mutations, which we have studied in the past using the other computational approach [11], [12], [13], [14], [15], [16], [17], [18].

Materials and methods

The following eight proteins with numerous mutations are used in this study. The copper-transporting ATPase 2 (ATP7B, Accession numbers for protein and DNA are P35670 and U11700 [19]) with 125 point mutations [11], the Bruton’s tyrosine kinase (BTK, Accession Nos. Q06187 and U10087) with 112 point mutations [12], the haemoglobin α-chain (HBA, Accession Nos. P01922 and J00153) with 130 point mutations [13], the low-density lipoprotein receptor (LDLR, Accession Nos. P01130 and L00352) with 134 point mutations [14], the human p53 protein (p53, Accession Nos. P04637 and M14695) with 192 point mutations [15], the phenylalanine hydroxylase protein (PH4H, Accession Nos. P00439 and K03020) with 187 point mutations [16], the von Hippel–Lindau disease tumor suppressor (VHL, Accession Nos. P40337 and AF010238) with 108 point mutations [17], and the human coronavirus OC43 (OC43, Accession Nos. P36334 and L14643) with 65 point mutations [18]. Translation probability of amino acid from RNA codon. The translation probability is calculated in the same way as shown in Table 2 for all RNA codons. Totally, there are 183 possible translation probabilities including the STOP codons (Table 3 ).
Table 3

Mutated amino acids and their translation probabilities

Original amino acidRNA codonMutated amino acids translated from the changed RNA codon
Number of translated amino acids with their translation probability
First positionSecond positionThird position
PheUUULeu, Ile, ValSer, Tyr, CysPhe, Leu, Leu2Cys + 2Ile + 6Leu + 2Ser + 2Val + 2Tyr
UUCLeu, Ile, ValSer, Tyr, CysPhe, Leu, Leu2/16 + 2/16 + 6/16 + 2/16 + 2/16 + 2/16



LeuUUALeu, Ile, ValSer, STOP, STOPPhe, Phe, Leu4Phe + Ile + Met + 2Ser + 2Val + Trp + 3STOP
UUGLeu, Met, ValSer, Trp, STOPPhe, Phe, Leu4/14 + 1/14 + 1/14 + 2/14 + 2/14 + 1/14 + 3/14



LeuCUUPhe, Ile, ValPro, His, ArgLeu, Leu, Leu2Phe + 2His + 3Ile + Met + 4Pro + 2Gln + 4Arg + 4Val
CUCPhe, Ile, ValPro, His, ArgLeu, Leu, Leu2/22 + 2/22 + 3/22 + 1/22 + 4/22 + 2/22 + 4/22 + 4/22
CUALeu, Ile, ValPro, Gln, ArgLeu, Leu, Leu
CUGLeu, Met, ValPro, Gln, ArgLeu, Leu, Leu



IleAUUPhe, Leu, ValThr, Asn, SerIle, Ile, Met2Phe + Lys + 4Leu + 3Met + 2Asn + Arg + 2Ser + 3Thr + 3Val
AUCPhe, Leu, ValThr, Asn, SerIle, Ile, Met2/21 + 1/21 + 4/21 + 3/21 + 2/21 + 1/21 + 2/21 + 3/21 + 3/21
AUALeu, Leu, ValThr, Lys, ArgIle, Ile, Met



MetAUGLeu, Leu, ValThr, Lys, ArgIle, Ile, Ile3Ile + Lys + 2Leu + Arg + Thr + Val
3/9 + 1/9 + 2/9 + 1/9 + 1/9 + 1/9



ValGUUPhe, Leu, IleAla, Asp, GlyVal, Val, Val4Ala + 2Asp + 2Glu + 2Phe + 4Gly + 3Ile + 6Leu + Met
GUCPhe, Leu, IleAla, Asp, GlyVal, Val, Val4/24 + 2/24 + 2/24 + 2/24 + 4/24 + 3/24 + 6/24 + 1/24
GUALeu, Leu, IleAla, Glu, GlyVal, Val, Val
GUGLeu, Leu, MetAla, Glu, GlyVal, Val, Val



SerUCUPro, Thr, AlaPhe, Tyr, CysSer, Ser, Ser4Ala + 2Cys + 2Phe + 2Leu + 4Pro + 4Thr + Trp + 2Tyr + 3STOP
UCCPro, Thr, AlaPhe, Tyr, CysSer, Ser, Ser4/24 + 2/24 + 2/24 + 2/24 + 4/24 + 4/24 + 1/24 + 2/24 + 3/24
UCAPro, Thr, AlaLeu, STOP, STOPSer, Ser, Ser
UCGPro, Thr, AlaLeu, STOP, TrpSer, Ser, Ser



ProCCUSer, Thr, AlaLeu, His, ArgPro, Pro, Pro4Ala + 2His + 4Leu + 2Gln + 4Arg + 4Ser + 4Thr
CCCSer, Thr, AlaLeu, His, ArgPro, Pro, Pro4/24 + 2/24 + 4/24 + 2/24 + 4/24 + 4/24 + 4/24
CCASer, Thr, AlaLeu, Gln, ArgPro, Pro, Pro
CCGSer, Thr, AlaLeu, Gln, ArgPro, Pro, Pro



ThrACUSer, Pro, AlaIle, Asn, SerThr, Thr, Thr4Ala + 2Arg + 2Asn + 3Ile + 2Lys + Met + 4Pro + 6Ser
ACCSer, Pro, AlaIle, Asn, SerThr, Thr, Thr4/24 + 2/24 + 2/24 + 3/24 + 2/24 + 1/24 + 4/24 + 6/24
ACASer, Pro, AlaIle, Lys, ArgThr, Thr, Thr
ACGSer, Pro, AlaMet, Lys, ArgThr, Thr, Thr



AlaGCUSer, Pro, ThrVal, Asp, GlyAla, Ala, Ala2Asp + 2Glu + 4Gly + 4Pro + 4Ser + 4Thr + 4Val
GCCSer, Pro, ThrVal, Asp, GlyAla, Ala, Ala2/24 + 2/24 + 4/24 + 4/24 + 4/24 + 4/24 + 4/24
GCASer, Pro, ThrVal, Glu, GlyAla, Ala, Ala
GCGSer, Pro, ThrVal, Glu, GlyAla, Ala, Ala



TyrUAUHis, Asn, AspPhe, Ser, CysTyr, STOP, STOP2Cys + 2Asp + 2Phe + 2His + 2Asn + 2Ser + 4STOP
UACHis, Asn, AspPhe, Ser, CysTyr, STOP, STOP2/16 + 2/16 + 2/16 + 2/16 + 2/16 + 2/16 + 4/16



OchreUAAGln, Lys, GluLeu, Ser, STOPTyr, Tyr, STOP2Glu + 2Lys + 2Leu + 2Gln + 2Ser + Trp + 4Tyr



AmberUAGGln, Lys, GluLeu, Ser, TrpTyr, Tyr, STOP2/15 + 2/15 + 2/15 + 2/15 + 2/15 + 1/15 + 4/15



HisCAUTyr, Asn, AspLeu, Pro, ArgHis, Gln, Gln,2Asp + 2Leu + 2Asn + 2Pro + 4Gln + 2Arg + 2Tyr
CACTyr, Asn, AspLeu, Pro, ArgHis, Gln, Gln2/16 + 2/16 + 2/16 + 2/16 + 4/16 + 2/16 + 2/16



GlnCAALys, Glu, STOPLeu, Pro, ArgHis, His, Gln2Glu + 4His + 2Lys + 2Leu + 2Pro + 2Arg + 2STOP
CAGLys, Glu, STOPLeu, Pro, ArgHis, His, Gln2/16 + 4/16 + 2/16 + 2/16 + 2/16 + 2/16 + 2/16



AsnAAUTyr, His, AspIle, Thr, SerAsn, Lys, Lys2Asp + 2His + 2Ile + 4Lys + 2Ser + 2Thr + 2Tyr
AACTyr, His, AspIle, Thr, SerAsn, Lys, Lys2/16 + 2/16 + 2/16 + 4/16 + 2/16 + 2/16 + 2/16



LysAAASTOP, Gln, GluIle, Thr, ArgAsn, Asn, Lys2Glu + Ile + Met + 4Asn + 2Gln + 2Arg + 2Thr + 2STOP
AAGSTOP, Gln, GluMet, Thr, ArgAsn, Asn, Lys2/16 + 1/16 + 1/16 + 4/16 + 2/16 + 2/16 + 2/16 + 2/16



AspGAUTyr, His, AsnVal, Ala, GlyAsp, Glu, Glu2Ala + 4Glu + 2Gly + 2His + 2Asn + 2Val + 2Tyr
GACTyr, His, AsnVal, Ala, GlyAsp, Glu, Glu2/16 + 4/16 + 2/16 + 2/16 + 2/16 + 2/16 + 2/16



GluGAASTOP, Gln, LysVal, Ala, GlyAsp, Asp, Glu2Ala + 4Asp + 2Gly + 2Lys + 2Gln + 2Val + 2STOP
GAGSTOP, Gln, LysVal, Ala, GlyAsp, Asp, Glu2/16 + 4/16 + 2/16 + 2/16 + 2/16 + 2/16 + 2/16



CysUGUArg, Ser, GlyPhe, Ser, TyrCys, Trp, STOP2Phe + 2Gly + 2Arg + 4Ser + 2Trp + 2Tyr + 2STOP
UGCArg, Ser, GlyPhe, Ser, TyrCys, Trp, STOP2/16 + 2/16 + 2/16 + 4/16 + 2/16 + 2/16 + 2/16



OpalUGAArg, Arg, GlyLeu, Ser, STOPCys, Cys, Trp2Cys + Gly + Leu + 2Arg + Ser + Trp
2/8 + 1/8 + 1/8 + 2/8 + 1/8 + 1/8



TrpUGGArg, Arg, GlyLeu, Ser, STOPCys, Cys, STOP2Cys + Gly + Leu + 2Arg + Ser + 2STOP
2/9 + 1/9 + 1/9 + 2/9 + 1/9 + 2/9



ArgCGUCys, Ser, GlyLeu, Pro, HisArg, Arg, Arg2Cys + 4Gly + 2His + 4Leu + 4Pro + 2Gln + 2Ser + Trp + STOP
CGCCys, Ser, GlyLeu, Pro, HisArg, Arg, Arg2/22 + 4/22 + 2/22 + 4/22 + 4/22 + 2/22 + 2/22 + 1/22 + 1/22
CGASTOP, Arg, GlyLeu, Pro, GlnArg, Arg, Arg
CGGTrp, Arg, GlyLeu, Pro, GlnArg, Arg, Arg



SerAGUCys, Arg, GlyIle, Thr, AsnSer, Arg, Arg2Cys + 2Gly + 2Ile + 2Asn + 6Arg + 2Thr
AGCCys, Arg, GlyIle, Thr, AsnSer, Arg, Arg2/16 + 2/16 + 2/16 + 2/16 + 6/16 + 2/16



ArgAGASTOP, Arg, GlyIle, Thr, LysSer, Ser, Arg2Gly + Ile + 2Lys + Met + 4Ser + 2Thr + Trp + STOP
AGGTrp, Arg, GlyMet, Thr, LysSer, Ser, Arg2/14 + 1/14 + 2/14 + 1/14 + 4/14 + 2/14 + 1/14 + 1/14



GlyGGUCys, Arg, SerVal, Ala, AspGly, Gly, Gly4Ala + 2Cys + 2Asp + 2Glu + 6Arg + 2Ser + 4Val + Trp + STOP
GGCCys, Arg, SerVal, Ala, AspGly, Gly, Gly4/24 + 2/24 + 2/24 + 2/24 + 6/24 + 2/24 + 4/24 + 1/24 + 1/24
GGASTOP, Arg, ArgVal, Ala, GluGly, Gly, Gly
GGGTrp, Arg, ArgVal, Ala, GluGly, Gly, Gly

Ala, alanine; Arg, arginine; Asn, asparagine; Asp, aspartic acid; Cys, cysteine; Gln, glutamine; Glu, glutamic acid; Gly, glycine; His, histidine; Ile, isoleucine; Leu, leucine; Lys, lysine; Met, methionine; Phe, phenylalanine; Pro, proline; Ser, serine; Thr, threonine; Trp, tryptophan; Tyr, tyrosine; Val, valine.

Mutated amino acids and their translation probabilities Ala, alanine; Arg, arginine; Asn, asparagine; Asp, aspartic acid; Cys, cysteine; Gln, glutamine; Glu, glutamic acid; Gly, glycine; His, histidine; Ile, isoleucine; Leu, leucine; Lys, lysine; Met, methionine; Phe, phenylalanine; Pro, proline; Ser, serine; Thr, threonine; Trp, tryptophan; Tyr, tyrosine; Val, valine. Theoretical distribution of mutated amino acids. With the help of probability in Table 3, we can construct a theoretical distribution of mutated amino acids. For example, we imagine that we would have a protein containing equal numbers of 20 types of amino acids. According to the probability in Table 3, we would expect that the mutations would occur in a distribution pattern somewhat similar to that in Fig. 1 , say, the mutated amino acids have 5.8% chances of being alanine, and there are 5.5% chances for mutations to result in this imaging protein to be truncated because of the STOP codon. Similarly, with the composition of a protein, we can obtain the theoretical distribution of mutated amino acids such as eight highly mutated proteins in this study.
Fig. 1

Theoretical distribution of mutated amino acids for an imaging protein containing equal numbers of 20 types of amino acids.

Theoretical distribution of mutated amino acids for an imaging protein containing equal numbers of 20 types of amino acids. Actual distribution of mutated amino acids. We can construct the actual distribution of mutated amino acids in proteins with numerous mutations. For instance, the recorded mutations in human p53 protein include the following mutated amino acids, 9 alanines, 9 arginines, 4 asparagines, 9 aspartic acids, 13 cysteines, 10 glutamines, 6 glutamic acids, 11 glycines, 10 histidines, 9 isoleucines, 14 leucines, 7 lysines, 5 methionines, 9 phenylalanines, 14 prolines, 18 serines, 13 threonines, 4 tryptophans, 5 tyrosines, and 9 valines. With these data, we can construct the actual distribution of mutated amino acids and compare it with the theoretical one. Determination of translation probability of single mutated amino acid at a position. The determination of distribution of translation probability of mutated amino acids from RNA codon is conducted as follows. Taking human p53 protein as an example, the amino acid at position 125 is threonine and its related RNA codon is ACG. A mutation at this position changes threonine to methionine, which has the least probability in Table 2. Thus, this mutation goes along the minimal probability pathway rather than the maximal probability one. Determination of translation probability of multiple mutated amino acids at a position. There are five mutations occurred at position 245 of human p53 protein, that is, the amino acid glycine is changed into alanine, cysteine, aspartic acid, serine, and valine with the corresponding translation probabilities of 4/24, 2/24, 2/24, 2/24, and 4/24, respectively (the last row in Table 3). In these manners, we can get the distribution of translation probability of mutated amino acids from RNA codons for 192 point mutations in human p53 protein. Determination of mutated amino acid that cannot be explained by a single change in RNA codon. At the beginning of this study, we could not imagine that there are cases of mutations that cannot be explained from the explicit corresponding relationship between RNA codons and translated amino acids, but we do meet with such cases. For example, the mutation at position 140 in human p53 protein changes threonine to tyrosine, however, we cannot find such a possibility in Table 2. Thus, this is the case, which cannot be explained from the explicit corresponding relationship between RNA codon and translated amino acid. Statistics. The difference between theoretical and actual translation probabilities are compared using the Wilcoxon signed rank test with SigmaStat for Windows (SPSS, 1992–2003), and the p  < 0.05 is considered statistically significant.

Results and discussion

Theoretical and actual distributions of mutated amino acids in eight highly mutated proteins

As the evolutionary history is reasonably long, we would expect that the actual distribution of mutated amino acids would approach to the theoretical distribution of mutated amino acids in a protein. Fig. 2 shows the theoretical and actual distributions of mutated amino acids in eight highly mutated proteins. We can see several characteristics in Fig. 2: (i) The theoretical distributions in Fig. 2 are different from those in Fig. 1 and also different one another, this is understandable because the composition of amino acids in each protein differs from that of our imaging protein, which contains equal numbers of 20 types of amino acids, and from other proteins. (ii) There is no actual distribution for STOP codon, which is certainly due to the fact that the premature termination results in a synthesis of deleterious truncated proteins. (iii) Some actual distributions of mutated amino acids are very similar to the theoretical distributions such as isoleucine and threonine in human p53 protein, which at least means that the mutations would not lead to the dysfunction of human p53 proteins if the mutated amino acids would be these types. (iv) Some actual distributions of mutated amino acids are very different from the theoretical distribution such as alanine and arginine in human p53 protein. These phenomena suggest that the mutations lead to the death of human p53 protein if the mutated amino acids would be these types, otherwise we would expect to have seen more records in these mutated amino acids and a smaller difference between the theoretical and actual distributions.
Fig. 2

Theoretical and actual distributions of mutated amino acids for eight highly mutated proteins. ATP7B, copper-transporting ATPase 2; BTK, Bruton’s tyrosine kinase; HBA, hemoglobin α chain; LDLR, low-density lipoprotein receptor; p53, human p53 protein; PH4H, phenylalanine hydroxylase protein; VHL, von Hippel–Lindau disease tumor suppressor.

Theoretical and actual distributions of mutated amino acids for eight highly mutated proteins. ATP7B, copper-transporting ATPase 2; BTK, Bruton’s tyrosine kinase; HBA, hemoglobin α chain; LDLR, low-density lipoprotein receptor; p53, human p53 protein; PH4H, phenylalanine hydroxylase protein; VHL, von Hippel–Lindau disease tumor suppressor. Most missense errors have little effect on protein function, since they only exchange one amino acid for another. Statistical and biochemical studies have revealed non-random patterns in codon assignments. The canonical genetic code is known to be highly efficient in minimizing the effects of mistranslational errors and point mutations, since the biochemical properties of the resulted amino acid are usually very similar to those of the original one when an amino acid is converted to another due to error [20]. Therefore, the implication of the difference between theoretical and actual distributions of mutated amino acids highlights the direction of mutations, say, a protein can survive with which type of mutated amino acid.

Mutated amino acids that cannot be explained by single-base change in RNA codon

Table 4 lists the mutated amino acids that cannot be explained by single-base change in the standard genetic codes in the proteins studied herein. Possible explanations for this phenomenon are that the mutated amino acid occurs at the protein level rather than the translation from mRNA, or the mutated amino acid is not related to a single-base change in RNA code, but to two or three (the fourth and fifth columns in Table 4).
Table 4

Mutated amino acids that cannot be explained by a single-base change in RNA codons

ProteinPositionMutationChange in 2 RNA codesChange in 3 RNA codes
BTK594Glycine → glutamineGGG → CAG
OC4329Lysine → valineAAA → GUA
173Glutamine → asparagineCAA → AAU
AAC
603Leucine → threonineCUU → ACU
630Leucine → threonineUUA → ACA
896Glutamic acid → cysteineGAA → UGU
UGG
912Aspartic acid → serineGAU → UCU
AGU
p53140Threonine → tyrosineACC → UAC
157Valine → serineGUC → UCC
AGC
174Arginine → histidineAGG → CAU
CAC
247Asparagine → tryptophanAAC → UGG
PH4H157Arginine → asparagineAGA → AAU
AAC
VHL70Glutamic acid → leucineGAG → UUG
CUG
101Leucine → glycineCUG → GGG
115Histidine → glutamic acidCAC → GAA
GAG
157Threonine → aspartic acidACU → GAU

BTK, Bruton’s tyrosine kinase; OC43, human coronavirus OC43; p53, human p53 protein; PH4H, phenylalanine hydroxylase protein; VHL, von Hippel–Lindau disease tumor suppressor.

Mutated amino acids that cannot be explained by a single-base change in RNA codons BTK, Bruton’s tyrosine kinase; OC43, human coronavirus OC43; p53, human p53 protein; PH4H, phenylalanine hydroxylase protein; VHL, von Hippel–Lindau disease tumor suppressor. Amino-acid misincorporation has been demonstrated during high-level expression [21]. Any errors of translation in the editing-defective cells were due to amino-acid misincorporation, rather than to frameshift errors and an editing deficiency does not contribute to the frequency of spontaneous mutations [22]. Selection at the amino-acid level can influence synonymous codon usage [23]. Non-standard genetic codes are genetic codes in which one or more codons have a different amino-acid assignment from that found in the standard genetic code. The diversity of non-standard genetic codes has found in the modern biosphere. The majority of non-standard codes arise from alterations in the tRNA, with most occurring by post-transcriptional modifications, such as base modification or RNA editing, rather than by substitutions within tRNA anticodons [24]. In some ciliate species, it is found that the UAG and UAA codons encode glutamine, and UGA encodes cysteine and tryptophan [25]. Thus, the findings in Table 4 may involve with the non-standard genetic codes.

Tracing of translation probability from RNA codons to mutated amino acids

Fig. 3 illustrates the translation probability versus the frequency of mutated amino acids in these eight highly mutated proteins. Although these proteins vary regarding their composition of amino acids, their function, their location, and so on, a common pattern can be seen in Fig. 3, which is much clearer than the patterns in Fig. 2, for example, the translation probability of 2/16 has the largest frequency of mutated amino acids. On the other hand, we could expect that the translation probability of 2/16 would be the largest in Fig. 3 because this probability appears most frequently (31.15%) among 183 translation probabilities in Table 3. In terms of percentage of frequency among all the mutations, there is no statistical difference between theoretical and actual situations in Table 5 (p  = 0.109, Wilcoxon signed rank test). This means that the natural mutation trend goes in principle along the theoretical translation probability listed in Table 3 if the sample is relatively large enough, although we can expect some difference between theoretical and actual mutation frequencies due to the nonsense mutation, dysfunctional mutant, etc.
Fig. 3

Translation probability versus frequency of mutated amino acids in eight highly mutated proteins.

Table 5

Percentage of mutations in theoretical and actual situations

Translation probabilityTheoretical situation based on Table 3
Actual situation in all mutation in Fig. 3
Frequency%Frequency%
1/842.1900.00
2/821.0900.00
1/973.83211.99
2/942.19111.04
3/910.5570.66
1/1473.8340.38
2/1452.73181.71
3/1410.5500.00
4/1421.0960.57
1/1510.5500.00
2/1552.7300.00
4/1510.5500.00
1/1621.0990.85
2/165731.1534332.57
4/1684.37464.37
6/1621.09211.99
1/2121.0910.09
2/2131.64100.95
3/2131.64201.90
4/2110.5510.09
1/2231.64151.42
2/2273.83676.36
3/2210.5510.09
4/2263.28969.12
1/2452.73252.37
2/24189.8411010.45
3/2431.64171.61
4/241910.3817016.14
6/2431.64343.23
Total1831001053100
Translation probability versus frequency of mutated amino acids in eight highly mutated proteins. Percentage of mutations in theoretical and actual situations As the genetic code is degenerated for a given polypeptide, a set of synonymous sequences would code the same polypeptide [26]. The relationships between synonymous and non-synonymous substitution rates and between synonymous rate and codon usage bias are important to our understanding of the roles of mutation and selection in the evolutionary process [27]. Synonymous codons differ in their capacity to minimize the effects of errors due to mutation or mistranslation. Natural selection for error minimization at the protein level plays a role in the evolution of coding sequences in Drosophila and rodents [28]. At least 10% of variation in codon bias can be explained by mutation pressure [29]. Furthermore, the effect of selection on individual codons changes over time [30]. The selection pressure is for reduced protein synthesis cost, say, most reassignments give amino acids that are less expensive to synthesize. Mitochondrial genetic codes evolve to match the amino-acid requirements of proteins [31]. Our analyses herein point out that the natural mutation trend goes in principle along the theoretical translation probability. In consistent with previous studies, nature should have the intention to spend the least time- and energy-consuming to construct proteins [32]. In conclusion, analyzing the translation probabilities of mutant amino acids governed by the standard genetic codes is performed in the proteins with many mutations. The differences between theoretical and actual distributions of mutated amino acids imply that a protein can survive with which type of mutated amino acids. Some mutated amino acids cannot be explained by single-base errors in the standard genetic codes, which may involve in the non-standard genetic codes. In principle, the natural mutation trend goes along the theoretical translation probability.
  27 in total

1.  Translation conditional models for protein coding sequences.

Authors:  F Rodolphe; C Mathé
Journal:  J Comput Biol       Date:  2000 Feb-Apr       Impact factor: 1.479

2.  Selection at the amino acid level can influence synonymous codon usage: implications for the study of codon adaptation in plastid genes.

Authors:  B R Morton
Journal:  Genetics       Date:  2001-09       Impact factor: 4.562

3.  Inhibited cell growth and protein functional changes from an editing-defective tRNA synthetase.

Authors:  Jamie M Bacher; Valérie de Crécy-Lagard; Paul R Schimmel
Journal:  Proc Natl Acad Sci U S A       Date:  2005-01-12       Impact factor: 11.205

Review 4.  Translational selection and molecular evolution.

Authors:  H Akashi; A Eyre-Walker
Journal:  Curr Opin Genet Dev       Date:  1998-12       Impact factor: 5.578

5.  Selection on codon usage for error minimization at the protein level.

Authors:  Marco Archetti
Journal:  J Mol Evol       Date:  2004-09       Impact factor: 2.395

6.  Estimation of amino acid pairs sensitive to variants in human phenylalanine hydroxylase protein by means of a random approach.

Authors:  Guang Wu; Shaomin Yan
Journal:  Peptides       Date:  2002-12       Impact factor: 3.750

7.  Determination of amino acid pairs sensitive to variants in human copper-transporting ATPase 2.

Authors:  Guang Wu; Shaomin Yan
Journal:  Biochem Biophys Res Commun       Date:  2004-06-18       Impact factor: 3.575

8.  The effects of mutation and natural selection on codon bias in the genes of Drosophila.

Authors:  R M Kliman; J Hey
Journal:  Genetics       Date:  1994-08       Impact factor: 4.562

Review 9.  Energy cost of translational proofreading in vivo. The aminoacylation of transfer RNA in Escherichia coli.

Authors:  H Jakubowski
Journal:  Ann N Y Acad Sci       Date:  1994-11-30       Impact factor: 5.691

10.  Mistranslation of human phosphoglycerate kinase in yeast in the presence of paromomycin.

Authors:  C M Grant; M F Tuite
Journal:  Curr Genet       Date:  1994-08       Impact factor: 3.886

View more
  5 in total

1.  Prediction of mutations in H1 neuraminidases from North America influenza A virus engineered by internal randomness.

Authors:  Guang Wu; Shaomin Yan
Journal:  Mol Divers       Date:  2008-02-19       Impact factor: 2.943

Review 2.  Mutation trend of hemagglutinin of influenza A virus: a review from a computational mutation viewpoint.

Authors:  Guang Wu; Shao-Min Yan
Journal:  Acta Pharmacol Sin       Date:  2006-05       Impact factor: 6.150

3.  Predicting Crystallization Propensity of Proteins from Arabidopsis Thaliana.

Authors:  Shaomin Yan; Guang Wu
Journal:  Biol Proced Online       Date:  2015-11-23       Impact factor: 3.244

4.  Prediction of mutations engineered by randomness in H5N1 hemagglutinins of influenza A virus.

Authors:  G Wu; S Yan
Journal:  Amino Acids       Date:  2007-11-02       Impact factor: 3.520

5.  Prediction of mutations engineered by randomness in H5N1 neuraminidases from influenza A virus.

Authors:  G Wu; S Yan
Journal:  Amino Acids       Date:  2007-08-28       Impact factor: 3.520

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.