| Literature DB >> 34822069 |
Sujatha Thankeswaran Parvathy1, Varatharajalu Udayasuriyan2, Vijaipal Bhadana3.
Abstract
Codon usage bias is the preferential or non-random use of synonymous codons, a ubiquitous phenomenon observed in bacteria, plants and animals. Different species have consistent and characteristic codon biases. Codon bias varies not only with species, family or group within kingdom, but also between the genes within an organism. Codon usage bias has evolved through mutation, natural selection, and genetic drift in various organisms. Genome composition, GC content, expression level and length of genes, position and context of codons in the genes, recombination rates, mRNA folding, and tRNA abundance and interactions are some factors influencing codon bias. The factors shaping codon bias may also be involved in evolution of the universal genetic code. Codon-usage bias is critical factor determining gene expression and cellular function by influencing diverse processes such as RNA processing, protein translation and protein folding. Codon usage bias reflects the origin, mutation patterns and evolution of the species or genes. Investigations of codon bias patterns in genomes can reveal phylogenetic relationships between organisms, horizontal gene transfers, molecular evolution of genes and identify selective forces that drive their evolution. Most important application of codon bias analysis is in the design of transgenes, to increase gene expression levels through codon optimization, for development of transgenic crops. The review gives an overview of deviations of genetic code, factors influencing codon usage or bias, codon usage bias of nuclear and organellar genes, computational methods to determine codon usage and the significance as well as applications of codon usage analysis in biological research, with emphasis on plants.Entities:
Keywords: Anticodon; CUB indices; Codon adaptation; Codon optimization; Genetic code; Synonymous codon; tRNA abundance
Mesh:
Substances:
Year: 2021 PMID: 34822069 PMCID: PMC8613526 DOI: 10.1007/s11033-021-06749-4
Source DB: PubMed Journal: Mol Biol Rep ISSN: 0301-4851 Impact factor: 2.316
Universal genetic code
Degeneracy of genetic code
| Number of synonymous codons for an amino acid | 1 | 2 | 3 | 4 | 6 |
|---|---|---|---|---|---|
| Number of amino acids encoded by corresponding codons | 2 | 9 | 1 | 5 | 3 |
| Specifications of amino acids | Methionine (Met) Tryptophan (Trp) | Phenylalanine (Phe) Tyrosine (Tyr) Histidine (His) Glutamine (Gln) Asparagine (Asn) Lysine (Lys) Aspartic acid (Asp) Glutamic acid (Glu) Cysteine (Cys) | Isoleucine (Ile) | Proline (Pro) Threonine (Thr) Alanine (Ala) Valine (Val) Glycine (Gly) | Serine (Ser) Leucine (Leu) Arginine (Arg) |
| Properties of amino acids with symbols | |||||
| Non-polar aliphatic R group | Methionine (M) | Isoleucine (I) | Alanine (A) Valine (V) Glycine (G) | Leucine (L) | |
| Non-polar aromatic R group | Tryptophan (W) | Phenylalanine (F) Tyrosine (Y) | |||
| Polar negatively charged R group (acidic) | Aspartic acid (D) Glutamic acid (E) | ||||
| Polar positively charged R group (basic) | Histidine (H) Lysine (K) | Arginine (R) | |||
| Polar uncharged R group (neutral) | Asparagine (N) Glutamine (Q) Cysteine (C) | Proline (P) Threonine (T) | Serine (S) | ||
Fig. 1Origin of life and genetic code. a Origin of 4 amino acids in the prebiotic soup encoded by simple genetic code. P1 indicates nucleotide at codon position 1 and P2 indicates nucleotide at codon position 2. b Evolution of the translation system and the genetic code
Deviations of standard genetic code
| Sl no | Codon | Standard aminoacid or code | Deviation or alternative code | Examples |
|---|---|---|---|---|
| 1 | UGA | STOP | Trp (Tryptophan) | Bacteria (Mycoplasma, Spiroplasma, Yeast and vertebrate mitochondria Protists (Trypanosomatids) |
Cys (Cysteine) | Ciliates ( | |||
Sec (Selenocysteine) | Many species in three domains of life | |||
Gly (Glycine) | Gammaproteo bacteria | |||
| 2 | UAR | STOP | Gln (Glutamine) | Ciliates Green algae |
| 3 | UAA | STOP | Glu (Glutamic acid) | Ciliates, Trypanosomatids |
Gln (Glutamine) | Heteropteran insect | |||
Tyr (Tyrosine) | ||||
| 4 | UAG | STOP | Pyl (Pyrrolysine) | Few methanogenic Archaea and anerobic bacteria |
Gln (Glutamine) | Anaerobic flagellate | |||
Leu (Leucine) | Heteropteran insect Mitochondrial genomes of several green algae Chytrid fungus | |||
Ala (Alanine) | Organelles of some green algae like | |||
| 5 | UUA | Leu (Leucine) | STOP | Mitochondria of |
| 6 | UUG | |||
| 7 | UCA | Ser (Serine) | STOP | Mitochondria of Sphaeropleales |
| 8 | UCG | |||
| 9 | CUN | Leu (Leucine) | Thr (Threonine) Ala (Alanine) | In yeast mitochondria |
| 10 | CUG | Leu (Leucine) | Ser (Serine) | Fungi Candida and Ascomycetes |
Ala (Alanine) | Mitochondria of yeast | |||
| 11 | CGG | Arg (Arginine) | Leu (Leucine) | Mitochondria of |
| 12 | AUA | Ile (Isoleucine) | Met (Methionine) | Yeast and vertebrate mitochondria |
Leu (Leucine) | Nematodes | |||
| STOP | Some animal and yeast mitochondria | |||
| 13 | AAR/AAA | Lys (Lysine) | Asn (Asparagine) | Mitochondria of |
Ser (Serine) | ||||
| 14 | AGR/AGA | Arg (Arginine) | Ser (Serine) | Mitochondria of echinoderms, fungi and most animals |
| STOP | Vertebrate mitochondria | |||
| 15 | AGG | Arg (Arginine) | Gly (Glycine) | Mitochondria of metazoans |
Ser (Serine) | Mitochondria of Sphaeropleales | |||
Leu (Leucine) | Mitochondria of | |||
Ala (Alanine) | Mitochondria of Sphaeropleales | |||
Met (Methionine) | ||||
| 16 | GGG | Gly (Glycine) | Leu (Leucine) | Nematodes ( |
Ile (Isoleucine) |
References [21, 24–29]
Fig. 2Mechanism of incorporation of non-standard amino acids and ribosomal frameshifting. a Incorporation of selenocysteine. In prokaryotes, serine attaches to tRNAsec to form Ser-tRNAsec and then to Sec-tRNAsec. In Archaea and eukaryotes, Ser-tRNAsec is phosphorylated to Sep-tRNAsec and then converted to Sec-tRNAsec. SerRS indicates seryl-tRNA synthetase, SelA:selenocysteine synthase, SelenoP:Seleno phosphate, PSTK: O-phosphoseryl-tRNA kinase and SepSecS: Sep-tRNA:Sec-tRNA synthase. Sec incorporation at UGA codon requires mRNA stem loop structure, Sec insertion sequence (SECIS), SECIS binding protein 2 (SBP2), and Sec specific elongation factor (EFSec). Mechanism of incorporation of selenocysteine in Archae and prokaryotes modified and redrawn from [30]. b Incorporation of non-standard amino acids (NSAAs) in proteins through orthogonal translation system (OTS). Orthogonal tRNA (o-tRNA)/aminoacyl-tRNA synthetase (o-aaRS) pairs from phylogenetically distant organisms are used to charge tRNA with NSAAs. c Programmed -1 ribosomal frameshifting. Ribosome shifts 1 nucleotide towards the 5′ end mRNA. This requires heptanucleotide slippery sequence with consensus of X_XXY_YYZ where X = any nucleotide, Y = A/U, Z = A/C/U, a spacer region, and a 3′- RNA secondary structure
Fig. 3Comparison of codon usage bias in model plant species using heat map of codon usage. AT refers to Arabidopsis thaliana, PT: Populus trichocarpa and PP: Physcomitrella patens. The RSCU values for each codon obtained from Kazusa codon database (https://www.kazusa.or.jp/codon/) are indicated inside the box in white. Colour code from 0 to 50 indicates the RSCU values. The amino acids are colour-coded to indicate the groups as shown below. (Color figure online)
Comparison of preferred codons in three model plants
| Amino acid | Plant species | Number of codons | ||
|---|---|---|---|---|
| Non-polar aliphatic | ||||
| Met | AUG | AUG | AUG | 1 |
| Ile | AUU | AUU | 3 | |
| Val | GUU | GUU | 4 | |
| Ala | GCU | GCU | GCU | 4 |
| Gly | GGA | GGA | GGA | 4 |
| Leu | CUU | CUU | 6 | |
| Non-polar aromatic | ||||
| Trp | UGG | UGG | UGG | 1 |
| Phe | UUU | UUU | 2 | |
| Tyr | UAU | UAU | 2 | |
| Polar acidic | ||||
| Asp | GAU | GAU | GAU | 2 |
| Glu | GAA | GAA | 2 | |
| Polar basic | ||||
| His | CAU | CAU | 2 | |
| Lys | AAG | AAG | 2 | |
| Arg | AGA | AGA | 6 | |
| Polar neutral | ||||
| Gln | CAA | CAA | 2 | |
| Asn | AAU | AAU | 2 | |
| Cys | UGU | UGU | 2 | |
| Pro | CCU | CCU | 4 | |
| Thr | ACU | ACU | 4 | |
| Ser | UCU | UCU | 6 | |
| STOP CODON | UGA | UGA | UGA/ | 3 |
| Total | 64 | |||
Codon in bold indicates variant codon when compared in all the three plant species
Fig. 4Factors affecting codon usage bias.
Major factors affecting codon usage bias in organisms such as GC content of genome, population size, gene expression level, protein length, codon position and context, tRNA abundance and interactions and mRNA structure are diagrammatically indicated. tRNA interactions are classified into frequency bias, co-occurence bias and pair bias. E, P and A indicate exit, peptide and amino acid sites in the ribosomes. The tRNA interactions were modified and redrawn from [1]
Comparison of codon bias in genes based on their level of expression
| Codon usage index | Gene expression | |||
|---|---|---|---|---|
| High | Intermediate | Low | Reference | |
| ARSU values | > 13 | 9–13 | < 9 | [ |
RSCU values are indicated in brackets below the most preferred codon
aRSCU values are not exactly clear as these are indicated in graph in the reference