Literature DB >> 25187684

Stress induced MAPK genes show distinct pattern of codon usage in Arabidopsis thaliana, Glycine max and Oryza sativa.

H Surachandra Singha1, Supriyo Chakraborty1, Himangshu Deka1.   

Abstract

Mitogen activated protein kinase (MAPK) genes provide resistance to various biotic and abiotic stresses. Codon usage profiling of the genes reveals the characteristic features of the genes like nucleotide composition, gene expressivity, optimal codons etc. The present study is a comparative analysis of codon usage patterns for different MAPK genes in three organisms, viz. Arabidopsis thaliana, Glycine max (soybean) and Oryza sativa (rice). The study has revealed a high AT content in MAPK genes of Arabidopsis and soybean whereas in rice a balanced AT-GC content at the third synonymous position of codon. The genes show a low bias in codon usage profile as reflected in the higher values (50.83 to 56.55) of effective number of codons (Nc). The prediction of gene expression profile in the MAPK genes revealed that these genes might be under the selective pressure of translational optimization as reflected in the low codon adaptation index (CAI) values ranging from 0.147 to 0.208.

Entities:  

Keywords:  Arabidopsis; MAPK; codon usage bias; mutational pressure; synonymous codons

Year:  2014        PMID: 25187684      PMCID: PMC4135292          DOI: 10.6026/97320630010436

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Many codons in the genetic code are functionally synonymous. A single amino acid is encoded by two to six codons, a phenomenon called synonymous codon usage. All the synonymous codons of an amino acid show a variation in the occurrence in a gene frequency. Some codons show a high frequency indicating that the usage of the particular codon is biased. The study of the codon usage bias may assist to design the DNA primers [1]. The synonymous codons show variation between as well as within the genome of an organism [2, 3]. This variation could be a consequence of natural and/or mutational pressure to determine the accuracy and the efficiency of the translation process of the organisms. Study of the synonymous codon usage patterns of a gene in various organisms could provide an insight into the evolution profile and the level of gene expression as well [4, 5]. In due course of evolution, plants have developed their own mechanisms to combat the stresses- both biotic as well as abiotic which they are continually subject to. Mitogen activated protein kinase (MAPK) genes present in the plants actively respond to various stresses. Abiotic stresses mainly include drought, high salinity, heat, cold, freezing, limited nutrient availability, heavy metals etc. [6, 7]. The MAPK genes are generally classified into three distinct types-MAPKKK, MAPKK and MAPK. The string of reactions in the stress signalling pathways involve these three MAPK families; MAPKKKs phosphorylate the serine/threonine of MAPKKs at their activation loop which in turn double phosphorylate the MAPKs at their T-D-Y motif in the activation loop. The MAPK genes get activated by the stress stimuli and form various cascade which form the metabolic pathways to regulate the stress response. The cascade actions acts downstream of the receptors on extracellular surface or acts as a sensor that׳s transduces the extracellular signals into intracellular responses [8]. Arabidopsis genome has been reported to harbour 80 MAPKKKs, 10 MAPKKs and 20 MAPKs genes. Whole genome analysis of rice (Oryza sativa) has revealed 75 MAPKKKs, 8 MAPKKs and 15 MAPKs [9, 10]. In-silico studies of the Soybean (Glycine max) has identified 38 MAPKs, 11 MAPKKs, and 150 MAPKKKs [11]. Chilling stress is one of the serious problems associated with the production of major crops such as rice and maize [12] in temperate zone. The Arabidopsis MAPKs, based on their structural motif and sequence similarities, is classified into four groups (A-D). The group A, B and C possesses a T-E-Y motif whereas the fourth group A possesses a T-D-Y motif [13, 14]. A signalling pathway comprising of MEKK1-MKK2-MPK4/MPK6 is reported to regulate the response to abiotic stresses in Arabidopsis [15]. The abiotic stresses induce the production of ROS (reactive oxygen species). Several MAPK signalling pathways are induced only by ROS and in turn regulate the ROS production. Many MAPK pathways in response to abiotic stresses have been studied in rice. Experiment on long term exposure of rice plants to cooling stress revealed the involvement of several MAPKs namely MEK1 (MAPKK) and MAP1 (MAPK) [16]. MAPK5 gene in rice has shown multiple activities, responding to both biotic and abiotic stress. Resistivity to drought in rice plants is provided by an MAPKK of the B3 subgroup named DSM1 [17]. MAPK33 has also shown activity withstanding drought [18]. Microarray analysis of MPK4 activity in soybean revealed that it negatively regulates SA and H2O2 accumulation. Therefore silencing of MPK4 in soybean significantly increases SA and H2O2 accumulation, up-regulating genes involved in defense response and providing the plants a better resistance to downy mildew and soybean mosaic virus as compared to vector controlled plants. MPK4 has been reported to down-regulate genes involved in growth and development, such as those in auxin signalling pathways and in cell cycle and proliferation [19]. Very little information is available on the codon usage pattern of MAPK genes across plant species despite substantial literature on physiological mechanisms. The present study has therefore been undertaken to elucidate the codon usage of MAPK genes. Overall comparative analysis of the codon usage patterns of the MAPK genes reported in the three organisms - A. thaliana, G. max and O. sativa could be beneficial to analyse the conservative nature of the genes, codon usage patterns and the compositional role in determining the optimal codons. Information regarding the codon usage patterns could help reveal the evolutionary history of individual genes within or between organisms, and the expression of genes as well.

Methodology

Coding DNA sequence data:

Coding sequence data (a total of 127 cds) of the different genes of the MAPK families of Arabidopsis, soybean and rice were retrieved from NCBI (www.ncbi.mlm.nih.gov). The genes with different accession numbers are listed in Table 1 (see supplementary material). Different analytical parameters for codon usage bias of the MAPK genes with different accession number were estimated and analysed.

Analysis of the codon usage profile:

Several parameters have been used to characterize the sequence data of the genes. Gene members of each of the three MAPK families are analysed and compared family-wise across the three organisms.

RSCU:

The relative synonymous codon usage (RSCU) measures the frequencies of optimal codons of each of the synonymous codons encoding an amino acid. It assists in characterising the codons in a genetic sequence, whether it follows unbiased pattern of the codons being used or certain codons are more preferred. Codons with RSCU values greater than 1 are generalised to possess a positive codon usage bias and those less than 1 are considered to possess negative codon usage bias [20]. Bioinformatics tools available online, codonW (mobyle.pasteur.fr/cgi-bin/portal.py#forms::codonW) have been used to estimate the RSCU which is mathematically expressed as: (For equation-> (1) please see supplementary material). where, xi is the number of frequency of jth codon for ith amino acid and ni being the number of alternative synonymous codons available for the ith amino acid.

GC content:

It is a measure of the occurrence of the nucleotide bases guanine (G) and cytosine (C) in the entire genetic sequence [21]. The GC content can be measured partially at the 1st, 2nd and 3rd synonymous codon position throughout the sequence, designated as GC1S , GC2S , and GC3S respectively [22]. Other indices for nucleotide bias such as A%, T%, G%, and C% are measured. The nucleotide measure at the third codon positions designated as A3S, T3S, G3S and C3S are also calculated. Mathematically GC% is expressed as: (For equation-> (2) please see supplementary material).

Effective number of codons (NC):

As proposed by Wright in the year 1990, Nc measures the absolute synonymous codon usage bias [23]. It measures the total number of synonymous codons used for each amino acid. The values of Nc ranges from 20 (when only one codon is used for an amino acid), to 61 (when all the codons are used with equal frequency) [1]. It may be expressed as: (For equation-> (3) please see supplementary material). Where, F2 is the probability that two randomly chosen codon for an amino acid, possibly encoded by two distinct codons, are identical, F3 is the probability that three randomly chosen codons for an amino acid with three synonymous codons are identical and so on [1, 23].

Gene expression analysis:

The expressivity of a gene is characterized on the basis of CAI value for a gene. It is measured as calculating the geometric means of the relative adaptiveness (w) of all the codons in a gene. The CAI value ranges from 0 (when less frequent codons are used) to 1 (when most frequent codons are used) [24]. Highly expressed genes tend to have high CAI values whereas less expressive genes possess low CAI values. CAI is mathematically expressed as: (For equation-> (4) please see supplementary material), where, L is the number of codons in the gene and wc(k) is the w value for the kth codon in the gene. Relative adaptiveness (w) can be calculated as: (For equation-> (5) please see supplementary material). Where Xij is the number of occurrence of jth codons in the reference set of highly expressed genes and Xmax is the maximum Xij for ith amino acid.

Results

Characteristic patterns of nucleotide composition:

The nucleotide composition of the MAPK genes in three organisms shows a clear characteristic pattern of resemblance with a few exceptions in rice (Figure 1). In case of MAPKKK family, the overall GC% of Arabidopsis is 35.6%, which is low, revealing that the organism has high overall AT content. The overall GC content in rice and soybean gene sequences is 43.99% and 53.10% respectively. This result suggests that there exists a balance between the AT and GC content in the soybean genes, while in rice the overall AT content slightly exceeds the overall GC content for MAPK genes. From the GC3% in all the three gene families across three species, it is evident that the GC% is markedly suppressed in Arabidopsis and soybean at the third synonymous codon position. The rice genes show an overall consistent pattern from the other two species in respect of the GC3%. The overall GC3 content in all the gene families of rice is slightly lower (40.02%-45.39%) as against the AT3 content (54.61%-59.08%). Thus the comparative study of the MAPKKK, MAPKK and the MAPK family genes show that Arabidopsis and soybean have resemblance in their pattern of nucleotide usage whereas rice genes deviate from the other two species (Table 1).
Figure 1

Nucleotide composition (%) of the three MAPK gene families, in Arabidopsis, soybean and rice

Synonymous codon usage pattern:

Codons with RSCU values greater than 1 are considered to have positive codon usage bias [22]. The most preferred codon for each of 18 amino acids bearing the highest RSCU values are shown in Table 2 (see supplementary material). Inspection of the overall RSCU values reveal that the codons TTT, GTT, AAT, and GAT coding for phenylalanine, valine, asparagine and aspartic acid respectively, have got the highest preference in all the three organisms. Besides this, the comparative study between the Arabidopsis and soybean (G. max) reveals significant resemblance in the preference of codons; 11 out of 18 amino acids show the same preferred codon. But rice does not show any significant trend: only two amino acids resemble with soybean, and three with Arabidopsis for preferred codon. The resemblance between Arabidopsis and soybean is further evident from the codon preference in the MAPKK family; 12 out of 18 amino acids show the same preference of codons. Only one amino acid i.e. valine is encoded by the most preferred codon GTT in three MAPK gene families across all the species. The MAPKKK family has shown the extreme resemblance for the most preferred codons between Arabidopsis and soybean; 15 out of 18 amino acids have the same preferred codon. But in rice the most preferred codon for most amino acids differs from Arabidopsis and soybean. The comparison of RSCU values for most preferred codon for each amino acid in three organisms is shown in Figure 2 for each gene family. The third codon position of all the preferred codons predominantly possesses the nucleotide T followed by A in all the three gene families indicating that this preference could be due to translational selection or mutation bias.
Figure 2

RSCU values of the Arabidopsis, soybean and rice showing three families: (A) MAPK; (B) MAPKK and (C) MAPKKK

Expressivity of genes:

Codon adaptation index (CAI), as proposed by Sharp and Li in the year 1987, is a measure to predict the expressivity of genes [24]. The CAI value of each gene belonging to different families was found to be very low, ranging from 0.147 to 0.208 (Table 1), which indicates that the genes are not possibly optimized for high expression (Figure 3). This could be due to the fact that stress induces the MAPK gene expression and that MAPK genes are not house-keeping genes by nature. Hence their CAI values are usually low indicating low expression under nonstress environment.
Figure 3

Frequency of estimated CAI values of different genes in Arabidopsis, rice and soybean

Biasness in codon usage:

The Nc values of all genes are towards higher side, ranging from 50.83 to 56.55. The higher values of Nc signify that the gene sequences show low biasness in the codon usage profile.

Discussion

The comparative study of the codon usage for MAPK genes in three species i.e. Arabidopsis, soybean and rice have shown that the first two organisms share a high degree of similarity between them. The MAPK genes of these two organisms have shown very high resemblance in codon preference in the coding sequences. The gene family wise comparison across the three organisms revealed that Arabidopsis and soybean are compositionally alike. In contrast, rice showed a somewhat different nucleotide composition from the other two species. This could be due to the fact that Arabidopsis and soybean are dicots unlike rice which is a monocot. Arabidopsis and soybean genes are overall AT-rich as well as at the third synonymous codon position. The rice MAPK sequences, on the other hand, are overall GC-rich and at the third synonymous position. Guo and his co-workers (2007) also reported a high GC content in rice genes [25]. The results suggest that the codon usage pattern in all the MAPK genes in the three species used in the study might be influenced by their base compositional properties. The general trend of the Nc values in the three species is consistently higher side for all the genes indicating a low bias in the codon usage pattern in these genes. The overall perusal of the RSCU values for the three gene families revealed a high similarity between the codon usage in Arabidopsis and G. max. Majority of the preferred codons in the genes of these two species are T-redundant at the silent third codon position. This may be due to the high prevalence of AT content in these genes. However, in case of O. sativa, the preference for the nucleotides at the third synonymous codon position is balanced in MAPK and MAPKK genes, which is evident from the nearly equal distribution of overall GC and AT content in these two gene families. In MAPKKK gene family of O. sativa, however, the preferential codons showed the increased tendency of using T at the third position. Valine is encoded by the most preferred codon GTT in three species. Highly expressed genes generally show a tendency of using a limited number of codons which they use preferentially [24]. The level of gene expression as estimated by the CAI values has shown that the MAPK genes are not possibly highly expressed. Since stress induces the expression of MAPK genes, these genes might be evolutionarily so organized as to give low expression under non-stress or normal conditions. CAI values of all the genes have shown a close proximity to 0, suggesting that they have less expressivity [24]. Translational selection might have played a role in this context, rendering the MAPK genes to be optimized for low expression under non-stress environment.

Conclusion

This work is the first attempt to gain insight into the codon usage profiles of stress induced MAPK genes across three plant species. Mutational pressure and natural selection have been projected as the main impetus behind the codon usage bias in various organism ranging from small prokaryotes to large plants and animals [21, 26, 27]. In the present study, the results indicate that apart from the mutational pressure, translational selection might be playing a pivotal role in order to make these genes optimized for translating efficiently. MAPK families comprise of a huge number of genes interacting with each other in order to combat different environmental stresses. Based on the type of stress and the species involved, the codon usage as well as the expression of genes may vary. Thus, it is necessary to carry out further detailed analysis of the codon usage pattern in MAPK and the associated genes involved in the cascade of actions under biotic and abiotic stress environments in different species.

Disclosure

The authors do not have any competing interest. No financial aid from any funding agency was obtained for undertaking the present study.
  27 in total

1.  BWMK1, a novel MAP kinase induced by fungal infection and mechanical wounding in rice.

Authors:  C He; S H Fong; D Yang; G L Wang
Journal:  Mol Plant Microbe Interact       Date:  1999-12       Impact factor: 4.171

2.  Evidence of selectively driven codon usage in rice: implications for GC content evolution of Gramineae genes.

Authors:  Xingyi Guo; Jiandong Bao; Longjiang Fan
Journal:  FEBS Lett       Date:  2007-02-08       Impact factor: 4.124

3.  The selection-mutation-drift theory of synonymous codon usage.

Authors:  M Bulmer
Journal:  Genetics       Date:  1991-11       Impact factor: 4.562

4.  Variation in G + C-content and codon choice: differences among synonymous codon groups in vertebrate genes.

Authors:  A Marín; J Bertranpetit; J L Oliver; J R Medina
Journal:  Nucleic Acids Res       Date:  1989-08-11       Impact factor: 16.971

5.  The codon Adaptation Index--a measure of directional synonymous codon usage bias, and its potential applications.

Authors:  P M Sharp; W H Li
Journal:  Nucleic Acids Res       Date:  1987-02-11       Impact factor: 16.971

6.  Overexpression of the mitogen-activated protein kinase gene OsMAPK33 enhances sensitivity to salt stress in rice (Oryza sativa L.).

Authors:  Seong-Kon Lee; Beom-Gi Kim; Taek-Ryoun Kwon; Mi-Jeong Jeong; Sang-Ryeol Park; Jung-Won Lee; Myung-Ok Byun; Hawk-Bin Kwon; Benjamin F Matthews; Choo-Bong Hong; Soo-Chul Park
Journal:  J Biosci       Date:  2011-03       Impact factor: 1.826

7.  Synonymous codon usage in Escherichia coli: selection for translational accuracy.

Authors:  Nina Stoletzki; Adam Eyre-Walker
Journal:  Mol Biol Evol       Date:  2006-11-13       Impact factor: 16.240

8.  Analysis of synonymous codon usage in H5N1 virus and other influenza A viruses.

Authors:  Tong Zhou; Wanjun Gu; Jianmin Ma; Xiao Sun; Zuhong Lu
Journal:  Biosystems       Date:  2005-04-07       Impact factor: 1.973

9.  Evidence for a trade-off between translational efficiency and splicing regulation in determining synonymous codon usage in Drosophila melanogaster.

Authors:  Tobias Warnecke; Laurence D Hurst
Journal:  Mol Biol Evol       Date:  2007-09-28       Impact factor: 16.240

10.  Identification, nomenclature, and evolutionary relationships of mitogen-activated protein kinase (MAPK) genes in soybean.

Authors:  Achal Neupane; Madhav P Nepal; Sarbottam Piya; Senthil Subramanian; Jai S Rohila; R Neil Reese; Benjamin V Benson
Journal:  Evol Bioinform Online       Date:  2013-09-22       Impact factor: 1.625

View more
  2 in total

Review 1.  Advances in understanding salt tolerance in rice.

Authors:  Showkat Ahmad Ganie; Kutubuddin Ali Molla; Robert J Henry; K V Bhat; Tapan Kumar Mondal
Journal:  Theor Appl Genet       Date:  2019-02-13       Impact factor: 5.699

2.  Genome-Wide Analysis of the Synonymous Codon Usage Patterns in Riemerella anatipestifer.

Authors:  Jibin Liu; Dekang Zhu; Guangpeng Ma; Mafeng Liu; Mingshu Wang; Renyong Jia; Shun Chen; Kunfeng Sun; Qiao Yang; Ying Wu; Xiaoyue Chen; Anchun Cheng
Journal:  Int J Mol Sci       Date:  2016-08-10       Impact factor: 5.923

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.