Literature DB >> 30664908

Analysis of codon usage patterns and influencing factors in Nipah virus.

Supriyo Chakraborty1, Bornali Deb2, Parvin A Barbhuiya2, Arif Uddin3.   

Abstract

Codon usage bias (CUB) is the unequal usage of synonymous codons of an amino acid in which some codons are used more often than others and is widely used in understanding molecular biology, genetics, and functional regulation of gene expression. Nipah virus (NiV) is an emerging zoonotic paramyxovirus that causes fatal disease in both humans and animals. NiV was first identified during an outbreak of a disease in Malaysia in 1998 and then occurred periodically since 2001 in India, Bangladesh, and the Philippines. We used bioinformatics tools to analyze the codon usage patterns in a genome-wide manner among 11 genomes of NiV as no work was reported yet. The compositional properties revealed that the overall GC and AT contents were 41.96 and 58.04%, respectively i.e. Nipah virus genes were AT-rich. Correlation analysis between overall nucleotide composition and its 3rd codon position suggested that both mutation pressure and natural selection might influence the CUB across Nipah genomes. Neutrality plot revealed natural selection might have played a major role while mutation pressure had a minor role in shaping the codon usage bias in NiV genomes.
Copyright © 2019 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Codon usage bias; Mutation pressure; Natural selection; Nipah virus

Mesh:

Substances:

Year:  2019        PMID: 30664908      PMCID: PMC7114725          DOI: 10.1016/j.virusres.2019.01.011

Source DB:  PubMed          Journal:  Virus Res        ISSN: 0168-1702            Impact factor:   3.303


Introduction

Degeneracy or redundancy of the genetic code ensures that multiple codons codify the same amino acid except for two amino acids i.e. methionine and tryptophan. The codons that encode the same amino acid are called synonymous codons. Numerous previous studies have shown that the usage of these synonymous codons in mRNA molecules in varying frequencies leads to a phenomenon known as codon usage bias (CUB) (Hasegawa et al., 1979). The evolution of CUB is very complex and a highly debatable subject. Various evolutionary processes explain the origin of synonymous codon usage variation or CUB, among them the two most accepted theories are the neutral theory and the selection-mutation drift balance model theory (Duret and Mouchiroud, 1999; Sharp et al., 1986, 1993; Shields and Sharp, 1987). However, the impact of these evolutionary forces in different species remains undetermined (Akashi, 1997; Hershberg and Petrov, 2008). In addition, various biological factors have been identified to be associated with CUB such as GC composition, gene expression level, gene length, protein structure, tRNA abundance and its types, hydrophobicity and hydrophilicity of the protein (Bains, 1987; Bernardi and Bernardi, 1986; Gouy and Gautier, 1982; Ikemura, 1981; Tao and Dafu, 1998). The relationships of codon usage between viruses and their hosts are fascinating as it has significance in overall viral existence, its codon adaptation to host, evasion of host’s immune system by viral pathogen and their co-evolution. CUB can provide significant insights relating to functional regulation of gene expression level, identification of horizontally transferred genes, optimization of protein expression level and adaptation of pathogens to certain specific hosts (Chaney and Clark, 2015; Lithwick and Margalit, 2005; Liu et al., 2012). Nipah virus (NiV), an emerging zoonotic paramyxovirus, possesses high pathogenicity that causes fatal disease in both animals and humans (Wong et al., 2002). NiV is a single stranded RNA virus which belongs to genus Henipavirus, within the family Paramyxoviridae (Chua et al., 2000). Genome size varies from 18246 to 18252 nucleotides and the number of genes varies from 6 to 9 (http://www.ncbi.nlm.nih.gov). NiV was first identified during an outbreak of a disease that took place in Malaysia in 1998 (Lee et al., 1999). Outbreaks of NiV have occurred periodically since 2001 in India, Bangladesh, and the Philippines (Arankalle et al., 2011; Ching et al., 2015; Hossain et al., 2008; Hsu et al., 2004). In the mid of 2018, Nipah outbreak was again reported in Southern parts of India. The natural hosts of the virus are fruit bats of Pteropodidae family (Olson et al., 2002). It is transmitted through contact with NiV infected animal causing a prominent risk for epidemic outbreak or through consumption of contaminated foods. Human-to-human transmission has also been observed (Clayton, 2017; Escaffre et al., 2013). In the outbreaks of Malaysia and Singapore, pigs were reported to be the intermediate hosts, whereas in Bangladesh it was the date palm sap contaminated by infected fruit bats (Chua et al., 2001; Clayton et al., 2016). Clinical features are highly variable and may range from asymptomatic infection through acute respiratory syndrome to fatal encephalitis in humans and respiratory diseases in swine (Harcourt et al., 2001). To date, there are neither therapeutic nor prophylactic treatments against NiV outbreak. Nonetheless, low mortality rate was observed while ribavirin was administered during Malaysian outbreak (Chong et al., 2001), but the same drug was ineffective in preventing the death of NiV infected hamster (Georges-Courbot et al., 2006). Treatment against NiV is limited to only supportive and preventive care. Analysis of codon usage patterns in different genomes of NiV would improve our understanding on the mechanism of codon distribution and variation in NiV genomes as a part of its molecular biology and on the factors influencing the codon usage patterns. In this paper, we report the CUB and compositional properties of ORFs (open reading frame) in eleven NiV genomes. This study might give insights into pathogen evolution and disease propagation. Besides, the biological relevance of this study might throw light on the therapeutic interventions relating to NiV.

Materials and methods

Sequence data

The complete ORFs of eleven genomes of Nipah virus were retrieved in FASTA format from GenBank database of the National Center for Biotechnology Information (NCBI) accessible from the website http://www.ncbi.nlm.nih.gov (Table 1 ). Each ORF with exact multiple of three nucleotides having appropriate start and stop codon was analyzed in the present study. Nipah is a RNA virus; nucleotide U in RNA genome is replaced by T in NCBI databases and nucleic acid sequences are presented conventionally in 5’–3’ orientation following Wisconsin system (where T represents U). Wisconsin system has added advantage of representing the ORF and the mRNA of a gene using the same sequence, here the letter T represents U. So we used NCBI format of ORFs of genes for analysis, thereby representing U by T.
Table 1

Accession number and acronym of 11 genomes of Nipah virus.

Nipah VirusAcronymGenBank Acc No.Genome size (nt)Cds
USA_NP_112021.1_1NiV USANC_002728.118,2469
Malayasia_UMMC2_AAK50548.1_1NiV M1AY029768.118,2468
Malaysia_ UMMC1_AAK50540.1_1NiV M2AY029767.118,2468
Malaysia_AAF73377.1_1NiV M3AF212302.218,2468
Malaysia_CAF25493.1_1NiV M4AJ627196.118,2466
Malaysia_ NV/MY/99/UM-0128_CAD92359.1_1NiV M5AJ564623.118,2466
Malaysia_NV/MY/99/VRI-2794_CAD92347.1_1NiV M6AJ564621.118,2466
Bangladesh_ NIVBGD2008RAJBARI_AEZ01384.1_1NiV B1JN808863.118,2529
Bangladesh_ NIVBGD2008MANIKGONJ_AEZ01379.1_1NiV B2JN808857.118,2529
Bangladesh_AAY43911.1_1NiV B3AY988601.118,2529
India_Ind-Nipah-07-FG_ACT32611.1_1NiV IFJ513078.118,2526
Accession number and acronym of 11 genomes of Nipah virus.

Nucleotide composition

Overall nucleotide composition (A, C, T and G%) and nucleotide composition at the third codon position of each ORF (A3, C3, T3 and G3%) in each genome were analyzed. The overall GC content, the GC content at the 1st, 2nd and 3rd codon positions were determined for all eleven genomes of NiV. Also, various nucleotide skews namely AT skew (A-T/A+T), GC skew (G-C/G+C), purine skew (A-G/A+G), pyrimidine skew (T-C/T+C), amino skew (A-C/A+C) and keto skew (T-G/T+G) values were also computed for each ORF to understand the dynamics of nucleotide usage in different NiV genomes.

Relative synonymous codon usage

Relative synonymous codon usage (RSCU) is the ratio of observed frequency to the expected frequency of codons within the synonymous codon family if all the codons for the particular amino acid are used randomly in mRNA molecules. RSCU value <1.0 implies that the codon is used less frequently than expected and vice-versa. RSCU value >1.6 identifies the over represented codon while RSCU value <0.6 reveals the underrepresented codon in the ORF (Behura and Severson, 2012; Sharp and Li, 1986). RSCU is determined mathematically asHere, Xij is the frequency of prevalence of the j th codon for i th amino acid (any X with a value of zero is arbitrarily allotted a value of 0.5), and n is the total number of synonymous codons that encode the i th amino acid.

Effective number of codons (ENC)

This parameter is broadly used for measuring the degree of CUB. It estimates the magnitude of deviation of codon usage of a gene from the equal usage of synonymous codons. The ENC value ranges from 20 (highest bias) to 61 (lowest or no bias). High ENC value is an indication of low CUB while a low ENC value signifies high CUB of a gene. ENC value less than 35 is regarded as the significant CUB of a gene (Wright, 1990). ENC is calculated in standard genetic code (Translation Table 1 of NCBI) as:Here, Fk (k = 2, 3, 4 or 6) is the average of the F values for k-fold degenerate amino acids. The F value indicates the probability of two randomly chosen codons being identical for an amino acid.

Correspondence analysis (COA)

Correspondence analysis on relative synonymous codon usage is often used in CUB analysis to elucidate the major trends in variation of data (Greenacre, 1984). The axes of relative inertia are the major trends.

Parity rule 2

The Parity Rule 2 (PR2) bias is drawn with AT-bias [A/(A+T)] on the y-coordinate and GC-bias [G/(G+C)] on the x -coordinate where the point 0.5 is the centre of the plot, indicating that no bias exists between the complementary nucleotides.

Neutrality plot

Neutrality plot is the graphical representation of GC12 versus GC3 which measures the mutation-selection equilibrium in shaping the codon usage variation. Each dot represents an independent gene in the plot. The influence of mutation pressure on the CUB of a gene is indicated from the slope of the regression line of GC12 on GC3 approaching 1 (Sueoka, 1988). However, natural selection is assumed to play a significant role in CUB if the points are widely scattered in the plot.

Mutational responsive index (MRI)

MRI is the measure of the mutational drift in codons (Gouy and Gautier, 1982). It is a component of CUB which is based on compositional properties of the gene. A positive MRI value indicates directional mutation pressure but a negative MRI value indicates that translational selection operates on the gene (Gatherer and McEwan, 1997).

Translational selection (P2)

P2 measures the codon-anticodon interaction efficiency and detects the translational efficiency of a gene. It is calculated asP2 = (WWC + SSU)/(WWY + SSY)Where W = A or T, S = C or G and Y = C or T P2 value >0.5 indicates a bias in favor of translational selection according to Gouy and Gautier (1982).

Software used in the study

Pearson’s correlation analysis was performed to investigate the relationship of ENC (codon usage bias) with overall GC content, codon-position-specific GC contents (GC1, GC2, GC3), nucleotide skews, amino acids; and also the relationships among overall nucleotide compositions and the compositions at third position of codons using SPSS software (SPSS Inc., Chicago, Illinois, USA) (Chakraborty et al., 2017). Correspondence analysis was performed using PAST software (Paleontological statistics software package) (Deb et al., 2018). All indices of codon usage bias were estimated using a program written in PERL computer language by the corresponding author (SC).

Results

Nucleotide composition of Nipah genomes

The distribution and usage of synonymous codons in coding sequences are greatly influenced by nucleotide composition of a genome (Jenkins and Holmes, 2003). We therefore, analyzed nucleotide composition of ORFs of Nipah genomes as shown in Fig. 1 . The mean A% was the highest (32.98), followed by T% (25.07), G% (21.92) and C% (20.03) across all genomes. The overall AT and GC contents were 58.04% and 41.96% respectively, indicating NiV genes were AT rich. We estimated GC contents at different codon positions and found mean GC1% (49.07), GC2% (38.72) and GC3% (38.08) indicating, first codon position in genes possessed almost equal AT1% and GC1% as compared to second and third codon positions.
Fig. 1

Overall nucleotide composition and its composition at 3rd codon position in genomes of Nipah virus.

Overall nucleotide composition and its composition at 3rd codon position in genomes of Nipah virus.

Codon usage bias of genes in Nipah genomes

To analyze the extent of variations in CUB, the ENC values of ORFs in each Nipah virus genome were estimated (S1). The ENC values of 11 genomes of NiV were shown in Table 2 and the mean value of all genomes was 51.87, which suggested that the CUB of NiV genomes was low (Butt et al., 2014).
Table 2

Average ENC values of genes in 11 genomes of Nipah virus.

Nipah virusENC values
NiV USA51.87
NiV M151.79
NiV M251.79
NiV M351.79
NiV M451.55
NiV M551.58
NiV M651.57
NiV B152.31
NiV B252.31
NiV B351.96
NiV I52.05
Average ENC values of genes in 11 genomes of Nipah virus.

Codon usage pattern in Nipah genomes

Relative synonymous codon usage (RSCU) values of 59 synonymous codons in 11 genomes of Nipah virus were estimated. The patterns of codons usage were very similar in most of the genomes as shown in Fig. 2 . Moreover, RSCU values of genes indicated the nucleotides A and T were more preferred to G and C at third codon position. This result along with nucleotide compositional analysis suggests mutational pressure might affect the codon usage pattern in Nipah genomes. In our analysis, RSCU values of most of the codons were in the range 0.6–1.6, indicating stable genetic composition of the genes. However, a few codons were overrepresented (RSCU > 1.6) i.e. ACA, GCA, AGA, GGA, AGG, CCT, GTT and TCA as depicted in Fig. 3 . On the other hand, a few codons were underrepresented (RSCU < 0.6) i.e. TCC, TCG, CGC, CGG, CGT, ACG, GCG, CCG, GGC, TGC and CCC as shown in Fig. 4 . Comparison of overrepresented and underrepresented codons of 11 genomes of NiV is shown in Fig. 5 .
Fig. 2

Comparison of RSCU values of codons in genomes of Nipah virus.

Fig. 3

Overrepresented codons in genomes of Nipah virus.

Fig. 4

Underrepresented codons in genomes of Nipah virus.

Fig. 5

Comparison of overrepresented and underrepresented codons in genomes of Nipah virus.

Comparison of RSCU values of codons in genomes of Nipah virus. Overrepresented codons in genomes of Nipah virus. Underrepresented codons in genomes of Nipah virus. Comparison of overrepresented and underrepresented codons in genomes of Nipah virus.

Correspondence analysis of Nipah genomes

To decipher the rate of codon usage variations in different genomes of Nipah virus, we performed correspondence analysis with RSCU values of codons. The contributions of Axis 1 and Axis 2 were different in different genomes as shown in S2. The AT ended codons were much closer to axes with a clustering tendency than GC ended codons, indicating nucleobases under mutational pressure might have a significant role in shaping CUB of genes.

Role of mutational pressure on codon usage bias in Nipah virus genomes

To estimate the role of mutational pressure on shaping CUB in 11 genomes, we compared correlation coefficients between overall nucleotide compositions (A%, T%, G%, C%, GC%) and nucleotide compositions at third codon position of the genes (A3%, T3%, G3%, C3%, GC3%) (Table 3 ). Significant correlation was found between them at p < 0.01 or p < 0.05, indicating mutational pressure might play an important role in shaping the codon usage bias. This analysis suggests compositional properties under mutational pressure might have affected the codon usage patterns. However, on correlating ENC with GC contents (GC%, GC1%, GC2%, GC3% and GC12%) (Table 4 ), we found, significant correlation of ENC with GC3% in NiV M4 (0.819*), NiV M5 (0.842*), NiV M6 (0.845*) and NiV I (0.909*), indicating higher impact of mutational pressure on these genomes of Nipah virus as compared to other genomes.
Table 3

Correlation analysis between overall nucleotide composition and its composition at the 3rd codon position in coding sequences of Nipah virus.

Nipah VirusCorrelationA3%T3%G3%C3%GC3%
NiV USAA%0.779*−0.4720.237−0.888**−0.147
T%−0.5300.689*−0.3550.221−0.280
G%0.107−0.5540.3980.1320.484
C%0.237−0.258−0.0810.3550.077
GC%0.147−0.4720.250.2230.368
NiV M1A%0.777*−0.4630.24−0.884**−0.139
T%−0.4900.715*−0.3940.164−0.346
G%0.033−0.5560.4250.2070.543
C%0.163−0.24−0.0850.4660.117
GC%0.068−0.4710.2710.3150.428
NiV M2A%0.777*−0.4650.240−0.886**−0.139
T%−0.4900.714*−0.3940.172−0.346
G%0.033−0.5540.4250.2000.543
C%0.163−0.237−0.0850.4610.117
GC%0.068−0.4680.2710.3080.428
NiV M3A%0.777*−0.4650.240−0.886**−0.139
T%−0.4900.714*−0.3940.172−0.346
G%0.033−0.5540.4250.2000.543
C%0.163−0.237−0.0850.4610.117
GC%0.068−0.4680.2710.3080.428
NiV M4A%0.819*−0.107−0.73−0.894*−0.924**
T%−0.3100.856*−0.206−0.037−0.123
G%−0.256−0.5610.6590.4980.634
C%−0.078−0.7450.3660.5380.526
GC%−0.211−0.6390.5860.5350.625
NiV M5A%0.795−0.189−0.652−0.873*−0.895*
T%−0.3550.855*−0.132−0.014−0.076
G%−0.211−0.5220.5850.4640.591
C%0.004−0.7490.2100.5260.457
GC%−0.158−0.6100.4890.5110.578
NiV M6A%0.795−0.186−0.652−0.872*−0.892*
T%−0.3510.857*−0.135−0.023−0.088
G%−0.211−0.5270.5850.4690.596
C%0−0.7480.2150.530.467
GC%−0.158−0.6160.4890.5170.585
NiV B1A%0.807**−0.3340.092−0.753*−0.232
T%−0.5320.595−0.4570.615−0.184
G%0.128−0.4780.487−0.3050.344
C%0.12−0.2190.123−0.0060.12
GC%0.119−0.4310.418−0.2320.309
NiV B2A%0.807 **−0.3340.092−0.753*−0.232
T%−0.5320.595−0.4570.615−0.184
G%0.128−0.4780.487−0.3050.344
C%0.12−0.2190.123−0.0060.12
GC%0.119−0.4310.418−0.2320.309
NiV B3A%0.810**−0.3770.107−0.753*−0.221
T%−0.5220.618−0.4430.601−0.196
G%0.108−0.5010.494−0.3140.377
C%0.133−0.1550.0120.0770.045
GC%0.106−0.4390.401−0.2170.322
NiV IA%0.820*−0.457−0.713−0.74−0.859*
T%−0.4280.75−0.2440.3340.165
G%−0.168−0.2520.7330.1040.367
C%−0.085−0.5140.280.360.397
GC%−0.154−0.340.6460.1820.396

**,* correlation significant at p < 0.01, p < 0.05 (2- tailed).

Table 4

Correlation coefficient between ENC and various GC contents in Nipah virus.

Nipah virusENC and GC%ENC and GC1%ENC and GC2%ENC and GC3%ENC and GC12%
NiV USA0.6160.6160.3690.1700.558
NiV M10.6160.6360.3490.1830.563
NiV M20.6160.6360.3520.1830.565
NiV M30.6160.6360.3490.1830.563
NiV M40.6300.5260.1120.819*0.397
NiV M50.5970.4780.0800.842*0.343
NiV M60.5990.4800.0830.845*0.348
NiV B10.3420.4660.442−0.2430.486
NiV B20.3420.4660.442−0.2430.486
NiV B30.2750.3350.324−0.1290.354
NiV I0.4030.1590.0290.909*0.110

**,* correlation significant at p < 0.01, p < 0.05 (2- tailed).

Correlation analysis between overall nucleotide composition and its composition at the 3rd codon position in coding sequences of Nipah virus. **,* correlation significant at p < 0.01, p < 0.05 (2- tailed). Correlation coefficient between ENC and various GC contents in Nipah virus. **,* correlation significant at p < 0.01, p < 0.05 (2- tailed).

Parity plot analysis of Nipah virus genomes

The parity rule 2 bias plots are used for predicting the relative magnitude of mutational pressure and natural selection on gene composition. If mutational pressure is the cause of CUB then AT and GC will be proportionally distributed among the degenerate codon groups of a gene (Chakraborty et al., 2017). However, if mutational pressure and natural selection both are acting on the CUB, then AT and GC will not be proportionally distributed in degenerate codon groups (Sueoka, 1995). To investigate the role of mutational pressure and natural selection, the relationships between G and C content and between A and T content in 2- fold, 4- fold and 6- fold degenerate codon families were analyzed with PR2 bias plot (S3). Our results for 2-fold, 4-fold and 6-fold families revealed unequal distribution of GC and AT, suggesting that both the evolutionary forces i.e. mutational pressure and natural selection are responsible for shaping the CUB in NiV genomes.

Role of natural selection on codon usage patterns in Nipah virus

It has been proposed that if codon usage pattern is influenced by mutational pressure only, then the frequencies of nucleotides G and C should be equal to that of A and T at third codon position (Zhang et al., 2013a). However in case of Nipah virus genomes, nucleotide composition of GC and AT in the third codon position were unequal (Fig. 1), indicating that natural selection might play a role in shaping CUB. Further, we drew neutrality plot between GC12 and GC3 (S4) to identify the major determinant of CUB phenomenon between mutational pressure and natural selection and analyzed the regression coefficient (Table 5 ). The values of regression coefficients were lower than 0.5, indicating selection played a dominant role in shaping the codon usage bias of NiV genomes.
Table 5

Regression coefficients of GC12 on GC3 content in genomes of Nipah virus.

Nipah virusRegression coefficients of GC12 on GC3
NiV USA0.1363
NiV M1−0.1052
NiV M2−0.1023
NiV M3−0.1052
NiV M40.3323
NiV M50.2935
NiV M60.3012
NiV B1−0.2339
NiV B2−0.2339
NiV B3−0.216
NiV I0.0991
Regression coefficients of GC12 on GC3 content in genomes of Nipah virus.

Relationship between codon usage bias and nucleotide skews

Correlation coefficients were estimated between ENC and various nucleotide skews to unravel their interrelationships in coding sequences. We found significant correlation between ENC and nucleotide skewness (−0.707*) in NiV USA and (−0.692*) in NiV B3 at (p < 0.05) (S5), thus nucleotide skewness might also affect the codon usage patterns in Nipah virus.

Mutational responsive index (MRI) and translational selection (P2)

The mean MRI values in 11 genomes of Nipah virus were 0.52 in NiV USA, NiV M1, NiV M2, NiV M3, NiV M4 but 0.51 in NiV M5, NiV M6, NiV B1, NiV B2, NiV B3, NiV I, respectively. Positive MRI value indicates directional mutation pressure (Gatherer and McEwan, 1997). In our study, MRI values in all the Nipah genomes were positive, indicating that directional mutation pressure might have influenced the CUB of NiV genomes. Further, the mean P2 values in 11 genomes of Nipah virus were 0.06 in NiV USA, NiV M1, NiV M2, NiV M3, NiV B1, NiV B2, NiV B3 and 0.04 in NiV M4, NiV M5, NiV M6, NiV I, respectively. The value of P2 was less than 0.5 (Gouy and Gautier, 1982), indicating that the role of translational selection in CUB was low in NiV genomes. As the MRI and P2 values were almost same in all the Nipah genomes it suggested the effects of directional mutation pressure and translational selection were similar in all genomes.

Discussions

CUB arises from unequal distribution of synonymous codons in mature mRNA transcripts. Biased usage of synonymous codons is commonly found in a wide variety of organisms i.e. prokaryotes to eukaryotes (Akashi and Eyre-Walker, 1998; Duret, 2002). For understanding the magnitude and causes of codon bias of genes, the analysis of codon usage and overall nucleotide composition are essential to investigate the evolutionary relatedness, genomic characteristics and the role of mutational pressure and natural selection in genomic composition (Shackelton et al., 2006). As codon usage bias is highly influenced by nucleotide compositions of a gene, we primarily investigated the nucleotide composition of ORFs in 11 genomes of Nipah virus, and found A% to be the highest, followed by T% > G% > C%; NiV genomes were primarily found to be AT rich. GC contents at three codon positions of genes were found to be GC1% (49.07), GC2% (38.72) and GC3% (38.08). Similar to NiV genomes the percentage of GC1 was the highest compared to GC2 and GC3 in dengue virus, and the variation in GC composition profile was apparently associated to their geographical location (Lara-Ramírez et al., 2014). For understanding the variations in codon usage we further investigated ENC values of the ORFs of different genomes of Nipah virus and found mean ENC = 51.87 (i.e. ENC much higher than 35) indicating, lower magnitude of CUB. Low CUB might be useful for efficient replication in Nipah virus genomes having different choices for codon usage (Jenkins and Holmes, 2003). Similarly, from the results of codon bias analysis of complete genomic coding sequences in 50 genetically and ecologically diverse human RNA viruses the mean ENC value was found to be 50.9, ranging from 38.9 (Hepatitis A virus) to 58.3 (Eastern equine encephalitis virus), thus depicting lower CUB in most of the viruses, although a few variations were evident (Jenkins and Holmes, 2003). On analysis of synonymous codon usage in SARS Coronavirus, ENC values of genes were found to range from 42.19 to 59.06 with a mean of 48.99, indicating lower codon usage bias. In order to understand the codon usage patterns, we estimated the RSCU values of 59 synonymous codons and found A/T-ended codons were preferentially used over G/C-ended codons, with a discrete compositional distribution, depicting mutational pressure might play a role in shaping the CUB. In Chikungunya genomes, the codon AGA for amino acid Arg and CUG for Leu were overrepresented whereas GUU for Val, CGU, CGG for Arg and CUU, CUC for Leu were underrepresented (Butt et al., 2014). Combining the results of nucleotide composition and RSCU analysis in this study, we observed that compositional properties had high impact in selection of preferred codons coupled with the presence of mutational pressure, supporting the result of Butt et al. (2014). Codon usage variation is multifactorial in nature; therefore for estimating the rate of variation of codon usage, we used a multivariate statistical technique i.e. correspondence analysis. We found mutational pressure might affect the codon usage as GC and AT ended codons were placed close to the axes; however natural selection might also affect the codon bias as some GC and AT ended codons were discretely distributed away from the axes. Gu et al. (2004), from the study of SARS Coronavirus and Nidovirales, reported on correspondence analysis of base compositions and found compositional properties mainly influenced the variations of codon biases in viruses. Moreover, correspondence analysis of Chikungunya viruses revealed that different genotypes were distributed across all planes of axes, indicating the impact of favorable transmission vectors, host range, susceptibility and climatic conditions in shaping codon usage bias and the least influence of mutational pressure (Butt et al., 2014). All viruses in general and RNA viruses in particular are exposed to high magnitude of mutational pressure (Drake, 1993). Adaption and co-evolution of viruses to their hosts were mostly analyzed by estimating mutation at synonymous and non‐synonymous coding sites in specific genes (Yang et al., 2014). For assessing the impact of mutational pressure in codon bias, we correlated overall nucleotide composition (A%, T%, G%, C%, GC%) and nucleotide composition at the third codon position of the genes (A3%, T3%, G3%, C3%, GC3%) using Pearson correlation method and found significant correlation between them. These results depict mutational pressure played a significant role in codon usage of NiV genomes. On correlating ENC with GC3% a significant correlation was obtained at p < 0.05 in NiV M4, NiV M5 and NiV M6 and NiV I. Previous studies reported on H5N1 virus and other influenza A viruses, wherein GC composition varied in a similar pattern indicating mutational bias played a major role in codon usage; however the codon usage of NS2 and M2 genes was independent of the GC content (Zhou et al., 2005). Previous studies reported mutational pressure shaped codon usage in some RNA viruses as mutation rate is much higher in RNA viruses in comparison to DNA viruses (Drake and Holland, 1999). Butt et al. (2014) reported significant correlation among various nucleotide properties at p < 0.01 or p < 0.05 suggesting, mutational pressure was mostly responsible for nucleotide compositional patterns and for affecting the dynamics of CUB in genes (Butt et al., 2014). To investigate the codon patterns in Zika virus, correlation analysis was performed among nucleotide compositions, and codon compositions in different combinations and significant correlation was reported in them, indicating the role of mutational pressure in codon biases (Butt et al., 2016). In the present study, the parity rule 2 bias plots analysis was performed to understand the role of mutational pressure and natural selection on gene composition. We found non-proportional distribution of GC and AT count in 2-fold, 4-fold and 6-fold degenerate codon families, suggesting the role of both mutational pressure and natural selection in codon composition. In Marburg virus, A and T were overrepresented compared to G and C in 4-fold degenerate codon families (Nasrullah et al., 2015). Butt et al. (2016) reported that in Zika virus, nucleobases A and G were more preferred to T and C in 4-fold degenerate families. If the codon usage is shaped exclusively by mutational pressure, the nucleotide distribution of G and C would be equal to that of A and T at wobble codon position (Zhang et al., 2013b). In our analysis the role of natural selection, apart from mutational pressure, was distinguishable and evident from codon usage patterns in 11 genomes of Nipah virus. To determine the magnitude of mutational pressure over natural selection, neutrality plot was drawn. The results revealed mutational pressure might impact CUB although natural selection was the prime determinant of codon usage in NiV. Similar result was also obtained in Zika virus, with natural selection (96.8%) being the major factor over mutational pressure (3.2%) in shaping the codon bias (Butt et al., 2016). Neutrality plot analysis in Marburg virus revealed mutational pressure (92.6%) was dominant over natural selection (7.4%) for codon usage bias (Nasrullah et al., 2015). Further, Li et al. (2018) reported that in H3N2 canine influenza virus mutational pressure (17.37%) played less significant role than natural selection (82.63%) in codon bias formation. Nucleotide skewness affects the codon usage bias of genes. Depending on correlation analysis between ENC and nucleotide skews we found significant correlation of codon bias with amino skew in NiV USA and NiV B3, justifying their effects on CUB formation. Berkhout et al. (2002) reported skew analysis in retroviral genomes that provided information about nucleotide preferences involved in shaping codon usage. On further analysis of mutational responsive index (MRI) and translational selection (P2) of ORFs, it was observed that mutational pressure might play an effective role over translational selection in 11 genomes of Nipah virus.

Conclusion

Nipah virus (NiV) outbreak from time to time frightens people due to its fatal characteristics. It claims life almost every year. Till date no effective drug as a prophylactic or therapeutic measure against NiV is yet available, apart from only supportive and preventive care. The present study on 11 genomes of NiV revealed that the NiV genes are AT-rich (58.04%) and the overall CUB of genes was low (ENC > 51), suggesting that almost all synonymous codons are used in the ORFs. This suggests the existence of genetic variability in codon usage in NiV genomes. Further, neutrality plot revealed that natural selection might have played a prominent role than mutation pressure in shaping the CUB of genes during evolution of Nipah virus. Codon bias analysis could be harnessed in designing peptide vaccine against disease causing viruses based on the expression potential of surface proteins, for example by identifying efficient epitopes of surface proteins for boosting up cellular immunity.

Conflict of interests

The authors declared no conflict of interests in the manuscript.
  55 in total

1.  Codon and amino acid usage in retroviral genomes is consistent with virus-specific nucleotide pressure.

Authors:  Ben Berkhout; Andrei Grigoriev; Margreet Bakker; Vladimir V Lukashov
Journal:  AIDS Res Hum Retroviruses       Date:  2002-01-20       Impact factor: 2.205

2.  Small regions of preferential codon usage and their effect on overall codon bias--the case of the plp gene.

Authors:  D Gatherer; N R McEwan
Journal:  Biochem Mol Biol Int       Date:  1997-09

3.  Poly(I)-poly(C12U) but not ribavirin prevents death in a hamster model of Nipah virus infection.

Authors:  M C Georges-Courbot; H Contamin; C Faure; P Loth; S Baize; P Leyssen; J Neyts; V Deubel
Journal:  Antimicrob Agents Chemother       Date:  2006-05       Impact factor: 5.191

4.  Molecular characterization of the polymerase gene and genomic termini of Nipah virus.

Authors:  B H Harcourt; A Tamin; K Halpin; T G Ksiazek; P E Rollin; W J Bellini; P A Rota
Journal:  Virology       Date:  2001-08-15       Impact factor: 3.616

5.  The neurological manifestations of Nipah virus encephalitis, a novel paramyxovirus.

Authors:  K E Lee; T Umapathi; C B Tan; H T Tjia; T S Chua; H M Oh; K M Fock; A Kurup; A Das; A K Tan; W L Lee
Journal:  Ann Neurol       Date:  1999-09       Impact factor: 10.422

6.  Analysis of codon usage pattern of mitochondrial protein-coding genes in different hookworms.

Authors:  Bornali Deb; Arif Uddin; Gulshana Akthar Mazumder; Supriyo Chakraborty
Journal:  Mol Biochem Parasitol       Date:  2017-11-20       Impact factor: 1.759

7.  Nipah virus encephalitis reemergence, Bangladesh.

Authors:  Vincent P Hsu; Mohammed Jahangir Hossain; Umesh D Parashar; Mohammed Monsur Ali; Thomas G Ksiazek; Ivan Kuzmin; Michael Niezgoda; Charles Rupprecht; Joseph Bresee; Robert F Breiman
Journal:  Emerg Infect Dis       Date:  2004-12       Impact factor: 6.883

8.  Large-scale genomic analysis of codon usage in dengue virus and evaluation of its phylogenetic dependence.

Authors:  Edgar E Lara-Ramírez; Ma Isabel Salazar; María de Jesús López-López; Juan Santiago Salas-Benito; Alejandro Sánchez-Varela; Xianwu Guo
Journal:  Biomed Res Int       Date:  2014-07-17       Impact factor: 3.411

9.  Synonymous codon usage in TTSuV2: analysis and comparison with TTSuV1.

Authors:  Zhicheng Zhang; Wei Dai; Dingzhen Dai
Journal:  PLoS One       Date:  2013-11-26       Impact factor: 3.240

10.  Genetic and evolutionary analysis of emerging H3N2 canine influenza virus.

Authors:  Gairu Li; Ruyi Wang; Cheng Zhang; Shilei Wang; Wanting He; Junyan Zhang; Jie Liu; Yuchen Cai; Jiyong Zhou; Shuo Su
Journal:  Emerg Microbes Infect       Date:  2018-04-25       Impact factor: 7.163

View more
  8 in total

1.  Codon Usage of Hepatitis E Viruses: A Comprehensive Analysis.

Authors:  Bingzhe Li; Han Wu; Ziping Miao; Linjie Hu; Lu Zhou; Yihan Lu
Journal:  Front Microbiol       Date:  2022-06-21       Impact factor: 6.064

2.  Codon usage pattern and its influencing factors in different genomes of hepadnaviruses.

Authors:  Bornali Deb; Arif Uddin; Supriyo Chakraborty
Journal:  Arch Virol       Date:  2020-02-08       Impact factor: 2.574

3.  Comprehensive Analysis of Synonymous Codon Usage Bias for Complete Genomes and E2 Gene of Atypical Porcine Pestivirus.

Authors:  Xianglong Yu; Jianxin Liu; Huizi Li; Boyang Liu; Bingqian Zhao; Zhangyong Ning
Journal:  Biochem Genet       Date:  2021-02-04       Impact factor: 1.890

Review 4.  Nipah Virus-Another Threat From the World of Zoonotic Viruses.

Authors:  Krzysztof Skowron; Justyna Bauza-Kaszewska; Katarzyna Grudlewska-Buda; Natalia Wiktorczyk-Kapischke; Maciej Zacharski; Zuzanna Bernaciak; Eugenia Gospodarek-Komkowska
Journal:  Front Microbiol       Date:  2022-01-25       Impact factor: 5.640

5.  Codon Usage is Influenced by Compositional Constraints in Genes Associated with Dementia.

Authors:  Taha Alqahtani; Rekha Khandia; Nidhi Puranik; Ali M Alqahtani; Kumarappan Chidambaram; Mohammad Amjad Kamal
Journal:  Front Genet       Date:  2022-08-09       Impact factor: 4.772

6.  Componential usage patterns in dengue 4 viruses reveal their better evolutionary adaptation to humans.

Authors:  Gun Li; Liang Shi; Liang Zhang; Bingyi Xu
Journal:  Front Microbiol       Date:  2022-09-20       Impact factor: 6.064

7.  Composition, codon usage pattern, protein properties, and influencing factors in the genomes of members of the family Anelloviridae.

Authors:  Bornali Deb; Arif Uddin; Supriyo Chakraborty
Journal:  Arch Virol       Date:  2021-01-03       Impact factor: 2.574

8.  Strategies and Patterns of Codon Bias in Molluscum Contagiosum Virus.

Authors:  Rahul Raveendran Nair; Manikandan Mohan; Gudepalya R Rudramurthy; Reethu Vivekanandam; Panayampalli S Satheshkumar
Journal:  Pathogens       Date:  2021-12-20
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.