Literature DB >> 31677415

Comprehensive analysis of synonymous codon usage patterns in orf3 gene of porcine epidemic diarrhea virus in China.

Xin Xu1, Pengfei Li2, Yating Zhang3, Xianhe Wang3, Jiaxin Xu3, Xuening Wu3, Yujiang Shen3, Dexuan Guo3, Yuchang Li3, Lili Yao3, Liyang Li3, Baifen Song3, Jinzhu Ma3, Xinyang Liu3, Shuyan Xu4, Hua Zhang5, Zhijun Wu6, Hongwei Cao7.   

Abstract

The ORF3 protein of porcine epidemic diarrhea virus (PEDV) is found to function as an ion channel which influences virus virulence and production. Taking consideration of the importance of PEDV orf3 gene, we have performed comprehensive analysis to investigate its synonymous codon usage patterns. In this study, the results of base composition analysis showed A/T rich and G/C poor in PEDV orf3 genes, and the most abundant base was nucleotide T. The relative synonymous codon usage value in each codon revealed that codon usage bias existed. The mean ENC value of each gene was 48.75, indicating a low codon usage bias, as well as a relatively instable change in PEDV orf3 genes. The general correlation analysis between base composition and codon usage bias indicated that mutational bias has an impact on the PEDV codon usage bias. Neutral analysis suggested that natural selection pressure takes a more important influence than mutational bias in shaping codon usage bias. Moreover, other factors including hydrophobicity and aromaticity have been also found to influence the codon usage variation among the PEDV orf3 genes. This study not only represents the most systematic analysis of codon usage patterns in PEDV orf3 genes, but also provides a basic shaping mechanism of the codon usage bias.
Copyright © 2019 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Codon usage bias; Mutational bias; Natural selection; Porcine epidemic diarrhea virus; orf3 gene

Mesh:

Substances:

Year:  2019        PMID: 31677415      PMCID: PMC7172109          DOI: 10.1016/j.rvsc.2019.09.012

Source DB:  PubMed          Journal:  Res Vet Sci        ISSN: 0034-5288            Impact factor:   2.534


As a highly contagious and acute enteric viral disease, porcine epidemic diarrhea (PED) is characterized by watery vomiting, diarrhea and severe dehydration, resulting into >80% mortality in neonatal piglets (Song et al., 2015). The first PED outbreak was recognized in England in the early 1970s and then has been continually reported in other European, American and Asian countries, including China (Song and Park, 2012; Sun et al., 2012). The causative agent of PED is porcine epidemic diarrhea virus (PEDV), which belongs to the member of the Coronaviridae family, Coronavirinae subfamily, and Alphacoronavirus genus, including some other swine, bat and human coronaviruses (Chen et al., 2008). PEDV is a large, single-stranded positive-sense RNA enveloped virus, whose genome is approximately 28 knt encoding at least seven open reading frames (ORF1a, ORF1b, and ORF2–6), a 3′ polyadenylated tail and a 5′ untranslated region (5’-UTR) (Lee et al., 2015). Replicase proteins are encoded by ORF1a and ORF1b, and the viral proteins are encoded by the next five ORFs, including the spike protein (S), the ORF3 protein (ORF3), the small membrane proteins (E), the membrane proteins (M), and the nucleocapsid protein (N) (Chen et al., 2014). As one of important viral gene, its product of orf3 is the only accessory protein in PEDV and found to function as an ion channel to influence virus virulence and production (Song et al., 2003; Wang et al., 2012). In the majority of PEDV strains, orf3 gene is widely used for diagnosis of PEDV infection because of its highly conserved characteristics (Wang et al., 2016). The differences in orf3 genes between the attenuated-strain and wild-strain can also be served as a marker of the viral adaption to host and used as a potential method to study molecular evolutionary. Previous studies of PEDV orf3 genes have been mainly limited to phylogenic analysis (Huang et al., 2013), and few synonymous codon usage analyses have been performed (Chen et al., 2014). Especially, the factors influencing nucleotide composition and synonymous codon usage bias in PEDV orf3 genes have been studied only to a limited extent. Except for tryptophan and methionine, other amino acids are encoded by 2–4 codons because the amino acids types are less than the genetic codes. This phenomenon is defined as synonymous codon usage (Chen et al., 2017). It is well known that synonymous codons for each amino acids are not used randomly in the genomes of organisms, but some codons are used more frequently than others, which is referred as synonymous codon usage bias (Marín et al., 1989). Many studies have determined codon usage bias in viruses, bacteria, fungi, and so on (D'Andrea et al., 2011). For example, the rotavirus and rubella show the strong codon usage bias among viral genome, whose degree of deviation are dependent on the identity of the virus (Belalov and Lukashev, 2013). On the contrary, other virus display weak codon usage bias, such as classical swine fever virus (CSFV) (Tao et al., 2009), enterovirus 71 (EV71) (Zhang et al., 2014), and newcastle disease virus (Cao et al., 2014). Up to date, codon usage in RNA virus was also testified to be related to mutation bias, translational selection, dinucleotide bias, and other factors (Zhou et al., 2005; Sharp et al., 2010; Hussain et al., 2019). Elucidating the extent and causes of codon usage biases is beneficial for the understanding viral molecular evolution (Shackelton et al., 2006). Considering the highly contagious features of PEDV and significance of orf3 gene, it is need to analyze the codon usage patterns of PEDV orf3 gene during its evolution, which can provide important information about virus evolution, regulation of gene expression and protein synthesis, and further aid in vaccine design that may require high levels of viral antigen expression to produce immunity (Butt et al., 2014). In this present study, a total of 518 coding sequences (CDS) of orf3 gene (>99% sequence identities excluded) of PEDV strains isolated from China were retrieved from GenBank database (https://www.ncbi.nlm.nih.gov/nucleotide/). The clustal X software (Thompson et al., 1997) was used for alignment of the orf3 gene sequences. The program codonW program (version 1.4.2) (http://codonw.sourceforge.net//) was applied for calculating the effective number of codons (ENC), total G + C genomic content, as well as G + C content at first, second and third codon positions. The detailed information of the 518 orf3 gene sequences is provided in supplemental data. The results showed that the T (38.22% with a SD of 0.25%) was the most abundant base, and the A (23.77% ± 0.17%), G (19.86% ± 0.23%) and C (17.09 ± 0.33%) were subsequently the second, third and fourth abundant base through base composition analysis. The average GC content of all PEDV orf3 was 36.95% (from 36.16% to 37.95%, with a SD. of 0.29%), and the average GC3s content in codons was 33.21% (from 31.36% to 35.91%, with a SD. of 0.65%), indicating all of the PEDV orf3 genes were A/T rich and G/C poor. It is first proposed that the relative synonymous codon usage (RSCU) value of each codon can be calculated to directly reflect the characteristics of codon usage in 1986 (Sharp and Li, 1986). RSCU value represents the frequency of codon usage bias, whose value is 1.0 indicating no bias. In contrast, if RSCU deviates 1.0, indicating there exists a negative or positive codon usage bias (Ma et al., 2002). To gain insight into characteristics of synonymous codon usage in PEDV orf3 genes, RSCU values were calculated using program GCUA (version 1.2) (ftp://ftp.nhm.ac.uk/pub/gcua), and the RSCU values of all 61 codons were displayed in Table 1 . These results showed that the preferentially used codons were U-ended (11 ones), C-ended (4 ones), A-ended (3 ones), and G-ended (3 ones) codons. It was worth noting that the most preferentially used U-ended codons among the synonymous codons were similar with the result of the above T base. These results supported the evidence that T was the most abundant base content and was most preferentially used among the third position of the four kinds of nucleotides, suggesting that codon usage bias exits in the synonymous codon usage pattern in the PEDV orf3 gene, which is influenced by compositional constraints.
Table 1

Overall RSCU of the 518 collected sequences of the orf3 gene of the PEDV.

AACodonNRSCUAACodonNRSCU
PheUUU80731.61SerUCU32402.32
UUC19470.39UCC9870.71
LeuUUA18480.72UCA14721.06
UUG35481.38UCG5900.42
TyrUAU51401.42CysUGU15030.99
UAC21000.58UGC15471.01
terUAA10.00terUGA5190.00
terUAG00.00TrpUGG5201.00
LeuCUU56642.20ProCCU70.03
CUC15650.61CCC60.02
CUA17700.69CCA10423.95
CUG10340.40CCG10.00
HisCAU10801.97ArgCGU10351.34
CAC140.03CGC10431.35
GlnCAA36021.77CGA5080.66
CAG4760.23CGG5170.67
IleAUU77972.40ThrACU27211.61
AUC14250.44ACC5210.31
AUA5220.16ACA20681.23
MetAUG10471.00ACG14310.85
AsnAAU36321.54SerAGU15571.12
AAC10920.46AGC5200.37
LysAAA25851.25ArgAGA4950.64
AAG15540.75AGG10471.35
ValGUU40451.84AlaGCU48571.97
GUC31691.44GCC3340.14
GUA5330.24GCA31201.27
GUG10430.47GCG15500.63
AspGAU20430.72GlyGGU25771.66
GAC35981.28GGC26001.67
GluGAA15690.86GGA5130.33
GAG20591.14GGG5230.34

The preferentially used codons and RSCU values for orf3 gene of the PEDV are in bold and italic. AA Amino acids, N number of codons, RSCU cumulative relative synonymous codon usage.

Overall RSCU of the 518 collected sequences of the orf3 gene of the PEDV. The preferentially used codons and RSCU values for orf3 gene of the PEDV are in bold and italic. AA Amino acids, N number of codons, RSCU cumulative relative synonymous codon usage. The ENC value of a gene is usually performed to determine the extent of codon usage bias. The ENC values fluctuate from 20 to 61. If the value is 20, indicating biased gene, but the value of 61 indicates the unbiased gene (Comeron and Aguade, 1998). In order to investigate the variation of codon usage bias in PEDV orf3 genes, the ENC values of 518 genes were calculated. The results showed that ENC values varied from 45.44 to 56.37, with an average ± SD of 48.75 ± 1.29, which represented a relatively low codon usage bias and an instable change. In addition, we have performed the same analysis as the above orf3 gene, which included a total of 294 coding sequences (CDS) of M gene of PEDV strains collected from China. The results showed that the ENC values of M gene varied from 47.45 to 60.47, with an average ± SD of 56.29 ± 1.74, which represented a comparatively stable change and a lower codon usage bias than orf3. Mutational pressure and translational selection are thought to be two major factors influencing usage variation in RNA virus genome (Belalov and Lukashev, 2013). The plot of ENC versus GC3s can be used to analyze synonymous codon usage bias of viral genes (Wright, 1990). Genes represented by the spots in the ENC-GC3s plot will locate above or below the predicted curve when codon usage is constrained only by a G + C mutational bias (Zhang et al., 2014). As shown in Fig. 1A, the ENC-GC3s plot showed that most points lay below the considerably predicted curve, revealing that the G + C mutational bias might play a major role in PEDV orf3 codon usage. While some points located above the expected curve, suggesting that codon bias is also related to translational selection combined with other factors.
Fig. 1

(A) ENC used in PEDV orf3 genes plotted against the GC3s. The dotted red line is composed of the predicted ENC values. Blue dots show the results obtained for ENC values of the PEDV orf3 genes. (B) The first 20 axes are used to display the tendency of codon usage bias of PEDV orf3 genes. The plot is drawn according to the relative and cumulative inertia of the first 20 factors, respectively. The relative inertia is represented by the bar chart and the cumulative inertia is indicated by the curve chart based on principal component analysis. (C) Positions of the PEDV orf3 gene in the plot of the first two major axes by COA of RSCU values. The first and second axes account for 21.7% and 15.38% of the total variation, respectively. (D) Neutrality analysis in relation to GC3s and GC12s displays the key role between mutational pressure and natural selection. (E) Summary of correlation analysis nucleotide composition, Axis1, Axis2, Gravy, Aroma, GC3s, GC12s and ENC. * P value ≤.05; ** P value ≤.01. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

(A) ENC used in PEDV orf3 genes plotted against the GC3s. The dotted red line is composed of the predicted ENC values. Blue dots show the results obtained for ENC values of the PEDV orf3 genes. (B) The first 20 axes are used to display the tendency of codon usage bias of PEDV orf3 genes. The plot is drawn according to the relative and cumulative inertia of the first 20 factors, respectively. The relative inertia is represented by the bar chart and the cumulative inertia is indicated by the curve chart based on principal component analysis. (C) Positions of the PEDV orf3 gene in the plot of the first two major axes by COA of RSCU values. The first and second axes account for 21.7% and 15.38% of the total variation, respectively. (D) Neutrality analysis in relation to GC3s and GC12s displays the key role between mutational pressure and natural selection. (E) Summary of correlation analysis nucleotide composition, Axis1, Axis2, Gravy, Aroma, GC3s, GC12s and ENC. * P value ≤.05; ** P value ≤.01. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Subsequently, we performed a correspondence analysis (COA) to investigate the trends in 59 codon usage variation among PEDV orf3 genes according to the previous method (Chen et al., 2014). Based on the relative and cumulative inertia of the first 20 factors, we used the Origin software (version 8.0) to display the distributions of each vector, respectively. The 21.7% of the total variation was accounted on the first principal axis. The next three axes accounted for 15.38%, 14.58%, and 14.50% of the variation, respectively, revealing that the first four axes accounted for 66.16% of the total variation (Fig. 1B). At the same time, COA was carried out on the RSCU values for each gene and its distribution in the plane defined by the first two principal axes of COA were displayed (Fig. 1C). The results showed that the vast majority of virulent genes were distributed around the origin of coordinate axis and did not distance too much from one other. Meanwhile, we also found that some genes were located at different positions in the plane, which were dispersed and far away from the origin. These strains mainly collected from southern China distributed more widespread than that of other region of China. In addition, most of the studied strains were isolated from southern China, and whose ENC values (51.73 ± 2.94) were higher than average ENC, as well as the strains belonging to other region of China. These data reflected the relatively low codon usage bias among different strains, indicating mutational bias might contribute to the codon usage bias of PEDV genome. These above results reveled that both mutation pressure and natural selection contribute to the codon usage bias of the orf3 gene of PEDV. Thus, to distinguish which one plays a more important role in shaping condon usage bias, the GC3s values were plotted against the GC12s values (Chen et al., 2014). The neutrality plot showed that the directional mutation pressure vs natural selection that shapes codon usage in the orf3 gene of PEDV (Fig. 1D). We found that GC3s was significantly correlated with GC12s (r = −0.442, P < .01), with a correlation coefficient of −0.2368, indicating that relative neutrality was 23.68%, conversely, natural selection was 76.32%. These results demonstrated that compared to mutational pressure, natural selection plays a major role in influencing the codon usage bias of orf3 gene of PEDV. To further analyze the possible effects of mutational pressure on the codon usage bias in PEDV orf3 genes, we performed the correlation analysis among the nucleotide compositions (A%, T%, G%, C%, and GC%), codon compositions (A3s, T3s, G3s, C3s, and GC3s) and the ENC values. Furthermore, correlation analysis and regression analysis was performed using the values of the first two axes of this COA (Chen et al., 2014) and the Spearman's rank correlation analysis method (Tsai et al., 2007). We have conducted all statistical analyses using the statistical analysis software SPSS (Version 17.0). The nucleotide compositions were correlated with most of the codon compositions (Fig. 1E). Furthermore, there was a significant correlation between the ENC values and the most of nucleotide compositions, and all of P values were <0.01, which indicated that mutational bias shapes the synonymous codon usage pattern of the PEDV orf3 gene. At last, we have evaluated the correlation between the Gravy and Aroma values and the codon contents. The results showed that Gravy value was correlated with the A3s, G3 s, C3s, U3 s, GC3s, GC12s and ENC. The Aroma value was correlated with the A3s, G3 s, C3s, GC3s, GC12s and ENC. Both Gravy value and Aroma value were correlated with Axis 1 and Axis 2, indicating that natural selection influences the codon usage bias of PEDV orf3 genes. In conclusion, the codon usage pattern of PEDV orf3 gene is comparatively low. Two main factors, mutational bias and natural selection pressure, contribute to the codon usage pattern with the latter playing a more critical role. Moreover, other factors, such as dinucleotide composition and aromaticity also influence codon usage bias. This study not only represents the most comprehensive analysis of PEDV orf3 codon usage patterns, but also provides a basic understanding of the mechanisms for codon usage bias. However, this study only applies to PEDV isolates from China, and our future direction of this work will focus on the comparison of PEDV isolates from other parts of the world to extensively examine the factors that cause the outbreak and evolution of this virus.

Declaration of Competing Interest

There is no conflict of interest among the contributors of this paper.
  28 in total

Review 1.  Forces that influence the evolution of codon bias.

Authors:  Paul M Sharp; Laura R Emery; Kai Zeng
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2010-04-27       Impact factor: 6.237

2.  An evaluation of measures of synonymous codon usage bias.

Authors:  J M Comeron; M Aguadé
Journal:  J Mol Evol       Date:  1998-09       Impact factor: 2.395

3.  Analysis of synonymous codon usage in Newcastle disease virus hemagglutinin-neuraminidase (HN) gene and fusion protein (F) gene.

Authors:  Hong-Wei Cao; De-Shan Li; Hua Zhang
Journal:  Virusdisease       Date:  2013-11-12

4.  Cluster analysis of the codon use frequency of MHC genes from different species.

Authors:  Jianmin Ma; Tong Zhou; Wanjun Gu; Xiao Sun; Zuhong Lu
Journal:  Biosystems       Date:  2002 Mar-May       Impact factor: 1.973

5.  Outbreak of porcine epidemic diarrhea in suckling piglets, China.

Authors:  Rui-Qin Sun; Ru-Jian Cai; Ya-Qiang Chen; Peng-Shuai Liang; De-Kun Chen; Chang-Xu Song
Journal:  Emerg Infect Dis       Date:  2012-01       Impact factor: 6.883

6.  Origin, evolution, and genotyping of emergent porcine epidemic diarrhea virus strains in the United States.

Authors:  Yao-Wei Huang; Allan W Dickerman; Pablo Piñeyro; Long Li; Li Fang; Ross Kiehne; Tanja Opriessnig; Xiang-Jin Meng
Journal:  MBio       Date:  2013-10-15       Impact factor: 7.867

Review 7.  Porcine epidemic diarrhoea virus: a comprehensive review of molecular epidemiology, diagnosis, and vaccines.

Authors:  Daesub Song; Bongkyun Park
Journal:  Virus Genes       Date:  2012-01-22       Impact factor: 2.332

8.  Differentiation of a Vero cell adapted porcine epidemic diarrhea virus from Korean field strains by restriction fragment length polymorphism analysis of ORF 3.

Authors:  D S Song; J S Yang; J S Oh; J H Han; B K Park
Journal:  Vaccine       Date:  2003-05-16       Impact factor: 3.641

9.  Genome-wide analysis of codon usage and influencing factors in chikungunya viruses.

Authors:  Azeem Mehmood Butt; Izza Nasrullah; Yigang Tong
Journal:  PLoS One       Date:  2014-03-04       Impact factor: 3.240

10.  Analysis of synonymous codon usage in classical swine fever virus.

Authors:  Pan Tao; Li Dai; Mengcheng Luo; Fangqiang Tang; Po Tien; Zishu Pan
Journal:  Virus Genes       Date:  2008-10-29       Impact factor: 2.332

View more
  1 in total

1.  Comprehensive analysis of synonymous codon usage patterns and influencing factors of porcine epidemic diarrhea virus.

Authors:  Xianglong Yu; Jianxin Liu; Huizi Li; Boyang Liu; Bingqian Zhao; Zhangyong Ning
Journal:  Arch Virol       Date:  2020-10-30       Impact factor: 2.574

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.