Literature DB >> 28625542

Analysis of the codon usage of the ORF2 gene of feline calicivirus.

Minghui Zang1, Wanting He1, Fanshu Du1, Gongjian Wu1, Bohao Wu1, Zhenlei Zhou2.   

Abstract

Feline calicivirus (FCV) is a highly prevalent pathogen of the domestic cat that causes acute infections of the oral and upper respiratory tract. The E region of the ORF2 protein is responsible for the induction of virus-neutralizing antibodies, thus it is important to understand the codon usage of this gene. Here, analysed 90 coding sequences of ORF2 and show that it undergoes a low codon usage bias. In addition, although mutational bias is one of the factors shaping the codon usage bias of this gene, natural selection plays a more significant role. Our results reveal part of the mechanisms driving FCV evolution, which will lay foundation for the further research of FCV.
Copyright © 2017 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Codon usage; Feline calicivirus; Mutation pressure; Natural selection; Neutrality analysis; Open reading frame 2 (ORF2)

Mesh:

Substances:

Year:  2017        PMID: 28625542      PMCID: PMC7106028          DOI: 10.1016/j.meegid.2017.06.013

Source DB:  PubMed          Journal:  Infect Genet Evol        ISSN: 1567-1348            Impact factor:   3.342


Introduction

Feline calicivirus (FCV) is a highly prevalent pathogen of the domestic cat, with widespread distribution across the world (Radford et al., 2007). It has been suggested that almost all of the members of Felidae, such as cats, tigers and cheetahs are susceptible to the virus. FCV has also been isolated from the faeces of dogs (Gabriel et al., 1996). FCV strains have been shown to exhibit interspecific circulation among different animal species. FCV belongs to the genus vesivirus of the Caliciviridae. It has a small, non-enveloped, positive-sense, single-stranded RNA genome of approximately 7700 nucleotides of which the 5′ end is linked covalently to the VPg protein and the 3′ end is linked to poly(A). The genome contains three open reading frames (ORFs) referred to as ORF1, ORF2 and ORF3 (Prikhodko et al., 2014). ORF2 encodes the capsid precursor protein, which is processed by the viral protease to release a small 124 amino acid protein called the leader of the capsid (LC) and the mature capsid protein (VP1). Comparative analysis of ORF2 sequences has been used to elucidate phylogenetic relationships among different FCV isolates (Prikhodko et al., 2014). Codons that encode the same amino acid are referred to as synonymous codons. Although synonymous codons encode for the same amino acid, their corresponding tRNAs may differ in relative abundance in the cell as well as the ribosome recognition speed, thus affecting the codon usage. The usage of synonymous codons is a non-random process with some codons being used more often than others (Marín et al., 1989). This phenomenon which called ‘codon usage bias’, can be found in numerous species such as prokaryotes, eukaryotes and viruses (Liu et al., 2011). Codon usage is influenced by two major factors, natural selection and mutation bias (Gu et al., 2004). The codon usage between the virus and the host will affect the overall survival of the virus, the ability to evade the host immune system and evolution (Moratorio et al., 2013). Thus, understanding the codon usage of viruses can provide information about viral evolution and expand our understanding of the regulation of viral gene expression based on codon adoption. This can aid rational vaccine design to achieve efficient viral protein expression to induce long-lasting immunity. Because different FCV strains cause disease with a wide range of clinical signs, it is important to characterize the genetic variation, evolution and the codon usage pattern of FCV to understand how these viral strains cause disease. The aim of this study was to describe the genetic features of the ORF2 gene of FCV. To this end, we analysed in detail the genetic evolution, the codon usage pattern and the evolutionary characterization of the codon usage pattern of FCV.

Materials and methods

Sequence data

90 coding sequences (CDS) of ORF2 of FCV strains were included in this study, which were retrieved from the National Center for Biotechnology (NCBI) GenBank database (https://www.ncbi.nlm.nih.gov/nucleotide/). The details of the sequences analysed including accession number, time of collection and geographical distribution are shown in Table S1.

Nucleotide composition analysis

The frequency of each nucleotide (A%, U%, G% and C%) was calculated using BioEdit. The nucleotide composition of the third synonymous codon position of each codon (A3s, T3 s, G3 s, C3s) was calculated using the Codon W package. The G + C at the first (GC1s), second (GC2s) and third codon positions (GC3s) were calculated using the CodonW program. Additionally, the G + C at the first and the second positions (GC12s) were calculated with the same program.

Synonymous codon usage analysis

Relative synonymous codon usage (RSCU)

The RSCU value of each codon (except for Met, Trp and termination codons and excluding the influence of amino acid composition and sequence length) was calculated to directly reflect the usage characteristics as first proposed in by sharp et al. (Sharp and Li, 1986). The RSCU value of a codon is the ratio of its observed frequency to its expected frequency assuming that all codons for a particular amino acid are used evenly (Peden, 1999) and it was calculated using the following equation: It is essential to note that g  represents the observed number of the i codon for the  j amino acid which has ni kinds of synonymous codons (Nasrullah et al., 2015). Normally, it is considered that a high RSCU value reflects a strong codon usage bias. Codon usages with RSCU values of < 1.0, 1.0, > 1.0 stand for negative codon usage bias, no bias and positive codon usage bias respectively (Chen et al., 2014a).

Effective number of codons (ENC)

The ENC is considered the magnitude of the codon usage bias of a single gene (Wright, 1990). The ENC value is not influenced by the amino acid or the gene length (Morla et al., 2016) . The ENC value was calculated using the formula given below:where the s value stands for the GC3s composition of each codon (Chen et al., 2014a). The ENC value ranges from 20 (only one of the possible synonymous codons is used for the corresponding amino acid) to 61 (all possible synonymous codons are used equally for the corresponding amino acid (Wright, 1990). In contrast to the RSCU value, the smaller the ENC value, the greater the extent of codon usage bias. A ENC value equal or < 35 is considered to be a sign of strong codon bias (Comeron and Aguadé, 1998).

ENC-plot

ENC plots were drawn to determining factors (especially mutation pressure) that influence the codon usage bias (Wright, 1990) taking the GC3s values in the x axes and the ENC values in the y axes. Under the null model, if codon usage is only constrained by G + C, the predicted ENC values would sit on or around the standard curve (Jiang et al., 2007). Otherwise, if predicted ENC values sit far lower than the standard curve, other factors such as natural selection play a major role in shaping codon usage bias.

Principal component analysis (PCA)

PCA, a widely used multivariate statistical approach to determine the major trends in codon usage variation of genes was performed using GraphPad Prism 6.0 (Gupta and Ghosh, 2001). The RSCU value of each gene, excluding Met, Trp and termination codons, is explained by a 59-dimensional space vector and transformed into a smaller number of unrelated factors (Lu et al., 2013).

General average GRAVY and AROMA indices

To unravel if natural selection shapes codon usage bias, the Gravy and Aroma score were determined. Both indices were obtained from CodonW, which reveals the frequencies of hydrophobic and aromatic amino acids respectively (Kyte and Doolittle, 1982). A higher Gravy or Aroma value suggests a more hydrophobic or aromatic amino acid product.

Neutrality analysis

Neutrality analysis was used to determine the role of mutational bias and natural pressure shaping codon usage bias. In neutrality plots, GC12s are drawn against GC3s each dot representing an independent FCV strain. It is essential to demonstrate that the slope of the line near to zero is an indication of only natural selection constrains the codon usage bias, while the near to one represents complete neutrality (Sueoka 1988).

Statistical analysis

Correlation analysis was performed using GraphPad Prism 6.0.

Results

Nucleotide composition of the FCV ORF2 gene

The nucleotide compositions of 90 coding sequences of FCV ORF2 were calculated. The mean values of A%, C%, G% and U% were 26.48%, 21.23%, 23.20% and 29.10%, with standard deviations (SD) of 0.53, 0.73, 0.44 and 0.62 respectively. This indicates that U and A were more abundant than C and G, while U was the most preferred nucleotide. The codon compositions at the third position (A3, U3, G3, C3, GC3) revealed that the mean U3% (43.57%) was the highest among the four nucleotides, which is consist with the nucleotide content of FCV ORF2 gene (Table S2). The GC3 values raged from 32.8% to 44.3% (mean 38.3%.), indicating that A/U terminated codons are preferred over G/C terminated codons.

Relative synonymous codon usage (RSCU) and effective number of codons (ENC) of the FCV ORF2 gene

The RSCU values of all 61 codons were calculated and are displayed in Table 1 . Among the 18 most frequently employed synonymous codons, 17 optional codons ended with U (GCU for Ala, UGU for Cys, GAU for Asp, UUU for Phe, GGU for Gly, AUU for Ile, CUU for Leu, CCU for Pro, UCU for Ser, ACU for Thr, GUU for Val, UAU for Tyr), 3 preferred codons terminated with A (GAA for Glu, AAA for Lys, CAA for Gln), 2 codons ended with C (CAC for His, AAC for Asn) and one codon terminated with G (AGG for Arg). It is interesting to note that codons ending in U were the most frequently used. This is in accordance with the fact that U was the most abundantly used nucleotide, demonstrating that codon usage is influenced by compositional constraints.
Table 1

RSCU of 90 sequences of the ORF2 gene of FCV.

Amino acidCodonRSCU/numberAmino acidCodonRSCU/number
AlaGCA1.1088/971MetATG1/1264
GCC0.8655/758AsnAAC1.0364/1522
GCG0.2170/190AAT0.9636/1415
GCT1.8088/1584ProCCA1.4004/1372
CysTGC0.8711/331CCC0.8359/819
TGT1.1289/429CCG0.2715/266
AspGAC0.8030/1645CCT1.4922/1462
GAT1.1970/2452GlnCAA1.4014/1290
GluGAA1.2405/1372CAG0.5986/551
GAG0.7595/840ArgAGA1.6088/514
PheTTC0.9181/1328AGG1.6651/532
TTT1.0819/1565CGA0.6291/201
GlyGGA1.2784/1473CGC0.7418/237
GGC0.7021/809CGG0.6354/203
GGG0.6735/776CGT0.7199/230
GGT1.3461/1551SerAGC0.5226/482
HisCAC1.1190/823AGT0.8403/775
CAT0.8810/648TCA1.2089/1115
IleATA0.4546/726TCC0.9389/866
ATC0.9305/1486TCG0.2906/268
ATT1.6149/2579TCT2.1988/2028
LysAAA1.0958/1275ThrACA1.2033/1234
AAG0.9042/1052ACC1.0054/1031
LeuCTA0.7986/630ACG0.3023/310
CTC0.9394/741ACT1.4890/1527
CTG0.6554/517ValGTA0.5403/548
CTT1.9294/1522GTC0.8233/835
TTA0.6377/503GTG1.1171/1133
TTG1.0395/820GTT1.5193/1541
TrpTGG1/1259TyrTAC0.9414/900
TAT1.0586/1012
RSCU of 90 sequences of the ORF2 gene of FCV. The values of the ENC analysis ranged from 49.38 to 57.55 (average ± SD of 53.70 ± 1.639) indicating fluctuation among the 90 FCV strains. The high ENC values (ENC > 45) indicate a low codon usage bias.

The role of mutational bias in shaping the codon usage of the FCV ORF2 gene

ENC-plots were drawn with ENC values plotted against GC3s values according to the geographical distribution of the strains used in this study (Fig. 1 ). All the strains, represented in different colours for each continent, located below the theoretical curve (Fig. 1A). Additionally, strains isolated from the same continents did not cluster together. In particular, all 90 strains collected from 9 countries were distributed widely (Fig. 1B). This indicates that mutational pressure combined with other factors contributes to codon usage bias of the FCV ORF2 gene (Sueoka 1988).
Fig. 1

ENC plots depicting the relationship between ENC and the GC content at the third codon position (GC3s) according to geographical distribution of each strain by (A) continents or (B) country.

ENC plots depicting the relationship between ENC and the GC content at the third codon position (GC3s) according to geographical distribution of each strain by (A) continents or (B) country. In addition, there was a correlation between the nucleotide composition (A%, U%, G%, C%, GC%) and the codon contents (A3s, T3 s, C3s, G3 s, GC3s) (p  < 0.05), except between the relationship of A3s and T. Furthermore, there was a significant correlation between the ENC values and the nucleotide compositions (p  < 0.01), which indicates that mutational bias influences the synonymous codon usage pattern of the ORF2 gene of FCV.

Principal component analysis (PCA)

PCA analysis, a multivariable method, was employed to unravel the variation of the synonymous codon usage (Singh et al. 2016). We found that the first four principal axes accounted for 54.86% of the total variation with the first, second, third and fourth principal axis accounting for 20.81%, 13.8%, 10.83% and 9.42% respectively (Fig. 2 ). This suggests that the first and second axis contributed to the variation of RSCU of synonymous codons. PCA analysis was performed based on the continent and country of isolation (Fig. 3 ). Based on the distribution of different strains on the first two axes, we found that the distribution of Asian strains, especially strains collected from China, was more widespread than the distribution of strains isolated from Oceania, Europe and North America. Moreover, most of the strains isolated from North America located near the origin, indicating that mutational pressure contributed to codon usage of the FCV ORF2 gene.
Fig. 2

Principal component analysis of the amino acid usage frequencies of the ORF2 gene. The relative and cumulative inertia of the first 20 factors are shown based on principal component analysis.

Fig. 3

Principal component analysis based on geographical distribution according to: (A) continent or (B) country. Different geographical distributions are represented by different colours.

Principal component analysis of the amino acid usage frequencies of the ORF2 gene. The relative and cumulative inertia of the first 20 factors are shown based on principal component analysis. Principal component analysis based on geographical distribution according to: (A) continent or (B) country. Different geographical distributions are represented by different colours.

The role of natural selection in shaping the codon usage pattern of the ORF2 gene of FCV

It is normally considered that natural selection contributes to some extent to codon usage bias, therefore we evaluated the correlation between the Gravy and Aroma values and the codon contents (A3s, G3 s, C3s, U3 s and GC3s) (Table 2 ). We found a correlation between the Aroma values and U3 s, G3 s and GC3s (p  < 0.05), confirming that natural selection influences the codon usage bias of the FCV ORF2 gene.
Table 2

Correlation analysis of the nucleotide composition, Axis1, Axis2, Gravy, Aroma, nucleotide at the third position and ENC.

ACGTGCGravyAromoAxis1Axis2
Nc− 0.4652**0.6024**0.4801**− 0.6511**0.7161**− 0.2012− 0.13710.11030.3489**
GC3s− 0.6888**0.7532**0.6418**− 0.7512**0.9126**− 0.0802− 0.2666*0.14080.4044**
T3 s0.3443**− 0.76**− 0.46**0.9259**− 0.83**0.13780.2722**− 0.2728**− 0.2962**
C3s− 0.5442**0.8967**0.2532*− 0.7673**0.841**− 0.1449− 0.2608*0.15340.3926**
A3s0.8373**− 0.377**− 0.54**0.1082− 0.561**− 0.08790.13130.015− 0.375**
G3 s− 0.5055**0.15860.8325**− 0.3455**0.5283**0.0371− 0.1316− 0.03180.1925

means p < 0.05.

means p < 0.01.

Correlation analysis of the nucleotide composition, Axis1, Axis2, Gravy, Aroma, nucleotide at the third position and ENC. means p < 0.05. means p < 0.01.

Natural selection plays a more important role than mutation pressure in shaping the codon usage of FCV

We found that both mutation pressure and natural selection contribute to the codon usage bias of the ORF2 gene of FCV. Thus, to understand which one plays a more important role, the GC12s values (the mean value of GC1s and GC2s) were plotted against the GC3s values (Fig. 4 ). We found a correlation between GC12s and GC3s (p  < 0.05) with a correlation coefficient of 0.22, indicating that relative neutrality was 22% or, conversely, natural selection was 78%. Thus, natural selection plays a major role in shaping the codon usage bias of ORF2 gene of FCV compared to mutational pressure.
Fig. 4

Neutrality analysis in relation to GC3s and GC12s.

Neutrality analysis in relation to GC3s and GC12s.

Discussion

RNA zoonotic viruses; such as influenza viruses and coronaviruses which have highly susceptible to recombination and cross species transmission (Su et al., 2015, Su et al., 2016). FCV is a RNA virus and as such it has experienced a high evolution rate since its emergence. Previous studies on FCV have mostly focused on infectivity (Pesavento et al. 2004) and prevalence (Knowles et al. 1989). However, there are no studies on codon usage bias of FCV. ORF2 is one of three ORF of FCV genome and encodes is the major capsid protein VP1. Therefore, the codon usage of ORF2 of FCV was first studied. Previous studies on codon usage bias of other RNA viruses showed high codon usage bias. For example, analysis of the G gene of Rabies virus (RABV) showed ENC values ranging from 44.40% to 51.40% (Zhao et al. 2016); Foot and Mouth Disease Virus (FMDV) displayed ENC values of 51.42% (Zhou et al. 2013); Porcine Epidemic Diarrhea Virus (PEDV) of 47.91% (Chen et al. 2014b); and Severe Acute Respiratory Syndrome (SARS) of 48.99% (Zhao et al. 2008). However, the mean ENC value of ORF2 of FCV reported here was 53.70% (SD ± 1.639), thus in comparison with the above viruses, the degree of codon usage bias of FCV is lower. Codon usage bias is mainly influenced by natural selection (Romero et al. 2003) and mutation pressure (Jenkins et al. 2001). Here, we used ENC-Plots (Fig. 1) and PCA(Fig. 3) analysis according to the geographical distribution to investigate the major factors shaping the codon usage bias of the FCV ORF2 gene . We found only one strain isolated from USA, one from Japan and one from Australia to sit near the standard values, suggesting that mutation pressure contributed to the codon usage bias of these three strains. PCA analysis revealed that mutation pressure is the dominant force shaping the codon usage of sequences isolated from North America (Fig. 3). Furthermore, analysis of the relationships between nucleotide composition and the codon contents at the third base positions suggested that mutation pressure is one of the factors in shaping the codon usage of FCV ORF2. Correlation analysis of Gravy, Aroma values and codon content (A3s, G3 s, C3s, U3 s and GC3s) showed that there is a correlation between Aroma and U3 s, G3 s and GC3s, confirming that natural selection contributes to the codon usage of the ORF2 gene of FCV. Since both mutation pressure and natural selection are important in driving the codon usage of ORF2, we performed neutrality analysis to understand which of the two forces has a bigger impact. We found that natural selection is the major force driving the codon usage of ORF2. This is the first study analysing the codon usage of the FCV ORF2 gene and describing the forces that drive FCV evolution. In the future, more epidemiological surveys and sequence analysis are required to examine the factors that resulted in FCV evolution.
  28 in total

1.  Directional mutation pressure and neutral molecular evolution.

Authors:  N Sueoka
Journal:  Proc Natl Acad Sci U S A       Date:  1988-04       Impact factor: 11.205

2.  Gene expressivity is the main factor in dictating the codon usage variation among the genes in Pseudomonas aeruginosa.

Authors:  S K Gupta; T C Ghosh
Journal:  Gene       Date:  2001-07-25       Impact factor: 3.688

Review 3.  Feline calicivirus.

Authors:  Alan D Radford; Karen P Coyne; Susan Dawson; Carol J Porter; Rosalind M Gaskell
Journal:  Vet Res       Date:  2007-02-13       Impact factor: 3.683

4.  Evolution of base composition and codon usage bias in the genus Flavivirus.

Authors:  G M Jenkins; M Pagel; E A Gould; P M de A Zanotto; E C Holmes
Journal:  J Mol Evol       Date:  2001-04       Impact factor: 2.395

5.  An evolutionary perspective on synonymous codon usage in unicellular organisms.

Authors:  P M Sharp; W H Li
Journal:  J Mol Evol       Date:  1986       Impact factor: 2.395

6.  A detailed comparative analysis on the overall codon usage patterns in West Nile virus.

Authors:  Gonzalo Moratorio; Andrés Iriarte; Pilar Moreno; Héctor Musto; Juan Cristina
Journal:  Infect Genet Evol       Date:  2013-01-16       Impact factor: 3.342

7.  Pathologic, immunohistochemical, and electron microscopic findings in naturally occurring virulent systemic feline calicivirus infection in cats.

Authors:  P A Pesavento; N J MacLachlan; L Dillard-Telm; C K Grant; K F Hurley
Journal:  Vet Pathol       Date:  2004-05       Impact factor: 2.221

8.  Characterization of codon usage pattern and influencing factors in Japanese encephalitis virus.

Authors:  Niraj K Singh; Anuj Tyagi; Rajinder Kaur; Ramneek Verma; Praveen K Gupta
Journal:  Virus Res       Date:  2016-05-14       Impact factor: 3.303

9.  Genomic analysis of codon usage shows influence of mutation pressure, natural selection, and host features on Marburg virus evolution.

Authors:  Izza Nasrullah; Azeem M Butt; Shifa Tahir; Muhammad Idrees; Yigang Tong
Journal:  BMC Evol Biol       Date:  2015-08-26       Impact factor: 3.260

10.  Analysis of synonymous codon usage in 11 human bocavirus isolates.

Authors:  Sheng Zhao; Qin Zhang; Xiaolin Liu; Xuemin Wang; Huilin Zhang; Yan Wu; Fei Jiang
Journal:  Biosystems       Date:  2008-02-21       Impact factor: 1.973

View more
  1 in total

1.  Molecular Characterization and Cross-Reactivity of Feline Calicivirus Circulating in Southwestern China.

Authors:  Long Zhou; Nengsheng Fu; Lu Ding; Yan Li; Jian Huang; Xue Sha; Qun Zhou; Xin Song; Bin Zhang
Journal:  Viruses       Date:  2021-09-12       Impact factor: 5.048

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.