| Literature DB >> 31911390 |
Abdullah Sheikh1, Abdulla Al-Taher2, Mohammed Al-Nazawi2, Abdullah I Al-Mubarak3, Mahmoud Kandeel4.
Abstract
The nucleocapsid (N) protein of a coronavirus plays a crucial role in virus assembly and in its RNA transcription. It is important to characterize a virus at the nucleotide level to discover the virus's genomic sequence variations and similarities relative to other viruses that could have an impact on the functions of its genes and proteins. This entails a comprehensive and comparative analysis of the viral genomes of interest for preferred nucleotides, codon bias, nucleotide changes at the 3rd position (NT3s), synonymous codon usage and relative synonymous codon usage. In this study, the variations in the N proteins among 13 different coronaviruses (CoVs) were analysed at the nucleotide and amino acid levels in an attempt to reveal how these viruses adapt to their hosts relative to their preferred codon usage in the N genes. The results revealed that, overall, eighteen amino acids had different preferred codons and eight of these were over-biased. The N genes had a higher AT% over GC% and the values of their effective number of codons ranged from 40.43 to 53.85, indicating a slight codon bias. Neutrality plots and correlation analyses showed a very high level of GC3s/GC correlation in porcine epidemic diarrhea CoV (pedCoV), followed by Middle East respiratory syndrome-CoV (MERS CoV), porcine delta CoV (dCoV), bat CoV (bCoV) and feline CoV (fCoV) with r values 0.81, 0.68, -0.47, 0.98 and 0.58, respectively. These data implied a high rate of evolution of the CoV genomes and a strong influence of mutation on evolutionary selection in the CoV N genes. This type of genetic analysis would be useful for evaluating a virus's host adaptation, evolution and is thus of value to vaccine design strategies.Entities:
Keywords: Amino acid; Codon bias; Coronavirus; Nucleocapsid protein; Preferred nucleotides
Mesh:
Substances:
Year: 2020 PMID: 31911390 PMCID: PMC7119019 DOI: 10.1016/j.jviromet.2019.113806
Source DB: PubMed Journal: J Virol Methods ISSN: 0166-0934 Impact factor: 2.014
Fig. 1ENc Plots of N genes from 13 different CoVs representing the relation between GC3s and Nc frequencies.
GC nucleotide frequencies at third positions (GC3s) plotted against the effective number of codons (Nc). GC3s and Nc regression is denoted by a linear dotted line and the solid line represents the relation between GC3s and Nc
Nucleotides composition of N gene of 13 CoVs.
| Frequencies of nucleotides | pedCoV | MERS CoV | ibCoV | cCoV | dCoV | tgCoV | hCoV 229E | bvCoV | bCoV | hCoV HKU1 | caCoV | fCoV | hCoV OC43 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Adenine (A) | 0.30 | 0.29 | 0.30 | 0.30 | 0.27 | 0.32 | 0.28 | 0.29 | 0.28 | 0.29 | 0.31 | 0.31 | 0.29 |
| Cytosine (C) | 0.22 | 0.25 | 0.19 | 0.22 | 0.25 | 0.19 | 0.19 | 0.21 | 0.25 | 0.20 | 0.18 | 0.20 | 0.22 |
| Guanine (G) | 0.24 | 0.21 | 0.26 | 0.21 | 0.21 | 0.22 | 0.21 | 0.24 | 0.22 | 0.18 | 0.20 | 0.22 | 0.24 |
| Thymine (T) | 0.22 | 0.23 | 0.23 | 0.26 | 0.25 | 0.25 | 0.29 | 0.24 | 0.23 | 0.31 | 0.29 | 0.24 | 0.23 |
| T3s | 0.44 | 0.45 | 0.48 | 0.54 | 0.39 | 0.43 | 0.51 | 0.46 | 0.45 | 0.62 | 0.44 | 0.47 | 0.46 |
| C3s | 0.29 | 0.27 | 0.14 | 0.20 | 0.29 | 0.24 | 0.20 | 0.23 | 0.28 | 0.15 | 0.22 | 0.24 | 0.24 |
| A3s | 0.28 | 0.32 | 0.38 | 0.33 | 0.30 | 0.40 | 0.33 | 0.31 | 0.30 | 0.34 | 0.39 | 0.35 | 0.30 |
| G3s | 0.24 | 0.17 | 0.23 | 0.17 | 0.23 | 0.19 | 0.20 | 0.23 | 0.18 | 0.12 | 0.20 | 0.20 | 0.23 |
Various CoVs representing RSCU values.
The values in bold are preferred codons for respective amino acids. The cells with negative biased values have a diagonal line. Over biased codon values are displayed in bold with shaded cells.
Codon Usage Indices of various CoVs.
| pedCoV | MERS CoV | ibCoV | cCoV | dCoV | tgCoV | hCoV 229E | bvCoV | bCoV | hCoV HKU1 | caCoV | fCoV | hCoV OC43 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ENc | 53.85 | 49.53 | 48.84 | 45.84 | 53.6 | 50.74 | 46.95 | 51.14 | 50.32 | 40.43 | 49.82 | 50.86 | 53.85 |
| GC3s | 0.42 | 0.36 | 0.29 | 0.29 | 0.42 | 0.34 | 0.31 | 0.37 | 0.38 | 0.22 | 0.33 | 0.34 | 0.38 |
| GC | 0.46 | 0.47 | 0.45 | 0.43 | 0.47 | 0.41 | 0.42 | 0.46 | 0.48 | 0.39 | 0.40 | 0.43 | 0.46 |
| GRAVY | −1.07 | −0.86 | −1.01 | −0.89 | −0.45 | −0.87 | −0.57 | −0.82 | −0.84 | −0.84 | −0.43 | −1.02 | −0.86 |
| AROMO | 0.06 | 0.07 | 0.07 | 0.06 | 0.08 | 0.08 | 0.07 | 0.08 | 0.07 | 0.10 | 0.10 | 0.08 | 0.08 |
Fig. 2Neutrality Plots of N genes from 13 different CoVs.
The GC nucleotide base frequencies at the third positions (GC3s) were plotted against the GC frequencies of first and second positions (GC)