| Literature DB >> 35573872 |
Zoya Shafat1, Anwar Ahmed2, Mohammad K Parvez3, Shama Parveen1.
Abstract
Background: Hepatitis E virus (HEV) is a member of the family Hepeviridae and causes acute HEV infections resulting in thousands of deaths worldwide. The zoonotic nature of HEV in addition to its tendency from human to human transmission has led scientists across the globe to work on its different aspects. HEV also accounts for about 30% mortality rates in case of pregnant women. The genome of HEV is organized into three open reading frames (ORFs): ORF1 ORF2 and ORF3. A reading frame encoded protein ORF4 has recently been discovered which is exclusive to GT 1 isolates of HEV. The ORF4 is suggested to play crucial role in pregnancy-associated pathology and enhanced replication. Though studies have documented the ORF4's importance, the genetic features of ORF4 protein genes in terms of compositional patterns have not been elucidated. As codon usage performs critical role in establishment of the host-pathogen relationship, therefore, the present study reports the codon usage analysis (based on nucleotide sequences of HEV ORF4 available in the public database) in three hosts along with the factors influencing the codon usage patterns of the protein genes of ORF4 of HEV.Entities:
Keywords: Hepatitis E virus (HEV); Mutational pressure; Natural selection; Nucleotide composition; Open reading frame 4 (ORF4); Synonymous codon usage
Year: 2022 PMID: 35573872 PMCID: PMC9086417 DOI: 10.1186/s43088-022-00244-w
Source DB: PubMed Journal: Beni Suef Univ J Basic Appl Sci ISSN: 2314-8535
Nucleotide composition analysis of ORF4 of hepatitis E viruses
| Nucleotide | Human | Rat | Ferret |
|---|---|---|---|
| A | 15.094 | 16.356 | 17.753 |
| C | 35.597 | 29.451 | 28.768 |
| T(U) | 21.341 | 27.122 | 27.119 |
| G | 27.966 | 27.070 | 26.358 |
| A1 | 20.754 | 24.301 | 22.826 |
| C1 | 32.704 | 29.891 | 30.978 |
| U1 | 26.415 | 25.310 | 25.543 |
| G1 | 20.125 | 20.496 | 20.652 |
| A2 | 8.176 | 8.307 | 14.456 |
| C2 | 42.893 | 26.708 | 24.728 |
| U2 | 24.402 | 39.052 | 35.108 |
| G2 | 24.528 | 25.931 | 25.706 |
| A3 | 16.352 | 16.459 | 15.978 |
| C3 | 31.194 | 31.754 | 30.597 |
| U3 | 13.207 | 17.003 | 20.706 |
| G3 | 39.245 | 34.782 | 32.717 |
| AU | 36.441 | 43.478 | 44.872 |
| GC | 63.563 | 56.498 | 55.126 |
| GC1 | 52.829 | 50.387 | 51.63 |
| GC2 | 67.421 | 52.639 | 50.434 |
| GC3 | 70.439 | 66.536 | 63.314 |
The values are represented as percentage
Fig. 1Comparative analysis of nucleotide composition patterns between HEV host organisms (human, rat and ferret)
Average RSCU values in ORF4 of hepatitis E viruses
| Amino acid | Codon | Hosts | ||
|---|---|---|---|---|
| Human | Rat | Ferret | ||
| Phe (F) | UUU | 0 | ||
| UUC | 2 | |||
| Leu (L) | UUA | 0.9 | 0.25 | |
| UUG | 2.4 | |||
| CUU | 0.3 | 0.51 | 0.45 | |
| CUC | 0.3 | 0.55 | ||
| CUA | 0.9 | |||
| CUG | 1.2 | |||
| Ile (I) | AUU | 1.5 | ||
| AUC | 1.5 | |||
| AUA | 0 | |||
| Val (V) | GUU | 0 | 1.11 | 0.93 |
| GUC | 0.8 | 0.12 | ||
| GUA | 0.8 | 0.98 | 0.47 | |
| GUG | 2.4 | 1.02 | ||
| Ser (S) | UCU | 0 | 0.84 | |
| UCC | 0.51 | |||
| UCA | 0.45 | |||
| UCG | 0.58 | |||
| AGU | 0 | 0 | ||
| AGC | ||||
| Pro (P) | CCU | |||
| CCC | 0.87 | |||
| CCA | 0.87 | 0.64 | ||
| CCG | 0.58 | |||
| Thr (T) | ACU | 0.39 | 0.41 | 0.63 |
| ACC | 0.71 | 1.5 | ||
| ACA | 0.78 | 0.75 | ||
| ACG | 1.13 | |||
| Ala (A) | GCU | 0.58 | 0.58 | |
| GCC | ||||
| GCA | 0.31 | 0.97 | ||
| GCG | 0.45 | 0.58 | ||
| Tyr (Y) | UAU | 2 | 0.5 | 0.4 |
| UAC | 0 | 1.5 | ||
| His (H) | CAU | 0 | 0.4 | 0.11 |
| CAC | 0 | 1.6 | ||
| Gln (Q) | CAA | 0.33 | 0.56 | 0.34 |
| CAG | 1.44 | |||
| Arn (N) | AAU | 0 | 0 | 0 |
| AAC | 2 | 2 | 2 | |
| Lys (K) | AAA | 0 | 0 | 0 |
| AAG | 2 | 2 | 2 | |
| Asp (D) | GAU | 2 | 1.88 | 2 |
| GAC | 0 | 0.12 | 0 | |
| Glu (E) | GAA | 0 | 0.11 | 0.44 |
| GAG | 2 | 1.89 | ||
| Cys (C) | UGU | 0 | 0.51 | 1 |
| UGC | 1 | |||
| Arg (R) | CGU | 0.6 | 0.29 | 0.52 |
| CGC | 1.2 | |||
| CGA | 1.2 | 0.2 | 0.05 | |
| CGG | 1.2 | |||
| AGA | 0.6 | 0.04 | 0.26 | |
| AGG | 1.2 | |||
| Gly (G) | GGU | 0.36 | 0.73 | 0.84 |
| GGC | 1.26 | |||
| GGA | 0.36 | 0 | 0.84 | |
| GGG | 1.15 | 1.05 | ||
The preferred codons are indicated in bold
Fig. 2Comparative analysis of relative synonymous codon usage (RSCU) patterns between HEV- hosts (human, rat and ferret)
Preferred codons for each amino acid in the ORF4 of HEV-hosts
| Amino acid | Human | Rat | Ferret |
|---|---|---|---|
| UUU(F) | 0 | 3.9 | |
| UUC(F) | 4.4 | ||
| UUA(L) | 3 | 3.3 | 1 |
| UUG(L) | 7.4 | ||
| CUU(L) | 1 | 2.9 | 1.8 |
| CUC(L) | 1 | 5.6 | 2.2 |
| CUA(L) | 3 | 7 | 5.3 |
| CUG(L) | 4 | 5 | |
| AUU(I) | |||
| AUC(I) | 3 | 3.7 | 4.8 |
| AUA(I) | 0 | 3.6 | 4.8 |
| GUU(V) | 0 | 2.6 | 2 |
| GUC(V) | 1 | 0.3 | |
| GUA(V) | 1 | 2.3 | 1 |
| GUG(V) | 2.2 | ||
| UCU(S) | 0 | 2.4 | |
| UCC(S) | 4.4 | 3.7 | 1.5 |
| UCA(S) | 5 | 1.3 | 4 |
| UCG(S) | 3.7 | 1.7 | |
| AGU(S) | 4 | 0 | 0 |
| AGC(S) | 3 | 4.6 | |
| CCU(P) | 4 | 4.6 | |
| CCC(P) | 3.1 | ||
| CCA(P) | 5 | 3.1 | 2.2 |
| CCG(P) | 10 | 3.7 | 2 |
| ACU(T) | 1 | 1 | 1 |
| ACC(T) | 3.2 | 1.7 | |
| ACA(T) | 2 | 3.4 | 1.2 |
| ACG(T) | 1.8 | ||
| GCU(A) | 4 | 2 | 1.8 |
| GCC(A) | |||
| GCA(A) | 1 | 3.6 | 3 |
| GCG(A) | 3 | 1.6 | 1.8 |
| UAU(Y) | 1 | 1 | |
| UAC(Y) | 0 | ||
| CAU(H) | 0 | 0.1 | 0.2 |
| CAC(H) | 0 | ||
| CAA(Q) | 1 | 0.7 | 1.7 |
| CAG(Q) | |||
| AAU(N) | 0 | 0 | 0 |
| AAC(N) | |||
| AAA(K) | 0 | 0 | 0 |
| AAG(K) | |||
| GAU(D) | |||
| GAC(D) | 0 | 0.1 | 0 |
| GAA(E) | 0 | 0.1 | 1 |
| GAG(E) | |||
| UGU(C) | 0 | 1.7 | 3 |
| UGC(C) | |||
| CGU(R) | 1 | 1 | 2 |
| CGC(R) | |||
| CGA(R) | 2 | 0.7 | 0.2 |
| CGG(R) | 2 | 5.4 | 4.7 |
| AGA(R) | 1 | 0.1 | 1 |
| AGG(R) | 2 | 6.6 | 6.8 |
| GGU(G) | 1 | 1.7 | 2 |
| GGC(G) | |||
| GGA(G) | 1 | 0 | 2 |
| GGG(G) | 4 | 2.7 | 2.5 |
Comparison of codon usage frequency of preferred codons among HEV-hosts. All the preferred codons are bold indicating the highest codon frequency
List of used codons based on frequency in ORF4 of hepatitis E viruses
| Host organism | Codon (aa) | Frequency | Codon (aa) | Frequency |
|---|---|---|---|---|
| Most frequent | CCC (P) | 11 | GCC(A) | 5 |
| CCG(P) | 10 | CCA(P) | 5 | |
| UUG(L) | 8 | UCA(S) | 5 | |
| UGC (C) | 7 | CAG(Q) | 5 | |
| UCG(S) | 5.6 | GGC(G) | 5 | |
| Least frequent | CUU(L) | 1 | AAC(N) | 1 |
| CUC(L) | 1 | AAG(K) | 1 | |
| GUC(V) | 1 | GAU(D) | 1 | |
| GUA(V) | 1 | CGU(R) | 1 | |
| ACU(T) | 1 | AGA(R) | 1 | |
| GCA(A) | 1 | GGU(G) | 1 | |
| UAU(Y) | 1 | GGA(G) | 1 | |
| CAA(Q) | 1 | |||
| Not used | UUU(F) | 0 | AAU(N) | 0 |
| AUA(I) | 0 | AAA(K) | 0 | |
| UCU(S) | 0 | GAC(D) | 0 | |
| UAC(Y) | 0 | GAA(E) | 0 | |
| CAU(H) | 0 | UGU(C) | 0 | |
| CAC(H) | 0 | AAU(N) | 0 | |
| Most frequent | CUG(L) | 7.6 | AGC(S) | 6.1 |
| UUG(L) | 7.4 | UUC(F) | 5.7 | |
| CGC(R) | 7.1 | CUC(L) | 5.6 | |
| CUA(L) | 7 | CGG(R) | 5.4 | |
| GCC(A) | 6.7 | UGC(C) | 5 | |
| AGG(R) | 6.6 | GGC(G) | 5 | |
| Least frequent | CAU(H) | 0.1 | CGA(R) | 0.7 |
| AGA(R) | 0.1 | CAA(Q) | 0.7 | |
| GAC(D) | 0.1 | ACU(T) | 1 | |
| GAA(E) | 0.1 | UAU(Y) | 1 | |
| GUC(V) | 0.3 | CGU(R) | 1 | |
| CAC(H) | 0.6 | |||
| Unused | AAA(K) | 0 | GGA(G) | 0 |
| AGU(S) | 0 | |||
| Most frequent | UUG(L) | 8.5 | UCU(S) | 5.7 |
| CGC(R) | 8.5 | CUA(L) | 5.3 | |
| CAG(Q) | 8.3 | UUU(F) | 5.2 | |
| AGG(R) | 6.8 | CUG(L) | 5 | |
| GCC(A) | 5.8 | CCC(P) | 5 | |
| Least frequent | AAG(K) | 0.2 | GUA(V) | 1 |
| CAU(H) | 0.2 | ACU(T) | 1 | |
| CGA(R) | 0.2 | UAU(Y) | 1 | |
| AAC(N) | 0.4 | GAA(E) | 1 | |
| UUA(L) | 1 | AGA(R) | 1 | |
| Unused codons | AAU(N) | 0 | GAC(D) | 0 |
| AAA(K) | 0 | AGU(S) | 0 | |
The table lists codons, the amino acid (aa) encoded by the codon (in parentheses), and frequency of use (as a percentage). Codons are listed from most to least frequent. The space in the table separates the most frequently used, least frequently used and the unused codons for each host