| Literature DB >> 35801104 |
Bingzhe Li1, Han Wu1, Ziping Miao2, Linjie Hu1, Lu Zhou1, Yihan Lu1.
Abstract
Hepatitis E virus (HEV) is an emerging zoonotic pathogen with multiple species and genotypes, which may be classified into human, animal, and zoonotic HEV. Codon usage bias of HEV remained unclear. This study aims to characterize the codon usage of HEV and elucidate the main drivers influencing the codon usage bias. A total of seven HEV genotypes, HEV-1 (human HEV), HEV-3 and HEV-4 (zoonotic HEV), HEV-8, HEV-B, HEV-C1, and HEV-C2 (emerging animal HEV), were included in the study. Complete coding sequences, ORF1, ORF2, and ORF3, were accordingly obtained in the GenBank. Except for HEV-8, the other six genotypes tended to use codons ending in G/C. Based on the analysis of relatively synonymous codon usage (RSCU) and principal component analysis (PCA), codon usage bias was determined for HEV genotypes. Codon usage bias differed widely across human, zoonotic, and animal HEV genotypes; furthermore, it varied within certain genotypes such as HEV-4, HEV-8, and HEV-C1. In addition, dinucleotide abundance revealed that HEV was affected by translation selection to form a unique dinucleotide usage pattern. Moreover, parity rule 2 analysis (PR2), effective codon number (ENC)-plot, and neutrality analysis were jointly performed. Natural selection played a leading role in forming HEV codon usage bias, which was predominant in HEV-1, HEV-3, HEV-B and HEV-C1, while affected HEV-4, HEV-8, and HEV-C2 in combination with mutation pressure. Our findings may provide insights into HEV evolution and codon usage bias.Entities:
Keywords: codon usage; effective codon number; hepatitis E virus; relatively synonymous codon usage; zoonotic pathogen
Year: 2022 PMID: 35801104 PMCID: PMC9253588 DOI: 10.3389/fmicb.2022.938651
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
Nucleotide composition of complete genomes in seven HEV genotypes.
| Genotype | GC1 | GC2 | GC3 | A3s | U3s | C3s | G3s |
|---|---|---|---|---|---|---|---|
| HEV-B | 64.68 ± 0.46 | 48.97 ± 0.15 | 55.89 ± 0.83 | 17.62 ± 1.21 | 33.95 ± 1.15 | 31.44 ± 0.78 | 36.65 ± 0.65 |
| HEV-C1 | 62.8 ± 1.53 | 50.80 ± 3.14 | 57.62 ± 6.03 | 17.73 ± 3.38 | 31.74 ± 5.89 | 33.74 ± 2.48 | 35.97 ± 5.64 |
| HEV-C2 | 55.78 ± 2.07 | 55.13 ± 2.73 | 55.46 ± 0.88 | 22.87 ± 1.97 | 27.82 ± 3.55 | 29.86 ± 0.85 | 35.89 ± 0.84 |
| HEV-1 | 64.49 ± 0.23 | 51.27 ± 0.16 | 60.23 ± 0.52 | 10.87 ± 0.39 | 34.46 ± 0.67 | 41.77 ± 0.69 | 29.70 ± 0.46 |
| HEV-3 | 63.45 ± 0.42 | 50.77 ± 0.27 | 54.81 ± 1.09 | 14.98 ± 0.73 | 37.03 ± 1.08 | 35.20 ± 1.13 | 30.55 ± 0.79 |
| HEV-4 | 61.78 ± 3.81 | 52.20 ± 3.30 | 54.21 ± 3.70 | 17.19 ± 4.00 | 35.21 ± 7.99 | 33.25 ± 1.35 | 30.99 ± 3.92 |
| HEV-8 | 60.70 ± 4.03 | 50.31 ± 0.42 | 49.89 ± 5.61 | 16.63 ± 3.89 | 40.22 ± 10.91 | 31.18 ± 0.87 | 27.71 ± 5.34 |
The values in the cells were represented as mean% ± SD%.
Correlation analysis between nucleotide composition and that at the third codon position of HEV complete coding sequences.
| Genotype | Correlation | A3% | T3% | G3% | C3% | GC3% |
|---|---|---|---|---|---|---|
| HEV-1 | A% | 0.93 | −0.45 | −0.75 | 0.4 | −0.04 |
| T% | −0.69 | 0.92 | 0.61 | −0.88 | −0.64 | |
| G% | −0.96 | 0.56 | 0.91 | −0.56 | −0.06 | |
| C% | 0.65 | −0.92 | −0.63 | 0.91 | 0.67 | |
| GC% | 0.31 | −0.87 | −0.32 | 0.86 | 0.82 | |
| HEV-3 | A% | 0.95 | 0.05 | −0.72 | −0.16 | −0.55 |
| T% | −0.1 | 0.97 | −0.08 | −0.79 | −0.79 | |
| G% | −0.57 | −0.14 | 0.97 | −0.09 | 0.44 | |
| C% | −0.13 | −0.9 | −0.1 | 0.94 | 0.85 | |
| GC% | −0.43 | −0.84 | 0.46 | 0.75 | 0.96 | |
| HEV-4 | A% | 0.31 | −0.31 | 0.19 | −0.02 | 0.26 |
| T% | −0.06 | 0.05 | 0.09 | −0.4 | −0.05 | |
| G% | −0.2 | 0.25 | −0.15 | −0.03 | −0.23 | |
| C% | −0.04 | 0.03 | −0.14 | 0.42 | 0 | |
| GC% | −0.14 | 0.15 | −0.21 | 0.39 | −0.12 | |
| HEV-8 | A% | 0.26 | −0.05 | 0.02 | −0.86 | −0.07 |
| T% | 0.09 | 0.11 | −0.13 | −0.85 | −0.23 | |
| G% | −0.81 | 0.67 | −0.65 | 0.81 | −0.56 | |
| C% | 0.31 | −0.5 | 0.53 | 0.67 | 0.61 | |
| GC% | −0.17 | −0.05 | 0.09 | 0.93 | 0.19 | |
| HEV-B | A% | 0.97 | −0.59 | −0.39 | −0.22 | −0.42 |
| T% | −0.70 | 0.92 | 0.25 | −0.52 | −0.3 | |
| G% | −0.26 | −0.11 | 0.80 | −0.01 | 0.45 | |
| C% | −0.38 | −0.19 | −0.19 | 0.88 | 0.66 | |
| GC% | −0.49 | −0.23 | 0.23 | 0.83 | 0.86 | |
| HE-C1 | A% | 0.47 | 0.8 | −0.76 | −0.7 | −0.86 |
| T% | 0.03 | 0.98 | −0.56 | −0.97 | −0.82 | |
| G% | −0.2 | −0.98 | 0.68 | 0.9 | 0.88 | |
| C% | −0.16 | −0.98 | 0.64 | 0.98 | 0.88 | |
| GC% | −0.18 | −1 | 0.67 | 0.96 | 0.89 | |
| HEV-C2 | A% | −0.11 | 0.26 | 0.08 | −0.32 | −0.52 |
| T% | 0.33 | −0.18 | 0.55 | −0.75 | −0.13 | |
| G% | 0.19 | −0.05 | 0.48 | −0.68 | −0.22 | |
| C% | −0.2 | 0.04 | −0.45 | 0.67 | 0.26 | |
| GC% | −0.19 | 0.03 | −0.41 | 0.63 | 0.27 |
p < 0.01;
p < 0.05.
Number of preferred codons in the seven HEV genotypes.
| HEV-1 | HEV-3 | HEV-4 | HEV-8 | HEV-B | HEV-C1 | HEV-C2 | |
|---|---|---|---|---|---|---|---|
| Number of preferred codons (RSCU > 1) | 31 | 30 | 29 | 28 | 31 | 29 | 27 |
| Number of preferred codons ending in G/C | 20 | 16 | 15 | 14 | 16 | 18 | 19 |
| Number of overrepresented codons (RSCU > 1.6) | 11 | 4 | 1 | 6 | 1 | 1 | 1 |
Figure 1Principal component analysis (PCA) based on the hepatitis E virus (HEV) complete coding sequences. The first dimension was plotted against the second dimension. PCA plot showed the deviations and similarity among the 59 synonymous codons of 98 HEV sequences included in the study. Seven HEV genotypes were presented by colors. The ellipses in the figure predicted new observations with a probability of 0.95. New observations from the same group were expected to fall inside the ellipses.
Figure 2Dinucleotide abundance frequency based on the HEV complete coding sequences. The dashed lines showed overrepresented and underrepresented values. Seven HEV genotypes were presented by colors.
Figure 3Parity Rule 2 (PR2) plot based on the HEV complete coding sequences. The center of the plot, where the value of both coordinates was 0.5, indicated no bias in mutation or selection rates. Seven HEV genotypes were presented by colors.
Figure 4Effective number of codons (ENCs)-plot analysis based on the HEV complete coding sequences. ENC values were plotted against GC3s of the genotypes. The black line represented the standard curve when the codon usage bias was determined by only the GC3s composition. Seven HEV genotypes were presented by colors.
Correlation analysis between ENC and GC contents of HEV complete coding sequences.
| Genotype | ENC and GC% | ENC and GC1% | ENC and GC2% | ENC and GC3% | ENC and GC12% |
|---|---|---|---|---|---|
| HEV-1 | −0.56 | −0.5 | 0.37 | −0.54 | −0.16 |
| HEV-3 | −0.27 | 0.14 | −0.33 | −0.3 | −0.09 |
| HEV-4 | −0.12 | −0.91 | 0.79 | 0.77 | −0.27 |
| HEV-8 | 0.03 | −1 | −0.5 | 0.99 | −1 |
| HEV-B | −0.44 | −0.39 | 0.3 | −0.39 | −0.26 |
| HEV-C1 | 0.54 | 0.79 | 0.78 | 0.21 | 0.81 |
| HEV-C2 | −0.04 | −0.98 | 0.97 | 0.88 | 0.83 |
p < 0.01;
p < 0.05.
Figure 5Neutrality analysis based on the HEV complete coding sequences. The correlation between GC content at first and second positions of codon (GC12s) and at third position of codon (GC3s) was calculated. The solid lines by colors represented the linear regression of GC12 against GC3s for the seven HEV genotypes. * Represented correlation significant at p < 0.05.
Relative codon deoptimization index (RCDI) values of HEV ORF1, ORF2, and ORF3 coding sequences with different hosts.
| HEV genotypes | Host | Coding sequences | Average RCDI |
| ||
|---|---|---|---|---|---|---|
| ORF1 | ORF2 | ORF3 | ||||
| HEV-1 |
| 1.22 | 1.35 | 1.77 | 1.45 | |
| HEV-3 |
| 1.19 | 1.46 | 1.60 | 1.42 | 0.289 |
|
| 1.23 | 1.42 | 1.49 | 1.38 | ||
| HEV-4 |
| 1.20 | 1.44 | 1.73 | 1.46 | 0.932 |
|
| 1.26 | 1.54 | 1.59 | 1.46 | ||
| HEV-8 |
| 1.41 | 1.71 | 1.77 | 1.63 | |
| HEV-B |
| 1.18 | 1.14 | 1.69 | 1.34 | |
| HEV-C1 |
| 1.20 | 1.21 | 2.12 | 1.51 | |
| HEV-C2 |
| 1.30 | 1.33 | 1.66 | 1.43 | |
The multiple comparisons using Tukey’s HSD test showed significant difference in the RCDI values among HEV coding sequences, except ORF2 and ORF3 in HEV-8, ORF1 and ORF2 in HEV-B, HEV-C1, and HEV-C2.