| Literature DB >> 22837418 |
Donald B Smith1, Jeff Vanek2, Sandeep Ramalingam2, Ingolfur Johannessen2, Kate Templeton2, Peter Simmonds1.
Abstract
The presence of a hypervariable (HVR) region within the genome of hepatitis E virus (HEV) remains unexplained. Previous studies have described the HVR as a proline-rich spacer between flanking functional domains of the ORF1 polyprotein. Others have proposed that the region has no function, that it reflects a hypermutable region of the virus genome, that it is derived from the insertion and evolution of host sequences or that it is subject to positive selection. This study attempts to differentiate between these explanations by documenting the evolutionary processes occurring within the HVR. We have measured the diversity of HVR sequences within acutely infected individuals or amongst sequences derived from epidemiologically linked samples and, surprisingly, find relative homogeneity amongst these datasets. We found no evidence of positive selection for amino acid substitution in the HVR. Through an analysis of published sequences, we conclude that the range of HVR diversity observed within virus genotypes can be explained by the accumulation of substitutions and, to a much lesser extent, through deletions or duplications of this region. All published HVR amino acid sequences display a relative overabundance of proline and serine residues that cannot be explained by a local bias towards cytosine in this part of the genome. Although all published HVRs contain one or more SH3-binding PxxP motifs, this motif does not occur more frequently than would be expected from the proportion of proline residues in these sequences. Taken together, these observations are consistent with the hypothesis that the HVR has a structural role that is dependent upon length and amino acid composition, rather than a specific sequence.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22837418 PMCID: PMC3542125 DOI: 10.1099/vir.0.045351-0
Source DB: PubMed Journal: J Gen Virol ISSN: 0022-1317 Impact factor: 3.891
Limiting-dilution analysis of HVR diversity during acute infection
Virus genotype was inferred from the sequence of the ORF2 region of the virus genome.
| Patient | Age/sex | Presentingsymptoms | Peak ALT(U l−1) | Recent travel history | Virus genotype | No. ofsequencesobtained | Variablesites* | Meandiversity(%)† | Syn‡ | Nsyn‡ |
| 2 | 29/M | Jaundice, fever | Indiansubcontinent | 1 | 21 | 1 | 0.08 | 1 | 0 | |
| 7 | 55/F | Jaundice | 2989 | Spain | 3 | 5 | 0 | 0 | 0 | 0 |
| 101 | 62/M | Jaundice | 2881 | None | 3 | 6 | 0 | 0 | 0 | 0 |
| 102 | 58/M | Jaundice | 2245 | None | 3 | 20 | 4 | 0.20 | 2 | 2 |
| 103 | 40/M | Jaundice | 6185 | Spain | 3 | 28 | 0 | 0 | 0 | 0 |
| 104 | 25/M | Fulminant hepatitis | 4121 | India | 1 | 22 | 5 | 0.30 | 2 | 3 |
| 21§ | 55/M | Jaundice | 1993 | None | 3 | 21 | 25 | 2.6 | 19 | 7 |
| 3 | A: 17 | 3 | 0.14 | 1 | 2 | |||||
| 3 | B: 4 | 1 | 0.40 | 1 | 0 | |||||
| 22 | 68/F | Abnormal liver function | 4023 | Spain | 3 | 18 | 1 | 0.20 | 0 | 1 |
No. of variable sites amongst sequence set.
Mean p distance amongst sequence set.
Syn, no. of synonymous substitutions; Nsyn, no. of non-synonymous substitutions.
For patient 21, the diversity amongst the two populations (A and B) of virus sequences detected is also shown separately.
HVR diversity amongst closely related sequences
Sources: Hu, human; Sw, pig; Mo, mongoose; Bo, wild boar.
| Sequences | Virus type | No. ofseqs | Country oforigin | Source | HVR mean | dN/dS |
| AF010429, AY230202 | 1 | 2 | Morocco | Hu | 1.1 | (0 nsyn/1 syn)* |
| AB291951–57, 60 | 3 | 8 | Japan | Hu | 3.6 | 0.32 |
| AB443624–26 | 3 | 3 | Japan | Sw | 2.2 | 0.16 |
| AB074918, -20, AB089824, AB630970 | 3 | 4 | Japan | Hu | 7.3 | 0.23 |
| AB591733, AB236320 | 3 | 2 | Japan | Mo | 3.7 | 0.91 |
| FJ426403, -04 | 3 | 2 | Korea | Sw | 1.3 | (1 nsyn/0 syn)* |
| EU495158, -59 | 3 | 2 | France | Hu | 4.5 | 0.40 |
| EU495171, -48 | 3 | 2 | France | Hu | 1.8 | 1.80 |
| EU495156, -57 | 3 | 2 | France | Hu | 1.8 | 0.23 |
| EU495163–66, EU723515–16 | 3 | 6 | France | Hu | 3.5 | 0.50 |
| EU495178–80 | 3 | 3 | France | Hu | 13.0 | 0.33 |
| FJ653660, AB369687 | 3 | 2 | Thailand | Hu | 6.9 | 0.51 |
| AB161717–19 etc.† | 4 | 18 | Japan | Hu | 2.4 | 0.38 |
| AB200239 etc.† | 4 | 8 | Japan | Hu, Sw | 3.1 | 0.29 |
| DQ450072, GU119961 | 4 | 3 | China | Sw | 9.5 | 0.93 |
| JF915746 etc.† | 4 | 7 | China, Japan | Hu, Sw, Bo | 12.8 | 0.37 |
| AY621103, -06, AY594199 | 4 | 3 | China | Sw | 1.0 | 1.33 |
| AJ272108 etc.† | 4 | 10 | China | Hu, Sw | 18.0 | 0.46 |
| EU366959 etc.† | 4 | 7 | China, Korea | Hu, Sw | 16.3 | 0.52 |
dN : dS could not be calculated for the sequences with GenBank accession numbers AF010429 and AY230202 (one synonymous substitution only) or for FJ426403 and -04 (one non-syonymous substitution only).
Group AB161717–19 etc. includes the sequences with GenBank accession numbers AB2209723, -25–29, AB291959, -65–68, AB113311, -12, AB161717 and AB480825; group AB200239 etc. includes AB193176–78, AB097811, -12, AB099347, AB091395, AB220971, AB080575 and AB481227; group JF915746 etc. includes AB602440, AB521806, AB521805, AB602439, AB369690 and EF570133; group AJ272108 etc. includes AY621103, AY621104, AY621106, AY594199, GU361892, AY621105, FJ610232, GU206559 and HM152568; group EU366959 etc. includes GU119960, AB197673, HQ634346, EF077630, AB197674 and FJ763142.
Fig. 1. Duplicated regions in genotype 3 HVR sequences. For representative genotype 3 isolates, the HVR amino sequence is shown split into two lines in order to display the proposed duplicated regions (indicated in bold). Dots indicate identity to the top sequence. The last sequence is a representative genotype 3 sequence without a duplicated region.
Fig. 2. Diversity of HVR in genotype 4. Single representatives of each phylogenetic group of genotype 4 HVR sequences are shown. Identity with the top sequence is indicated by dots; gaps in the sequence are indicated by dashes.
Fig. 3. HVR variation between virus genotypes. Representative sequences were genotype 3 (GenBank accession no. EU495148), genotype 3 (AB189071), genotype 3 (rabbit; GU937805), genotype 4 (EU366959), wild boar (AB573435 and AB602441), genotype 1 (M80581) and genotype 2 (M74506). Identity with the top sequence is indicated by dots; gaps are indicated by dashes.
Fig. 4. Biased amino acid composition of the HVR. The ratio of the mean amino acid composition of HVR of genotypes 1, 3 and 4 compared with that for ORF1 as a whole is shown for each amino acid.
Fig. 5. Over-representation of cytosine at the second codon position in the HVR. For the first, second and third position of codons, the frequency of each nucleotide in ORF1 as a whole or in the HVR is shown for genotypes 1, 3 and 4.
Fig. 6. PxxP motifs occur at a level expected by chance in the HVR. For each genotype, the figure shows the ratio of the number of PxxP motifs in the HVR compared with the number expected by chance for a peptide of the same amino acid composition and length. Sequences comprised epidemiologically unlinked sequences from genotype 1 (n = 18), genotype 2 (n = 1), genotype 3 sequences with (type 3I, n = 42) and without (type 3N, n = 38) a duplicated region, genotype 4 (n = 15) and divergent types from wild boar (n = 2) and genotype 3 sequences from rabbits (n = 3).