| Literature DB >> 29259596 |
Takumi Motoya1,2, Koo Nagasawa3, Yuki Matsushima4, Noriko Nagata1, Akihide Ryo5, Tsuyoshi Sekizuka6, Akifumi Yamashita6, Makoto Kuroda6, Yukio Morita7, Yoshiyuki Suzuki8, Nobuya Sasaki2, Kazuhiko Katayama9, Hirokazu Kimura3,5,10.
Abstract
Human norovirus (HuNoV) is a leading cause of viral gastroenteritis worldwide, of which GII.4 is the most predominant genotype. Unlike other genotypes, GII.4 has created various variants that escaped from previously acquired immunity of the host and caused repeated epidemics. However, the molecular evolutionary differences among all GII.4 variants, including recently discovered strains, have not been elucidated. Thus, we conducted a series of bioinformatic analyses using numerous, globally collected, full-length GII.4 major capsid (VP1) gene sequences (466 strains) to compare the evolutionary patterns among GII.4 variants. The time-scaled phylogenetic tree constructed using the Bayesian Markov chain Monte Carlo (MCMC) method showed that the common ancestor of the GII.4 VP1 gene diverged from GII.20 in 1840. The GII.4 genotype emerged in 1932, and then formed seven clusters including 14 known variants after 1980. The evolutionary rate of GII.4 strains was estimated to be 7.68 × 10-3 substitutions/site/year. The evolutionary rates probably differed among variants as well as domains [protruding 1 (P1), shell, and P2 domains]. The Osaka 2007 variant strains probably contained more nucleotide substitutions than any other variant. Few conformational epitopes were located in the shell and P1 domains, although most were contained in the P2 domain, which, as previously established, is associated with attachment to host factors and antigenicity. We found that positive selection sites for the whole GII.4 genotype existed in the shell and P1 domains, while Den Haag 2006b, New Orleans 2009, and Sydney 2012 variants were under positive selection in the P2 domain. Amino acid substitutions overlapped with putative epitopes or were located around the epitopes in the P2 domain. The effective population sizes of the present strains increased stepwise for Den Haag 2006b, New Orleans 2009, and Sydney 2012 variants. These results suggest that HuNoV GII.4 rapidly evolved in a few decades, created various variants, and altered its evolutionary rate and antigenicity.Entities:
Keywords: GII.4; Norovirus; VP1; bioinformatics; molecular evolution
Year: 2017 PMID: 29259596 PMCID: PMC5723339 DOI: 10.3389/fmicb.2017.02399
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Time-scaled phylogenetic trees of the complete HuNoV capsid VP1 gene constructed by the Bayesian MCMC method. (A) The maximum clade credibility tree with the dataset including HuNoV GI.1 and all GII genotypes; (B) Enlarged tree focused on GII.4. Gray bars indicate the 95% highest probability densities for each branch year.
Figure 2Evolutionary rates of nucleotide sequences in the full-length GII.4 VP1 gene. (A) Evolutionary rates for domains within GII.4 VP1 gene; (B) Evolutionary rates for each GII.4 variant. The y-axis represents the evolutionary rate (substitutions/site/year). Statistical results for multiple comparisons in the domains and the GII.4 variants were shown in Tables S2, S3, respectively.
Figure 3SimPlot analysis of the representative HuNoV GII.4 strains. Each variant's similarity to the Bristol 1993 variant is represented. The positions of the shell, P1, and P2 domains in VP1 gene are shown below the graph.
Figure 4Phylogenetic distance between the nucleotide GII.4 sequences of the full-length VP1 gene. (A) Phylogenetic distance of intra-genotype in HuNoV GII.4 strains. The y-axis represents the number of sequence pairs corresponding to each distance. The x-axis shows phylogenetic distances; (B) Phylogenetic distance of intra-variant in HuNoV GII.4 strains. The y-axis indicates phylogenetic distances. The x-axis represents each variant. Data are expressed as mean ± standard deviation. Statistical results for multiple comparisons were shown in Table S4.
Figure 5Structural models for the capsid VP1 protein of each HuNoV GII.4 variant. Three-dimensional VP1 dimer structures for the Bristol 1993 (A), the Den Haag 2006b (B), the New Orleans 2009 (C) and the Sydney 2012 (D) variants are shown. Chains that are composed of the dimer structures are colored in gray (chain A) and dim gray (chain B). Predicted epitopes of each variant are colored in red and circled for regions. Positive selection sites are colored in blue for aa9, aa294, aa376, aa393, aa412, and aa534. Of note, aa6 could not be specified due to lack of structure modeling in N terminus. Amino-acid substitutions of the other variants to a GII.4 Bristol 1993 strain are colored in green.
Parameters for evolutionary rates and Bayesian skyline plot analyses in HuNoV GII.4 and the variants.
| GII.4 all variants (Complete | 466 | GTR+Γ+I | Relaxed clock log normal | Coalescent exponential population | 300,000,000 | 2,000 |
| Strict clock | Coalescent Bayesian skyline | 220,000,000 | 8,000 | |||
| GII.4 all variants (Shell domain) | 466 | HKY+Γ+I | Strict clock | Coalescent exponential population | 200,000,000 | 5,000 |
| GII.4 all variants (P1 domain) | 466 | SYM+Γ | Relaxed clock log normal | Coalescent exponential population | 210,000,000 | 2,000 |
| GII.4 all variants (P2 domain) | 466 | GTR+Γ+I | Relaxed clock exponential | Coalescent constant population | 150,000,000 | 5,000 |
| US95_96 | 33 | SYM+Γ+I | Relaxed clock exponential | Coalescent exponential population | 250,000,000 | 2,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 2,000 | |||
| Farmington Hills 2002 | 18 | K80+Γ | Relaxed clock exponential | Coalescent exponential population | 1,400,000,000 | 10,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 5,000 | |||
| Asia 2003 | 17 | K80+Γ | Relaxed clock log normal | Coalescent exponential population | 550,000,000 | 4,000 |
| Strict clock | Coalescent Bayesian skyline | 100,000,000 | 2,000 | |||
| Hunter 2004 | 22 | K80+Γ | Relaxed clock exponential | Coalescent exponential population | 550,000,000 | 4,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 2,000 | |||
| Yerseke 2006a | 13 | TrNef+Γ | Relaxed clock exponential | Coalescent exponential population | 1,160,000,000 | 10,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 2,000 | |||
| Den Haag 2006b | 145 | HKY+Γ+I | Relaxed clock exponential | Coalescent exponential population | 300,000,000 | 10,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 300,000,000 | 10,000 | |||
| Osaka 2007 | 11 | TrNef+Γ | Relaxed clock exponential | Coalescent exponential population | 100,000,000 | 2,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 2,000 | |||
| Apeldoorn 2007 | 25 | TrNef+Γ | Relaxed clock exponential | Coalescent exponential population | 300,000,000 | 10,000 |
| Relaxed clock exponential | Coalescent Bayesian skyline | 100,000,000 | 5,000 | |||
| New Orleans 2009 | 86 | HKY+Γ+I | Random local clock | Coalescent exponential population | 200,000,000 | 2,000 |
| Random local clock | Coalescent Bayesian skyline | 300,000,000 | 6,000 | |||
| Sydney 2012 | 68 | TPM2+Γ+I | Relaxed clock log normal | Coalescent exponential population | 100,000,000 | 5,000 |
| Random local clock | Coalescent Bayesian skyline | 100,000,000 | 5,000 |
Parameters for evolutionary rates and BSP are shown in upper and lower lines, respectively.
Figure 6Bayesian skyline plot for VP1 sequences of HuNoV GII.4. Plots for all GII.4 strains (A), US95_96 variant strains (B), Farmington Hills 2002 variant strains (C), Asia 2003 variant strains (D), Hunter 2004 variant strains (E), Yerseke 2006a variant strains (F), Den Haag 2006 variant strains (G), Osaka 2007 variant strains (H), Apeldoorn 2007 variant strains (I), New Orleans 2009 variant strains (J), and Sydney 2012 variant strains (K) are shown. The y-axis represents the effective population size on logarithmic scale, whereas the x-axis denotes the time in years. The solid black line indicates the median posterior value. The intervals with the highest probability densities (95%) are shown by blue lines.
Putative conformational epitopes for the capsid VP1 proteins of HuNoV GII.4 variants.
| Epitope | A | A | A | A | A | A | C | C | D | D | D | E | E | E | ||||||||||||||||||||||||
| Bristol 1993 | P | A | N | T | I | A | G | S | H | D | T | N | N | D | T | R | A | D | G | S | Q | A | G | D | G | D | - | H | H | Q | N | G | Y | N | R | T | G | H |
| Camberwell 1994 | · | S | H | · | · | V | · | · | · | · | · | · | · | · | · | · | · | · | · | · | · | T | · | · | · | · | - | · | · | · | · | · | · | · | · | · | · | · |
| US95_96 | · | · | H | · | · | · | · | · | · | · | · | · | · | · | · | · | G | · | · | · | · | T | · | · | · | · | - | N | · | · | · | · | · | · | · | · | · | · |
| Kaiso 2003 | · | S | · | · | · | · | · | · | R | N | S | D | · | · | · | · | · | · | · | · | · | P | · | · | · | · | - | R | · | · | · | · | · | D | · | · | · | · |
| Farmington Hills 2002 | · | S | · | · | · | · | · | T | · | N | N | · | · | · | · | · | G | · | · | · | E | T | · | · | · | N | G | T | · | · | · | · | · | S | · | · | · | · |
| Lanzou 2002 | · | S | · | · | · | · | · | T | · | · | · | · | · | · | · | · | G | · | · | · | E | T | · | · | · | N | S | A | · | · | · | · | · | N | · | · | · | · |
| Asia 2003 | · | S | · | I | · | P | · | T | R | T | A | D | · | · | · | K | G | · | · | · | E | T | · | · | · | S | S | A | · | R | · | · | · | D | · | · | V | · |
| Hunter 2004 | · | S | · | · | · | · | · | T | Q | N | S | S | · | · | · | · | R | · | · | · | E | T | · | · | · | S | T | T | · | · | · | · | · | D | · | D | S | · |
| Yerseke 2006a | · | S | · | · | · | · | · | T | Q | E | S | S | · | · | · | · | R | · | · | · | E | T | · | · | · | S | T | T | · | · | · | · | · | D | · | D | S | · |
| Den Haag 2006b | · | S | · | · | · | · | · | · | R | N | S | E | · | · | · | K | G | · | · | · | E | T | H | · | · | S | T | T | · | R | · | · | · | S | · | N | V | · |
| Osaka 2007 | · | S | · | · | · | · | · | · | R | N | A | D | · | · | · | · | S | · | · | · | E | S | · | · | · | S | T | T | · | R | · | · | · | · | · | · | · | · |
| Apeldoorn 2007 | · | S | · | · | · | · | · | · | R | N | A | D | · | · | · | · | · | · | · | · | D | · | N | · | · | N | T | A | · | R | · | · | · | S | · | N | S | · |
| New Orleans 2009 | · | S | · | · | · | P | · | · | R | N | A | D | · | · | · | · | T | N | · | · | E | T | N | · | · | S | T | T | P | R | · | · | · | S | · | N | I | · |
| Sydney 2012 | · | S | · | · | · | T | · | · | R | N | E | D | R | · | · | · | T | · | · | · | E | · | N | · | · | G | T | T | · | R | · | · | · | S | · | N | T | · |
Red colors represent putative epitopes of each HuNoV GII.4 variant in this analysis.
Amino acids corresponding to epitope regions identified by Lindesmith et al. (.
Positive selection sites in the present strains of HuNoV GII.4.
| Asn6Ser | ° | ° | ° |
| Ser6Asn | |||
| Asn9Ser, Thr, His, Lys | ° | ° | ° |
| Ser9Asn | |||
| Asn17Ser, His, Thr | ° | ||
| Ala294Val, Thr, Pro, Gly | ° | ||
| Val294Gly, Ala | |||
| Thr294Ala, Ile, Ser | |||
| Pro294Ser, Thr | |||
| Ser294Pro, Ala | |||
| Gly294Arg | |||
| Tyr352Ser, Arg, Phe, Leu | ° | ||
| Ser352Tyr, Leu | |||
| Leu352Phe | |||
| Thr368Ser, Ala, Asn | ° | ||
| Ala368Val, Ser, Thr, Asp | |||
| Ser368Gly, Asn, Arg | |||
| Gly368Ala, Ser | |||
| Asn368Glu, Asp, Ser | |||
| Glu368Ala, Gly | |||
| Val368Phe | |||
| Glu376Gln, Asp, Val | ° | ||
| Gln376Glu, Asn | |||
| Asp376Glu, Val, Gly | |||
| Val376Glu, Ile | |||
| Gly393Ser, Asn, Asp | ° | ° | |
| Asn393Asp, Ser, Gly | |||
| Ser393Asn, Gly, Thr, Ala | |||
| Asp393Asn, Gly, Glu | |||
| Thr395Asn, Ala | ° | ° | |
| Asn395His, Thr | |||
| His395Arg, Asp, Pro | |||
| Ala395Thr | |||
| Val413Gly, Ala, Ile | ° | ||
| Gly413Ser, Val, Asn | |||
| Ser413Thr, Asn, Gly | |||
| Thr413Ser, Ile, Ala | |||
| Ile413Thr, Val | |||
| Ser494Thr, Pro, Ala | ° | ||
| Thr494Ala | |||
| Thr534Ala, Ser | ° | ° | ° |
| Ala534Val, Thr | |||
| Total | 5 | 6 | 9 |
Mean dN/dS = 0.130 (95%CI = 0.124–0.136).
p-value < 0.05.
Positive selection sites in HuNoV GII.4 variants.
| Farmington Hills 2002 | Asn9His, Thr | ° | ||
| Thr395Ala | ° | |||
| Ala395Thr | ||||
| Lanzou 2002 | Gly255Ser | ° | ||
| Asia 2003 | Val413Ala | ° | ||
| Hunter 2004 | Arg340Gly | ° | ||
| Yerseke 2006a | Ser98Gly | ° | ||
| Den Haag 2006b | Asn9Ser, Thr, His | ° | ° | |
| Pro357Asp | ° | |||
| Ser393Gly, Asn | ° | ° | ° | |
| Gly393Ser | ||||
| Asn412Asp, Ser | ° | ° | ° | |
| Asp412Gly | ||||
| His414Gln, Pro | ° | |||
| Pro414His | ||||
| Osaka 2007 | Leu352Tyr, Phe | ° | ||
| Ser393Asn | ° | |||
| Asn407Gly, Ser | ° | |||
| Thr412Asp | ° | |||
| Apeldoorn 2007 | Ala359Thr, Ser | ° | ||
| New Orleans 2009 | Pro294Ser, Thr | ° | ° | ° |
| Ser294Ala, Pro | ||||
| Ala294Thr | ||||
| Asn341Asp | ° | |||
| Asp341Asn | ||||
| Glu376Asp, Val, Gln | ° | ° | ° | |
| Asp376Val, Glu | ||||
| Val376Asp, Glu, Ile | ||||
| Ile413Thr, Val | ° | |||
| Thr413Ile | ||||
| Sydney 2012 | Ile293Thr | ° | ||
| Ser309Asn | ° | ° | ||
| Asn309Ser | ||||
| His373Arg, Asn | ° | |||
| Ser393Gly, Asn, Thr | ° | ° | ° | |
| Gly393Ser | ||||
| Tyr460His | ° |
Positive selection sites were not found in the Camberwell 1994, US95_96, and Kaiso 2003 variants.
Number of negative selection sites in the present strains of HuNoV GII.4.
| Camberwell 1994 | 0 | 0 | 0 | 0 |
| US95-96 | 4 (1.8%) | 1 (0.5%) | 1 (0.6%) | 6 (1.1%) |
| Kaiso 2003 | 0 | 0 | 0 | 0 |
| Farmington Hills 2002 | 0 | 0 | 0 | 0 |
| Lanzou 2002 | 0 | 0 | 0 | 0 |
| Asia 2003 | 0 | 0 | 0 | 0 |
| Hunter 2004 | 0 | 0 | 0 | 0 |
| Yerseke 2006a | 0 | 0 | 0 | 0 |
| Den Haag 2006b | 34 (15.4%) | 16 (9.0%) | 9 (6.2%) | 59 (10.9%) |
| Osaka 2007 | 4 (1.8%) | 3 (1.6%) | 4 (2.7%) | 11 (2.03%) |
| Apeldoorn 2007 | 2 (0.9%) | 0 | 0 | 2 (0.3%) |
| New Orleans 2009 | 10 (4.5%) | 12 (6.7%) | 3 (2.0%) | 25 (4.6%) |
| Sydney 2012 | 3 (1.3%) | 0 | 0 | 3 (0.5%) |
| All GII.4 | 140 (63.6%) | 142 (80.2%) | 87 (60.8%) | 369 (68.3%) |
The number of consensus sites by SLAC, FEL and IFEL methods is exhibited.