Literature DB >> 24820343

Rapid replacement of human respiratory syncytial virus A with the ON1 genotype having 72 nucleotide duplication in G gene.

You-Jin Kim¹, Dae-Won Kim², Wan-Ji Lee¹, Mi-Ran Yun², Ho Yeon Lee¹, Han Saem Lee¹, Hee-Dong Jung¹, Kisoon Kim³.

Abstract

Human respiratory syncytial virus (HRSV) is the main cause of severe respiratory illness in young children and elderly people. We investigated the genetic characteristics of the circulating HRSV subgroup A (HRSV-A) to determine the distribution of genotype ON1, which has a 72-nucleotide duplication in attachment G gene. We obtained 456 HRSV-A positive samples between October 2008 and February 2013, which were subjected to sequence analysis. The first ON1 genotype was discovered in August 2011 and 273 samples were identified as ON1 up to February 2013. The prevalence of the ON1 genotype increased rapidly from 17.4% in 2011-2012 to 94.6% in 2012-2013. The mean evolutionary rate of G protein was calculated as 3.275 × 10(-3) nucleotide substitution/site/year and several positively selected sites for amino acid substitutions were located in the predicted epitope region. This basic and important information may facilitate a better understanding of HRSV epidemiology and evolution.

Entities: CellLine Chemical Disease Species

Keywords: Attachment G gene; Diversity; Duplication; Evolution; Human respiratory syncytial virus (HRSV)

Mesh：

Substances：

Year: 2014 PMID： 24820343 PMCID： PMC7106136 DOI： 10.1016/j.meegid.2014.05.007

Source DB: PubMed Journal: Infect Genet Evol ISSN： 1567-1348 Impact factor: 3.342

Introduction

Human respiratory syncytial virus (HRSV) is recognized by pediatricians as the most common cause of acute respiratory tract infections and is a leading cause of hospital admissions and death among children aged <5 years worldwide (Hall et al., 2013, Cho et al., 2013, Munywoki et al., 2013, Bezerra et al., 2011, Nair et al., 2010). The World Health Organization has estimated that the annual global disease burden is more than 64 million HRSV infections and 160,000 deaths related to HRSV infection (World Health Organization (WHO), 2009). HRSV infection is a major concern in developed and developing countries, but no effective vaccine is available and immunoprophylaxis is the only treatment for preventing HRSV infection, although access is limited (Chang, 2011, Rudraraju et al., 2013, Graham, 2011, Jorquera et al., 2013, Shaw et al., 2013, Wang et al., 2011a, Zeitlin et al., 2013). Two antigenic groups of HRSV have been differentiated based on antigenic variability in the attachment G gene, i.e., HRSV subgroup A (HRSV-A) and HRSV subgroup B (HRSV-B). Ten HRSV-A and 13 HRSV-B genotypes have been classified in different geographical regions, which are designated as GA1–GA7, SAA1, NA1, and NA2 for HRSV-A, and GB1–GB4, SAB1–SAB3, and BA1–BA6 for HRSV-B (Auksornkitti et al., 2013, Eshaghi et al., 2012, Lee et al., 2012, Khor et al., 2013, Shobugawa et al., 2009). Most previous molecular epidemiological and divergence studies of HRSV have focused on analyses of nucleotide and/or amino acid changes in part of the G protein, which is a type II glycoprotein that mediates attachment of the virus to the cell during virus entry, and is one of the targets of the immune response (Agoti et al., 2010, Cui et al., 2013, Baek et al., 2012, Tan et al., 2013, Murata and Catherman, 2012). These studies have yielded significant volumes of partial genomic information for HRSV and primary analyses based on variation in the G protein. In particular, the BA genotype of HRSV-B, which was isolated in Buenos Aires, Argentina during 1999, contains a duplication of 60 nucleotides (nt) in the C-terminal third of the G protein gene and is the predominant strain according to global epidemiological studies (Sullender et al., 1991, Trento et al., 2006, Trento et al., 2010, van Niekert and Venter, 2011, Zhang et al., 2010). More recently, a similar duplication was reported in HRSV-A (ON1) isolates from Canada, Germany, Malaysia, Thailand, Kenya, and South Korea, which was characterized as a 72-nt duplication in the C-terminal third of the G gene (Munywoki et al., 2013, Auksornkitti et al., 2013, Eshaghi et al., 2012, Lee et al., 2012, Prifert et al., 2013). The exact mechanism that allows such duplications to play roles during selective pressure and the factors that may increase their fitness to substitute the BA genotype or other HRSV-B viruses remain to be defined. Similarly, the basis of the evolutionary advantage and antigenic dominance of the HRSV-A (ON1) strain due to the introduction of 72 identical bases in the G gene also needs to be clarified. In this study, we investigated the emergence of the new HRSV-A ON1 genotype and conducted an in-depth analysis of the genetic predisposition of the G protein gene. In addition, we predicted the epitope of the duplicated G protein with an insertion of 23 amino acids, which we compared with ancestral strains. This analysis is of importance for elucidating the antigenic variation of the HRSV G protein and its relationships with clinical manifestations and vaccine development.

Materials and methods

Ethical statement

The clinical samples used in this study were collected as part of the laboratory surveillance system in South Korea, i.e., the Acute Respiratory Infection Network (ARI-NET) conducted until April 2009 and the Korea Influenza and Respiratory Viruses Surveillance System (KINRESS) since May 2009. This study was approved by the Korea National Institute of Health institutional review boards (Approval Nos. 2010-03EXP-1-R, 2011-03CON-04-C, 2011-06EXP01-C, 2012-08EXP-06-3C, and 2012-09CON-03-4C) because it involved anonymization of the remaining respiratory tract samples, which were not related to human gene studies. These samples were collected for respiratory virus diagnosis and written informed consent was obtained from the patients, their parents, or legal guardians.

Specimen collection and virus detection

This study used nasal aspirate specimens and throat swab samples taken from patients enrolled in ARI-NET and KINRESS with acute respiratory illness who were diagnosed as HRSV-positive including co-infected samples with influenza virus, human rhinovirus, adenovirus, human coronavirus, human bocavirus, human enterovirus, parainfluenza virus or human metapneumovirus. ARI-NET used conventional reverse transcriptase (RT)-PCR (Solgent, Seoul, South Korea) to detect the HRSV-A and -B subgroups simultaneously. By contrast, KINRESS used an improved real-time one-step RT-PCR to distinguish the HRSV-A and -B subgroups (Kogen Bio, Seoul, South Korea) after July 2011, where viral RNA was extracted from 140 μL of each respiratory specimen using QIAamp Viral RNA Mini Kits (Qiagen GmbH, Hilden, Germany).

PCR amplification of the G gene

HRSV-positive clinical samples were subjected to amplification of the partial G gene using a G gene-specific primer set for sequence analysis, i.e., the forward primer G(151–173)F: CTGGCAATGATAATCTCAACTTC and reverse primer F(3–22)R: CAACTCCATTGTTATTTGCC (da Silva et al., 2008). The cDNA was prepared using the viral RNA extraction method employed by the routine respiratory virus test. The reaction mixture contained 5 μL of RNA, which was mixed with a final concentration of 10 mM dNTPs, 20 μM random primer, 1× RT buffer, 200 U of Superscript III reverse transcriptase (Invitrogen, CA, USA), 40 U of RNase-OUT RNase Inhibitor (Invitrogen), 25 mM MgCl2, and 0.1 mM dithiothreitol (DTT), and RNase-free water was added to make a final volume of 20 μL. The mixture was then incubated at 25 °C for 5 min, 50 °C for 60 min, and 72 °C for 5 min to terminate cDNA synthesis. Next, 5 μL of cDNA was added to a PCR mixture containing 1 μL of SP-Taq DNA polymerase (2.5 U/μL) (Cosmo Genetech, Seoul, South Korea), 36 μL of distilled water, 5 μL of 10× PCR buffer, 1 μL of 10 mM dNTPs, and 1 μL each of the forward and reverse primers (both 20 μM) for the G gene. Primary denaturation was conducted at 95 °C for 10 min, which was followed by 35 cycles of PCR where each cycle comprised denaturation for 40 s at 95 °C, annealing for 40 s at 54 °C, and elongation for 1 min at 72 °C, with a final extension cycle of 5 min at 72 °C. The PCR products were separated by electrophoresis using 1% agarose gel and visualized using 1× SYBR Safe DNA Gel Stain (Invitrogen).

Nucleotide sequencing and phylogenetic analysis

The amplified PCR products were sequenced bidirectionally with same primers for PCR amplification mentioned in Section 2.3. using an ABI 3730xl DNA analyzer (Applied Biosystems, Foster City, CA, USA). The sequences were edited with SeqMan Pro in the Lasergene 8 software suite (version 8.0; DNAStar, Madison, WI, USA) and aligned using MEGA 5 (ver. 5.05). The 717 nt length fragments of G gene from 181 nt position to stop codon based on A2 strain were used for further analysis. To minimize potential biases in the alignment and to obtain a more tractable representation of the dataset, identical sequences were removed using CD-HIT before performing the alignments (Huang et al., 2010). After clustering sequences with an amino acid sequence similarity of >0.98 into one cluster, one sequence was selected from each cluster and the sequences with no identical sequence were also removed. The final representative dataset comprised 18 sequences, which were submitted to GenBank and assigned accession numbers of AB860223–AB860240. To obtain a comprehensive representation of the diverse HRSV-A subgroup, we downloaded 15 publicly available HRSV sequences: WUE/14576/12 (JX912357), NG-016-04 (AB470478), RSV A2 (M74568), Chiba-C/24031 (AB698559), ON138-0111A (JN257694), MO48 (AF233914), SEL/02/98072 (AF193325), CN2395 (AF233905), CH09 (AF065254), NG-009-02 (AB175815), NY20 (AF233918), MO02 (AF233910), NG-082-05 (AB470479), SA98V603 (AF348807), and WUE/16397/12 (JX912364) (Prifert et al., 2013). In total, 33 sequences were used in the subsequent analysis, which comprised 18 representative sequences and 15 reference sequences. The phylogenetic trees were constructed using MUSCLE and MEGA5 (Tamura et al., 2011). The maximum composite likelihood for nucleotide sequences and the JTT model for amino acid sequences were used with the neighbor-joining (NJ) methods to perform the distance calculations. The trees generated were visualized and edited using EvolView (Zhang et al., 2012).

Selective pressure analysis

To investigate the selective pressure, we used a dataset that comprised 18 C-terminal regions (secondary hypervariable region) of the G genes and NG-016-04 (GenBank accession No. AB470478) as a reference sequence. In this analysis, we performed a multiple sequence alignment and a phylogenetic tree was generated using CLUSTALW and MEGA5. The nucleotide frequencies in the codon positions were assumed based on unequal codon frequencies. The maximum-likelihood (ML) method was used to analyze the selection pressure with the CODEML program in the phylogenetic analysis ML package (PAML, http://abacus.gene.ucl.ac.uk/software/paml.html) (Yang, 2007). CODEML was used to estimate the numbers of nonsynonymous (dN) to synonymous (dS) codon changes per site. Positive selection was defined as dN > dS (ω ratio >1). Different codon substitution models were tested in this study, i.e., M1 and M7 (neutral), and M2 and M8 (positive). The likelihood rates were calculated as twice the difference between the log-likelihood values (2ΔL) of the models, which were compared using the χ2 distribution (two degrees of freedom).

Evolution estimation using Bayesian skyline plot analysis

This analysis used the partial G gene sequences of HRSV samples isolated in the present study between 2008 and 2013 (n = 456) and additional sequences (n = 48) reported from South Korea during 1991–2010 (Baek et al., 2012, Choi and Lee, 2000). After removing 100% identical sequences using CD-HIT (Huang et al., 2010), the population dynamics of HRSV were estimated over time using a Bayesian Markov chain Monte Carlo approach (MCMC in BEAST version 1.7), which included the date of virus sampling (Drummond et al., 2012). The dataset was analyzed with an uncorrelated log-normal relaxed uncorrelated clock using the general time-reversible substitution model selected by jModelTest version 2.0 (Posada, 2008). The MCMC chain was run for 200 million steps to achieve convergence, with sampling every 10,000 steps. Convergence was assessed based on the effective sample size (ESS) after a 10% burn-in using Tracer version 1.5 (http://beast.bio.ed.ac.uk/Tracer) and only parameters where ESS > 200 were accepted. The uncertainties of the estimates were indicated by the 95% highest posterior density intervals. The final tree was visualized and edited with FigTree version 1.3.1 (http://tree.bio.ed.ac.uk/software/figtree).

Epitope prediction

We predicted the epitope for three HRSV-A G gene sequences using seven prediction tools, i.e., BepiPreds (Larsen et al., 2006), LBtope (Singh et al., 2013), BCPRED/FBCPRED (El-Manzalawy et al., 2008), Antigenic (Rice et al., 2000), LEPS (Wang et al., 2011b), and Epitopia (Rubinstein et al., 2009). These tools made the predictions using the initial values of the parameters, and the common epitopes predicted by four or more tools with ⩾10 consecutive amino acids were selected.

Results

General characteristics of HRSV subgroups and the distributions of genotypes

We collected 36,404 samples during 2010–2013 via KINRESS, nationwide surveillance for outpatients with acute respiratory illness covers about 100 hospitals located all over Korea and includes all ages. Of these, 18,112 samples were positive for respiratory viral infections and 1,643 samples (9.1%) had positive results for HRSV. Mean age of entire study group is 3.85 year. Sex ratio and co-infection rate were 46.3% and 21.9 %, respectively and corresponding data per each period is summarized in Table 1 . During the 2010–2011 season, 234 samples (4.58% in respiratory virus-positive patients) were HRSV-positive. During the next two seasons, however, the HRSV-positive ratio increased and 643 (8.41%) and 766 (14.30%) cases were detected during the 2011–2012 and 2012–2013 seasons, respectively, as shown in Fig. 1 A. There were no specific differences in the gender distribution, mean age, or coinfection rate during these three consecutive seasons.

Table 1

Subject data in this study.

Period		2010.5.-2011.4.	2011.5.-2012.4.	2012.5.-2013.2.	2010.5.-2013.2.
Patients enrolled in KINRESS	Positive for respiratory infections/ enrolled total no.	5,108/11,300	7,647/14,788	5,357/10,316	18,112/36,404

HRSV positive patients (n = 1,643)	Total no. (%)	234 (4.58)	643 (8.41)	766 (14.30)	1,643 (9.1)
	F/M	113/121	293/350	355/409⁎	761/880
	Female ratio	48.3	45.6	46.5	46.3
	Mean age	4.19 year	3.71 year	3.87 year	3.85 year
	Co-infections (%)	38 (16.2)	156 (24.3)	166 (21.7)	360 (21.9)

KINRESS, Korea influenza and respiratory surveillance system; HRSV, human respiratory syncytial virus.

Two samples have no record about gender.

Fig. 1

Prevalence and genotype distributions of HRSV-positive samples in South Korea during 2010–2013. (A) Prevalence of HRSV-positive samples, including the HRSV-A and -B subgroups, between May 2010 and February 2013. (B) Distributions of the HRSV-A and -B subgroups between October 2010 and February 2013. (C) Distributions of HRSV-A genotypes between July 2011 and February 2013.

Subject data in this study. KINRESS, Korea influenza and respiratory surveillance system; HRSV, human respiratory syncytial virus. Two samples have no record about gender. Prevalence and genotype distributions of HRSV-positive samples in South Korea during 2010–2013. (A) Prevalence of HRSV-positive samples, including the HRSV-A and -B subgroups, between May 2010 and February 2013. (B) Distributions of the HRSV-A and -B subgroups between October 2010 and February 2013. (C) Distributions of HRSV-A genotypes between July 2011 and February 2013. To investigate the distributions of the subgroups, we randomly selected HRSV-positive samples obtained during the 2010–2011 season, which were analyzed by real-time RT-PCR because the RT-PCR methods used in that season could not distinguish subgroups HRSV-A and -B. We tested 172 samples, and only 20 samples (11.6%) belonged to the HRSV-A subgroup. In 2011–2012, the prevalence of the HRSV-A subgroup had increased greatly and over 98% of the samples were found to be HRSV-A, while the prevalence of the HRSV-A subgroup was also high in the following season (89.1%) (Fig. 1B and Table 2 ).

Table 2

General information and frequencies of HRSV subgroup A and B over 3 consecutive seasons in South Korea.

Period	2010.5.–2011.4.	2011.5.–2012.4.	2012.5.–2013.2.
Total no. of HRSV positive	234	643	766

Sample no. of subgroup determined	172*	637	764
HRSV-A	20	627	681
HRSV-B	152	10	83
Ratio of A subgroup (%)	11.6	98.4	89.1

Sample no. of genotype† determined in HRSV-A	1	167	258
GA5	0	5	0
NA1	1	133	14
ON1	0	29	244
Ratio of ON1 (%)	0	17.4%	94.6%

HRSV, human respiratory syncytial virus.

Before July 2011, A/B subgroup was tested to random selected samples.

During October 2008 to April 2009, nine GA5 and 21 NA1 genotype stains were identified from 30 random selected samples.

General information and frequencies of HRSV subgroup A and B over 3 consecutive seasons in South Korea. HRSV, human respiratory syncytial virus. Before July 2011, A/B subgroup was tested to random selected samples. During October 2008 to April 2009, nine GA5 and 21 NA1 genotype stains were identified from 30 random selected samples. Eshaghi et al. (2012) reported the discovery of a novel genotype in Canada with a 72-nt duplication in the G gene during the winter season in 2010–2011 and we also found ON1 genotype strains in South Korea, for which we reported the whole-genome sequences (Lee et al., 2012). Thus, extensive genotype and sequence analyses were performed using the HRSV-A subgroup to further study the prevalence of ON1 genotype strains. In total, we investigated 426 HRSV-A samples obtained during 2010–2013, where we analyzed the G gene sequences to determine whether a 72-nt duplication was present or absent. According to the broad sequence analysis, the first ON1 genotype strains were discovered in August 2011. In the season from May 2011 to April 2012, 29 samples (17.4%) had the ON1 genotype in South Korea among 167 HRSV-A samples tested. The major genotype in that season was NA1 (133 samples, 79.6%) and the GA5 genotype was also identified (five samples, 3.0%). These genotype distributions were similar to the results reported by the Canadian group (Eshaghi et al., 2012). In the next season (2012–2013), however, the prevalence of the ON1 genotypes increased significantly to 94.6% (244/258 HRSV-A-positive samples) and the GA5 genotype was not detected (Table 2 and Fig. 1C). NA1 genotype strains were still detected but only at a very low frequency (14 samples, 5.4%). Our analysis of this newly emerged strain showed that the rapid replacement of the non-ON1 genotypes without a 72-nt duplication in the G gene by the ON1 genotype was dramatic compared with the earlier spread of the HRSV BA genotype.

Phylogenetic analysis and alignment

In a further analysis of the 426 HRSV-A sequences obtained during 2010–2013, we added 30 unpublished HRSV-A sequences collected between October 2008 and April 2009. A clustering approach was applied (as described in the Materials and Methods section) to remove any redundant sequences and to make the data clear and manageable, while ensuring that the data still represented the diversity of HRSV. The numbers of sequences that belonged to the same clusters (amino acid sequence similarity >0.98) were plotted next to the phylogenetic tree. Finally, 18 isolates and 15 reference sequences were analyzed and the clusters in the phylogenetic tree generated four major groups: ON1, NA1, GA5, and others, including the reference sequences. The phylogenies based on the nucleotide and amino acid sequences (Fig. 2 A and B) were in almost complete agreement based on the topology of the phylogeny. The branching patterns differed slightly between groups, but the clade compositions were stable within the groups.

Fig. 2

Phylogenetic analysis of the HRSV-A subgroup using partial G sequences based on (A) nucleotide and (B) amino acid sequences. Partial G protein sequences were used to study the phylogenetic relationships between HRSV-A subgroups. The nucleotide and protein sequences were aligned using the MUSCLE algorithm and the phylogenetic trees were generated by the neighbor-joining method with MEGA5 based on 1000 bootstrap replicates. The length of the square bar indicates the number of sequences that belong to the same cluster, while the color indicates the year the sample was isolated. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) However, several sequences (n = 8), which were represented by the GG818/12 sequence, had an apparently different branching structure, although there was low statistical support (bootstrap) for how most of these branches were related. Based on the alignment analysis, these clusters had a 1-nt deletion at nt position 899 in the 72-nt duplicated region. This deletion caused a frame shift from amino acid position 300 in the C-terminal region to the stop codon located at amino acid position 325 (Fig. 3 ). We considered that the GG818/12 clusters represented a subgenotype, which we designated as ON1_OS.

Fig. 3

Amino acid alignment of G protein sequences from amino acid position 212 to the C-terminal end. Positively selected sites according to the M8 model are marked as closed black circles.

Amino acid alignment of G protein sequences from amino acid position 212 to the C-terminal end. Positively selected sites according to the M8 model are marked as closed black circles. In addition, the first ON1 sequence, identified i.e., ON67-1210A/2010 from Canada (Eshaghi et al., 2012) and US469/11 in the present study, had the identical duplicated region with the template. However, the L298P and Y305H substitutions in other ON1 clusters were occurred as the same as those reported by Tsukagoshi et al. (2013) from Japan.

Selective pressure on the amino acid sequence of attachment G protein

The deduced amino acid sequences contained various substitutions in the secondary variable region of the G protein. In total, 18 representative sequences and an NA1 genotype reference sequence (NG-016-04, AB470478) were subjected to selection pressure analysis, as described in the Section 2. The dN/dS (nonsynonymous to synonymous codon changes per site) ratio averaged over all sites ranged from 0.669 to 1.185 with different codon substitution models. Models M2 and M8, which allowed for positive selection, fitted our dataset better than the two neutral models (M1 and M7). Using the NG-016-04 strain as an outgroup, 22 and 24 sites were found to be under positive selection by the M2 and M8 models, respectively, and these positively selected sites are shown in Fig. 3 and Table 3 . The empirical Bayes analysis showed that 3/24 (positions 237, 274, and 322) positively selected sites were under positive selection at the 95% level in the M8 model. Previously, positions 225, 256, 274, and 311 were reported to be flip-flop sites that tend to revert to a previous state over time (Botosso et al., 2009). Thus, we also found that the substitutions of V225A, P256L, and L274P/T were forward replacements, whereas L311H/P was a backward replacement.

Table 3

Parameter estimates, dN/dS, values of log-Likelihood (l), positive selection sites, and Likelihood Ratio Tests (LRT) in the G gene analysis of HRSV-A isolated in South Korea between 2008 and 2013.

Model	Parameter estimates	dN/dS	Log-likelyhood (l)	Positively selected sites⁎	Model comparison
M1	w0 = 0.0	0.6688	−818.392077		8.71
	w1 = 1.0				d.f. = 2, p < 0.05
	p0 = 0.33115
	p1 = 0.66885

M2†	w0 = 0.15815	1.1852	−814.034775	V225A, E232G/D, D237N/Y, L247M, S250N, N251Y/S, T253K/I, P256L, E262R, L274P/T, Y280H, E309T, L311H/P, S312N, Q313L, S314Y/P, S316H, S317P, N319Q, T320Q/I, T321N/S, K322D
	w1 = 1.0
	w2 = 2.31504
	p0 = 0.52384
	p1 = 0.0
	p2 = 0.47616

M7	p = 0.01165	0.70	−817.440963		6.81
M7	q = 0.00500	0.70	−817.440963		d.f. = 2, p < 0.05

M8‡	p0 = 0.52384	1.1849	−814.035088	V225A, E232G/D, D237N/Y, L247M, S250N, N251Y/S, T253K/I, P256L, N260S, E262R, N273Y, L274P/T, Y280H, E309T, L311H/P, S312N, Q313L, S314Y/P, S316H, S317P, N319Q, T320Q/I, T321N/S, K322D
	p1 = 0.47616
	p = 18.69927
	q = 99.0
	w = 2.31393

HRSV, human respiratory syncytial virus.

Sites were numbered as based on ON1 sequence (ON67-1210A, Genbank accession No. JN257693).

Posterior probability of positively selected sites with M2 model: 50% to 74% (232, 317, 225, 253, 280, 319, 256, 250, 313, 251, 247, 321, 312); 75% to 84% (316, 314, 262, 311, 320); 85% to 94% (322, 309, 274, 237); and 95% < (none).

Posterior probability of positively selected sites with M8 model: 50% to 74% (260, 273, 232, 280, 317, 225, 253, 319, 256, 247, 250, 313, 251); 75% to 84% (none); 85% to 94% (321, 312, 316, 314, 262, 311, 320, 309); and 95% < (322, 274, 237).

Parameter estimates, dN/dS, values of log-Likelihood (l), positive selection sites, and Likelihood Ratio Tests (LRT) in the G gene analysis of HRSV-A isolated in South Korea between 2008 and 2013. HRSV, human respiratory syncytial virus. Sites were numbered as based on ON1 sequence (ON67-1210A, Genbank accession No. JN257693). Posterior probability of positively selected sites with M2 model: 50% to 74% (232, 317, 225, 253, 280, 319, 256, 250, 313, 251, 247, 321, 312); 75% to 84% (316, 314, 262, 311, 320); 85% to 94% (322, 309, 274, 237); and 95% < (none). Posterior probability of positively selected sites with M8 model: 50% to 74% (260, 273, 232, 280, 317, 225, 253, 319, 256, 247, 250, 313, 251); 75% to 84% (none); 85% to 94% (321, 312, 316, 314, 262, 311, 320, 309); and 95% < (322, 274, 237). A frame shift occurred because of the 1-nt deletion at amino acid position 300, but the results of the analysis showed that there was high pressure for positive selection from amino acid position 309 to the stop codon.

Evolutionary dynamics of the G gene

To estimate the dynamics of the nucleotide substitutions and variation in HRSV-A, all 456 of the HRSV-A sequences obtained from this study during 2008–2013 and 48 previously published sequences from South Korea during 1991–2010 (Baek et al., 2012, Choi and Lee, 2000) were analyzed using the Bayesian skyline plot method after removing homologous sequences (Fig. 4 ). The calculated mean evolutionary rate was 3.275 × 10−3 nucleotide substitutions/site/year, which was faster than the whole-genome evolutionary rate of 6.47 × 10−4 for HRSV-A (Tan et al., 2013) but slower than the rate of 4.7 × 10−3 nucleotide substitutions/site/year in the HRSV-B G gene during the 10 years since the discovery of the 60-nt duplicated BA genotypes (Trento et al., 2010). A previous study reported a similar evolutionary rate for the HRSV-A G gene of 3.57 × 10−3 nucleotide substitutions/site/year (Tsukagoshi et al., 2013).

Fig. 4

Genealogies and the corresponding Bayesian skyline plot showing the demographic history of HRSV sequences, which are drawn using the same time scale. The y-axes of the skyline plot (lower panel) represent the population size, which is equal to the product of the effective population size (shown as the product of Ne and generation time τ). The black line represents the median estimate and the areas between the blue and sky lines show the 95% highest posterior density limits. The Maximum clade credibility (MCC) tree (upper panel) is represented on the same time scale as the skyline plot. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) The time course analysis showed that the genetic diversity was steady during 20 years in the course of NA1 genotype emergence in late 1990’s, consistent with the analysis of Kushibuchi in Japan (Kushibuchi et al., 2013). The relative genetic diversity has increased around 2006 in population size. Following that period, a limited fluctuation was observed in 2010 with a subsequent dramatic increase in the effective virus population size after 2011. The specific selective pressure applied to the virus in 2006 remains to be clarified, but this increasing trend in the relative genetic diversity of HRSV was followed by the initial appearance of the HRSV-A ON1 strain. In this analysis, we could also assume that ON1 genotype was emerged in 2009 year, although first ON1 case was discovered in 2011. The Bayesian skyline plot also demonstrated that the growth phase of the virus population size agreed with the rapid emergence of the HRSV-A ON1 strain. The G protein is the major antigenic protein expressed on the surface of HRSV, so we assumed that the 72-nt duplication, which caused a 23-amino-acid duplication in secondary hypervariable regions of the ectodomain, may have affected the antigenicity, antigen recognition by antibodies, or virulence. Thus, we performed B-cell epitope prediction using the representative prototype A2 (GenBank accession No. M74568), the ON1 genotype GN435/11 (GenBank accession No. JX627336) (Lee et al., 2012), and the ON1_OS subgenotype GG818/12 sequences (Table 4 ). For each sequence, 5–7 antigenic peptides were identified by computational prediction. The 121–131, 133–146, 188–203, 221–231, and 272–284 epitope positions were determined for all three sequences. The predicted epitopes were similar but the 287–299 epitope located in the duplicated region was predicted only in the GG818/12 sequence.

Table 4

Epitope prediction results of novel ON1 types (GN435/11, GG818/12) and prototype A2 strains. Bold letter means common epitope regions in three strains.

Sequences	Position	Predicted epitope regions
A2 (M74568)	85–102	NTTPTYLTQNPQLGISPS
	121–131	GVKSTLQSTTVKT
	133–152	TKNTTTTQTQPSKPTTKQRQ
	188–231	RIPNKKPGKKTTTKPTKKPTLKTTKKDPKPQTTKSKEVPTTKPT
	266–291	HSTSSEGNPSPSQVSTTSEYPSQPSS
GN435/11 (JX627336)	115–146	LASTTPSAESTPQSTTVKIKNTTTTQILPSKP
	188–203	RIPNKKPGKKTTTKPT
	221–242	KPKEVLTTKPTGKPTINTTKTN
	251–265	NTKGNPEHTSQEETL
	272–284	GYLSPSQVYTTSG
GG818/12 (AB860239)	81–94	NQIKNTTPTYLTQS
	115–152	LASTTPSAESTPQSTTVKIKNTTTTQILPSKPTTKQRQ
	188–212	RIPNKKPGKKTTTKPTKKPTLKTTK
	221–242	KPKEVLTTKPTGKPTINTTKTN
	251–265	NTKGNPEHTSQEETL
	272–284	GYLSPSQVYTTSG
	287–299	ETLHSTTSEGYLS

Epitope prediction results of novel ON1 types (GN435/11, GG818/12) and prototype A2 strains. Bold letter means common epitope regions in three strains.

Discussion

In this study, we analyzed the distribution in South Korea of the novel HRSV ON1 genotype, which has a 72-nt duplicated genetic region, and its rapid replacement of the non-duplicated strains. Our results provide up-to-date sequence information on the prevalent HRSV genotypes in South Korea and insights into HRSV evolution. In South Korea, the first ON1 genotype was discovered in August 2011, which was the next season after Eshaghi et al. identified the novel ON1 genotype in Canada (2010–2011) (Eshaghi et al., 2012). However, the ON1 genotype was estimated as emerged in 2009 with time scale evolution analysis and became the predominant strain in South Korea in 2012–2013. Studies of HRSV-B show that the prevalence of the 60-nt duplicated BA genotype has fluctuated over the last 10 years but all of the HRSV-B subgroups now belong to the BA genotype (Baek et al., 2012, Trento et al., 2010, van Niekert and Venter, 2011). Further long-term molecular epidemiological studies are required during consecutive seasons to determine whether the ON1 genotype will replace the non-ON1 genotypes completely. In addition to the introduction of a 72-nt duplication, the C-terminal region of the G gene has been the target of various amino acid sequence changes via substitutions and deletion. We found L298P and Y305H substitutions, as well as a 1-nt deletion that caused a frame shift from amino acid position 300 and a finally two amino acids longer than that in previously reported ON1 strains (Munywoki et al., 2013, Auksornkitti et al., 2013, Eshaghi et al., 2012, Lee et al., 2012, Prifert et al., 2013). The nucleotide substitution in the duplicated region was also found in Japan (Tsukagoshi et al., 2013). However, this is the first report of the 1-nt deletion in the duplicated region, and we designated this as the ON1_OS subgenotype in the present study. This ON1_OS strain was isolated from eight patients who were enrolled between October 2012 and January 2013 in five cities in South Korea (Fig. 1C). This finding suggests the replication and spread of these strains. The change of a 25 amino acid region into a novel sequence at the C-terminal is quite long and we still do not know the structure of G protein, so it might be possible that structural transformations occurred because of the 72-nt duplication or other variations. In vitro and in vivo functional studies should be performed using this novel G protein during attachment to host cells, or in interactions with immune responses, to understand the effects of this major change. Previous reports have shown that a major HRSV type dominates in a given season, although two or more different genotypes cocirculate at the same time with different levels (Trento et al., 2010). A similar prevalence pattern was also observed in the present study. For example, HRSV-A did not lead the infections of patients with acute respiratory illness during the 2010–2011 season, but HRSV-A was prevalent in the following season of 2011–2012, which might be related to the emergence of the novel HRSV-A genotype in the human population. HRSV-A remained prevalent in the following season of 2012–2013, but we found that the prevalent subgroup of the 2013–2014 season is HRSV-B in South Korea (data not shown). The G protein is the major antigen of HRSV, and it is known to be highly antigenic with high genetic diversity, which may be related to frequent reinfection with HRSV (Johnson et al., 1987, Sullender, 2000, Parveen et al., 2006). Variation and the positive selection for amino acid changes are focused in two hypervariable regions, and the C-terminal third of the G protein contains multiple epitopes (Melero et al., 1997). Immune escape by new variants may contribute to the diversity and prevalence of HRSV. Combinations of novel substitutions and flip-flop changes in amino acids are involved with various epitope transitions, which reflect the immune status of the human population. Our epitope prediction analysis suggested the presence of a new epitope in the GG818/12 strain, where a 1-nt deletion occurred after the 72-nt duplication. We also found that several positively selected sites, i.e., amino acids 237, 274, and 280, were located in common epitopes with a high probability, which also supported our hypothesis. Furthermore, it was known that protective immunity through neutralizing antibody which was important for host defense against HRSV is of short duration (Sande et al., 2013). Immunological evidence is also required to clarify this phenomenon, but periodic substitutions of the prevalent HRSV-A and -B subgroups, as well as variations in the G protein and short-lived immunity, might be accompanied by changes in the herd immune status of human populations with respect to specific genotypes (Collins and Graham, 2008). We used Bayesian skyline plots to infer the relative genetic diversity based on 318 HRSV-A G genes collected between 1991 and 2013, which may reflect the general trend. Previous reports indicate that the rapid substitution rate of the G gene compared with other genes or the entire HRSV genome have contributed to its high evolutionary rate (Tan et al., 2013). Most recently, Balmaks et al. (2013) also described population dynamics of ON1genotype based on limited information of sequences and showed effective population size of the ON1 genotype was expanded slowly and even decreased before the beginning of the season 2012–2013. Whereas, our data which were encompassed HRSV-A G genes collected in 2013 strongly indicate that the relative genetic diversity patterns of the HRSV-A G genes were significantly correlated with the emergence and prevalence of the HRSV-A ON1 genotype, which was rapidly emerged and maintained as a predominant strain. In summary, our large-scale analysis of the HRSV-A G gene indicates that the emergence of the HRSV-A ON1 genotype in South Korea was correlated with an increase in genetic diversity. It remains to be clarified whether changes in the antigenicity of the G protein and/or substantial changes in the herd immune status have contributed to the rapid dominance of HRSV-A ON1. Furthermore it will also be crucial to understand determining factor(s) of viral phenotype, fitness of the HRSV-ON1 and its variant such as HRSV-ON1_OS subgenotype or transmission/complete replacement like BA genotype of HRSV-B. Although we conferred emergence of the strain into South Korea might be mediated by overseas transmission through limited genetic analysis (data not shown), cumulative evolutionary information with time scale is still required to verify global spread of the ON1 genotype precisely, Therefore, meticulous and continuous monitoring of the evolutionary trends in the G gene is essential to obtain insights that may facilitate vaccine development and amendment of public health responses against HRSV infection.

Conclusion

We investigated the emergence of the new HRSV-A ON1 genotype having 72-nt duplication in G gene and rapid replacement during 3 years. Time scaled evolutionary study support the drastic increase of genetic diversity resulting to the prevalence of the new genotype which was subdivided around 2009. In addition, we predicted the epitope of the duplicated G protein with an insertion of 23 amino acids and the results suggest the antigenic variation of the HRSV G protein.

52 in total

1. Ten years of global evolution of the human respiratory syncytial virus BA genotype with a 60-nucleotide duplication in the G protein gene.

Authors: Alfonsina Trento; Inmaculada Casas; Ana Calderón; Maria L Garcia-Garcia; Cristina Calvo; Pilar Perez-Breña; José A Melero
Journal: J Virol Date: 2010-05-26 Impact factor: 5.103

2. PAML 4: phylogenetic analysis by maximum likelihood.

Authors: Ziheng Yang
Journal: Mol Biol Evol Date: 2007-05-04 Impact factor: 16.240

3. Genetic diversity and molecular epidemiology of the G protein of subgroups A and B of respiratory syncytial viruses isolated over 9 consecutive epidemics in Korea.

Authors: E H Choi; H J Lee
Journal: J Infect Dis Date: 2000-05-15 Impact factor: 5.226

4. Complete genome sequence of human respiratory syncytial virus genotype A with a 72-nucleotide duplication in the attachment protein G gene.

Authors: Wan-Ji Lee; You-jin Kim; Dae-Won Kim; Han Saem Lee; Ho Yeon Lee; Kisoon Kim
Journal: J Virol Date: 2012-12 Impact factor: 5.103

5. Genetic diversity among respiratory syncytial viruses that have caused repeated infections in children from rural India.

Authors: Shama Parveen; Shobha Broor; Suresh Kumar Kapoor; Karen Fowler; Wayne M Sullender
Journal: J Med Virol Date: 2006-05 Impact factor: 2.327

Review 6. The path to an RSV vaccine.

Authors: Christine A Shaw; Max Ciarlet; Brian W Cooper; Lamberto Dionigi; Paula Keith; Karen B O'Brien; Maryam Rafie-Kolpin; Philip R Dormitzer
Journal: Curr Opin Virol Date: 2013-05-30 Impact factor: 7.090

7. Molecular epidemiology of human respiratory syncytial virus over three consecutive seasons in Latvia.

Authors: Reinis Balmaks; Irina Ribakova; Dace Gardovska; Andris Kazaks
Journal: J Med Virol Date: 2013-12-02 Impact factor: 2.327

8. Emerging genotypes of human respiratory syncytial virus subgroup A among patients in Japan.

Authors: Yugo Shobugawa; Reiko Saito; Yasuko Sano; Hassan Zaraket; Yasushi Suzuki; Akihiko Kumaki; Isolde Dapat; Taeko Oguma; Masahiro Yamaguchi; Hiroshi Suzuki
Journal: J Clin Microbiol Date: 2009-06-24 Impact factor: 5.948

9. Predicting linear B-cell epitopes using string kernels.

Authors: Yasser El-Manzalawy; Drena Dobbs; Vasant Honavar
Journal: J Mol Recognit Date: 2008 Jul-Aug Impact factor: 2.137

Review 10. Respiratory syncytial virus: current progress in vaccine development.

Authors: Rajeev Rudraraju; Bart G Jones; Robert Sealy; Sherri L Surman; Julia L Hurwitz
Journal: Viruses Date: 2013-02-05 Impact factor: 5.048

25 in total

1. Conservation of G-Protein Epitopes in Respiratory Syncytial Virus (Group A) Despite Broad Genetic Diversity: Is Antibody Selection Involved in Virus Evolution?

Authors: Alfonsina Trento; Leyda Ábrego; Rosa Rodriguez-Fernandez; Maria Isabel González-Sánchez; Felipe González-Martínez; Adriana Delfraro; Juan M Pascale; Juan Arbiza; José A Melero
Journal: J Virol Date: 2015-05-20 Impact factor: 5.103

2. Dominance of the ON1 Genotype of RSV-A and BA9 Genotype of RSV-B in Respiratory Cases from Jeddah, Saudi Arabia.

Authors: Hessa A Al-Sharif; Sherif A El-Kafrawy; Jehad M Yousef; Taha A Kumosani; Mohammad A Kamal; Norah A Khathlan; Reham M Kaki; Abeer A Alnajjar; Esam I Azhar
Journal: Genes (Basel) Date: 2020-11-09 Impact factor: 4.096

3. Prevalence and genetic characterisation of respiratory syncytial viruses circulating in Bulgaria during the 2014/15 and 2015/16 winter seasons.

Authors: Neli Korsun; Svetla Angelova; Iren Tzotcheva; Irina Georgieva; Snezhina Lazova; Snezhana Parina; Ivaylo Alexiev; Penka Perenovska
Journal: Pathog Glob Health Date: 2017-09-26 Impact factor: 2.894

4. Characteristics and Their Clinical Relevance of Respiratory Syncytial Virus Types and Genotypes Circulating in Northern Italy in Five Consecutive Winter Seasons.

Authors: Susanna Esposito; Antonio Piralla; Alberto Zampiero; Sonia Bianchini; Giada Di Pietro; Alessia Scala; Raffaella Pinzani; Emilio Fossali; Fausto Baldanti; Nicola Principi
Journal: PLoS One Date: 2015-06-05 Impact factor: 3.240

5. Genetic diversity and evolutionary insights of respiratory syncytial virus A ON1 genotype: global and local transmission dynamics.

Authors: Venkata R Duvvuri; Andrea Granados; Paul Rosenfeld; Justin Bahl; Alireza Eshaghi; Jonathan B Gubbay
Journal: Sci Rep Date: 2015-09-30 Impact factor: 4.379

6. Complete genome sequences of human respiratory syncytial virus genotype a and B isolates from South Korea.

Authors: Mi-Ran Yun; A-Reum Kim; Han Saem Lee; Dae-Won Kim; Wan-Ji Lee; Kisoon Kim; Sung Soon Kim; You-Jin Kim
Journal: Genome Announc Date: 2015-04-23

7. Novel respiratory syncytial virus (RSV) genotype ON1 predominates in Germany during winter season 2012-13.

Authors: Julia Tabatabai; Christiane Prifert; Johannes Pfeil; Jürgen Grulich-Henn; Paul Schnitzler
Journal: PLoS One Date: 2014-10-07 Impact factor: 3.240

8. Molecular and clinical characterization of human respiratory syncytial virus in South Korea between 2009 and 2014.

Authors: E Park; P H Park; J W Huh; H J Yun; H K Lee; M H Yoon; S Lee; G Ko
Journal: Epidemiol Infect Date: 2017-10-09 Impact factor: 4.434

9. Molecular Evolution of the Capsid Gene in Norovirus Genogroup I.

Authors: Miho Kobayashi; Shima Yoshizumi; Sayaka Kogawa; Tomoko Takahashi; Yo Ueki; Michiyo Shinohara; Fuminori Mizukoshi; Hiroyuki Tsukagoshi; Yoshiko Sasaki; Rieko Suzuki; Hideaki Shimizu; Akira Iwakiri; Nobuhiko Okabe; Komei Shirabe; Hiroto Shinomiya; Kunihisa Kozawa; Hideki Kusunoki; Akihide Ryo; Makoto Kuroda; Kazuhiko Katayama; Hirokazu Kimura
Journal: Sci Rep Date: 2015-09-04 Impact factor: 4.379

10. Molecular Characterization of Human Respiratory Syncytial Virus in the Philippines, 2012-2013.

Authors: Rungnapa Malasao; Michiko Okamoto; Natthawan Chaimongkol; Tadatsugu Imamura; Kentaro Tohma; Isolde Dapat; Clyde Dapat; Akira Suzuki; Mayuko Saito; Mariko Saito; Raita Tamaki; Gay Anne Granada Pedrera-Rico; Rapunzel Aniceto; Reynaldo Frederick Negosa Quicho; Edelwisa Segubre-Mercado; Socorro Lupisan; Hitoshi Oshitani
Journal: PLoS One Date: 2015-11-05 Impact factor: 3.240