Literature DB >> 28414749

Sequence variability of the respiratory syncytial virus (RSV) fusion gene among contemporary and historical genotypes of RSV/A and RSV/B.

Anne M Hause1,2, David M Henke3, Vasanthi Avadhanula1, Chad A Shaw3, Lorena I Tapia4,5, Pedro A Piedra1,6.   

Abstract

BACKGROUND: The fusion (F) protein of RSV is the major vaccine target. This protein undergoes a conformational change from pre-fusion to post-fusion. Both conformations share antigenic sites II and IV. Pre-fusion F has unique antigenic sites p27, ø, α2α3β3β4, and MPE8; whereas, post-fusion F has unique antigenic site I. Our objective was to determine the antigenic variability for RSV/A and RSV/B isolates from contemporary and historical genotypes compared to a historical RSV/A strain.
METHODS: The F sequences of isolates from GenBank, Houston, and Chile (N = 1,090) were used for this analysis. Sequences were compared pair-wise to a reference sequence, a historical RSV/A Long strain. Variability (calculated as %) was defined as changes at each amino acid (aa) position when compared to the reference sequence. Only aa at antigenic sites with variability ≥5% were reported.
RESULTS: A total of 1,090 sequences (822 RSV/A and 268 RSV/B) were analyzed. When compared to the reference F, those domains with the greatest number of non-synonymous changes included the signal peptide, p27, heptad repeat domain 2, antigenic site ø, and the transmembrane domain. RSV/A subgroup had 7 aa changes in the antigenic sites: site I (N = 1), II (N = 1), p27 (N = 4), α2α3β3β4(AM14) (N = 1), ranging in frequency from 7-91%. In comparison, RSV/B had 19 aa changes in antigenic sites: I (N = 3), II (N = 1), p27 (N = 9), ø (N = 4), α2α3β3β4(AM14) (N = 1), and MPE8 (N = 1), ranging in frequency from 79-100%. DISCUSSION: Although antigenic sites of RSV F are generally well conserved, differences are observed when comparing the two subgroups to the reference RSV/A Long strain. Further, these discrepancies are accented in the antigenic sites in pre-fusion F of RSV/B isolates, often occurring with a frequency of 100%. This could be of importance if a monovalent F protein from the historical GA1 genotype of RSV/A is used for vaccine development.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28414749      PMCID: PMC5393888          DOI: 10.1371/journal.pone.0175792

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Background

Respiratory syncytial virus (RSV) is a major cause of lower respiratory tract illness (LRTI) among infants and young children and contributes significantly to morbidity and mortality in this age group. RSV is classified into two subgroups, RSV/A and RSV/B, based on variation in the attachment (G) gene. Viruses from both subgroups circulate, though usually one subgroup dominates a given RSV season [1]. The G protein and fusion (F) protein are the only two surface glycoproteins capable of inducing a neutralizing antibody response [2]. However, the F protein is far more conserved than the G protein and, for this reason, has been the major antigen of focus for RSV vaccine development [3]. There is currently no licensed vaccine against RSV; however, there is a large pipeline containing candidate vaccines that are in preclinical to late stages of development [4]. Most of these vaccines are monovalent and utilize the F protein or sequence isolated in the 1960s from an RSV/A virus belonging to the GA1 genotype. The RSV/A and RSV/B subgroups are further divided into genotypes based on variability in the distal third of the G gene, the hypervariable mucin-like domain [1,5]. During RSV season more than one genotype from the same RSV subgroup co-circulates within a community outbreak. The GA2 genotype has been the dominant genotype for RSV/A for nearly a decade. However, it is rapidly being replaced by the Ontario (ON1) genotype [6]. The Buenos Aires (BA) genotype has been the dominant genotype for RSV/B since 2005 [7]. Interestingly, both ON1 and BA have a unique duplication in the distal third of their G genes, 72 and 60 nucleotides respectively [7,8]. The F protein has been identified as having at least two dominant conformations: the pre-fusion and post-fusion F forms. The F protein’s pre-fusion conformation is metastable and readily rearranges into the stable post-fusion conformation [9]. Each of these conformations has been expressed as a protein crystal; however, modifications had to be made to stabilize the F protein, in particular, for the pre-fusion conformation. Thus it is possible that the pre-fusion protein crystallization may not represent the protein’s true form prior to virus-to-cell fusion. Both the pre-fusion and post-fusion conformation of the F protein are being explored as vaccine candidates [4,10,11]. These two conformations share some antigenic sites but also have their own antigenic sties. Two known antigenic sites (II and IV) are present in the pre- and post-fusion F [12,13]. Antigenic site II is the targeted site of the therapeutic monoclonal antibody, palivizumab. In addition, pre-fusion F has antigenic site ø, MPE-8, α2, α3, β3 & β4 (recognized by AM14), and p27 [9,14-16]; post-fusion F has the unique antigenic site I [17]. Although the F protein is generally thought to be well conserved, variability in some of the F domains has been observed in the signal peptide, transmembrane domain, not defined 2 site, and antigenic site ø [18]. In this report we examine the sequence variability of the F gene from a large bank of RSV sequences that span over 50 years. To better understand the impact this variability may have on vaccine development, we have focused on the antigenic sites of the pre-fusion and post-fusion F and used as our reference the F gene from a historical sequence. This reference gene belongs to the genotype GA1 which is often utilized in the development of RSV vaccines.

Materials and methods

Virus strains

In order to robustly represent and categorize the contemporary virus, previously sequenced and published RSV clinical isolates from the Department of Molecular Virology and Microbiology of Baylor College of Medicine, Houston, Texas (n = 118) and the Programa de Virología of Universidad de Chile (n = 102) were utilized in this study [18]. An additional 1017 RSV F gene sequences were obtained from the GenBank (www.ncbi.nlm.nih.gov/genbank/) database during October 2015 (S1 Table). GenBank sequences represent the publicly available sequence information for the RSV F gene provided by multiple study sites from 1961 through 2014. This data is not longitudinal surveillance information and should not be interpreted as such. All available RSV F gene sequences at the time of the download were considered in this study. When available, the corresponding G gene was also acquired and included. Additional information on the sequences, including date of sample collection and country of origin, was also obtained when such information was available.

Genotype assignment

To ensure the historical data’s validity for every sequence, an unstructured cluster analysis of viral subgroup was conducted. Based on pairwise similarity scoring between any two viral sequences, the Lance-Williams dissimilarity score was used to create major group populations within the entire population of viral sequences. This method of categorizing the viral subgrouping was conducted among all F sequences and the two major groupings of G sequences (those with the duplication of the distal third of the G gene and those without). Once two major groups (representing RSV/A and RSV/B) had been established, the cluster split was compared to a priori subgroup information. Only strains which grouped the same between the a priori (historical call) and unstructured genotype call were utilized. In addition, sequences deemed of poor fidelity were removed. Between these control steps, 147 F sequences were removed from the analysis (S1 Fig). To understand different populations of RSV, similar viral genome sequences are catagorized into genotypes. This assignment was preferentially performed on the virus’ G gene then its F gene. As per convention, the distal third of the gene was utilized for genotype assignment [1]. This region of sequence was selected by multiply aligning all G gene sequences then removing the region from the 649th nucleotide to the 5’ end, with respect to the reference sequence. The remaining sequence represents ~27.7% of the gene. The surrounding sequence was seen to be relatively conserved between strains and provided a buffer before the insertion position seen in BA and ON genotypes. For those sequences without a corresponding G gene, only the F gene was used to provide genotype assignment as previously described [18]. By comparing every strain’s gene to everyother’s, we were able to rank the similarity of strains independently of one another. Based on this ranking we were able to group previously unassigned viral strains to their most appropriate genotype according to the similarity scores. Basic assumptions were acted on previous to the assignment of the similarity scores, creating informed distinct groupings. These assumptions distinguish obvious viral classifications, i.e. subgroups A vs. B and those genotypes which contain duplications within the distal third of the G gene, genotypes BA and ON. Our similarity score ranking is based off of the maximal pair-wise similarity score of the multiply aligned genes (i.e. nearest neighbor method), encompassing both the non-genotyped laceled sequences and previously-genotype-labeled reference sequences was used in genotype assignment. Pair-wise alignment was conducted on DNA under a Smith-Waterman algorithm implemented using R version 3.0.1 (Biostring package version 2.30.1). A substitution matrix of 2 and -2 for matches and mismatches, respectively, was used, along with a -6 and -0.2 penalty for gap openings and extensions, respectively. RSV subgroups were quarantined during genotype assignment. G gene scoring was conducted only for non-genotyped sequences not seen with the insertion (exclusion of Ontario and Buenos Ares genotypes). Known genotypes were taken a priori and then amended using the insertion within the distal third of the G gene to assign genotypes to those isolates with duplications in the distal third of the G gene. Only scores of known genotype sequences were utilized as references for a maximal similarity to the unknown counterparts. To ensure all sequences were aligned correctly, the conservative glycosylation sites were confirmed to have 100% consensus. Additional steps were taken to ensure accurate genotype assignment. First, the sequence length of the G gene was examined to ensure that those sequences with insertions were correctly assigned to their respective genotypes (ON1 or BA). Next, phylogenetic trees for the F and G gene were constructed to examine genotype clustering. Tree construction was conducted in a bootstrap fashion using 1,000 iterations. The optimized trees allowed for topology, base frequencies, the rate matrix, and the proportion of variable size to get optimized. Parameters were chosen by maximizing the tree’s log likelihood of the protein. Tree construction was conducted in R 3.3.0 (under the phangorn package v. 2.0.4).

Amino acid variability analysis by subgroup and genotype

A final goal was to characterize the stability of the RSV F protein. This was accomplished by viewing changes within the F protein. To determine the amino acid variability in the F domains of RSV/A and RSV/B, the RSV/A and RSV/B subgroups were compared to the historical RSV/A Long strain (ATCC VR-26; RSV/A Long), a GA1 genotype. As the specific type of nucleic acid change is informative, amino acids with both synonomous and nonsynonomous nucleotide changes were considered for our analysis. When constructing the nonsynonymous/synonomous bar chart of the amino acids comprising F gene, all sequences were grouped by distinct subgroup. Each sequence contributed equal influence to the graph. Codons with an unknown or missing base were dropped from the analysis. Previously defined F gene domains were assigned color blocks [19,20]. Additionally, antigenic sites were highlighted in shades of gray respective to the F protein formation on which they are found (pre-fusion, post-fusion, or both). Variability was reported as the percentage of each unique amino acid at a given residue found in a subgroup. Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at each residue. Genotypes with fewer than five sequences were excluded from the analysis. Changes occurring at ≤5% variability were excluded from this report. Amino acid variability was also decomposed by genotype. Within genotype differences from the reference sequence were reported as the percentage of each unique amino acid found in a genotype at a given residue. All changes were reported for genotypes [1,21-25].

Entropy analysis

In order to further quantify sequence variability, each amino acid and nucleotide position of the F protein was examined using an estimate of Shannon (information) entropy, defined as ∑−i log(i), where i is the weighted proportion of each unique amino acid found in a given population and at a given residue. This measure of variability is representative of the disorder of each amino acid position within its populations (subgroups RSV/A and RSV/B). Thus, entropy is minimized when perfect consensus is found at a position; it is maximized when there is a uniform distribution over all options. Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at a given residue.

Results

A total of 1,090 RSV F gene sequences were utilized for this study. Of these, 352 had been previously assigned a genotype and were used as references for genotype assignment and construction of phylogenetic trees. The remaining 873 were assigned genotypes based on our previously described methods. Of the 586 with a corresponding G gene, 90 were observed to have a duplication in the distal third of the G gene indicative of the genotypes ON (72 nucleotide insertion) or BA (60 nucleotide insertion). The remaining 496 sequences with a corresponding G gene were genotyped based on the distal third of the G gene. Those 288 sequences without a corresponding G gene were genotyped based on pair wise assignment of the full F gene. Of the 1,090 sequences, 822 were from the RSV/A subgroup and 268 were RSV/B (Table 1). Among the RSV/A subgroup, the most dominant genotypes were GA2 (44%) and GA5 (36%). The most dominant genotype among the RSV/B subgroup was BA (71%). Sequences of a particular genotype generally clustered together on the phylogenetic trees of their respective subgroups. Distance between branches of the phylogenetic trees was greater among the G trees (S2 Fig, S3 Fig) than the F trees (S4 Fig, S5 Fig), indicating greater variability among the G sequences than the F sequences.
Table 1

Number of sequences from GenBank, Houston, and Chile grouped by their pairwise similarity score assigned genotypes among the RSV/A and RSV/B subgroups.

GenotypesNumber of Sequences
RSV/AGA138
GA5294
GA310
GA42
GA713
NA113
SAA5
GA2364
ON83
RSV/A SUB-TOTAL822
RSV/BGB112
GB416
SAB12
GB338
BA190
RSV/B SUB-TOTAL268
TOTAL1,090
Descriptive information was available for 965 (86%) F gene sequences, including date of sample collection. This information is not longitudinal and therefore not necessarily representative of the epidemic in natural populations. The oldest sample included in this dataset is from 1956, the majority of the samples were obtained between 2001 and 2014. The appearance of different genotypes and shifts in their dominance are evident when the genotype assignments are plotted by year the samples were obtained (Fig 1). It is interesting to note that there was a resurgence of GA5 viruses in 2013. It is also clear that, although RSV/A predominates in most years, there is annual co-circulation of the two subgroups.
Fig 1

Appearance of RSV/A and RSV/B genotypes and dominance over time (1961–2014).

Sequences assigned genotypes were assessed by their sample acquisition date. The included inset depicts those years (1961–2000) with a small number of available sequences.

Appearance of RSV/A and RSV/B genotypes and dominance over time (1961–2014).

Sequences assigned genotypes were assessed by their sample acquisition date. The included inset depicts those years (1961–2000) with a small number of available sequences.

Amino acid variability of RSV subgroups

The F domains of RSV/A and RSV/B were compared to the historical RSV/A Long strain (ATCC VR-26; RSV/A Long). When compared to the RSV/A Long strain, viruses in the RSV/A subgroup had a number of nucleotide changes, the majority of which resulted in synonymous amino acid changes (Fig 2). Those domains with the greatest number of non-synonymous changes included the signal peptide, p27, heptad repeat domain 2, antigenic site ø and the transmembrane domain. Shown in S2 Table are all the non-synomynous changes that were detected in domains that have not been reported to have antigenic sites. Overall, there were a greater number of non-synonymous nucleotide changes in the RSV/B subgroup (N = 60) than the RSV/A subgroup (N = 21), when compared to the RSV/A Long strain. For both subgroups, approximately one-third of the non-synonmyous changes occurred in antigenic sites. Those domains that had non-synonymous changes among RSV/A also had non-synonymous changes among RSV/B, and often occurred in the same amino acid residue. However the changes were more numerous among RSV/B and occurred with a higher frequency. For example, the signal peptide of RSV/B had 15 amino acid changes all occurring with a frequency of >90%, with the exception of a secondary change in AA4 that occurred in 7% of sequences. Conversly, the signal peptide of RSV/A had four amino acid changes, none of which occurred at a frequency >90%. A number of non-synonymous changes among RSV/B were seen in other additional domains, including antigenic sites.
Fig 2

The fusion genes of RSV/A isolates are more similar to the RSV/A Long strain than RSV/B isolates.

Non-synonymous/synonymous ratio graph of amino acids with non-synonymous or synonomous changes in the fusion gene for a) RSV/A isolates and b) RSV/B isolates (compared to the RSV/A Long strain). Fusion gene domains are depicted by assigned color blocks. Antigenic sites are highlighted in shades of gray respective to the protein conformation on which they are found.

The fusion genes of RSV/A isolates are more similar to the RSV/A Long strain than RSV/B isolates.

Non-synonymous/synonymous ratio graph of amino acids with non-synonymous or synonomous changes in the fusion gene for a) RSV/A isolates and b) RSV/B isolates (compared to the RSV/A Long strain). Fusion gene domains are depicted by assigned color blocks. Antigenic sites are highlighted in shades of gray respective to the protein conformation on which they are found. The viruses in the RSV/A subgroup have fewer amino acid changes in antigenic sites than the viruses in the RSV/B subgroup when compared to the historical RSV/A Long strain (Table 2). Those seven amino acid changes in antigenic sites in RSV/A (with frequency >5%) occurred in sites I, II, p27, and α2α3β3β4 (AM14) and ranged from 7–91% in frequency. A total of nineteen amino acid changes occurred in the antigenic sites of RSV/B viruses. The majority of these changes (N = 15) occurred in pre-fusion antigenic sites (antigenic site ø, MPE-8, α2α3β3β4 (AM14), and p27) and ranged from 79–100% in frequency. Seven amino acids (384, 276, 124, 125, 129, 129, and 169) among the antigenic sites share vulnerability to change in both RSV/A and RSV/B isolates. Changes occurred at a greater rate in RSV/B. For example, a single change occurred in site p27 of RSV/A, L129V, at a rate of 14%. Among the RSV/B subgroup, a change occured at the same amino acid site. This change from leucine to isoleucine at a rate of 100%.
Table 2

Frequency of amino acid changes in antigenic sites for of the fusion gene RSV/A (N = 822) and RSV/B (N = 268) compared to the RSV/A Long strain.

Antigenic SiteAmino AcidsRSV/ARSV/B
AA ChangeFrequencyAA ChangeFrequency
I380–400--N380S100%
V384I91%V384T100%
--P389S100%
II254–277N276S38%N276S100%
IV422–438----
p27109–136--L111A100%
--R113Q100%
--F114Y98%
--L119I100%
--N121T100%
T122A9%--
K124N87%K124N100%
T125N16%T125L98%
--T128S100%
L129V14%L129I100%
ø62–69,196–210--N67T100%
--D200N100%
--K201N100%
--K209Q88%
α2α3β3β4(AM14)148–194S169N7%S169N100%
MPE844–50,305–310--L45F79%

Changes with ≤5% frequency were omitted from this table. Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at each residue.

Changes with ≤5% frequency were omitted from this table. Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at each residue.

Amino acid variability among RSV genotypes

The F antigenic sites of isolates for each genotype were compared to the historical RSV/A Long strain (ATCC VR-26; RSV/A Long). Among antigenic site I (Table 3), the change V384I was observed at a rate of ≥90% for all RSV/A genotypes except GA1, for which the change was observed with less frequency (50%). The change P389S was observed among GA2 strains at a very low rate (0.30%) and among all RSV/B genotypes with a frequency of 100%. An additional amino acid change, V384T, was observed among all RSV/B genotypes with a rate of 100%.
Table 3

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site I.

 380381382383384385386387388389390391392393394395396397398399400
NLCNVDIFNPKYDCKIMTSKT
GA1....Ib................
    2° AA.b
GA5....Ia.....a..........a.a
    2° AA.dIdAd
GA3....Ia................
GA7....Ia................
    2° AAVc
NA1....Ia................
SAA....Ia................
GA2.a.a..Ia.....a.a..........
    2° AADdFd.SdNd
    3° AATd
ON..a..Ia................
    2° AAFd
 380381382383384385386387388389390391392393394395396397398399400
NLCNVDIFNPKYDCKIMTSKT
GB1Sa...Ta....Sa...........
GB4Sa...Ta....Sa...........
SABSa...Ta....Sa...........
GB3Sa...Ta....Sa...........
BASa...Ta....Sa...........

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Among antigenic site II (Table 4), the amino acid change N276S was observed among contemporary RSV/A genotypes GA2, NA1, and ON, occurring at rate of 70%, 100%, and 99%, respectively. This change occurred in all RSV/B genotypes with a frequency of 100%, with the exception of BA, for which the change occurred in 95% of sequences.
Table 4

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site II.

 254255256257258259260261262263264265266267268269270271272273274275276277
NSELLSLINDMPITNDQKKLMSNN
GA1........................
GA5...............a.........
    2° AAId
GA3........................
GA7........................
NA1......................Sa.
SAA........................
GA2..a....................Sb.
    2°AANd.c
ON..a....................Sa.
    2° AAGc.
 254255256257258259260261262263264265266267268269270271272273274275276277
NSELLSLINDMPITNDQKKLMSNN
GB1......................Sa.
GB4......a................Sa.
    2° AATc
SAB......................Sa.
GB3..a....................Sa.
    2° AAGc
BA......................Sa.
    2° AA.c

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Antigenic site IV (Table 5) was well conserved among all genotypes of RSV/A and RSV/B. Two amino acid changes were observed at a low frequency (≤1%) among the GA5 genotype.
Table 5

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site IV.

 422423424425426427428429430431432433434435436437438
CTASNKNRGIIKTFSNG
GA1.................
GA5.......a........a..
    2° AA Dd       Fd 
GA3.................
GA7.................
NA1.................
SAA.................
GA2.................
ON.................
 422423424425426427428429430431432433434435436437438
CTASNKNRGIIKTFSNG
GB1.................
GB4.................
SAB.................
GB3.................
BA.................

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Site p27 was the most variable antigenic site (Table 6). The change K124N was observed at a rate of ≥90% for all RSV/A genotypes except GA1, for which the change was observed with less frequency (18%). Several amino acid changes occurred with high frequency among single RSV/A genotypes, including T122A in GA1 (61%), T125N in GA5 (89%), and L129V in GA7 (100%). Among the RSV/B genotypes, the following changes occurred with an observed frequency of ≥90%: L111A, R113Q, F114Y, L119I, N121T, K124N, T125L, T128S, L129I. These changes were also observed in the contemporary genotypes of RSV/A (GA2 and ON).
Table 6

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site p27.

 109110111112113114115116117118119120121122123124125126127128129130131132133134135136
RELPRFMNYTLNNTKKTNVTLSKKRKRR
GA1......a.......Ab..b.b...........
    2° AAVc.cNcNc
    3° AAPcTc
GA5......a..a.....a.a.NaNa..a..a..a.....
    2° AALdYdSdNd.IcVdNd
GA3....a.....a......Na............
    2° AAScHc
GA7...............Na....Va.......
NA1.........a......Na............
    2° AAHc
SAA...............Na............
GA2......a.a.a.a.a.a.a.a.a.aNa.a..a.a..a.a.a....
    2° AASdRdDcHcIdIdDdSdAdNdTcIdIdIdGdRdId
    3° AANdFdKdIdYdAd
ON............a..a.Na.a..a.........
    2° AADcAcYdIdId
    3° AANd
 109110111112113114115116117118119120121122123124125126127128129130131132133134135136
RELPRFMNYTLNNTKKTNVTLSKKRKRR
GB1..Aa.QaYa....Ia.Ta..NaLa..SaIa.......
GB4..Aa.QaYa....Ia.Ta..NaLa..SaIa.......
SAB..Aa.QaYa....Ia.Ta..NaLa..SaIa.......
GB3..Aa.QaYa.a..a.Ia.Ta..NaLa..SaIa.......
    2° AAHcHcTcHcTc
BA..Aa.aQaYa.a..a.Ia.Ta.a.aNaLa..aSaIa.a......
    2° AASdIdCdAcIdGcScPcAcTdId
    3° AAIdRd.dSd

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Among antigenic site ø (Table 7), a number of amino acid changes occurred with low frequency within isolates of the RSV/A genotypes. Among RSV/B genotypes, the amino acid changes N67T, D200N, and K201N, occurred with a frequency of ≥90%. The amino acid change K209Q occurred more frequently in GB1 (100%), SAB (100%), BA (99.5%) genotypes than GB4 (63%) and GB3 (76%).
Table 7

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site ø.

 6263646566676869196197198199200201202203204205206207208209210
SNIKENKCKNYIDKQLLPIVNKQ
GA1.a....a........... a.......
    2° AARcKcVc
GA5..........a.............
    2° AATd
GA3.......................
GA7.......................
NA1.......................
SAA.......................
GA2..a..........a....a...a....
    2° AATdMdFdVd
ON.......................
 6263646566676869196197198199200201202203204205206207208209210
SNIKENKCKNYIDKQLLPIVNKQ
GB1.....Ta......NaNa.......Qa.
GB4.....Ta......NaNa.......Qb.
    2° AA.c
SAB.....Ta......NaNa.......Qa.
GB3....a.Ta.a.....NaNa.......Qb.
    2° AAQcIcNc.c
BA.....Ta....a..NaNa.......Qa.
    2° AAIc.d

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Among site α2α3β3β4(AM14) (Table 8), a number of amino acid changes occurred at a low rate within the RSV/A genotypes. These changes include S169N, which was observed among GA1 (3%), GA5 (42%), and GA2 (0.5%). Among RSV/B genotypes, the amino acid change S169N was observed in all RSV/B genotypes at a frequency of 100%, except BA for which the change occurred in 99.5% of sequences.
Table 8

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site α2α3β3β4(AM14).

148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
IASGIAVSKVLHLEGEVNKIKSALLSTNKAVVSLSNGVSVLTSKVLD
GA1.a....................a.a...a.a.....................
    2° AATcVcRcNcQcTc
    3° AATc
GA5..................a..a..b..............a...........
    2° AADdLcNcTd
GA3...............................................
GA7...............................................
NA1...............................................
SAA...............................................
GA2.....a.............a....a............a.a.a...........a
    2° AATdKdNdFdIdDdHd
ON.......................a.......................
    2° AAPd
 148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
IASGIAVSKVLHLEGEVNKIKSALLSTNKAVVSLSNGVSVLTSKVLD
GB1.....................Na.........................
    2° AAI
GB4.....................Na.........................
SAB.....................Na.........................
GB3.....a................Na.........................
    2° AAMc
BA.....a................Na...a.a.....................
    2° AAMdDdQcAd
    3° AAVdLd

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%). Among site MPE8 (Table 9), a number of amino acid changes occurred with low frequency within the RSV/A genotypes. The amino acid change L45F occurred with high frequency in all RSV/B genotypes GB1 (100%), GB4 (69%), SAB (100%), and GB3 (100%), except BA, for which the change occurred in 25% of sequences. The amino acid change L305I was observed at a rate of 100% for all RSV/B genotypes.
Table 9

Genotype specific amino acid changes when compared to the RSV/A Long strain in antigenic site MPE8.

 44454647484950305306307308309310
YLSALRTLYGVID
GA1.............
GA5.......a......
    2° AAId
GA3.............
GA7.............
NA1.............
SAA............
GA2.a...........a.
    2° AAFdMd
ON.............
 44454647484950305306307308309310
YLSALRTLYGVID
GB1.Fa.....Ia.....
GB4.Fb.....Ia.....
    2° AA.c
SAB.Fa.....Ia.....
GB3.Fa.....Ia.....
BA..b.....Ia.....
    2° AAFc
    3° AAId

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts:

d(≤1%)

c(2–45%)

b(46–89%), and

a(≥90%).

The percentage of each unique amino acid found in a genotype at a given residue is indicated by the superscripts: d(≤1%) c(2–45%) b(46–89%), and a(≥90%).

Amino acid entropy of RSV subgroups

The amino acids of the F protein were analyzed for entropy, another measure for variability. The theoretical range for entropy is 0 to 3.3 (given that all amino acids have equal representation at any one particular location). Amino acids with an entropy values of 0.1 or less are considered stable. The higher the entropy value the greater the likelihood of variability at that residue. Those amino acids within the top 5% (≥95 percentile) of entropy are reported in Table 10. For RSV/A, the mean entropy value of the amino acids within the top 5% of entropy was 0.39 (0.18–1.03). For RSV/B, the mean entropy value of the top 5% was 0.26 (0.10–0.92). Many of these amino acids that fell within the top 5% of entropy belonged to the signal peptide, cytoplasmic tail, transmembrane domain, or p27. In addition, a number of these amino acids with the highest entropy values were within antigenic sites. Among RSV/A isolates, the amino acids of antigenic sites with high entropy values resided in antigenic sites I and II, p27, α2α3β3β4(AM14), while they were found in MPE-8, p27, antigenic site ø, and α2α3β3β4(AM14) among the RSV/B isolates. The distribution of entropy values among the amino acids in the F protein of RSV/A and RSV/B was similar (Fig 3). For both RSV/A and RSV/B, most of the entropy values in the amino acids of antigenic sites were low, at ≤0.1. Higher entropy values (>0.1) were observed more frequently in RSV/A (N = 27) than RSV/B (N = 19) viruses.
Table 10

Amino acids of the RSV fusion gene with the greatest 5% entropy among isolates in RSV/A and RSV/B subgroups.

RSV/ARSV/B
DomainAA PositionEntropyDomainAA PositionEntropy
Transmembrane5401.03Transmembrane5290.92
Not Defined 21050.67Not Defined 21030.58
Antigenic Site II2760.67MPE8450.53
Heptad Repeat 25180.51Cytoplasmic Tail5730.41
p271240.46Not Defined 43120.4
p271250.45Antigenic Site ø2090.37
Signal Peptide160.44Signal Peptide40.34
Transmembrane5470.44Not Defined 42780.31
p271290.41Signal Peptide70.3
Signal Peptide80.4Heptad Repeat 25180.3
Signal Peptide200.37Cytoplasmic Tail5510.21
Signal Peptide60.34Signal Peptide110.2
p271220.34Signal Peptide120.2
Antigenic Site I3840.31Transmembrane4720.2
Signal Peptide190.27Signal Peptide80.16
Not Defined 43560.27Signal Peptide150.13
α2α3β3β4 (AM14)1690.24Transmembrane5270.12
Not Defined 21010.21p271170.11
Not Defined 21030.19Cytoplasmic Tail5620.11
Cytoplasmic Tail5740.19Antigenic Site ø650.1
Signal Peptide20.18p271140.1
Signal Peptide130.18α2α3β3β4 (AM14)1720.1
Signal Peptide150.17Cytoplasmic Tail5640.1
Not Defined 43770.17p271250.09
Signal Peptide40.13Transmembrane5260.09
p271170.13Signal Peptide160.08
Signal Peptide220.11Signal Peptide220.08
α2α3β3β4 (AM14)1520.1Not Defined 1390.08
Not Defined 32130.1α2α3β3β4 (AM14)1780.08
Transmembrane5350.1Not Defined 42910.08

Entropy was defined as ∑−i log(i). Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at a given residue.

Fig 3

The entropy values in amino acids within antigenic sites of the fusion gene of RSV/A and RSV/B have a similar distribution.

Entropy was defined as ∑−i log(i). Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at a given residue.

The entropy values in amino acids within antigenic sites of the fusion gene of RSV/A and RSV/B have a similar distribution.

Entropy was defined as ∑−i log(i). Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at a given residue. Entropy was defined as ∑−i log(i). Individual genotypes of each subgroup contributed equally to the proportion of amino acids found at a given residue.

Discussion

The F protein of RSV is the central antigen for RSV vaccine development. Because most vaccines in development are monovalent and based on a historical sequences of the GA1 genotype of RSV/A, we chose the historical RSV/A Long strain of RSV/A as our reference sequence, which was orginially isolated in 1956. We utilized 1,090 sequences from GenBank that were obtained over the past 6 decades from various locations throughout the world and took several steps to ensure our sequences were assigned the correct genotype (S1 Fig). Focus was given to the analysis of amino acids, opposed to nucleotide, due to the relatvely high rate or variability as well as functionality of the molecules. Examining the antigenic sites of RSV/A and RSV/B sequences to the RSV/A Long strain we found that, while these sites are generally well conserved, differences did exist and were most pronounced among the pre-fusion sites of RSV/B. The amino acid changes observed in the antigenic sites occurred at a frequency of 90% or higher in the RSV/B sequences, and fewer changes were detected in RSV/A sequences. This may indicate that a monovalent RSV F vaccine that is based on the historical GA1 genotype or a contemporary RSV/A genotype may provide a lower efficacy against infections caused by viruses from RSV/B compared to viruses from RSV/A. In adults who have been infected multiple times with RSV during their life time, the majority of neutralizing antibodies against RSV that are found in sera target the pre-fusion F [26]. Monoclonal antibodies directed at site ø, which is unique to pre-fusion F, have greater neutralization capacity than palivizumab, which is directed against site II. Site II is found in both the pre-fusion and post-fusion forms of F. Palivizumab, a monoclonal antibody that targets site II, is licensed for the prevention of severe RSV infection in high-risk infants born prematurely or have chronic lung disease or hemodynamically significant congenital heart disease. For this reason, both the pre-fusion and post-fusion F are intriguing targets for vaccine development. However, we have found pre-fusion sites to be the most variable of RSV antigenic sites. Indeed, it has been hypothesized that variability in the antigenic site ø of RSV/A and RSV/B may result in subgroup specific immunity. Although subgroup specific epitopes have been identified, the neutralizing potential of monoclonals developed against the prefusion antigenic sites is not well defined [27]. Site p27 was found to be the most variable antigenic site among both RSV/A and RSV/B. The F protein of RSV is unique in that it contains two furin sites that are cleaved to form a fully activated pre-fusion F. p27 is the cleavage product that is removed when furin-like protease cleave at its surrounding two furin sites. Cleavage at the two furin sites of the F protein was thought to occur as a post-transcriptional process, with the fully cleaved F being transported to the cell surface. Recently, it has been described that the second furin site does not undergo cleavage until the virus infects the cell and is internalized by macropinocytosis. After internalization the second furin site is cleaved making the virus fully infectious [28]. If this RSV entry mechanism is correct, it would indicate that an intermidate F that still possess p27 is present on the respiratory epithelial cell surface as the virus is budding and being released into the respiratory secretions. It would also indicate that virions with intermidate F containing the p27 will be exposed to the host immune response during RSV infection. Fuentes et al recently demonstrated p27 to be a dominant antigenic site recognized by sera from children and adults infected with RSV [16]. Among young children, there was significantly greater binding acitivty in sera for the p27 epitope than other antigenic sites, including site II and site IV [16]. Its great variability between RSV/A and RSV/B may hint at subgroup specific immunity. Evaluating the entropy of RSV/A and RSV/B allows us to understand the variability of sequences within each subgroup and also to compare the two. When examining the amino acids within the top 5% of entropy, we found RSV/A to be more variable than RSV/B and to have a greater number of residues with higher entropy values (>0.1). This is consistant with studies of overall variability in the F gene of the two subgroups [29,30]. Both RSV/A and RSV/B had high entropy value amino acids in the same non-antigenic site domains. However, they differed in the high entropy value amino acids of antigenic sites. Both subgroups had amino acids with high entropy values in p27 and α2α3β3β4(AM14). However, among RSV/A isolates, amino acids with high entropy values were found in antigenic sites I and II and among RSV/B isolates, amino acids with high entropy values were found in antigenic sites ø and MPE-8. Some antigenic sites may be more conserved within each subgroup, if not subgroup specific. Our study is limited in the type and number of sequences available to us; however, this is the largest collection to-date that analyses the sequences of the F gene. Our data also skews towards more recent collection years, as the cost of sequencing has become more inexpensive during this time. For this reason, some genotypes are more represented than others. To help balance over representation and underrepresentation of viruses in different genotypes in our subgroup analysis equal weight was given to each genotype. Likewise, the number of RSV/A sequences reported in this study is approximately three times that of RSV/B. This might be a selection bias based on the sequences in GenBank or it might represent the seasonal variability by location with RSV/A isolates being the predominate viruses. Most of the antigenic site changes observed among the genotypes of RSV/A and RSV/B are conserved; although some genotypes appear to be more susceptible to change. This might be attributed to the smaller sample sizes of some of these genotypes. Additionally, the occurence of low-level variability appears more frequently among the genotypes of RSV/A. This could be due to the overall larger size of the RSV/A subgroup. In addition, we are limited in that the corresponding G genes were not available for all of our F genes. A further limitation was the quality of sequencing available for our consumption. While we believe our system of genotyping based on only the F gene to be sufficient, some misclassification is possible. In summary, this study documents that most of the known antigenic sites of RSV F are generally well conserved; though, differences do exist when comparing the two subgroups to the reference RSV/A Long strain. Additionally, we found a number of differences in non-antigenic sites. Perhaps, containing some antigenic domains yet to be identified. To our surprise, the non-synmoynous changes in the antigenic sites that were detected in the RSV/B isolates occurred at nearly 100% frequency. The signficance of these non-synmoynous changes in the antigenic domains is unclear. Ongoing RSV F vaccine trials primarily with monovalent formulation will provide insight on the importance of these observed differences, and could impact the next generation of RSV F vaccine formulation.

Accession numbers for sequences acquired from GenBank.

(DOCX) Click here for additional data file.

Frequency of amino acid changes in non-antigenic domains of the fusion gene for RSV/A (N = 822) and RSV/B (N = 268) compared to the RSV/A Long strain.

(DOCX) Click here for additional data file.

Frequency of nucleotide changes in antigenic sites of the fusion gene for RSV/A (N = 822) and RSV/B (N = 268) compared to the RSV/A Long strain.

(DOCX) Click here for additional data file.

RSV fusion gene sequence collection strategy.

(TIF) Click here for additional data file.

Phylogenetic trees for the attachment protein of RSV/A.

(TIF) Click here for additional data file.

Phylogenetic trees for the attachment protein of RSV/B.

(TIF) Click here for additional data file.

Phylogenetic trees for the fusion protein of RSV/A.

(TIF) Click here for additional data file.

Phylogenetic trees for the fusion protein of RSV/B.

(TIF) Click here for additional data file.
  29 in total

1.  Comparison of human respiratory syncytial virus A2 and 8/60 fusion glycoprotein gene sequences and mapping of sub-group specific antibody epitopes.

Authors:  A L Connor; D J Bevitt; G L Toms
Journal:  J Med Virol       Date:  2001-02       Impact factor: 2.327

2.  Natural history of human respiratory syncytial virus inferred from phylogenetic analysis of the attachment (G) glycoprotein with a 60-nucleotide duplication.

Authors:  Alfonsina Trento; Mariana Viegas; Mónica Galiano; Cristina Videla; Guadalupe Carballal; Alicia S Mistchenko; José A Melero
Journal:  J Virol       Date:  2006-01       Impact factor: 5.103

3.  Antigenic structure of human respiratory syncytial virus fusion glycoprotein.

Authors:  J A López; R Bustos; C Orvell; M Berois; J Arbiza; B García-Barreno; J A Melero
Journal:  J Virol       Date:  1998-08       Impact factor: 5.103

4.  Cross-neutralization of four paramyxoviruses by a human monoclonal antibody.

Authors:  Davide Corti; Siro Bianchi; Fabrizia Vanzetta; Andrea Minola; Laurent Perez; Gloria Agatic; Barbara Guarino; Chiara Silacci; Jessica Marcandalli; Benjamin J Marsland; Antonio Piralla; Elena Percivalle; Federica Sallusto; Fausto Baldanti; Antonio Lanzavecchia
Journal:  Nature       Date:  2013-08-18       Impact factor: 49.962

5.  Neutralizing antibodies against the preactive form of respiratory syncytial virus fusion protein offer unique possibilities for clinical intervention.

Authors:  Margarita Magro; Vicente Mas; Keith Chappell; Mónica Vázquez; Olga Cano; Daniel Luque; María C Terrón; José A Melero; Concepción Palomo
Journal:  Proc Natl Acad Sci U S A       Date:  2012-02-08       Impact factor: 11.205

6.  Structure of RSV fusion glycoprotein trimer bound to a prefusion-specific neutralizing antibody.

Authors:  Jason S McLellan; Man Chen; Sherman Leung; Kevin W Graepel; Xiulian Du; Yongping Yang; Tongqing Zhou; Ulrich Baxa; Etsuko Yasuda; Tim Beaumont; Azad Kumar; Kayvon Modjarrad; Zizheng Zheng; Min Zhao; Ningshao Xia; Peter D Kwong; Barney S Graham
Journal:  Science       Date:  2013-04-25       Impact factor: 47.728

7.  Safety and immunogenicity of a Sf9 insect cell-derived respiratory syncytial virus fusion protein nanoparticle vaccine.

Authors:  Gregory M Glenn; Gale Smith; Louis Fries; Rama Raghunandan; Hanxin Lu; Bin Zhou; D Nigel Thomas; Somia P Hickman; Eloi Kpamegan; Sarathi Boddapati; Pedro A Piedra
Journal:  Vaccine       Date:  2012-11-12       Impact factor: 3.641

8.  Circulation patterns of genetically distinct group A and B strains of human respiratory syncytial virus in a community.

Authors:  T C Peret; C B Hall; K C Schnabel; J A Golub; L J Anderson
Journal:  J Gen Virol       Date:  1998-09       Impact factor: 3.891

9.  Emerging genotypes of human respiratory syncytial virus subgroup A among patients in Japan.

Authors:  Yugo Shobugawa; Reiko Saito; Yasuko Sano; Hassan Zaraket; Yasushi Suzuki; Akihiko Kumaki; Isolde Dapat; Taeko Oguma; Masahiro Yamaguchi; Hiroshi Suzuki
Journal:  J Clin Microbiol       Date:  2009-06-24       Impact factor: 5.948

10.  Gene sequence variability of the three surface proteins of human respiratory syncytial virus (HRSV) in Texas.

Authors:  Lorena I Tapia; Chad A Shaw; Letisha O Aideyan; Alan M Jewell; Brian C Dawson; Taha R Haq; Pedro A Piedra
Journal:  PLoS One       Date:  2014-03-13       Impact factor: 3.240

View more
  23 in total

1.  Respiratory Syncytial Virus Genotypes, Host Immune Profiles, and Disease Severity in Young Children Hospitalized With Bronchiolitis.

Authors:  Rosa Rodriguez-Fernandez; Lorena I Tapia; Chin-Fen Yang; Juan Pablo Torres; Susana Chavez-Bueno; Carla Garcia; Lisa M Jaramillo; Melissa Moore-Clingenpeel; Hasan S Jafri; Mark E Peeples; Pedro A Piedra; Octavio Ramilo; Asuncion Mejias
Journal:  J Infect Dis       Date:  2017-12-27       Impact factor: 5.226

2.  Reverse genetics systems for contemporary isolates of respiratory syncytial virus enable rapid evaluation of antibody escape mutants.

Authors:  Wendy K Jo; Alina Schadenhofer; Andre Habierski; Franziska K Kaiser; Giulietta Saletti; Tina Ganzenmueller; Elias Hage; Sibylle Haid; Thomas Pietschmann; Gesine Hansen; Thomas F Schulz; Guus F Rimmelzwaan; Albert D M E Osterhaus; Martin Ludlow
Journal:  Proc Natl Acad Sci U S A       Date:  2021-04-06       Impact factor: 11.205

3.  Assessment of Drug Resistance during Phase 2b Clinical Trials of Presatovir in Adults Naturally Infected with Respiratory Syncytial Virus.

Authors:  Danielle P Porter; Ying Guo; Jason Perry; David L Gossage; Timothy R Watkins; Jason W Chien; Robert Jordan
Journal:  Antimicrob Agents Chemother       Date:  2020-08-20       Impact factor: 5.191

4.  Antibody responses of healthy adults to the p27 peptide of respiratory syncytial virus fusion protein.

Authors:  Brittani N Blunck; Letisha Aideyan; Xunyan Ye; Vasanthi Avadhanula; Laura Ferlic-Stark; Lynn Zechiedrich; Brian E Gilbert; Pedro A Piedra
Journal:  Vaccine       Date:  2021-12-10       Impact factor: 3.641

5.  A prospective surveillance study on the kinetics of the humoral immune response to the respiratory syncytial virus fusion protein in adults in Houston, Texas.

Authors:  Brittani N Blunck; Letisha Aideyan; Xunyan Ye; Vasanthi Avadhanula; Laura Ferlic-Stark; Lynn Zechiedrich; Brian E Gilbert; Pedro A Piedra
Journal:  Vaccine       Date:  2021-01-26       Impact factor: 3.641

6.  A phase I study to evaluate safety, pharmacokinetics, and pharmacodynamics of respiratory syncytial virus neutralizing monoclonal antibody MK-1654 in healthy Japanese adults.

Authors:  Yuji Orito; Naoyuki Otani; Yuki Matsumoto; Katsukuni Fujimoto; Nobuyuki Oshima; Brian M Maas; Luzelena Caro; Antonios O Aliprantis; Kara S Cox; Osamu Tokumaru; Masaaki Kodama; Hideo Kudo; Hiromitsu Imai; Naoto Uemura
Journal:  Clin Transl Sci       Date:  2022-05-17       Impact factor: 4.438

Review 7.  Respiratory Syncytial Virus: The Influence of Serotype and Genotype Variability on Clinical Course of Infection.

Authors:  Silvia Vandini; Carlotta Biagi; Marcello Lanari
Journal:  Int J Mol Sci       Date:  2017-08-06       Impact factor: 5.923

8.  Correction: Sequence variability of the respiratory syncytial virus (RSV) fusion gene among contemporary and historical genotypes of RSV/A and RSV/B.

Authors:  Anne M Hause; David M Henke; Vasanthi Avadhanula; Chad A Shaw; Lorena I Tapia; Pedro A Piedra
Journal:  PLoS One       Date:  2017-06-28       Impact factor: 3.240

9.  Genetic diversity of human respiratory syncytial virus isolated among children with acute respiratory infections in Southern Cameroon during three consecutive epidemic seasons, 2011-2013.

Authors:  Sebastien Kenmoe; Marie-Astrid Vernet; Fabien Miszczak; Julia Dina; Matthieu Schoenhals; Véronique Penlap Beng; Astrid Vabret; Richard Njouom
Journal:  Trop Med Health       Date:  2018-04-03

10.  Genetic variations in the fusion protein of respiratory syncytial virus isolated from children hospitalized with community-acquired pneumonia in China.

Authors:  Xiangpeng Chen; Baoping Xu; Jiayun Guo; Changchong Li; Shuhua An; Yunlian Zhou; Aihuan Chen; Li Deng; Zhou Fu; Yun Zhu; Chunyan Liu; Lili Xu; Wei Wang; Kunling Shen; Zhengde Xie
Journal:  Sci Rep       Date:  2018-03-14       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.