| Literature DB >> 25793751 |
Michael E Bose1, Jie He1, Susmita Shrivastava2, Martha I Nelson3, Jayati Bera2, Rebecca A Halpin2, Christopher D Town2, Hernan A Lorenzi2, Daniel E Noyola4, Valeria Falcone5, Giuseppe Gerna6, Hans De Beenhouwer7, Cristina Videla8, Tuckweng Kok9, Marietjie Venter10, John V Williams11, Kelly J Henrickson1.
Abstract
BACKGROUND: Human respiratory syncytial virus (RSV) is the leading cause of respiratory tract infections in children globally, with nearly all children experiencing at least one infection by the age of two. Partial sequencing of the attachment glycoprotein gene is conducted routinely for genotyping, but relatively few whole genome sequences are available for RSV. The goal of our study was to sequence the genomes of RSV strains collected from multiple countries to further understand the global diversity of RSV at a whole-genome level.Entities:
Mesh:
Year: 2015 PMID: 25793751 PMCID: PMC4368745 DOI: 10.1371/journal.pone.0120098
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Information on the RSV Strains Sequenced.
| Accession Number | Strain Name | Base Pairs | Gaps (bp) |
|---|---|---|---|
| KF826827 | RSVA/Homo sapiens/ARG/159/2004 | 15209 | - |
| KF826828 | RSVA/Homo sapiens/ARG/162/2004 | 15191 | - |
| KF530260 | RSVA/Homo sapiens/ARG/170/2005 | 14980 | - |
| KF826838 | RSVA/Homo sapiens/ARG/177/2006 | 15189 | - |
| KF826841 | RSVA/Homo sapiens/ARG/190/2007 | 15194 | - |
| KF826846 | RSVA/Homo sapiens/ARG/202/2008 | 15142 | - |
| KF826847 | RSVA/Homo sapiens/AUS/248/2007 | 15179 | - |
| KF826848 | RSVA/Homo sapiens/AUS/249/2007 | 15190 | - |
| KF530261 | RSVA/Homo sapiens/DEU/106/2008 | 15100 | - |
| KF826830 | RSVA/Homo sapiens/DEU/107/2009 | 15186 | - |
| KF826831 | RSVA/Homo sapiens/DEU/108/2009 | 15195 | - |
| KF826854 | RSVA/Homo sapiens/ITA/119/2009 | 15192 | - |
| KF826832 | RSVA/Homo sapiens/ITA/120/2009 | 15176 | - |
| KF826833 | RSVA/Homo sapiens/ITA/121/2009 | 15129 | - |
| KF826855 | RSVA/Homo sapiens/ITA/123/2009 | 15163 | - |
| KF826856 | RSVA/Homo sapiens/ITA/125/2009 | 15188 | - |
| KF826826 | RSVA/Homo sapiens/MEX/23/2004 | 15197 | - |
| KF826816 | RSVA/Homo sapiens/MEX/25/2005 | 14948 | 1215 |
| KF826836 | RSVA/Homo sapiens/MEX/26/2006 | 15165 | - |
| KF826837 | RSVA/Homo sapiens/MEX/27/2006 | 15194 | - |
| KF826840 | RSVA/Homo sapiens/MEX/29/2007 | 15106 | - |
| KF826817 | RSVA/Homo sapiens/MEX/43/2009 | 14744 | 117 |
| KF530268 | RSVA/Homo sapiens/MEX/59/2007 | 15128 | - |
| KF826852 | RSVA/Homo sapiens/USA/629–1/2007 | 15167 | - |
| KF826850 | RSVA/Homo sapiens/USA/629–11–1/2008 | 15193 | - |
| KF530263 | RSVA/Homo sapiens/USA/629–4/2007 | 15118 | 1279 |
| KF826823 | RSVA/Homo sapiens/USA/629–4360/1998 | 15197 | - |
| KF826824 | RSVA/Homo sapiens/USA/629–4392/1998 | 15200 | - |
| KF826821 | RSVA/Homo sapiens/USA/629–8–2/2007 | 15177 | - |
| KF530269 | RSVA/Hep2_lab/USA/629-Q0030_RSV60/2009 | 15049 | - |
| KF826849 | RSVA/Hep2_lab/USA/629-Q0115_RSV89/2010 | 15182 | - |
| KF530267 | RSVA/Homo sapiens/ZAF/323/2007 | 15078 | 1518, 530, 1797, 130, 1078 |
| KF530258 | RSVA/Homo sapiens/ZAF/324/2007 | 15123 | 592 |
| KF530264 | RSVA/Homo sapiens/ZAF/332/2008 | 15039 | 1002, 670, 290, 174 |
| KF826839 | RSVB/Homo sapiens/ARG/187/2006 | 15278 | - |
| KF826842 | RSVB/Homo sapiens/ARG/195/2007 | 15269 | - |
| KF826845 | RSVB/Homo sapiens/ARG/201/2008 | 15147 | - |
| KF530265 | RSVB/Homo sapiens/DEU/111/2009 | 14673 | 258, 670 |
| KF826853 | RSVB/Homo sapiens/DEU/114/2008 | 15183 | - |
| KF530266 | RSVB/Homo sapiens/DEU/115/2008 | 15064 | - |
| KF826834 | RSVB/Hep2_lab/ITA/126_RSV78/2009 | 15218 | - |
| KF826835 | RSVB/Hep2_lab/ITA/127_RSV79/2009 | 15238 | - |
| KF530262 | RSVB/Homo sapiens/ITA/128/2009 | 15045 | - |
| KF826857 | RSVB/Homo sapiens/ITA/129/2009 | 15278 | - |
| KF826858 | RSVB/Homo sapiens/ITA/130/2009 | 15233 | - |
| KF826859 | RSVB/Homo sapiens/ITA/131/2009 | 15263 | - |
| KF826818 | RSVB/Homo sapiens/ITA/132/2009 | 14730 | 207, 1197, 1383, 1456, 687 |
| KF826819 | RSVB/Homo sapiens/ITA/133/2009 | 15200 | 926, 737, 388 |
| KF826820 | RSVB/Homo sapiens/ITA/134/2009 | 15181 | 54 |
| KF826860 | RSVB/Homo sapiens/ITA/135/2009 | 15249 | - |
| KF826825 | RSVB/Homo sapiens/MEX/20/2004 | 15259 | - |
| KF826829 | RSVB/Homo sapiens/MEX/24/2005 | 15215 | - |
| KF826843 | RSVB/Homo sapiens/MEX/51/2008 | 15280 | - |
| KF826844 | RSVB/Homo sapiens/MEX/62/2008 | 15218 | - |
| KF826851 | RSVB/Homo sapiens/USA/629–24/2007 | 15279 | - |
| KF826822 | RSVB/Homo sapiens/USA/629–5/2007 | 15269 | - |
| KF530259 | RSVB/Homo sapiens/ZAF/319/2006 | 15157 | 308 |
a These sequences were sequenced by Next-Gen sequencing.
b Partial G CDS sequences have previously been published for these samples with different strain names. Accession numbers for these sequences are HQ711732, HQ711709, HQ711688, and HQ711801 [11].
c These sequences were produced from viruses isolated in tissue culture.
Fig 1Protein Entropy Plots.
This is an entropy plot of the concatenated predicted protein sequences of all of the viruses sequenced in this study. Entropy values were calculated using BioEdit 7.0 and the plot was generated using Microsoft Excel. Black bars are for RSVA sequences and red bars are for RSVB sequences. The higher the bar is the greater the variation at that position in the protein sequence. Across the top of the plot are listed the abbreviated protein sequence names in the order in which the CDS sequences for the proteins appear in the genome.
Sites under positive selection in RSVA and RSVB.
| Type | Gene | Site | SLAC | FEL | FUBAR | Reference Position | |||
|---|---|---|---|---|---|---|---|---|---|
| dN-dS | p-value | dN-dS | p-value | dN-dS | Post. Pr. | ||||
| RSVA | G | 4 | 3.237 | 0.042 | 1.091 | 0.002 | 0.779 | 0.994 | 4 |
| 124 | 0.752 | 0.019 | 0.433 | 0.043 | 0.176 | 0.877 |
| ||
| 161 | 0.338 | 0.233 | 0.346 | 0.009 | 0.202 | 0.957 |
| ||
| 162 | 0.474 | 0.129 | 0.35 | 0.006 | 0.212 | 0.969 | 162 | ||
| 247 | 1.333 | 0.002 | 1.162 | 0.005 | 0.847 | 0.998 |
| ||
| 258 | 0.405 | 0.175 | 0.453 | 0.003 | 0.239 | 0.977 |
| ||
| 301 | 1.686 | 0.001 | 0.807 | 0.062 | 0.697 | 0.98 |
| ||
| 313 | 1.879 | <0.001 | 0.776 | 0.002 | 0.637 | 0.994 |
| ||
| 317 | 2.885 | 0.003 | 0.834 | 0.017 | 0.716 | 0.989 |
| ||
| 324 | 2.009 | 0.15 | 1.751 | 0.004 | 2.034 | 0.998 |
| ||
| F | 19 | 5.37 | 0.083 | 1.931 | 0.029 | 1.025 | 0.964 | 19 | |
| 518 | 6.54 | 0.026 | 0.787 | 0.008 | 0.324 | 0.956 | 518 | ||
| RSVB | G | 219 | 4.19 | <0.001 | 1.318 | 0.001 | 0.81 | 1 |
|
| 286 | 1.288 | 0.005 | 0.498 | 0.005 | 0.248 | 0.961 | 261 | ||
| 292 | 1.571 | 0.011 | 1.043 | 0.003 | 0.806 | 0.996 |
| ||
| 305 | 0.894 | 0.027 | 0.617 | 0.035 | 0.207 | 0.874 | 280 | ||
a Position in reference sequence. Accession M74568 is the reference for RSVA and AF013254 is the reference for RSVB. Sites in bold were also predicted to be under positive selection in previous publications.
Fig 2G Protein Entropy Plot and Positive Selected Sites.
This is an entropy plot of the G protein sequences with positively selected sites also shown. Entropy values were calculated using BioEdit 7.0 from the alignments used for the positive selection analysis and the plot was generated using Microsoft Excel. Black bars are for RSVA sequences and red bars are for RSVB sequences. Sites predicted to be under positive selection in this study are shown with black diamonds for RSVA and red diamonds for RSVB. Near the top are shown sites predicted to be under positive selection or diversifying selection from previously published studies with black pluses for RSVA and red pluses for RSVB.
Gene End Sequences of RSVA and RSVB.
| Gene | RSVA | RSVB | ||
|---|---|---|---|---|
| Sequence 3’→5’ | Occurrences | Sequence 3’→5’ | Occurrences | |
|
|
| 34 |
| 23 |
|
|
| 17 |
| 23 |
|
| 16 | |||
|
|
| 27 |
| 22 |
|
| 1 | UCAAUU—GUUUUU | 1 | |
| UCAAUU—AUUUUU | 3 | |||
| UCAAUU—GUUUUU | 2 | |||
|
|
| 27 |
| 22 |
| UCAAU—-GUUUUUUUU | 2 | UCAUU—-GUUUUUUUU | 1 | |
|
| 2 | |||
| UCAAU—-GUUUUU | 1 | |||
| UCAAU—GCUUUUUU | 2 | |||
|
|
| 12 |
| 22 |
|
| 19 | UCUAUUU-AUUUU | 1 | |
|
| 1 | |||
|
|
| 26 |
| 21 |
|
| 6 |
| 2 | |
|
|
| 29 |
| 20 |
| UCAGU—AAUUUUUU | 2 |
| 3 | |
| UCAAU—AAUUUUU | 2 | |||
|
|
| 29 |
| 21 |
|
| 1 | UCAAU-AUAUUUUU | 2 | |
| UCAAU-AUAUUUUUU | 2 | |||
| UCAAU-AUAUUUUUUU | 1 | |||
| UCAGU-AUAUUUU | 1 | |||
|
|
| 30 |
| 20 |
|
|
| 31 | UCAAU—AAUUUUUU | 21 |
|
| 1 | |||
| UCAGU—AAUUUUUU | 1 | |||
a Underlined sequences were identified previously. Dashes were added so the U-tracts line up.
b These numbers only include sequences from this study.
Fig 3CDS Evolutionary Rates.
This plot shows the estimated evolutionary rates for each CDS in RSVA and RSVB. Error bars represent the 95% HPD values. Rates were determined using BEAST v1.8.0 with the GTR model of nucleotide substitution, a gamma-distributed rate variation among sites, an uncorrelated lognormal relaxed molecular clock, and a flexible Bayesian skyline tree prior. Each CDS was ran with a chain length of 50 million and sampled 10,000 times.
Fig 4Tree of RSVA Genome Sequences.
This is a maximum clade credibility tree of RSVA genome sequences generated in this study and retrieved from GenBank. Tip times correspond to date of collection with the scale axis across the bottom showing the years. Tip labels show the accession number, country of isolation, and collection date. The labels are color coded with black for sequences from this study (FTS), grey for sequences with an undetermined genotype (UND), and the remaining colors corresponding to previous published genotypes as show in the key in the upper left corner. Brackets highlight the major clades. Bayesian posterior probabilities are shown for key nodes.
Fig 5Tree of RSVB Genome Sequences.
This is a maximum clade credibility tree of RSVB genome sequences generated in this study and retrieved from GenBank. Tip times correspond to date of collection with the scale axis across the bottom showing the years. Tip labels show the accession number, country of isolation, and collection date. The labels are color coded with black for sequences from this study (FTS), grey for sequences with an undetermined genotype (UND), and the remaining colors corresponding to previous published genotypes as show in the key in the upper left corner. Brackets highlight the major clades. Bayesian posterior probabilities are shown for key nodes.