| Literature DB >> 25609811 |
Charles N Agoti1, James R Otieno1, Patrick K Munywoki1, Alexander G Mwihuri1, Patricia A Cane2, D James Nokes3, Paul Kellam4, Matthew Cotten5.
Abstract
UNLABELLED: Human respiratory syncytial virus (RSV) is associated with severe childhood respiratory infections. A clear description of local RSV molecular epidemiology, evolution, and transmission requires detailed sequence data and can inform new strategies for virus control and vaccine development. We have generated 27 complete or nearly complete genomes of RSV from hospitalized children attending a rural coastal district hospital in Kilifi, Kenya, over a 10-year period using a novel full-genome deep-sequencing process. Phylogenetic analysis of the new genomes demonstrated the existence and cocirculation of multiple genotypes in both RSV A and B groups in Kilifi. Comparison of local versus global strains demonstrated that most RSV A variants observed locally in Kilifi were also seen in other parts of the world, while the Kilifi RSV B genomes encoded a high degree of variation that was not observed in other parts of the world. The nucleotide substitution rates for the individual open reading frames (ORFs) were highest in the regions encoding the attachment (G) glycoprotein and the NS2 protein. The analysis of RSV full genomes, compared to subgenomic regions, provided more precise estimates of the RSV sequence changes and revealed important patterns of RSV genomic variation and global movement. The novel sequencing method and the new RSV genomic sequences reported here expand our knowledge base for large-scale RSV epidemiological and transmission studies. IMPORTANCE: The new RSV genomic sequences and the novel sequencing method reported here provide important data for understanding RSV transmission and vaccine development. Given the complex interplay between RSV A and RSV B infections, the existence of local RSV B evolution is an important factor in vaccine deployment.Entities:
Mesh:
Year: 2015 PMID: 25609811 PMCID: PMC4403408 DOI: 10.1128/JVI.03391-14
Source DB: PubMed Journal: J Virol ISSN: 0022-538X Impact factor: 5.103
Summary of RSV primers used in this study
| Target | Primer | Sequence (5′ to 3′) | Strand | Position | % with 0 MM | % with 0–3 MM | |
|---|---|---|---|---|---|---|---|
| RSVA | rsvas | ACGCGAAAAAATGCGTACAAC | Plus | 1 | 57.13 | 18.28 | 18.97 |
| RSVA | rsva52 | TGTGCATGTTATTACAAGTAGTGATATTTG | Plus | 266 | 56.96 | 95.52 | 98.97 |
| RSVA | rsva50 | GCATGTTATTACAAGTAGTGATATTTGCC | Plus | 269 | 57.51 | 95.17 | 98.97 |
| RSVA | rsva117 | ATAAGAGATGCCATGGTTGGTTTAAGA | Plus | 2849 | 58.44 | 95.86 | 100.00 |
| RSVA | rsva86 | AAGAGATGCCATGGTTGGTTTAAGA | Plus | 2851 | 58.43 | 95.86 | 100.00 |
| RSVA | rsva175 | TTCTCTTAAACCAACCATGGCATCT | Minus | 2878 | 58.43 | 95.86 | 100.00 |
| RSVA | rsva39 | CTTCTCTTAAACCAACCATGGCATC | Minus | 2879 | 58.22 | 95.86 | 100.00 |
| RSVA | rsva1820 | GCAGCATATGCAGCAACAATC | Plus | 5207 | 56.95 | 93.79 | 98.97 |
| RSVA | rsva1914 | CAGCATATGCAGCAACAATCCAA | Plus | 5208 | 58.32 | 93.10 | 98.62 |
| RSVA | rsva1644 | CAACTCCATTGTTATTTGCCCC | Minus | 5674 | 56.05 | 89.66 | 100.00 |
| RSVA | rsva1688 | CAACTCCATTGTTATTTGCCCCA | Minus | 5674 | 57.54 | 89.66 | 100.00 |
| RSVA | rsva704 | ATGTGTTGCCATGAGCAAACTC | Plus | 7893 | 57.95 | 91.03 | 100.00 |
| RSVA | rsva731 | GCCATGAGCAAACTCCTCACT | Plus | 7900 | 58.49 | 71.38 | 99.31 |
| RSVA | rsva341 | TTGTCAGGTAGTATCATTATTTTTGGCATG | Minus | 8196 | 58.53 | 98.97 | 99.31 |
| RSVA | rsva312 | AGGATATTTGTCAGGTAGTATCATTATTTTTGG | Minus | 8203 | 58.08 | 98.97 | 100.00 |
| RSVA | rsva374 | AAGAGAACTCAGTGTAGGTAGAATGTTT | Plus | 10360 | 57.89 | 96.55 | 100.00 |
| RSVA | rsva350 | AGAACTCAGTGTAGGTAGAATGTTTG | Plus | 10363 | 56.64 | 96.55 | 100.00 |
| RSVA | rsva497 | GCTTGATTGAATTTGCTGAGATCTGT | Minus | 10620 | 58.44 | 95.52 | 100.00 |
| RSVA | rsva539 | ATGCTTGATTGAATTTGCTGAGATCTG | Minus | 10622 | 58.68 | 95.52 | 100.00 |
| RSVA | rsva1220 | GATTGGGTGTATGCATCTATAGATAACAAG | Plus | 12386 | 57.94 | 95.86 | 99.31 |
| RSVA | rsva1232 | ATTGGGTGTATGCATCTATAGATAACAAG | Plus | 12387 | 57.17 | 95.86 | 99.31 |
| RSVA | rsva364 | TTATATATCCCTCTCCCCAATCTTTTTCAAA | Minus | 13070 | 58.32 | 96.21 | 100.00 |
| RSVA | rsva385 | ATCAGTTATATATCCCTCTCCCCAATCTT | Minus | 13075 | 58.46 | 96.21 | 100.00 |
| RSVA | rsva4066 | GTTGTATAACAAACTACCTGTGATTTTAATCAG | Minus | 14983 | 57.95 | 88.97 | 99.31 |
| RSVA | rsva5632 | TAACTATAATTGAATACAGTGTTAGTGTGTAGC | Minus | 15063 | 57.95 | 29.31 | 95.17 |
| RSVA | rsvae | ACGAGAAAAAAAGTGTCAAAAACTAATA | Minus | 15223 | 55.09 | 17.59 | 18.28 |
| RSVB | rsvbs | ACGCGAAAAAATGCGTACTACA | Plus | 1 | 57.56 | 43.14 | 43.14 |
| RSVB | rsvb3 | TGGGGCAAATAAGAATTTGATAAGTGC | Plus | 44 | 58.58 | 48.04 | 54.90 |
| RSVB | rsvb1021 | GGGGCAAATAAGAATTTGATAAGTGCTATT | Plus | 45 | 58.75 | 47.06 | 54.90 |
| RSVB | rsvb33 | ATATTAGGAATGCTCCATACATTAGTAGTTG | Plus | 2777 | 57.21 | 88.24 | 100.00 |
| RSVB | rsvb71 | TAAGAGATGCTATGGTTGGTCTAAGAGA | Plus | 2841 | 58.69 | 90.20 | 100.00 |
| RSVB | rsvb50 | AGTCTTGCCATAGCCTCTAACCT | Minus | 2937 | 58.57 | 93.14 | 100.00 |
| RSVB | rsvb95 | CCATTTTTTCGCTTTCCTCATTCCTA | Minus | 2963 | 58.14 | 95.10 | 100.00 |
| RSVB | rsvb7884 | AGTATATGTGGCAACAATCAACTCTG | Plus | 5202 | 57.48 | 81.37 | 100.00 |
| RSVB | rsvb7996 | TATGTGGCAACAATCAACTCTGC | Plus | 5206 | 57.70 | 81.37 | 100.00 |
| RSVB | rsvb7442 | GATGTGGAGGGCTCGGATG | Minus | 5548 | 57.92 | 75.49 | 100.00 |
| RSVB | rsvb7423 | CCATGGTTATTTGCCCCAGATTTAAT | Minus | 5662 | 57.87 | 77.45 | 99.02 |
| RSVB | rsvb3762 | AGAGGTCATTGCTTGAATGGTAGAA | Plus | 7642 | 57.98 | 93.14 | 100.00 |
| RSVB | rsvb3712 | AAGAGCATAGACACTTTGTCTGAAATAAG | Plus | 7762 | 57.89 | 77.45 | 100.00 |
| RSVB | rsvb3652 | GCTTATGGTTATGCTTTTGTGGATATCTAAT | Minus | 8130 | 58.41 | 89.22 | 98.04 |
| RSVB | rsvb3660 | GCAATCATGCTTTCACTTGAGATCAA | Minus | 8247 | 58.67 | 64.71 | 98.04 |
| RSVB | rsvb32 | AAGAAGAGTACTAGAGTATTACTTGAGAGATAA | Plus | 10236 | 57.04 | 90.20 | 100.00 |
| RSVB | rsvb52 | AAATCCAAATCTTAGCAGAGAAAATGATAG | Plus | 10412 | 56.70 | 96.08 | 100.00 |
| RSVB | rsvb47 | CCATGCAGTTCATCTAATACATCACTG | Minus | 10673 | 58.13 | 90.20 | 99.02 |
| RSVB | rsvb168 | TGCATGTCTATATGTACATATTATTGTGACAAG | Minus | 10746 | 58.25 | 91.18 | 99.02 |
| RSVB | rsvb651 | ATCGACATTGTGTTTCAAAATTGCATAAG | Plus | 12640 | 58.40 | 81.37 | 100.00 |
| RSVB | rsvb165 | TTCAAAATTGCATAAGTTTTGGTCTTAGC | Plus | 12653 | 58.06 | 88.24 | 100.00 |
| RSVB | rsvb27 | TTAATGAACATATGATCAGTTATATACCCCTCT | Minus | 13088 | 57.88 | 79.41 | 100.00 |
| RSVB | rsvb60 | AACTTAAAACTGTGACAGCCTTTTATTCT | Minus | 13325 | 58.08 | 89.22 | 100.00 |
| RSVB | rsvb1199 | ATAGTACACTACCTGTTATTTTAATCAGCTTCT | Minus | 14977 | 58.56 | 88.24 | 100.00 |
| RSVB | rsvb989 | TATAGTACACTACCTGTTATTTTAATCAGCTTC | Minus | 14978 | 57.57 | 88.24 | 100.00 |
| RSVB | rsvbe | ACGAGAAAAAAAGTGTCAAAAACTAATGT | Minus | 15216 | 57.47 | 5.88 | 6.86 |
Primer mapping position in RSVA (GenBank accession number FJ948820) or RSVB (GenBank accession number JQ582843).
T (melting temperature) calculated using a Python script that approximates the method of Breslauer et al. (51).
Percentage of full-length RSVA genomes (n = 290) or full-length RSVB genomes (n = 102) showing perfect homology to primer, i.e., 0 mismatches (MM).
Percentage of full-length RSVA genomes (n = 290) or full-length RSVB genomes (n = 102) showing the target sequence for the primer with up to 3 mismatches.
Details for samples used in this study
| MiSeq | Age (mo) | Sample date (day-mo-yr) | Group | Length (nt) | Coverage | Present in G set | Present in F set | GenBank no. | ENA no. |
|---|---|---|---|---|---|---|---|---|---|
| 10028_10 | 0 | 07-Jan-02 | A | 9,346 | 6,401 | Yes | KP317918 | ERR323212 | |
| 10028_11 | 6 | 27-Apr-02 | A | 7,091 | 10,370 | KP317940 | ERR323213 | ||
| 10028_12 | 6 | 28-Jan-03 | A | 9,776 | 5,692 | Yes | KP317955 | ERR323214 | |
| 11866_65 | 5 | 13-Feb-03 | A | 12,151 | 7,347 | Yes | Yes | KP317949 | ERR438932 |
| 11865_75 | 8 | 24-Mar-04 | A | 14,985 | 12,283 | Yes | Yes | KP317956 | ERR438910 |
| 10891_50 | 6 | 21-Jan-05 | A | 5,396 | 3,554 | Yes | Yes | KP317948 | ERR376407 |
| 10891_56 | 0 | 02-Feb-05 | A | 5,396 | 2,369 | Yes | Yes | KP317924 | ERR376413 |
| 9696_45 | 14 | 20-Feb-06 | A | 14,778 | 3,830 | Yes | Yes | KP317944 | ERR303303 |
| 10891_57 | 1 | 23-Feb-06 | A | 14,841 | 4,640 | Yes | Yes | KP317942 | ERR376414 |
| 10891_58 | 0 | 29-Mar-06 | A | 8,864 | 6,016 | KP317943 | ERR376415 | ||
| 10891_59 | 3 | 04-Jan-07 | A | 11,496 | 5,316 | Yes | KP317937 | ERR376416 | |
| 10891_60 | 1 | 05-Jan-07 | A | 14,791 | 4,454 | Yes | Yes | KP317926 | ERR376417 |
| 10891_51 | 0 | 07-Mar-08 | A | 14,967 | 4,882 | Yes | Yes | KP317933 | ERR376408 |
| 10891_52 | 11 | 17-Mar-08 | A | 5,636 | 1,201 | Yes | KP317931 | ERR376409 | |
| 10899_38 | 1 | 22-Feb-09 | A | 14,854 | 8,478 | Yes | Yes | KP317950 | ERR381723 |
| 10899_40 | 4 | 26-Jan-10 | A | 10,113 | 13,351 | Yes | KP317916 | ERR381725 | |
| 10899_41 | 18 | 10-Feb-10 | A | 14,713 | 12,405 | Yes | Yes | KP317935 | ERR381726 |
| 11864_54 | 3 | 29-Apr-10 | A | 14,716 | 7,071 | Yes | Yes | KP317936 | ERR438905 |
| 11862_33 | 3 | 26-Aug-10 | A | 14,719 | 8,961 | Yes | Yes | KP317921 | ERR438868 |
| 11864_53 | 1 | 25-Mar-11 | A | 14,735 | 6,891 | Yes | Yes | KP317951 | ERR438904 |
| 11862_28 | 28 | 13-Apr-11 | A | 15,214 | 10,434 | Yes | Yes | KP317920 | ERR438864 |
| 11862_29 | 4 | 23-Mar-12 | A | 14,950 | 12,922 | Yes | Yes | KP317953 | ERR438865 |
| 11862_32 | 14 | 30-Apr-12 | A | 7,197 | 6,180 | KP317947 | ERR438867 | ||
| 9697_16 | 10 | 06-Jul-02 | B | 15,040 | 5,419 | Yes | Yes | KP317939 | ERR303322 |
| 9697_10 | 8 | 13-Jan-03 | B | 9,790 | 6,853 | Yes | KP317930 | ERR303316 | |
| 10140_1 | 46 | 02-Apr-04 | B | 12,034 | 12,174 | Yes | Yes | KP317919 | ERR331021 |
| 9697_7 | 10 | 22-Dec-04 | B | 15,080 | 4,480 | Yes | Yes | KP317925 | ERR303313 |
| 9697_6 | 2 | 25-Dec-04 | B | 14,998 | 6,523 | Yes | Yes | KP317954 | ERR303312 |
| 9697_5 | 1 | 27-Jan-06 | B | 15,234 | 3,682 | Yes | Yes | KP317917 | ERR303311 |
| 9465_10 | 23 | 27-Feb-09 | B | 14,995 | 16,190 | Yes | Yes | KP317938 | ERR303268 |
| 9465_11 | 31 | 13-Feb-10 | B | 15,004 | 11,722 | Yes | Yes | KP317941 | ERR303269 |
| 9465_12 | 22 | 06-Apr-10 | B | 15,260 | 14,855 | Yes | Yes | KP317932 | ERR303270 |
| 9465_6 | 17 | 09-May-10 | B | 15,333 | 13,719 | Yes | Yes | KP317952 | ERR303264 |
| 9465_7 | 3 | 01-Feb-11 | B | 15,233 | 14,182 | Yes | Yes | KP317927 | ERR303265 |
| 9465_8 | 2 | 14-Apr-11 | B | 15,323 | 15,367 | Yes | Yes | KP317945 | ERR303266 |
| 9465_9 | 1 | 08-Jul-11 | B | 15,237 | 14,709 | Yes | Yes | KP317928 | ERR303267 |
| 9465_3 | 8 | 14-Jan-12 | B | 14,995 | 12,378 | Yes | Yes | KP317946 | ERR303261 |
| 9465_1 | 19 | 13-Feb-12 | B | 15,233 | 12,994 | Yes | Yes | KP317934 | ERR303259 |
| 9465_4 | 14 | 01-Mar-12 | B | 15,179 | 14,802 | Yes | Yes | KP317923 | ERR303262 |
| 10911_9 | 1 | 23-Mar-12 | B | 14,977 | 12,504 | Yes | Yes | KP317929 | ERR376442 |
| 9465_2 | 5 | 16-May-12 | B | 14,941 | 12,906 | Yes | Yes | KP317922 | ERR303260 |
Final sequence length obtained from de novo assembly of short read data (see Materials and Methods).
Coverage calculated by mapping all reads to final assembled contig. Coverage was calculated as the number of mapped reads/(length of the genome fragment/129).
Samples yielding sufficient sequence for G region analysis (Fig. 5).
Samples yielding sufficient sequence for F region analysis (Fig. 5).
The final genome data were deposited in GenBank with the indicated accession numbers.
Short-read data available at European Nucleotide Archive (http://www.ebi.ac.uk/ena).
FIG 5Kilifi versus global changes in the G, F, and NS2 proteins. (A) Kilifi compared to global G protein changes. For each group, the G protein sequences were identified as Kilifi or non-Kilifi (global) and aligned, and a consensus amino acid sequence was generated (at 60% level). The first portion shows the positions of O-linked (red) and N-linked (blue) glycosylation sites, the second portion shows general features of the G protein, and the third portion shows total changes (Kilifi plus global) at each position. The fourth portion shows amino acid differences in each G sequence from the consensus. Amino acid changes observed only in Kilifi are marked in red, and changes observed either globally or in the Kilifi are marked in gray. Gaps are not indicated. N-linked and O-linked glycosylation sites were determined using NetNGlyc 1.0 and NetOGlyc 3.1 (46–48). (B) Kilifi versus global F protein changes. Changes in F protein were determined and are depicted as in panel A. Known motifs of the F protein (49) include signal peptide (SP), heptad repeat C (HRC), 27-mer fragment (p27), putative fusion peptide (FP), heptad repeat A (HRA), domains 1 and 2 (Dom1&2), heptad repeat B (HRB), transmembrane domain (TM), and cytoplasm domain (CP). Antigenic sites I, II, and IV (ASI, ASII, and ASIV) are sites of neutralizing antibody binding (40, 50). (C) Kilifi versus global NS2 protein changes. Changes in NS2 protein were determined and are depicted as in panel A. Known motifs of the NS2 protein include the TRAF3-interacting domain (TRAF3-ID) and C-terminal tetrapeptide sequence (DLNP) (43).
FIG 1(A) PCR primer target sites in RSVA and RSVB. The primer target sequences in representative RSVA (left) and RSVB (right) viruses were determined. Circular markers indicate positions of primer target sites in the test genome color-coded by number of mismatches with the primer; gray bars indicate lengths and positions of the predicted products. (B) Two examples of reverse transcription-PCR function. The DNA products of reverse transcription and PCR amplification of two samples were resolved by agarose gel electrophoresis and visualized by ethidium bromide staining. Sizes of some of the molecular size markers (in base pairs) are indicated to left of the gel. Lane m, molecular size markers; lanes 1 to 6, individual 2- to 3-kb RSV amplicons 1 to 6, respectively. (C) Flowchart of the RSV sequencing process.
FIG 2Phylogenetic analysis of the Kilifi RSVA and RSVB genomes. (A) MrBayes tree of representative global RSVA genome sequences together and the 11 novel Kilifi RSVA genome sequences. (B) MrBayes tree of representative global RSVB genome sequences and the 16 novel Kilifi RSVA genome sequences. Trees were inferred using the Bayesian methods in MrBayes (http://mrbayes.sourceforge.net/index.php) under the GTR model of evolution. The numbers next to the branches indicate the posterior probabilities. The Kilifi taxa are indicated in red font. Thinned global reference sets for RSVA and RSVB were prepared from all available RSV genomes clustering at 0.99% identity using uclust (32). See Materials and Methods for additional details.
FIG 3Comparison of RSVB genomes with identical G regions. Each panel represents a genome nucleotide alignment of RSVs that had identical G gene sequences. The G protein ORF portions of the genomes are highlighted gray across the panels and were identical. The vertical lines indicate where there are nucleotide substitutions occurring outside the G gene region between the genomes. The blue blocks indicate a gap in the sequence.
FIG 4(A) Estimates of the nucleotide substitution rates for RSVA and RSVB in the individual ORFs and for the whole-genome sequence. (B) Estimates of tMRCA for RSVA and RSVB for the individual ORFs and for the whole-genome sequence. The analysis was undertaken using the usearch-thinned data sets (37 genome sequences for RSVA and 23 sequences for RSVB). The analysis was performed with BEAST (36).
Kilifi versus global evolution
| Protein | No. of distinct changes for all Kilifi and global viruses | No. of distinct changes in Kilifi viruses | No. (%) of distinct changes unique to Kilifi viruses |
|---|---|---|---|
| RSVA G | 409 | 68 | 7 (11.8) |
| RSVB G | 299 | 70 | 30 (42.9) |
| RSVA F | 200 | 45 | 9 (22.2) |
| RSVB F | 81 | 19 | 13 (79.3) |
| RSVA NS2 | 73 | 18 | 3 (16.7) |
| RSVB NS2 | 38 | 16 | 13 (81.3) |
Number of distinct amino acid changes observed in Kilifi and not in other parts of the world. “Distinct changes” means that the set of changes is reduced to a unique set with multiple occurrences of a change counted only once.
The P value for Fisher's exact test was <0.01 for the number of location-specific distinct changes compared to total distinct changes for RSVA versus RSVB.
The P value for Fisher's exact test was <0.05 for the number of location-specific distinct changes compared to total distinct changes for RSVA versus RSVB.
The P value for Fisher's exact test was <0.05 for the number of location-specific distinct changes compared to total distinct changes for RSVA versus RSVB.