| Literature DB >> 30271623 |
J R Otieno1, E M Kamau1, J W Oketch1, J M Ngoi1, A M Gichuki1, Š Binter2,3, G P Otieno1, M Ngama1, C N Agoti1,4, P A Cane5, P Kellam3,6, M Cotten2,7, P Lemey8, D J Nokes1,9.
Abstract
The respiratory syncytial virus (RSV) group A variant with the 72-nucleotide duplication in the G gene, genotype ON1, was first detected in Kilifi in 2012 and has almost completely replaced circulating genotype GA2 strains. This replacement suggests some fitness advantage of ON1 over the GA2 viruses in Kilifi, and might be accompanied by important genomic substitutions in ON1 viruses. Close observation of such a new virus genotype introduction over time provides an opportunity to better understand the transmission and evolutionary dynamics of the pathogen. We have generated and analysed 184 RSV-A whole-genome sequences (WGSs) from Kilifi (Kenya) collected between 2011 and 2016, the first ON1 genomes from Africa and the largest collection globally from a single location. Phylogenetic analysis indicates that RSV-A circulation in this coastal Kenya location is characterized by multiple introductions of viral lineages from diverse origins but with varied success in local transmission. We identified signature amino acid substitutions between ON1 and GA2 viruses' surface proteins (G and F), polymerase (L), and matrix M2-1 proteins, some of which were positively selected, and thereby provide an enhanced picture of RSV-A diversity. Furthermore, five of the eleven RSV open reading frames (ORFs) (G, F, L, N, and P) formed distinct phylogenetic clusters for the two genotypes. This might suggest that coding regions outside of the most frequently studied G ORF also play a role in the adaptation of RSV to host populations, with the alternative possibility that some of the substitutions are neutral and provide no selective advantage. Our analysis provides insight into the epidemiological processes that define RSV spread, highlights the genetic substitutions that characterize emerging strains, and demonstrates the utility of large-scale WGS in molecular epidemiological studies.Entities:
Keywords: ON1; RSV; genomic epidemiology; phylodynamics; respiratory syncytial virus; virus evolution
Year: 2018 PMID: 30271623 PMCID: PMC6153471 DOI: 10.1093/ve/vey027
Source DB: PubMed Journal: Virus Evol ISSN: 2057-1577
Figure 1.Sample sequencing and genome details. The two RSV-A whole genome amplification strategies used in this study are shown in (A), i.e. six and fourteen amplicons. For each panel the positions of primer targets for each amplicon are indicated. The locations of the eleven RSV ORFs are indicated on top of panel 1. (B) The proportion of RSV genome length sequence recovered (using KC731482 as the reference) for all the 184 genomes was plotted as a function of sample’s diagnostic real-time PCR Ct value. (C) The distribution of the diagnostic real-time PCR Ct values for the 184 sequenced samples reported here (KCH and KHDSS). (D) The log values of the sequencing depth (see Materials and methods) at each position of the genome assemblies along the concatenated RSV ORFs (i.e. excluding the intergenic regions).
Figure 2.Global and local ON1 MCC trees and PCA. (A) MCC tree inferred from 344 global full genome sequences (see Materials and methods) with the tips colour coded with the continent of sample collection. All the African samples (in blue, K1-3 and vertical bars) in this dataset were only available from Kilifi (Kenya). Node labels are posterior probabilities indicating support for the selected nodes. (B) The evolutionary rate estimates for the different genotype ON1 ORFs. (C) An MCC tree inferred from 154 ON1 genomes from Kilifi annotated with identified lineages A and B, and the tips colour coded with the epidemic season. (D) A PCA analysis (see Materials and methods) of the same dataset as (C) and similarly annotated with the epidemic season. Percentage of variance explained by each component is indicated on the axis.
Figure 3.Global ON1 G-gene MCC phylogenetic tree. An MCC tree inferred from 1,167 partial ON1 G gene global sequences with the tips colour coded with the source continent.
Figure 4.Pairwise genomic distances and genome-wide amino acid variation. The distribution of pairwise genetic distances between genotype GA2 and ON1 genome sequences are shown in (A) and (B), respectively. (C) An entropy plot showing amino acid variation along the concatenated ORFs of Kilifi RSV-A genomes.
Figure 5.Estimated TMRCA for Kilifi RSV-A viruses and ORFs. (A) MCC tree inferred from 184 RSV-A complete genome sequences (concatenated coding regions only) from Kilifi with the tips colour coded by genotype, i.e. ON1 (cyan) and GA2 (red). The two node bars indicate the 95% HPD interval for the TMRCA for the Kilifi GA2 and ON1 viruses (grey), and Kilifi ON1 strains (blue). Node labels are posterior probabilities indicating support for the selected nodes. (B) The TMRCA (with 95% HPD interval) of the node separating Kilifi RSV-A genotype GA2 and ON1 viruses for a concatenated set of all ORFs and five different ORFs. The stars (*) indicates node posterior support of less than 0.9 (i.e. low support) for the split between GA2 and ON1 in the nucleoprotein (N) and phosphoprotein (P) ORFs.
Signature nonsynonymous substitutions between genotype ON1 and GA2 viruses.
| ORF | ORF Nt Pos. | ORF AA Pos. | Nt Change | AA Change | SNP type |
|---|---|---|---|---|---|
| G | 424 | 142 | TT → CA | L → Q | Substitution |
| G | 622 | 208 | C → A | L → I | Transversion |
| G | 695 | 232 | G → A | G → E | Transition |
| G | 709 | 237 | A → G | N → D | Transition |
| G | 758 | 253 | A → C | K → T | Transversion |
| G | 817 | T → A | Y → N | Transversion | |
| G | 821 | 274 | C → T | P → L | Transition |
| G | 851 | 284 | 72 nt duplication | 24 AA insertion | Deletion |
| G | 929 (GA2: 857) | C → T | P → L | Transition | |
| G | 941 (GA2: 869) | 314 | T → C | L → P | Transition |
| F | 346 | 116 | A → G | N → D | Transition |
| F | 364 | 122 | G → A | A → T | Transition |
| M2-1 | 349 | 117 | A → C | N → H | Transversion |
| L | 1792 | 598 | C → T | H → Y | Transition |
| L | 5175 | 1725 | A → T | E → D | Transversion |
Positions are relative to ON1 strains, in which complementary positions in GA2 (without the duplication) within the G protein are shown in brackets.
Positively selected sites.
Nt, nucleotide; AA, amino acid; Pos., Position.