Literature DB >> 26807575

Whole-Genome Sequencing Analysis from the Chikungunya Virus Caribbean Outbreak Reveals Novel Evolutionary Genomic Elements.

Kenneth A Stapleford1, Gonzalo Moratorio1, Rasmus Henningsson1,2, Rubing Chen3, Séverine Matheus4, Antoine Enfissi4, Daphna Weissglas-Volkov5, Ofer Isakov5, Hervé Blanc1, Bryan C Mounce1, Myrielle Dupont-Rouzeyrol6, Noam Shomron5, Scott Weaver3, Magnus Fontes2, Dominique Rousset4, Marco Vignuzzi1.   

Abstract

BACKGROUND: Chikungunya virus (CHIKV), an alphavirus and member of the Togaviridae family, is capable of causing severe febrile disease in humans. In December of 2013 the Asian Lineage of CHIKV spread from the Old World to the Americas, spreading rapidly throughout the New World. Given this new emergence in naïve populations we studied the viral genetic diversity present in infected individuals to understand how CHIKV may have evolved during this continuing outbreak. METHODOLOGY/PRINCIPLE
FINDINGS: We used deep-sequencing technologies coupled with well-established bioinformatics pipelines to characterize the minority variants and diversity present in CHIKV infected individuals from Guadeloupe and Martinique, two islands in the center of the epidemic. We observed changes in the consensus sequence as well as a diverse range of minority variants present at various levels in the population. Furthermore, we found that overall diversity was dramatically reduced after single passages in cell lines. Finally, we constructed an infectious clone from this outbreak and identified a novel 3' untranslated region (UTR) structure, not previously found in nature, that led to increased replication in insect cells.
CONCLUSIONS/SIGNIFICANCE: Here we preformed an intrahost quasispecies analysis of the new CHIKV outbreak in the Caribbean. We identified novel variants present in infected individuals, as well as a new 3'UTR structure, suggesting that CHIKV has rapidly evolved in a short period of time once it entered this naïve population. These studies highlight the need to continue viral diversity surveillance over time as this epidemic evolves in order to understand the evolutionary potential of CHIKV.

Entities:  

Mesh:

Year:  2016        PMID: 26807575      PMCID: PMC4726740          DOI: 10.1371/journal.pntd.0004402

Source DB:  PubMed          Journal:  PLoS Negl Trop Dis        ISSN: 1935-2727


Introduction

Arthropod-borne viruses (arboviruses) pose an eminent threat to public health worldwide and are continuously re-emerging or spreading to uninfected areas. In particular, chikungunya virus (CHIKV) has recently spread to the Americas to cause an estimated 1.7 million cases of severe, debilitating and often chronic arthralgia after roughly 60 years of circulation within Africa and Asia [1] [2] [3] [4]. This raises questions on how CHIKV will spread, evolve, and adapt in new environments in the near future. Previous epidemics of CHIKV have been attributed to adaptive mutations within the viral glycoproteins, allowing the virus to more readily infect the Asian tiger mosquito Aedes albopictus, and thus increase its transmission throughout areas of the world harboring this mosquito species. Interestingly, the CHIKV strain that has arrived to the Americas is from the Asian lineage and does not contain these adaptive mutations as of yet. However, both Aedes aegypti and albopictus are prevalent throughout many parts of North and South America [5] that, along with an enormous naïve human population, give this new strain ample opportunity to undergo adaptive evolution. Using deep-sequencing technologies, we recently characterized the evolution of CHIKV within the mosquito host, where we recapitulated the emergence of previous epidemic variants and identified novel mutations yet to be detected in nature [6]. A survey of the mutant spectra present in human clinical samples, on the other hand, has not yet been performed for CHIKV. Here, we characterize the minority variants directly from human samples, collected between week 52–2013 and week 5–2014, by whole-genome deep sequencing. While no significant consensus changes were observed between these samples collected within a short period of time, our data reveal considerable intra-host genetic diversity. Most importantly, we identify a 3’ untranslated genome region (UTR) duplication that may have been missed by the initial sequencing performed on the ongoing epidemic in the Americas, which seems unique among the circulating CHIKV strains around the world.

Methods

Ethics Statement

Samples involved in this study were chosen among human serum specimens received as part of standard diagnostic and expertise activities of the arboviruses National Reference Center for French Departments of the Americas located in French Guiana. The donor samples were rendered completely anonymous and renumbered prior to preparation of extracted RNA for sequencing with only the week of sampling and island of origin retained. Of the 100 samples, 20 gave whole-genome deep sequence coverage and five others (last five in Table 1) gave partial coverage and were retained for analysis, and assigned new IDs as indicated in Table 1.
Table 1

Clinical samples used in this study.

SampleaWeek-Year SampledIsland of OriginRNA copies/mlAccession Number
M10052–2013MART6.2E+07LN898093
G1002–2014GUA1.7E+04LN898094
M1013–2014MART4.7E+04LN898095
M1023–2014MART2.5E+07LN898096
G1013–2014GUA1.6E+07LN898097
G1023–2014GUA2.5E+06LN898098
G1033–2014GUA3.5E+06LN898099
M1033–2014MART2.1E+07LN898100
M1043–2014MART2.8E+07LN898101
G1043–2014GUA2.2E+07LN898102
G1053–2014GUA3.0E+06LN898103
M1053–2014MART9.4E+06LN898104
M1063–2014MART3.3E+06LN898105
M1073–2014MART1.3E+07LN898106
M1083–2014MART8.6E+05LN898107
M1094–2014MART4.8E+07LN898108
M1104–2014MART4.5E+06LN898109
G1065–2014GUA8.7E+07LN898110
G1075–2014GUA3.6E+06LN898111
M1113–2014MART8.0E+06LN898112
M1123–2014MART2.1E+06n/a
M1134–2014MART7.6E+04n/a
M1144–2014MART2.8E+04n/a
M1153–2014MART1.5E+06n/a
M1163–2014MART1.2E+03n/a

aIndividual in which infectious clone was derived.

MART = Martinique, GUA = Guadeloupe, n/a—not submitted to bank, incomplete genome coverage

aIndividual in which infectious clone was derived. MART = Martinique, GUA = Guadeloupe, n/a—not submitted to bank, incomplete genome coverage

Selection of Clinical Samples

100 human sera positive for CHIKV qRT-PCR were randomly selected amongst (1) those sampled between week 52 of 2013 and week 5 of 2014 around the beginning of epidemic phase in the French Caribbean islands, and (2) those having a high viral load (mostly between 106 and 107 copies/ml, even if some lower viral loads were added to examine whether sampling bias with respect to viral load had occurred (). One third of these samples were from Guadeloupe and two thirds from Martinique. The consensus sequences of the 20 whole-genome samples were deposited in the European Nucleotide Archive, with the accession numbers indicated in Table 1, and accessible at http://www.ebi.ac.uk/ena/data/view/LN898093-LN898112.

Deep-Sequencing Analysis

Total RNA from patient serum was isolated by Trizol (Sigma) extraction following the manufacturer’s protocol, resuspended in nuclease free water, and used directly for cDNA synthesis using the Maxima H Minus First Strand cDNA Synthesis Kit (Thermo Scientific) with random hexamers. Following cDNA synthesis, approximately 2 kb amplicons of the CHIKV genome were amplified by Phusion DNA polymerase using the primers designed based on the published St. Martin CHIKV strain CNR20235 () (http://www.european-virus-archive.com/article147.html). Amplicons were subsequently purified via a nucleospin PCR purification kit (Macherey-Nagel), quantified by picogreen, and fragmented as described previously [6]. Sequences were obtained with an Illumina NextSeq500 machine and aligned against the CNR20235 reference sequence using the ViVAn pipeline [7], which differentiates statistically significant variants from total SNPs identified within reads. For example, patient 1, amplicon 1 presented 2557 SNPs in the quality filtered reads along the 871 nucleotide sites sequenced, with the lowest frequencies at 0.00001. ViVan statistical analysis further reduced these SNPs to 1188 with the lowest frequency at 0.0001 for an average read coverage of 80,000X. We set an additional, conservative cut-off of a minimum of 3,000X coverage and 0.001 frequency, bringing the total SNPs in this sample to 564. Average coverages were above 70,000X for all samples. All samples had similar profiles to the example given above, with no apparent outliers, with 95–100% of all possible SNPs represented in quality filtered reads, 36–48% of SNPs in ViVan filtered data and 16–23% of SNPs above the conservative cut-off. Variants with a frequency above 0.5 (50% of the total population) were considered consensus changes and were added to the CNR20235 reference sequence. The consensus sequences obtained from the 20 whole-genome samples were deposited in the European Nucleotide Archive with accession numbers listed in . To align “total” and “unique” reads an in-house pipeline was used. The reads were trimmed to remove low-quality bases using fastq-mcf [8] and aligned with bwa-mem [9] to an artificial reference genome consisting of the two references genomes. For tissue culture passaged virus deep-sequencing, human sera were placed directly on Vero or C6/36 cells and supernatants were collected three days post-infection for C6/36 cells or at cell death for Vero cells. Viral RNA was extracted and analyzed as described above.

Phylogenetic Alignment and Analysis

Full-length CHIKV sequences were aligned using the CLUSTAL W program [10]. Once aligned, the program Model Generator [11] was used to identify the optimal evolutionary model that best described our sequence dataset. Akaike information criteria and hierarchical likelihood ratio test indicated that the GTR + Γ + I model best fit the sequence data. Maximum-likelihood phylogenetic trees were constructed under the GTR + Γ + I model using software from the PhyML program [11]. As a measure of the robustness of each node, we used an approximate Likelihood Ratio Test (aLRT), which demonstrates that the branch studied provides a significant likelihood against the null hypothesis that involves collapsing that branch of the phylogenetic tree but leaving the rest of the tree topology identical. aLRT was calculated using three different approaches: (a) minimum of Chi square-based calculations; (b) a Shimodaira-Hasegawa-like procedure (SH-like) [12] [13], which is non-parametric, and (c) a combination of both (SH-like and the minimum Chi square-based calculations), which is the most conservative option for these calculations. In addition, the bootstrap method was also used.

Construction of Caribbean Strain Infectious Clone

Patient serum was first inoculated on the Ae. albopictus mosquito cell line C6/36 and CHIKV obtained was subsequently amplified on Vero cells to generate a working viral stock. Viral RNA was isolated by Trizol extraction and cDNA was synthesized as described above. The infectious clone was constructed using four PCR amplicons generated by Phusion DNA polymerase using the primers in and subcloned into the plasmid containing the published Indian Ocean Lineage (IOL) infectious clone [14] using common restriction sites. In brief, amplicon one was subcloned into the BamHI and AgeI restriction sites of the IOL infectious clone also generating a unique AgeI restriction site in the Caribbean CHIKV infectious clone sequence. The BamHI site was then removed by site-directed mutagenesis. Amplicon two was subcloned between the two AgeI restriction sites, followed by the subcloning of three into the 3’ AgeI and XhoI restriction sites. Finally, amplicon four was subcloned into the XhoI and NotI restriction sites. Each cloning precursor was Sanger sequenced and the final clone was Sanger sequenced in full.

Cell Culture and Viruses

Baby Hamster Kidney (BHK-21) cells and Vero cells were maintained in Dulbecco’s Modified Eagle’s Media (DMEM) supplemented with 10% fetal calf serum and 1% penicillin/streptomycin (P/S) at 37°C with 5% CO2. Ae. albopictus cells (C6/36 and U4.4) were maintained in Leibovitz L-15 media supplemented with 10% fetal bovine serum, 1% P/S, 1% nonessential amino acids, and 1% tryptose phosphate broth at 28°C and 5% CO2. Working viral stocks from the Caribbean infectious clone was generated as described previously [14]. The Caribbean strain infectious clone lacking the 3’UTR duplication was commercially synthesized and supplied by the laboratory of Andres Merits. The CHIKV strain NC/2011-58 (accession # HE806461) was a gift from the Institut Pasteur–New Caledonia. All viruses were passaged once over BHK cells to obtain a working viral stock. Viral titers were determined by plaque assay on Vero cells as previously described [14]. The strains from Mexico, Dominican Republic and Trinidad used to confirm the presence of the 3'UTR duplication were obtained from Scott Weaver and Rubing Chen.

Viral Growth Curves

Mammalian cells (BHK-21 and Vero) and insect cells (C6/36 and U4.4) were infected with each virus at an MOI of 0.1 in infection media (DMEM containing 0.2% bovine serum albumin, 1 mM HEPES pH 7.4, and 1% P/S) for one hour at 37°C for mammalian cells and 28°C for insect cells. Virus was subsequently removed and cells were washed twice with phosphate buffered saline (PBS) and complete media was added. Aliquots of the viral supernatant were taken at the indicated time points and viral titers were determined by plaque assay as described previously [14].

Results

To analyze the viral diversity present within human hosts infected during the current CHIKV outbreak in the Americas, we deep-sequenced 25 viral strains from sera of infected patients from Martinique and Guadeloupe (). The samples, consisting of approximately 70% from Martinique, were taken between weeks 52–2013 and week 5–2014 and represented the same proportion of patients who were diagnosed as CHIKV positive during this time. By using our deep sequencing data we assembled the consensus sequences obtained from each patient to determine the degree of genetic variability of these strains by phylogenetic approaches. These sequences were aligned with 63 full-length CHIKV strains isolated elsewhere, representing all major CHIKV lineages. Subsequently, maximum likelihood phylogenetic trees were constructed. The results of our analysis were in agreement with previous studies [1] [15][16], placing these viruses into the Asian Lineage of CHIKV and to cluster with the St. Martin strain CNR20236 (. Furthermore, deep-sequencing analysis identified a variety of unique high-frequency intra-host minority variants (at greater than 0.5% frequency) in infected individuals, as well as five synonymous consensus sequence differences with respect to the initially reported strain from St. Martin, which were common to all patients (nsP2 position 2716, A>G; nsP3 position 4507, C>A; position 4513, A>G; and the 3’UTR position 11952, C>T; position 11953, G>A) (, bold).

Phylogenetic analysis of chikungunya virus human samples from Martinique and Guadeloupe.

A. Maximum likelihood phylogenetic tree of chikungunya virus strains using complete genome sequences. Scale bar represents genetic distance. Numbers at the branches show aLRT values (1 = 1000). Genotypes are indicated on the right. B. Maximum likelihood phylogenetic tree based on the Caribbean outbreak cluster with additional Asian strains of chikungunya virus using full genome sequences. Scale bar denotes genetic distance. Numbers at the branches show aLRT values (1 = 1000). Strains from Martinique are in blue and those from Guadeloupe are in red. a nt = nucleotide b AA = amino acid c syn = synonymous Bold print indicates five novel synonymous sequences changes. The table shows any mutation present above 0.1 in at least one individual, and any mutation above 0.001 found in more than one individual. We then evaluated the overall genetic variability of the outbreak strain across the entire CHIKV genome. With the exception of the five consensus changes, yielding a frequency of 1, we found a number of high-frequency minority variants scattered throughout the genome; yet the majority were unique to individual patients, suggesting they did not mediate significant adaptation at the population level. In particular, these variants were found primarily in the nonstructural proteins and 3’UTR with only several variants present in the structural genes. When we analyzed low-frequency minority variants we found a diverse population containing variants with frequencies ranging from as much as 30% to less than 0.1% of the population. We next analyzed the specific variants in each gene (Figs and ). We found that the nonstructural proteins presented many more variants (56 at a frequency of at least 10% of the population) than the structural proteins (5 variants at a frequency above 10% of the population). In particular, within nsP1, which functions as the methyltransferase and is necessary for RNA replication, we observed eight amino acid changes and three synonymous changes in individual patients and localized variation concentrated in the methyltransferase and D3 domains. Nsp2 is a multifunctional protein serving as the viral helicase, protease, and NTPase. Although we observed considerable variation scattered across nsP2, this protein contained six synonymous and six amino acid changes including one mutation (G772D) present at roughly 50% of the viral population in one individual. The nsP3 gene, which encodes a phosphoprotein required for RNA replication, contained the largest number of minority variants, with 23 total changes. These include only five synonymous changes and 18 coding changes (and ), with several minority variants making up a considerable portion of the viral population. We noted the mutations of several serine residues in nsP3 (S48R, S255F, and S340P) that were present at nearly 100% in one individual that could change the phosphorylation state of the protein, as well as several charge changes (G113R, R178Q and E415K) that may have functional roles in RNA binding or viral replication. The viral RNA dependent RNA polymerase, nsP4, contained 10 sequence changes with three synonymous and seven coding changes. Interestingly, one patient presented a stop codon (L441Stop) at 63% of the viral population that is coupled with the R178Q mutation of nsP3 mentioned above. It is possible that these mutations may function together to oblate the large amount of truncated nsP4 in this individual.

Minority variant analysis of chikungunya virus nonstructural proteins.

Schematic representation of the chikungunya virus nonstructural proteins is represented above the graph. Nonstructural domains represent the following: nsP1, nonstructural protein 1, genome nucleotides 77–1681, MT = methyltransferase domain (amino acids 1–170), MB = membrane binding domain (amino acids 171–300); nsP2, nonstructural protein 2, genome nucleotides 1682–4072, NTD = N-terminal domain, MTLD = methyltransferase like domain; nsp3, nonstructural protein 3, genome nucleotide 4073–5653, ZBD = zinc binding domain, HVR = hypervariable region (amino acids 328–530); nsP4, nonstructural protein 4, genome nucleotides. 5654–7489, CatD = catalytic domain (amino acids 365–479). Variants expressing a frequency of greater than 0.001 were used for analysis. Top graph represent the mean variant frequency over all samples. Middle graph represents the mean frequency (black) and variant range (red) over all samples. Bottom graph represents mean variant frequency smoothed with a Gaussian kernel.

Minority variant analysis of chikungunya virus structural proteins.

Schematic representation of the chikungunya virus structural proteins is represented above. Capsid, genome nucleotides 7555–8337, NLS = nuclear localization signal (amino acids 60–99), NES = nuclear export signal (amino acids 143–155); E3, genome nucleotides 8338–8529; E2, genome nucleotides 8530–9801, TM = transmembrane domain (amino acids 363–391); 6k, genome nucleotides 9802–9981; E1, genome nucleotides 9982–11301, FL = fusion loop (amino acids 83–101), TM = transmembrane domain (amino acids 413–436). Variants expressing a frequency of greater then 0.001 were used for analysis. Top graph represent the mean frequency over all samples. Middle graph represents the mean frequency (black) and variant range (red) over all samples. Bottom graph represents mean variant frequency smoothed with a Gaussian kernel. In contrast, the structural genes presented fewer numbers and frequencies of minority variants, with only two nonsynonymous variants in E3 and E1 genes, with frequencies between 10 and 14 percent of the viral population (). Given the role of the viral glycoproteins E2 and E1 in viral evolution and transmission, we looked at the overall genetic variation within these two proteins. Here, we found portions of the protein involved in protein-protein interactions (Domains I and II of E1, as well as the E2 and E1 stem domains) and the transmembrane domains required for membrane binding to contain variation. Interestingly, regions flanking the fusion-loop (amino acids 83–101) of E1 contained more variation then the fusion-loop itself indicating that changes directly in the fusion loop are poorly tolerated. Importantly, no samples contain any significant levels of previously observed vector-adaptive mutations such as E2 L210Q, E1T98A, or E1 A226V, which could potentially facilitate dissemination of this virus in Aedes albopictus mosquitoes [17]. In addition to the variation within serum samples, we also addressed how minority variants change as they are passaged through tissue culture, a technique that is commonly used to amplify viral stocks from low-titer human samples. By passaging five different individual sera once through mammalian and insect cells, we found the number of higher-frequency variants dropped considerably from roughly 60 variants in the human samples to 15 in the tissue culture passages (). Of the 15 variants, the six consensus sequence changes were maintained in insect cells; however, passaging through mammalian cells removed the two consensus changes in nsP3 (positions 4507 and 4513). Furthermore, passaging virus through tissue culture cells also identified five novel variants not found in the human samples. These included two synonymous changes in E2 (position 8874, C>T) and E1 (position 10104, G>A), as well as three coding changes in nsP2 (G460S, 37% of the population in mammalian cells), nsP4 (L455M, 95% of the population in insect cells), and E1 (G274V, 95% of the population in mammalian cells) in unique sera passages. When we compared the variants between serum and tissue cultured passages of the same samples, we found that the high-frequency variants present in the sera were indeed maintained over passaging. Finally, passaging each virus a single time through mammalian cells maintained more diversity than passaging virus once through insect cells ( bottom graphs in each panel), something that has been seen previously when looking at viral adaptation between such disparate hosts [14]. Taken together, these data shed light on variable hot spots within the CHIKV genome, identified novel variants circulating at high frequency in individuals, and suggests that if wildtype-like population diversity is desired, it may be best to passage viral strains through highly-permissive mammalian cell lines.

Comparison of minority variants present in either mammalian or insect tissue culture passaged viral stocks.

A. Frequency of variants present in nsp3 (nucleotides 4073–5653). B. Frequency of variants present in E2 (nucleotides 8530–9801). C. Frequency of variants present in E1 (nucleotides 9982–11301). Top graphs represent the mean variant frequency (black) and variant frequency range (red) over each sample. Bottom graphs represent mean low-level variant frequency (0.001–0.04) smoothed with a Gaussian kernel. nt = nucleotide, AA = amino acid, syn = synonymous, The table shows any mutation present above 0.1 in at least one individual, and any mutation above 0.001 found in more than one individual. In addition to characterizing the viral diversity present in the coding regions of the CHIKV genome, we also examined the diversity within the noncoding untranslated regions (UTR) (). To begin, we analyzed the well-conserved 5’UTR () and 3’ subgenomic promoter () and found only slight variations in these regions (frequencies ranging from 0.01 to 10% in the 5’UTR and 0.01 to 1% in the subgenomic promoter), suggesting that RNA secondary structure in these regions is maintained and may be important for viral replication.

Analysis of minority variants in noncoding genome regions and identification of a novel duplication in the 3’UTR.

A-C. Minority variants present in the 5’UTR, genome nucleotides 1–76 (A), subgenomic promoter, genome nucleotides 7490–7554 (B), and 3’UTR, genome sequence 11302–11944 (C). Graphs represent the mean variant frequency (black) and variant range (red) at each nucleotide. D. Schematic and alignment of the published Asian strain and novel Caribbean strain containing a 177 nucleotide duplication of the 3’ end of 1+2a and complete 1+2b in the 3’UTR. “Total reads” show the read coverage of all reads aligning to the two lineage references. E. Quantification of the total “unique reads”. Unique reads show the read coverage of all reads where the alignment to the Caribbean reference is superior to the alignment of the Asian Lineage. F. The presence of the 3'UTR duplication was confirmed in clinical samples obtained from Mexico [18], Dominican Republic and Trinidad. In contrast, we found a higher degree of genetic diversity within the 3’UTR. Interestingly, when we initially aligned the deep sequencing reads to the published St. Martin strain of CHIKV, we observed a near doubling of read coverage mapping to a region in the 3'UTR, spanning the 1+2a and 1+2b repeat regions (, Asian Lineage), suggesting that this region may have been duplicated. Examination of filtered reads that did not properly align to the original reference sequence identified reads that overlapped the expected junction site where the duplication would have occurred. Indeed, this 177 nucleotide duplication was confirmed by Sanger sequencing of RT-PCR amplicons and mapped to a duplication of the 3’ portion of the 1+2a region and complete 1+2b region (, Caribbean Lineage). Subsequently, when we generated a new reference sequence containing the expected duplication for alignment of deep-sequencing data, the reads that originally could not map to the genome now mapped perfectly to the duplication in all patients where sequence data were available (). This duplication was found in patients presenting both low and high viremia. Importantly, we confirmed this duplication by RT-PCR in CHIKV clinical samples from Mexico, the Dominican Republic and Trinidad (, suggesting that this novel genetic element is present in samples throughout the Caribbean islands and Americas. To understand the function and evolutionary potential of these novel minority variants and RNA structural elements, we constructed an infectious clone of this virus () as well as obtained an infectious clone lacking the 3’UTR duplication (Caribbean-∆3’UTR Duplication). Similar to what has been published previously, we found both of these viruses to replicate similarly to another Asian lineage of CHIKV (NC-2011) in mammalian cells ( [19]. However, in mosquito cells we found the Caribbean strain containing the 3’UTR duplication to have a roughly 10-fold growth advantage over the Asian strain as well as a Caribbean strain lacking the 3’UTR duplication (). These data suggest that the 3’UTR duplication not only has no negative impact on viral replication, but that this novel element directly provides an advantage to Asian strains in insects. Prior historic duplications in the 3’UTR are believed to have allowed the Asian genotype to recover from genetic drift after its introduction from Africa in the late 19th or early 20th century [19]. This infectious clone, containing the correct 3'UTR of the American strain of CHIKV, will provide a powerful tool in which to study pathogenesis and evolution.

Design and growth of the Caribbean strain infectious clone.

A. Schematic of the infectious clone, including a new synonymous AgeI restriction site and duplication in the 3’UTR. B and C. Viral growth curves of the Caribbean infectious clone (Caribbean–circle), a Caribbean clone lacking the 3’UTR duplication (Caribbean-∆3’UTR Duplication—square), or an Asian lineage strain of chikungunya virus (NC-2011—triangle) in either mammalian (Vero) or insect (C6/36) cells respectively. Mean values and SEM, n = 3, two-tailed unpaired t-test.

Discussion

In this study we characterized the minority variants and viral populations within CHIKV-infected individuals from the recent Caribbean islands outbreak. It should be noted that the whole-genome analysis presented here was obtained from generally high viremia samples, which could possibly bias the observed diversity. Here we show that genetic diversity is spread throughout the coding region of the CHIKV genome, with higher levels of variation, including amino acid changes, in some non-structural proteins such as nsP3. Given the relatively low frequency of most of these variants (<10%), these mutations likely represent de novo generated neutral or lower fitness variants that would be purified during transmission bottlenecks. It is well known that the majority of mutations in RNA viruses bear a negative fitness cost [20] [21]. In our own study, human samples presented thousands of variants, over 80% of which had frequencies below 0.5%, yet above the background noise of sequencing. Most of the variants presenting stop mutations that would result in non viable virus are within this group. The variants presented here likely retained some level of replicative capacity, yet not sufficient fitness to outcompete the master sequence. It will be interesting to further characterize these variants in the context of the infectious clone, which we have developed to better understand how these variants may function in CHIKV pathogenesis and disease. Our data also indicate that a considerable loss of diversity present within human samples occurs after even a single passage in cell culture, especially in C6/36 mosquito cells. Even a single amplification in cell culture passage, although often necessary for diagnostics and surveillance, may thus introduce artifacts or purify previously existing minority variants that were in the process of emerging in human samples. Phylogenetic analysis revealed that the strains collected in Guadeloupe and Martinique islands belong to the Asian genotype circulating in the Caribbean. Furthermore, the strains sequenced in this study show a closer phylogenetic relationship, which can be attributed to the short genetic distances depicted in their tree branches. We observed strains from Brazil (TR206 and AMA2798) and Mexico (InDRE_4CHIK and InDRE_51CHIK) to cluster with our cohort as well. Interestingly, the strain InDRE_4CHIK in Mexico was obtained from an imported case from Antigua and Barbuda, islands in close geographic proximity to Guadeloupe and Martinique [22] [23] and the strain from Brazil (TR206) was obtained from a patient who had recently traveled to Guadeloupe [15]. Finally, our phylogenetic analysis supports previous hypothesis about the introduction of CHIKV into the Americas by a single entry event of the Asian genotype [15]. Moreover, whereas previous analyses focused on amino acid changes that may impart a functional or enzymatic impact, we also analyzed variation in the noncoding, untranslated regions of the genome. In this analysis, we identified a novel 3’UTR duplication that had not previously been observed in nature, yet is present in several viruses circulating in the Americas including Mexico, the Dominican Republic and Trinidad. The fact that we have found this duplication in clinical samples from 2013 and again in 2015 suggests this duplication is fixed in the current circulating Caribbean strain. Importantly, we found this new UTR structure to provide no disadvantage for the virus and interestingly, it led to increases in viral titers in mosquito cells when compared to a similar Asian strain and a Caribbean strain clone lacking the duplication. As the 3’UTR has been shown to play essential roles in arbovirus replication, evolution, and host adaptation [19] [24], it will be interesting to dissect the role of this novel structure in its ability to specifically infect mosquitoes native to these affected areas as well as its ability to affect pathogenesis in humans. These studies highlight the need to carefully re-analyze deep-sequencing assemblies, such as sudden increases in read coverage and the inability to map unfiltered reads that still contain virus-specific sequences, which may be indicative of duplications and insertions. The origins of this duplication still needs to be determined to understand if it first originated in Asia prior to arriving in the Caribbean, or was generated just before or after December 9, 2013, the start of the epidemic in St. Martin [25] and has since spread throughout the Americas due to a fitness advantage. These scenarios are both possible as chikungunya virus was first observed in Martinique on December 19, 2013 and Guadeloupe on December 28, 2013 and our earliest sample analyzed in this study was on December 26, 2013, shortly after its spread to Martinique. In either case, a population bottleneck was involved that may have facilitated the fixation of this beneficial insertion. Nonetheless, this study provides an in depth look at the minority variants present during an ongoing chikungunya virus epidemic, identifying novel variants and structural elements, and the construction of an infectious clone of this virus to be used to future study the pathogenesis, adaptation and evolution of chikungunya virus in the Americas. The study of these evolutionary elements, which may play crucial roles in chikungunya virus evolution and adaptation, will be necessary to address potential future public health issues both in disease and viral dissemination.
Table 2

Primers used to PCR amplify the chikungunya virus genome.

Primer NameSequence (5’—-3’)Genome Region
Primer 1 ForwardCACGTAGCCTACCAGTTTCTTA5’ UTR–nsP1
Primer 2 ReverseATGGAACACCGATGGTAGGTG
Primer 3 ForwardAACCCCGTTCATGTACAACGCnsP1–nsP2
Primer 4 ReverseCGGCATGTTGTACTCATTCG
Primer 5 ForwardCGAATTCGACAGCTTTGTAGnsP1–nsP2
Primer 6 ReverseGCACATGATGTCCGTTTATC
Primer 7 ForwardGACCAAGACTGAAAGTTGTACnsP2–nsP3
Primer 8 ReverseCCACATAGTATGTATCTCTGC
Primer 9 ForwardGCGTACTGGGACGTAAGTTTAnsP2–nsP3
Primer 10 ReverseGGACGCACTCTCCTGGAGTTTC
Primer 11 ForwardCTGTACGGGAAGTGAGTATGACnsP3–nsP4
Primer 12 ReverseCATACCGGATTTCATCATAGC
Primer 13 ForwardGGAGACGCCGTTTTAGAAACGnsP4–capsid
Primer 14 ReverseCGCTTGAAGGCCAATTTGGCC
Primer 15 ForwardGCAGAGAGAGAATGTGCATGCapsid–E2
Primer 16 ReverseCCGCTTTAGCTGTTCTAATGC
Primer 17 ForwardGGAACTACCTTGCAGCACGTACE2–E1
Primer 18 ReverseGGCGTTAGTCATCGAGTGCAC
Primer 19 ForwardGTACAGCAGAGTGTAAGGACCAE1–3’UTR
Primer 20 ReverseCATATACCTTCTTACCTAC
Primer 21 ForwardGAACATGCCTATCTCCATCGACE1–3’UTR
Primer 22 ReverseAACATCTCCTACGTCCCTATGG
Table 3

Primers used to construct Caribbean chikungunya virus infectious clone.

Primer NameSequence (5’—-3’)
BamHI ForwardGATTAATAACCCATCATGGATCCTG
AgeI ReverseGTTGTAAATGGCCTGGACCGGTGTC
AgeI ForwardGGTATATATTCTCGTCGGACACCGG
AgeI2 ReverseGGCTTCTTTTTCTTTTGAACCGGTT
AgeI2 ForwardGCCCCCCAAAAAGAAACCGGTT
XhoI ReverseGTTACCCCACGTGACCTCGAGCC
XhoI ForwardGTTAACCGTGCCGACTGAGGGGCTCG
PolyANotI ReverseGAGGATGCATTGCGGCCGCTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTTGAAATATTAAAAACAAAATAAC
BamHI removal ForwardCCCATCATGGATtCTGTGTACGTGGATATAG
BamHI removal ReverseCTATATCCACGTACACAGaATCCATGATGGG
Table 4

Frequency of significant mutations (>0.001) identified in patient samples.

GenomeRegionnt Positionant ChangeAA Changeb,cM100G100M101M102G101G102G103M103M104G104G105M105M106M107M108M109M110G106G107M111
5’UTR53A>T0.1060.0890.0770.0860.0810.0700.0780.0880.0660.0720.0710.0040.0040.0030.0030.0030.0030.0030.0020.003
nsp1476T>AC134S0.1010.0980.0900.0980.0970.1060.0940.1000.1070.0900.1020.0170.0170.0120.0100.0130.0110.0120.0100.015
nsp1479T>Csyn>L0.9880.9990.9990.9850.0610.9990.0100.006
nsp1633A>CN186T0.0100.0110.0100.0100.0110.0090.0110.0080.1030.0080.0090.0040.0040.0490.0050.0060.0050.0130.0050.004
nsp1667A>Tsyn>T0.0710.0640.0780.0400.0380.0440.0410.0620.1920.0400.0410.0110.0120.0150.0040.0050.0040.0060.0050.012
nsp1734G>CG220R0.127
nsp1809T>GS245A0.1000.0970.0890.0840.0760.0920.0960.0840.0790.0730.1020.0510.0520.0290.0410.0490.0370.0460.0460.051
nsp1864T>AV263E0.1130.0090.0190.0160.0070.0050.0010.0010.0030.0010.0040.0020.0020.002
nsp11112C>TL346F0.120
nsp11525C>Tsyn>Y0.9940.9950.9960.1070.995
nsp11527G>AS484N0.100
nsp11571G>AE499K0.1880.004
nsp22008C>Tsyn>H0.158
nsp22038G>Asyn>S0.0020.9710.0020.981
nsp22639A>TI320F0.0110.0230.0120.0110.0130.0100.1030.0140.0140.0090.0060.0050.0040.0050.0070.0050.005
nsp22672C>TR331C0.1100.0020.002
nsp22716A>Gsyn>T0.9970.9970.9950.9940.9930.9950.9980.9980.9970.9960.9940.9970.9980.9980.9980.9970.9970.9980.9970.997
nsp22788T>Asyn>I0.1230.0940.1330.0980.0730.0870.0900.0220.0210.0210.0140.0120.0170.0140.013
nsp22962T>Csyn>S0.999
nsp23063T>GI461S0.0650.0580.0520.0530.1190.0690.0180.0180.0170.0220.0170.0170.0220.018
nsp23196T>Asyn>A0.0050.0060.0040.0060.0040.0080.0020.0020.1560.0020.0020.0030.0020.002
nsp23519T>AV613E0.1160.1290.0560.0470.0960.0100.0090.0140.0080.0070.0080.008
nsp23591T>AL637H0.0020.0030.0040.0050.0010.0020.0020.0020.1310.001
nsp23996G>AG772D0.504
nsp34219T>AS48R0.0310.0320.0370.0370.0280.0330.0380.1600.1800.0360.0060.0050.0060.0060.0030.0070.0050.0060.005
nsp34221C>TA49V0.1360.002
nsp34334G>AA88T0.0010.144
nsp34409G>CG113R0.671
nsp34479C>TT135M0.0020.0010.463
nsp34507C>Asyn>R0.9940.9950.9940.9920.9950.9930.9990.9930.9940.9940.9940.9990.9980.9980.9980.9970.9970.9990.9970.997
nsp34513A>Gsyn>K1.0000.9990.9990.9990.9981.0000.9990.9980.9990.9990.9990.9990.9990.9990.9990.9990.9991.0000.9990.999
nsp34605G>AR178Q0.9570.0030.001
nsp34606C>AR178Q0.959
nsp34839C>TS255F0.991
nsp34864T>Asyn>L0.0210.0150.0150.0080.0200.0180.0110.0170.0040.0020.0030.0030.0040.0050.9990.0890.003
nsp35090T>CS340P0.987
nsp35151A>CD359A0.0470.0330.0610.0930.0580.1220.1950.0450.0660.0550.0210.0540.0390.0230.0330.0260.0290.0270.031
nsp35179A>CE368D0.0410.0160.0580.1060.0650.1330.0750.0500.0340.0520.0210.0340.0230.0200.0240.0160.0200.0160.019
nsp35190A>CD359A0.0310.0170.0410.0800.0450.1140.0170.0390.0430.0430.0220.0340.0220.0190.0250.0160.0170.0170.019
nsp35221T>Asyn>L0.0750.0630.1050.0710.0740.0390.0680.0670.0750.0630.0060.0050.0070.0050.0050.0050.0040.0050.006
nsp35298T>AV408E0.0270.0180.0320.0200.1510.0220.0250.0200.0250.0040.0030.0040.0040.0050.0040.0040.0040.003
nsp35305T>AC410Stop0.0370.0270.0420.0410.1150.0260.0210.0220.0220.0050.0040.0040.0040.0040.0040.0040.0040.002
nsp35315G>AE415K0.745
nsp35325T>AI417K0.0250.0340.0260.1190.0360.0230.0170.0230.0180.0100.0120.0100.0030.0050.0040.0030.0040.007
nsp35334T>AM420K0.0230.0240.0200.1210.0330.0260.0200.0190.0200.0040.0040.0050.0040.0040.0030.0040.0030.005
nsp35376T>CV435A0.2240.003
nsp35564C>Tsyn>L0.227
nsp45999T>CS116P0.0080.0080.0010.0010.065
nsp46075T>CI141T0.0040.1050.0170.002
nsp46687T>CL345P0.179
nsp46767A>CI372L0.105
nsp46805G>Asyn>A0.0780.0860.0760.0570.0600.0170.1200.0650.0040.0020.0010.001
nsp46864T>CI404T0.969
nsp46975T>AL441Stop0.179
nsp47076T>GS475A0.0720.0690.0790.0800.0740.6250.0790.0780.0750.0700.0630.0300.0130.0150.0130.0130.0100.0130.0100.015
nsp47159T>Csyn>A0.0630.0510.0630.1000.0780.0900.0620.0760.0820.0900.0170.0160.0200.0150.0200.0150.0140.0180.012
E38492A>GQ52R0.143
E28892T>Csyn>I1.0001.0001.0001.0000.9981.0000.9981.000
E29690G>Asyn>G0.0490.0680.1760.9890.0690.0720.058
E110335T>AF118L0.0970.1140.0840.1150.1060.0900.0910.1140.0960.0140.0120.0140.0120.0140.0110.0100.0110.015
E111046T>Csyn>S0.0020.924
3UTR11302C>T0.984
3UTR11311T>C0.9790.0010.997
3UTR11416T>A0.0150.0050.0050.0060.0030.0020.0080.0050.0110.0020.0020.0030.0010.0040.0040.0050.9960.0030.006
3UTR11525C>T0.4120.0160.0030.0060.0040.4640.0040.4740.0210.018
3UTR11775C>T0.9550.9800.9660.8970.9970.9460.9570.9650.9960.9950.8720.9890.8920.9820.921
3UTR11776G>A0.9950.9970.9900.9770.9970.9930.9970.9930.9970.9970.8900.9960.9160.9940.946
3UTR11791C>T0.3620.5860.6240.5950.6400.593

a nt = nucleotide

b AA = amino acid

c syn = synonymous

Bold print indicates five novel synonymous sequences changes. The table shows any mutation present above 0.1 in at least one individual, and any mutation above 0.001 found in more than one individual.

Table 5

Frequency of significant mutations (>0.001) identified in tissue culture passaged viral populations.

GenomeRegionnt Positionnt ChangeAA ChangeC6/36-1C6/36-2C6/36-3C6/36-4C6/36-5Vero-1Vero-2Vero-3Vero-4Vero-5
nsp1479T>Csyn>L0.0101.0001.000
nsp11525C>Tsyn>Y0.0060.9960.9730.9780.012
nsp22716A>Gsyn>T0.9960.9990.9970.9970.9970.9990.9990.9990.9990.999
nsp23059G>AG460S0.371
nsp34507C>Asyn>R0.9980.9980.9980.9970.998
nsp34513A>Gsyn>K0.9990.9990.9990.9990.996
nsp34864T>Asyn>L0.0070.0030.0040.9990.0030.0120.0110.9900.0090.011
nsp47016C>AL455M0.9510.0200.081
E28874C>Tsyn>F0.0050.0020.0020.0020.990
E28892T>Csyn>I0.9990.9980.9981.0001.0001.0000.9991.000
E110104G>Asyn>T0.9790.0020.0030.0040.0020.0020.0030.0020.968
E110802G>TG274V0.0440.0410.0960.0930.0880.959
3UTR11775C>T0.9780.9520.9850.981
3UTR11776G>A0.9990.9950.994
3UTR11791C>T0.3590.2910.4990.539

nt = nucleotide, AA = amino acid, syn = synonymous, The table shows any mutation present above 0.1 in at least one individual, and any mutation above 0.001 found in more than one individual.

  21 in total

1.  CONSEL: for assessing the confidence of phylogenetic tree selection.

Authors:  H Shimodaira; M Hasegawa
Journal:  Bioinformatics       Date:  2001-12       Impact factor: 6.937

2.  An approximately unbiased test of phylogenetic tree selection.

Authors:  Hidetoshi Shimodaira
Journal:  Syst Biol       Date:  2002-06       Impact factor: 15.683

3.  The distribution of fitness effects caused by single-nucleotide substitutions in an RNA virus.

Authors:  Rafael Sanjuán; Andrés Moya; Santiago F Elena
Journal:  Proc Natl Acad Sci U S A       Date:  2004-05-24       Impact factor: 11.205

4.  Host alternation of chikungunya virus increases fitness while restricting population diversity and adaptability to novel selective pressures.

Authors:  Lark L Coffey; Marco Vignuzzi
Journal:  J Virol       Date:  2010-11-03       Impact factor: 5.103

5.  Chikungunya in the Americas.

Authors:  Isabelle Leparc-Goffart; Antoine Nougairede; Sylvie Cassadou; Christine Prat; Xavier de Lamballerie
Journal:  Lancet       Date:  2014-02-08       Impact factor: 79.321

6.  Chikungunya outbreak in the Caribbean region, December 2013 to March 2014, and the significance for Europe.

Authors:  W Van Bortel; F Dorleans; J Rosine; A Blateau; D Rousset; S Matheus; I Leparc-Goffart; O Flusin; Cm Prat; R Cesaire; F Najioullah; V Ardillon; E Balleydier; L Carvalho; A Lemaître; H Noel; V Servas; C Six; M Zurbaran; L Leon; A Guinard; J van den Kerkhof; M Henry; E Fanoy; M Braks; J Reimerink; C Swaan; R Georges; L Brooks; J Freedman; B Sudre; H Zeller
Journal:  Euro Surveill       Date:  2014-04-03

7.  Emergence and transmission of arbovirus evolutionary intermediates with epidemic potential.

Authors:  Kenneth A Stapleford; Lark L Coffey; Sreyrath Lay; Antonio V Bordería; Veasna Duong; Ofer Isakov; Kathryn Rozen-Gagnon; Camilo Arias-Goeta; Hervé Blanc; Stéphanie Beaucourt; Türkan Haliloğlu; Christine Schmitt; Isabelle Bonne; Nir Ben-Tal; Noam Shomron; Anna-Bella Failloux; Philippe Buchy; Marco Vignuzzi
Journal:  Cell Host Microbe       Date:  2014-06-11       Impact factor: 21.023

Review 8.  The distribution of fitness effects of new mutations.

Authors:  Adam Eyre-Walker; Peter D Keightley
Journal:  Nat Rev Genet       Date:  2007-08       Impact factor: 53.242

9.  PHYML Online--a web server for fast maximum likelihood-based phylogenetic inference.

Authors:  Stéphane Guindon; Franck Lethiec; Patrice Duroux; Olivier Gascuel
Journal:  Nucleic Acids Res       Date:  2005-07-01       Impact factor: 16.971

10.  Chikungunya virus 3' untranslated region: adaptation to mosquitoes and a population bottleneck as major evolutionary forces.

Authors:  Rubing Chen; Eryu Wang; Konstantin A Tsetsarkin; Scott C Weaver
Journal:  PLoS Pathog       Date:  2013-08-29       Impact factor: 6.823

View more
  50 in total

Review 1.  51 years in of Chikungunya clinical vaccine development: A historical perspective.

Authors:  Arturo Reyes-Sandoval
Journal:  Hum Vaccin Immunother       Date:  2019-04-02       Impact factor: 3.452

2.  High-Throughput Fluorescence-Based Screen Identifies the Neuronal MicroRNA miR-124 as a Positive Regulator of Alphavirus Infection.

Authors:  Paula López; Erika Girardi; Bryan C Mounce; Amélie Weiss; Béatrice Chane-Woon-Ming; Mélanie Messmer; Pasi Kaukinen; Arnaud Kopp; Diane Bortolamiol-Becet; Ali Fendri; Marco Vignuzzi; Laurent Brino; Sébastien Pfeffer
Journal:  J Virol       Date:  2020-04-16       Impact factor: 5.103

3.  Mutations in the E2 Glycoprotein and the 3' Untranslated Region Enhance Chikungunya Virus Virulence in Mice.

Authors:  David W Hawman; Kathryn S Carpentier; Julie M Fox; Nicholas A May; Wes Sanders; Stephanie A Montgomery; Nathaniel J Moorman; Michael S Diamond; Thomas E Morrison
Journal:  J Virol       Date:  2017-09-27       Impact factor: 5.103

Review 4.  Dynamics of West Nile virus evolution in mosquito vectors.

Authors:  Nathan D Grubaugh; Gregory D Ebel
Journal:  Curr Opin Virol       Date:  2016-10-24       Impact factor: 7.090

5.  Comprehensive Genome Scale Phylogenetic Study Provides New Insights on the Global Expansion of Chikungunya Virus.

Authors:  Rubing Chen; Vinita Puri; Nadia Fedorova; David Lin; Kumar L Hari; Ravi Jain; Juan David Rodas; Suman R Das; Reed S Shabman; Scott C Weaver
Journal:  J Virol       Date:  2016-11-14       Impact factor: 5.103

6.  Chikungunya Virus Replication Rate Determines the Capacity of Crossing Tissue Barriers in Mosquitoes.

Authors:  Diego E Alvarez; María-Carla Saleh; Fernando Merwaiss; Claudia V Filomatori; Yasutsugu Susuki; Eugenia S Bardossy
Journal:  J Virol       Date:  2021-01-13       Impact factor: 5.103

7.  Using SHAPE-MaP To Model RNA Secondary Structure and Identify 3'UTR Variation in Chikungunya Virus.

Authors:  Emily A Madden; Alain Laederach; Nathanial J Moorman; Mark T Heise; Kenneth S Plante; Clayton R Morrison; Katrina M Kutchko; Wes Sanders; Kristin M Long; Sharon Taft-Benz; Marta C Cruz Cisneros; Ashlyn Morgan White; Sanjay Sarkar; Grace Reynolds; Heather A Vincent
Journal:  J Virol       Date:  2020-11-23       Impact factor: 5.103

8.  Examining the potential for South American arboviruses to spread beyond the New World.

Authors:  Víctor Hugo Peña-García; Michael K McCracken; Rebecca C Christofferson
Journal:  Curr Clin Microbiol Rep       Date:  2017-10-19

9.  Evaluation of Aedes aegypti, Aedes albopictus, and Culex quinquefasciatus Mosquitoes Competence to Oropouche virus Infection.

Authors:  Silvana F de Mendonça; Marcele N Rocha; Flávia V Ferreira; Thiago H J F Leite; Siad C G Amadou; Pedro H F Sucupira; João T Marques; Alvaro G A Ferreira; Luciano A Moreira
Journal:  Viruses       Date:  2021-04-25       Impact factor: 5.048

Review 10.  The Putative Roles and Functions of Indel, Repetition and Duplication Events in Alphavirus Non-Structural Protein 3 Hypervariable Domain (nsP3 HVD) in Evolution, Viability and Re-Emergence.

Authors:  Nurshariza Abdullah; Nafees Ahemad; Konstantinos Aliazis; Jasmine Elanie Khairat; Thong Chuan Lee; Siti Aisyah Abdul Ahmad; Nur Amelia Azreen Adnan; Nur Omar Macha; Sharifah Syed Hassan
Journal:  Viruses       Date:  2021-05-28       Impact factor: 5.048

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.