| Literature DB >> 25538191 |
Taj Azarian, Afsar Ali, Judith A Johnson, David Mohr1, Mattia Prosperi, Nazle M Veras2, Mohammed Jubair, Samantha L Strickland2, Mohammad H Rashid2, Meer T Alam2, Thomas A Weppelmann, Lee S Katz3, Cheryl L Tarr3, Rita R Colwell, J Glenn Morris4, Marco Salemi5.
Abstract
UNLABELLED: Phylodynamic analysis of genome-wide single-nucleotide polymorphism (SNP) data is a powerful tool to investigate underlying evolutionary processes of bacterial epidemics. The method was applied to investigate a collection of 65 clinical and environmental isolates of Vibrio cholerae from Haiti collected between 2010 and 2012. Characterization of isolates recovered from environmental samples identified a total of four toxigenic V. cholerae O1 isolates, four non-O1/O139 isolates, and a novel nontoxigenic V. cholerae O1 isolate with the classical tcpA gene. Phylogenies of strains were inferred from genome-wide SNPs using coalescent-based demographic models within a Bayesian framework. A close phylogenetic relationship between clinical and environmental toxigenic V. cholerae O1 strains was observed. As cholera spread throughout Haiti between October 2010 and August 2012, the population size initially increased and then fluctuated over time. Selection analysis along internal branches of the phylogeny showed a steady accumulation of synonymous substitutions and a progressive increase of nonsynonymous substitutions over time, suggesting diversification likely was driven by positive selection. Short-term accumulation of nonsynonymous substitutions driven by selection may have significant implications for virulence, transmission dynamics, and even vaccine efficacy. IMPORTANCE: Cholera, a dehydrating diarrheal disease caused by toxigenic strains of the bacterium Vibrio cholerae, emerged in 2010 in Haiti, a country where there were no available records on cholera over the past 100 years. While devastating in terms of morbidity and mortality, the outbreak provided a unique opportunity to study the evolutionary dynamics of V. cholerae and its environmental presence. The present study expands on previous work and provides an in-depth phylodynamic analysis inferred from genome-wide single nucleotide polymorphisms of clinical and environmental strains from dispersed geographic settings in Haiti over a 2-year period. Our results indicate that even during such a short time scale, V. cholerae in Haiti has undergone evolution and diversification driven by positive selection, which may have implications for understanding the global clinical and epidemiological patterns of the disease. Furthermore, the continued presence of the epidemic strain in Haitian aquatic environments has implications for transmission.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25538191 PMCID: PMC4278535 DOI: 10.1128/mBio.01824-14
Source DB: PubMed Journal: MBio Impact factor: 7.867
FIG 1 Maximum likelihood (ML) tree of V. cholerae environmental and clinical strains. The ML tree was inferred from ProgressiveMauve alignment of four environmentally sampled non-O1/O139 isolate sequences (designated by green coloring and marked with an asterisk), one environmentally sampled nontoxigenic O1 sequence, and 27 reference sequences. The 60 closely related toxigenic O1 biotype El Tor strains from our study are collapsed and represented by the 2010EL-1786 reference strain, which clusters with other members of the El Tor lineage. The four non-O1/O139 strains were dispersed throughout the global phylogeny, indicating a panmictic population of non-O1/O139 strains in the environment. The phylogeny was constructed with RAxML, using the GTR nucleotide substitution model and 1,000 bootstrap replicates. Bootstrap values for branches are not indicated but were all above 90%.
Frequency of isolates by year and diversity/divergence estimates
| Yr | Frequency according to: | Total[ | Diversity[ | Divergence[ | Within-group hqSNP | |
|---|---|---|---|---|---|---|
| UF-EPI[ | Katz et al. (9) | |||||
| 2010 | 9 | 13 | 22 | 0.011 (0.004) | 0.636 (0.219) | |
| 2011 | 14 | 14 | 0.098 (0.016) | 0.056 (0.008) | 5.868 (0.855) | |
| 2012 | 23 | 1 | 24 | 0.073 (0.018) | 0.058 (0.018) | 3.978 (0.866) |
| Total | 32 | 28 | 60 | |||
The isolates listed were sampled by the University of Florida Emerging Pathogen Institute (UF-EPI).
Four non-O1/O139 isolates are not listed and were not included in the diversity/divergence estimates.
Diversity estimates of the mean evolutionary diversity within year-specific subpopulations. The number of nucleotide substitutions per hqSNP site from mean diversity calculations within subpopulations is shown with the standard error obtained by bootstrap estimates (1,000 replicates) in parentheses.
Divergence estimates and standard errors (in parentheses) of average evolutionary divergence compared to sequence sampled during the first epidemic outbreak in 2010. The number of nucleotide substitutions per hqSNP site from averaging over all sequence pairs within groups is shown.
Mean within-group hqSNP differences and standard errors (in parentheses) by collection year.
FIG 2 DensiTree of posterior distribution of trees from Bayesian phylogenetic analysis of 60 V. cholerae O1 isolates. A posterior distribution of trees was obtained from Bayesian phylogenetic analysis of genome-wide SNPs from 60 toxigenic V. cholerae O1 isolates using the GMRF skygrid model and strict molecular clock as implemented in BEAST 1.8.0. Tip dates for each node were assigned based on date of isolate collection, allowing the phylogeny to be scaled in time. DensiTree provides a visualization of the posterior distribution of trees by illustrating the frequency of node clustering to assess the support for clades and overall topology. Well-supported branches are indicated by solid colors, whereas webs represent little agreement. This is an alternative to presenting one “best” MCC tree.
FIG 3 Effective population size (Ne) estimates from Bayesian phylogenetic analysis and epidemiological case counts for the coinciding period. The blue line and the gray upper and lower bounds represent, respectively, median and 95% high posterior density (95% HPD) interval estimates of N over time. The bottom panel shows the number of cumulative (shaded area) and incident (histogram) cholera cases from October 2010 to June 2013.
FIG 4 Bayesian MCC phylogeny and selection analysis along backbone paths. (A) Bayesian maximum clade credibility (MCC) tree of Haitian strains of V. cholerae with branch lengths scaled in time by enforcing a strict molecular clock. Strain labels are colored to indicate the time of sampling according to the legend in the figure. An asterisk along a branch marks subtending clades supported by a posterior probability of >0.75. An asterisk next to a node sequence name indicates an environmentally sampled isolate. One of the possible backbone paths, representing major lineages propagating over time from root node to most recently sampled sequence, is highlighted in red. Branch labels correspond to nonsynonymous mutations indicated in the table on panel C. (B) Mean nonsynonymous (dN) and synonymous (dS) divergence along backbone paths in a subsample of trees from the posterior distribution (see Materials and Methods) during the cholera epidemic in Haiti are represented by blue and green lines, respectively. Estimates 1 standard deviation above and below the mean dN or dS are represented by broken lines. The y axis represents the number of nucleotide (dN or dS) changes per (synonymous or nonsynonymous) SNP site. The x axis matches the time scale of the epidemic. Estimates were obtained by reconstructing and comparing ancestral sequences along V. cholerae trees sampled from the posterior distribution from the Bayesian analysis. (C) Inferred codon and amino acid changes in the V. cholerae genome (numbered according to the V. cholerae 2010EL-1786 reference strain) along internal branches of the O1 Haitian phylogeny. Branch numbers correspond to numbered branches in the MCC tree given in panel A.