| Literature DB >> 26834710 |
Melanie C Melendrez1, Eric D Becraft1, Jason M Wood1, Millie T Olsen1, Donald A Bryant2, John F Heidelberg3, Douglas B Rusch4, Frederick M Cohan5, David M Ward1.
Abstract
Recent studies of bacterial speciation have claimed to support the biological species concept-that reduced recombination is required for bacterial populations to diverge into species. This conclusion has been reached from the discovery that ecologically distinct clades show lower rates of recombination than that which occurs among closest relatives. However, these previous studies did not attempt to determine whether the more-rapidly recombining close relatives within the clades studied may also have diversified ecologically, without benefit of sexual isolation. Here we have measured the impact of recombination on ecological diversification within and between two ecologically distinct clades (A and B') of Synechococcus in a hot spring microbial mat in Yellowstone National Park, using a cultivation-free, multi-locus approach. Bacterial artificial chromosome (BAC) libraries were constructed from mat samples collected at 60°C and 65°C. Analysis of multiple linked loci near Synechococcus 16S rRNA genes showed little evidence of recombination between the A and B' lineages, but a record of recombination was apparent within each lineage. Recombination and mutation rates within each lineage were of similar magnitude, but recombination had a somewhat greater impact on sequence diversity than mutation, as also seen in many other bacteria and archaea. Despite recombination within the A and B' lineages, there was evidence of ecological diversification within each lineage. The algorithm Ecotype Simulation identified sequence clusters consistent with ecologically distinct populations (ecotypes), and several hypothesized ecotypes were distinct in their habitat associations and in their adaptations to different microenvironments. We conclude that sexual isolation is more likely to follow ecological divergence than to precede it. Thus, an ecology-based model of speciation appears more appropriate than the biological species concept for bacterial and archaeal diversification.Entities:
Keywords: Ecotype Simulation; Synechococcus; cyanobacteria; ecotype; multi-locus sequence typing; population genetics; recombination; speciation
Year: 2016 PMID: 26834710 PMCID: PMC4712262 DOI: 10.3389/fmicb.2015.01540
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 5.640
Figure 1Positional analysis relative to the . In (C) percent nucleotide identity of recruited sequences with genomic homologs is also plotted and lines connect mate-pairs.
Results from recombination and mutation rate and ratio analyses for A-like .
| 7 | 50 | 4008 | 183 | 40.86 | 41|13|1.6 | 33.14 | 580.03 | 0.04–1.0 | 33.94 | 20.365 (12.92–30.29) | 0.600 (0.38–0.892) | 2.596 (1.57–4.01) |
| 5 | 69 | 2760 | 222 | 45.67 | 44|16|2.8 | 39.19 | 706.5 | 0.06–0.96 | 42.59 | 24.76 (16.75–34.65) | 0.581 (0.39–0.81) | 3.396 (2.21–4.92) |
| 520 | 119 | 24.48 | 5|9|0 | 23.24 | 509.7 | 0–0.368 | ||||||
| 584 | 30 | 6.17 | 11|1|0 | 3.63 | 18.85 | 0–1.7 | ||||||
| 429 | 0 | |||||||||||
| 420 | 30 | 6.17 | 0|2|0 | 2.11 | 18.58 | 0–0.324 | ||||||
| 629 | 28 | 5.76 | 7|1|0 | 9.13 | 93.59 | 0–1.2 | ||||||
| 607 | 15 | 3.09 | 0|0|0 | 1.10 | 2.67 | 0.0 | ||||||
| 547 | 26 | 5.80 | 0|0|0 | 1.49 | 20.48 | 0.0 | ||||||
CI: 95% credibility interval.
rbsK, CHP, PK, lepB, aroA.
Sum of all single gene lengths (number of nucleotides) and analysis pertains to a 73-sequence (145 sequence de-duplicated) 5-locus alignment except for hisF and dnaG which pertain to 7-locus alignment.
Watterson's theta.
Maximum at 4Ner(region); composite likelihood method (LDHat; pairwise) finite sites model (McVean et al., 2002, 2004).
R/θ from LDHat is equivalent to the ρ/θ measure from Clonal Frame and define the ratio of the rates of recombination (R or ρ) to mutation (θ) illustrating how often recombination occurs relative to mutation (McVean et al., 2002, 2004; Didelot and Falush, 2007; Didelot and Maiden, 2010). For LDHat-pairwise analysis, R/θ range includes calculations from composite-likelihood, Rmin and Wakeleys moment method (McVean et al., 2002, 2004).
Ratio of the probabilities that a given site is altered through recombination (r) and mutation (m) illustrating how important the effect of recombination was in the diversification of the samples relative to mutation (Didelot and Falush, 2007; Didelot and Maiden, 2010).
There were no segregating sites in the sequence dataset.
Clonal Frame analysis is designed for concatenated multi-locus alignments (Didelot and Falush, 2007; Didelot and Maiden, 2010).
avePWD: average pairwise distance (LDHat pairwise).
varPWD: variance in pairwise distance (LDHat pairwise).
Rmin value (LDHat pairwise); minimum number of recombination events describing evidence for recombination in region (McVean et al., 2002, 2004) infinite sites model.
Population scaled 4Ner (recombination estimate) by Wakeley Moment method (McVean et al., 2002, 2004).
The look up table for LDHat pairwise not generated from data but from existing lkgen table provided in LDHat distribution (lk_n100_t0.01; https://github.com/auton1/LDhat).
Results from recombination and mutation rate and ratio analyses for B'-like .
| 4 | 51 | 2448 | 361 | 80.24 | 85|0|22.3 | 66.14 | 905.36 | 0–1.06 | 63.34 | 23.53 (15.18–33.89) | 0.371 (0.24–0.535) | 2.42 (1.58–3.44) |
| 550 | 101 | 22.45 | 0|13|0 | 14.42 | 219.48 | 0–0.579 | ||||||
| 535 | 122 | 27.12 | 15|15|2.9 | 34.05 | 539.83 | 0.44–0.55 | ||||||
| 628 | 62 | 13.78 | 38|11|0.5 | 11.06 | 75.06 | 0.36–2.8 | ||||||
| 16S rRNA/ITS | 733 | 76 | 16.89 | 15|0|0 | 6.725 | 43.46 | 0–0.889 | |||||
CI: 95% credibility interval.
Watterson's theta.
Maximum at 4Ner(region); composite likelihood method (LDHat; pairwise) finite sites model (McVean et al., 2002, 2004).
R/θ from LDHat is equivalent to the ρ/θ measure from Clonal Frame and define the ratio of the rates of recombination (R or ρ) to mutation (θ) illustrating how often recombination occurs relative to mutation (McVean et al., 2002, 2004; Didelot and Falush, 2007; Didelot and Maiden, 2010). For LDHat-pairwise analysis, R/θ range includes calculations from composite-likelihood, Rmin and Wakeleys moment method (McVean et al., 2002, 2004).
Ratio of the probabilities that a given site is altered through recombination (r) and mutation (m) illustrating how important the effect of recombination was in the diversification of the samples relative to mutation (Didelot and Falush, 2007; Didelot and Maiden, 2010).
Clonal Frame analysis is designed for concatenated multi-locus alignments (Didelot and Falush, 2007; Didelot and Maiden, 2010).
avePWD: average pairwise distance (LDHat pairwise).
varPWD: variance in pairwise distance (LDHat pairwise).
Rmin value (LDHat pairwise); minimum number of recombination events describing evidence for recombination in region (McVean et al., 2002, 2004) infinite sites model.
Population scaled 4Ner (recombination estimate) by Wakeley Moment method (McVean et al., 2002, 2004).
The look up table for LDHat pairwise generated from data but from existing lkgen table provided in LDHat distribution (lk_n100_t0.01; https://github.com/auton1/LDhat).
Figure 2A-like . Putative ecotypes (PEs) demarcated by Ecotype Simulation are indicated by brackets or vertical bars adjacent to each tree. STs within PEs are colored according to the MLSA phylogeny shown in A and colors are maintained in both phylogenies (color correspondence also maintained in Supplemental Data Presentation Figure 14). The number of variants belonging to dominant and subdominant sequence types (STs) is indicated in parentheses. Stars, colored coded according to gene (see inset star legend), demarcate recombination events interpreted from SNP patterns confirmed (closed stars) or not confirmed (open stars) by Clonal Frame analysis. Clade splitting events between rbsK and the concatenated phylogeny, where grouped variants within rbsK are split apart into separate PEs in the concatenated phylogeny, are indicated by dashed (clade splitting event of rbsK PE1) and solid arrows (clade splitting event of rbsK PE2). Non-syntenous STs are shaded in gray and STs that contained a combination of sequences that were syntenous and non-syntenous are indicated by an asterisk; syntenous STs are not shaded and are not annotated by an asterisk. Shared SNP pattern between STs of two distinct PEs indicated by a bidirectional arrow colored according to gene (yellow = lepB). Bootstrap values are provided for major nodes. Reference genome indicated by “SynA”; Genbank accession number CP000239.
The number of unique (u) and overlapping (ovl; supported by multiple methods) recombination events recorded by RDP4, Clonal Frame (CF), and single nucleotide polymorphism (SNP) analysis of MLSA datasets.
| MLSA7 | 71 | 11 | 0 | 7 | 24 | 7 | 35 | |
| MLSA5 | 49 | 5 | 0 | 0 | 0 | 0 | 5 | |
| MLSA5 | 145 | 5 | 1 | 8 | 2 | 6 | 8 | |
| MLSA4 | 72 | 4 | 9 | 19 | 11 | 14 | 24 |
All loci were considered in analysis.
Overlapping RDP4 events with CF or SNP analyses were not recorded in CFu or SNPu columns, only the RDP4u column.
See Supplemental Data Table 10.
See Supplemental Data Table 11.
See Supplemental Data Presentation Figure 10.
See Figure 2.
Sum of “u” columns only.
Ecotype Simulation and eBURST output for 71 A-like .
| 0.03 | 7 (4–16) | 0.05 (0.01–0.14) | 3.5 (0.33–42) | Yes | 24 | ||
| 0.002 | 2 (2–60) | 0.052 (0.04–66) | 13 (1.2–12.2) | No | 11 | ||
| < 0.0001 | 2 (2–71) | 3113 (<2e−7–>100) | 48,000 (0.7->100) | No | 2 | ||
| 0.005 | 2 (2–6) | 0.09 (0.003–0.6) | 26.9 (1.7–99.8) | No | 6 | ||
| 0.01 | 2 (2–5) | 0.03 (0.001–0.23) | 564 (5.4–>100) | No | 4 | ||
| 0.001 | 3 (2–71) | 1.86 (0.03–99.6) | 22.7 (0.07–100) | No | 7 | ||
| 0.002 | 4 (3–22) | 0.3 (0.07–2.5) | 180 (3.8–>100) | No | 7 | ||
| Concatenation | 0.01 | 13 (9–44) | 0.05 (0.03–0.12) | 1.11 (0.07–100) | 50 | 3 | |
Molecular resolution of hisF was not significant enough for Ecotype Simulation to effectively predict ecotypes.
Calculated from concatenated sequence datasets only.
Not enough 65° sequences to determine sample specificity of demarcated PEs.
Clonal complexes only apply to concatenated sequence datasets.
Figure 3. (A) rbsK distributions of two MLSA PEs that are demarcated as a single PE in rbsK analysis. (B,C) rbsK (solid line) and psaA (dotted line) distributions of two MLSA PEs connected through genomes of representative isolates. Bars represent standard error (n = 3). PE colors correspond across panels.
Figure 4eBURST population snapshot of A-like Clonal complexes enclosed by solid black lines. PE demarcation from Ecotype Simulation analysis are overlaid, using different colors corresponding to Figure 2A to represent distinct PEs. STs are represented by numbers and those colored in gray belong to PEs demarcated from a single sequence with a unique ST. Reference genome indicated by “SynA” next to ST10.
Figure 5Single nucleotide polymorphism patterning of A-like . Variants detected only by eBURST (blue), only by Ecotype Simulation (red) or by both (purple) are compared to the shared dominant variant (consensus sequence). (A) Dominant variant ST1 in PEA7 and clonal complex A7-I and (B) subdominant variant ST6 in PE7 and clonal complex A7-III. STs correspond with STs in Figure 2A.
| 1 | 1 | 1 | 1 | 31 | 1 | 1 | 1 | 1 | 1 | 1 | 72 | 82 | 105 | 121 | 172 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 21 | 1 | 1 | 81 | 1011 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 21 | 313 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 220 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 72 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 21 | 1 | 1 | 1 | 1 | 1 | 519 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 32 | |
| 2 | 120 | 2 | 2 | 2 | 182 | 148 | 1513 | |
| 1 | 1 | 1 | 1 | 81 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 21 | 1 | 1 | 1 | 1 | 1 | |
| 2 | 2 | 2 | 31 | 2 | 2 | 2 | 2 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | |
| 3 | 11 | 3 | 3 | 3 | |
| 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 1 | |
| 1 | 1 | 220 | 1 | 1 | |
| 1 | 1 | 1 | 1 | 52 | |
| 1 | 1 | 1 | 41 | 1 | |
The number of BACs within an ST is in parentheses when greater than 1. Superscripts next to the allele number denote number of nucleotide differences compared to the consensus sequence. Corresponding to Figure 4.
DV, dominant variant.