| Literature DB >> 26744812 |
Matthew L Bendall1, Sarah Lr Stevens2, Leong-Keat Chan1, Stephanie Malfatti1, Patrick Schwientek1, Julien Tremblay1, Wendy Schackwitz1, Joel Martin1, Amrita Pati1, Brian Bushnell1, Jeff Froula1, Dongwan Kang1, Susannah G Tringe1, Stefan Bertilsson3, Mary A Moran4, Ashley Shade5, Ryan J Newton6, Katherine D McMahon2,7, Rex R Malmstrom1.
Abstract
Multiple models describe the formation and evolution of distinct microbial phylogenetic groups. These evolutionary models make different predictions regarding how adaptive alleles spread through populations and how genetic diversity is maintained. Processes predicted by competing evolutionary models, for example, genome-wide selective sweeps vs gene-specific sweeps, could be captured in natural populations using time-series metagenomics if the approach were applied over a sufficiently long time frame. Direct observations of either process would help resolve how distinct microbial groups evolve. Here, from a 9-year metagenomic study of a freshwater lake (2005-2013), we explore changes in single-nucleotide polymorphism (SNP) frequencies and patterns of gene gain and loss in 30 bacterial populations. SNP analyses revealed substantial genetic heterogeneity within these populations, although the degree of heterogeneity varied by >1000-fold among populations. SNP allele frequencies also changed dramatically over time within some populations. Interestingly, nearly all SNP variants were slowly purged over several years from one population of green sulfur bacteria, while at the same time multiple genes either swept through or were lost from this population. These patterns were consistent with a genome-wide selective sweep in progress, a process predicted by the 'ecotype model' of speciation but not previously observed in nature. In contrast, other populations contained large, SNP-free genomic regions that appear to have swept independently through the populations prior to the study without purging diversity elsewhere in the genome. Evidence for both genome-wide and gene-specific sweeps suggests that different models of bacterial speciation may apply to different populations coexisting in the same environment.Entities:
Mesh:
Year: 2016 PMID: 26744812 PMCID: PMC4918448 DOI: 10.1038/ismej.2015.241
Source DB: PubMed Journal: ISME J ISSN: 1751-7362 Impact factor: 10.302
Figure 1‘Sequence-discrete' populations revealed by metagenomic read mapping. (a) An example recruitment plot of 50 000 shotgun reads mapping across the Chlorobium-111 genome at various nucleotide identity levels. Each dot represents a read. (b) Summary of reads mapping at each percentage of nucleotide identity level for all genomes. Each line represents a different genome. A distinct lack of coverage around 95% identity was observed in all genomes. The y axis (percentage of mapped reads) of panel (b) was truncated at 30% to illustrate this coverage discontinuity.
Genomes reconstructed from metagenomic-combined assembly
| Epilimnion | 764 032 | 95 | 917 | 64/74 | |
| Epilimnion | 996 711 | 125 | 1094 | 67/69 | |
| Epilimnion | 1 660 228 | 93 | 1777 | 72/62 | |
| Epilimnion | 1 013 290 | 136 | 1149 | 98/100 | |
| Epilimnion | 990 006 | 133 | 1125 | 57/52 | |
| Epilimnion | 942 700 | 85 | 1111 | 76/99 | |
| Epilimnion | 2 036 179 | 101 | 1943 | 95/100 | |
| Epilimnion | 2 186 907 | 124 | 1,998 | 90/100 | |
| Epilimnion | 971 617 | 97 | 1063 | 74/58 | |
| Hypolimnion | 2 314 202 | 74 | 2319 | 92/100 | |
| Hypolimnion | 1 314 366 | 121 | 1475 | 66/52 | |
| Hypolimnion | 2 981 798 | 188 | 2862 | 80/56 | |
| Hypolimnion | 3 073 408 | 152 | 2864 | 77/66 | |
| Hypolimnion | 1 431 993 | 51 | 1439 | 90/82 | |
| Hypolimnion | 1 257 796 | 81 | 1353 | 73/60 | |
| Hypolimnion | 1 496 525 | 68 | 1581 | 57/54 | |
| TM7-1225 | Hypolimnion | 915 278 | 14 | 993 | 63/90 |
| Hypolimnion | 2 299 825 | 136 | 2072 | 68/59 | |
| Hypolimnion | 1 077 715 | 49 | 1131 | 50/46 | |
| Hypolimnion | 2 301 184 | 60 | 2383 | 98/100 | |
| Hypolimnion | 3 124 798 | 188 | 2919 | 94/89 | |
| Hypolimnion | 3 680 027 | 151 | 2965 | 72/59 | |
| Hypolimnion | 845 311 | 113 | 980 | 61/64 | |
| Hypolimnion | 1 808 963 | 100 | 1654 | 72/88 | |
| Hypolimnion | 1 002 927 | 75 | 1180 | 62/65 | |
| Hypolimnion | 3 798 404 | 58 | 3387 | 93/92 | |
| Hypolimnion | 1 149 636 | 85 | 1251 | 67/54 | |
| Hypolimnion | 2 657 023 | 54 | 2637 | 97/95 | |
| Hypolimnion | 2 156 671 | 83 | 2242 | 89/100 | |
| Hypolimnion | 1 315 659 | 42 | 1392 | 76/94 |
a/b=Genome completeness estimated using the approaches of Parks (a) and Rinke (b).
Summary of single-nucleotide polymorphisms (SNPs)
| 3514 | 4599 | 2914 | 460 | 3 | 136 | |
| 1772 | 1753 | 1378 | 275 | 3 | 91 | |
| 4571 | 2753 | 3627 | 710 | 3 | 231 | |
| 45 | 44 | 18 | 11 | 2 | 14 | |
| 6244 | 6188 | 5039 | 851 | 3 | 231 | |
| 3003 | 3186 | 2223 | 656 | 1 | 123 | |
| 6437 | 3161 | 5257 | 893 | 4 | 283 | |
| 3839 | 1743 | 2924 | 663 | 1 | 223 | |
| 2238 | 2182 | 1659 | 377 | 0 | 84 | |
| 3111 | 1344 | 1498 | 1127 | 22 | 464 | |
| 6451 | 4908 | 3418 | 738 | 1 | 2291 | |
| 8501 | 2851 | 5605 | 2004 | 10 | 881 | |
| 4995 | 1625 | 3187 | 1037 | 1 | 770 | |
| 279 | 195 | 132 | 120 | 1 | 26 | |
| 297 | 236 | 189 | 47 | 1 | 60 | |
| 4269 | 2853 | 2971 | 971 | 4 | 323 | |
| TM7-1225 | 3 | 3 | 0 | 1 | 0 | 2 |
| 1381 | 600 | 951 | 197 | 1 | 232 | |
| 1779 | 1651 | 1153 | 434 | 2 | 190 | |
| 279 | 121 | 154 | 95 | 1 | 29 | |
| 6660 | 2131 | 3908 | 1515 | 14 | 1223 | |
| 4256 | 1157 | 2389 | 1231 | 12 | 623 | |
| 4209 | 4979 | 3400 | 597 | 3 | 209 | |
| 8036 | 4442 | 6254 | 1246 | 4 | 531 | |
| 2943 | 2934 | 2115 | 712 | 2 | 113 | |
| 145 | 38 | 43 | 70 | 3 | 29 | |
| 2111 | 1836 | 1551 | 318 | 2 | 240 | |
| 69 | 26 | 35 | 23 | 0 | 11 | |
| 4146 | 1922 | 2317 | 1180 | 11 | 637 | |
| 2126 | 1616 | 1505 | 477 | 0 | 143 | |
Figure 2Differences in SNP-level heterogeneity among coexisting populations. The number of SNPs found in each sequence-discrete population, normalized to genome size (SNPs per Mbp), varied by three orders of magnitude among populations with similar coverage levels. Although the power to identify low-frequency SNPs increases with greater genome coverage, populations with many SNPs were not necessarily sequenced deeper than those with few SNPs. Two pairs of closely related populations are highlighted to illustrate this point.
Figure 3Temporal dynamics of SNP allele frequencies within different populations. (a, b) Two examples of populations with different SNP dynamics. SNPs are arrayed along the y axis, with each row representing one SNP locus. SNP color indicates allele frequency, that is, the percentage of metagenomic reads supporting the reference allele during each time period. SNPs dominated by a single allele appear either as red (few reads matching reference base) or blue (most reads matching reference base). SNPs are arranged in ascending order along the y axis based on allele frequency in 2005. (c, d) Fraction of SNPs dominated by single allele (⩾95% frequency) in each year. Broad patterns of allele frequencies were determined by combining sequence data for each year.
Figure 4Temporal trends in SNP allele frequencies and gene content in a natural Chlorobium population. (a) SNPs are arrayed along the y axis, with each row representing one SNP locus. SNP color indicates allele frequency, that is, the percentage of metagenomic reads supporting the reference allele during each year. (b) Relative abundance of genes gained or lost from Chlorobium-111. A gene frequency of 1 equates to single copy per cell. Gene annotations and locus IDs are listed in Supplementary Table S2. Broad patterns of allele frequencies and gene abundances were determined by combining sequence data for each year.