| Literature DB >> 32571204 |
Mirna Vázquez-Rosas-Landa1,2, Gabriel Yaxal Ponce-Soto1, Jonás A Aguirre-Liguori1, Shalabh Thakur3, Enrique Scheinvar1, Josué Barrera-Redondo1, Enrique Ibarra-Laclette2, David S Guttman3,4, Luis E Eguiarte1, Valeria Souza5.
Abstract
BACKGROUND: In bacteria, pan-genomes are the result of an evolutionary "tug of war" between selection and horizontal gene transfer (HGT). High rates of HGT increase the genetic pool and the effective population size (Ne), resulting in open pan-genomes. In contrast, selective pressures can lead to local adaptation by purging the variation introduced by HGT and mutation, resulting in closed pan-genomes and clonal lineages. In this study, we explored both hypotheses, elucidating the pan-genome of Vibrionaceae isolates after a perturbation event in the endangered oasis of Cuatro Ciénegas Basin (CCB), Mexico, and looking for signals of adaptation to the environments in their genomes.Entities:
Keywords: Effective population size; Pan-genome; Population genomics; Recombination; Selection; Vibrionaceae
Mesh:
Year: 2020 PMID: 32571204 PMCID: PMC7306931 DOI: 10.1186/s12864-020-06829-y
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Study site, Pozas Rojas in Los Hundidos within Cuatro Ciénegas Basin, Mexico. Sampling sites are signaled in yellow. Cuatro Ciénegas location is also shown in a map (Pozas Rojas photos were provided by David Jaramillo, a map showing the location of Cuatro Ciénegas Valley was obtained from Google Earth, earth.google.com/web/)
Fig. 2Core gene phylogeny of the 1254 orthologs. Maximum-likelihood phylogenetic reconstruction of core genes, supporting branch values are shown. Each square represents the isolation environment, water or sediment, while yellow stars indicate reference strains. Isolation pond is indicated by its number. Clades are distinguished with colors. Clades IV and V which are likely to be exclusive to CCB are highlighted with an asterisk
Pan-genome metrics of each Vibrionaceae clades isolated from Poza Rojas, CCB
| Group Clade | Number of CCB genomes included in each clade | Pan-genome metrics | Heaps law parameters | ||||
|---|---|---|---|---|---|---|---|
| Core | Flexible | Unique | Total number of genes | Intercept value | Alpha | ||
| Clade I | 3 | 3617 | 346 | 603 | 4566 | 692.8508 | 1.1293 |
| Clade II | 22 | 1746 | 5770 | 1745 | 9261 | 244.2096 | 0.7913 |
| Clade III | 5 | 2672 | 718 | 324 | 3714 | 658.0634 | 1.6625 |
| Clade IV | 5 | 2055 | 1445 | 180 | 3680 | 2726.7580 | 2.0000 |
| Clade V | 4 | 2853 | 1660 | 1332 | 5845 | 1196.2571 | 1.3109 |
| Clade VI | 3 | 2448 | 3476 | 1028 | 4992 | 3295.5770 | 2.0000 |
| Vibrionaceae all Clades | 47 | 1254 | 14,072 | 4795 | 20,121 | 2263.7472 | 0.6621 |
The first column shows the Clade ID, next is the number of genomes used for the analysis regarding each clade, followed by the general metrics of pan-genome, and last columns show the Heaps law values obtained. If alpha > 1.0 the pan-genome is considered closed if alpha < 1.0 it is considered open
Genetic diversity statistics of Vibrionaceae clades isolated from Poza Rojas, CCB
| Clade | Number of individuals | Number of segregating sites | π | θw | Tajima’s | P-value of Tajima’s | |
|---|---|---|---|---|---|---|---|
| Clade I | 3 | 100,971 | 0.0164894 | 0.0163978 | 0 | 0 | |
| Clade II | All individuals | 22 | 103,197 | 0.01148342 | 0.01106029 | 0.15738106 | 0.8582025 |
| All individuals in the three larger sub-Clades | 14 | 49,946 | 0.00916203 | 0.00613614 | 2.23866585 | 0.02142617 | |
| Sub-clade G | 4 | 13 | 2.54E-06 | 2.77E-06 | −0.84306779 | 0.77323024 | |
| Sub-clade D | 6 | 42 | 5.47E-06 | 7.19E-06 | −1.52560731 | 0.02458297 | |
| Sub-clade A | 4 | 82 | 1.61E-05 | 1.75E-05 | −0.83190864 | 0.8020116 | |
| Clade III | 5 | 40,593 | 0.0051088 | 0.0061293 | −1.27467187 | 0.01772241 | |
| Clade IV | 5 | 209 | 2.86E-05 | 3.46E-05 | −1.31696234 | 0 | |
| Clade V | 4 | 34,843 | 0.00398639 | 0.00434715 | −0.87361739 | 0.56601856 | |
| Clade VI | 3 | 204,388 | 0.04622002 | 0.04621538 | 0 | 0 | |
From left to right are displayed the values for segregation sites, nucleotide diversity (π) Watterson’s theta (θw), Tajima’s D and Tajima’s D p-value. The values were estimated for all six Clades and Sub-clades with 3 or more individuals
Estimates of effective population sizes (N) of Vibrionaceae clades isolated from Poza Rojas, CCB, obtained through simulations with Fastsimcoal2 [44, 45], and comparative values from other organisms
| Group Clade | Sample size | Median Value | Range | Environment | Reference | |
|---|---|---|---|---|---|---|
| Lower value | Larger value | |||||
| Clade I | 3 | 12,822,270 | 10,110,043 | 16,231,765 | Sediment | This work |
| Clade II | ||||||
| Sub-clade A | 4 | 55,938 | 34,079 | 392,104 | Sediment | This work |
| Sub-clade D | 6 | 20,849 | 2795 | 218,603 | Water-Sediment | This work |
| Sub-clade G | 4 | 29,791 | 6174 | 226,658 | Water-Sediment | This work |
| Clade III | 4 | 15,018,880 | 8,970,283 | 22,432,331 | Water-Sediment | This work |
| Clade IV | 4 | 383,067 | 345,564 | 427,557 | Sediment | This work |
| Clade V | 4 | 9,594,874 | 5,894,074 | 12,914,770 | Sediment | This work |
| Clade VI | 3 | 4,141,870 | 2,582,483 | 10,645,019 | Sediment | This work |
| 39,665,437 | – | – | – | [ | ||
| 348,991,354 | – | – | – | [ | ||
| 179,600,000 | – | – | – | [ | ||
| 20,348 | – | – | – | [ | ||
| 266,769 | – | – | – | [ | ||
| 3,998,701 | – | – | – | [ | ||
| 5,332,244 | – | – | – | [ | ||
First column shows the names of the CCB Clades and reference strains used for the calculus, second column represents the number of strains within each group, followed by the median N value estimated and the range. Last two columns display the isolation environment and the reference
Fig. 3Patterns of recombination events among isolated strains. Heatmap of the frequency of recombination events among different strains; red colors indicate more recombination events within strains while blue events indicate few recombination events. Distances were estimated with the Jaccard dissimilarity index
Recombination vs. mutation estimates of Vibrionaceae clades isolated from Poza Rojas, CCB
| Group Clade | Recombination vs. mutation estimates | |
|---|---|---|
| rho/theta | ||
| Clade I | 0.1036 | 2.7249 |
| Clade II | 0.1171 | 0.5299 |
| Clade III | 0.1498 | 1.1163 |
| Clade IV | 0.1437 | 0.9090 |
| Clade V | 0.0278 | 0.2825 |
| Clade VI | 0.0074 | 0.0052 |
| 0.0064 | 0.2261 | |
| 0.2889 | 4.0014 | |
| 0.0667 | 0.5659 | |
| 0.0025 | 0.1246 | |
First column shows the names of the CCB Clades and reference strains used for the calculus. Second and third columns shows the Rho/theta and r/m estimates [49]
GO terms enriched estimated with TopGO [50], regarding the gene families with signals of positive selection
| GO ID | Term | Annotated | Significant | Expected | Fisher test with Bonferroni |
|---|---|---|---|---|---|
| GO:0000902 | cell morphogenesis | 398 | 67 | 26.93 | 0.00020748 |
| GO:0009234 | menaquinone biosynthetic process | 240 | 38 | 16.24 | 0.00150024 |
| GO:0009245 | lipid A biosynthetic process | 240 | 38 | 16.24 | 0.00150024 |
| GO:0008360 | regulation of cell shape | 244 | 37 | 16.51 | 0.0059052 |
| GO:0007156 | homophilic cell adhesion via plasma membrane adhesion molecules | 13 | 7 | 0.88 | 0.0122892 |
| GO:0006304 | DNA modification | 295 | 41 | 19.96 | 0.01596 |
| GO:0009058 | biosynthetic process | 26,775 | 1675 | 1811.62 | 0.017556 |
First two columns show the enriched GO IDs and its name, third column the number of annotated genes, fourth and fifth column the number of significant genes and the expected, last column shows the significance corrected with Bonferroni
Fig. 4UPGMA of the 598 SNPs associated with the isolation environment. Tip colors represent clade membership, for Clade II, Sub-clades are also indicated. Squares represent the isolation environment. Distances were calculated with the bitwise distance function of poppr v2.8.1
GO terms enriched in the genes found to have an association with the isolation environment (water or sediment)
| Genes with signals of recombination or selection | GO ID | Term | Annotated | Significant | Expected | Fisher test with Bonferroni |
|---|---|---|---|---|---|---|
| Recombination | GO:0006066 | alcohol metabolic process | 446 | 8 | 0.55 | 0.000146 |
| Recombination | GO:0006429 | leucyl-tRNA aminoacylation | 41 | 4 | 0.05 | 0.000338 |
| Recombination | GO:0006419 | alanyl-tRNA aminoacylation | 48 | 4 | 0.06 | 0.000643 |
| Recombination | GO:0006265 | DNA topological change | 339 | 6 | 0.42 | 0.006914 |
| Selection | GO:0006814 | sodium ion transport | 685 | 9 | 1.47 | 0.03216 |
First two columns show the enriched GO IDs and its name, with signals of recombination or selection. Third column the number of annotated genes, fourth and fifth column the number of significant genes and the expected, last column shows the significance corrected with Bonferroni