| Literature DB >> 35877611 |
Emanuel M Fonseca1,2, Drew J Duckett1,2, Filipe G Almeida3, Megan L Smith4, Maria Tereza C Thomé1,2, Bryan C Carstens1,2.
Abstract
Bayesian skyline plots (BSPs) are a useful tool for making inferences about demographic history. For example, researchers typically apply BSPs to test hypotheses regarding how climate changes have influenced intraspecific genetic diversity over time. Like any method, BSP has assumptions that may be violated in some empirical systems (e.g., the absence of population genetic structure), and the naïve analysis of data collected from these systems may lead to spurious results. To address these issues, we introduce P2C2M.Skyline, an R package designed to assess model adequacy for BSPs using posterior predictive simulation. P2C2M.Skyline uses a phylogenetic tree and the log file output from Bayesian Skyline analyses to simulate posterior predictive datasets and then compares this null distribution to statistics calculated from the empirical data to check for model violations. P2C2M.Skyline was able to correctly identify model violations when simulated datasets were generated assuming genetic structure, which is a clear violation of BSP model assumptions. Conversely, P2C2M.Skyline showed low rates of false positives when models were simulated under the BSP model. We also evaluate the P2C2M.Skyline performance in empirical systems, where we detected model violations when DNA sequences from multiple populations were lumped together. P2C2M.Skyline represents a user-friendly and computationally efficient resource for researchers aiming to make inferences from BSP.Entities:
Mesh:
Year: 2022 PMID: 35877611 PMCID: PMC9312427 DOI: 10.1371/journal.pone.0269438
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.752
Fig 1Workflow of the P2C2M.Skyline pipeline.
Arrows represent the path of the data from step 1 to 6. See P2C2M.Skyline package section on Material and Methods for more information.
Fig 2Percentage of simulated datasets with a p‐value of <1% and 5% across the five different diversification scenarios.
In each chart, the Y‐axis shows the percentage of replicates where the statistical fit of the Bayesian Skyline model is rejected or not under two sampling schemes (10 and 50 individuals).
Results of the mathews correlation coefficient for the simulated datasets.
False positives represent datasets simulated under the Bayesian Skyline model premises (i.e., constant, expansion, and bottleneck) that P2C2M.Skyline classified as a model violation. In contrast, false negatives represent datasets not simulated under Bayesian Skyline model premises (i.e., two-population models) that P2C2M.Skyline classified as not a model violation.
| Sampling | Significance level | True positives | Tue negatives | False negatives | False positives | Matthews correlation coefficient (MCC) |
|---|---|---|---|---|---|---|
| 10 and 50 individuals combined | < 0.01 | 588 | 50 | 350 | 12 | 0.21 |
| < 0.025 | 583 | 99 | 301 | 17 | 0.33 | |
| < 0.05 | 567 | 139 | 261 | 33 | 0.38 | |
| < 0.1 | 523 | 245 | 155 | 77 | 0.50 | |
| 10 individuals | < 0.01 | 290 | 11 | 189 | 10 | 0.05 |
| < 0.025 | 287 | 25 | 175 | 13 | 0.15 | |
| < 0.05 | 282 | 44 | 156 | 18 | 0.23 | |
| < 0.1 | 259 | 101 | 99 | 41 | 0.40 | |
| 50 individuals | < 0.01 | 298 | 39 | 161 | 2 | 0.33 |
| < 0.025 | 296 | 74 | 126 | 4 | 0.48 | |
| < 0.05 | 285 | 95 | 105 | 15 | 0.50 | |
| < 0.1 | 264 | 144 | 56 | 36 | 0.61 |
Results of the P2C2M.Skyline on empirical datasets.
Asterisk indicates datasets with p-value < 0.05.
| Order | Species | Population | Number of samples | Length (bp) | p-value | Skyline result | Reference |
|---|---|---|---|---|---|---|---|
| Anura |
| All samples | 82 | 641 | 0.46 | Expansion | This study |
| Population 1 | 32 | 0.12 | Constant | ||||
| Population 2 | 27 | 0.46 | Constant | ||||
| Population 3 | 23 | 0.42 | Constant | ||||
| Anura |
| All samples | 86 | 554 | 0.02* | Constant | This study |
| Population 1 | 23 | 0.22 | Constant | ||||
| Population 2 | 63 | 0.092 | Constant | ||||
| Anura |
| All samples | 25 | 603 | 0.044* | Constant | [ |
| Anura |
| All samples | 165 | 603 | 0* | Expansion | [ |
| Population 1 | 15 | 0.006* | Constant | ||||
| Population 2 | 140 | 0.968 | Expansion | ||||
| Population 3 | 10 | 0.074 | Constant | ||||
| Squamata |
| All samples | 68 | 837 | 0.986 | Constant | [ |
| Population 1 | 18 | 0.176 | Constant | ||||
| Population 2 | 32 | 0.774 | Constant | ||||
| Population 3 | 18 | 0.482 | Constant | ||||
| Squamata |
| All samples | 53 | 679 | 0.002* | Expansion | [ |
| Population 1 | 45 | 0.146 | Constant | ||||
| Population 2 | 8 | 0.196 | Constant | ||||
| Passeriformes |
| All samples | 44 | 1,041 | 0.07 | Constant | [ |
| Passeriformes |
| All samples | 40 | 1,041 | 0.088 | Constant | [ |
| Passeriformes |
| All samples | 84 | 1,041 | 0.01* | Expansion | [ |
| + | |||||||
|
| |||||||
| Araneae |
| All samples | 203 | 715 | 0.41 | Expansion | [ |
| Population 1 | 162 | 0.028* | Expansion | ||||
| Population 2 | 41 | 0.036* | Expansion |