| Literature DB >> 18976476 |
Alexei J Drummond1, Marc A Suchard.
Abstract
BACKGROUND: Many data summary statistics have been developed to detect departures from neutral expectations of evolutionary models. However questions about the neutrality of the evolution of genetic loci within natural populations remain difficult to assess. One critical cause of this difficulty is that most methods for testing neutrality make simplifying assumptions simultaneously about the mutational model and the population size model. Consequentially, rejecting the null hypothesis of neutrality under these methods could result from violations of either or both assumptions, making interpretation troublesome.Entities:
Mesh:
Year: 2008 PMID: 18976476 PMCID: PMC2645432 DOI: 10.1186/1471-2156-9-68
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Summary statistics used in test of neutrality
| Summary Statistic | Reference | Description |
| - | The total length of all branches of the tree. | |
| - | The difference in age between the most recent common ancestor and the most modern individual. | |
| [ | A classic summary statistic for testing neutrality. Normalized difference between external branch lengths and total tree length. | |
| [ | A measure of tree-imbalance. | |
| [ | The number of internal nodes with exactly two terminal children(the number of cherries). | |
| [ | A measure of tree-imbalance. Ranges is [0,1]. Larger numbers signify more imbalanced trees. |
Bayesian parameter estimates
| Data Set | Demographic Model | log | ||||||
| Brown bear | Constant* | -2200 | 113,800 | - | 5.68 × 10-7 | 0.243 | 41.8 | 153,500 |
| Exp. growth | -2198 | 127,000 | 5.45 × 10-6 | 5.95 × 10-7 | 0.243 | 41.7 | 145,100 | |
| HRSV | Constant* | -6068 | 36.3 | - | 0.00242 | 0.900 | 12.4 | 56.1 |
| Exp. growth | -6070 | 53.0 | 0.0263 | 0.00239 | 0.900 | 12.4 | 55.8 | |
| Dengue-4 | Constant | -3960 | 11.2 | - | 0.000976 | 0.167 | 17.3 | 19.7 |
| Exp. growth* | -3952 | 38.9 | 0.134 | 0.00096 | 0.167 | 17.2 | 19.0 | |
| Influenza A | Constant* | -4386 | 4.3 | - | 0.00503 | 0.332 | 5.49 | 19.0 |
| Exp. growth | -4383 | 7.25 | 0.0681 | 0.00506 | 0.332 | 5.5 | 18.9 |
Posterior parameter estimates from the MCMC analyses. The effective population size is reported only as a product with generation time (Nτ) and the compound parameter has unit of years for virus data sets and radiocarbon years for the brown bear data set. Posterior means are reported for all model parameters. For each data set, the demographic model chosen by a Bayes factor is marked (*). 1marginal likelihood, 2exponential growth rate, 3substitution rate, 4shape parameter of the Γ-distribution.
Predictive Probabilities
| Data Set | Demographic Model | MV | ||||||
| Brown bear | Constant | 0.205 | 0.164 | 0.128 | 0.307 | 0.844 | 0.900 | 0.219 |
| Exponential growth | 0.382 | 0.374 | 0.209 | 0.305 | 0.832 | 0.886 | 0.284 | |
| HRSV | Constant | 0.045 | 0.034 | 0.044 | 0.835 | 0.851 | 0.865 | 0.330 |
| Exponential growth | 0.294 | 0.335 | 0.121 | 0.805 | 0.845 | 0.857 | 0.463 | |
| Dengue-4 | Constant | 0.036 | 0.004* | 0.001* | 0.434 | 0.401 | 0.581 | 0.027* |
| Exponential growth | 0.219 | 0.170 | 0.449 | 0.349 | 0.498 | 0.128 | ||
| Human influenza A | Constant | 0.040 | 0.101 | 0.951 | 0.392 | 0.393 | ||
| Exponential growth | 0.085 | 0.381 | 0.001* | 0.916 | 0.427 | 0.438 | 0.018* |
Univariate and multivariate posterior predictive p-values for summary statistics on each of the example data sets. Significant departures (univariate: p<α/2 or p> 1 - α/2; multivariate: p<α for α = 0.05) from neutrality are marked (*). Significant departures on the best fitting model for each data set are in bold.
Figure 1Posterior and predictive distributions of tree length . Posterior and predictive distributions of tree length T and Dfor all four data sets. The dengue-4 data is from an analysis assuming exponential growth, while the other three analyses assumed a constant population size. Human influenza A virus shows the largest departure from neutrality, with the posterior distribution completely disjoint from the predictive distribution.
Figure 2Posterior and posterior predictive genealogies of human influenza A virus. (A) A sample of two trees from the posterior distribution of the human influenza A virus data set. (B) The two matching trees simulated for the predictive distribution of the human influenza A virus data set. Obvious differences between the posterior and predictive trees are the shorter tree length and absence of deep splits in the posterior trees.
Figure 3Cumulative distribution of . (Cumulative distribution of pvalues (based on Dstatistic) on 100 simulated data sets under a constant population (open circles) and an exponentially growing population (closed circles). The ideal behaviour for pwhen applied to data simulated from the null distribution would be a uniform distribution (see main text for details). This plot shows that if the true demographic history is a constant population, then pwill be a good test of neutrality. However, if the true demographic history is exponential growth pwill be a conservative test, as can be seen by the lack of high pvalues in the closed circles.