| Literature DB >> 36016270 |
Jérémie Scire1,2, Joëlle Barido-Sottani1,2,3, Denise Kühnert4, Timothy G Vaughan1,2, Tanja Stadler1,2.
Abstract
The multi-type birth-death model with sampling is a phylodynamic model which enables the quantification of past population dynamics in structured populations based on phylogenetic trees. The BEAST 2 package bdmm implements an algorithm for numerically computing the probability density of a phylogenetic tree given the population dynamic parameters under this model. In the initial release of bdmm, analyses were computationally limited to trees consisting of up to approximately 250 genetic samples. We implemented important algorithmic changes to bdmm which dramatically increased the number of genetic samples that could be analyzed and which improved the numerical robustness and efficiency of the calculations. Including more samples led to the improved precision of parameter estimates, particularly for structured models with a high number of inferred parameters. Furthermore, we report on several model extensions to bdmm, inspired by properties common to empirical datasets. We applied this improved algorithm to two partly overlapping datasets of the Influenza A virus HA sequences sampled around the world-one with 500 samples and the other with only 175-for comparison. We report and compare the global migration patterns and seasonal dynamics inferred from each dataset. In this way, we show the information that is gained by analyzing the bigger dataset, which became possible with the presented algorithmic changes to bdmm. In summary, bdmm allows for the robust, faster, and more general phylodynamic inference of larger datasets.Entities:
Keywords: Bayesian inference; phylodynamics; phylogenetics; population structure
Mesh:
Year: 2022 PMID: 36016270 PMCID: PMC9413058 DOI: 10.3390/v14081648
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.818
Figure 1Complete tree (left) and sampled trees (middle and right) obtained from a multi-type birth–death process with two types. The orange and blue dots on the trees represent sampled individuals and are colored according to the type these individuals belong to. A -sampling event happens at time . The grey squares represent degree-2 nodes added to branches crossing this event. -sampling also happens in the present (time T). As seen in the complete tree, the three individuals who were first sampled were not removed from the population upon sampling, whereas the three individuals sampled at time were.
Distributions from which parameters were sampled for the simulation of trees. All parameters were constant through time.
| Parameter | Distribution |
|---|---|
|
| Unif(1, 3) |
|
| |
|
| Unif(0, 0.5) |
|
| Unif(0.05, 0.5) |
|
| Unif(0, 1) |
Figure 2Comparison between the original and the updated implementations of the multi-type birth–death model. (A) Speed comparison. Only successful calculations were taken into account, i.e., calculations where the log probability density was different from . (B) Success in calculating probability density values plotted against tree size. The values presented in this panel correspond to the same set of calculations as the one in panel (A).
Figure 3Comparison of computation results between the original bdmm and improved bdmm versions. (A) Randomly simulated tree with 10 tips and 2 demes, used for comparison. (B) Log-likelihood values obtained with each bdmm version as a function of (birth rate of orange deme).
Figure A3Comparisons of likelihood computation results between the original and improved bdmm versions for additional trees. (A,B) Randomly simulated ten-tip tree and log-likelihood computation results against (birth rate of red deme). (C,D) Randomly simulated hundred-tip tree and log-likelihood computation results against (birth rate of red deme). (E,F) Randomly simulated ten-tip tree and log-likelihood computation results against (death rate of red deme).
Figure 4Maximum Clade Credibility (MCC) trees from analyses of (A) 175 samples and (B) 500 samples.
Figure 5(A) Seasonal effective reproduction numbers for each sample location, for both datasets. (B) Migration rates inferred for each dataset. N, S, and T refer respectively to North, South, and Tropics. For instance, “Mig. rate N-T” represents the migration rate from the Northern location to the Tropical one.
Inferred parameter values for Influenza A virus analysis under the multi-type birth–death model. For each parameter, the lower and upper bounds for the 95% Highest Posterior Density interval (hpd_low, hpd_high) are given along with the median (m). N, S, and T refer respectively to the North, South, and Tropics. For effective reproduction numbers R, the first subscript is the location while the second one refers to the period of the year. s, w, as, om respectively refer to summer, winter, april–september, and october–march. Thus, for instance, refers to the effective reproduction number of samples from the southern hemisphere during the winter season. The tree height t is given in number of , M is given in , and is given in . The remaining parameters are unitless.
| 175 Samples | 500 Samples | |||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
|
| 3.342 | 3.048 | 3.643 | 6.645 | 6.313 | 7.031 |
|
| 90.464 | 75.836 | 105.207 | 101.582 | 87.197 | 116.768 |
|
| 0.354 | 0.199 | 0.556 | 0.505 | 0.263 | 0.815 |
|
| 0.971 | 0.927 | 1.01 | 1.001 | 0.98 | 1.02 |
|
| 1.048 | 1.026 | 1.071 | 1.005 | 0.992 | 1.017 |
|
| 0.991 | 0.969 | 1.013 | 1.01 | 0.998 | 1.022 |
|
| 0.558 | 0.335 | 0.783 | 0.774 | 0.679 | 0.861 |
|
| 1.08 | 1.037 | 1.123 | 1.027 | 1.002 | 1.051 |
|
| 0.475 | 0.196 | 0.869 | 0.304 | 0.147 | 0.524 |
|
| 0.871 | 0.245 | 1.923 | 1.064 | 0.3 | 2.422 |
|
| 0.894 | 0.253 | 1.965 | 0.838 | 0.264 | 1.744 |
|
| 2.183 | 0.877 | 4.174 | 1.568 | 0.635 | 2.973 |
|
| 0.561 | 0.205 | 1.08 | 0.694 | 0.332 | 1.248 |
|
| 1.001 | 0.273 | 2.261 | 0.887 | 0.275 | 1.93 |
|
| 1.005 | 0.279 | 2.173 | 1.13 | 0.368 | 2.468 |
|
| 0.292 | 0 | 0.777 | 0.296 | 0 | 0.779 |
|
| 0.3 | 0 | 0.784 | 0.292 | 0 | 0.777 |
|
| 0.283 | 0 | 0.769 | 0.289 | 0 | 0.767 |
|
| 0.024 | 0 | 0.535 | 0.023 | 0 | 0.501 |
|
| 0.849 | 0.167 | 1 | 0.863 | 0.176 | 1 |
|
| 0.041 | 0 | 0.731 | 0.044 | 0 | 0.67 |
Fixed parameter values for tree simulation and likelihood computations.
| Parameter | Value |
|---|---|
|
| 0.4 |
|
| 0.3 |
|
| 0.27 |
|
| 0.17 |
|
| 0.03 |
|
| 0.03 |
|
| 0.03 |
|
| 0.03 |
| Initial root state | 1 |
Prior distributions for parameters of the multi-type birth–death model in the seasonal influenza analysis.
| Parameter | Prior Distribution |
|---|---|
|
| LogNormal(0, 1.0) |
|
| LogNormal(4.5, 0.15) |
|
| LogNormal(0, 2.0) |
|
| LogNormal(0, 0.5) |
|
| Exp(0.001) truncated on |
|
| Beta(10.0, 1.5) |
|
| LogNormal(2.0, 1.0) |