Literature DB >> 36016270

Robust Phylodynamic Analysis of Genetic Sequencing Data from Structured Populations.

Jérémie Scire^1,2, Joëlle Barido-Sottani^1,2,3, Denise Kühnert⁴, Timothy G Vaughan^1,2, Tanja Stadler^1,2.

Abstract

The multi-type birth-death model with sampling is a phylodynamic model which enables the quantification of past population dynamics in structured populations based on phylogenetic trees. The BEAST 2 package bdmm implements an algorithm for numerically computing the probability density of a phylogenetic tree given the population dynamic parameters under this model. In the initial release of bdmm, analyses were computationally limited to trees consisting of up to approximately 250 genetic samples. We implemented important algorithmic changes to bdmm which dramatically increased the number of genetic samples that could be analyzed and which improved the numerical robustness and efficiency of the calculations. Including more samples led to the improved precision of parameter estimates, particularly for structured models with a high number of inferred parameters. Furthermore, we report on several model extensions to bdmm, inspired by properties common to empirical datasets. We applied this improved algorithm to two partly overlapping datasets of the Influenza A virus HA sequences sampled around the world-one with 500 samples and the other with only 175-for comparison. We report and compare the global migration patterns and seasonal dynamics inferred from each dataset. In this way, we show the information that is gained by analyzing the bigger dataset, which became possible with the presented algorithmic changes to bdmm. In summary, bdmm allows for the robust, faster, and more general phylodynamic inference of larger datasets.

Entities: Chemical

Keywords: Bayesian inference; phylodynamics; phylogenetics; population structure

Mesh：

Year: 2022 PMID： 36016270 PMCID： PMC9413058 DOI： 10.3390/v14081648

Source DB: PubMed Journal: Viruses ISSN： 1999-4915 Impact factor: 5.818

1. Introduction

Genetic sequencing data taken from a measurably evolving population contain fingerprints of past population dynamics [1]. In particular, the phylogeny spanning the sampled genetic data contains information about the mixing pattern of different populations and thus contains information beyond what is encoded in classic occurrence data; see, e.g., Hey and Machado [2], Stadler and Bonhoeffer [3]. Phylodynamic methods [4,5] aim at quantifying past population dynamic parameters, such as migration rates, from genetic sequencing data. Such tools have been widely used to study the spread of infectious diseases in structured populations; see, e.g., Dudas et al. [6], Faria et al. [7] as examples of analyses of recent epidemic outbreaks. The Bayesian phylodynamic inference framework BEAST2 [8] is one of the software frameworks within which such analyses can be carried out. With BEAST2, tree topologies, parameters from phylodynamic, molecular clock, and substitution models can be jointly inferred via Markov-Chain Monte-Carlo (MCMC) sampling. Both the host population and the pathogen population may be structured (e.g., the host population may be geographically structured), and the pathogen population may consist of a drug-sensitive and a drug-resistant subpopulation. Understanding how these subpopulations interact with one another—whether they are separated by geographic distance, lifestyles of the hosts, or other barriers—is a key determinant in understanding how an epidemic spreads. In macroevolution, different species may also be structured into different “subpopulations”, e.g., due to their geographic distribution or to trait variations; see, e.g., Hodges [9]. Phylodynamic tools aim at quantifying the rates at which species migrate or the rates at which traits are gained or lost, as well as the rates of speciation and extinction within the “subpopulations”; see, e.g., Goldberg et al. [10], Mayrose et al. [11], Goldberg et al. [12]. The phylodynamic analysis of structured populations can be performed using two classes of models, namely coalescent-based and birth–death-based approaches. Both have their unique advantages and disadvantages [13,14]. Here, we report major improvements on a multi-type birth–death-based approach. A multi-type birth–death model is a linear birth–death model accounting for structured populations. Under this model, the probability density of a phylogenetic tree can be calculated by numerically integrating a system of differential equations. The use of this model within a phylodynamic setting and the associated computational approach were initially proposed for the analysis of species phylogenies [15] and later for the analysis of pathogen phylogenies [3,13]. The BEAST2 package bdmm generalizes the assumptions of these two initial approaches [16]. It further allows for co-inferring phylogenetic trees together with the model parameters and thus explicitly takes phylogenetic uncertainty into account. Datasets containing more than 250 genetic sequences could not be analyzed using the original bdmm package [16] due to numerical instability. This limitation was a strong impediment to obtaining reliable results, particularly for the analysis of structured populations as the quantification of parameters which characterize each subpopulation requires a significant amount of samples from each of them. The instability was due to numerical underflow in the probability density calculations, which meant that probability values extremely close to zero could not be accurately calculated and stored. We were able to solve the numerical instability issue of bdmm, thereby lifting the hard limit on the number of samples that could be analyzed. In addition, the practical usefulness of the bdmm package had previously been restricted by the amount of computation time required to carry out the analyses. Here, we report on significant improvements in computation efficiency. As a result, bdmm can now handle datasets containing several hundred genetic samples. Finally, we made the multi-type birth–death model more general in several ways: homochronous sampling events can now occur at multiple time points (not only the present), individuals are no longer necessarily removed upon sampling, and the migration rate specification has been made more flexible by allowing for piecewise-constant changes through time. Overall, these model generalizations and implementation improvements enable more reliable and ambitious empirical data analyses. Below, we use the new release of bdmm to quantify the Influenza A virus spread around the globe as a sample application and compare the results obtained with those from the reduced dataset analyzed in Kühnert et al. [16].

2. Methods

2.1. Description of the Extended Multi-Type Birth–Death Model

First, we formally define the multi-type birth–death model on d types [16], including the generalizations introduced in this work. The process starts at time 0 with one individual; this is also called the origin of the process and the origin of the resulting tree. This individual is of type , with a probability of (where ). The process ends after T time units (in the present). The time interval is partitioned into n intervals through , and we define and . Each individual at time t, , of type , gives rise to an additional individual of type , with a birth rate of , migrates to type j with a rate of (with ), dies with a rate of , and is sampled with a rate of . At time , each individual of type i is sampled with a probability of . Upon sampling (either with a rate of or a probability of ), the individual is removed from the infectious pool with a probability of . We summarize all birth rates in , migration rates in , death rates in , sampling rates in , sampling probabilities in , and removal probabilities in , . The model described in Kühnert et al. [16] is a special case of the above and assumes that migration rates are constant through time (i.e., do not depend on k), removal probabilities are constant through time and across types (i.e., do not depend on k and i), and that for and . This process gives rise to complete trees on sampled and non-sampled individuals, with types being assigned to all branches at all times (Figure 1, left). Following each branching event, one offspring is assigned to be the “left” offspring and another to be the “right” offspring, with each assignment having a probability of . In the figure, we draw the branch with the assignment “left” on the left and the branch with the assignment “right” on the right. Such trees are called oriented trees, and considering oriented trees will facilitate the calculation of the probability densities of trees. Pruning all lineages without sampled descendants leads to the sampled phylogeny (Figure 1, middle and right). The orientation of a branch in the sampled phylogeny is the orientation of the corresponding branch descending from the common branching event in the complete tree. When the sampled phylogeny is annotated with the types along each branch, we refer to it as a branch-typed tree (Figure 1, middle). On the other hand, if we discard these annotations but keep the types of the sampled individuals, we call the resulting object a sample-typed (or tip-typed) tree (Figure 1, right).

Figure 1

Complete tree (left) and sampled trees (middle and right) obtained from a multi-type birth–death process with two types. The orange and blue dots on the trees represent sampled individuals and are colored according to the type these individuals belong to. A -sampling event happens at time . The grey squares represent degree-2 nodes added to branches crossing this event. -sampling also happens in the present (time T). As seen in the complete tree, the three individuals who were first sampled were not removed from the population upon sampling, whereas the three individuals sampled at time were.

Here, we give an overview of the computation of the probability density of the sampled tree (i.e., the sample-typed or branch-typed tree) given the multi-type birth–death parameters , , , , , , , T. This probability density is obtained by integrating probability densities g from the leaf nodes (or “tips”), backwards along all the edges (or “branches”) to the origin of the tree. Our notation here is based on previous work [16,17], and the probabilities and relate to E and D in Stadler and Bonhoeffer [3], Maddison et al. [15], respectively. Every branching event in the sampled tree gives rise to a node of degree 3 (i.e., 3 branches are attached). Every sampling event gives rise to a node of degree 2 (called “sampled ancestor”) or 1 (called “tip”, as defined above). A sampling event at time , , is referred to as a -sampled node. All other nodes corresponding to samples are referred to as -sampled nodes. Furthermore, degree-2 nodes are placed at time on all lineages crossing time , , as shown at time in Figure 1. In a branch-typed tree, a node of degree 2 also occurs on a lineage at a time point when a type change occurs. Such type changes may be the result of either migrations or birth events in which one of the descendant subtrees is unsampled (Figure 1, middle). We highlight that in bdmm, we assume that the most recent sampling event happens at time T. This is equivalent to assuming that the sampling effort was terminated directly after the last sample was collected and overcomes the necessity for users to specify the time between the last sample and the termination of the sampling effort at time T. The derivation of the probability density of a sampled tree under the extended multi-type birth–death model is developed in Appendix A. This probability density, also called a “phylodynamic likelihood”, can be used to estimate the multi-type birth–death parameters , , , , T, when used in a Bayesian phylodynamic framework such as BEAST 2 by Bouckaert et al. [8]. Note that unlike other parameters of this model, is typically not estimated via MCMC sampling. values can be set according to different rationales: the root type can be fixed to a particular type k (, for ), or all types can be equally likely (), or they can be set to the equilibrium distribution (derived by Stadler and Bonhoeffer [3]) given that the process was already in equilibrium at the time of origin.

2.2. Implementation Improvements

The computation of the probability densities of sampled trees under the multi-type birth–death model requires the numerical solving of Ordinary Differential Equations (ODEs) along each tree branch. We were able to significantly improve the robustness of the original bdmm implementation, which suffered from instabilities caused by the underflow of these numerical calculations. Compared to the original implementation, we prevented the underflow by implementing an extended precision floating point representation (EPFP) for storing intermediary calculation results. In addition to this improvement in stability, we improved the efficiency of the probability density calculations by (1) using an adaptive step-size integrator for numerical integration, (2) performing preliminary calculations and storing the results for use during the main calculation step, and (3) distributing calculations among threads running in parallel. Details can be found in Appendix B. The latest release with our updates, bdmm 1.0, is freely available as an open access package of BEAST 2. The source code can be accessed at https://github.com/denisekuehnert/bdmm (accessed on 26 July 2022).

3. Results

3.1. Evaluation of Numerical Improvements

We compared the robustness and efficiency of the improved bdmm package against its original version. We considered tree sizes varying between 10 and 1000 samples. For each tree size, we simulated 50 branch-typed and 50 sample-typed trees under the multi-type birth–death model using randomly drawn parameter values from the distributions shown in Table A1. The distributions from which the parameters were drawn were selected to reflect a wide range of scenarios. For each simulated tree, we measured the time taken to perform the calculation of the probability density, given the parameter values under which the tree was simulated, using the updated and the original bdmm implementation. We report here the wall-clock time taken to perform this calculation 5000 times (Figure 2). All computations were performed on a MacBook Pro with a dual-core 2.3 GHz Intel Core i5 processor. The new implementation of bdmm is on average nine times faster than the original (Figure 2A). The robustness of the updated implementation was demonstrated by reporting how often the implementations returned for the probability density in log space. We call these calculations “failures”, the most likely cause of error being the underflow. Our new implementation showed no calculation failure for trees containing a thousand samples, while in the original implementation calculations often failed for trees with more than 250 samples (Figure 2B).

Table A1

Distributions from which parameters were sampled for the simulation of trees. All parameters were constant through time.

Parameter	Distribution
λi	Unif(1, 3)
μi	λi× Unif(0, 1)
mi,j	Unif(0, 0.5)
ψi	Unif(0.05, 0.5)
ri	Unif(0, 1)

Figure 2

Comparison between the original and the updated implementations of the multi-type birth–death model. (A) Speed comparison. Only successful calculations were taken into account, i.e., calculations where the log probability density was different from . (B) Success in calculating probability density values plotted against tree size. The values presented in this panel correspond to the same set of calculations as the one in panel (A).

3.2. Validation against Original Implementation

To ensure that no errors were introduced into the updated bdmm, we validated the improved implementation against the original bdmm version by comparing the results of likelihood computation on a handful of randomly simulated trees. We used simulated trees with 10 or 100 tips, well below the limit of reliability of the original bdmm version (approximately 250 tips). Details of this procedure can be found in Appendix C. Figure 3 shows one of such simulated trees along with tree likelihood values (or the probability density of the sampled tree given the multi-type birth–death parameters) computed with each bdmm version. Likelihood computation results are identical for all trees and parameters tested for both implementations (difference in log-likelihoods < 1 × 10−6). Figure A3 shows that the same results were obtained with other trees or when varying other parameters. Therefore, we conclude that the results of the full validation, along with error and bias assessment performed by Kühnert et al. [16] on the original bdmm version, hold true for the improved bdmm implementation we present in this article.

Figure 3

Comparison of computation results between the original bdmm and improved bdmm versions. (A) Randomly simulated tree with 10 tips and 2 demes, used for comparison. (B) Log-likelihood values obtained with each bdmm version as a function of (birth rate of orange deme).

Figure A3

Comparisons of likelihood computation results between the original and improved bdmm versions for additional trees. (A,B) Randomly simulated ten-tip tree and log-likelihood computation results against (birth rate of red deme). (C,D) Randomly simulated hundred-tip tree and log-likelihood computation results against (birth rate of red deme). (E,F) Randomly simulated ten-tip tree and log-likelihood computation results against (death rate of red deme).

3.3. Influenza A Virus (H3N2) Analysis

As an example of a biological question that can be investigated with the use of bdmm, we analyzed 500 H3N2 influenza virus HA sequences sampled around the globe from 2000 to 2006; we aimed to recover the seasonal dynamics of the global epidemics. The dataset is a random subset of the data analyzed by Vaughan et al. [18], taken from three different regions around the globe: New York (North, ), New Zealand (South, ), and Hong Kong (Tropics, ). The dataset of 980 samples assembled by Vaughan et al. [18] was built with the aim of gathering samples from three locations with relatively similar population sizes, each representative of the northern, southern, or equatorial regions. As a comparison, we performed an identical analysis of the H3N2 influenza dataset with 175 sequences sampled between 2003 and 2006 that was used in [16]. This dataset of 175 sequences was also a subset of the data by Vaughan et al. [18], and it also grouped samples from three locations denoted as North (for the northern hemisphere), South (for the southern hemisphere), and Tropic (for tropical regions). Note that the latter dataset came from more geographically spread samples, and thus we did not expect results from both analyses to be perfectly comparable. As we were dealing with pathogen sequence data, we adopted the epidemiological parametrization of the multi-type birth–death model, as detailed in Kühnert et al. [16]. The epidemiological parametrization substitutes birth, death, and sampling rates with effective reproduction numbers within types, rate at which hosts become noninfectious, and sampling proportions. To study the seasonal dynamics of the global epidemic, we allowed the effective reproduction number to vary through time. To do so, we subdivided time into six-month intervals (starting April 1st and October 1st), and we constrained the effective reproduction number values corresponding to the same season across different years to be equal for each particular location, assuming that the values were the same in the summer seasons and the same in the winter seasons. The testing of this hypothesis’ validity by estimating the values that varied in each six-month period was not performed as we expected little information from the data for the additional parameters. For the same reason, the migration rate was not varied through time. Further details on the data analysis configuration can be found in Appendix D. The analysis of the larger dataset (500 samples) allowed for the reconstruction of the phylogenetic tree encompassing a longer time period, and therefore gave a more long-term and detailed view of the evolution of the global epidemic (see Figure 4 for the Maximum Clade Credibility trees).

Figure 4

Maximum Clade Credibility (MCC) trees from analyses of (A) 175 samples and (B) 500 samples.

As can be expected for the tropical location, in both analyses, the effective reproduction numbers for H3N2 influenza A were inferred to be close to one throughout the year (Figure 5A). Conversely, strong seasonal variations can be observed in the Northern and Southern hemisphere locations, where the effective reproduction number was close to one in winter and was much lower in summer. Inferences from the small and large datasets are mostly in agreement. A subtle difference appears: in the larger dataset, the effective reproduction numbers in the winter seasons and in the tropical location are closer to one, with less variation across estimates. This seems to indicate that the variations between estimates observed with the smaller dataset, including samples from 2003 to 2006 (for instance, in winter in the North compared to in winter in the South), are due to stochastic fluctuations, which are averaged out when considering a longer period of transmission dynamics in the larger dataset covering the years 2000–2006.

Figure 5

(A) Seasonal effective reproduction numbers for each sample location, for both datasets. (B) Migration rates inferred for each dataset. N, S, and T refer respectively to North, South, and Tropics. For instance, “Mig. rate N-T” represents the migration rate from the Northern location to the Tropical one.

The precise inference of migration rates is more difficult, as reflected by the significant uncertainty we obtained on the estimates (Figure 5B). However, we still observed that the uncertainty was generally reduced for the inference performed with the larger dataset, as expected. A significant difference between the migration rates inferred from the Southern to Tropical locations arose between the two analyses. With the larger dataset, the estimated rate was much lower than that with the smaller one, and it was more in range with the other migration rate estimates. Detailed results of all the parameter estimates for both analyses are available in Table A4. Most notably, the estimates of the root locations for both datasets are very similar. In both cases, the tropical location is most likely to be the location of the root; however, neither of the two other locations can be entirely excluded.

Table A4

Inferred parameter values for Influenza A virus analysis under the multi-type birth–death model. For each parameter, the lower and upper bounds for the 95% Highest Posterior Density interval (hpd_low, hpd_high) are given along with the median (m). N, S, and T refer respectively to the North, South, and Tropics. For effective reproduction numbers R, the first subscript is the location while the second one refers to the period of the year. s, w, as, om respectively refer to summer, winter, april–september, and october–march. Thus, for instance, refers to the effective reproduction number of samples from the southern hemisphere during the winter season. The tree height t is given in number of , M is given in , and is given in . The remaining parameters are unitless.

	175 Samples			500 Samples
	m	hpd_low	hpd_high	m	hpd_low	hpd_high
t	3.342	3.048	3.643	6.645	6.313	7.031
δ	90.464	75.836	105.207	101.582	87.197	116.768
RNs	0.354	0.199	0.556	0.505	0.263	0.815
RNw	0.971	0.927	1.01	1.001	0.98	1.02
RTas	1.048	1.026	1.071	1.005	0.992	1.017
RTom	0.991	0.969	1.013	1.01	0.998	1.022
RSs	0.558	0.335	0.783	0.774	0.679	0.861
RSw	1.08	1.037	1.123	1.027	1.002	1.051
σm	0.475	0.196	0.869	0.304	0.147	0.524
MN,T	0.871	0.245	1.923	1.064	0.3	2.422
MN,S	0.894	0.253	1.965	0.838	0.264	1.744
MT,N	2.183	0.877	4.174	1.568	0.635	2.973
MT,S	0.561	0.205	1.08	0.694	0.332	1.248
MS,N	1.001	0.273	2.261	0.887	0.275	1.93
MS,T	1.005	0.279	2.173	1.13	0.368	2.468
hN	0.292	0	0.777	0.296	0	0.779
hT	0.3	0	0.784	0.292	0	0.777
hS	0.283	0	0.769	0.289	0	0.767
P(rootinN)	0.024	0	0.535	0.023	0	0.501
P(rootinT)	0.849	0.167	1	0.863	0.176	1
P(rootinS)	0.041	0	0.731	0.044	0	0.67

3.4. Properties of bdmm

3.4.1. Identifiability of Parameters

In birth–death models with sampling through time or in the present, only two of the three parameters of birth rate, death rate, and sampling rate/sampling probability can be jointly estimated [19,20]. Thus, independent prior data need to be employed to quantify all three parameters. In recent work, the question of identifiability in time-dependent birth–death sampling models has been thoroughly investigated [20,21]. The issue of identifiability in state-dependent birth–death sampling models remains, to our knowledge, largely unanswered. The interactions between migration rates, rates of birth-among-demes, and other multi-type birth–death parameters is not well-known. It is likely that different parameter combinations of the multi-type birth–death model can yield the same likelihood value. Informative prior information on some of the birth–death parameters mitigates parameter non-identifiability issues.

3.4.2. Computational Costs

Despite the implementation improvements presented in this manuscript, phylodynamics analyses performed in bdmm are still limited in practice by the number of genetic sequences they can handle. This limitation, unlike the previously existing limitation caused by underflow, is not a hard boundary but rather a soft boundary imposed by the practical constraints of computational analyses. Limitations with regard to the complexity of the analyses that could be carried out with the improved version of bdmm derive from the time required to carry out computations and from the complexity of the probability space that must be explored. For instance, each MCMC chain for the 500-sample Influenza A analysis required about 15 days to compute. Keeping the same analysis setup and increasing the number of genetic samples will have a linear effect on the time required to compute the phylodynamic likelihood with bdmm. With our updated bdmm implementation, the core bottleneck is the complexity of exploring tree space, which increases exponentially with additional samples. Due to this complexity, only trees with up to around 1000 samples can be successfully estimated with BEAST2.

4. Discussion

The multi-type birth–death model with its updated implementation in the bdmm package for BEAST 2 provides a flexible method for taking into account the effect of population structure when performing a phylodynamic genetic sequence analysis. Compared to the original implementation, we now prevent the underflow of numerical calculations and speed up calculations by almost an order of magnitude. The size limit of around 250 samples for datasets that could be handled by bdmm is thus lifted, and significantly larger datasets can now be analyzed. Now, the bottleneck lies in the search for tree space with MCMC rather than with bdmm. We demonstrate this improvement by analyzing two datasets of Influenza A virus H3N2 genetic data from around the globe. One dataset has 500 samples and could not have been analyzed with the original version of bdmm, while the other one contains 175 samples and is the original sample dataset analyzed in [16]. Overall, we observed that analyzing a dataset with more samples, as expected, gives a more long-term picture of the global transmission patterns and reduces the general uncertainty concerning parameter estimates. With the addition of the so-called -sampling events in the past, intense sampling efforts limited to short periods of time (leading to many samples being taken nearly simultaneously) can be easily modeled as instantaneous sampling events across the entire population (or subpopulation) rather than as non-instantaneous sampling over small sampling intervals. This simplifies the modeling of intense pathogen sequencing efforts in very short time windows. By allowing the removal probability r (the probability for an individual to be removed from the infectious population upon sampling) to be type-dependent and to vary across time intervals, as well as allowing migration rates between types to vary across time intervals, we further increase the generality and flexibility of the multi-type birth–death model. A sample bdmm analysis with a -sampling event in the past was added to the software package to guide users who may want to set up such an analysis with their own data. We focused on an epidemiological application of bdmm, where we co-infer the phylogenetic trees to take into account the phylogenetic uncertainty. However, the bdmm modeling assumptions are equally applicable to the analysis of macroevolutionary data, in which context bdmm allows for the joint inference of trees with fossil samples under structured models [22]. When using a multi-type birth–death model in the macroevolutionary framework, -sampling can be used to model fossil samples originating from the same rock layer. In the context of the exploration of trait-dependent speciation, structured birth–death models such as the binary-state speciation and extinction model (BiSSE) [23,24] have been shown to possibly produce spurious associations between character state and speciation rate when applied to empirical phylogenies [25]. When used in this fashion, users of bdmm should assess the propensity of their dataset analysis for Type I errors through neutral trait simulations, as suggested by Rabosky and Goldberg [25]. In summary, the new release of bdmm overcomes several constraints when analyzing sequencing data in BEAST2. As it stands, the main constraint now is given by the efficiency of the BEAST2 MCMC tree space sampler rather than bdmm itself. We expect the new release of bdmm to become a standard tool for the phylodynamic analysis of sequencing data and phylogenetic trees from structured populations.

Table A2

Fixed parameter values for tree simulation and likelihood computations.

Parameter	Value
λ1	0.4
λ2	0.3
μ1	0.27
μ2	0.17
ψ1	0.03
ψ2	0.03
m1,2	0.03
m2,1	0.03
Initial root state	1

Table A3

Prior distributions for parameters of the multi-type birth–death model in the seasonal influenza analysis.

Parameter	Prior Distribution
R	LogNormal(0, 1.0)
δ	LogNormal(4.5, 0.15)
σm	LogNormal(0, 2.0)
Mi,j	LogNormal(0, 0.5)
s	Exp(0.001) truncated on [0,1]
r	Beta(10.0, 1.5)
T	LogNormal(2.0, 1.0)

27 in total

Robust Phylodynamic Analysis of Genetic Sequencing Data from Structured Populations.

1. Introduction

2. Methods

2.1. Description of the Extended Multi-Type Birth–Death Model

2.2. Implementation Improvements

3. Results

3.1. Evaluation of Numerical Improvements

3.2. Validation against Original Implementation

3.3. Influenza A Virus (H3N2) Analysis

3.4. Properties of bdmm

3.4.1. Identifiability of Parameters

3.4.2. Computational Costs

4. Discussion

1. Species selection maintains self-incompatibility.

2. Confounding asymmetries in evolutionary diversification and character change.

3. A new method for calculating evolutionary substitution rates.

4. Birth-death skyline plot reveals temporal changes of epidemic spread in HIV and hepatitis C virus (HCV).

5. Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7.

6. Sampling through time and phylodynamic inference with coalescent and birth-death models.

Review 7. Phylogenetic and epidemic modeling of rapidly evolving infectious diseases.

8. Unifying Phylogenetic Birth-Death Models in Epidemiology and Macroevolution.

9. Phylodynamics with Migration: A Computational Framework to Quantify Population Structure from Genomic Data.

10. Fundamental Identifiability Limits in Molecular Epidemiology.