| Literature DB >> 35989603 |
Alexander A Fisher1, Gabriel W Hassler2, Xiang Ji3, Guy Baele4, Marc A Suchard2,5,6, Philippe Lemey4.
Abstract
Recent advances in Bayesian phylogenetics offer substantial computational savings to accommodate increased genomic sampling that challenges traditional inference methods. In this review, we begin with a brief summary of the Bayesian phylogenetic framework, and then conceptualize a variety of methods to improve posterior approximations via Markov chain Monte Carlo (MCMC) sampling. Specifically, we discuss methods to improve the speed of likelihood calculations, reduce MCMC burn-in, and generate better MCMC proposals. We apply several of these techniques to study the evolution of HIV virulence along a 1536-tip phylogeny and estimate the internal node heights of a 1000-tip SARS-CoV-2 phylogenetic tree in order to illustrate the speed-up of such analyses using current state-of-the-art approaches. We conclude our review with a discussion of promising alternatives to MCMC that approximate the phylogenetic posterior. This article is part of a discussion meeting issue 'Genomic population structures of microbial pathogens'.Entities:
Keywords: BEAST; Bayesian phylogenetics; Hamiltonian Monte Carlo; adapative MCMC; online inference; scalable inference
Mesh:
Year: 2022 PMID: 35989603 PMCID: PMC9393558 DOI: 10.1098/rstb.2021.0242
Source DB: PubMed Journal: Philos Trans R Soc Lond B Biol Sci ISSN: 0962-8436 Impact factor: 6.671
Figure 1The online addition of 132 SARS-CoV-2 sequences to a 588-tip time-measured tree drawn from the posterior. Appended branches are blue while original branches are black. We omit timescale since this augmented tree is not sampled from the posterior.
Figure 2(a) Joint trajectory of two branch-specific clock rates γ1 and γ2 over their joint density in a three taxa tree with simulated sequence data. Trajectories display 600 posterior samples from the uMH chain and 100 posterior samples from the HMC chain since uMH takes six times as many steps when controlling for runtime. The strong posterior correlation between γ1 and γ2 results in very poor mixing with uMH while HMC easily accommodates. (b) Effective sample size (ESS) per second of BEAST runtime under both HMC and uMH MCMC samplers of branch-specific rates of phenotypic evolution over a 1536-tip HIV-1 tree. HMC results in a median speed-up of ×1000. (c) Trace plot of B.1.177 clade age with node heights sampled under both uMH MCMC and HMC.