Guy Baele1, Philippe Lemey1, Andrew Rambaut2,3, Marc A Suchard4,5,6. 1. Department of Microbiology and Immunology, Rega Institute, KU Leuven, Leuven, Belgium. 2. Institute of Evolutionary Biology, University of Edinburgh, Edinburgh, UK. 3. Centre for Immunology, Infection and Evolution, University of Edinburgh, Ashworth Laboratories, King's Buildings, Edinburgh, UK. 4. Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA. 5. Department of Biostatistics, School of Public Health, University of California, Los Angeles, CA, USA. 6. Department of Biomathematics, David Geffen School of Medicine, University of California, Los Angeles, CA, USA.
Abstract
MOTIVATION: Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. RESULTS: We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. AVAILABILITY AND IMPLEMENTATION: Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. CONTACT: guy.baele@kuleuven.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Advances in sequencing technology continue to deliver increasingly large molecular sequence datasets that are often heavily partitioned in order to accurately model the underlying evolutionary processes. In phylogenetic analyses, partitioning strategies involve estimating conditionally independent models of molecular evolution for different genes and different positions within those genes, requiring a large number of evolutionary parameters that have to be estimated, leading to an increased computational burden for such analyses. The past two decades have also seen the rise of multi-core processors, both in the central processing unit (CPU) and Graphics processing unit processor markets, enabling massively parallel computations that are not yet fully exploited by many software packages for multipartite analyses. RESULTS: We here propose a Markov chain Monte Carlo (MCMC) approach using an adaptive multivariate transition kernel to estimate in parallel a large number of parameters, split across partitioned data, by exploiting multi-core processing. Across several real-world examples, we demonstrate that our approach enables the estimation of these multipartite parameters more efficiently than standard approaches that typically use a mixture of univariate transition kernels. In one case, when estimating the relative rate parameter of the non-coding partition in a heterochronous dataset, MCMC integration efficiency improves by > 14-fold. AVAILABILITY AND IMPLEMENTATION: Our implementation is part of the BEAST code base, a widely used open source software package to perform Bayesian phylogenetic inference. CONTACT: guy.baele@kuleuven.be. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Fredrik Ronquist; Maxim Teslenko; Paul van der Mark; Daniel L Ayres; Aaron Darling; Sebastian Höhna; Bret Larget; Liang Liu; Marc A Suchard; John P Huelsenbeck Journal: Syst Biol Date: 2012-02-22 Impact factor: 15.683
Authors: Daniel L Ayres; Michael P Cummings; Guy Baele; Aaron E Darling; Paul O Lewis; David L Swofford; John P Huelsenbeck; Philippe Lemey; Andrew Rambaut; Marc A Suchard Journal: Syst Biol Date: 2019-11-01 Impact factor: 15.683
Authors: Nicola F Müller; Cassia Wagner; Chris D Frazar; Pavitra Roychoudhury; Jover Lee; Louise H Moncla; Benjamin Pelle; Matthew Richardson; Erica Ryke; Hong Xie; Lasata Shrestha; Amin Addetia; Victoria M Rachleff; Nicole A P Lieberman; Meei-Li Huang; Romesh Gautom; Geoff Melly; Brian Hiatt; Philip Dykema; Amanda Adler; Elisabeth Brandstetter; Peter D Han; Kairsten Fay; Misja Ilcisin; Kirsten Lacombe; Thomas R Sibley; Melissa Truong; Caitlin R Wolf; Michael Boeckh; Janet A Englund; Michael Famulare; Barry R Lutz; Mark J Rieder; Matthew Thompson; Jeffrey S Duchin; Lea M Starita; Helen Y Chu; Jay Shendure; Keith R Jerome; Scott Lindquist; Alexander L Greninger; Deborah A Nickerson; Trevor Bedford Journal: Sci Transl Med Date: 2021-05-03 Impact factor: 17.956
Authors: Samuel L Hong; Simon Dellicour; Bram Vrancken; Marc A Suchard; Michael T Pyne; David R Hillyard; Philippe Lemey; Guy Baele Journal: Viruses Date: 2020-02-05 Impact factor: 5.048
Authors: E Ogbaini-Emovon; S Günther; S Duraffour; L E Kafetzopoulou; S T Pullan; P Lemey; M A Suchard; D U Ehichioya; M Pahlmann; A Thielebein; J Hinzmann; L Oestereich; D M Wozniak; K Efthymiadis; D Schachten; F Koenig; J Matjeschk; S Lorenzen; S Lumley; Y Ighodalo; D I Adomeh; T Olokor; E Omomoh; R Omiunu; J Agbukor; B Ebo; J Aiyepada; P Ebhodaghe; B Osiemi; S Ehikhametalor; P Akhilomen; M Airende; R Esumeh; E Muoebonam; R Giwa; A Ekanem; G Igenegbale; G Odigie; G Okonofua; R Enigbe; J Oyakhilome; E O Yerumoh; I Odia; C Aire; M Okonofua; R Atafo; E Tobin; D Asogun; N Akpede; P O Okokhere; M O Rafiu; K O Iraoyah; C O Iruolagbe; P Akhideno; C Erameh; G Akpede; E Isibor; D Naidoo; R Hewson; J A Hiscox; R Vipond; M W Carroll; C Ihekweazu; P Formenty; S Okogbenin Journal: Science Date: 2019-01-04 Impact factor: 47.728