Literature DB >> 34168419

Massive parallelization boosts big Bayesian multidimensional scaling.

Andrew J Holbrook1, Philippe Lemey2, Guy Baele2, Simon Dellicour2, Dirk Brockmann3, Andrew Rambaut4,5, Marc A Suchard1,6,7.   

Abstract

Big Bayes is the computationally intensive co-application of big data and large, expressive Bayesian models for the analysis of complex phenomena in scientific inference and statistical learning. Standing as an example, Bayesian multidimensional scaling (MDS) can help scientists learn viral trajectories through space-time, but its computational burden prevents its wider use. Crucial MDS model calculations scale quadratically in the number of observations. We partially mitigate this limitation through massive parallelization using multi-core central processing units, instruction-level vectorization and graphics processing units (GPUs). Fitting the MDS model using Hamiltonian Monte Carlo, GPUs can deliver more than 100-fold speedups over serial calculations and thus extend Bayesian MDS to a big data setting. To illustrate, we employ Bayesian MDS to infer the rate at which different seasonal influenza virus subtypes use worldwide air traffic to spread around the globe. We examine 5392 viral sequences and their associated 14 million pairwise distances arising from the number of commercial airline seats per year between viral sampling locations. To adjust for shared evolutionary history of the viruses, we implement a phylogenetic extension to the MDS model and learn that subtype H3N2 spreads most effectively, consistent with its epidemic success relative to other seasonal influenza subtypes. Finally, we provide MassiveMDS, an open-source, stand-alone C++ library and rudimentary R package, and discuss program design and high-level implementation with an emphasis on important aspects of computing architecture that become relevant at scale.

Entities:  

Keywords:  Bayesian phylogeography; GPU; Hamiltonian Monte Carlo; Massive parallelization; SIMD

Year:  2020        PMID: 34168419      PMCID: PMC8218718          DOI: 10.1080/10618600.2020.1754226

Source DB:  PubMed          Journal:  J Comput Graph Stat        ISSN: 1061-8600            Impact factor:   2.302


  29 in total

1.  Many-core algorithms for statistical phylogenetics.

Authors:  Marc A Suchard; Andrew Rambaut
Journal:  Bioinformatics       Date:  2009-04-15       Impact factor: 6.937

2.  The hidden geometry of complex, network-driven contagion phenomena.

Authors:  Dirk Brockmann; Dirk Helbing
Journal:  Science       Date:  2013-12-13       Impact factor: 47.728

3.  Among-site rate variation and its impact on phylogenetic analyses.

Authors:  Z Yang
Journal:  Trends Ecol Evol       Date:  1996-09       Impact factor: 17.712

4.  Multiresolution Network Models.

Authors:  Bailey K Fosdick; Tyler H McCormick; Thomas Brendan Murphy; Tin Lok James Ng; Ted Westling
Journal:  J Comput Graph Stat       Date:  2018-11-05       Impact factor: 2.302

5.  Emerging infectious diseases: A proactive approach.

Authors:  David E Bloom; Steven Black; Rino Rappuoli
Journal:  Proc Natl Acad Sci U S A       Date:  2017-04-10       Impact factor: 11.205

6.  Hierarchical phylogenetic models for analyzing multipartite sequence data.

Authors:  Marc A Suchard; Christina M R Kitchen; Janet S Sinsheimer; Robert E Weiss
Journal:  Syst Biol       Date:  2003-10       Impact factor: 15.683

7.  On the utility of graphics cards to perform massively parallel simulation of advanced Monte Carlo methods.

Authors:  Anthony Lee; Christopher Yau; Michael B Giles; Arnaud Doucet; Christopher C Holmes
Journal:  J Comput Graph Stat       Date:  2010-12-01       Impact factor: 2.302

8.  Improving Bayesian population dynamics inference: a coalescent-based model for multiple loci.

Authors:  Mandev S Gill; Philippe Lemey; Nuno R Faria; Andrew Rambaut; Beth Shapiro; Marc A Suchard
Journal:  Mol Biol Evol       Date:  2012-11-22       Impact factor: 16.240

9.  Global circulation patterns of seasonal influenza viruses vary with antigenic drift.

Authors:  Trevor Bedford; Steven Riley; Ian G Barr; Shobha Broor; Mandeep Chadha; Nancy J Cox; Rodney S Daniels; C Palani Gunasekaran; Aeron C Hurt; Anne Kelso; Alexander Klimov; Nicola S Lewis; Xiyan Li; John W McCauley; Takato Odagiri; Varsha Potdar; Andrew Rambaut; Yuelong Shu; Eugene Skepner; Derek J Smith; Marc A Suchard; Masato Tashiro; Dayan Wang; Xiyan Xu; Philippe Lemey; Colin A Russell
Journal:  Nature       Date:  2015-06-08       Impact factor: 49.962

10.  Bayesian phylogenetic and phylodynamic data integration using BEAST 1.10.

Authors:  Marc A Suchard; Philippe Lemey; Guy Baele; Daniel L Ayres; Alexei J Drummond; Andrew Rambaut
Journal:  Virus Evol       Date:  2018-06-08
View more
  4 in total

1.  BAYESIAN MITIGATION OF SPATIAL COARSENING FOR A HAWKES MODEL APPLIED TO GUNFIRE, WILDFIRE AND VIRAL CONTAGION.

Authors:  Andrew J Holbrook; Xiang Ji; Marc A Suchard
Journal:  Ann Appl Stat       Date:  2022-03-28       Impact factor: 1.959

2.  From viral evolution to spatial contagion: a biologically modulated Hawkes model.

Authors:  Andrew J Holbrook; Xiang Ji; Marc A Suchard
Journal:  Bioinformatics       Date:  2022-01-18       Impact factor: 6.937

3.  Scalable Bayesian inference for self-excitatory stochastic processes applied to big American gunfire data.

Authors:  Andrew J Holbrook; Charles E Loeffler; Seth R Flaxman; Marc A Suchard
Journal:  Stat Comput       Date:  2021-01-12       Impact factor: 2.559

4.  Relax, Keep Walking - A Practical Guide to Continuous Phylogeographic Inference with BEAST.

Authors:  Simon Dellicour; Mandev S Gill; Nuno R Faria; Andrew Rambaut; Oliver G Pybus; Marc A Suchard; Philippe Lemey
Journal:  Mol Biol Evol       Date:  2021-07-29       Impact factor: 16.240

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.