Literature DB >> 29790939

Nextstrain: real-time tracking of pathogen evolution.

James Hadfield1, Colin Megill1, Sidney M Bell1,2, John Huddleston1,2, Barney Potter1, Charlton Callender1, Pavel Sagulenko3, Trevor Bedford1, Richard A Neher3,4,5.   

Abstract

Summary: Understanding the spread and evolution of pathogens is important for effective public health measures and surveillance. Nextstrain consists of a database of viral genomes, a bioinformatics pipeline for phylodynamics analysis, and an interactive visualization platform. Together these present a real-time view into the evolution and spread of a range of viral pathogens of high public health importance. The visualization integrates sequence data with other data types such as geographic information, serology, or host species. Nextstrain compiles our current understanding into a single accessible location, open to health professionals, epidemiologists, virologists and the public alike. Availability and implementation: All code (predominantly JavaScript and Python) is freely available from github.com/nextstrain and the web-application is available at nextstrain.org.

Entities:  

Mesh:

Year:  2018        PMID: 29790939      PMCID: PMC6247931          DOI: 10.1093/bioinformatics/bty407

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.931


1 Introduction

Viral pathogens pose an ever-present danger to global human health, highlighted by recent events such as the West African Ebola epidemic and the ongoing Zika epidemic in the Americas. The rapid evolution of these viruses allows inference of epidemic history from genomic data. Such analyses are often done in isolation, and may lack the spatial or temporal context in which to best interpret the results (Pybus ). Furthermore, the results of analyses are rarely made available to the public or health bodies until after publication, which may be too late to aid understanding or effect change in policy. We have developed Nextstrain to visualize outbreaks in as close to real time as possible. Whilst currently encompassing a selection of viruses, extension to non-viral pathogens is forthcoming. The regularly updated nature and rapidity of these analyses is crucial to the monitoring and understanding of pathogen epidemiology and evolution. Sequencing times and costs are continually dropping, with on-the-ground sequencing used during recent epidemics (Faria ; Quick ). Rapid methods by which to analyze, interpret, and disseminate results must complement this speed of sequencing. Nextstrain consists of data curation, analysis and visualization components: Python scripts maintain a database of available sequences and related metadata, sourced from public repositories such as NCBI (www.ncbi.nlm.nih.gov), GISAID (www.gisaid.org) and ViPR (www.viprbrc.org), as well as GitHub repositories and other sources of genomic data. A suite of tools perform phylodynamic analysis (Volz ), including subsampling, alignment, phylogenetic inference, temporal dating of ancestral nodes and discrete trait geographic reconstruction, including inference of the most likely transmission events. This leverages the maximum likelihood analyses implemented in TreeTime (Sagulenko ), allowing a full analysis of the entire Ebola epidemic (n = 1581 genomes) in under 2 h on a modern laptop. These scripts separate generic core functionality from a light pathogen-specific layer such that they are easily adapted to different pathogens. Visualization is available through nextstrain.org. This approach is similar in concept to Nextflu (Neher and Bedford, 2015) however extended and generalized to different viral pathogens. There is a growing need for surveillance of non-influenza viruses (Tang ), and Nextstrain is able to be extended to most outbreaks with readily accessible genomic data, although we note the potential for recombination or low mutation rate to confound phylogenetic signal.

2 Joint temporal and spatial visualization

Conveying understanding of pathogen evolution through space and time involves filtering large amounts of data into forms that can be easily reasoned with. Multiple views into different facets of the data are presented and remain in sync as one interacts with the data. This allows simultaneous interrogation of phylogenetic and spatial relationships, with additional data such as genotype or serotype expressed through colourings (Fig. 1). This is coupled with an interactive time slider to see how the pathogen has evolved and spread over the course of the epidemic. By animating the temporal dimension, a high level overview of how the entire outbreak unfolded is quickly gained. This approach both communicates the geographical spread of the epidemic alongside the underlying genomic data that supports this geographic reconstruction.
Fig. 1.

Genomic epidemiology of Zika virus as of Oct 2017 (live display at nextstrain.org/zika). The main interface consists of three linked panels—a phylogenetic tree, geographic transmissions and the genetic diversity across the genome

Genomic epidemiology of Zika virus as of Oct 2017 (live display at nextstrain.org/zika). The main interface consists of three linked panels—a phylogenetic tree, geographic transmissions and the genetic diversity across the genome Maximum likelihood ancestral state reconstruction of discrete traits such as country or region of isolation allows identification of probable transmission events given the sampled data, together with inferred probability distributions of ancestral state at each node. Internal node colours indicate ancestral state and shifts are drawn as links between demes on the map. Confidence is conveyed by matching colour saturation to the confidence of that trait, and by displaying all relevant information when one hovers over the corresponding branch or isolate on the tree. Sampling bias and lack of data can obscure transmission links, and in certain cases we have chosen not to display the inferred states.

3 Monitoring of evolution and adaptation

Nextstrain tracks and reconstructs mutations across the tree and displays this information as a bar chart of entropy at each position in the genome, as well as showing the mutations inferred to occur on each branch by hovering over the tree. Selecting a position in the genome with non-zero entropy reveals the distribution of the segregating variant in the phylogeny and on the map. This allows interrogation of genetic change which may be adaptive or underlying a change in disease dynamics. For many pathogens, the emergence and spread of gain-of-function variants is a grave concern. For instance, China has experienced seasonal epidemics of influenza A/H7N9 over the past five years. Despite no known human-to human-transmission events, the high mortality rate of 30% (Li ) makes the threat of mutations which facilitate human-to-human transmission of extreme concern. For example, mutations identified by de Vries are readily visible at nextstrain.org/avian/h7n9. Continual monitoring of such putatively adaptive mutations is critical.

4 A model for public sharing of data

Nextstrain presents a single, continuously updated overview of both endemic viral disease (seasonal influenza, dengue) as well as emergent viral outbreaks (avian influenza, Zika, Ebola), all based upon the same underlying bioinformatics architecture. This architecture is well positioned to respond to future outbreaks, be they viral or bacterial. Analysis of such outbreaks relies on public sharing of data, and Nextstrain has the ability to automatically update as new sequences from a range of public databases and repositories appear. Scientists are justifiably hesitant to cede control of their data, and we try to address these concerns by preventing access to the raw genome sequences, and by clearly indicating the source of each sequence. Derived data, such as phylogenetic trees, metadata and screenshots are available, and one can append private metadata via CSV files. We believe this strikes a compromise between keeping certain data private and allowing the dissemination of results important to the wider scientific community, thereby encouraging collaboration between scientists. Genomic epidemiology has the potential to inform the public, health organisations and scientists alike, a potential realized by sharing of data in real-time rather than retrospectively (Croucher and Didelot, 2015).

Funding

This work was supported by the Open Science Prize to TB and RAN, by the NSF through DGE-1256082 to SMB, by the ERC through StG-260686 to RAN and by NIH R35 GM119774-01 to TB. TB is a Pew Biomedical Scholar. Conflict of Interest: none declared.
  10 in total

1.  Epidemiology of human infections with avian influenza A(H7N9) virus in China.

Authors:  Qun Li; Lei Zhou; Minghao Zhou; Zhiping Chen; Furong Li; Huanyu Wu; Nijuan Xiang; Enfu Chen; Fenyang Tang; Dayan Wang; Ling Meng; Zhiheng Hong; Wenxiao Tu; Yang Cao; Leilei Li; Fan Ding; Bo Liu; Mei Wang; Rongheng Xie; Rongbao Gao; Xiaodan Li; Tian Bai; Shumei Zou; Jun He; Jiayu Hu; Yangting Xu; Chengliang Chai; Shiwen Wang; Yongjun Gao; Lianmei Jin; Yanping Zhang; Huiming Luo; Hongjie Yu; Jianfeng He; Qi Li; Xianjun Wang; Lidong Gao; Xinghuo Pang; Guohua Liu; Yansheng Yan; Hui Yuan; Yuelong Shu; Weizhong Yang; Yu Wang; Fan Wu; Timothy M Uyeki; Zijian Feng
Journal:  N Engl J Med       Date:  2013-04-24       Impact factor: 91.245

2.  Establishment and cryptic transmission of Zika virus in Brazil and the Americas.

Authors:  N R Faria; J Quick; I M Claro; J Thézé; J G de Jesus; M Giovanetti; M U G Kraemer; S C Hill; A Black; A C da Costa; L C Franco; S P Silva; C-H Wu; J Raghwani; S Cauchemez; L du Plessis; M P Verotti; W K de Oliveira; E H Carmo; G E Coelho; A C F S Santelli; L C Vinhal; C M Henriques; J T Simpson; M Loose; K G Andersen; N D Grubaugh; S Somasekar; C Y Chiu; J E Muñoz-Medina; C R Gonzalez-Bonilla; C F Arias; L L Lewis-Ximenez; S A Baylis; A O Chieppe; S F Aguiar; C A Fernandes; P S Lemos; B L S Nascimento; H A O Monteiro; I C Siqueira; M G de Queiroz; T R de Souza; J F Bezerra; M R Lemos; G F Pereira; D Loudal; L C Moura; R Dhalia; R F França; T Magalhães; E T Marques; T Jaenisch; G L Wallau; M C de Lima; V Nascimento; E M de Cerqueira; M M de Lima; D L Mascarenhas; J P Moura Neto; A S Levin; T R Tozetto-Mendoza; S N Fonseca; M C Mendes-Correa; F P Milagres; A Segurado; E C Holmes; A Rambaut; T Bedford; M R T Nunes; E C Sabino; L C J Alcantara; N J Loman; O G Pybus
Journal:  Nature       Date:  2017-05-24       Impact factor: 49.962

3.  Evolutionary epidemiology: preparing for an age of genomic plenty.

Authors:  O G Pybus; C Fraser; A Rambaut
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2013-02-04       Impact factor: 6.237

Review 4.  The application of genomics to tracing bacterial pathogen transmission.

Authors:  Nicholas J Croucher; Xavier Didelot
Journal:  Curr Opin Microbiol       Date:  2014-11-22       Impact factor: 7.934

5.  nextflu: real-time tracking of seasonal influenza virus evolution in humans.

Authors:  Richard A Neher; Trevor Bedford
Journal:  Bioinformatics       Date:  2015-06-26       Impact factor: 6.937

6.  Three mutations switch H7N9 influenza to human-type receptor specificity.

Authors:  Robert P de Vries; Wenjie Peng; Oliver C Grant; Andrew J Thompson; Xueyong Zhu; Kim M Bouwman; Alba T Torrents de la Pena; Marielle J van Breemen; Iresha N Ambepitiya Wickramasinghe; Cornelis A M de Haan; Wenli Yu; Ryan McBride; Rogier W Sanders; Robert J Woods; Monique H Verheije; Ian A Wilson; James C Paulson
Journal:  PLoS Pathog       Date:  2017-06-15       Impact factor: 6.823

Review 7.  Global epidemiology of non-influenza RNA respiratory viruses: data gaps and a growing need for surveillance.

Authors:  Julian W Tang; Tommy T Lam; Hassan Zaraket; W Ian Lipkin; Steven J Drews; Todd F Hatchette; Jean-Michel Heraud; Marion P Koopmans
Journal:  Lancet Infect Dis       Date:  2017-04-28       Impact factor: 25.071

Review 8.  Viral phylodynamics.

Authors:  Erik M Volz; Katia Koelle; Trevor Bedford
Journal:  PLoS Comput Biol       Date:  2013-03-21       Impact factor: 4.475

9.  TreeTime: Maximum-likelihood phylodynamic analysis.

Authors:  Pavel Sagulenko; Vadim Puller; Richard A Neher
Journal:  Virus Evol       Date:  2018-01-08

10.  Real-time, portable genome sequencing for Ebola surveillance.

Authors:  Joshua Quick; Nicholas J Loman; Sophie Duraffour; Jared T Simpson; Ettore Severi; Lauren Cowley; Joseph Akoi Bore; Raymond Koundouno; Gytis Dudas; Amy Mikhail; Nobila Ouédraogo; Babak Afrough; Amadou Bah; Jonathan Hj Baum; Beate Becker-Ziaja; Jan-Peter Boettcher; Mar Cabeza-Cabrerizo; Alvaro Camino-Sanchez; Lisa L Carter; Juiliane Doerrbecker; Theresa Enkirch; Isabel Graciela García Dorival; Nicole Hetzelt; Julia Hinzmann; Tobias Holm; Liana Eleni Kafetzopoulou; Michel Koropogui; Abigail Kosgey; Eeva Kuisma; Christopher H Logue; Antonio Mazzarelli; Sarah Meisel; Marc Mertens; Janine Michel; Didier Ngabo; Katja Nitzsche; Elisa Pallash; Livia Victoria Patrono; Jasmine Portmann; Johanna Gabriella Repits; Natasha Yasmin Rickett; Andrea Sachse; Katrin Singethan; Inês Vitoriano; Rahel L Yemanaberhan; Elsa G Zekeng; Racine Trina; Alexander Bello; Amadou Alpha Sall; Ousmane Faye; Oumar Faye; N'Faly Magassouba; Cecelia V Williams; Victoria Amburgey; Linda Winona; Emily Davis; Jon Gerlach; Franck Washington; Vanessa Monteil; Marine Jourdain; Marion Bererd; Alimou Camara; Hermann Somlare; Abdoulaye Camara; Marianne Gerard; Guillaume Bado; Bernard Baillet; Déborah Delaune; Koumpingnin Yacouba Nebie; Abdoulaye Diarra; Yacouba Savane; Raymond Bernard Pallawo; Giovanna Jaramillo Gutierrez; Natacha Milhano; Isabelle Roger; Christopher J Williams; Facinet Yattara; Kuiama Lewandowski; Jamie Taylor; Philip Rachwal; Daniel Turner; Georgios Pollakis; Julian A Hiscox; David A Matthews; Matthew K O'Shea; Andrew McD Johnston; Duncan Wilson; Emma Hutley; Erasmus Smit; Antonino Di Caro; Roman Woelfel; Kilian Stoecker; Erna Fleischmann; Martin Gabriel; Simon A Weller; Lamine Koivogui; Boubacar Diallo; Sakoba Keita; Andrew Rambaut; Pierre Formenty; Stephan Gunther; Miles W Carroll
Journal:  Nature       Date:  2016-02-03       Impact factor: 69.504

  10 in total
  824 in total

1.  The Heterogeneous Landscape and Early Evolution of Pathogen-Associated CpG Dinucleotides in SARS-CoV-2.

Authors:  Andrea Di Gioacchino; Petr Šulc; Anastassia V Komarova; Benjamin D Greenbaum; Rémi Monasson; Simona Cocco
Journal:  SSRN       Date:  2020-05-27

2.  Whole genome analysis of more than 10 000 SARS-CoV-2 virus unveils global genetic diversity and target region of NSP6.

Authors:  Indrajit Saha; Nimisha Ghosh; Ayan Pradhan; Nikhil Sharma; Debasree Maity; Kaushik Mitra
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

3.  The persistent threat of emerging plant disease pandemics to global food security.

Authors:  Jean B Ristaino; Pamela K Anderson; Daniel P Bebber; Kate A Brauman; Nik J Cunniffe; Nina V Fedoroff; Cambria Finegold; Karen A Garrett; Christopher A Gilligan; Christopher M Jones; Michael D Martin; Graham K MacDonald; Patricia Neenan; Angela Records; David G Schmale; Laura Tateosian; Qingshan Wei
Journal:  Proc Natl Acad Sci U S A       Date:  2021-06-08       Impact factor: 11.205

4.  Potential Antigenic Mismatch of the H3N2 Component of the 2019 Southern Hemisphere Influenza Vaccine.

Authors:  Sigrid Gouma; Madison Weirick; Scott E Hensley
Journal:  Clin Infect Dis       Date:  2020-05-23       Impact factor: 9.079

5.  Preface to theme issue 'Modelling infectious disease outbreaks in humans, animals and plants: epidemic forecasting and control'.

Authors:  R N Thompson; Ellen Brooks-Pollock
Journal:  Philos Trans R Soc Lond B Biol Sci       Date:  2019-07-08       Impact factor: 6.237

6.  Want to track pandemic variants faster? Fix the bioinformatics bottleneck.

Authors:  Emma B Hodcroft; Nicola De Maio; Rob Lanfear; Duncan R MacCannell; Bui Quang Minh; Heiko A Schmidt; Alexandros Stamatakis; Nick Goldman; Christophe Dessimoz
Journal:  Nature       Date:  2021-03       Impact factor: 49.962

7.  Anomalous influenza seasonality in the United States and the emergence of novel influenza B viruses.

Authors:  Rebecca K Borchering; Christian E Gunning; Deven V Gokhale; K Bodie Weedop; Arash Saeidpour; Tobias S Brett; Pejman Rohani
Journal:  Proc Natl Acad Sci U S A       Date:  2021-02-02       Impact factor: 11.205

Review 8.  Evolution and rapid spread of a reassortant A(H3N2) virus that predominated the 2017-2018 influenza season.

Authors:  Barney I Potter; Rebecca Kondor; James Hadfield; John Huddleston; John Barnes; Thomas Rowe; Lizheng Guo; Xiyan Xu; Richard A Neher; Trevor Bedford; David E Wentworth
Journal:  Virus Evol       Date:  2019-12-04

Review 9.  Real-Time Analysis and Visualization of Pathogen Sequence Data.

Authors:  Richard A Neher; Trevor Bedford
Journal:  J Clin Microbiol       Date:  2018-10-25       Impact factor: 5.948

10.  CoV-Seq, a New Tool for SARS-CoV-2 Genome Analysis and Visualization: Development and Usability Study.

Authors:  Boxiang Liu; Kaibo Liu; He Zhang; Liang Zhang; Yuchen Bian; Liang Huang
Journal:  J Med Internet Res       Date:  2020-10-02       Impact factor: 5.428

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.