Literature DB >> 32381734

Sampling bias and incorrect rooting make phylogenetic network tracing of SARS-COV-2 infections unreliable.

Carla Mavian1,2, Sergei Kosakovsky Pond3, Simone Marini4,5, Brittany Rife Magalis4,2, Anne-Mieke Vandamme6,7, Simon Dellicour6,8, Samuel V Scarpino9, Charlotte Houldcroft10, Julian Villabona-Arenas11,12, Taylor K Paisie4,2, Nídia S Trovão13, Christina Boucher14, Yun Zhang15, Richard H Scheuermann15,16,17, Olivier Gascuel18, Tommy Tsan-Yuk Lam19, Marc A Suchard20,21,22, Ana Abecasis7, Eduan Wilkinson23, Tulio de Oliveira23, Ana I Bento24, Heiko A Schmidt25, Darren Martin26, James Hadfield27, Nuno Faria28, Nathan D Grubaugh29, Richard A Neher30, Guy Baele6, Philippe Lemey6, Tanja Stadler31, Jan Albert32, Keith A Crandall33, Thomas Leitner34, Alexandros Stamatakis35,36, Mattia Prosperi1,5, Marco Salemi1,2.   

Abstract

Entities:  

Mesh:

Year:  2020        PMID: 32381734      PMCID: PMC7293693          DOI: 10.1073/pnas.2007295117

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


× No keyword cloud information.
There is obvious interest in gaining insights into the epidemiology and evolution of the virus that has recently emerged in humans as the cause of the coronavirus disease 2019 (COVID-19) pandemic. The recent paper by Forster et al. (1) analyzed 160 severe acute respiratory syndrome coronavirus (SARS-CoV-2) full genomes available (https://www.gisaid.org/) in early March 2020. The central claim is the identification of three main SARS-CoV-2 types, named A, B, and C, circulating in different proportions among Europeans and Americans (types A and C) and East Asians (type B). According to a median-joining network analysis, variant A is proposed to be the ancestral type because it links to the sequence of a coronavirus from bats, used as an outgroup to trace the ancestral origin of the human strains. The authors further suggest that the “ancestral Wuhan B-type virus is immunologically or environmentally adapted to a large section of the East Asian population, and may need to mutate to overcome resistance outside East Asia.” There are several serious flaws with their findings and interpretation. First, and most obviously, the sequence identity between SARS-CoV-2 and the bat virus is only 96.2%, implying that these viral genomes (which are nearly 30,000 nucleotides long) differ by more than 1,000 mutations. Such a distant outgroup is unlikely to provide a reliable root for the network. Yet, strangely, the branch to the bat virus, in figure 1 of their paper, is only 16 or 17 mutations in length. Indeed, the network seems to be misrooted, because (see their SI Appendix, figure S4) a virus from Wuhan from week 0 (24 December 2019) is portrayed as a descendant of a clade of viruses collected in weeks 1 to 9 (presumably from many places outside China), which makes no evolutionary (2) or epidemiological sense (3). As for the finding of three main SARS-CoV-2 types, we must underline that finding different lineages in different countries and regions is expected with any RNA virus experiencing founder effects (2). According to Forster et al.’s (1) own analysis, a single synonymous mutation (nucleotide change in a gene that does not result in a modified protein) distinguishes type A from type B, while one nonsynonymous mutation (resulting in a protein with a single amino acid change) separates types A and C, and another one separates types B and C. Given SARS-CoV-2’s fast evolutionary rate, random emergence of new mutations is entirely expected, even in a relatively short timeframe (4). When a viral strain is introduced and spreads in a new population, such random mutations can be propagated without them being selected or advantageous, due to founder effects. The fact that SARS-CoV-2 sequences show some geographical clustering is not new and is nicely and interactively shown on Nextstrain (5), but this cannot be used as a proof of biological differences unless backed by solid experimental data (6). This is particularly true for the work of Forster et al., since their findings are based on a nonrepresentative dataset of 160 genomes, with no significant correlation between prevalence of confirmed cases and number of sequenced strains per country (7, 8). The essential role of representative sampling is well documented in the literature (9), but was not acknowledged by the authors, who, instead, claim that their “network faithfully traces routes of infections for documented [COVID-19] cases,” without taking into consideration missing viral diversity, or evaluating multiple transmission hypotheses that would be consistent with sequence data, or even providing any support on the robustness of the branching pattern in their network. Ultimately, no firm conclusion should be drawn without evaluating the probability of alternative dissemination routes. The inappropriate application and interpretation of phylogenetic methods to analyze limited and unevenly sampled datasets begs for restraint about origin, directionality, and early clade/lineage inference of SARS-CoV-2. We feel the urgency to reframe the current debate in more rigorous scientific terms, given the dangerous implications of misunderstanding the true dispersal dynamics of SARS-CoV-2 and the COVID-19 pandemic.
  7 in total

1.  Eight challenges in phylodynamic inference.

Authors:  Simon D W Frost; Oliver G Pybus; Julia R Gog; Cecile Viboud; Sebastian Bonhoeffer; Trevor Bedford
Journal:  Epidemics       Date:  2014-09-16       Impact factor: 4.396

2.  Nextstrain: real-time tracking of pathogen evolution.

Authors:  James Hadfield; Colin Megill; Sidney M Bell; John Huddleston; Barney Potter; Charlton Callender; Pavel Sagulenko; Trevor Bedford; Richard A Neher
Journal:  Bioinformatics       Date:  2018-12-01       Impact factor: 6.931

Review 3.  Evolutionary analysis of the dynamics of viral infectious disease.

Authors:  Oliver G Pybus; Andrew Rambaut
Journal:  Nat Rev Genet       Date:  2009-08       Impact factor: 53.242

4.  We shouldn't worry when a virus mutates during disease outbreaks.

Authors:  Nathan D Grubaugh; Mary E Petrone; Edward C Holmes
Journal:  Nat Microbiol       Date:  2020-04       Impact factor: 17.745

5.  The continuing 2019-nCoV epidemic threat of novel coronaviruses to global health - The latest 2019 novel coronavirus outbreak in Wuhan, China.

Authors:  David S Hui; Esam I Azhar; Tariq A Madani; Francine Ntoumi; Richard Kock; Osman Dar; Giuseppe Ippolito; Timothy D Mchugh; Ziad A Memish; Christian Drosten; Alimuddin Zumla; Eskild Petersen
Journal:  Int J Infect Dis       Date:  2020-01-14       Impact factor: 3.623

6.  Phylogenetic network analysis of SARS-CoV-2 genomes.

Authors:  Peter Forster; Lucy Forster; Colin Renfrew; Michael Forster
Journal:  Proc Natl Acad Sci U S A       Date:  2020-04-08       Impact factor: 11.205

7.  A Snapshot of SARS-CoV-2 Genome Availability up to April 2020 and its Implications: Data Analysis.

Authors:  Carla Mavian; Simone Marini; Mattia Prosperi; Marco Salemi
Journal:  JMIR Public Health Surveill       Date:  2020-06-01
  7 in total
  26 in total

1.  Optimizing viral genome subsampling by genetic diversity and temporal distribution (TARDiS) for phylogenetics.

Authors:  Simone Marini; Carla Mavian; Alberto Riva; Marco Salemi; Brittany Rife Magalis
Journal:  Bioinformatics       Date:  2021-10-21       Impact factor: 6.931

Review 2.  Molecular Diagnosis of Coronavirus Disease 2019.

Authors:  Claudia C Dos Santos; Barbara A Zehnbauer; Uriel Trahtemberg; John Marshall
Journal:  Crit Care Explor       Date:  2020-09-17

3.  Across regions: Are most COVID-19 deaths above or below life expectancy?

Authors:  Rondy J Malik
Journal:  Germs       Date:  2021-03-15

4.  Reply to Sánchez-Pacheco et al., Chookajorn, and Mavian et al.: Explaining phylogenetic network analysis of SARS-CoV-2 genomes.

Authors:  Peter Forster; Lucy Forster; Colin Renfrew; Michael Forster
Journal:  Proc Natl Acad Sci U S A       Date:  2020-05-21       Impact factor: 11.205

Review 5.  Betacoronavirus Genomes: How Genomic Information has been Used to Deal with Past Outbreaks and the COVID-19 Pandemic.

Authors:  Alejandro Llanes; Carlos M Restrepo; Zuleima Caballero; Sreekumari Rajeev; Melissa A Kennedy; Ricardo Lleonart
Journal:  Int J Mol Sci       Date:  2020-06-26       Impact factor: 5.923

Review 6.  COVID-19: The first documented coronavirus pandemic in history.

Authors:  Yen-Chin Liu; Rei-Lin Kuo; Shin-Ru Shih
Journal:  Biomed J       Date:  2020-05-05       Impact factor: 4.910

7.  Authors' Reply to: Errors in Tracing Coronavirus SARS-CoV-2 Transmission Using a Maximum Likelihood Tree. Comment on "A Snapshot of SARS-CoV-2 Genome Availability up to April 2020 and its Implications: Data Analysis".

Authors:  Carla Mavian; Simone Marini; Mattia Prosperi; Marco Salemi
Journal:  JMIR Public Health Surveill       Date:  2020-11-11

8.  A Snapshot of SARS-CoV-2 Genome Availability up to April 2020 and its Implications: Data Analysis.

Authors:  Carla Mavian; Simone Marini; Mattia Prosperi; Marco Salemi
Journal:  JMIR Public Health Surveill       Date:  2020-06-01

9.  Positive Selection of ORF1ab, ORF3a, and ORF8 Genes Drives the Early Evolutionary Trends of SARS-CoV-2 During the 2020 COVID-19 Pandemic.

Authors:  Lauro Velazquez-Salinas; Selene Zarate; Samantha Eberl; Douglas P Gladue; Isabel Novella; Manuel V Borca
Journal:  Front Microbiol       Date:  2020-10-23       Impact factor: 5.640

Review 10.  Feline infectious peritonitis (FIP) and coronavirus disease 19 (COVID-19): Are they similar?

Authors:  Saverio Paltrinieri; Alessia Giordano; Angelica Stranieri; Stefania Lauzi
Journal:  Transbound Emerg Dis       Date:  2020-10-20       Impact factor: 4.521

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.