| Literature DB >> 33316043 |
Simon Dellicour1,2, Keith Durkin3, Samuel L Hong2, Bert Vanmechelen2, Joan Martí-Carreras2, Mandev S Gill2, Cécile Meex4, Sébastien Bontems4, Emmanuel André2, Marius Gilbert1, Conor Walker5, Nicola De Maio5, Nuno R Faria6,7, James Hadfield8, Marie-Pierre Hayette4, Vincent Bours3, Tony Wawina-Bokalanga2, Maria Artesi3, Guy Baele2, Piet Maes2.
Abstract
Since the start of the COVID-19 pandemic, an unprecedented number of genomic sequences of SARS-CoV-2 have been generated and shared with the scientific community. The unparalleled volume of available genetic data presents a unique opportunity to gain real-time insights into the virus transmission during the pandemic, but also a daunting computational hurdle if analyzed with gold-standard phylogeographic approaches. To tackle this practical limitation, we here describe and apply a rapid analytical pipeline to analyze the spatiotemporal dispersal history and dynamics of SARS-CoV-2 lineages. As a proof of concept, we focus on the Belgian epidemic, which has had one of the highest spatial densities of available SARS-CoV-2 genomes. Our pipeline has the potential to be quickly applied to other countries or regions, with key benefits in complementing epidemiological analyses in assessing the impact of intervention measures or their progressive easement.Entities:
Keywords: COVID-19; SARS-CoV-2; lockdown measures; phylodynamic; phylogenetic clusters; phylogeography
Year: 2021 PMID: 33316043 PMCID: PMC7665608 DOI: 10.1093/molbev/msaa284
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
Fig. 1Time-scaled phylogenetic tree in which we identified Belgian clusters. A cluster is here defined as a phylogenetic clade likely corresponding to a distinct introduction into the study area (Belgium). We delineated these clusters by performing a simplistic discrete phylogeographic reconstruction along the time-scaled phylogenetic tree while only considering two potential ancestral locations: “Belgium” and “non-Belgium”. We identified a minimum number of 331 lineage introductions (95% HPD interval = [315–344]), which showcases the relative importance of external introductions considering the number of sequences currently sampled in Belgium (n = 740). On the tree, lineages circulating in Belgium are highlighted in green, and green nodes correspond to the most ancestral node of each Belgian cluster (see also supplementary fig. S1, Supplementary Material online for a noncircular visualization of the same tree). Besides the tree, we also report the distribution of cluster sizes (number of sampled sequences in each cluster) as well as the number of sequences sampled through time.
Fig. 2Spatially explicit phylogeographic reconstruction of the dispersal history of SARS-CoV-2 lineages in Belgium. (A) Continuous phylogeographic reconstruction performed along each Belgian clade (cluster) identified by the initial discrete phylogeographic analysis. For each clade, we mapped the maximum clade credibility (MCC) tree and overall 80% highest posterior density (HPD) regions reflecting the uncertainty related to the phylogeographic inference. MCC trees and 80% HPD regions are based on 1,000 trees subsampled from each post burn-in posterior distribution. MCC tree nodes were colored according to their time of occurrence, and 80% HPD regions were computed for successive time layers and then superimposed using the same color scale reflecting time. Continuous phylogeographic reconstructions were only performed along Belgian clades linking at least three sampled sequences for which the geographic origin was known (see the detailed analytical pipeline in Supplementary Information for further detail). Besides the phylogenetic branches of MCC trees obtained by continuous phylogeographic inference, we also mapped sampled sequences belonging to clades linking less than three geo-referenced sequences. Furthermore, when a clade only gathers two geo-referenced sequences, we highlighted the phylogenetic link between these two sequences with a dashed curve connecting them. Subnational province borders are represented by white lines. (B) MCC tree branches occurring before March 18, 2020 (beginning of the lockdown). (C) MCC tree branches occurring after March 18, 2020. See also supplementary figure S2, Supplementary Material online, for a zoomed version of the dispersal history of viral lineages in the Province of Liège, for which we have a particularly dense sampling.
Fig. 3Evolution of viral lineage dispersal dynamics during the Belgian epidemic. These estimates are based on 1,000 trees subsampled from each post burn-in posterior distribution. Except for the number of phylogenetic branches occurring at each time slice, all estimates were smoothed using a 14-days sliding window. Dark gray surrounding polygons represent 95% credible intervals, and light gray surrounding polygons represent 95% credible intervals re-estimated after subsampling 75% of branches in each of the 1,000 posterior trees. The credible interval based on the subsampling procedure is an indication of the robustness of the estimate. In addition, we also report the number of phylogenetic branches occurring per tree at each time slice (blue curve). The number of branches available at each time slice is an additional, yet qualitative, indication of robustness of the estimate for a given time period.