Literature DB >> 32325130

Bayesian phylodynamic inference on the temporal evolution and global transmission of SARS-CoV-2.

Jianguo Li1, Zhen Li2, Xiaogang Cui2, Changxin Wu3.   

Abstract

Entities:  

Mesh:

Year:  2020        PMID: 32325130      PMCID: PMC7169879          DOI: 10.1016/j.jinf.2020.04.016

Source DB:  PubMed          Journal:  J Infect        ISSN: 0163-4453            Impact factor:   6.072


× No keyword cloud information.
Dear Editor, Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has developed a global pandemic. The initial transmission of SARS-CoV-2 has been limited in the national wide of China during the first two month, while a global spread is establishing with about 2 million laboratory confirmed infections and more than 126, 000 deaths from 185 countries by April 15, 2020. The genome of SARS-CoV-2 exhibited a relative high similarity among the early obtained strains. , However, two key mutations were recently identified, potentially contributing to the sub-lineage classification of SARS-CoV-2. Although the genome structure of SARS-CoV-2 has been well documented, the temporal evolution and global transmission of the virus remains poorly investigated. Here, we retrieved 313 SRAS-CoV-2 genomes from the GISAID (www.gisaid.org) database, from which 99 genomes with exact collection dates (before Feb 29, 2019) were selected to infer the origin time and global transmission of SARS-CoV-2 by Bayesian phylodynamic approaches. To gain insight into the temporal evolutionary dynamics of SARS-CoV-2, we performed Markov Chain Monte Carlo (MCMC) algorithms implemented in BEAST 1.10.4 package with the 99 enrolled SARS-CoV-2 genomes. Generalized Time Reversible (GTR) with invariant sites as site heterogeneity model (GTR+I) was selected as the best-fit nucleotide substitution model by the Akaike Information Criterion (AIC) implemented in jModelTest. The estimated mean evolutionary rate of SARS-CoV-2 was estimated to be 6.14 × 10−6 subs/site/day (95% HPD: 3.61 × 10−6 –8.68 × 10−6 subs/site/day), corresponding to 2.24 × 10−3 subs/site/year (95% HPD: 1.32 × 10−3 –3.17 × 10−3 subs/site/year). We recorded the information of MCMC reconstruction into a Maximum Clade Credibility (MCC) tree by using the program TreeAnnotator. From the MCC tree (Fig. 1 ), the tMRCA (Time to the Most Common Recent Ancestor) of SARS-CoV-2 was dated back to Dec 11, 2019 (95%HPD, Nov 21, 2019 – Dec 24, 2019). Two major clades were observed from the MCC tree, with a divergence time at Dec 23, 2019 (95%HPD, Dec 18, 2019 – Dec 29, 2019), both of which consist strains of SARS-CoV-2 from Wuhan and other regions of China.
Fig. 1

Bayesian evolutionary dynamics of SARS-CoV-2. (a) Time-scaled Maximum Clade Credibility (MCC) tree based on MCMC analysis of the 99 SARS-CoV-2 genomes with an Exponential Growth tree prior. The time scale was set to the bottom of Fig. 1 shared by both (a) and (b). The tree branches were colored according to the collection countries with the color panel to the left lower part. Time to the Most Common Recent Ancestor (tMRCA) and the divergence time of clades and sub-clades were labelled on the corresponding nodes with 95% HPD in the following bracket. Clade and sub-clade specific mutations were labelled under the divergent time, with the non-synonymous mutations labelled in red. (b) population dynamics of SARS-CoV-2. The viral population dynamics was represented by the viral relative genetic diversity generated from Bayesian Skyline Plot reconstruction of the MCMC analysis. The Y-axis represents relative genetic diversity, which is equivalent to the product of the effective population size (Ne) and the generation length in days (τ). The color regions show 95% HPD limits, and the black line represent the median estimate of relative genetic diversity.

Bayesian evolutionary dynamics of SARS-CoV-2. (a) Time-scaled Maximum Clade Credibility (MCC) tree based on MCMC analysis of the 99 SARS-CoV-2 genomes with an Exponential Growth tree prior. The time scale was set to the bottom of Fig. 1 shared by both (a) and (b). The tree branches were colored according to the collection countries with the color panel to the left lower part. Time to the Most Common Recent Ancestor (tMRCA) and the divergence time of clades and sub-clades were labelled on the corresponding nodes with 95% HPD in the following bracket. Clade and sub-clade specific mutations were labelled under the divergent time, with the non-synonymous mutations labelled in red. (b) population dynamics of SARS-CoV-2. The viral population dynamics was represented by the viral relative genetic diversity generated from Bayesian Skyline Plot reconstruction of the MCMC analysis. The Y-axis represents relative genetic diversity, which is equivalent to the product of the effective population size (Ne) and the generation length in days (τ). The color regions show 95% HPD limits, and the black line represent the median estimate of relative genetic diversity. The circulating strains of SARS-CoV-2 could be separated into four sub-clades (Fig. 1a). The two sub-clades from Clade 1 was diverged at Jan 1, 2020 (95%HPD, Dec 27, 2019 – Jan 5, 2020), while the two sub-clades from Clade 2 was diverged at Jan 8, 2020 (95%HPD, Jan 3 – Jan 13). With respect to the country-specific strains of SARS-CoV-2, we observed that the circulating strains in USA were from both of the two clades, the UK and Australia circulating strains were from Clade 1, the circulating strains in Singapore, Japan, Germany, France and Italy seemed to be from Clade 2 (Fig. 1a, Table S1). To infer the population growth dynamics of SARS-CoV-2, the viral relative genetic diversity was reconstructed by Bayesian Skyline Plot (BSP) analysis. BSP analysis suggested that SARS-CoV-2 possessed a relative stable effective population size (N) during the first month (Dec 23, 2019 to Jan 22, 2020) of the virus outbreak (Fig. 1b). A slow but accelerating reduction in the N was observed from Jan 22, 2020, with a sharp reduction of the lower 95% HPD of the N from Feb 5, 2020. A sharp reduction in the Ne suggests the initiation of a bottle-neck-effect in the virus population size. A bottle-neck-effect indicates that the current circulating virus strain was trapped, and more mutations in the virus genome will potentially occur to help the virus escape, resulting in a leap in the virus population. Despite the BSP was generated from a limited sample size, the results suggested a possible initiation of a bottle-neck-effect in the population size of SARS-CoV-2, indicating more infected cases will occur in the near future due to the increased mutations in the viral genome. Despite SARS-CoV-2 remains relative stable, thirteen clade/sub-clade-specific mutations were observed in the present study (Fig. 1a). The mutations at nt 8782 and nt 28,144 were clade specific, i.e., C8782T and T28144C were only occurred in Clade 1, rather than in Clade 2. Only a viral strain (EPI_ISL_406,592 from Guangdong, China) in Clade 1 did not possess C8782T, while all strains in Clade 1 possess T28144C. Eleven out of the thirteen sub-clade specific mutations were also observed (Fig. 1a). Seven mutations were located in Clade 1, among which C29095G and C24034T/T26729C were observed in a sub-clade consisting of viral strains from China (outside Wuhan) and USA, respectively. G28878Aand G29742A were observed in a subclade of viral strains from Australia and USA. Four mutations were located in Clade 2, among which C21707T and C28854T were observed in a sub-clade consisting of viral strains from China (outside Wuhan) and USA. C17373T was observed in a sub-clade of viral strains from China (outside Wuhan), USA and Singapore. G26144T was observed in a sub-clade of viral strains from USA, Taiwan, Australia, Sweden, Italy, and Singapore. Seven of the observed mutations resulted in non-synonymous mutations in the translated viral protein, including two mutations in nucleocapsid phosphoprotein (C28854T: Ser-Phe; G28878A: Ser-Asn), one mutation in ORF1ab polyprotein (T18488C: Ile-Thr), Surface glycoprotein (C21707T: His-Tyr), ORF3a protein (G26144T: Gly-Val), ORF8 protein (T28144C: Leu-Ser), and ORF10 protein (G29742A: Arg-His). Notably, all of the four sub-clades possessed at least one non-synonymous mutation (Fig. 1a, Table 1 ).
Table 1

Clade-/Sub-clade specific mutations of SARS-CoV-2 observed in Maximum Clade Credibility tree.

MutationGeneTypeAmino acid mutationCollection country/region of the viral strain
C8782TORF1asynonymousClade 1 in Fig. 1a (detailed in Table S1)
C17373TORF1bsynonymousChina (outside Wuhan), USA and Singapore
C18060TORF1bsynonymousChina (outside Wuhan) and USA
C24034TSsynonymous
T26729CMsynonymous
C29095GNsynonymous
T18488CORF1bnon-synonymousIle-ThrUnited Kingdom
C21707TSnon-synonymousHis-TyrChina (outside Wuhan) and USA
G26144TORF3non-synonymousGly-ValUSA, Taiwan, Australia, Sweden, Italy, and Singapore
T28144CORF8non-synonymousLeu-SerClade 1 in Fig. 1a (detailed in Table S1)
C28854TNnon-synonymousSer-PheChina (outside Wuhan) and USA
G28878ANnon-synonymousSer-AsnAustralia and USA
G29742A3-UTRnon-synonymousArg-His (untranslated)

Ile, Isoleucine; Thr, Threonine; His, Histidine; Tyr, Tyrosine; Gly, Glycine; Val, Valine; Leu, Leucine, Ser, Serine; Phe, Phenylalanine; Asn, Asparagine; Arg, Argnine. USA, United States of America.

Clade-/Sub-clade specific mutations of SARS-CoV-2 observed in Maximum Clade Credibility tree. Ile, Isoleucine; Thr, Threonine; His, Histidine; Tyr, Tyrosine; Gly, Glycine; Val, Valine; Leu, Leucine, Ser, Serine; Phe, Phenylalanine; Asn, Asparagine; Arg, Argnine. USA, United States of America. In conclusion, continuous evolution occurred in almost all regions of the SARS-CoV-2 genome and potentially in a country-specific manner. Further efforts on monitoring the genomic mutations of SARS-CoV-2 from different countries are recommended.

Declaration of Competing Interest

The authors declare that they have no conflicts of interest.
  5 in total

1.  Identification of the nucleotide substitutions in 62 SARS-CoV-2 sequences from Turkey.

Authors:  Ayşe Banu Demİr; Domenico Benvenuto; Hakan AbacioĞlu; Silvia Angeletti; Massimo Ciccozzi
Journal:  Turk J Biol       Date:  2020-06-21

Review 2.  Decoding Covid-19 with the SARS-CoV-2 Genome.

Authors:  Phoebe Ellis; Ferenc Somogyvári; Dezső P Virok; Michela Noseda; Gary R McLean
Journal:  Curr Genet Med Rep       Date:  2021-01-09

3.  Population Genomics Insights into the First Wave of COVID-19.

Authors:  Maria Vasilarou; Nikolaos Alachiotis; Joanna Garefalaki; Apostolos Beloukas; Pavlos Pavlidis
Journal:  Life (Basel)       Date:  2021-02-07

4.  Challenges in estimating virus divergence times in short epidemic timescales with special reference to the evolution of SARS-CoV-2 pandemic.

Authors:  Carlos G Schrago; Lucia P Barzilai
Journal:  Genet Mol Biol       Date:  2021-02-08       Impact factor: 1.771

5.  Comprehensive phylogeographic and phylodynamic analyses of global Senecavirus A.

Authors:  Han Gao; Yong-Jie Chen; Xiu-Qiong Xu; Zhi-Ying Xu; Si-Jia Xu; Jia-Bao Xing; Jing Liu; Yun-Feng Zha; Yan-Kuo Sun; Gui-Hong Zhang
Journal:  Front Microbiol       Date:  2022-09-29       Impact factor: 6.064

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.