| Literature DB >> 32526937 |
Rui Dong1, Shaojun Pei1, Changchuan Yin2, Rong Lucy He3, Stephen S-T Yau1.
Abstract
The severe respiratory disease COVID-19 was initially reported in Wuhan, China, in December 2019, and spread into many provinces from Wuhan. The corresponding pathogen was soon identified as a novel coronavirus named SARS-CoV-2 (formerly, 2019-nCoV). As of 2 May, 2020, over 3 million COVID-19 cases had been confirmed, and 235,290 deaths had been reported globally, and the numbers are still increasing. It is important to understand the phylogenetic relationship between SARS-CoV-2 and known coronaviruses, and to identify its hosts for preventing the next round of emergency outbreak. In this study, we employ an effective alignment-free approach, the Natural Vector method, to analyze the phylogeny and classify the coronaviruses based on genomic and protein data. Our results show that SARS-CoV-2 is closely related to, but distinct from the SARS-CoV branch. By analyzing the genetic distances from the SARS-CoV-2 strain to the coronaviruses residing in animal hosts, we establish that the most possible transmission path originates from bats to pangolins to humans.Entities:
Keywords: COVID-19; Natural Vector method; SARS-CoV-2; transmission path
Mesh:
Substances:
Year: 2020 PMID: 32526937 PMCID: PMC7349679 DOI: 10.3390/genes11060637
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1The genomic structures of SARS-CoV-2, SARS-CoV, bat-CoV and pangolin-CoV. The genomic structures of SARS-CoV-2 (NC_045512), SARS-CoV (NC_004718) and bat-CoV (MN_996532) were drawn according to their annotations in NCBI GenBank. The genomic structure of pangolin-CoV was drawn in [8].
Figure 2The phylogenetic tree of 95 SARS-CoV-2 and 731 known coronaviruses based on BioNJ and Natural Vector algorithm. Different colors represent different types of coronaviruses that can infect humans.
Figure 3The phylogenetic tree of 38 coronavirus genomes based on BioNJ and Natural Vector algorithm.
The RMSD and NV distance between 3CL proteinase (6LU7)/spike protein (6VXX) of SARS-CoV-2 and the counterpart proteins of other human coronaviruses.
|
|
|
|
|
|
|
| PDB-Number | 6LU7 | 3AW0 | 2ZU2 | 5WKJ | 6FV2 |
| RMSD | 0.72 | 1.10 | 1.53 | 1.25 | |
| NV-Distance | 22.61 | 117.82 | 140.59 | 118.98 | |
|
|
|
|
|
|
|
| PDB-Number | 6VXX | 5X58 | 6U7H | 5X5F | 5SZS |
| RMSD | 1.74 | 2.21 | 3.20 | 2.71 | |
| NV-Distance | 235.39 | 349.86 | 289.54 | 401.99 |
Figure 4The phylogenetic tree of 95 SARS-CoV-2 strains where different colors represent the SARS-CoV-2 strains sampled from different countries.
Figure 5The phylogenetic BioNJ tree based on the Hausdorff distance between SARS-CoV-2 strains group and 13 possible host groups.
Distance from SARS-CoV-2 group to the coronavirus group of each host.
| Host | Number | Hausdorff Distance | Center Distance | S-Protein Center Distance |
|---|---|---|---|---|
| Pangolin_beta | 3 | 333.89 | 230.11 | 117.39 |
| Civet_beta | 13 | 928.40 | 952.39 | 220.21 |
| Bat_beta | 54 | 2400.72 | 1102.03 | 205.47 |
| Murine_beta | 43 | 2620.43 | 2358.63 | 254.27 |
| Camel_beta | 225 | 2464.57 | 1307.54 | 353.01 |
| Bovine_beta | 33 | 2571.48 | 2377.34 | 317.78 |
| Avian_gamma | 320 | 8788.62 | 2753.43 | 426.91 |
| Bat_alpha | 41 | 3340.69 | 2494.54 | 257.71 |
| Camel_alpha | 25 | 3065.11 | 3044.61 | 405.31 |
| Canine_alpha | 11 | 893.11 | 947.82 | 485.56 |
| Feline_alpha | 31 | 1205.55 | 107.93 | 453.74 |
| Murine_alpha | 2 | 2168.66 | 2125.31 | 482.57 |
| Porcine_alpha | 25 | 4981.11 | 2784.12 | 391.29 |
Hausdorff distance from SARS-CoV-2 strains to the coronaviruses found in each host group. Center distance from SARS-CoV-2 strains to the coronaviruses found in each host group. Center distance from the S protein sequences of SARS-CoV-2 strains to the S protein sequences of coronaviruses found in each host group.
Figure 6The natural graph of 13 possible host sources and SARS-CoV-2 group. The blue arrows represent the first-level relationship while the red ones represent the second-level relationships. First level indicates closer relationship.