| Literature DB >> 32416074 |
Hong Zhou1, Xing Chen2, Tao Hu1, Juan Li1, Hao Song3, Yanran Liu1, Peihan Wang1, Di Liu4, Jing Yang5, Edward C Holmes6, Alice C Hughes7, Yuhai Bi8, Weifeng Shi9.
Abstract
The unprecedented pandemic of pneumonia caused by a novel coronavirus, SARS-CoV-2, in China and beyond has had major public health impacts on a global scale [1, 2]. Although bats are regarded as the most likely natural hosts for SARS-CoV-2 [3], the origins of the virus remain unclear. Here, we report a novel bat-derived coronavirus, denoted RmYN02, identified from a metagenomic analysis of samples from 227 bats collected from Yunnan Province in China between May and October 2019. Notably, RmYN02 shares 93.3% nucleotide identity with SARS-CoV-2 at the scale of the complete virus genome and 97.2% identity in the 1ab gene, in which it is the closest relative of SARS-CoV-2 reported to date. In contrast, RmYN02 showed low sequence identity (61.3%) to SARS-CoV-2 in the receptor-binding domain (RBD) and might not bind to angiotensin-converting enzyme 2 (ACE2). Critically, and in a similar manner to SARS-CoV-2, RmYN02 was characterized by the insertion of multiple amino acids at the junction site of the S1 and S2 subunits of the spike (S) protein. This provides strong evidence that such insertion events can occur naturally in animal betacoronaviruses.Entities:
Keywords: COVID-19; S1/S2 cleavage site; SARS-CoV-2; bat coronavirus; spike protein
Mesh:
Substances:
Year: 2020 PMID: 32416074 PMCID: PMC7211627 DOI: 10.1016/j.cub.2020.05.023
Source DB: PubMed Journal: Curr Biol ISSN: 0960-9822 Impact factor: 10.834
Sequence Identity for SARS-CoV-2 Compared with RmYN02 and Representative Beta-CoV Genomes
| Strain | Complete Genome | Gene Region | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1ab | S | RBD | 3a | E | M | 6 | 7a | 7b | 8 | N | 10 | |||
| Nucleotide sequences | RmYN02 | 93.3% | 97.2% | 71.9% | 61.3% | 96.4% | 98.7% | 94.8% | 96.8% | 96.2% | 92.4% | 45.8% | 97.3% | 99.1% |
| RaTG13 | 96.1% | 96.5% | 92.9% | 85.3% | 96.3% | 99.6% | 95.4% | 98.4% | 95.6% | 99.2% | 97.0% | 96.9% | 99.1% | |
| ZC45 | 87.6% | 89.0% | 75.1% | 62.1% | 87.8% | 98.7% | 93.4% | 95.2% | 88.8% | 94.7% | 88.5% | 91.1% | 99.1% | |
| ZXC21 | 87.4% | 88.7% | 74.6% | 60.6% | 88.9% | 98.7% | 93.4% | 95.2% | 89.1% | 95.5% | 88.5% | 91.2% | / | |
| pangolin/GD/2019 | – | 90.8% | 89.3% | – | 93.4% | 98.3% | 93.1% | 94.6% | 93.4% | – | 92.1% | 96.1% | – | |
| pangolin/GX/P5L/2017 | 85.2% | 84.7% | 83.2% | 79.9% | 87.0% | 97.4% | 91.3% | 90.9% | 86.6% | 81.8% | 80.6% | 91.0% | 94.0% | |
| SARS-CoV GZ02 | 78.9% | 79.6% | 72.3% | 73.8% | 75.6% | 93.5% | 85.1% | 74.5% | 82.1% | 83.0% | 45.3% | 88.1% | / | |
| Amino acid sequences | RmYN02 | N/A | 98.8% | 72.9% | 62.4% | 96.7% | 100.0% | 98.2% | 96.7% | 95.9% | 83.7% | 27.3% | 98.6% | 97.4% |
| RaTG13 | N/A | 98.5% | 97.4% | 89.3% | 97.8% | 100.0% | 98.6% | 100.0% | 97.5% | 97.7% | 95.0% | 99.0% | 97.4% | |
| ZC45 | N/A | 95.6% | 80.2% | 63.5% | 90.9% | 100.0% | 98.6% | 93.4% | 87.6% | 93.0% | 94.2% | 94.3% | 97.4% | |
| ZXC21 | N/A | 95.2% | 79.6% | 62.9% | 92.0% | 100.0% | 98.6% | 93.4% | 88.4% | 93.0% | 94.2% | 94.3% | / | |
| pangolin/GD/2019 | N/A | 97.1% | 90.7% | 97.4% | 97.4% | 100.0% | 98.6% | 96.6% | 97.5% | – | 94.9% | 97.6% | – | |
| pangolin/GX/P5L/2017 | N/A | 92.6% | 92.4% | 86.8% | 89.8% | 100.0% | 98.2% | 95.1% | 88.4% | 72.1% | 87.6% | 93.8% | 84.2% | |
| SARS-CoV GZ02 | N/A | 86.2% | 76.2% | 74.6% | 73.1% | 94.7% | 89.6% | 68.9% | 85.2% | 79.5% | 29.7% | 90.5% | / | |
Pangolin/GD/2019 and pangolin/GX/P5L/2017 (EPI_ISL_410540). –, no corresponding values in [6]; /, this open reading frame is not found; N/A, not available.
Sequence identities for RmYN02 compared with the SARS-CoV GZ02 (accession number AY390556); the bat SARS-like coronaviruses RaTG13 (EPI_ISL_402131), ZC45 (MG772933), and ZXC21 (MG772934); and the pangolin SARS-like coronaviruses
Pangolin/GD/2019 represents a merger of GD/P1L and GD/P2S, and these values were adapted from [6]
Figure 1Patterns of Sequence Identity between the Consensus Sequences of SARS-CoV-2 and Representative Beta-CoVs
(A) Whole-genome similarity plot between SARS-CoV-2 and representative viruses listed in Table 1. The analysis was performed using Simplot, with a window size of 1,000 bp and a step size of 100 bp.
(B) Similarity plot in the spike gene (positions 1–1,658) between SARS-CoV-2 and representative viruses listed in Table 1. The analysis was performed using Simplot, with a window size of 150 bp and a step size of 5 bp.
See also Table S3.
Figure 2Homology Modeling of the RBD Structures and Molecular Characterizations of the S1/S2 Cleavage Site of RmYN02 and Representative Beta-CoVs
(A–D) Homology modeling and structural comparison of the RBD structures of RmYN02 and representative beta-CoVs, including (A) RmYN02, (B) RaTG13, (C) pangolin/MP789/2019, and (D) pangolin/GX/P5L/2017. The three-dimensional structures of the RBD from Bat-SL-CoV RmYN02, RaTG13, pangolin/MP789/2019, and pangolin/GX/P5L/2017 were modeled using the Swiss-Model program [13] employing the RBD of SARS-CoV (PDB: 2DD8) as a template. All the core subdomains are colored magenta, and the external subdomains of RmYN02, RaTG13, pangolin/MP789/2019, and pangolin/GX/P5L/2017 are colored cyan, green, orange, and yellow, respectively. The conserved disulfide bond in RaTG13, pangolin/GD, and pangolin/GX is highlighted, while it is missing in RmYN02 due to a sequence deletion.
(E and F) Superimposition of the RBD structure of pangolin/MP789/2019 (E) and RmYN02 (F) with that of SARS-CoV-2. The two deletions located in respective loops in RmYN02 are highlighted using dotted cycles.
(G) Molecular characterizations of the RBD of RmYN02 and the representative beta-CoVs.
(H) Molecular characterizations of the cleavage site of RmYN02 and the representative beta-CoVs.
See also Figures S2 and S3 and Table S2.
Figure 3Phylogenetic Analysis of SARS-CoV-2 and Representative Viruses from the Subgenus Sarbecoronavirus
(A) Phylogenetic tree of the full-length virus genome.
(B) The S gene.
(C) The RBD.
(D) The RdRp.
Phylogenetic analysis was performed using RAxML [21] with 1,000 bootstrap replicates, employing the GTR nucleotide substitution model. RBD is delimited as the gene region 991–1,572 of the spike gene according to [6]. All the trees are midpoint rooted for clarity.
See also Figure S4.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| BetaCoV/bat/Yunnan/RmYN01/2019 | This manuscript | N/A |
| BetaCoV/bat/Yunnan/RmYN02/2019 | This manuscript | N/A |
| Samples are provided in the | This manuscript | N/A |
| RNAlater Stabilization Solution | Invitrogen | Cat#AM7021 |
| RNAiso Plus reagent | TAKARA | Cat#9109 |
| ReverTra Ace qPCR RT Kit | TOYOBO | Cat#FSQ-101 |
| PerfectStart II Probe qPCR SuperMix | TransGen | Cat#AQ711 |
| TransStart Tip Green qPCR SuperMix | TransGen | Cat#AQ141 |
| AG | Cat#AG11411 | |
| RNeasy Mini Kit | QIAGEN | Cat# 74104 |
| Raw and analyzed data | This manuscript | NMDC1001304; China National Microbiological Data Center |
| RmYN01 genome | This manuscript | EPI_ISL_412976/NMDC60013004-01; GISAID/China National Microbiological Data Center |
| RmYN02 genome | This manuscript | EPI_ISL_412977/NMDC60013004-02; GISAID/China National Microbiological Data Center |
| partial sequence of spike gene of RmYN02 | This manuscript | NMDCN0000001; China National Microbiological Data Center |
| partial sequence of RdRp gene of RmYN02 | This manuscript | NMDCN0000002; China National Microbiological Data Center |
| partial sequence of cytochrome b (cytb) gene of Rhinolophus malayanus isolate YN-190625 | This manuscript | NMDCN0000003; China National Microbiological Data Center |
| SARS-CoV-2 reference genome sequences from databases are provided in the | GenBank / GISAID | N/A |
| Primer sequences are provided in the | This manuscript | N/A |
| Fastp v0.20.0 | [ | |
| Bowtie2 v2.3.3.1 | [ | |
| Trinity v2.5.1 | [ | |
| Geneious v11.1.5 | The Biomatters development team | |
| PeHaplo | [ | |
| MAFFT v7.450 | [ | |
| RAxML v8.1.6 | [ | |
| MrBayes v3.2.6 | [ | |
| Simplot v3.5.1 | [ | |
| SWISS-MODEL | [ | |
| Sequencing systems | Illumina | NovaSeq 6000 |