Literature DB >> 32374452

SARS-CoV-2: Structural diversity, phylogeny, and potential animal host identification of spike glycoprotein.

Siarhei Alexander Dabravolski1, Yury Kazimirovich Kavalionak1.   

Abstract

To investigate the evolutionary history of the current pandemic outbreak of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), a total of 137 genomes of coronavirus strains with release dates between January 2019 and 25 March 2020, were analyzed. To investigate the potential intermediate host of the SARS-CoV-2, we analyzed spike glycoprotein sequences from different animals, with particular emphasis on bats. We performed phylogenetic analysis and structural reconstruction of the spike glycoproteins with subsequent alignment and comparison. Our phylogenetic results revealed that SARS-CoV-2 was more similar to the bats' betacoronavirus isolates: HKU5-related from Pipistrellus abramus and HKU4-related from Tylonycteris pachypus. We also identified a yak betacoronavirus strain, YAK/HY24/CH/2017, as the closest match in the comparison of the structural models of spike glycoproteins. Interestingly, a set of unique features has been described for this particular strain of the yak betacoronavirus. Therefore, our results suggest that the human SARS-CoV-2, responsible for the current outbreak of COVID-19, could also come from yak as an intermediate host.
© 2020 Wiley Periodicals LLC.

Entities:  

Keywords:  COVID-19; SARS-CoV-2; betacoronavirus; spike glycoprotein

Mesh:

Substances:

Year:  2020        PMID: 32374452      PMCID: PMC7267556          DOI: 10.1002/jmv.25976

Source DB:  PubMed          Journal:  J Med Virol        ISSN: 0146-6615            Impact factor:   2.327


INTRODUCTION

The novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2), is associated with the current pandemic outbreak of COVID‐19. The virus emerged in Wuhan (China) and it can be transmitted from person to person. The patients visited a local seafood market selling various live animals, from where this zoonotic disease was suspected to have spread. Till date, there have been 1 812 734 confirmed cases and 113 675 deaths across 202 countries according to WHO (https://www.who.int/). Successful drug and vaccine development require a deep understanding of the virus phylogeny, evolutionary origin, and the source of zoonotic transmission. Also, it should be a top‐priority task for researchers to prevent outbreaks of this type in the future. Substantial efforts have been made to identify the animal source of SARS‐CoV‐2. Several recent studies have revealed the origin of SARS‐CoV‐2 from bats ; however, some other intermediate hosts were also suggested such as snake, pangolin, or some mammals and birds. The genome of SARS‐CoV‐2 contains two open reading frames (ORFs), ORF1 and ORF2 encoding two polyproteins, which are responsible for viral genome maintenance after cleavage. There is also a set of so‐called structural proteins such as spike glycoproteins, an envelope protein, membrane proteins, and the nucleocapsid. One of the structural proteins, the spike surface glycoprotein (“spike glycoprotein”), is one of the primary therapeutic targets. This protein plays an important role in binding to receptors on the host cell, fusion of the host and viral membranes, and as a target for antibodies (reviewed in Fung and Liu ). In this study, we examined 137 genomes to establish the relationship between spike glycoproteins from different coronaviruses. This was done with a combination of the phylogenetic tree analysis (reconstructed from the translated sequences) and a comparison of the structural models. The phylogenetic study has confirmed close relatedness of SARS‐CoV‐2 to bats' coronaviruses. In contrast, a comparison of the structural models has found yak (Bos grunniens) betacoronavirus as the closest match.

METHODS

Sequence retrieval and phylogenetic analysis

Recently released complete genome sequences were downloaded for the analysis from NCBI (listed in Table S1). Furthermore, spike glycoprotein ORFs (nucleotide and protein sequences) were retrieved from NCBI ORFfinder (https://www.ncbi.nlm.nih.gov/orffinder/) and conserved domains were checked with CD‐Search (NCBI), respectively. Complete translated ORFs were used for multiple sequence alignments performed using MUSCLE. The tests of substitution models and phylogenetic analysis were carried out using the MEGA X software. The neighbor‐joining method and JTT substitution models were selected assuming an estimated proportion of invariant sites and 4‐gamma‐distributed rate categories to account for rate heterogeneity across sites. The gamma shape parameter was estimated directly from the data. Reliability for the internal branch was assessed using the bootstrapping method (1000 bootstrap replicates). Two spike glycoproteins from the gammacoronaviruses were used as an outgroup.

Structure modeling and comparison

Structural models of the spike glycoproteins (Table S1, marked with *) were built using SWISS‐MODEL. Predicted structures were refined with an online tool 3Drefine (http://sysbio.rnet.missouri.edu/3Drefine/) and verified using QMEAN (https://swissmodel.expasy.org/qmean/). iPBA web server was used for pdb structure alignment (https://www.dsimb.inserm.fr/dsimb_tools/ipba/index.php). The quality of the structure alignments was evaluated using root mean square deviation and normalized score. Chimera software was used for structure visualization.

RESULTS

Domain architecture

Altogether, 137 genomes were analyzed. Our primary focus was on the severe acute respiratory syndrome‐related coronavirus (92 genomes), released from 2019 till 25 March 2020 (NCBI). The Middle East respiratory syndrome‐related coronavirus genomes (n = 18) (further shortened to MERS) were used to check the reliability of our in silico approaches. A set of different coronaviruses (taxonomical origin, hosts, and release date) was used in our analyses (Table S1). Four conserved domains were identified: spike glycoprotein N‐terminal domain (pfam16451), spike receptor‐binding domain (pfam09408), coronavirus S1 glycoprotein (pfam01600), and coronavirus S2 glycoprotein (pfam01601) (listed from N‐terminal to C‐terminal direction) (Figure S1). Surprisingly, none of the analyzed genomes exhibited such domain architecture, as is known, for example, for the viruses studied earlier (top line in Figures S1 and S2). As a common feature, we noticed that alphacoronaviruses and gammacoronaviruses share domain architecture represented by two domains: coronavirus S1 glycoprotein along with coronavirus S2 glycoprotein. In contrast, all analyzed betacoronaviruses have a full or partial spike receptor‐binding domain. Interestingly, only betacoronaviruses with humans and yaks as hosts have spike glycoprotein N‐terminal domain, which is absent in all other analyzed hosts.

Phylogenetic analysis

To gain insight into the phylogenetic relationships between spike glycoproteins from alphacoronaviruses, betacoronaviruses, and gammacoronaviruses of different hosts, a robust phylogenetic tree after multiple alignments of the 137 extracted sequences (Figure S3) was generated. The tree was rooted in the outgroup (gammacoronaviruses). As expected, all SARS‐CoV‐2 spike glycoproteins (with humans as a host) were almost identical and formed separate clusters. Similarly, all the MERS (with humans as a host) were nearly identical and clustered together with camels' MERS (isolates from Kenya), with closely related MERS isolates from the Egyptian camel. Two betacoronavirus isolates from bats, HKU5‐ and HKU4‐related (MN611520.1 Pipistrellus abramus and MN611519.1 Tylonycteris pachypus, respectively), were located between SARS‐CoV‐2 and MERS clusters. Two other branches formed two well‐distinguished clusters for alphacoronaviruses and betacoronaviruses isolates. Alphacoronaviruses have host‐dependent subclusters (humans and bats). In contrast, other betacoronaviruses (HKU1 and OC43) are closely related to the yak isolate.

Comparison of structural models

Furthermore, to better understand the relationships between spike glycoproteins on the structural level, we built protein models with homology‐based server SWISS‐MODEL (Table S1, marked with *). Obtained models were evaluated and verified (Table S2). In the next step, iPBA web server, with a local backbone conformation comparison similarity algorithm, was used to match human host‐delivered SARS‐CoV‐2 and MERS spike glycoproteins to models delivered from other coronaviruses and hosts (Table 1). MERS betacoronavirus was used as a control data set as it is well‐studied in both hosts (human and camel), and as their close genetic relationship is well‐characterized. As expected, human host‐delivered‐MERS showed the highest similarity to that from camels. SARS‐CoV‐2's spike glycoprotein, on the contrary, has shown the highest similarity to the yak‐delivered betacoronavirus (Table 1 and Figure 1B).
Table 1

Comparison of the spike glycoprotein models

SARS‐CoV‐2 a MERS b
Virus speciesHostNormalized scoreRMSDNormalized scoreRMSD
Betacoronavirus 1 Bos grunniens 196.042.65−62.332.71
Pipistrellus abramus bat coronavirus HKU5‐related P abramus 147.332.5557.172.41
Tylonycteris pachypus bat coronavirus HKU4‐related T pachypus 141.901.60−60.252.85
Middle East respiratory syndrome‐related coronavirus Camelus dromedarius −81.542.70606.530.03
Miniopterus pusillus bat coronavirus HKU8‐related M pusillus 50.152.99−125.143.17
Hipposideros pomona bat coronavirus HKU10‐related H pomona −45.273.01−110.032.92
Avian coronavirus Tadorna tadornoides 87.222.79−133.143.11
Miniopterus schreibersii bat coronavirus 1‐related M schreibersii 23.503.01−154.622.73
Hipposideros pomona bat coronavirus CHB25 H larvatus 40.542.93−125.293.13
Scotophilus kuhlii bat coronavirus 512‐related S kuhlii 51.532.74−136.463.23

Abbreviation: RMSD, root mean square deviation.

Severe acute respiratory syndrome‐related coronavirus, host: Homo sapiens.

Middle East respiratory syndrome‐related coronavirus, host: H sapiens.

Figure 1

Graphical representation of compared structural models. A, MERS from the human host (red), MERS from camel host (blue); (B) SARS‐CoV‐2 from the human host (red), betacoronavirus from Bos grunniens (blue); (C) SARS‐CoV‐2 from the human host (red), Pipistrellus abramus bat coronavirus HKU5‐related (blue); (D) SARS‐CoV‐2 from the human host (red), Tylonycteris pachypus bat coronavirus HKU4‐related (blue). MERS, Middle East respiratory syndrome; SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2

Comparison of the spike glycoprotein models Abbreviation: RMSD, root mean square deviation. Severe acute respiratory syndrome‐related coronavirus, host: Homo sapiens. Middle East respiratory syndrome‐related coronavirus, host: H sapiens. Graphical representation of compared structural models. A, MERS from the human host (red), MERS from camel host (blue); (B) SARS‐CoV‐2 from the human host (red), betacoronavirus from Bos grunniens (blue); (C) SARS‐CoV‐2 from the human host (red), Pipistrellus abramus bat coronavirus HKU5‐related (blue); (D) SARS‐CoV‐2 from the human host (red), Tylonycteris pachypus bat coronavirus HKU4‐related (blue). MERS, Middle East respiratory syndrome; SARS‐CoV‐2, severe acute respiratory syndrome coronavirus 2

DISCUSSION

Proper domain classification and identification remain to be a matter of discussion. Earlier papers refer to the spike glycoprotein as cleaved in the middle and forming S1 and S2 domains, further subdividing them into N‐terminal and C‐terminal domains in each. Based on the current Conserved Domain Database (CDD) output, we concluded that N‐terminal S1 corresponds to the spike glycoprotein N‐terminal domain (pfam16451), C‐terminal S1 corresponds to the spike receptor‐binding domain (pfam09408), and S2 domain contains coronavirus S1 glycoprotein (pfam01600), and coronavirus S2 glycoprotein (pfam01601) as subdomains (Figure S1). Canonical spike glycoprotein contains four domains, each of which plays a specific function. It is known that both spike glycoprotein N‐terminal domain (pfam16451) and spike receptor‐binding domain (pfam09408) participate in specific receptor binding. The N‐terminal domain binds to carcinoembryonic antigen‐related cell adhesion molecule 1 (CEACAM1) in mouse hepatitis coronavirus, and binds sugar in porcine transmissible gastroenteritis virus. Spike receptor‐binding domain binds to the aminopeptidase N or angiotensin‐converting enzyme 2 (ACE2) in coronaviruses. An interplay between coronavirus S1 glycoprotein (pfam01600) and coronavirus S2 glycoprotein (pfam01601) is required for the attachment of spike to susceptible tissues and subsequent fusion. The phylogenetic data reported above show that the new human‐delivered SARS‐CoV‐2 spike glycoproteins cluster with two betacoronaviruses, HKU4‐ and HKU5‐related, delivered from the hosts T pachypus and P abramus, respectively. Also, as seen from the phylogenetic tree (Figure S1), these two sequences deviate from the other bat coronavirus sequences, suggesting that these bat coronaviruses are homologous and genetically more similar to human‐delivered SARS‐CoV‐2 than to the other bats' coronaviruses. In general, these data support phylogenetic results obtained by previous researches based on (a) whole‐genome; (b) nonstructural proteins NS7b and NS8; (c) spike glycoprotein, and (d) nucleocapsid protein. For the next experiment, based on the alignment and comparison of the structures, MERS has been set as a control data set, due to the well‐characterized close relationship between human‐ and camel‐delivered strains. , In contrast, alignment and comparison of the SARS‐CoV‐2 structural models have revealed close relationship to the yak betacoronavirus (B grunniens) (Table 1), while bat‐delivered HKU4‐ and HKU5‐related betacoronaviruses had a lower matching score. Bovine coronavirus is a worldwide spread zoonotically transmissible infection in domestic and wild ruminants, that is known to cause severe diarrhoea in neonatal, dysentery in adult, and respiratory diseases in animals of all ages and could also infect humans. Our conclusion is also well‐supported by a recent report on the ACE2spike glycoprotein complexes which suggests Bovidae as one of the potential intermediate hosts. Interestingly, the identified bovine coronavirus strain (strain YAK/HY24/CH/2017) has unique amino acid variation in the S gene, that represents an uncommon adaptive evolution pathway with unknown biological meaning. It should be noted that many bat species have been identified in Europe that are natural hosts for many viruses, including coronaviruses, in particular, SARS‐like. Unexplored natural reservoirs of viruses could pose potential threats for public health. This possibility also raises the question of unique transmission channels specific for each region.

CONCLUSION

In conclusion, the results of our phylogenetic study support the fact that the infection originated from bats. Additionally, results from the comparison structural models propose an additional intermediate host, yak (B grunniens), that could transmit bat coronavirus to human hosts. We also wish to emphasize the importance of further investigations into the evolution of the spike glycoprotein. Such work could render a positive impact on the current SARS‐CoV‐2 transmission and prevent zoonotic disease outbreaks of this type in the future.

CONFLICT OF INTERESTS

The authors declare that there are no conflict of interests. Supplementary information Click here for additional data file. Supplementary information Click here for additional data file. Supplementary information Click here for additional data file. Supplementary information Click here for additional data file. Supplementary information Click here for additional data file.
  24 in total

1.  UCSF Chimera--a visualization system for exploratory research and analysis.

Authors:  Eric F Pettersen; Thomas D Goddard; Conrad C Huang; Gregory S Couch; Daniel M Greenblatt; Elaine C Meng; Thomas E Ferrin
Journal:  J Comput Chem       Date:  2004-10       Impact factor: 3.376

Review 2.  Human Coronavirus: Host-Pathogen Interaction.

Authors:  To Sing Fung; Ding Xiang Liu
Journal:  Annu Rev Microbiol       Date:  2019-06-21       Impact factor: 15.500

3.  Crystal structure of NL63 respiratory coronavirus receptor-binding domain complexed with its human receptor.

Authors:  Kailang Wu; Weikai Li; Guiqing Peng; Fang Li
Journal:  Proc Natl Acad Sci U S A       Date:  2009-11-09       Impact factor: 11.205

4.  Crystal structure of mouse coronavirus receptor-binding domain complexed with its murine receptor.

Authors:  Guiqing Peng; Dawei Sun; Kanagalaghatta R Rajashankar; Zhaohui Qian; Kathryn V Holmes; Fang Li
Journal:  Proc Natl Acad Sci U S A       Date:  2011-06-13       Impact factor: 11.205

5.  A real-time PCR assay for bat SARS-like coronavirus detection and its application to Italian greater horseshoe bat faecal sample surveys.

Authors:  Andrea Balboni; Laura Gallina; Alessandra Palladini; Santino Prosperi; Mara Battilani
Journal:  ScientificWorldJournal       Date:  2011-11-22

6.  Coronavirus and paramyxovirus in bats from Northwest Italy.

Authors:  Francesca Rizzo; Kathryn M Edenborough; Roberto Toffoli; Paola Culasso; Simona Zoppi; Alessandro Dondo; Serena Robetto; Sergio Rosati; Angelika Lander; Andreas Kurth; Riccardo Orusa; Luigi Bertolotti; Maria Lucia Mandola
Journal:  BMC Vet Res       Date:  2017-12-22       Impact factor: 2.741

7.  SWISS-MODEL: homology modelling of protein structures and complexes.

Authors:  Andrew Waterhouse; Martino Bertoni; Stefan Bienert; Gabriel Studer; Gerardo Tauriello; Rafal Gumienny; Florian T Heer; Tjaart A P de Beer; Christine Rempfer; Lorenza Bordoli; Rosalba Lepore; Torsten Schwede
Journal:  Nucleic Acids Res       Date:  2018-07-02       Impact factor: 16.971

8.  Viral Metagenomics Revealed Sendai Virus and Coronavirus Infection of Malayan Pangolins (Manis javanica).

Authors:  Ping Liu; Wu Chen; Jin-Ping Chen
Journal:  Viruses       Date:  2019-10-24       Impact factor: 5.048

9.  Genomic Characterization and Phylogenetic Classification of Bovine Coronaviruses Through Whole Genome Sequence Analysis.

Authors:  Tohru Suzuki; Yoshihiro Otake; Satoko Uchimoto; Ayako Hasebe; Yusuke Goto
Journal:  Viruses       Date:  2020-02-06       Impact factor: 5.048

10.  SARS-CoV-2 spike protein favors ACE2 from Bovidae and Cricetidae.

Authors:  Junwen Luan; Xiaolu Jin; Yue Lu; Leiliang Zhang
Journal:  J Med Virol       Date:  2020-04-10       Impact factor: 20.693

View more
  7 in total

Review 1.  Drug targets for COVID-19 therapeutics: Ongoing global efforts.

Authors:  Ambrish Saxena
Journal:  J Biosci       Date:  2020       Impact factor: 1.826

2.  Comparative genomic signature representations of the emerging COVID-19 coronavirus and other coronaviruses: High identity and possible recombination between Bat and Pangolin coronaviruses.

Authors:  Rabeb Touati; Sondes Haddad-Boubaker; Imen Ferchichi; Imen Messaoudi; Afef Elloumi Ouesleti; Henda Triki; Zied Lachiri; Maher Kharrat
Journal:  Genomics       Date:  2020-07-06       Impact factor: 5.736

Review 3.  The diagnostic accuracy of Artificial Intelligence-Assisted CT imaging in COVID-19 disease: A systematic review and meta-analysis.

Authors:  Meisam Moezzi; Kiarash Shirbandi; Hassan Kiani Shahvandi; Babak Arjmand; Fakher Rahim
Journal:  Inform Med Unlocked       Date:  2021-05-06

4.  ExTaxsI: an exploration tool of biodiversity molecular data.

Authors:  Giulia Agostinetto; Alberto Brusati; Anna Sandionigi; Adam Chahed; Elena Parladori; Bachir Balech; Antonia Bruno; Dario Pescini; Maurizio Casiraghi
Journal:  Gigascience       Date:  2022-01-25       Impact factor: 6.524

5.  Appearance and re-appearance of zoonotic disease during the pandemic period: long-term monitoring and analysis of zoonosis is crucial to confirm the animal origin of SARS-CoV-2 and monkeypox virus.

Authors:  Chiranjib Chakraborty; Manojit Bhattacharya; Shyam Sundar Nandi; Ranjan K Mohapatra; Kuldeep Dhama; Govindasamy Agoramoorthy
Journal:  Vet Q       Date:  2022-06-07       Impact factor: 8.071

6.  SARS-CoV-2: Structural diversity, phylogeny, and potential animal host identification of spike glycoprotein.

Authors:  Siarhei Alexander Dabravolski; Yury Kazimirovich Kavalionak
Journal:  J Med Virol       Date:  2020-05-17       Impact factor: 2.327

7.  What Would Jenner and Pasteur Have Done About COVID-19 Coronavirus? The Urges of a Vaccinologist.

Authors:  Clarisa B Palatnik-de-Sousa
Journal:  Front Immunol       Date:  2020-08-26       Impact factor: 7.561

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.