Literature DB >> 32035028

Genome Composition and Divergence of the Novel Coronavirus (2019-nCoV) Originating in China.

Aiping Wu1, Yousong Peng2, Baoying Huang3, Xiao Ding1, Xianyue Wang1, Peihua Niu3, Jing Meng1, Zhaozhong Zhu2, Zheng Zhang2, Jiangyuan Wang1, Jie Sheng1, Lijun Quan4, Zanxian Xia5, Wenjie Tan6, Genhong Cheng7, Taijiao Jiang8.   

Abstract

An in-depth annotation of the newly discovered coronavirus (2019-nCoV) genome has revealed differences between 2019-nCoV and severe acute respiratory syndrome (SARS) or SARS-like coronaviruses. A systematic comparison identified 380 amino acid substitutions between these coronaviruses, which may have caused functional and pathogenic divergence of 2019-nCoV.
Copyright © 2020 Elsevier Inc. All rights reserved.

Entities:  

Mesh:

Year:  2020        PMID: 32035028      PMCID: PMC7154514          DOI: 10.1016/j.chom.2020.02.001

Source DB:  PubMed          Journal:  Cell Host Microbe        ISSN: 1931-3128            Impact factor:   21.023


Main Text

A novel coronavirus (CoV) named “2019 novel coronavirus” or “2019-nCoV” by the World Health Organization (WHO) is responsible for the recent pneumonia outbreak that started in early December, 2019 in Wuhan City, Hubei Province, China (Huang et al., 2020, Zhou et al., 2020, Zhu et al., 2020). This outbreak is associated with a large seafood and animal market, and investigations are ongoing to determine the origins of the infection. To date, thousands of human infections have been confirmed in China along with many exported cases across the globe (China CDC, 2020). Coronaviruses mainly cause respiratory and gastrointestinal tract infections and are genetically classified into four major genera: Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus (Li, 2016). The former two genera primarily infect mammals, whereas the latter two predominantly infect birds (Tang et al., 2015). Six kinds of human CoVs have been previously identified. These include HCoV-NL63 and HCoV-229E, which belong to the Alphacoronavirus genus; and HCoV-OC43, HCoV-HKU1, severe acute respiratory syndrome coronavirus (SARS-CoV), and Middle East respiratory syndrome coronavirus (MERS-CoV), which belong to the Betacoronavirus genus (Tang et al., 2015). Coronaviruses did not attract worldwide attention until the 2003 SARS pandemic, followed by the 2012 MERS and, most recently, the 2019-nCoV outbreaks (China CDC, 2020, Song et al., 2019). SARS-CoV and MERS-CoV are considered highly pathogenic (Cui et al., 2019), and it is very likely that both SARS-CoV and MERS-CoV were transmitted from bats to palm civets (Guan et al., 2003) or dromedary camels (Drosten et al., 2014), and finally to humans (Cui et al., 2019). The genome of coronaviruses, whose size ranges between approximately 26,000 and 32,000 bases, includes a variable number (from 6 to 11) of open reading frames (ORFs) (Song et al., 2019). The first ORF representing approximately 67% of the entire genome encodes 16 non-structural proteins (nsps), while the remaining ORFs encode accessory proteins and structural proteins (Cui et al., 2019). The four major structural proteins are the spike surface glycoprotein (S), small envelope protein (E), matrix protein (M), and nucleocapsid protein (N). The spike surface glycoprotein plays an essential role in binding to receptors on the host cell and determines host tropism (Li, 2016, Zhu et al., 2018). The spike proteins of SARS-CoV and MERS-CoV bind to different host receptors via different receptor-binding domains (RBDs). SARS-CoV uses angiotensin-converting enzyme 2 (ACE2) as one of the main receptors (Ge et al., 2013) with CD209L as an alternative receptor (Jeffers et al., 2004), whereas MERS-CoV uses dipeptidyl peptidase 4 (DPP4, also known as CD26) as the primary receptor. Initial analysis suggested that 2019-nCoV has a close evolutionary association with the SARS-like bat coronaviruses (Zhou et al., 2020). Here, based on the first three determined genomes of the novel coronavirus (2019-nCoV), namely Wuhan/IVDC-HB-01/2019 (GISAID accession ID: EPI_ISL_402119) (HB01), Wuhan/IVDC-HB-04/2019 (EPI_ISL_402120) (HB04), and Wuhan/IVDC-HB-05/2019 (EPI_ISL_402121) (HB05), an in-depth genome annotation of this virus was performed with a comparison to related coronaviruses, including 1,008 human SARS-CoV, 338 bat SARS-like CoV, and 3,131 human MERS-CoV, whose genomes were published before January 12, 2020 (release date: September 12, 2019) from Virus Pathogen Database and Analysis Resource (ViPR) (http://www.viprbrc.org/) and NCBI. Comparison of genomes of these three strains showed that they are almost identical, with only five nucleotide differences in the genome of ~29.8 kb nucleotides (Figure S1). The 2019-nCoV genome was annotated to possess 14 ORFs encoding 27 proteins (Figure 1 A and Tables S1A and S1B). The orf1ab and orf1a genes located at the 5′-terminus of the genome respectively encode the pp1ab and pp1a proteins, respectively. They together comprise 15 nsps including nsp1 to nsp10 and nsp12 to nsp16 (Figure 1A and Table S1B). The 3′-terminus of the genome contains four structural proteins (S, E, M, and N) and eight accessory proteins (3a, 3b, p6, 7a, 7b, 8b, 9b, and orf14). At the amino acid level, the 2019-nCoV is quite similar to that of SARS-CoV, but there are some notable differences. For example, the 8a protein is present in SARS-CoV and absent in 2019-nCoV; the 8b protein is 84 amino acids in SARS-CoV, but longer in 2019-nCoV, with 121 amino acids; the 3b protein is 154 amino acids in SARS-CoV, but shorter in 2019-nCoV, with only 22 amino acids (Table S1A). Further studies are needed to characterize how these differences affect the functionality and pathogenesis of 2019-nCoV.
Figure 1

Genome composition and phylogenetic tree for 2019-nCoV

(A) Schematic diagram of the genome organization and the encoded proteins of pp1ab and pp1a for the IVDC-HB-01/2019 (HB01) strain. The largest gene, namely the orf1ab, encodes the pp1ab protein that contains 15 nsps (nsp1-nsp10 and nsp12-nsp16). The pp1a protein encoded by the orf1a gene also contains 10 nsps (nsp1-nsp10). Structural proteins are encoded by the four structural genes, including spike (S), envelope (E), membrane (M), and nucleocapsid (N) genes. The accessory genes are distributed among the structural genes. The protein-encoding genes of the genome of 2019-nCoV were predicted by the online servers of GeneMarkS (http://exon.gatech.edu/GeneMark/genemarks.cgi) and ORFfinder (https://www.ncbi.nlm.nih.gov/orffinder/) with manual check.

(B) Phylogenetic relationship based on the whole genome for the HB01 strain and other coronaviruses. All viral strains were classified by the genus and the type, which are presented on the left and right schematic phylogenetic trees, respectively. The four genera of the coronaviruses, including Alphacoronavirus (red), Betacoronavirus (blue), Gammacoronavirus (green), and Deltacoronavirus (violet) are blocked in the left phylogenetic tree. The MERS coronavirus (brown), the SARS-like bat coronavirus (violet), human SARS coronavirus (light blue), and the HB01 strain (red) are highlighted by lines of different colors in the right phylogenetic tree.

(C) Schematic phylogenetic trees of individual genes for the HB01 strain. The coronavirus species were colored in the same way as (B). The amount of the strains in the phylogenetic clade is denoted by the area of the circles.

Genome composition and phylogenetic tree for 2019-nCoV (A) Schematic diagram of the genome organization and the encoded proteins of pp1ab and pp1a for the IVDC-HB-01/2019 (HB01) strain. The largest gene, namely the orf1ab, encodes the pp1ab protein that contains 15 nsps (nsp1-nsp10 and nsp12-nsp16). The pp1a protein encoded by the orf1a gene also contains 10 nsps (nsp1-nsp10). Structural proteins are encoded by the four structural genes, including spike (S), envelope (E), membrane (M), and nucleocapsid (N) genes. The accessory genes are distributed among the structural genes. The protein-encoding genes of the genome of 2019-nCoV were predicted by the online servers of GeneMarkS (http://exon.gatech.edu/GeneMark/genemarks.cgi) and ORFfinder (https://www.ncbi.nlm.nih.gov/orffinder/) with manual check. (B) Phylogenetic relationship based on the whole genome for the HB01 strain and other coronaviruses. All viral strains were classified by the genus and the type, which are presented on the left and right schematic phylogenetic trees, respectively. The four genera of the coronaviruses, including Alphacoronavirus (red), Betacoronavirus (blue), Gammacoronavirus (green), and Deltacoronavirus (violet) are blocked in the left phylogenetic tree. The MERS coronavirus (brown), the SARS-like bat coronavirus (violet), human SARS coronavirus (light blue), and the HB01 strain (red) are highlighted by lines of different colors in the right phylogenetic tree. (C) Schematic phylogenetic trees of individual genes for the HB01 strain. The coronavirus species were colored in the same way as (B). The amount of the strains in the phylogenetic clade is denoted by the area of the circles. As shown in a phylogenetic tree based on whole genomes (Figures 1B and S2) with the Molecular Evolutionary Genetics Analysis (MEGA) (version 7.0), the 2019-nCoV is in the same Betacoronavirus clade as MERS-CoV, SARS-like bat CoV, and SARS-CoV. The phylogenetic tree falls into two clades. The Betacoronavirus genus constitutes one clade, while the Alphacoronavirus, Gammacoronavirus, and Deltacoronavirus genera constitute the other clade. The 2019-nCoV is parallel to the SARS-like bat CoVs, while the SARS-CoVs are descended from the SARS-like bat CoVs, indicating that 2019-nCoV is closer to the SARS-like bat CoVs than the SARS-CoVs in terms of the whole genome sequence. Tables S1C and S1D also show that the genome of 2019-nCoV has the highest similarity with that of a SARS-like bat CoV (MG772933). In comparison, 2019-nCoV is distant from and less related to the MERS-CoVs. In terms of the encoded proteins of pp1ab, pp1a, envelope, matrix, accessory protein 7a, and nucleocapsid genes, phylogenetic analyses showed that the 2019-nCoV is closest to the SARS-like bat CoVs (Figure 1C and Table S1D). Regarding the spike gene, the 2019-nCoV is closest to the bat CoVs, while the 3a and 8b accessory genes are both closest to the SARS-CoVs. Although phylogenetic analyses for the whole genome and individual genes clearly show that the 2019-nCoV is most closely related to SARS-like bat viruses (Figures 1B and 1C), we did not find a single strain of a SARS-like bat virus that harbors all proteins with the most similarity to counterparts of the 2019-nCoV (Figures 1B and 1C). Given the close relationship between 2019-nCoV and SARS-CoVs or SARS-like bat CoVs (Figures 1B and 1C), an examination of the amino acid substitutions in different proteins could shed light into how 2019-nCoV differs structurally and functionally from SARS-CoVs. In total, there were 380 amino acid substitutions between the amino acid sequences of 2019-nCoV (HB01) and the corresponding consensus sequences of SARS and SARS-like viruses (Figure 2 and Tables S1E and S1F). No amino acid substitutions occurred in nonstructural protein 7 (nsp7), nsp13, envelope, matrix, or accessory proteins p6 and 8b (Table S1F). Respectively, 102 and 61 amino acid substitutions are located in nsp3 and nsp2. In addition, 27 amino acid substitutions were found in the spike protein with a length of 1,273 amino acids, including six substitutions in the RBD at amino acid region 357-528 and six substitutions in the underpinning subdomain (SD) at amino acid region 569-655. Moreover, four substitutions (Q560L, S570A, F572T, and S575A) in the C-terminal of the receptor-binding subunit S1 domain (Figure 2) are situated in two peptides previously reported to be antigens for SARS-CoV (Guo et al., 2004).
Figure 2

Amino Acid Substitutions of 2019-nCoV against SARS and SARS-like Viruses

All 27 proteins encoded by 2019-nCoV have been aligned against SARS-CoVs and SARS-like bat CoVs using the FFT-NS-2 algorithm in MAFFT (version v7.407) (The number of aligned proteins were listed in Table S1E). An amino acid substitution was defined as an absolutely conserved site in the group of SARS and SARS-like CoVs but different from that of 2019-nCoV. In total, 380 amino acid substitutions have been identified between the amino acid sequences of 2019-nCoV (HB01) and the corresponding consensus sequences of SARS and SARS-like CoVs.

Amino Acid Substitutions of 2019-nCoV against SARS and SARS-like Viruses All 27 proteins encoded by 2019-nCoV have been aligned against SARS-CoVs and SARS-like bat CoVs using the FFT-NS-2 algorithm in MAFFT (version v7.407) (The number of aligned proteins were listed in Table S1E). An amino acid substitution was defined as an absolutely conserved site in the group of SARS and SARS-like CoVs but different from that of 2019-nCoV. In total, 380 amino acid substitutions have been identified between the amino acid sequences of 2019-nCoV (HB01) and the corresponding consensus sequences of SARS and SARS-like CoVs. Due to very limited knowledge of this novel virus, we are unable to give reasonable explanations for the significant number of amino acid substitutions between the 2019-nCoV and SARS or SARS-like CoVs. For example, no amino acid substitutions were present in the receptor-binding motifs that directly interact with human receptor ACE2 protein in SARS-CoV (Ge et al., 2013), but six mutations occurred in the other region of the RBD. Whether these differences could affect the host tropism and transmission property of the 2019-nCoV compared to SARS-CoV is worthy of future investigation.
  12 in total

1.  Evidence for camel-to-human transmission of MERS coronavirus.

Authors:  Christian Drosten; Paul Kellam; Ziad A Memish
Journal:  N Engl J Med       Date:  2014-10-02       Impact factor: 91.245

2.  CD209L (L-SIGN) is a receptor for severe acute respiratory syndrome coronavirus.

Authors:  Scott A Jeffers; Sonia M Tusell; Laura Gillim-Ross; Erin M Hemmila; Jenna E Achenbach; Gregory J Babcock; William D Thomas; Larissa B Thackray; Mark D Young; Robert J Mason; Donna M Ambrosino; David E Wentworth; James C Demartini; Kathryn V Holmes
Journal:  Proc Natl Acad Sci U S A       Date:  2004-10-20       Impact factor: 11.205

Review 3.  Structure, Function, and Evolution of Coronavirus Spike Proteins.

Authors:  Fang Li
Journal:  Annu Rev Virol       Date:  2016-08-25       Impact factor: 10.431

4.  Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China.

Authors:  Y Guan; B J Zheng; Y Q He; X L Liu; Z X Zhuang; C L Cheung; S W Luo; P H Li; L J Zhang; Y J Guan; K M Butt; K L Wong; K W Chan; W Lim; K F Shortridge; K Y Yuen; J S M Peiris; L L M Poon
Journal:  Science       Date:  2003-09-04       Impact factor: 47.728

5.  Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor.

Authors:  Xing-Yi Ge; Jia-Lu Li; Xing-Lou Yang; Aleksei A Chmura; Guangjian Zhu; Jonathan H Epstein; Jonna K Mazet; Ben Hu; Wei Zhang; Cheng Peng; Yu-Ji Zhang; Chu-Ming Luo; Bing Tan; Ning Wang; Yan Zhu; Gary Crameri; Shu-Yi Zhang; Lin-Fa Wang; Peter Daszak; Zheng-Li Shi
Journal:  Nature       Date:  2013-10-30       Impact factor: 49.962

Review 6.  Origin and evolution of pathogenic coronaviruses.

Authors:  Jie Cui; Fang Li; Zheng-Li Shi
Journal:  Nat Rev Microbiol       Date:  2019-03       Impact factor: 60.633

7.  Predicting the receptor-binding domain usage of the coronavirus based on kmer frequency on spike protein.

Authors:  Zhaozhong Zhu; Zheng Zhang; Wenjun Chen; Zena Cai; Xingyi Ge; Haizhen Zhu; Taijiao Jiang; Wenjie Tan; Yousong Peng
Journal:  Infect Genet Evol       Date:  2018-04-04       Impact factor: 3.342

8.  Inferring the hosts of coronavirus using dual statistical models based on nucleotide composition.

Authors:  Qin Tang; Yulong Song; Mijuan Shi; Yingyin Cheng; Wanting Zhang; Xiao-Qin Xia
Journal:  Sci Rep       Date:  2015-11-26       Impact factor: 4.379

9.  A Novel Coronavirus from Patients with Pneumonia in China, 2019.

Authors:  Na Zhu; Dingyu Zhang; Wenling Wang; Xingwang Li; Bo Yang; Jingdong Song; Xiang Zhao; Baoying Huang; Weifeng Shi; Roujian Lu; Peihua Niu; Faxian Zhan; Xuejun Ma; Dayan Wang; Wenbo Xu; Guizhen Wu; George F Gao; Wenjie Tan
Journal:  N Engl J Med       Date:  2020-01-24       Impact factor: 91.245

10.  SARS corona virus peptides recognized by antibodies in the sera of convalescent cases.

Authors:  Jian-Ping Guo; Martin Petric; William Campbell; Patrick L McGeer
Journal:  Virology       Date:  2004-07-01       Impact factor: 3.616

View more
  733 in total

1.  Candidate Targets for Immune Responses to 2019-Novel Coronavirus (nCoV): Sequence Homology- and Bioinformatic-Based Predictions.

Authors:  Alba Grifoni; John Sidney; Yun Zhang; Richard H Scheuermann; Bjoern Peters; Alessandro Sette
Journal:  SSRN       Date:  2020-02-25

2.  The laboratory tests and host immunity of COVID-19 patients with different severity of illness.

Authors:  Feng Wang; Hongyan Hou; Ying Luo; Guoxing Tang; Shiji Wu; Min Huang; Weiyong Liu; Yaowu Zhu; Qun Lin; Liyan Mao; Minghao Fang; Huilan Zhang; Ziyong Sun
Journal:  JCI Insight       Date:  2020-05-21

3.  Structural analysis of SARS-CoV-2 genome and predictions of the human interactome.

Authors:  Andrea Vandelli; Michele Monti; Edoardo Milanetti; Alexandros Armaos; Jakob Rupert; Elsa Zacco; Elias Bechara; Riccardo Delli Ponti; Gian Gaetano Tartaglia
Journal:  Nucleic Acids Res       Date:  2020-11-18       Impact factor: 16.971

4.  Novel Development of Predictive Feature Fingerprints to Identify Chemistry-Based Features for the Effective Drug Design of SARS-CoV-2 Target Antagonists and Inhibitors Using Machine Learning.

Authors:  Kelvin Cooper; Christopher Baddeley; Bernie French; Katherine Gibson; James Golden; Thiam Lee; Sadrach Pierre; Brent Weiss; Jason Yang
Journal:  ACS Omega       Date:  2021-02-05

5.  Proceedings of the XXXVIIIth Seminar of the French-Speaking Society for Theoretical Biology; Saint-Flour (Cantal), France, 11-13 June, 2018.

Authors:  Nicolas Glade; Ibrahim Cheddadi; Sergiu Ivanov
Journal:  Acta Biotheor       Date:  2020-02-24       Impact factor: 1.774

Review 6.  Biological characteristics and biomarkers of novel SARS-CoV-2 facilitated rapid development and implementation of diagnostic tools and surveillance measures.

Authors:  Gajanan Sampatrao Ghodake; Surendra Krushna Shinde; Avinash Ashok Kadam; Rijuta Ganesh Saratale; Ganesh Dattatraya Saratale; Asad Syed; Abdallah M Elgorban; Najat Marraiki; Dae-Young Kim
Journal:  Biosens Bioelectron       Date:  2021-01-04       Impact factor: 10.618

Review 7.  Clinical, molecular, and epidemiological characterization of the SARS-CoV-2 virus and the Coronavirus Disease 2019 (COVID-19), a comprehensive literature review.

Authors:  Esteban Ortiz-Prado; Katherine Simbaña-Rivera; Lenin Gómez-Barreno; Mario Rubio-Neira; Linda P Guaman; Nikolaos C Kyriakidis; Claire Muslin; Ana María Gómez Jaramillo; Carlos Barba-Ostria; Doménica Cevallos-Robalino; Hugo Sanches-SanMiguel; Luis Unigarro; Rasa Zalakeviciute; Naomi Gadian; Andrés López-Cortés
Journal:  Diagn Microbiol Infect Dis       Date:  2020-05-30       Impact factor: 2.803

Review 8.  Antibacterial and Antiviral Functional Materials: Chemistry and Biological Activity toward Tackling COVID-19-like Pandemics.

Authors:  Bhuvaneshwari Balasubramaniam; Sudhir Ranjan; Mohit Saraf; Prasenjit Kar; Surya Pratap Singh; Vijay Kumar Thakur; Anand Singh; Raju Kumar Gupta
Journal:  ACS Pharmacol Transl Sci       Date:  2020-12-29

Review 9.  SARS-CoV-2 and nervous system: From pathogenesis to clinical manifestation.

Authors:  Kiandokht Keyhanian; Raffaella Pizzolato Umeton; Babak Mohit; Vahid Davoudi; Fatemeh Hajighasemi; Mehdi Ghasemi
Journal:  J Neuroimmunol       Date:  2020-11-07       Impact factor: 3.478

Review 10.  Molecular biology of coronaviruses: current knowledge.

Authors:  I Made Artika; Aghnianditya Kresno Dewantari; Ageng Wiyatno
Journal:  Heliyon       Date:  2020-08-17
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.