Feng Wen1, Hai Yu2, Jinyue Guo3, Yong Li4, Kaijian Luo5, Shujian Huang6. 1. College of Life Science and Engineering, Foshan University, Foshan, 528231 Guangdong, China. Electronic address: wenfengjlu@163.com. 2. Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Shanghai 200241, China; Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou 225009, China. 3. College of Life Science and Engineering, Foshan University, Foshan, 528231 Guangdong, China. 4. College of Animal Science and Technology, Jiangxi Agricultural University, Nanchang 330045 Jiangxi, China. 5. College of Veterinary Medicine, South China Agricultural University, Guangzhou 510642 Guangdong, China. 6. College of Life Science and Engineering, Foshan University, Foshan, 528231 Guangdong, China. Electronic address: 617955368@qq.com.
A recent study in this journal studied the genomes of the novel SARS-like coronavirus (SARS-CoV-2) in China and suggested that the SARS-CoV-2 had undergone genetic recombination with SARS-related CoV. By February 14, 2020, a total of 66,576 confirmed cases of COVID-19, peopleinfected with SARS-CoV-2, were reported in China, leading to 1524 deaths, per the Chinese CDC (http://2019ncov.chinacdc.cn/2019-nCoV/). Several full genomic sequences of this virus have been released for the study of its evolutionary origin and molecular characteristics2, 3, 4. Here, we analyzed the potential mutations that may have evolved after the virus became epidemic among humans and also the mutations resulting in the human adaptation.The sequences of BetaCoV were downloaded on February 3, 2020 from the GISAID platform. A total of 58 accessions were available, among which BetaCoV/bat/Yunnan/RaTG13/2013 is a known close relative of SARS-CoV-2. Four accessions, namely, BetaCov/Italy/INM1/2020, BetaCov/Italy/INM2/2020, BetaCoV/Kanagawa/1/2020, and BetaCoV/USA/IL1/2020, were excluded because of the short-truncated sequences or multiple ambiguous nucleotides. A total of 54 accessions (Supplementary Table 1) isolated from humans were utilized in the following analysis. The sequences NC_004718.3 of SARS coronavirus genes were utilized to define the protein products of SARS-CoV-2. The protein sequences of ORF1ab, S, E, M, and N genes were translated, and all of the loci without experimental evidences were excluded. First, the protein sequences of SARS-CoV-2 were compared with RaTG13, human SARS (NC_004718.3), bat SARS (DQ022305.2), and humanMERS (NC_019843.3) by calculating the similarity in a given sliding window (Fig. 1
A). The sliding window was set to 500 for ORF1ab and S, and to 50 for proteins E, M, and N considering their short length. SARS-CoV-2 were highly similar to RaTG13 isolated from bats, showing 96% identity based on the whole-nucleotide sequences and 83% based on the protein sequences, suggesting a bat zoonotic origin of SARS-CoV-2. ORF1a, and the head of S seemed to have diverged from other beta coronaviruses.
Fig. 1
(A) The similarity between SARS-CoV-2 and other beta coronaviruses using the sliding window showed that SARS-CoV-2 was similar to bat virus RaTG13. (B) The molecular phylogenetic tree based on protein sequences established the high similarity among SARS-CoV-2 and its near relatives. (C) The mutations that developed after it came to circulate among humans did not include any mutation with high occurrence. (D) The graphs show all of the differences between SARS-CoV-2 and its close relative strains isolated from bats.
(A) The similarity between SARS-CoV-2 and other beta coronaviruses using the sliding window showed that SARS-CoV-2 was similar to bat virus RaTG13. (B) The molecular phylogenetic tree based on protein sequences established the high similarity among SARS-CoV-2 and its near relatives. (C) The mutations that developed after it came to circulate among humans did not include any mutation with high occurrence. (D) The graphs show all of the differences between SARS-CoV-2 and its close relative strains isolated from bats.The molecular phylogenetic tree (Fig. 1B) was built by using the maximum likelihood method based on the JTT matrix-based model. It hinted that the protein sequences of SARS-CoV-2 had over 99% similarity. Twenty-eight viruses had shared the same protein sequences, and could be the original strain circulated in the humans. The other viruses had only a few mutations from it. This indicates that the virus could have evolved for only a very short time after gaining the efficient human to human transmissibility, as expected. Next, we analyzed the mutations that occurred after infecting humans (Fig. 1C) in order to identify mutations associated with more severe infection. Here, two accessions (BetaCoV/Shenzhen/SZTH-001/2020 and BetaCoV/Shenzhen/SZTH-004/2020) from Shenzhen, which had 5 and 16 mutations, respectively, were excluded, considering the possible experimental issues. All of the mutations only occurred once, so it is possible that all of these mutations occur naturally and are associated with viral survival and infection. Several mutations were clustered in peptides nsp3 and nsp4 of ORF1ab and in the header of S. These results suggested that there had probably been no hyper-variable genomic hotspot in the SARS-CoV-2 population until now.We compared these results with those of the work of Ceraolo and Giorgi, who reported at least two hyper-variable genomic hotspots based on the Shannon entropy of nucleotide sequences. They utilized all of the sequences, while we merged all of the fully identical sequences into one during our Shannon entropy calculation. As shown in Fig. 1B, 28 sequences were merged into one in present study because they had been collected in such a short time, so collection time and location could not have produced any large bias. If those identical sequences were calculated individually, any mutations on these 28 sequences would have sharply increased Shannon entropy. The protein sequences were used to exclude any unimportant silent mutations. Finally, the sequences of earliest SARS-CoV-2 were compared with RaTG13 from bats (Fig. 1D). Fisher's exact test with post hoc test suggested that nsp1, nsp3, and nsp15 of ORF1ab and gene S had significantly more mutations than other genes, which might facilitate human adaptation and infection.S gene encodes spike glycoprotein, which binds host ACE2 receptors and is required for initiation of the infection. They reported that a 193-amino acid fragment was able to bind ACE2 more efficiently than its unmutated counterpart. This region in which spike glycoprotein binds to ACE2 had 21 mutations not found in RaTG13, suggesting their role in the adaptation to human hosts. Peptide nsp1 facilitated viral gene expression and evasion from the host immune response. Peptide nsp3, named papain-like proteinase, was found to be associated with the cleavages, viral replication, and antagonization of innate immune. These two peptides are probably associated with the latent period after infection in humans. Peptide nsp15 acted as uridylate-specific endoribonuclease. These results collectively suggest that peptides nsp1, nsp3, and nsp15 might have unclear but critical roles in this outbreak of SARS-CoV-2.To summarize, this study confirmed the relationship of SARS-CoV-2 with other beta coronaviruses on the amino acid level. The hyper-variable genomic hotspot has been established in the SARS-CoV-2 population at the nucleotide but not the amino acid level, suggesting that there have been no beneficial mutations. The mutations in nsp1, nsp3, nsp15, and gene S that identified in this study would be associated with the SARS-CoV-2 epidemic and was worthy of further study.
Authors: Marco A Marra; Steven J M Jones; Caroline R Astell; Robert A Holt; Angela Brooks-Wilson; Yaron S N Butterfield; Jaswinder Khattra; Jennifer K Asano; Sarah A Barber; Susanna Y Chan; Alison Cloutier; Shaun M Coughlin; Doug Freeman; Noreen Girn; Obi L Griffith; Stephen R Leach; Michael Mayo; Helen McDonald; Stephen B Montgomery; Pawan K Pandoh; Anca S Petrescu; A Gordon Robertson; Jacqueline E Schein; Asim Siddiqui; Duane E Smailus; Jeff M Stott; George S Yang; Francis Plummer; Anton Andonov; Harvey Artsob; Nathalie Bastien; Kathy Bernard; Timothy F Booth; Donnie Bowness; Martin Czub; Michael Drebot; Lisa Fernando; Ramon Flick; Michael Garbutt; Michael Gray; Allen Grolla; Steven Jones; Heinz Feldmann; Adrienne Meyers; Amin Kabani; Yan Li; Susan Normand; Ute Stroher; Graham A Tipples; Shaun Tyler; Robert Vogrig; Diane Ward; Brynn Watson; Robert C Brunham; Mel Krajden; Martin Petric; Danuta M Skowronski; Chris Upton; Rachel L Roper Journal: Science Date: 2003-05-01 Impact factor: 47.728
Authors: Yatish Turakhia; Nicola De Maio; Bryan Thornlow; Landen Gozashti; Robert Lanfear; Conor R Walker; Angie S Hinrichs; Jason D Fernandes; Rui Borges; Greg Slodkowicz; Lukas Weilguny; David Haussler; Nick Goldman; Russell Corbett-Detig Journal: PLoS Genet Date: 2020-11-18 Impact factor: 5.917
Authors: Ali A Rabaan; Shamsah H Al-Ahmed; Javed Muhammad; Amjad Khan; Anupam A Sule; Raghavendra Tirupathi; Abbas Al Mutair; Saad Alhumaid; Awad Al-Omari; Manish Dhawan; Ruchi Tiwari; Khan Sharun; Ranjan K Mohapatra; Saikat Mitra; Muhammad Bilal; Salem A Alyami; Talha Bin Emran; Mohammad Ali Moni; Kuldeep Dhama Journal: Vaccines (Basel) Date: 2021-04-29