Literature DB >> 32238584

Computational Inference of Selection Underlying the Evolution of the Novel Coronavirus, Severe Acute Respiratory Syndrome Coronavirus 2.

Rachele Cagliani1, Diego Forni2, Mario Clerici3,4, Manuela Sironi2.   

Abstract

The novel coronavirus n class="Species">severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) that recently emerged in China is thought to have a bat origin, as its closest known relative (BatCoV RaTG13) was described previously in horseshoe bats. We analyzed the selective events that accompanied the divergence of SARS-CoV-2 from BatCoV RaTG13. To this end, we applied a population genetics-phylogenetics approach, which leverages within-population variation and divergence from an outgroup. Results indicated that most sites in the viral open reading frames (ORFs) evolved under conditions of strong to moderate purifying selection. The most highly constrained sequences corresponded to some nonstructural proteins (nsps) and to the M protein. Conversely, nsp1 and accessory ORFs, particularly ORF8, had a nonnegligible proportion of codons evolving under conditions of very weak purifying selection or close to selective neutrality. Overall, limited evidence of positive selection was detected. The 6 bona fide positively selected sites were located in the N protein, in ORF8, and in nsp1. A signal of positive selection was also detected in the receptor-binding motif (RBM) of the spike protein but most likely resulted from a recombination event that involved the BatCoV RaTG13 sequence. In line with previous data, we suggest that the common ancestor of SARS-CoV-2 and BatCoV RaTG13 encoded/encodes an RBM similar to that observed in SARS-CoV-2 itself and in some pangolin viruses. It is presently unknown whether the common ancestor still exists and, if so, which animals it infects. Our data, however, indicate that divergence of SARS-CoV-2 from BatCoV RaTG13 was accompanied by limited episodes of positive selection, suggesting that the common ancestor of the two viruses was poised for human infection.IMPORTANCE Coronaviruses are dangerous zoonotic pathogens; in the last 2 decades, three coronaviruses have crossed the species barrier and caused human epidemics. One of these is the recently emerged SARS-CoV-2. We investigated how, since its divergence from a closely related bat virus, natural selection shaped the genome of SARS-CoV-2. We found that distinct coding regions in the SARS-CoV-2 genome evolved under conditions of different degrees of constraint and are consequently more or less prone to tolerate amino acid substitutions. In practical terms, the level of constraint provides indications about which proteins/protein regions are better suited as possible targets for the development of antivirals or vaccines. We also detected limited signals of positive selection in three viral ORFs. However, we warn that, in the absence of knowledge about the chain of events that determined the human spillover, these signals should not be necessarily interpreted as evidence of an adaptation to our species.
Copyright © 2020 American Society for Microbiology.

Entities:  

Keywords:  N protein; Nsp1; ORF8; SARS-CoV-2; positive selection; spike protein; viral evolution

Mesh:

Substances:

Year:  2020        PMID: 32238584      PMCID: PMC7307108          DOI: 10.1128/JVI.00411-20

Source DB:  PubMed          Journal:  J Virol        ISSN: 0022-538X            Impact factor:   5.103


INTRODUCTION

In December 2019, a human-infecting n class="Species">coronavirus, now referred to as coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) (1), emerged in Wuhan, China, causing respiratory disease in a large number of people and being responsible for thousands of deaths (https://www.who.int/emergencies/diseases/novel-coronavirus-2019) (2). After SARS-CoV (severe acute respiratory syndrome coronavirus) and MERS-CoV (Middle East respiratory syndrome coronavirus), SARS-CoV-2 is the third coronavirus to cause a human epidemic in the last 2 decades (3, 4). Coronaviruses (family n class="Species">Coronaviridae, order Nidovirales) have positive-sense, single-stranded RNA genomes which are unusually long and complex compared to those of other RNA viruses. Two-thirds of the coronavirus genome is occupied by two large overlapping open reading frames (ORFs), ORF1a and ORF1b, that are translated into the pp1a and pp1ab polyproteins. These are processed to generate 16 nonstructural proteins (nsp1 to nsp16) (5). The remaining portion of the genome includes ORFs for the structural proteins spike (S), envelope (E), membrane (M), and nucleoprotein (N), as well as a variable number of accessory proteins (3–5). Several coronavirus genera and subgenera are recogene">nized (https://talk.ictvonline.org/ictv-reports/) (1, 6, 7). Whereas n class="Species">MERS-CoV is a member of the Merbecovirus subgenus, phylogenetic analyses indicated that SARS-CoV-2 clusters with SARS-CoV and other bat-derived viruses in the Sarbecovirus subgenus (genus Betacoronavirus) (1, 8, 9). A recent report by the Coronavirus Study Group of the International Committee on Taxonomy of Viruses (ICTV) indicated that SARS-CoV-2 can be assigned to the species Severe acute respiratory syndrome-related coronavirus (1). Bats host a large diversity of coronaviruses related to n class="Species">SARS-CoV (5, 10, 11), and, in general, these animals are believed to represent the original reservoir of several human-infecting coronaviruses (3, 4). This also seems to be the case for SARS-CoV-2, as analysis of the viral genome indicated that its closest known relative, with an average identity of ∼96%, is a virus (BatCoV RaTG13) identified in horseshoe bats (Rhinolophus affinis) (8). Two other bat-derived coronaviruses (bat-SL-CoVZC45 and bat-SL-CoVZXC21) display high levels of similarity (>70%) to SARS-CoV-2, with various levels of identity along the genome (9, 12, 13). However, because both SARS-CoV and MERS-CoV were transmitted to humans via intermediate hosts (3, 4), it remains unclear whether the Wuhan epidemic was initiated by a spillover from bats or from other animals. Recent data suggested that viruses related to SARS-CoV-2 are found in pangolins (Manis javanica) (14–17), but the role of these animals in fueling the human epidemic remains unclear. A major determinant of coronavirus host range is represented by the binding affinity between the n class="Gene">spike protein and the cognate cellular receptor (18–22). Notably, this was previously shown to be the case for SARS-CoV, which, in analogy to SARS-CoV-2, uses ACE2 (angiotensin-converting enzyme 2) to enter host cells (8, 23). A limited number of amino acid changes in the receptor binding domain (RBD) of SARS-CoV were shown to modulate the binding efficiency to ACE2 from different mammalian species and to contribute to the adaptation of the virus to human cells (24–26). However, the SARS-CoV epidemic was characterized by another signature change in the viral genome; relatively early during the human-to-human transmission chain, SARS-CoV strains acquired a 29-nucleotide deletion which split ORF8, encoding an accessory protein, into two functional ORFs (27). Together with the observation that ORF8 is evolving quickly in SARS-CoV strains, this finding was taken to imply adaptation to our species (28). The evidence for adaptation was subsequently questioned, and recent data indicated that the 29-nucleotide deletion most likely represents a founder effect, which causes fitness loss irrespective of the host species (4, 29). These data underscore the relevance (and possible pitfalls) of evolutionary analyses in the study of viral species emergence and host shifts. Here, we used available SARS-CoV-2 strains to describe the selective events that accompanied the divergence of this novel n class="Species">human pathogen from its closest known relative (BatCoV RaTG13) (8).

RESULTS AND DISCUSSION

As mentioned above, the closest relative (BatCoV RaTG13) of the novel n class="Species">human-infecting SARS-CoV-2 was identified in bats (8). It is presently unknown whether BatCoV RaTG13 can be transmitted in human populations and if it can infect human cells. Likewise, the reservoir and the animal host that fueled the human transmission of SARS-CoV-2 are presently uncertain. It is certain that ample data now indicate that human-to-human transmission has a role in spreading the SARS-CoV-2 epidemic (30–33) and that, in addition to humans, the virus can infect cells from bats, small carnivores, and pigs (8). We thus set out to determine the selective events that accompanied the divergence of the SARS-CoV-2 lineage from BatCoV RaTG13. In doing so, we do not imply that any such event was primarily responsible for human adaptation, as high efficiency of human infection might instead represent an incidental by-product of adaptation to another host. Based on the alignment of 44 SARS-CoV-2 genomes and the Batn class="Species">CoV RaTG13 sequence, 147 amino acid replacements, unevenly distributed along the genome, were found to separate SARS-CoV-2 from its closest relative. A total of 41 amino acid changes are polymorphic in the SARS-CoV-2 population (Fig. 1A).
FIG 1

Selective patterns of SARS-CoV-2. (A) Similarity plot (generated with SimPlot) of BatCoV RaTG13 relative to SARS-CoV-2 (Wuhan-Hu-1 reference strain, NC_045512.2). Similarity (Kimura distance) was calculated within sliding windows of 250 bp moving with steps of 50 bp. A schematic representation of the SARS-CoV-2 genome is also shown. ORF and nsp (nonstructural protein) names, lengths, and relative positions are in accordance with the annotation for the reference Wuhan-Hu-1 sequence. Box colors indicate the level of amino acid identity between the SARS-CoV-2 and BatCoV RaTG13 sequences. Black triangles indicate amino acid changes that are polymorphic in the analyzed SARS-CoV-2 genomes. Asterisks denote positively selected sites, and their sizes are proportional to the number of selected sites/region. Short ORFs with names in red were not analyzed with gammaMap. (B and C) Violin plots (median, white dot; interquartile range, black bar) of selection coefficients (γ) for the longest (more that 80 codons) ORFs (B) and nsp3 subdomains (C) are shown. Nsp3 domains were retrieved from the SARS-CoV annotation (68).

Selective patterns of SARS-CoV-2. (A) Similarity plot (generated with SimPlot) of Batn class="Species">CoV RaTG13 relative to SARS-CoV-2 (Wuhan-Hu-1 reference strain, NC_045512.2). Similarity (Kimura distance) was calculated within sliding windows of 250 bp moving with steps of 50 bp. A schematic representation of the SARS-CoV-2 genome is also shown. ORF and nsp (nonstructural protein) names, lengths, and relative positions are in accordance with the annotation for the reference Wuhan-Hu-1 sequence. Box colors indicate the level of amino acid identity between the SARS-CoV-2 and BatCoV RaTG13 sequences. Black triangles indicate amino acid changes that are polymorphic in the analyzed SARS-CoV-2 genomes. Asterisks denote positively selected sites, and their sizes are proportional to the number of selected sites/region. Short ORFs with names in red were not analyzed with gammaMap. (B and C) Violin plots (median, white dot; interquartile range, black bar) of selection coefficients (γ) for the longest (more that 80 codons) ORFs (B) and nsp3 subdomains (C) are shown. Nsp3 domains were retrieved from the SARS-CoV annotation (68). To investigate the selection patterns acting on SARS-CoV-2 genomes, we applied a method that combines analysis of within-population variation (i.e., variation among n class="Species">SARS-CoV-2 strains) and divergence from an outgroup (BatCoV RaTG13). Specifically, nucleotide alignments were analyzed using gammaMap (34), which estimates selection coefficients (γ) along coding regions and allows the detection of fine-scale differences in selective pressures at specific codons. In practical terms, γ  values can be considered a measure of the fitness consequences of new nonsynonymous mutations. The method categorizes selection coefficients into 12 predefined classes ranging from −500 (inviable) to 100 (strongly beneficial). For gammaMap analysis, we divided the ORF1a and ORF1b alignments into the 16 nsps; because nsp3 is a long, multidomain protein, it was split into domains also. Likewise, the coronavirus S protein includes two functionally distinct units (S1 and S2), which were separately analyzed. Alignments of more than 80 codons were analyzed with gammaMap (Fig. 1A). As previously shown for several other viruses (35–37), we found that most sites evolved under conditions of strong to moderate purifying selection (γ value less than −5). However, the strength of purifying selection varied depending on the region. The strongest constraints were observed for nsp6 to n class="Gene">nsp10, for nsp16, and for the M ORF (Fig. 1B). Whereas nsp6 is involved in the formation of the reticulovesicular membrane network where viral RNA replication occurs, nsp7 to nsp10 are small proteins that function as cofactors for viral replicative enzymes, including nsp16, a 2′-O-methyl transferase (38). Conversely, the M ORF encodes a structural protein which is highly abundant in the virion of coronaviruses (39). The M protein interacts with other structural viral proteins and plays an important role in virion morphogenesis (40). Importantly, the M protein is a dominant immunogen for both the humoral and the cellular immune responses (41, 42). The latter features and its high level of constraint suggest that the M protein represents an excellent target for vaccine design. Among the nonaccessory ORFs, the lowest levels of constraint were observed for nsp1 and the acidic domain of n class="Gene">nsp3 (Fig. 1B and C). This is in line with data indicating that these regions are quickly evolving in coronaviruses at large (see below) (43, 44). Accessory ORFs, and, in particular, ORF8, had a nonnegligible proportion of codons evolving under conditions of very weak purifying selection or close to selective neutrality. On one hand, this is in line with the idea that genetic variation in accessory ORFs causes limited fitness consequences, as the above-mentioned case of SARS-CoV ORF8 indicates (4, 29). In fact, gains and losses of accessory proteins have been common during the evolutionary history of coronaviruses and accessory ORFs differ in number and sequence even among coronaviruses belonging to the same genus or subgenus (4). On the other hand, accessory proteins have often been shown to contribute to the modulation of immune responses, as well as to virulence (3, 4). It is thus conceivable that their limited constraint maintains variability in coronavirus accessory ORFs, eventually facilitating rapid adaptation when the environment (e.g., host) changes. We next wished to determine whether positive selection at specific sites also drove the evolution of SARS-CoV-2. We thus estimated codon-wise posterior probabilities for each selection coefficient. Very strong evidence (defined as a posterior probability > 0.80 of γ ≥ 1) of positive selection was detected for seven sites, including six in the S1 region of the n class="Gene">spike protein and one in N (Fig. 2). When the posterior probability cutoff was lowered to a less stringent value of 0.50, five additional sites in ORF8 (n = 4) and in nsp1 (n = 1) were identified (Fig. 2). It should be noted that this P value cutoff represents reasonably strong evidence of positive selection. Using these criteria, positively selected sites were estimated to account for the 0.12% of analyzed codons seen using 0.5 as the cutoff (0.07% for a 0.8 cutoff) (34, 45, 46).
FIG 2

SARS-CoV-2 positively selected sites. A schematic representation of the nsp1, ORF8, spike (S), and nucleocapsid (N) proteins is presented. Positively selected sites (magenta) and amino acid substitutions between SARS-CoV-2 and BatCoV RaTG13 (red) and between SARS-CoV-2 and pangolin-CoV MP789 (blue) are indicated in the alignments. The location of an insertion (insPRRA) in the spike glycoprotein is also shown. This insertion is predicted to occur in the S1/S2 furin-like cleavage site (69, 70).

SARS-CoV-2 positively selected sites. A schematic representation of the n class="Gene">nsp1, ORF8, spike (S), and nucleocapsid (N) proteins is presented. Positively selected sites (magenta) and amino acid substitutions between SARS-CoV-2 and BatCoV RaTG13 (red) and between SARS-CoV-2 and pangolin-CoV MP789 (blue) are indicated in the alignments. The location of an insertion (insPRRA) in the spike glycoprotein is also shown. This insertion is predicted to occur in the S1/S2 furin-like cleavage site (69, 70). The S1 region contains the RBD, and the crystal structure of the SARS-CoV S protein in complex with n class="Species">human ACE2 showed that, in turn, the RBD is formed by two subdomains, a core structure and the receptor-binding motif (RBM, which directly contacts ACE2) (47, 48). The S2 region includes the fusion machinery (49). We performed homology modeling of the SARS-CoV-2 S protein onto the SARS-CoV structure, and we analyzed the distribution of selection coefficients (Fig. 3A). The S2 subunit was characterized by stronger constraint than the S1 portion, and five of six putative positively selected sites were found to be located in the RBM, at the binding interface with ACE2 (Fig. 3A).
FIG 3

Homology modeling of positively selected SARS-CoV-2 proteins. Selected sites are mapped onto the 3D structure of models obtained using SARS-CoV proteins as a templates (PDB ID: 6ACG for panel A, 2CJR for panel B, 2HSX for panel C). Coronavirus proteins are colored in hues of blue based on the most likely selection coefficient. Positively selected sites are marked in red. (A) Ribbon representation of the spike glycoprotein model (one monomer is shown) in complex with human ACE2 (green) (48). The binding interface is shown in the enlargement. (B) Ribbon representation of the C-terminal domain of the nucleocapsid protein. (C) Ribbon representation of the N-terminal portion of nsp1. Note that although some sites had the highest posterior probability for γ = 1 (yellow), they were not called as positively selected because the 0.5 threshold was not reached.

Homology modeling of positively selected SARS-CoV-2 proteins. Selected sites are mapped onto the 3D structure of models obtained using n class="Species">SARS-CoV proteins as a templates (PDB ID: 6ACG for panel A, 2CJR for panel B, 2HSX for panel C). Coronavirus proteins are colored in hues of blue based on the most likely selection coefficient. Positively selected sites are marked in red. (A) Ribbon representation of the spike glycoprotein model (one monomer is shown) in complex with human ACE2 (green) (48). The binding interface is shown in the enlargement. (B) Ribbon representation of the C-terminal domain of the nucleocapsid protein. (C) Ribbon representation of the N-terminal portion of nsp1. Note that although some sites had the highest posterior probability for γ = 1 (yellow), they were not called as positively selected because the 0.5 threshold was not reached. Comparing SARS-CoV-2 and Batn class="Species">CoV RaTG13, the RBM stands out as the single most divergent region (Fig. 1A) (8, 16). Very recent evidence indicated that, although the average level of genome similarity is lower than that seen with BatCoV RaTG13, coronaviruses isolated from pangolins have RBMs almost identical to that of SARS-CoV (14–17). This clearly implies that recombination might have inflated the estimation of positive selection in the S1 region. A pangolin virus available in GenBank (isolate MP789) has an RBM with high identity to SARS-CoV-2. Thus, using the genome sequences of isolate MP789, SARS-CoV-2, and BatCoV RaTG13, we searched for recombination events using RDP4 (50). No evidence of recombination was detected, but that finding might have been due to the fact that the parental sequence with which BatCoV RaTG13 recombined is presently unsampled. We thus analyzed synonymous substitutions in the RBM alignment for these viruses, and we found that 41% (n = 37) of such substitutions are shared between SARS-CoV-2 and isolate MP789, whereas only 27% (n = 10) are shared between SARS-CoV-2 and BatCoV RaTG13. Overall, these findings strongly suggest that recombination rather than positive selection shaped the genetic diversity at the RBM, as previously suggested (16). Recombination is known to affect evolutionary inference (51). In this case, because we used the BatCoV RaTG13 as an outgroup, the spurious signals were generated by considering the selected sites to represent amino acid replacements that arose and became fixed in the SARS-CoV-2 population, whereas they might represent changes that occurred in the outgroup through recombination. We consider that this was not the case for the other signals that we detected, as all of them were located in regions of high overall similarity between BatCoV RaTG13 and SARS-CoV-2, indicating no evidence of recombination (Fig. 1A). The positively selected site (A267) in the nucleocapsid protein is located in the C-terminal domain. Homology modeling using the SARS-CoV n class="Gene">N protein as a template indicated that A267 is located on an exposed loop on the protein surface (Fig. 3B) (52). The N protein is the most abundant protein in coronavirus-infected cells (53, 54). Its primary function is to package the viral genome into a ribonucleoprotein complex. In addition, the N protein performs nonstructural functions, as it regulates the host cell cycle and the stress response, it acts as a molecular chaperone, and it interferes with the host immune response (53, 54). Because these activities are mediated by interactions with different cellular proteins, the positively selected site might be evolving to establish, maintain, or avoid the binding of different host molecules. Another positively selected site was detected in the nsp1 region, which also displayed relatively weak selective constraint. In n class="Species">SARS-CoV and other betacoronaviruses, nsp1 is a virulence factor and is essential for viral replication at least in the presence of an intact host interferon (IFN) response (55–57). Despite their relevant role for viral fitness in vivo, nsp1 proteins tend to be variable in sequence both within and among coronavirus genera. Detailed analysis of SARS-CoV nsp1 indicated that the protein plays multiple roles during viral infection, including inhibition of host protein synthesis, antagonism of IFN responses, modulation of the calcineurin/NFAT (nuclear factor of activated T cells) pathway, and induction of chemokine secretion (43). Homology modeling using the SARS-CoV nsp1 structure indicated that the positively selected site (E93) is exposed on the protein surface (Fig. 3C). Extensive mutagenesis of SARS-CoV nsp1 showed that exposed charged residues, including the positively selected site, mediate inhibition of gene expression and antiviral signaling (58). Moreover, the N-terminal half of SARS-CoV nsp1 interacts with immunophilins and calcipressins to modulate the calcineurin/NFAT pathway (59). Overall, these observations suggest that the diversity of coronavirus nsp1 proteins is driven by the need to establish interactions with multiple cellular partners and to evade immune surveillance. This is also likely to explain the positive selection signal that we detected. In general, a better understanding of the evolutionary constraints and forces acting on coronavirus nsp1 proteins may be extremely relevant, as the generation of viruses carrying nsp1 mutations was previously reported to be regarded as a potential strategy to generate attenuated vaccine strains (57, 60), and inhibitors of cyclophilins were previously reported to be potential antivirals for coronavirus treatment (59). Finally, all of the selected sites that we identified in ORF8 (F3, I10, n class="Gene">A14, are T26) are located in the N-terminal portion of the protein (Fig. 2). The SARS-CoV-2 ORF8 protein displays 30% identity to the intact ORF8 from the SARS-CoV GZ02 stain. It is presently unclear whether the SARS-CoV ORF8 N terminus is cleaved as a signal peptide or inserted into the endoplasmic reticulum membrane (61, 62). Using computational methods to predict signal peptides and transmembrane helices, we found evidence for both in the case of the N terminus of SARS-CoV-2 ORF8 (not shown). Clearly, experimental analyses will be required to determine the function of the N-terminal region of ORF8 and, more generally, the relevance of the selected sites for virus fitness or pathogenicity. Overall, our analyses indicate that distinct coding regions in the SARS-CoV-2 genome evolve under conditions of different degrees of constraint and are consequently more or less prone to tolerate amino acid substitutions. In practical terms, the level of constraint can provide indications concerning which specific proteins or protein regions are better suited to being possible targets for the development of antivirals or vaccines. Conversely, the current available kene">nowledge and the analyses reported here allow no inference on the selective events (or lack thereof) that turned n class="Species">SARS-CoV-2 into a human pathogen. Recent analyses paid much attention to changes in the RBM. This is indeed expected to represent a major determinant of host range, and its sequence is highly variable among SARS-CoV-related viruses (as is also evident in the data presented in Fig. 2). Albeit preliminary and necessarily limited to currently sampled genomes, our analyses suggest that recombination had a role in shaping the diversity of the RBMs in these viruses. Our data also indicate that divergence of SARS-CoV-2 from BatCoV RaTG13 was accompanied by limited episodes of positive selection, suggesting that the common ancestor of the two viruses was poised for human infection. We also emphasize that lack of knowledge about the reservoir host and the chain of events that determined the human spillover prevent us from drawing any conclusion on the selective pressure underlying the limited positive selection events tha twe detected. These will need to be interpreted in the future, by incorporating epidemiological, biochemical, and additional genetic data. Clearly, a caveat of our analyses lies in the quality and paucity of SARS-CoV-2 genomes, as well as in the limited availability of genomes of other n class="Species">coronaviruses closely related to SARS-CoV-2. Available sequences were obtained using different methods and most likely contain errors. This is unlikely to strongly affect inference of positive selection, as the frequency of all selected sites is high in the SARS-CoV-2 population. Also, the SARS-CoV-2 sequences that we analyzed display limited diversity (with only 41 nonsynonymous polymorphisms, most of them present in one or a few sequences). Thus, although the availability of additional genomes may increase the power to detect selective events and the confidence with which evolutionary patterns are inferred, simply increasing the number of genomes is unlikely to change the bulk of our results. However, sustained viral spread in the human population will necessarily introduce new mutations in the viral population. Thus, data reported here can depict only the situation of the early phases of the human epidemic. Follow-up analyses of the SARS-CoV-2 population will be required to determine the evolutionary trajectories of new mutations and to assess whether and how they affect viral fitness in the human hots.

MATERIALS AND METHODS

Sequences and alignments.

Genome sequences were retrieved from the National Center for Biotechnology Information database (n class="Gene">NCBI; https://www.ncbi.nlm.nih.gov/). Only complete or almost-complete genome sequences were included in the analysis (Table 1).
TABLE 1

List of analyzed strains

Strain nameGenBank ID
Wuhan-Hu-1NC_045512.2
2019-nCoV WHU01MN988668.1
2019-nCoV WHU02MN988669.1
2019-nCoV_HKU-SZ-005b_2020MN975262.1
2019-nCoV_HKU-SZ-002a_2020MN938384.1
SARS-CoV-2/WH-09/human/2020/CHNMT093631.1
SARS-CoV-2/IQTC01/human/2020/CHNMT123290.1
HZ-1MT039873.1
BetaCoV/Wuhan/IPBCAMS-WH-01/2019MT019529.1
BetaCoV/Wuhan/IPBCAMS-WH-03/2019MT019531.1
BetaCoV/Wuhan/IPBCAMS-WH-02/2019MT019530.1
BetaCoV/Wuhan/IPBCAMS-WH-04/2019MT019532.1
BetaCoV/Wuhan/IPBCAMS-WH-05/2020MT019533.1
WIV02MN996527.1
WIV04MN996528.1
WIV05MN996529.1
WIV06MN996530.1
WIV07MN996531.1
SARS-CoV-2/Yunnan-01/human/2020/CHNMT049951.1
nCoV-FIN-29-Jan-2020MT020781.1
SARS0CoV-2/61-TW/human/2020/NPLMT072688.1
SNU01MT039890.1
SARS-CoV-2/01/human/2020/SWEMT093571.1
SARS-CoV-2/NTU01/2020/TWNMT066175.1
SARS-CoV-2/NTU02/2020/TWNMT066176.1
2019-nCoV/USA-WA1/2020MN985325.1
2019-nCoV/USA-AZ1/2020MN997409.1
2019-nCoV/USA-CA1/2020MN994467.1
2019-nCoV/USA-CA2/2020MN994468.1
2019-nCoV/USA-CA3/2020MT027062.1
2019-nCoV/USA-CA4/2020MT027063.1
2019-nCoV/USA-CA5/2020MT027064.1
2019-nCoV/USA-CA6/2020MT044258.1
2019-nCoV/USA-CA7/2020MT106052.1
2019-nCoV/USA-CA8/2020MT106053.1
2019-nCoV/USA-CA9/2020MT118835.1
2019-nCoV/USA-IL2/2020MT044257.1
2019-nCoV/USA-IL1/2020MN988713.1
2019-nCoV/USA-MA1/2020MT039888.1
2019-nCoV/USA-TX1/2020MT106054.1
2019-nCoV/USA-WA1-A12/2020MT020880.1
2019-nCoV/USA-WA1-F6/2020MT020881.1
2019-nCoV/USA-WI1/2020MT039887.1
Australia/VIC01/2020MT007544.1
Bat coronavirus RaTG13MN996532.1
Pangolin coronavirus isolate MP789MT084071.1
Bat SARS-like coronavirus isolate bat-SL-CoVZC45MG772933.1
Bat SARS-like coronavirus isolate bat-SL-CoVZXC21MG772934.1
SARS-CoV tor2NC_004718.3
SARS-CoV GZ02AY390556.1
Bat SARS coronavirus HKU3-1DQ022305.2
Rhinolophus affinis coronavirus isolate LYRa11KF569996.1
List of analyzed strains Alignments were generated using MAFFT (63), setting sequence type as codons.

Population genetics—phylogenetic analysis.

Analyses were performed with gammaMap, which uses intraspecies variation and interspecies diversity to estimate, along coding regions, the distribution of selection coefficients (γ). In this framework, γ is defined as 2PNes, where P is the ploidy, n class="Gene">Ne is effective population size, and s is the fitness advantage of any amino acid-replacing derived allele (34). For the eight longest ORFs in the SARS-CoV-2 genome, the corresponding coding sequence of Batn class="Species">CoV RaTG13 was used as the outgroup. We assumed θ (neutral mutation rate per site), k (transitions/transversions ratio), and T (branch length) to vary within genes following log-normal distributions, whereas p (probability of adjacent codons to share the same selection coefficient) was assumed to follow a log-uniform distribution. For each ORF, we set the neutral frequencies of non-STOP codons (1/61). For selection coefficients, we considered a uniform Dirichlet distribution with the same prior weight for each selection class. For each ORF, we performed 2 runs with 100,000 iterations each and with a thinning interval of 10 iterations. Runs were merged after checking for convergence. The similarity plot was computed using a Kimura (two-parameter) distance model with SimPlot version 3.5.1 (64). The strip gap option was set at the 50% default value. Similarity scores were calculated in sliding windows of 250 bp moving with a step of 50 bp.

Protein 3D structures and homology modeling.

The three-dimensional (3D) structures of SARS-CoV n class="Gene">N (PDB identifier [ID]: 2CJR) (65) and S (PDB ID: 6ACG) (48) proteins were obtained from the Protein Data Bank (PDB). Homology modeling analysis was performed through the SWISS-MODEL server (66). The accuracy of the models was examined through the GMQE (Global Model Quality Estimation) and QMEAN (Qualitative Model Energy An class="Gene">Nalysis) scores (67). 3D structures were rendered using PyMOL (The PyMOL Molecular Graphics System, Version 1.8.4.0; Schrödinger, LLC).
  69 in total

1.  Severe acute respiratory syndrome coronavirus phylogeny: toward consensus.

Authors:  Alexander E Gorbalenya; Eric J Snijder; Willy J M Spaan
Journal:  J Virol       Date:  2004-08       Impact factor: 5.103

Review 2.  Recombination, reservoirs, and the modular spike: mechanisms of coronavirus cross-species transmission.

Authors:  Rachel L Graham; Ralph S Baric
Journal:  J Virol       Date:  2009-11-11       Impact factor: 5.103

3.  The M, E, and N structural proteins of the severe acute respiratory syndrome coronavirus are required for efficient assembly, trafficking, and release of virus-like particles.

Authors:  Y L Siu; K T Teoh; J Lo; C M Chan; F Kien; N Escriou; S W Tsao; J M Nicholls; R Altmeyer; J S M Peiris; R Bruzzone; B Nal
Journal:  J Virol       Date:  2008-08-27       Impact factor: 5.103

4.  Identifying SARS-CoV-2-related coronaviruses in Malayan pangolins.

Authors:  Tommy Tsan-Yuk Lam; Na Jia; Ya-Wei Zhang; Marcus Ho-Hin Shum; Jia-Fu Jiang; Yi-Gang Tong; Hua-Chen Zhu; Yong-Xia Shi; Xue-Bing Ni; Yun-Shi Liao; Wen-Juan Li; Bao-Gui Jiang; Wei Wei; Ting-Ting Yuan; Kui Zheng; Xiao-Ming Cui; Jie Li; Guang-Qian Pei; Xin Qiang; William Yiu-Man Cheung; Lian-Feng Li; Fang-Fang Sun; Si Qin; Ji-Cheng Huang; Gabriel M Leung; Edward C Holmes; Yan-Ling Hu; Yi Guan; Wu-Chun Cao
Journal:  Nature       Date:  2020-03-26       Impact factor: 49.962

5.  Mechanisms of host receptor adaptation by severe acute respiratory syndrome coronavirus.

Authors:  Kailang Wu; Guiqing Peng; Matthew Wilken; Robert J Geraghty; Fang Li
Journal:  J Biol Chem       Date:  2012-01-30       Impact factor: 5.157

6.  A pneumonia outbreak associated with a new coronavirus of probable bat origin.

Authors:  Peng Zhou; Xing-Lou Yang; Xian-Guang Wang; Ben Hu; Lei Zhang; Wei Zhang; Hao-Rui Si; Yan Zhu; Bei Li; Chao-Lin Huang; Hui-Dong Chen; Jing Chen; Yun Luo; Hua Guo; Ren-Di Jiang; Mei-Qin Liu; Ying Chen; Xu-Rui Shen; Xi Wang; Xiao-Shuang Zheng; Kai Zhao; Quan-Jiao Chen; Fei Deng; Lin-Lin Liu; Bing Yan; Fa-Xian Zhan; Yan-Yi Wang; Geng-Fu Xiao; Zheng-Li Shi
Journal:  Nature       Date:  2020-02-03       Impact factor: 69.504

7.  The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade.

Authors:  B Coutard; C Valle; X de Lamballerie; B Canard; N G Seidah; E Decroly
Journal:  Antiviral Res       Date:  2020-02-10       Impact factor: 5.970

8.  A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster.

Authors:  Jasper Fuk-Woo Chan; Shuofeng Yuan; Kin-Hang Kok; Kelvin Kai-Wang To; Hin Chu; Jin Yang; Fanfan Xing; Jieling Liu; Cyril Chik-Yan Yip; Rosana Wing-Shan Poon; Hoi-Wah Tsoi; Simon Kam-Fai Lo; Kwok-Hung Chan; Vincent Kwok-Man Poon; Wan-Mui Chan; Jonathan Daniel Ip; Jian-Piao Cai; Vincent Chi-Chung Cheng; Honglin Chen; Christopher Kim-Ming Hui; Kwok-Yung Yuen
Journal:  Lancet       Date:  2020-01-24       Impact factor: 79.321

9.  Importation and Human-to-Human Transmission of a Novel Coronavirus in Vietnam.

Authors:  Lan T Phan; Thuong V Nguyen; Quang C Luong; Thinh V Nguyen; Hieu T Nguyen; Hung Q Le; Thuc T Nguyen; Thang M Cao; Quang D Pham
Journal:  N Engl J Med       Date:  2020-01-28       Impact factor: 91.245

10.  Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus.

Authors:  Wenhui Li; Michael J Moore; Natalya Vasilieva; Jianhua Sui; Swee Kee Wong; Michael A Berne; Mohan Somasundaran; John L Sullivan; Katherine Luzuriaga; Thomas C Greenough; Hyeryun Choe; Michael Farzan
Journal:  Nature       Date:  2003-11-27       Impact factor: 49.962

View more
  57 in total

Review 1.  Potential role of biochemical markers in the prognosis of COVID-19 patients.

Authors:  Apeksha Niraula; Nirmal Baral; Madhab Lamsal; Mahima Bataju; Saroj Thapa
Journal:  SAGE Open Med       Date:  2022-07-05

2.  Quasispecies of SARS-CoV-2 revealed by single nucleotide polymorphisms (SNPs) analysis.

Authors:  Rongsui Gao; Wenhong Zu; Yang Liu; Junhua Li; Zeyao Li; Yanling Wen; Haiyan Wang; Jing Yuan; Lin Cheng; Shengyuan Zhang; Yu Zhang; Shuye Zhang; Weilong Liu; Xun Lan; Lei Liu; Feng Li; Zheng Zhang
Journal:  Virulence       Date:  2021-12       Impact factor: 5.882

3.  Image processing unravels the evolutionary pattern of SARS-CoV-2 against SARS and MERS through position-based pattern recognition.

Authors:  Reza Ahsan; Mohammad Reza Tahsili; Faezeh Ebrahimi; Esmaeil Ebrahimie; Mansour Ebrahimi
Journal:  Comput Biol Med       Date:  2021-05-08       Impact factor: 4.589

Review 4.  The SARS-Coronavirus Infection Cycle: A Survey of Viral Membrane Proteins, Their Functional Interactions and Pathogenesis.

Authors:  Nicholas A Wong; Milton H Saier
Journal:  Int J Mol Sci       Date:  2021-01-28       Impact factor: 6.208

Review 5.  Omicron - The new SARS-CoV-2 challenge?

Authors:  A Lino; M A Cardoso; P Martins-Lopes; H M R Gonçalves
Journal:  Rev Med Virol       Date:  2022-04-21       Impact factor: 11.043

Review 6.  A rational roadmap for SARS-CoV-2/COVID-19 pharmacotherapeutic research and development: IUPHAR Review 29.

Authors:  Steve P H Alexander; Jane F Armstrong; Anthony P Davenport; Jamie A Davies; Elena Faccenda; Simon D Harding; Francesca Levi-Schaffer; Janet J Maguire; Adam J Pawson; Christopher Southan; Michael Spedding
Journal:  Br J Pharmacol       Date:  2020-07-19       Impact factor: 8.739

Review 7.  The Proteins of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS CoV-2 or n-COV19), the Cause of COVID-19.

Authors:  Francis K Yoshimoto
Journal:  Protein J       Date:  2020-06       Impact factor: 2.371

Review 8.  Betacoronavirus Genomes: How Genomic Information has been Used to Deal with Past Outbreaks and the COVID-19 Pandemic.

Authors:  Alejandro Llanes; Carlos M Restrepo; Zuleima Caballero; Sreekumari Rajeev; Melissa A Kennedy; Ricardo Lleonart
Journal:  Int J Mol Sci       Date:  2020-06-26       Impact factor: 5.923

9.  The interplay of SARS-CoV-2 evolution and constraints imposed by the structure and functionality of its proteins.

Authors:  Lukasz Jaroszewski; Mallika Iyer; Arghavan Alisoltani; Mayya Sedova; Adam Godzik
Journal:  PLoS Comput Biol       Date:  2021-07-08       Impact factor: 4.475

10.  A selective sweep in the Spike gene has driven SARS-CoV-2 human adaptation.

Authors:  Lin Kang; Guijuan He; Amanda K Sharp; Xiaofeng Wang; Anne M Brown; Pawel Michalak; James Weger-Lucarelli
Journal:  Cell       Date:  2021-07-07       Impact factor: 66.850

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.