Literature DB >> 33119313

Molecular Basis of SARS-CoV-2 Infection and Rational Design of Potential Antiviral Agents: Modeling and Simulation Approaches.

Antonio Francés-Monerris^1,2, Cécilia Hognon¹, Tom Miclot^1,3, Cristina García-Iriepa^4,5, Isabel Iriepa^5,6, Alessio Terenzi³, Stéphanie Grandemange⁷, Giampaolo Barone³, Marco Marazzi^4,5, Antonio Monari¹.

Abstract

The emergence in late 2019 of the coronavirus SARS-CoV-2 has resulted in the breakthrough of the COVID-19 pandemic that is presently affecting a growing number of countries. The development of the pandemic has also prompted an unprecedented effort of the scientific community to understand the molecular bases of the virus infection and to propose rational drug design strategies able to alleviate the serious COVID-19 morbidity. In this context, a strong synergy between the structural biophysics and molecular modeling and simulation communities has emerged, resolving at the atomistic level the crucial protein apparatus of the virus and revealing the dynamic aspects of key viral processes. In this Review, we focus on how in silico studies have contributed to the understanding of the SARS-CoV-2 infection mechanism and the proposal of novel and original agents to inhibit the viral key functioning. This Review deals with the SARS-CoV-2 spike protein, including the mode of action that this structural protein uses to entry human cells, as well as with nonstructural viral proteins, focusing the attention on the most studied proteases and also proposing alternative mechanisms involving some of its domains, such as the SARS unique domain. We demonstrate that molecular modeling and simulation represent an effective approach to gather information on key biological processes and thus guide rational molecular design strategies.

Entities: Chemical Disease Gene Mutation Species

Keywords: COVID-19; SARS unique domain; SARS-CoV-2; coronavirus; docking; drug design; free-energy methods; membrane fusion; molecular dynamics; molecular modeling; proteases; spike protein; structural biophysics

Mesh：

Substances：

Year: 2020 PMID： 33119313 PMCID： PMC7640986 DOI： 10.1021/acs.jproteome.0c00779

Source DB: PubMed Journal: J Proteome Res ISSN： 1535-3893 Impact factor: 4.466

Introduction

A new coronavirus, which emerged in China at the end of 2019 and was soon named SARS-CoV-2, was recognized as the causative agent of severe acute respiratory syndrome (SARS) that in some cases evolved into severe pneumonia.[1−3] Later, it was also recognized that SARS-CoV-2 was able to cross the interspecies barrier[4] and was also characterized by a high transmissibility between humans.[5] SARS-CoV-2 has been recognized as the agent responsible for the coronavirus disease 19 (COVID-19), which, after harshly striking Asia and later Europe and the America, has presently evolved into a widespread pandemic. Although the mortality ratio of COVID-19 is relatively low,[6] <2% in younger patients, it increases to 40–60% in subjects older than 70. The high contagion rate of COVID-19 is also favored by the fact that some patients are totally asymptomatic[7,8] and hence actively participate in the virus diffusion. Because of the lack of a vaccine or an efficient antiviral treatment, almost all countries have introduced social distancing measures to palliate the virus diffusion that are significantly affecting both economic and social well-being.[9] The scientific community has been particularly active in answering the call and taking the challenge of characterizing the main mechanisms related to the COVID-19 diffusion, as will be clearly demonstrated in the present Review. In addition to the modeling of the macroscale epidemic diffusion and of the effects of social distancing on its spread,[10−12] a particular focus has also been devoted to the elucidation of the molecular mechanisms related to the SARS-CoV-2, including the infection of the host cell and its reproduction. Indeed, such a characterization is propaedeutic to rational drug design and repurposing strategies that should result in the rapid development of efficient antiviral agents. Much effort has been put into the resolution of the key protein structure of the SARS-CoV-2 capsid as well as of the adducts leading to the interaction with human receptors, mostly performed either via X-ray diffraction or via cryo-electron microscopy (CryoEM).[13−16] Some classes of proteins are particularly important in dictating the global behavior of the virus, its life cycle, and, ultimately, its pathogenicity and infectivity rate.[17,18] The recognition of the host cell and its entry are mediated by the SARS-CoV-2 spike (S) protein.[19,20] The latter is a trimeric multidomain glycoprotein that is able to specifically recognize the human angiotensin converting enzyme 2 (ACE2) via its receptor binding domain (RBD).[21] The conformational changes induced by the RBD/ACE2 interactions ultimately result in the viral and cellular lipid membrane fusion and hence in cell infection. Consequently, the RBD and ACE2 represent ideal targets to counteract the SARS-CoV-2 infection. Furthermore, considering that ACE2 is present in lung cells and is involved in the regulation of blood pressure, ACE2 impairment by the coronavirus is one of the possible causes of severe respiratory stress and possible complications in hypertense patients. After cell entry, the viral RNA is translated into nonfunctional global polyproteins that should be precisely processed to yield the separate final structural proteins (as the S protein) and nonstructural proteins (Nsp’s). This fundamental step is assured by proteases,[22] which are themselves Nsp’s, and because of the strategic role of this step in assuring the maturation and replication of the virus, it also represents a key therapeutic target to inhibit SARS-CoV-2 propagation. More specifically, two main proteases have been identified in SARS-CoV-2, the 3-chymotrypsin-like protease (3CLpro) and the papain-like protease (PLpro).[23,24] Whereas the catalytic activity of both proteases is similar, important structural differences should be pointed out, with their active form being either dimeric or monomeric. Furthermore, some proteases may also interact with ubiquitine-like sites, becoming involved in altering the regulation of the cellular signaling, which hampers the response of the host immune system. The capacity of this coronavirus in eluding the innate immune response of the host is one of the reasons for the high pathogenicity and transmissibility. This capacity can be partially ascribed to the role played by large Nsp’s, which are recognized to interact with the ribosome, altering the translation of the host cell.[25,26] Nsp’s are generally multidomain proteins, and it has been pointed out that a specific domain present in SARS viruses, named the SARS unique domain (SUD), contributes to eluding the immune response by interacting with host mRNA, either in its native conformation or in a guanine quadruplex (G-quadruplex) arrangement.[27,28] Potential drugs able to interfere with such mechanisms could be instrumental in boosting the capacity of the host cell to eliminate the viral material. From the previous brief and largely nonexhaustive summary, it is clear that possible drug targets are diverse, from both a structural and a functional point of view. In this context, the possibilities offered by up-to-date molecular modeling and simulation approaches are of great interest and value. Indeed, high-throughput molecular simulations, and, in particular, all-atom molecular dynamics (MD), allow us to represent complex phenomena taking place between complex systems at an atomic-scale resolution. They have proven to be extremely efficient in modeling the interactions and the related conformational reorganization between proteins, nucleic acids, and lipid membranes.[29−31] Molecular simulations have allowed us to resolve the complex processes related to, among others, enzymatic catalysis[32−35] and DNA lesion production and repair[36−40] and to unravel the key mechanisms of passive and active membrane transporters.[41−46] These successes have also been made possible by the impressive development of computational algorithms, including enhanced sampling and free-energy methods,[47] which nowadays allow us to simulate the behavior of systems of hundreds of thousands of atoms up to the microsecond time scale.[31,48] Furthermore, large-scale simulations based on the molecular modeling approach, providing further insight into the structure and functions of SARS-CoV-2 and hence offering an original strategy to counteract its actions, have recently been reported by Amaro,[49,50] Chodera,[51,52] and collaborators in different reviews and perspectives. In the present Review, we critically analyze some of the most important results obtained by molecular modeling and simulations in understanding the key functions of SARS-CoV-2 proteins and in proposing possible novel antiviral agents, from both a methodological and a biological perspective. This Review also illustrates the success and maturity of computational approaches in offering original responses to complex biological problems. In this respect, although we are aware of the fact that different large-scale simulation projects are still being actively pursued, the effort of the scientific community has already produced a large amount of results and data since the beginning of the pandemic, especially in molecular simulations. This offers a deeper vision of the viral basic mechanisms and also suggests strategies for drug development or repurposing. One of the aims of our Review is hence to provide an arena to critically analyze and contextualize all of those efforts and also resituate them in the perspective of the relationship with the structural biology and pharmaceutical chemistry communities.

SARS-CoV-2 Spike Glycoprotein: Structure and Dynamics of Coronaviruses Trojan Horse

Following the identification of the new pathogen as a bat SARS-type betacoronavirus in January 2020,[53] the structure of the SARS-CoV-2 S protein complexed with the human zinc peptidase ACE2 was rapidly resolved at the atomic level by several research groups.[13,14,19,54−57] Because of its fundamental function, the S protein has become a promising target to fight the SARS-CoV-2 infection. As shown in this section, the ultimate goal is to use this knowledge to rationally design vaccines, neutralizing antibodies, and antiviral agents with the general purpose of blocking S-protein recognition and preventing cellular infection.

Structure, Biology, and Dynamics of the Spike Glycoprotein

Enveloped viruses such as SARS-CoV-2 need to fuse with the host cellular membrane to gain entry into the cell.[58] This step, although thermodynamically favorable, has an important energetic barrier and therefore is catalyzed by viral fusion proteins expressed on the viral envelope, namely, the S proteins. Such proteins also determine the viral host range, that is, the spectrum of cells they can infect, the tissue tropism, and the induction of the host immune response.[20] The viral S transmembrane protein is a class I fusion protein composed of three segments: an ectodomain, a single-pass transmembrane anchor, and a short tail inside the virus (Figure A).[13,20] The ectodomain is constituted by a trimer of S1 subunits, whose main function is to recognize the host receptor, connected to the viral envelope through an S2 subunit, in charge of fusing the viral and host membranes. Prior to the host recognition, the SARS-CoV-2 S protein is in a metastable prefusion state.[13] When a subunit S1 binds to a host ACE2 receptor by means of the specific RBD region, the prefusion trimer of S1 subunits is destabilized, inducing a relatively large conformational change that leads the system to the postfusion state.[20] Afterward, the S1 subunits are released while the S2 subunit produces the fusion of the cellular and viral membranes, allowing the insertion of the viral genome into the host cell.[20] The RBD subunit oscillates between two hinge-like conformations, namely, “up/open” and “down/closed” (see Figure A).[13] Whereas the former conformation is more flexible[59,60] and is able to bind the ACE2 receptor, in the down state, the RBD is hidden or buried, is more rigid, and is unable to recognize the host cell. Interestingly, it has been reported that different from the former SARS-CoV, the down state prevails in the SARS-CoV-2 RBD, contributing to hiding the highly immunogenic RBD surface.[61] It has been suggested that this feature contributes to the evasion of the host immune system surveillance, delaying the innate immune response and giving rise to longer infective periods that increase the spread of the contagion.[61] The partial loss of the potential infectivity of SARS-CoV-2 due to the preference of the RBD down conformation is compensated by a higher affinity of the RBD up state toward ACE2[13,62,63] and by a preactivation furin-sensitive cleavage site, which is absent in the previous SARS-CoV, that can facilitate the crossing of the interspecies barrier.[61] Henderson et al. have reported a fine control of these up/down configurations and discussed the implications in the development of vaccines and in the evolution of the SARS-CoV-2 pathogenicity.[64]

Figure 1

(A) Structure of the SARS-CoV-2 spike protein. S1 = receptor binding subunit, S2 = membrane fusion subunit, RBD = receptor binding domain, TA = transmembrane anchor, IT = intracellular tail. Inspired by ref (61). (B) Negative-stain electron microscopy image of an ACE2 receptor bound to a SARS-CoV-2 spike protein (left) and diagram of the microscopy image (right). Reprinted with permission from ref (13). Copyright 2020 American Association for the Advancement of Science. Recently, Tian and Tao[60] applied a combination of the Markov state model, transition path theory, and machine learning methodologies to analyze two independent MD trajectories of the up and down states performed by Shaw and coworkers.[65] The authors characterized the transition between the two SARS-CoV-2 RBD states and quantified their relative prevalence, revealing a stunning 95.5% preference for the down state, in agreement with experimental observations.[61] Moreover, multiple paths involving different intermediates leading from the down to the up state were identified and quantified in terms of probability.[60] The most probable channel is illustrated in Figure A and has been estimated to have a probability of the 23.7%. On the contrary, Gur et al. have estimated the free-energy landscape connected within the down-to-up transition, highlighting the role of salt bridges and solvent accessibility and identifying an intermediate semiopen state (see Figure B).[66] The existence of the intermediate state also participates in lowering the global free-energy barriers that should be overcome during the transition. The energetic penalty of the rate-limiting step has been estimated to be ∼3kBT. On the contrary, Xu and coworkers[67] have analyzed the distribution of the bending angle between the S1 and S2 subunits, evidencing that an angle of at least 52.5° is required to recognize ACE2.

Figure 2

(A) Most probable transition (23.7%) from a “down” to an “up” state in the SARS-CoV-2 RBD. The color code is blue (down) → red (down) → green (up) → orange (up) → yellow (up). ACE2 is represented in gray. Reprinted with permission from ref (60). Copyright 2020 Informa UK Limited. (B) Energy landscape for the down → up transition of SARS-CoV-2 RBD. Reprinted with permission from ref (66). Copyright 2020 AIP Publishing. Like in other viruses, such as HIV, the S protein of SARS-CoV-2 is extensively glycosylated.[68−70] Glycan shields play an important role in protecting epitopes from antibody neutralization, contributing to the eluding of the host immune response and diminishing the efficiency of therapeutic agents. Therefore, a detailed knowledge of the glycosylation sites and their dynamics is a prerequisite to understand the infection and to develop vaccines and other treatments. Grant et al.[71] have performed extended MD simulations to study the role of the S-protein glycan shield in modulating the antigen exposition and the accessibility of known antibodies to their corresponding antigens. The authors estimate that the glycosylation shields ca. 40% of the surface susceptible to being recognized by antibodies and highlight the RBD region of the S1 subunit as the most accessible and largest antigenic surface of the whole S protein. This result is coherent with the necessary recognition of the ACE2 receptor by the RBD and the consequent formation of the protein complex. Conversely, it has been also suggested that the RBD domain might be O-glycosilated at the Thr323 position,[70,72] an occurrence that could also have implications in the RBD/ACE2 recognition mechanism and in the design of possible inhibitors of this process. The precise role of glycosylation is, however, still debated, which is also due to the inherent difficulties in the experimental structural resolution, and further studies are required to clarify this possibility. In this context, a full-length model of the glycosylated SARS-CoV-2 S protein, in both the open and closed states, was recently built by Amaro and coworkers[73] to interpret, at an atomistic level, the role of glycans on the spike protein structure and dynamics using all-atom MD simulations in the microsecond range. Very recently, the structure–function relationship of the fully glycosylated SARS-CoV-2 spike protein was unveiled by the decomposition of unbiased MD simulations. In this way, spike substructures were identified by an energy criterium to possibly act as antibody binding sites.[74]

Molecular Basis of the Human ACE2 Recognition

As recently reported by Li and coworkers,[61] the binding mode of SARS-CoV-2 S to ACE2 is similar to the one of the SARS-CoV S protein, even though the affinity of SARS-CoV-2 RBD is higher.[13,55,62,63,75,76] The superior affinity of the SARS-CoV-2 RBD for human ACE2 compared with other related coronaviruses, including SARS-CoV, was also confirmed by different molecular modeling and simulation studies,[63,67,77−83] whereas other computational studies reported small or negligible differences,[84−87] in contrast with the latest experimental findings.[13,19,55,61,62] The variability of the results can be explained, in part, by differences in the specific MD setup, the choice of the model systems, and the methods of analysis. A great deal of attention has been devoted to the identification of the key interactions that drive the RBD/ACE2 recognition in both SARS-CoV and SARS-CoV-2 viruses. Coarse-grained[88] and biased MD simulations[89] indicate that the process occurs without any stable intermediate. There is consensus over the fact that noncovalent interactions, in particular, an extended network of multiple hydrogen bonds, dominate the recognition. Other interactions such as π-stacking or water/amino acid bridges have also been deemed relevant.[63,89] García-Iriepa et al. have performed all-atom MD simulations on top of the recently resolved RBD/ACE2 complex,[14] quantifying the most important hydrogen bonds that sustain the binding of the viral RBD to the ACE2 peptidase domain (PD).[89] The most important protein–protein interactions involve amino acids located at the α and β interfaces of the PD, as shown in detail in Figure A. Whereas this feature is conserved in both SARS-CoV and SARS-CoV-2,[19] Linial and coworkers demonstrated, by means of MD simulations, substantial differences between SARS-CoV-2, SARS-CoV, and the human coronavirus HCoV-NL63.[84] Indeed, SARS-CoV-2 has the largest interaction area involving the largest number of RBD/ACE2 contacts,[81] whereas SARS-CoV interacts through fewer amino acids but via stronger hot spots, and HCoV-NL63 RBD recognizes a different region of ACE2, as schematically shown in Figure B.[84]

Figure 3

(A) Most important amino acids involved in the SARS-CoV-2 RBD/ACE2 PD complexation. Amino acids color code: red (negatively charged), blue (positively charged), green (polar), cyan (neutral His ε-protonated). Reprinted from ref (89). (B) Schematic representations of the ACE2 recognition by the RBD of different coronaviruses. Reprinted from ref (84). Copyright 2020 by the authors. Licensee MDPI, Basel, Switzerland. Distributed under the terms and conditions of the Creative Commons Attribution (CCBY) license. Using MD simulations, Spinello et al.[63] have built Pearson-based matrices to evaluate the cross-correlation of the ACE2 residues with the RBD of SARS-CoV-2 and SARS-CoV, respectively. A larger correlation for SARS-CoV-2 has been found, and the pivotal role of some of its mutated residues (e.g., Asn501 and Phe486) in increasing the strength of the interaction between the two protein domains has been evidenced.[63] These results globally support the hypothesis of the stronger binding of SARS-CoV-2 RBD to ACE2, compensating the dominant down conformations of the S1 subunits of the S protein.[61]

SARS-CoV-2 Mutations and Animal ACE2 Receptors

SARS-CoV-2 mutations potentially leading to multiple viral strains and hence to the possible emergence of more pathogenic agents are also actively studied in detail using in silico methodologies. By analyzing 791 viral genomes of different SARS-CoV-2 strains, Alfaro and coworkers[59] found that the S2 subunit fusion is much less affected by mutations than the RBD. This fact seems to indicate that changes in the protein units directly involved in the membrane fusion may produce dysfunctional viruses that are unable to infect host cells. Consequently, the S2 subunit should be considered as a robust target for potential antiviral drugs given its high conservation and low mutation tolerance.[59] Notwithstanding, certain residues of the RBD have also been deemed indispensable for the efficient recognition of human ACE2.[79,90,91] On the contrary, Ou et al.[77] have reported that mutations in the RBD of SARS-CoV-2 observed from strains with different geographical distributions actually enhance the binding to ACE2, indicating that some RBD polymorphs might even be more infectious and hence increase the diffusion of the pandemic. The variability of ACE2 and hence the ability of SARS-CoV-2 to infect other species, cross the interspecies barrier, and develop zoonosis[92] has been computationally addressed by simulating animal ACE2 proteins.[92−96] Pach et al.[94] have studied several mutants of ACE2 belonging to cat, dog, ferret, hamster, mouse, rat, and red squirrel by means of MD simulations. The authors found molecular descriptors that could predict the susceptibility of these animals to being infected, alerting that red squirrels could suffer from SARS-CoV-2 infection. They also predicted that certain mutations in dogs, rats, and mice could lead to effective SARS-CoV-2 infections. These studies are also extremely relevant in the aim of finding or creating suitable animal models for experimentation.[94] Furthermore, the possible influence of distinct ACE2 polymorphs in the infectivity and severity of COVID-19 has been also reported.[97−99]

Rational Design of RBD/ACE2 Inhibitors

The capital function of the viral S protein and the advances achieved in characterizing the molecular bases of the RBD/ACE2 binding have motivated a considerable number of experimental and computational studies, aiming at identifying potential antiviral agents susceptible to inhibiting RBD/ACE2 recognition.[100−102] A commonly followed strategy, necessary also to cope with the societal and sanitary urgency induced by the pandemic, is to repurpose already approved drugs with known pharmacokinetics and toxicological profiles to possibly obtain a significant antiviral effect.[103−107] Several known antivirals, including the famous remdesivir, have been or are currently being tested against SARS-CoV-2,[108] with some of them undergoing clinical trials.[109−111] Whereas the mode of action usually consists of the inhibition of the viral RNA polymerase, proteases, or neuraminidases,[112−115] the anti-influenza drug Arbidol[116] has recently been proposed as an inhibitor of the trimerization at the S2 subunit of the S protein.[117] On the contrary, the efficacy of antimalarial drugs chloroquine and hydroxychloroquine, which have been compassionately administered in severe COVID-19 patients, is being intensely debated.[118−121] Several modes of action have been hypothesized, although the specific molecular mechanism or mechanisms responsible for the in vitro inhibition of SARS-CoV-2 remain unclear.[122] The benefit/risk balance of the combination of hydroxychloroquine with the antibiotic azithromycin has been questioned as well.[123−126] Fantini and coworkers[127] used MD simulations to study the molecular mechanism of action of these combinations of drugs. The authors do not exclude possible synergistic effects: whereas hydroxychloroquine is directed toward the ACE2 protein of the host cell, azithromycin interacts with the SARS-CoV-2 S protein. The binding site, however, is located not at the RBD but at a ganglioside binding site in the lateral regions of the S protein.[127,128] Other studies combining machine learning and molecular docking did not point out remarkable interactions between azithromycin and the viral S protein, although other glycopeptide and macrolactam antibiotics showed inhibition potential.[129] Hence, no clear evidence unequivocally identifying the molecular bases of the proposed therapeutic protocol can, at the moment, be invoked. Flavonoids are a broad family of natural compounds with antioxidant, antibiotic, antinflammatory, anticancer, and antiviral properties.[130,131] Because of their known biological properties, general safety, and broad availability, recent studies have tackled their potential activity to block the SARS-CoV-2 RBD.[132−136] Among the different flavonoids, hesperidin (see the structure in Figure A), a compound present in citrus fruits and other vegetables,[137] has shown the ability to block the RBD/ACE2 interaction (as well as the main viral protease),[138] even motivating large-scale extraction processes for massive production.[139]

Figure 4

(A) Chemical structure of different flavonoids. (B) Relevant binding hot spots of different flavonoids with human ACE2 including binding energies (kcal/mol) and histogram of the distances between the RBD and ACE2-PD centers of mass for (C) diosmin and (D) rutin. Adapted from ref (89). García-Iriepa et al. have studied the potential ability of the flavonoids diosmin, rutin, and naringin (Figure A) to inhibit RBD/ACE2 recognition by means of both docking and all-atom MD simulations.[89] Molecular docking revealed three relevant binding hot spots on the human ACE2 protein, the loop and the PDs at interface α and interface β, with estimated binding affinities of ∼6–9 kcal/mol (Figure B). The destabilization of the RBD/ACE2 complex was highlighted by MD simulations, in particular, by measuring the distance between the center of mass of the RBD and ACE2 PD. Despite the promising binding energies for diosmin, the MD results indicate the poor destabilization of the RBD/ACE2 complex for this flavonoid, given the limited distance distributions with respect to the control (RBD/ACE2 complex in the absence of drug; see Figure C). In contrast, the effect of rutin is more pronounced, yielding larger RBD–ACE2 distances for all three binding hot spots (Figure D).[89] These results evidence that even though docking studies are a first and necessary step for the preliminary assessment of a potential drug activity against SARS-CoV-2, further analyses, including extended MD simulations, are required to validate and quantify the inhibition potentiality. A large number of antibiotics have also been tested by means of docking studies against the viral RBD.[140−143] Plicamycin (also known as mithramycin; see the structure in Figure A) is clinically used as an anticancer agent[144,145] and shows a promising interaction with ACE2, in particular, at interface β, a region that is directly involved in the recognition of the viral RBD (see Figures B and 5B,C).[89]

Figure 5

(A) Chemical structure of plicamycin (mithramycin). (B) Histogram of the distances between the centers of mass of the RBD and the ACE2 PD measured from extended MD simulations. (C) Representative snapshots of the two plicamycin conformations in equilibrium at the interface-β. (D) Potential of mean force (PMF) corresponding to the RBD/ACE2 complexation in the absence and in the presence of plicamycin. (E) Snapshots of the complex in the presence of plicamycin at interface-β, at its free-energy minimum (bound), and as separated structures. Reprinted from ref (89). The RBD/ACE2 complex destabilization was further quantified by determining the binding free energy using a combination of metadynamics[146] and adaptative biased force (eABF),[147] that is, the meta-eABF method.[148,149] The results of García-Iriepa et al. revealed a 2 kcal/mol destabilization of the RBD/ACE2 complex in the presence of the drug as compared with the nondrugged situation. Furthermore, the RBD/ACE2 distance at the free-energy minimum is increased by ∼5 Å with respect to the drug-free system, supporting the mode of action of plicamycin as a possible antiviral against SARS-CoV-2 (Figure D,E).[89] An alternative strategy to inhibit the RBD/ACE2 binding has been recently proposed by Boldrini et al., in this case directed only toward the human receptor.[150] The authors propose to hamper the ACE2 maturation at a post-translational level by designing inhibitors for specific folding intermediates. As a result, the downregulation of ACE2 would limit the number of receptors susceptible to binding the SARS-CoV-2 S protein and stopping the infection. The free-energy profile corresponding to the folding process was computed through biased MD simulations, revealing two intermediate conformations (early and late intermediates; see Figure A). Only the late intermediate was found to be druggable, for which two specific pockets were identified. A database of approved drugs was tested to identify potential blockers of these two pockets (Figure B). Among the 35 hits, the authors identified mefloquine and hydroxychloroquine as compounds showing possible anti-SARS-CoV-2 activity in vitro (Figures C,D).[150] Despite the interest of this strategy for antiviral purposes, care should be taken in examining the possible side effects due to the decrease amount of active ACE2 receptors.

Figure 6

(A) Free-energy profile corresponding to the ACE2 folding. (B) Druggable pockets of the late intermediate state, colored in magenta. (C) Chemical structures of mefloquine and hydroxychloroquine, which showed potential inhibition of the ACE2 late intermediate. (D) Snapshot of mefloquine inside the binding pocket of the ACE2 late intermediate. Reprinted and adapted from ref (150). Distributed under the terms and conditions of the Creative Commons Public Domain Dedication (CC0 1.0). The design of oligopeptides based on ACE2 sequences and able to bind to the RBD has been recently proposed to selectively target the viral material.[151,152] Oligopeptides present several advantages with respect to traditional small drugs: (i) They cover large surfaces of the binding hot spots through multiple contacts, (ii) the interactions are based on protein–protein recognition and therefore are highly specific, (iii) because they are based on human ACE2 sequences, they are unlikely to trigger unwanted immune responses, and (iv) they can be potentially combined with nanoparticle carriers.[153] By using extended MD simulations, Han and Král[151] have studied four peptides based on the ACE2 PD (see Figure A–E), analyzed their stabilities through the root-mean-square deviation (RMSD) values (Figure F), and assessed their ability to bind SARS-CoV-2 RBD by estimating the binding energies (Figure G). The sequences were designed to mimic several regions of ACE2, although the majority of crucial amino acids (up to 15) necessary for the RBD recognition belong to the α1-helix and are present in all prototypes. Globally, all peptides showed very promising results, exhibiting similar binding energies as compared with the full ACE2 PD.[151] A similar peptide was modeled by Pentelute and coworkers[152] and synthesized using automated fast-flow peptide synthesis, whereas its specific binding affinity toward SARS-CoV-2 RBD was quantified experimentally through biolayer interferometry. The peptide binds the viral RBD with a low 49 nM affinity; several modifications that enhance the binding capacity have, however, been proposed on the basis of MD simulations.[154,155]

Figure 7

(A–E) Representation of inhibitors 1–4 and the control c (α1-helix shown in red) complexed with the SARS-CoV-2 RBD (blue). (F) Analysis of the stabilities of 1–4 and c through the root-mean-square deviation (RMSD) values. (G) Analysis of the ability of 1–4 and c to bind SARS-CoV-2 RBD by estimating the binding energies. Reprinted from ref (151). A strategy previously used against SARS-CoV relies on the use of peptides to bind the S2 subunit.[156] Oligopeptides derived from the S2 heptad repeats 1 and 2 (HR1 and HR2, respectively) have been proposed. HR1 and HR2 are known to associate to form a six-helix bundle fusion core that is crucial to allow the cellular/viral membrane fusion. Thus HR2 analogues able to effectively bind HR1 structures would block the membrane fusion and therefore halt viral infection. This strategy has been recently assessed for SARS-CoV-2 by Ling and coworkers.[157] Indeed, MD simulations reveal that the HR2-like peptides are able to bind HR1 with a free binding energy of −33.4 kcal/mol. These results are also coherent with the observed high conservation of the S2 genome between SARS-CoV and SARS-CoV-2. Using a computational model of the complex between the S-protein of SARS-CoV-2 and the human ACE2 receptor, a supercomputer-based molecular docking was recently performed by Smith and Smith,[158] revealing that 77 ligands bind strongly to either the S-protein/ACE2 interface or the isolated S protein.

SARS-CoV-2 Proteases

Although the SARS-CoV-2 genome is rather large, being formed by an RNA strand of ca. 30 000 nucleobases, it encodes a relatively low number of proteins, which can be divided into structural and nonstructural proteins (Figure ).[159] Whereas structural proteins are required to generate viral particles, 16 nonstructural proteins (Nsp1–16) are responsible for the diverse virus functionalities, mainly focusing on replication and survival in the host cell. SARS-CoV-2, such as other betacoronaviruses, translates, once inside the host cell, two overlapping polyproteins (pp1a and pp1ab) that are responsible for encoding all nonstructural and structural proteins. The latter evolve into mature virions inside intermediate compartments formed by the host endoplasmic reticulum and the Golgi apparatus before being excreted to the extracellular region.[160] The viral replication is initiated by the cleavage of both pp1a and pp1ab polyproteins[161,162] to generate all 16 nonstructural proteins.

Figure 8

SARS-CoV-2 genome structure, including structural and nonstructural proteins.

Initially, two proteases encoded within the polyproteins—Nsp5, corresponding to 3-chymotrypsin-like protease (3CLpro), and Nsp3, corresponding to papain-like protease (PLpro)—undergo autocleavage. Once generated, 3CLpro and PLpro cleave the polyprotein, including RNA-dependent RNA polymerase (RdRp), helicase (Hel), exonuclease (exoN), endoribonuclease (NendoU), and an S-adenosyl-methionine-dependent ribose 2′-O-methyltransferase (2′-O-MT). Because of their positions within the polyprotein, PLpro is responsible for the formation of Nsp1–3, whereas 3CLpro accounts for the formation of the other Nsps (Nsp4–16).[163] Although all structural and nonstructural proteins could be of potential interest for drug repurposing (see, e.g., the potential binding mechanism of remdesivir to RdRp)[164] due to their fundamental relevance in the viral replication process, the following sections are dedicated to the modeling and simulation of drugs that possibly inhibit 3CLpro and PLpro functions. Furthermore, it is established that 3CLpro acts as a homodimer, although only one of the monomers contains the active binding site.[165] On the contrary, PLpro, although proposed by crystallography[23] to be accessible in different aggregation states, was shown to biologically act only as a monomer.

3CLpro Structure, Domains, And Catalytic Active Site

As already mentioned, 3CLpro is responsible for the catalytic cleavage of at least 11 polyprotein sites, whereas PLpro cleaves only 3 of them (Figure ). Because 3CLpro cleaves at more sites than PLpro, the former has been the object of more structural and computational studies and is a much better characterized target for anticoronaviral agents.[166−168] SARS-CoV-2 genome structure, including structural and nonstructural proteins. To rationalize the design of potential antiviral agents targeting 3CLpro, it is essential to define the protease structural details such as its catalytic active site, domains, and quaternary structure. In this regard, SARS-CoV-2 3CLpro significantly resembles the analogous SARS-CoV enzyme, sharing a 96% sequence identity.[24] The active 3CLpro homodimer consists of two protomers, each of them composed of fewer than 310 residues (Figure a). Each protomer can be divided in three different domains (Figure b).[165,167,169,170] The dimerization takes place via the interaction between the N-finger and domain II of the protomers and is essential to ensure the catalytic activity.[165,169,171] Indeed, the shape of the substrate binding site is strongly influenced by the dimerization. As for the catalytic site, it consists of a dyad composed of His41 and Cys145 lying in the cleft between domains I and II and bears strong similarities with the one described for the 3CLpro of SARS-CoV.[165,167,169,170]

Figure 9

(a) 3CLpro crystal structure (PDB ID 6LU7). Protomers A and B are depicted in blue and purple, respectively. The inhibitor placed in the binding pocket of each protomer is depicted in van der Waals representation and in yellow. (b) Representation of one protomer, highlighting its three domains, the loop connecting domains II and III, and the N-finger. Because of the well-characterized structure and its appeal as drug target, numerous potential 3CLpro antiviral agents have already been proposed.[24,159,172−174] In this regard, docking studies have been widely used to rapidly screen a large number of lead compounds based on their affinity with the 3CLpro binding pocket,[175−181] whereas MD has mainly been performed to confirm the stability of the selected docking poses for each antiviral.[175,182−185] Some simulation studies have been reported that focus on rationalizing the drug–3CLpro interactions by quantifying the strength of the binding process via free-energy calculations.[173,174,181,186,187] The most used[173,181−183,188−190] 3CLPro crystal structure is the dimer complexed with an inhibitor and resolved by Jin et al.[191] (Figure a); the crystal structure of the apo form has been recently released by Zhang et al.[24]

3CLpro Antivirals Screening through Molecular Docking

Molecular docking has become an increasingly important tool to assist and stimulate drug discovery because it can model the interaction between a small molecule and a protein at the atomic level, allowing a fast and computationally cheap prescreening of drug candidates. It also allows the prediction of the ligand conformation as well as its position and orientation within the binding sites (usually referred to as the pose) and binding affinities. The first studies faced the problem that the crystallographic structure of SARS-CoV-2 3CLpro was not available and hence relied on homology modeling;[175,176] the experimental resolution of the 3CLpro structure boosted a huge increase in molecular docking studies.[176−180,192] When the binding site is known, selecting a reasonably small docking box around the hot spot (i.e., focused docking) reduces the computational time. However, other competitive binding sites can be artificially neglected when using this strategy. Focused docking has been used to perform the structure-based virtual screening of a library of 1000 covalent protease inhibitors and 16 United States Food and Drug Administration (FDA)-approved protease inhibitors. Three compounds, including paritaprevir and simeprevir (Figure ), were identified as potential 3CLpro inhibitors.[179]

Figure 10

Chemical structure of commercial drugs used as potential inhibitors of the SARS-CoV-2 main protease 3CLpro.

Chemical structure of commercial drugs used as potential inhibitors of the SARS-CoV-2 main protease 3CLpro. The focused-docking approach has also been used by Peele et al.[178] to test 24 plant compounds with antiviral properties, 22 FDA-approved antiviral drugs, and 16 antimalarial drugs. Lopinavir (an anti-HIV drug), amodiaquine (an antimalarial drug), and theaflavin digallate (a plant-based phenol derivative) were selected as potential SARS-CoV-2 therapeutics (Figure ) due to the good affinity for the active site of 3CLpro.[178] Although the binding site of 3CLpro is known, blind-docking approaches, targeting the whole protein, have also been considered. In this regard, Yu et al.[177] have studied the possible interactions of ribavirin and remdesivir (antiviral drugs), chloroquine (antimalarial), and luteolin (flavonoid) with 3CLpro (Figure ). Luteolin was found to bind the SARS-CoV-2 3CLpro with high affinity, occupying the same binding site as the standard N3 inhibitor. Chloroquine was also found to stably bind 3CLPro, suggesting that protease inhibition could represent a secondary mechanism of action of this potential drug.[122] González-Paz et al.[192] performed a blind-docking study to predict the binding of the B1a and B1b forms of the wide-range antiparasitic drug ivermectin (Figure ) to 3CLpro, suggesting that ivermectin B1a binds with a higher affinity than ivermectin B1b. Lobo-Galo et al.[180] used a multiscale approach in which blind docking was first used to pinpoint the principal potential binding cavities, most often corresponding to the active site, and this region was subsequently selected for further focused docking. The authors screened thiol-reacting FDA-approved drugs including captopril (antihypertensive drug) and found that disulfiram, used to treat chronic alcoholism, has promising antiviral properties (Figure ). Macchiagodena et al.[181] have applied an interesting computational strategy that includes the combination of molecular docking and ligand generative adversarial network (LIGANN) methods to generate new ligands that match the shape and the chemical characteristics of the binding pocket, thus resulting in a de novo drug design. On the basis of molecular docking calculations, five noncommercially available compounds have been identified that exhibit the highest binding affinity toward 3CLpro (Figure ).

Figure 11

Chemical structure of noncommercial compounds used as potential inhibitors of the SARS-CoV-2 main protease 3CLpro.[181]

Chemical structure of noncommercial compounds used as potential inhibitors of the SARS-CoV-2 main protease 3CLpro.[181] On the one hand, both focused- and blind-docking calculations can be followed by MD simulations (see the next section) or by binding free-energy calculations to further refine the relevance of the docked poses. The latter strategy was adopted for carfilzomib (an approved anticancer drug acting as a proteasome inhibitor) in the interaction with 3CLpro, in which the binding free energy of the most relevant docking pose was found to be −13.8 kcal/mol.[143] On the other hand, instead of preselecting molecular ligand structures based on the biological function or chemical intuition, docking methods can be applied as an extensive virtual screening tool, scanning up to 1.3 billion compounds, as was performed in the identification of potential 3CLpro inhibitors.[193,194]

Molecular Dynamics Simulations to Evaluate the Potential Activity of 3CLpro Antivirals

In addition to the screening provided by docking, further studies including dynamic sampling are essential to evaluate the favorable drug–3CLpro interactions and their stability along the simulation time. Indeed, MD studies were performed to first assess the molecular bases of the SARS-CoV-2 3CLpro catalytic function, including the impact of the enzyme dimerization[195] and the definition of the ligand anchor site within its pocket.[183] Different families of drugs targeting 3CLpro have been studied through dynamics simulation approaches with mainly two aims: (i) identifying the most promising antiviral drugs based on their interaction with 3CLpro, belonging to families of compounds whose inhibition activity against SARS-CoV-2 has not yet been proved, and (ii) rationalizing the interaction patterns and possibly the mechanisms of action for agents whose anti-SARS-CoV-2 activity has already been proposed. The first approach has mainly concerned studies based on drugs with significant activity against other viruses. Examples include the natural phytochemicals α-ketoamide(11r), baicalin, cyanidine 3-glucoside, glabridin, and hypericin.[196−199] A comprehensive MD study of rutin (see Figure A) has also been performed to understand its interaction inside the 3CLpro binding pocket, allowing the proposition of rutin analogues with an enhanced interaction.[187] Regarding natural products, marine compounds have also been proposed as 3CLpro inhibitors and have been studied through modeling approaches, hence identifying the derivatives with the most favorable and stable interactions.[184] Apart from natural products, other research groups have focused on FDA-approved antimicrobial drugs (such as viomycin) or other already known drugs (such as carfilzomib, eravacycline, valrubicin, lopinavir, elbasvir, streptomycin, and oftasceine, among others) to showcase, thanks to MD simulations, those with the highest potential.[188] Regarding the works focused on investigating the 3CLpro inhibitors that already show promising experimental results against SARS-CoV-2, the combination of experimental data and modeling results allows us to determine the binding mechanism and the specific interactions that enhance the stability of the drug inside the 3CLpro pocket. For instance, entecavir and nelfinavir revealed their structural differences and the concomitant influence on the binding affinity and stability inside the 3CLpro pocket,[183] showing that the physical–chemical knowledge derived from simulation studies is quite valuable for future proposals of drugs or derivatives. Another example is the work of Sk et al.,[186] in which, taking advantage of the available crystal structures of 3CLpro in the interaction with two inhibitors (α-ketoamide and Z31792168), the authors disentangled through MD simulations the molecular origin (electrostatic, van der Waals, polar, or nonpolar interactions) of the binding mode providing structural stability to the drug−protein complex.[186] Most of the published studies making use of MD simulations evaluate the stability of the initial binding poses through the analysis of: (i) the RMSD of both the 3CLpro and the drug, (ii) the root-mean-square fluctuation (RMSF), and (iii) the radius of gyration (Rg). These three descriptors are essential in evaluating the pose stability, as the RMSD determines the structural stability, the RMSF reveals the drug effect on the system flexibility and fluctuations, and Rg is indicative of the compactness of the global structure (refs (143, 173, 175, 184, 185, 187, 188, 190, and 200)). In addition, more specific criteria, such as the development of particular hydrogen bonds and hydrophobic interactions, are crucial to discriminate between potential or not efficient drugs because they ultimately determine the position of the drug with respect to the 3CLpro. Indeed, to show inhibitory activity, the ligand should be firmly placed inside the binding pocket, lay close to the catalytic dyad, and possibly interact with the S1, S2, and S4 units, as shown in Figure .[183,184,186−188]

Figure 12

3CLpro binding site defined by the S1, S2, and S4 units. The catalytic dyad composed of His41 and Cys145 is also depicted.

3CLpro binding site defined by the S1, S2, and S4 units. The catalytic dyad composed of His41 and Cys145 is also depicted. For instance, Huynh et al.[183] revealed that the clinically approved entecavir drifts from the starting docking pose to a novel stable location (Figure b). In the initial conformation, the drug establishes hydrogen bonds with residues Thr26, His41, and Gly143 (Figure b), whereas in the final conformation, the drug keeps the interaction with His41 but establishes two novel hydrogen-bonds with Glu166 and Ala189 (Figure b). As a matter of fact, the final drug location corresponds to a secondary pose already obtained by the docking study. This work constitutes a glaring example of the fact that the stability of the docking poses should be, when possible, confirmed by MD simulations because the explicit consideration of the solvent and the inherent protein flexibility, as well as the time evolution, could influence the interactions between the protein and the drug, changing the rank order of the docking poses.

Figure 13

(a) Green points depict the center of mass of the drug during the MD simulation. Hydrogen bonds between the drug and the 3CLpro binding pocket for (b) the starting structure (most stable docking pose) and (c) the pose after MD simulation. Figure adapted from ref (183). Another example is the work published by Mahanta et al. focused on the study of some FDA-approved drugs for antiviral repurposing.[188] In the case of viomycin, the analysis of the hydrogen-bond interactions within the 3CLpro binding site reveals that the drug moves deeply inside the binding pocket, reaching a quite stable position and developing robust interactions at the end of the MD simulation. The comparison of the hydrogen-bond interactions of viomycin placed at the bottom of the pocket with the ones observed for the already reported reference N3 demonstrates that the binding of viomycin is significantly stronger. To better quantify the precise positioning of the drug in the binding pocket and the balance of noncovalent interactions leading to its stability, different analyses can be performed. In particular, the position of the drug and its extension inside the cavity can be analyzed through the evaluation of solvent-accessible surface area (SASA) and, based on this concept, through the contact area between the drug and the 3CLpro binding pocket. As described by Huynh et al.,[183] the comparison of the contact area computed along the MD simulation for two different drugs allowed us to conclude that one of them does not completely fill the 3CLpro binding pocket due to its small size, hence indicating a weaker interaction. In other cases, the calculation of the SASA along the MD simulation has allowed differentiation of the expansion or the shrinking of 3CLpro when binding to different drugs.[182,186,200] Huynh et al. studied the behavior of rutin (Figure A) as a possible 3CLpro inhibitor.[187] After analyzing the MD simulation, it was concluded that hydrophobic interactions between the drug and binding site, essential for drug stabilization, were missing for rutin due to the large number of hydroxyl groups. The authors used this knowledge to design a more hydrophobic analogue of rutin, in which two methyl groups replaced two hydroxyl groups in each sugar moiety. This analogue showed much stronger hydrophobic interactions with the binding site, placing the drug deeply into the 3CLpro pocket. When performing molecular simulations involving many different drugs, the principal component analysis (PCA) is useful to classify the compounds in different groups depending on the evolution of the drug–protein interaction along the simulation time.[182,186] For instance, the PCA based on the MD simulation of five phytochemicals revealed that the three most stable drugs (α-ketoamide(11r), cyanidin 3-glucoside, and baicalin) share similar dynamic and interaction patterns inside the 3CLpro binding site.[182] Going beyond equilibrium MD, free-energy methods can be used to obtain the binding free energy of the drug–protein complex. However, free-energy methods are diverse, and, as usual, high accuracy is invariably accompanied by a high computational cost. As an example, Sk et al.[186] have estimated the binding free energy of 3CLpro with two drugs (α-ketoamide and Z31792168), resorting to the highly approximated but computationally inexpensive molecular mechanics Poisson–Boltzmann surface area (MM-PBSA)[201−208] strategy. The decomposition of the total free energy in its different energetic components increases the details of the description. By analyzing the results presented in Figure a, the authors concluded that the binding is mostly driven by van der Waals (ΔEVdW) and electrostatic (ΔEelec) interactions and nonpolar solvation free energy (ΔGnp), indicating ΔEVdW as the leading contribution. Conversely, the polar solvation free energy (ΔGpol) and the configurational entropy (−TΔS) destabilize the interaction.

Figure 14

(a) Total binding free energy (ΔGbind) and the van der Waals (ΔEVdW), electrostatic (ΔEelec) interactions, polar solvation free energy (ΔGpol), nonpolar solvation free energy (ΔGnp), and the configurational entropy (−TΔS) components for both drugs under study. (b) Contribution of each residue to the total binding free energy for both drugs, α-ketoamide and Z31792168, in the left and right panels, respectively. Figure adapted with permission from ref (186). Copyright 2020 Informa UK Limited. The authors pushed the analysis to understand the contribution of some specific amino acids on the binding free energy by using the molecular mechanics/generalized Born surface area (MM-GBSA) approach,[209−213] which provides a per-residue decomposition of the binding free energy. Their results (Figure b,c) evidence that for α-ketoamide, the number of residues significantly contributing to the binding free energy is larger than that for Z31792168, which is in line with the more negative total binding free energy computed for the former drug (Figure a). Similar binding free-energy analyses have been reported by Wang et al.,[143] in this case focusing on neutral and charged clinically approved drugs, such as carfilzomib, lopinavir, elbasvir, and streptomycin, among others, finally selecting carfilzomib as the most efficient inhibitor. MD simulations coupled to a regression model correlating the binding energy and molecular descriptors have also been proposed to establish the 3CLpro-inhibiting capabilities of the long-debated chloroquine and hydroxychloroquine drugs. The interactions with some active-site residues (Cys145, His41) show that aminoquinoline analogs should result in more suitable choices.[214]

Hybrid Quantum Mechanics/Molecular Mechanics Studies of 3CLpro

If the chemical reactivity plays an important role in protein–substrate systems, as is the case for catalytic activity, then the region of the molecular system undergoing chemical reactions must be described by quantum mechanics (QM) methods, surrounded by the rest of the macromolecular environment, treated by molecular mechanics (MM), to get insights into the reaction free-energy profile. This results in the so-called QM/MM approaches, which have been widely adopted with different flavors in biochemistry and biology in the last several years.[215,216] In the context of the properties of SARS-CoV-2 3CLpro, QM/MM studies revealed the possibility to discriminate between reactive and nonreactive enzyme–substrate complexes.[217] Also, these multiscale simulation methods allowed it to be firmly established that the free-energy landscape is associated not only with the substrate binding but also, more importantly, with different steps of the proteolysis reaction, as described by the corresponding transition states.[218] Moreover, this helped in elucidating how SARS-CoV-2 3CLpro differs in the mechanism of action compared with other cysteine proteases.[219]

Papain-like Protease Catalytic Site and Ubiquitin-like Domains: A Multifunctional Protein

The papain-like protease (PLpro) is recognized as an attractive target for antivirals,[23] although it is much less studied as compared with 3CLpro. Like 3CLpro, PLpro is primarily a cysteine protease (Figure c). PLpro recognizes the tetrapeptide Leu-X-Gly-Gly, a motif that is found at the Nsp 1/2, Nsp 2/3, and Nsp 3/4 borders.[220,221] Even though PLpro and 3CLpro share the same primary function, some in vitro studies have outlined that PLpro has two additional proteolytic functions: the removal of ubiquitin (Ub) and the removal of a Ub-like protein named interferon-stimulated gene product 15 (ISG15).[161,162,222−224] These two additional functions are available due to the presence of two Ub and Ub-like binding subsites on the protein surface (SUb1 and SUb2 in Figure b,d). Hence, the PLpro function is not only restricted to the initiation of the viral replication process through the cysteine protease catalytic activity, as deubiquitinating and deISGylating activities also compete with the response of the immune system.[225−227] Indeed, the infected host cell can stimulate, through ubiquitination and ISGylation, the production of interferon-stimulated gene products as cytokines and chemokines, known for their antiviral properties.[228,229] Indeed, ISGylation can fight the virus through the sequestration and degradation of viral proteins.[230−232] Hence, the activity of PLpro can substantially reduce the secondary immune response, conferring to SARS-type viruses an additional survival mechanism.

Figure 15

(a) Papain-like protease (PLpro, PDB ID: 6W9C). The structures of (b) SUb1, in blue, (c) the active site, in red, and (d) SUb2, in yellow, are highlighted.

(a) Papain-like protease (PLpro, PDB ID: 6W9C). The structures of (b) SUb1, in blue, (c) the active site, in red, and (d) SUb2, in yellow, are highlighted. Structurally, PLpro offers some advantages for the in silico design of anti-SARS-CoV-2 drugs. This enzyme is active as a monomer, and hence no information on aggregation states is required. Moreover, a monomer implies a reduced number of atoms compared with a dimer or a trimer, allowing longer simulation times to be reached, which are usually required to efficiently sample flexible protein structures and their interactions with the proposed drugs. Also, the presence of three different binding domains as possible antiviral targets (the cysteine protease active site, SUb1 and SUb2) makes PLpro a highly attractive multifunctional target. Despite these favorable characteristics, some challenges need to be faced when proposing PLpro inhibitory mechanisms: The protease substrate is composed of a conventional catalytic triad (Cys112–His273–Asp287)[23] that is partially solvent-exposed, with side chains hampering any contact with the catalytic triad. Indeed, a nearby gate has to be accessed by the polypeptide undergoing catalytic cleavage, identified as Leu-X-Gly-Gly subsites (Figure a). On the basis of the prescreening of a library of >50 000 compounds, also followed by synthetic optimization, it was indeed found that drugs can be designed to close such a gate by inducing a loop closure that shuts down catalysis at the active site.[233] SUb1 and SUb2 binding subsites are much less studied and hence are less characterized. Therefore, a deeper understanding of these additional PLpro proteolytic functions is envisaged to improve the specificity of the inhibitor design. From a functional point of view, whereas the deubiquitinating activity is more established, the deISGylating activity is not completely understood,[234,235] although its effects on the immune response during a viral infection are documented.[236] From a structural point of view, when looking at the SARS-CoV-2 PLpro’s X-ray structure, a majority of hydrophobic exposed side chains can be identified at both the SUb1 and SUb2 surfaces. Similar to the previous SARS-CoV PLpro, the presence of SUb1 as a C-terminal domain and SUb2 as a N-terminal domain is confirmed, with both domains being distant from the catalytic site and both being active for deubiquitination and deISGylation.[222] The flexibility and thus specificity of SUb1 and SUb2 subdomains was demonstrated in mouse hepatitis virus (MHV) PLpro by altering the binding of Ub and ISG15 through the change of a single amino acid (Asp to Ala), in turn leading to pathogenic modifications.[234−236] Comparing the pp1ab polyprotein of SARS-CoV and SARS-CoV-2, a moderate 83% sequence similarity is observed in the PLpro region, hence raising the question of whether SARS-CoV PLpro antivirals can be directly used to inhibit SARS-CoV-2 PLpro. In more detail, focusing on SUb1, differences in six amino acids are found (SARS-CoV residues in parentheses), Ser170(Thr), Tyr171(His), Tyr216(Leu), Gln195(Lys), Thr225(Val), and Lys232(Gln), suggesting electrostatic and steric modifications.[163]

PLpro Inhibitors of the Catalytic Active Site

The vast majority of the in silico search for potential antivirals against the action of PLpro is dedicated to the inhibition of the catalytic active site. This is mainly due to the fact that 3CLpro and PLpro share the same basic features of a cysteine protease. Hence, the same antiviral compound could, in principle, interact with the same type of protein subdomain. Nevertheless, we know that 3CLpro and PLpro have different inhibiting subsites as entry gates for the cleavable polypeptides, so this represents a challenging goal. From a methodological point of view, such multitarget studies were usually performed by applying molecular-docking techniques followed, in most of the cases, by virtual screening based on the calculated binding energy. By using these approaches, luteolin was proposed to bind with high affinity to 3CLpro (ca. −5.5 kcal/mol) and even more to PLpro (ca. −7.0 kcal/mol) due to the formation, in the latter case, of three hydrogen bonds with Asn180, Arg105, and Phe185.[177] Evidently, the low computational cost offered by docking can be of high interest when comparing a large set of ligands, including compounds proposed by experimentalists or previously used to treat other infections. For example, Hosseini et al. proposed a set of promising inhibitors showing a possibly stronger effect with respect to the known and highly debated remdesivir, lopinavir, and ritonavir (acting against 3CLpro).[237,238] The authors previewed a higher inhibition effect by docking PLpro with paritaprevir, glecaprevir, velpatasvir, amaryl, trypan blue, raltegravir, ledipasvir, and simeprevir (all of them within a really narrow 0.5 kcal/mol energy difference). This long list also shows how docking techniques are proficient for proposing alternative compounds, although they lack in the identification of a clearly emerging lead compound. A similar approach was employed by Wu et al., considering an even larger set of drugs,[239] divided according to their use: Ribavirin, valganciclovir, and thymidine (antiviral drugs); chloramphenicol, cefamandole, and tigecycline (antibacterial drugs); chlorphenesin carbamate (muscle relaxant drug); and levodropropizine (antitussive drug) were all found to have high binding affinity to PLpro. In a similar study, the focus was instead placed on the different activities of the drugs, and the most suitable target was finally proposed depending on the type of interaction with the protease.[240] In this case, ritonavir, lopinavir, and darunavir (anti-HIV drugs) were considered, and it was finally suggested that darunavir should be more selective for recognizing PLpro instead of 3CLpro. Nevertheless, additional docking studies suggest that darunavir could also be a suitable choice to bind 3CLpro,[241,242] highlighting the relevance of this particular anti-HIV drug as a potential anti-COVID-19 clinical drug. Different from molecular docking, bioinformatic criteria, using network proximity analyses of drug targets, can help to identify candidate molecules, also including potential drug combinations, via the exploration of not only one target but also the complete human interactome. As an example, it was proposed that a combination of mercaptopurine and melatonin can constitute a combination therapy for SARS-CoV-2 by targeting PLpro, ACE2, c-Jun signaling, and anti-inflammatory pathways in a synergistic way.[240] On the opposite side of the spectrum, to have high-value chemical insights about reliable potential antivirals, it is necessary to increase the complexity of the molecular modeling technique. MM-GBSA was used by Huang et al. to assess the interaction of omeprazole, methicillin, and tolazamide with PLpro.[243] Omeprazole was proposed as the best candidate, as strong interactions in inhibiting the Leu-X-Gly-Gly region were identified (Figure a). In particular, a synergy between polar and hydrophobic interactions was highlighted: hydrogen bonds with Ser212, Glu214, and Lys217 and π stacking of omeprazole’s phenyl group with Tyr213 (Figure b).

Figure 16

(a) Omeprazole, methicillin, and tolazamide PLpro binding site. (b) Insights into the interactions between PLpro and omeprazole. Reprinted from ref (243). Distributed under the terms and conditions of the Creative Commons Attribution (CCBY) license. The same MM-GBSA method was also performed on a different set of drugs, and it was found that elbasvir binds stably to RdRp, Hel, and PLpro.[244] Definitively much more accurate, from a physicochemical point of view, are fully atomistic enhanced sampling nonequilibrium simulations such as alchemical transformations, which were employed to measure the dissociation constant of hydroxychloroquine from 3CLpro, PLpro, and RdRp. It was concluded that the drug could act as a mild inhibitor for all three targets and especially for PLpro.[245] The proposed drug interacts with the Leu-X-Gly-Gly inhibition region close to the catalytic site. In more detail, the chosen computational approach is based on the Hamiltonian replica exchange method (HREM) coupled to nonequilibrium alchemical MD simulations (i.e., HREM/NE methodology).[246,247]

PLpro Inhibitors of the Ubiquitin-like Domains

Very few studies focus on the possible inhibition of SUb1 and SUb2 domains, which also happen to be the PLpro region showing the largest sequence differences when comparing SARS-CoV and SARS-CoV-2, hence increasing the difficulty of the task. Some structural insights of the interaction of SARS-CoV-2 PLpro with Lys48-linked diubiquitin (Lys48-Ub2) and ISG15 were proposed by MD studies.[248] Through extensive simulations (1 μs), it was found that mISG15 binds more tightly to the same PLpro subsite than Lys48-Ub2. On the basis of this result, the authors proposed GRL-0617 to be the inhibitor, which was already identified as a noncovalent inhibitor of PLpro in previous SARS viruses.[248] This assumption was based on the observation that the conserved Tyr268 could bind to GRL-0617, as in previously identified PLpros, blocking the entry of the ISG15 C-terminus while approaching the subsite. Interestingly, GRL-0617 does not inhibit PLpro’s involved in Middle East respiratory syndrome (MERS) infections because Tyr268 is replaced by a Thr residue.

Other SARS-CoV-2 Targets

In addition to the previously mentioned SARS-CoV-2 targets, common to diverse coronaviruses, four other biomolecular targets have been proposed to inhibit SARS-CoV-2. These are (i) the SUD,[25,26,239,249−252] (ii) the RdRp,[15,16,239,253,254] (iii) the Hel,[239,255,256] and (iv) the N-terminal RNA binding domain of the nucleocapsid protein.[257,258] Interestingly, the SUD, RdRp, and Hel all belong to coronavirus nonstructural proteins. Apart from the molecular-docking approaches dealing with RdRp[244,253,259−261] and Hel,[239] little interest has been dedicated to MD simulations,[256] and hence we expect those targets to become the objects of further studies. A different role can be assigned to the SUD, which is able to interact with oligo(G) nucleic acids of the host, like G-quadruplexes, to hamper the defensive response of the host, thus favoring viral infection of human cells.[25,250,251] Moreover, the SUD has recently been reported to promote the stabilization of the E3 ubiquitin ligase RCHY1, which degrades the antiviral factor p53.[252] The key role of the SUD in coronavirus activity has motivated Hognon et al.[262] to study its binding with a RNA G-quadruplex to explain the role of the noncanonical nucleic acid structure in the dimerization process. The investigation was performed using a combination of equilibrium and enhanced sampling classic MD simulations together with the evaluation of the 2D free-energy profile (Figure ).[262] Two stable interaction modes between the SUD and a model RNA G-quadruplex have been evidenced, in which the oligonucleotide resides either at the interface between the SUD units (dimeric mode, Figure b) or close to one of the monomers (monomeric mode, Figure c). The 2D free-energy profile has shown not only that the dimeric mode is slightly favored, leading to a binding free energy of ∼6 kcal/mol, but also that the presence of the RNA avoids the opening of the SUD units to lead to an open inactive conformation (Figure d). Overall, these results highlight the importance of the G-quadruplex interaction to stabilize the dimeric conformation of SUD and pave the way for the rational design of efficient therapeutic agents aiming at perturbing this interaction, hence enhancing the host defenses against the virus.

Figure 17

(a) 2D free-energy profile for the interaction between a SUD dimer (SUDA, SUDB) and an RNA G4 structure, including (b) dimeric and (c) monomeric binding modes. (d) SUD conformation. Reprinted from ref (262).

Conclusions

Molecular modeling and simulation techniques have achieved a high degree of maturity that allows their use to answer key biological questions, especially concerning rational drug design development. This assumption has been proven particularly true in the case of the global fight against the COVID-19 pandemic. Indeed, modeling and simulation have complemented experimental studies, offering a clear picture of the molecular bases of the main SARS-CoV-2 protein functions and highlighting possible targets to reach their inhibition. The use of multiscale modeling, in which the level of theory used is carefully tailored to the physical–chemical property that should be determined and which can be progressively increased, has allowed large libraries of existing compounds to be screened. This combined approach is fundamental in speeding up the quest for efficient antiviral treatments that are effectively capable of counteracting the high SARS-CoV-2 transmissibility. Furthermore, key protein targets may be considered, related to either to the cell infection (S protein) or the viral material maturation (proteases). The role of other nonstructural proteins and, in particular, their effect on eluding the host immune response, can also be efficiently tackled by molecular modeling and simulation techniques. If molecular docking was largely popular and mainstream for drug-design purposes, then the necessity to include long-time-scale MD simulations, also coupled to enhanced sampling techniques to obtain free-energy profiles, would have proven its fundamental role. In the present Review, we have focused on classical-based molecular simulations. However, QM or QM/MM approaches are expected to be of great utility in providing information on the specific chemical mechanisms of the protease enzymes or in the development of covalent-based inhibitors. Bioinformatic procedures were also considered for their capacity to estimate multi-target-based or synergic therapeutic protocols. The computational techniques reviewed here are therefore crucial actors of the fight that the scientific community has undertaken against SARS-CoV-2, especially when applied in tight collaboration with structural, molecular, and cellular biology researchers. This has been made possible by the great advancement in the methodological development experienced in the last several years, which has allowed us to treat complex systems and come closer to the real biological conditions while maintaining an atomistic resolution. It has also been facilitated by the mobilization of the computer centers of many countries and research agencies, which has provided an unprecedented computational power to the effort. With this Review, we have aimed at collecting and rationalizing the huge amount of data produced. Because the field is extremely dynamic and constantly evolving, the completeness of this Review cannot be granted; however, we are confident that we have identified some of the most important and relevant studies from either a methodological or an applicative point of view. Clearly, in answering the call against SARS-CoV-2, molecular modeling and simulation have demonstrated once more their status and their role as a fundamental and complementary tool to preview, analyze, and rationalize complex biological phenomena happening in complex environments. As such, they have reinforced their key position in the scientific panorama and in their capacity to tackle fundamental societal issues.

208 in total

1. p53 down-regulates SARS coronavirus replication and is targeted by the SARS-unique domain and PLpro via E3 ubiquitin ligase RCHY1.

Authors: Yue Ma-Lauer; Javier Carbajo-Lozoya; Marco Y Hein; Marcel A Müller; Wen Deng; Jian Lei; Benjamin Meyer; Yuri Kusov; Brigitte von Brunn; Dev Raj Bairad; Sabine Hünten; Christian Drosten; Heiko Hermeking; Heinrich Leonhardt; Matthias Mann; Rolf Hilgenfeld; Albrecht von Brunn
Journal: Proc Natl Acad Sci U S A Date: 2016-08-12 Impact factor: 11.205

2. Potential anti-viral activity of approved repurposed drug against main protease of SARS-CoV-2: an in silico based approach.

Authors: Saurov Mahanta; Purvita Chowdhury; Neelutpal Gogoi; Nabajyoti Goswami; Debajit Borah; Rupesh Kumar; Dipak Chetia; Probodh Borah; Alak K Buragohain; Bhaskarjyoti Gogoi
Journal: J Biomol Struct Dyn Date: 2020-05-25

Review 3. While We Wait for a Vaccine Against SARS-CoV-2, Why Not Think About Available Drugs?

Authors: Francisco J Barrantes
Journal: Front Physiol Date: 2020-07-03 Impact factor: 4.566

4. Structure-based design of antiviral drug candidates targeting the SARS-CoV-2 main protease.

Authors: Wenhao Dai; Bing Zhang; Xia-Ming Jiang; Haixia Su; Jian Li; Yao Zhao; Xiong Xie; Zhenming Jin; Jingjing Peng; Fengjiang Liu; Chunpu Li; You Li; Fang Bai; Haofeng Wang; Xi Cheng; Xiaobo Cen; Shulei Hu; Xiuna Yang; Jiang Wang; Xiang Liu; Gengfu Xiao; Hualiang Jiang; Zihe Rao; Lei-Ke Zhang; Yechun Xu; Haitao Yang; Hong Liu
Journal: Science Date: 2020-04-22 Impact factor: 47.728

5. Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation.

Authors: Daniel Wrapp; Nianshuang Wang; Kizzmekia S Corbett; Jory A Goldsmith; Ching-Lin Hsieh; Olubukola Abiona; Barney S Graham; Jason S McLellan
Journal: Science Date: 2020-02-19 Impact factor: 47.728

Review 6. A systematic review of asymptomatic infections with COVID-19.

Authors: Zhiru Gao; Yinghui Xu; Chao Sun; Xu Wang; Ye Guo; Shi Qiu; Kewei Ma
Journal: J Microbiol Immunol Infect Date: 2020-05-15 Impact factor: 4.399

7. Regulation of IRF-3-dependent innate immunity by the papain-like protease domain of the severe acute respiratory syndrome coronavirus.

Authors: Santhana G Devaraj; Nan Wang; Zhongbin Chen; Zihong Chen; Monica Tseng; Naina Barretto; Rongtuan Lin; Clarence J Peters; Chien-Te K Tseng; Susan C Baker; Kui Li
Journal: J Biol Chem Date: 2007-08-30 Impact factor: 5.157

8. Novel Coronavirus Polymerase and Nucleotidyl-Transferase Structures: Potential to Target New Outbreaks.

Authors: Wen-Fa Zhang; Preyesh Stephen; Jean-François Thériault; Ruixuan Wang; Sheng-Xiang Lin
Journal: J Phys Chem Lett Date: 2020-05-22 Impact factor: 6.475

Review 9. From SARS to MERS: crystallographic studies on coronaviral proteases enable antiviral drug design.

Authors: Rolf Hilgenfeld
Journal: FEBS J Date: 2014-08-11 Impact factor: 5.542

10. Structural variations in human ACE2 may influence its binding with SARS-CoV-2 spike protein.

Authors: Mushtaq Hussain; Nusrat Jabeen; Fozia Raza; Sanya Shabbir; Ayesha A Baig; Anusha Amanullah; Basma Aziz
Journal: J Med Virol Date: 2020-04-15 Impact factor: 20.693

22 in total

1. Meticulous assessment of natural compounds from NPASS database for identifying analogue of GRL0617, the only known inhibitor for SARS-CoV2 papain-like protease (PLpro) using rigorous computational workflow.

Authors: Paritosh Parmar; Priyashi Rao; Abhilasha Sharma; Arpit Shukla; Rakesh M Rawal; Meenu Saraf; Baldev V Patel; Dweipayan Goswami
Journal: Mol Divers Date: 2021-05-18 Impact factor: 3.364

2. Nano-size dependent protein corona formation by SARS-CoV-2 Omicron spike protein over gold nano-colloid and reversible aggregation.

Authors: Kazushige Yokoyama; Theresa Lam; Jack Santariello; Akane Ichiki
Journal: Colloids Surf A Physicochem Eng Asp Date: 2022-04-16 Impact factor: 5.518

3. Nano-size dependence in the adsorption by the SARS-CoV-2 spike protein over gold colloid.

Authors: Kazushige Yokoyama; Akane Ichiki
Journal: Colloids Surf A Physicochem Eng Asp Date: 2021-02-04 Impact factor: 4.539

4. Microsecond MD Simulation and Multiple-Conformation Virtual Screening to Identify Potential Anti-COVID-19 Inhibitors Against SARS-CoV-2 Main Protease.

Authors: Chandrabose Selvaraj; Umesh Panwar; Dhurvas Chandrasekaran Dinesh; Evzen Boura; Poonam Singh; Vikash Kumar Dubey; Sanjeev Kumar Singh
Journal: Front Chem Date: 2021-01-13 Impact factor: 5.221

5. Predicting Potential SARS-COV-2 Drugs-In Depth Drug Database Screening Using Deep Neural Network Framework SSnet, Classical Virtual Screening and Docking.

Authors: Nischal Karki; Niraj Verma; Francesco Trozzi; Peng Tao; Elfi Kraka; Brian Zoltowski
Journal: Int J Mol Sci Date: 2021-02-04 Impact factor: 5.923

6. Allosteric Cross-Talk among Spike's Receptor-Binding Domain Mutations of the SARS-CoV-2 South African Variant Triggers an Effective Hijacking of Human Cell Receptor.

Authors: Angelo Spinello; Andrea Saltalamacchia; Jure Borišek; Alessandra Magistrato
Journal: J Phys Chem Lett Date: 2021-06-23 Impact factor: 6.475

7. New insights into the catalytic mechanism of the SARS-CoV-2 main protease: an ONIOM QM/MM approach.

Authors: Henrique S Fernandes; Sérgio F Sousa; Nuno M F S A Cerqueira
Journal: Mol Divers Date: 2021-06-24 Impact factor: 3.364

8. Development and Evaluation of Peptidomimetic Compounds against SARS-CoV-2 Spike Protein: An in silico and in vitro Study.

Authors: Omid Zarei; Hannah Kleine-Weber; Markus Hoffmann; Maryam Hamzeh-Mivehroud
Journal: Mol Inform Date: 2022-02-01 Impact factor: 4.050

9. Proteomics and Its Application in Pandemic Diseases.

Authors: Suman S Thakur
Journal: J Proteome Res Date: 2020-11-06 Impact factor: 4.466

Review 10. COVID-19: Myths and Reality.

Authors: Larisa V Kordyukova; Andrey V Shanko
Journal: Biochemistry (Mosc) Date: 2021-07 Impact factor: 2.487