Literature DB >> 32678588

SARS-CoV-2 Main Protease: A Molecular Dynamics Study.

Dimas Suárez1, Natalia Díaz1.   

Abstract

Herein, we investigate the structure and flexibility of the hydrated SARS-CoV-2 main protease by means of 2.0 μs molecular dynamics (MD) simulations in explicit solvent. After having performed electrostatic pKa calculations on several X-ray structures, we consider both the native (unbound) configuration of the enzyme and its noncovalent complex with a model peptide, Ace-Ala-Val-Leu-GlnSer-Nme, which mimics the polyprotein sequence recognized at the active site. For each configuration, we also study their monomeric and homodimeric forms. The simulations of the unbound systems show that the relative orientation of domain III is not stable in the monomeric form and provide further details about interdomain motions, protomer-protomer interactions, inter-residue contacts, accessibility at the catalytic site, etc. In the presence of the peptide substrate, the monomeric protease exhibits a stable interdomain arrangement, but the relative orientation between the scissile peptide bond and the catalytic dyad is not favorable for catalysis. By means of comparative analysis, we further assess the catalytic impact of the enzyme dimerization, the actual flexibility of the active site region, and other structural effects induced by substrate binding. Overall, our computational results complement previous crystallographic studies on the SARS-CoV-2 enzyme and, together with other simulation studies, should contribute to outline useful structure-activity relationships.

Entities:  

Year:  2020        PMID: 32678588      PMCID: PMC7409944          DOI: 10.1021/acs.jcim.0c00575

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


Introduction

At the end of 2019, a novel coronavirus was identified in the airway epithelial cells of three patients with pneumonia of unknown cause.[1] The new pathogen, named severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2),[2] spread rapidly around the globe, and the associated COVID-19 disease become a worldwide pandemic. Social distancing and quarantine were imposed in numerous countries to stop dissemination, and the development of effective drugs and/or vaccines against the virus was urged. Similar to other coronaviruses, two overlapping polyproteins (replicase polyproteins 1a and 1ab) are produced and processed after SARS-CoV-2 infection to generate multiple functional subunits required for viral replication.[3] The proteolytic processing is accomplished by two internally encoded proteases that hydrolyze the polyproteins at specific sites. One of these proteolytic enzymes, the so-called main protease or 3C-like protease (3CLpro), catalyzes most maturation cleavage events. Thus, 3CLpro is an essential enzyme for viral replication and constitutes one of the best characterized drug targets among coronaviruses. The crystal structure of the 3CLpro protein from SARS-CoV-2 is highly similar to that of other coronaviruses.[4,5] The protein is formed by three domains: domains I (residues 10–99) and II (residues 100–182) have an antiparallel β-barrel structure, while domain III (residues 198–303) forms a compact α-helical domain connected to domain II by a long linker loop. The active site is located in a cleft between domains I and II, and it holds a histidine/cysteine catalytic dyad. In the SARS-CoV-2 main protease, Cys145 acts as a nucleophile during the first step of the hydrolysis reaction assisted by His41 as a base catalyst. The oxyanion hole, which stabilizes the partial negative charge developed at the P1 carbonyl group of the peptide substrate during the hydrolysis of the P1–P1′ bond, is formed by the backbone amido groups of Gly143 and Cys145, both placed in the so-termed oxyanion loop (residues 138–146). The catalytic machinery also includes a number of binding sites, with the S1 site defining the enzyme specificity for a glutamine at the P1 position of the peptide substrate. Domain III is involved in the dimerization of the protease (see below), and the resulting homodimer is supposed to be the active form of the enzyme.[6] The high homology shown by the 3CLpro enzyme in different coronaviruses suggests that this enzyme family exhibits almost identical biochemical and biophysical properties. Particularly, the main proteases from SARS-CoV-2 and SARS-CoV share 96% of sequence identity so that much of the experimental results obtained for the later enzyme could be relevant for the present case.[4] Hence, before outlining our goals in this work, we first summarize some biochemical and structural studies on the SARS-CoV main protease.[7] The critical role of domain III in the 3CLpro dimerization has been settled by fragment deletion experiments performed for the SARS-CoV protease, showing that a truncated enzyme lacking domain III remains as a monomer.[8] The catalytic activity of this truncated form, which has been assayed against a 14-mer peptide substrate, is very weak as only ∼20% of the substrate was cleaved after 20 h.[8] Furthermore, serial truncation experiments have confirmed that the last C-terminal helix in domain III is essential for dimerization and enzyme activity, and more specifically, deletion of residues Gln299 and Arg298 significantly reduces both dimerization and enzyme activity (∼1%–2%).[9,10] Another structural element of 3CLpro that seems essential for the activity of the SARS-CoV main protease is the N-finger constituted by residues 1–7 in domain I. According to Chen et al.,[11] the complete deletion of the N-finger has a minor effect on dimerization, but it abolishes enzymatic activity (<1%) when reacting with a 12-mer peptide. On the other hand, Hsu et al.[9] have found that the first three N-finger residues have only a small influence both on the dimer stability and on the activity of the enzyme (Kd for dimer dissociation changes from 0.28 to 3.4 μM upon deletion while 76% of catalytic activity is retained). However, in contrast with Chen et al., the serial truncation experiments of Hsu et al. have revealed a dramatic effect in the structure and activity of the enzyme after removing Arg4 (Kd = 57.5 μM and 1.3% of activity in the truncated enzyme) and the next residues.[9] Crystallographic structures have shed light onto the relationship between dimerization and enzyme activity. In the homodimer structures, the N-finger of one protomer is squeezed between domains II and III of another protomer. This allows Ser1 and Arg4 from one protomer to shape the substrate-specificity pocket (i.e., the S1 site) and the oxyanion loop in the other protomer.[12] Curiously, in the crystal structures for some monomers (mutants G11A, S139A, and R298A),[13−15] the oxyanion loop is partially folded as a 310-helix that hinders the access to the active site, particularly to the S1 subsite. Therefore, it seems that both protein dimerization and the right placement of the N-finger residues are indispensable for maintaining an open active site. This interpretation has been supported by surface plasmon resonance experiments showing that the full-length enzyme binds a P6–P1 hexapeptide model (Kd ∼ 152 μM), but almost no binding exists to the N-finger deleted protease.[11] From the structural and activity results enumerated above, it seems that the monomeric form of the 3CLpro enzymes has little or no affinity for the peptide substrate, presumably due to the blocking conformation of the oxyanion loop in the active site. However, this view has been challenged by isothermal titration calorimetry (ITC) assays showing that the wild type and several monomeric mutants of SARS-CoV (R298A, R298L, and R298A/Q299A) present comparable binding affinities for a 6-mer peptide substrate.[16] Moreover, substrate binding to the single mutants induces the dimerization of the enzyme, and kinetic assays provided similar kcat values for the single mutants and the wild-type protein. In contrast, dimer stabilization upon substrate binding does not occur for the double mutant, which exhibits null proteolytic activity.[16] Another intriguing fact is that some of the crystal structures obtained for the 3CLpro enzyme present a dimer with one of the monomers displaying a partially collapsed or disordered oxyanion loop, showing thus that the blocking conformation is not exclusive of the monomeric state.[12,15] Moreover, it has been reported that the N214A mutant remains mainly as a dimer without enzymatic activity but structurally very close to the wild-type enzyme (i.e., no collapsed active site is observed in the crystal structure).[17] Therefore, it seems most likely that a complex interplay exists among enzyme dimerization, active site structure, substrate binding, and catalysis. Clearly, some of the questions regarding the actual role of the protease dimerization on the architecture and activity of the SARS-CoV-2 catalytic site are prone to computational examination by means of molecular dynamics (MD) simulations in explicit solvent. Similarly, computational studies can reveal important molecular details of the Michaelis complexes with peptide substrates. Hence, in this work, we examine various configurations of the SARS-CoV-2 3CLpro enzyme differing in the monomer or homodimer state and in the presence or not of a short peptide substrate. All the models are subject to a 2.0 μs MD simulation followed by intensive analysis in order to characterize many structural and dynamic features ranging from the tertiary and quaternary structures to specific inter-residue contacts at the protomer interface or in the active site. By comparing the results obtained for the various configurations, the differences and similarities between the monomeric and dimer forms are discussed in detail, revealing also some effects induced by substrate binding on the interdomain arrangements and the prereactive organization of the catalytic site. In this way, our computational results may complement previous crystallographic studies on the SARS-CoV and SARS-CoV-2 enzymes. Considering also that the spread of the COVID-19 disease has undoubtedly sparked a flurry of computational research on this system, the present results may also contribute to reach a consensus view about the actual flexibility and structure of the active site supported by independent simulation studies. Eventually, the representative structures produced by our simulations could be of interest to undertake further computational work.

Methods

pKa Calculations

The protonation state for the titratable residues at pH 7 were assigned according to structure-based pKa calculations performed with the H++ web server (version 3.1).[18] In these calculations, four high-resolution X-ray structures recently deposited at the Protein Data Bank (PDB) were considered: two apoenzyme structures 6Y84 (1.39 Å)[19]and 6M03 (2.0 Å)[20] and two enzyme–inhibitor complexes 6LU7 (2.16 Å)[5] and 5R82 (1.31 Å)[21] (coordinates of the inhibitor atoms were removed). The pKa calculations were performed for the protease monomers and for the catalytically competent dimeric forms of the apoenzyme structures. In addition, three different dielectric constants were selected for the interior of the protein (εint = 4, 10, and 20) to evaluate the consistency of the results.

Initial Structures and Building of the Enzyme/Peptide Complex

Starting coordinates for the SARS-CoV-2 main protease were taken from the 6LU7 (pH 6.0) PDB structure. The coordinates of the inhibitor were removed prior to the edition and simulation of the native form of the enzyme, and all the crystallographic water molecules were maintained in the models. We also simulated an enzyme/substrate complex as the SARS-CoV-2 main protease performs multiple cleavages at viral polyproteins by recognizing several peptide sequences with an absolutely conserved glutamine at the P1 site (i.e., cleavage occurs at the P1–P1′ peptide bond). Hence, to generate a dynamic model for substrate binding within the 3CLpro active site, we selected the sequence of the cleavage site between the so-called nonstructural protein 4 and the 3CLpro enzyme that are successively encoded in the viral polyproteins (i.e., scissile peptide bond 3263–3264). According to the information prereleased in the UniProt database (codes P0DTC1 and P0DTD1),[22] the -P4-P3-P2-P1–P1′- sequence of this recognition site is -Ala-Val-Leu-GlnSer-. We built this short peptide sequence within the protease active site by using the 6LU7 crystal structure as a template. Thus, the peptidomimetic inhibitor in 6LU7 includes a peptide moiety with the sequence -Ala-Val-Leu- located within the S4-S3-S2 binding sites and a consecutive side chain bound within S1 that resembles a glutamine side chain (see Scheme ). Hence, we kept this part of the inhibitor, and we merely built the Ser backbone chain at the S1′ site using the Chimera program.[23] During the molecular edition of the noncovalent enzyme/pentapeptide complex, the Ser(P1′) side chain, the N-terminal acetyl- and C-terminal N-methyl amide capping groups, and all H atoms in the peptide substrate were added by the tLEaP program included in the AMBER18 suite of programs.[24,25]
Scheme 1

Comparison between Molecular Structures of Peptido-Mimetic Inhibitor and Peptide Substrate Selected for This Work

Molecular Dynamics Settings

Molecular Dynamics (MD) simulations in explicit solvent were run for the monomeric and dimeric forms of the enzyme, either in their free (unbound) state or in complex with a pentapeptide substrate. The coordinates of the solute atoms from the selected X-ray structure were processed with the tLEaP program in order to add the missing H atoms and to assign the molecular mechanics parameters. The systems, which were represented with the ff14SB version[26] of the all-atom AMBER force field,[27] were immersed in an octahedral water box that extended 18 Å (monomer) or 20 Å (dimer) from the protein atoms. The TIP3P potential[28] was used to represent the water molecules, and Na+ counterions[29] were added with the tLEaP program to neutralize the systems. The solvent molecules and counterions were initially relaxed by means of energy minimizations and 100 ps of molecular dynamics (MD) using the SANDER program.[25] Then, the full systems were minimized and heated gradually to 300 K using 60 ps of constant volume (NVT) MD with a 1 fs time step and using the PMEMD program in AMBER18. Subsequently, the density was adjusted by means of 2.0 ns of constant pressure (NPT) MD with a 2 fs time step and using the Monte Carlo barostat as implemented in PMEMD. Langevin dynamics was employed to control the temperature (300 K) with a collision frequency of 2 ps–1. The SHAKE algorithm[30] was selected to constraint all R–H bonds, and periodic boundary conditions were applied to simulate a continuous system at constant pressure (NPT). A nonbonded cutoff of 9.0 Å was used, and the Particle-Mesh-Ewald method[31] was employed to include the contributions of long-range interactions. The production phase of the simulations at the NPT conditions extended up to 2.0 μs, and coordinates were saved every 2.5 ps for analysis. The MD runs with a time step of 2.0 fs employed the GPU accelerated version of the PMEMD code included in AMBER18.[25,32]

Structural Analysis of MD Simulations

The CPPTRAJ software[33] in AMBER18 was used to compute the root mean squared deviation (RMSD) of the protein coordinates with respect to the reference X-ray structure along the MD trajectories. The coordinates of the models were also clustered using CPPTRAJ with the average-linkage clustering algorithm and a sieve of 250 frames. The distance metric between frames was calculated via the best-fit coordinate RMSD using the coordinates of heavy atoms. The molecular surfaces of the whole systems and of the domain/monomer components were computed using the linear combination of pairwise overlaps (LCPO) method[34] as implemented in CPPTRAJ. Secondary structure assignment of the oxyanion loop was done using the 2002 CMBI version of the DSSP program.[35] H-bond and vdW contacts were characterized using a specific software developed locally. H-bonds were identified based on a geometrical criterion (X···Y distance <3.5 Å and X–H···Y angle >120°), whereas hydrophobic interactions were scored by evaluating a dispersion attraction term[36] between pairs of atoms belonging to different hydrophobic groups. The criteria for assessing the occurrence of dispersion interactions between two groups were as follows: (a) The total pairwise dispersion energy is larger than 0.5 kcal/mol in absolute value. (b) The distance between the centers of mass of the two interacting groups is below 12.0 Å. The Chimera visualization system[23] was employed to draw the ribbon/stick models of the systems. Using a locally developed FORTRAN code, the relative orientation of the protein domains and/or of the two protomers in the dimeric form were monitored in terms of the Euler angles (xyx convention) that characterize the relative orientation of two rigid coordinate systems, which are placed at the center of mass of the considered fragments. Each coordinate system is defined by the principal inertia axes, which, in turn, are computed considering the coordinates of the backbone atoms located in the α-helical or β-strand elements within the selected region. To further characterize the structure and shape of the active site region, we computed the radius of accessibility (racc) of various residue side chains and/or H-bond sites following a computational protocol that has been described in previous work.[37] For each atom or group of atoms, racc is defined as the maximum radius of a spherical ligand that can touch the desired target. The MSMS program[38] was used to carry out fast computations of molecular surfaces considering only the protein heavy atoms and probe spheres of varying radius. The racc values were calculated for a subset of 2000 snapshots for each simulation.

Electrostatic Calculations

The electrostatic potentials of the models in solution were computed on selected cluster representatives from the MD trajectories using the APBS software.[39] In the electrostatic Poisson–Boltzmann calculations, only the coordinates of the solute atoms were used, with the atomic charges and radii taken from the ff14SB AMBER representation. The nonlinear PB equation was solved on a cubic lattice by using an iterative finite-difference method. The cubic lattice had a grid spacing of 0.33 Å, and the points at the boundary of the grid were set to the sum of Debye–Hückel potentials. The dielectric boundary was the contact surface between the radii of the solute and the radius (1.4 Å) of a water probe molecule. The electrostatic potential was plotted onto the molecular surface computed by the MSMS program[38] using the Chimera program.

Conformational Entropy

We estimated the conformational entropy (Sconform) of the backbone ϕ and ψ dihedral angles for residues located nearby the active site using the CENCALC program.[40] The entropy calculation relies on the discretization of the time evolution of the selected dihedral angles. To this end, the continuous probability density function (PDF) of each dihedral angle is represented by a von Mises kernel density estimator, which depends on a concentration parameter κ (a κ = 0.50 value was chosen here). By finding the maxima and minima of the PDF, the time series containing the values of the corresponding dihedral angle during the MD simulation is transformed into an array of integer numbers labeling the accessible conformational states. This allows the estimation of the probability mass functions (Pi) of the individual dihedral angles from which the marginal (first-order) conformational entropy of the each dihedral is computed as Typically, μs-length MD simulations are required in order to obtain reasonably converged Sconform values.[41]

Results and Discussion

Table S1 in the Supporting Information collects the computed pKa values for all the titratable residues of the SARS-CoV-2 main protease. The crystal structures used in the calculations were selected according to their state (apoenzyme or inhibited enzyme), their atomic resolution (1.3–2.2 Å), and the experimental pH value (6.0–8.1). The results consistently predict that all lysines, arginines, and aspartic and glutamic acids should be modeled in their charged state at pH 7. In contrast, all cysteines and tyrosines should remain neutral. For the imidazole histidine groups, their pKa values are well below 7.0, suggesting thus that they will be mainly neutral at pH 7. Only the results obtained for His80 at high dielectric constants, more reliable for this solvent-accessible residue, suggest the coexistence of neutral and protonated forms. However, the computed pKa values below 7.0 and the absence of negatively charged residues close to His80 prompted us to consider its neutral state. Then, the most probable neutral state for the six histidines, with the side chain protonated at Nδ or Nε atoms, was selected after visual inspection of their contacts in the crystal structures. This resulted in His41 and His80 being protonated at Nδ, whereas His64, His163, His164, His172, and His246 were protonated at Nε. We also note in passing that pKa calculations on the dimer 3CLpro structures do not introduce any changes in the selected protonation states. In addition, the pKa values obtained for the active site residues, particularly for the catalytic dyad Cys145/His41, support their neutral state within the pH range 6–8 corresponding to the selected crystal structures.

MD Simulations

We examined various protease configurations that involve either the monomeric or the dimeric forms, which, in turn, simulate either the native (unbound) state or the bound state with the Ace-Ala-Val-Leu-GlnSer-Nme peptide substrate (see Table S2 in the Supporting Information). In particular, we run two 2 μs MD trajectories representing the native state that are labeled as M (monomer) and D (dimer). The enzyme–peptide complex was sampled by two MD trajectories labeled as M/Pep and D/2Pep, corresponding to the monomeric and dimeric forms, respectively. In D/2Pep the active site of each protomer A and B accommodates one peptide molecule. The MD results of the various simulations are presented and analyzed in a comparative manner in order to better characterize the structure and flexibility of the protein domains, the amplitude of the interdomain relative motions, the nature and stability of the protomer–protomer contacts, the shape and flexibility of the active site, and the substrate mode of binding. All these results can be also useful to outline more clearly the differences and similarities between the monomeric and dimeric forms of the enzyme in aqueous solution.

Intradomain Structure and Relative Domain Orientation

Changes in the internal geometry and in the relative position of the various protein domains were first analyzed by monitoring the time evolution of the RMSD of the coordinates of the heavy atoms with respect to the crystallographic structures (Figure ; Table S3 presents mean RMSD values). In addition, the superposition of the solid-state and the MD-averaged ribbon models (Figures and 3) reveals some differences in the positioning of specific loops and secondary structure elements.
Figure 1

(a) Time evolution along the M (in red) and M/pep (in blue) trajectories of the root mean squared deviation (RMSD) computed for selected backbone heavy atoms (in Å). (b) Time evolution along the D (in red) and D/2Pep (in blue) trajectories of the RMSD data (in Å).

Figure 2

Different views (90° turned) for a ribbon representation of the average structure obtained from the last 50 ns of the M and M/pep simulations. The average structure was superposed over the 6LU7 X-ray structure (in lighter color) using the backbone coordinates of domain II.

Figure 3

Different views for a ribbon representation of the average structure obtained from the last 50 ns of the D and D/2Pep simulations. The average structures were superposed over the 6LU7 X-ray structure (in lighter color) using the backbone coordinates of domain II.

(a) Time evolution along the M (in red) and M/pep (in blue) trajectories of the root mean squared deviation (RMSD) computed for selected backbone heavy atoms (in Å). (b) Time evolution along the D (in red) and D/2Pep (in blue) trajectories of the RMSD data (in Å). Different views (90° turned) for a ribbon representation of the average structure obtained from the last 50 ns of the M and M/pep simulations. The average structure was superposed over the 6LU7 X-ray structure (in lighter color) using the backbone coordinates of domain II. Different views for a ribbon representation of the average structure obtained from the last 50 ns of the D and D/2Pep simulations. The average structures were superposed over the 6LU7 X-ray structure (in lighter color) using the backbone coordinates of domain II. The rotational motion of the helical domain III with respect to the central β-strand domain II along the M simulation is a remarkable result concerning the overall protein architecture (Figure ). In the coordinates of protomer A in the 6LU7 structure, domain III establishes only a few H-bond contacts with domain II (e.g., Th111···Asp295 H-bond, Arg131···Asp289 salt bridge) and domain I (e.g., Arg4···Gln299 H-bonds), and domains II/III are connected through a 14-long peptide linker (Gly183Asp197). In contrast, the connecting loop between domains I and II comprises only seven residues (Asp92-Pro99) including a central Pro96 residue, and abundant interdomain contacts contribute to define the active site region. In fact, the domain I/II arrangement was quite stable in all of our MD simulations. The relative motion of domain III is well described by the time evolution of the RMSD values for all the backbone atoms and by the center of mass (COM) distance between domains II and III (Figure and Figure S2). To characterize the interdomain orientation regardless of the COM separation, we also computed the Euler angles defined by a reference inertial system placed at domain II and an analogous coordinate system located at domain III, with the resulting ϕ, θ, and ψ angles shown in the form of polar plots in Figure S1. In the M simulation, the two domains slightly depart from each other (COM distance elongates from ∼26 to ∼29 Å) at 200 ns without significant reorientation. During the ∼200–300 ns interval, an ample reorientation occurred in less than 15 ns facilitated by a torsional change at Gly195, and the relative position remained stable during more than 1 μs. However, during the last 500 ns of the M simulation, domain III reorients again with respect to domain II adopting an intermediate pose in terms of the Euler angles (Figure S1), which also departs significantly from the X-ray structure as shown in Figure . To further assess whether or not the varying location of domain III is an intrinsic feature of the monomeric state in aqueous solution, we decided to run a second MD simulation started from a different X-ray structure (6Y84) with the same settings as those employed in the M simulation. For the sake of brevity, the structural details of this simulation, which was run for 1.0 μs, are not reported here. We just comment that a two-step interdomain reorientation occurred again within the 300–500 ns time interval, leading to an average MD structure that resembles the first configuration observed in the M trajectory (Figure S4). Therefore, the independent MD trajectory added support to the results produced by the M simulation. Interestingly, the wide domain II/III rearrangement was not observed in the presence of the peptide substrate (M/Pep simulation), although the COM separation and the reference Euler angles present significant fluctuations, especially in the first half of the simulation, indicating loosened interdomain contacts. The peptide substrate gives direct H-bond contacts with linker residues (e.g., Gln189 and Gln192), which could contribute to reduce the mobility of the II–III linker. On the other hand, the interdomain motions are largely dampened out in the dimer simulations (D and D/2Pep), which are characterized by a persistent interdomain disposition and stable interprotomer contacts (see below). Thus, it turns out that the destabilization of the interdomain orientation would occur only in the monomeric form. Concerning the internal structure of domains I–III, the most remarkable feature is the small structural deviations and low flexibility of the central domain II (Figure and Table S3). Thus, the mean RMSD values are only ∼1.1 ± 0.1 Å for the monomeric simulations (M and M/Pep) and even smaller for the dimer models (∼0.8 ± 0.1 Å) with the only exception being domain IIA in the D simulation (1.52 ± 0.21 Å; see below). The internal stability of domain II is in consonance with a certain buried character as only ∼55%–60% (monomer simulations) or ∼40%–45% (dimers) of its molecular surface is solvent accessible (Table S4). The terminal domains I and III are more hydrophilic, having around 70% and 85% of exposed surfaces, respectively, all along the MD simulations. They exhibit wider internal motions involving their helical and loop elements, which, however, do not induce dramatic changes in the domain structures, with the largest RMSD values being lower than 2.5 Å. In general, the largest intradomain deviations arise in the native trajectories (M and D) as, for example, in domain IIIB in D (RMSD = 2.30 ± 0.16 Å), whose C-terminal helix is largely displaced (Figure ). In this respect, it turns out that the local internal displacements and the actual flexibility in domains I/III are quite variable when comparing among the data of the various simulations and/or of the A/B protomers in the D simulation. This variability suggests that the protein domains may access different conformational states on the μs time scale. Finally, we note that the D/2Pep simulation consistently shows the smaller structural deviations and the lower fluctuations, both in terms of the intradomain RMSD data and the interdomain COM distances/Euler angles, revealing thus a rigidifying effect exerted by substrate binding.

Protomer–Protomer Interactions

As in other 3CLpro enzymes, only the SARS-CoV-2 main protease homodimer is considered to be catalytically active, and accordingly, targeting the interface between both protomers (A and B) might be an alternative therapeutic strategy for the treatment of COVID-19. Hence, we decided to analyze in detail the overall dimer architecture and the inter-residue contacts between the two protomers. According to the protomer–protomer distance in Figure S2, the dimer remains stable during the D and D/2Pep simulations. By means of molecular surface calculations (Table S4), we found that the A···B contact area consists of ∼11% (∼1400 Å2) of the LCPO surface of the separated protomers and also remains quite stable along the D simulation. This contact area in D is similar to that in the X-ray structure (∼1400 Å2 in 6LU7), albeit lower than in the D/2Pep simulation (∼1600 Å2). The two protomers are oriented perpendicular to one another. Their relative orientation, as measured by the Euler angles between the corresponding inertial reference systems, is highly stable according to the MD simulations (Figure S1), which is compatible with internal motions within the respective protomer domains. The stability of the dimer architecture is also evident in the good overall match between the average MD structures and their parent X-ray structures (Figure ). The global RMSD mean values further confirm this agreement as they have moderate values of 2.91 ± 0.12 Å (D) and 1.78 ± 0.16 Å (D/2Pep), which, in turn, point out again the substrate binding effect. In the crystallographic structures, the two protomers give symmetrical contacts (e.g., the Ser1@O···HN@Phe140* and Phe140@NH···O@Ser1* H-bonds have the same interatomic distances; see Table S6). The A···B contacts mainly involve residues Ser1-Gly11 in one protomer and several residues distributed in the three domains of the other protomer (e.g., Ser10···Ser10, Arg4···Lys137, Met6···Val125, and Arg4···Glu290). There are also a few nonpolar contacts between residues in domain II (e.g., Val125···Val125) and between residues in domain III (e.g., Ala285···Leu286). Most of these contacts are well maintained during the D and D/2pep simulations (Table S6). For instance, the Gly11@NH···Oε@Glu14 H-bond presents 100% of abundance and a short interatomic distance (2.9 Å) in the two dimer trajectories. But the percentage of occurrence of other polar interactions, mainly involving Ser1 and Arg4 residues in the N-finger, varies across the simulations and with respect to the crystal structure. Thus, the initial Ser1···Glu166 H-bonds are missing for most part of the simulations (i.e., the highest percentage of occurrence is below 50%). Similarly, the Ser1···His172 contact is weakened in the simulated dimers, and the Ser1···Phe140 interaction is not abundant for the D trajectory. With respect to Arg4, the salt bridge with Glu290 is well preserved but the Lys137(A)···Arg4(B) interaction is clearly destabilized (Table S6A). On the other hand, the simulations also assess the relevance of nonpolar interactions for the stability of the dimer. Thus, the Met6/Tyr126, Pro9/Pro122, Pro9/Val125, and Pro9/Leu115 contacts, which connect the N-finger in one protomer to domain II in the other protomer, have a high abundance and a large scoring energy. Similarly, the Val125/Val125 contact glues domain IIA and IIB in all the simulations, while the hydrophobic packaging of the Ala285 and Leu286 residues in domains IIIA and IIIB also contributes to fix the highly mobile C-terminal domains. This hydrophobic cluster involving residues Ala285/Leu286 is a particularly important interprotomer contact spot. The Ala285···Leu286 average MD distances are clearly shorter for the D/2Pep trajectory (Table S6B), which indicates that the long linker in domain III containing Ala285 and Leu286 is more packed in the dimer in the presence of the peptide substrate. This would explain the increase in the contact area between the protomers along D/2Pep. Domain II in one protomer is also linked to domain III in the other protomer thanks to the Ser123···Arg298 and Ser139···Gln299 H-bonds. The Ser123···Arg298 contact is mainly mediated by a water molecule during the simulations, and it is clearly desestabilized along D/2Pep. In contrast, the presence of the substrate contributes to stabilize the Ser···Gln299 interactions (Table S6A) that connect the oxyanion loop in one protomer to the C-terminal helix in the other protomer.

Structure and Dynamics of Active Site in Unbound Form of Enzyme

The 3CLpro active site is located at a shallow crevice in the I/II interdomain region. The nucleophilic Cys145 belongs to the oxyanion loop, which is an S-shaped loop (Gly138-Gly146) in domain II. The amide nitrogens of Cys145 and Gly143 define the “oxyanion hole”, which binds the carbonyl group of the scissile peptide bond in the substrates. Adjacent to the oxyanion loop, a β-strand segment (His163-Pro168) comprises other residues that play an important role in substrate binding (e.g., His163, Glu166), which, in turn, are close to a loop segment (Gly183-Ala193) placed at the beginning of the domain II/III linker that contributes to border the active site region. Besides Cys145, the catalytic dyad also includes His41 that is supposed to act as a base during the activation of the nucleophile. His41 belongs to a small helix that is placed within a long and highly helical connection loop in domain I. The positioning of His41 is assisted by a network of H-bonds including a His41···Asp187 interaction mediated by a conserved water molecule and a salt bridge between the nearby Arg40 guanidinium and the Asp187 carboxylate at the domain II/III linker. The active site is completed by several interdomain nonpolar interactions among Pro39, Met49, and Leu27 in domain I and His164 and Met165 in domain II that constitute a hydrophobic pocket (site S2) for accommodating the peptide substrate. To describe the structure and flexibility of the active site in the MD simulations of the unbound configurations, we performed first clustering calculations that yield the population and representative structures, which allow us to visualize structural deviations with respect to the X-ray geometries. A more detailed description is provided by the statistical analysis of selected H-bond/vdW contacts and the secondary structure analysis of the oxyanion loop. We also measured the accessibility of the catalytic residues and binding sites in terms of the average accessibility radii (Table ). The main results are displayed in Figures –7, while Tables S7 and S8 collect statistical data on inter-residue contacts. Some details of the clustering calculations and of the Sconform calculations are given in Table S5 and Figure S3.
Table 1

Average Values (in Å) of Radii of Accessibility (racc) at Different Sites from the Last 1.5 μs of MD Simulationsa

 6LU7MM/PepDADBDA/2PepDB/2Pep
Cys145 (Sγ)3.302.68(1.16)2.63(1.21)2.50(1.28)3.60(0.88)3.45(1.03)2.99(1.16)
His41 (imidazol)3.453.95(1.24)3.43(0.78)4.37(0.99)3.78(0.84)3.90(0.67)3.62(0.71)
Cys145 (NH)1.281.21(0.44)1.83(0.43)2.44(0.60)1.32(0.28)1.22(0.31)1.35(0.36)
Gly143 (NH)3.103.94(1.46)2.76(0.79)5.18(1.38)3.47(0.96)3.11(0.82)2.99(0.83)
Glu166 (NH)3.352.16(1.17)3.48(0.71)3.59(0.76)3.34(0.83)2.67(0.62)2.86(0.73)
Glu166 (C=O)4.034.51(1.08)5.59(0.60)5.55(0.74)5.30(0.77)5.07(0.67)5.00(0.84)
His164 (C=O)3.282.77(0.67)3.87(0.65)3.24(0.77)3.31(0.80)4.20(0.49)3.92(0.57)
His163 (imidazol)2.281.63(0.68)2.55(0.31)2.02(0.56)2.26(0.33)2.37(0.17)2.38(0.18)
Met49 (side chain)3.453.82(1.14)5.67(0.52)5.94(0.26)5.74(0.61)5.89(0.34)5.57(0.77)
His164 (side chain)0.681.42(0.52)1.43(0.28)1.26(0.72)1.09(0.21)1.33(0.16)1.33(0.21)
Met165 (side chain)3.353.51(0.83)3.69(0.97)4.12(1.11)3.83(0.83)4.52(0.57)3.96(0.89)
Leu167 (side chain)1.901.98(0.50)2.38(0.47)1.74(0.56)2.14(0.59)2.47(0.37)2.41(0.42)

Standard deviations are given in parentheses. The racc data for the X-ray structure are also included.

Figure 4

(a, b) View of the active site in the M simulation as shown by the superposition of the two most important cluster onto the X-ray structure (shown in lighter colors). (c) Superposition of top five cluster representatives (coil thickness and color intensity are proportional to cluster abundance). (d) Electrostatic potential mapped onto the surface of the two most populated clusters.

Figure 7

(a) View of the active site as shown by the superposition of the most important D/2Pep cluster representative on the X-ray structure (shown in light colors). (b) Closer view of the active site region showing the peptide substrate and the catalytic dyad. (c) Superposition of the cluster representatives.

(a, b) View of the active site in the M simulation as shown by the superposition of the two most important cluster onto the X-ray structure (shown in lighter colors). (c) Superposition of top five cluster representatives (coil thickness and color intensity are proportional to cluster abundance). (d) Electrostatic potential mapped onto the surface of the two most populated clusters. (a) View of the active site region in each protomer (A, B) along the D simulation as shown by the superposition of the top one clusters onto the X-ray structure (shown in lighter colors). (b) Superposition of the cluster representatives (coil thickness and color intensity are proportional to cluster abundance). (c) Electrostatic potential mapped onto the solvent-accessible protein surfaces. Standard deviations are given in parentheses. The racc data for the X-ray structure are also included. (a) View of the active site as shown by the superposition of the most important M/Pep cluster representative on the X-ray structure (shown in light colors). (b) Closer view of the active site region showing the peptide substrate and the catalytic dyad. (c) Superposition of the cluster representatives. (d) Electrostatic potential mapped onto the solvent-accessible protein surfaces. (a) View of the active site as shown by the superposition of the most important D/2Pep cluster representative on the X-ray structure (shown in light colors). (b) Closer view of the active site region showing the peptide substrate and the catalytic dyad. (c) Superposition of the cluster representatives. The active site region remains quite accessible during the MD simulations as expressed in terms of the mean racc values collected in Table . According to this structural index, which measures the size of the largest spherical probes contacting a particular group of protein atoms, the catalytic Cys145 thiol group, the His41 imidazol, and the Gly143 amide group are not sterically blocked during the M/D simulations with racc values above 2.7, 3.7, and 3.5 Å, respectively, while the backbone Cys145 position is partially buried with smaller racc values between 1.3 and 2.0 Å. The majority of the binding spots located at the hydrophobic shallow pocket are well solvent exposed and have racc values quite similar to those in the X-ray structures with the exception of the Glu166 backbone and His163 side chain in the M simulation (see below). Such large accessibility is observed again in the molecular surface drawings (Figure ), which in addition reveal that a local concentration of negative electrostatic potential is another feature of the active-site surface patch. Although the global form of the active region is comparable during the M and D simulations, there are significant differences concerning the amplitude of the fluctuations and deviations with respect to the initial coordinates. For example, clustering analysis considering both backbone and side chain atoms results in 80 different clusters for the M trajectory, the top two clusters having moderate abundances of 16% and 14%, respectively. The same clustering settings yield 33/32 clusters for protomers A/B in the dimer, with the top D/D clusters being more populated (33% for the first cluster and 14/17% for the second cluster; Table S5). When the average-linkage clustering is based on the backbone RMSD values, there are fewer clusters with higher abundances as expected, but the flexibility measurement is somehow modified. Thus, the monomeric M simulation still has many clusters (25), with the two most populated ones accounting for 39% and 16% of the analyzed MD frames. For each protomer in the D simulation, the backbone-only clustering results in seven (D) and four (D) clusters, but the population distribution is asymmetrical given that the top two D clusters have 37% and 33%, whereas the backbone D is quite rigid, with the top one cluster having 89% abundance. In terms of the T-weighed conformational entropy values (Figure S3), the flexibility of the backbone dihedral angles in the vicinity of the active site amounts to −11, −13, and −7 kcal/mol for the M, D, and D catalytic sites, respectively, which seem in consonance with the backbone clustering analysis. Therefore, it is clear that the active site in the monomeric state exhibits a significant flexibility both in terms of backbone and side chain atoms. However, the active site region may also retain some plasticity in the dimer state as shown in the D active site. Inspection of the cluster representatives confirms that the catalytic site in the isolated monomer (Figure c) is more ductile than that in protomers A and B of the dimer (Figure b). Moreover, a close examination of the top two M clusters (Figure ) reveals that the oxyanion loop, the domain II/III linker segment, and the interhelical connecting loop in domain I tend all to be distorted with regard to the X-ray structure. For the oxyanion loop, a short 310 helix conformation is detected in residues Asn142-Ser144, which, in turn, is associated with the placement of the Asn142 side chain over the shallow S1 subsite, blocking access to the Glu166 NH and CO groups. However, the time evolution of the racc indexes indicates that the S1 “collapse” in the M simulation is reversible (Figure S5) given that access to the Glu166@NH and His163 sites fluctuates on the ns time scale between blocked phases with racc around ∼1.0 Å and solvent-exposed phases with racc above 2 Å. The racc plots are indeed correlated with the placement of the Asn142 side chain and the 310 helical distortion at the oxyanion loop. According to secondary structure analyses using the DSSP method, the central residues Asn142-Gly143-Ser144 show 310 helical conformation in ∼43% of the analyzed snapshots, with the backbone chain interchanging between the helical and turn/coil conformations along the M trajectory (Figure S6), which is in consonance with the structural flexibility of the oxyanion loop in this state.
Figure 5

(a) View of the active site region in each protomer (A, B) along the D simulation as shown by the superposition of the top one clusters onto the X-ray structure (shown in lighter colors). (b) Superposition of the cluster representatives (coil thickness and color intensity are proportional to cluster abundance). (c) Electrostatic potential mapped onto the solvent-accessible protein surfaces.

The conformation of the oxyanion loop during the D simulation deserves also particular attention due to its role in substrate binding and catalysis. A dissimilar behavior was observed between the two protomers so that the D active site maintains an overall conformation that is closer to the crystallographic structure and clearly more rigid than that of the other protomer D (Figure ). The major difference arises in the oxyanion loop because it adopts a β-strand conformation around the Phe140 residue during the central part of the D simulation (protomer A; Figure S6) without compromising the accessibility to the important binding sites. Therefore, the secondary structure analysis further confirms the larger flexibility of the oxyanion loop in D and reveals its complex conformational properties, which oscillate between flexible and quasi-static states in the absence of substrate molecules. However, the implications (if any) of this behavior for substrate binding are not clear. The simulations allow us to establish unambiguously the relationship between the oxyanion loop dynamics and specific inter-residue contacts. Among such contacts, the catalytic Cys145 forms a persistent (94%–98%) Cys145@CO···Asn28@NδH interaction in all the simulations except D, in which such backbone···side chain contact is less abundant (54%). Thus, the Asn28···Cys145 interaction is involved in the orientation and flexibility of the C-terminal portion of the oxyanion loop (Asn28 seems also important because of its additional contacts with Cys117 and Gly120 at the domain I/II interface). To analyze the contacts of the Cys145 thiol group, we selected geometric criteria (Sγ···:X distance < 4.0 Å and 90 < SγH···:X angle < 180°) that take into account the larger size and more diffuse electron cloud of the sulfur atom.[42] The most abundant interaction of Cys145@SγH occurs with the backbone carbonyl group of His164, especially in the monomeric state (84% for M, 81% for protomer A and 65% for B in D). When looking at the central region of the oxyanion loop, its conformation is mainly held by means of a Ser144@OγH···Leu141@O H-bond so that its rupture (e.g., in D) leads to a different loop arrangement. Considering the first part of the oxyanion loop, it turns out that the Phe140 side chain largely determines its conformation. This group gives several nonpolar contacts with Tyr126, His163, His172, and Val114 both in the X-ray structures and in the majority of the simulations. However, the destabilization of this hydrophobic clustering in M and D, particularly of the Phe140(phenyl)···His163(imidazol) π–π stacking, would trigger the Phe140 rearrangement that results in the conformational change of the whole oxyanion loop. The network of H-bond contacts around the catalytic His41 residue is also of particular interest. The simulations confirm the presence of water-mediated His41@NH···Wat···Asp187@Oδ and/or His41@NδH···Wat···Asp187@Oδ contacts. Furthermore, the same water molecule connects the His41 and His164 side chains (His41@NδH···Wat···His164@Nδσ). We note, however, that the bridging water molecule exchanges frequently with bulk solvent on the ns time scale without disrupting its structural role. The His41 side chain also forms nonpolar weak contacts with Leu27, Pro39, Met49, and His164 (Tables S7B and S8B). The salt bridge Arg40···Asp187 interaction is also observed in all the simulations and contributes to maintain in the right place the short helix containing His41 and the first part of the II/III connecting loop (residues 189–191).

Substrate Binding

As previously mentioned, the binding of the peptide substrate within the active site exerts a certain rigidifying effect in the interdomain dynamics, with the global structure of the enzyme becoming more compact. Peptide binding can also induce specific effects on the oxyanion loop conformation with respect to the unbound form of the enzyme. Thus, the structural analysis, clustering calculations, and secondary structure assignments point out clearly that the oxyanion loop conformation is nearly frozen in the presence of the substrate and remains relatively close to the X-ray conformation. Hence, the MD simulations suggest that a stable positioning of the oxyanion loop partially induced by the substrate would be required for optimal binding and catalysis. Concerning the enzyme–peptide binding determinants, we note first that the short peptide Ace-Ala(P)-Val(P)-Leu(P)-Gln(P)∼Ser(P1′)-Nme aligns antiparallel to the terminal part of the long II β-strand (Scheme and Figures and 7). The analysis of the M/Pep and D/2Pep simulations shows that the alignment occurs from P4 to P2 thanks to two highly stable (i.e., 90%–100%) H-bond contacts involving the backbone groups of Glu166 in the β-strand and that of the substrate Val(P3) residue. At the N-terminal end of the peptide, some H-bonds also occur with residue Gln192 in the II–III connection loop, although these contacts are less abundant (∼60%). The cleavage sites selected by the SARS-CoV-2 main proteinase includes Leu, Phe, Val, or Met at the P site, all of them well suited for being accommodated at the hydrophobic S subsite of the enzyme. According to the calculated dispersion energy scorings, the Leu(P2) side chain placed at S2 mainly interacts with the side chains of His41, Met49, and Tyr54 in domain I and Met165 in domain II (Table S9).
Scheme 2

Schematic Representation of Enzyme–Substrate Interactions

Average values of heavy-atom separation (Å) and % of abundances are indicated for selected contacts. Some abundances are segregated into protomer A and B (in italics).

Figure 6

(a) View of the active site as shown by the superposition of the most important M/Pep cluster representative on the X-ray structure (shown in light colors). (b) Closer view of the active site region showing the peptide substrate and the catalytic dyad. (c) Superposition of the cluster representatives. (d) Electrostatic potential mapped onto the solvent-accessible protein surfaces.

Schematic Representation of Enzyme–Substrate Interactions

Average values of heavy-atom separation (Å) and % of abundances are indicated for selected contacts. Some abundances are segregated into protomer A and B (in italics). Coronavirus main protease invariably recognizes peptide cleavage sites within polyproteins with a strictly conserved glutamine residue in P1. Taking into account that the hydrolysis reaction catalyzed by this enzyme occurs at the P1–P1′ amide bond, it is clear that the Gln(P1) residue should be correctly placed within the active site. According to our simulations, the backbone amido group of Gln(P1) interacts with the backbone carbonyl group of His164 (60% for M/Pep and ∼96% for D/2Pep), whereas the Gln(P1) backbone carbonyl group is placed within the oxyanion hole defined by the amido groups of Gly143 and Cys145. At this point, we note a remarkable difference between the M/Pep and D/2Pep simulations. For the monomer, both Gly143@NH···Gln(P1)@O and Cys145@NH···Gln(P1)@O H-bonds present a low abundance (i.e., <50%), which suggests that the scissile peptide bond is not well positioned within the active site. In contrast, the Gly143@NH···Gln(P1)@O H-bond is well maintained during the whole D/2Pep trajectory in the active site of protomers A and B. Some differences also arise at other contacts formed by the P1 side chain that build up the specificity of the S1 site. Our simulations confirm that residues Phe140 and His163 are important in defining the S1 subsite via His163@NεH···Gln(P1)@Oε and Phe140@O···Gln(P1)@NεH H-bonds. However, these contacts are clearly more abundant in the two active sites of the dimer than in the monomer state, pointing out that the N-finger of the second protomer contributes to organize the S1 subsite (Scheme and Table S9). Glu166 also participates in the binding of the Gln(P1) side chain but, according to the M/Pep and D/2Pep simulations, the Glu166@Oε···Gln(P1)@NεH H-bond contact is preferentially mediated by a solvent molecule. In the most likely hydrolysis mechanism assisted by the coronavirus main protease, the reactive events would involve the proton transfer from Cys145@SγH to His41@Nε and the nucleophilic attack of the Sγ atom to the carbonyl group of Gln(P1). In this respect, the simulations show that, compared to the native form, the Cys145 side chain shifts to preferentially interact with the His41 side chain when the peptide substrate is bound within the active site. Thus, a catalytically relevant Cys145@SγH···Nε@His41 contact is present along the D/2Pep simulation (80% for A and 81% for B), although it is less abundant for M/Pep (49%). The H-bond network involving His41, His164, and Asp187 and the structural water molecule is well preserved in the presence of the peptide substrate, with the water exchange with bulk solvent being reduced to just one (M/Pep and D/Pep) or no (D/Pep) events. Hence, the relocation of the Cys145 thiol group and the stable positioning of His41 result in an average Sγ···Nε distance and average SγH···Nε angle of 3.9 ± 0.5 Å and 95 ± 43° for M/Pep and 3.5 ± 0.3 Å and 125 ± 39° for D/Pep (equal values for protomers A and B). These values suggest that the homodimer state favors geometrically the activation of the nucleophile by His41. We also measured the average Cys145@Sγ···C@Gln(P1) distance/Sγ···C···O angle, which have values of 5.1 ± 0.8 Å/84 ± 32° in M/pep and 3.7 ± 0.6 Å/83 ± 13° in D/2Pep for protomer A and 4.0 ± 0.7 Å/82 ± 11° in D/2Pep for protomer B. In this case, the shorter Cys145@Sγ···C@Gln(P) distances in the D/2Pep simulation suggest again that the dimer state exhibits an enzyme–substrate orientation favorable for catalysis.

Discussion

Most of the crystallographic and biochemical studies reported to date about the 3CLpro proteases have been devoted to the SARS-CoV enzyme. Although their major conclusions are reasonably expected to be valid for the SARS-CoV-2 protease due to their high degree of homology, there is, of course, a growing research activity aimed specifically at SARS-CoV-2. Nevertheless, detailed molecular descriptions of the structure and dynamics of the native 3CLpro enzyme in aqueous solution and of its binding determinants are still lacking, which is what prompted us (and other researchers) to computationally examine the SARS-CoV-2 protein by means of extensive MD simulations. In this scenario, our results may complement currently available structural data given that several protein configurations were simulated and subject to comparative analysis in order to clarify some questions regarding the monomer structure, the dimer stability, and so on. It must be noted, however, that the significance of the present 2.0 μs MD simulations will depend on their critical comparison with results produced by other theoretical studies. With regard to the tertiary structure of the monomeric 3CLpro protein, a remarkable observation is the domain III rearrangement occurring during the M simulation. Thus, the monomer enzyme can adopt alternative conformations in aqueous solution. Interestingly, the crystallographic structures of three mutants of the SARS-CoV protein (G11A, S139A, and R298A) reveal a similar domain II/domain III poses.[13] These mutations cause the dimer dissociation by disrupting critical contacts at the protomer interface, and thereby, the rotation of domain III has been considered a structural feature of the monomeric enzyme. In the most abundant domain III orientation observed in the M simulation, the alignment of domains II and III is stabilized by a few salt bridges that also involve N-finger residues (Arg4···Asp216, Lys5···Glu290, and Arg131···Asp289). In the SARS-CoV mutants (2PWX structure),[13] the Lys5···Glu290 salt bridge is again observed, while domain III is slightly rotated and establishes different salt bridges (Arg131···Glu240 and Lys137···Asp289). The simulations also gave some clues about the domain III rotation event. At the beginning of the monomer MD simulation, the N-finger, which lacks the Arg4···Glu290* contact characteristic of the dimer state, is anchored at the last α-helix of domain III both by polar (Arg4···Gln299) and nonpolar (Phe3···Phe291) interactions. It turned out that the weakening of the π–π contact preceded the rupture of the Arg4···Gln299 contact, loosening then the N-finger hold onto domain III and enabling the torsional changes at the hinge Gly195 residue for the wide rotation to occur. Hence, it seems reasonable to consider that the N-finger structural instability induces the domain III conformational change. We also note in passing that the domain III rearrangement may be especially unfavorable for the recognition and organization of the long polyprotein sequences processed in vivo by the main SARS-CoV proteases. The MD simulations probe possible effects associated with the binding of the relatively small pentapeptide substrate. In general, we found that the amplitude of intradomain and interdomain motions as well as the global flexibility are reduced in the presence of the substrate as measured in terms of various descriptors (RMSDs, molecular surface, cluster representatives, etc.). The dampening of the domain I and II mobility is well understood because the peptide molecule, which remains perfectly bound to the catalytic site, further interconnects domains I and II by forming a stable network of H-bonds and nonpolar contacts. Similarly, the direct influence of the Gln(P1) residue on the oxyanion loop conformation is equally clear. However, the allosteric mechanism(s) through which substrate binding can affect the conformation of distal enzyme regions and/or enhance dimerization are not obvious. This is apparently the case for the outer wall of a channel[43] passing through the central region of the dimer constituted by the side chains of Ala285 and Leu286 from each of the protomers. The inter-residue contact analysis and the surface calculations indicate that this hydrophobic cluster is significantly more compact in the D/2Pep simulation than in D, which is likely connected with the lower mobility of domain III upon substrate binding. On the other hand, the fact that the domain III rotation did not occur during the monomeric M/Pep simulation confirms again the rigidifying role played by the peptide molecule. More specifically, the residues of the domain II/III linker loop that border the catalytic site (-Gln189-Thr190-Ala191-Gln192-) become less flexible in the presence of the peptide molecule, which, in turn, may hamper the torsional change at the nearby Gly195 acting as the hinge residue for the domain III rearrangement. Therefore, in this way, substrate binding to monomeric proteases could favor the interdomain orientation that is adequate for dimerization, helping thus clarify the cooperative effects between substrate binding and dimerization experimentally observed[16] in the monomeric mutants of SARS-CoV. Clearly, the small impact on dimer stability and catalysis due to the truncation of residues 1–3 in the N-finger of SARS-CoV[9] agrees nicely with our MD simulations in aqueous solution, which reveal how the contacts involving the N-terminal Ser1 in the starting crystallographic structures are largely weakened by water molecules (e.g., Ser1···Glu166 salt bridge and Ser1···Phe140 backbone contact). Hence, we propose that Ser1 plays only a secondary role in controlling the conformation of the oxyanion loop, which contrasts with the critically important Arg4 residue. Experimentally, residue deletion up to Arg4 results in a monomeric SARS-CoV protein with null or very weak enzymatic activity.[9] In fact, our simulations emphasize that Arg4 has a two-fold role. On one hand, it provides the Arg4···Glu290* salt bridge between the two protomers and helps fix the position of domain III in the same protomer through a double H-bond between the N-finger backbone and the Gln299 side chain. On the other hand, its guanidinium group contributes to clutch the oxyanion loop by H-bonding the Lys137* amide group in the other protomer. However, in the native form of the enzyme, the latter interaction and the Ser139···Gln299* contact with the C-terminal helix may be perturbed as seen in the D simulation, conferring some flexibility to the oxyanion loop. Other residues located in the N-coil also contribute to dimer stability, with the Gly11@NH····Oε@Glu14* contacts being especially important in terms of their abundance (100%) and steady interatomic distance (2.9 Å) all along the dimer simulations, which is in consonance with the inability of the Gly11Ala SARS-CoV mutant to dimerize.[13] As mentioned in the Introduction, the absence of catalytic activity in the monomeric SARS-CoV mutants has been ascribed to a presumably collapsed form of the active site, in which a partially 310 helical oxyanion loop would impede access to the binding sites. To our knowledge, such collapsed conformation of residues Ser139-Phe140-Leu141 has been observed both in the monomeric X-ray structures of the SARS-CoV enzyme and in some of its dimer states. In our simulations, the 310 helical twist occurs exclusively during the monomeric M simulation at the Asn142-Ser144 residues adjacent to Cys145. Furthermore, this transition turns out to be reversible (twist and untwist events are observed) and only reduces the accessibility to the Glu166@NH group and His163 side chain. Therefore, the simulations suggest that, in the liquid phase, the active site of the monomeric state would be transiently, but not permanently, in the partially collapsed conformation. The MD analysis also shows that the S1 structural fluctuations associated with the oxyanion loop motions would be the direct consequence of the missing contacts with the N-finger of the protomer. Concerning the dynamics of the dimer unbound state, we find that all the binding sites remain perfectly solvent exposed, and the active site regions tend to be more rigid than in the monomer form. The terminal residues (1–3) of the N-finger still have some conformational freedom, and the oxyanion loop in D shows flexibility around the Phe140 position. Hence, we propose that the S1 site of the 3CLpro enzymes may retain some plasticity in the dimer form, which would be more accentuated in the monomer state reducing, but not abolishing, its peptide binding ability. On the contrary, we expect that the monomer SARS-CoV-2 protein would have a significant affinity for binding the examined peptide as indicated by some highly stable enzyme–substrate contacts along the M/Pep MD trajectory. Therefore, although the M/Pep complex may be not optimal for catalysis (see below) and the M active site exhibits varying oxyanion loop conformations, our simulations give no support to the hypothesis of a collapsed active site to explain the noncatalytic behavior of the monomer state. In fact, they seem more compatible with the report of calorimetric Kd values for the enzyme–peptide complexes involving monomeric mutants of SARS-CoV that are close to that of the wild-type dimer.[16] The present MD simulations allow us to investigate the noncovalent binding between the SARS-CoV-2 enzyme and a model peptide, Ace-Ala-Val-Leu-GlnSer-Nme, reproducing the sequence of the first cleavage site catalyzed by 3CLpro at the polyproteins. The construction of the initial complex was feasible thanks to the crystallographic adduct (6LU7) between SARS-CoV-2 and a peptido-mimetic inhibitor. The remarkable stability of the enzyme–peptide mode of binding all along the (unconstrained) simulations confirms that the active site is readily adapted to accommodate this sequence. As above-discussed, the presence of the peptide molecule influences both globally and locally the structure and dynamics of the systems. In this respect, the alignment of the substrate chain along the binding crevice, together with the accommodation of Gln(P1) and Leu(P2) in the S1 and S2 pockets, firmly locks the scissile peptide bond into the Cys145/Gly143 oxyanion hole. Furthermore, we found that substrate is essential for the side chain of the nucleophilic Cys145 to adopt a suitable orientation for proton transfer toward the His41 imidazol and nucleophilic attack toward the Gln(P1) carbonyl group. Therefore, we note that the catalytic competency of the enzyme should not be judged on the basis of the Cys145···His41 contacts in the native form of the enzyme. The average interatomic distances and angles relating Cys145@Sγ, Gln(P1)@C, and His41@Nε corroborate the nearly prereactive character of the Michaelis complexes in the D/2Pep simulation (e.g., Sγ···C ∼ 3.7–4.0 Å; Sγ···Nε ∼ 3.5 Å). For the monomeric Michaelis complex, however, the measured prereactive contacts are much less favorable (e.g., Sγ···C ∼ 5.1 Å; Sγ···Nε ∼ 3.9 Å). Finally, the question arises about why the monomeric SARS-CoV-2 and SARS-CoV proteases have minimal or null activity against short peptide molecules. In this respect, our results reveal significant differences between the M/Pep and D/2Pep complexes that would undoubtedly favor the catalytic efficiency in the dimer state. On one hand, the larger flexibility of the monomeric active site could result in a certain entropic penalty as peptide binding would stifle the oxyanion loop and the domain II/III linker residues delimiting the binding sites. On the other hand, the relative position between the nucleophile and the Ser(P1′)–Gln(P1) peptide bond seems not adequate for catalysis in the monomeric Michaelis complex. This is most likely due to the above-discussed changes in the S1 site regarding the oxyanion loop position and the lack of the N-finger residues. In addition, other possible effects could positively affect catalysis in the dimer complexes. For example, the proximity of the whole domain III of the other protomer to the active site region could well provide a favorable environment for the electrostatic stabilization of transition states and reactive intermediates. More conclusive evidence about the catalytic impairment of the monomeric proteases could be gained by means of further computational work aimed to determine the full catalytic pathway and free energy profiles.
  31 in total

1.  Antiviral peptides against the main protease of SARS-CoV-2: A molecular docking and dynamics study.

Authors:  Shafi Mahmud; Suvro Biswas; Gobindo Kumar Paul; Mohasana Akter Mita; Shamima Afrose; Md Robiul Hasan; Mst Sharmin Sultana Shimu; Mohammad Abu Raihan Uddin; Md Salah Uddin; Shahriar Zaman; K M Kaderi Kibria; Md Arif Khan; Talha Bin Emran; Md Abu Saleh
Journal:  Arab J Chem       Date:  2021-07-14       Impact factor: 5.165

Review 2.  Molecular dynamics of the viral life cycle: progress and prospects.

Authors:  Peter Eugene Jones; Carolina Pérez-Segura; Alexander J Bryer; Juan R Perilla; Jodi A Hadden-Perilla
Journal:  Curr Opin Virol       Date:  2021-08-28       Impact factor: 7.121

3.  Primer for Designing Main Protease (Mpro) Inhibitors of SARS-CoV-2.

Authors:  Abhishek Thakur; Gaurav Sharma; Vishnu Nayak Badavath; Venkatesan Jayaprakash; Kenneth M Merz; Galia Blum; Orlando Acevedo
Journal:  J Phys Chem Lett       Date:  2022-06-21       Impact factor: 6.888

4.  An intermolecular salt bridge linking substrate binding and P1 substrate specificity switch of arterivirus 3C-like proteases.

Authors:  Qian Chen; Junwei Zhou; Zhixiang Yang; Jiahui Guo; Zimin Liu; Xinyi Sun; Qingshi Jiang; Liurong Fang; Dang Wang; Shaobo Xiao
Journal:  Comput Struct Biotechnol J       Date:  2022-06-30       Impact factor: 6.155

5.  Calendulaglycoside A showing potential activity against SARS-CoV-2 main protease: Molecular docking, molecular dynamics, and SAR studies.

Authors:  Ahmed A Zaki; Ahmed Ashour; Sameh S Elhady; Khaled M Darwish; Ahmed A Al-Karmalawy
Journal:  J Tradit Complement Med       Date:  2021-05-17

6.  Structure-based identification of SARS-CoV-2 main protease inhibitors from anti-viral specific chemical libraries: an exhaustive computational screening approach.

Authors:  Shovonlal Bhowmick; Achintya Saha; Sameh Mohamed Osman; Fatmah Ali Alasmary; Tahani Mazyad Almutairi; Md Ataul Islam
Journal:  Mol Divers       Date:  2021-04-12       Impact factor: 3.364

7.  Prospective Role of Peptide-Based Antiviral Therapy Against the Main Protease of SARS-CoV-2.

Authors:  Shafi Mahmud; Gobindo Kumar Paul; Suvro Biswas; Shamima Afrose; Mohasana Akter Mita; Md Robiul Hasan; Mst Sharmin Sultana Shimu; Alomgir Hossain; Maria Meha Promi; Fahmida Khan Ema; Kumarappan Chidambaram; Balakumar Chandrasekaran; Ali M Alqahtani; Talha Bin Emran; Md Abu Saleh
Journal:  Front Mol Biosci       Date:  2021-05-10

8.  Asymmetric dynamics of dimeric SARS-CoV-2 and SARS-CoV main proteases in an apo form: Molecular dynamics study on fluctuations of active site, catalytic dyad, and hydration water.

Authors:  Shinji Iida; Yoshifumi Fukunishi
Journal:  BBA Adv       Date:  2021-06-20

9.  New insights into the catalytic mechanism of the SARS-CoV-2 main protease: an ONIOM QM/MM approach.

Authors:  Henrique S Fernandes; Sérgio F Sousa; Nuno M F S A Cerqueira
Journal:  Mol Divers       Date:  2021-06-24       Impact factor: 3.364

Review 10.  Molecular Basis of SARS-CoV-2 Infection and Rational Design of Potential Antiviral Agents: Modeling and Simulation Approaches.

Authors:  Antonio Francés-Monerris; Cécilia Hognon; Tom Miclot; Cristina García-Iriepa; Isabel Iriepa; Alessio Terenzi; Stéphanie Grandemange; Giampaolo Barone; Marco Marazzi; Antonio Monari
Journal:  J Proteome Res       Date:  2020-10-29       Impact factor: 4.466

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.