The 3C-like main peptidase 3CL(pro) is a viral polyprotein processing enzyme essential for the viability of the Severe Acute Respiratory Syndrome coronavirus (SARS-CoV). While it is generalized that 3CL(pro) and the structurally related 3C(pro) viral peptidases cleave their substrates via a mechanism similar to that underlying the peptide hydrolysis by chymotrypsin-like serine proteinases (CLSPs), some of the hypothesized key intermediates have not been structurally characterized. Here, we present three crystal structures of SARS 3CL(pro) in complex with each of two members of a new class of peptide-based phthalhydrazide inhibitors. Both inhibitors form an unusual thiiranium ring with the nucleophilic sulfur atom of Cys145, trapping the enzyme's catalytic residues in configurations similar to the intermediate states proposed to exist during the hydrolysis of native substrates. Most significantly, our crystallographic data are consistent with a scenario in which a water molecule, possibly via indirect coordination from the carbonyl oxygen of Thr26, has initiated nucleophilic attack on the enzyme-bound inhibitor. Our data suggest that this structure resembles that of the proposed tetrahedral intermediate during the deacylation step of normal peptidyl cleavage.
The 3C-like main peptidase 3CL(pro) is a viral polyprotein processing enzyme essential for the viability of the Severe Acute Respiratory Syndrome coronavirus (SARS-CoV). While it is generalized that 3CL(pro) and the structurally related 3C(pro) viral peptidases cleave their substrates via a mechanism similar to that underlying the peptide hydrolysis by chymotrypsin-like serine proteinases (CLSPs), some of the hypothesized key intermediates have not been structurally characterized. Here, we present three crystal structures of SARS3CL(pro) in complex with each of two members of a new class of peptide-based phthalhydrazide inhibitors. Both inhibitors form an unusual thiiranium ring with the nucleophilic sulfur atom of Cys145, trapping the enzyme's catalytic residues in configurations similar to the intermediate states proposed to exist during the hydrolysis of native substrates. Most significantly, our crystallographic data are consistent with a scenario in which a water molecule, possibly via indirect coordination from the carbonyl oxygen of Thr26, has initiated nucleophilic attack on the enzyme-bound inhibitor. Our data suggest that this structure resembles that of the proposed tetrahedral intermediate during the deacylation step of normal peptidyl cleavage.
The 2002–2003 epidemic of Severe Acute Respiratory Syndrome (SARS) marked the worldwide debut of a newly recognized member of the Coronaviridae viral family, SARS-associated coronavirus (SARS-CoV). This highly contagious yet previously uncharacterized virus likely originated from some animal coronavirus that fortuitously crossed the animal–human species barrier. The disease was eventually brought under control by strict enforcement of medical containment and tight screening of travelers, but not before the world had witnessed over 8000 cases including 774 SARS-related deaths. At present, there is no specific and effective treatment against SARS-CoV.SARS-CoV is an enveloped, positive single-stranded RNA virus that replicates in the cytoplasm of the host cell. Similar to those of picornaviruses, coronaviral RNA genomes encode not only the capsid proteins required for virion assembly but also the non-structural proteins involved in viral RNA replication, including two large viral polyproteins, replicases pp1a and pp1ab. In these viruses, the polyproteins are processed by virally encoded peptidases via proteolysis. It is thought that the SARS 3C-like main proteinase performs 11 peptide cleavages in the viral polyproteins to generate individual viral proteins that subsequently assemble into functional complexes required for the replication of the viral RNA genome.2., 3., 4.The crystal structures of four coronaviral 3CLpro enzymes have been reported: those of the transmissible gastroenteritis virus (TGEV), human coronavirus (H-CoV 229E), SARS-CoV, and the mouse hepatitis virus (MHV).5., 6., 7., 8. In all cases, the N-terminal domains I and II are each composed of a chymotrypsin-like β-barrel. The C-terminal domain III is mainly helical and mediates the homodimerization of coronaviral 3CLpro in the crystal structures, an interaction believed to be important for its proteolytic activity in trans. The catalytic residues are thought to be the His41-Cys145 dyad, not a triad as in the structurally similar CLSPs and cysteine peptidases including the picornaviral 3Cpro enzymes. This hypothesis seems to be supported by various coronaviral 3CLpro crystal structures, which showed that His41 does not interact directly with any acidic residues. However, a water molecule has been consistently observed forming a hydrogen bond (2.6 to 2.9 Å) with Nδ1 of His41. Additionally, this water molecule is stabilized via hydrogen bonds formed with two other residues near the catalytic residues, i.e. His164 and Asp187. The role of this water molecule in the proteolytic reactions catalyzed by 3CLpro enzymes has not been thoroughly investigated.The hydrolysis of peptide substrates by viral 3Cpro or 3CLpro peptidases is thought to occur in a manner analogous to the proteolysis by CLSPs. In the first half of the reaction or the acylation step, a histidine general base (His41 in SARS3CLpro numbering) assists the nucleophilic attack on the carbonyl carbon of the scissile bond by the Sγ atom of the cysteine nucleophile (Cys145 in SARS3CLpro numbering), leading to the formation of the first tetrahedral intermediate (TI1). The ensuing collapse of the TI1 and the departure of the C-terminal product give rise to a covalent thioester enzyme-substrate complex. In the second half of the catalysis, or the deacylation step, a solvent molecule, activated to a nucleophilic OH¯ ion by His41, attacks the carbonyl carbon of the thioester, forming the second tetrahedral intermediate (TI2), which is followed by the release of the N-terminal product and the regeneration of the catalytic cysteine.The 3CLpro enzymes have been targeted for drug design against various members of the Coronavirus genus due to the extensive structural conservation in their active sites and the apparent absence of human homologues. A few non-covalent, competitive inhibitors and peptidic, covalent inhibitors have been visualized in SARS3CLpro crystal structures.7., 8., 11., 12., 13. The covalent inhibitors studied to date carry either a halomethyl ketone, an epoxide or a 1,4 Michael acceptor function as the reactive “warheads”. Such functionalities permanently inactivate the viral peptidase via the formation of a non-hydrolysable covalent linkage to Cys145 as the result of the nucleophilic attack by its Sγ atom. As shown by X-ray crystallography, these inhibitors yield a complex analogous to the thioacyl intermediate formed during proteolysis. Although these crystal structures have provided valuable insights into how the residues in the active site of SARS3CLpro are organized after the inhibition reaction is completed, little is known about the mechanisms of inhibition at the molecular level or by analogy about the details of the acylation step of normal catalysis. Furthermore, none of these crystal structures (including those of the serine peptidases) offered any substantial structural insights into the deacylation of the acyl enzyme intermediate.We recently designed a new class of covalent inhibitors against 3Cpro (HAV) and 3CLpro (SARS-CoV) based on the phthalhydrazide function.14., 15. Initially designed as non-covalent peptidic inhibitors against the 3Cpro and 3CLpro enzymes, we recently demonstrated the formation of an acylated enzyme in HAV 3Cpro with concomitant elimination of the phthalhydrazide group. Two species of modified enzymes were observed by high resolution X-ray crystallography: (1) a thioacyl-like species similar to those reported in other covalently inhibited 3Cpro and 3CLpro and (2) a species in which the inhibitor is linked to HAV 3Cpro by an episulfide cation ring. In the latter complex, the Sγ of the catalytic Cys172 is directly attached to the carbonyl carbon, leading to the formation of an oxyanion in the active site. To our knowledge, this is the first structure to show a linkage between the nucleophilic sulfur atom and the sp3 hybridized carbonyl carbon, in a 3Cpro or 3CLpro peptidase. While this structure may be an analogue of the tetrahedral intermediate that occurs during acylation (TI1), the relevance of the episulfide ring complex to normal substrate hydrolysis is limited due to the lack of structures corresponding to other catalytic intermediates.Here we extend our structural studies on the HAV 3C proteinase to another viral peptidase, SARS3CLpro. The crystal structures of SARS3CLpro bound to two phthalhydrazide-charged peptidyl inhibitors, acetyl-valyl-(O-benzyloxy)threonyl-leucyl-ketomethyl(cycloglutamine)-phthalhydrazide (inhibitor 1, Ac-VTbLQcmph) and acetyl-leucyl-alanyl-alanyl-ketomethyl(cycloglutamine)-phthalhydrazide (inhibitor 2, Ac-LAAQcmph) are presented. The significance of these structures in relation to the catalytic steps during normal peptide hydrolysis by viral cysteine proteinases is discussed.
Results and Discussion
In vitro inhibition of SARS 3CLpro by inhibitor 1
The inhibition of SARS3CLpro by inhibitor 1 was characterized using a FRET-based fluorogenic method reported previously. When added to a mixture of protease and substrate, inhibitor 1 inhibited 3CLpro in a competitive fashion with a K
ic of 250( ± 50) nM. No kinetic evidence of covalent inhibition was observed when 10 uM 3CLpro was pre-incubated with 100 uM inhibitor 1 (20 mM Bis–Tris (pH 7.0), 2 mM DTT, 37 °C) for 15 to 60 min, nor under solution conditions that mimicked the crystallization buffer.
Inhibitors 1 and 2 are covalently attached to SARS 3CLpro in crystals
We examined the chemical nature of the reaction between the inhibitors and SARS3CLpro by subjecting SARS3CLpro crystals soaked in inhibitors 1 or 2 to electron spray ionization-mass spectrometry (ESI-MS). For SARS3CLpro-inhibitor 1 complex, a mass difference of 616 Da was observed before and after soaking inhibitor 1 into 3CLpro crystals. This corresponds to a covalent adduct between 3CLpro and inhibitor 1 without the phthalhydrazide moiety (616 Da). Similarly, the covalent linkage between inhibitor 2 and 3CLpro was established by a mass difference of 466 Da (unreacted versus inactivated 3CLpro), which is within the experimental error (the calculated mass of inhibitor 2 after the removal of the phthalhydrazide function is 467 Da). These results are in line with our crystallographic observations (see below).The discrepancy between the findings of solution studies and those of crystallography may be explained by several hypotheses: (1) while fast, competitive inhibition predominates in an aqueous environment, slow, selective crystallization of covalently inactivated enzyme molecules takes place in the course of a 24–72 h period; (2) the “local concentration effect” in crystal soaking experiments, plus the relatively high concentration of inhibitor used in crystallographic study may foster the formation of the covalent linkage between the inhibitors and the enzyme; and (3) the local conformation of the residues in the active site of crystallized SARS3CLpro may be restricted in some way that facilitates the covalent attachment of the inhibitors to the enzyme.
Structural overview
The three SARS3CLpro-inhibitor complexes are very similar in overall protein fold to each other and to the unliganded structure (PDB code 2A5A) as evidenced by their r.m.s.d. values (less than 0.3 Å calculated over all 306 Cα atoms Table 1
)). Two regions show significant structural divergences: those comprising residues 45-TAEDM-49 and residues 277-NG-278. Residues N277 and G278 are located in a flexible surface loop in the C-terminal helical domain, where residues 276-MNGR-279 form a type II turn. The electron density for N277 was previously noted to be poorly defined and this seems to be attributable to the relatively higher inherent flexibility in this residue. Although the loop containing N277 is situated near the dimerization interface of SARS3CLpro in the crystal, N277 itself is not directly involved in crystal packing. Residues 45-TAEDM-49 form the outer lid of the S2 pocket and their atoms have been associated with temperature factors significantly higher than those of the neighboring residues in various SARS3CLpro crystal structures. The significance of the flexibility in these residues in the recognition of the P2 residues of peptidyl substrates is discussed below.
Table 1
Alignment statistics of various complexes discussed in this study
2A5I
XXXX
YYYY
ZZZZ
2A5A
0.28a
0.24
0.28
0.23
XXXX
0.27/1.16b
0.18/1.92
0.18/2.13
YYYY
0.23/0.93
0.16/0.30
ZZZZ
0.23/1.12
r.m.s.d. values (Cα positions) over all protein residues in Å.
r.m.s.d. values for inhibitors alone.
Alignment statistics of various complexes discussed in this studyr.m.s.d. values (Cα positions) over all protein residues in Å.r.m.s.d. values for inhibitors alone.
The interactions between the peptidyl portions of the inhibitors and the specificity pockets of SARS 3CLpro
The three refined structures show that the inhibitors are located in the substrate-binding cleft between the two chymotrypsin-like β-barrels. Excluding the interactions inside the S1 pocket, four and three hydrogen bonds were observed between the enzyme and the peptide backbone of inhibitors 1 and 2, respectively (Table 2
). This structural organization has the general resemblance to that of the picornaviral 3C-peptidyl inhibitor complexes. In the latter, the peptidyl portions of the inhibitors form a parallel β-sheet with a strand (residues 141-ATYVHK-146 in HAV 3C) of the β-hairpin substructure, and an antiparallel β-sheet with a strand (residues 192-VAGGN-196 in HAV 3C) from the C-terminal β-barrel. In contrast, in the SARS3CLpro structures, the substrate/inhibitor peptide is antiparallel to both of the two interacting strands, residues 189-QTA-190 and residues 164-HMELP-168, an arrangement similar to that observed in the subtilisin family of serine proteinases. A recent theoretical model also suggested that peptide substrates likely also form an antiparallel β-sheet in the active site cleft of caliciviral 3C-like proteinases.
Table 2
Interactions between the tetrapeptidyl inhibitors and SARS 3CL proteinase
Thioacyl-like
Episulfide (inhibitor 1)
Episulfide (inhibitor 2)
Deacylating
P4a
21/2b
20/1
30/3
30/3
P3
16/4
12/3
8/4
8/4
P2
7/2
10/4
12
12
P1
45/4
42/1
44
45
H bonds
P4O:Gln189NE2 (2.7)c
P4O:Gln189NE2 (2.9)
P3N:Glu166O (2.8)
P3N:Glu166O (2.9)
P3N:Glu166O (3.0)
P3N:Glu166O (3.0)
P3O:Glu166N (2.8)
P3O:Glu166N (2.9)
P3O:Glu166N (2.9)
P3O:Glu166N (2.9)
P1N:Glu164O (3.5)
P1N:Glu164O (3.1)
P1N:Glu164O (3.0)
P1N:Glu164O (3.0)
P1O:Gly143N (3.0)
P1O:Gly143N (3.1)
P1O:Gly143N (2.6)
P1O:Gly143N (2.9)
P1O:Cys145N (3.1)
P1O:Cys145N (2.9)
P1O:Cys145N (3.2)
P1O:Cys145N (2.9)
P1OE1:His163NE2 (2.6)
P1OE1:His163NE2 (2.6)
P1OE1:His163NE2 (2.8)
P1OE1:His163NE2 (2.7)
P1NE2:Glu166OE2 (3.0)
P1NE2:Glu166OE2 (3.0)
P1NE2:Glu166OE2 (2.9)
P1NE2:Glu166OE2 (3.0)
P1NE2:Phe140O (2.9)
P1NE2:Phe140O (3.5)
P1NE2:Phe140O (3.3)
P1NE2:Phe140O (3.4)
Residue positions with respect to scissile bond as defined by Schechter and Berger.
Total number of van der Waals interactions/van der Waals interactions with solvent (less than or equal to 4 Å).
Parentheses indicate the distances in Å.
Interactions between the tetrapeptidyl inhibitors and SARS3CL proteinaseResidue positions with respect to scissile bond as defined by Schechter and Berger.Total number of van der Waals interactions/van der Waals interactions with solvent (less than or equal to 4 Å).Parentheses indicate the distances in Å.Upon substrate binding, small shifts were observed in the residues forming one of the two β-strands in α-lytic protease interacting with the peptidyl substrates in the active site. The movement of this β-strand (residues 214-SGGNV-218 in α-lytic protease) causes the specificity pockets of the enzyme to collapse onto the substrate, a mechanism proposed to confer substrate specificity. A similar motion seems at play in the active sites of SARS3CLpro as well; residues 164-HMELP-168 of a structurally equivalent β-strand to that in α-lytic protease exhibit shifts of 0.3–0.4 Å from their positions in the unliganded 3CLpro structure (PDB code 2A5A) towards the inhibitor. Although inhibitors 1 and 2 consist of different peptidyl sequences, the size of these shifts is comparable among all three complexes described here. This suggests that the driving force of the motion in residues 164-HMELP-168 is the hydrogen bonding formed between these residues and the main chain atoms of the peptide substrates.The S1 specificity pocket is primarily responsible for discriminating among peptide substrates. Virtually all native coronaviral 3CLpro cleavage sites have a P1-Gln residue. With the exception of the side-chain atoms of Asn142, all atoms forming the S1 pocket show below average conformational changes upon inhibitor binding, suggesting that the screening of prospective substrates at the P1 position is largely based on structural complementarity. Indeed, we found that the side-chain atoms of the P1-cycloglutamine (P1-Glnc) residues of both inhibitors 1 and 2 occupy almost identical positions in the S1 pocket. Two hydrogen bonds contribute to the specific recognition and stabilization of the P1-Glnc in the S1 pocket: the side-chain carbonyl oxygen accepts a strong hydrogen bond (2.5 to 2.7 Å) from Nε2 of His163 and the amidenitrogen of the pyrrole ring donates a hydrogen bond (∼ 3 Å) to a carboxylateoxygen of Glu166. The S1 pocket and the oxyanion hole in SARS3CLpro are essentially unchanged for the unliganded enzyme (PDB code 2A5A) and the three complex structures presented here.In contrast to S1, the S2 pocket, although iso-structural between the unliganded enzyme and the 3CLpro-inhibitor 2 complex, showed significant structural change upon the binding of inhibitor 1. Leucine occurs most frequently at the P2 position in the cognate cleavage sites for coronaviral 3CLpro enzymes. Such preference can be best explained by the induced orderliness in the S2 pocket. The side-chain of Met165, the residue that forms the bottom of the S2 pocket, takes on two alternate conformations (with χ1 angles ∼ 0° and 80°, respectively) in both the native enzyme and 3CLpro-inhibitor 2 complex. The presence of the P2-Leu of inhibitor 1 in the S2 pocket completely eliminates the first conformation (χ1 angle ∼ 0°) of Met165. This is consistent with the structural changes in the S2 pocket observed in the structures of the SARS3CLpro-azapeptide epoxide complexes: in addition to the restriction in the side-chain placement of Met165, the S2 “lid”, a surface loop consisting of residues 45-TAEDM-49, moves up towards the solvent to accommodate the bulkier P2-Phe of the epoxide inhibitor. These structural adjustments in the S2 pocket with regard to the incumbent P2 residue indicate that the recognition/binding of the P2 residue in the S2 pocket occurs in an “induced-fit” fashion.Although the structural differences in the S4 pocket are minor between the complexes with inhibitors 1 and 2, the only hydrogen bond formed between 3CLpro and the carbonyl oxygen of the P4 residues of the inhibitors is lost in the inhibitor 2 complex. Asn189, the residue that forms the mobile lid of the S4 pocket, normally makes a hydrogen bond (2.8–2.9 Å) with the P4 carbonyl oxygen via its Nε2 atom, which is indeed observed in the inhibitor 1 complexes. The distance between these two atoms is 3.9 Å in the inhibitor 2 complex because of a shift in the backbone atoms of the P4 and P3 residues of inhibitor 2. This is possibly due to the fact that inhibitor 2 contains a larger hydrophobic residue at the P4 position (Leu) than the P4-Val of inhibitor 1. Leucine does not fit well inside the normally shallow S4 pocket. Consequently, the acetyl modification of P4-Leu, which resembles the peptide bond between the P5 and P4 residues of a substrate, does not lie in the same volume as that of the (acetyl)-P4-Val of inhibitor 1. It is of interest to mention here that the S4 pocket of SARS3CLpro does occasionally exhibit inducible conformational change to accommodate larger residues such as the N-terminal benzene moiety in the epoxide inhibitor bound to SARS3CLpro. In that structure, only one of the two protomers in the asymmetric unit showed an enlarged S4 pocket, indicating that either the S4 pocket of SARS3CLpro is less susceptible to induced structural adjustment or the binding energy for a fully buried leucine side-chain in inhibitor 2 is not sufficient to drive the corresponding conformational changes in the S4 pocket.
Covalent linkages between the inhibitors and SARS 3CLpro catalytic cysteine Structures of SARS 3CLpro in complex with inhibitor 1
The SARS3CLpro-inhibitor 1 complex obtained via the co-crystallization method shows predominantly a thioacyl-like connectivity between Sγ of Cys145 and the Ci atom of the inhibitor (Figure 1
for nomenclature, Table 3
for bond geometry and Figure 2, Figure 3, Figure 4
for visualization). This structure mirrors those previously published of SARS3CLpro that has reacted with peptidyl chloromethyl ketone inhibitors, with an epoxide inhibitor and with Michael acceptor inhibitors. In all of these structures, the Cβ–Sγ bond of Cys145 shows a dramatic rotation (∼ 90° in the χ1 angle) from its position in the unliganded enzyme. Similar conformational changes in the catalytic residues of the CLSPs are rarely reported in crystal structures. For example, chloromethyl ketone inhibitors usually form two covalent attachments each to one of the His57:Ser195 catalytic pair in their native conformation, possibly through a double displacement mechanism. Interestingly, in the crystal structure of elastase complexed with the non-covalent inhibitor, trifluoroacetal-leucyl-alanyl-p-trifluoromethylphenylanilide (TFLA), the Cβ–Oγ bond of Ser195 does undergo a similar rotation to avoid steric clashes with the trifluoro function of the inhibitor (PDB code 7EST). This rotation in the side-chain of the catalytic serine/cysteine residue not only increases slightly the distance between Nε2 of the assisting general base (His57 of elastase/His41 of 3CLpro) and the nucleophilic atom, but also places the nucleophilic Oγ/Sγ atom in a position less coplanar with the imidazole ring of the histidine residue. Consequently, the hydrogen bond between the catalytic His:Ser/Cys pair is weakened. When the usual analogy is drawn between these inhibitor-enzyme complexes and intermediate states of peptide hydrolysis, an attractive hypothesis arises stating that the demonstrated innate structural plasticity in the catalytic serine/cysteine residue effectively decreased the backward reaction from the acyl enzyme stage. However, that these conformational changes occur only in inactivated enzymes argues that the non-native conformation is unique to enzyme-inhibitor complexes and may not bear much relevance to the hydrolysis of peptide substrates. Indeed, the crystal structures of serine proteinase complexes showed that the formation of neither the acyl complex nor the tetrahedral intermediate requires the side-chain movement in the catalytic serine to a magnitude similar to that observed in the thioacyl-like complexes with viral 3Cpro or 3CLpro enzymes. In SARS3CLpro and picornaviral 3Cpro, because of the geometric constraints imposed by the relatively greater rigidity of the oxyanion hole and the catalytic dyad, the additional methylene carbonCi of inhibitor 1 makes it impossible for Sγ of Cys145 to maintain its native conformation in the thioacyl-like complex. It is noteworthy that two recently reported atomic resolution X-ray crystal structures of α-lytic protease showed a modified version of the aforementioned mechanism by which proteinases may promote the formation of the acyl enzyme over back-protonation of the catalytic Ser195. The authors observed a translational shift in the position of Oγ of the catalytic Ser195 residue that weakens the His57:Ser195hydrogen bond through a significant decrease in the calculated His57Nε2-H···OγSer195 angle from the ideal linear value of 180° to 127°.
Figure 1
The chemical formulae for the inhibitors used in this study. Chiral centers are indicated where necessary.
Table 3
Bond geometry of the three covalent linkages
Bond lengths
Thioacyl-like
Episulfide
Deacylating
Sγ-Ci
1.81
1.83
1.81
Sγ-C
1.80
1.83
C=O
1.23
C-O¯
1.43
1.45
Ci-C
1.51
1.53
1.53
C-Cα
1.50
1.52
1.55
Ci-OH
1.45
Cys145N···O
3.11
3.14
2.92
Gly143N···O
3.01
3.07
2.91
His41Nε2···Sγ
3.07
His41Nε2···Ci
2.97
Bond angles
Sγ-Ci-C
108.9
64.0
65.4
Cβ-Sγ-Ci
117.3
104.3
105.7
O-C-Ci
123.1
113.3
108.1
O-C-Cα
118.4
111.3
110.9
Ci-C-Cα
118.5
111.4
110.9
Sγ-C-O
123.4
113.0
Sγ-C-Ci
66.3
107.4
Sγ-C-Cα
125.0
121.0
Cβ-Sγ-C
111.5
107.5
Sγ-C-Cα
121.0
106.4
Oh-Ci-C
109.7
Dihedralsa
Thioacyl-like
Episulfide
Deacylating
His41…Sγ
161.5
173.5
173.3
His41…Ci
164.6
155.0
151.0
His41…C
163.5
159.4
160.1
His41…Oh
144.8
Ci, C, Cα, O are atoms of the inhibitors, Cβ and Sγ are atoms of Cys145, Oh is the oxygen atom of the hydrolytic water, all bond lengths are in Å, angles are in °. Covalent bonds are shown as dashes, whereas hydrogen bonds are shown in dotted lines.
Dihedrals are defined as the torsion angles determined by Nδ1, Cε1, Nε2 of His41 and the fourth atom.
Figure 2
The electron densities of the inhibitors on the |2|Fo|–|Fc||,αcalc map. (a) The thioacyl form of SARS 3CLpro-inhibitor 1 complex; (b) the episulfide (major) and thioacyl mixed forms of SARS 3CLpro-inhibitor 1 complex; (c) the episulfide and the deacylating (major) mixed forms of SARS 3CLpro-inhibitor 2 complex. The protein is shown in cartoon with the carbon atoms colored green. The inhibitors are shown in sticks. Colors grey, purple and cyan are used to distinguish the carbon atoms in the thioacyl, episulfide and the deacylating complexes, respectively. In addition, the carbon atoms in portions of the inhibitors corresponding to the P4-P2 residues of the native substrates are colored grey. Cys145 is colored similarly as the bonding mode of the inhibitor to which it is linked. Oxygen and nitrogen atoms are colored red and blue, respectively.
Figure 3
The interactions between the inhibitors and SARS 3CLpro. (a) The thioacyl form of SARS 3CLpro-inhibitor 1 complex; (b) the episulfide form of SARS 3CLpro-inhibitor 1 complex; (c) the deacylating form of SARS 3CLpro-inhibitor 2 complex; (d) the episulfide form of SARS 3CLpro-inhibitor 2 complex.
Figure 4
Positive electron density peaks on the |Fo|–|Fc|,αcalc difference maps for (a) the episulfide linkage of SARS 3CLpro-inhibitor 1 complex and (b) the deacylating form of SARS 3CLpro-inhibitor 2 complex overlaid with their corresponding refined structures. The protein and inhibitor structures are similarly colored as in Figure 2. Positive densities on the 2||Fo|–|Fc||,αcalc difference maps (contoured at 1 σ) are shown in sandy color, whereas those on the |Fo|–|Fc|,αcalc difference maps (contoured at 3.5 σ) are colored red. The water molecule hydrogen bonded to H41 is shown as a red sphere.
The chemical formulae for the inhibitors used in this study. Chiral centers are indicated where necessary.Bond geometry of the three covalent linkagesCi, C, Cα, O are atoms of the inhibitors, Cβ and Sγ are atoms of Cys145, Oh is the oxygen atom of the hydrolytic water, all bond lengths are in Å, angles are in °. Covalent bonds are shown as dashes, whereas hydrogen bonds are shown in dotted lines.Dihedrals are defined as the torsion angles determined by Nδ1, Cε1, Nε2 of His41 and the fourth atom.The electron densities of the inhibitors on the |2|Fo|–|Fc||,αcalc map. (a) The thioacyl form of SARS3CLpro-inhibitor 1 complex; (b) the episulfide (major) and thioacyl mixed forms of SARS3CLpro-inhibitor 1 complex; (c) the episulfide and the deacylating (major) mixed forms of SARS3CLpro-inhibitor 2 complex. The protein is shown in cartoon with the carbon atoms colored green. The inhibitors are shown in sticks. Colors grey, purple and cyan are used to distinguish the carbon atoms in the thioacyl, episulfide and the deacylating complexes, respectively. In addition, the carbon atoms in portions of the inhibitors corresponding to the P4-P2 residues of the native substrates are colored grey. Cys145 is colored similarly as the bonding mode of the inhibitor to which it is linked. Oxygen and nitrogen atoms are colored red and blue, respectively.The interactions between the inhibitors and SARS3CLpro. (a) The thioacyl form of SARS3CLpro-inhibitor 1 complex; (b) the episulfide form of SARS3CLpro-inhibitor 1 complex; (c) the deacylating form of SARS3CLpro-inhibitor 2 complex; (d) the episulfide form of SARS3CLpro-inhibitor 2 complex.Positive electron density peaks on the |Fo|–|Fc|,αcalc difference maps for (a) the episulfide linkage of SARS3CLpro-inhibitor 1 complex and (b) the deacylating form of SARS3CLpro-inhibitor 2 complex overlaid with their corresponding refined structures. The protein and inhibitor structures are similarly colored as in Figure 2. Positive densities on the 2||Fo|–|Fc||,αcalc difference maps (contoured at 1 σ) are shown in sandy color, whereas those on the |Fo|–|Fc|,αcalc difference maps (contoured at 3.5 σ) are colored red. The water molecule hydrogen bonded to H41 is shown as a red sphere.The structure of the 3CLpro-inhibitor 1 complex obtained via the soaking method indicates that there is an alternate bonding mode. This is clear in the ||F
o|–|F
c||,αc difference map when refinement was performed using the coordinates of the thioacyl-like complex. As shown in Figure 4(a), even when the occupancies of both inhibitor and Cys145 were set to 1.0, a significant (> 4σ) positive peak was observed near the position of Sγ of the unliganded enzyme. Correspondingly, there is a small negative density peak near the Sγ atom at its inhibited, “thioacyl”-like position. A similar result was reported by us previously for the complexes of HAV 3Cpro with phthalhydrazide inhibitors. In the SARS3CLpro complex, we have determined that an episulfide cation (thiiranium ring) best explains the electron density near Sγ of Cys145. Subsequent structural refinement has shown that both the negative and positive peaks in the ||F
o|–|F
c||,αc difference map disappear when we set the episulfide bonding mode to be the major species (> 70%). The majority of the episulfide species is also confirmed in another crystal structure of the same complex acquired using a lower concentration of inhibitor 1 during soaking, in which the ratio between thiomethyl acyl and episulfide bonding modes remains roughly the same after refinement. This suggests that either linkage can be the “terminal” product of the inhibition reaction in this form of SARS3CLpro crystals. The distinct difference in the distribution of the two bonding modes between the structures of co-crystals and soaked crystals suggests the intermediacy of the episulfide linkage in solution. Additionally, the collapse of the episulfide cation into the thiomethyl ketone linkage may require conformational changes in the 3CLpro enzyme, which are more attainable in aqueous solution than in crystals (Table 3).In the episulfide linkage, the Ci atom of inhibitor 1is 3.0 Å away from Nε2 of His41, a distance shorter than that observed in the thioacyl-like bonding mode (> 4 Å). The shorter distance indicates a possible CH···N type hydrogen bond that would help to stabilize the episulfide cation structure. Two lines of argument support this proposal: (1) Ci is an electron-deficient atom due to the positive charge associated with the episulfide ring and the relatively larger electronegativity of Sγ and (2) Nδ1 of His41 donates a strong hydrogen bond to a nearby water (∼ 2.7 Å), which is, in turn, hydrogen bonded to D187 (∼ 2.8 Å to Oδ2), H164 (∼ 2.9 Å to Nδ1) and the main chain nitrogen of His41. Nε2 of His41 is closer to Sγ of Cys145 than to Ci of inhibitor 1 in the thioacyl-like complex, whereas the reverse is true for the episulfide complex. This is in agreement with the proposed catalytic roles of His41: it acts as a general base to activate Sγ of Cys145 for its nucleophilic attack on the peptide bond, and then as a general acid to protonate the scissile peptide amidenitrogen for the release of the leaving group, the C-terminal product. The imidazole ring of His41 is roughly coplanar with both Sγ of Cys145 and Ci. Moreover, both the Nε2–Sγ and the Nε2–Ci distances in the two complexes are between 3.0 and 3.6 Å, suggesting that nucleophilic attack and leaving group protonation are closely coordinated.
Structure of SARS 3CLpro in complex with inhibitor 2
The |F
o|–|F
c|,αc difference map generated using an episulfide linkage as the refinement model clearly shows that the thioacyl-like bonding mode probably does not exist at a level significant enough to be taken into account. Surprisingly, a strong positive electron density peak appeared near the Ci atom of inhibitor 2 (Figure 4(b)). This peak extends into the electron density surrounding the imidazole ring of His41 on the corresponding |2|F
o|–|F
c||,αc difference map. The continuous electron density led to our hypothesis that this peak represents a possible position of a water molecule, which, with the assistance of His41, has attacked and opened the episulfide ring formed between Sγ of Cys145 and the methyl ketone function of inhibitor 2. Subsequent refinement of the modified structure in the active site of SARS3CLpro proved successful; the positive peak disappeared and no negative peaks were observed nearby, indicating that our model provides a reasonable explanation for the additional electron densities near Ci on the initial |F
o|–|F
c|,αc map. The final structure shows a distance of ∼ 2.5 Å between the hydroxyl oxygen on atom Ci and Nε2 of His41. This indicates that it is a strong hydrogen bond, considering an overall coordinate error of ∼ 0.15 Å based on maximum likelihood. Although solvent-directed nucleophilic attack during the deacylation stage of peptide hydrolysis by proteinases has been an unchallenged textbook concept, this is the first crystal structure that provides clear visual demonstration of such critical reaction step. Furthermore, the reaction pathway proposed here parallels the posited mechanism underlying the inhibition of chymotrypsin by chloromethyl ketone inhibitors. In the latter hypothesis, an epoxy ether intermediate, resulting from the internal displacement of the chloride by the oxyanion, is subsequently attacked (thus opened) by Nε2 of the histidine residue of the catalytic triad to form a second covalent linkage between the inhibitor and the His:Ser catalytic pair of chymotrypsin.Comparison of the three complex structures reported here suggests that the activated water molecule carrying out the attack on the acyl enzyme approaches from the C-terminal side of the scissile bond, i.e. the S′ side. The extrapolated angle of attack (O···C=O), using the coordinates of the hydroxyl oxygen attached to the Ci atom and those of the carbonyl function in the episulfide bonding mode (the bona fide pre-attack scenario), is ∼ 115°. This angle is satisfyingly close to the calculated optimal angle (∼ 109°) for nucleophilic addition onto carbonyl groups. The slightly larger angle in our structure may reflect either minor coordinate error or the fact that the attacking hydroxyl oxygen is still ∼ 2.3 Å from the P1 carbonyl carbon and separated by the Ci atom or both. During normal deacylation, the actual line of attack on the P1 carbonyl group by the hydrolytic water from its position observed in our “deacylating” complex structure will be inevitably affected by the bulky Sγ of Cys145. Therefore, the attacking angle of the hydrolytic water on P1 carbonyl of peptide substrates may change slightly from that estimated in the “deacylating” complex. More importantly, the establishment of C-side entrance of the hydrolytic water and its position relative to His41 and Cys145 as observed in the deacylating complex are in accordance with the hypothesized “minimal energy pathway” mechanism, in which the solvent activation by Nε2 (His41) and re-protonation of Sγ (Cys145) are tightly coordinated. The distance between Nε2 (His41) and Sγ (Cys145) in the deacylating complex is 3.7 Å, indicating that it is a better mimic of tetrahedral intermediate 2 (TI2) instead of TI2′. This distance is longer than that between Nε2 (His57) and Oγ (Ser195) in CLSPs, reflecting the larger size of Sγ
versus Oγ and the necessity to reduce possible steric hindrance between the hydrolytic water and the catalytic pair during its entrance into the active site from the S′ sites.In the active sites of CLSPs, the carboxylate of Asp102 (PPE numbering) forms hydrogen bonds with Nδ1 of His57 and Oγ of Ser214 (the S1 residue), whose carbonyl oxygen interacts with the amidenitrogen of the P1 residue via a hydrogen bond (Figure 5(a)). Adding the hydrogen bond between the carboxyl group of Asp102 and main chain amide of His57, the hydrogen bonding network converges on the catalytic histidine residue and was thought to couple acyl enzyme hydrolysis and product release. An extensive hydrogen bonding network involving the catalytic histidine residue is also observed in the crystal structures of coronaviral 3CLpro. However, coronaviral 3CL proteinases do not have an acidic residue as the third member of the catalytic triad; Asp187 (SARS3CLpro numbering), the equivalent residue to Asp102 in CLSPs, does not interact directly with the catalytic His41. Instead, a solvent molecule is consistently observed in the vicinity of Nδ1 of His41 in various coronaviral 3CLpro crystal structures.5., 6., 7., 8. In the deacylating complex, this water (WAT22) forms hydrogen bonds with Nδ1 of His41 (2.5 Å), Nδ1 of His164 (2.9 Å), Oδ2 of Asp187 (2.7 Å), and main-chain NH of His41 (3.3 Å) (Figure 5(b)). In addition, the carbonyl oxygen of His164 forms a hydrogen bond with the main-chain NH of the P1-Glnc (3.0 Å). Therefore, the solvent-directed hydrogen bonding network in SARS3CLpro closely mimics the Ser214···Asp102···His57 interactions in the CLSPs. This suggests that SARS3CLpro may employ a similar mechanism such as that proposed for the CLSPs underlying the concerted progression from acyl enzyme hydrolysis to product release. Interestingly, the residues in the picornaviral 3Cpro corresponding to Ser214 in the CLSPs are usually aliphatic (e.g. Val192 in HAV 3Cpro). This precludes the possibility of linking peptide substrate binding/dissociation to the catalytic events concerning the scissile bond via the hydrogen-bonding network described above.
Figure 5
Similar hydrogen bonding network converged on the catalytic histidine residue in PPE and SARS 3CLpro. (a) The PPE-BCM7 acyl complex and (b) the SARS 3CLpro-inhibitor 1 thioacyl-like complex. The PPE and SARS 3CLpro residues are distinguished by the color of the carbon atoms as cyan and green, respectively. The carbon atoms in the inhibitors are colored grey. Hydrogen bonds are shown in black broken lines. WAT22 is shown in red sphere.
Similar hydrogen bonding network converged on the catalytic histidine residue in PPE and SARS3CLpro. (a) The PPE-BCM7 acyl complex and (b) the SARS3CLpro-inhibitor 1thioacyl-like complex. The PPE and SARS3CLpro residues are distinguished by the color of the carbon atoms as cyan and green, respectively. The carbon atoms in the inhibitors are colored grey. Hydrogen bonds are shown in black broken lines. WAT22 is shown in red sphere.
The origin of the hydrolytic water
It has been proposed that the solvent molecules hydrogen bonding to the carbonyl oxygen of residue 41 in the CLSPs (Thr41 in PPE) are involved in acyl enzyme hydrolysis (references in Perona et al.
). The distances between O of residue 41 and Nε2 of His57 are ∼ 7.0 Å in various CLSP crystal structures, implying that there are likely two water molecules bridging these two residues. This is indeed observed in the structure of the PPE-β-casomorphin (BCM7) acyl enzyme; the main-chain carbonyl oxygen of Thr41 coordinates the hypothetical hydrolytic water (WAT317) through another solvent molecule (Figure 6(a)). This intermediary water (WAT318) forms two strong hydrogen bonds to the main-chain oxygen of Thr41 (2.7 Å) and to WAT317 (2.7 Å), which is at a distance of 2.9 Å from Nε2 of His57. We found a similar disposition of two water molecules in the active site of the thioacyl-like complex of the SARS3CLpro. Thr26, the structurally equivalent residue of Thr41 in PPE, tightly binds a water molecule, WAT58 (∼ 2.6 Å) that interacts with another solvent molecule, WAT81. The distance between WAT81 and Nε2 of H41 in 3CLprois 4.8 Å, ruling out the possibility that this solvent molecule serves as the hydrolytic water. However, it is important to take into account that the observed position of WAT81 is a consequence of both the conformational change in the side-chain atoms of Cys145 and the intercalation of Ci between His41 and the carbonyl group of the P1 residue in the thioacyl-like complex (Figure 5(b)). Strikingly, although the amino acid identity of the structurally equivalent residues of Thr41 in PPE varies from virus to virus (Thr26 in SARS-CoV, Lys24 in HRV2, and Val28 in HAV), the relative position of their main-chain carbonyl oxygen atoms with regard to the catalytic residues is highly conserved. This reiterated structural resemblance among viral cysteine proteinases and the CLSPs seems to strengthen the notion that their catalytic mechanisms are likely to be quite similar as well.
Figure 6
Similar solvent molecules near the catalytic residues in the active sites of PPE and SARS 3CLpro complexes. (a) The PPE-BCM7 acyl complex and (b) the SARS 3CLpro-inhibitor 1 thioacyl-like complex. Carbon atoms in PPE, BCM7, SARS 3CLpro, and inhibitor 1 are distinguished by the colors yellow, cyan, green and grey, respectively. Hydrogen bonds in the PPE-BCM7 acyl complex and those in the SARS 3CLpro-inhibitor 1 thioacyl-like complex are colored purple and dark teal, respectively. The interacting solvent molecules are shown as spheres and distinguished by the color of their hydrogen bonds.
Similar solvent molecules near the catalytic residues in the active sites of PPE and SARS3CLpro complexes. (a) The PPE-BCM7 acyl complex and (b) the SARS3CLpro-inhibitor 1thioacyl-like complex. Carbon atoms in PPE, BCM7, SARS3CLpro, and inhibitor 1 are distinguished by the colors yellow, cyan, green and grey, respectively. Hydrogen bonds in the PPE-BCM7 acyl complex and those in the SARS3CLpro-inhibitor 1thioacyl-like complex are colored purple and dark teal, respectively. The interacting solvent molecules are shown as spheres and distinguished by the color of their hydrogen bonds.
Materials and Methods
Chemical syntheses of SARS 3CLpro inhibitors
The syntheses of the phthalhydrazide inhibitors used in this study were done as described.
Expression, purification, and crystallization of SARS 3CLpro
SARS3CLpro protein was expressed in Escherichia coli in its cognate coding sequence with no additional protein tag attached. The purification procedure of SARS3CLpro essentially followed the one published. The complexes were obtained either through co-crystallizing each of the inhibitors with SARS3CLpro or via soaking the inhibitors individually into the pre-grown SARS3CLpro crystals. The crystallization conditions were derived from a previously published condition that is conducive for the growth of crystals in the C2 space group (one molecule/asymmetric unit). Co-crystals of SARS3CLpro with inhibitor 1 were obtained two to three days after mixing the protein with mother liquor containing the compound (2 to 5 mM). For soaking, crystals were incubated in drops containing either inhibitor 1 or 2 (2–5 mM) overnight before being flash cooled for data collection at the synchrotron X-ray source.
Mass spectrometry of SARS 3CLpro inactivated by phthalhydrazide inhibitors
Pre-grown SARS3CLpro crystals were soaked in solutions containing either inhibitor 1 or 2 for ∼ 18 h before being collected in a PCR tube. A large volume (v/v 1:200) of wash buffer (1 mM Tris pH 7.5) was used to wash the crystals four to five times. Samples were loaded onto a C4 ziptip (Millipore, MA, USA) and eluted with 0.1% formic acid. Electron spray ionization-mass spectrometric (ESI-MS) analyses were performed on eluted protein samples using a Waters (Micromass) Q-TOF Premier with an infusion rate of 0.5–1 ml/min.
X-ray data collection, processing and structure refinement
X-ray diffraction data were collected at beamline 12.3.1 of the Advanced Light Source (ALS). The data were indexed, integrated and scaled with the programs MOSFLM and SCALA. The native SARS3CLpro structure in the C2 space group (PDB code 2A5A) was chosen as the search model for molecular replacement using the program MOLREP. Refinement was done using cycles of REFMAC 5 and manual adjustment of the model using Xtalview. The crystallographic statistics of data collection and model refinement are listed in Table 4
.
Table 4
Crystallographic statistics of data collection and structure refinement
PDB code
2Z3E
2Z3C
2Z3D
Inhibitor solution
1, co-crystallized
1, soaking
2, soaking
Space group
C2
C2
C2
a (Å)
108.26
108.56
108.27
b (Å)
82.24
81.46
81.84
c (Å)
53.49
53.36
53.42
α (°), β (°), γ (°)
90, 104.45, 90
90, 104.45, 90
90, 104.17, 90
No. molecules/asymmetric unit
1
1
1
Vm (Matthews' coefficient)/%solvent content
3.41/63.9
3.38/63.6
3.39/63.7
Data collection
ALS Beamline 12.3.1
Resolution range (Å)
15.47–2.32 (2.45–2.32)a
18.37–1.79 (1.89–1.79)
18.43–2.10 (2.21–2.10)
Wavelength (Å)
0.97946
0.97946
0.97946
Observations
45,213 (6602)
305,138 (33,941)
135,574 (19,673)
Unique reflections
18,207 (2662)
41,831 (6038)
23,907 (3584)
I/∑(I)
8.9 (5.4)
11.6 (2.7)
9.7 (5.1)
Data completeness (%)
92.5 (92.8)
98.8 (97.6)
91.2 (93.8)
Rmergeb
0.061 (0.186)
0.078 (0.643)
0.070 (0.328)
Refinement
No. reflections used
16,658 (1090)
39,711 (2888)
22,259 (1716)
Resolution range (Å)
15.47–2.32 (2.37–2.32)
18.37–1.79 (1.84–1.79)
18.43–2.10 (2.16–2.10)
Free set size (%)
5.0
5.0
5.0
No. atoms (protein+ligand)
2652
2758
2666
No. waters
227
295
212
Rworkingc (%)
18.7 (23.4)
19.0 (29.8)
19.8 (25.4)
Rfree (%)
23.1 (28.2)
23.9 (30.4)
24.8 (30.7)
Mean B value (Å2)d
32.66/46.32/36.28
31.49/40.81/50.72
31.55/44.94/41.88
r.m.s.d. from ideal geometry
Bond length (Å)
0.020
0.015
0.012
Bond angle (°)
1.956
1.700
1.481
Chirality
0.127
0.116
0.101
Ramachandran plot (%most favored/%disallowed)
90.3/0.0
90.3/0.0
91.0/0.0
Parentheses indicate values for the highest resolution shell.
Rmerge = ∑h∑j|Ihj–|/∑h∑jIhj, where is the weighted mean intensity of the symmetry-related reflections Ihj.
Rworking = ∑h||Fo–|Fc||/∑hFo, where |Fo| and |Fc| represent the observed and calculated structure factor amplitudes, respectively. Rfree is Rworking calculated with the reference set.
Average B factors of the complex/tetrapeptidyl inhibitor/solvent molecules.
Crystallographic statistics of data collection and structure refinementParentheses indicate values for the highest resolution shell.Rmerge = ∑h∑j|Ihj–|/∑h∑jIhj, where is the weighted mean intensity of the symmetry-related reflections Ihj.Rworking = ∑h||Fo–|Fc||/∑hFo, where |Fo| and |Fc| represent the observed and calculated structure factor amplitudes, respectively. Rfree is Rworking calculated with the reference set.Average B factors of the complex/tetrapeptidyl inhibitor/solvent molecules.
Structural analysis and Figure generation
The quality of the final models was assessed using the program PROCHECK. The interactions between the inhibitors and the enzyme were calculated using the program CONTACT. The program LIGPLOT was used to illustrate these interactions. Structures were aligned using the program ALIGN. Figures containing structural models and electron density maps were generated by the program Pymol‡.
Kinetic assays of SARS 3CLpro activity
The steady-state proteolytic activity of SARS3CLpro was determined using a fluorogenic peptide substrate Abz-SVTLQSGY(NO2)R, where Abz is aminobenzoate and Y(NO2) is nitrotyrosine. The standard assay was performed using 50 nM 3CLpro, 20 mM Bis-Tris (pH 7.0), 2 mM DTT in 100 μl at 37( ± 0.1) °C in a small volume quartz cuvette. For competitive inhibition experiments, the inhibitor concentration was varied from 0.2 to 3 μM. Fluorescence data were empirically corrected for the inner filter effect. Fluorescence was measured using a Cary Eclipse Fluorescence spectrophotometer (Varian Canada, Mississauga, Ontario, Canada) equipped with a circulating water-bath. The reaction was monitored using an excitation wavelength of 320 nm (2.5 nm bandpass) and an emission wavelength of 420 nm (5 nm bandpass). Initial velocities were determined from a least-squares analysis of the linear portion of the progress curves (at least 1 min) using Excel 2003 (Microsoft, Redmond, WA).
Protein Data Bank accession codes
The coordinates for the structures of SARS3CLpro in complex with inhibitor 1 and 2 have been deposited in RCSB Protein Data Bank§
and are available under accession codes, 2Z3C, 2Z3D, 2Z3E.
Authors: Andrew R Buller; Jason W Labonte; Michael F Freeman; Nathan T Wright; Joel F Schildbach; Craig A Townsend Journal: J Mol Biol Date: 2012-06-15 Impact factor: 5.469
Authors: Young Bae Ryu; Hyung Jae Jeong; Jang Hoon Kim; Young Min Kim; Ji-Young Park; Doman Kim; Thi Thanh Hanh Nguyen; Su-Jin Park; Jong Sun Chang; Ki Hun Park; Mun-Chual Rho; Woo Song Lee Journal: Bioorg Med Chem Date: 2010-09-19 Impact factor: 3.641