Literature DB >> 17189639

Production of authentic SARS-CoV M(pro) with enhanced activity: application as a novel tag-cleavage endopeptidase for protein overproduction.

Xiaoyu Xue1, Haitao Yang, Wei Shen, Qi Zhao, Jun Li, Kailin Yang, Cheng Chen, Yinghua Jin, Mark Bartlam, Zihe Rao.   

Abstract

The viral proteases have proven to be the most selective and useful for removing the fusion tags in fusion protein expression systems. As a key enzyme in the viral life-cycle, the main protease (M(pro)) is most attractive for drug design targeting the SARS coronavirus (SARS-CoV), the etiological agent responsible for the outbreak of severe acute respiratory syndrome (SARS) in 2003. In this study, SARS-CoV M(pro) was used to specifically remove the GST tag in a new fusion protein expression system. We report a new method to produce wild-type (WT) SARS-CoV M(pro) with authentic N and C termini, and compare the activity of WT protease with those of three different types of SARS-CoV M(pro) with additional residues at the N or C terminus. Our results show that additional residues at the N terminus, but not at the C terminus, of M(pro) are detrimental to enzyme activity. To explain this, the crystal structures of WT SARS-CoV M(pro) and its complex with a Michael acceptor inhibitor were determined to 1.6 Angstroms and 1.95 Angstroms resolution respectively. These crystal structures reveal that the first residue of this protease is important for sustaining the substrate-binding pocket and inhibitor binding. This study suggests that SARS-CoV M(pro) could serve as a new tag-cleavage endopeptidase for protein overproduction, and the WT SARS-CoV M(pro) is more appropriate for mechanistic characterization and inhibitor design.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17189639      PMCID: PMC7094453          DOI: 10.1016/j.jmb.2006.11.073

Source DB:  PubMed          Journal:  J Mol Biol        ISSN: 0022-2836            Impact factor:   5.469


Introduction

In the age of proteomics, the production of pure protein in a high-throughput manner is required for both structural and functional studies. Especially for protein structure studies, one of the main bottlenecks is to produce adequate quantities of soluble and properly folded recombinant proteins. Fusion protein expression systems have been widely used for this purpose in basic research and in industry. Fusion domains (or small “tags”) are expressed as partners of the passenger proteins and are generally removed after purification, as they might interfere with the function or other biochemical or biophysical characteristics of the protein. Hence, some endopeptidases, such as bovine thrombin, bovine Factor Xa, human rhinovirus 3C protease and tobacco etch virus (TEV) protease, are routinely used to remove fusion domains. Of these, the viral proteases, e.g. human rhinovirus 3C protease , and TEV protease, , have proven to be the most selective and useful to date. Therefore, the development of novel viral proteases with high specificity and activity will be helpful to proteomics research. The SARS coronavirus (SARS-CoV) is the etiological agent responsible for the global outbreak of a life-threatening disease that caused approximately 800 deaths worldwide.7., 8., 9., 10., 11. The coronavirus main protease (Mpro), which plays a key role in mediating viral replication and transcription, has been identified as the most attractive target for anti-SARS drug design.12., 13., 14. In 2003, our group published the first crystal structure of SARS-CoV Mpro. Thereafter, several other groups published the structures of Mpro,15., [16], [17] and its complex with an aza-peptide epoxide inhibitor. The Mpro can form a homodimer both in the crystal , , and in solution. , Each protomer consists of three domains: domains I and II resemble chymotrypsin, whereas domain III has a globular cluster of five, mostly antiparallel, α-helices. SARS-CoV Mpro has a catalytic dyad consisting of His41 and Cys145, and its substrate-binding pocket is located in the cleft between domains I and II. The N-terminal residues 1–7 of domain I (or N-finger) of Mpro are considered to have an important role in its proteolytic activity, , [21], [22], 23. although their importance in dimerization has been reported with inconsistent results.[21], [22], 23. In particular, mutation of Arg4, which is involved in forming an ion pair, results in a fourfold decrease in activity. Deletion of the N-terminal residues 1–7 results in an almost completely inactive SARS-CoV Mpro. SARS-CoV Mpro has been characterized sufficiently and two particular aspects drew our attention. First, it is highly selective for substrate sequence. Second, it displays a high level of proteolytic activity, although inconsistent kinetic parameters have been reported for SARS-CoV Mpro with k cat /K m ranging from 20–29,000 M−1s−1. , , 25., 26., 27. These properties indicate that SARS-CoV Mpro is a promising candidate as an endopeptidase to remove fusion tags. In this study, SARS-CoV Mpro was utilized to remove the fusion domain in overproduction of calbindin D28k (a member of the calmodulin superfamily) from a novel glutathione-S-transferase (GST) fusion protein expression system.[28], 29., 30., 31. To obtain the peptidase with high catalytic efficiency, we developed a new method to produce wild-type (WT) SARS-CoV Mpro with authentic N and C termini, and investigated the activity difference among several SARS-CoV Mpro constructs. The crystal structures of WT Mpro and its complex with an inhibitor were solved to explain the mechanism of enhanced activity.

Results

Construction of a new GST fusion protein expression vector (pGSTM)

In order to take advantage of SARS-CoV Mpro as an endopeptidase, a new GST fusion protein expression vector (designated pGSTM) was constructed from another fusion vector pGEX-6p-1 (GE Healthcare). The resulting plasmid retains the whole multiple cloning sites of pGEX-6p-1. However, the linker between the GST gene and the gene of interest was replaced by nucleotides encoding a unique N-terminal autocleavage site of SARS-CoV Mpro consisting of 11 amino acid residues (see Figure 1(a)). Sequencing confirmed that the cleavage site (TSAVLQSGFRK) was inserted correctly.
Figure 1

A new GST fusion protein expression system. (a) The map of the pGSTM vector design. The cleavage site for SARS-CoV Mpro is labeled. (b) Expression and purification of calbindin D28k using the pGSTM expression system. Lane 1, total cell extract for calbindin D28k before induction; lane 2, total cell extract for calbindin D28k after induction overnight; lane 3, supernatant of the cell lysate; lanes 4 and 5, purified calbindin D28k; lane 6, protein molecular mass marker. (c) Schematic plot of WT SARS-CoV Mpro construct designed. (d) Expression and purification of WT Mpro . Lane 1, protein molecular mass marker; lane 2, total cell extract for WT Mpro before induction; lane 3, total cell extract for WT Mpro after induction overnight; lane 4, WT-GPH6 after affinity chromatography; lane 5, WT after cleavage by rhinovirus 3C protease; lane 6, GPLGS-WT; lane 7, GS-WT.

A new GST fusion protein expression system. (a) The map of the pGSTM vector design. The cleavage site for SARS-CoV Mpro is labeled. (b) Expression and purification of calbindin D28k using the pGSTM expression system. Lane 1, total cell extract for calbindin D28k before induction; lane 2, total cell extract for calbindin D28k after induction overnight; lane 3, supernatant of the cell lysate; lanes 4 and 5, purified calbindin D28k; lane 6, protein molecular mass marker. (c) Schematic plot of WT SARS-CoV Mpro construct designed. (d) Expression and purification of WT Mpro . Lane 1, protein molecular mass marker; lane 2, total cell extract for WT Mpro before induction; lane 3, total cell extract for WT Mpro after induction overnight; lane 4, WT-GPH6 after affinity chromatography; lane 5, WT after cleavage by rhinovirus 3C protease; lane 6, GPLGS-WT; lane 7, GS-WT.

Expression and purification of calbindin D28k using the pGSTM expression system

Calbindin D28k, a member of the calmodulin superfamily, has been proposed to function as an important intracellular Ca2+-buffering protein. We took it as our protein of interest for structural studies. This protein was cloned, expressed and purified using our new protein expression system. After one-step purification, calbindin D28k protein, with five additional residues (SGFRK-) at the N terminus and an authentic C terminus, was analyzed by SDS-PAGE and the purity was shown to be >90% (Figure 1(b)). Around 70–80 mg of calbindin D28k was obtained from 1 l of bacterial culture. The crystallographic analysis of calbindin D28k is underway.

Preparation of WT SARS-CoV Mpro

In an earlier study, wild-type SARS-CoV Mpro with two additional residues (GS) at the N terminus (GS-WT) was used for the determination of kinetic parameters. To avoid any potential effects from the extraneous N-terminal residues on enzyme activity, we developed a new method to produce WT SARS-CoV Mpro. First, the four amino acids AVLQ, which correspond to the P4–P1 sites of the N-terminal autocleavage sequence of SARS-CoV Mpro, were introduced between the GST tag and the first residue of the protease. Thus, the authentic N terminus would become available by autocleavage during protein expression. Second, the eight amino acids GPHHHHHH (abbreviated as GPH6, where GP correspond to the P1′ and P2′ sites of rhinovirus 3C protease; see Table 1 ) were added after the last glutamine residue at the C terminus (Figure 1(c)). This strategy was intended to yield an authentic C terminus following cleavage by rhinovirus 3C protease, despite appearing unreasonable, since SARS-CoV and rhinovirus belong to different families (Coronaviridae and Picornaviridae, respectively). This strategy was used for the following reasons.
Table 1

C-terminal autocleavage site for SARS-CoV Mpro and typical substrate sequence for human rhinovirus 3C protease

ProteaseP6P5P4P3P2P1P1′P2′
SARS-CoV Mpro (C terminus)SGVTFQGK
Human rhinovirus 3CLEVLFQGP
C-terminal autocleavage site for SARS-CoV Mpro and typical substrate sequence for human rhinovirus 3C protease With an extra domain III, SARS-CoV Mpro differs from rhinovirus 3C protease in many aspects. However, they share several similarities in both cleavage sequence and the substrate-recognition pocket. From the native/complex structures of these two proteases published to date, the S4, S2, S1, S1′ and S2′ subsites are critical for substrate binding to the two proteases. , , , , Thus, the corresponding P4, P2, P1, P1′ and P2′ sites were taken into consideration when designing recognition sequences. Table 1 lists the C-terminal autocleavage site for SARS-CoV Mpro, and the substrate sequence for human rhinovirus 3C protease. It shows that the five sites are identical for the two proteases, with the exception of P2′. The substantial difference embodied in the two substrate sequences is limited to P2′. Hence, we reasoned that rhinovirus 3C protease could recognize the octapeptide substrate SGVTFQ↓GP, whose P1–P6 sites were derived from the C-terminal autoprocessing site of SARS-CoV Mpro. However, replacing lysine with proline at the P2′ site would arrest autocleavage. Our group has recently solved the crystal structure of a mutant of SARS-CoV Mpro in complex with an 11 amino acid residue peptidyl substrate (unpublished results). The complex structure revealed that substitution of P2′ with Pro would result in steric hindrance between P2′ (Pro) and the main chain of SARS-CoV, hindering substrate binding. On the basis of the above analysis, we hypothesized that rhinovirus 3C protease but not SARS-CoV Mpro could recognize the octapeptide substrate, SGVTFQ↓GP. SDS-PAGE analysis (see Figure 1(d)) showed that after Ni-affinity chromatography, the SARS-CoV Mpro is in the WT-GPH6 form, implying that the N-terminal GST tag was removed efficiently by autocleavage during protein expression in Escherichia coli. The difference between lane 4 and lane 5 indicates that the GPH6 tag was indeed removed by rhinovirus 3C protease cleavage, confirming that the modified substrate of SARS-CoV Mpro could be recognized by rhinovirus 3C protease. Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS) analysis indicated that WT-GPH6 and WT Mpros were in accord with their predicted molecular mass (see Table 2 ).
Table 2

Molecular mass of WT-GPH6 and WT measured by MALDI-TOF MS

SARS-CoV MproPredicted (Da)Measured (Da)
WT-GPH634,82234,819
WT33,84633,817
Molecular mass of WT-GPH6 and WT measured by MALDI-TOF MS

Enzyme activity assay of four types of SARS-CoV Mpros

To ascertain whether the additional residues would interfere with the catalytic activity of SARS-CoV Mpro, we determined the kinetic parameters for GPLGS-WT (with five additional residues (GPLGS) at the N terminus of WT), GS-WT, WT-GPH6 and WT SARS-CoV Mpros (see Table 3 ). The catalytic efficiency of an enzyme is best defined by k cat/K m. Table 3 shows that the WT protease has the highest cleavage efficiency (k cat /K m  = 26,500 M−1 s−1) while GPLGS-WT has the lowest (k cat/K m  = 167 M−1 s−1). The activity of the WT protease with authentic N and C termini was more than 150-fold greater than that of GPLGS-WT and 20-fold greater than that of GS-WT. Furthermore, our results suggest that increasing the number of additional residues at the N terminus would result in a greater decrease in activity. However, the activity of WT-GPH6 was about one-third of that of the WT enzyme, which suggests that additional residues at the C terminus had less effect on activity.
Table 3

Comparison of enzyme activities of four types of SARS-CoV Mpro

SARS-CoV MproKm (μM)kcat (s−1)kcat/Km (M−1 s−1)
GPLGS-WT126 ± 80.021 ± 0.001167
GS-WT12129 ± 70.14 ± 0.011100
WT-GPH661.0 ± 2.90.41 ± 0.026800
WT40.0 ± 0.81.06 ± 0.0426,500
Comparison of enzyme activities of four types of SARS-CoV Mpro

Crystal structure of WT SARS-CoV Mpro

In order to shed light on the difference in activity, we determined the crystal structure of the WT SARS-CoV Mpro to 1.6 Å resolution. In our published crystal structure of SARS-CoV Mpro (in the GPLGS-WT form), the crystal belongs to the space group P21 and the asymmetric unit contains a dimer (with the two protomers designated A and B). Within the dimer, protomer A was in the active form, while protomer B showed an inactive form, resulting from a partially collapsed S1 subsite. In contrast, the crystal of WT SARS-CoV Mpro belongs to space group C2, and each asymmetric unit contains only one protomer of a typical dimer (with the two protomers designated A* and B*). The two protomers in the dimer are related by a crystallographic 2-fold symmetry axis, and each has a catalytically competent conformation. All residues of the protomer (residues 1–306) were identified from electron density maps. Apart from differences in the S1 subsite, the SARS-CoV Mpro protomer as seen in the new structure is very similar to each member of the dimer in the original structure (see Figure 2 ). Compared to the latter structure, the protomer in the new crystal forms display overall rmsd for Cα atoms of 0.6 Å for protomer A and 0.7 Å for protomer B. In the following discussion, we consider the differences between the substrate-binding sites of the two structures in further detail.
Figure 2

Superposition of WT and GPLGS-WT Mpros. The substrate-binding pocket of one promoter is in surface representation. GPLGS-WT is in blue, WT is in magenta.

Superposition of WT and GPLGS-WT Mpros. The substrate-binding pocket of one promoter is in surface representation. GPLGS-WT is in blue, WT is in magenta. The S1 site of SARS-CoV Mpro, which has absolute specificity for Gln in the P1 site, consists of the side-chains of His163 and Phe140, and the main-chain atoms of GluA166, Asn142, Gly143 and HisA172. In the S1 site of protomer A in our original structure (GPLGS-WT), the NH group of SerB1 was unable to donate hydrogen bonds simultaneously to the carboxylate group of GluA166 and the main-chain carbonyl group of PheA140, due to the presence of additional residues at the N terminus, although the distance is suitable for hydrogen bond formation. However, in the newly solved WT structure, the amino group (NH2) of SerB*1 in one protomer donates a 3.0 Å hydrogen bond to the carboxylate group of GluA*166 (Figure 3(a)) and a 2.7 Å hydrogen bond to the main-chain carbonyl group of PheA*140, thus stabilizing the S1 pocket. This induces a series of conformational changes. For instance, the NH of GlyA*143, which participates directly in the formation of the oxyanion hole, moves by 0.8 Å towards the active site; the main chain of residues 142-143 moves towards the S1 subsite; and the side-chain of AsnA*142 flips over with a 6 Å shift. Stabilized by SerB*1, protomer A* displays a more catalytically competent conformation in the S1 subsite than protomer A in our original structure. In our original structure, the S1 pocket of protomer B is partly collapsed compared with the WT protomer: no electron density was visible for residues A1 and A2; GluB166 reorientates to interact with the possibly protonated HisB163; PheB140 undergoes a dramatic conformational change, with the phenyl ring moving by as much as 10 Å; and GlyB143 moves about 3 Å towards the active site, leaving no space to accommodate a tetrahedral reaction intermediate (Figure 3(b)). These structural variations account for the higher activity of the WT protease compared with the GPLGS-WT and GS-WT proteases. We observed also that the GPLGS-WT protease has lower activity than the GS-WT protease. This might result from the additional flexible residues at the N terminus, which are located close to the active site and would hinder substrate binding.
Figure 3

Superposition of the S1 pockets of GPLGS-WT and WT SARS-CoV Mpro (in stereo). (a) Superposition of the S1 pockets in protomer A of GPLGS-WT and that of protomer A* of WT SARS-CoV Mpro. Protomer A* of WT is in blue; protomer A of GPLGS-WT is in yellow; protomer B* of WT is in magenta; protomer B of GPLGS-WT is in red. In the WT structure, the amino group (NH2) of Ser1 in protomer B* donates a 3.0 Å hydrogen bond to the carboxylate group of Glu166 and a 2.7 Å hydrogen bond to the main-chain carbonyl group of Phe140 in protomer A*, stabilizing the S1 pocket. The NH of Gly143 moves 0.8 Å toward the activity site; the main chain of residues 142-143 moves toward the S1 subsite; the side-chain of Asn-A*142 flips over with a 6 Å shift compared with protomer A of GPLGS-WT. (b) Superposition of the S1 pockets in protomer B of GPLGS-WT and that of Protomer A* of WT SARS-CoV Mpro. Protomer A* of WT is in blue; protomer B of GPLGS-WT is in yellow; protomer B* of WT is in magenta; protomer A of GPLGS-WT is in red. The S1 pocket of protomer B collapses partly with reorientation of Glu166 and residues 140–143. No electron density was visible for residues A1 and A2.

Superposition of the S1 pockets of GPLGS-WT and WT SARS-CoV Mpro (in stereo). (a) Superposition of the S1 pockets in protomer A of GPLGS-WT and that of protomer A* of WT SARS-CoV Mpro. Protomer A* of WT is in blue; protomer A of GPLGS-WT is in yellow; protomer B* of WT is in magenta; protomer B of GPLGS-WT is in red. In the WT structure, the amino group (NH2) of Ser1 in protomer B* donates a 3.0 Å hydrogen bond to the carboxylate group of Glu166 and a 2.7 Å hydrogen bond to the main-chain carbonyl group of Phe140 in protomer A*, stabilizing the S1 pocket. The NH of Gly143 moves 0.8 Å toward the activity site; the main chain of residues 142-143 moves toward the S1 subsite; the side-chain of Asn-A*142 flips over with a 6 Å shift compared with protomer A of GPLGS-WT. (b) Superposition of the S1 pockets in protomer B of GPLGS-WT and that of Protomer A* of WT SARS-CoV Mpro. Protomer A* of WT is in blue; protomer B of GPLGS-WT is in yellow; protomer B* of WT is in magenta; protomer A of GPLGS-WT is in red. The S1 pocket of protomer B collapses partly with reorientation of Glu166 and residues 140–143. No electron density was visible for residues A1 and A2. In contrast to the N terminus, the C terminus of the WT protomer (residues 301–306) is located far from the substrate-binding pocket of its partner protomer (∼10 Å). Therefore, the additional residues (GPH6) at the C terminus are expected to have less effect on enzyme activity (see Table 3).

Inhibition assay of SARS-CoV Mpro

In our previous study, we designed an irreversible anti-coronavirus inhibitor (designated as N3, Figure 4(a)) consisting of an α,β-unsaturated ester (one type of Michael acceptor) incorporated with a peptidyl portion. The evaluation of this series of time-dependent inhibitors requires a pseudo second-order rate constant (k 3/K i). K i and k 3 represent the equilibrium binding constant and inactivation rate constant for covalent bond formation, respectively. We assayed the inhibition of N3 against different types of Mpros to determine whether or not the additional residues would affect inhibitor binding. In our preliminary inhibition assays, we observed that N3 could completely inactivate WT and WT-GPH6 proteases after its preincubation with the proteases (fivefold molar excess of the enzyme) for 5 min, but not GPLGS-WT and GS-WT proteases (see Supplementary Data Figure S2). The strict kinetic parameters listed in Table 4 show that the second-order rate constant of N3 against WT protease (k 3/K i  = 18,800 M−1s−1) is approximately equal to that against WT-GPH6 (k 3/K i  = 15,400 M−1s−1). However, the second-order rate constant of N3 (k 3/K i  = 340 M−1s−1) against GS-WT is decreased by greater than 50-fold. This difference implies that the first residue of SARS-CoV Mpro also plays an important role in inhibitor binding.
Figure 4

Differences between the complex structures of WT and GPLGS-WT. (a) Inhibitor N3. (b) Superposition of the substrate-binding pockets in protomer A of GPLGS-WT and that in protomer A* of WT. In the WT-N3 complex structure, the NH2 group of Ser1 in protomer B* was still hydrogen-bonded to the carboxylate group of Glu166 and the carbonyl group of Phe140 in protomer A*, stabilizing the S1 pocket. In the GPLGS-WT-N3 complex structure, however, the two hydrogen bonds described above were not found. Instead, an ordered water molecule was observed in the S1 pocket. Protomer A* of WT is in blue; protomer A of GPLGS-WT is in yellow; inhibitor N3 (complexed with WT) is in magenta; inhibitor N3 (complexed with GPLGS-WT) is in red; protomer B* of WT is in green; protomer B of GPLGS-WT is in cyan.

Table 4

Enzyme inhibition data of inhibitor N3 against four types of SARS-CoV Mpro

ProteaseKi (μM)k3 (s−1)k3/Ki (M−1 s−1)
GPLGS-WTN/AN/AN/A
GS-WT9.0 ± 0.80.0031 ± 0.0005340 ± 27
WT-GPH62.3 ± 0.10.034 ± 0.00115,400 ± 1,200
WT1.9 ± 0.10.035 ± 0.00218,800 ± 1,800
Differences between the complex structures of WT and GPLGS-WT. (a) Inhibitor N3. (b) Superposition of the substrate-binding pockets in protomer A of GPLGS-WT and that in protomer A* of WT. In the WT-N3 complex structure, the NH2 group of Ser1 in protomer B* was still hydrogen-bonded to the carboxylate group of Glu166 and the carbonyl group of Phe140 in protomer A*, stabilizing the S1 pocket. In the GPLGS-WT-N3 complex structure, however, the two hydrogen bonds described above were not found. Instead, an ordered water molecule was observed in the S1 pocket. Protomer A* of WT is in blue; protomer A of GPLGS-WT is in yellow; inhibitor N3 (complexed with WT) is in magenta; inhibitor N3 (complexed with GPLGS-WT) is in red; protomer B* of WT is in green; protomer B of GPLGS-WT is in cyan. Enzyme inhibition data of inhibitor N3 against four types of SARS-CoV Mpro

The crystal structure of WT SARS-CoV Mpro in complex with a Michael acceptor

We determined the crystal structure of WT SARS-CoV Mpro in complex with inhibitor N3 to 1.9 Å resolution. The substantial difference between the structures of the WT complex and the GPLGS-WT complex reported previously still lies in the S1 subsite. In the GPLGS-WT complex structure, N3 binds to protomers A and B of SARS-CoV Mpro in an identical and normal manner, thus we discuss only protomer A and A*. In protomer A of GPLGS-WT Mpro complexed with N3, the NH group of SerB1 was 3.4 Å from the carboxylate group of GluA166 and 4.7 Å from the main-chain carbonyl group of PheA140, both of which are beyond the distance for formation of a hydrogen bond. However, an ordered water molecule is situated at the bottom of the S1 subsite, connecting N3 and the protease. This water donates two hydrogen bonds, to the carboxylate group of GluA166 and the main-chain carbonyl group of PheA140, and accepts two hydrogen bonds from the lactam and the side-chain of HisA172. Although the water molecule helps to stabilize the inhibitor binding in the S1 pocket, it occupies part of the space of the S1 subsite. Due to steric hindrance, the lactam was not able to insert further into the S1 subsite. In protomer A* of the WT Mpro complexed with N3, no water molecule was found at the bottom of the S1 pocket. As a consequence, the carboxylate group of GluA*166 moves 1.7 Å upwards to form a 2.8 Å hydrogen bond with the NH of the lactam. The competent conformation of the S1 subsite is still maintained via the interaction of NH2 SerB*1 with the carboxylate group of GluA*166 (Figure 4(b)) and the main-chain carbonyl group of PheA*140. These structural data demonstrate that the first Ser1 residue is important also for inhibitor binding, and accounts for the more potent inhibition of N3 against the WT than the GPLGS-WT and GS-WT proteases.

Discussion

Although SARS-CoV is notorious for causing a lethal disease in humans, some positive elements may result from this life-threatening virus. In this study, SARS-CoV Mpro was engineered to serve as a novel endopeptidase to remove fusion tags in recombinant protein overproduction. Table 5 shows the advantages of SARS-CoV Mpro compared with other routinely used proteases in methods for production, substrate specificity and cleavage efficiency. Human thrombin and bovine factor Xa are both extracted from plasma, which requires more complicated procedures, although they possess a high level of cleavage efficiency. In addition, they are not as selective for substrate as viral proteases. WT SARS-CoV Mpro, which can be suitably overexpressed in E. coli, is highly specific for substrate. Furthermore, it has superior activity (k cat/K m  = 26500 M−1 s−1) to rhinovirus and TEV proteases. Therefore, SARS-CoV Mpro is a suitable candidate for site-specific cleavage in fusion protein expression systems. The cleavage efficiency of WT-GPH6 is still high, despite additional residues at the C terminus. Highly purified Mpro in this form could be obtained by a simple one-step purification, as shown by the overloaded SDS-PAGE gel (Figure 1(d)) For the sake of an additional six histidine residues at the C terminus, it can be readily separated from the protein of interest by affinity chromatography after removal of the fusion tags. These advantages suggest this form of SARS-CoV Mpro could have important industrial applications.
Table 5

Comparison of the different proteases

ProteasesMethods for productionSubstrate specificityCleavage efficiency kcat/Km (M−1 s−1)References
Human thrombinExtraction from plasma-XLVPR↓GSX-94,60039., 40., 41.
Bovine Factor XaExtraction from plasma-XIEGR↓X-39,00042
Human rhinovirus 3C proteaseRecombinant expression-XLEVLFQ↓GPX-9203., 4.
TEV proteaseRecombinant expression-XENLYFQ↓G(S)X-26205., 6.
SARS-CoV MproRecombinant expression-XTSAVLQ↓SGFRKX-26,500
Comparison of the different proteases One disadvantage for recombinant protein expression is to commonly produce additional amino acid residues at the termini of the wild type protein. In SARS-CoV Mpro studies, several research groups have reported inconsistent results for the kinetic parameters of this protease, with k cat /K m ranging from 20–29,000 M−1s−1. , , 25., 26., 27. The slight differences in methods (HPLC and FRET-based methods) and substrates used would not account entirely for this phenomenon. The first published crystal structure of SARS-CoV Mpro (with additional residues at the N terminus) by our group provided some clues that the first residue might play an important role in substrate binding. In order to clarify this point, we designed a new strategy to produce the WT SARS-CoV Mpro. We created a GST fusion product with a tag that can be removed via the autocleavage mechanism of this enzyme for three reasons: (1) autoprocessing will not produce additional residues at the N terminus; (2) SARS-CoV Mpro has been reported to have highly efficient expression in GST fusion systems; and (3) this method could be used to characterize the autocleavage efficiency of SARS-CoV in vitro. As for point (3), the affinity GST tag was a mimic of the transmembrane domain upstream of SARS-CoV Mpro in polyprotein 1a and 1ab, representing an autocleavage model of the Mpro in vitro. SDS-PAGE analysis shows that the GST tag was removed entirely from the N terminus through autoprocessing during expression of SARS-CoV Mpro in E. coli, exhibiting its high efficiency for autocleavage in vitro. In previous studies, it was hypothesized that the autocleavage of Mpro may occur in proximity to the membrane. Our data suggest that completion of autocleavage might be achieved in the cytoplasm, although the precise whereabouts of the assembly of replicase complex components remains to be identified. According to separate reports by Hsu and Lin, , using autocleavage to remove tags such as thioredoxin can further support our hypothesis. Rhinoviruses and coronaviruses belong to the Picornaviridae and the Coronaviridae, respectively. No previous report has shown that any CoV Mpro could efficiently process the substrate of picornavirus 3C proteases, or vice versa. In our study, it is interesting to observe that only substitution at the P2′ site resulted in the conversion of the SARS-CoV main protease substrate into that of rhinovirus 3C protease. In our study, we demonstrated that the first Ser1 residue at the N terminus of SARS-CoV Mpro is important for its activity and inhibitor binding. The critical interactions involved in stabilizing the substrate-binding pocket are two hydrogen bonds formed by the free amino group of the first residue of one protomer with two residues, Phe140 and Glu166, comprising the S1 subsite of its partner protomer within the dimer structure. The stabilizing effect of the free amino group is obvious. In our published GPLGS-WT structure, the S1 subsite of protomer B is partially collapsed without stabilization by the first residue of protomer A, suggesting that the S1 pocket would be less stable due to micro-environmental changes. During the crystallization step of the structure determination procedure, a large quantity of WT SARS-CoV Mpro crystals could be produced over one night, which could easily diffract to very high resolution under the conditions described previously. In contrast, it usually took five to ten days to produce a very limited amount of GPLGS-WT SARS-CoV Mpro or GS-WT SARS-CoV Mpro crystals suitable for diffraction, albeit with comparatively lower resolution. As a consequence, the WT SARS-CoV Mpro reported here will be more suitable for activity assays, inhibitor screening and crystallization. It should accelerate development of anti-coronavirus inhibitors through a structure-assisted approach to drug design.

Materials and Methods

Construction of the pGSTM expression vector

The 654–965 region of the pGEX-6p-1 vector was cloned into the pMD18-T vector with two primers: forward, 5′-TTCGAAGATCGTTTATGTCATAAA-3′ reverse, 5′-GGATCCTTTCCTAAAACCACTCTGCAGAACTGCACTAGTATCCGATTTTGGAGGATG-3′ The 33 nucleotides encoding the specific 11 amino acids TSAVLQSGFRK recognized by SARS-CoV Mpro were introduced by the reverse primer. The recombinant pMD18-T plasmid was double-digested with BstbI and BamHI. The segment of interest was recombined into the pGEX-6p-1 vector, which had also been double-digested by BstbI and BamHI, to construct the new protein expression vector pGSTM.

Cloning, expression and purification of calbindin D28k by the pGSTM protein expression system

Construction of recombinant plasmid

The gene encoding calbindin D28k was amplified by the polymerase chain reaction (PCR) by primers: forward, 5′-GGATCCATGGCAGAATCCCACCTG-3′ reverse, 5′-CCGCTCGAGCTAGTTATCCCCAGCACA-3′ The PCR products were ligated to pMD18-T vectors with bacteriophage T4 DNA ligase. After digestion by BamHI and XhoI, the gene of interest was inserted between the BamHI and XhoI sites of the pGSTM vector.

Expression and purification

The resulting recombinant plasmid was transformed into the E. coli strain BL21 (DE3). The cells were cultured in LB medium containing 0.1 mg ml−1 ampicillin. When the absorbance at 600 nm (A 600) reached 0.6, IPTG was added to 0.5 mM and the cell culture was incubated at 16 °C for 10 h. After harvesting by centrifugation at 4600 (Beckman JLA-10-5), the pellet was resuspended in PBS (140 mM NaCl, 2.7 mM KCl, 10 mM Na2HPO4, 1.8 mM KH2PO4, pH 7.3) and sonicated on ice. The lysate was centrifuged at 27,000 (Beckman JA-25-50) for 30 minutes and the precipitate was discarded. The supernatant was loaded onto 2 ml GST-glutathione affinity columns (Pharmacia) equilibrated with PBS and washed with 30 column volumes of PBS. After that, 0.1 mg of SARS-CoV Mpro (GS-WT) was added to the column at 4 °C for 12 h to remove the GST tag. The protein of interest was collected and analyzed by SDS-PAGE.

Expression and purification of SARS-CoV Mpro

Expression and purification of SARS-CoV Mpro with different numbers of additional residues (5, 2, 0) at the N terminus is described below.

SARS-CoV Mpro (with five additional residues, GPLGS, at the N terminus)

Expression and purification of GPLGS-WT SARS-CoV Mpro has been reported. Briefly, the coding sequence of the SARS-CoV Mpro was inserted into the BamHI and XhoI sites of pGEX-6p-1 plasmid DNA (GE Healthcare). The resulting plasmid was used to transform E. coli BL21 (DE3) cells. The GST fusion protein, GST-SARS-CoV Mpro, was purified by GST-glutathione affinity chromatography, cleaved with GST rhinovirus 3C protease, and the recombinant SARS-CoV Mpro was further purified by anion-exchange chromatography.

SARS-CoV Mpro (with two additional residues, GS, at the N terminus)

Expression and purification of GS-WT SARS-CoV Mpro has been reported. Briefly, the coding sequence was inserted into the BamHI and XhoI sites of the pGEX-4T-1 vector (GE Healthcare). The following procedure is similar to that used for the expression and purification of SARS-CoV Mpro in pGEX-6p-1 plasmid, except that the GST fusion protein was cleaved by thrombin.

WT SARS-CoV Mpro (without additional residues at the termini)

The coding sequence for SARS-CoV Mpro was amplified by polymerase chain reaction (PCR) using the PCR primers: forward, 5′-CGGGATCCGCGGTACTGCAGAGTGGTTTCAGGAAAATGGCA-3′ reverse ′-CCGCTCGAGTTAGTGGTGGTGGTGGTGGTGGGGTCCTTGGAAGGTAACTCC-3′ The 12 nucleotides coding for the four amino acids AVLQ (corresponding to the P1–P4 autocleavage sites at the N terminus of SARS-CoV Mpro; nomenclature for the substrate amino acid residues is Pn, …, P2, P1, P1′, P2′, …, Pn', where P1-P1′ denotes the hydrolyzed bond while Sn, …, S2, S1, Sl′, S2′, …, Sn' denote the corresponding enzyme binding sites) were added before the first Ser1 residue. The 24 nucleotides coding for the eight amino acids GPH6 were added at the C terminus by the reverse primer. The PCR products were inserted into the BamHI and XhoI sites of the pGEX-6p-1 plasmid (GE Healthcare). The resulting plasmid was then used to transform E. coli BL21 (DE3) cells. The sequence of the insert was verified by dideoxynucleotide sequencing. Positive clones harboring the recombinant plasmid were grown to an A 600 of 0.6 at 37 °C by shaking in LB medium containing 0.1 mg ml−1 ampicillin. The GST fusion protein was expressed by introducing IPTG to 0.5 mM with incubation continued at 16 °C for 10 h. Cells were then harvested by centrifugation at 4600g (Beckman JLA-10-5), resuspended in lysis buffer (20 mM TrisHCl (pH 8.0), 300 mM NaCl) and sonicated on ice. The lysate was centrifuged at 27,000g (Beckman JA-25-50) for 30 min and the supernatant was collected. The His tag fused protein was purified by Ni-NTA affinity chromatography and concentrated in PreScission Cleavage Buffer (50 mM TrisHCl (pH 7.0), 150 mM NaCl, 1 mM EDTA, 1 mM dithiothreitol). 50 μl of 3 mg ml−1 human rhinovirus 3C protease was added to 1 ml of the above 10 mg ml−1 WT-GPH6 solution to cleave the C-terminal His tag, producing an SARS-CoV Mpro with an authentic C terminus. The WT SARS-CoV Mpro was further purified using anion-exchange chromatography. The protein samples in each step were prepared for SDS-PAGE analysis. WT-GPH6 and WT were analyzed by MALDI-TOF MS. The purified and concentrated WT SARS-CoV Mpro (10 mg ml−1) was stored in 50 mM TrisHCl (pH7.3), 1 mM EDTA at −80 °C for enzyme activity assays and crystallization.

Enzymatic activity and inhibition assays

Enzyme activity assays of GS-WT SARS-CoV Mpro have been described. Activity assays of GPLGS-WT, WT-GPH6 and WT SARS-CoV Mpro followed a similar protocol. Briefly, the substrate of the N and C-terminal authentic SARS-CoV Mpro was the fluorogenic compound MCA-AVLQSGFR-Lys(Dnp)-Lys-NH2 (greater than 95% purity, GL Biochem Shanghai Ltd, Shanghai, China). The excitation and emission wavelengths of the fluorogenic substrate were 320 nm and 405 nm, respectively. A buffer consisting of 50 mM TrisHCl (pH 7.3), 1 mM EDTA was used for enzyme activity assays at a temperature of 30 °C. The reaction was initiated by adding protease (final concentration of 0.2 μM for WT and WT-GPH6, 2 μM for GPLGS-WT) to a solution containing different final concentrations of the substrate (3.2–40 μM for WT and WT-GPH6, and 6.4–80 μM for GPLGS-WT). The kinetic constants K m and k cat were obtained from a double-reciprocal plot (Supplementary Data Figure S1). Strict kinetic parameters were determined for the inhibition assay.

Crystallization, data collection and structure determination

Crystallization of WT SARS-CoV Mpro was carried out as described. The preparation of the co-crystals of SARS-CoV Mpro in complex with the inhibitor N3 has been reported. A set of WT SARS-CoV Mpro data was collected from a single crystal on beamline BL19-ID of the Advanced Photon Source (APS), Argonne National Lab at a wavelength of 1.00 Å. Data for the SARS-CoV Mpro complex were collected at 100 K in-house on a Rigaku CuKα rotating-anode X-ray generator (MM007) at 40 kV and 20 mA (1.5418 Å) with a Rigaku R-AXIS IV++ image-plate detector. Data were processed, integrated, scaled and merged using HKL2000. The methods used for structure determination were as described. Briefly, the structures were determined by molecular replacement from our native structure of SARS-CoV Mpro (pH 7.6) (PDB ID 1UK3). Data collection and structure refinement statistics are summarized in Table 6 .
Table 6

Data collection and refinement statistics

WT SARS-CoV MproWT SARS-CoV Mpro: N3
A. Data collection statistics
Wavelength (Å)1.00001.5418
Resolution limit (Å)50.0–1.50 (1.55–1.50)50.0–1.95(2.02–1.95)
Space groupC2C2
Cell parameters
 a (Å)108.4108.6
 b (Å)81.881.2
 c (Å)53.653.3
 β (deg.)104.7104.5
Total reflections219,294148,615
Unique reflections63,24132,068
Completeness, %87.7 (40.6)97.9 (94.6)
Redundancy3.5 (2.5)4.7 (3.7)
Rmergea0.038 (0.286)0.042 (0.277)
σcutoff00
I/σ (I)30.4 (2.0)32.4 (4.2)



B. Refinement statistics
Resolution range (Å)50.0–1.650.0–1.95
Rworkb (%)20.120.2
Rfree (%)21.422.1
rmsd from ideal geometry
 Bond lengths (Å)0.0090.015
 Bond angles (deg.)1.591.83
Average B factor, Å2
Chain A26.137.2
Solvent41.549.5
Ramachandran plotc
 Favored (%)90.991.7
 Allowed (%)7.97.2
 Generously allowed (%)0.80.8
 Disallowed (%)0.40.4

Rmerge = ∑|I–|/∑|I|, where I is the intensity of an individual reflection i, and is the average intensity of that reflection.

Rwork = ∑|Fp|–|Fc|/∑|Fp|, where Fc is the calculated and Fp is the observed structure factor amplitude.

Ramachandran plots were generated with the program PROCHECK.

Data collection and refinement statistics Rmerge = ∑|I–|/∑|I|, where I is the intensity of an individual reflection i, and is the average intensity of that reflection. Rwork = ∑|Fp|–|Fc|/∑|Fp|, where Fc is the calculated and Fp is the observed structure factor amplitude. Ramachandran plots were generated with the program PROCHECK.

Data Bank with accession numbers

Coordinates and structure factors for WT SARS-CoV Mpro have been deposited in the Protein Data Bank with accession number 2H2Z. Coordinates and structure factors for WT SARS-CoV Mpro in complex with the inhibitor N3 have been deposited in the Protein Data Bank with accession number 2HOB.
  40 in total

1.  Structure-assisted design of mechanism-based irreversible inhibitors of human rhinovirus 3C protease with potent antiviral activity against multiple rhinovirus serotypes.

Authors:  D A Matthews; P S Dragovich; S E Webber; S A Fuhrman; A K Patick; L S Zalman; T F Hendrickson; R A Love; T J Prins; J T Marakovits; R Zhou; J Tikhe; C E Ford; J W Meador; R A Ferre; E L Brown; S L Binford; M A Brothers; D M DeLisle; S T Worland
Journal:  Proc Natl Acad Sci U S A       Date:  1999-09-28       Impact factor: 11.205

2.  Identification of a novel coronavirus in patients with severe acute respiratory syndrome.

Authors:  Christian Drosten; Stephan Günther; Wolfgang Preiser; Sylvie van der Werf; Hans-Reinhard Brodt; Stephan Becker; Holger Rabenau; Marcus Panning; Larissa Kolesnikova; Ron A M Fouchier; Annemarie Berger; Ana-Maria Burguière; Jindrich Cinatl; Markus Eickmann; Nicolas Escriou; Klaus Grywna; Stefanie Kramme; Jean-Claude Manuguerra; Stefanie Müller; Volker Rickerts; Martin Stürmer; Simon Vieth; Hans-Dieter Klenk; Albert D M E Osterhaus; Herbert Schmitz; Hans Wilhelm Doerr
Journal:  N Engl J Med       Date:  2003-04-10       Impact factor: 91.245

3.  Structure of human rhinovirus 3C protease reveals a trypsin-like polypeptide fold, RNA-binding site, and means for cleaving precursor polyprotein.

Authors:  D A Matthews; W W Smith; R A Ferre; B Condon; G Budahazi; W Sisson; J E Villafranca; C A Janson; H E McElroy; C L Gribskov
Journal:  Cell       Date:  1994-06-03       Impact factor: 41.582

4.  Efficient and rapid affinity purification of proteins using recombinant fusion proteases.

Authors:  P A Walker; L E Leong; P W Ng; S H Tan; S Waller; D Murphy; A G Porter
Journal:  Biotechnology (N Y)       Date:  1994-06

5.  Myo-inositol monophosphatase is an activated target of calbindin D28k.

Authors:  Tord Berggard; Olga Szczepankiewicz; Eva Thulin; Sara Linse
Journal:  J Biol Chem       Date:  2002-08-09       Impact factor: 5.157

6.  Active-site mapping of bovine and human blood coagulation serine proteases using synthetic peptide 4-nitroanilide and thio ester substrates.

Authors:  K Cho; T Tanaka; R R Cook; W Kisiel; K Fujikawa; K Kurachi; J C Powers
Journal:  Biochemistry       Date:  1984-02-14       Impact factor: 3.162

7.  3C-like proteinase from SARS coronavirus catalyzes substrate hydrolysis by a general base mechanism.

Authors:  Changkang Huang; Ping Wei; Keqiang Fan; Ying Liu; Luhua Lai
Journal:  Biochemistry       Date:  2004-04-20       Impact factor: 3.162

8.  The crystal structures of severe acute respiratory syndrome virus main protease and its complex with an inhibitor.

Authors:  Haitao Yang; Maojun Yang; Yi Ding; Yiwei Liu; Zhiyong Lou; Zhe Zhou; Lei Sun; Lijuan Mo; Sheng Ye; Hai Pang; George F Gao; Kanchan Anand; Mark Bartlam; Rolf Hilgenfeld; Zihe Rao
Journal:  Proc Natl Acad Sci U S A       Date:  2003-10-29       Impact factor: 11.205

9.  Characterization of trans- and cis-cleavage activity of the SARS coronavirus 3CLpro protease: basis for the in vitro screening of anti-SARS drugs.

Authors:  Cheng-Wen Lin; Chang-Hai Tsai; Fuu-Jen Tsai; Pei-Jer Chen; Chien-Chen Lai; Lei Wan; Hua-Hao Chiu; Kuan-Hsun Lin
Journal:  FEBS Lett       Date:  2004-09-10       Impact factor: 4.124

10.  High-throughput screening identifies inhibitors of the SARS coronavirus main proteinase.

Authors:  Jan E Blanchard; Nadine H Elowe; Carly Huitema; Pascal D Fortin; Jonathan D Cechetto; Lindsay D Eltis; Eric D Brown
Journal:  Chem Biol       Date:  2004-10
View more
  90 in total

1.  Temperature-sensitive mutants and revertants in the coronavirus nonstructural protein 5 protease (3CLpro) define residues involved in long-distance communication and regulation of protease activity.

Authors:  Christopher C Stobart; Alice S Lee; Xiaotao Lu; Mark R Denison
Journal:  J Virol       Date:  2012-02-15       Impact factor: 5.103

2.  Chimeric exchange of coronavirus nsp5 proteases (3CLpro) identifies common and divergent regulatory determinants of protease activity.

Authors:  Christopher C Stobart; Nicole R Sexton; Havisha Munjal; Xiaotao Lu; Katrina L Molland; Sakshi Tomar; Andrew D Mesecar; Mark R Denison
Journal:  J Virol       Date:  2013-09-11       Impact factor: 5.103

3.  Phosphoserine acidic cluster motifs bind distinct basic regions on the μ subunits of clathrin adaptor protein complexes.

Authors:  Rajendra Singh; Charlotte Stoneham; Christopher Lim; Xiaofei Jia; Javier Guenaga; Richard Wyatt; Joel O Wertheim; Yong Xiong; John Guatelli
Journal:  J Biol Chem       Date:  2018-08-22       Impact factor: 5.157

4.  SARS-CoV 3CL protease cleaves its C-terminal autoprocessing site by novel subsite cooperativity.

Authors:  Tomonari Muramatsu; Chie Takemoto; Yong-Tae Kim; Hongfei Wang; Wataru Nishii; Takaho Terada; Mikako Shirouzu; Shigeyuki Yokoyama
Journal:  Proc Natl Acad Sci U S A       Date:  2016-10-31       Impact factor: 11.205

5.  A novel mutation in murine hepatitis virus nsp5, the viral 3C-like proteinase, causes temperature-sensitive defects in viral growth and protein processing.

Authors:  Jennifer S Sparks; Eric F Donaldson; Xiaotao Lu; Ralph S Baric; Mark R Denison
Journal:  J Virol       Date:  2008-04-02       Impact factor: 5.103

6.  Structures of two coronavirus main proteases: implications for substrate binding and antiviral drug design.

Authors:  Xiaoyu Xue; Hongwei Yu; Haitao Yang; Fei Xue; Zhixin Wu; Wei Shen; Jun Li; Zhe Zhou; Yi Ding; Qi Zhao; Xuejun C Zhang; Ming Liao; Mark Bartlam; Zihe Rao
Journal:  J Virol       Date:  2007-12-19       Impact factor: 5.103

7.  Glycyrrhizic acid exerts inhibitory activity against the spike protein of SARS-CoV-2.

Authors:  Shaopeng Yu; Yuying Zhu; Jiaruo Xu; Guangtao Yao; Pei Zhang; Mengge Wang; Yongfang Zhao; Guoqiang Lin; Hongzhuan Chen; Lili Chen; Jiange Zhang
Journal:  Phytomedicine       Date:  2020-10-02       Impact factor: 5.340

8.  Mechanism for controlling the dimer-monomer switch and coupling dimerization to catalysis of the severe acute respiratory syndrome coronavirus 3C-like protease.

Authors:  Jiahai Shi; J Sivaraman; Jianxing Song
Journal:  J Virol       Date:  2008-02-27       Impact factor: 5.103

9.  Without its N-finger, the main protease of severe acute respiratory syndrome coronavirus can form a novel dimer through its C-terminal domain.

Authors:  Nan Zhong; Shengnan Zhang; Peng Zou; Jiaxuan Chen; Xue Kang; Zhe Li; Chao Liang; Changwen Jin; Bin Xia
Journal:  J Virol       Date:  2008-02-27       Impact factor: 5.103

10.  Over-expression, purification, and confirmation of Bacillus anthracis transcriptional regulator NprR.

Authors:  Amy J Rice; Jerry K Woo; Attiya Khan; Michael Z Szypulinski; Michael E Johnson; Hyunwoo Lee; Hyun Lee
Journal:  Protein Expr Purif       Date:  2015-09-03       Impact factor: 1.650

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.