| Literature DB >> 12093723 |
Kanchan Anand1, Gottfried J Palm, Jeroen R Mesters, Stuart G Siddell, John Ziebuhr, Rolf Hilgenfeld.
Abstract
The key enzyme in coronavirus polyprotein processing is the viral main proteinase, M(pro), a protein with extremely low sequence similarity to other viral and cellular proteinases. Here, the crystal structure of the 33.1 kDa transmissible gastroenteritis (corona)virus M(pro) is reported. The structure was refined to 1.96 A resolution and revealed three dimers in the asymmetric unit. The mutual arrangement of the protomers in each of the dimers suggests that M(pro) self-processing occurs in trans. The active site, comprised of Cys144 and His41, is part of a chymotrypsin-like fold that is connected by a 16 residue loop to an extra domain featuring a novel alpha-helical fold. Molecular modelling and mutagenesis data implicate the loop in substrate binding and elucidate S1 and S2 subsites suitable to accommodate the side chains of the P1 glutamine and P2 leucine residues of M(pro) substrates. Interactions involving the N-terminus and the alpha-helical domain stabilize the loop in the orientation required for trans-cleavage activity. The study illustrates that RNA viruses have evolved unprecedented variations of the classical chymotrypsin fold.Entities:
Mesh:
Substances:
Year: 2002 PMID: 12093723 PMCID: PMC126080 DOI: 10.1093/emboj/cdf327
Source DB: PubMed Journal: EMBO J ISSN: 0261-4189 Impact factor: 11.598

Fig. 1. Sequence comparison of coronavirus main proteinases. The alignment was produced using CLUSTAL X, version 1.81 (Thompson ), and corrected manually on the basis of the three-dimensional structure of TGEV Mpro. The corresponding sequences of FIPV (strain 79–1146), HCoV (strain 229E), bovine coronavirus (BCoV, isolate LUN), MHV (strain JHM) and IBV (strain Beaudette) were derived from the replicative polyproteins of the respective viruses whose sequences are deposited at the DDBJ/EMBL/GenBank database (accession Nos: FIPV, AF326575; HCoV, X69721; BCoV, AF391542; MHV, M55148; IBV, M95169; TGEV, AJ271965). The β-strands and α-helices as revealed in the TGEV Mpro crystal structure (this study) are shown above the sequence alignment (see also Figures 4 and 5). Black background colour indicates the catalytic cysteine and histidine residues. Grey background colour indicates the key residue of the S1 subsite (TGEV Mpro His162) and its equivalents in other coronavirus main proteinases. Also shown in grey are the phenylalanine and tyrosine residues (TGEV Mpro Phe139 and Tyr160) that are proposed to stabilize the neutral state of His162 (see text for details).

Fig. 2. Stereo view of a representative part of the electron density map. The 2|Fo| – |Fc| electron density map (1.96 Å resolution, contoured at 1σ above the mean) corresponds to Mpro residues 160–162 (Tyr–Met–His), a conserved motif in coronavirus main proteinases. The strong hydrogen bonding interaction between the Tyr160 hydroxyl group and His162 Nδ1 is indicated.

Fig. 3. Stereo depiction of the six molecules (three dimers) of TGEV Mpro in the asymmetric unit. The monomers A–F are shown in different colours; A = red, B = black, C = green, D = orange-red, E = yellow and F = cyan. Note the 2-fold symmetry axes between the monomers in each of the dimers, and between the two lower dimers in the figure (AB and EF). Each of the monomers measures ∼70 Å × 22 Å × 40 Å.

Fig. 4. A MOLSCRIPT diagram (Kraulis, 1991) showing the overall fold of TGEV Mpro (A) with the two β-barrel domains and the α-helical C-terminal domain. β-strands and helices are represented as arrows and cylinders, respectively. The β-barrels of each domain I and II are composed of six-stranded β-sheets (green). Domain III is composed mainly of α-helices (red). The structures of HAV 3Cpro (PDB code: 1HAV) (B) and α-chymotrypsin (4CHA, residues 12–15 and 147–148 are excised) (C) are shown for comparison.

Fig. 5. Topological representation of the secondary structure elements of a TGEV Mpro monomer. α-helices and β-strands are represented as cylinders and arrows, respectively. Numbers indicate the N- and C-terminal residues of the secondary structure elements. Strands bI and cI are adjacent. Cys144 (yellow) and His41 (blue) are shown by circles. The positions of the N- and C-termini are indicated. Also, the presumed localization of the P5–P1 region of a model substrate is shown (blue) (for details, see text and Figure 7).

Fig. 6. Active site of the TGEV Mpro. (A) Difference electron density (|Fo| – |Fc| at 3.0σ above the mean; red) for the oxidized active site Cys144, indicating three oxygen atoms bound to the sulfur. (B) The catalytic Cys144 and His41 residues are shown. The region forming the oxyanion hole (main chain amides of Gly142, Thr143 and Cys144) is highlighted in pink. The water molecule, which occupies a position equivalent to that of the catalytic aspartate of serine proteinases, is shown together with its hydrogen-bonding partners, His41, His163 and Asp186. (C) Superposition of the active site residues of chymotrypsin (shown in red) with the spatially equivalent residues of TGEV Mpro (blue) and HAV 3Cpro (green). The equivalent to the third catalytic residue (Asp102) of chymotrypsin is Asp84 in HAV 3Cpro (side chain oriented differently) and Val84 in TGEV Mpro.
Enzymatic activities of TGEV Mpro mutants
| Plasmid | Oligonucleotides used for cloning or mutagenesis (5′→3′) | Protein | Mpro amino acids | Activity (%)a |
|---|---|---|---|---|
| pMal-Mpro | TCAGGTTTGCGGAAAATGGCAC, | Mpro | Ser1–Gln302 | 100 |
| AAAAGGATCCTTACTGAAGATTTACACCATACATTTG | ||||
| pMal-MproΔ184–302 | TCAGGTTTGCGGAAAATGGCAC, | MproΔ184–302 | Ser1–Gly183 | <0.02 |
| AAAGGATCCTTAACCACCGTACATTTCTCCTTCAAAATT | ||||
| pMal-MproΔ200–302 | TCAGGTTTGCGGAAAATGGCAC, | MproΔ200–302 | Ser1–Ser199 | 0.4 |
| AAAGGATCCTTATGACATGACATTAGTACCTTCCAATTG | ||||
| pMal-MproΔ1–5/Δ200–302 | ATGGCACAGCCTAGTGGTCTTGTA, | MproΔ1–5/Δ200–302 | Met6–Ser199 | 0.6 |
| AAAGGATCCTTATGACATGACATTAGTACCTTCCAATTG | ||||
| pMal-MproΔ1–5 | ATGGCACAGCCTAGTGGTCTTGTA, | MproΔ1–5 | Met6–Gln302 | 0.3 |
| AAAAGGATCCTTACTGAAGATTTACACCATACATTTG | ||||
| pMal-Mpro-H163L | GTATACATGCATCTCTTAGAACTTGGAAATGGCTCGCAT, | Mpro-H163L | Ser1–Gln302 | 98 |
| TCCAAGTTCTAAGAGATGCATGTATACAAAATAGAGAAT | (His163→Leu) | |||
| pMal-Mpro-C144A | AGCTGGTACTGCTGGATCAGTAGGTTATGTGTTAGAA, | Mpro-C144A | Ser1–Gln302 | <0.02 |
| CTACTGATCCAGCAGTACCAGCTATAAAAGATCCTTT | (Cys144→Ala) |
The sequence of the 15mer substrate peptide, H2N-VSVNSTLQSGLRKMA-COOH, was derived from the N-terminal Mpro autoprocessing site (residues shown in bold indicate the scissile bond). The activity of wild-type Mpro (encompassing 302 residues) was taken as 100% and the mean value of three experiments, which did not vary by more than 15%, is shown.
aProteolytic activities were determined using a peptide-based cleavage assay (Ziebuhr ; see Materials and methods).

Fig. 7. Stereo diagram of a P5–P1 substrate (Asn–Ser–Thr–Leu–Gln, red; corresponding to the TGEV Mpro N-terminal autoprocessing site) modelled into the active site cleft of the TGEV Mpro. Hydrogen bonds are depicted by dotted lines.

Fig. 8. Intra- and intermolecular contacts of the TGEV Mpro N-terminus. (A) MOLSCRIPT stereo representation of a TGEV Mpro dimer. Molecule A is coloured from blue at the N-terminus, via green (domain II), to red (C-terminus), while molecule B is shown in grey. The catalytic Cys144 and His41 residues are labelled in both monomers. (B) Detailed view of the interactions made by the N-terminal segment (blue) and domains II/III of monomer A as well as domains II/III of monomer B. Residues critically involved in these interactions are designated by the single-letter code and shown in ball-and-stick representation (see text for details). The N- and C-termini of molecule A are indicated.
Summary of X-ray diffraction data from crystals of native and SeMet-substituted Mpro
| Peak | Edge | High | Low | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Beamline | XRDa | BW7Ab | |||||||
| Data sete | Native | P1 | P2 | P3 | E1 | E2 | H1 | H2 | L1 |
| Wavelength (Å)d | 0.99983 | 0.97487 | 0.97845 | 0.97848 | 0.97864 | 0.97874 | 0.95583 | 0.9080 | 1.0022 |
| Resolution (Å) (highest resolution bin)c | 50–1.95 (1.98–1.95) | 30–2.8 | 30–2.8 | 30–2.8 | 30–2.8 | 30–2.8 | 30–2.8 | 30–2.8 | 30–2.8 |
| Completeness (%)c | 98.9 (97.0) | 99.9 | 98.1 | 99.7 | 99.9 | 99.7 | 99.7 | 98.8 | 97.3 |
| Mosaicity (°) | 0.62 | 0.4 | 0.6 | 0.7 | 0.4 | 0.6 | 0.4 | 0.6 | 0.4 |
| 4.2 (22.1) | 10.5 | 11.4 | 10.6 | 8.1 | 8.2 | 8.6 | 7.2 | 8.0 | |
| 4.6 (27.1) | 12.1 | 13.0 | 12.3 | 9.2 | 8.9 | 10.2 | 7.5 | 10.3 | |
| 1.8 (15.2) | 6.1 | 6.6 | 6.4 | 4.7 | 4.5 | 5.2 | 3.2 | 5.4 | |
| Redundancyc | 5.4 (2.9) | 3.8 | 3.8 | 3.9 | 3.8 | 3.9 | 3.7 | 3.6 | 2.9 |
| 13.5 (4.0) | 5.4 | 4.7 | 4.8 | 6.1 | 4.1 | 4.1 | 4.9 | 2.5 | |
aX-ray diffraction beamline at ELETTRA, Trieste, equipped with a Mar CCD detector.
bWiggler beamline of EMBL at DESY, Hamburg, equipped with a Mar CCD detector.
cHighest resolution bin in parentheses.
dThe inflection point and peak wavelengths were collected in inverse beam mode, whereas the remote wavelengths were collected at the low energy side of the Se edge where there is little anomalous signal and, as a result, no inverse beam data were collected.
eP1, P2, P3 = peak wavelengths 1, 2 and 3; E1, E2 = edge wavelengths 1 and 2 (point of inflection); H1, H2 = high energy remote wavelengths 1 and 2; L1 = low energy remote wavelength.
fRmerge = 100 × ΣiΣhkl|Ii – |/ΣiΣhklIi, where Ii is the observed intensity and is the average intensity from multiple measurements.
gRrim = 100 × Σi (N/N – 1)1/2Σhkl|Ii – |/ΣiΣhklIi, where N is the number of times a given reflection has been measured. This quality indicator corresponds to an Rsym that is independent of the redundancy of the measurements.
hRpim = 100 × Σi (1/N – 1)1/2Σhkl|Ii – |/ΣiΣhklIi. This factor provides information about the average precision of the data.
Phasing statistics, refinement statistics and model quality
| Phasing | |
| FOMa before solvent flattening | 0.48 |
| FOMa after solvent flattening (no averaging) | 0.72 |
| FOMa after solvent flattening (with averaging) | 0.79 |
| Refinement | |
| Resolution (Å) | 50–1.96 |
| | 0.210 |
| | 0.256 |
| No. of non-hydrogen atoms [average | |
| Protein (main chain) | 7198 (46.1) |
| Protein (side chain) | 6613 (47.2) |
| Water | 1006 (50.3) |
| MPD | 48 (67.6) |
| Sulfate | 135 (57.1) |
| Dioxane | 54 (71.7) |
| R.m.s. deviation from ideal geometry | |
| Bonds (Å) | 0.017 |
| Angles (°) | 1.9 |
| Improper dihedral angles (°) | 1.16 |
aFOM = figure of merit.
bR-factor = Σ (|Fo| – k|Fc|)/Σ|Fo|, where k is the scale factor.