Naveen Vankadari1, Vijayasarathy Ketavarapu2, Sasikala Mitnala2, Ravikanth Vishnubotla2, Duvvur Nageshwar Reddy2, Debnath Ghosal3. 1. Monash Biomedicine Discovery Institute, Department of Biochemistry and Molecular Biology, Monash University, Clayton, Victoria 3800, Australia. 2. Institute of Translational Research, Department of Genomics and Molecular Biology, Asian Institute of Gastroenterology, Gachibowli, Hyderabad 500032, Telangana, India. 3. Department of Biochemistry and Pharmacology, Bio21 Molecular Science and Biotechnology Institute, The University of Melbourne, Melbourne, Victoria 3000, Australia.
Abstract
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected more than 520 million people around the globe resulting in more than 6.2 million as of May 2022. Understanding the cell entry mechanism of SARS-CoV-2 and its entire repertoire is a high priority for developing improved therapeutics. The SARS-CoV-2 spike glycoprotein (S-protein) engages with host receptor ACE2 for adhesion and serine proteases furin and TMPRSS2 for proteolytic activation and subsequent entry. Recent studies have highlighted the molecular details of furin and S-protein interaction. However, the structural and molecular interplay between TMPRSS2 and S-protein remains enigmatic. Here, using biochemical, structural, computational, and molecular dynamics approaches, we investigated how TMPRSS2 recognizes and activates the S-protein to facilitate viral entry. First, we identified three potential TMPRSS2 cleavage sites in the S2 domain of S-protein (S2', T1, and T2) and reported the structure of TMPRSS2 with its individual catalytic triad. By employing computational modeling and structural analyses, we modeled the macromolecular structure of TMPRSS2 in complex with S-protein, which incited the mechanism of S-protein processing or cleavage for a new path of viral entry. On the basis of structure-guided drug screening, we also report the potential TMPRSS2 inhibitors and their structural interaction in blocking TMPRSS2 activity, which could impede the interaction with the spike protein. These findings reveal the role of TMPRSS2 in the activation of SARS-CoV-2 for its entry and insight into possible intervention strategies.
The coronavirus disease 2019 (COVID-19) pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has infected more than 520 million people around the globe resulting in more than 6.2 million as of May 2022. Understanding the cell entry mechanism of SARS-CoV-2 and its entire repertoire is a high priority for developing improved therapeutics. The SARS-CoV-2 spike glycoprotein (S-protein) engages with host receptor ACE2 for adhesion and serine proteases furin and TMPRSS2 for proteolytic activation and subsequent entry. Recent studies have highlighted the molecular details of furin and S-protein interaction. However, the structural and molecular interplay between TMPRSS2 and S-protein remains enigmatic. Here, using biochemical, structural, computational, and molecular dynamics approaches, we investigated how TMPRSS2 recognizes and activates the S-protein to facilitate viral entry. First, we identified three potential TMPRSS2 cleavage sites in the S2 domain of S-protein (S2', T1, and T2) and reported the structure of TMPRSS2 with its individual catalytic triad. By employing computational modeling and structural analyses, we modeled the macromolecular structure of TMPRSS2 in complex with S-protein, which incited the mechanism of S-protein processing or cleavage for a new path of viral entry. On the basis of structure-guided drug screening, we also report the potential TMPRSS2 inhibitors and their structural interaction in blocking TMPRSS2 activity, which could impede the interaction with the spike protein. These findings reveal the role of TMPRSS2 in the activation of SARS-CoV-2 for its entry and insight into possible intervention strategies.
The coronavirus disease 2019
(COVID-19) pandemic, caused by the highly transmissible and virulent
pathogen severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2),
is a global health emergency[1,2] causing severe acute
respiratory distress syndrome (ARDS) in humans. SARS-CoV-2 is an enveloped,
single-stranded positive-sense RNA β-coronavirus[3] belonging to the family Coronaviridae, similar to SARS-CoV-1
and MERS-CoV-2.[4] To date, it has infected
more than 520 million people resulting in more than 6.2 million deaths
with varied incidences across the globe (www.covid19.who.int) as of
May 2022. In different countries, various levels of virulence and
pathogenicity have been identified, which could be due to the variance
in viral strains and host factors, including host genetic makeup.[5−7] To develop improved therapeutics against SARS-CoV-2, understanding
the molecular mechanism of viral entry and/or hijacking the host system
and its interplay with the different and host proteins is imperative.
The observed differences in the rates of infection across the world
also raise an intriguing question of whether functionally relevant
variants in ACE2, Furin, and TMPRSS2 contribute to differences in infection rates.[8,9]Like other viruses, the entry of a coronavirus or SARS-CoV-2 into
host cells is a vital determinant of viral infectivity and pathogenesis
and also a significant challenge for host immune surveillance and
combat strategies.[10,11] SARS-CoV-2 utilizes spike glycoprotein
(S-protein) for host cell adhesion via host-receptor recognition followed
by membrane fusion.[12,13] The N-terminal S1 subunit of
S-protein contains the receptor binding domain (RBD) and binds to
the peptidase binding domain on receptor angiotensin converting enzyme
2 (ACE2).[9,14] The cryo-electron microscopy (cryo-EM) structure
of S-protein reveals the structural rearrangement of S-protein after
binding to the receptor to allow fusion of the host cell and viral
membranes.[13,15] It is also known that SARS-CoV-2
S-protein becomes activated through different host proteases.[16,17] Evidence suggests that coronavirus S-proteins are cleaved by host
proteases such as furin at the S1/S2 cleavage site, exposing S2 to
further processing by host serine proteases for subsequent viral entry.[16,18] Surface-expressed transmembrane protease serine 2 (TMPRSS2) is implicated
in the activation of influenza A, influenza B, and coronaviruses,
including SARS-CoV-2, to drive efficient pulmonary infection.[9,19] Inhibiting TMPRSS2’s proteolytic activity prevents efficient
viral entry, making it a promising target for antiviral therapies.[20] In SARS-CoV-2, TMPRSS2 is known to attack the
S2 prime (S2′) region and cleave the spike protein facilitating
cell entry. In this regard, specific serine protease inhibitors such
as Camostast and Nafmostast have been effectively used to block the
TMPRSS2 as one of the therapeutic options against COVID-19 infection.[21] Recent clinical studies have also shown that
mutations in host TMPRSS2 lower the infection rate in COIVD-19 patients.[22] However, the lack of structural and molecular
studies concerning interaction of TMPRSS2 with S-protein limits our
understanding of the key priming action of TMPRSS2 for the entry of
the S-protein into the host cell. This warrants detailed structural
and molecular studies to unravel the molecular mechanism and repertoire
of the coronavirus hijacking the host system, which has potential
therapeutic implications.Here, using comprehensive structural,
functional, and molecular
modeling–dynamics studies, we report the structural basis of
TMPRSS2-mediated activation of SARS-CoV-2 S-protein for its cellular
entry. Along with known and functionally proven TMPRSS2 cleavage at
the S2′ site, we report two additional potential TMPRSS2 cleavage
sites in spike protein, which potentially contribute to immune evasion
and cell infectivity. The structural studies also unveil the cleavage
fragments of the S protein. Using computational approaches and structural
interactions, we screened several clinically approved protease inhibitors
(Chemostat, Upamostat, Nafmostat, and Bromhexine) that could potentially
block TMPRSS2 activity and its further interaction with the S2 subunit
of spike protein. The other two potential TMPRSS2 inhibitors are Ambroxol
and Gabexate. These findings unravel one of the key strategies that
SARS-CoV-2 adopts to infect humans while escaping immune surveillance,
and the findings also provide insight into possible intervention strategies
and development of other therapeutics.First, to understand
the structure and functional mechanism of
TMPRSS2 and due to the lack of an available structure, we first constructed
the soluble and surface-exposed functional domain structure of TMPRSS2
through de novo molecular modeling by employing two independent servers,
SWISS-MODEL (https://swissmodel.expasy.org/) for homology modeling and I-TASSER for iterative threading-based
prediction (www.zhang-lab.org), by selecting the best sequence- and topology-aligned structure
[Protein Data Bank (PDB) entry 5CE1]. The modeled loops were reconfirmed
and refined for the best fit to avoid steric hindrance clashes and
ensure all residues are placed in Ramachandran favored positions using
Coot (https://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/). The
model was validated using the C-score (confidence
score) and TM score (structural similarity), demonstrating the most
correct fold and confidence of the predicted structure. All amino
acid residues were positioned according to their lowest-energy possible
orientation in the final model, and it superimposes with 5CE1 (hepsin protease)
at a 0.45 Å Ca root-mean-square deviation (RMSD) (Figure S1a–e). The overall structure of
TMPRSS2 (amino acids 145–492) (Figure a) measures 42 Å in length and 24 Å
in diameter, comprising an N-terminal (amino acids 145–243)
activation domain and a C-terminal (amino acids 256–492) proteolytic
or catalytic domain. The N-terminus of TMPRSS2 is oriented toward
the transmembrane of the host cell, while the C-terminus is pointed
outward and is arranged to receive its target peptides and/or proteins
or spike protein. We mapped the catalytic site/pocket of TMPRSS2 using
sequence and structural analysis of different serine proteases. Residues
H296, S441, K432, W461, and Q438 were found to be highly conserved
with other TTPs (type II transmembrane serine proteases) and act as
prime residues for the catalytic activity. The N-terminus of TMPRSS2
(amino acids 1–145) belongs to the transmembrane region, which
is not considered in this structural determination. We also compared
our TMPRSS2 structure with recent AI-based computations from Alpha-fold,[23] which was found to be substantially comparable
to our modeled structure as well as a recent unpublished crystal structure
(PDB entry 7MEQ) with an RMSD of 0.32 Å (Figure e). The two minor observed differences are a change
in the one-turn helix to a loop structure at Q317–G323 in the
model and a small bending angle between the N- and C-terminal domains,
which do not have a role in catalytic function.
Figure 1
Structures of TMPRSS2
and SARS-CoV-2 spike glycoprotein. (a) Surface
and ribbon model showing the side view of the homology model structure
of TMPRSS2. The substrate or catalytic binding region is labeled on
top, and the unstructured and membrane regions are shown as dashed
lines. (b) Surface and ribbon model showing the side view of SARS-CoV-2
spike glycoprotein. The S1 domain region is shown as a gray surface,
and the internal S2 domain is shown as colored ribbons. Three protomers
of the homotrimer are colored accordingly. (c) Domain arrangement
of the S1 and S2 subunits of spike glycoprotein. Abbreviations: SP,
signal peptide; NTD, N-terminal domain; RBM, receptor binding domain;
FP, fusion peptide; HR1, heptad repeat 1; HR2, heptad repeat 2; TM,
transmembrane domain; CP, cytoplasmic domain. (d) Cleavage of SARS-CoV-2
S-protein by TMPRSS2. The HEK293 cells were transfected with either
an empty vector or the TMPRSS2 gene and incubated for 48 h in the
absence or presence of the furin inhibitor MI-1851 (50 μM each).
Cell lysates were subjected to SDS–PAGE and Western blot analysis
using antibodies against the C-terminal Myc tag. Cleavage of the spike
into S2, S2′, and S1 is shown. Figure adapted with permission
from ref (29). (e)
Overlay of our de novo validated structure of TMPRSS2 (blue) with
Alpha-fold predicted (pink) and X-ray diffraction (green). Catalytic
active sites and N- and C-termini are denoted.
Structures of TMPRSS2
and SARS-CoV-2 spike glycoprotein. (a) Surface
and ribbon model showing the side view of the homology model structure
of TMPRSS2. The substrate or catalytic binding region is labeled on
top, and the unstructured and membrane regions are shown as dashed
lines. (b) Surface and ribbon model showing the side view of SARS-CoV-2
spike glycoprotein. The S1 domain region is shown as a gray surface,
and the internal S2 domain is shown as colored ribbons. Three protomers
of the homotrimer are colored accordingly. (c) Domain arrangement
of the S1 and S2 subunits of spike glycoprotein. Abbreviations: SP,
signal peptide; NTD, N-terminal domain; RBM, receptor binding domain;
FP, fusion peptide; HR1, heptad repeat 1; HR2, heptad repeat 2; TM,
transmembrane domain; CP, cytoplasmic domain. (d) Cleavage of SARS-CoV-2
S-protein by TMPRSS2. The HEK293 cells were transfected with either
an empty vector or the TMPRSS2 gene and incubated for 48 h in the
absence or presence of the furin inhibitor MI-1851 (50 μM each).
Cell lysates were subjected to SDS–PAGE and Western blot analysis
using antibodies against the C-terminal Myc tag. Cleavage of the spike
into S2, S2′, and S1 is shown. Figure adapted with permission
from ref (29). (e)
Overlay of our de novo validated structure of TMPRSS2 (blue) with
Alpha-fold predicted (pink) and X-ray diffraction (green). Catalytic
active sites and N- and C-termini are denoted.With regard to the structure of the SARS-CoV-2 spike glycoprotein,
the only two available cryo-EM structures (PDB entries 6VSB and 6VXX)[15,24] are incomplete and have several gaps in the built structure. Hence,
we remodeled these structures to fill the missing loops using a previously
published and validated model structure of full-length SARS-CoV-2
spike glycoprotein.[25] The loop regions
connecting the S1 and S2 regions of the spike are highly flexible
and hard to locate even in recent EM structures but play a key role
in virion activation.[26] For enzyme catalytic
activity, understanding the active site is critical. It has been shown
that TMPRSS2 cleaved spike protein preferentially in the S2′
region (KPSKR815↓SFIED) in S-protien. With the aid
of structure-guided serine protease cleavage sites, we mapped two
additional potential TMPRSS2 binding or cleaving sites in S-protein
(T1, amino acids 837–845, and T2, amino acids 976–986)
located in the S2 domain region, as found in other serine protease
target sites,[27,28] along with the putative sequence
and structural analysis (Figure S2). The
T1 and T2 binding sites on S2 spike protein were discovered in the
structure of extended loop regions, where T2 is buried in S1 domains
and likely exposed during either receptor binding or post-furin cleavage
(PRRAR685↓SVAS). The T1 cleavage site on the S2
domain of S-protein, which is between the C-terminal region of the
fusion peptide and the N-terminal region of heptad repeat region 1
(HR1) adjacent to the fusion peptide site, is rich in Lys and Arg
residues. On the basis of cryo-EM and a molecular model of S-protein,
the S1 and S2 domains and their inner structural arrangement are shown
in Figure b. Similarly,
the T2 cleavage site on the S2 domain overlaps with the C-terminal
region of HR1 and contains similar residues (Figure c). It is also clear that most serine proteases
target the Lys- and Arg-rich targets.The SARS-CoV-2 spike protein
endogenous cleavage assay demonstrates
that human TMPRSS2 cleaves the spike protein at the S1 and S2′
regions even in the presence of MI-1851 (furin-specific inhibitor)
(Figure d). The TMPRSS2
cleavage produces S1 and S2/S2′ fragments, but it has a strong
preference for the S2′ region for cleavage.[29] Additionally, cleavage at the T1 site has also been reported.
The absence of MI-1851, on the contrary, resulted in a greater cleavage
of the spike, which was primarily driven by the endogenous furin.
Furthermore, recent studies show that COVID-19 patients carrying WT-TMPRSS2
had an infection rate higher than that with the V160M mutation[22,30] and demonstrate the V160M mutation reduced TMPRSS2 processivity.[22] This suggests that WT-TMPRSS2 is one of the
factors that contributes to enhanced virulence.The functional
and biophysical studies evidence that TMPRSS2 potentially
recognizes and cleaves the spike protein at the S2 and S2′
sites (Figure d).
To better understand the structural basis of the interaction of SARS-CoV-2
S-protein and TMPRSS2, molecular unbiased random docking and interaction
studies were performed with the full-length and activated form (post-furin
cleavage) of SARS-CoV-2 spike glycoprotein (S2 domain) and our modeled
and validated TMPRSS2 (amino acids 145–492) as template structures.
With these two individual structures, we docked the structures using
two independent servers, Cluspro (https://cluspro.org/login.php) and HADDOCK 2.2 (www.bonvinlab.org/software/haddock2.2/), for further validation in the absence of water. The binding free
energies were taken into consideration for selecting the best possible
models. Further validation and refinement were completed by ensuring
that the residues occupied Ramachandran favored positions using Coot
(https://www2.mrc-lmb.cam.ac.uk/personal/pemsley/coot/). The
final docked complex structures were then verified for the absence
of any large conformational changes upon docking. The S-protein/TMPRSS2
complex structure is further validated and screened using different
parameters (cluster size, mode of interaction, buried interphase,
ΔG, etc.) obtained from the docking clusters
(Figure S3). The prime interaction between
S-protein and TMPRSS2 is mediated by classical protease cleavage binding
mode (Figure ). The
interaction encompasses a large buried interphase (2140 Å2), demonstrating the factual mode of interaction, and is mediated
via electrostatic energy (−170 kcal), van der Waals energy
(−63 kcal), and hydrogen bonding. The overall complex structure
shows that TMPRSS2 binds to the S2′ site (amino acids 806–814;
a loop region is extended outward and well-exposed) of the SARS-CoV-2
spike glycoprotein homotrimer. The active site amino acids in TMPRSS2
(Q276, E299, K300, P301, K340, K342, E389, K390, L419, S441, Q438,
and W461) mediate major contacts for S-protein interaction (Figure a,b). Consistently,
residues H296, S441, K432, and Q438 are highly conserved among several
serine proteases.[31] This underlines the
structural mechanism of human TMPRSS2 recognizing and cleaving the
SARS-CoV-2 spike protein in the S2′ region.
Figure 2
Structural interaction
between TMPRSS2 and SARS-CoV-2 spike glycoprotein.
(a) S2′, the primary cleavage site of TMPRSS interaction in
the S2 domain of S-protein in the S2′ region. (b) Enlarged
view showing the detailed interaction between TMRPRSS2 and the S2′
site of spike protein. (c) Key residues and binding orientation topology
at the interaction interface of the S2′ site. Color coding
and labeling are the same in all of the figures. (d) T1, first model
of TMPRSS interacting with the S2 domain of spike glycoprotein in
the T1 region. (e) Enlarged view showing the detailed interaction
between TMRPRSS2 and the T1 site in spike protein. (f) Key residues
and binding orientation topology at the interaction interface of the
T1 site.
Structural interaction
between TMPRSS2 and SARS-CoV-2 spike glycoprotein.
(a) S2′, the primary cleavage site of TMPRSS interaction in
the S2 domain of S-protein in the S2′ region. (b) Enlarged
view showing the detailed interaction between TMRPRSS2 and the S2′
site of spike protein. (c) Key residues and binding orientation topology
at the interaction interface of the S2′ site. Color coding
and labeling are the same in all of the figures. (d) T1, first model
of TMPRSS interacting with the S2 domain of spike glycoprotein in
the T1 region. (e) Enlarged view showing the detailed interaction
between TMRPRSS2 and the T1 site in spike protein. (f) Key residues
and binding orientation topology at the interaction interface of the
T1 site.The additional interaction and
cleavage by the TMPRSS2 at the T1
and T2 sites of the S2 domain of spike glycoprotein occur via three
possible modes (Figure S3b). The TMPRSS2
binding to the T1 site of spike protein (cluster 1 of the docked complex)
represents another best possible mode of interaction (Figure d–f and Figure S4a–c) with a highest docking score
of −101 with a large buried surface area of 1557 Å2. The interaction is mediated by electrostatic energy (−250
kcal), van der Waals energy (−45 kcal), and hydrogen bonding
(Figure S3). The overall docked complex
structure shows that TMPRSS2 binds to the T1 site (amino acids 835–850)
of the SARS-CoV-2 spike glycoprotein homotrimer and adopts the typical
protease binding mode. The TMPRSS2 catalytic pocket adopts a cup-like
architecture and accommodates a large binding interface. With regard
to interaction of TMPRSS2 with the T1 site (K835–I850) of activated
spike protein (Figure d–f and Figure S4a–c), the
extended and well-exposed loop region (G832–N856) of S-protein
assumes the peptide binding mode with TMPRSS2. The entire “S-shaped”
loop region of the T1 site of the S2 domain passes into the canyon-like
crevice or cup-like structure of the TMPRSS2 catalytic pocket (Figure d–f), and
the interaction is mediated via hydrogen bonding and van der Waals
energy. Among them, S-protein residues K835, Y837, D839, C840, L841,
D843, I844, R847, and R848 are the key interacting residues and are
well positioned with respect to the catalytic pocket of TMPRSS2 for
processing the spike protein.On the contrary, we also notice
the possible interaction of TMPRSS2
with the T2 site [amino acids 975–987, the linker region connecting
α1/α2 and α3/α4 of the S2 domain of the spike
protein (Figure S4d–f)]. This includes
the interaction with the pin/tip regions of the activated/primed S2
domain trimer containing α-helices via hydrogen bonding, electrostatic
energy (−129 kcal), and van der Waals forces (−69 kcal)
with a larger buried surface area of 2282 Å2 (Figure S3). Residues D745 and T747 of the α1/α2
helix region and residues N977, D978, L981, R983, D985, and K986 of
the α3/α4 helix region of spike glycoprotein (S2 domain)
are the key interacting residues and well positioned and inserted
into the catalytic pocket of TMPRSS2 (Figure S4d–f). The detailed position and alignment of the amino acids of TMPRSS2
and spike protein involved in the interaction provide a high degree
of confidence in molecular binding (Figure S4d–f). This suggests that T1 and T2 sites are also potential targets
for the TMPRSS2 protease.To understand the real-time in-solution
behavior of the SARS-CoV-2
S-protein/TMPRSS2 complex, we performed virtual biophysical experiments
using molecular dynamics and simulations using DynOmics (http://dynomics.pitt.edu/) and LARMD (http://chemyang.ccnu.edu.cn/ccb/server/LARMD/). The time course molecular simulations for 10 ns of dynamics were
recorded. As the main catalytic domain of TMPRSS2 is involved in only
target spike protein interaction or drugs, which is stable and resembles
the crystal structure, we studied the dynamics using only one conformation
or ensemble of TMPRSS2. The structure of spike protein in complex
with TMPRSS2 was solvated in an area of 125 Å3 with
water molecules using the Desmond Builder (Schrödinger, LLC,
New York, NY). All simulations were performed after energy minimization,
and equilibration systems were neutralized with counterions. The simulations
were performed for 10 ns using LARMD at a pressure of 1 atm and a
temperature of 300 K. All analyses were carried out using LARMD analysis
tools. The MD trajectories were analyzed to identify critical interactions
that were formed, retained, and disrupted in the interface between
spike protein and TMPRSS2. The DynOmics program was used to generate
the changes in the B-factor, eigenvectors, and mode
of interaction analysis using default settings. The B-factor profiles (thermal stability factor), RMSD, and domain separation
analysis combined with simulation studies were performed and validated
with Schrodinger molecular dynamics tertiaries.Our extended
biophysical, molecular dynamics, and simulation studies
also principally suggest that the overall SARS-CoV-2 S-protein/TMPRSS2
complex is stable with regard to its interaction and dynamic motion
(Figure and Figure S5). First, the inter-residue contact
map of the SARS-CoV-2 spike/TMPRSS2 complex shows the clear and robust
physical interaction between the molecules even in the dynamic or
in-solution state (Figure a,b). The entire loop region of spike protein at S2 interacting
with TMPRSS2 shows stability even under dynamic motion. The dynamic
state of the complex was allowed to oscillate up to 4.5 Å (intermolecule
lines in Figure a)
(Movie S1). With respect to the B-factor, all residues in the SARS-CoV-2 S-protein/TMPRSS2
complex showed significantly lower B-factor values
of <0.5 Å2 at the interaction interface (Figure c), which further
supported the greater physical stability of the complex. In comparison,
considerable B-factor values were observed in the
disorder regions or loops. In addition, the SARS-CoV-2 S-protein/TMPRSS2
complex with a lower B-factor of the structure also
demonstrates the higher thermodynamic stability. Hence, we next sought
to check the domain separation possibilities of SARS-CoV-2 and TMPRSS2
through a biophysical study and time course eigenvectors (domain separation
dynamics). As expected, very low eigenvectors were observed for the
whole complex-forming region (eigenvector score of <0 ± 0.01),
suggesting the high stability of the complex and the least likely
physical separation possibilities (Figure d). On the contrary, some increase or higher
eigenvectors were noticed for the S1 domain regions of SARS-CoV-2,
which further evidence the separation of S1 domains (Figure d). Small differences in eigenvector
levels of spike monomers could result from differences in interdomain
interaction. Increased eigenvectors are directly linked with a higher
likelihood of domain physical separation or movement from the rest
of the complex. Consistent molecular dynamics results were also obtained
with the post-furin cleaved spike protein[8] and with regard to TMPRSS2 recognition of spike (Figure S5). The time course simulation of the SARS-CoV-2 S-protein/TMPRSS2
complex for 10 ns was recorded, and the mobility scale (Movie S1) also shows that the interaction interphase
is less mobile, thus providing a high degree of confidence of complex
formation and physical stability. Furthermore, the mode shape oscillation
profile shows that S1 domains of SARS-CoV-2 have different conformations
but the S2 domains are more stable and exhibit minor changes in conformation,
suggesting the stability of the S2 domains of the S-protein compared
to the S1 domains.
Figure 3
(a) Intermolecular connectivity of the SARS-CoV-2 full-length
trimer
and TMPRSS2 complex during the real-time dynamic state and oscillation
of atoms. Each monomer of the spike protein and TMPRSS2 are shown
in different colors in the ball-and-stick model. (b) Ribbon model
showing the mobility scale and stable complex-forming region. Highly
stable and less stable residues are colored blue and red, respectively.
(c) Molecular dynamics simulation studies showing the oscillation
and B-factor (stability factor; the lower the value,
the higher the stability) profiles of the S-protein/TMPRSS2 complex.
The amino acid residue position is shown on the X-axis, and the degree of movement of amino acids as a B-factor is shown on the Y-axis. (d) Domain separation
dynamics of the spike protein/TMPRSS2 complex. Low and studied eigenvectors
for TMPRSS2 and moderate level for spike protein in most regions are
noticeable, indicating the higher stability of the complex. (e) Mode
shape profile of the spike trimer binding to TMPRSS showing the region
that has a greater number of conformational shapes.
(a) Intermolecular connectivity of the SARS-CoV-2 full-length
trimer
and TMPRSS2 complex during the real-time dynamic state and oscillation
of atoms. Each monomer of the spike protein and TMPRSS2 are shown
in different colors in the ball-and-stick model. (b) Ribbon model
showing the mobility scale and stable complex-forming region. Highly
stable and less stable residues are colored blue and red, respectively.
(c) Molecular dynamics simulation studies showing the oscillation
and B-factor (stability factor; the lower the value,
the higher the stability) profiles of the S-protein/TMPRSS2 complex.
The amino acid residue position is shown on the X-axis, and the degree of movement of amino acids as a B-factor is shown on the Y-axis. (d) Domain separation
dynamics of the spike protein/TMPRSS2 complex. Low and studied eigenvectors
for TMPRSS2 and moderate level for spike protein in most regions are
noticeable, indicating the higher stability of the complex. (e) Mode
shape profile of the spike trimer binding to TMPRSS showing the region
that has a greater number of conformational shapes.Currently, COVID-19 patients are undergoing treatment with
a broad
range of antiviral drugs such as remedesivir, arbidol, ritonavir,
and other combinations of drugs, but the disease still warrants more
drugs that aid in different functional aspects.[32−34] Drugs or inhibitors
that could block the function of TMPRSS2 and structurally impede its
interaction with the spike protein could be potential therapeutic
targets. To explore the specificity and validate the precision of
drugs that could fit in the catalytic pocket of TMPRSS2, we performed
structure-guided screening of potential drugs via unbiassed random
molecular docking, refinement, and dynamic studies with four potential
TMPRSS2 protease inhibitors (Chemostat, Upamostat, Nafmostat, and
Bromhexine hydrochloride) using two independent servers: (i) Cluspro
(https://cluspro.org/login.php) with a refinement interface
and HADDOCK2.2 (www.bonvinlab.org/software/haddock2.2/).
Before that, both protein and protease inhibitors were prepared for
these studies by ensuring the presence of all hydrogen atoms and water
molecules at least 5 Å around the binding site or catalytic pocket
using the Mastero package program.[35] The
prediction results and drug binding location are assessed and corroborated
on the basis of the solvent accessible surface area (SASA), C-score (confidence score), and Z-score
(clash score) of the binding location and exposed residues of SARS-CoV-2
spike glycoprotein (Figures S6 and S7).
We observed the specific binding location of all drugs directed to
the catalytic pocket of TMPRSS2. Among the possible modes of small
molecule binding, from the SwissDock server, it is evident that the
three best possible solutions for the individual drug that could potentially
bind to TMPRSS2 protein and the clusters C2 show higher redundancy
of drug binding and correlate with the TMPRSS2 catalytic pocket (Figure S6). It is also interesting to note that
all four drugs bind to TMPRSS2 with a greater affinity of approximately
−9 kcal/mol with a higher docking score (Figures S6 and S7).We next analyzed interactions of
individual drugs with the TMPRSS2
catalytic pocket of active residues Q276, H296, E299, K300, P301,
K340, K342, E389, K390, L419, S441, Q438, and W461 (Figure ). It was interesting to notice
that all four inhibitors (Chemostat, Upamostat, Nafamostat, and Bromhexine
hydrochloride) predominantly bind to the specific location of the
TMPRSS2 catalytic pocket, which excluded residues Q276, Q317, and
E395 (Figures and 4). Intriguingly, Nafamostat potentially could bind
in two different modes and adjoining positions in the TMPRSS2 catalytic
pocket. The inhibitor Chemostat prefers the TMPRSS2 catalytic pocket
for interaction with Q276, H296, K342, W461, L419, Q389, and S441
(Figure a). Bromhexine
hydrochloride also docks at the catalytic triad of the TMPRSS2 pocket
and docks into the hydrophobic groove, forming strong interactions
with H296, K342, W461, L419, Q389, and S436 (Figure b). Nafamostat shows promising binding in
two different sites. Mode 1 has a higher affinity to bind and interact
with the residues (H296, E299, K342, S441, Q438, W461, S436, and D435),
whereas in binding mode 2, Nafamostat interacts mainly with the TMPRSS2
residues (H296, E299, S441, Q438, Q317, and E395) (Figure c,d). Similarly, Upamostat
interacts with TMPRSS2 via Q389, E299, E389, and K399 in addition
to the key triad residues (H296, K342, S441, and Q438) (Figure e). The detailed position and
alignment of amino acids of TMPRSS2 and individual drugs involved
in the interaction are shown in Figure .
Figure 4
Detailed structural view of the interaction between different
potential
protease inhibitors and the TMPRSS2 catalytic site (color coding of
amino acids for TMPRSS2 only). The position and residue names are
labeled accordingly, and the types of interactions between the individual
drug and neighboring amino acids are marked as shown in the legend:
(a) Camostat, (b) Bromhexine, (c and d) Nafamostat binding in two
different modes, and (e) Upamostat.
Detailed structural view of the interaction between different
potential
protease inhibitors and the TMPRSS2 catalytic site (color coding of
amino acids for TMPRSS2 only). The position and residue names are
labeled accordingly, and the types of interactions between the individual
drug and neighboring amino acids are marked as shown in the legend:
(a) Camostat, (b) Bromhexine, (c and d) Nafamostat binding in two
different modes, and (e) Upamostat.In this study, we identified and structurally demonstrated the
key regions of the SARS-CoV2 S-protein involved in interacting with
and processing of human TMPRSS2 to establish the underlying structural
mechanism in TMPRSS2-mediated S-protein processing for viral entry
in human cells. On the basis of available biochemical and structural
data for the target sites of other serine proteases[36,37] along with sequence analysis, we identified two prime TMPRSS2 recognition
sites in the SARS-CoV-2 S-protein (T1, amino acids 837–845;
T2, amino acids 976–986). The two identified sites span a part
of HR1 of the six-helix bundle of S-protein near the fusion peptide.
Recently, it has been reported that regions HR1 and HR2 aid in bringing
the fusion peptide into the proximity of the transmembrane domain,
thereby facilitating membrane fusion.[13,38,39] Docking results of the T1 site showed the maximum
docking score and largest buried area, suggesting that T1 is a better
substrate binding site. We also examined TMPRSS2 recognition sites
located in the three-dimensional structure of S-protein. The first
target site was observed in the position adjoining the furin site
(amino acids 837–848), which was partially buried and shielded
with S1 domains, and the entire loop region was poorly exposed to
solvent. On the contrary, T2, the second TMPRSS2 recognition site
(amino acids 976–986) in SARS-CoV-2 S-protein, was observed
to be completely buried or hidden inside spike S1 domains with no
access to the external solvent content. This raises an intriguing
question. If both TMPRSS2 sites are buried and shielded inside the
spike glycoprotein with S1 domains, how would TMPRSS2 interact with
spike glycoprotein? Furthermore, the distance from the S1 domain to
the TMPRSS2 binding site measures 105 Å, and the membrane-exposed
structure of TMPRSS2 measures only 42 Å from the membrane. This
makes it challenging for TMPRSS2 to recognize its target site on spike
glycoprotein without the cleavage of furin protease.[8,40] It is convincing to speculate that furin protease acts first on
the spike glycoprotein in the S1/S2 region and cleaves the spike protein
into S1 (ACE2 and CD26 binding region) and S2 (trimerization domain),
resulting in the complete exposure of the S2 domain and TMPRSS2 recognition
sites as suggested previously.[8,18,40]To assess the efficiency of binding of TMPRSS2 with identified
cleavage sites T1/S2′ and T2 on the S2 subunit of S-protein,
we tested the clinically proven protease inhibitors Camostat, Nafamostat,
Upamostat, and Bromhexine hydrochloride (BHH). Camostat, used therapeutically
for unrelated clinical conditions, was shown to inhibit influenza
viral replication,[21,41] while Nafamostat is a potent
inhibitor of MERS S-protein-mediated membrane fusion.[42] BHH is a Food and Drug Administration-approved mucolytic
agent and a specific inhibitor of TMPRSS2.[43,44] Upamostat is another serine protease inhibitor under consideration
for clinical trials.[45] All four drugs bind
to the active site of TMPRSS2 with a high degree of precision and
specificity (Figure and Figure S5). We demonstrate that Camostat,
Upamostat, and BHH preferentially bind to a specific location in the
catalytic pocket of TMPRSS2, while Nafamostat binds to three additional
binding residues in two modes. It is noticeable that residues H296,
E299, S441, Q438, and W461 of the TMPRSS2 catalytic pocket are highly
conserved and interact with all four drugs. Furthermore, these potential
TMPRSS2 inhibitors also share their interaction via several polar,
charged, and hydrophobic interactions (Figure S5). The specific binding of these drugs with a high degree
of precision confirms that they could potentially bind and impede
the interaction between the S-protein and TMPRSS2. The interactions
of the TMPRSS2 catalytic pocket with the linker region connecting
α1/α2 and α3/α4 of the S2 domain involving
residues D745 and T747 of the α1/α2 helix region, residues
N977, D978, L981, R983, D985, and K986 of the α3/α4 helix
region of T1, and residues K835, Y837, D839, C840, L841, D843, I844,
R847, and R848 of T2 clearly demonstrate that the cleavage sites identified
are the regions where TMPRSS2 cleaves the S2 domain of spike protein
leading to membrane fusion and viral entry into the host cell. Therefore,
this conserved epitope can be targeted for the development of vaccines
and therapeutic drugs. On the basis of these findings, we propose
a model of interaction of TMPRSS2 with the S2 subunit of spike protein
of SARS-CoV-2 (Figure ).
Figure 5
Proposed model of the TMPRSS2 binding S2 unit of SARS-CoV-2 spike
glycoprotein. The three protomers of spike protein are colored accordingly.
Proposed model of the TMPRSS2 binding S2 unit of SARS-CoV-2 spike
glycoprotein. The three protomers of spike protein are colored accordingly.Besides, in addition to the recent findings demonstrating
host
genetic mutations in altered SARS-CoV-2 virulence,[22] we discovered a number of other genetic mutations in human
TMPRSS2 in our whole-exome sequencing analysis of nearly 60 000
humans (Figure S8a) derived from next-generation
sequencing data from the GTIxp portal and the GenomeAD V3.1 repository.[46] The exome sequence data were filtered to extract
only putative loss-of-function (pLOF) and missense mutations that
occurred in the whole exome of human TMPRSS2 encoded on chromosome
21. All missense and deleterious mutations/SNPs or genetic alleles
or variants are tabulated for further structural and binding analyses.
Among all of these alleles, only eight mutations were found to be
located in the active site pocket of TMPRSS2. That could be involved
in the interaction with the target SARS-CoV-2 spike protein or any
drug candidate. Furthermore, all tissue expression profiling (www.GTExportal.org) of TMPRSS2
reveals its highest and very high level of expression in the prostate
followed by the colon (Figure S8), which
could explain why men are more susceptible to SARS-CoV-2 infection
than women.
Authors: Jared M Lucas; Cynthia Heinlein; Tom Kim; Susana A Hernandez; Muzdah S Malik; Lawrence D True; Colm Morrissey; Eva Corey; Bruce Montgomery; Elahe Mostaghel; Nigel Clegg; Ilsa Coleman; Christopher M Brown; Eric L Schneider; Charles Craik; Julian A Simon; Antonio Bedalov; Peter S Nelson Journal: Cancer Discov Date: 2014-08-13 Impact factor: 39.397
Authors: Hannah Limburg; Anne Harbig; Dorothea Bestle; David A Stein; Hong M Moulton; Julia Jaeger; Harshavardhan Janga; Kornelia Hardes; Janine Koepke; Leon Schulte; Andreas Rembert Koczulla; Bernd Schmeck; Hans-Dieter Klenk; Eva Böttcher-Friebertshäuser Journal: J Virol Date: 2019-10-15 Impact factor: 5.103
Authors: Alexandra C Walls; M Alejandra Tortorici; Joost Snijder; Xiaoli Xiong; Berend-Jan Bosch; Felix A Rey; David Veesler Journal: Proc Natl Acad Sci U S A Date: 2017-10-03 Impact factor: 11.205
Authors: Monkol Lek; Konrad J Karczewski; Eric V Minikel; Kaitlin E Samocha; Eric Banks; Timothy Fennell; Anne H O'Donnell-Luria; James S Ware; Andrew J Hill; Beryl B Cummings; Taru Tukiainen; Daniel P Birnbaum; Jack A Kosmicki; Laramie E Duncan; Karol Estrada; Fengmei Zhao; James Zou; Emma Pierce-Hoffman; Joanne Berghout; David N Cooper; Nicole Deflaux; Mark DePristo; Ron Do; Jason Flannick; Menachem Fromer; Laura Gauthier; Jackie Goldstein; Namrata Gupta; Daniel Howrigan; Adam Kiezun; Mitja I Kurki; Ami Levy Moonshine; Pradeep Natarajan; Lorena Orozco; Gina M Peloso; Ryan Poplin; Manuel A Rivas; Valentin Ruano-Rubio; Samuel A Rose; Douglas M Ruderfer; Khalid Shakir; Peter D Stenson; Christine Stevens; Brett P Thomas; Grace Tiao; Maria T Tusie-Luna; Ben Weisburd; Hong-Hee Won; Dongmei Yu; David M Altshuler; Diego Ardissino; Michael Boehnke; John Danesh; Stacey Donnelly; Roberto Elosua; Jose C Florez; Stacey B Gabriel; Gad Getz; Stephen J Glatt; Christina M Hultman; Sekar Kathiresan; Markku Laakso; Steven McCarroll; Mark I McCarthy; Dermot McGovern; Ruth McPherson; Benjamin M Neale; Aarno Palotie; Shaun M Purcell; Danish Saleheen; Jeremiah M Scharf; Pamela Sklar; Patrick F Sullivan; Jaakko Tuomilehto; Ming T Tsuang; Hugh C Watkins; James G Wilson; Mark J Daly; Daniel G MacArthur Journal: Nature Date: 2016-08-18 Impact factor: 49.962
Authors: John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli Journal: Nature Date: 2021-07-15 Impact factor: 49.962