Tao Liang1, Yuan Yuan2, Ran Wang1, Yanzhi Guo1, Menglong Li1, Xuemei Pu1, Chuan Li1. 1. College of Chemistry and College of Computer Science, Sichuan University, No. 29 Jiuyanqiao Wangjiang Road, Chengdu 610064, People's Republic of China. 2. College of Management, Southwest University for Nationalities, No. 16 South Section 4, Yihuan Road, Chengdu 610041, People's Republic of China.
Abstract
It has already been suggested by researchers that there should be multiple intermediate states in the activation process for G-protein-coupled receptors (GPCRs). However, the intermediate states are very short-lived and hardly captured by the experiments, leading to very limited understanding of their structural features and drug efficacies. In this work, a novel joint strategy of targeted molecular dynamics simulation, conventional molecular dynamics simulation, and virtual screening is developed to address the problems. The results from 10 intermediate conformations obtained from the work reveal that the ligand pocket is very unstable and fluctuates between the inactive state and the active one in the case of ligand-free, in particular for ECL2 as a gate-keeper of the ligand-binding. The ligand-binding site could be stable in the active state with a small volume and a completely closed ECL2, only when the G-protein-binding region is fully activated. In addition, the activations of the ligand-binding pocket and G-protein-binding site are relatively independent and exhibit a loose allosteric coupling, which contributes to the existence of multiple intermediate conformations. Interestingly, the screening performance of the agonists does not increase on increasing the overall activity of the intermediate state, but is dependent on the activated extent of the ligand pocket. The receptor is prone to bind the agonist when closing ECL2 and reducing the ligand-binding pocket volume, whereas it is more favorable for binding the antagonist when opening ECL2 and increasing the pocket volume. These observations added to previous studies could help us better understand the activation mechanism of GPCRs and provide valuable information for drug design.
It has already been suggested by researchers that there should be multiple intermediate states in the activation process for G-protein-coupled receptors (GPCRs). However, the intermediate states are very short-lived and hardly captured by the experiments, leading to very limited understanding of their structural features and drug efficacies. In this work, a novel joint strategy of targeted molecular dynamics simulation, conventional molecular dynamics simulation, and virtual screening is developed to address the problems. The results from 10 intermediate conformations obtained from the work reveal that the ligand pocket is very unstable and fluctuates between the inactive state and the active one in the case of ligand-free, in particular for ECL2 as a gate-keeper of the ligand-binding. The ligand-binding site could be stable in the active state with a small volume and a completely closed ECL2, only when the G-protein-binding region is fully activated. In addition, the activations of the ligand-binding pocket and G-protein-binding site are relatively independent and exhibit a loose allosteric coupling, which contributes to the existence of multiple intermediate conformations. Interestingly, the screening performance of the agonists does not increase on increasing the overall activity of the intermediate state, but is dependent on the activated extent of the ligand pocket. The receptor is prone to bind the agonist when closing ECL2 and reducing the ligand-binding pocket volume, whereas it is more favorable for binding the antagonist when opening ECL2 and increasing the pocket volume. These observations added to previous studies could help us better understand the activation mechanism of GPCRs and provide valuable information for drug design.
β2-Adrenergic receptor (β2AR),
as an important receptor of class A G-protein-coupled receptors (GPCRs),
is ubiquitous in the smooth muscle throughout the body and regulates
smooth muscle relaxation in the airways and vasculature.[1−3] Thus, it is also an important target for cardiac and asthma drugs
and has been extensively characterized by many experimental works
and computational ones over several decades. In 2007, Kobilka and
Stevens first resolved the inactive human β2AR crystal
structure bound to a high-affinity inverse agonist carazolol.[4] In the following years, more and more β2AR crystal structures were obtained, and most of them were
bound to inactive ligands,[5,6] thus being in inactive
states. In 2011, Rasmussen resolved the first active crystal structure
of β2AR,[7] which is bound
by a nucleotide-free Gs heterotrimer (Gα, Gβ, and Gγ)
and a high-affinity agonist (BI-167107).As revealed, the activation
of GPCRs from the inactive state to
the active one is accompanied by a series of large conformational
rearrangements. Many biochemical, biophysical, and other studies indicated
that GPCR exists in a dynamic equilibrium between the inactive state
and the active one,[6−9] in which there are multiple intermediate states, and different ligands
would shift the equilibrium toward different directions, in turn influencing
the signal transmission.[10] However, the
process is very fast and the intermediate states are very short-lived,
which are hardly captured by the experiments. Consequently, the dynamic
behavior of GPCRs in the activation process has been hardly characterized
by the experiments, although its importance was well-recognized. Molecular
dynamics (MD) simulation can provide a full atomic level view of protein
structures over time. Thus, it has been successfully applied to study
the dynamics behaviors of various proteins,[11−14] including GPCRs.[15−18] Previous MD studies on GPCRs mainly focused on structures of GPCRs,[19−22] the interaction between various ligands and GPCRs,[23−30] and internal water pathways.[18,31] In addition, the activation
mechanism of GPCRs was also a concern.[16,32−34] However, the activation of GPCRs generally occurs on a microsecond
time scale, which significantly limits classic all-atom MD studies
on the activation process started at the inactive state because of
the power of the computer. Some conventional all-atom MD methods studied
the activation pathway induced by different ligands and the deactivation
process for two A class GPCRs within nanosecond and microsecond timescales,[32,34] which provided valuable information for the dynamics change in some
key residues and regions. In addition, to overcome the limitation
of the simulation time and to bridge the gap between the computational
timescales and the experimental ones, some biased MD methods were
used to investigate the activation process of GPCRs,[35−38] including targeted molecular dynamics (TMD) simulation.[33,39,40] TMD can accelerate the transition
process between two existing states with the aid of an external force
and has been successfully applied to elucidate the large-scale conformational
transition between two states of proteins.[41,42] In our previous work, we also used the TMD method to overcome high
barriers of the large conformational variations within the nanosecond
timescale to study the activation process of β2AR
from the inactive state to the active one.[33] Although some structural features of the intermediate state were
obtained based on one representative conformation in the activation
process, the TMD conformation is not identical to the real space and
hence not appropriate to further study its drug efficacies. As accepted,
virtual screening based on MD conformations is an effective way to
design drugs for target proteins.[28,29,43] It was reported that the virtual screening based
on the MD conformations of GPCRs outperforms those based on crystal
structures because MD conformations incorporate the flexibility of
receptor.[28] However, in the previous studies,
the drug selection of the intermediate states in the activation process
was seldom concerned, disfavoring the designs of diverse drugs targeting
GPCRs.On the whole, although GPCRs have aroused considerable
interests
from experiments and computations, the information concerning the
activation intermediates and their drug efficacies is still very limited
and highly desired for further study. Thus, in this work, we propose
a facile yet efficient strategy to address the above issue for the
representative β2AR through a combination of TMD,
conventional MD (CMD), and virtual screening. First, we utilized TMD
to overcome the high barriers of the large conformational changes
in the activation process and get some initial seeds representing
different intermediate stages. Then, we further performed 150 ns CMD
simulations for each seed and assembled all these equilibrium trajectories
to obtain reasonable intermediate conformations close to the real
activation space. Finally, we studied their selectivity to ligands
with the help of the virtual screening method based on the representative
MD conformations, focusing on what structural features of the receptor
would favor the screening toward the agonists and antagonists.
Results and Discussion
Structural Features for
the Representative
Conformations in the Activation Process
The root-mean-square
deviation (rmsd) value between the inactive crystal structure and
the active one was calculated to be 4.22 Å for all backbone atoms
of the receptor. It can be seen from Figure that the rmsd values of the conformations
derived from the six 150 ns CMD trajectories are in the range between
∼1.5 and ∼5.0 Å, with respect to either the active
crystal structure or the inactive crystal structure, exhibiting significant
differences. The result implies that the CMD trajectories based on
the TMD seeds could capture the large conformation variations in the
activation process. To obtain the representative frames, we further
used the k-means method to cluster all CMD conformations
and obtained 80 categories based on the rmsd values of the backbone
atoms of the receptor. Then, we chose representative groups to further
characterize their structures. The selection is based on the populations
of the groups and considered together with their rmsd values, so that
the chosen groups could be representative for the activation process.
Consequently, 10 groups were obtained. Figure shows the populations of the 10 clusters
and the rmsd values of the representative conformations from the 10
clusters. It can be seen that their rmsd values cover the range between
∼1.5 and ∼4.8 Å, thus being representative for
the activation space.
Figure 1
rmsd distributions for all conformations derived from
the six 150
ns CMD trajectories.
Figure 2
Popularity of conformations for the 10 representative clusters
and the rmsd values (Å) of all backbone atoms to 2RH1 and 3SN6 for the 10 central
conformations selected from the 10 representative clusters.
rmsd distributions for all conformations derived from
the six 150
ns CMD trajectories.Popularity of conformations for the 10 representative clusters
and the rmsd values (Å) of all backbone atoms to 2RH1 and 3SN6 for the 10 central
conformations selected from the 10 representative clusters.To characterize the conformation
change of the ligand-binding pocket
in the activation process, four indicators (pocket volume, distance
between Ser2075.46 and Tyr3167.43 within the
pocket,[33] distance between Phe193ECL2 and the pocket residue Tyr3087.35, and rmsd of ECL2)
were taken into account in terms of the following observations. A
comparison between the inactive crystal structure (PDB ID: 2RH1)[4] and the active crystal structure (PDB ID: 3SN6)[7] already revealed significant differences in their structures.[17,44,45] As known, the agonist is smaller
than the antagonist and could form hydrogen bonding with the receptor,
which necessitates a contraction of the binding pocket in the active
state. Thus, the pocket volume is considered as an indicator. Compared
to the inactive crystal structure, the inward bulge of TM5 centered
around Ser2075.46 was found in the active state for β2AR.[46] Our previous TMD observations[33] indicated that Tyr3167.43 has the
highest flexibility among all residues of the ligand-binding pocket
in the activation process, and Ser2075.46 is second to
it. Consequently, the distance between Ser2075.46 and Tyr3167.43 could act as one indicator to reflect the dynamics variation
of the ligand pocket. ECL2 has been proposed to act as a “gatekeeper”
for ligand binding and to associate with the specificity of ligand
binding.[47,48] In addition, the distance between the residue
Phe193ECL2 in ECL2 and the pocket residue Tyr3087.35 was also a concern because a comparison between the fully active
crystal structure and the inactive one indicated that there is a significant
movement of 2–2.5 Å for the two residues, which was widely
served as one main feature for the formation of the liplike structure
in the ligand-binding pocket.[3] Thus, the
rmsd of ECL2 and the distance of Phe193ECL2···Tyr3087.35 are selected as the other two indicators. The 4 indicators
characterizing the ligand-binding pocket were calculated for the 10
representative conformations from the 10 groups and are listed in Table .
Table 1
Eight Indicators of Ten Intermediate
Conformations, the Inactive Crystal Structure (2RH1), and the Active
One (3SN6)a,b
ligand-binding
pocket
G-protein-binding
region
rmsd_all
rmsd_ECL2
rmsd_NPxxY
rmsd_ICL2
conformation states
ina
act
volume
d_207–316
ina
act
d_193–308
d_TM3–TM6
d_ionic
ina
act
ina
act
2RH1
4.22
641
16.20
0.42
5.86
8.37
11.15
1.22
3.93
all-c1
4.52
4.77
566
16.15
2.75
2.80
19.15
7.12
12.78
0.61
1.54
1.49
4.26
all-c2
2.03
3.06
535
14.63
2.49
2.42
12.33
8.48
13.38
0.73
1.05
1.52
4.62
all-c3
1.88
2.92
449
16.40
2.16
2.13
6.27
7.65
12.22
0.91
1.39
1.37
4.55
all-c4
1.82
3.02
382
16.44
1.55
1.57
5.46
8.43
14.65
0.56
1.28
1.82
4.14
all-c5
2.89
2.58
526
16.16
0.86
0.91
7.47
15.91
12.21
0.93
1.01
2.46
3.73
all-c6
3.87
3.31
534
14.92
1.60
1.57
15.56
13.99
17.05
1.12
1.32
3.21
1.89
all-c7
2.71
2.11
432
14.09
0.70
0.77
9.05
14.60
12.34
0.94
1.07
3.14
3.05
all-c8
3.34
2.80
654
16.23
1.39
1.41
10.70
15.42
18.92
1.19
0.75
3.55
1.40
all-c9
3.04
2.53
335
13.29
1.22
1.29
6.69
16.94
12.01
0.86
1.12
2.48
3.48
all-c10
3.60
1.74
462
13.33
0.85
0.76
3.94
16.20
17.96
1.35
0.36
3.65
0.90
3SN6
4.22
397
13.93
0.42
3.84
15.41
18.97
1.22
3.93
The 10
intermediates are obtained
by clustering based on the rmsd values of the backbone atoms of the
receptor.
rmsd_all, rmsd_ECL2,
rmsd_NPxxY,
and rmsd_ICL2 denote the rmsd values (in Å) for all backbone
atoms of β2AR, ECL2 residues, NPxxY residues, and
ICL2 residues, respectively; ina and act denote the rmsd (in Å)
with respect to the inactive crystal structure (2RH1) and the active
one (3SN6),
respectively; volume (in Å3) denotes the volume of
the ligand-binding pocket; d_207–316 denotes
the distance (in Å) between Ser2075.46 and Tyr3167.43; d_193–308 denotes the distance
(in Å) between Phe193ECL2 and Tyr3087.35; d_TM3–TM6 denotes the distance (in Å)
between TM3 and TM6, measured by the distance between Arg1313.50 and Leu2726.34; and d_ionic denotes
the distance (in Å) between Arg1313.50 and Glu2686.30.
The 10
intermediates are obtained
by clustering based on the rmsd values of the backbone atoms of the
receptor.rmsd_all, rmsd_ECL2,
rmsd_NPxxY,
and rmsd_ICL2 denote the rmsd values (in Å) for all backbone
atoms of β2AR, ECL2 residues, NPxxY residues, and
ICL2 residues, respectively; ina and act denote the rmsd (in Å)
with respect to the inactive crystal structure (2RH1) and the active
one (3SN6),
respectively; volume (in Å3) denotes the volume of
the ligand-binding pocket; d_207–316 denotes
the distance (in Å) between Ser2075.46 and Tyr3167.43; d_193–308 denotes the distance
(in Å) between Phe193ECL2 and Tyr3087.35; d_TM3–TM6 denotes the distance (in Å)
between TM3 and TM6, measured by the distance between Arg1313.50 and Leu2726.34; and d_ionic denotes
the distance (in Å) between Arg1313.50 and Glu2686.30.As accepted,
the variation of the ligand-binding region would pass
to the intracellular G-protein-binding site. The highly conserved
NPxxY has been considered to be closely associated with the G-protein
binding and could serve as an important activation indicator.[49−51] The fully active crystal structure of β2AR-Gs displays
a significant difference in the intracellular loop 2 (ICL2) conformation
with respect to the inactive state because of the interaction between
ICL2 and G-protein in the active state.[52] When the G-protein-binding site adopts an active conformation, the
ionic lock between Arg1313.50 and Glu2686.30 in the cytoplasmic end is broken and the intracellular end of TM6
obviously moves outward, leading to an opening of the G-protein-binding
region in the receptor cytoplasmic face.[53] Thus, we selected the four indicators to characterize the changes
of the G-protein-binding site in the activation process, such as the
distance between TM3 and TM6, the distance between Arg1313.50 and Glu2686.30 (ionic lock), rmsd values of NPxxY motif
and ICL2. Herein, we referred the crystallographically observed G-protein-coupling
conformational state (PDB ID: 3SN6)[7] as the canonical
active state and referred the crystallographically bound with the
inverse agonist (PDB ID: 2RH1)[4] as the standard inactive
state. Figure qualitatively
shows the similarities of the intermediate conformations to the active
crystal structure for the eight activation indicators above, in which
the colors from light to dark denote the activity from low to high.
Figure 3
Diagram
of the activities expressed by colors for the 8 indicators
in the 10 representative conformations, derived from the data in Table . The colors from
deep to light denote the activities from high to low. Deeper the color,
closer to the active state. Lighter the color, closer to the inactive
state. Upper four horizontal bars correspond to the ligand-binding
pocket, whereas the remaining four horizontal bars are associated
with the G-protein-binding region.
Diagram
of the activities expressed by colors for the 8 indicators
in the 10 representative conformations, derived from the data in Table . The colors from
deep to light denote the activities from high to low. Deeper the color,
closer to the active state. Lighter the color, closer to the inactive
state. Upper four horizontal bars correspond to the ligand-binding
pocket, whereas the remaining four horizontal bars are associated
with the G-protein-binding region.For the all-c1 conformation, the total rmsd from the inactive
state
is high, up to 4.52 Å, but its rmsd from the active state is
also large, up to 4.77 Å. The conformation presents large deviations
from the inactive and active states. The volume of the ligand-binding
pocket is 566 Å3, located between 641 Å3 of the inactive 2RH1 and 397 Å3 of the active 3SN6 but closer to that of the inactive 2RH1. The distance between
the Phe193ECL2 residue and the pocket residue Tyr3087.35 is 19.15 Å, much greater than those of the inactive
state and the active one, indicating a significant opening of ECL2
from the pocket. As a result, the rmsd values of ECL2 is the largest
(2.75 Å from 2RH1 and 2.80 Å from 3SN6) among all 10 intermediates, much greater than the
rmsd value (only 0.42 Å) between the inactive 2RH1 and the active 3SN6. For the G-protein-binding
site, the distance between TM3 and TM6 is 7.12 Å, close to that
of 2RH1 (8.37
Å). The distance between Arg1313.50 and Glu2686.30 (viz., ionic lock) is 12.78 Å, which is slightly
larger than that of 2RH1 (11.15 Å) but much smaller than 18.97 Å of the active 3SN6. The rmsd of NPxxY
is 0.61 Å from the inactive 2RH1 and 1.54 Å from the active 3SN6, much closer to
the inactive state. The rmsd values of ICL2 are 1.49 Å from 2RH1 and 4.26 Å
from 3SN6. As
observed above, all eight indicators in the state are almost close
to the inactive 2RH1, as reflected by the light colors in Figure .The conformation all-c2 does not
present a large deviation from
the inactive state, judged from its 2.03 Å value of rmsd. The
rmsd is 3.06 Å from the active 3SN6 state. The value of its pocket volume
(535 Å3) is located between the active state and the
inactive one, displaying an intermediate feature. The distance between
Ser2075.46 and Tyr3167.43 is 14.63 Å, approaching
13.93 Å of the active state. The separation between Phe193ECL2 and Tyr3087.35 is 12.33 Å, indicating
that ECL2 also significantly moves away from the binding pocket. The
rmsd values of ECL2 (2.49 Å from 2RH1 and 2.42 Å from 3SN6) are close to the
all-c1 state. The distance of the ionic lock (8.48 Å) is larger
than that of all-c1 but still closer to the inactive state with respect
to the active one. Similar to all-c1, the NPxxY and ICL2 regions in
the all-c2 conformation match those of the inactive state, judged
from their rmsd values. Overall, most features of the state are similar
to those of the all-c1 state despite the small differences, both close
to the inactive state.For the all-c3 state, its total rmsd
values are 1.88 Å from
the inactive 2RH1 and 2.92 Å from the active 3SN6. The volume of the ligand-binding pocket
is 449 Å3 and close to 397 Å3 of the
active 3SN6.
The distance between Ser2075.46 and Tyr3167.43 within the ligand-binding pocket is 16.40 Å, almost equal to
16.20 Å of the inactive state. The rmsd values of ECL2 are 2.16
Å from the inactive 2RH1 and 2.13 Å from the active 3SN6, closer to the inactive
state. The distance between Phe193ECL2 and the pocket residue
Tyr3087.35 is 6.27 Å, significantly lower than those
of the two conformations above (viz., all-c1 and all-c2) and close
to the value of the inactive 2RH1 (5.86 Å). For the G-protein-binding site, the
distance between TM3 and TM6 is 7.65 Å, approaching that of 2RH1 (8.37 Å). The
distance between Arg1313.50 and Glu2686.30 (ionic
lock) is 12.22 Å, which is slightly higher than that of 2RH1 (11.15 Å) but
much lower than 18.97 Å of the active 3SN6. The rmsd values of NPxxY are 0.91 Å
versus the inactive 2RH1 and 1.39 Å versus the active 3SN6, significantly closer to the inactive
state. The rmsd values of ICL2 are 1.37 Å from 2RH1 and 4.55 from 3SN6. As observed above,
all parameters in the state are almost close to the inactive 2RH1 state with the exception
of the pocket volume, as reflected by the light colors in Figure .Similar to
the all-c3 state, the total rmsd of all-c4 to the inactive 2RH1 is small (1.82 Å),
whereas its rmsd to the active state (3.02 Å) is relatively large.
Its pocket volume (382 Å3) is very close to the active
state (397 Å3). In addition, the rmsd of ECL2, the
distance of Ser2075.46···Tyr3167.43, and the distance of Phe193ECL2...Tyr3087.35 are also close to the inactive state. Compared to the inactive 2RH1, the ionic lock
of the all-c4 state is significantly increased to 14.65 Å, whereas
the distance of TM3–TM6 is almost unchanged. The two indicators
display one inconsistent change. Judging from the rmsd value, the
NPxxY region is also similar to the inactive state. Although ICL2
presents a significant variation with respect to the inactive state,
it is still closer to the inactive state compared to the active state.The total rmsd values of all-c5 are 2.89 Å from 2RH1 and 2.58 Å
from 3SN6. Its
pocket volume is 526 Å3, located between the active
state and the inactive one. It can be seen from Table that the distance of Ser2075.46 and Tyr3167.43, the rmsd value of ECL2, and the distance
of Phe193ECL2 and Tyr3087.35 are very close
to those of the inactive 2RH1. For the G-protein-binding region, NPxxY does not
present a significant variation relative to 2RH1, whereas ICL2 displays
the feature of one intermediate state, evidenced by their rmsd values.
The ionic lock is slightly increased, but still far away from 18.97
Å of the active state. By contrast, the distance of TM3–TM6
is slightly larger than the active state, displaying full activation.
The activation features in the state are significantly enhanced with
respect to the four states above, in particular for the G-protein
region. But, the ligand-binding region is almost in the inactive state.The total rmsd values of all-c6 are large, achieving 3.87 Å
from the inactive 2RH1 and 3.31 Å from the active 3SN6. The pocket volume, the distance of Ser2075.46 and Tyr3167.43, and the rmsd of ECL2 are almost
in one intermediate stage between the inactive state and the active
one, as reflected by Table and Figure . The large distance of Phe193ECL2 and Tyr3087.35 (15.56 Å) indicates that ECL2 significantly deviates from the
binding pocket. The ionic lock is completely broken, similar to the
active state. The distance between TM3 and TM6 also approaches the
active state. ICL2 presents a more significant change than the five
states above, closer to the active state. However, the NPxxY region
is still closer to the inactive state. Overall, the activity of all-c6
is almost in an intermediate state.The total rmsd values of
all-c7 are 2.71 Å versus 2RH1 and 2.11 Å
versus 3SN6.
Although the rmsd relative to the inactive state is smaller than that
of all-c6, the pocket volume and the distance of Ser2075.46 and Tyr3167.43 approach the active state. The distance
of Phe193ECL2 and Tyr3087.35 (9.05 Å) is
also larger than those of the inactive and active states. However,
the rmsd of ECL2 only presents a slight deviation from the inactive
and active crystal structures. Similar to all-c5, the ionic lock does
not present a significant increase, but the distance of TM3 and TM6
is close to the active state. Compared to the two crystal structures,
ICL2 significantly changes, whereas the NPxxY region only exhibits
a slight variation.The total rmsd of all-c8 is 3.34 Å
relative to the inactive
state, larger than its rmsd (2.80) from the active 3SN6. However, the ligand
pocket is almost in the inactive state, as evidenced by the four pocket
indicators in Table and Figure . But,
the G-protein region almost approaches the active state.For
all-c9, its total rmsd is 3.04 Å relative to 2RH1 and is 2.53 Å
relative to 3SN6. Its pocket volume is 335 Å3, smaller than that
of the active 3SN6. The distance of Ser2075.46 and Tyr3167.43 is close to the active site. ECL2 approaches an intermediate state,
as reflected by Figure . Although the distance of the ionic lock only presents a slight
increase with respect to the inactive state, the distance of TM3 and
TM6 is significantly increased and larger than the active state. ICL2
and NPxxY regions are closer to the inactive state, judged from their
rmsd values. Overall, the ligand-binding pocket of the state is almost
in an active state, whereas the three indicators of the G-protein
region present a reverse trend with the exception of the distance
of TM3 and TM6.The all-c10 state presents a large deviation
from the inactive
state (3.60 Å of rmsd) and a small deviation from the active
state (1.74 Å of rmsd). As reflected by Table and Figure , its ligand pocket and G-protein region almost resemble
the active state, exhibiting full activation for the two regions.On the whole, all-cl, all-c2, all-c3, and all-c4 are close to the
inactive state, in particular for the G-protein region. However, some
structural features in these states are still present to be partly
activated, for example, the binding pocket volume of all-c3 and all-c4
and the ionic lock of all-c4. The G-protein region in all-c8 almost
approaches the active state, but its ligand-binding pocket still matches
the inactive state. By contrast, the ligand-binding pockets in all-c7
and all-c9 almost approach the active state, whereas some features
in their G-protein regions are still close to the inactive state,
except for the distance of TM3–TM6. Similarly, for the all-c5
and all-c6 states, some features display activation but some are in
the inactive space or the intermediate state, as reflected by Figure . Thus, these conformations
should be in different intermediate states. Only the all-c10 state
is fully activated, closest to the active 3SN6.Interestingly, the observations
reflect that the activations of
the ligand-binding region and the G-protein-binding region are relatively
independent, exhibiting a loose allosteric coupling. Some spectroscopic
results suggested that the conformations of the ligand-binding pocket
and the cytoplasmic domain of β2AR should not be
tightly allosterically coupled.[10,54] A microsecond timescale
MD simulation to study the deactivation process of β2AR also showed that there is no high correlation in conformations
between the G-protein couple domain and the ligand-binding region.[32] Our results provide further evidences on the
molecular level for the loose allosteric coupling between the two
regions. The independence is mainly attributed to the instability
of the ligand-binding region, which presents obvious fluctuation between
the inactive and active states in the activation process. The ligand-binding
pocket shows high activity only in the all-c10 state, in which the
G-protein-binding region is also highly active. The observation further
confirms that only when the G-protein-binding region is fully activated
and the G-protein is coupled, the ligand-binding site could be stable
in a closed (small volume) and active state.[55]In addition, it is worth to note the ECL2 variation in the
activation
process. Although the rmsd of ECL2 between the active state and the
inactive one is minor (only 0.42 Å), our results indicate that
the distance between Phe193ECL2 and Tyr3087.35 exhibits very large fluctuations from 3.8 Å to about 20 Å
among these states. Except for the fully activated all-c10 state,
the distances of all other nine states are larger than that of the
active crystal structure. The ECL2 in the all-c3, all-c4, all-c5,
and all-c9 states is relatively close to the inactive state (vide Figure a), whereas in the
all-c1, all-c2, all-c6, and all-c8 states, it exhibits large position
deviations (vide Figure b), in particular for all-c1 with the largest position change (19.15
Å). ECL2 in the ligand-bound 2RH1 and 3SN6 structures presents a relatively closed
state compared to the ligand-free states obtained from the work. ECL2
is completely closed only when the receptor is fully activated like
all-c10 and the active 3SN6 engaging the G-protein.
Figure 4
Superimposition of the
10 central conformations derived from the
10 representative clusters on the inactive crystal structure (2RH1) and the active
crystal structure (3SN6) to display ECL2 changes. The ECL2 region and the two residues Phe193ECL2 and Tyr3087.35 are highlighted. (a) all-c3
(light orange), all-c4 (blue), all-c5 (deep pink), all-c7 (deep purple),
all-c9 (limon), and all-c10 (cyan) are superimposed on the inactive 2RH1 (green) and the
active 3SN6 (light
pink). Displacements of ECL2 in the six intermediate conformations
from the two crystal structures are relatively small. (b) all-c2 (light
blue), all-c6 (yellow), all-c8 (orange), and all-c1 (purple) are superimposed
on the two crystal structures. ECL2 in the four intermediate conformations
presents large deviations from the two crystal structures, in particular
for the all-c1 (purple). The region involved in the residues Phe193ECL2 for the four intermediate conformations are highlighted
in the black circle.
Superimposition of the
10 central conformations derived from the
10 representative clusters on the inactive crystal structure (2RH1) and the active
crystal structure (3SN6) to display ECL2 changes. The ECL2 region and the two residues Phe193ECL2 and Tyr3087.35 are highlighted. (a) all-c3
(light orange), all-c4 (blue), all-c5 (deep pink), all-c7 (deep purple),
all-c9 (limon), and all-c10 (cyan) are superimposed on the inactive 2RH1 (green) and the
active 3SN6 (light
pink). Displacements of ECL2 in the six intermediate conformations
from the two crystal structures are relatively small. (b) all-c2 (light
blue), all-c6 (yellow), all-c8 (orange), and all-c1 (purple) are superimposed
on the two crystal structures. ECL2 in the four intermediate conformations
presents large deviations from the two crystal structures, in particular
for the all-c1 (purple). The region involved in the residues Phe193ECL2 for the four intermediate conformations are highlighted
in the black circle.In general, the GPCR could bind the G-protein only when it
is activated
either by the agonist or by self-activation. The previous MD observation
indicated that the fully activated GPCR, coupling the G-protein, would
transition spontaneously from the active state to the inactive one
when removing the G-protein.[32] Crystallographic
GPCR structures bound by the agonist but not coupled by the G-protein
still match the inactive conformation.[8] On the basis of our observations above and the previous findings,
it can be drawn that the discontinuous activation for the G-protein
region and the ligand-binding region should contribute to the multiple
intermediate states existing in the activation process. When the G-protein
domain is activated and coupled by the G-protein, the receptor is
stabilized in a fully active state.
Selectivity
of the Representative Conformations
to Ligands
To study the selection of these intermediate states
to ligands, we established a ligand validation set, which includes
40 known β2AR ligands (20 agonists and 20 antagonists)
and 1440 decoys of actives. They were extracted from the DUD-E database
(total of 1480 molecules). Structures of the 20 agonists and 20 antagonists
are shown in Figures S1 and S2 in the Supporting Information. We docked the ligand set to the 10 representative
conformations.We plot the receiver operating characteristic
(ROC) curve and calculate the area under the ROC curve (AUC) to obtain
the screening performance of each representative conformation, as
shown in Figure .
It can be seen that the 10 states could improve the screening to the
agonists (AUC-agonist > 0.5) with respect to the inactive 2RH1 (AUC-agonist = 0.44).
In addition, the screening to the antagonists is superior to the agonists
for the nine intermediate states, except for the all-c10 state. As
observed above, ECL2 is closer to the ligand-binding pocket for all-c3,
all-c4, all-c5, all-c7, all-c9, and all-c10 states, whereas ECL2 of
all-c1, all-c2, all-c6, and all-c8 states is far away from the pocket.
The virtual screening results show that the performance to identify
the agonist is higher for those states with ECL2 close to the pocket
than those with ECL2 far from the pocket. In addition, it is found
that the all-c3 and all-c4 conformations with a low activity have
agonist affinities similar to the all-c7 and all-c9 conformations
with a relatively high activity. Also, the all-c8 state with a relatively
high activity presents agonist affinity similar to the all-c1 state
with a low activity. As revealed above, the activity in the G-protein
region is low for the all-c4 state, but its pocket is in an active
state with a small volume (382 Å3). Thus, it favors
the agonist binding (AUC-agonist = 0.64). For all-c8, the activity
in the G-protein region is high, but its ligand-bound pocket is large
(654 Å3), leading to poor affinity to the agonist
(AUC-agonist = 0.50). In addition, the states like all-c3, all-c4,
all-c7, all-c9, and all-c10, which have relatively high activities
in the ligand-binding pocket, present to be conducive to screen the
agonists. The observations indicate that the receptor’s ability
to screen the agonists does not increase as the overall activity of
the receptor increases, different from some previous findings that
the higher the activity of the conformation, the stronger the enrichment
to the agonist.[56] The result provides further
support to the observations from some MD and nuclear magnetic resonance
experimental works that the active conformation is not necessarily
the lowest energy state for agonist binding.[10,55,57] Our results clearly show that the activation
extent of the binding pocket is crucial to select the ligands.
Figure 5
Performances
to identify the agonists and antagonists for the two
crystal structures (2RH1 and 3SN6)
and the 10 intermediate conformations derived from the 10 representative
clusters based on the rmsd values of the backbone atoms of the receptor.
Performances
to identify the agonists and antagonists for the two
crystal structures (2RH1 and 3SN6)
and the 10 intermediate conformations derived from the 10 representative
clusters based on the rmsd values of the backbone atoms of the receptor.
Effects
of the Ligand-Binding Pocket on Ligand
Screening
How does the pocket affect ligand binding, and
what type of pocket is suitable for agonist binding? To address the
questions, we clustered all CMD trajectories into 80 categories only
based on the rmsd values of the ligand-binding pocket residues rather
than the total rmsd above. On the basis of the populations of the
clusters and the differences between their rmsd values, 10 representative
clusters were selected. The central conformations of the 10 clusters
were used to perform further ligand screening. Figure S3 and Table
S1 in the Supporting Information show the
four indicators of the ligand pocket for the 10 representative conformations.
It can be seen that the 10 conformations exhibit different features
for the 4 indicators, thus to a large extent representing the ligand-pocket
variation in the activation process. The screening results of the
10 conformations are listed in Table S2. We arranged the 10 representative conformations according to their
pocket sizes from small to large because of the significant effects
of the pocket volume on the ligand screening observed above. It can
be seen from Figure that the performance to distinguish the antagonist from the decoys
increases with increasing pocket volume. However, the screening to
the agonists is different. When the pocket volume is less than about
450 Å3, the performance for the screening agonist
increases as the pocket volume increases because the large pocket
volume should favor the entrance of the agonist. But, when the pocket
volume is larger than about 450 Å3, the performance
to identify the agonist gradually reduces with increasing pocket volume
because the too large pocket disfavors the interaction between the
small-size agonist and the receptor.
Figure 6
Effects of the ligand-binding pocket volumes
on the performance
for screening the agonists and antagonists.
Effects of the ligand-binding pocket volumes
on the performance
for screening the agonists and antagonists.The performance to identify the agonists is the best for
the lb-c7
state with 411 Å3 of the pocket volume, in which the
activity of the pocket is in an intermediate state (vide Figure S3). However, the performance to identify
the antagonists is the highest for lb-c2 with 742 Å3 of the pocket volume. The result may be attributed to the fact that
the antagonist of β2AR is generally bigger than the
agonist. Thus, the pocket with a large volume is more favorable for
binding the antagonist, whereas a relatively small volume (about 400–450
Å3) is more suitable for binding the agonist.As accepted, the distance between the Phe193ECL2 residue
and the pocket residue Tyr3087.35 is associated with the
opening and closing of the lid ECL2. Thus, to gain insight into its
effects on ligand binding, we arranged the order of 10 representative
conformations in terms of the distance from small to large, as shown
in Figure . It can
be seen that the more the opening of ECL2, the better the performance
to identify the antagonists. Different from the antagonist screening,
the performance for screening the agonist increases as the distance
increases, only when the distance is in the range of about 4–9
Å. The reason should be that the increased distance would facilitate
the entrance of the agonist. But, when the distance becomes too large,
for example, greater than 9 Å, the performance to select the
agonist reversely reduces. As reported, Phe193ECL2 and
Tyr3087.35 could form a hydrophobic region to interact
with the phenyl ring of the agonist, in turn capturing the agonist.[46] If the distance is too large, the hydrophobic
interaction would be impaired, even disappeared, thus reducing the
performance for screening the agonist. For the lb-c1 and lb-c2 states,
ECL2 is far away from the ligand-binding pocket, as evidenced by 16.29
and 15.59 Å distances, respectively. Consequently, their selections
to the agonists are low, but the selections to the antagonists are
high. The observation indicates that the receptor is not conducive
to bind the agonist when ECL2 is too far away from the pocket. However,
when ECL2 is gradually closed, it would disfavor the entrance of the
antagonist because of the fact that the antagonist molecule is larger
than the agonist. Thus, when ECL2 is in an open state, it is conducive
for the entrance and binding of the antagonist. In addition, we also
arranged the 10 clusters in the increasing order of the distance between
Ser2075.46 and Tyr3167.43 inside the pocket
(vide Figure ) to
observe the effect of the distance on ligand screening. As reflected
by Figure , there
is no obvious correlation between the screening performance and the
distance, indicating that the distance does not play a significant
role in screening the agonist and antagonist, despite the fact that
the active crystal structure of β2AR shows an obvious
inward bulge at Ser2075.46 compared to the inactive crystal
structure.[46] Previous observations derived
from the interaction fingerprints (IFP) analysis of β2AR and its cocrystallized ligands indicated that full agonists possessing
catechol or a catechol-mimicking moiety could provide more hydrogen
bond donors to form extra hydrogen bonds with the Ser2075.46 residue.[58] However, the antagonist and
agonist used in this work lack catechol or the catechol-mimicking
moiety. Thus, it is difficult for them to form H-bonding with the
Ser2075.46 residue unlike the full agonist, leading to
the result that the distance between Ser2075.46 and Tyr3167.43 does not play an observable role in influencing the receptor
screening to the agonists and antagonists.
Figure 7
Effect of the distance
between Phe193ECL2 and Tyr3087.35 (d_193–308) on the performance
for screening the agonists and antagonists.
Figure 8
Effect of the distance between Ser2075.46 and Tyr3167.43 (d_207–316) inside the ligand-binding
pocket on the performance for screening the agonists and antagonists.
Effect of the distance
between Phe193ECL2 and Tyr3087.35 (d_193–308) on the performance
for screening the agonists and antagonists.Effect of the distance between Ser2075.46 and Tyr3167.43 (d_207–316) inside the ligand-binding
pocket on the performance for screening the agonists and antagonists.In addition, although we did not
calculate and analyze the effect
of the G-protein coupling on the ligand binding in this work, some
observations on the issue were reported by one recent experimental
work.[55] Their pharmacological and biochemical
evidences suggested that the G-protein coupling with the active receptor
would influence the passage of ligands to the orthosteric-binding
site, for example, impeding disassociation of the ligand bound in
the receptor or association of the ligand in the free receptor state.
An early experimental work also reported that the uncoupling G-protein
from β2AR would accelerate the agonist dissociation from the
receptor.[59] Our observations above revealed
that the small volume of the ligand-binding pocket and the small distance
between the gate keeper ECL2 and the pocket in some intermediate states
disfavor the entrance of the large-size antagonist but favor the binding
of the small-size agonists. If the ligand is already bound in the
closed and active pocket, its disassociation is certainly difficult.
Thus, our results are in line with the experimental findings.[55,59] In addition, our observations further indicate that the ligand-binding
pocket frequently fluctuates from the open and inactive conformation
to the closed and active one in the absence of the G-protein coupling.
Thus, it still gives a preference to the agonist in one intermediate
state with the active ligand-binding pocket but the low activity of
the G-protein region, whereas the agonist binding would accelerate
the activation of the intermediate state, in turn favoring the G-protein
coupling.
Conclusions
We combined
the TMD and CMD to obtain 10 representative intermediate
conformations in the activation process of β2AR.
The 10 intermediates exhibit different structural features in the
ligand-binding region and the G-protein-bindingregion. Despite the
small differences in the ligand-binding pocket between the active
crystal structure and the inactive one, our observation indicates
that the ligand pocket is very unstable and fluctuates between the
inactive state and the active one, when the receptor is in the free-ligand
state. The ligand-binding site could be stable in the active state
with a small volume and a completely closed ECL2, only when the G-protein-binding
region is fully activated. The observation provides a molecular evidence
for some experimental observations. In addition, our results further
confirm that the activations of the ligand-binding pocket and G-protein-binding
site are discontinuous and relatively independent, which contribute
to the existence of multiple intermediate conformations in the activation
process.The virtual screening to identify the agonists and
antagonists
for the10 representative conformations indicates that the screening
performance to the agonist does not increase as the structure activity
of the receptor increases, but is dominated by the activation of the
ligand-binding pockets. The receptor is beneficial for binding the
agonist when closing ECL2 (the distance of 6 Å) and increasing the pocket volume
(>500 Å3). However, the distance between the residues
Ser2075.46 and Tyr3167.43 in the pocket does
not present correlation with the screening performance to the agonists
and antagonists. The observations could provide further insights,
adding into previous studies, for a better understanding of the GPCR
activation mechanism and providing valuable information for designing
functionally specific drugs targeting GPCRs.
Materials
and Methods
Workflow
In this work, a novel ensemble
strategy was proposed, as shown in Figure . First, the TMD simulation was used to produce
6 representative initial seeds in the activation process. Then, six
150 ns CMD simulations were performed based on the six seeds. For
all six 150 ns CMD trajectories, the clustering was applied to obtain
the ten main intermediate states in terms of the rmsd of all backbone
atoms of the receptor with respect to the inactive crystal structure.
For the 10 intermediates, their structural features were analyzed,
mainly focusing on the ligand-binding pocket and the G-protein-coupling
region. Finally, the virtual screening was used to explore their selectivity
to the ligands.
Figure 9
Schematic representation of the workflow.
Schematic representation of the workflow.
System Preparation
The inactive crystal
structure (PDB ID: 2RH1)[4] and the active crystal one (PDB ID: 3SN6)[7] of β2AR were used as initial coordinates,
in which all nonreceptor molecules were removed, but the crystal water
inside the receptor was retained. For the ICL3 missed in the two crystal
structures, we used MODELLER V9.10[60] to
rebuild it, owing to its important role in the interaction with G-protein.[49] The receptor structure was inserted into a well-prepared
phospholipids bilayer, palmitoyl oleoyl phosphatidyl choline (POPC),[61] and we removed the lipids whose P atoms fall
within 0.5 Å of the receptor. Then, chloride ions were introduced
to neutralize the receptor charge using columbic potential terms.
Water molecules were added using xleap utility. The rectangle periodic
box was set up so that any solute atom is at least 9 Å from any
of the box edges. Finally, there are about 30 000 water molecules
in each system. The AMBER03[62] force field
was used for the receptor, and the GAFF force field[63] was utilized for POPC. Water was represented by the TIP3P
model.[64] To remove bad contacts in the
initial geometry of the system, a 20 000 step minimization
was performed using a steepest descent method combined with the conjugate
gradient algorithm. After the minimization, the system was heated
from 0 to 300 K within 120 ps. The coordinates and trajectories were
stored for the following TMD and CMD simulations.
TMD Simulation
To obtain the target
structure from the initial structure in the TMD simulation, a restraint
defined in terms of the rmsd to the target structure was applied as
an extra energy term in the force field through the following time-dependent
energy functionwhere N is the number of
atoms, K is the force constant, rmsd(t) is the rmsd of the structure by superposition to the target structure
at time t, and rmsd0(t) is the prescribed rmsd relative to the target structure at time t. With the aid of the applied force, the moving structure
is gradually driven toward the target structure. In this work, the
inactive and active MD coordinates obtained above were used as the
initial and the target structures for the TMD simulation, respectively.
The NPT ensemble was utilized at a temperature of 300 K and a pressure
of 1 bar. The length of all bonds involving a hydrogen atom was constrained
by the SHAKE algorithm. Nonbonding interactions were handled with
a 10 Å atom-based cutoff. The particle-mesh-Ewald method was
applied to fix the long-range electrostatic interactions with a 10
Å nonbonded cutoff. The force constant applied to the backbone
atoms of the receptor was set to be 1 kcal mol–1 Å–2, and the integration step is 2 fs.[33] The simulation time is 6 ns. The trajectories
were saved at an interval of 1 ps. The obtained 6000 conformations
were then clustered into six clusters by the k-means
algorithm based on the rmsd of the backbone atoms of the receptor,
as reflected by Figure S4 in the Supporting Information. In addition, it can be seen from Figure S4 that the TMD simulation achieves an equilibrium within a time of
6 ns. The representative conformation of each cluster was selected
to further conduct the CMD simulation.
CMD Simulation
To get the conformation
close to the real state, a 150 ns CMD simulation was further performed
with the periodic boundary condition in the NPT ensemble at 300 K
for each of the six TMD representative conformations under similar
simulation conditions to the TMD above. For analysis, the trajectories
were saved at an interval of 2 ps in the CMD simulations.All
MD simulations were performed using the sander module of AMBER 12.0
package.[65] All MD results were analyzed
using the analysis module of AMBER 12.0 and VMD[66] as well as some other developed specific trajectory analysis
softwares.
Clustering Analysis
For all six 150
ns CMD trajectories, clustering was carried out using the k-means algorithm[67,68] included in the ptraj program from the AmberTools package. Clustering was
based on the mass-weighted rmsd structural similarity matrix. In this
work, two types of cluster analysis were constructed. One was based
on the rmsd of the backbone atoms of all residues, and the other was
based on the ligand-binding pocket residues (His93, Trp109, Thr110,
Asp113, Val114, Val117, Thr118, Thr195, Tyr199, Ala200, Ser203, Ser204,
Ser207, Trp286, Phe289, Phe290, Asn293, Lys305, Tyr308, Ile309, Asn312,
and Tyr316).[32,37] We chose several top clusters
in the populations with different rmsd values for further structural
analysis and virtual screening, in which the cluster center was selected
as the representative structure.
Virtual
Screening
A set of ligands
with different efficacies was selected to explore the ligand selectivity
of the intermediate states, which includes 40 known β2AR ligands (20 agonists and 20 antagonists) collected from ZINC[69] database, GPCR-ligand[70] database, and PubChem[71] database. Their
structures are shown in Figure S1 and Figure S2 in the Supporting Information. In addition, the decoys
of actives were extracted from the DUD-E[72] database, and the ratio of active ligands to decoys (Nactives/Ndecoys) was kept
at 1:36 in terms of the rules adopted in DUD-E database. Finally,
there are 1480 small molecules in the ligand set, which are docked
to the representative receptor structures. All of the input files
were prepared by AutoDockTools 1.5.6 package. AutoGrid 4.2 was used
to create affinity grids centered on the active site. The box size
is 75 Å × 75 Å × 75 Å with 0.375 Å spacing,
which is enough to cover the ligand-binding site. The dockings with
the rigid receptor and the flexible ligand were performed using AutoDock
4.2,[74] in which 100 separate docking calculations
were carried out for each ligand to ensure the accuracy of the result.
Each docking calculation was composed of 1 000 000 energy
evaluations using the Lamarckian genetic algorithm. The docking pose
with the lowest binding energy was selected as the best binding mode
for further scoring.The ROC[75] plot
is a common tool to evaluate the performance of a structure to discriminate
the actives from the decoys, which is a curve of the true positive
rate (TPR) versus the false positive rate (FPR). They are calculated
in terms of the following equationswhere TP (true positive)
is the number of
actives in the positive class, FN (false negative) is the number of
actives in the negative class, FP (false positive) is the number of
decoys in the positive class, and TN (true negative) is the number
of decoys in the negative class. AUC is the area under the ROC curve,[76] which has been widely used to quantify the screen
performance.[77,78] The larger the AUC value, the
better the screening performance of the structure to the actives.
For example, if the AUC value is equal to 0.5, it represents random
screening. When it is equal to 1, the structure has the strongest
screening ability for the actives in the ligand-training set. The
calculation of the AUC value is involved in the ligand-binding energy
but not equal to the energy. It could reflect the overall affinity
of the structure for one class of actives (agonists or antagonists)
from one ligand set consisting of the agonists, antagonists, and decoys,
rather than the screening performance of the structure to one ligand.
Authors: Olaf Fritze; Sławomir Filipek; Vladimir Kuksa; Krzysztof Palczewski; Klaus Peter Hofmann; Oliver P Ernst Journal: Proc Natl Acad Sci U S A Date: 2003-02-24 Impact factor: 11.205
Authors: Ka Young Chung; Søren G F Rasmussen; Tong Liu; Sheng Li; Brian T DeVree; Pil Seok Chae; Diane Calinski; Brian K Kobilka; Virgil L Woods; Roger K Sunahara Journal: Nature Date: 2011-09-28 Impact factor: 49.962
Authors: Noureldin Saleh; Passainte Ibrahim; Giorgio Saladino; Francesco Luigi Gervasio; Timothy Clark Journal: J Chem Inf Model Date: 2017-05-08 Impact factor: 4.956
Authors: Yinglong Miao; Sara E Nichols; Paul M Gasper; Vincent T Metzger; J Andrew McCammon Journal: Proc Natl Acad Sci U S A Date: 2013-06-18 Impact factor: 11.205