Passive permeability of a drug-like molecule is a critical property assayed early in a drug discovery campaign that informs a medicinal chemist how well a compound can traverse biological membranes, such as gastrointestinal epithelial or restrictive organ barriers, so it can perform a specific therapeutic function. However, the challenge that remains is the development of a method, experimental or computational, which can both determine the permeation rate and provide mechanistic insights into the transport process to help with the rational design of any given molecule. Typically, one of the following three methods are used to measure the membrane permeability: (1) experimental permeation assays acting on either artificial or natural membranes; (2) quantitative structure-permeability relationship models that rely on experimental values of permeability or related pharmacokinetic properties of a range of molecules to infer those for new molecules; and (3) estimation of permeability from the Smoluchowski equation, where free energy and diffusion profiles along the membrane normal are taken as input from large-scale molecular dynamics simulations. While all these methods provide estimates of permeation coefficients, they provide very little information for guiding rational drug design. In this study, we employ a highly parallelizable weighted ensemble (WE) path sampling strategy, empowered by cloud computing techniques, to generate unbiased permeation pathways and permeability coefficients for a set of drug-like molecules across a neat 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphatidylcholine membrane bilayer. Our WE method predicts permeability coefficients that compare well to experimental values from an MDCK-LE cell line and PAMPA assays for a set of drug-like amines of varying size, shape, and flexibility. Our method also yields a series of continuous permeation pathways weighted and ranked by their associated probabilities. Taken together, the ensemble of reactive permeation pathways, along with the estimate of the permeability coefficient, provides a clearer picture of the microscopic underpinnings of small-molecule membrane permeation.
Passive permeability of a drug-like molecule is a critical property assayed early in a drug discovery campaign that informs a medicinal chemist how well a compound can traverse biological membranes, such as gastrointestinal epithelial or restrictive organ barriers, so it can perform a specific therapeutic function. However, the challenge that remains is the development of a method, experimental or computational, which can both determine the permeation rate and provide mechanistic insights into the transport process to help with the rational design of any given molecule. Typically, one of the following three methods are used to measure the membrane permeability: (1) experimental permeation assays acting on either artificial or natural membranes; (2) quantitative structure-permeability relationship models that rely on experimental values of permeability or related pharmacokinetic properties of a range of molecules to infer those for new molecules; and (3) estimation of permeability from the Smoluchowski equation, where free energy and diffusion profiles along the membrane normal are taken as input from large-scale molecular dynamics simulations. While all these methods provide estimates of permeation coefficients, they provide very little information for guiding rational drug design. In this study, we employ a highly parallelizable weighted ensemble (WE) path sampling strategy, empowered by cloud computing techniques, to generate unbiased permeation pathways and permeability coefficients for a set of drug-like molecules across a neat 1-palmitoyl-2-oleoyl-sn-glycero-3-phosphatidylcholine membrane bilayer. Our WE method predicts permeability coefficients that compare well to experimental values from an MDCK-LE cell line and PAMPA assays for a set of drug-like amines of varying size, shape, and flexibility. Our method also yields a series of continuous permeation pathways weighted and ranked by their associated probabilities. Taken together, the ensemble of reactive permeation pathways, along with the estimate of the permeability coefficient, provides a clearer picture of the microscopic underpinnings of small-molecule membrane permeation.
The
ability of a drug candidate to cross (or permeate) lipid membranes
is essential for achieving the required absorption, distribution,
metabolism, excretion, and toxicity (ADME/Tox) profile.[1] Understanding the mechanism by which membrane
permeation can occur is invaluable for the rational improvement of
drug bioavailability. While active transport via transmembrane proteins
can contribute to ADME/Tox properties[2] of
charged endogenous compounds,[3,4] passive diffusion across
lipid bilayers is believed to be the predominant mechanism for membrane
transport of drug candidates in a variety of cell types.Given
the time-consuming and costly nature of in vitro experiments[5−8] for measuring passive membrane permeation, there has been great
interest in theoretical strategies for predicting drug permeability.
The first theoretical model of permeation was based on Overton’s
rule[9] developed over a century ago,[10] which proportionally relates passive membrane
permeability to the oil–water partition coefficient.[11] In this model, permeability is correlated with
the ability to partition into the lipid phase, assuming that permeation
is driven only by a molecule’s inherent lipophilicity. More
sophisticated quantitative structure–permeability relationship
(QSPR) models have also emerged that are built from experimentally
derived permeability measurements, along with a set of input physiochemical
descriptors from a database of training molecules.[12] QSPR models typically rely on properties such as the polar
surface area, molecular weight, hydrogen bond count, and octanol–water
partition coefficient to statistically infer new permeability coefficients
for drug-like compounds.[13] Over the last
few years, advanced machine learning (ML) techniques have also been
applied to predict permeability from either molecular descriptors
or fingerprints. Reviews of ML applications to predict permeability
exist elsewhere[14] and will not be covered
in detail here, yet some examples of ML approaches worth mentioning
include models for Parallel Artificial Membrane Permeability Assay
(PAMPA)-like assays,[5] models for Caco-2
cell permeability,[15] as well as blood–brain
barrier[16] and CNS[17] permeability models. Although ML and QSPR methods can be fast, cheap,
and accurate within the domain of chemical space of the training set,
such models have not provided any mechanistic insights to guide drug
design to favor permeation.In principle, molecular dynamics
(MD) simulations can provide the
most detailed mechanistic insights on membrane permeation. While MD
simulations have captured small-molecule permeation processes that
occur within a microsecond,[18] membrane
permeability of drug-like molecules can be orders of magnitude beyond
this timescale. To access these longer timescales, many simulation
studies have used a method proposed by Marrink and Berendsen[19] that is based on an inhomogeneous solubility-diffusion
(ISD) model in which the permeability coefficient of a small molecule
is estimated from the free energy and diffusion rate profiles across
the membrane. Such studies have used enhanced sampling techniques
that apply either external biasing forces or modified Hamiltonians
[e.g., adaptive biasing force method,[20] free energy perturbation,[21] restrained
potential of mean force (PMF) calculations,[22] constant pH simulations,[23] and umbrella
sampling.][24] A caveat of such studies,
however, is that the results rely on the free energy profile, which
is a function of the chosen reaction coordinate and may miss potential
rate-limiting steps in orthogonal coordinates.[25,26] Alternatively, a few simulation studies have employed methods that
combine short discontinuous trajectories to estimate long-timescale
observables (e.g., milestoning[27,28] and Markov state models[29]). While all the above studies have revealed
some microscopic details of the permeation process, they have not
provided pathways of membrane permeation processes that are both unbiased
in the dynamics and continuous.To bridge the gap between computation
and experiment in a quantitative
manner, we combine the WE path sampling strategy with cloud-based
computing to not only provide direct estimates of permeability coefficients
but also continuous, atomistic permeation pathways of drug-like compounds
across a model lipid bilayer. We focus initially on tacrine, a rigid
small-molecule inhibitor of acetylcholinesterase (zero rotatable bonds),
to evaluate the efficiency of several WE protocols for estimating
permeability coefficients. We then apply the most efficient protocol
to (R)-zacopride and sotalol, which are substantially more complex,
flexible molecules (three and six rotatable bonds, respectively),
for detailed analysis of their membrane permeation mechanisms. All
three compounds are weakly basic and obey Lipinski’s “rule
of five”[30,31] for orally active therapeutics.
Theoretical
Background
In this section, we introduce
a general model to predict permeability using the kinetic rate constant
of membrane crossing. The permeation rate constant can be defined
using the mean-first-passage time (MFPT) of the crossing event, as
will be outlined with the Hill relation below. In principle, however,
this rate constant could be obtained from any kinetic method.
Kinetic
Model of Membrane Permeability
The passive
membrane permeation of a drug-like molecule can be modeled as a two-state,
first-order process, in which there is a large free energy barrier
to desolvate the molecule upon entering the lipid bilayer. During
this process, a molecule diffuses across a lipid bilayer to reach
one of two aqueous compartments, namely, the donor (D) or the acceptor
(A)Here, kD→A is the forward rate constant for passive diffusion
of the molecule
from the donor to the acceptor compartment, while A→D is the reverse rate constant. Assuming that the system is symmetric
(i.e., the volumes of compartment D and A are identical) and CD and CA are the
concentrations of the molecule in the two compartments, the rate of
change in the concentrations (or populations) of the molecule can
be described by the following set of ordinary differential equationswhere v’s are volumetric
permeation ratesAlternatively, these
rates can be measured
by the rate of change in the concentration of the respective species.
For examplewhere dmD is the
amount of molecule (in terms of mass or molar mass), originating from
compartment D, passing through the bilayer, and moving into compartment
A within time dt. Here, S is the
surface area of the membrane, and lD is
the depth of the effective “reaction” volume of compartment
D or the “unstirred layer” as referred by Missner and
Pohl.[11] Instead of those in the bulk water,
molecules in this aqueous layer adjacent to the membrane “react”
with the membrane surface, potentially leading to the initiation of
the permeation process. The detailed balance between the concentrations
of the molecules in bulk water and in this layer should be safely
regarded as rapidly maintained by diffusion in a relatively uniform
aqueous environment—a process that is much more rapid than
permeation through a lipid membrane.[18]Although the volumetric permeation rates in eq provide one route to estimate the kinetics
of molecular transport across the membrane, the most common quantification
is the membrane permeability coefficient, Pm, typically measured in logarithmic units of cm/s. This coefficient
is based on Fick’s first law of diffusion and linearly connects
the net flux of the molecule across the membrane at steady state, Jm, to the difference in concentrations of the
molecule in compartments D and ABy imposing the homogenous solubility-diffusion (HSD) model,[18] where the membrane is in a quasi-steady state,
it can be readily demonstrated that the proportionality constant from eq is related to physical
properties of the permeant-membrane system. If the concentration change
is assumed to be linear across the membrane, the permeability coefficient
equals DK/h, where D is the diffusion constant of the molecule inside the membrane, K is the oil–water partition coefficient, and h is the membrane thickness. The relationship between Pm and DK has been experimentally
verified over 6 log units,[32] with a few
notable exceptions that are likely attributed to active transport.In addition to eq , the membrane flux can also be written as the difference between
the molecular influx, JD→A, and
outflux, JA→D, through the membraneBoth the influx and outflux represent the amount of molecules
that
passes through a surface area, S, within a given
amount of time. Mathematically, this can be expressed asBy comparing eqs and 8, one can see that the in-/outflux and
the volumetric permeation rates are relatedSubstituting eqs , 7, and 9 into eq and assuming unequal concentrations
in the two compartments, that is, CD ≠ CA, we arrive atFurthermore, either (1) under the assumption of a symmetric system,
such as the case in the permeation of a small molecule through a neat
membrane, where one can set kD→A = kA→D and lD = lA, or (2) under the steady-state
consideration of the WE strategy, where CA ≡ 0 (see below for details), the form of the permeability
coefficient in eq can be reduced toIn our model, by
default, we consider the size of the unstirred
layer, lD, to half the length of the aqueous
part of the simulation box along the lipid bilayer normal. A more
refined value of lD can be determined
by the size of the free energy basin of donor compartment D,[33] but preliminary tests suggest that the permeability
coefficient is somewhat insensitive to the exact value of lD, especially since permeability coefficients
are usually expressed in log units (see Figure S2 for details). While eq is derived here for a two-state permeation process,
the same equation can be derived from a multistate model, where states
D and A are connected by a series of intermediates inside the lipid
bilayer to yield apparent rate constants for the overall process, kD→A or kA→D, as described by Parisio et al.[33]
Calculation
of Permeation Rate Constants Using Weighted Ensemble
Simulations
The weighted ensemble path sampling strategy[34,35] enables simulations of processes that are orders of magnitude longer
than the simulations themselves.[36−39] This greatly enhanced sampling
results from an iterative resampling procedure (at fixed time intervals
τ) that replicates trajectories to occupy less-visited regions
of configurational space—typically defined by a progress coordinate
toward the target state (also known as a reaction coordinate, set
of order parameters, or collective variables) that is divided into
bins (Figure A). The
WE resampling strategy is unbiased in two senses: (1) the underlying
dynamics is not altered by the resampling and (2) the trajectories
are assigned statistical weights that are rigorously tracked, such
that the resampling procedure is statistically unbiased. The former
allows us to potentially obtain an ensemble of continuous pathways
for the process of interest, and the latter allows us to evaluate
unbiased estimates of steady-state averages, such as the rate constants
into any arbitrary state.[40,41] Thus, the WE strategy
provides an ideal framework for direct simulations of drug membrane-permeability
pathways and calculations of permeability coefficients.
Figure 1
Basic weighted
ensemble protocol and system setup. (A) Illustration
of the WE protocol for a membrane permeability simulation in which
a one-dimensional progress coordinate z (eq ) is divided into bins,
and iterative rounds of dynamics propagation and a resampling procedure
are performed with the goal of providing even coverage along the coordinate.
As seen in the upper left, two trajectories (solid dots) originating from the left-most bin each occupy a previously empty
bin after N rounds of dynamics (curved arrows). The resampling procedure then replicates the trajectories in these
newly occupied bins to maintain a target number of two trajectories
per bin. (B) Simulation workflow used by permeability floe to directly
calculate permeability coefficients, including one round of WE resampling
(using WESTPA) and dynamics propagation for each WE iteration (see Figure S1 for further details). (C) Snapshot
of the simulation system from a trajectory of a “rule of five”
permeate, sotalol, crossing the periodic membrane. All water molecules
have been removed for clarity. Lz is the z-component of the simulation box, while z′ is the distance from the center of mass of the
permeate and the center of mass of the membrane (straight
blue lines).
Basic weighted
ensemble protocol and system setup. (A) Illustration
of the WE protocol for a membrane permeability simulation in which
a one-dimensional progress coordinate z (eq ) is divided into bins,
and iterative rounds of dynamics propagation and a resampling procedure
are performed with the goal of providing even coverage along the coordinate.
As seen in the upper left, two trajectories (solid dots) originating from the left-most bin each occupy a previously empty
bin after N rounds of dynamics (curved arrows). The resampling procedure then replicates the trajectories in these
newly occupied bins to maintain a target number of two trajectories
per bin. (B) Simulation workflow used by permeability floe to directly
calculate permeability coefficients, including one round of WE resampling
(using WESTPA) and dynamics propagation for each WE iteration (see Figure S1 for further details). (C) Snapshot
of the simulation system from a trajectory of a “rule of five”
permeate, sotalol, crossing the periodic membrane. All water molecules
have been removed for clarity. Lz is the z-component of the simulation box, while z′ is the distance from the center of mass of the
permeate and the center of mass of the membrane (straight
blue lines).According to the Hill
relation, the permeation rate constant, or
steady-state probability flux, into a target state of interest (compartment
A) is exactly the inverse MFPT[35,42]For drug membrane permeation, fD→ASS is the steady-state
(SS) probability flux arriving at state A from D, while pD is the fraction of trajectories that are more recently
in state D than in state A.Within the WE framework, steady-state
fluxes are mimicked by introducing
an initial state, a target state, and a “recycling”
condition, where a trajectory will be returned to the initial state
once it enters the target state to prevent re-entry (i.e., first passage).
In the context of the permeation system, the initial and the final
states naturally correspond to compartments A and D. In this way,
the weight of the recycled trajectory can be recorded as a probability
flux into the target state. The time average of instantaneous fluxes
arriving at state A from D, f̂D→A, in a WE simulation provides an estimate of the steady-state fluxThe second equality comes from the recycling condition, which
implies
that p̂D, the instantaneous fraction
of trajectories more recently in D than in A, is identically one.
In practice, taking the time average using pre-steady-state data could
result in a statistical bias toward events with short barrier-crossing
times. To address this issue, we employed the rate from event durations
(RED) scheme to apply a correction using the event-duration distribution
to eq , to mitigate
such bias.[43]With regards to membrane
permeability, the probabilities from eq are proportional to
the concentrations by a constant. Given the total concentration, C0 = mtotal/Vwhere VD = VA = V due to the symmetry of
the system and mtotal = mD + ∑m + mA is the total amount of the permeant. By assuming mD ≫ ∑m, we have p̃D ≈ pD, and it can be seen from inspection that the rate constants in eq are the same as eq , and eq still holds true if populations
of a molecule in compartment D and A are represented by probabilities.
This procedure is like the transition rate-based counting method,[18] but instead of simply counting the number of
crossing events in a simulation, the probabilistic weight of each
crossing event is also considered.Ultimately, the expression
that we used to estimate the permeability
coefficient from WE steady-state fluxes can be derived by combining eqs and 13
Methods
System Preparation and
MD Equilibration
The final system
used in the WE simulations was prepared using several individually
constructed molecular systems that were pieced together in the following
way. First, ParmEd[44] was used to generate
input files containing Open Force Field Parsley v1.3.1a1 force field[45] parameters for the drug-like molecule as well
as Amber LIPID17[46,47] parameters for the POPC membrane.
Next, an initial solvated POPC membrane configuration was generated
using CHARMM-GUI v2.0[48] with the Membrane
Builder[49] input generator module and 50
POPC molecules per leaflet in a solution of TIP3P[50] water molecules. The solvated membrane was equilibrated
using a slightly modified set of CHARMM-GUI scripts such that the
OpenMM 7.5 MD[200] engine could be used with
a single GTX1080 GPU to equilibrate the system for 0.5 μs in
the NPT ensemble. For each drug-like molecule, a graph representation
of the molecule was converted to a three-dimensional structure using
OEChem Toolkit 3.2.0.0[51,52] followed by the generation of
a diverse set of conformers using Omega Toolkit 4.1.2.0.[53−55] The top 20 conformers ranked by Omega were each randomly oriented
in compartment D relative to the pre-equilibrated lipid bilayer and
solvated by a 2 nm layer of water using PACKMOL[56] at a density of 1 gm/cm3. The 2D chemical structures
were visualized using Picto 4.5.3.0,[57] and
3D molecular structures including the snapshots and movies shown in
this study were rendered using VMD 1.9.4.[58]Each of the 20 solvated drug-membrane systems were subjected
to energy minimization until convergence using the L-BFGS method and
then equilibrated in the NPT ensemble in two stages. In the first
stage, each system was gradually heated from 0 to 308 K over 0.01
ns, while applying a weak harmonic restraining potential with a force
constant of 2 kcal/mol/Å2 to all atoms of the drug-like
molecule. In the second stage, the force constant was gradually reduced
from 2.0 to 0.1 kcal/mol/Å2 over 0.06 ns of equilibration.
The resulting 20 equilibrated systems were used as starting conformations
for WE simulations.
Weighted Ensemble Simulations
WE
simulations were run
in the following manner using a Python API of the WESTPA 2.0 software
package.[59] To maintain non-equilibrium
steady state conditions, trajectories that reached a target state
of compartment A (i.e., z = 3.0 nm, see Figure ) were “recycled”,
starting a new trajectory from the initial state (compartment). A
one-dimensional progress coordinate was divided into bins using two
different schemes: a manual, fixed binning scheme and the minimal
adaptive binning schemes(MAB) scheme[60] (see
below). A resampling interval of 0.1 ns was applied with a target
number of 5 trajectories per bin. The simulations required 50 ns,
which corresponds to 500 WE iterations, to reach reasonable convergence
of the permeability coefficient.
Progress Coordinate and State Definitions
An effective
progress coordinate for a WE simulation captures the slowest motion
that is relevant to the rare-event process of interest. For membrane
permeation, an effective progress coordinate is the distance between
the center of mass of the molecule () to that of the lipid bilayer () along the unit vector normal to the bilayer surface, ẑGiven a lipid bilayer with a width
of ∼40 Å, a molecule with z′ < −20 Å is in compartment D, a molecule within −20
Å ≤ z′ ≤ 20
Å is inside the membrane with z′ = 0,
indicating the center of the membrane, and a molecule with z′ > 20 Å is in compartment A
(Figure A). A change
in z′ from anywhere smaller than
−20
Å to +30 Å is considered a membrane crossing event. Note
that the extra 10 Å beyond +20 Å accounts for the interfacial
solvation layer that will have different properties compared to bulk
water.The progress coordinate defined in eq works well in an ideal setup, where there
is a single lipid bilayer sandwiched between two aqueous compartments
of infinite volumes. However, correction is needed to account for
the periodic boundary conditions imposed by the simulation protocol.
This correction can be written aswhere Lz is the
length of the simulation box along ẑ. The corrected
progress coordinate, z, is guaranteed to change sign
correctly from negative to positive when the molecule crosses any
lipid bilayer, which can be in either the unit simulation cell or
its periodic neighbor (Figure C).
Binning Schemes
The fixed binning
scheme separates
the water compartments (z < −20 Å
or z > 20 Å) into fixed sized bins of 2 Å
and the membrane region (−20 Å < z < 20 Å) into 0.5 Å-wide bins. We used 30 dynamic linear
bins for all our simulation using the MAB scheme.[60]
Dynamics Propagation
Dynamics were
propagated using
the OpenMM 7.5 MD engine in the NPT ensemble. As required for WE simulations,
a stochastic thermostat was used to maintain the temperature at 308
K, that is, a Langevin thermostat with a collision frequency of 1
ps–1. Constant pressure was maintained at 1 atm
by anisotropically coupling the Monte Carlo membrane barostat to the
system with 30 fs between attempts to adjust the system volume. A
1 nm cutoff was used for short-range nonbonded interactions, while
the particle mesh Ewald (PME) method[61] was
applied for the treatment of both long-range electrostatics and Lennard-Jones
interactions. To enable a 2-fs timestep, the SHAKE[62] or SETTLE[63] algorithm was used
to constrain the lengths of bonds to hydrogens. The trajectories were
processed and analyzed using MDTraj.[64]
Reweighting Trajectories for a Steady State
To accelerate
convergence of the WE simulation to a steady state, trajectory weights
were adjusted using a WE steady-state (WESS) reweighting procedure
that makes use of rates among “arbitrarily” defined
bins.[35]To reweight trajectories
for a steady state, we applied the WESS reweighting procedure (see
Bhatt et al.[35] for full details). Briefly,
the WESS procedure uses the rates k of transition between bins (i and j) to solve for the steady-state (SS) bin populations pSS using the standard SS relation ∑pSSk = ∑pSSk. Trajectory weights are scaled iteratively at every fixed
resampling interval τw to match the p values. In the present study, the bins
used for the reweighting procedure are the bins used for the manual,
fixed binning scheme and reweighting was performed every τw = 5 ns.
Estimating Uncertainties
A Monte
Carlo bootstrap procedure
was used to estimate uncertainties in calculated permeability coefficients.[34] In short, given a simulation with a total of N iterations, Monte Carlo bootstrapping was performed on
the original WE dataset of fluxes into state A accumulated from individual
iterations, {f̂D→A(|i ∈ 1, 2, ..., m}, to generate 1000 subsamples.
The reported rate constant is the mean among the subsamples (eq ), and the associated
uncertainty is the 95% confidence interval.
Free Energy
Profiles
Free energy profiles have been
generated as a function of the WE progress coordinate z and “auxiliary” coordinates that are orthogonal to
the progress coordinate by symmetrizing the probability distribution
as a function of the z-coordinate such that the resulting
two sets of non-equilibrium steady state trajectories in opposite
directions could be combined to form a set of equilibrium trajectories.[65] The auxiliary coordinates are the following:The cosine
of the angle relative to ẑ, which is defined through
the vector product of the
unit electric dipole moment of the molecule and ẑ, quantifies the relative orientation of the molecule through the
membrane.The number
of hydrophobic contacts is
defined to be the number of aliphatic atoms of the lipid tails within
the 10 Å distance of any hydrophobic atom of the drug molecule,
following Rogers and Geissler in their studies of lipid insertion.[25] The hydrophobic atoms used in this study are
highlighted in Figure S6.The number of hydrogen bonds between
the drug and the membrane was identified using the Baker-Hubbard definition[66] as implemented in the MDTraj package. According
to this definition, any donating NH or OH is assumed to be in a hydrogen
bond with any accepting N or O if the bond angle is greater than 120°
and the bond distance is less than 2.5 Å.The end-to-end distance of each molecule
was calculated based on the largest separated atoms of the molecule
identified in the longest axis. The atoms used for each drug-like
molecule are highlighted in Figure S6.Auxiliary coordinates are separate from
the WE progress
coordinate and were calculated after the simulations were completed.
If the auxiliary coordinates are orthogonal to the progress coordinate,
the sampling along these auxiliary coordinates is equivalent to that
of unbiased, conventional “brute force” MD. If the auxiliary
coordinates are not orthogonal to the progress coordinate, the WE
strategy would enhance the sampling of these coordinates. Even if
the sampling of auxiliary coordinates is not extensive, the results
of our permeability simulations would not be as impacted as the results
of force-biasing methods by the choice of progress coordinate. In
addition, auxiliary coordinates can serve as useful tools for gaining
mechanistic insights into how a molecule crosses (or does not cross)
the membrane and which permeation pathways would lead to the permeability
estimated from the simulations.
Cloud Computing in Orion
Scientific computing in OpenEye’s
Orion cloud platform is patterned after the concept of flow-based
programming (FBP).[67] In FBP, a series of
compute kernels are defined to perform actions on input data and pass
(transformed) output data to subsequent kernels. FBP is designed around
the idea that information will flow from one kernel to another through
connections known as ports, with complex operations occurring in each
kernel process. Compute kernels within Orion’s FBP vocabulary
are known as “Cubes”. Cubes are connected to each other
through a series of ports, whereby data can flow from one Cube to
another in the form of one of several strongly typed data structures.
Within the Orion FBP framework, finalized workflows with a set of
logically connected cubes are known as “Floes”. All
Cubes and Floes are written in Python 3, where a Cube is an instance
of a Cube class and Floes are in many ways the equivalent of a Python
script. In Orion, all compute nodes are sourced from Amazon Web Services
(AWS). Each Cube runs on its own AWS instance that itself runs in
an isolated computing unit called a Docker container. The Orion platform
handles all sourcing and scheduling of Cubes onto AWS instances. Floes
are uploaded as Python-like packages, which can depend on other Python
packages sourced from Anaconda or public pip repositories.
OpenEye
Permeability Floe
The OpenEye Permeability
floe in Orion contains a series of Cubes that each performs one of
the following functions: system preparation, MD equilibration, WE
simulation, and permeability analysis of the membrane-permeate system
(see Figure B for
an example of the flow relationship diagram of the compute kernels
and above sections for details). The Simulation Manager and Segment
Runner Cube, respectively, handle most of the WE logic and the MD
propagation and, therefore, are connected to each other in a cyclic
fashion to enable bidirectional communication between the WE driver
and the MD engine.The Permeability Simulation Floe is hosted
on the Orion platform where all the simulation setup and actual computation
(including system preparation and WE/MD simulation) took place. The
Floe takes a molecule of interest as input, which can be either a
graph representation (2D) of a molecule sketched into the molecular
editor or a pre-generated 3D structure of a molecule. If a 2D molecule
was given as input, the Floe will automatically convert it to a 3D
structure using the OEChem Toolkit. The stereochemistry of the molecule
is automatically handled by the Omega Toolkit, which will respect
pre-defined stereochemistry if such information is provided in the
molecular graph. The Floe also exposes various parameters for system
preparation and the WE simulation including the option to turn on/off
the MAB scheme for adaptive binning[60] or
the bin probability reweighting for faster convergence to the steady
state[35] (Figure S3). The input molecules and the WE protocols used in this study were
set using the floe’s graphical user interface (GUI).Finally, the WE simulation is automatically parallelized and performed
on CPUs or GPUs from either spot or non-spot AWS instances sourced
by the Orion computing platform. Typically, the simulation will be
automatically scaled-up by Orion to several hundreds of GPUs or thousands
of CPUs per WE iteration. Permeability simulations are analyzed on-the-fly,
and a simulation report describing important features of the reactive
trajectory data is generated upon completion (Figure S4). The report provides the time-evolution of the
permeability coefficient with a 90% confidence interval, probability
distribution as a function of the progress coordinate z, and visualization of trajectories—both fully downloadable
trajectories as well as visual schematics of membrane crossing—that
successfully reached the acceptor (A) compartment with their associated
probabilities.
Results
Here, we present the results
from fully automated permeability
simulations performed on three “rule of five” molecules
(tacrine, zacopride, and sotalol) using the OpenEye Permeability Floe
package in the Orion cloud computing environment. These drug-like
molecules are weakly basic primary or secondary amines that vary in
size, shape, and number of rotatable bonds.
Evaluation of WE Protocols
To determine an effective
WE protocol for simulating the membrane permeability for a drug-like
molecule, we focused on tacrine, the simplest compound in this study
with zero rotatable bonds (Figure A). In particular, we assessed the advantages of applying
an adaptive binning scheme (MAB scheme) and WE steady-state (WESS)
reweighting procedure by testing four WE protocols on GPUs: (1) standard
WE with a manual, fixed binning scheme, (2) WE with the MAB scheme,
(3) WE with the WESS reweighting procedure, and (4) WE with the MAB
scheme and WESS reweighting (see Methods for
full details). We also ran a WE simulation using protocol #4 on CPU
cores rather than GPUs to perform a cost-benefit analysis of using
GPUs over CPU cores. All the WE protocols yielded permeability coefficients
(log Pm: −6.95––3.23)
that are in reasonable agreement with the value measured by MCDK-LE
(−4.64)[68] and PAMPA (−5.02
± 0.2),[69]Figure B and Table ).
Figure 2
Estimated permeability coefficients of tacrine. (A) Chemical
structure
of tacrine. (B) Average (log) permeability coefficients (with 95%
confidence intervals) calculated from each of the four WE simulations
with different protocols [see the legend in panel (C)]. The gray dashed
lines indicate the observed values from MDCK-LE and PAMPA experiments.
Dashed arrows indicate the raised permeability estimates by applying
WESS reweighting to Protocol 1 and 3 after the simulations were completed
(50 ns). (C) Time evolution of estimated logarithm of the permeability
coefficients (cm/s) for tacrine from four WE simulations, using a
fixed binning scheme with (regular, blue) or without
WESS (orange), and the MAB scheme with (green) or without WESS (purple), respectively. The solid
lines indicate the mean values of the estimates, and the shaded areas
indicate 95% confidence intervals. The dashed gray line indicates
the permeability coefficient measured by an MDCK-LE experiment.[68] The molecular time is represented as Nτ, where N is the number of WE iterations
and τ is the fixed time interval (100 ps) of each WE iteration.
(D) Snapshots of the tacrine molecule (cyan) passing
through the lipid bilayer (gray) at selected molecular
times. Water molecules in close contact with the molecule are highlighted
in magenta.
Table 1
Efficiencies of Different WE Protocols
in Predicting the Membrane Permeability (Log Pm) of Tacrine
compound
MAB
WESS
platform
total simulation
time (μs)
wall clock time (Orion
days)
predicted log Pm (cm/s)
tacrine
GPU
23.0
15.5
–5.54 ± 0.13 (−4.32 ± 0.11)a
×
GPU
23.0
16.2
–3.23 ± 0.09
×
GPU
8.2
7.1
–6.96 ± 0.16 (−3.46 ± 0.23)a
×
×
GPU
8.1
7.9
–4.27 ± 0.24
×
×
CPU
5.6
11.4
–5.20 ± 0.28
Permeability
estimates in parentheses
were calculated by applying the WESS reweighting procedure after the
simulation was finished.
Estimated permeability coefficients of tacrine. (A) Chemical
structure
of tacrine. (B) Average (log) permeability coefficients (with 95%
confidence intervals) calculated from each of the four WE simulations
with different protocols [see the legend in panel (C)]. The gray dashed
lines indicate the observed values from MDCK-LE and PAMPA experiments.
Dashed arrows indicate the raised permeability estimates by applying
WESS reweighting to Protocol 1 and 3 after the simulations were completed
(50 ns). (C) Time evolution of estimated logarithm of the permeability
coefficients (cm/s) for tacrine from four WE simulations, using a
fixed binning scheme with (regular, blue) or without
WESS (orange), and the MAB scheme with (green) or without WESS (purple), respectively. The solid
lines indicate the mean values of the estimates, and the shaded areas
indicate 95% confidence intervals. The dashed gray line indicates
the permeability coefficient measured by an MDCK-LE experiment.[68] The molecular time is represented as Nτ, where N is the number of WE iterations
and τ is the fixed time interval (100 ps) of each WE iteration.
(D) Snapshots of the tacrine molecule (cyan) passing
through the lipid bilayer (gray) at selected molecular
times. Water molecules in close contact with the molecule are highlighted
in magenta.Permeability
estimates in parentheses
were calculated by applying the WESS reweighting procedure after the
simulation was finished.Relative to WE simulation with a manual binning scheme and no reweighting,
use of the MAB scheme reduces the required total simulation time by
∼65% (by comparing Protocols 3 and 4) and use of the WESS reweighting
procedure reduces the required total simulation time by ∼60%
(by comparing Protocols 1 and 2, where Protocol 2 reached the final
estimate of Protocol 1 at around 20 ns. See Figure C and Table ). The combined use of the MAB scheme and WESS reweighting
procedure reduces the required total simulation time by roughly threefold.
Reasonably converged permeability coefficients were obtained within
50 ns of molecular time, which is defined as Nτ,
where N is the number of WE iterations and τ
is the fixed time interval (100 ps) of each WE iteration. Without
the application of the WESS reweighting scheme, the permeabilities
estimated from the simulations using the fixed binning (Protocol 1)
and the adaptive binning (Protocol 3) differ by about 1.5 log units.
In contrast to protocols that applied reweighting, the results from
the different binning schemes (Protocols 2 and 4) differ by only one
log unit (Table ).
The smaller difference between estimated permeability coefficients
in the latter case is due to reweighting to a steady state. Once the
reweighting procedure was applied after 50 ns of simulation using
Protocols 1 and 3, the estimated permeability coefficients became
comparable to those from Protocols 2 and 4 (Figure B). The fact that these successful membrane-crossing
trajectories are orders of magnitude shorter than the mean first-passage
time for membrane permeation indicates that the membrane-crossing
events are indeed rare events. Said another way, the trajectories
include solely the relatively fast transitions with low probability
between stable states, leaving out the long waiting times in the initial
stable state for the low-probability transitions.In addition,
we calculated the evolution of the progress coordinate
(eq ) of trajectories
that successfully reached the target state in compartment A. This
estimate is based on thousands of crossing events forked from about
10 independent events using standard (regular) WE
with a fixed binning scheme (Figure S4C). One of the top-weighted, successful trajectories (probabilistic
weight: 6.0 × 10–6) was extracted for visualization
at the atomic level (Figure D and Movie S1). With the MAB scheme,
we were able to obtain comparable permeability estimates using much
fewer bins compared to a fixed binning scheme (34 bins with MAB vs
100+ with fixed bins). Because of the reduced number of bins, MAB-based
simulations require about 3 times less aggregate simulation time compared
to fixed binning as shown in Table (∼8 μs for MAB vs ∼23 μs
for fixed bins). To do so, we can analyze the fixed binning data at
around 8 μs aggregate time, which for Protocol 1 (fixed binning;
no reweighting) would yield a permeability estimate of −6.55
on the log scale at iteration 194. This value is comparable to the
permeability coefficient (−6.95) estimated by Protocol 3 at
iteration 500 (MAB; no reweighting); however, the (log) permeability
estimate from Protocol 2 (fixed binning, with reweighting) is still
−5.56 at iteration 195. The fixed binning estimates with only
8 μs of aggregate simulation time are nearly 2 log units away
from the MDCK-LE experiments (−4.64), whereas MAB with reweighting
can produce a more accurate permeability estimate (−4.27) in
a much shorter amount of aggregate simulation time. Nonetheless, the
large lower CI bound for simulation with the MAB and WESS protocol
was a result of scattered incoming fluxes into the target state due
to significant shorter aggregate time as compared to other protocols
(see total simulation time in Table ).
Mechanisms of Membrane Permeation for Tacrine,
Sotalol, and
Zacopride
To evaluate the free energy profile of the molecule
along the z-coordinate from the non-equilibrium steady-state
ensembles generated by our simulations, we symmetrized the resulting
probability distribution as a function of the z-coordinate
(see Methods; for the original, unsymmetrized
probability distribution for tacrine, see Figure S5). Using tacrine as an example, the free energy profile in Figure A shows that the
largest barrier of the permeation process of the molecule was located
at around z = 0 Å, that is, the center of the
membrane. ∼30 ns was needed for the free energy profile to
converge to the final profile determined by the total 50 ns simulation.
There is a smaller barrier at the membrane surface (z = ±20 Å), followed by an energy minimum near the center
of each leaflet (z = ±10 Å).
Figure 3
Free energy
profile along the lipid normal (z)
for tacrine using the fixed binning and reweighting WE protocol (WESS).
(A) Free energy profile (in units of kBT) of tacrine along the bilayer normal, z. The probability distribution along z, P(z), was extracted from the
simulation at different molecular times and symmetrized (see Methods; Figure S5B for
the unsymmetrized distribution). (B) Free energy profile along z and the cosine of the angle of the molecule (dipole moment)
with respect to z (top left), hydrophobic
contacts between the molecule and the membrane (top right), the number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The solid black
line represents the top weighted trajectory (statistical
weight: 6.0 × 10–6) and the purple
star indicates the start location. The approximate range
of the membrane region is indicated by black dashed lines (−20 Å < z < 20 Å). For
all 2D distributions, the probabilities are symmetrized across the
membrane to obtain the free energy profiles.
Free energy
profile along the lipid normal (z)
for tacrine using the fixed binning and reweighting WE protocol (WESS).
(A) Free energy profile (in units of kBT) of tacrine along the bilayer normal, z. The probability distribution along z, P(z), was extracted from the
simulation at different molecular times and symmetrized (see Methods; Figure S5B for
the unsymmetrized distribution). (B) Free energy profile along z and the cosine of the angle of the molecule (dipole moment)
with respect to z (top left), hydrophobic
contacts between the molecule and the membrane (top right), the number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The solid black
line represents the top weighted trajectory (statistical
weight: 6.0 × 10–6) and the purple
star indicates the start location. The approximate range
of the membrane region is indicated by black dashed lines (−20 Å < z < 20 Å). For
all 2D distributions, the probabilities are symmetrized across the
membrane to obtain the free energy profiles.We also calculated four auxiliary progress coordinates, namely,
the cosine of the angle of the unit electric dipole moment to ẑ, the number of hydrophobic contacts, the number of
hydrogen bonds between the molecule and the membrane, and the end-to-end
distance of the molecule to evaluate molecular orientation, hydrophobic
interactions, and hydrogen bond structure with respect to the main
progress coordinate, z, for tacrine (Figure B), zacopride (Figure C), and sotalol (Figure C). Interestingly, all three
molecules passed through the membrane with a 60–120° angle
(±30° with respect to ẑ), with a relatively
constrained angle near the center and interface of the membrane, especially
for zacopride and sotalol. All three molecules formed increasing numbers
of hydrophobic contacts with the membrane as the molecules approach
the center of the membrane and losing them when exiting from the center
(Figure S6). All three molecules formed
more hydrogen bonds near the headgroup region than in the center of
the membrane. Tacrine, as expected, did not undergo a large conformational
change crossing the membrane according to the end-to-end distances
of the molecule (7.2–7.6 Å) from the simulation using
Protocol 2 (fixed binning + WESS), but it was observed “flipping,”
as previously predicted, as a model for membrane permeation.[29] Similar energy barriers and bottleneck regions
can be observed for tacrine from the simulation using the adaptive
binning protocol (MAB + WESS, Figure S7). Owing to rigid nature of the molecule, zacopride seems to have
crossed the membrane adopting a mostly extended form (end-to-end distance
∼10.7 Å), whereas sotalol seems to be relatively more
flexible inside the membrane interior (end-to-end distance between
9 and 12 Å).
Figure 4
Mechanistic analysis of zacopride permeation. (A) Chemical
structure
of zacopride. (B) Snapshots of the zacopride molecule (cyan) passing through the lipid bilayer (gray) at selected
molecular times (see the legend in Figure D). (C) Free energy profile (in units of kBT) along the bilayer normal, z, the cosine of the angle of the molecule with respect
to z (top left), hydrophobic contacts
between the molecule and the membrane (top right),
the number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The black line represents the top weighted trajectory (probabilistic weight: 2.2
× 10–5), with purple dots indicating
the location of the snapshots in panel B and a purple star indicating the start location. The approximate range of the membrane
region is indicated by black dashed lines (−20
Å < z < 20 Å). For all 2D distributions,
the probabilities are symmetrized across the membrane to obtain the
free energy profiles.
Figure 5
Mechanistic analysis
of sotalol permeation. (A) Chemical structure
of sotalol. (B) Snapshots of the sotalol molecule (cyan) passing through the lipid bilayer (gray) at selected
molecular times (see the legend in Figure D). (C) Free energy profile (in units of kBT) along the bilayer normal, z, and the cosine of the angle of molecule with respect
to z (top left), number of hydrophobic
contacts between the molecule and the membrane (top right), number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The black line represents the top weighted trajectory (probabilistic weight: 9.3
× 10–6), with purple dots indicating
the location of the snapshots in panel (B) and a purple star indicating the start location. The approximate range of the membrane
region is indicated by black dashed lines (−20
Å < z < 20 Å). For all 2D distributions,
the probabilities are symmetrized across the membrane to obtain the
free energy profiles.
Mechanistic analysis of zacopride permeation. (A) Chemical
structure
of zacopride. (B) Snapshots of the zacopride molecule (cyan) passing through the lipid bilayer (gray) at selected
molecular times (see the legend in Figure D). (C) Free energy profile (in units of kBT) along the bilayer normal, z, the cosine of the angle of the molecule with respect
to z (top left), hydrophobic contacts
between the molecule and the membrane (top right),
the number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The black line represents the top weighted trajectory (probabilistic weight: 2.2
× 10–5), with purple dots indicating
the location of the snapshots in panel B and a purple star indicating the start location. The approximate range of the membrane
region is indicated by black dashed lines (−20
Å < z < 20 Å). For all 2D distributions,
the probabilities are symmetrized across the membrane to obtain the
free energy profiles.Mechanistic analysis
of sotalol permeation. (A) Chemical structure
of sotalol. (B) Snapshots of the sotalol molecule (cyan) passing through the lipid bilayer (gray) at selected
molecular times (see the legend in Figure D). (C) Free energy profile (in units of kBT) along the bilayer normal, z, and the cosine of the angle of molecule with respect
to z (top left), number of hydrophobic
contacts between the molecule and the membrane (top right), number of hydrogen bonds between the molecule and the membrane
(bottom left), and the end-to-end distance of the
molecule (bottom right, blue: <5kBT, red: >5kBT). The black line represents the top weighted trajectory (probabilistic weight: 9.3
× 10–6), with purple dots indicating
the location of the snapshots in panel (B) and a purple star indicating the start location. The approximate range of the membrane
region is indicated by black dashed lines (−20
Å < z < 20 Å). For all 2D distributions,
the probabilities are symmetrized across the membrane to obtain the
free energy profiles.Overall, these 2D free
energy profiles suggest that our WE simulations
with a single (and simple) progress coordinate were able to sample
a variety of conformations that allowed the molecules to pass through
the membrane. The distributions shown here also reinforce the idea
that choice in reaction coordinates may influence the interpretation
of a rate-limiting step, which could be a problem for methods based
on the ISD model. Depending on whether hydrogen bond count to the
membrane, end-to-end distance, or the number of hydrophobic contacts
were chosen as a reaction coordinate, one might believe a rate-limiting
step that occurred at different positions within z, for example, in the center of the membrane for hydrophobic contacts
or near the membrane–water interface for the hydrogen bond
count.We also extracted the trajectory with the highest probability
for
each molecule (see Figures D, 4B, and 5B for critical snapshots and see Movies S1–3 for full trajectories). Overall,
the molecule adopted pathways consistent with our population-level
observations above. Additionally, we observed water (<5 molecules
at once) deep in the membrane (near z = 0), which
was either due to the drug “dragging” water across the
bilayer (sotalol), or by distorting the curvature of the outer leaflet
of the membrane to bring water toward the bilayer center (zacopride).
The water has been captured in both the visualized snapshots of each
of the molecules as well as the movies of the top weighted simulations
of drug molecules crossing the membrane.
Efficiency of WE Simulations
To estimate the efficiency
of WE simulations relative to conventional MD simulations, we make
two comparisons. First, we compare the total computing time that would
be required in Orion using conventional MD simulations to generate
a single permeation event given the corresponding MFPT estimated by
our WE simulations (see eq ). Second, we make a similar comparison to Anton3 from D.
E. Shaw Research, which is currently known as the fastest MD simulation
engine in the world. As shown in Table , conventional simulations in Orion would require 1.5–177
years to generate a single permeation event depending on the molecule,
which is roughly equivalent to 22 weeks to over 7 years on the latest
Anton3. Our WE simulations generated the first permeation event in
1.1 (tacrine), 10.7 (sotalol), or 7.5 (zacopride) days and a total
of 872 (tacrine), 66 (sotalol), or 56 (zacopride) permeation events
were observed within a 7.9 (tacrine), 12.7 (sotalol), and 11.7 (zacopride)
day period. Although many of the pathways for these events are correlated,
sharing common trajectory segments, reasonably converged permeation
coefficients were obtained.
Table 2
Comparison of Wall-Clock
Times Required
by Weighted Ensemble MD Simulations Relative to Standard MD Simulations
for Generating a Single Drug Membrane Crossing Eventa
weighted
Ensemble MD
standard
MDb
Pm estimation
a single
crossing event
a single
crossing event
compound (no. of atoms)
predicated MFPT (ms)
total simulation time (μs)
wall clock time (Orion days)
simulation time (ns)
wall clock time
(Orion days)
wall clock time (Orion yearsc)
wall clock time
(Anton3 yearsd)
tacrinee (29)
4.6
8.1
7.9
1.4
1.1
1.5
0.06
sotalol (38)
52.1
6.3
12.7
3.7
10.7
16.5
0.71
zacopride (41)
559.2
6.0
11.7
3.0
7.5
177.3
7.7
Wall-clock times are reported for
the Orion cloud-computing platform on Amazon Web Services and the
Anton3 special-purpose MD supercomputer, and they were estimated using
the mean first passage time (MFPT) for the permeation process determined
by weighted ensemble MD. The wall-clock time is also reported for
the generation of a reasonably converged estimate of the permeability
coefficient Pm, which is not practical
to estimate using standard MD.
Expected times based on the MFPTs
predicted by the WE simulations.
Using 100 GPUs at a time on Orion
at the speed of 8600 ns per day.
Assuming 200 μs per day for
a similar sized system (∼33k atoms), as reported using the
64-node Anton3 performance data for DHFR and ApoA1[73] from D. E. Shaw Research.
Data from the tacrine run using
MAB and WESS.
Wall-clock times are reported for
the Orion cloud-computing platform on Amazon Web Services and the
Anton3 special-purpose MD supercomputer, and they were estimated using
the mean first passage time (MFPT) for the permeation process determined
by weighted ensemble MD. The wall-clock time is also reported for
the generation of a reasonably converged estimate of the permeability
coefficient Pm, which is not practical
to estimate using standard MD.Expected times based on the MFPTs
predicted by the WE simulations.Using 100 GPUs at a time on Orion
at the speed of 8600 ns per day.Assuming 200 μs per day for
a similar sized system (∼33k atoms), as reported using the
64-node Anton3 performance data for DHFR and ApoA1[73] from D. E. Shaw Research.Data from the tacrine run using
MAB and WESS.
Discussion
Despite our use of a model membrane system, our estimated permeability
coefficients from WE simulations for the three molecules (tacrine:
−4.27 ± 0.24, zacopride: −6.35 ± 0.22, sotalol:
−5.32 ± 0.22) are in reasonable agreement with experimentally
measured values (tacrine: −4.64[68] or −5.03 ± 0.2,[69] zacopride:
−5.23,[68] sotalol: −6.02,[68] −5.58 or −6.74;[70] see Figures and S1 for detail). Absolute agreement
of our calculated permeability coefficients with experiment would
not, however, be expected due to several complexities of real cell
membranes that are lacking in our simulation setup, particularly,
for cell-line assays such as MDCK and CaCo-2. These complexities include
the presence of multiple lipid species, (especially cholesterol),
transmembrane proteins, and membrane rafts. Such membrane-associated
structures can be particularly important for modeling the permeation
process of more complicated cell types like those involved in the
blood–brain barrier. Additionally, paracellular transport exists
as a potential method for drug delivery. In future efforts, we will
expand our model membrane (currently, 50 POPC lipids per leaflet)
to accommodate larger molecules that fall outside the spectrum of
Lipinski’s “rule of five”, such as peptides,
natural products, or de novo designed biologics. In addition, we will
improve on the diversity of pathways for membrane permeation by including
orthogonal dimensions to the progress coordinate that can distinguish
between different conformations of the molecule or the lipid bilayer,
for example, radius of gyration of the drug or the extent of membrane
curvature. These efforts will greatly aid the simulation of membrane
permeation for a modern lead series with more than 10 rotatable bonds,
including molecular glues and macrocycles that are highly flexible
with large molecular weights (>kDa).
Figure 6
Membrane permeabilities
(Log Pm) calculated
using various WE simulation protocols for tacrine, sotalol, and zacopride.
Uncertainties represent standard deviations, which are evaluated as
1/4th the difference between the 95% CI upper bound and the lower
bound. Experimentally measured values are shown in gray [MDCK-LE:
Dickson et al. (2019)],[68] PAMPA for tacrine:
Katt et al. (2016),[69] and PAMPA for sotalol:
Liu et al. (2012).[70] See also Table S1 in the Supporting Information.
Membrane permeabilities
(Log Pm) calculated
using various WE simulation protocols for tacrine, sotalol, and zacopride.
Uncertainties represent standard deviations, which are evaluated as
1/4th the difference between the 95% CI upper bound and the lower
bound. Experimentally measured values are shown in gray [MDCK-LE:
Dickson et al. (2019)],[68] PAMPA for tacrine:
Katt et al. (2016),[69] and PAMPA for sotalol:
Liu et al. (2012).[70] See also Table S1 in the Supporting Information.Additional concerns with the current work could
be either the small
number of molecules used for model development or the minimal dynamic
range of the permeability data itself (Log Pm varies between about −4 and −6). Both issues
should be addressed. Regarding the number of compounds, it is true
that three molecules alone are not enough to provide the statistical
insight needed to reliably judge the predictive power of the model
for any given molecular species. Even so, for the molecules presented
here, our model was able to predict permeability coefficients within
about log unit of experimental MDCK-LE and PAMPA measurements. Such
a discrepancy with respect to experiment can be considered small,
especially compared to free energy-based methods, where permeability
coefficients may be several log units away from a reference value.[20] Furthermore, one may also note that the agreement
between two permeability estimates for the same molecule using the
same experimental technique may not be in perfect agreement. In fact,
reported PAMPA data for sotalol using either the top-to-bottom (TtoB)
or the bottom-to-top (BtoT) protocol[70] provided
permeability coefficients that differed by more than a log unit (see Figure ). Note that the
apparent discrepancy between 95% CIs in Figure C and standard deviations in Figure is because the original CIs
and standard deviations were defined in the linear scale and converted
to the log scale for visual representation.As shown by Rogers
and Geissler,[25] the
choice of using z as the reaction coordinate for
the free energy-based methods could have an enormous impact on the
perceived free energy barrier, which may contribute to multi-log unit
differences in the absolute comparison to permeability experiments
mentioned above. While a reaction (progress) coordinate is typically
defined for the WE strategy, the computed rates are independent of
the chosen coordinate. Therefore, even though the efficiency of the
computation can be reduced by a poor choice of reaction coordinate,
the results are much less sensitive to it compared to free energy-based
methods. In the most extreme case, when a reaction coordinate is suboptimal,
WE simulations can surmount barriers along orthogonal coordinates
in a “brute force” manner, whereas free energy-based
methods have no means of recovering. Even if there is no enhanced
sampling of auxiliary coordinates, the predicted permeability coefficients
derived from permeation rates are in reasonable agreement with the
experiment. Meanwhile, permeation rates computed from free energy
methods depend exponentially on the barrier height, which is in turn
sensitive to the choice of the progress coordinate. That said, as
a future direction, we will apply a committor analysis technique[71] to characterize the mechanism of drug membrane
permeation processes. We will also apply a history-augmented Markov
state model analysis[72] instead of the WESS
reweighting procedure to further enhance the efficiency of obtaining
steady-state observables.While this work advances the methodology
for computational prediction
of permeability coefficient, several limitations prevent using the
current algorithm for high-throughput screening. First, even though
the amount of computation is a fraction of that required of brute
force MD simulations (see Table ), the computational cost is still too large for a
virtual screen where billions of molecules can be routinely analyzed
to find a lead candidate. Second, the amount of required wall-clock
time per permeability estimate is roughly one week with our current
WE setup employing the MAB scheme for adaptive binning and the WESS
reweighting scheme. One week per compound is outside the time budget
of a typical virtual screen since many orders of magnitude more compounds
could be completed within the same period using a faster, albeit less
accurate, computational technique. Still, we could imagine our WE
method being used in later stages of the drug development cycle, primarily
in the lead optimization phase where more expensive computational
techniques are routinely applied. For instance, it is widely accepted
that the high attrition rate for small-molecule drug candidates is
largely due to ADME/Tox liabilities. Therefore, a method like ours
that provides detailed mechanistic insights into the membrane permeation
process could be employed at the lead optimization stage to reduce
such high drug candidate attrition.The OpenEye Permeability
Floe used for our WE simulations is designed
to be a user-friendly cloud application that can assist modelers in
designing drugs for increased bioavailability. The cloud-based Floe
is system-agnostic, enabling users to run and analyze simulations
on any workstation with almost no hardware requirement aside from
Internet connection. In our present study, the simulations required
one to two weeks to finish depending on the WE protocol. This wall-clock
time can be further reduced by horizontal scaling, that is, recruiting
more nodes or scaling groups. Each WE iteration typically requires
a few hundred trajectory segments in total, and each trajectory segment
only requires 2 min of wall clock time to complete with a single GPU
of the Amazon EC2 G4 instance types. Ideally, if each trajectory segment
was assigned to a dedicated node, each WE iteration would complete
in only 2 min, and a 500-iteration simulation would complete in only
16 h with a significantly increased demand of computing resources.
A future direction is to further optimize the balance of the required
computation and run time. The Floe also features a GUI that facilitates
the set up for a permeability simulation with their drug-like molecule
of interest. The user can increase or decrease the simulation length
for analysis if needed, and an automatic detection of convergence
in the estimated permeability coefficient is under investigation that
will trigger the floe’s termination. As mentioned above, the
WE strategy not only yields a direct calculation of the passive permeability
coefficient but also provides fully continuous trajectories revealing
how molecules cross through the membrane.
Authors: Yudong Qiu; Daniel G A Smith; Simon Boothroyd; Hyesu Jang; David F Hahn; Jeffrey Wagner; Caitlin C Bannan; Trevor Gokey; Victoria T Lim; Chaya D Stern; Andrea Rizzi; Bryon Tjanaka; Gary Tresadern; Xavier Lucas; Michael R Shirts; Michael K Gilson; John D Chodera; Christopher I Bayly; David L Mobley; Lee-Ping Wang Journal: J Chem Theory Comput Date: 2021-09-22 Impact factor: 6.578
Authors: Ernesto Suárez; Steven Lettieri; Matthew C Zwier; Carsen A Stringer; Sundar Raman Subramanian; Lillian T Chong; Daniel M Zuckerman Journal: J Chem Theory Comput Date: 2014-03-03 Impact factor: 6.006