Literature DB >> 30834317

Drug-Membrane Permeability across Chemical Space.

Roberto Menichetti¹, Kiran H Kanekal¹, Tristan Bereau¹.

Abstract

Unraveling the relation between the chemical structure of small druglike compounds and their rate of passive permeation across lipid membranes is of fundamental importance for pharmaceutical applications. The elucidation of a comprehensive structure-permeability relationship expressed in terms of a few molecular descriptors is unfortunately hampered by the overwhelming number of possible compounds. In this work, we reduce a priori the size and diversity of chemical space to solve an analogous-but smoothed out-structure-property relationship problem. This is achieved by relying on a physics-based coarse-grained model that reduces the size of chemical space, enabling a comprehensive exploration of this space with greatly reduced computational cost. We perform high-throughput coarse-grained (HTCG) simulations to derive a permeability surface in terms of two simple molecular descriptors-bulk partitioning free energy and pK a. The surface is constructed by exhaustively simulating all coarse-grained compounds that are representative of small organic molecules (ranging from 30 to 160 Da) in a high-throughput scheme. We provide results for acidic, basic, and zwitterionic compounds. Connecting back to the atomic resolution, the HTCG predictions for more than 500 000 compounds allow us to establish a clear connection between specific chemical groups and the resulting permeability coefficient, enabling for the first time an inverse design procedure. Our results have profound implications for drug synthesis: the predominance of commonly employed chemical moieties narrows down the range of permeabilities.

Entities: CellLine Chemical Disease Species

Year: 2019 PMID： 30834317 PMCID： PMC6396385 DOI： 10.1021/acscentsci.8b00718

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 14.553

Introduction

The passive permeation of small molecules across lipid membranes offers not only physicochemical insight but also crucial pharmaceutical information about drug–membrane thermodynamics.[1] It probes the time scale of translocation due to a concentration gradient of the drug, without active cellular mechanisms (Figure ). A detailed understanding of the underlying structure–property relationships between drug chemistry and passive-permeation thermodynamics, though of great interest for drug development, is still lacking.

Figure 1

From left to right: coarse-graining reduces the size of chemical space, such that many small molecules of similar size and hydrophobicity are mapped to the same representation.[5] For each molecule, we model its passive translocation across a lipid bilayer (water not shown for clarity). The thermodynamics of the system is characterized by the potential of mean force (PMF), evaluating both the neutral and charged species, shifted according to the compound’s pKa. The major dependence of the PMF on the water/octanol partitioning and the pKa motivates these as molecular descriptors to construct a permeability surface (eq ). These two molecular descriptors, also highlighted in red, are experimental quantities directly fed into the physics-based simulations to yield a parameter-free estimation of the permeability coefficient. Structure–property relationships are often tackled by means of high-throughput screening experiments: (i) a large number of compounds are probed with respect to the property of interest by individual measurements or calculations; and (ii) the relationship between structure and property is empirically learned by means of statistical algorithms.[2−4] While structure–property relationships thus formally rely on both the breadth and quality of the data, as well as the accuracy of the statistical model, the common bottleneck in the pharmaceutical sciences often arises from the former. Even though in vivo techniques probe drug–membrane interactions in all the intricacies of the cellular environment, the experimental cost and complexity make them poorly suited for high-throughput screening.[6] It is instead the development of in vitro techniques that have helped in expanding passive-permeation databases.[7,8] Unfortunately, limited aggregate data has been made publicly available thus far. The resulting statistical models—such as quantitative structure–property relationships (QSPRs) or machine learning—typically rely on 102–104 data points only.[9−11] The question follows: how representative can these samples be, when the size of drug chemical space is estimated at 1060?[12] The tendency of these statistical models to depend significantly on individual outliers strongly suggests overfitting—these models lack transferability across chemical space.[10] The small-data-set problem is typically aggravated by the compounds’ poor diversity.[13] As a complementary approach to experimental measurements, physics-based modeling provides a robust strategy to predict passive permeation in silico.[10,14] The inhomogeneous solubility–diffusion model[15,16] considers the concentration gradient of a solute molecule across an interface to yield a permeability coefficient, P.[17] This results in a spatial integral normal to the interface, z, of the potential of mean force (PMF), G(z), and local diffusivity, D(z):with β = 1/kBT. Equation highlights the two key parameters that contribute to the rate of passive permeation of a compound: its hydrophobicity, quantified by the PMF, together with the local diffusivity. Practically, G(z) and D(z) are commonly extracted from enhanced-sampling classical molecular dynamics simulations. Grounding the problem within the statistical mechanics of a concentration flux diffusing through an interface combined with conformational sampling from physically motivated force fields can offer unprecedented insight. We stress that current experimental techniques have yet to resolve G(z)—computer simulations thus remain the gold standard to estimate eq . Unfortunately, adequate conformational sampling remains computationally daunting at the atomistic level, even for a small rigid molecule crossing a single-component lipid membrane: roughly 105 CPU-hours per compound limit this strategy to up to ∼10 different molecules per study.[18−21] When combined with the overwhelming size of chemical space, these figures hinder short-term prospects of running atomistic simulations at high throughput, thereby hampering the elucidation of the underlying structure–property relationships. Structure–property relationships effectively project down chemical complexity on a few molecular descriptors that map to the property of interest.[22,23] Inferring these maps typically relies on a statistical analysis over many measurements, identifying a smooth (i.e., low-dimensional) connection between structure and property. In this work we propose an alternative strategy: rather than smoothing this connection a posteriori, we enforce it a priori. We still rely on physics-based models but reduce their resolution to efficiently interpolate across chemistry, while ensuring accurate thermodynamics by construction. This enables a high-throughput approach for two reasons: (i) the reduced representation significantly speeds up every simulation, and (ii) the interpolation across chemistry effectively reduces the size of chemical space. Solving the structure–property relationship problem for the reduced model proves significantly more tractable, and allows the identification of a permeability surface as a function of simple molecular descriptors. We further connect back to the original problem by means of a large-scale analysis of our predictions. The above-mentioned reduced models, better known as coarse-grained (CG) models, lump together several atoms into a bead.[24,25] While defined in terms of fewer degrees of freedom, coarse-graining remains physics-based and can be combined with rigorous free-energy calculations. Here, we rely on the CG Martini model, which has shown useful to simulating a wide variety of biomolecular systems.[26−28] In Martini, a small set of bead types encodes how small organic fragments partition between solvents of different polarity, thus ensuring robust thermodynamics at complex interfaces,[28] while cutting down the computational costs by 3 orders of magnitude.[29] The parametrization of a molecule at the CG level thus consists of a collection of Martini beads, each representing a specific chemical group. Constructing a CG molecule can be streamlined into a systematic procedure,[30] so as to emulate the molecule’s overall shape and hydrophobicity. The small set of bead types leads to a degeneracy in the representation: many molecules of similar shapes and hydrophobicity map to the same CG parametrization (Figure ). Such a many-to-one mapping generates a significant reduction in the size of chemical space—further lowering the computational investment by an additional 103–104. The few bead types involved lead to a dramatic reduction in the combinatorial explosion of chemistry, easing the construction of all CG small molecules up to a certain size (Figure ).[5] This addresses the poor-diversity issues that synthetic databases typically face, facilitating a representative coverage of subsets of chemical space projected primarily along size and hydrophobicity. Recently, these properties allowed us to predict the PMF of drug–membrane partitioning for an unprecedented 511 427 small molecules—several orders of magnitude beyond what was previously available.[5] In terms of accuracy we showed that the CG model achieves a mean absolute error of 0.8 kcal/mol to predict bulk water/octanol partitioning free energies,[30] translating to a 1.4 kcal/mol error along a PMF.[5] We further stress that these errors are evaluated across a significant subset of the chemistry of small organic molecules, while atomistic results are too scarce to make such estimates. For the present work, our error estimates roughly translate to an accuracy of 1 log10 unit in the permeability coefficient, validated across an extensive set of structurally distinct compounds against both atomistic simulations and experimental measurements (see the Supporting Information (SI)). In this work, we use high-throughput coarse-grained (HTCG) simulations to cover a subset of chemical space both efficiently and broadly. Unlike conventional high-throughput screening protocols that require an arbitrary selection of compounds, we consider all coarse-grained representations up to a threshold size, mapping to most small organic molecules ranging from 30 to 160 Da. This comprehensive exploration allows us to systematically investigate the effect of hydrophobicity and pKa on the permeation rate (Figure ), unlike previous studies limited to a handful of compounds.[18−21] Our methodology offers a unique approach to construct a two-dimensional surface describing the permeability of a small molecule across a lipid membrane (Figure ). The molecular descriptors are here motivated by the physics of the permeation process, i.e., the interplay between diffusivity and solubility (eq ). Because the diffusivity was shown to be rather insensitive to chemical detail,[18] we focus on the potential of mean force, G(z). We have recently shown that the key features of G(z) can be reconstructed simply from the bulk partitioning free energy.[5] By further accounting for the contribution of different protonation states, we also express the permeability surface in terms of its acid dissociation constant in water, pKa. In the following we focus on acidic and basic compounds, while the SI further discusses zwitterions. These surfaces allow for a rapid, simulation-free prediction of drug permeability starting from key molecular properties. The accuracy is roughly on par with explicit CG simulations because of compensating errors between the two methods. Extracting permeability surfaces from the CG simulations allows us to connect back to the original structure–property relationship problem. Our analysis of over 500 000 small molecules mapping to the investigated CG representations unveils the role played by representative functional groups in the permeability coefficient, enabling inverse molecular design. The link drawn here has profound implications for drug synthesis: favoring the incorporation of certain chemical groups (e.g., carboxylic groups) will reduce the range of accessible permeabilities of the final compound.

Results and Discussion

While drug permeation is known to depend on lipid composition,[21] in this work we only consider a single-component bilayer made of 1,2-dioleoyl-sn-glycero-3-phosphocholine (DOPC). The permeability coefficient, P, is readily estimated from the PMF and diffusivity profile (eq ). The PMFs are extracted from HTCG simulations of all CG representations made of one and two beads, mapping to a representative subset of small organic molecules in the range 30–160 Da.[5] For compounds capable of (de)protonating, we also model the corresponding charged species. For convenience, we distinguish the pKa of a chemical group as being either acidic (apKa) or basic (bpKa), which quantifies the propensity of a neutral compound to deprotonate or protonate, respectively. The effective permeability coefficient is constructed by a combination of the two PMFs (Figure ), shifted according to the compound’s pKa in water (see the Methods section).[31,32] The diffusivity profile is estimated from reference atomistic simulations.[18]

Permeability Surfaces

Figure displays the computed drug–membrane permeability as a function of two drug parameters: its pKa in water and water/membrane partitioning free energy, ΔW→M. The latter corresponds to the free-energy difference between insertion in bulk water and the membrane-bilayer midplane. Though we have shown this quantity to correlate extremely well with the experimentally accessible water/octanol partitioning free energy, ΔW→Ol, ΔW→M displays enhanced transferability across CG molecular sizes.[5] Indeed, HTCG simulations of single-bead or two-bead CG compounds lead to identical permeability surfaces, except for the range of ΔW→M covered (compare Figure S4 with Figure ).

Figure 2

Permeability surfaces (log10 scale) calculated from HTCG simulations as a function of two small-molecule descriptors: the (a) basic or (b) acidic pKa in water and the water/membrane partitioning free energy, ΔW→M. Cooler (warmer) colors correspond to faster (slower) permeating molecules. The intersection between the two surfaces corresponds to compounds that effectively always remain neutral. Green ●, yellow ★, and orange ■ correspond to deviations from atomistic simulations within 0.5, 1.3, and 2.2 log units, respectively (SI). Figure displays smooth permeability surfaces as a function of the drug’s acidic and basic pKa value in water. The log10 scale of the permeability surfaces indicates the wide time scale variations these molecular parameters exert on the thermodynamic process. For both panels, the horizontal behavior indicates that larger permeabilities are obtained toward the left—more hydrophobic compounds—while polar molecules experience more difficulties crossing the lipid bilayer, leading to a drastic reduction in P. The effect is compounded by (de)protonation: Figure b across the vertical axis describes the effect of the compound’s apKa in water onto P. Extremely strongly acidic molecules (apKa ≲ 2) effectively remain charged across the membrane interface, leading to prohibitively large free energies along the PMF, such that their rate of permeation is strongly suppressed. Increasing apKa shows a significant increase in P, up to apKa ≈ 7, beyond which P plateaus. This stabilization is due to the competition between neutral and charged PMFs, where the charged PMF is shifted to increasingly larger values, and therefore never contributes significantly compared to the more attractive neutral PMF. Of particular interest are the strong acids (2 ≲ apKa ≲ 7), which neutralize upon entering the membrane, effectively enhancing the permeability coefficient as compared to a compound that remains charged across the interface. An approximately symmetric behavior can be observed when switching from acidic to basic compounds (Figure a). The impact of both apKa and bpKa on the permeability coefficient becomes even more pronounced in the case of zwitterions (Figure S5), where high permeation rates are only obtained for compounds containing both weak acidic and basic chemical groups. The permeability surface also displays a comparison against atomistic simulations[18,19,31] for several compounds (symbols in Figure ). These points provide a validation of our methodology—we report a mean absolute error of 1.0 log10 unit across the two molecular descriptors—with additional information included in the SI (also against experimental data). Most importantly, the few data points highlight the extremely limited exploration of chemical space using in silico simulations at an atomistic resolution.

Functional-Group Localization on the Permeability Surfaces

To better elucidate how the chemical structure impacts the permeability coefficient, we consider a large database of small organic molecules from combinatorial chemistry: the generated database (GDB).[33,34] It consists of a large set of stable molecules up to 10 heavy atoms made of the chemical elements C, O, N, and F, saturated with H. We pointed out how transferable coarse-grained models effectively reduce the size of chemical space by lumping many molecules into one coarse-grained representation.[5] This allows us to associate the above-mentioned one- and two-bead CG permeability results to 5 × 105 molecules. The distinction made between compounds that reduce to CG molecules made of a single bead (“unimers”) from those made of two beads (“dimers”) effectively amounts to a segregation between molecular weights.[5] We populate the permeability surfaces with these compounds—projecting them onto the two molecular descriptors: pKa and water/octanol partitioning free energy ΔW→Ol. By coarse-graining every single compound, we establish a map between chemical structure and its CG thermodynamic property. Figure displays the chemical-space coverage of GDB compounds onto the molecular descriptors. For all panels, we have colored the points in terms of the permeability calculated using HTCG simulations. Top and bottom panels distinguish between bpKa and apKa, while left and right denote unimers and dimers, respectively. We first note that the cloud of points is not uniformly distributed, but is instead centered around zero in ΔW→Ol. An increase in the molecular weight of the compound (left to right in Figure ) opens up new regions of chemical space, as we observe a significant broadening of the distribution along the water/octanol axis. This naturally arises due to the extensivity of the water/octanol partitioning, the more complex combinatorics of atoms involved, and the additional presence of five-membered rings.

Figure 3

Chemical-space coverage of GDB projected onto pKa and water/octanol partitioning free energies, ΔW→Ol. Basic and acidic pKa are shown in panels a and b, and c and d, respectively. Panels (a,c) and (b,d) describe the coverage corresponding to coarse-grained unimers and dimers, respectively. Regions highlighted in light blue display several representative chemical groups. Substitutions denoted by “?” correspond to either H or a substitution starting with an alkyl or aryl carbon, while “?*” only corresponds to substitutions that begin with an alkyl carbon. (e) Our analysis clusters molecules containing both a predominant functional group (blue), and one or several substitutions (black), of which only a few possibilities are shown. Unlike bulk partitioning, the pKa of a compound is not significantly impacted by aggregate behavior, but is instead dominated by one or a few specific chemical groups capable of (de)protonating. As such, we investigated the presence of chemical groups representative of a subset of chemical space. The regions in blue highlight a chemical group that is predominant, appearing in at least 50% of the molecules in that subset. Detailed statistics pertaining to the frequency of specific functional groups in each of the blue regions are provided in the SI. The localization of chemical groups remains largely similar from unimers to dimers (e.g., carboxylic group). Our high-throughput analysis offers an intuitive visualization of the link between chemistry and permeabilities via the pKa. Figure reflects that oxygen-containing functional groups are generally more likely to be proton donors, whereas nitrogen-containing functional groups can serve as either proton donors or acceptors.[35] At low apKa values, we mainly see carboxylic groups transitioning to nitrogen-containing functional groups (e.g., oxime derivatives) as we increase the apKa. Contrastingly, the bpKa chemical coverage displays no predominant oxygen-containing functional groups. Notable exceptions are the zwitterionic amino-acid-like compounds and certain aromatic heterocyclic compounds shown in Figure , which have both a low apKa and a high bpKa. These functional groups largely contribute to the chemical coverage of zwitterions (Figure S5b).

Linking Functional Groups and the Permeability Surface Enables Molecular Design

Figure enables a robust ad hoc method for both direct and inverse molecular design. The direct route amounts to estimating the permeability coefficient given a chemical structure. Figure simply requires an estimate for the two molecular descriptors, pKa and ΔW→Ol, from either experiments or prediction algorithms.[32,36] More interestingly, our results allow us to focus on specific regions of chemical space compatible with a desired permeability coefficient. We effectively reduce the high dimensionality of chemical space by projecting down onto our molecular descriptors and identifying key scaffolds. Figure offers a simple route at an inverse design procedure. For example, if designing a small molecule of 3–5 heavy atoms (i.e., mapping to a CG unimer) that requires a log10P of −1.0, Figure c suggests molecules containing either a terminal hydroxyl group or an oxime group. Indeed, small alcohols such as propanol and butanol match this target (Figure S8), although we are not aware of relevant experimental studies containing small oxime derivatives. Interestingly, we can also predict how small chemical changes will affect permeability: a change that impacts hydrophobicity (e.g., through heteroatom substitutions) will smoothly shift the compound horizontally on the surface. On the other hand, the introduction of new (de)protonatable groups might lead to large jumps on the surface, dictated by the strongest acid or base present in the molecule. The different behavior across the horizontal and vertical axes is due to the extensive and intensive characters of the descriptors, respectively. Critically, Figure shows remarkable transferability outside the range of compounds used in the screening. For example, while salicylate is made up of 10 heavy atoms, its aromatic ring leads to a four-bead representation. CG simulations using this parametrization result in log10P = −4.21 (Figure S6 and Table S1), deviating only one log10 unit from the atomistic results (highlighted as one of the symbols in Figure ).[18] Alternatively, we can easily read off the permeability from the surface: the carboxylic group is the main contributor for its descriptors apKa = 2.8 and ΔW→Ol = −2.7 kcal/mol (Figure ). This results in a simulation-free prediction for log10P of −3.72, less than two log units away from the atomistic results. The discrepancy between the four-bead representation and the dimer surface we rely on is the main source of errors: we have observed a systematic shift between ΔW→Ol and ΔW→M as a function of the number of CG beads.[5] An even more challenging test case involved ibuprofen (206 Da, significantly outside our range of molecular weights), for which both CG simulations and the surface prediction yield an accuracy within 1 log10 unit within the atomistic results (symbol in Figure , and Figure S6 and Table S1). We verified this consistent accuracy between explicit CG simulations and simulation-free surface predictions across two dozen small molecules—both in and out of the range of molecular weights considered (Figure S7 and Table S1). Although one would expect higher accuracy from explicit simulations, we observe compensating errors between the discretization of partitioning free energies and the smoothing of the surface. The transferability beyond the initial molecular weight considered speaks to the robustness of our physics-based approach. This feature contrasts radically with statistical methods that fit experimental data, such as QSPRs: the transferability of a QSPR model hinges upon potential biases in the training data set. Given the small data set sizes available from experiments and the wider range of molecular weights, QSPR models tend to be limited to chemistries very close to those used in training.[37,38] On the other hand, the HTCG method systematically spans a wide region of chemical compound space without resorting to parameter tuning, offering accurate predictions even beyond the range of molecular weight considered.

Impact of Functional-Group Localization on Bioavailability

The projection of the GDB database onto the two molecular descriptors provides a low-dimensional representation of chemical-space coverage. Interestingly, this helps compare its breadth and variety with other databases. In particular, we focus on ChEMBL: a database of synthesized compounds.[39] We prune ChEMBL to only retain compounds roughly compatible in size with the compounds in GDB (up to 10 heavy atoms), as well as H, C, O, N, and F elements only. Figure displays the coverage of both GDB and ChEMBL onto the molecular descriptors. Here again, Figure a,b distinguishes acidic and basic ionizing groups. We first note that ChEMBL displays a much smaller number of data points, illustrating the minuscule ratio of stable compounds that have been synthesized.[12] Overall the two databases cover remarkably similar regions of this chemical surface. However, a projection of the distributions along the individual axes indicates a statistically significant difference for apKa: synthesized compounds strikingly over-represent compounds with low apKa values (from 2 to 4). We find a significant over-representation of carboxylic groups in ChEMBL: 90% of the compounds in the range 0 < apKa < 6 contain such a group. This well-known bias in drug design[40] can readily be rationalized: Synthesizing compounds that include carboxylic groups will offer relatively strong acidity as well as an improved ability to hydrogen bond—a dominant interaction in most biomolecular processes. Our results introduce further implications: the over-representation of carboxylic groups will effectively narrow down the range of permeability coefficients. This limitation will be further compounded by the necessity of a drug candidate to show high aqueous solubilities, and the delicate interplay existing between these two properties,[41] overall affecting the compounds’ bioavailability.

Figure 4

Comparison of the chemical-space coverage of the combinatorial GDB and synthetic ChEMBL databases, projected onto (a) basic or (b) acidic pKa and water/octanol partitioning free energy, ΔW→Ol. The coverages are further projected down along a single variable on the sides. Note the significant differences between the GDB and ChEMBL distributions along the apKa in panel b.

Conclusions

We present the prediction of membrane-permeability coefficients for an unprecedented number and chemical range of small organic molecules across a single-component DOPC lipid bilayer. Rather than tackling the original structure–property relationship problem head-on, we work with physics-based reduced models that smoothly interpolate across chemistry, thereby reducing the size of chemical space. Critically, we do not arbitrarily select compounds to be screened, but instead systematically consider all coarse-grained representations that map to small organic molecules ranging from 30 to 160 Da. Coarse-grained permeability predictions were extensively validated against both atomistic simulations and experimental measurements for structurally diverse compounds. The high-throughput coarse-grained (HTCG) simulation approach used here compounds more efficient conformational sampling and reduction in chemical space, offering an overall speedup of ∼106 compared to atomistic simulations. This enables a systematic exploration of the link between chemical structure and permeability coefficient. To this end we construct a smooth surface as a function of two molecular descriptors: the pKa and water/membrane partitioning free energy, ΔW→M. The many orders of magnitude covered by the surface indicate the significant impact of the small molecule’s chemistry onto the thermodynamic process. The surfaces illustrate how strong acids and bases limit the loss of permeability for charged compounds. Having solved the reduced structure–property mapping allows us to connect back to the original, higher-dimensional problem. We identify dominant functional groups representative of chemical regions in the permeability surface. The identification of functional groups linking to the permeability coefficient effectively provides robust structure–property relationships for drug–membrane permeation, and the means to perform inverse molecular design. Finally, we show how the apparent bias of synthetic databases toward carboxylic groups can have deleterious effects on the accessible range of permeability coefficients, and thus on bioavailability. All in all, our HTCG approach offers a complementary approach to in vitro high-throughput screening, providing much larger numbers of compounds (510 000 in this study) than currently available in public databases. The much larger data set size will help statistical models (e.g., QSPRs) reach improved transferability. In analogy to rapidly growing interests in generating in silico databases of electronic properties,[42,43] we expect HTCG to have a broad impact in efficiently mapping the relevant low-dimensional surfaces that link chemical structure to thermodynamic properties.

Methods

Molecular Dynamics Simulations

Molecular dynamics simulations in this work were performed in GROMACS 4.6.6[44] and with the Martini force field,[26−28] relying on the standard simulation parameters.[45] The integration time step was δt = 0.02 τ, where τ is the model’s natural unit of time dictated by the units of energy , mass , and length , . Sampling from the NPT ensemble at P = 1 bar and T = 300 K was obtained by means of a Parrinello–Rahman barostat[46] and a stochastic velocity rescaling thermostat,[47] with coupling constants τ = 12 τ and τ = τ, respectively. We relied on the INSANE building tool[48] to generate a membrane of ≈36 nm2 containing N = 128 DOPC lipids (64 per layer), N′ = 1890 water molecules, N″ = 190 antifreeze particles,[27] and enough counterions to neutralize the box. The system was subsequently minimized, heated up, and equilibrated. The potential of mean force G(z) of each compound was determined by means of umbrella sampling.[49] We employed 24 simulation windows with harmonic biasing potentials (k = 240 kcal/(mol nm2)) centered every 0.1 nm along the normal to the bilayer midplane. In each of them, two solute molecules were placed in the membrane to increase sampling and alleviate leaflet-area asymmetry.[31,50] The total production time for each umbrella simulation was 1.2 × 105 τ. We then estimated the free-energy profiles by means of the weighted histogram analysis method.[51−53]

Permeability Coefficients

The permeability coefficient is obtained from the potential of mean force G(z) and local diffusivity D(z) in the resistivity R(z) = exp[βG(z)]/D(z) (see eq ). For compounds with multiple protonation states, both neutral and charged species contribute to the total flux, leading to the total resistivity RT given by[18]RT(z)−1 = RN(z)−1 + RC(z)−1, where RN and RC are the resistivities of the neutral and charged species, respectively. In calculating these quantities in the case of a single (de)protonation reaction, one has to offset the corresponding PMFs GN(z) and GC(z) by the free-energy difference for the acid/base reaction in bulk water,[31] as shown inFigure , where we systematically consider neutral pH = 7.4. Beyond the distinction between acid and base, we consider both neutral and charged species (Figure ): (i) a neutral acid deprotonates into a charged conjugate base (acidic pKa or apKa), and (ii) a neutral base protonates into a charged conjugate acid (basic pKa or bpKa). The extension to zwitterions, in which two consecutive protonation and deprotonation reactions occur in different chemical groups leaving the molecule globally neutral, is discussed in the SI. Estimation of the local diffusivity, D(z), using the CG simulations is a priori problematic given the tendency of these models to inconsistently accelerate the dynamics.[54] On the other hand, atomistic simulations showed that the diffusivity across a DOPC bilayer was virtually independent of the chemistry of the solute.[18] We used this profile in the present calculations. We stress that the local diffusivity only provides a logarithmic correction to log10P (see eq ), and therefore has limited impact—a variation well within 1 log10 unit depending on the diffusivity profile. More details can be found in sections S2 and S6 of the SI. We obtained the permeability surfaces presented in Figure and Figure S4 by first determining the PMF G(z) for all possible neutral combinations of one and two CG beads, 119 in total. For each of them we then determined G(z) for its charged counterparts, amounting to a total of 232 additional compounds. All PMF calculations required less than 105 CPU hours, on par with the typical computational time needed to run a single compound at an atomistic resolution.[10] At the CG level, protonating (deprotonating) a neutral chemical group amounts to replacing the bead type with a positive (negative) charge. We assume that the (de)protonation reaction always occurs in the chemical fragment represented by the more polar bead, and select the bead accordingly. In section S3 of the SI, we justify this approach by analyzing the pKa distribution for various CG bead types. By combining neutral and charged PMFs, we calculated the permeability coefficient of every compound as a function of the bpKa (apKa) every 0.2 pKa unit, and projected the results on the (ΔW→M, pKa) plane. The data consisted of a discrete set of permeabilities densely covering the partitioning free-energy axis located at the ΔW→M of each CG compound, and were finally interpolated on a grid with Gaussian weights resulting in the surfaces shown in Figure and Figure S4.

Chemical-Space Coverage

Prediction of the water/octanol partitioning on both chemical databases considered in this work, GDB[33,34] and ChEMBL,[39] was performed by means of the neural network ALOGPS.[36] apKa and bpKa predictions of neutral compounds were provided by the calculator plug-in of CHEMAXON MARVIN.[32] The mean absolute errors associated with the two prediction algorithms are 0.36 kcal/mol[36] and 0.86 units,[55] respectively. The aggregate predictions of water/octanol partitioning and pKa on both databases required roughly 102 CPU hours. Functional groups were identified using the CHECKMOL package.[56] Using the AUTO-MARTINI scheme, 511 427 molecules were coarse-grained.[30] AUTO-MARTINI automatically determines the coarse-grained force field in two steps: (i) the CG mapping is optimized according to Martini-based heuristic rules, and (ii) interactions are set by determining a type for each bead, selected from chemical properties of the encapsulated atoms, especially water/octanol partitioning, net charge, and hydrogen-bonding.

13 in total

1. Inserting Small Molecules across Membrane Mixtures: Insight from the Potential of Mean Force.

Authors: Alessia Centi; Arghya Dutta; Sapun H Parekh; Tristan Bereau
Journal: Biophys J Date: 2020-02-04 Impact factor: 4.033

2. CLiB - a novel cardiolipin-binder isolated via data-driven and in vitro screening.

Authors: Isabel Kleinwächter; Bernadette Mohr; Aljoscha Joppe; Nadja Hellmann; Tristan Bereau; Heinz D Osiewacz; Dirk Schneider
Journal: RSC Chem Biol Date: 2022-06-10

3. Computational and Experimental Approaches to Investigate Lipid Nanoparticles as Drug and Gene Delivery Systems.

Authors: Chun Chan; Shi Du; Yizhou Dong; Xiaolin Cheng
Journal: Curr Top Med Chem Date: 2021 Impact factor: 3.295

4. Genetic validation of Aspergillus fumigatus phosphoglucomutase as a viable therapeutic target in invasive aspergillosis.

Authors: Kaizhou Yan; Mathew Stanley; Bartosz Kowalski; Olawale G Raimi; Andrew T Ferenbach; Pingzhen Wei; Wenxia Fang; Daan M F van Aalten
Journal: J Biol Chem Date: 2022-04-30 Impact factor: 5.486

10. The importance of intramolecular hydrogen bonds on the translocation of the small drug piracetam through a lipid bilayer.

Authors: João T S Coimbra; Ralph Feghali; Rui P Ribeiro; Maria J Ramos; Pedro A Fernandes
Journal: RSC Adv Date: 2021-01-04 Impact factor: 3.361