Literature DB >> 24460383

Theoretical prediction of hydrogen-bond basicity pKBHX using quantum chemical topology descriptors.

Abstract

Hydrogen bonding plays an important role in the interaction of biological molecules and their local environment. Hydrogen-bond strengths have been described in terms of basicities by several different scales. The pKBHX scale has been developed with the interests of medicinal chemists in mind. The scale uses equilibrium constants of acid···base complexes to describe basicity and is therefore linked to Gibbs free energy. Site specific data for polyfunctional bases are also available. The pKBHX scale applies to all hydrogen-bond donors (HBDs) where the HBD functional group is either OH, NH, or NH+. It has been found that pKBHX can be described in terms of a descriptor defined by quantum chemical topology, ΔE(H), which is the change in atomic energy of the hydrogen atom upon complexation. Essentially the computed energy of the HBD hydrogen atom correlates with a set of 41 HBAs for five common HBDs, water (r2=0.96), methanol (r2=0.95), 4-fluorophenol (r2=0.91), serine (r2=0.93), and methylamine (r2=0.97). The connection between experiment and computation was strengthened with the finding that there is no relationship between ΔE(H) and pKBHX when hydrogen fluoride was used as the HBD. Using the methanol model, pKBHX predictions were made for an external set of bases yielding r2=0.90. Furthermore, the basicities of polyfunctional bases correlate with ΔE(H), giving r2=0.93. This model is promising for the future of computation in fragment-based drug design. Not only has a model been established that links computation to experiment, but the model may also be extrapolated to predict external experimental pKBHX values.

Entities: Chemical Disease Gene Species

Mesh：

Substances：
Organic Chemicals
Water

Year: 2014 PMID： 24460383 PMCID： PMC4004274 DOI： 10.1021/ci400657c

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Introduction

Recently, fragment-based drug design has emerged as a popular method for screening small chemical structures against a known drug target.[1−4] Modifications to the lead compound serve to improve the target selectivity and administration/absorption, distribution, metabolism, excretion/elimination, toxicology (ADMET) profile. The resulting modifications aim to increase the chance of discovering a novel drug candidate. The probability of discovering a novel drug candidate is increased if the modifications to the lead compound are informed. One concept broadly applied to help medicinal chemists make informed modifications to lead compounds is that of bioisosterism.[5−9] A bioisostere is a substituent or a fragment of an active compound with similar physical or chemical properties.[5] The concept of bioisosterism can be applied during the lead optimization process. Bioisosteric modifications can be made to lead compounds with the aim being to improve the ADMET profile of the lead. Only the properties important to the biological activity in each specific lead optimization need to be enhanced. However, the properties that are enhanced to improve the ADMET profile of one lead might unfortunately deliver contrasting results in the case of a different lead. Therefore, successful modifications of a lead are unique and specific to the lead itself. The unique combination of modifications that are specific to each successful lead optimization is often unknown. As the specific modifications are unknown for each lead, a number of broad properties have been outlined[7] that can be modified practically in the lead optimization process. These properties relate to both the drug molecule and its environment, and serve as a starting point when considering bioisosteric substitutions. The properties are size, shape, electronic distribution, lipid solubility, water solubility, pKa, chemical reactivity, and hydrogen-bonding capacity. Hydrogen bonding is an important noncovalent interaction between a biological molecule and its local environment.[10−13] Hydrogen bonding may influence binding affinity[14−16] and solvation,[17,18] membrane permeability,[19] and adsorption onto surfaces.[20] Hydrogen bonding influences both the distribution of a drug to its target and the binding of a drug to its target. After the drug is administered and absorbed into the body, it must be delivered to the target. As the drug is absorbed into the body, it interacts with the biological fluids that transport the drug around the body. Any exposed polar residues of the drug are capable of forming hydrogen bonds with the biological fluids as the drug becomes dissolved. The drug must then shed the solvent to bind the target. The hydrogen-bonding network involved in the displacement of the solvent with the target influences the thermodynamics of a drug interaction. The Gibbs energy of a drug interaction can be improved by focusing on hydrogen bonding. There is a favorable enthalpy change as the drug forms hydrogen bonds with the target. However, there is also an unfavorable enthalpy change as hydrogen bonds are broken as the drug sheds the layer of solvent upon binding. There is a favorable entropy change as the layer of solvent is shed upon binding. During binding, there is, however, an entropic penalty due to the loss of translational and rotational freedom. Favorable enthalpy and entropy changes must be maximized while minimizing unfavorable changes to improve the binding profile of a lead compound. One way in which this may be achieved is to modify the lead in such a way that the strongest hydrogen bonds are formed between the drug and its respective target. Knowledge of relative hydrogen-bond strengths is therefore important to medicinal chemists in lead optimization.

Background

Quantum Chemical Topology (QCT)

Quantum chemical topology is a method that extracts chemical information from wave functions by using[21,22] the language of dynamical systems. QCT generalizes the original quantum theory of atoms in molecules (QTAIM)[23,24] and builds on the idea of partitioning quantum mechanical property densities by means of a (gradient) vector field. This approach defines finite volumes in 3D space that express the partitioning of the quantum system. In this Article, we only use topological properties retrieved from the electron density (for details and definitions, see section 3). Bond critical points (BCPs), also denoted (3,–1), are featured in this study, together with the quantum mechanical functions evaluated at them. These local properties are complemented by global properties, which are obtained by 3D integration over the volume of a topological atom. These concepts have been reviewed many times before.

Hydrogen-Bond Basicity Scales

Despite the importance of hydrogen bonding in biological systems, there is a general lack of understanding of the relative strengths of hydrogen-bond basicity.[25] The first scales of hydrogen-bond basicity were set up by Taft and co-workers,[26,27] who investigated linear free energy relationships (LFER) between the reference acid 4-fluorophenol and a series of OH hydrogen-bond donors (HBDs). The HBDs were hydrogen bonded to a series of bases in 1:1 complexes. The equilibrium constants of these complexes were used to define the relative basicities of the hydrogen-bond acceptors (HBAs). The corresponding scale was called pKHB. However, the pKHB scale contains only 117 basicity values. Second, due to the nature of the LFERs, pKHB values are limited to (−OH) HBDs. When the LFERs were repeated using 5-fluoroindole instead of 4-fluorophenol as the reference acid, family-dependent behavior was observed.[28] Therefore, the pKHB scale covers only a limited proportion of all possible hydrogen bonds that may be formed, and has been made little use of by medicinal chemists. The log KBH scale, established by Abraham and co-workers,[29] was constructed using 34 reference acids complexed to a series of bases in a 1:1 ratio. The log KBH scale was intended to establish an overall generality of hydrogen-bond basicity. The log KBH values were obtained from 34 LFER equations in the form of eq 1:where the index i refers to the base, A represents the acid, LA and DA are the gradients, and log K is the intercept, respectively, of the 34 straight line equations. The 34 LFER equations were constrained to intersect the log KBH axis at a “magic point” of −1.1 log units. This constraint allowed an equivalent but more convenient linear transform of the log KBH scale to be set up. The linear transform of the log KBH scale was called the β2H scale. Adding the 1.1 log K units to the log KBH scale, and dividing by 4.636, the β2H scale conveniently ranges from 0 to 1. Therefore, a β2H value of 0 represents no hydrogen bonding, and a β2H value of 1 represents the strongest hydrogen bonding. Once the constants LA and DA in eq 1 are known for a particular acid, secondary log KBH values not included in the original data can be obtained. However, due to the statistical treatment of 34 such log KBH values to obtain β2H values, the determination of secondary β2H values not included in the original data set is not straightforward. Many organic bases and potential drug molecules have multiple hydrogen-bonding sites. These bases with multiple hydrogen-bonding sites are known as polyfunctional hydrogen bond bases. As the β2H scale of hydrogen-bond basicity was considered to be the most general, it was adapted to account for polyfunctional hydrogen-bond bases. The resulting scale was called the ∑β2H scale.[30] The ∑β2H scale is based on equilibrium constants, but values are not directly calculated from equilibrium constants. Instead, the ∑β2H values are based on the general solvation equation in the form of eq 2: In this equation, SP relates to a set of solute water–solvent partition coefficients in a given system, R2 is the excess molar refraction, π2H is the dipolarity or polarizability, and α2H and β2H are the hydrogen-bond acidity and hydrogen-bond basicity. The variable V is McGowan’s characteristic molecular volume (cm3 mol–1/100).[31] The parameters of eq 2 can be determined experimentally with the exception of the acidity or basicity terms. The solutes were restricted either to give an acidity term of 0 or to monoacidic solutes for which the acidity term is known.[18] Hence, the only unknown term is the overall basicity. The ∑β2H values were estimated applying β2H values to eq 2 and back-calculating to return successive ∑β2H values in an iterative process until a self-consistent set of equations and therefore ∑β2H were obtained. It was therefore claimed that the ∑β2H scale of overall hydrogen-bond basicity was more useful than the β2H scale when considering biological properties.[30] Despite being the most quoted scale of hydrogen-bond basicity,[25,32] and despite the authors’ claim that it is more useful than 1:1 scales in the interpretation of biochemical and physiochemical properties,[30] there are problems in the interpretation of ∑β2H values. The ∑β2H scale assigns only one ∑β2H value to the whole molecule, and is based on partition coefficients rather than binding. Although medicinal chemists may make use of ∑β2H values to predict permeability, the ∑β2H scale offers no thermodynamic binding information. It is clear that there is a need for a measure of hydrogen-bond basicity that targets the needs of medicinal chemists. The requirement for this scale is that it should combine the simplicity of the pKHB scale and its thermodynamic relevance with the analysis of polyfunctional bases as in the ∑β2H scale. The pKBHX database[25] seems to meet these requirements. The pKBHX scale is essentially an extension of the pKBH scale in that it is based on equilibrium constants of 1:1 complexes in the solvent CCl4 where 4-fluorophenol is the reference acid. The pKBHX values are therefore related to Gibbs energy and have thermodynamic relevance. However, the pKBHX scale also accounts for polyfunctional bases. An important difference between the pKBHX scale, on one hand, and the pKBH and ∑β2H scales on the other hand, is the experimental method used to calculate the equilibrium constants. The difference in experimental methods allows polyfunctional bases to be included in the analysis. Fourier transform infrared (FTIR) spectrometry is used in the determination of the pKBHX scale rather than 19F NMR, UV, and dispersive IR techniques,[25] which are mostly used for pKBH and ∑β2H. The advantage of using FTIR is that multiple significant hydrogen-bond acceptor sites of polyfunctional bases can be analyzed.[25] During the formation of a hydrogen bond, the stretching vibration of the XH bond is shifted to a lower wavenumber, giving rise to peaks in a FTIR spectrum. Each peak represents a hydrogen-bond site. The change in the vibrational frequency of the XH bond can be translated into a pKBHX value through family-dependent pKBHX–IR shift relationships.[33−36] The family-dependent pKBHX–IR relationships correlate the change in IR vibrational frequency of the XH bond to pKBHX values for a family of HBAs and a reference HBD in 1:1 complexes. The FTIR spectroscopy method allows equilibrium concentrations of these 1:1 complexes to be measured in a certain frequency range, therefore allowing a pKBHX value to be obtained. This allows a pKBHX value to be assigned to each HBA site separately. It is clear that the pKBHX scale is able to analyze polyfunctional bases as well as the ∑β2H scale. An important difference, however, is that pKBHX values are thermodynamically significant because each basic site is considered separately. On the other hand, the ∑β2H scale does not consider each hydrogen-bond acceptor site individually. This feature is illustrated by the fact that the pKBHX database contains 1338 pKBHX values taken from 1164 bases. This large data set is advantageous to medicinal chemists as most drugs contain several hydrogen-bond acceptor sites consisting of mainly O, N, S, F, Cl, and sp2 hydridized carbons. It is therefore a possibility that the pKBHX scale could be used as a descriptor characterizing hydrogen bonding from which bioisosteric replacements could be found. The ∑β2H scale considers the entire molecule to be thought of as the hydrogen-bond acceptor, whereas the pKBHX scale considers each individual hydrogen-bond acceptor site independently. This has important implications for fragment-based drug design. The ∑β2H scale depends on the entire molecule being intact to return a ∑β2H value. The pKBHX scale, however, allows for each hydrogen-bond acceptor site to be assigned a pKBHX value. Therefore, a fragmented drug-like molecule may be assigned pKBHX values but not ∑β2H values.

Computational Predictions of Hydrogen-Bond Basicity Values

There have been a number of computational models used to predict hydrogen-bond basicity. The minimum value of the electrostatic potential at the hydrogen-bond acceptor atom[37,38] has been used to obtain a family-dependent model for the hydrogen-bond basicity parameter β used in linear solvation energy relationships.[39] A more general model displaying a reasonable family-independent correlation, r2 = 0.81, between log KHB and the minimum value of the electrostatic potential has also been established.[40] Basicity is typically more difficult than acidity to model due to the variety of different HBA sites. By considering the nitrogen HBAs separately, the correlation can be improved when correlating pKHB to the minimum of the electrostatic potential,[41] and refined further by also including the interaction energy, giving r2 = 0.99. Platts[42] developed a model to predict hydrogen-bond basicity based on DFT calculations on a series of bases taken from Abraham’s list of ∑β2H values.[43] Theoretical properties were calculated for both the isolated base and the base complexed with HF. Once again, strong family-dependent relationships were observed. The atomic multipole moments of nitrogens occurring in the isolated bases gave the best correlations, r2 = 0.786. The combination of the minimum electrostatic potential at the nucleus of the HBA and an atomic charge (according to QCT) of the isolated base gave a better correlation, yielding r2 = 0.887 for oxygen HBAs. However, with the use of a HF probe, improvements were made with the oxygen model yielding r2 = 0.939. The binding energy calculated from the supramolecular complex correlated reasonably well, r2 = 0.901, to give a good family-independent model. It could be argued that the ∑β2H scale might not be the best measure of basicity for the medicinal chemist to predict. This is due to the lack of HBA site-specific information and thermodynamic information offered by the scale. A recent study repeated the supramolecular approach of Platts, this time using phenol as the HBD, and found[43] good general correlations (r2 = 0.94) between log Kβ values and the hydrogen-bond binding energy for nitrogen and oxygen HBAs. Platts later refined his initial model correlating free energy of formation with pKHB values.[44] Fits of r2 = 0.969 were observed for a family-independent model using HF as the HBD. The final refinement to this model was made by combining the minimum of the electrostatic potential at the HBA and properties of the hydrogen-bond critical point (BCP),[45] leading to r2 = 0.97. Devereux et al. suggested[46] this model to be too expensive computationally for use in QCT-aided drug design.[47] These authors constructed a model[46] by combining the minimum, median, and mean of the electrostatic potential at the HBA with the minimum of the local ionization energy on the HBA surface and correlating with β2H values. They observed family-dependent models with non-nitrogen bases producing fits of r2 = 0.931, while nitrogen bases gave r2 = 0.871. However, the results were found to be less accurate than Platts’ model,[42] which correlated properties of hydrogen-bonded complexes with ∑β2H values. It would therefore seem that the most accurate models used to predict hydrogen-bond basicity must be constructed from properties of the hydrogen-bonded complex. Also, by using β2H values, the model is limited to the 1:1 hydrogen-bonded complexes only. However, in drug···target interactions, 1(HBA):n(HBD) complexes are likely to occur, and an improvement to the model could be made by correlating to a measure of hydrogen-bond basicity that accounts for all major hydrogen-bond acceptor sites of a particular molecule. The previous models are limited by 1:1 ratios. However, drug design warrants the inclusion of polyfunctional bases and therefore needs a model that goes beyond the 1:1 ratio. Such a refined model must keep the accuracy of Platts’ model by considering properties of the hydrogen-bonded complex, while taking into account polyfunctional hydrogen-bond acceptors. A good candidate to fulfill the above requirement is the pKBHX scale. The advantage of using pKBHX values over ∑β2H values is that a pKBHX value is listed for each hydrogen-bond acceptor site. This important feature allows for the correlation between a single value of a particular property of a specific hydrogen bond and a single value relating to the basicity of that hydrogen bond.

Method and Computational Details

A training set of 41 bases was selected from the pKBHX database.[25] The bases were chosen to include a wide range of pKBHX values for nitrogen, oxygen, and sulfur acceptor atoms. The strength of the nitrogen bases ranges from weaker nitrile and aniline acceptors, to stronger pyridine and amines. The oxygen bases include weaker phenol and alcohols, and stronger carbonyl and amide acceptors. A small set of sulfide and thiol sulfur acceptors also formed part of the training set. The distribution of the pKBHX values within the training set was chosen to roughly match that of the entire pKBHX database.[25] Hydrogen-bond complexes (1:1) were formed between the 41 training bases and 6 possible HBDs. The latter consist of 4 OH donors (water, methanol, 4-fluorophenol, and serine), complemented by methylamine and hydrogen fluoride. The pKBHX scale was set up using 4-fluorophenol as the reference HBD, and is thought to be applicable to all OH, NH+, and NH donors.[25] Therefore, the chosen HBDs in this study, with the exception of hydrogen fluoride, should give complexes that the pKBHX scale can be applied to. DFT calculations were performed using Gaussian 03.[48] A set of hydrogen-bonded complexes were set up, based on those investigated by Platts.[42] They were generated by placing a hydrogen atom of the HBD approximately 2.0 Å away from the primary HBA site. The essence of the hydrogen-bond system is then conveniently denoted by D–H···A, where H···A is the hydrogen bond and D–H is the covalent HBD bond. The hydrogen-bonded complexes were geometry optimized at the B3LYP/6-311++G(2d,p) level of theory. Properties of the complexes were obtained and correlated with their respective pKBHX values. The properties include the hydrogen-bond length r(A···H), the change in the HBD bond length Δr(H–D), and the hydrogen-bond energy ΔE, the details of which are explained below. Note that the symbol Δ denotes the difference of the complex and the constituent monomers forming it. Atomic properties are obtained by a volume integral over an atomic basin Ω. The kinetic energy densities G(r) and K(r), and the atomic energy E(Ω), are defined in eqs 3, 4, and 5:where G(r) is one type of kinetic energy density, while K(r) is another type. The symbol ∫dτ′ refers to the integration over the spatial and spin coordinates of all electrons except one, dr denotes a three-dimensional integration over the basin of the topological atom Ω, while R represents the virial ratio correction factor.[23] It is an important property of a topological atom that its kinetic energy is well-defined (which is not true for an arbitrary subspace). Indeed, G(Ω) = K(Ω). The atomic charges are defined in eqs 6 and 7:where Z is the charge of the nucleus inside the topological atom. All atomic integrations used to analyze atomic properties were calculated with the AIMAll suite.[49] The atomic properties used here are ΔE(H), or the change (going from the monomers to the complex) in the HBD energy, and Δq(H), or the change in charge of the hydrogen atom in the HBD. Bond properties were evaluated and analyzed only at BCPs associated with H···A hydrogen bonds and BCPs of the covalent HBD bonds (D–H), both in the hydrogen-bond system D–H···A. Properties analyzed at the hydrogen BCP are the electron density ρ(A···H), the Laplacian of the electron density ∇2ρ(A···H), the kinetic energy density at the BCP G(A···H), and the kinetic energy density G(r) at the BCP, divided by the electron density G/ρ(A···H). Properties of the HBD critical point are the change in the electron density Δρ(H–D), and the change in the Laplacian of the electron density Δ∇2ρ(H–D). Properties of the BCPs were obtained from the MORPHY98 program.[50,51] This program also generated Figure 1, which shows an atomic basin of a HBD and all BCPs of a hydrogen-bonded complex.

Figure 1

A QCT visualization of the acetonitrile–methylamine complex. The purple dots are the bond critical points (BCPs), and the atomic basin of the hydrogen in the hydrogen bond is marked in wireframe (where the electron density has been cut off at ρ = 10–3 au at the outer surfaces). This atomic basin is integrated over to obtain ΔE(H). Basis set superposition error (BSSE) calculations were carried out to accurately estimate the hydrogen-bond binding energy for the water series only. As the BSSE of hydrogen-bond energy turned out to be very small, BSSE calculations were not repeated for complexes without water. The method used to perform the BSSE calculations is that of Handy and co-workers.[52] Equations 8, 9, and 10 show the details:Equation 8 shows how to calculate the hydrogen-bond energy without considering the BSSE. Equation 9 calculates the BSSE of the hydrogen-bond energy using the counterpoise correction method.[53] In eq 9, the asterisk refers to the geometry of the HBA and HBD in the optimized complex, and CP stands for counterpoise and refers to the energy of the HBA and HBD in the presence of ghost HBD and HBA Gaussian functions, respectively. Equation 10 calculates the counterpoise corrected hydrogen-bond binding energy from eqs 8 and 9. Criticism of the B3LYP functional led to several investigations showing how the MPWB1K functional consistently performs better for weak interactions such as hydrogen bonding.[54−56] To investigate this criticism in the context of the current work, a further set of calculations using the MPWB1K/6-311++G(2d,p) level was performed for the training set of bases where water was used as a HBD. To account for polyfunctional bases, calculations were performed at the B3LYP/6-311++G(2d,p) level on a set of 11 bases with 2 hydrogen-bond acceptor sites giving 22 acceptor sites in total. Initially, 22 1:1 complexes were generated. The properties calculated from the 1:1 complexes have been compared to properties calculated from the same set of bases in 2:1 complexes. The HBD chosen for these calculations was methanol.

Results and Discussion

Table 1 list the names of the 41 bases (i.e., HBAs) and their corresponding pKBHX values.

Table 1

Training Set of 41 Bases Used as Hydrogen-Bond Acceptors (HBAs) and Their Corresponding pKBHX Values

hydrogen-bond acceptor	pK_BHX
3-chloropyridine	1.31
4-methylpyridine	2.07
acetamide	2.06
acetone	1.18
acetonitrile	0.91
acrylonitrile	0.70
aniline	0.46
chloroacetonitrile	0.39
dimethyl sulfide	0.12
dimethylamine	2.26
ethanol	1.02
ethyl thiol	–0.16
ethylamine	2.17
formamide	1.75
MeSCN	0.73
methanol	0.82
methyl acetate	1.00
methylamine	2.2
methylformate	0.65
N-methylaniline	0.26
phenol	–0.07
pyridine	1.86
pyrrolidine	2.59
t-butylamine	2.23
tetrahydropyran	1.23
2,6-dimethylaniline	0.47
3-chloroaniline	0.13
3-fluoroaniline	0.20
3-methylphenol	0.01
4-methylphenol	0.03
dimethyldisulfide	–0.49
ethylmethylsulfide	0.18
p-toluidine	0.56
4-fluorophenol	–0.12
4-bromo-N,N-dimethylaniline	–0.42
oxydibenzene	–0.80
diethyl disulfide	–0.40
4-aminopyridine	2.56
N,N,N-trimethyl ammoniopropanamidate	3.59
triethylarsine oxide	4.89
trimethylamine oxide	5.46

Each of the HBAs listed in Table 1 has been used to form a hydrogen-bonded complex with each of the six HBDs, which are water, methanol, 4-fluorophenol, methylamine, serine, and HF. Examples of these complexes are given in Figure 2, except for HF, illustrating each possible element (S, O, and N) as an acceptor.

Figure 2

Five examples of hydrogen-bonded complexes in their optimized geometry: (A) water···pyridine, (B) methanol···phenol, (C) 4-fluorophenol···dimethyl sulfide, (D) methylamine···acetamide, and (E) serine···dimethyl amine. Table 2 summarizes the correlations of calculated properties (energetic, geometric, atomic, and BCP) with the experimental pKBHX values. With the exception of the hydrogen fluoride set, the quantity ΔE(H) consistently outperforms all other properties, giving r2 values of 0.96 for water, 0.95 for methanol, 0.91 for 4-fluorophenol, 0.93 for serine, and 0.97 for methylamine. Particular attention was given to the electronic energy of the hydrogen bond, ΔE. However, poor correlations across the board of OH and NH donors were observed. The BSSE corrected energy offered no improvement to the correlations. However, the plot of ΔE against pKBHX where hydrogen fluoride was used as the probe gave a correlation of r2 = 0.60. Although 0.60 is a weak correlation, it is significantly better than any of the values obtained for OH and NH donors (which are generally about 0.30 with the exception of 4-fluorophenol with r2 = 0.52). Lamarche and Platts[44] obtained a value of r2 = 0.91 for the correlation between computed ΔE and Taft’s pKHB values using hydrogen fluoride as the HBD. Taking into account the stronger correlations of ΔE with pKBHX and the observations of Lamarche and Platts when hydrogen fluoride is used as the HBD, it could be said that any relationship between ΔE and basicity is highly questionable due to the use of hydrogen fluoride as a probe. Indeed, the computed ΔE of a hydrogen fluoride complex correlates reasonably well to basicity values, but a link cannot be made between theory and experiment as the basicity scales are not applicable to hydrogen fluoride complexes.

Table 2

Correlation Coefficients r2 of Calculated Properties (Energetic, Geometric, Atomic, and BCP) with the Experimental pKBHX Values of the Data Set Listed in Table 1a

	hydrogen-bond donor
property	water	methanol	4-fluorophenol	serine	methylamine	hydrogen fluoride
r(A···H)	0.46	0.49	0.49	0.46	0.52	0.41
Δr(H–O)	0.80	0.81	0.80	0.76	0.87	0.38
ΔE	0.32 (0.91)	0.33	0.52	0.29	0.05	0.60
Δq(H)	0.58	0.56	0.57	0.64	0.81	0.00
ΔE(H)	0.96 (0.97)	0.95	0.91	0.93	0.97	0.04
ρ(A···H)	0.81	0.86	0.88	0.79	0.90	0.74
∇²ρ(A···H)	0.64	0.68	0.55	0.58	0.83	0.10
Δρ(H–D)	0.81	0.82	0.81	0.78	0.83	0.73
Δ∇² ρ(H–D)	0.59	0.51	0.65	0.62	0.66	0.65
G(A···H)	0.80	0.84	0.83	0.79	0.83	0.71
G/ρ(A···H)	0.09	0.09	0.06	0.08	0.11	0.01

The ΔE and ΔE(H) values in brackets in the water set are taken from MPWB1K calculations. A slightly reduced data set of 35 HBAs was used for MPWB1K calculations. The data set is as in Table 1, minus 3-chloropyridine, dimethyl sulfide, ethyl thiol, ethylmethylsulfide, 4-fluorophenol, and diethyldisulfide. These HBAs were omitted due to computational timing constraints. As a direct comparison, r2 = 0.97 for ΔE(H) when the reduced data set is used for B3LYP calculations. The MPWB1K density functional has been shown to perform better than B3LYP when computing the interaction energies of hydrogen-bonded complexes.[54−56] Table 2 reveals that hydrogen-bond binding (interaction) energy correlates with pKBHX for MPWB1K calculations but not B3LYP calculations. However, as the binding energies of the complexes used here do not correlate with pKBHX values as strongly as ΔE(H), it is adequate to use B3LYP. Furthermore, Table 2 shows how ΔE(H) calculated from a MPWB1K wave function performs equally as well as ΔE(H) calculated from a B3LYP wave function when correlating ΔE(H) to pKBHX. Correlation of the QCT ΔE(H) with pKBHX values for the bases listed in Table 2 with HBDs: (A) water, pKBHX = 197.34ΔE(H) – 3.749, r2 = 0.96, se = 0.30, F = 829.49 (“se” = standard error); (B) methanol, pKBHX = 180.94ΔE(H) – 3.7074, r2 = 0.95, se = 0.31, F = 740.62; (C) 4-fluorophenol, pKBHX = 170.44ΔE(H) – 3.8052, r2 = 0.91, se = 0.42, F = 391.90; (D) serine, pKBHX = 179.19ΔE(H) – 1.6701, r2 = 0.93, se = 0.37, F = 522.46; (E) methylamine, pKBHX = 194.73ΔE(H) – 2.3995, r2 = 0.97, se = 0.26, F = 1089.93. Figure 3 shows the plots of ΔE(H) against pKBHX for water, methanol, 4-fluorophenol, serine, and methylamine. However, pKBHX values did not correlate with ΔE(H) when HF was used as the HBD, as made from the very low correlation coefficient r2 = 0.04. It is claimed[25] that the pKBHX scale is applicable to hydrogen-bonded complexes provided the HBD’s functional group is either OH, NH, or NH+. Therefore, hydrogen fluoride was chosen as a control probe because hydrogen-bonded complexes, where hydrogen fluoride is the HBD, have no link to the experimental pKBHX scale. The relationship between ΔE(H) and pKBHX values gains credibility as it holds only for HBDs that the pKBHX scale may be applied to. Therefore, a strong link between theory and experiment has been established through the partitioned atomic energy of the HBD. Essentially, the energy assigned to a theoretically derived atomic basin is related to a molecular free energy quantity that is derived from experiment. It is also astonishing that ΔE(H) is able to predict the pKBHX of an external data set.

Figure 3

Correlation of the QCT ΔE(H) with pKBHX values for the bases listed in Table 2 with HBDs: (A) water, pKBHX = 197.34ΔE(H) – 3.749, r2 = 0.96, se = 0.30, F = 829.49 (“se” = standard error); (B) methanol, pKBHX = 180.94ΔE(H) – 3.7074, r2 = 0.95, se = 0.31, F = 740.62; (C) 4-fluorophenol, pKBHX = 170.44ΔE(H) – 3.8052, r2 = 0.91, se = 0.42, F = 391.90; (D) serine, pKBHX = 179.19ΔE(H) – 1.6701, r2 = 0.93, se = 0.37, F = 522.46; (E) methylamine, pKBHX = 194.73ΔE(H) – 2.3995, r2 = 0.97, se = 0.26, F = 1089.93.

An external data set of 41 compounds has been chosen to demonstrate that the QCT property ΔE(H) can predict pKBHX values of HBAs chosen from the pKBHX database. The HBAs that form the external data set are listed in Table 3.

Table 3

External Test Set of 41 Bases Used as Hydrogen-Bond Acceptors, and Their pKBHX Values

hydrogen-bond acceptor	pK_BHX
4-chloropyridine	1.54
4-ethylpyridine	2.07
benzaldehyde	0.78
benzylamine	1.84
butan-2-one	1.22
cyanamide	1.56
dimethylformamide	2.1
isopropanethiol	–0.1
isopropylamine	2.2
propanol	1
propionitrile	0.93
propiophenone	1.04
trichloro acetonitrile	–0.26
2,4,6-trimethylpyridine	2.29
2-aminopyridine	2.12
2-chloropyridine	1.05
3-aminopyridine	2.2
4-chloro-N-methylaniline	0.05
5-bromopyrimidine	0.59
ammonia	1.74
anisole	–0.07
benzyl methylsulfide	–0.02
isoquinoline	1.94
N-methylbenzamide	2.03
N-methylpropanamide	2.24
1,3,5-triazine	0.32
4,6-dimethylpyrimidine	1.47
4-chlorobutyronitrile	0.83
cyanicbromide	0.19
cyclooctanone	1.45
diisopropylether	1.11
ethyl chloroacetate	0.67
ethylenesulfite	0.87
isoxazole	0.81
N-formylmorpholine	1.93
phenylcyanate	0.77
phthalazine	1.97
pyridazine	1.65
pyrrolidine-1-carbonitrile	1.66
progesterone	1.75
diphenylphosphinic chloride	2.17

Figure 4 plots the predicted versus experimental pKBHX values for the HBAs listed in Table 3. Predicted pKBHX values are calculated from eq 11:which corresponds to the least-squares line in panel B in Figure 3. This panel plots ΔE(H) against pKBHX, where methanol is the HBD, complexed to the HBAs of Table 2. Methanol was chosen as the probe to use in the model for three reasons: (i) its simplicity, (ii) its use in obtaining secondary LFER derived experimental values, and (iii) the strong correlation of r2 = 0.95 between the ΔE(H) of these complexes and their pKBHX values. Although the correlation between ΔE(H) and pKBHX for the methylamine set (r2 = 0.97) is stronger than the methanol set, it is challenging to model specific hydrogen bonds with methylamine due to the presence of an additional hydrogen attached to the nitrogen as compared to OH donors that have a single hydrogen. This additional hydrogen may cause the molecule to rotate during optimization and form interactions other than the desired one, a characteristic often observed in this work due to no geometrical constraints placed upon the optimization. Therefore, it is easier to describe pKBHX in terms of ΔE(H) on water or methanol HBDs. An F-test was performed, and the variances between the water and methanol models were not found to be significantly different. Following, a paired t test assuming equal variances was performed. Again, the water and methanol models were not found to be significantly different. Therefore, there is no statistical advantage to choosing methanol as the probe over water. Water would in fact be computationally less expensive than methanol. However, methanol has been chosen due to its usefulness for obtaining secondary pKBHX values from LFERs. The advantage that methanol can be used in such a way justifies the slightly lengthier calculations. Therefore, by using methanol as the probe, the computation is kept in close proximity to the experimental procedure.

Figure 4

Correlation of the predicted pKBHX values with experimental pKBHX values for the bases listed in Table 3. The common HBD is methanol. Predictions are based on the straight line equation for methanol complexes (Figure 3B). Experimental = 0.8503 × predicted – 0.0011; r2 = 0.90, se = 0.75, F = 1.80. Figure 4 shows a relatively strong correlation, r2 = 0.90, for the plot of predicted versus experimental pKBHX values. This demonstrates impressive extrapolation of the original model. The only constraint put on the compounds in Table 3 is that the HBA must be either an oxygen, a nitrogen, or a sulfur atom. Not only does the model based on ΔE(H) show impressive external predictability, but also good generality. This is because all HBAs can be grouped together in one data set. There seems to be no need to separate the HBAs into atomic groups (i.e., families) nor build individual models for each HBA type. The results discussed so far are for 1:1 HBD:HBA complexes where the bases have only one major HBA site. The attractiveness of the pKBHX scale to the medicinal chemist lies in its inclusion of data for the more realistic cases of polyfunctional bases with two or more major HBA sites. Table 4 shows a list of 11 HBAs, each with two major HBA sites (so leading to 22 entries). Two approaches were used to test the validity of the ΔE(H) model for polyfunctional bases. The first approach simply involved generating a 1:1 complex for each HBA site. The second approach aimed to mimic experiment in which the bases are surrounded by an excess of acid allowing each HBA site to reach equilibrium with a HBD. The way in which the experiment was mimicked consisted of generating 2:1 HBD:HBA complexes where both HBA sites are occupied in the same calculation. Figure 5 shows the results of the two approaches. The first approach in which 1:1 complexes were generated marginally outperforms the second approach in which 2:1 complexes were computed. This may be surprising as 2:1 complexes are closer to the experimental approach. However, the fact that 1:1 complexes slightly outperform 2:1 complexes is beneficial to the computational chemist because 1:1 complexes are computationally less expensive than the 2:1 complexes.

Table 4

Set of 11 Polyfunctional Bases Used as HBAs and Their pKBHX Valuesa

hydrogen-bond acceptor	pK_BHX
methyl nicotinate (N)	1.44
methyl nicotinate (O)	0.51
3-benzoylpyridine (N)	1.42
3-benzoylpyridine (O)	0.68
4-acetylpyridine (N)	1.41
4-acetylpyridine (O)	0.78
ethyl 4-cyanobenzoate (N)	0.66
ethyl 4-cyanobenzoate (O)	0.53
S-cotinine (N)	1.62
S-cotinine (O)	2.16
4-acetylbenzonitrile (N)	0.65
4-acetylbenzonitrile (O)	0.6
3-acetylpyridine (N)	1.39
3-acetylpyridine (O)	0.9
N,N-diethylnicotinamide (N)	1.63
N,N-diethylnicotinamide (O)	1.98
4-cyanopyridine (Pyr)	0.92
4-cyanopyridine (Nit)	0.47
3-cyanopyridine (Pyr)	0.82
3-cyanopyridine (Nit)	0.53
2-cyanopyridine (Pyr)	0.48
2-cyanopyridine (Nit)	0.61

Figure 5

Correlation between the QCT ΔE(H) value and the pKBHX values for the 11 polyfunctional bases (each with two HBA sites) listed in Table 4 with the HBD methanol common to all. (A) 1:1 complex, (B) 2:1 complex. (A) pKBHX = 178.72ΔE(H) – 3.9521, r2 = 0.93, se = 1.39, F = 279.50; (B) pKBHX = 166.86ΔE(H) – 3.4887, r2 = 0.92, se = 0.15, F = 224.50.

The symbol in brackets indicates the HBA site where (N) is a nitrogen atom, (O) is an oxygen atom, (Nit) is a nitrogen atom on a nitrile functional group, and (Pyr) is a nitrogen atom on a pyridine functional group. Correlation between the QCT ΔE(H) value and the pKBHX values for the 11 polyfunctional bases (each with two HBA sites) listed in Table 4 with the HBD methanol common to all. (A) 1:1 complex, (B) 2:1 complex. (A) pKBHX = 178.72ΔE(H) – 3.9521, r2 = 0.93, se = 1.39, F = 279.50; (B) pKBHX = 166.86ΔE(H) – 3.4887, r2 = 0.92, se = 0.15, F = 224.50.

Conclusions

It has been shown that the energy, E(H), assigned to a theoretically derived atomic basin of a molecule in the gas phase is related to a molecular free energy quantity that has been derived from experiment, pKBHX. Furthermore, the relationship breaks down when hydrogen fluoride is used as the hydrogen-bond donor. Therefore, the relationship between computation and experiment is only observed with computations that use hydrogen-bond donors that the experimental scale is applicable to. It has also been shown that pKBHX values outside the initial data set are accurately predicted by the model set up with the initial data set. Therefore, ΔE(H) is able to predict the pKBHX values of an external data set. The attraction of the pKBHX scale is that a pKBHX value is listed for each hydrogen-bond acceptor site on polyfunctional bases. It has also been found that ΔE(H) relates to pKBHX values of polyfunctional bases. This is an important observation when considering site-specific thermodynamic data and could help to improve computational drug design and development.

24 in total

1. Theoretical prediction of the hydrogen-bond basicity pK(HB).

Authors: Olivier Lamarche; James A Platts
Journal: Chemistry Date: 2002-01-18 Impact factor: 5.236

Review 2. Application of hydrogen bonding calculations in property based drug design.

Authors: Michael H Abraham; Adam Ibrahim; Andreas M Zissimos; Yuan H Zhao; John Comer; Derek P Reynolds
Journal: Drug Discov Today Date: 2002-10-15 Impact factor: 7.851

3. Atomic properties of amino acids: computed atom types as a guide for future force-field design.

Authors: Paul L Popelier; Fiona M Aicken
Journal: Chemphyschem Date: 2003-08-18 Impact factor: 3.102

4. Benchmark Databases for Nonbonded Interactions and Their Use To Test Density Functional Theory.

Authors: Yan Zhao; Donald G Truhlar
Journal: J Chem Theory Comput Date: 2005-05 Impact factor: 6.006

5. Design of density functionals that are broadly accurate for thermochemistry, thermochemical kinetics, and nonbonded interactions.

Authors: Yan Zhao; Donald G Truhlar
Journal: J Phys Chem A Date: 2005-06-30 Impact factor: 2.781

6. Fragment-based drug discovery.

Authors: Wendy A Warr
Journal: J Comput Aided Mol Des Date: 2009-06-24 Impact factor: 3.686

7. Theoretical prediction of hydrogen bond strength for use in molecular modeling.

Authors: Monika Nocker; Sandra Handschuh; Christofer Tautermann; Klaus R Liedl
Journal: J Chem Inf Model Date: 2009-09 Impact factor: 4.956

Review 8. Three-dimensional pharmacophore methods in drug discovery.

Authors: Andrew R Leach; Valerie J Gillet; Richard A Lewis; Robin Taylor
Journal: J Med Chem Date: 2010-01-28 Impact factor: 7.446

Review 9. Helix capping.

Authors: R Aurora; G D Rose
Journal: Protein Sci Date: 1998-01 Impact factor: 6.725

10. A theoretical evaluation of the pKHB and DeltaH[symbol: see text]HB hydrogen-bond scales of nitrogen bases.

Authors: François Besseau; Jérôme Graton; Michel Berthelot
Journal: Chemistry Date: 2008 Impact factor: 5.236

3 in total

1. Charting Hydrogen Bond Anisotropy.

Authors: Diogo Santos-Martins; Stefano Forli
Journal: J Chem Theory Comput Date: 2020-03-10 Impact factor: 6.006

2. Modeling biophysical and biological properties from the characteristics of the molecular electron density, electron localization and delocalization matrices, and the electrostatic potential.

Authors: Chérif F Matta
Journal: J Comput Chem Date: 2014-04-29 Impact factor: 3.376

3. Machine learning models for hydrogen bond donor and acceptor strengths using large and diverse training data generated by first-principles interaction free energies.

Authors: Christoph A Bauer; Gisbert Schneider; Andreas H Göller
Journal: J Cheminform Date: 2019-09-11 Impact factor: 5.514

3 in total