| Literature DB >> 27602298 |
Jimmy C Kromann1, Frej Larsen1, Hadeel Moustafa1, Jan H Jensen1.
Abstract
The PM6 semiempirical method and the dispersion and hydrogen bond-corrected PM6-D3H+ method are used together with the SMD and COSMO continuum solvation models to predict pKa values of pyridines, alcohols, phenols, benzoic acids, carboxylic acids, and phenols using isodesmic reactions and compared to published ab initio results. The pKa values of pyridines, alcohols, phenols, and benzoic acids considered in this study can generally be predicted with PM6 and ab initio methods to within the same overall accuracy, with average mean absolute differences (MADs) of 0.6-0.7 pH units. For carboxylic acids, the accuracy (0.7-1.0 pH units) is also comparable to ab initio results if a single outlier is removed. For primary, secondary, and tertiary amines the accuracy is, respectively, similar (0.5-0.6), slightly worse (0.5-1.0), and worse (1.0-2.5), provided that di- and tri-ethylamine are used as reference molecules for secondary and tertiary amines. When applied to a drug-like molecule where an empirical pKa predictor exhibits a large (4.9 pH unit) error, we find that the errors for PM6-based predictions are roughly the same in magnitude but opposite in sign. As a result, most of the PM6-based methods predict the correct protonation state at physiological pH, while the empirical predictor does not. The computational cost is around 2-5 min per conformer per core processor, making PM6-based pKa prediction computationally efficient enough to be used for high-throughput screening using on the order of 100 core processors.Entities:
Keywords: Drug design; Electronic structure; Semiempirical methods; pKa prediction
Year: 2016 PMID: 27602298 PMCID: PMC4991863 DOI: 10.7717/peerj.2335
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 3The structure of compound 1, heliotridane, and benzylpyrolidine.
List of molecules and experimental pKa values used for Table 2.
The first entry for each functional group is the reference used to compute the pKa values and the corresponding reference pKa value. The pKa values are taken from Sastre et al. (2012).
| Pyridines | Alcohols | Carboxylic acids | |||
|---|---|---|---|---|---|
| Pyridine | 5.2 | Ethanol | 15.9 | Acetic acid | 4.8 |
| 2-Methylpyridine | 6.0 | Methanol | 15.5 | Formic | 3.8 |
| 3-Methylpyridine | 5.7 | Propanol | 16.1 | Benzoic | 4.2 |
| 4-Methylpyridine | 6.0 | i-Propanol | 17.1 | Hexanoic | 4.8 |
| 2,3-Dimethylpyridine | 6.6 | 2-Butanol | 17.6 | Propanoic | 4.9 |
| 2,4-Dimethylpyridine | 7.0 | tert-butanol | 19.2 | Pentanoic | 4.9 |
| 3-Fluoropyridine | 3.0 | Trimethylacetic | 5.1 | ||
| 3-Cyanopyridine | 1.5 |
Mean absolute differences (MADs) and maximum absolute difference (Max AD) of predicted pKa values relative to experimental values for the molecules listed in Table 1.
CBS-4B3*, B3LYP, and M05-2X refer to predictions made by Sastre et al. (2012) using a modified CBS-4B3 composite method and the SMD solvation method, B3LYP/6-311++G(d,p)/SMD and M05-2X/6-311++G(d,p)/SMD, respectively.
| CBS-4B3 | B3LYP/SMD | M05-2X/SMD | PM6-D3H+/SMD | PM6-D3H+/SMD | PM6/SMD | PM6/COSMO | |
|---|---|---|---|---|---|---|---|
| MAD | 0.2 | 0.4 | 0.3 | 1.2 | 1.2 | 1.3 | 0.7 |
| Max AD | 0.6 | 0.8 | 0.7 | 3.9 | 4.0 | 4.1 | 1.9 |
| MAD | 0.2 | 0.4 | 0.3 | 0.5 | 0.6 | 0.6 | 0.6 |
| Max AD | 0.6 | 0.8 | 0.7 | 1.2 | 1.4 | 1.4 | 1.4 |
| MAD | 0.7 | 0.7 | 0.6 | 1.4 | 1.3 | 1.2 | 1.0 |
| Max AD | 1.1 | 1.5 | 1.3 | 3.5 | 3.3 | 3.3 | 2.3 |
| MAD | 0.5 | 0.6 | 0.6 | 0.2 | 0.3 | 0.3 | 0.4 |
| Max AD | 0.8 | 1.0 | 1.0 | 0.4 | 0.4 | 0.5 | 1.0 |
| MAD | 1.3 | 1.0 | 1.3 | 0.7 | 0.8 | 0.8 | 0.8 |
| Max AD | 2.8 | 2.3 | 2.9 | 1.7 | 1.9 | 1.8 | 1.9 |
| MAD | 0.6 | 0.9 | 0.9 | 1.3 | 1.2 | 1.2 | 1.3 |
| Max AD | 1.7 | 2.2 | 2.1 | 2.4 | 2.5 | 2.4 | 2.4 |
| MAD | 0.4 | 0.5 | 0.3 | 0.3 | 0.3 | 0.3 | 0.3 |
| Max AD | 1.1 | 1.4 | 0.7 | 0.7 | 0.7 | 0.7 | 0.7 |
Notes:
Indicates that the rigid rotor, harmonic oscillator free energy term is neglected.
Indicates MAD and Max AD computed for primary amines only.
Statistics for the predicted pKa values in Table 2 (labeled “Sastre”) and the amines in Table 4 plus the primary amines in Table 2 (labeled “Amines”).
Outliers were identified using the Modified Thompson τ method and removed prior to analysis. “std err” is the standard error of the estimate, F is the Fischer statistic, n the degrees of freedom, and τ cutoff is the cutoff used to determine outliers. The Sastre set has 36 data points and 34° of freedom including outliers, while the Amine set has 18 data points and 16° of freedom including outliers.
| CBS-4B3 | B3LYP/SMD | M05-2X/SMD | |
|---|---|---|---|
| Slope | 1.044 ± 0.033 | 0.991 ± 0.033 | 1.030 ± 0.036 |
| Intercept | −0.19 ± 0.30 | 0.19 ± 0.30 | −0.05 ± 0.33 |
| 0.970 (0.7) | 0.968 (0.8) | 0.963 (0.8) | |
| 1006 (31) | 925 (31) | 817 (31) | |
| τ cutoff | 1.4 | 1.5 | 1.6 |
Note:
Indicates that the rigid rotor, harmonic oscillator free energy term is neglected.
Figure 1Plot of (A) ab initio and (B) semiempirical pKa predictions for the molecules in Table 1 and (C) semiempirical pKa predictions for the primary amines in Table 1 and the amines in Table 4.
Outliers are identified using the Modified Thomson τ method.
Predicted pKa values for the secondary and tertiary amines shown in Fig. 2, using di- and tri-ethylamine as a reference, respectively.
In the case or piperazine and DABCO the pKa value corresponds to the singly protonated species.
| Exp | PM6-D3H+/SMD | PM6-D3H+/SMD | PM6/SMD | PM6/COSMO | |
|---|---|---|---|---|---|
| Diethylamine | 11.1 | ||||
| Morpholine | 8.4 | 7.3 | 7.8 | 7.2 | 7.9 |
| Piperidine | 11.2 | 10.9 | 11.3 | 10.8 | 10.9 |
| Piperazine | 9.8 | 8.8 | 9.0 | 8.4 | 9.1 |
| Pyrrolidine | 11.3 | 11.3 | 11.1 | 10.6 | 11.3 |
| Diallylamine | 9.3 | 8.0 | 8.7 | 7.9 | 8.3 |
| Diisopropylamine | 11.0 | 12.6 | 12.4 | 11.7 | 11.4 |
| MAD | 0.9 | 0.6 | 1.0 | 0.5 | |
| Max AD | 1.6 | 1.4 | 1.4 | 1.0 | |
| Tri-ethylamine | 10.7 | ||||
| N-methyl morpholine | 7.4 | 4.9 | 5.8 | 4.6 | 7.4 |
| Quinuclidine | 11.0 | 8.1 | 8.7 | 7.5 | 9.4 |
| DABCO | 8.8 | 5.1 | 5.6 | 4.3 | 6.7 |
| N-Ethylpyrrolidine | 10.4 | 9.0 | 9.5 | 8.6 | 10.4 |
| Triallylamine | 8.3 | 4.8 | 6.9 | 5.2 | 6.9 |
| Diisopropylmethylamine | 10.5 | 11.8 | 12.4 | 11.3 | 11.5 |
| MAD | 2.5 | 1.9 | 2.7 | 1.0 | |
| Max AD | 3.7 | 3.2 | 4.5 | 2.1 | |
Note:
Indicates that the rigid rotor, harmonic oscillator free energy term is neglected.
Figure 2Depiction of the secondary and tertiary amines used in this study.
Predicted pKa values for compound 1 shown in Fig. 3, using tri-ethylamine, heliotridane, and benzylpyrrolidine as a reference, respectively.
The pKa values of heliotridane, and benzylpyrrolidine are taken from Morgenthaler et al. (2007). Note that the latter is estimated and not measured experimentally.
| pKaref | PM6-D3H+/SMD | PM6-D3H+/SMD | PM6/SMD | PM6/COSMO | |
|---|---|---|---|---|---|
| Tri-ethylamine | 10.7 | −4.3 | −3.6 | 5.9 | −0.2 |
| Benzylpyrrolidene | 8.9 | −1.9 | −1.5 | 7.8 | 0.1 |
| Heliotridane | 11.4 | −1.6 | −1.8 | 8.7 | 0.7 |
Note:
Indicates that the rigid rotor, harmonic oscillator free energy term is neglected.