| Literature DB >> 21033677 |
Haiying Yu1, Ralph Kühne, Ralf-Uwe Ebert, Gerrit Schüürmann.
Abstract
For 1143 organic compounds comprising 580 oxygen acids and 563 nitrogen bases that cover more than 17 orders of experimental pK(a) (from -5.00 to 12.23), the pK(a) prediction performances of ACD, SPARC, and two calibrations of a semiempirical quantum chemical (QC) AM1 approach have been analyzed. The overall root-mean-square errors (rms) for the acids are 0.41, 0.58 (0.42 without ortho-substituted phenols with intramolecular H-bonding), and 0.55 and for the bases are 0.65, 0.70, 1.17, and 1.27 for ACD, SPARC, and both QC methods, respectively. Method-specific performances are discussed in detail for six acid subsets (phenols and aromatic and aliphatic carboxylic acids with different substitution patterns) and nine base subsets (anilines, primary, secondary and tertiary amines, meta/para-substituted and ortho-substituted pyridines, pyrimidines, imidazoles, and quinolines). The results demonstrate an overall better performance for acids than for bases but also a substantial variation across subsets. For the overall best-performing ACD, rms ranges from 0.12 to 1.11 and 0.40 to 1.21 pK(a) units for the acid and base subsets, respectively. With regard to the squared correlation coefficient r², the results are 0.86 to 0.96 (acids) and 0.79 to 0.95 (bases) for ACD, 0.77 to 0.95 (acids) and 0.85 to 0.97 (bases) for SPARC, and 0.64 to 0.87 (acids) and 0.43 to 0.83 (bases) for the QC methods, respectively. Attention is paid to structural and method-specific causes for observed pitfalls. The significant subset dependence of the prediction performances suggests a consensus modeling approach.Entities:
Year: 2010 PMID: 21033677 DOI: 10.1021/ci100306k
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956