| Literature DB >> 31647238 |
Yipin Lu1, Shankara Anand1, William Shirley1, Peter Gedeck1, Brian P Kelley2, Suzanne Skolnik2, Stephane Rodde3, Mai Nguyen1, Mika Lindvall1, Weiping Jia1.
Abstract
The acid-base dissociation constant, pKa, is a key parameter to define the ionization state of a compound and directly affects its biopharmaceutical profile. In this study, we developed a novel approach for pKa prediction using rooted topological torsion fingerprints in combination with five machine learning (ML) methods: random forest, partial least squares, extreme gradient boosting, lasso regression, and support vector regression. With a large and diverse set of 14 499 experimental pKa values, pKa models were developed for aliphatic amines. The models demonstrated consistently good prediction statistics and were able to generate accurate prospective predictions as validated with an external test set of 726 pKa values (RMSE 0.45, MAE 0.33, and R2 0.84 by the top model). The factors that may affect prediction accuracy and model applicability were carefully assessed. The results demonstrated that rooted topological torsion fingerprints coupled with ML methods provide a promising approach for developing accurate pKa prediction models.Entities:
Year: 2019 PMID: 31647238 DOI: 10.1021/acs.jcim.9b00498
Source DB: PubMed Journal: J Chem Inf Model ISSN: 1549-9596 Impact factor: 4.956