Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Stacking Gaussian processes to improve [Formula: see text] predictions in the SAMPL7 challenge.

Literature DB >> 34363562

Stacking Gaussian processes to improve [Formula: see text] predictions in the SAMPL7 challenge.

Abstract

Accurate predictions of acid dissociation constants are essential to rational molecular design in the pharmaceutical industry and elsewhere. There has been much interest in developing new machine learning methods that can produce fast and accurate pKa predictions for arbitrary species, as well as estimates of prediction uncertainty. Previously, as part of the SAMPL6 community-wide blind challenge, Bannan et al. approached the problem of predicting [Formula: see text]s by using a Gaussian process regression to predict microscopic [Formula: see text]s, from which macroscopic [Formula: see text] values can be analytically computed (Bannan et al. in J Comput-Aided Mol Des 32:1165-1177). While this method can make reasonably quick and accurate predictions using a small training set, accuracy was limited by the lack of a sufficiently broad range of chemical space in the training set (e.g., the inclusion of polyprotic acids). Here, to address this issue, we construct a deep Gaussian Process (GP) model that can include more features without invoking the curse of dimensionality. We trained both a standard GP and a deep GP model using a database of approximately 3500 small molecules curated from public sources, filtered by similarity to targets. We tested the model on both the SAMPL6 and more recent SAMPL7 challenge, which introduced a similar lack of ionizable sites and/or environments found between the test set and the previous training set. The results show that while the deep GP model made only minor improvements over the standard GP model for SAMPL6 predictions, it made significant improvements over the standard GP model in SAMPL7 macroscopic predictions, achieving a MAE of 1.5 [Formula: see text].

Entities: Chemical

Keywords: Acid dissociation constants; Computational drug design; Gaussian process models; Machine learning; Physicochemical properties; SAMPL7 physical property prediction

Mesh：

Substances：
Solvents

Year: 2021 PMID： 34363562 PMCID： PMC9478567 DOI： 10.1007/s10822-021-00411-8

Source DB: PubMed Journal: J Comput Aided Mol Des ISSN： 0920-654X Impact factor: 4.179

19 in total

1. Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation.

Authors: Araz Jakalian; David B Jack; Christopher I Bayly
Journal: J Comput Chem Date: 2002-12 Impact factor: 3.376

2. Predicting pK(a) by molecular tree structured fingerprints and PLS.

Authors: Li Xing; Robert C Glen; Robert D Clark
Journal: J Chem Inf Comput Sci Date: 2003 May-Jun

3. Extended-connectivity fingerprints.

Authors: David Rogers; Mathew Hahn
Journal: J Chem Inf Model Date: 2010-05-24 Impact factor: 4.956

4. Generation of a set of simple, interpretable ADMET rules of thumb.

Authors: M Paul Gleeson
Journal: J Med Chem Date: 2008-01-31 Impact factor: 7.446

5. Comparison of the accuracy of experimental and predicted pKa values of basic and acidic compounds.

Authors: Luca Settimo; Krista Bellman; Ronald M A Knegtel
Journal: Pharm Res Date: 2013-11-19 Impact factor: 4.200

6. Multiconformation, Density Functional Theory-Based pK_a Prediction in Application to Large, Flexible Organic Molecules with Diverse Functional Groups.

Authors: Art D Bochevarov; Mark A Watson; Jeremy R Greenwood; Dean M Philipp
Journal: J Chem Theory Comput Date: 2016-11-29 Impact factor: 6.006

7. Assessing the accuracy of octanol-water partition coefficient predictions in the SAMPL6 Part II log P Challenge.

Authors: Mehtap Işık; Teresa Danielle Bergazin; Thomas Fox; Andrea Rizzi; John D Chodera; David L Mobley
Journal: J Comput Aided Mol Des Date: 2020-02-27 Impact factor: 3.686

8. SAMPL6 challenge results from [Formula: see text] predictions based on a general Gaussian process model.

Authors: Caitlin C Bannan; David L Mobley; A Geoffrey Skillman
Journal: J Comput Aided Mol Des Date: 2018-10-15 Impact factor: 3.686

9. Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information.

Authors: Iurii Sushko; Sergii Novotarskyi; Robert Körner; Anil Kumar Pandey; Matthias Rupp; Wolfram Teetz; Stefan Brandmaier; Ahmed Abdelaziz; Volodymyr V Prokopenko; Vsevolod Y Tanchuk; Roberto Todeschini; Alexandre Varnek; Gilles Marcou; Peter Ertl; Vladimir Potemkin; Maria Grishina; Johann Gasteiger; Christof Schwab; Igor I Baskin; Vladimir A Palyulin; Eugene V Radchenko; William J Welsh; Vladyslav Kholodovych; Dmitriy Chekmarev; Artem Cherkasov; Joao Aires-de-Sousa; Qing-You Zhang; Andreas Bender; Florian Nigsch; Luc Patiny; Antony Williams; Valery Tkachenko; Igor V Tetko
Journal: J Comput Aided Mol Des Date: 2011-06-10 Impact factor: 3.686

10. Structure property relationships of N-acylsulfonamides and related bioisosteres.

Authors: Karol R Francisco; Carmine Varricchio; Thomas J Paniak; Marisa C Kozlowski; Andrea Brancale; Carlo Ballatore
Journal: Eur J Med Chem Date: 2021-03-28 Impact factor: 7.088

1 in total

1. Stacking Gaussian processes to improve [Formula: see text] predictions in the SAMPL7 challenge.

Authors: Robert M Raddi; Vincent A Voelz
Journal: J Comput Aided Mol Des Date: 2021-08-07 Impact factor: 4.179

1 in total