Literature DB >> 18060505

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

Timon Sebastian Schroeter1, Anton Schwaighofer, Sebastian Mika, Antonius Ter Laak, Detlev Suelzle, Ursula Ganzer, Nikolaus Heinrich, Klaus-Robert Müller.   

Abstract

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

Mesh:

Substances:

Year:  2007        PMID: 18060505     DOI: 10.1007/s10822-007-9160-9

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  19 in total

1.  Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology

Authors: 
Journal:  J Chem Inf Comput Sci       Date:  2000-05

2.  Simultaneous prediction of aqueous solubility and octanol/water partition coefficient based on descriptors derived from molecular structure.

Authors:  D J Livingstone; M G Ford; J J Huuskonen; D W Salt
Journal:  J Comput Aided Mol Des       Date:  2001-08       Impact factor: 3.686

3.  Estimation of aqueous solubility of chemical compounds using E-state indices.

Authors:  I V Tetko; V Y Tanchuk; T N Kasheva; A E Villa
Journal:  J Chem Inf Comput Sci       Date:  2001 Nov-Dec

4.  A consensus neural network-based technique for discriminating soluble and poorly soluble compounds.

Authors:  David T Manallack; Benjamin G Tehan; Emanuela Gancia; Brian D Hudson; Martyn G Ford; David J Livingstone; David C Whitley; Will R Pitt
Journal:  J Chem Inf Comput Sci       Date:  2003 Mar-Apr

5.  Screening for dihydrofolate reductase inhibitors using MOLPRINT 2D, a fast fragment-based method employing the naïve Bayesian classifier: limitations of the descriptor and the importance of balanced chemistry in training and test sets.

Authors:  Andreas Bender; Hamse Y Mussa; Robert C Glen
Journal:  J Biomol Screen       Date:  2005-09-16

6.  Model selection based on structural similarity-method description and application to water solubility prediction.

Authors:  Ralph Kühne; Ralf-Uwe Ebert; Gerrit Schüürmann
Journal:  J Chem Inf Model       Date:  2006 Mar-Apr       Impact factor: 4.956

7.  In silico prediction of buffer solubility based on quantum-mechanical and HQSAR- and topology-based descriptors.

Authors:  Andreas H Göller; Matthias Hennemann; Jörg Keldenich; Timothy Clark
Journal:  J Chem Inf Model       Date:  2006 Mar-Apr       Impact factor: 4.956

Review 8.  Can we estimate the accuracy of ADME-Tox predictions?

Authors:  Igor V Tetko; Pierre Bruneau; Hans-Werner Mewes; Douglas C Rohrer; Gennadiy I Poda
Journal:  Drug Discov Today       Date:  2006-08       Impact factor: 7.851

9.  An introduction to kernel-based learning algorithms.

Authors:  K R Müller; S Mika; G Rätsch; K Tsuda; B Schölkopf
Journal:  IEEE Trans Neural Netw       Date:  2001

10.  Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity.

Authors:  Weida Tong; Qian Xie; Huixiao Hong; Leming Shi; Hong Fang; Roger Perkins
Journal:  Environ Health Perspect       Date:  2004-08       Impact factor: 9.031

View more
  6 in total

1.  Estimation of the applicability domain of kernel-based machine learning models for virtual screening.

Authors:  Nikolas Fechner; Andreas Jahn; Georg Hinselmann; Andreas Zell
Journal:  J Cheminform       Date:  2010-03-11       Impact factor: 5.514

2.  DPRESS: Localizing estimates of predictive uncertainty.

Authors:  Robert D Clark
Journal:  J Cheminform       Date:  2009-07-14       Impact factor: 5.514

3.  Pruned Machine Learning Models to Predict Aqueous Solubility.

Authors:  Alexander L Perryman; Daigo Inoyama; Jimmy S Patel; Sean Ekins; Joel S Freundlich
Journal:  ACS Omega       Date:  2020-07-01

4.  General Purpose Structure-Based Drug Discovery Neural Network Score Functions with Human-Interpretable Pharmacophore Maps.

Authors:  Benjamin P Brown; Jeffrey Mendenhall; Alexander R Geanes; Jens Meiler
Journal:  J Chem Inf Model       Date:  2021-01-26       Impact factor: 4.956

5.  Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes.

Authors:  Irene Luque Ruiz; Miguel Ángel Gómez-Nieto
Journal:  Molecules       Date:  2018-10-24       Impact factor: 4.411

6.  In silico design and optimization of selective membranolytic anticancer peptides.

Authors:  Gisela Gabernet; Damian Gautschi; Alex T Müller; Claudia S Neuhaus; Lucas Armbrecht; Petra S Dittrich; Jan A Hiss; Gisbert Schneider
Journal:  Sci Rep       Date:  2019-08-02       Impact factor: 4.379

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.