Literature DB >> 17632688

Estimating the domain of applicability for machine learning QSAR models: a study on aqueous solubility of drug discovery molecules.

Timon Sebastian Schroeter1, Anton Schwaighofer, Sebastian Mika, Antonius Ter Laak, Detlev Suelzle, Ursula Ganzer, Nikolaus Heinrich, Klaus-Robert Müller.   

Abstract

We investigate the use of different Machine Learning methods to construct models for aqueous solubility. Models are based on about 4000 compounds, including an in-house set of 632 drug discovery molecules of Bayer Schering Pharma. For each method, we also consider an appropriate method to obtain error bars, in order to estimate the domain of applicability (DOA) for each model. Here, we investigate error bars from a Bayesian model (Gaussian Process (GP)), an ensemble based approach (Random Forest), and approaches based on the Mahalanobis distance to training data (for Support Vector Machine and Ridge Regression models). We evaluate all approaches in terms of their prediction accuracy (in cross-validation, and on an external validation set of 536 molecules) and in how far the individual error bars can faithfully represent the actual prediction error.

Mesh:

Substances:

Year:  2007        PMID: 17632688     DOI: 10.1007/s10822-007-9125-z

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  19 in total

1.  Estimation of aqueous solubility for a diverse set of organic compounds based on molecular topology

Authors: 
Journal:  J Chem Inf Comput Sci       Date:  2000-05

2.  Simultaneous prediction of aqueous solubility and octanol/water partition coefficient based on descriptors derived from molecular structure.

Authors:  D J Livingstone; M G Ford; J J Huuskonen; D W Salt
Journal:  J Comput Aided Mol Des       Date:  2001-08       Impact factor: 3.686

3.  Estimation of aqueous solubility of chemical compounds using E-state indices.

Authors:  I V Tetko; V Y Tanchuk; T N Kasheva; A E Villa
Journal:  J Chem Inf Comput Sci       Date:  2001 Nov-Dec

4.  A consensus neural network-based technique for discriminating soluble and poorly soluble compounds.

Authors:  David T Manallack; Benjamin G Tehan; Emanuela Gancia; Brian D Hudson; Martyn G Ford; David J Livingstone; David C Whitley; Will R Pitt
Journal:  J Chem Inf Comput Sci       Date:  2003 Mar-Apr

5.  Screening for dihydrofolate reductase inhibitors using MOLPRINT 2D, a fast fragment-based method employing the naïve Bayesian classifier: limitations of the descriptor and the importance of balanced chemistry in training and test sets.

Authors:  Andreas Bender; Hamse Y Mussa; Robert C Glen
Journal:  J Biomol Screen       Date:  2005-09-16

6.  Model selection based on structural similarity-method description and application to water solubility prediction.

Authors:  Ralph Kühne; Ralf-Uwe Ebert; Gerrit Schüürmann
Journal:  J Chem Inf Model       Date:  2006 Mar-Apr       Impact factor: 4.956

7.  In silico prediction of buffer solubility based on quantum-mechanical and HQSAR- and topology-based descriptors.

Authors:  Andreas H Göller; Matthias Hennemann; Jörg Keldenich; Timothy Clark
Journal:  J Chem Inf Model       Date:  2006 Mar-Apr       Impact factor: 4.956

Review 8.  Can we estimate the accuracy of ADME-Tox predictions?

Authors:  Igor V Tetko; Pierre Bruneau; Hans-Werner Mewes; Douglas C Rohrer; Gennadiy I Poda
Journal:  Drug Discov Today       Date:  2006-08       Impact factor: 7.851

9.  An introduction to kernel-based learning algorithms.

Authors:  K R Müller; S Mika; G Rätsch; K Tsuda; B Schölkopf
Journal:  IEEE Trans Neural Netw       Date:  2001

10.  Assessment of prediction confidence and domain extrapolation of two structure-activity relationship models for predicting estrogen receptor binding activity.

Authors:  Weida Tong; Qian Xie; Huixiao Hong; Leming Shi; Hong Fang; Roger Perkins
Journal:  Environ Health Perspect       Date:  2004-08       Impact factor: 9.031

View more
  7 in total

1.  Automatic QSAR modeling of ADME properties: blood-brain barrier penetration and aqueous solubility.

Authors:  Olga Obrezanova; Joelle M R Gola; Edmund J Champness; Matthew D Segall
Journal:  J Comput Aided Mol Des       Date:  2008-02-14       Impact factor: 3.686

2.  ADME properties evaluation in drug discovery: in silico prediction of blood-brain partitioning.

Authors:  Lu Zhu; Junnan Zhao; Yanmin Zhang; Weineng Zhou; Linfeng Yin; Yuchen Wang; Yuanrong Fan; Yadong Chen; Haichun Liu
Journal:  Mol Divers       Date:  2018-08-06       Impact factor: 2.943

3.  Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.

Authors:  Alessandro Lusci; Gianluca Pollastri; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2013-07-02       Impact factor: 4.956

Review 4.  Machine learning for flow batteries: opportunities and challenges.

Authors:  Tianyu Li; Changkun Zhang; Xianfeng Li
Journal:  Chem Sci       Date:  2022-04-07       Impact factor: 9.969

5.  Pushing the limits of solubility prediction via quality-oriented data selection.

Authors:  Murat Cihan Sorkun; J M Vianney A Koelman; Süleyman Er
Journal:  iScience       Date:  2020-12-17

6.  Can small drugs predict the intrinsic aqueous solubility of 'beyond Rule of 5' big drugs?

Authors:  Alex Avdeef; Manfred Kansy
Journal:  ADMET DMPK       Date:  2020-04-25

7.  Prediction of aqueous intrinsic solubility of druglike molecules using Random Forest regression trained with Wiki-pS0 database.

Authors:  Alex Avdeef
Journal:  ADMET DMPK       Date:  2020-03-04
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.