Literature DB >> 23479283

Representing descriptors derived from multiple conformations as uncertain features for machine learning.

Ulf Norinder1, Henrik Boström.   

Abstract

Uncertainty was introduced into the chemical descriptors of 11 datasets by conformational analysis in order to incorporate three-dimensional information and to investigate the resulting predictive performance of a state-of-the-art machine learning method, random forests, for binary classification tasks. A number of strategies for handling uncertainty in random forests were evaluated. The study showed that when incorporating three-dimensional information as uncertainty into chemical descriptors, the use of uniform probability distributions over the range of possible values, in conjunction with fractional distribution of compounds clearly outperforms the use of normal distributions as well as sampling from both normal and uniform distributions. The main conclusion of this study is that, even when distributions of uncertain values are provided, the random forest method can generate models that are almost as accurate from the expected values of these distributions alone. Hence, there seems to be little advantage to using the more elaborate methods of incorporating uncertainty in chemical descriptors when using random forests rather than replacing the distributions with single-point values. The results also show that random forest models with similar performances can also be generated using three-dimensional descriptor information derived from single (lowest-energy or Corina-derived) conformations.

Mesh:

Year:  2013        PMID: 23479283     DOI: 10.1007/s00894-013-1806-z

Source DB:  PubMed          Journal:  J Mol Model        ISSN: 0948-5023            Impact factor:   1.810


  12 in total

Review 1.  ADMET in silico modelling: towards prediction paradise?

Authors:  Han van de Waterbeemd; Eric Gifford
Journal:  Nat Rev Drug Discov       Date:  2003-03       Impact factor: 84.694

2.  What is wrong with quantitative structure-property relations models based on three-dimensional descriptors?

Authors:  M Hechinger; K Leonhard; W Marquardt
Journal:  J Chem Inf Model       Date:  2012-08-16       Impact factor: 4.956

Review 3.  Data reduction and representation in drug discovery.

Authors:  Trevor J Howe; Guy Mahieu; Patrick Marichal; Tom Tabruyn; Pieter Vugts
Journal:  Drug Discov Today       Date:  2006-11-13       Impact factor: 7.851

4.  Development, interpretation and temporal evaluation of a global QSAR of hERG electrophysiology screening data.

Authors:  Claire L Gavaghan; Catrin Hasselgren Arnby; Niklas Blomberg; Gert Strandlund; Scott Boyer
Journal:  J Comput Aided Mol Des       Date:  2007-03-24       Impact factor: 3.686

5.  Benchmark data set for in silico prediction of Ames mutagenicity.

Authors:  Katja Hansen; Sebastian Mika; Timon Schroeter; Andreas Sutter; Antonius ter Laak; Thomas Steger-Hartmann; Nikolaus Heinrich; Klaus-Robert Müller
Journal:  J Chem Inf Model       Date:  2009-09       Impact factor: 4.956

6.  Ensemble QSAR: a QSAR method based on conformational ensembles and metric descriptors.

Authors:  Raghuvir R S Pissurlenkar; Vijay M Khedkar; Radhakrishnan P Iyer; Evans C Coutinho
Journal:  J Comput Chem       Date:  2011-04-21       Impact factor: 3.376

7.  Conformation-dependent QSPR models: logPOW.

Authors:  Markus Muehlbacher; Ahmed El Kerdawy; Christian Kramer; Brian Hudson; Timothy Clark
Journal:  J Chem Inf Model       Date:  2011-08-29       Impact factor: 4.956

8.  Introducing uncertainty in predictive modeling--friend or foe?

Authors:  Ulf Norinder; Henrik Boström
Journal:  J Chem Inf Model       Date:  2012-10-31       Impact factor: 4.956

9.  Automated QSAR with a Hierarchy of Global and Local Models.

Authors:  David J Wood; David Buttar; John G Cumming; Andrew M Davis; Ulf Norinder; Sarah L Rodgers
Journal:  Mol Inform       Date:  2011-11-15       Impact factor: 3.353

10.  Compass: predicting biological activities from molecular surface properties. Performance comparisons on a steroid benchmark.

Authors:  A N Jain; K Koile; D Chapman
Journal:  J Med Chem       Date:  1994-07-22       Impact factor: 7.446

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.