Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Three useful dimensions for domain applicability in QSAR models using random forest.

Literature DB >> 22385389

Three useful dimensions for domain applicability in QSAR models using random forest.

Abstract

One popular metric for estimating the accuracy of prospective quantitative structure-activity relationship (QSAR) predictions is based on the similarity of the compound being predicted to compounds in the training set from which the QSAR model was built. More recent work in the field has indicated that other parameters might be equally or more important than similarity. Here we make use of two additional parameters: the variation of prediction among random forest trees (less variation among trees indicates more accurate prediction) and the prediction itself (certain ranges of activity are intrinsically easier to predict than others). The accuracy of prediction for a QSAR model, as measured by the root-mean-square error, can be estimated by cross-validation on the training set at the time of model-building and stored as a three-dimensional array of bins. This is an obvious extension of the one-dimensional array of bins we previously proposed for similarity to the training set [Sheridan et al. J. Chem. Inf. Comput. Sci.2004, 44, 1912-1928]. We show that using these three parameters simultaneously adds much more discrimination in prediction accuracy than any single parameter. This approach can be applied to any QSAR method that produces an ensemble of models. We also show that the root-mean-square errors produced by cross-validation are predictive of root-mean-square errors of compounds tested after the model was built.

Mesh：

Year: 2012 PMID： 22385389 DOI： 10.1021/ci300004n

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Keyword Cloud
Cited

24 in total

1. QSAR model based on weighted MCS trees approach for the representation of molecule data sets.

Authors: Bernardo Palacios-Bejarano; Gonzalo Cerruela García; Irene Luque Ruiz; Miguel Ángel Gómez-Nieto
Journal: J Comput Aided Mol Des Date: 2013-02-06 Impact factor: 3.686

2. Chlorophenol sorption on multi-walled carbon nanotubes: DFT modeling and structure-property relationship analysis.

Authors: Marquita Watkins; Natalia Sizochenko; Quentarius Moore; Marek Golebiowski; Danuta Leszczynska; Jerzy Leszczynski
Journal: J Mol Model Date: 2017-01-24 Impact factor: 1.810

3. Introduction to the BioChemical Library (BCL): An Application-Based Open-Source Toolkit for Integrated Cheminformatics and Machine Learning in Computer-Aided Drug Discovery.

Authors: Benjamin P Brown; Oanh Vu; Alexander R Geanes; Sandeepkumar Kothiwale; Mariusz Butkiewicz; Edward W Lowe; Ralf Mueller; Richard Pape; Jeffrey Mendenhall; Jens Meiler
Journal: Front Pharmacol Date: 2022-02-21 Impact factor: 5.810

4. Discovery of potent, selective multidrug and toxin extrusion transporter 1 (MATE1, SLC47A1) inhibitors through prescription drug profiling and computational modeling.

Authors: Matthias B Wittwer; Arik A Zur; Natalia Khuri; Yasuto Kido; Alan Kosaka; Xuexiang Zhang; Kari M Morrissey; Andrej Sali; Yong Huang; Kathleen M Giacomini
Journal: J Med Chem Date: 2013-01-22 Impact factor: 7.446

5. QSAR with experimental and predictive distributions: an information theoretic approach for assessing model quality.

Authors: David J Wood; Lars Carlsson; Martin Eklund; Ulf Norinder; Jonna Stålring
Journal: J Comput Aided Mol Des Date: 2013-03-16 Impact factor: 3.686

6. Assigning confidence to molecular property prediction.

Authors: AkshatKumar Nigam; Robert Pollice; Matthew F D Hurley; Riley J Hickman; Matteo Aldeghi; Naruki Yoshikawa; Seyone Chithrananda; Vincent A Voelz; Alán Aspuru-Guzik
Journal: Expert Opin Drug Discov Date: 2021-06-15 Impact factor: 7.050

7. A novel artificial intelligence protocol to investigate potential leads for Parkinson's disease.

Authors: Zhi-Dong Chen; Lu Zhao; Hsin-Yi Chen; Jia-Ning Gong; Xu Chen; Calvin Yu-Chian Chen
Journal: RSC Adv Date: 2020-06-16 Impact factor: 4.036

8. Prediction of blood:air and fat:air partition coefficients of volatile organic compounds for the interpretation of data in breath gas analysis.

Authors: Christian Kramer; Paweł Mochalski; Karl Unterkofler; Agapios Agapiou; Veronika Ruzsanyi; Klaus R Liedl
Journal: J Breath Res Date: 2016-01-27 Impact factor: 3.262

9. QSAR workbench: automating QSAR modeling to drive compound design.

Authors: Richard Cox; Darren V S Green; Christopher N Luscombe; Noj Malcolm; Stephen D Pickett
Journal: J Comput Aided Mol Des Date: 2013-04-25 Impact factor: 3.686

10. An in silico platform for predicting, screening and designing of antihypertensive peptides.

Authors: Ravi Kumar; Kumardeep Chaudhary; Jagat Singh Chauhan; Gandharva Nagpal; Rahul Kumar; Minakshi Sharma; Gajendra P S Raghava
Journal: Sci Rep Date: 2015-07-27 Impact factor: 4.379