Literature DB >> 15921467

kappa Nearest neighbors QSAR modeling as a variational problem: theory and applications.

Peter Itskowitz1, Alexander Tropsha.   

Abstract

Variable selection k Nearest Neighbor (kNN) QSAR is a popular nonlinear methodology for building correlation models between chemical descriptors of compounds and biological activities. The models are built by finding a subspace of the original descriptor space where activity of each compound in the data set is most accurately predicted as the averaged activity of its k nearest neighbors in this subspace. We have formulated the problem of searching for the optimized kNN QSAR models with the highest predictive power as a variational problem. We have investigated the relative contribution of several model parameters such as the selection of variables, the number (k) of nearest neighbors, and the shape of the weighting function used to evaluate the contributions of k nearest neighbor compound activities to the predicted activity of each compound. We have derived the expression for the weighting function which maximizes the model performance. This optimization methodology was applied to several experimental data sets divided into the training and test sets. We report a significant improvement of both the leave-one-out cross-validated R(2) (q(2)) for the training sets and predictive R(2) of the test sets in all cases. Depending on the data set, the average improvements in the prediction accuracy (prediction R(2)) for the test sets ranged between 1.1% and 94% and for the training sets (q(2)) between 3.5% and 118%. We also describe a modified computational procedure for model building based on the use of relational databases to store descriptors and calculate compounds' similarities, which simplifies calculations and increases their efficiency.

Entities:  

Mesh:

Year:  2005        PMID: 15921467     DOI: 10.1021/ci049628+

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  13 in total

1.  kScore: a novel machine learning approach that is not dependent on the data structure of the training set.

Authors:  Scott Oloff; Ingo Muegge
Journal:  J Comput Aided Mol Des       Date:  2007-02-28       Impact factor: 3.686

2.  Differentiation of AmpC beta-lactamase binders vs. decoys using classification kNN QSAR modeling and application of the QSAR classifier to virtual screening.

Authors:  Jui-Hua Hsieh; Xiang S Wang; Denise Teotico; Alexander Golbraikh; Alexander Tropsha
Journal:  J Comput Aided Mol Des       Date:  2008-03-13       Impact factor: 3.686

3.  kNNsim: k-nearest neighbors similarity with genetic algorithm features optimization enhances the prediction of activity classes for small molecules.

Authors:  Dariusz Plewczynski
Journal:  J Mol Model       Date:  2008-07-29       Impact factor: 1.810

Review 4.  Modeling kinetics of subcellular disposition of chemicals.

Authors:  Stefan Balaz
Journal:  Chem Rev       Date:  2009-05       Impact factor: 60.622

5.  Discovery of Natural Product-Derived 5-HT1A Receptor Binders by Cheminfomatics Modeling of Known Binders, High Throughput Screening and Experimental Validation.

Authors:  Man Luo; Terry-Elinor Reid; Xiang Simon Wang
Journal:  Comb Chem High Throughput Screen       Date:  2015       Impact factor: 1.339

Review 6.  Considerations and recent advances in QSAR models for cytochrome P450-mediated drug metabolism prediction.

Authors:  Haiyan Li; Jin Sun; Xiaowen Fan; Xiaofan Sui; Lan Zhang; Yongjun Wang; Zhonggui He
Journal:  J Comput Aided Mol Des       Date:  2008-06-24       Impact factor: 3.686

7.  ADMET evaluation in drug discovery: 15. Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling.

Authors:  Tailong Lei; Youyong Li; Yunlong Song; Dan Li; Huiyong Sun; Tingjun Hou
Journal:  J Cheminform       Date:  2016-02-01       Impact factor: 5.514

8.  Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions.

Authors:  Faizan Sahigara; Davide Ballabio; Roberto Todeschini; Viviana Consonni
Journal:  J Cheminform       Date:  2013-05-30       Impact factor: 5.514

Review 9.  Reviewing ligand-based rational drug design: the search for an ATP synthase inhibitor.

Authors:  Chia-Hsien Lee; Hsuan-Cheng Huang; Hsueh-Fen Juan
Journal:  Int J Mol Sci       Date:  2011-08-17       Impact factor: 5.923

10.  A comparative analysis of predictive models of morbidity in intensive care unit after cardiac surgery - part I: model planning.

Authors:  Emanuela Barbini; Gabriele Cevenini; Sabino Scolletta; Bonizella Biagioli; Pierpaolo Giomarelli; Paolo Barbini
Journal:  BMC Med Inform Decis Mak       Date:  2007-11-22       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.