Literature DB >> 13677490

Rational selection of training and test sets for the development of validated QSAR models.

Alexander Golbraikh1, Min Shen, Zhiyan Xiao, Yun-De Xiao, Kuo-Hsiung Lee, Alexander Tropsha.   

Abstract

Quantitative Structure-Activity Relationship (QSAR) models are used increasingly to screen chemical databases and/or virtual chemical libraries for potentially bioactive molecules. These developments emphasize the importance of rigorous model validation to ensure that the models have acceptable predictive power. Using k nearest neighbors (kNN) variable selection QSAR method for the analysis of several datasets, we have demonstrated recently that the widely accepted leave-one-out (LOO) cross-validated R2 (q2) is an inadequate characteristic to assess the predictive ability of the models [Golbraikh, A., Tropsha, A. Beware of q2! J. Mol. Graphics Mod. 20, 269-276, (2002)]. Herein, we provide additional evidence that there exists no correlation between the values of q2 for the training set and accuracy of prediction (R2) for the test set and argue that this observation is a general property of any QSAR model developed with LOO cross-validation. We suggest that external validation using rationally selected training and test sets provides a means to establish a reliable QSAR model. We propose several approaches to the division of experimental datasets into training and test sets and apply them in QSAR studies of 48 functionalized amino acid anticonvulsants and a series of 157 epipodophyllotoxin derivatives with antitumor activity. We formulate a set of general criteria for the evaluation of predictive power of QSAR models.

Entities:  

Mesh:

Substances:

Year:  2003        PMID: 13677490     DOI: 10.1023/a:1025386326946

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  23 in total

1.  Modeling antimalarial activity: application of Kinetic Energy Density Quantum Similarity Measures as descriptors in QSAR.

Authors:  X Gironés; A Gallegos; R Carbó-Dorca
Journal:  J Chem Inf Comput Sci       Date:  2000 Nov-Dec

2.  Novel variable selection quantitative structure--property relationship approach based on the k-nearest-neighbor principle

Authors: 
Journal:  J Chem Inf Comput Sci       Date:  2000-01

Review 3.  Identification of the descriptor pharmacophores using variable selection QSAR: applications to database mining.

Authors:  A Tropsha; W Zheng
Journal:  Curr Pharm Des       Date:  2001-05       Impact factor: 3.116

4.  Construction of high-quality structure-property-activity regressions: the boiling points of sulfides

Authors: 
Journal:  J Chem Inf Comput Sci       Date:  2000-07

5.  Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection.

Authors:  Alexander Golbraikh; Alexander Tropsha
Journal:  J Comput Aided Mol Des       Date:  2002 May-Jun       Impact factor: 3.686

6.  Comparison of algorithms for dissimilarity-based compound selection.

Authors:  M Snarey; N K Terrett; P Willett; D J Wilton
Journal:  J Mol Graph Model       Date:  1997-12       Impact factor: 2.518

7.  Rational combinatorial library design. 2. Rational design of targeted combinatorial peptide libraries using chemical similarity probe and the inverse QSAR approaches.

Authors:  S J Cho; W Zheng; A Tropsha
Journal:  J Chem Inf Comput Sci       Date:  1998 Mar-Apr

8.  Quantitative structure-antitumor activity relationships of camptothecin analogues: cluster analysis and genetic algorithm-based studies.

Authors:  Y Fan; L M Shi; K W Kohn; Y Pommier; J N Weinstein
Journal:  J Med Chem       Date:  2001-09-27       Impact factor: 7.446

9.  Antitumor agents. 213. Modeling of epipodophyllotoxin derivatives using variable selection k nearest neighbor QSAR method.

Authors:  Zhiyan Xiao; Yun-De Xiao; Jun Feng; Alexander Golbraikh; Alexander Tropsha; Kuo-Hsiung Lee
Journal:  J Med Chem       Date:  2002-05-23       Impact factor: 7.446

10.  [Modeling of a three-dimensional structure of cytochrome P-450 1A2 and search for its new ligands].

Authors:  N V Belkina; V S Skvortsov; A S Ivanov; A I Archakov
Journal:  Vopr Med Khim       Date:  1998 Sep-Oct
View more
  103 in total

1.  Scaffold hopping for identification of novel D(2) antagonist based on 3D pharmacophore modelling of illoperidone analogs.

Authors:  Radha Charan Dash; Sharad H Bhosale; Suhas M Shelke; Mugdha R Suryawanshi; Ashish M Kanhed; Kakasaheb R Mahadik
Journal:  Mol Divers       Date:  2011-12-08       Impact factor: 2.943

2.  A QSAR study of radical scavenging antioxidant activity of a series of flavonoids using DFT based quantum chemical descriptors--the importance of group frontier electron density.

Authors:  Ananda Sarkar; Tapas Ranjan Middya; Atish Dipnakar Jana
Journal:  J Mol Model       Date:  2011-11-13       Impact factor: 1.810

3.  Novel coumarin-based tyrosinase inhibitors discovered by OECD principles-validated QSAR approach from an enlarged, balanced database.

Authors:  Huong Le-Thi-Thu; Gerardo M Casañola-Martín; Yovani Marrero-Ponce; Antonio Rescigno; Luciano Saso; Virinder S Parmar; Francisco Torrens; Concepción Abad
Journal:  Mol Divers       Date:  2010-09-03       Impact factor: 2.943

4.  In silico exploration of c-KIT inhibitors by pharmaco-informatics methodology: pharmacophore modeling, 3D QSAR, docking studies, and virtual screening.

Authors:  Prashant Chaudhari; Sanjay Bari
Journal:  Mol Divers       Date:  2015-09-28       Impact factor: 2.943

5.  Genetic algorithms and self-organizing maps: a powerful combination for modeling complex QSAR and QSPR problems.

Authors:  Ersin Bayram; Peter Santago; Rebecca Harris; Yun-De Xiao; Aaron J Clauset; Jeffrey D Schmitt
Journal:  J Comput Aided Mol Des       Date:  2004 Jul-Sep       Impact factor: 3.686

6.  An automated PLS search for biologically relevant QSAR descriptors.

Authors:  Marius Olah; Cristian Bologa; Tudor I Oprea
Journal:  J Comput Aided Mol Des       Date:  2004 Jul-Sep       Impact factor: 3.686

7.  3D-QSAR illusions.

Authors:  Arthur M Doweyko
Journal:  J Comput Aided Mol Des       Date:  2004 Jul-Sep       Impact factor: 3.686

8.  A support vector machine approach to classify human cytochrome P450 3A4 inhibitors.

Authors:  Jan M Kriegl; Thomas Arnhold; Bernd Beck; Thomas Fox
Journal:  J Comput Aided Mol Des       Date:  2005-03       Impact factor: 3.686

9.  Development of quantitative structure-binding affinity relationship models based on novel geometrical chemical descriptors of the protein-ligand interfaces.

Authors:  Shuxing Zhang; Alexander Golbraikh; Alexander Tropsha
Journal:  J Med Chem       Date:  2006-05-04       Impact factor: 7.446

10.  Discovery of novel antimalarial compounds enabled by QSAR-based virtual screening.

Authors:  Liying Zhang; Denis Fourches; Alexander Sedykh; Hao Zhu; Alexander Golbraikh; Sean Ekins; Julie Clark; Michele C Connelly; Martina Sigal; Dena Hodges; Armand Guiguemde; R Kiplin Guy; Alexander Tropsha
Journal:  J Chem Inf Model       Date:  2013-01-23       Impact factor: 4.956

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.