Literature DB >> 21875058

Rank order entropy: why one metric is not enough.

Margaret R McLellan1, M Dominic Ryan, Curt M Breneman.   

Abstract

The use of Quantitative Structure-Activity Relationship models to address problems in drug discovery has a mixed history, generally resulting from the misapplication of QSAR models that were either poorly constructed or used outside of their domains of applicability. This situation has motivated the development of a variety of model performance metrics (r(2), PRESS r(2), F-tests, etc.) designed to increase user confidence in the validity of QSAR predictions. In a typical workflow scenario, QSAR models are created and validated on training sets of molecules using metrics such as Leave-One-Out or many-fold cross-validation methods that attempt to assess their internal consistency. However, few current validation methods are designed to directly address the stability of QSAR predictions in response to changes in the information content of the training set. Since the main purpose of QSAR is to quickly and accurately estimate a property of interest for an untested set of molecules, it makes sense to have a means at hand to correctly set user expectations of model performance. In fact, the numerical value of a molecular prediction is often less important to the end user than knowing the rank order of that set of molecules according to their predicted end point values. Consequently, a means for characterizing the stability of predicted rank order is an important component of predictive QSAR. Unfortunately, none of the many validation metrics currently available directly measure the stability of rank order prediction, making the development of an additional metric that can quantify model stability a high priority. To address this need, this work examines the stabilities of QSAR rank order models created from representative data sets, descriptor sets, and modeling methods that were then assessed using Kendall Tau as a rank order metric, upon which the Shannon entropy was evaluated as a means of quantifying rank-order stability. Random removal of data from the training set, also known as Data Truncation Analysis (DTA), was used as a means for systematically reducing the information content of each training set while examining both rank order performance and rank order stability in the face of training set data loss. The premise for DTA ROE model evaluation is that the response of a model to incremental loss of training information will be indicative of the quality and sufficiency of its training set, learning method, and descriptor types to cover a particular domain of applicability. This process is termed a "rank order entropy" evaluation or ROE. By analogy with information theory, an unstable rank order model displays a high level of implicit entropy, while a QSAR rank order model which remains nearly unchanged during training set reductions would show low entropy. In this work, the ROE metric was applied to 71 data sets of different sizes and was found to reveal more information about the behavior of the models than traditional metrics alone. Stable, or consistently performing models, did not necessarily predict rank order well. Models that performed well in rank order did not necessarily perform well in traditional metrics. In the end, it was shown that ROE metrics suggested that some QSAR models that are typically used should be discarded. ROE evaluation helps to discern which combinations of data set, descriptor set, and modeling methods lead to usable models in prioritization schemes and provides confidence in the use of a particular model within a specific domain of applicability.

Entities:  

Mesh:

Year:  2011        PMID: 21875058      PMCID: PMC3428235          DOI: 10.1021/ci200170k

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  70 in total

1.  A widely applicable set of descriptors.

Authors:  P Labute
Journal:  J Mol Graph Model       Date:  2000 Aug-Oct       Impact factor: 2.518

2.  Comparative Quantitative Structureminus signActivity Relationship Studies on Anti-HIV Drugs.

Authors:  Rajni Garg; Satya P. Gupta; Hua Gao; Mekapati Suresh Babu; Asim Kumar Debnath; Corwin Hansch
Journal:  Chem Rev       Date:  1999-12-08       Impact factor: 60.622

3.  Development and evaluation of an in silico model for hERG binding.

Authors:  Minghu Song; Matthew Clark
Journal:  J Chem Inf Model       Date:  2006 Jan-Feb       Impact factor: 4.956

4.  Potent and selective nonpeptidic inhibitors of procollagen C-proteinase.

Authors:  Paul V Fish; Gillian A Allan; Simon Bailey; Julian Blagg; Richard Butt; Michael G Collis; Doris Greiling; Kim James; Jackie Kendall; Andrew McElroy; Dawn McCleverty; Charlotte Reed; Robert Webster; Gavin A Whitlock
Journal:  J Med Chem       Date:  2007-06-26       Impact factor: 7.446

5.  Thermodynamic and structure guided design of statin based inhibitors of 3-hydroxy-3-methylglutaryl coenzyme A reductase.

Authors:  Ronald W Sarver; Elizabeth Bills; Gary Bolton; Larry D Bratton; Nicole L Caspers; James B Dunbar; Melissa S Harris; Richard H Hutchings; Robert M Kennedy; Scott D Larsen; Alexander Pavlovsky; Jeffrey A Pfefferkorn; Graeme Bainbridge
Journal:  J Med Chem       Date:  2008-06-10       Impact factor: 7.446

6.  Functional role of P-glycoprotein in limiting intestinal absorption of drugs: contribution of passive permeability to P-glycoprotein mediated efflux transport.

Authors:  Manthena V S Varma; Khandavilli Sateesh; Ramesh Panchagnula
Journal:  Mol Pharm       Date:  2005 Jan-Feb       Impact factor: 4.939

7.  Transferable atom equivalent multicentered multipole expansion method.

Authors:  C E Whitehead; C M Breneman; N Sukumar; M D Ryan
Journal:  J Comput Chem       Date:  2003-03       Impact factor: 3.376

8.  Identification of orally active, potent, and selective 4-piperazinylquinazolines as antagonists of the platelet-derived growth factor receptor tyrosine kinase family.

Authors:  Anjali Pandey; Deborah L Volkots; Joseph M Seroogy; Jack W Rose; Jin-Chen Yu; Joseph L Lambing; Athiwat Hutchaleelaha; Stanley J Hollenbach; Keith Abe; Neill A Giese; Robert M Scarborough
Journal:  J Med Chem       Date:  2002-08-15       Impact factor: 7.446

9.  Prediction of human volume of distribution values for neutral and basic drugs. 2. Extended data set and leave-class-out statistics.

Authors:  Franco Lombardo; R Scott Obach; Marina Y Shalaeva; Feng Gao
Journal:  J Med Chem       Date:  2004-02-26       Impact factor: 7.446

10.  Correlation of human jejunal permeability (in vivo) of drugs with experimentally and theoretically derived parameters. A multivariate data analysis approach.

Authors:  S Winiwarter; N M Bonham; F Ax; A Hallberg; H Lennernäs; A Karlén
Journal:  J Med Chem       Date:  1998-12-03       Impact factor: 7.446

View more
  4 in total

1.  Impact of distance-based metric learning on classification and visualization model performance and structure-activity landscapes.

Authors:  Natalia V Kireeva; Svetlana I Ovchinnikova; Sergey L Kuznetsov; Andrey M Kazennov; Aslan Yu Tsivadze
Journal:  J Comput Aided Mol Des       Date:  2014-02-04       Impact factor: 3.686

Review 2.  Human heart failure: is cell therapy a valid option?

Authors:  Marcello Rota; Annarosa Leri; Piero Anversa
Journal:  Biochem Pharmacol       Date:  2013-11-13       Impact factor: 5.858

3.  Cardiomyogenesis in the aging and failing human heart.

Authors:  Jan Kajstura; Marcello Rota; Donato Cappetta; Barbara Ogórek; Christian Arranto; Yingnan Bai; João Ferreira-Martins; Sergio Signore; Fumihiro Sanada; Alex Matsuda; James Kostyla; Maria-Virginia Caballero; Claudia Fiorini; David A D'Alessandro; Robert E Michler; Federica del Monte; Toru Hosoda; Mark A Perrella; Annarosa Leri; Bruce A Buchholz; Joseph Loscalzo; Piero Anversa
Journal:  Circulation       Date:  2012-09-06       Impact factor: 29.690

4.  2-Aminomethylene-5-sulfonylthiazole Inhibitors of Lysyl Oxidase (LOX) and LOXL2 Show Significant Efficacy in Delaying Tumor Growth.

Authors:  Deborah A Smithen; Leo M H Leung; Mairi Challinor; Rae Lawrence; HaoRan Tang; Dan Niculescu-Duvaz; Simon P Pearce; Robert Mcleary; Filipa Lopes; Mohammed Aljarah; Michael Brown; Louise Johnson; Graeme Thomson; Richard Marais; Caroline Springer
Journal:  J Med Chem       Date:  2019-09-04       Impact factor: 7.446

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.