Literature DB >> 16859303

R-NN curves: an intuitive approach to outlier detection using a distance based method.

Rajarshi Guha1, Debojyoti Dutta, Peter C Jurs, Ting Chen.   

Abstract

Libraries of chemical structures are used in a variety of cheminformatics tasks such as virtual screening and QSAR modeling and are generally characterized using molecular descriptors. When working with libraries it is useful to understand the distribution of compounds in the space defined by a set of descriptors. We present a simple approach to the analysis of the spatial distribution of the compounds in a library in general and outlier detection in particular based on counts of neighbors within a series of increasing radii. The resultant curves, termed R-NN curves, appear to follow a logistic model for any given descriptor space, which we justify theoretically for the 2D case. The method can be applied to data sets of arbitrary dimensions. The R-NN curves provide a visual method to easily detect compounds lying in a sparse region of a given descriptor space. We also present a method to numerically characterize the R-NN curves thus allowing identification of outliers in a single plot.

Year:  2006        PMID: 16859303     DOI: 10.1021/ci060013h

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  3 in total

1.  Counting clusters using R-NN curves.

Authors:  Rajarshi Guha; Debojyoti Dutta; David J Wild; Ting Chen
Journal:  J Chem Inf Model       Date:  2007-06-30       Impact factor: 4.956

2.  Chemoinformatic analysis of combinatorial libraries, drugs, natural products, and molecular libraries small molecule repository.

Authors:  Narender Singh; Rajarshi Guha; Marc A Giulianotti; Clemencia Pinilla; Richard A Houghten; Jose L Medina-Franco
Journal:  J Chem Inf Model       Date:  2009-04       Impact factor: 4.956

3.  High-throughput screening of argan oil composition and authenticity using benchtop 1 H NMR.

Authors:  Yvonne Gunning; Alex J Jackson; Joshua Colmer; Fouad Taous; Mark Philo; Rachel M Brignall; Tibari El Ghali; Marianne Defernez; E Kate Kemsley
Journal:  Magn Reson Chem       Date:  2020-04-06       Impact factor: 2.447

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.