Literature DB >> 15554672

Similarity search profiling reveals effects of fingerprint scaling in virtual screening.

Ling Xue1, Florence L Stahura, Jürgen Bajorath.   

Abstract

Fingerprint scaling is a method to increase the performance of similarity search calculations. It is based on the detection of bit patterns in keyed fingerprints that are signatures of specific compound classes. Application of scaling factors to consensus bits that are mostly set on emphasizes signature bit patterns during similarity searching and has been shown to improve search results for different fingerprints. Similarity search profiling has recently been introduced as a method to analyze similarity search calculations. Profiles separately monitor correctly identified hits and other detected database compounds as a function of similarity threshold values and make it possible to estimate whether virtual screening calculations can be successful or to evaluate why they fail. This similarity search profile technique has been applied here to study fingerprint scaling in detail and better understand effects that are responsible for its performance. In particular, we have focused on the qualitative and quantitative analysis of similarity search profiles under scaling conditions. Therefore, we have carried out systematic similarity search calculations for 23 biological activity classes under scaling conditions over a wide range of scaling factors in a compound database containing approximately 1.3 million molecules and monitored these calculations in similarity search profiles. Analysis of these profiles confirmed increases in hit rates as a consequence of scaling and revealed that scaling influences similarity search calculations in different ways. Based on scaled similarity search profiles, compound sets could be divided into different categories. In a number of cases, increases in search performance under scaling conditions were due to a more significant relative increase in correctly identified hits than detected false-positives. This was also consistent with the finding that preferred similarity threshold values increased due to fingerprint scaling, which was well illustrated by similarity search profiling.

Year:  2004        PMID: 15554672     DOI: 10.1021/ci0400819

Source DB:  PubMed          Journal:  J Chem Inf Comput Sci        ISSN: 0095-2338


  14 in total

1.  Bounds and algorithms for fast exact searches of chemical fingerprints in linear and sublinear time.

Authors:  S Joshua Swamidass; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2007-02-28       Impact factor: 4.956

2.  Lossless compression of chemical fingerprints using integer entropy codes improves storage and retrieval.

Authors:  Pierre Baldi; Ryan W Benz; Daniel S Hirschberg; S Joshua Swamidass
Journal:  J Chem Inf Model       Date:  2007-10-30       Impact factor: 4.956

Review 3.  FINDSITE: a combined evolution/structure-based approach to protein function prediction.

Authors:  Jeffrey Skolnick; Michal Brylinski
Journal:  Brief Bioinform       Date:  2009-03-26       Impact factor: 11.622

4.  Integration-mediated prediction enrichment of quantitative model for Hsp90 inhibitors as anti-cancer agents: 3D-QSAR study.

Authors:  Kuldeep K Roy; Supriya Singh; Anil K Saxena
Journal:  Mol Divers       Date:  2010-08-26       Impact factor: 2.943

5.  Comparison of structure-based and threading-based approaches to protein functional annotation.

Authors:  Michal Brylinski; Jeffrey Skolnick
Journal:  Proteins       Date:  2010-01

6.  Hashing algorithms and data structures for rapid searches of fingerprint vectors.

Authors:  Ramzi Nasr; Daniel S Hirschberg; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2010-08-23       Impact factor: 4.956

7.  When is chemical similarity significant? The statistical distribution of chemical similarity scores and its extreme values.

Authors:  Pierre Baldi; Ramzi Nasr
Journal:  J Chem Inf Model       Date:  2010-07-26       Impact factor: 4.956

8.  Speeding up chemical searches using the inverted index: the convergence of chemoinformatics and text search methods.

Authors:  Ramzi Nasr; Rares Vernica; Chen Li; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2012-04-10       Impact factor: 4.956

9.  Large scale study of multiple-molecule queries.

Authors:  Ramzi J Nasr; S Joshua Swamidass; Pierre F Baldi
Journal:  J Cheminform       Date:  2009-06-04       Impact factor: 5.514

10.  BLASTing small molecules--statistics and extreme statistics of chemical similarity scores.

Authors:  Pierre Baldi; Ryan W Benz
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.