Literature DB >> 12444738

Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients.

Xin Chen1, Charles H Reynolds.   

Abstract

2D fragment-based similarity searching is one of the most popular techniques for searching a large database of chemical structures and has been widely applied in drug discovery. However, its performance, especially its effectiveness in retrieving active structural analogues, has not been adequately studied. We report a series of computational experiments, where we systematically studied the influence of structural descriptors and similarity coefficients on the effectiveness of similarity searching. The study was conducted using two public large data sets, NCI anti-AIDS and MDDR. Four sets of 2D linear fragment descriptors, based on the original definitions of atom pairs and atom sequences, were compared. The effect of using the Tanimoto coefficient and the Euclidean distance was studied as a function of descriptor set. The results clearly indicate that the Tanimoto coefficient is superior to the Euclidean distance in 2D-fragment based similarity searching, in terms of hit rate, while atom sequences demonstrate the best overall performance among the structural descriptors we studied.

Entities:  

Mesh:

Substances:

Year:  2002        PMID: 12444738     DOI: 10.1021/ci025531g

Source DB:  PubMed          Journal:  J Chem Inf Comput Sci        ISSN: 0095-2338


  57 in total

1.  Comparison of correlation vector methods for ligand-based similarity searching.

Authors:  Uli Fechner; Lutz Franke; Steffen Renner; Petra Schneider; Gisbert Schneider
Journal:  J Comput Aided Mol Des       Date:  2003-10       Impact factor: 3.686

Review 2.  Methods for Similarity-based Virtual Screening.

Authors:  Thomas G Kristensen; Jesper Nielsen; Christian N S Pedersen
Journal:  Comput Struct Biotechnol J       Date:  2013-03-03       Impact factor: 7.271

3.  ChemMine. A compound mining database for chemical genomics.

Authors:  Thomas Girke; Li-Chang Cheng; Natasha Raikhel
Journal:  Plant Physiol       Date:  2005-06       Impact factor: 8.340

Review 4.  Molecular similarity and diversity in chemoinformatics: from theory to applications.

Authors:  Ana G Maldonado; J P Doucet; Michel Petitjean; Bo-Tao Fan
Journal:  Mol Divers       Date:  2006-02       Impact factor: 2.943

5.  Partner-matching for the automated identification of reproducible ICA components from fMRI datasets: algorithm and validation.

Authors:  Zhishun Wang; Bradley S Peterson
Journal:  Hum Brain Mapp       Date:  2008-08       Impact factor: 5.038

6.  Analysis and use of fragment-occurrence data in similarity-based virtual screening.

Authors:  Shereena M Arif; John D Holliday; Peter Willett
Journal:  J Comput Aided Mol Des       Date:  2009-06-18       Impact factor: 3.686

7.  Accelerated similarity searching and clustering of large compound sets by geometric embedding and locality sensitive hashing.

Authors:  Yiqun Cao; Tao Jiang; Thomas Girke
Journal:  Bioinformatics       Date:  2010-02-23       Impact factor: 6.937

8.  Pre-docking filter for protein and ligand 3D structures.

Authors:  Alisa Wilantho; Sissades Tongsima; Ekachai Jenwitheesuk
Journal:  Bioinformation       Date:  2008-12-31

9.  ChemmineR: a compound mining framework for R.

Authors:  Yiqun Cao; Anna Charisi; Li-Chang Cheng; Tao Jiang; Thomas Girke
Journal:  Bioinformatics       Date:  2008-07-02       Impact factor: 6.937

10.  A maximum common substructure-based algorithm for searching and predicting drug-like compounds.

Authors:  Yiqun Cao; Tao Jiang; Thomas Girke
Journal:  Bioinformatics       Date:  2008-07-01       Impact factor: 6.937

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.