| Literature DB >> 12653508 |
Martin Whittle1, Peter Willett, Werner Klaffke, Paula van Noort.
Abstract
Similarity searches using combinations of seven different similarity coefficients and six different representations have been carried out on the Dictionary of Natural Products database. The objective was to discover if any special methods of searching apply to this database, which is very different in nature from the many synthetic databases that have been the subject of previous studies of similarity searching. Search effectiveness was assessed by a recall analysis of the search outputs from sets of pharmacologically active target structures. The different target sets produce exceptional but contradictory results for the Russell-Rao and Forbes coefficients, which have been shown to be due to a dependence on molecular size; these are the coefficients of choice in the case of large and small structures, respectively. Rankings from these results have been combined using a data fusion scheme and some small gains in performance were normally obtained by using substructural fingerprints and molecular holograms in combination with the Squared Euclidean or Tanimoto coefficients.Mesh:
Substances:
Year: 2003 PMID: 12653508 DOI: 10.1021/ci025591m
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338