Literature DB >> 19405526

Comparison of nonbinary similarity coefficients for similarity searching, clustering and compound selection.

Aysha Al Khalifa1, Maciej Haranczyk, John Holliday.   

Abstract

Several recent studies have compared the relative performance of a selection of similarity coefficients when applied to chemical databases represented by binary fingerprints. Considerable variation in performance, when used for (dis)similarity-based techniques, such as similarity searching, database clustering, and dissimilarity-based compound selection, has been reported, the reasons for which are closely related to molecular size. For many of these similarity coefficients, an alternative form can be derived which is applicable to sets of nonbinary data, such as calculated or measured physicochemical properties, or counts of substructural fragments. Here we report on several studies which have been undertaken to investigate the relative performance of twelve coefficients when applied to nonbinary data using such (dis)similarity-based techniques. Results suggest that no single coefficient is appropriate for all methodologies investigated and that the size bias detected with binary data is not as apparent when the data and, hence, coefficient are nonbinary in nature.

Year:  2009        PMID: 19405526     DOI: 10.1021/ci8004644

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  14 in total

Review 1.  Exploring chemical space for drug discovery using the chemical universe database.

Authors:  Jean-Louis Reymond; Mahendra Awale
Journal:  ACS Chem Neurosci       Date:  2012-04-25       Impact factor: 4.418

2.  A rank weighted classification for plasma proteomic profiles based on case-based reasoning.

Authors:  Amy M Kwon
Journal:  BMC Med Inform Decis Mak       Date:  2018-05-31       Impact factor: 2.796

3.  Visualisation of the chemical space of fragments, lead-like and drug-like molecules in PubChem.

Authors:  Ruud van Deursen; Lorenz C Blum; Jean-Louis Reymond
Journal:  J Comput Aided Mol Des       Date:  2011-05-27       Impact factor: 3.686

4.  Visualisation and subsets of the chemical universe database GDB-13 for virtual screening.

Authors:  Lorenz C Blum; Ruud van Deursen; Jean-Louis Reymond
Journal:  J Comput Aided Mol Des       Date:  2011-05-27       Impact factor: 3.686

5.  Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.

Authors:  Mahendra Awale; Xian Jin; Jean-Louis Reymond
Journal:  J Cheminform       Date:  2015-02-10       Impact factor: 5.514

6.  Understanding the foundations of the structural similarities between marketed drugs and endogenous human metabolites.

Authors:  Steve O'Hagan; Douglas B Kell
Journal:  Front Pharmacol       Date:  2015-05-13       Impact factor: 5.810

7.  A generalizable definition of chemical similarity for read-across.

Authors:  Matteo Floris; Alberto Manganaro; Orazio Nicolotti; Ricardo Medda; Giuseppe Felice Mangiatordi; Emilio Benfenati
Journal:  J Cheminform       Date:  2014-10-18       Impact factor: 5.514

8.  Analysis of drug-endogenous human metabolite similarities in terms of their maximum common substructures.

Authors:  Steve O'Hagan; Douglas B Kell
Journal:  J Cheminform       Date:  2017-03-09       Impact factor: 5.514

9.  Extended many-item similarity indices for sets of nucleotide and protein sequences.

Authors:  Dávid Bajusz; Ramón Alain Miranda-Quintana; Anita Rácz; Károly Héberger
Journal:  Comput Struct Biotechnol J       Date:  2021-06-16       Impact factor: 7.271

10.  A Comparison Study on Similarity and Dissimilarity Measures in Clustering Continuous Data.

Authors:  Ali Seyed Shirkhorshidi; Saeed Aghabozorgi; Teh Ying Wah
Journal:  PLoS One       Date:  2015-12-11       Impact factor: 3.240

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.