Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Comparison of similarity coefficients for clustering and compound selection.

Literature DB >> 18293953

Comparison of similarity coefficients for clustering and compound selection.

Abstract

Recent studies into the use of a selection of similarity coefficients, when applied to searches of chemical databases represented by binary fingerprints, have shown considerable variation in their retrieval performance and in the sets of compounds being retrieved. The main factor influencing performance is the density distribution of the bitstrings for the active class, a feature which is closely related to molecular size. If this is the case when these coefficients are applied to similarity searches, then we would expect considerable variation in performance when applied to dissimilarity methods, namely clustering and compound selection. Here we report on several studies which have been undertaken to investigate the relative performance of 13 association and correlation coefficients, which have been shown to exhibit complementary performance in similarity searches, when applied to hierarchical and nonhierarchical clustering methods and to a compound selection methodology. Results suggest that the correlation coefficients perform consistently well for clustering and compound selection, as does the Baroni-Urbani/Buser association coefficient. Surprisingly, these often outperform the Tanimoto coefficient, while the Simple Match (effectively the complement of the Squared Euclidean Distance) performs very poorly.

Year: 2008 PMID： 18293953 DOI： 10.1021/ci700413a

Source DB: PubMed Journal: J Chem Inf Model ISSN： 1549-9596 Impact factor: 4.956

Keyword Cloud
Cited

10 in total

1. Characterizing the diversity and biological relevance of the MLPCN assay manifold and screening set.

Authors: Jintao Zhang; Gerald H Lushington; Jun Huan
Journal: J Chem Inf Model Date: 2011-05-13 Impact factor: 4.956

2. Chemical informatics and the drug discovery knowledge pyramid.

Authors: Gerald H Lushington; Yinghua Dong; Bhargav Theertham
Journal: Comb Chem High Throughput Screen Date: 2013-12 Impact factor: 1.339

3. DPRESS: Localizing estimates of predictive uncertainty.

Authors: Robert D Clark
Journal: J Cheminform Date: 2009-07-14 Impact factor: 5.514

4. WebCSD: the online portal to the Cambridge Structural Database.

Authors: Ian R Thomas; Ian J Bruno; Jason C Cole; Clare F Macrae; Elna Pidcock; Peter A Wood
Journal: J Appl Crystallogr Date: 2010-02-12 Impact factor: 3.304

5. A generalizable definition of chemical similarity for read-across.

Authors: Matteo Floris; Alberto Manganaro; Orazio Nicolotti; Ricardo Medda; Giuseppe Felice Mangiatordi; Emilio Benfenati
Journal: J Cheminform Date: 2014-10-18 Impact factor: 5.514

6. Measuring similarity between gene interaction profiles.

Authors: Joëlle Barido-Sottani; Samuel D Chapman; Evsey Kosman; Arcady R Mushegian
Journal: BMC Bioinformatics Date: 2019-08-22 Impact factor: 3.169

Review 7. In Silico Strategies in Tuberculosis Drug Discovery.

Authors: Stephani Joy Y Macalino; Junie B Billones; Voltaire G Organo; Maria Constancia O Carrillo
Journal: Molecules Date: 2020-02-04 Impact factor: 4.411

Review 8. scMelody: An Enhanced Consensus-Based Clustering Model for Single-Cell Methylation Data by Reconstructing Cell-to-Cell Similarity.

Authors: Qi Tian; Jianxiao Zou; Jianxiong Tang; Liang Liang; Xiaohong Cao; Shicai Fan
Journal: Front Bioeng Biotechnol Date: 2022-02-23

9. Comparison of profile similarity measures for genetic interaction networks.

Authors: Raamesh Deshpande; Benjamin Vandersluis; Chad L Myers
Journal: PLoS One Date: 2013-07-10 Impact factor: 3.240

Review 10. Calculating Glycoprotein Similarities From Mass Spectrometric Data.

Authors: William E Hackett; Joseph Zaia
Journal: Mol Cell Proteomics Date: 2021-01-06 Impact factor: 5.911

10 in total