Literature DB >> 35288838

Extended continuous similarity indices: theory and application for QSAR descriptor selection.

Anita Rácz1, Timothy B Dunn2, Dávid Bajusz3, Taewon D Kim2, Ramón Alain Miranda-Quintana4,5, Károly Héberger6.   

Abstract

Extended (or n-ary) similarity indices have been recently proposed to extend the comparative analysis of binary strings. Going beyond the traditional notion of pairwise comparisons, these novel indices allow comparing any number of objects at the same time. This results in a remarkable efficiency gain with respect to other approaches, since now we can compare N molecules in O(N) instead of the common quadratic O(N2) timescale. This favorable scaling has motivated the application of these indices to diversity selection, clustering, phylogenetic analysis, chemical space visualization, and post-processing of molecular dynamics simulations. However, the current formulation of the n-ary indices is limited to vectors with binary or categorical inputs. Here, we present the further generalization of this formalism so it can be applied to numerical data, i.e. to vectors with continuous components. We discuss several ways to achieve this extension and present their analytical properties. As a practical example, we apply this formalism to the problem of feature selection in QSAR and prove that the extended continuous similarity indices provide a convenient way to discern between several sets of descriptors.
© 2022. The Author(s), under exclusive licence to Springer Nature Switzerland AG.

Entities:  

Keywords:  Descriptors; Extended similarity; QSAR; Similarity

Mesh:

Year:  2022        PMID: 35288838     DOI: 10.1007/s10822-022-00444-7

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  23 in total

Review 1.  Similarity-based virtual screening using 2D fingerprints.

Authors:  Peter Willett
Journal:  Drug Discov Today       Date:  2006-10-20       Impact factor: 7.851

Review 2.  Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches.

Authors:  Hanna Eckert; Jürgen Bajorath
Journal:  Drug Discov Today       Date:  2007-02-07       Impact factor: 7.851

Review 3.  Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation.

Authors:  Hanna Geppert; Martin Vogt; Jürgen Bajorath
Journal:  J Chem Inf Model       Date:  2010-02-22       Impact factor: 4.956

4.  Similarity coefficients for binary chemoinformatics data: overview and extended comparison using simulated and real data sets.

Authors:  Roberto Todeschini; Viviana Consonni; Hua Xiang; John Holliday; Massimo Buscema; Peter Willett
Journal:  J Chem Inf Model       Date:  2012-11-07       Impact factor: 4.956

Review 5.  Combination of similarity rankings using data fusion.

Authors:  Peter Willett
Journal:  J Chem Inf Model       Date:  2013-01-16       Impact factor: 4.956

6.  Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Authors:  Dávid Bajusz; Anita Rácz; Károly Héberger
Journal:  J Cheminform       Date:  2015-05-20       Impact factor: 5.514

Review 7.  Fusing similarity rankings in ligand-based virtual screening.

Authors:  Peter Willett
Journal:  Comput Struct Biotechnol J       Date:  2013-02-24       Impact factor: 7.271

8.  Binary similarity measures for fingerprint analysis of qualitative metabolomic profiles.

Authors:  Anita Rácz; Filip Andrić; Dávid Bajusz; Károly Héberger
Journal:  Metabolomics       Date:  2018-01-31       Impact factor: 4.290

9.  Extended similarity indices: the benefits of comparing more than two objects simultaneously. Part 1: Theory and characteristics.

Authors:  Ramón Alain Miranda-Quintana; Dávid Bajusz; Anita Rácz; Károly Héberger
Journal:  J Cheminform       Date:  2021-04-23       Impact factor: 5.514

10.  Life beyond the Tanimoto coefficient: similarity measures for interaction fingerprints.

Authors:  Anita Rácz; Dávid Bajusz; Károly Héberger
Journal:  J Cheminform       Date:  2018-10-04       Impact factor: 5.514

View more
  1 in total

1.  Molecular Dynamics Simulations and Diversity Selection by Extended Continuous Similarity Indices.

Authors:  Anita Rácz; Levente M Mihalovits; Dávid Bajusz; Károly Héberger; Ramón Alain Miranda-Quintana
Journal:  J Chem Inf Model       Date:  2022-07-14       Impact factor: 6.162

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.