Literature DB >> 24289493

How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space.

Alexios Koutsoukas1, Shardul Paricharak, Warren R J D Galloway, David R Spring, Adriaan P Ijzerman, Robert C Glen, David Marcus, Andreas Bender.   

Abstract

Chemical diversity is a widely applied approach to select structurally diverse subsets of molecules, often with the objective of maximizing the number of hits in biological screening. While many methods exist in the area, few systematic comparisons using current descriptors in particular with the objective of assessing diversity in bioactivity space have been published, and this shortage is what the current study is aiming to address. In this work, 13 widely used molecular descriptors were compared, including fingerprint-based descriptors (ECFP4, FCFP4, MACCS keys), pharmacophore-based descriptors (TAT, TAD, TGT, TGD, GpiDAPH3), shape-based descriptors (rapid overlay of chemical structures (ROCS) and principal moments of inertia (PMI)), a connectivity-matrix-based descriptor (BCUT), physicochemical-property-based descriptors (prop2D), and a more recently introduced molecular descriptor type (namely, "Bayes Affinity Fingerprints"). We assessed both the similar behavior of the descriptors in assessing the diversity of chemical libraries, and their ability to select compounds from libraries that are diverse in bioactivity space, which is a property of much practical relevance in screening library design. This is particularly evident, given that many future targets to be screened are not known in advance, but that the library should still maximize the likelihood of containing bioactive matter also for future screening campaigns. Overall, our results showed that descriptors based on atom topology (i.e., fingerprint-based descriptors and pharmacophore-based descriptors) correlate well in rank-ordering compounds, both within and between descriptor types. On the other hand, shape-based descriptors such as ROCS and PMI showed weak correlation with the other descriptors utilized in this study, demonstrating significantly different behavior. We then applied eight of the molecular descriptors compared in this study to sample a diverse subset of sample compounds (4%) from an initial population of 2587 compounds, covering the 25 largest human activity classes from ChEMBL and measured the coverage of activity classes by the subsets. Here, it was found that "Bayes Affinity Fingerprints" achieved an average coverage of 92% of activity classes. Using the descriptors ECFP4, GpiDAPH3, TGT, and random sampling, 91%, 84%, 84%, and 84% of the activity classes were represented in the selected compounds respectively, followed by BCUT, prop2D, MACCS, and PMI (in order of decreasing performance). In addition, we were able to show that there is no visible correlation between compound diversity in PMI space and in bioactivity space, despite frequent utilization of PMI plots to this end. To summarize, in this work, we assessed which descriptors select compounds with high coverage of bioactivity space, and can hence be used for diverse compound selection for biological screening. In cases where multiple descriptors are to be used for diversity selection, this work describes which descriptors behave complementarily, and can hence be used jointly to focus on different aspects of diversity in chemical space.

Entities:  

Mesh:

Year:  2013        PMID: 24289493     DOI: 10.1021/ci400469u

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  20 in total

Review 1.  RNA Structural Differentiation: Opportunities with Pattern Recognition.

Authors:  Christopher S Eubanks; Amanda E Hargrove
Journal:  Biochemistry       Date:  2018-12-18       Impact factor: 3.162

2.  Development and Testing of Druglike Screening Libraries.

Authors:  Junmei Wang; Yubin Ge; Xiang-Qun Xie
Journal:  J Chem Inf Model       Date:  2019-01-03       Impact factor: 4.956

3.  Small Molecule-Based Pattern Recognition To Classify RNA Structure.

Authors:  Christopher S Eubanks; Jordan E Forte; Gary J Kapral; Amanda E Hargrove
Journal:  J Am Chem Soc       Date:  2016-12-22       Impact factor: 15.419

Review 4.  In Silico Studies in Drug Research Against Neurodegenerative Diseases.

Authors:  Farahnaz Rezaei Makhouri; Jahan B Ghasemi
Journal:  Curr Neuropharmacol       Date:  2018       Impact factor: 7.363

5.  Prediction of developmental chemical toxicity based on gene networks of human embryonic stem cells.

Authors:  Junko Yamane; Sachiyo Aburatani; Satoshi Imanishi; Hiromi Akanuma; Reiko Nagano; Tsuyoshi Kato; Hideko Sone; Seiichiroh Ohsako; Wataru Fujibuchi
Journal:  Nucleic Acids Res       Date:  2016-05-20       Impact factor: 16.971

6.  Stereoselective virtual screening of the ZINC database using atom pair 3D-fingerprints.

Authors:  Mahendra Awale; Xian Jin; Jean-Louis Reymond
Journal:  J Cheminform       Date:  2015-02-10       Impact factor: 5.514

7.  Cheminformatics Research at the Unilever Centre for Molecular Science Informatics Cambridge.

Authors:  Julian E Fuchs; Andreas Bender; Robert C Glen
Journal:  Mol Inform       Date:  2015-03-10       Impact factor: 3.353

8.  Understanding the foundations of the structural similarities between marketed drugs and endogenous human metabolites.

Authors:  Steve O'Hagan; Douglas B Kell
Journal:  Front Pharmacol       Date:  2015-05-13       Impact factor: 5.810

9.  Improved large-scale prediction of growth inhibition patterns using the NCI60 cancer cell line panel.

Authors:  Isidro Cortés-Ciriano; Gerard J P van Westen; Guillaume Bouvier; Michael Nilges; John P Overington; Andreas Bender; Thérèse E Malliavin
Journal:  Bioinformatics       Date:  2015-09-08       Impact factor: 6.937

10.  A 'rule of 0.5' for the metabolite-likeness of approved pharmaceutical drugs.

Authors:  Steve O Hagan; Neil Swainston; Julia Handl; Douglas B Kell
Journal:  Metabolomics       Date:  2014-09-19       Impact factor: 4.290

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.