Literature DB >> 21728295

Large-scale similarity search profiling of ChEMBL compound data sets.

Kathrin Heikamp1, Jürgen Bajorath.   

Abstract

A large-scale similarity search investigation has been carried out on 266 well-defined compound activity classes extracted from the ChEMBL database. The analysis was performed using two widely applied two-dimensional (2D) fingerprints that mark opposite ends of the current performance spectrum of these types of fingerprints, i.e., MACCS structural keys and the extended connectivity fingerprint with bond diameter four (ECFP4). For each fingerprint, three nearest neighbor search strategies were applied. On the basis of these search calculations, a similarity search profile of the ChEMBL database was generated. Overall, the fingerprint search campaign was surprisingly successful. In 203 of 266 test cases (∼76%), a compound recovery rate of at least 50% was observed with at least the better performing fingerprint and one search strategy. The similarity search profile also revealed several general trends. For example, fingerprint searching was often characterized by an early enrichment of active compounds in database selection sets. In addition, compound activity classes have been categorized according to different similarity search performance levels, which helps to put the results of benchmark calculations into perspective. Therefore, a compendium of activity classes falling into different search performance categories is provided. On the basis of our large-scale investigation, the performance range of state-of-the-art 2D fingerprinting has been delineated for compound data sets directed against a wide spectrum of pharmaceutical targets.

Mesh:

Substances:

Year:  2011        PMID: 21728295     DOI: 10.1021/ci200199u

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  25 in total

1.  Multi-output model with Box-Jenkins operators of linear indices to predict multi-target inhibitors of ubiquitin-proteasome pathway.

Authors:  Gerardo M Casañola-Martin; Huong Le-Thi-Thu; Facundo Pérez-Giménez; Yovani Marrero-Ponce; Matilde Merino-Sanjuán; Concepción Abad; Humberto González-Díaz
Journal:  Mol Divers       Date:  2015-03-10       Impact factor: 2.943

2.  TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database.

Authors:  Lirong Wang; Chao Ma; Peter Wipf; Haibin Liu; Weiwei Su; Xiang-Qun Xie
Journal:  AAPS J       Date:  2013-01-05       Impact factor: 4.009

3.  Data Set Augmentation Allows Deep Learning-Based Virtual Screening to Better Generalize to Unseen Target Classes and Highlight Important Binding Interactions.

Authors:  Jack Scantlebury; Nathan Brown; Frank Von Delft; Charlotte M Deane
Journal:  J Chem Inf Model       Date:  2020-08-04       Impact factor: 4.956

4.  Protein-Ligand Scoring with Convolutional Neural Networks.

Authors:  Matthew Ragoza; Joshua Hochuli; Elisa Idrobo; Jocelyn Sunseri; David Ryan Koes
Journal:  J Chem Inf Model       Date:  2017-04-11       Impact factor: 4.956

5.  Model for high-throughput screening of multitarget drugs in chemical neurosciences: synthesis, assay, and theoretic study of rasagiline carbamates.

Authors:  Nerea Alonso; Olga Caamaño; Francisco J Romero-Duran; Feng Luan; M Natália D S Cordeiro; Matilde Yañez; Humberto González-Díaz; Xerardo García-Mera
Journal:  ACS Chem Neurosci       Date:  2013-07-29       Impact factor: 4.418

6.  Shallow Representation Learning via Kernel PCA Improves QSAR Modelability.

Authors:  Stefano E Rensi; Russ B Altman
Journal:  J Chem Inf Model       Date:  2017-08-07       Impact factor: 4.956

Review 7.  Machine learning in chemoinformatics and drug discovery.

Authors:  Yu-Chen Lo; Stefano E Rensi; Wen Torng; Russ B Altman
Journal:  Drug Discov Today       Date:  2018-05-08       Impact factor: 7.851

8.  Systematic search for benzimidazole compounds and derivatives with antileishmanial effects.

Authors:  Juan Carlos Sánchez-Salgado; Pablo Bilbao-Ramos; María Auxiliadora Dea-Ayuela; Francisco Hernández-Luis; Francisco Bolás-Fernández; José L Medina-Franco; Yareli Rojas-Aguirre
Journal:  Mol Divers       Date:  2018-05-10       Impact factor: 2.943

9.  Synthesis and in silico screening of a library of β-carboline-containing compounds.

Authors:  Kay M Brummond; John R Goodell; Matthew G Laporte; Lirong Wang; Xiang-Qun Xie
Journal:  Beilstein J Org Chem       Date:  2012-07-10       Impact factor: 2.883

10.  Open-source platform to benchmark fingerprints for ligand-based virtual screening.

Authors:  Sereina Riniker; Gregory A Landrum
Journal:  J Cheminform       Date:  2013-05-30       Impact factor: 5.514

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.