Literature DB >> 16045274

A hierarchical clustering approach for large compound libraries.

Alexander Böcker1, Swetlana Derksen, Elena Schmidt, Andreas Teckentrup, Gisbert Schneider.   

Abstract

A modified version of the k-means clustering algorithm was developed that is able to analyze large compound libraries. A distance threshold determined by plotting the sum of radii of leaf clusters was used as a termination criterion for the clustering process. Hierarchical trees were constructed that can be used to obtain an overview of the data distribution and inherent cluster structure. The approach is also applicable to ligand-based virtual screening with the aim to generate preferred screening collections or focused compound libraries. Retrospective analysis of two activity classes was performed: inhibitors of caspase 1 [interleukin 1 (IL1) cleaving enzyme, ICE] and glucocorticoid receptor ligands. The MDL Drug Data Report (MDDR) and Collection of Bioactive Reference Analogues (COBRA) databases served as the compound pool, for which binary trees were produced. Molecules were encoded by all Molecular Operating Environment 2D descriptors and topological pharmacophore atom types. Individual clusters were assessed for their purity and enrichment of actives belonging to the two ligand classes. Significant enrichment was observed in individual branches of the cluster tree. After clustering a combined database of MDDR, COBRA, and the SPECS catalog, it was possible to retrieve MDDR ICE inhibitors with new scaffolds using COBRA ICE inhibitors as seeds. A Java implementation of the clustering method is available via the Internet (http://www.modlab.de).

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16045274     DOI: 10.1021/ci0500029

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  5 in total

1.  The development of a knowledge base for basic active structures: an example case of dopamine agonists.

Authors:  Takashi Okada; Masumi Yamakawa; Norihito Ohmori; Sachio Mori; Hiroshi Horikawa; Taketo Hayashi; Satoshi Fujishima
Journal:  Chem Cent J       Date:  2010-01-23       Impact factor: 4.215

2.  Machine Learning-Enabled Pipeline for Large-Scale Virtual Drug Screening.

Authors:  Aayush Gupta; Huan-Xiang Zhou
Journal:  J Chem Inf Model       Date:  2021-08-17       Impact factor: 6.162

3.  CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering.

Authors:  Cheng Zhang; Lin Tao; Chu Qin; Peng Zhang; Shangying Chen; Xian Zeng; Feng Xu; Zhe Chen; Sheng Yong Yang; Yu Zong Chen
Journal:  Nucleic Acids Res       Date:  2014-11-20       Impact factor: 16.971

4.  Clustered distribution of natural product leads of drugs in the chemical space as influenced by the privileged target-sites.

Authors:  Lin Tao; Feng Zhu; Chu Qin; Cheng Zhang; Shangying Chen; Peng Zhang; Cunlong Zhang; Chunyan Tan; Chunmei Gao; Zhe Chen; Yuyang Jiang; Yu Zong Chen
Journal:  Sci Rep       Date:  2015-03-20       Impact factor: 4.379

Review 5.  Commercial SARS-CoV-2 Targeted, Protease Inhibitor Focused and Protein-Protein Interaction Inhibitor Focused Molecular Libraries for Virtual Screening and Drug Design.

Authors:  Sebastjan Kralj; Marko Jukič; Urban Bren
Journal:  Int J Mol Sci       Date:  2021-12-30       Impact factor: 5.923

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.