Literature DB >> 16995722

A fast clustering algorithm for analyzing highly similar compounds of very large libraries.

Weizhong Li1.   

Abstract

As a result of the recent developments of high-throughput screening in drug discovery, the number of available screening compounds has been growing rapidly. Chemical vendors provide millions of compounds; however, these compounds are highly redundant. Clustering analysis, a technique that groups similar compounds into families, can be used to analyze such redundancy. Many available clustering methods focus on accurate classification of compounds; they are slow and are not suitable for very large compound libraries. Here is described a fast clustering method based on an incremental clustering algorithm and the 2D fingerprints of compounds. This method can cluster a very large data set with millions of compounds in hours on a single computer. A program implemented with this method, called cd-hit-fp, is available from http://chemspace.org.

Mesh:

Year:  2006        PMID: 16995722     DOI: 10.1021/ci0600859

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  7 in total

1.  Counting clusters using R-NN curves.

Authors:  Rajarshi Guha; Debojyoti Dutta; David J Wild; Ting Chen
Journal:  J Chem Inf Model       Date:  2007-06-30       Impact factor: 4.956

2.  3-D clustering: a tool for high throughput docking.

Authors:  John P Priestle
Journal:  J Mol Model       Date:  2008-12-16       Impact factor: 1.810

3.  Structure-based drug design of a new chemical class of small molecules active against influenza A nucleoprotein in vitro and in vivo.

Authors:  Peter Fedichev; Roman Timakhov; Tim Pyrkov; Evgeny Getmantsev; Andrey Vinnik
Journal:  PLoS Curr       Date:  2011-08-07

4.  Discovery of a small-molecule antiviral targeting the HIV-1 matrix protein.

Authors:  Isaac Zentner; Luz-Jeannette Sierra; Lina Maciunas; Andrei Vinnik; Peter Fedichev; Marie K Mankowski; Roger G Ptak; Julio Martín-García; Simon Cocklin
Journal:  Bioorg Med Chem Lett       Date:  2012-11-29       Impact factor: 2.823

5.  Transcription Factor DLX5 As a New Target for Promising Antitumor Agents.

Authors:  R A Timakhov; P O Fedichev; A A Vinnik; J R Testa; O O Favorova
Journal:  Acta Naturae       Date:  2011-07       Impact factor: 1.845

6.  CFam: a chemical families database based on iterative selection of functional seeds and seed-directed compound clustering.

Authors:  Cheng Zhang; Lin Tao; Chu Qin; Peng Zhang; Shangying Chen; Xian Zeng; Feng Xu; Zhe Chen; Sheng Yong Yang; Yu Zong Chen
Journal:  Nucleic Acids Res       Date:  2014-11-20       Impact factor: 16.971

7.  Identification and Preclinical Pharmacology of the γ-Secretase Modulator BMS-869780.

Authors:  Jeremy H Toyn; Lorin A Thompson; Kimberley A Lentz; Jere E Meredith; Catherine R Burton; Sethu Sankaranararyanan; Valerie Guss; Tracey Hall; Lawrence G Iben; Carol M Krause; Rudy Krause; Xu-Alan Lin; Maria Pierdomenico; Craig Polson; Alan S Robertson; R Rex Denton; James E Grace; John Morrison; Joseph Raybon; Xiaoliang Zhuo; Kimberly Snow; Ramesh Padmanabha; Michele Agler; Kim Esposito; David Harden; Margaret Prack; Sam Varma; Victoria Wong; Yingjie Zhu; Tatyana Zvyaga; Samuel Gerritz; Lawrence R Marcin; Mendi A Higgins; Jianliang Shi; Cong Wei; Joseph L Cantone; Dieter M Drexler; John E Macor; Richard E Olson; Michael K Ahlijanian; Charles F Albright
Journal:  Int J Alzheimers Dis       Date:  2014-07-08
  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.