| Literature DB >> 11911694 |
Sergei V Trepalin1, Vadim A Gerasimenko, Andrey V Kozyukov, Nikolay Ph Savchuk, Andrey A Ivaschenko.
Abstract
Some modifications were introduced into the previously described Centroid diversity sorting algorithm, which uses cosine similarity metric. The modified algorithm is suitable for the work with large databases on personal computers. For example, for diversity sorting of the database with the size greater than a million of records, less than 9 h are required (Pentium III, 800 MHz). The problem of selecting new compounds into the existing collection is examined to reach the maximum diversity of the collection. The article describes the new algorithm for the selection of heterocyclic compounds.Entities:
Year: 2002 PMID: 11911694 DOI: 10.1021/ci0100649
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338