| Literature DB >> 12132890 |
Jeffrey W Godden1, Ling Xue, Douglas B Kitchen, Florence L Stahura, E James Schermerhorn, Jürgen Bajorath.
Abstract
A method termed Median Partitioning (MP) has been developed to select diverse sets of molecules from large compound pools. Unlike many other methods for subset selection, the MP approach does not depend on pairwise comparison of molecules and can therefore be applied to very large compound collections. The only time limiting step is the calculation of molecular descriptors for database compounds. MP employs arrays of property descriptors with little correlation to divide large compound pools into partitions from which representative molecules can be selected. In each of n subsequent steps, a population of molecules is divided into subpopulations above and below the median value of a property descriptor until a desired number of 2n partitions are obtained. For descriptor evaluation and selection, an entropy formulation was embedded in a genetic algorithm. MP has been applied here to generate a subset of the Available Chemicals Directory, and the results have been compared with cell-based partitioning.Mesh:
Year: 2002 PMID: 12132890 DOI: 10.1021/ci0203693
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338