| Literature DB >> 10850786 |
L Xue1, J Bajorath.
Abstract
We have evaluated combinations of 111 descriptors that were calculated from two-dimensional representations of molecules to classify 455 compounds belonging to seven biological activity classes using a method based on principal component analysis. The analysis was facilitated by application of a genetic algorithm. Using scoring functions that related the number of compounds in pure classes (i.e., compounds with the same biological activity), singletons, and mixed classes, effective descriptor sets were identified. A combination of only four molecular descriptors accounting for aromatic character, hydrogen bond acceptors, estimated polar van der Waals surface area, and a single structural key gave overall best results. At this performance level, approximately 91% of the compounds occurred in pure classes and mixed classes were absent. The results indicate that combinations of only a few critical descriptors are preferred to partition compounds according to their biological activity, at least in the test cases studied here.Entities:
Mesh:
Substances:
Year: 2000 PMID: 10850786 DOI: 10.1021/ci000322m
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338