| Literature DB >> 11045818 |
L Xue1, J W Godden, J Bajorath.
Abstract
Combinations of 65 preferred 1D/2D molecular descriptors and 143 single structural keys were evaluated for their performance in compound classification focused on biological activity. The analysis was based on principal component analysis of descriptor combinations and facilitated by use of a genetic algorithm and different scoring functions. In these calculations, several descriptor combinations with greater than 95% prediction accuracy were identified. A set of 40 preferred structural keys was incorporated into a small binary fingerprint designed to search databases for compounds with biological activity similar to query molecules. The performance of mini-fingerprints was tested by systematic similarity search calculations in a database consisting of compounds belonging to seven biological activity classes, which had not been used to select effective descriptors. In these blind test calculations, mini-fingerprints correctly identified approximately 54% of compounds sharing similar biological activity and with 1% false positives. Thus, although the design of mini-fingerprints is conceptually simple, they perform well in activity-oriented similarity searching.Mesh:
Substances:
Year: 2000 PMID: 11045818 DOI: 10.1021/ci000327j
Source DB: PubMed Journal: J Chem Inf Comput Sci ISSN: 0095-2338