Literature DB >> 18466274

Random reduction in fingerprint bit density improves compound recall in search calculations using complex reference molecules.

Yuan Wang1, Hanna Geppert, Jürgen Bajorath.   

Abstract

Fingerprints are bit string representations of molecular structure and properties and widely used tools to search databases for active molecules. It is well appreciated that molecular complexity and size effects lead to systematic errors in fingerprint similarity searching. For example, different studies have highlighted the caveats associated with preferential recognition of large compounds, irrespective of their activity, when complex molecules are used as templates for fingerprint calculations. In order to systematically study complexity relationships between reference and database molecules that are relevant for practical fingerprint similarity searching, we have designed sets of active molecules of increasing fingerprint bit density relative to average database compounds and potential hits and carried out systematic similarity search trials. We find that the more complex reference molecules are, the lower the search performance becomes. However, a major result has been that random deletion of bits that are set on in fingerprints of complex reference molecules generally improves compound recall, although these random bit density reductions also cause a loss in chemical information content. These results suggest a general search strategy for fingerprints that are sensitive to complexity effects when optimized active compounds are used as reference molecules.

Mesh:

Substances:

Year:  2008        PMID: 18466274     DOI: 10.1111/j.1747-0285.2008.00664.x

Source DB:  PubMed          Journal:  Chem Biol Drug Des        ISSN: 1747-0277            Impact factor:   2.817


  3 in total

1.  Freely available compound data sets and software tools for chemoinformatics and computational medicinal chemistry applications.

Authors:  Ye Hu; Jurgen Bajorath
Journal:  F1000Res       Date:  2012-08-14

2.  Follow up: Compound data sets and software tools for chemoinformatics and medicinal chemistry applications: update and data transfer.

Authors:  Ye Hu; Jürgen Bajorath
Journal:  F1000Res       Date:  2014-03-11

3.  Introducing a Chemically Intuitive Core-Substituent Fingerprint Designed to Explore Structural Requirements for Effective Similarity Searching and Machine Learning.

Authors:  Tiago Janela; Kosuke Takeuchi; Jürgen Bajorath
Journal:  Molecules       Date:  2022-04-04       Impact factor: 4.411

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.