Literature DB >> 11855971

Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis.

Jeffrey W Godden1, Jürgen Bajorath.   

Abstract

Analysis of the variability of molecular descriptors in large compound databases has recently been carried out using both the Shannon entropy (SE) and differential Shannon entropy (DSE) concepts that reduce descriptor distributions to their information content (SE analysis) and detect intrinsic differences between descriptor settings in compound databases (DSE analysis). Here it is shown that a combination of SE and DSE calculations, termed SE-DSE analysis, makes it possible to identify molecular descriptors most sensitive to systematic differences in databases consisting of synthetic, drug-like, and natural molecules. Descriptors with consistently high information content are detected, and database-specific differences are quantified. Different sets of only very few descriptors were found to be most responsive to principal differences between synthetic, natural, and drug-like molecules. Descriptors with DSE values furthest away from zero are likely to best distinguish between compounds with different characteristics. SE-DSE analysis also reveals that a number of descriptors are not sensitive to compound class-specific features, despite their complexity and consistently high information content.

Entities:  

Year:  2002        PMID: 11855971     DOI: 10.1021/ci0103065

Source DB:  PubMed          Journal:  J Chem Inf Comput Sci        ISSN: 0095-2338


  6 in total

Review 1.  Global analysis of large-scale chemical and biological experiments.

Authors:  David E Root; Brian P Kelley; Brent R Stockwell
Journal:  Curr Opin Drug Discov Devel       Date:  2002-05

2.  IMMAN: free software for information theory-based chemometric analysis.

Authors:  Ricardo W Pino Urias; Stephen J Barigye; Yovani Marrero-Ponce; César R García-Jacas; José R Valdes-Martiní; Facundo Perez-Gimenez
Journal:  Mol Divers       Date:  2015-01-26       Impact factor: 2.943

3.  Accurate prediction of personalized olfactory perception from large-scale chemoinformatic features.

Authors:  Hongyang Li; Bharat Panwar; Gilbert S Omenn; Yuanfang Guan
Journal:  Gigascience       Date:  2018-02-01       Impact factor: 6.524

4.  ProtDCal: A program to compute general-purpose-numerical descriptors for sequences and 3D-structures of proteins.

Authors:  Yasser B Ruiz-Blanco; Waldo Paz; James Green; Yovani Marrero-Ponce
Journal:  BMC Bioinformatics       Date:  2015-05-16       Impact factor: 3.169

5.  Exploring general-purpose protein features for distinguishing enzymes and non-enzymes within the twilight zone.

Authors:  Yasser B Ruiz-Blanco; Guillermin Agüero-Chapin; Enrique García-Hernández; Orlando Álvarez; Agostinho Antunes; James Green
Journal:  BMC Bioinformatics       Date:  2017-07-21       Impact factor: 3.169

6.  Database fingerprint (DFP): an approach to represent molecular databases.

Authors:  Eli Fernández-de Gortari; César R García-Jacas; Karina Martinez-Mayorga; José L Medina-Franco
Journal:  J Cheminform       Date:  2017-02-06       Impact factor: 5.514

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.