Literature DB >> 25423612

Chemical data visualization and analysis with incremental generative topographic mapping: big data challenge.

Héléna A Gaspar1, Igor I Baskin, Gilles Marcou, Dragos Horvath, Alexandre Varnek.   

Abstract

This paper is devoted to the analysis and visualization in 2-dimensional space of large data sets of millions of compounds using the incremental version of generative topographic mapping (iGTM). The iGTM algorithm implemented in the in-house ISIDA-GTM program was applied to a database of more than 2 million compounds combining data sets of 36 chemicals suppliers and the NCI collection, encoded either by MOE descriptors or by MACCS keys. Taking advantage of the probabilistic nature of GTM, several approaches to data analysis were proposed. The chemical space coverage was evaluated using the normalized Shannon entropy. Different views of the data (property landscapes) were obtained by mapping various physical and chemical properties (molecular weight, aqueous solubility, LogP, etc.) onto the iGTM map. The superposition of these views helped to identify the regions in the chemical space populated by compounds with desirable physicochemical profiles and the suppliers providing them. The data sets similarity in the latent space was assessed by applying several metrics (Euclidean distance, Tanimoto and Bhattacharyya coefficients) to data probability distributions based on cumulated responsibility vectors. As a complementary approach, data sets were compared by considering them as individual objects on a meta-GTM map, built on cumulated responsibility vectors or property landscapes produced with iGTM. We believe that the iGTM methodology described in this article represents a fast and reliable way to analyze and visualize large chemical databases.

Mesh:

Substances:

Year:  2014        PMID: 25423612     DOI: 10.1021/ci500575y

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  16 in total

1.  Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds.

Authors:  Pavel Sidorov; Helena Gaspar; Gilles Marcou; Alexandre Varnek; Dragos Horvath
Journal:  J Comput Aided Mol Des       Date:  2015-11-12       Impact factor: 3.686

2.  Multi-task generative topographic mapping in virtual screening.

Authors:  Arkadii Lin; Dragos Horvath; Gilles Marcou; Bernd Beck; Alexandre Varnek
Journal:  J Comput Aided Mol Des       Date:  2019-02-09       Impact factor: 3.686

Review 3.  Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling.

Authors:  Linlin Zhao; Heather L Ciallella; Lauren M Aleksunes; Hao Zhu
Journal:  Drug Discov Today       Date:  2020-07-11       Impact factor: 7.851

4.  Predictive cartography of metal binders using generative topographic mapping.

Authors:  Igor I Baskin; Vitaly P Solov'ev; Alexander A Bagatur'yants; Alexandre Varnek
Journal:  J Comput Aided Mol Des       Date:  2017-07-07       Impact factor: 3.686

5.  Scaffold analysis of PubChem database as background for hierarchical scaffold-based visualization.

Authors:  Jakub Velkoborsky; David Hoksza
Journal:  J Cheminform       Date:  2016-12-29       Impact factor: 5.514

6.  Scaffold Diversity of Fungal Metabolites.

Authors:  Mariana González-Medina; John R Owen; Tamam El-Elimat; Cedric J Pearce; Nicholas H Oberlies; Mario Figueroa; José L Medina-Franco
Journal:  Front Pharmacol       Date:  2017-04-03       Impact factor: 5.810

7.  Probabilistic ancestry maps: a method to assess and visualize population substructures in genetics.

Authors:  Héléna A Gaspar; Gerome Breen
Journal:  BMC Bioinformatics       Date:  2019-03-07       Impact factor: 3.169

8.  DMSO Solubility Assessment for Fragment-Based Screening.

Authors:  Shamkhal Baybekov; Gilles Marcou; Pascal Ramos; Olivier Saurel; Jean-Luc Galzi; Alexandre Varnek
Journal:  Molecules       Date:  2021-06-28       Impact factor: 4.411

9.  Web-based 3D-visualization of the DrugBank chemical space.

Authors:  Mahendra Awale; Jean-Louis Reymond
Journal:  J Cheminform       Date:  2016-05-04       Impact factor: 5.514

10.  A Chemographic Audit of anti-Coronavirus Structure-activity Information from Public Databases (ChEMBL).

Authors:  Dragos Horvath; Alexey Orlov; Dmitry I Osolodkin; Aydar A Ishmukhametov; Gilles Marcou; Alexandre Varnek
Journal:  Mol Inform       Date:  2020-05-14       Impact factor: 4.050

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.