Literature DB >> 27480235

Using Graph Indices for the Analysis and Comparison of Chemical Datasets.

Denis Fourches1, Alexander Tropsha2.   

Abstract

In cheminformatics, compounds are represented as points in multidimensional space of chemical descriptors. When all pairs of points found within certain distance threshold in the original high dimensional chemistry space are connected by distance-labeled edges, the resulting data structure can be defined as Dataset Graph (DG). We show that, similarly to the conventional description of organic molecules, many graph indices can be computed for DGs as well. We demonstrate that chemical datasets can be effectively characterized and compared by computing simple graph indices such as the average vertex degree or Randic connectivity index. This approach is used to characterize and quantify the similarity between different datasets or subsets of the same dataset (e.g., training, test, and external validation sets used in QSAR modeling). The freely available ADDAGRA program has been implemented to build and visualize DGs. The approach proposed and discussed in this report could be further explored and utilized for different cheminformatics applications such as dataset diversification by acquiring external compounds, dataset processing prior to QSAR modeling, or (dis)similarity modeling of multiple datasets studied in chemical genomics applications.
Copyright © 2013 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Keywords:  ADDAGRA; Chemical dataset graph; Graph indices; QSAR

Year:  2013        PMID: 27480235     DOI: 10.1002/minf.201300076

Source DB:  PubMed          Journal:  Mol Inform        ISSN: 1868-1743            Impact factor:   3.353


  10 in total

1.  Design of chemical space networks using a Tanimoto similarity variant based upon maximum common substructures.

Authors:  Bijun Zhang; Martin Vogt; Gerald M Maggiora; Jürgen Bajorath
Journal:  J Comput Aided Mol Des       Date:  2015-09-29       Impact factor: 3.686

2.  Comprehensive characterization of the Published Kinase Inhibitor Set.

Authors:  Jonathan M Elkins; Vita Fedele; Marta Szklarz; Kamal R Abdul Azeez; Eidarus Salah; Jowita Mikolajczyk; Sergei Romanov; Nikolai Sepetov; Xi-Ping Huang; Bryan L Roth; Ayman Al Haj Zen; Denis Fourches; Eugene Muratov; Alex Tropsha; Joel Morris; Beverly A Teicher; Mark Kunkel; Eric Polley; Karen E Lackey; Francis L Atkinson; John P Overington; Paul Bamborough; Susanne Müller; Daniel J Price; Timothy M Willson; David H Drewry; Stefan Knapp; William J Zuercher
Journal:  Nat Biotechnol       Date:  2015-10-26       Impact factor: 54.908

3.  Design and characterization of chemical space networks for different compound data sets.

Authors:  Magdalena Zwierzyna; Martin Vogt; Gerald M Maggiora; Jürgen Bajorath
Journal:  J Comput Aided Mol Des       Date:  2014-12-03       Impact factor: 3.686

4.  Comparison of bioactive chemical space networks generated using substructure- and fingerprint-based measures of molecular similarity.

Authors:  Bijun Zhang; Martin Vogt; Gerald M Maggiora; Jürgen Bajorath
Journal:  J Comput Aided Mol Des       Date:  2015-06-07       Impact factor: 3.686

5.  Exploring drug space with ChemMaps.com.

Authors:  Alexandre Borrel; Nicole C Kleinstreuer; Denis Fourches
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

6.  Lessons learned from the design of chemical space networks and opportunities for new applications.

Authors:  Martin Vogt; Dagmar Stumpfe; Gerald M Maggiora; Jürgen Bajorath
Journal:  J Comput Aided Mol Des       Date:  2016-03-05       Impact factor: 3.686

7.  Data set modelability by QSAR.

Authors:  Alexander Golbraikh; Eugene Muratov; Denis Fourches; Alexander Tropsha
Journal:  J Chem Inf Model       Date:  2014-01-08       Impact factor: 4.956

Review 8.  Machine learning in chemoinformatics and drug discovery.

Authors:  Yu-Chen Lo; Stefano E Rensi; Wen Torng; Russ B Altman
Journal:  Drug Discov Today       Date:  2018-05-08       Impact factor: 7.851

9.  An automated framework for QSAR model building.

Authors:  Samina Kausar; Andre O Falcao
Journal:  J Cheminform       Date:  2018-01-16       Impact factor: 5.514

10.  Analysis and Comparison of Vector Space and Metric Space Representations in QSAR Modeling.

Authors:  Samina Kausar; Andre O Falcao
Journal:  Molecules       Date:  2019-04-30       Impact factor: 4.411

  10 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.