Literature DB >> 16267691

Surrogate data--a secure way to share corporate data.

Igor V Tetko1, Ruben Abagyan, Tudor I Oprea.   

Abstract

The privacy of chemical structure is of paramount importance for the industrial sector, in particular for the pharmaceutical industry. At the same time, companies handle large amounts of physico-chemical and biological data that could be shared in order to improve our molecular understanding of pharmacokinetic and toxicological properties, which could lead to improved predictivity and shorten the development time for drugs, in particular in the early phases of drug discovery. The current study provides some theoretical limits on the information required to produce reverse engineering of molecules from generated descriptors and demonstrates that the information content of molecules can be as low as less than one bit per atom. Thus theoretically just one descriptor can be used to completely disclose the molecular structure. Instead of sharing descriptors, we propose to share surrogate data. The sharing of surrogate data is nothing else but sharing of reliably predicted molecules. The use of surrogate data can provide the same information as the original set. We consider the practical application of this idea to predict lipophilicity of chemical compounds and we demonstrate that surrogate and real (original) data provides similar prediction ability. Thus, our proposed strategy makes it possible not only to share descriptors, but also complete collections of surrogate molecules without the danger of disclosing the underlying molecular structures.

Mesh:

Year:  2005        PMID: 16267691     DOI: 10.1007/s10822-005-9013-3

Source DB:  PubMed          Journal:  J Comput Aided Mol Des        ISSN: 0920-654X            Impact factor:   3.686


  10 in total

1.  New diversity calculations algorithms used for compound selection.

Authors:  Sergei V Trepalin; Vadim A Gerasimenko; Andrey V Kozyukov; Nikolay Ph Savchuk; Andrey A Ivaschenko
Journal:  J Chem Inf Comput Sci       Date:  2002 Mar-Apr

2.  Prediction of n-octanol/water partition coefficients from PHYSPROP database using artificial neural networks and E-state indices.

Authors:  I V Tetko; V Y Tanchuk; A E Villa
Journal:  J Chem Inf Comput Sci       Date:  2001 Sep-Oct

3.  The price of innovation: new estimates of drug development costs.

Authors:  Joseph A DiMasi; Ronald W Hansen; Henry G Grabowski
Journal:  J Health Econ       Date:  2003-03       Impact factor: 3.883

4.  An electrotopological-state index for atoms in molecules.

Authors:  L B Kier; L H Hall
Journal:  Pharm Res       Date:  1990-08       Impact factor: 4.200

5.  Neural network studies. 4. Introduction to associative neural networks.

Authors:  Igor V Tetko
Journal:  J Chem Inf Comput Sci       Date:  2002 May-Jun

6.  ZINC--a free database of commercially available compounds for virtual screening.

Authors:  John J Irwin; Brian K Shoichet
Journal:  J Chem Inf Model       Date:  2005 Jan-Feb       Impact factor: 4.956

7.  Application of ALOGPS to predict 1-octanol/water distribution coefficients, logP, and logD, of AstraZeneca in-house database.

Authors:  Igor V Tetko; Pierre Bruneau
Journal:  J Pharm Sci       Date:  2004-12       Impact factor: 3.534

8.  Application of ALOGPS 2.1 to predict log D distribution coefficient for Pfizer proprietary compounds.

Authors:  Igor V Tetko; Gennadiy I Poda
Journal:  J Med Chem       Date:  2004-11-04       Impact factor: 7.446

9.  Application of a pruning algorithm to optimize artificial neural networks for pharmaceutical fingerprinting.

Authors:  I V Tetko; A E Villa; T I Aksenova; W L Zielinski; J Brower; E R Collantes; W J Welsh
Journal:  J Chem Inf Comput Sci       Date:  1998 Jul-Aug

10.  Modeling of ion complexation and extraction using substructural molecular fragments

Authors: 
Journal:  J Chem Inf Comput Sci       Date:  2000-05
  10 in total
  9 in total

1.  Descriptor collision and confusion: toward the design of descriptors to mask chemical structures.

Authors:  Cristian Bologa; Tharun Kumar Allu; Marius Olah; Michael A Kappler; Tudor I Oprea
Journal:  J Comput Aided Mol Des       Date:  2005-12-02       Impact factor: 3.686

2.  Bigger data, collaborative tools and the future of predictive drug discovery.

Authors:  Sean Ekins; Alex M Clark; S Joshua Swamidass; Nadia Litterman; Antony J Williams
Journal:  J Comput Aided Mol Des       Date:  2014-06-19       Impact factor: 3.686

3.  Enhancing Carbon Acid pKa Prediction by Augmentation of Sparse Experimental Datasets with Accurate AIBL (QM) Derived Values.

Authors:  Jeffrey Plante; Beth A Caine; Paul L A Popelier
Journal:  Molecules       Date:  2021-02-17       Impact factor: 4.411

Review 4.  Artificial intelligence to deep learning: machine intelligence approach for drug discovery.

Authors:  Rohan Gupta; Devesh Srivastava; Mehar Sahu; Swati Tiwari; Rashmi K Ambasta; Pravir Kumar
Journal:  Mol Divers       Date:  2021-04-12       Impact factor: 3.364

5.  eTOXlab, an open source modeling framework for implementing predictive models in production environments.

Authors:  Pau Carrió; Oriol López; Ferran Sanz; Manuel Pastor
Journal:  J Cheminform       Date:  2015-03-11       Impact factor: 5.514

6.  Building attention and edge message passing neural networks for bioactivity and physical-chemical property prediction.

Authors:  M Withnall; E Lindelöf; O Engkvist; H Chen
Journal:  J Cheminform       Date:  2020-01-08       Impact factor: 5.514

7.  A survey of quantitative descriptions of molecular structure.

Authors:  Rajarshi Guha; Egon Willighagen
Journal:  Curr Top Med Chem       Date:  2012       Impact factor: 3.295

8.  BIGCHEM: Challenges and Opportunities for Big Data Analysis in Chemistry.

Authors:  Igor V Tetko; Ola Engkvist; Uwe Koch; Jean-Louis Reymond; Hongming Chen
Journal:  Mol Inform       Date:  2016-07-28       Impact factor: 3.353

9.  Development of an Infrastructure for the Prediction of Biological Endpoints in Industrial Environments. Lessons Learned at the eTOX Project.

Authors:  Manuel Pastor; Jordi Quintana; Ferran Sanz
Journal:  Front Pharmacol       Date:  2018-10-11       Impact factor: 5.810

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.