Literature DB >> 34498673

A general optimization protocol for molecular property prediction using a deep learning network.

Jen-Hao Chen1, Yufeng Jane Tseng2.   

Abstract

The key to generating the best deep learning model for predicting molecular property is to test and apply various optimization methods. While individual optimization methods from different past works outside the pharmaceutical domain each succeeded in improving the model performance, better improvement may be achieved when specific combinations of these methods and practices are applied. In this work, three high-performance optimization methods in the literature that have been shown to dramatically improve model performance from other fields are used and discussed, eventually resulting in a general procedure for generating optimized CNN models on different properties of molecules. The three techniques are the dynamic batch size strategy for different enumeration ratios of the SMILES representation of compounds, Bayesian optimization for selecting the hyperparameters of a model and feature learning using chemical features obtained by a feedforward neural network, which are concatenated with the learned molecular feature vector. A total of seven different molecular properties (water solubility, lipophilicity, hydration energy, electronic properties, blood-brain barrier permeability and inhibition) are used. We demonstrate how each of the three techniques can affect the model and how the best model can generally benefit from using Bayesian optimization combined with dynamic batch size tuning.
© The Author(s) 2021. Published by Oxford University Press.

Entities:  

Keywords:  CNN; deep learning; drug discovery; optimization

Mesh:

Year:  2022        PMID: 34498673      PMCID: PMC8769690          DOI: 10.1093/bib/bbab367

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  10 in total

1.  The Effect of Resampling on Data-imbalanced Conditions for Prediction towards Nuclear Receptor Profiling Using Deep Learning.

Authors:  Yong Oh Lee; Young Jun Kim
Journal:  Mol Inform       Date:  2020-03-31       Impact factor: 3.353

2.  Convolutional Embedding of Attributed Molecular Graphs for Physical Property Prediction.

Authors:  Connor W Coley; Regina Barzilay; William H Green; Tommi S Jaakkola; Klavs F Jensen
Journal:  J Chem Inf Model       Date:  2017-07-25       Impact factor: 4.956

3.  "Found in Translation": predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models.

Authors:  Philippe Schwaller; Théophile Gaudin; Dávid Lányi; Costas Bekas; Teodoro Laino
Journal:  Chem Sci       Date:  2018-06-22       Impact factor: 9.825

Review 4.  The rise of deep learning in drug discovery.

Authors:  Hongming Chen; Ola Engkvist; Yinhai Wang; Marcus Olivecrona; Thomas Blaschke
Journal:  Drug Discov Today       Date:  2018-01-31       Impact factor: 7.851

5.  Different molecular enumeration influences in deep learning: an example using aqueous solubility.

Authors:  Jen-Hao Chen; Yufeng Jane Tseng
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

6.  Deep architectures and deep learning in chemoinformatics: the prediction of aqueous solubility for drug-like molecules.

Authors:  Alessandro Lusci; Gianluca Pollastri; Pierre Baldi
Journal:  J Chem Inf Model       Date:  2013-07-02       Impact factor: 4.956

7.  The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets.

Authors:  Takaya Saito; Marc Rehmsmeier
Journal:  PLoS One       Date:  2015-03-04       Impact factor: 3.240

8.  MoleculeNet: a benchmark for molecular machine learning.

Authors:  Zhenqin Wu; Bharath Ramsundar; Evan N Feinberg; Joseph Gomes; Caleb Geniesse; Aneesh S Pappu; Karl Leswing; Vijay Pande
Journal:  Chem Sci       Date:  2017-10-31       Impact factor: 9.825

9.  Analyzing Learned Molecular Representations for Property Prediction.

Authors:  Kevin Yang; Kyle Swanson; Wengong Jin; Connor Coley; Philipp Eiden; Hua Gao; Angel Guzman-Perez; Timothy Hopper; Brian Kelley; Miriam Mathea; Andrew Palmer; Volker Settels; Tommi Jaakkola; Klavs Jensen; Regina Barzilay
Journal:  J Chem Inf Model       Date:  2019-08-13       Impact factor: 4.956

10.  Randomized SMILES strings improve the quality of molecular generative models.

Authors:  Josep Arús-Pous; Simon Viet Johansson; Oleksii Prykhodko; Esben Jannik Bjerrum; Christian Tyrchan; Jean-Louis Reymond; Hongming Chen; Ola Engkvist
Journal:  J Cheminform       Date:  2019-11-21       Impact factor: 5.514

  10 in total
  1 in total

Review 1.  On modeling and utilizing chemical compound information with deep learning technologies: A task-oriented approach.

Authors:  Sangsoo Lim; Sangseon Lee; Yinhua Piao; MinGyu Choi; Dongmin Bang; Jeonghyeon Gu; Sun Kim
Journal:  Comput Struct Biotechnol J       Date:  2022-08-05       Impact factor: 6.155

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.