Literature DB >> 32865404

Three-Dimensional Convolutional Neural Networks and a Cross-Docked Data Set for Structure-Based Drug Design.

Paul G Francoeur1, Tomohide Masuda1, Jocelyn Sunseri1, Andrew Jia1, Richard B Iovanisci1, Ian Snyder1, David R Koes1.   

Abstract

One of the main challenges in drug discovery is predicting protein-ligand binding affinity. Recently, machine learning approaches have made substantial progress on this task. However, current methods of model evaluation are overly optimistic in measuring generalization to new targets, and there does not exist a standard data set of sufficient size to compare performance between models. We present a new data set for structure-based machine learning, the CrossDocked2020 set, with 22.5 million poses of ligands docked into multiple similar binding pockets across the Protein Data Bank, and perform a comprehensive evaluation of grid-based convolutional neural network (CNN) models on this data set. We also demonstrate how the partitioning of the training data and test data can impact the results of models trained with the PDBbind data set, how performance improves by adding more lower-quality training data, and how training with docked poses imparts pose sensitivity to the predicted affinity of a complex. Our best performing model, an ensemble of five densely connected CNNs, achieves a root mean squared error of 1.42 and Pearson R of 0.612 on the affinity prediction task, an AUC of 0.956 at binding pose classification, and a 68.4% accuracy at pose selection on the CrossDocked2020 set. By providing data splits for clustered cross-validation and the raw data for the CrossDocked2020 set, we establish the first standardized data set for training machine learning models to recognize ligands in noncognate target structures while also greatly expanding the number of poses available for training. In order to facilitate community adoption of this data set for benchmarking protein-ligand binding affinity prediction, we provide our models, weights, and the CrossDocked2020 set at https://github.com/gnina/models.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32865404      PMCID: PMC8902699          DOI: 10.1021/acs.jcim.0c00411

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  52 in total

Review 1.  Docking and scoring in virtual screening for drug discovery: methods and applications.

Authors:  Douglas B Kitchen; Hélène Decornez; John R Furr; Jürgen Bajorath
Journal:  Nat Rev Drug Discov       Date:  2004-11       Impact factor: 84.694

2.  Comments on "leave-cluster-out cross-validation is appropriate for scoring functions derived from diverse protein data sets": significance for the validation of scoring functions.

Authors:  Pedro J Ballester; John B O Mitchell
Journal:  J Chem Inf Model       Date:  2011-05-31       Impact factor: 4.956

Review 3.  Latest developments in molecular docking: 2010-2011 in review.

Authors:  Elizabeth Yuriev; Paul A Ramsland
Journal:  J Mol Recognit       Date:  2013-05       Impact factor: 2.137

4.  Task-Specific Scoring Functions for Predicting Ligand Binding Poses and Affinity and for Screening Enrichment.

Authors:  Hossam M Ashtawy; Nihar R Mahapatra
Journal:  J Chem Inf Model       Date:  2017-12-20       Impact factor: 4.956

5.  KDEEP: Protein-Ligand Absolute Binding Affinity Prediction via 3D-Convolutional Neural Networks.

Authors:  José Jiménez; Miha Škalič; Gerard Martínez-Rosell; Gianni De Fabritiis
Journal:  J Chem Inf Model       Date:  2018-01-29       Impact factor: 4.956

6.  Learning from the ligand: using ligand-based features to improve binding affinity prediction.

Authors:  Fergus Boyles; Charlotte M Deane; Garrett M Morris
Journal:  Bioinformatics       Date:  2020-02-01       Impact factor: 6.937

Review 7.  Structure-based virtual screening for drug discovery: a problem-centric review.

Authors:  Tiejun Cheng; Qingliang Li; Zhigang Zhou; Yanli Wang; Stephen H Bryant
Journal:  AAPS J       Date:  2012-01-27       Impact factor: 4.009

8.  DeepDTA: deep drug-target binding affinity prediction.

Authors:  Hakime Öztürk; Arzucan Özgür; Elif Ozkirimli
Journal:  Bioinformatics       Date:  2018-09-01       Impact factor: 6.937

Review 9.  Machine-learning scoring functions to improve structure-based binding affinity prediction and virtual screening.

Authors:  Qurrat Ul Ain; Antoniya Aleksandrova; Florian D Roessler; Pedro J Ballester
Journal:  Wiley Interdiscip Rev Comput Mol Sci       Date:  2015-08-28

10.  Development and evaluation of a deep learning model for protein-ligand binding affinity prediction.

Authors:  Marta M Stepniewska-Dziubinska; Piotr Zielenkiewicz; Pawel Siedlecki
Journal:  Bioinformatics       Date:  2018-11-01       Impact factor: 6.937

View more
  15 in total

1.  Scoring Functions for Protein-Ligand Binding Affinity Prediction using Structure-Based Deep Learning: A Review.

Authors:  Rocco Meli; Garrett M Morris; Philip C Biggin
Journal:  Front Bioinform       Date:  2022-06-17

Review 2.  Improving ΔΔG Predictions with a Multitask Convolutional Siamese Network.

Authors:  Andrew T McNutt; David Ryan Koes
Journal:  J Chem Inf Model       Date:  2022-04-05       Impact factor: 6.162

3.  Improving protein-ligand docking and screening accuracies by incorporating a scoring function correction term.

Authors:  Liangzhen Zheng; Jintao Meng; Kai Jiang; Haidong Lan; Zechen Wang; Mingzhi Lin; Weifeng Li; Hongwei Guo; Yanjie Wei; Yuguang Mu
Journal:  Brief Bioinform       Date:  2022-05-13       Impact factor: 13.994

4.  SolTranNet-A Machine Learning Tool for Fast Aqueous Solubility Prediction.

Authors:  Paul G Francoeur; David R Koes
Journal:  J Chem Inf Model       Date:  2021-05-26       Impact factor: 6.162

Review 5.  Pose Classification Using Three-Dimensional Atomic Structure-Based Neural Networks Applied to Ion Channel-Ligand Docking.

Authors:  Heesung Shim; Hyojin Kim; Jonathan E Allen; Heike Wulff
Journal:  J Chem Inf Model       Date:  2022-04-21       Impact factor: 6.162

6.  Learning protein-ligand binding affinity with atomic environment vectors.

Authors:  Rocco Meli; Andrew Anighoro; Mike J Bodkin; Garrett M Morris; Philip C Biggin
Journal:  J Cheminform       Date:  2021-08-14       Impact factor: 5.514

7.  Concatenation of molecular docking and molecular simulation of BACE-1, γ-secretase targeted ligands: in pursuit of Alzheimer's treatment.

Authors:  Nasimudeen R Jabir; Md Tabish Rehman; Khadeejah Alsolami; Shazi Shakil; Torki A Zughaibi; Raed F Alserihi; Mohd Shahnawaz Khan; Mohamed F AlAjmi; Shams Tabrez
Journal:  Ann Med       Date:  2021-12       Impact factor: 4.709

8.  Generating 3D molecules conditional on receptor binding sites with deep generative models.

Authors:  Matthew Ragoza; Tomohide Masuda; David Ryan Koes
Journal:  Chem Sci       Date:  2022-02-07       Impact factor: 9.825

9.  NeuralDock: Rapid and Conformation-Agnostic Docking of Small Molecules.

Authors:  Congzhou M Sha; Jian Wang; Nikolay V Dokholyan
Journal:  Front Mol Biosci       Date:  2022-03-22

10.  GNINA 1.0: molecular docking with deep learning.

Authors:  Andrew T McNutt; Paul Francoeur; Rishal Aggarwal; Tomohide Masuda; Rocco Meli; Matthew Ragoza; Jocelyn Sunseri; David Ryan Koes
Journal:  J Cheminform       Date:  2021-06-09       Impact factor: 5.514

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.