Kei Terayama1, Hiroaki Iwata2, Mitsugu Araki3, Yasushi Okuno3,4, Koji Tsuda1,5,6. 1. Department of Computational Biology and Medical Science, Graduate School of Frontier Sciences, The University of Tokyo, Chiba 277-8561, Japan. 2. Foundation for Biomedical Research and Innovation, Hyogo 650-0047, Japan. 3. RIKEN Advanced Institute for Computational Science, Hyogo 650-0047, Japan. 4. Department of Biomedical Data Intelligence, Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan. 5. Center for Materials Research by Information Integration, NIMS, Ibaraki 305-0047, Japan. 6. RIKEN Center for Advanced Intelligence Project, Tokyo 103-0027, Japan.
Abstract
Motivation: Fast and accurate prediction of protein-ligand binding structures is indispensable for structure-based drug design and accurate estimation of binding free energy of drug candidate molecules in drug discovery. Recently, accurate pose prediction methods based on short Molecular Dynamics (MD) simulations, such as MM-PBSA and MM-GBSA, among generated docking poses have been used. Since molecular structures obtained from MD simulation depend on the initial condition, taking the average over different initial conditions leads to better accuracy. Prediction accuracy of protein-ligand binding poses can be improved with multiple runs at different initial velocity. Results: This paper shows that a machine learning method, called Best Arm Identification, can optimally control the number of MD runs for each binding pose. It allows us to identify a correct binding pose with a minimum number of total runs. Our experiment using three proteins and eight inhibitors showed that the computational cost can be reduced substantially without sacrificing accuracy. This method can be applied for controlling all kinds of molecular simulations to obtain best results under restricted computational resources. Availability and implementation: Code and data are available on GitHub at https://github.com/tsudalab/bpbi. Contact: terayama@cbms.k.u-tokyo.ac.jp or tsuda@k.u-tokyo.ac.jp. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Fast and accurate prediction of protein-ligand binding structures is indispensable for structure-based drug design and accurate estimation of binding free energy of drug candidate molecules in drug discovery. Recently, accurate pose prediction methods based on short Molecular Dynamics (MD) simulations, such as MM-PBSA and MM-GBSA, among generated docking poses have been used. Since molecular structures obtained from MD simulation depend on the initial condition, taking the average over different initial conditions leads to better accuracy. Prediction accuracy of protein-ligand binding poses can be improved with multiple runs at different initial velocity. Results: This paper shows that a machine learning method, called Best Arm Identification, can optimally control the number of MD runs for each binding pose. It allows us to identify a correct binding pose with a minimum number of total runs. Our experiment using three proteins and eight inhibitors showed that the computational cost can be reduced substantially without sacrificing accuracy. This method can be applied for controlling all kinds of molecular simulations to obtain best results under restricted computational resources. Availability and implementation: Code and data are available on GitHub at https://github.com/tsudalab/bpbi. Contact: terayama@cbms.k.u-tokyo.ac.jp or tsuda@k.u-tokyo.ac.jp. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971
Authors: David Silver; Aja Huang; Chris J Maddison; Arthur Guez; Laurent Sifre; George van den Driessche; Julian Schrittwieser; Ioannis Antonoglou; Veda Panneershelvam; Marc Lanctot; Sander Dieleman; Dominik Grewe; John Nham; Nal Kalchbrenner; Ilya Sutskever; Timothy Lillicrap; Madeleine Leach; Koray Kavukcuoglu; Thore Graepel; Demis Hassabis Journal: Nature Date: 2016-01-28 Impact factor: 49.962
Authors: Tianyi Yang; Johnny C Wu; Chunli Yan; Yuanfeng Wang; Ray Luo; Michael B Gonzales; Kevin N Dalby; Pengyu Ren Journal: Proteins Date: 2011-04-12