Marcin Kowiel1,2, Dariusz Brzezinski2,3, Przemyslaw J Porebski2,4, Ivan G Shabalin2,4, Mariusz Jaskolski1,5, Wladek Minor2,4. 1. Center for Biocrystallographic Research, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Poznan, Poland. 2. Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA, USA. 3. Institute of Computing Science, Poznan University of Technology, Poznan, Poland. 4. Center for Structural Genomics of Infectious Diseases (CSGID), University of Virginia, Charlottesville, VA, USA. 5. Department of Crystallography, Faculty of Chemistry, A. Mickiewicz University, Poznan, Poland.
Abstract
Motivation: The correct identification of ligands in crystal structures of protein complexes is the cornerstone of structure-guided drug design. However, cognitive bias can sometimes mislead investigators into modeling fictitious compounds without solid support from the electron density maps. Ligand identification can be aided by automatic methods, but existing approaches are based on time-consuming iterative fitting. Results: Here we report a new machine learning algorithm called CheckMyBlob that identifies ligands from experimental electron density maps. In benchmark tests on portfolios of up to 219 931 ligand binding sites containing the 200 most popular ligands found in the Protein Data Bank, CheckMyBlob markedly outperforms the existing automatic methods for ligand identification, in some cases doubling the recognition rates, while requiring significantly less time. Our work shows that machine learning can improve the automation of structure modeling and significantly accelerate the drug screening process of macromolecule-ligand complexes. Availability and implementation: Code and data are available on GitHub at https://github.com/dabrze/CheckMyBlob. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: The correct identification of ligands in crystal structures of protein complexes is the cornerstone of structure-guided drug design. However, cognitive bias can sometimes mislead investigators into modeling fictitious compounds without solid support from the electron density maps. Ligand identification can be aided by automatic methods, but existing approaches are based on time-consuming iterative fitting. Results: Here we report a new machine learning algorithm called CheckMyBlob that identifies ligands from experimental electron density maps. In benchmark tests on portfolios of up to 219 931 ligand binding sites containing the 200 most popular ligands found in the Protein Data Bank, CheckMyBlob markedly outperforms the existing automatic methods for ligand identification, in some cases doubling the recognition rates, while requiring significantly less time. Our work shows that machine learning can improve the automation of structure modeling and significantly accelerate the drug screening process of macromolecule-ligand complexes. Availability and implementation: Code and data are available on GitHub at https://github.com/dabrze/CheckMyBlob. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Ingolf Sommer; Oliver Müller; Francisco S Domingues; Oliver Sander; Joachim Weickert; Thomas Lengauer Journal: Bioinformatics Date: 2007-10-31 Impact factor: 6.937
Authors: Helen M Berman; Buvaneswari Coimbatore Narayanan; Luigi Di Costanzo; Shuchismita Dutta; Sutapa Ghosh; Brian P Hudson; Catherine L Lawson; Ezra Peisach; Andreas Prlić; Peter W Rose; Chenghua Shao; Huanwang Yang; Jasmine Young; Christine Zardecki Journal: FEBS Lett Date: 2013-01-18 Impact factor: 4.124
Authors: Garib N Murshudov; Pavol Skubák; Andrey A Lebedev; Navraj S Pannu; Roberto A Steiner; Robert A Nicholls; Martyn D Winn; Fei Long; Alexei A Vagin Journal: Acta Crystallogr D Biol Crystallogr Date: 2011-03-18
Authors: Marek Grabowski; Marcin Cymborowski; Przemyslaw J Porebski; Tomasz Osinski; Ivan G Shabalin; David R Cooper; Wladek Minor Journal: Struct Dyn Date: 2019-11-22 Impact factor: 2.920
Authors: Dariusz Brzezinski; Marcin Kowiel; David R Cooper; Marcin Cymborowski; Marek Grabowski; Alexander Wlodawer; Zbigniew Dauter; Ivan G Shabalin; Miroslaw Gilski; Bernhard Rupp; Mariusz Jaskolski; Wladek Minor Journal: Protein Sci Date: 2020-10-08 Impact factor: 6.993