Literature DB >> 29715002

Machine Learning in Computer-Aided Synthesis Planning.

Connor W Coley1, William H Green1, Klavs F Jensen1.   

Abstract

Computer-aided synthesis planning (CASP) is focused on the goal of accelerating the process by which chemists decide how to synthesize small molecule compounds. The ideal CASP program would take a molecular structure as input and output a sorted list of detailed reaction schemes that each connect that target to purchasable starting materials via a series of chemically feasible reaction steps. Early work in this field relied on expert-crafted reaction rules and heuristics to describe possible retrosynthetic disconnections and selectivity rules but suffered from incompleteness, infeasible suggestions, and human bias. With the relatively recent availability of large reaction corpora (such as the United States Patent and Trademark Office (USPTO), Reaxys, and SciFinder databases), consisting of millions of tabulated reaction examples, it is now possible to construct and validate purely data-driven approaches to synthesis planning. As a result, synthesis planning has been opened to machine learning techniques, and the field is advancing rapidly. In this Account, we focus on two critical aspects of CASP and recent machine learning approaches to both challenges. First, we discuss the problem of retrosynthetic planning, which requires a recommender system to propose synthetic disconnections starting from a target molecule. We describe how the search strategy, necessary to overcome the exponential growth of the search space with increasing number of reaction steps, can be assisted through a learned synthetic complexity metric. We also describe how the recursive expansion can be performed by a straightforward nearest neighbor model that makes clever use of reaction data to generate high quality retrosynthetic disconnections. Second, we discuss the problem of anticipating the products of chemical reactions, which can be used to validate proposed reactions in a computer-generated synthesis plan (i.e., reduce false positives) to increase the likelihood of experimental success. While we introduce this task in the context of reaction validation, its utility extends to the prediction of side products and impurities, among other applications. We describe neural network-based approaches that we and others have developed for this forward prediction task that can be trained on previously published experimental data. Machine learning and artificial intelligence have revolutionized a number of disciplines, not limited to image recognition, dictation, translation, content recommendation, advertising, and autonomous driving. While there is a rich history of using machine learning for structure-activity models in chemistry, it is only now that it is being successfully applied more broadly to organic synthesis and synthesis design. As reported in this Account, machine learning is rapidly transforming CASP, but there are several remaining challenges and opportunities, many pertaining to the availability and standardization of both data and evaluation metrics, which must be addressed by the community at large.

Entities:  

Year:  2018        PMID: 29715002     DOI: 10.1021/acs.accounts.8b00087

Source DB:  PubMed          Journal:  Acc Chem Res        ISSN: 0001-4842            Impact factor:   22.384


  54 in total

Review 1.  QSAR without borders.

Authors:  Eugene N Muratov; Jürgen Bajorath; Robert P Sheridan; Igor V Tetko; Dmitry Filimonov; Vladimir Poroikov; Tudor I Oprea; Igor I Baskin; Alexandre Varnek; Adrian Roitberg; Olexandr Isayev; Stefano Curtarolo; Denis Fourches; Yoram Cohen; Alan Aspuru-Guzik; David A Winkler; Dimitris Agrafiotis; Artem Cherkasov; Alexander Tropsha
Journal:  Chem Soc Rev       Date:  2020-05-01       Impact factor: 54.564

2.  Energy refinement and analysis of structures in the QM9 database via a highly accurate quantum chemical method.

Authors:  Hyungjun Kim; Ji Young Park; Sunghwan Choi
Journal:  Sci Data       Date:  2019-07-03       Impact factor: 6.444

Review 3.  Rethinking drug design in the artificial intelligence era.

Authors:  Petra Schneider; W Patrick Walters; Alleyn T Plowright; Norman Sieroka; Jennifer Listgarten; Robert A Goodnow; Jasmin Fisher; Johanna M Jansen; José S Duca; Thomas S Rush; Matthias Zentgraf; John Edward Hill; Elizabeth Krutoholow; Matthias Kohler; Jeff Blaney; Kimito Funatsu; Chris Luebkemann; Gisbert Schneider
Journal:  Nat Rev Drug Discov       Date:  2019-12-04       Impact factor: 84.694

4.  Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems.

Authors:  John A Keith; Valentin Vassilev-Galindo; Bingqing Cheng; Stefan Chmiela; Michael Gastegger; Klaus-Robert Müller; Alexandre Tkatchenko
Journal:  Chem Rev       Date:  2021-07-07       Impact factor: 60.622

Review 5.  Automation and data-driven design of polymer therapeutics.

Authors:  Rahul Upadhya; Shashank Kosuri; Matthew Tamasi; Travis A Meyer; Supriya Atta; Michael A Webb; Adam J Gormley
Journal:  Adv Drug Deliv Rev       Date:  2020-11-24       Impact factor: 15.470

6.  Generic and specific recurrent neural network models: Applications for large and small scale biopharmaceutical upstream processes.

Authors:  Jens Smiatek; Christoph Clemens; Liliana Montano Herrera; Sabine Arnold; Bettina Knapp; Beate Presser; Alexander Jung; Thomas Wucherpfennig; Erich Bluhmki
Journal:  Biotechnol Rep (Amst)       Date:  2021-05-28

7.  Graph-based machine learning interprets and predicts diagnostic isomer-selective ion-molecule reactions in tandem mass spectrometry.

Authors:  Jonathan Fine; Judy Kuan-Yu Liu; Armen Beck; Kawthar Z Alzarieni; Xin Ma; Victoria M Boulos; Hilkka I Kenttämaa; Gaurav Chopra
Journal:  Chem Sci       Date:  2020-10-05       Impact factor: 9.825

8.  Discovery of a synthesis method for a difluoroglycine derivative based on a path generated by quantum chemical calculations.

Authors:  Tsuyoshi Mita; Yu Harabuchi; Satoshi Maeda
Journal:  Chem Sci       Date:  2020-05-22       Impact factor: 9.825

9.  BonDNet: a graph neural network for the prediction of bond dissociation energies for charged molecules.

Authors:  Mingjian Wen; Samuel M Blau; Evan Walter Clark Spotte-Smith; Shyam Dwaraknath; Kristin A Persson
Journal:  Chem Sci       Date:  2020-12-08       Impact factor: 9.825

10.  Quantum-mechanical transition-state model combined with machine learning provides catalyst design features for selective Cr olefin oligomerization.

Authors:  Steven M Maley; Doo-Hyun Kwon; Nick Rollins; Johnathan C Stanley; Orson L Sydora; Steven M Bischof; Daniel H Ess
Journal:  Chem Sci       Date:  2020-08-21       Impact factor: 9.825

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.