Literature DB >> 25541888

Development of a novel fingerprint for chemical reactions and its application to large-scale reaction classification and similarity.

Nadine Schneider1, Daniel M Lowe, Roger A Sayle, Gregory A Landrum.   

Abstract

Fingerprint methods applied to molecules have proven to be useful for similarity determination and as inputs to machine-learning models. Here, we present the development of a new fingerprint for chemical reactions and validate its usefulness in building machine-learning models and in similarity assessment. Our final fingerprint is constructed as the difference of the atom-pair fingerprints of products and reactants and includes agents via calculated physicochemical properties. We validated the fingerprints on a large data set of reactions text-mined from granted United States patents from the last 40 years that have been classified using a substructure-based expert system. We applied machine learning to build a 50-class predictive model for reaction-type classification that correctly predicts 97% of the reactions in an external test set. Impressive accuracies were also observed when applying the classifier to reactions from an in-house electronic laboratory notebook. The performance of the novel fingerprint for assessing reaction similarity was evaluated by a cluster analysis that recovered 48 out of 50 of the reaction classes with a median F-score of 0.63 for the clusters. The data sets used for training and primary validation as well as all python scripts required to reproduce the analysis are provided in the Supporting Information.

Mesh:

Year:  2015        PMID: 25541888     DOI: 10.1021/ci5006614

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  31 in total

1.  Structure-reactivity modeling using mixture-based representation of chemical reactions.

Authors:  Pavel Polishchuk; Timur Madzhidov; Timur Gimadiev; Andrey Bodrov; Ramil Nugmanov; Alexandre Varnek
Journal:  J Comput Aided Mol Des       Date:  2017-07-27       Impact factor: 3.686

2.  "Found in Translation": predicting outcomes of complex organic chemistry reactions using neural sequence-to-sequence models.

Authors:  Philippe Schwaller; Théophile Gaudin; Dávid Lányi; Costas Bekas; Teodoro Laino
Journal:  Chem Sci       Date:  2018-06-22       Impact factor: 9.825

3.  Planning chemical syntheses with deep neural networks and symbolic AI.

Authors:  Marwin H S Segler; Mike Preuss; Mark P Waller
Journal:  Nature       Date:  2018-03-28       Impact factor: 49.962

4.  Unified Deep Learning Model for Multitask Reaction Predictions with Explanation.

Authors:  Jieyu Lu; Yingkai Zhang
Journal:  J Chem Inf Model       Date:  2022-03-10       Impact factor: 4.956

5.  Improving the performance of models for one-step retrosynthesis through re-ranking.

Authors:  Min Htoo Lin; Zhengkai Tu; Connor W Coley
Journal:  J Cheminform       Date:  2022-03-15       Impact factor: 5.514

6.  What Does the Machine Learn? Knowledge Representations of Chemical Reactivity.

Authors:  Joshua A Kammeraad; Jack Goetz; Eric A Walker; Ambuj Tewari; Paul M Zimmerman
Journal:  J Chem Inf Model       Date:  2020-03-03       Impact factor: 4.956

7.  Inferring experimental procedures from text-based representations of chemical reactions.

Authors:  Alain C Vaucher; Philippe Schwaller; Joppe Geluykens; Vishnu H Nair; Anna Iuliano; Teodoro Laino
Journal:  Nat Commun       Date:  2021-05-06       Impact factor: 14.919

8.  Evaluating and clustering retrosynthesis pathways with learned strategy.

Authors:  Yiming Mo; Yanfei Guan; Pritha Verma; Jiang Guo; Mike E Fortunato; Zhaohong Lu; Connor W Coley; Klavs F Jensen
Journal:  Chem Sci       Date:  2020-11-23       Impact factor: 9.825

9.  BonDNet: a graph neural network for the prediction of bond dissociation energies for charged molecules.

Authors:  Mingjian Wen; Samuel M Blau; Evan Walter Clark Spotte-Smith; Shyam Dwaraknath; Kristin A Persson
Journal:  Chem Sci       Date:  2020-12-08       Impact factor: 9.825

10.  SureChEMBL: a large-scale, chemically annotated patent document database.

Authors:  George Papadatos; Mark Davies; Nathan Dedman; Jon Chambers; Anna Gaulton; James Siddle; Richard Koks; Sean A Irvine; Joe Pettersson; Nicko Goncharoff; Anne Hersey; John P Overington
Journal:  Nucleic Acids Res       Date:  2015-11-17       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.