Literature DB >> 36056300

CNN-based two-branch multi-scale feature extraction network for retrosynthesis prediction.

Feng Yang1, Juan Liu2, Qiang Zhang1, Zhihui Yang1, Xiaolei Zhang1.   

Abstract

BACKGROUND: Retrosynthesis prediction is the task of deducing reactants from reaction products, which is of great importance for designing the synthesis routes of the target products. The product molecules are generally represented with some descriptors such as simplified molecular input line entry specification (SMILES) or molecular fingerprints in order to build the prediction models. However, most of the existing models utilize only one molecular descriptor and simply consider the molecular descriptors in a whole rather than further mining multi-scale features, which cannot fully and finely utilizes molecules and molecular descriptors features.
RESULTS: We propose a novel model to address the above concerns. Firstly, we build a new convolutional neural network (CNN) based feature extraction network to extract multi-scale features from the molecular descriptors by utilizing several filters with different sizes. Then, we utilize a two-branch feature extraction layer to fusion the multi-scale features of several molecular descriptors to perform the retrosynthesis prediction without expert knowledge. The comparing result with other models on the benchmark USPTO-50k chemical dataset shows that our model surpasses the state-of-the-art model by 7.4%, 10.8%, 11.7% and 12.2% in terms of the top-1, top-3, top-5 and top-10 accuracies. Since there is no related work in the field of bioretrosynthesis prediction due to the fact that compounds in metabolic reactions are much more difficult to be featured than those in chemical reactions, we further test the feasibility of our model in task of bioretrosynthesis prediction by using the well-known MetaNetX metabolic dataset, and achieve top-1, top-3, top-5 and top-10 accuracies of 45.2%, 67.0%, 73.6% and 82.2%, respectively.
CONCLUSION: The comparison result on USPTO-50k indicates that our proposed model surpasses the existing state-of-the-art model. The evaluation result on MetaNetX dataset indicates that the models used for retrosynthesis prediction can also be used for bioretrosynthesis prediction.
© 2022. The Author(s).

Entities:  

Keywords:  Convolutional neural network; Machine learning; Multi-scale features; Retrosynthesis prediction

Mesh:

Year:  2022        PMID: 36056300      PMCID: PMC9440582          DOI: 10.1186/s12859-022-04904-7

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.307


  7 in total

1.  Computer-assisted design of complex organic syntheses.

Authors:  E J Corey; W T Wipke
Journal:  Science       Date:  1969-10-10       Impact factor: 47.728

2.  Single-Step Retrosynthesis Prediction Based on the Identification of Potential Disconnection Sites Using Molecular Substructure Fingerprints.

Authors:  Haris Hasic; Takashi Ishida
Journal:  J Chem Inf Model       Date:  2021-02-03       Impact factor: 4.956

3.  Neural-Symbolic Machine Learning for Retrosynthesis and Reaction Prediction.

Authors:  Marwin H S Segler; Mark P Waller
Journal:  Chemistry       Date:  2017-02-22       Impact factor: 5.236

4.  Computer-Assisted Retrosynthesis Based on Molecular Similarity.

Authors:  Connor W Coley; Luke Rogers; William H Green; Klavs F Jensen
Journal:  ACS Cent Sci       Date:  2017-11-16       Impact factor: 14.553

5.  MetaNetX/MNXref--reconciliation of metabolites and biochemical reactions to bring together genome-scale metabolic networks.

Authors:  Sébastien Moretti; Olivier Martin; T Van Du Tran; Alan Bridge; Anne Morgat; Marco Pagni
Journal:  Nucleic Acids Res       Date:  2015-11-02       Impact factor: 16.971

6.  Retrosynthetic Reaction Prediction Using Neural Sequence-to-Sequence Models.

Authors:  Bowen Liu; Bharath Ramsundar; Prasad Kawthekar; Jade Shi; Joseph Gomes; Quang Luu Nguyen; Stephen Ho; Jack Sloane; Paul Wender; Vijay Pande
Journal:  ACS Cent Sci       Date:  2017-09-05       Impact factor: 18.728

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.