Literature DB >> 30758950

Molecular Structure Extraction from Documents Using Deep Learning.

Joshua Staker1, Kyle Marshall1, Robert Abel2, Carolyn M McQuaw1.   

Abstract

Chemical structure extraction from documents remains a hard problem because of both false positive identification of structures during segmentation and errors in the predicted structures. Current approaches rely on handcrafted rules and subroutines that perform reasonably well generally but still routinely encounter situations where recognition rates are not yet satisfactory and systematic improvement is challenging. Complications impacting the performance of current approaches include the diversity in visual styles used by various software to render structures, the frequent use of ad hoc annotations, and other challenges related to image quality, including resolution and noise. We present end-to-end deep learning solutions for both segmenting molecular structures from documents and predicting chemical structures from the segmented images. This deep-learning-based approach does not require any handcrafted features, is learned directly from data, and is robust against variations in image quality and style. Using the deep learning approach described herein, we show that it is possible to perform well on both segmentation and prediction of low-resolution images containing moderately sized molecules found in journal articles and patents.

Mesh:

Year:  2019        PMID: 30758950     DOI: 10.1021/acs.jcim.8b00669

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  6 in total

1.  SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer.

Authors:  Zhanpeng Xu; Jianhua Li; Zhaopeng Yang; Shiliang Li; Honglin Li
Journal:  J Cheminform       Date:  2022-07-01       Impact factor: 8.489

2.  RanDepict: Random chemical structure depiction generator.

Authors:  Henning Otto Brinkhaus; Kohulan Rajan; Achim Zielesny; Christoph Steinbeck
Journal:  J Cheminform       Date:  2022-06-06       Impact factor: 8.489

Review 3.  Molecular representations in AI-driven drug discovery: a review and practical guide.

Authors:  Laurianne David; Amol Thakkar; Rocío Mercado; Ola Engkvist
Journal:  J Cheminform       Date:  2020-09-17       Impact factor: 5.514

4.  DECIMER: towards deep learning for chemical image recognition.

Authors:  Kohulan Rajan; Achim Zielesny; Christoph Steinbeck
Journal:  J Cheminform       Date:  2020-10-27       Impact factor: 5.514

Review 5.  Review of techniques and models used in optical chemical structure recognition in images and scanned documents.

Authors:  Fidan Musazade; Narmin Jamalova; Jamaladdin Hasanov
Journal:  J Cheminform       Date:  2022-09-09       Impact factor: 8.489

6.  Combining Machine Learning and Computational Chemistry for Predictive Insights Into Chemical Systems.

Authors:  John A Keith; Valentin Vassilev-Galindo; Bingqing Cheng; Stefan Chmiela; Michael Gastegger; Klaus-Robert Müller; Alexandre Tkatchenko
Journal:  Chem Rev       Date:  2021-07-07       Impact factor: 60.622

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.