Literature DB >> 32212690

ChemSchematicResolver: A Toolkit to Decode 2D Chemical Diagrams with Labels and R-Groups into Annotated Chemical Named Entities.

Edward J Beard1,2, Jacqueline M Cole1,2,3.   

Abstract

The number of journal articles in the scientific domain has grown to the point where it has become impossible for researchers to capitalize on all findings in their relevant discipline. Information is stored in these articles in a number of ways, including figures that describe important results. In organic chemistry, these figures often present chemical schematic diagrams that graphically define the structures of carbon-based compounds. These diagrams are intuitive for an expert to comprehend, but they are not designed for machines. This work presents ChemSchematicResolver, a software tool that can be used to identify chemical schematic diagrams within the figure of a document, resolve any R-group substituents within them, and convert the resulting diagrams to a machine-readable format in a high-throughput, autonomous fashion. The tool includes a new algorithm that is used to identify relevant diagrams and a mechanism that combines these data with contextual information from the rest of the document for the creation of highly relational databases. It includes support for a variety of general R-group structures, the first time this is available in any open-source chemical schematic diagram extraction tool. It is presented alongside a self-generated evaluation set, on which the most important assessment metric, precision, achieved 83-100% for all assessed areas. The ChemSchematicResolver tool is released under the MIT license and is available to download from www.chemschematicresolver.org.

Entities:  

Mesh:

Year:  2020        PMID: 32212690     DOI: 10.1021/acs.jcim.0c00042

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  2 in total

1.  DECIMER-hand-drawn molecule images dataset.

Authors:  Henning Otto Brinkhaus; Achim Zielesny; Christoph Steinbeck; Kohulan Rajan
Journal:  J Cheminform       Date:  2022-06-09       Impact factor: 8.489

Review 2.  Review of techniques and models used in optical chemical structure recognition in images and scanned documents.

Authors:  Fidan Musazade; Narmin Jamalova; Jamaladdin Hasanov
Journal:  J Cheminform       Date:  2022-09-09       Impact factor: 8.489

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.