Literature DB >> 19298076

CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition.

Aniko T Valko1, A Peter Johnson.   

Abstract

We present CLiDE Pro, the latest version of the output of the long-term CLiDE project for the development of tools for automatic extraction of chemical information from the literature. CLiDE Pro is concerned with the extraction of chemical structure and generic structure information from electronic images of chemical molecules available online as well as pages of scanned chemical documents. The information is extracted in three phases, first the image is segmented into text and graphical regions, then graphical regions are analyzed and where possible the connection tables are reconstructed, and finally any generic structures are interpreted by matching R-groups found in structure diagrams with the ones located in the text. The program has been tested on a large set of images of chemical structures originating from various sources. The results demonstrate good performance in the reconstruction of connection tables with few errors in the interpretation of the individual drawing features found in the structure diagrams. This full test set is presented for use in the validation of other similar systems.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19298076     DOI: 10.1021/ci800449t

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  12 in total

1.  Silver threads.

Authors:  Wendy A Warr
Journal:  J Comput Aided Mol Des       Date:  2011-12-09       Impact factor: 3.686

2.  Many InChIs and quite some feat.

Authors:  Wendy A Warr
Journal:  J Comput Aided Mol Des       Date:  2015-06-17       Impact factor: 3.686

3.  SwinOCSR: end-to-end optical chemical structure recognition using a Swin Transformer.

Authors:  Zhanpeng Xu; Jianhua Li; Zhaopeng Yang; Shiliang Li; Honglin Li
Journal:  J Cheminform       Date:  2022-07-01       Impact factor: 8.489

4.  DECIMER-hand-drawn molecule images dataset.

Authors:  Henning Otto Brinkhaus; Achim Zielesny; Christoph Steinbeck; Kohulan Rajan
Journal:  J Cheminform       Date:  2022-06-09       Impact factor: 8.489

5.  Tunable machine vision-based strategy for automated annotation of chemical databases.

Authors:  Jungkap Park; Gus R Rosania; Kazuhiro Saitou
Journal:  J Chem Inf Model       Date:  2009-08       Impact factor: 4.956

6.  SCRIPDB: a portal for easy access to syntheses, chemicals and reactions in patents.

Authors:  Abraham Heifets; Igor Jurisica
Journal:  Nucleic Acids Res       Date:  2011-11-08       Impact factor: 16.971

7.  Machines first, humans second: on the importance of algorithmic interpretation of open chemistry data.

Authors:  Alex M Clark; Antony J Williams; Sean Ekins
Journal:  J Cheminform       Date:  2015-03-22       Impact factor: 5.514

8.  The creation and characterisation of a National Compound Collection: the Royal Society of Chemistry pilot.

Authors:  David M Andrews; Laura M Broad; Paul J Edwards; David N A Fox; Timothy Gallagher; Stephen L Garland; Richard Kidd; Joseph B Sweeney
Journal:  Chem Sci       Date:  2016-02-23       Impact factor: 9.825

9.  ChemEx: information extraction system for chemical data curation.

Authors:  Atima Tharatipyakul; Somrak Numnark; Duangdao Wichadakul; Supawadee Ingsriswang
Journal:  BMC Bioinformatics       Date:  2012-12-13       Impact factor: 3.169

10.  SureChEMBL: a large-scale, chemically annotated patent document database.

Authors:  George Papadatos; Mark Davies; Nathan Dedman; Jon Chambers; Anna Gaulton; James Siddle; Richard Koks; Sean A Irvine; Joe Pettersson; Nicko Goncharoff; Anne Hersey; John P Overington
Journal:  Nucleic Acids Res       Date:  2015-11-17       Impact factor: 16.971

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.