Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Automated annotation of chemical names in the literature with tunable accuracy.

Literature DB >> 22107874

Automated annotation of chemical names in the literature with tunable accuracy.

Jun D Zhang¹, Lewis Y Geer, Evan E Bolton, Stephen H Bryant.

Abstract

BACKGROUND: A significant portion of the biomedical and chemical literature refers to small molecules. The accurate identification and annotation of compound name that are relevant to the topic of the given literature can establish links between scientific publications and various chemical and life science databases. Manual annotation is the preferred method for these works because well-trained indexers can understand the paper topics as well as recognize key terms. However, considering the hundreds of thousands of new papers published annually, an automatic annotation system with high precision and relevance can be a useful complement to manual annotation.
RESULTS: An automated chemical name annotation system, MeSH Automated Annotations (MAA), was developed to annotate small molecule names in scientific abstracts with tunable accuracy. This system aims to reproduce the MeSH term annotations on biomedical and chemical literature that would be created by indexers. When comparing automated free text matching to those indexed manually of 26 thousand MEDLINE abstracts, more than 40% of the annotations were false-positive (FP) cases. To reduce the FP rate, MAA incorporated several filters to remove "incorrect" annotations caused by nonspecific, partial, and low relevance chemical names. In part, relevance was measured by the position of the chemical name in the text. Tunable accuracy was obtained by adding or restricting the sections of the text scanned for chemical names. The best precision obtained was 96% with a 28% recall rate. The best performance of MAA, as measured with the F statistic was 66%, which favorably compares to other chemical name annotation systems.
CONCLUSIONS: Accurate chemical name annotation can help researchers not only identify important chemical names in abstracts, but also match unindexed and unstructured abstracts to chemical records. The current work is tested against MEDLINE, but the algorithm is not specific to this corpus and it is possible that the algorithm can be applied to papers from chemical physics, material, polymer and environmental science, as well as patents, biological assay descriptions and other textual data.

Entities: Chemical Disease Gene Species

Year: 2011 PMID： 22107874 PMCID： PMC3281788 DOI： 10.1186/1758-2946-3-52

Source DB: PubMed Journal: J Cheminform ISSN： 1758-2946 Impact factor: 5.514

13 in total

1. Analysis of biomedical text for chemical names: a comparison of three methods.

Authors: W J Wilbur; G F Hazard; G Divita; J G Mork; A R Aronson; A C Browne
Journal: Proc AMIA Symp Date: 1999

2. The NLM Indexing Initiative.

Authors: A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal: Proc AMIA Symp Date: 2000

3. Medical Subject Headings (MeSH).

Authors: C E Lipscomb
Journal: Bull Med Libr Assoc Date: 2000-07

4. The NLM Indexing Initiative's Medical Text Indexer.

Authors: Alan R Aronson; James G Mork; Clifford W Gay; Susanne M Humphrey; Willie J Rogers
Journal: Stud Health Technol Inform Date: 2004

5. Identification of new drug classification terms in textual resources.

Authors: Corinna Kolárik; Martin Hofmann-Apitius; Marc Zimmermann; Juliane Fluck
Journal: Bioinformatics Date: 2007-07-01 Impact factor: 6.937

6. Mining chemical and biological information from the drug literature.

Authors: Debra L Banville
Journal: Curr Opin Drug Discov Devel Date: 2009-05

7. A dictionary to identify small molecules and drugs in free text.

Authors: Kristina M Hettne; Rob H Stierum; Martijn J Schuemie; Peter J M Hendriksen; Bob J A Schijvenaars; Erik M van Mulligen; Jos Kleinjans; Jan A Kors
Journal: Bioinformatics Date: 2009-09-16 Impact factor: 6.937

8. A strategy for assigning new concepts in the MEDLINE database.

Authors: Won Kim; W John Wilbur
Journal: AMIA Annu Symp Proc Date: 2005

9. Automatic vs. manual curation of a multi-source chemical dictionary: the impact on text mining.

Authors: Kristina M Hettne; Antony J Williams; Erik M van Mulligen; Jos Kleinjans; Valery Tkachenko; Jan A Kors
Journal: J Cheminform Date: 2010-03-23 Impact factor: 5.514

10. Detection of IUPAC and IUPAC-like chemical names.

Authors: Roman Klinger; Corinna Kolárik; Juliane Fluck; Martin Hofmann-Apitius; Christoph M Friedrich
Journal: Bioinformatics Date: 2008-07-01 Impact factor: 6.937

3 in total

1. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

Authors: Martin Krallinger; Obdulia Rabal; Florian Leitner; Miguel Vazquez; David Salgado; Zhiyong Lu; Robert Leaman; Yanan Lu; Donghong Ji; Daniel M Lowe; Roger A Sayle; Riza Theresa Batista-Navarro; Rafal Rak; Torsten Huber; Tim Rocktäschel; Sérgio Matos; David Campos; Buzhou Tang; Hua Xu; Tsendsuren Munkhdalai; Keun Ho Ryu; S V Ramanan; Senthil Nathan; Slavko Žitnik; Marko Bajec; Lutz Weber; Matthias Irmer; Saber A Akhondi; Jan A Kors; Shuo Xu; Xin An; Utpal Kumar Sikdar; Asif Ekbal; Masaharu Yoshioka; Thaer M Dieb; Miji Choi; Karin Verspoor; Madian Khabsa; C Lee Giles; Hongfang Liu; Komandur Elayavilli Ravikumar; Andre Lamurias; Francisco M Couto; Hong-Jie Dai; Richard Tzong-Han Tsai; Caglar Ata; Tolga Can; Anabel Usié; Rui Alves; Isabel Segura-Bedmar; Paloma Martínez; Julen Oyarzabal; Alfonso Valencia
Journal: J Cheminform Date: 2015-01-19 Impact factor: 5.514

2. DisArticle: a web server for SVM-based discrimination of articles on traditional medicine.

Authors: Sang-Kyun Kim; SeJin Nam; SangHyun Kim
Journal: BMC Complement Altern Med Date: 2017-01-28 Impact factor: 3.659

3. Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation.

Authors: Alex M Clark; Barry A Bunin; Nadia K Litterman; Stephan C Schürer; Ubbo Visser
Journal: PeerJ Date: 2014-08-14 Impact factor: 2.984

3 in total