Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

Literature DB >> 31414701

NERChem: adapting NERBio to chemical patents via full-token features and named entity feature with chemical sub-class composition.

Richard Tzong-Han Tsai, Yu-Cheng Hsiao, Po-Ting Lai.

Abstract

Chemical patents contain detailed information on novel chemical compounds that is valuable to the chemical and pharmaceutical industries. In this paper, we introduce a system, NERChem that can recognize chemical named entity mentions in chemical patents. NERChem is based on the conditional random fields model (CRF). Our approach incorporates ( 1 ) class composition, which is used for combining chemical classes whose naming conventions are similar; ( 2 ) BioNE features, which are used for distinguishing chemical mentions from other biomedical NE mentions in the patents; and ( 3 ) full-token word features, which are used to resolve the tokenization granularity problem. We evaluated our approach on the BioCreative V CHEMDNER-patent corpus, and achieved an F-score of 87.17% in the Chemical Entity Mention in Patents (CEMP) task and a sensitivity of 98.58% in the Chemical Passage Detection (CPD) task, ranking alongside the top systems. Database URL: Our NERChem web-based system is publicly available at iisrserv.csie.n cu.edu.tw/nerchem.

Entities: CellLine Chemical Disease Gene Species

Year: 2016 PMID： 31414701 PMCID： PMC5091336 DOI： 10.1093/database/baw135

Source DB: PubMed Journal: Database (Oxford) ISSN： 1758-0463 Impact factor: 3.451

14 in total

1. BANNER: an executable survey of advances in biomedical named entity recognition.

Authors: Robert Leaman; Graciela Gonzalez
Journal: Pac Symp Biocomput Date: 2008

2. Enhancing of chemical compound and drug name recognition using representative tag scheme and fine-grained tokenization.

Authors: Hong-Jie Dai; Po-Ting Lai; Yung-Chun Chang; Richard Tzong-Han Tsai
Journal: J Cheminform Date: 2015-01-19 Impact factor: 5.514

3. tmVar: a text mining approach for extracting sequence variants in biomedical literature.

Authors: Chih-Hsuan Wei; Bethany R Harris; Hung-Yu Kao; Zhiyong Lu
Journal: Bioinformatics Date: 2013-04-05 Impact factor: 6.937

4. Detection of IUPAC and IUPAC-like chemical names.

Authors: Roman Klinger; Corinna Kolárik; Juliane Fluck; Martin Hofmann-Apitius; Christoph M Friedrich
Journal: Bioinformatics Date: 2008-07-01 Impact factor: 6.937

5. OSCAR4: a flexible architecture for chemical text-mining.

Authors: David M Jessop; Sam E Adams; Egon L Willighagen; Lezan Hawizy; Peter Murray-Rust
Journal: J Cheminform Date: 2011-10-14 Impact factor: 5.514

6. CHEMDNER: The drugs and chemical names extraction challenge.

Authors: Martin Krallinger; Florian Leitner; Obdulia Rabal; Miguel Vazquez; Julen Oyarzabal; Alfonso Valencia
Journal: J Cheminform Date: 2015-01-19 Impact factor: 5.514

7. DrugBank 4.0: shedding new light on drug metabolism.

Authors: Vivian Law; Craig Knox; Yannick Djoumbou; Tim Jewison; An Chi Guo; Yifeng Liu; Adam Maciejewski; David Arndt; Michael Wilson; Vanessa Neveu; Alexandra Tang; Geraldine Gabriel; Carol Ly; Sakina Adamjee; Zerihun T Dame; Beomsoo Han; You Zhou; David S Wishart
Journal: Nucleic Acids Res Date: 2013-11-06 Impact factor: 16.971

8. Annotated chemical patent corpus: a gold standard for text mining.

Authors: Saber A Akhondi; Alexander G Klenner; Christian Tyrchan; Anil K Manchala; Kiran Boppana; Daniel Lowe; Marc Zimmermann; Sarma A R P Jagarlapudi; Roger Sayle; Jan A Kors; Sorel Muresan
Journal: PLoS One Date: 2014-09-30 Impact factor: 3.240

9. The Comparative Toxicogenomics Database's 10th year anniversary: update 2015.

Authors: Allan Peter Davis; Cynthia J Grondin; Kelley Lennon-Hopkins; Cynthia Saraceni-Richards; Daniela Sciaky; Benjamin L King; Thomas C Wiegers; Carolyn J Mattingly
Journal: Nucleic Acids Res Date: 2014-10-17 Impact factor: 16.971

10. The ChEBI reference database and ontology for biologically relevant chemistry: enhancements for 2013.

Authors: Janna Hastings; Paula de Matos; Adriano Dekker; Marcus Ennis; Bhavana Harsha; Namrata Kale; Venkatesh Muthukrishnan; Gareth Owen; Steve Turner; Mark Williams; Christoph Steinbeck
Journal: Nucleic Acids Res Date: 2012-11-24 Impact factor: 16.971

3 in total

1. The extraction of complex relationships and their conversion to biological expression language (BEL) overview of the BioCreative VI (2017) BEL track.

Authors: Sumit Madan; Justyna Szostak; Ravikumar Komandur Elayavilli; Richard Tzong-Han Tsai; Mehdi Ali; Longhua Qian; Majid Rastegar-Mojarad; Julia Hoeng; Juliane Fluck
Journal: Database (Oxford) Date: 2019-01-01 Impact factor: 3.451

2. Knowledge-enhanced biomedical named entity recognition and normalization: application to proteins and genes.

Authors: Huiwei Zhou; Shixian Ning; Zhe Liu; Chengkun Lang; Zhuang Liu; Bizun Lei
Journal: BMC Bioinformatics Date: 2020-01-30 Impact factor: 3.169

3. Using a Large Margin Context-Aware Convolutional Neural Network to Automatically Extract Disease-Disease Association from Literature: Comparative Analytic Study.

Authors: Richard Tzong-Han Tsai; Jorng-Tzong Horng; Po-Ting Lai; Wei-Liang Lu; Ting-Rung Kuo; Chia-Ru Chung; Jen-Chieh Han
Journal: JMIR Med Inform Date: 2019-11-26

3 in total