Literature DB >> 25810775

CHEMDNER system with mixed conditional random fields and multi-scale word clustering.

Yanan Lu1, Donghong Ji1, Xiaoyuan Yao1, Xiaomei Wei1, Xiaohui Liang2.   

Abstract

BACKGROUND: The chemical compound and drug name recognition plays an important role in chemical text mining, and it is the basis for automatic relation extraction and event identification in chemical information processing. So a high-performance named entity recognition system for chemical compound and drug names is necessary.
METHODS: We developed a CHEMDNER system based on mixed conditional random fields (CRF) with word clustering for chemical compound and drug name recognition. For the word clustering, we used Brown's hierarchical algorithm and Skip-gram model based on deep learning with massive PubMed articles including titles and abstracts.
RESULTS: This system achieved the highest F-score of 88.20% for the CDI task and the second highest F-score of 87.11% for the CEM task in BioCreative IV. The performance was further improved by multi-scale clustering based on deep learning, achieving the F-score of 88.71% for CDI and 88.06% for CEM.
CONCLUSIONS: The mixed CRF model represents both the internal complexity and external contexts of the entities, and the model is integrated with word clustering to capture domain knowledge with PubMed articles including titles and abstracts. The domain knowledge helps to ensure the performance of the entity recognition, even without fine-grained linguistic features and manually designed rules.

Entities:  

Keywords:  chemical named entity recognition; deep learning; mixed conditional random fields; word clustering

Year:  2015        PMID: 25810775      PMCID: PMC4331694          DOI: 10.1186/1758-2946-7-S1-S4

Source DB:  PubMed          Journal:  J Cheminform        ISSN: 1758-2946            Impact factor:   5.514


  4 in total

1.  Exploring the boundaries: gene and protein identification in biomedical text.

Authors:  Jenny Finkel; Shipra Dingare; Christopher D Manning; Malvina Nissim; Beatrice Alex; Claire Grover
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

2.  Identifying gene and protein mentions in text using conditional random fields.

Authors:  Ryan McDonald; Fernando Pereira
Journal:  BMC Bioinformatics       Date:  2005-05-24       Impact factor: 3.169

3.  CHEMDNER: The drugs and chemical names extraction challenge.

Authors:  Martin Krallinger; Florian Leitner; Obdulia Rabal; Miguel Vazquez; Julen Oyarzabal; Alfonso Valencia
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

4.  Overview of BioCreative II gene mention recognition.

Authors:  Larry Smith; Lorraine K Tanabe; Rie Johnson nee Ando; Cheng-Ju Kuo; I-Fang Chung; Chun-Nan Hsu; Yu-Shi Lin; Roman Klinger; Christoph M Friedrich; Kuzman Ganchev; Manabu Torii; Hongfang Liu; Barry Haddow; Craig A Struble; Richard J Povinelli; Andreas Vlachos; William A Baumgartner; Lawrence Hunter; Bob Carpenter; Richard Tzong-Han Tsai; Hong-Jie Dai; Feng Liu; Yifei Chen; Chengjie Sun; Sophia Katrenko; Pieter Adriaans; Christian Blaschke; Rafael Torres; Mariana Neves; Preslav Nakov; Anna Divoli; Manuel Maña-López; Jacinto Mata; W John Wilbur
Journal:  Genome Biol       Date:  2008-09-01       Impact factor: 13.583

  4 in total
  12 in total

1.  Putting hands to rest: efficient deep CNN-RNN architecture for chemical named entity recognition with no hand-crafted rules.

Authors:  Ilia Korvigo; Maxim Holmatov; Anatolii Zaikovskii; Mikhail Skoblov
Journal:  J Cheminform       Date:  2018-05-23       Impact factor: 5.514

2.  Recognition of chemical entities: combining dictionary-based and grammar-based approaches.

Authors:  Saber A Akhondi; Kristina M Hettne; Eelke van der Horst; Erik M van Mulligen; Jan A Kors
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

3.  Extracting Drug Names and Associated Attributes From Discharge Summaries: Text Mining Study.

Authors:  Ghada Alfattni; Maksim Belousov; Niels Peek; Goran Nenadic
Journal:  JMIR Med Inform       Date:  2021-05-05

4.  Feature engineering for drug name recognition in biomedical texts: feature conjunction and feature selection.

Authors:  Shengyu Liu; Buzhou Tang; Qingcai Chen; Xiaolong Wang; Xiaoming Fan
Journal:  Comput Math Methods Med       Date:  2015-03-12       Impact factor: 2.238

5.  CHEMDNER: The drugs and chemical names extraction challenge.

Authors:  Martin Krallinger; Florian Leitner; Obdulia Rabal; Miguel Vazquez; Julen Oyarzabal; Alfonso Valencia
Journal:  J Cheminform       Date:  2015-01-19       Impact factor: 5.514

6.  Mining chemical patents with an ensemble of open systems.

Authors:  Robert Leaman; Chih-Hsuan Wei; Cherry Zou; Zhiyong Lu
Journal:  Database (Oxford)       Date:  2016-05-12       Impact factor: 3.451

7.  Disorder recognition in clinical texts using multi-label structured SVM.

Authors:  Wutao Lin; Donghong Ji; Yanan Lu
Journal:  BMC Bioinformatics       Date:  2017-01-31       Impact factor: 3.169

8.  Long short-term memory RNN for biomedical named entity recognition.

Authors:  Chen Lyu; Bo Chen; Yafeng Ren; Donghong Ji
Journal:  BMC Bioinformatics       Date:  2017-10-30       Impact factor: 3.169

Review 9.  90 YEARS OF PROGESTERONE: Selective progesterone receptor modulators in gynaecological therapies.

Authors:  H O D Critchley; R R Chodankar
Journal:  J Mol Endocrinol       Date:  2020-07       Impact factor: 5.098

10.  KGHC: a knowledge graph for hepatocellular carcinoma.

Authors:  Nan Li; Zhihao Yang; Ling Luo; Lei Wang; Yin Zhang; Hongfei Lin; Jian Wang
Journal:  BMC Med Inform Decis Mak       Date:  2020-07-09       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.