Literature DB >> 22056694

Boosting performance of gene mention tagging system by hybrid methods.

Lishuang Li1, Wenting Fan, Degen Huang, Yanzhong Dang, Jing Sun.   

Abstract

NER (Named Entity Recognition) in biomedical literature is presently one of the internationally concerned NLP (Natural Language Processing) research questions. In order to get higher performance, a hybrid experimental framework is presented for the gene mention tagging task. Six classifiers are firstly constructed by four toolkits (CRF++, YamCha, Maximum Entropy (ME) and MALLET) with different training methods and features sets, and then combined with three different hybrid methods respectively: simple set operation method, voting method and two layer stacking method. Experiments carried out on the corpus of BioCreative II GM task show that the three hybrid methods get the F-measure of 87.40%, 87.31% and 87.70% separately without any post-processing, which are all higher than those of any single ones. Our best hybrid method (two layer stacking method) achieves an F-measure of 88.42% after post-processing, which outperforms most of the state-of-the-art systems. We also discuss the influence on the performance of the ensemble system by the number, performance and divergence of single classifiers in each hybrid method, and give the corresponding analysis why our hybrid models can improve the performance.
Copyright © 2011 Elsevier Inc. All rights reserved.

Mesh:

Year:  2011        PMID: 22056694     DOI: 10.1016/j.jbi.2011.10.004

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  5 in total

1.  Assessing the impact of case sensitivity and term information gain on biomedical concept recognition.

Authors:  Tudor Groza; Karin Verspoor
Journal:  PLoS One       Date:  2015-03-19       Impact factor: 3.240

2.  A multistage gene normalization system integrating multiple effective methods.

Authors:  Lishuang Li; Shanshan Liu; Lihua Li; Wenting Fan; Degen Huang; Huiwei Zhou
Journal:  PLoS One       Date:  2013-12-12       Impact factor: 3.240

3.  Supervised segmentation of phenotype descriptions for the human skeletal phenome using hybrid methods.

Authors:  Tudor Groza; Jane Hunter; Andreas Zankl
Journal:  BMC Bioinformatics       Date:  2012-10-15       Impact factor: 3.169

4.  Recognizing scientific artifacts in biomedical literature.

Authors:  Tudor Groza; Hamed Hassanzadeh; Jane Hunter
Journal:  Biomed Inform Insights       Date:  2013-04-02

5.  Mining skeletal phenotype descriptions from scientific literature.

Authors:  Tudor Groza; Jane Hunter; Andreas Zankl
Journal:  PLoS One       Date:  2013-02-08       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.