Literature DB >> 31294792

Cost-aware active learning for named entity recognition in clinical text.

Qiang Wei1, Yukun Chen2, Mandana Salimi1, Joshua C Denny3,4, Qiaozhu Mei5, Thomas A Lasko3, Qingxia Chen3,6, Stephen Wu1, Amy Franklin1, Trevor Cohen7, Hua Xu1.   

Abstract

OBJECTIVE: Active Learning (AL) attempts to reduce annotation cost (ie, time) by selecting the most informative examples for annotation. Most approaches tacitly (and unrealistically) assume that the cost for annotating each sample is identical. This study introduces a cost-aware AL method, which simultaneously models both the annotation cost and the informativeness of the samples and evaluates both via simulation and user studies.
MATERIALS AND METHODS: We designed a novel, cost-aware AL algorithm (Cost-CAUSE) for annotating clinical named entities; we first utilized lexical and syntactic features to estimate annotation cost, then we incorporated this cost measure into an existing AL algorithm. Using the 2010 i2b2/VA data set, we then conducted a simulation study comparing Cost-CAUSE with noncost-aware AL methods, and a user study comparing Cost-CAUSE with passive learning.
RESULTS: Our cost model fit empirical annotation data well, and Cost-CAUSE increased the simulation area under the learning curve (ALC) scores by up to 5.6% and 4.9%, compared with random sampling and alternate AL methods. Moreover, in a user annotation task, Cost-CAUSE outperformed passive learning on the ALC score and reduced annotation time by 20.5%-30.2%. DISCUSSION: Although AL has proven effective in simulations, our user study shows that a real-world environment is far more complex. Other factors have a noticeable effect on the AL method, such as the annotation accuracy of users, the tiredness of users, and even the physical and mental condition of users.
CONCLUSION: Cost-CAUSE saves significant annotation cost compared to random sampling.
© The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  active learning; electronic health records; named entity recognition, user study; natural language processing

Mesh:

Year:  2019        PMID: 31294792      PMCID: PMC6798575          DOI: 10.1093/jamia/ocz102

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


  7 in total

1.  2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text.

Authors:  Özlem Uzuner; Brett R South; Shuying Shen; Scott L DuVall
Journal:  J Am Med Inform Assoc       Date:  2011-06-16       Impact factor: 4.497

2.  Active learning reduces annotation time for clinical concept extraction.

Authors:  Mahnoosh Kholghi; Laurianne Sitbon; Guido Zuccon; Anthony Nguyen
Journal:  Int J Med Inform       Date:  2017-08-05       Impact factor: 4.046

3.  A study of active learning methods for named entity recognition in clinical text.

Authors:  Yukun Chen; Thomas A Lasko; Qiaozhu Mei; Joshua C Denny; Hua Xu
Journal:  J Biomed Inform       Date:  2015-09-15       Impact factor: 6.317

4.  Active learning: a step towards automating medical concept extraction.

Authors:  Mahnoosh Kholghi; Laurianne Sitbon; Guido Zuccon; Anthony Nguyen
Journal:  J Am Med Inform Assoc       Date:  2015-08-07       Impact factor: 4.497

5.  What do we mean by prediction in language comprehension?

Authors:  Gina R Kuperberg; T Florian Jaeger
Journal:  Lang Cogn Neurosci       Date:  2015-11-13       Impact factor: 2.331

Review 6.  Clinical information extraction applications: A literature review.

Authors:  Yanshan Wang; Liwei Wang; Majid Rastegar-Mojarad; Sungrim Moon; Feichen Shen; Naveed Afzal; Sijia Liu; Yuqun Zeng; Saeed Mehrabi; Sunghwan Sohn; Hongfang Liu
Journal:  J Biomed Inform       Date:  2017-11-21       Impact factor: 6.317

7.  An active learning-enabled annotation system for clinical named entity recognition.

Authors:  Yukun Chen; Thomas A Lask; Qiaozhu Mei; Qingxia Chen; Sungrim Moon; Jingqi Wang; Ky Nguyen; Tolulola Dawodu; Trevor Cohen; Joshua C Denny; Hua Xu
Journal:  BMC Med Inform Decis Mak       Date:  2017-07-05       Impact factor: 2.796

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.