Literature DB >> 29093611

Aggregating and Predicting Sequence Labels from Crowd Annotations.

An T Nguyen1, Byron C Wallace2, Junyi Jessy Li3, Ani Nenkova3, Matthew Lease1.   

Abstract

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.

Entities:  

Year:  2017        PMID: 29093611      PMCID: PMC5662012          DOI: 10.18653/v1/P17-1028

Source DB:  PubMed          Journal:  Proc Conf Assoc Comput Linguist Meet        ISSN: 0736-587X


  4 in total

1.  Evaluation of PICO as a knowledge representation for clinical questions.

Authors:  Xiaoli Huang; Jimmy Lin; Dina Demner-Fushman
Journal:  AMIA Annu Symp Proc       Date:  2006

2.  Long short-term memory.

Authors:  S Hochreiter; J Schmidhuber
Journal:  Neural Comput       Date:  1997-11-15       Impact factor: 2.026

3.  Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision.

Authors:  Byron C Wallace; Joël Kuiper; Aakash Sharma; Mingxi Brian Zhu; Iain J Marshall
Journal:  J Mach Learn Res       Date:  2016       Impact factor: 3.654

4.  Utilization of the PICO framework to improve searching PubMed for clinical questions.

Authors:  Connie Schardt; Martha B Adams; Thomas Owens; Sheri Keitz; Paul Fontelo
Journal:  BMC Med Inform Decis Mak       Date:  2007-06-15       Impact factor: 2.796

  4 in total
  1 in total

1.  A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature.

Authors:  Benjamin Nye; Junyi Jessy Li; Roma Patel; Yinfei Yang; Iain J Marshall; Ani Nenkova; Byron C Wallace
Journal:  Proc Conf Assoc Comput Linguist Meet       Date:  2018-07
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.