Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Aggregating and Predicting Sequence Labels from Crowd Annotations.

Literature DB >> 29093611

Aggregating and Predicting Sequence Labels from Crowd Annotations.

An T Nguyen¹, Byron C Wallace², Junyi Jessy Li³, Ani Nenkova³, Matthew Lease¹.

Abstract

Despite sequences being core to NLP, scant work has considered how to handle noisy sequence labels from multiple annotators for the same text. Given such annotations, we consider two complementary tasks: (1) aggregating sequential crowd labels to infer a best single set of consensus annotations; and (2) using crowd annotations as training data for a model that can predict sequences in unannotated text. For aggregation, we propose a novel Hidden Markov Model variant. To predict sequences in unannotated text, we propose a neural approach using Long Short Term Memory. We evaluate a suite of methods across two different applications and text genres: Named-Entity Recognition in news articles and Information Extraction from biomedical abstracts. Results show improvement over strong baselines. Our source code and data are available online.

Entities: Chemical Disease Gene Species

Year: 2017 PMID： 29093611 PMCID： PMC5662012 DOI： 10.18653/v1/P17-1028

Source DB: PubMed Journal: Proc Conf Assoc Comput Linguist Meet ISSN： 0736-587X

4 in total

1 in total

1. A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature.

Authors: Benjamin Nye; Junyi Jessy Li; Roma Patel; Yinfei Yang; Iain J Marshall; Ani Nenkova; Byron C Wallace
Journal: Proc Conf Assoc Comput Linguist Meet Date: 2018-07

1 in total

Aggregating and Predicting Sequence Labels from Crowd Annotations.

1. Evaluation of PICO as a knowledge representation for clinical questions.

2. Long short-term memory.

3. Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision.

4. Utilization of the PICO framework to improve searching PubMed for clinical questions.

1. A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature.