Literature DB >> 31797607

Towards identifying drug side effects from social media using active learning and crowd sourcing.

Sophie Burkhardt1,2, Julia Siekiera, Josua Glodde, Miguel A Andrade-Navarro, Stefan Kramer.   

Abstract

MOTIVATION: Social media is a largely untapped source of information on side effects of drugs. Twitter in particular is widely used to report on everyday events and personal ailments. However, labeling this noisy data is a difficult problem because labeled training data is sparse and automatic labeling is error-prone. Crowd sourcing can help in such a scenario to obtain more reliable labels, but is expensive in comparison because workers have to be paid. To remedy this, semi-supervised active learning may reduce the number of labeled data needed and focus the manual labeling process on important information.
RESULTS: We extracted data from Twitter using the public API. We subsequently use Amazon Mechanical Turk in combination with a state-of-the-art semi-supervised active learning method to label tweets with their associated drugs and side effects in two stages. Our results show that our method is an effective way of discovering side effects in tweets with an improvement from 53% F-measure to 67% F-measure as compared to a one stage work flow. Additionally, we show the effectiveness of the active learning scheme in reducing the labeling cost in comparison to a non-active baseline. AVAILABILITY: Code and data will be published on https://github.com/kramerlab.

Entities:  

Mesh:

Year:  2020        PMID: 31797607

Source DB:  PubMed          Journal:  Pac Symp Biocomput        ISSN: 2335-6928


  1 in total

1.  Characterization of Anonymous Physician Perspectives on COVID-19 Using Social Media Data.

Authors:  Katherine J Sullivan; Marisha Burden; Angela Keniston; Juan M Banda; Lawrence E Hunter
Journal:  Pac Symp Biocomput       Date:  2021
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.