| Literature DB >> 25954400 |
Karen O'Connor1, Pranoti Pimpalkhute1, Azadeh Nikfarjam1, Rachel Ginn1, Karen L Smith2, Graciela Gonzalez1.
Abstract
Recent research has shown that Twitter data analytics can have broad implications on public health research. However, its value for pharmacovigilance has been scantly studied - with health related forums and community support groups preferred for the task. We present a systematic study of tweets collected for 74 drugs to assess their value as sources of potential signals for adverse drug reactions (ADRs). We created an annotated corpus of 10,822 tweets. Each tweet was annotated for the presence or absence of ADR mentions, with the span and Unified Medical Language System (UMLS) concept ID noted for each ADR present. Using Cohen's kappa1, we calculated the inter-annotator agreement (IAA) for the binary annotations to be 0.69. To demonstrate the utility of the corpus, we attempted a lexicon-based approach for concept extraction, with promising success (54.1% precision, 62.1% recall, and 57.8% F-measure). A subset of the corpus is freely available at: http://diego.asu.edu/downloads.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25954400 PMCID: PMC4419871
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076