| Literature DB >> 25755127 |
Azadeh Nikfarjam1, Abeed Sarker2, Karen O'Connor2, Rachel Ginn2, Graciela Gonzalez1.
Abstract
OBJECTIVE: Social media is becoming increasingly popular as a platform for sharing personal health-related information. This information can be utilized for public health monitoring tasks, particularly for pharmacovigilance, via the use of natural language processing (NLP) techniques. However, the language in social media is highly informal, and user-expressed medical concepts are often nontechnical, descriptive, and challenging to extract. There has been limited progress in addressing these challenges, and thus far, advanced machine learning-based NLP techniques have been underutilized. Our objective is to design a machine learning-based approach to extract mentions of adverse drug reactions (ADRs) from highly informal text in social media.Entities:
Keywords: ADR; adverse drug reaction; deep learning word embeddings; machine learning; natural language processing; pharmacovigilance; social media mining
Mesh:
Year: 2015 PMID: 25755127 PMCID: PMC4457113 DOI: 10.1093/jamia/ocu041
Source DB: PubMed Journal: J Am Med Inform Assoc ISSN: 1067-5027 Impact factor: 4.497
Figure 1Examples of user-posted drug reviews in Twitter (a) and DailyStrength (b).
Examples of the unsupervised learned clusters with the subsets of the words in each cluster; ci is an integer between 0 and 149
| Cluster# | Semantic category | Examples of clustered words |
|---|---|---|
| c1 | Drug | Abilify, Adderall, Ambien, Ativan, aspirin, citalopram, Effexor, Paxil, … |
| c2 | Signs/Symptoms | hangover, headache, rash, hive, … |
| c3 | Signs/Symptoms | anxiety, depression, disorder, ocd, mania, stabilizer, … |
| c4 | Drug dosage | 1000 mg, 100 mg, .10, 10 mg, 600 mg, 0.25, .05, … |
| c5 | Treatment | anti-depressant, antidepressant, drug, med, medication, medicine, treat, … |
| c6 | Family member | brother, dad, daughter, father, husband, mom, mother, son, wife, … |
| c7 | Date | 1992, 2011, 2012, 23rd, 8th, April, Aug, August, December, … |
The “Semantic category” titles are manually assigned and are not used in the system.
Figure 2Calculated features representing three CRF classification instances.
Number of user reviews and annotation details in train/test sets for DailyStrength (DS) and Twitter corpora
| Dataset | No. of user posts | No. of sentences | No. of tokens | No. of ADR mentions | No. of indication mentions |
|---|---|---|---|---|---|
| DS train set | 4720 | 6676 | 66 728 | 2193 | 1532 |
| DS test set | 1559 | 2166 | 22 147 | 750 | 454 |
| Twitter train set | 1340 | 2434 | 28 706 | 845 | 117 |
| Twitter test set | 444 | 813 | 9526 | 277 | 41 |
Comparison of ADR classification precision (P), recall (R), and F-measure (F) of ADRMine with embedding cluster features (ADRMineWITH_CLUSTER) and the baselines systems on two different corpora: DS and Twitter
| Method | DS | |||||
|---|---|---|---|---|---|---|
| P | R | F | P | R | F | |
| MetaMapADR_LEXICON | 0.470 | 0.392 | 0.428 | 0.394 | 0.309 | 0.347 |
| MetaMapSEMANTIC_TYPE | 0.289 | 0.484 | 0.362 | 0.230 | 0.403 | 0.293 |
| Lexicon-based | 0.577 | 0.724 | 0.642 | 0.561 | 0.610 | 0.585 |
| SVM |
| 0.671 | 0.760 | 0.778 | 0.495 | 0.605 |
| ADRMineWITHOUT_CLUSTER | 0.874 | 0.723 | 0.791 |
| 0.549 | 0.647 |
| ADRMineWITH_CLUSTER | 0.860 |
|
| 0.765 |
|
|
The highest values in each column are highlighted in bold.
The effectiveness of different CRF feature groups; all feature set (All) includes: context, lexicon, POS, negation, and embedding clusters (cluster)
| CRF Features | DS | |||||
|---|---|---|---|---|---|---|
| P | R | F | P | R | F | |
| All | 0.856 | 0.776 | 0.814 | 0.765 |
|
|
| All – lexicon | 0.852 | 0.781 | 0.815 | 0.765 | 0.646 | 0.701 |
| All – POS | 0.853 | 0.776 | 0.812 | 0.754 | 0.653 | 0.700 |
| All – negation | 0.854 | 0.769 | 0.810 | 0.752 | 0.646 | 0.695* |
| All – context | 0.811 | 0.665 | 0.731* | 0.624 | 0.498 | 0.554* |
| All – cluster | 0.851 | 0.745 | 0.794* |
| 0.549 | 0.647* |
| Context + cluster |
|
|
| 0.746 | 0.628 | 0.682* |
Statistically significant changes (p < 0.05), when compared with All feature set, are marked with asterisks.
The highest values in each column are highlighted in bold.
Figure 3The impact of embedding clusters on precision, recall (a), and F-measure (b), when training the CRF on variable training set sizes and testing on the same test set.
Figure 4Examples of successfully extracted concepts using ADRMine.
Figure 5Examples of concepts that could only be extracted after adding the embedding cluster features to ADRMine. These concepts are starred and other extracted concepts are highlighted.
Figure 6Analysis of false positive and false negatives produced by the ADR extraction approach.