| Literature DB >> 27363901 |
Ioannis Korkontzelos1, Azadeh Nikfarjam2, Matthew Shardlow3, Abeed Sarker4, Sophia Ananiadou5, Graciela H Gonzalez6.
Abstract
OBJECTIVE: The abundance of text available in social media and health related forums along with the rich expression of public opinion have recently attracted the interest of the public health community to use these sources for pharmacovigilance. Based on the intuition that patients post about Adverse Drug Reactions (ADRs) expressing negative sentiments, we investigate the effect of sentiment analysis features in locating ADR mentions.Entities:
Keywords: Adverse drug reactions; Sentiment analysis; Social media; Text mining
Mesh:
Year: 2016 PMID: 27363901 PMCID: PMC4981644 DOI: 10.1016/j.jbi.2016.06.007
Source DB: PubMed Journal: J Biomed Inform ISSN: 1532-0464 Impact factor: 6.317
Examples of tweets about drugs.
| # | Example tweet |
|---|---|
| A | |
| B | |
| C | |
| D | |
| E |
Numbers of ADR and indication mentions in the DailyStrength and Twitter corpora, number of messages and numbers of messages depending on the mentions they contain. Percentages (%) are shown within parentheses.
| Corpus | Mentions | Messages | Messages containing | ||||
|---|---|---|---|---|---|---|---|
| ADR | Ind. | ADRs | Ind. | Both | None | ||
| Daily-Strength | 1500 | 1068 | 4720 | 1500 | 1068 | 232 | 2384 |
| (31.8) | (22.6) | (4.9) | (50.5) | ||||
| 651 | 101 | 1339 | 651 | 101 | 53 | 640 | |
| (48.6) | (7.5) | (4.0) | (47.8) | ||||
| Daily-Strength | 752 | 454 | 1559 | 533 | 322 | 71 | 775 |
| (34.2) | (20.7) | (4.6) | (49.7) | ||||
| 277 | 38 | 443 | 236 | 33 | 18 | 192 | |
| (53.3) | (7.5) | (4.1) | (43.3) | ||||
Feature groups used for experimentation in this section.
| Feature groups | Types of included features |
|---|---|
| Token | |
| PoS | Parts-of-speech |
| Character | Character |
| Negation | isNegated |
| Heuristics | isAllCaps, isPunctuation, isElongated |
| TW | TW clusters |
| Lex. | Token, bigram and non-contiguous |
| All SA features | |
| DN & min. sent. pos. | Drug name & minimum sentiment relative position |
| All features | All SA features, DN & min. sent. pos. |
ADR extraction performance percentages (on DailyStrength and Twitter) when testing different feature sets.
| DailyStrength | Twitter | |||||
|---|---|---|---|---|---|---|
| Features | ||||||
| ADRMine (baseline) | 86.34 | 78.40 | 82.18 | 76.51 | 68.23 | 72.14 |
| 86.25 | 76.93 | 81.32 | 74.38 | 64.98 | 69.36 | |
| Character | 85.40 | 77.20 | 81.09 | 78.70 | 65.34 | 71.40 |
| PoS | 85.02 | 77.20∗ | 80.92 | 75.95 | 64.98 | 70.04 |
| Negation | 86.38 | 78.67 | 82.34 | 76.35 | 66.43 | 71.04 |
| Heuristics | 86.41 | 78.00 | 81.99 | 76.92 | 68.59 | 72.52 |
| TW | 85.55 | 78.13 | 81.67 | 74.49 | 65.34 | 69.62 |
| Hu&Liu Lex. | 86.26 | 77.87 | 81.85 | 77.05 | 67.87 | 72.17 |
| Subjectivity Lex. | 85.86 | 77.73 | 81.60 | 75.61 | 67.15 | 71.13 |
| NRC Lex. | 86.32 | 78.27 | 82.10 | 74.27 | 64.62 | 69.11 |
| NRC# Lex. | 85.74 | 76.13 | 80.65 | 76.09 | 63.18 | 69.03 |
| S140 Lex. | 79.48 | 65.70 | 71.94 | |||
| All Lex. | 86.38 | 76.93 | 81.38 | 78.30 | 66.43 | 71.88 |
| All SA features | 83.82 | 77.33 | 80.44 | |||
| DN & min. sent. pos. | 86.39 | 77.87 | 81.91 | 74.60 | 66.79 | 70.48 |
| All features | 83.36 | 77.13 | 79.58 | |||
: The contents of each feature-set are presented in Table 3. Statistically significant improvements over the baseline are marked with asterisk (∗). Statistical significance was computed using the two-tailed McNemar’s Q for a confidence level of 0.05.
Fig. 1Evaluation for the DailyStrength part of the corpus using parts of the training data.
Fig. 2Evaluation for the Twitter part of the corpus using parts of the training data.
ADR extraction performance percentages (on DailyStrength and Twitter) when testing different feature sets. Stratified 10 × 10-fold cross-validation results.
| DailyStrength | Twitter | |||||
|---|---|---|---|---|---|---|
| Features | ||||||
| ADRMine (baseline) | 83.62 | 75.96 | 79.57 | 75.51 | 60.29 | 66.91 |
| 83.91 | 76.55‡ | 80.03‡ | 75.85 | 60.32 | 67.07 | |
| Character | 83.07 | 76.96‡ | 79.87∗ | 76.05 | 62.32‡ | 68.39‡ |
| PoS | 83.31 | 76.81‡ | 79.89‡ | 74.30 | 60.36 | 66.47∗ |
| Negation | 83.71 | 75.93 | 79.59 | 75.35 | 60.16 | 66.77 |
| Heuristics | 83.67 | 76.03 | 79.62 | 75.58 | 60.29 | 66.94 |
| TW | 83.21 | 76.71‡ | 79.80 | 75.90 | 60.99† | 67.50† |
| Hu&Liu Lex. | 83.55 | 76.10 | 79.62 | 75.66 | 60.46 | 67.08 |
| Subjectivity Lex. | 83.58 | 76.01 | 79.58 | 75.61 | 60.53 | 67.09 |
| NRC Lex. | 83.46 | 75.98 | 79.51 | 75.49 | 60.60 | 67.10 |
| NRC# Lex. | 83.47 | 75.56 | 79.29 | 75.92 | 60.92∗ | 67.48∗ |
| S140 Lex. | 83.48 | 76.03 | 79.55 | 75.78 | 60.84∗ | 67.37∗ |
| All Lex. | 83.22 | 75.98 | 79.40 | 76.04 | 61.50‡ | 67.88‡ |
| All SA features | 83.04 | |||||
| DN & min. sent. pos. | 83.63 | 75.97 | 79.58 | 75.44 | 60.38 | 66.95 |
| All features | 83.01 | 76.90 | ||||
: Statistically significant improvements over the baseline are marked with asterisk (∗), dagger (†) and doubledagger (‡) for significance levels of 0.05, 0.01, 0.005, respectively. Since the cross-validation folds are common between all experiments, the two-tailed matched-samples t-test was used for computing statistical significance.
Prediction numbers and (within parentheses) percentages of ADR or indication mentions in DailyStrength (DS) and Twitter messages by the baseline system and the best performing systems for each corpus.
| ADR mentions | Indication mentions | No mentions | ||||||
|---|---|---|---|---|---|---|---|---|
| Features | Predicted as | Predicted as | Predicted as | |||||
| ADR | Ind. | None | Ind. | ADR | None | ADR | Ind. | |
| ADRMine | 598 | 38 | 116 | 306 | 45 | 103 | 44 | 35 |
| (baseline) | (79.5) | (5.1) | (15.4) | (67.4) | (9.9) | (22.7) | ||
| S140 Lex. | 597 | 34 | 121 | 305 | 44 | 105 | 40 | 30 |
| (79.4) | (4.5) | (16.1) | (67.2) | (9.7) | (23.1) | (−10) | (−14) | |
| ADRMine | 189 | 0 | 88 | 13 | 6 | 19 | 50 | 1 |
| (baseline) | (68.2) | (0.0) | (31.8) | (34.2) | (15.8) | (50.0) | ||
| All features | 192 | 0 | 85 | 15 | 7 | 16 | 41 | 4 |
| (69.3) | (0.0) | (30.7) | (39.5) | (18.4) | (42.1) | (−18) | (+300) | |
: For unannotated text (last two columns) parentheses shows increase or decrease in comparison to the relevant baseline.
Prediction numbers and (within parentheses) percentages of DailyStrength (DS) and Twitter messages that contain ADR or indication mentions by the baseline system and the best performing systems for corpus.
| Messages containing ADR mentions | Messages containing indication mentions | |||||||
|---|---|---|---|---|---|---|---|---|
| Features | Predicted as containing | Predicted as containing | ||||||
| ADRs | Ind. | Both | None | Ind. | ADRs | Both | None | |
| ADRMine | 442 | 22 | 67 | 69 | 178 | 101 | 66 | 43 |
| (baseline) | (82.9) | (4.1) | (12.6) | (13.0) | (55.3) | (31.4) | (20.5) | (13.4) |
| S140 Lex. | 444 | 22 | 69 | 67 | 177 | 101 | 70 | 44 |
| (83.3) | (4.1) | (13.0) | (12.6) | (55.0) | (31.4) | (21.7) | (13.7) | |
| ADRMine | 176 | 3 | 6 | 57 | 7 | 16 | 6 | 10 |
| (baseline) | (74.6) | (1.3) | (2.5) | (24.2) | (21.2) | (48.5) | (18.2) | (30.3) |
| All features | 178 | 4 | 9 | 54 | 6 | 17 | 9 | 10 |
| (75.4) | (1.7) | (3.8) | (22.9) | (18.2) | (51.5) | (27.3) | (30.3) | |
Prediction numbers and (within parentheses) percentages of DailyStrength (DS) and Twitter messages that contain both ADR or Indication mentions or no mentions at all by the baseline system and the best performing systems for each corpus.
| Messages containing ADR and indication mentions | Messages containing no mentions | |||||||
|---|---|---|---|---|---|---|---|---|
| Features | Predicted as containing | Predicted as containing | ||||||
| Both | ADRs | Ind. | None | None | ADRs | Ind. | Both | |
| ADRMine | 48 | 17 | 6 | 0 | 748 | 18 | 9 | 0 |
| (baseline) | (67.6) | (23.9) | (8.5) | (0.0) | (96.5) | (2.3) | (1.2) | (0.0) |
| S140 Lex. | 52 | 13 | 6 | 0 | 756 | 12 | 7 | 0 |
| (73.2) | (18.3) | (8.5) | (0.0) | (97.6) | (1.6) | (0.9) | (0.0) | |
| ADRMine | 6 | 7 | 3 | 2 | 170 | 21 | 1 | 0 |
| (baseline) | (33.3) | (38.9) | (16.7) | (11.1) | (88.5) | (10.9) | (0.5) | (0.0) |
| All features | 8 | 5 | 3 | 2 | 173 | 17 | 2 | 0 |
| (44.4) | (27.8) | (16.7) | (11.1) | (90.1) | (8.9) | (1.0) | (0.0) | |
ADR extraction performance (on DailyStrength and Twitter) when testing the isDrugName feature.
| DailyStrength | Twitter | |||||
|---|---|---|---|---|---|---|
| Features | ||||||
| All SA features + isDrugName | 83.24 | 76.80 | 79.89 | |||
| All features + isDrugName | 83.79 | 77.20 | 80.36 | |||
: Statistically significant improvements over the baseline are marked with asterisk (∗). Statistical significance was computed using the two-tailed McNemars Q for a 95% confidence interval.
Examples of messages whose ADR mentions were not predicted correctly.
| # | Example |
|---|---|
| A | |
| B | |
| C | |
| D | |
| E | |