| Literature DB >> 27333889 |
Rainer Winnenburg1, Nigam H Shah2.
Abstract
BACKGROUND: Identification of associations between marketed drugs and adverse events from the biomedical literature assists drug safety monitoring efforts. Assessing the significance of such literature-derived associations and determining the granularity at which they should be captured remains a challenge. Here, we assess how defining a selection of adverse event terms from MeSH, based on information content, can improve the detection of adverse events for drugs and drug classes.Entities:
Keywords: Enrichment analysis; Information retrieval; MeSH indexing; Postmarketing pharmacovigilance
Mesh:
Substances:
Year: 2016 PMID: 27333889 PMCID: PMC4918084 DOI: 10.1186/s12859-016-1080-z
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Overview on our approach for using generalized enrichment analysis for detecting adverse drug events
Fig. 2Defining an abstraction level based on information content (IC) of terms. Blue boxes denote terms that are considered under the given abstraction layer (here at IC range from 4.0 to 7.5); green circles denote terms that are used for representing these term aggregations, crossed out terms are ignored. Values in blue circles denote the information content (IC) of the term. Examples: Left box: Aneurysm, Ruptured and Aneurysm are aggregated into and represented by Vascular Diseases. Center box: Cardiovascular Abnormalities and Congenital Abnormalities are considered but Cardiovascular Diseases is ignored (too general). Demyelinating Diseases will not be considered under this abstraction layer because there is no ancestor term into which they could be aggregated. Tree numbers in square brackets indicate hierarchical relations between terms according to the MeSH tree hierarchy
Fig. 3Differences between term-by-term enrichment analysis (a) and conditional enrichment analysis (b). See main text for details
2 × 2 contingency table used for calculating associations between drugs and adverse events (AE) with proportional reporting ratio (PRR)
| With this AE | Without this AE | |
|---|---|---|
| Articles mentioning the drug | a | b |
| Articles not mentioning the drug | c | d |
Mapping of drugs and drug classes in the OMOP reference set to ATC
| Aggregation | Drugs (ATC5) | Drug classes (ATC4) | ||
|---|---|---|---|---|
| Outcome | Positive | Negative | Positive | Negative |
| Acute kidney injury | 23 | 59 | 19 | 68 |
| Acute liver injury | 79 | 33 | 68 | 43 |
| Acute myocardial infarction | 33 | 61 | 18 | 72 |
| GI bleed | 24 | 62 | 19 | 73 |
| Total | 159 | 215 | 124 | 256 |
Positive and negative controls mapped to drugs and drug classes in ATC for the four adverse event outcomes in the OMOP reference set
Adverse drug event outcomes in OMOP mapped to MeSH terms at different abstraction levels
| Aggregation | ORIGINAL | LOW (IC 7–10) | MEDIUM (IC 4.5–7) | HIGH (IC 1–5.5) | MeSH 2nd level |
|---|---|---|---|---|---|
| Acute kidney injury | Acute kidney injury, Kidney tubular necrosis, acute | Acute kidney injury D058186 (5th level) IC: 9.34 | Kidney Diseases D007674 (3rd level) IC: 5.73 | Urologic Diseases D014570 (2nd level) IC: 5.16 | Urologic Diseases D014570 (2nd level) IC: 5.16 |
| Acute Liver injury | Drug-induced liver injury, Drug-induced liver injury, chronic | Drug-Induced Liver Injury D056486 (3rd level) IC: 9.87 | Liver Diseases D008107 (2nd level) IC: 5.64 | Digestive System Diseases D004066 (1st level) IC: 4.01 | Liver Diseases D008107 (2nd level) IC: 5.64 |
| Acute Myocardial Infarction | Myocardial infarction, Anterior wall MI, Inferior wall MI, Myocardial stunning, Shock, cardiogenic | Myocardial infarction D009203 (4th level) IC: 7.22 | Myocardial Ischemia D017202 (3rd level) IC: 5.94 | Heart Diseases D006331 (2nd level) IC: 4.61 | Heart Diseases D006331 (2nd level) IC: 4.61 |
| GI bleed | GI Hemorrhage, Hematemesis, Melena, Peptic ulcer hemorrhage | Gastrointestinal Hemorrhage D006471 (3rd level) IC: 9.00 | Gastrointestinal Diseases D005767 (2nd level) IC: 4.87 | Digestive System Diseases D004066 (1st level) IC: 4.01 | Gastrointestinal Diseases D005767 (2nd level) IC: 4.87 |
Representation of ADE information from the literature at different abstraction levels
| Aggregation | ORIGINAL (MeSH terms as extracted from MEDLINE) | LOW (IC 7–10) | MEDIUM (IC 4.5–7) | HIGH (IC 1–5.5) | MeSH 2nd level |
|---|---|---|---|---|---|
| # unique candidate ADE pairs | 105354 | 87415 | 49696 | 41705 | 51226 |
| # unique higher-level AE terms | 3057 | 607 | 156 | 99 | 297 |
| # unique AE terms covered | 3057 | 2428 | 2760 | 2875 | 3039 |
| # unique drugs with AE associations | 1609 | 1590 | 1602 | 1608 | 1608 |
| # unique drug classes (ATC 4) with AE associations | 565 | 565 | 564 | 565 | 565 |
Fig. 4Performance of selected signal detection methods for single drugs on the OMOP reference set. Performance is measured for each of the four AE outcomes measured in terms of AUC summarizing all achievable combinations of true positive and false positive rates
Overall performance measured on all single drugs and outcomes in OMOP gold standard
| GEA | 7–10 | 4.5–7 | 1–5.5 | 2nd level | ||||||||
| Threshold | 0.1 | 0.05 | 0.005 | 0.1 | 0.05 | 0.005 | 0.1 | 0.05 | 0.005 | 0.1 | 0.05 | 0.005 |
| TP |
| 119 | 113 | 112 | 110 | 100 | 116 | 113 | 104 | 116 | 115 | 102 |
| TN | 182 | 188 |
| 191 | 195 |
| 159 | 164 | 178 | 186 | 193 | 198 |
| FP | 33 | 27 |
| 24 | 20 |
| 56 | 51 | 37 | 29 | 22 | 17 |
| FN |
| 40 | 46 | 47 | 49 | 59 | 43 | 46 | 55 | 43 | 44 | 57 |
| Precision | 0.79 | 0.82 |
| 0.82 | 0.85 | 0.91 | 0.67 | 0.69 | 0.74 | 0.80 | 0.84 | 0.86 |
| Recall |
| 0.75 | 0.71 | 0.70 | 0.69 | 0.63 | 0.73 | 0.71 | 0.65 | 0.73 | 0.72 | 0.64 |
| F1-Measure | 0.77 | 0.78 |
| 0.76 | 0.76 | 0.74 | 0.70 | 0.70 | 0.69 | 0.76 |
| 0.73 |
| PRR | 7–10 | 4.5–7 | 1–5.5 | 2nd level | ||||||||
| Threshold | 1 | 1.5 | 5 | 1 | 1.5 | 5 | 1 | 1.5 | 5 | 1 | 1.5 | 5 |
| TP |
| 96 | 48 | 110 | 93 | 39 | 111 | 83 | 6 | 109 | 90 | 27 |
| TN | 126 | 144 | 197 | 150 | 172 | 208 | 152 | 176 |
| 166 | 184 |
|
| FP | 89 | 71 | 18 | 65 | 43 | 7 | 63 | 39 |
| 49 | 31 |
|
| FN |
| 63 | 111 | 49 | 66 | 120 | 48 | 76 | 153 | 50 | 69 | 132 |
| Precision | 0.56 | 0.57 | 0.73 | 0.63 | 0.68 | 0.85 | 0.64 | 0.68 | 0.60 | 0.69 | 0.74 |
|
| Recall |
| 0.60 | 0.30 | 0.69 | 0.58 | 0.25 |
| 0.52 | 0.04 | 0.69 | 0.57 | 0.17 |
| F1-Measure | 0.62 | 0.59 | 0.43 | 0.66 | 0.63 | 0.38 | 0.67 | 0.59 | 0.07 |
| 0.64 | 0.28 |
Performance with GEA (top) and PRR (bottom) using different abstraction levels and thresholds. Bold numbers indicate best performance throughout all configurations (individually for GEA and PRR-based configurations)
Adverse event signals for drugs detected by GEA and PRR
|
| GEA IC 7–10 | GEA IC 4.5–6 | GEA IC 1–5.5 | PRR 2nd | PRR IC 7–10 | PRR |
|---|---|---|---|---|---|---|
| <0.1 | 43702 (22073) | 10585 (7634) | 12875 (6157) | 30547 | 61108 | >1 |
| <0.05 | 35953 (17140) | 9249 (6840) | 11453 (5356) | 24834 | 52377 | >1.5 |
| <0.005 | 23059 ( | 6867 (4931) | 8594 (3623) | 10810 | 26953 | >5 |
Adverse event signals for drugs (ATC ingredient names) detected by GEA and PRR using different thresholds at different aggregation levels. The number in bold indicates the unique adverse event association signals found using the optimal configuration (GEA at IC 7–10 with p < 0.005) as determined by the overall performance assessment