Literature DB >> 24303305

Learning signals of adverse drug-drug interactions from the unstructured text of electronic health records.

Srinivasan V Iyer¹, Paea Lependu, Rave Harpaz, Anna Bauer-Mehren, Nigam H Shah.

Abstract

Drug-drug interactions (DDI) account for 30% of all adverse drug reactions, which are the fourth leading cause of death in the US. Current methods for post marketing surveillance primarily use spontaneous reporting systems for learning DDI signals and validate their signals using the structured portions of Electronic Health Records (EHRs). We demonstrate a fast, annotation-based approach, which uses standard odds ratios for identifying signals of DDIs from the textual portion of EHRs directly and which, to our knowledge, is the first effort of its kind. We developed a gold standard of 1,120 DDIs spanning 14 adverse events and 1,164 drugs. Our evaluations on this gold standard using millions of clinical notes from the Stanford Hospital confirm that identifying DDI signals from clinical text is feasible (AUROC=81.5%). We conclude that the text in EHRs contain valuable information for learning DDI signals and has enormous utility in drug surveillance and clinical decision support.

Entities: CellLine Chemical Disease Species

Year: 2013 PMID： 24303305 PMCID： PMC3814491

Source DB: PubMed Journal: AMIA Jt Summits Transl Sci Proc

Introduction

More than 400,000 preventable adverse drug reactions occur every year1, each costing over $3500 with an increased hospital stay of over 3 days2, so their discovery and prevention would address one of the leading causes of death in the US. We are also witnessing a rise in polypharmacy, which is the use of multiple concomitant drugs to treat medical conditions, with many people taking 3 or more drugs. In fact, one study estimates that 29.4% of elderly patients3 are on 6 or more drugs. Drug interactions that lead to adverse reactions are a potentially avoidable4 consequence of this practice, accounting for more than 30% of all drug reactions5; and their early detection is vital6. New drugs are usually tested for interactions with existing drugs before market approval using in-vivo and in-vitro methods7. However, owing to the sheer number of ways by which drugs can interact8, it is infeasible and expensive to test for every kind of interaction. Also, many drug interactions manifest after a certain period of exposure and it takes several exposures for rare drug interactions to occur9. Therefore, post marketing surveillance is necessary to detect unanticipated interactions that occur when the drug is in use in the general population. The US Food and Drug Administration (FDA) enables such surveillance by the active monitoring of spontaneous reporting systems (SRS) such as Adverse Event Reporting System (AERS) and similarly, the World Health Organization’s VigiBase. There have been several studies10–13 that have successfully inferred drug interactions from these sources, overcoming problems of reporting biases14 and duplicate reporting15. Electronic health records (EHRs) complement these existing SRSs, providing a source of observational data without the same biases. Initiatives like the Observational Medical Outcomes Partnership (OMOP) in the US and the Exploring and Understanding Adverse Drug Reactions (EU-ADR) project in Europe are focusing on building EHR based surveillance systems. These projects mainly utilize the structured diagnosis and prescription data of the EHRs for identifying single drug adverse reactions. Most efforts aimed at finding drug interactions use reported sources for signal detection and use EHRs as a means of validation. For instance, Tatonetti et. al.10 found 171 new drug interactions from AERS and used the EHRs at Stanford to validate them. Another study by Duke et. al.16 mined MEDLINE abstracts for hypothesis generation and validated the signals on EHRs. However, in addition to structured data, EHRs contain rich information in the unstructured notes and reports taken by doctors, nurses and other practitioners. By ignoring the unstructured text, we could be missing a substantial portion of adverse events17. Many studies18 have shown that coded information like ICD-9 are inadequate to accurately build patient cohorts and there is a considerable advantage19 in using the unstructured clinical text of EHRs. We argue that such an advantage would also extend to drug safety signal detection. Indeed, there is already some work20,21 demonstrating the discovery of the adverse event profiles for single drugs using unstructured notes. Therefore, given increasing adoption and access to medical records for research, we expect efforts to shift more toward directly mining EHRs for signal generation with an increased attention on the use of unstructured data22. In this paper, we apply data mining methods on the textual portion of EHRs to learn signals of drug-drug interactions. To our knowledge, this is the first study of its kind.

Methods

Preparation of Gold Standard

In order to ensure reliability, we limit our study to 1,164 drug ingredients common to three sources: the UMLS RxNORM ontology (4,993 ingredients), DrugBank23 (6,711 drugs) and the Anatomical Therapeutic Chemical Classification (ATC) ontology (4,406 drugs). We use known interactions from DrugBank and the Medi-Span® Drug Therapy Monitoring System™ (Wolters Kluwer Health, Indianapolis, IN) as positive interactions in our gold standard. Each of these databases provides a textual monograph describing the interaction. The short monographs follow a basic template, so we use regular expressions to extract and manually validate drug-drug-event relations that indicate known interactions. Our set of positive interactions consists of 591 different drug ingredients and 14 distinct events. For estimating false discovery, we simulate negative examples by generating random drug-drug-event tuples, and removing any known interactions according to DrugBank, Medi-Span or Drugs.com24. We also remove pairs for which the adverse event is an indication (from Medi-Span, DrugBank, Drugs.com, UMLS and SIDER25) for either drug individually.

Annotation of Electronic Health Records

We use the Stanford Translational Research Integrated Database Environment (STRIDE) dataset comprising 9,078,736 textual notes corresponding to 1,044,979 patients. There are 565,898 patients (53% female) in STRIDE, with at least one drug or event from our selection mentioned in their records. 857 out of the 1,164 drugs actually appeared at least once in our dataset. As described in our previous work26, we define drug and event concepts as sets of terms derived from biomedical ontologies. For drugs, we include trade names and other forms of the drug from the RxNORM ontology. To improve our precision, we remove terms that occur in common English usage, followed by manual curation. The average size of the set of terms used to identify a drug is 8.3 terms. We then use a fast text annotator to tag clinical notes with these concepts and order them by the note’s timestamp, thus forming a set of patient timelines. The tool also takes into account negation and family history contextual cues to reduce false attribution of concepts. We focus our study on 14 adverse events based on existing literature (a list published by Trifiro et al, their presence in our gold standard, their prevalence in STRIDE and our ability to successfully detect their presence from EHRs.

Identification of DDI signals

In traditional drug safety surveillance with single drugs, odds ratios (OR) are computed using a 2-by-2 contingency table to detect signals [Figure 2]. For our method, the exposed group represents patients who have taken both drugs and the comparison group represents patients who have taken at most one drug. Based on the ordering of the drug and event mentions, we classify each patient into one of the cells of the contingency table [Figure 3]. Any drugs that appear after the first occurrence of the event are ignored.

Figure 2.

2×2 contingency table for a drug-drug-event association

Figure 3.

Assignment of patients to various cells in the 2×2 contingency table. The portion of the timeline after the first occurrence of the event is ignored. D=drug, E=event.

We use the ratio of the odds of getting the adverse event in the exposed group and in the comparison groups as a measure of the strength of the interaction. We calculate 95% confidence intervals for the odds ratio and signal an interaction if the lower bound of the confidence interval is greater than a threshold, which is picked using ROC curves depending on the desired sensitivity and specificity. Similar to approaches used in SRSs29 to reduce the effect of confounding, we use propensity score matching (PSM) to match at most 10 control patients for every case patient, and then generate an adjusted OR. In addition to matching on average age at the time of exposure, gender, and race, we also match on note count, drug count and disease count, which serve as a proxy for the overall health of the patient22.

Results

Prevalence of known interactions

DrugBank contains 10,906 distinct drug interactions, of which 7,400 corresponded to our list of drugs. Similarly, of 40,475 drug interactions from Medi-Span, 15,621 corresponded to our list of drugs. Together, these formed a set of 19,356 interactions, which resulted in 6,845 distinct drug-drug-event tuples constituting the positive examples in our gold standard. We calculated ORs, using STRIDE data, for 560 true examples with sufficient support, and also for an equal number (560) of random negative examples [see Methods]. We calculated the prevalence of each event among patients on drug combinations representing true interactions from the 2×2 contingency table [see Table 1]. This gold standard of 1,120 interactions, together with the prevalence information for the 560 true interactions forms a unique resource that can be used to test the performance of other methods for mining DDIs and to prioritize known interactions [see Discussion].

Table 1.

The drug combinations with the highest event prevalence (Prev. = a/(a+b)) in STRIDE, for each event.

Adverse Event (#Patients)	Drug1	Drug2	a	b	Prev. (%)
Parkinsonian Symptoms (3541)	levodopa	lorazepam	176	235	42.82
Cardiac Arrhythmias (88555)	potassium chloride	lisinopril	1091	1615	40.32
Neutropenia (14322)	paclitaxel	trastuzumab	140	567	19.8
Bradycardia (22906)	amiodarone	metoprolol	796	3671	17.82
Hypoglycemia (11150)	glipizide	lisinopril	367	2160	14.52
Acute Renal Failure (32197)	hydrochlorothiazide	ibuprofen	884	8375	9.55
Hyperkalemia (4973)	potassium chloride	spironolactone	349	3471	9.14
Hyperglycemia (19189)	prednisone	salmeterol	379	4612	7.59
Nephrotoxicity (1460)	fluconazole	tacrolimus	85	1208	6.57
Pancytopenia (8718)	mercaptopurine	azathioprine	15	278	5.12
Hypokalemia (8405)	prednisone	salmeterol	222	4982	4.27
Serotonin Syndrome (674)	tramadol	duloxetine	57	1301	4.2
QT prolongation (1260)	amiodarone	ciprofloxacin	46	2487	1.82
Rhabdomyolysis (1378)	ciprofloxacin	simvastatin	50	5184	0.96

Evaluation

Using an odds ratio [see Methods] as a measure of interaction, and taking the lower bound of the 95% confidence interval to account for variance, we obtained a sensitivity of 37.86% at a specificity of 86.79% and a positive predictive value (PPV) of 74.13%. We found that the performance of the method was below average for Acute Renal Failure, Nephrotoxicity, Hypokalemia and Hyperglycemia [see Discussion]. On removing these events, and following adjustment by PSM, our specificity improved to 96.56% with a PPV of 91.71% (threshold=1.5)[see Table 2]. Figure 4 shows a Receiver Operator Characteristic (ROC) curve showing all possible values of sensitivity and specificity that can be achieved by varying the cutoff threshold. Overall performance suggests that the unstructured text contains relevant data for drug interaction detection, achieving 81.5% area under the ROC curve for signaling known DDIs.

Table 2.

Performance at selected values of the threshold

Threshold	TP	TN	FP	FN	Sensitivity(%)	Specificity(%)	PPV(%)
1.5	177	450	16	289	37.98	96.56	91.71
1.3	205	442	24	261	43.99	94.85	89.52
1.0	244	423	43	222	52.36	90.77	85.02

Figure 4.

Receiver Operator Characteristic (ROC) curve showing sensitivity and specificity levels that can be achieved by varying the threshold.

Discussion

The importance of the early identification of DDIs is paramount and most of the existing methods use SRS databases to learn interactions, and use coded information present in EHRs for validation and prioritization. We have shown that it is possible to detect signals of DDIs directly from the unstructured text of EHRs using text mining methods and odds ratios. Since our methods are easy to implement and run rapidly, we could deploy them as an active monitoring tool for detecting unknown interactions for new and existing drugs, thus serving as an important step forward for Phase IV surveillance of drugs and meaningful use of EHRs. Unlike SRS systems, EHRs have good longitudinal coverage of patient history, a larger number of measured covariates and are less affected by reporting and publicity biases and thus, can provide a more accurate measure of the prevalence of a particular drug interaction in the real world. Augmenting existing drug interaction databases with this prevalence information30 would in itself be a major step forward in prioritizing interaction alerts in Computerized Physician Order Entry (CPOE) systems; where at present 49% to 96% of all alerts are overridden31 owing to alert fatigue. Furthermore, this prevalence information could help choose between drugs used in combination therapy. For example, a study found that use of Tacrolimus with statin therapy is safer than use of Cyclosporine A and reduces the risk of rhabdomyolysis32. We tested our method on a gold standard comprising known drug interactions as positive examples, and randomly generated interactions as negative examples. Most interaction studies33,34 use similar techniques to build a gold standard and this aids in comparison of results. Indeed, some events did not perform well in this setting, and this could be due in part to the gold standard or to the methods themselves. The databases from which the known interactions are derived are not exhaustive and this may introduce certain biases. In future work, we propose to make a distinction between interactions where the event is unrelated to the drugs, and interactions in which either of the drugs is already known to be associated with the event. It is possible that these simple methods work for some kinds of interactions and more sophisticated methods are required for others. Our work has several limitations. Since our workflow relies on text annotation methods to recognize concepts, its performance depends on the accuracy of concept recognition in EHRs. In this study, we use frequency based methods35 and manual curation to remove ambiguous terms corresponding to concepts, at the risk of reducing our sensitivity. However, we have found that some concepts are hard to detect in EHRs even after such filtering. Another limitation of the current method is that it fails to identify DDIs that are dependent on the dosage of the drugs in question. Resolving these issues requires more advanced natural language processing methods36, which proportionally increases the computation time required. Also, we do not put a restriction on the time window in which drugs interact and drug exposures spaced far away in time may cause several false associations. Lastly, our current methods may suffer from confounding by other variables, which we aim to resolve in future work by adjusting for confounding by co-morbidities and co-prescriptions.

Conclusion

In this paper, we demonstrate the feasibility of using the textual portion of EHRs for learning signals of DDIs and to estimate the prevalence of existing interactions. In the process, we created a gold standard of DDIs that may be useful for detailed characterization of future methods. Owing to the use of simple and fast text mining methods, we are able to learn DDI signals on a database containing millions of text notes without the use of extensive computing infrastructure. To the best of our knowledge, this is the first study to use the textual notes from a clinical data warehouse to generate hypotheses about drug interactions and examine the prevalence of DDIs.

30 in total

1. Detecting drug-drug interactions using a database for spontaneous adverse drug reactions: an example with diuretics and non-steroidal anti-inflammatory drugs.

Authors: E P van Puijenbroek; A C Egberts; E R Heerdink; H G Leufkens
Journal: Eur J Clin Pharmacol Date: 2000-12 Impact factor: 2.953

Review 2. Drug interaction studies: study design, data analysis, and implications for dosing and labeling.

Authors: S-M Huang; R Temple; D C Throckmorton; L J Lesko
Journal: Clin Pharmacol Ther Date: 2007-02 Impact factor: 6.875

3. Drug-drug interactions - a preventable patient safety issue?

Authors: Johanna Strandell; Andrew Bate; Marie Lindquist; I Ralph Edwards
Journal: Br J Clin Pharmacol Date: 2007-07-17 Impact factor: 4.335

4. Estimating the extent of reporting to FDA: a case study of statin-associated rhabdomyolysis.

Authors: Mara McAdams; Judy Staffa; Gerald Dal Pan
Journal: Pharmacoepidemiol Drug Saf Date: 2008-03 Impact factor: 2.890

5. Automated concept-level information extraction to reduce the need for custom software and rules development.

Authors: Leonard W D'Avolio; Thien M Nguyen; Sergey Goryachev; Louis D Fiore
Journal: J Am Med Inform Assoc Date: 2011-06-22 Impact factor: 4.497

6. A novel signal detection algorithm for identifying hidden drug-drug interactions in adverse event reports.

Authors: Nicholas P Tatonetti; Guy Haskin Fernald; Russ B Altman
Journal: J Am Med Inform Assoc Date: 2011-06-14 Impact factor: 4.497

7. A Comprehensive Analysis of Five Million UMLS Metathesaurus Terms Using Eighteen Million MEDLINE Citations.

Authors: Rong Xu; Mark A Musen; Nigam H Shah
Journal: AMIA Annu Symp Proc Date: 2010-11-13

8. Data-driven prediction of drug effects and interactions.

Authors: Nicholas P Tatonetti; Patrick P Ye; Roxana Daneshjou; Russ B Altman
Journal: Sci Transl Med Date: 2012-03-14 Impact factor: 17.956

9. Using temporal patterns in medical records to discern adverse drug events from indications.

Authors: Yi Liu; Paea Lependu; Srinivasan Iyer; Nigam H Shah
Journal: AMIA Jt Summits Transl Sci Proc Date: 2012-03-19

10. A side effect resource to capture phenotypic effects of drugs.

Authors: Michael Kuhn; Monica Campillos; Ivica Letunic; Lars Juhl Jensen; Peer Bork
Journal: Mol Syst Biol Date: 2010-01-19 Impact factor: 11.429

4 in total

1. Non-redundant association rules between diseases and medications: an automated method for knowledge base construction.

Authors: François Séverac; Erik A Sauleau; Nicolas Meyer; Hassina Lefèvre; Gabriel Nisand; Nicolas Jay
Journal: BMC Med Inform Decis Mak Date: 2015-04-15 Impact factor: 2.796

2. Proton Pump Inhibitor Usage and the Risk of Myocardial Infarction in the General Population.

Authors: Nigam H Shah; Paea LePendu; Anna Bauer-Mehren; Yohannes T Ghebremariam; Srinivasan V Iyer; Jake Marcus; Kevin T Nead; John P Cooke; Nicholas J Leeper
Journal: PLoS One Date: 2015-06-10 Impact factor: 3.240

3. OpenVigil FDA - Inspection of U.S. American Adverse Drug Events Pharmacovigilance Data and Novel Clinical Applications.

Authors: Ruwen Böhm; Leocadie von Hehn; Thomas Herdegen; Hans-Joachim Klein; Oliver Bruhn; Holger Petri; Jan Höcker
Journal: PLoS One Date: 2016-06-21 Impact factor: 3.240

4. Language-agnostic pharmacovigilant text mining to elicit side effects from clinical notes and hospital medication records.

Authors: Benjamin Skov Kaas-Hansen; Davide Placido; Cristina Leal Rodríguez; Hans-Christian Thorsen-Meyer; Simona Gentile; Anna Pors Nielsen; Søren Brunak; Gesche Jürgens; Stig Ejdrup Andersen
Journal: Basic Clin Pharmacol Toxicol Date: 2022-07-26 Impact factor: 3.688

4 in total