Literature DB >> 28815115

Discovering adverse drug events combining spontaneous reports with electronic medical records: a case study of conventional DMARDs and biologics for rheumatoid arthritis.

Liwei Wang1,2, Majid Rastegar-Mojarad2, Sijia Liu2,3, Huaji Zhang4, Hongfang Liu2.   

Abstract

The use of multiple data sources has been preferred in the surveillance of adverse drug events due to shortcomings of using only a single source. In this study, we proposed a framework where the ADEs associated with interested drugs are systematically discovered from the FDA's Adverse Event Reporting System (AERS), and then validated through mining unstructured clinical notes from Electronic Medical Records (EMRs). This framework has two features. First, a higher priority was given to clinical practice during signal detection and validation. Second, the normalization by NLP facilitated the interoperation between AERS-DM and the EMR. To demonstrate this methodology, we investigated potential ADEs associated with drugs (class level) for rheumatoid arthritis (RA) patients. The results demonstrated the feasibility and sufficient accuracy of the framework. The framework can serve as the interface between the informatics domain and the medical domain to facilitate ADE discovery.

Entities:  

Year:  2017        PMID: 28815115      PMCID: PMC5543355     

Source DB:  PubMed          Journal:  AMIA Jt Summits Transl Sci Proc


Introduction

Adverse drug events (ADEs), referring to any undesirable effect of a drug beyond its anticipated therapeutic effects occurring during clinical use[1], are important public health concerns. Although randomized clinical trials (RCTs) are considered a gold standard in identifying pre-marketing safety issues of drugs, there are some existing limitations, primarily within experiments. These limitations can include insufficient patient number, homogeneous population, short trial period and exclusion of patients with comorbid diseases. Therefore, it is well accepted that pre-marketing RCTs may not detect all types of ADEs related to a particular drug in clinical practice. In post-marketing surveillance for adverse drug events (ADEs), the FDA’s Adverse Event Reporting System (AERS) has become an important resource. However, signals from AERS data may contain false positive results, where an association between the drug and ADE is incorrectly identified, as well as false negative results, where a true association or signal is missed. Other data sources have been studied aiming for ADE detection, such as the secondary use of Electronic Medical Records (EMRs) for further validation or comparison of ADEs, which has been paid much attention. EMRs contain rich information in unstructured clinical notes that cannot be overlooked[2]. Recently, Natural language processing (NLP) has been used to extract drug-ADE pairs for signal detection through χ2 test[3]. The efficacy of mining EMRs for drug-ADE relationship has also been proven[4]. As a demonstration that combining AERS with EHRs can improve the accuracy of ADE signal detection, an approach was proposed to produce a highly selective ranked set of candidate ADEs from both AERS and EMRs based on proportionality analysis[5]. This study could systematically discover ADEs and apply to very general scenarios. In this study, we proposed a framework where the ADEs associated with interested drugs are discovered from FDA AERS, and then validated through mining unstructured clinical notes where clinical priorities are given in terms of cohort selection and result analysis. To demonstrate the methodology, we investigate potential ADEs associated with drugs (class level) for rheumatoid arthritis (RA) patients.

Background

Rheumatoid arthritis (RA) is the most common type of arthritis in adults in the United States[6]. Conventional disease- modifying anti-rheumatic drugs (DMARDs), including methotrexate (MTX), sulfasalazine, and leflunomide, have been the cornerstone of the treatment of rheumatoid arthritis (RA). Recently biological agents (biologics), for example etanercept, demonstrated major therapeutic advances in treating RA patients[7]. In clinical practice, the safety of medications for RA patients is an important issue. Many studies focus on adverse drug events (ADEs) associated with either DMARDs or biologics, or their combination, through randomized controlled trials (RCTs)[8], clinical trials[9], systematic reviews[10], meta-analysis[11], and chart reviews[12]. Because RCTs or clinical trials are not able to reveal all potential ADEs due to experimental limits, post-marketing surveillance becomes an important means of evaluating drug safety. The FDA Adverse Event Reporting System has been used for the discovery of ADEs associated with biologics for RA, mainly aiming at specific ADEs, i.e., ischaemic colitis[13],T-cell non-Hodgkin’s lymphomas[13], neurological events[14], and pneumocystis[15]. In one study, several data sources were used to compare the magnitude of serious adverse events (SAEs) observed in post-marketing reports of tocilizumab (TCZ), one of the biologics for RA patients [16]. However, ADEs are not systematically discovered. Interested ADEs included only serious hepatic events, gastrointestinal perforation, and cardiovascular events (myocardial infarction and stroke). In this study, we aim to systematically discover ADEs associated with two drug classes (conventional DMARDs and biologics) based on the framework we propose.

Materials and Methods

Figure 1 shows the framework for ADE mining from FDA’s AERS and EMRs that includes three steps: preprocessing, signal detection and validation. In preprocessing, NLP was conducted for clinical notes in the EMRs. During the signal detection, interested drugs were first identified from the AERS data mining set (AERS-DM), and then data mining algorithms such as reporting odds ratios (ROR) were conducted to generate potential ADE signals. For the EMRs, interested drugs, the cohort on interested drugs, and outcomes were identified, and then the outcomes before drug use were removed. Lastly, the overlap between ADE signals and outcomes from EMR was further investigated to discover potential ADEs. The details are shown below.
Figure 1:

Framework for ADE mining from FDA AERS and EMRs

Data Sources

The FDA’s AERS is a database supporting the post-marketing safety surveillance for drug and therapeutic biologic products [17]. However, this database contains redundant data where drugs can also be registered by arbitrary names, including trade names, abbreviations, and even typographical errors. In order to make it convenient for complicated downstream analysis, we previously produced a normalized knowledge-enhanced data mining set based on AERS, i.e., AERS-DM[18]. Three steps were conducted: de-duplication, drug normalization, and data aggregation. First, redundant reports were removed as suggested by the FDA. This procedure removed multiple reports of the same event. Second, FAERS drug names, along with administration route and dose information, were normalized using a natural language processing (NLP) tool MedEx[19] to RxNorm, a standardized nomenclature for clinical drugs and drug delivery devices [20]. Meanwhile, adverse event terms were mapped to Medical Dictionary for Regulatory Activities (MedDRA)’s preferred term (PT) code and classified into MedDRA System Organ Class (SOC) [21]. Third, adverse events were aggregated according to MedDRA SOC and PT codes, and drugs were aggregated based on National Drug File–Reference Terminology (NDF-RT) classification information through RxNorm [22]. We processed FAERS data from 2004 through 2011 into AERS-DM, which contains 37,029,228 ADE records. In total, 74% of FAERS unique drug names were normalized to 14,489 unique RxNorm concepts, of which 10,221 (71%) were classified in NDF-RT. The datasets of AERS-DM can be downloaded from the website http://informatics.mayo.edu/adepedia/index.php/Download. EMR clinical notes in our study consist of a cohort of Employee and Community Health (ECH) patients receiving their primary care at Mayo Clinic over a period of 15 years (1998–2013). This cohort include 138,000 patients and covers both inpatient and outpatient settings. Problems (outcomes) in those notes are generally entries which are itemized as either phrases (e.g., Allergic rhinitis/vasomotor rhinitis) or short sentences (e.g, Her asthma appeared to be very mild). In this study, we chose sections related to diagnosis and lab tests for ADE detection.

Preprocessing

To align with the meaningful use requirement, the CORE Problem List Subset was created to better implement Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT) in EMRs [23]. The CORE Problem List Subset offers a good coverage of frequently used terms in problem lists[23]. In a previous study [24], we assessed the coverage of SNOMED CT for codifying problem lists in narrative format by extracting itemized entries from clinical notes [25]. In this study, we normalized them to the Unified Medical Language System (UMLS) concepts. We applied the same methodology but kept UMLS concepts that can be mapped to the CORE Problem List Subset codes (the August 2015 version of The CORE Problem List Subset of SNOMED CT was used). Then MedXN was used for the normalization of medications in this cohort to RxNorm codes[26].

Signal detection

DMARDs and biologics are two drug classes for the treatment of RA. DMARDs include methotrexate, leflunomide, hydroxychloroquine and sulfasalazine. Biologics include abatacept, adalimumab, anakinra, certolizumab, etanercept, golimumab, infliximab, rituximab, tocilizumab and tofacitinib. From AERS-DM, RxNorm codes of these generic ingredients were used to extract records. Drug indications were limited for RA patients. The data mining method reporting odds ratio (ROR) was used to detect associations between drug class DMARDs, biologic use, and ADEs. The calculation of ROR is based on a 2×2 contingency table [27, 28]. The number of reports with drug class and ADE is defined as a. The number of reports with drug class and without ADE is defined as b. The number of reports with drugs other than this drug class and with ADE is defined as c. The number of reports with drugs other drug class and without ADE is defined as d. In this analysis, the lower bound of the 95% confidence interval of the ROR was used[29]. R package PhViD 1.0.6 was used for signal detection[30]. From the normalized data of clinical notes in EMRs, first, synonyms of RA from UMLS were used to identify RA patients, i.e., rheumatoid arthritis or polyarthritis rheumatic. Associated medications, prescription date, and diagnosis date were also extracted. Second, patient cohorts were identified in consideration of clinical priorities based on interested drugs and indications. To study the drug class DMARDs, we identified the cohort of RA patients as those who took any drug in the DMARDs class without drugs of biologics. According to the 2015 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis, conventional DMARDs are usually used for early RA patients, while biologics are often used for moderate or high disease activity, combining with or without DMARDs[31]. To study the drug class biologics, we identified another cohort of RA patients as those who took any drugs within the biologics class, no matter if a drug in the DMARDs class was used in combination. This is also simulating the condition from AERS-DM where data mining of biologics for RA did not consider if DMARSs were used in combination. Therefore, two different cohorts were used for two drug classes. Third, the outcomes of patients from the two cohorts were identified respectively. Forth, outcomes before the administration of interested drugs were removed to obtain possible ADE signals, i.e., possible consequences of interested drugs.

Validation

After obtaining signals associated with DMARDs and biologics from AERS-DM, MedDRA PT codes were mapped to 2012AB UMLS concepts. The overlapping signals for the two drug classes were further analyzed through mapping PT terms to System Organ Class (SOC) terms. For each drug class, we manually compared the overlapping signals to filter confirmed ADEs from package inserts, and then complications and other confounding factors were filtered to reveal potential ADEs. Some examples were shown using top overlaps of outcomes associated with biologics and DMARDs chosen according to the criteria of ROR more than 2, reporting number in AERS-DM more than 5, and incidence from EMR more than 5%.

Results

In total, there were 497 unique patients with an RA (or synonyms) diagnosis who took only DMARDs, and 365 unique patients with an RA (or synonyms) diagnosis who took biologics no matter if DMARDs were co- administered. Table 1 shows signals from AERS-DM and outcomes from clinical notes. More signals were detected for biologics (152) from both AERS-DM and clinical notes than DMARDs (147).
Table 1:

Signal detections from AERS-DM and clinical notes

Clinical notesAERS-DM
No. of patientsNo. of outcomesNo. of outcome overlap with AERS-DM (%)No. of signalsNo. mapping to UMLSNo. of signal overlap with clinical notes (%)
DMARDs4972,688147 (5.5%)13111311147 (11.2%)
Biologics3652,595152 (5.9%)14501448152 (10.5%)
The overlapping signals for the two drug classes were further analyzed through mapping PT terms to System Organ Class (SOC) terms. Table 2 shows the number of PT terms (signals) for DMARDs and biologics mapping to SOC. Potential ADEs associated with biologics were involved in more SOCs than those with DMARDs, and the top 6 SOCs were in the same order for potential ADEs associated with both drug classes.
Table 2:

Number of PT terms associated with DMARDs and biologics mapping to SOC

System Organ Class (SOC)DMARDsBiologics
Respiratory, thoracic and mediastinal disorders1919
Infections and infestations2017
Musculoskeletal and connective tissue disorders1617
Skin and subcutaneous tissue disorders1517
Nervous system disorders1215
Surgical and medical procedures1013
Injury, poisoning and procedural complications67
Reproductive system and breast disorders37
Blood and lymphatic system disorders06
Gastrointestinal disorders76
Investigations25
Renal and urinary disorders35
Eye disorders43
Hepatobiliary disorders23
Immune system disorders03
Metabolism and nutrition disorders22
Neoplasms benign, malignant and unspecified (incl cysts and polyps)82
Cardiac disorders11
Ear and labyrinth disorders01
General disorders and administration site conditions21
Psychiatric disorders01
Vascular disorders31
Immune system disorders80
For each drug class, we manually compared the overlapping signals with confirmed ADEs from package inserts. Table 3 shows the analysis results. Signals were divided into four categories, the first is confirmed ADEs or signs of ADEs in package inserts such as “vasculitis” for biologics, the second is complications of RA such as “osteoporosis”, the third is treatments such as “appendectomy”, and the forth is potential ADEs such as “hyperkeratosis” for biologics.
Table 3:

Analysis of overlapping signals for each drug class.

Confirmed ADEComplicationsTreatmentsPotential ADEsTotal
DMARDs58 (39.5%)21 (14.3%)10(14.7%)58 (39.5%)147
Biologics72 (47.4%)27 (17.8%)11 (7.2%)42 (27.6%)152
The top potential ADEs associated with biologics and DMARDs were chosen according to the criteria of ROR more than 2, reporting number in AERS-DM more than 5, and incidence from EMR more than 5%. Table 3 and Table 4 show the top potential ADEs for DMARDs and biologics, case number from clinical notes and percentage, report number from AERS-DM, and ROR.
Table 4:

Top potential ADEs for DMARDs. Bold and italic indicate confirmed ADEs or ADE signs, italic indicates complications of RA, and bold indicates possible ADEs.

SignalsUMLS codesCase number from clinical notes (%)Report number from AERS-DMROR
Endometrial cancerC0476089128(25.8%)302.87
RhinorrheaC1260880111(22.3%)3552.81
Productive coughC0239134100(20.1%)3462.83
Bladder neoplasmC049693071(14.3%)172.60
Gastroduodenal ulcerC003092054(10.9%)73.07
Amyotrophic lateral sclerosisC000273644(8.9%)262.71
Sinus congestionC015202940(8.0%)2374.87
Sjogren’s syndromeC152733636(7.2%)725.54
Respiratory tract congestionC024207336(7.2%)2306.59
BunionC000638634(6.8%)796.41
Sinus headacheC003719532(6.4%)1103.40
Antinuclear antibody positiveC015148032(6.4%)923.01
MetatarsalgiaC002558731(6.2%)73.57
Rash pruriticC003377127(5.4%)3302.20
Red blood cell sedimentation rate increasedC015163226(5.2%)2284.07
In Table 4, there are 15 signals above the thresholds for DMARDs. There are 7 signals (46%) confirmed as ADEs or signs of ADEs in package inserts, shown in bold and italic. There are 4 signals (27%) identified as complications of RA, shown in italic. Four signals (27%) “Endometrial cancer”, “bladder neoplasm”, “Sjogren’s syndrome”, and “Amyotrophic lateral sclerosis” could be possible ADEs following DMARDs that can’t be found in package inserts. In Table 5, there are 18 signals above the thresholds for biologics. There are 12 signals (67%) confirmed as ADEs or signs of ADEs in package inserts, shown in bold and italic. There are 2 signals (11%) identified as complications of RA, shown in italic. Two signals (11%) “steroid therapy” and “laparoscopy” could be excluded from ADEs, since they are treatments instead of undesirable effects. The left 2 signals (11%), “Sjogren’s syndrome” and “amyotrophic lateral sclerosis” could be possible ADEs following DMARDs that can’t be found in package inserts.
Table 5:

Top potential ADEs for biologics. Bold and italic indicate confirmed ADEs or ADE signs, italic indicates complications of RA, and bold indicates possible ADEs.

SignalsUMLS codesCase number from clinical notes (%)Report number from AERS-DMROR
Endometrial cancerC0476089101(27.7%)512.49
RhinorrheaC126088082(22.5%)8873.64
Productive coughC023913479(21.6%)6842.87
Bladder neoplasmC049693066(18.1%)282.18
Sjogren’s syndromeC152733641(11.2%)1064.17
Sinus congestionC015202936(9.9%)5165.54
Respiratory tract congestionC024207331(8.5%)5748.80
Amyotrophic lateral sclerosisC000273630(8.2%)472.50
Sinus headacheC003719529(7.9%)2443.90
Steroid therapyC014978328(7.7%)83.63
Rash pruriticC003377125(6.8%)1952.91
Oral herpesC001934523(6.3%)2353.21
Foot operationC018841323(6.3%)19510.63
Squamous cell carcinomaC000713721(5.8%)2753.28
WoundC003311920(5.5%)2332.69
LaparoscopyC003115020(5.5%)76.82
BunionC000638620(5.5%)1616.85
Pneumonia primary atypicalC141200219(5.2%)552.01

Discussion

In this study, we demonstrated the framework by exploring potential ADEs associated with drugs for RA patients. ADEs associated with drug class DMARDs and biologics for RA patients were first systematically mined from AERS-DM. Corpuses of RA patients on each drug class were then carefully selected according to the clinical guidelines. Following that, outcomes following drug uses were revealed from unstructured EMRs, and the overlaps between the signals and the outcomes of RA patients on these drugs were further analyzed to identify potential ADEs. RA is a systemic autoimmune disease with the characteristics of chronic inflammation that results in a destructive polyarthritis. Many complications may occur after RA. Therefore, we fully considered the features of RA to exclude possible complications from overlaps between signals from AERS-DM and outcomes from EMRs. In view of various regimens used among different institutions, some drugs used in one institution may not be used in another. EMR data from only a single institution, i.e., Mayo Clinic, was used in this study. To avoid omitting information on drugs and indications, our method doesn’t aim for screening whole databases as done in the previous study[5]. Instead, demonstrated as a framework interfacing informatics domain and medical domain, it employed more refined strategies based on interested drugs and indications. In the future, we will develop more general methodology once EMR data from multiple institutions, such as Optum lab, can be obtained. Some adverse events occur after a short time following drug use, from several minutes to several hours. Others occur only after several days, weeks, months or even years of exposure [4]. Therefore, when extracting outcomes, we have not limited the time of outcome occurrence after drug use. This allows detection of late-onset events. However, it may be interesting to observe the difference of time of outcome occurrences in the future. During the result analysis, we found that potential ADEs such as “Endometrial cancer” and “bladder neoplasm” for conventional DMARDs could also be the natural consequences of RA. Because the disease is a systemic autoimmune disease, patients with RA are at an increased risk for cancer[32]. In the future, we will integrate case- control study design into the framework based on EMR data to further discriminate such potential ADEs from co- morbidities with indications of interested drugs.

Conclusions

We proposed a framework for discovering potential ADEs associated with drugs combining both FDA AERS and EMRs. This framework has two features. First, more priority was given to clinical practice. Second, the normalization by NLP facilitated the interoperation between AERS-DM and EMRs. The results demonstrated the feasibility and sufficient accuracy of the framework. The framework can serve as the interface between the informatics domain and the medical domain to facilitate ADE discovery.
  29 in total

1.  Use of proportional reporting ratios (PRRs) for signal generation from spontaneous adverse drug reaction reports.

Authors:  S J Evans; P C Waller; S Davis
Journal:  Pharmacoepidemiol Drug Saf       Date:  2001 Oct-Nov       Impact factor: 2.890

2.  The Unified Medical Language System (UMLS): integrating biomedical terminology.

Authors:  Olivier Bodenreider
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

3.  Neurological events with tumour necrosis factor alpha inhibitors reported to the Food and Drug Administration Adverse Event Reporting System.

Authors:  P Deepak; D J Stobaugh; M Sherid; H Sifuentes; E D Ehrenpreis
Journal:  Aliment Pharmacol Ther       Date:  2013-06-26       Impact factor: 8.171

4.  MedEx: a medication information extraction system for clinical narratives.

Authors:  Hua Xu; Shane P Stenner; Son Doan; Kevin B Johnson; Lemuel R Waitman; Joshua C Denny
Journal:  J Am Med Inform Assoc       Date:  2010 Jan-Feb       Impact factor: 4.497

Review 5.  Adverse drug reactions.

Authors:  M Pirmohamed; A M Breckenridge; N R Kitteringham; B K Park
Journal:  BMJ       Date:  1998-04-25

6.  Combing signals from spontaneous reports and electronic health records for detection of adverse drug reactions.

Authors:  Rave Harpaz; Santiago Vilar; William Dumouchel; Hojjat Salmasian; Krystl Haerian; Nigam H Shah; Herbert S Chase; Carol Friedman
Journal:  J Am Med Inform Assoc       Date:  2012-10-31       Impact factor: 4.497

7.  'Global trigger tool' shows that adverse events in hospitals may be ten times greater than previously measured.

Authors:  David C Classen; Roger Resar; Frances Griffin; Frank Federico; Terri Frankel; Nancy Kimmel; John C Whittington; Allan Frankel; Andrew Seger; Brent C James
Journal:  Health Aff (Millwood)       Date:  2011-04       Impact factor: 6.301

Review 8.  2015 American College of Rheumatology Guideline for the Treatment of Rheumatoid Arthritis.

Authors:  Jasvinder A Singh; Kenneth G Saag; S Louis Bridges; Elie A Akl; Raveendhara R Bannuru; Matthew C Sullivan; Elizaveta Vaysbrot; Christine McNaughton; Mikala Osani; Robert H Shmerling; Jeffrey R Curtis; Daniel E Furst; Deborah Parks; Arthur Kavanaugh; James O'Dell; Charles King; Amye Leong; Eric L Matteson; John T Schousboe; Barbara Drevlow; Seth Ginsberg; James Grober; E William St Clair; Elizabeth Tindall; Amy S Miller; Timothy McAlindon
Journal:  Arthritis Rheumatol       Date:  2015-11-06       Impact factor: 10.995

9.  Tocilizumab in rheumatoid arthritis: a case study of safety evaluations of a large postmarketing data set from multiple data sources.

Authors:  Jeffrey R Curtis; Susana Perez-Gutthann; Samy Suissa; Pavel Napalkov; Natasha Singh; Liz Thompson; Benjamin Porter-Brown
Journal:  Semin Arthritis Rheum       Date:  2014-07-27       Impact factor: 5.532

10.  Certolizumab pegol plus methotrexate 5-year results from the rheumatoid arthritis prevention of structural damage (RAPID) 2 randomized controlled trial and long-term extension in rheumatoid arthritis patients.

Authors:  Josef S Smolen; Ronald van Vollenhoven; Arthur Kavanaugh; Vibeke Strand; Jiri Vencovsky; Michael Schiff; Robert Landewé; Boulos Haraoui; Catherine Arendt; Irina Mountian; David Carter; Désirée van der Heijde
Journal:  Arthritis Res Ther       Date:  2015-09-10       Impact factor: 5.156

View more
  4 in total

Review 1.  Data Science for Child Health.

Authors:  Tellen D Bennett; Tiffany J Callahan; James A Feinstein; Debashis Ghosh; Saquib A Lakhani; Michael C Spaeder; Stanley J Szefler; Michael G Kahn
Journal:  J Pediatr       Date:  2019-01-25       Impact factor: 4.406

2.  Detecting Pharmacovigilance Signals Combining Electronic Medical Records With Spontaneous Reports: A Case Study of Conventional Disease-Modifying Antirheumatic Drugs for Rheumatoid Arthritis.

Authors:  Liwei Wang; Majid Rastegar-Mojarad; Zhiliang Ji; Sijia Liu; Ke Liu; Sungrim Moon; Feichen Shen; Yanshan Wang; Lixia Yao; John M Davis Iii; Hongfang Liu
Journal:  Front Pharmacol       Date:  2018-08-07       Impact factor: 5.810

3.  Salience of Medical Concepts of Inside Clinical Texts and Outside Medical Records for Referred Cardiovascular Patients.

Authors:  Sungrim Moon; Sijia Liu; David Chen; Yanshan Wang; Douglas L Wood; Rajeev Chaudhry; Hongfang Liu; Paul Kingsbury
Journal:  J Healthc Inform Res       Date:  2019-01-28

4.  Target Adverse Event Profiles for Predictive Safety in the Postmarket Setting.

Authors:  Peter Schotland; Rebecca Racz; David B Jackson; Theodoros G Soldatos; Robert Levin; David G Strauss; Keith Burkhart
Journal:  Clin Pharmacol Ther       Date:  2020-11-07       Impact factor: 6.875

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.