Literature DB >> 24688309

Commonality of drug-associated adverse events detected by 4 commonly used data mining algorithms.

Toshiyuki Sakaeda1, Kaori Kadoyama1, Keiko Minami1, Yasushi Okuno2.   

Abstract

OBJECTIVES: Data mining algorithms have been developed for the quantitative detection of drug-associated adverse events (signals) from a large database on spontaneously reported adverse events. In the present study, the commonality of signals detected by 4 commonly used data mining algorithms was examined.
METHODS: A total of 2,231,029 reports were retrieved from the public release of the US Food and Drug Administration Adverse Event Reporting System database between 2004 and 2009. The deletion of duplicated submissions and revision of arbitrary drug names resulted in a reduction in the number of reports to 1,644,220. Associations with adverse events were analyzed for 16 unrelated drugs, using the proportional reporting ratio (PRR), reporting odds ratio (ROR), information component (IC), and empirical Bayes geometric mean (EBGM).
RESULTS: All EBGM-based signals were included in the PRR-based signals as well as IC- or ROR-based ones, and PRR- and IC-based signals were included in ROR-based ones. The PRR scores of PRR-based signals were significantly larger for 15 of 16 drugs when adverse events were also detected as signals by the EBGM method, as were the IC scores of IC-based signals for all drugs; however, no such effect was observed in the ROR scores of ROR-based signals.
CONCLUSIONS: The EBGM method was the most conservative among the 4 methods examined, which suggested its better suitability for pharmacoepidemiological studies. Further examinations should be performed on the reproducibility of clinical observations, especially for EBGM-based signals.

Entities:  

Keywords:  Adverse Event Reporting System; FAERS; adverse event; data mining; database; empirical Bayes geometric mean.; information component; proportional reporting ratio; reporting odds ratio; signal; signal detection

Mesh:

Year:  2014        PMID: 24688309      PMCID: PMC3970098          DOI: 10.7150/ijms.7967

Source DB:  PubMed          Journal:  Int J Med Sci        ISSN: 1449-1907            Impact factor:   3.738


Introduction

The US Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS, formerly AERS) is a database that contains information on adverse event and medication error reports submitted to the FDA 1-3. Besides those from manufacturers, reports can be submitted from health care professionals and the general public. The FAERS structure adheres to the International Safety Reporting Guidance issued by the International Conference on Harmonisation, ICH E2B, and adverse events are coded to terms in the Medical Dictionary for Regulatory Activities (MedDRA) terminology 4. The original system was initiated in 1969; however, reporting markedly increased following the last major revision in 1997 5, 6. To date, the FAERS contains more than 4 million reports and is the largest repository of spontaneously reported adverse events in the world 5, 6. The FDA releases data to the general public, and this has allowed us to conduct pharmacoepidemiological studies and/or pharmacovigilance analyses. Data mining algorithms have been developed for the quantitative detection of signals 7-11. A signal indicates an association between a drug and an adverse event or drug-associated adverse event, including the proportional reporting ratio (PRR) 12, reporting odds ratio (ROR) 13, information component (IC) given by a Bayesian confidence propagation neural network 14, and empirical Bayes geometric mean (EBGM) 15. Associations with adverse events of interests were previously analyzed for 16 drugs using reports in the FAERS database between 2004 and 2009 16-22. Whether an adverse event is detected as a signal has been shown to depend on the algorithms; however, of the 4 methods, the ROR method provided the highest number of signals, while the EBGM method provided the lowest 23. In the present study, the commonality of PRR-, ROR-, IC-, and EBGM-based signals was examined.

Methods

Data were retrieved from the public release of the FAERS database from the first quarter of 2004 through to the end of 2009. The total number of reports obtained was 2,231,029. Duplicated reports were deleted and arbitrary drug names were revised, resulting in a reduction in the number of reports from 2,231,029 to 1,644,220. Signal scores, i.e., the PRR, ROR, IC, and EBGM values, were calculated for 16 unrelated drugs to assess associations with adverse events, including 2 antimicrobials (colistin and tigecycline), 4 HMG-CoA reductase inhibitors (statins) (pravastatin, simvastatin, atorvastatin, and rosuvastatin), 2 proton pump inhibitors (PPIs) (omeprazole and esomeprazole), warfarin, 2 antiplatelets (aspirin and clopidogrel), and 5 anticancer agents (cisplatin, carboplatin, oxaliplatin, 5-fluorouracil, and capecitabine). It is noted that the associations of these drugs with adverse events have already been published 16-22. All values reported are the mean±standard deviation (SD). The unpaired Student's t-test/Welch's test or Mann-Whitney's U test was used for two-group comparisons of the values. P values of less than 0.05 were considered significant.

Results

Figure 1 shows the relationship among the PRR-, ROR-, IC-, and EBGM-based signals, which was commonly observed for all 16 drugs. All EBGM-based signals were included in the PRR-based signals as well as IC- or ROR-based ones. The PRR- and IC-based signals were included in the ROR-based ones. Therefore, ROR-based signals could be stratified into 5 groups; signals detected by the ROR only, signals detected by the ROR and PRR, signals detected by the ROR and IC, signals detected by the ROR, PRR, and IC, and signals detected by the 4 methods. Table 1 lists the numbers of signals in the 5 groups. The ratio of the total number of EBGM-based signals to that of signals detected by the ROR only varied from 3.9% with omeprazole to 57.3% with oxaliplatin. The ratio of the total number of EBGM-based signals to that of ROR-based signals varied from 1.7% with omeprazole to 20.5% with oxaliplatin.
Figure 1

Commonality of signals detected by 4 commonly used data mining algorithms. PRR: proportional reporting ratio; ROR: reporting odds ratio; IC: information component; EBGM: empirical Bayes geometric mean. ROR-based signals were stratified into 5 groups; signals detected by the ROR only, signals detected by the ROR and PRR, signals detected by the ROR and IC, signals detected by the ROR, PRR, and IC, and signals detected by the 4 methods. The numbers of signals in the 5 groups are listed in Table 1.

Table 1

Numbers of signals in the 5 groups.

ROR onlyROR&PRRROR&ICROR&PRR&ICROR&PRR&IC&EBGM
Cisplatin3569849206175
Carboplatin3217780188144
Oxaliplatin2626460196150
Colistin1661813023
5-Fluorouracil3418262218161
Capecitabine3406751198146
Pravastatin3585812514119
Simvastatin2846126810130
Atorvastatin3046529516455
Rosuvastatin295429712263
Tigecycline1551822944
Omeprazole3618724411214
Esomeprazole348782019917
Warfarin24862157159110
Aspirin38586115162100
Clopidogrel28775185187104

PRR: proportional reporting ratio; ROR: reporting odds ratio; IC: information component; EBGM: empirical Bayes geometric mean.

ROR-based signals were stratified into 5 groups; signals detected by the ROR only, signals detected by the ROR and PRR, signals detected by the ROR and IC, signals detected by the ROR, PRR, and IC, and signals detected by the 4 methods.

Table 2 lists the PRR scores of PRR-based signals. Since PRR-based signals could be divided into 2 groups based on whether adverse events were also detected as signals by the EBGM method (Figure 1), the effects of additional detection by the EBGM method on PRR scores was examined. As shown in Table 2, the scores were significantly larger for 15 of 16 drugs when adverse events were also detected as signals by the EBGM method. Tables 3 and 4 show data on the ROR and IC, respectively. The effects of additional detection by the EBGM method found for PRR scores were not observed for the ROR, whereas the IC scores of IC-based signals were the same as the PRR scores of PRR-based signals.
Table 2

PRR scores of PRR-based signals (the signals detected by the PRR method).

AllDetected by EBGMNot detected by EBGMp
NPRRNPRRNPRR
Cisplatin4798.03 ± 11.2917512.90 ± 16.733045.23 ± 4.36< 0.001
Carboplatin4096.80 ± 8.3214410.57 ± 12.252654.76 ± 3.69< 0.001
Oxaliplatin4107.72 ± 11.4715011.69 ± 17.162605.43 ± 4.90< 0.001
Colistin7129.30 ± 83.822377.31 ± 136.92486.29 ± 4.66< 0.001
5-Fluorouracil4617.52 ± 10.0316111.61 ± 14.903005.33 ± 4.72< 0.001
Capecitabine4118.09 ± 13.0614612.07 ± 20.262655.90 ± 5.09< 0.001
Pravastatin2184.70 ± 4.261910.48 ± 8.611994.15 ± 3.11< 0.001
Simvastatin1924.50 ± 4.81308.99 ± 10.331623.66 ± 1.94< 0.001
Atorvastatin2843.76 ± 1.93554.41 ± 1.992293.61 ± 1.89< 0.001
Rosuvastatin2275.20 ± 5.77638.50 ± 9.371643.94 ± 2.65< 0.001
Tigecycline9137.88 ± 114.304472.09 ± 158.16475.85 ± 3.57< 0.001
Omeprazole2134.69 ± 5.051412.29 ± 15.281994.16 ± 2.770.003
Esomeprazole1944.65 ± 3.83177.19 ± 9.501774.41 ± 2.680.513
Warfarin3315.28 ± 4.951107.46 ± 7.382214.19 ± 2.47< 0.001
Aspirin3485.56 ± 4.931008.05 ± 7.392484.56 ± 2.96< 0.001
Clopidogrel3664.85 ± 3.791046.77 ± 5.442624.08 ± 2.52< 0.001

PRR-based signals were divided into 2 groups based on whether adverse events were also detected by the EBGM method.

Table 3

ROR scores of ROR-based signals (the signals detected by the ROR method).

AllDetected by EBGMNot detected by EBGMp
NRORNRORNROR
Cisplatin88415.75 ± 34.1217513.92 ± 20.6370916.20 ± 36.690.002
Carboplatin81014.95 ± 43.9314411.07 ± 14.1166615.78 ± 47.960.001
Oxaliplatin73212.32 ± 31.9415012.41 ± 20.3158212.29 ± 34.32< 0.001
Colistin23857.84 ± 165.032378.97 ± 141.6721555.58 ± 167.470.028
5-Fluorouracil86414.89 ± 37.8216112.34 ± 18.3870315.47 ± 40.990.001
Capecitabine80217.16 ± 54.7714613.10 ± 25.2065618.06 ± 59.350.097
Pravastatin70110.00 ± 23.371910.92 ± 9.306829.97 ± 23.640.019
Simvastatin7445.37 ± 7.173011.03 ± 16.147145.13 ± 6.45< 0.001
Atorvastatin8835.14 ± 8.66554.61 ± 2.248285.18 ± 8.92< 0.001
Rosuvastatin61911.87 ± 27.18638.93 ± 10.6855612.21 ± 28.440.074
Tigecycline24870.05 ± 381.274474.82 ± 170.8620469.03 ± 413.140.008
Omeprazole8186.39 ± 11.041416.92 ± 26.688046.20 ± 10.510.003
Esomeprazole7436.83 ± 10.03178.05 ± 11.777266.80 ± 9.990.308
Warfarin7367.81 ± 13.741108.36 ± 10.066267.72 ± 14.30< 0.001
Aspirin84811.86 ± 35.851008.38 ± 8.3874812.32 ± 38.020.033
Clopidogrel8386.20 ± 9.011047.19 ± 6.267346.06 ± 9.33< 0.001

ROR-based signals were divided into 2 groups based on whether adverse events were also detected by the EBGM method.

Table 4

IC scores of IC-based signals (the signals detected by the IC method).

AllDetected by EBGMNot detected by EBGMp
NICNICNIC
Cisplatin4301.64 ± 0.671752.22 ± 0.552551.24 ± 0.39< 0.001
Carboplatin4121.51 ± 0.661442.15 ± 0.532681.16 ± 0.42< 0.001
Oxaliplatin4061.60 ± 0.691502.22 ± 0.622561.23 ± 0.41< 0.001
Colistin541.82 ± 0.52232.25 ± 0.47311.51 ± 0.28< 0.001
5-Fluorouracil4411.62 ± 0.701612.32 ± 0.542801.22 ± 0.40< 0.001
Capecitabine3951.66 ± 0.701462.31 ± 0.632491.28 ± 0.41< 0.001
Pravastatin2851.03 ± 0.48191.98 ± 0.282660.96 ± 0.41< 0.001
Simvastatin3990.81 ± 0.50301.96 ± 0.513690.72 ± 0.36< 0.001
Atorvastatin5140.92 ± 0.52551.88 ± 0.414590.80 ± 0.41< 0.001
Rosuvastatin2821.27 ± 0.68632.18 ± 0.602191.00 ± 0.42< 0.001
Tigecycline752.05 ± 0.68442.44 ± 0.58311.50 ± 0.34< 0.001
Omeprazole3700.80 ± 0.50141.96 ± 0.443560.75 ± 0.44< 0.001
Esomeprazole3170.84 ± 0.48171.78 ± 0.373000.79 ± 0.43< 0.001
Warfarin4261.28 ± 0.761102.19 ± 0.713160.97 ± 0.47< 0.001
Aspirin3771.34 ± 0.681002.18 ± 0.502771.04 ± 0.45< 0.001
Clopidogrel4761.20 ± 0.661042.08 ± 0.563720.95 ± 0.45< 0.001

IC-based signals were divided into 2 groups based on whether adverse events were also detected by the EBGM method.

Discussion

Several studies previously compared data mining algorithms 13, 24-29; however, as Bate and Evans recently concluded 7, different algorithms have slightly different properties such that one may consequently be preferable in a particular application. If used for pharmacovigilance, data mining algorithms should be assessed from the standpoint of early and timely signal detection 30-33. Although few studies have published comparative data, Chen et al. recently compared the timing of early signal detection with PRR, ROR, IC, and EBGM using the FAERS database, and concluded that the ROR performed better 30. We previously reported that the ROR method provided the highest number of signals, while the EBGM method provided the lowest 23. The difference in the number of signals can be attributed to a higher rate of false positives or lower ability to detect signals. In the present study, the commonality of signals was clarified, as shown in Figure 1. The EBGM method was shown to be the most conservative among the 4 methods, which suggested that it was suitable for pharmacoepidemiological studies. In contrast, the ROR method was shown to be the most comprehensive, indicating its usefulness for pharmacovigilance. These results were consistent with the findings of Chen et al 30. These 4 data mining algorithms were used in our previous studies 16-22, and adverse events were listed as drug-associated, when at least 1 of the 4 indices met the criteria. However, the results shown in Figure 1 demonstrated that lists of adverse events were only identical when the ROR method was applied, which suggested that care should be taken in interpreting data when signals are not detected by the EBGM method. Based on the number of signals, 16 drugs could be classified into 4 groups. Group 1 included 2 antimicrobials, which were characterized by the lower number of signals. The total number of co-occurrences with colistin was only 1,491, and 1,906 for tigecycline. These were markedly less than those of the other 14 drugs; from 33,197 with oxaliplatin to 220,194 with atorvastatin. The lower number of signals can be explained by comparatively infrequent use, and, therefore, a smaller number of reports in the database. This is not related to the reliability of the signals. Group 2 included 4 statins and 2 PPIs characterized by a lower number of EBGM-based signals, and group 3 included warfarin and 2 antiplatelets by a higher number of EBGM-based signals. Group 4 included 5 anticancer agents characterized by a much higher number of EBGM-based signals. The total number of ROR-based signals was similar among drugs in groups 2-4; from 619 with rosuvastatin to 884 with cisplatin. The ROR method is feasible for detecting more signals, including false positives, than the EBGM method. The difference observed in the ratio of EBGM-based to ROR-based signals may reflect whether adverse events are generally found. A pilot study performed by Hochberg et al. in 2009 concerning drug-versus-drug comparisons revealed that the rank-order of adverse event rates in the FAERS database was consistent with the results of published studies 34, which encouraged the use of the database for comparisons. In other investigations, the number of reports with or without normalization by usage or sales during the corresponding period was used to compare drugs 35; however, adverse events are underreported, which may lead to incorrect conclusions 36-38. Signal scores have also been considered inappropriate for determining the rank-order of drugs in terms of risk; however, few studies have been published to date. In the present study, the EBGM method was shown to be the most conservative among the 4 methods; therefore, it is important to confirm whether this method can provide important information similar to that in well-organized clinical studies.
  34 in total

1.  The reporting odds ratio versus the proportional reporting ratio: 'deuce'.

Authors:  Patrick Waller; Eugène van Puijenbroek; Antoine Egberts; Stephen Evans
Journal:  Pharmacoepidemiol Drug Saf       Date:  2004-08       Impact factor: 2.890

Review 2.  Perspectives on the use of data mining in pharmaco-vigilance.

Authors:  June Almenoff; Joseph M Tonning; A Lawrence Gould; Ana Szarfman; Manfred Hauben; Rita Ouellet-Hellstrom; Robert Ball; Ken Hornbuckle; Louisa Walsh; Chuen Yee; Susan T Sacks; Nancy Yuen; Vaishali Patadia; Michael Blum; Mike Johnston; Charles Gerrits; Harry Seifert; Karol Lacroix
Journal:  Drug Saf       Date:  2005       Impact factor: 5.606

3.  Adverse drug event surveillance and drug withdrawals in the United States, 1969-2002: the importance of reporting suspected reactions.

Authors:  Diane K Wysowski; Lynette Swartz
Journal:  Arch Intern Med       Date:  2005-06-27

4.  Criteria revision and performance comparison of three methods of signal detection applied to the spontaneous reporting database of a pharmaceutical manufacturer.

Authors:  Yasuyuki Matsushita; Yasufumi Kuroda; Shinpei Niwa; Satoshi Sonehara; Chikuma Hamada; Isao Yoshimura
Journal:  Drug Saf       Date:  2007       Impact factor: 5.606

5.  Drug-versus-drug adverse event rate comparisons: a pilot study based on data from the US FDA Adverse Event Reporting System.

Authors:  Alan M Hochberg; Ronald K Pearson; Donald J O'Hara; Stephanie J Reisinger
Journal:  Drug Saf       Date:  2009       Impact factor: 5.606

6.  Serious adverse drug events reported to the Food and Drug Administration, 1998-2005.

Authors:  Thomas J Moore; Michael R Cohen; Curt D Furberg
Journal:  Arch Intern Med       Date:  2007-09-10

7.  A Bayesian neural network method for adverse drug reaction signal generation.

Authors:  A Bate; M Lindquist; I R Edwards; S Olsson; R Orre; A Lansner; R M De Freitas
Journal:  Eur J Clin Pharmacol       Date:  1998-06       Impact factor: 2.953

8.  Update on adverse drug events associated with parenteral iron.

Authors:  Glenn M Chertow; Phillip D Mason; Odd Vaage-Nilsen; Jarl Ahlmén
Journal:  Nephrol Dial Transplant       Date:  2005-11-11       Impact factor: 5.992

9.  Was the thrombotic risk of rofecoxib predictable from the French Pharmacovigilance Database before 30 September 2004?

Authors:  A Sommet; S Grolleau; H Bagheri; M Lapeyre-Mestre; J L Montastruc
Journal:  Eur J Clin Pharmacol       Date:  2008-05-29       Impact factor: 2.953

Review 10.  Novel statistical tools for monitoring the safety of marketed drugs.

Authors:  J S Almenoff; E N Pattishall; T G Gibbs; W DuMouchel; S J W Evans; N Yuen
Journal:  Clin Pharmacol Ther       Date:  2007-05-30       Impact factor: 6.875

View more
  6 in total

1.  Data mining differential clinical outcomes associated with drug regimens using adverse event reporting data.

Authors:  Mayur Sarangdhar; Scott Tabar; Charles Schmidt; Akash Kushwaha; Krish Shah; Jeanine E Dahlquist; Anil G Jegga; Bruce J Aronow
Journal:  Nat Biotechnol       Date:  2016-07-12       Impact factor: 54.908

2.  A Pharmacovigilance Signaling System Based on FDA Regulatory Action and Post-Marketing Adverse Event Reports.

Authors:  Keith B Hoffman; Mo Dimbil; Nicholas P Tatonetti; Robert F Kyle
Journal:  Drug Saf       Date:  2016-06       Impact factor: 5.606

3.  Data mining for adverse drug reaction signals of daptomycin based on real-world data: a disproportionality analysis of the US Food and Drug Administration adverse event reporting system.

Authors:  Jiao-Jiao Chen; Xue-Chen Huo; Shao-Xia Wang; Fei Wang; Quan Zhao
Journal:  Int J Clin Pharm       Date:  2022-09-30

Review 4.  Sources of Safety Data and Statistical Strategies for Design and Analysis: Postmarket Surveillance.

Authors:  Rima Izem; Matilde Sanchez-Kam; Haijun Ma; Richard Zink; Yueqin Zhao
Journal:  Ther Innov Regul Sci       Date:  2018-01-08       Impact factor: 1.778

5.  Cutaneous Toxicity Associated With Enfortumab Vedotin: A Real-Word Study Leveraging U.S. Food and Drug Administration Adverse Event Reporting System.

Authors:  Hui Yang; Xiaojia Yu; Zhuoling An
Journal:  Front Oncol       Date:  2022-01-19       Impact factor: 6.244

6.  Detecting early safety signals of infliximab using machine learning algorithms in the Korea adverse event reporting system.

Authors:  Jeong-Eun Lee; Ju Hwan Kim; Ji-Hwan Bae; Inmyung Song; Ju-Young Shin
Journal:  Sci Rep       Date:  2022-09-01       Impact factor: 4.996

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.