Literature DB >> 32349656

Predicting potential adverse events using safety data from marketed drugs.

Chathuri Daluwatte1, Peter Schotland2, David G Strauss1, Keith K Burkhart1, Rebecca Racz3.   

Abstract

BACKGROUND: While clinical trials are considered the gold standard for detecting adverse events, often these trials are not sufficiently powered to detect difficult to observe adverse events. We developed a preliminary approach to predict 135 adverse events using post-market safety data from marketed drugs. Adverse event information available from FDA product labels and scientific literature for drugs that have the same activity at one or more of the same targets, structural and target similarities, and the duration of post market experience were used as features for a classifier algorithm. The proposed method was studied using 54 drugs and a probabilistic approach of performance evaluation using bootstrapping with 10,000 iterations.
RESULTS: Out of 135 adverse events, 53 had high probability of having high positive predictive value. Cross validation showed that 32% of the model-predicted safety label changes occurred within four to nine years of approval (median: six years).
CONCLUSIONS: This approach predicts 53 serious adverse events with high positive predictive values where well-characterized target-event relationships exist. Adverse events with well-defined target-event associations were better predicted compared to adverse events that may be idiosyncratic or related to secondary target effects that were poorly captured. Further enhancement of this model with additional features, such as target prediction and drug binding data, may increase accuracy.

Entities:  

Keywords:  Adverse reaction; Classifier; Computational biology; Pharmacovigilance

Mesh:

Year:  2020        PMID: 32349656      PMCID: PMC7191698          DOI: 10.1186/s12859-020-3509-7

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

The Food and Drug Administration’s (FDA) proposed process modernization to support new drug development involves establishing a unified post-market safety surveillance framework to monitor the benefits and risks of drugs across their lifecycles [1]. While clinical trials are considered the gold standard for detecting and labeling adverse events, these trials are not sufficiently powered to detect less common adverse events. Additionally, some adverse events emerge when a drug is used in clinical practice outside of the specified inclusion/exclusion criteria. Some adverse events may have high prevalence in specific subpopulations who were not enrolled in the clinical trials or subgroups who cannot be identified based on information collected from patients in the trials. For example, a substantially increased risk of Stevens-Johnson syndrome in patients positive for the HLA-B*1502 allele taking carbamazepine was not identified until decades after approval [2]. In addition, concomitant medications (drug-drug interactions) and comorbidities may also contribute to adverse events, and these interactions are not always adequately present or captured in clinical trials. Therefore, post-market safety surveillance is crucial. FDA uses the FDA Adverse Event Reporting System (FAERS) [3] and the Sentinel Initiative [4] to obtain information about adverse events occurring after drug approval. In 2017, over 1.8 million adverse event cases were reported to the FDA, including nearly 907,000 serious reports and over 164,000 fatal cases [5]. While traditional pharmacovigilance relies on data mining systems, these methods have reporting biases and require manual review of cases to determine reporting accuracy. Recently, there has been a strong interest in developing prediction algorithms to assist in post-market surveillance to overcome such weaknesses and make post-market pharmacovigilance more efficient. Adverse event information from a variety of sources such as FAERS, literature, genomic data, and social media has been used to both evaluate adverse events and make predictions. For example, FAERS and similar post-market databases have demonstrated utility in adverse event prediction; Xu and Wang showed FAERS, combined with literature, had great utility in detecting safety signals [6]. Others have used chemical structure as the basis for adverse event predictions. Vilar and colleagues used molecular fingerprint similarity to drugs with a known association with rhabdomyolysis to further support and prioritize rhabdomyolysis signals found in FAERS [7]. Another unique option has been to use social media reports to identify new adverse events for drugs before they are reported to regulatory agencies or in peer-reviewed literature; Yang and colleagues used a partially supervised classification method to identify reports of adverse events on the discussion forum for Medhelp [3]. Other sources of information for adverse event prediction and detection include electronic health records, drug labels and even bioassay data [8-10]. Additionally, a wide variety of algorithms have been used to make adverse event predictions, including logistic regression models, support vector machine, and ensemble methods [8, 11, 12]. Many of these models have experienced varying degrees of success but overall demonstrate the great potential of developing an adverse event prediction model using a classifier. However, many of these methodologies have focused on predicting a specific adverse event (e.g. cardiovascular events) or drug class (e.g. oncology drugs) [12-14]. Algorithms that can predict a wide variety of adverse events for multiple drug classes are important to enhance post-market safety surveillance. We have previously developed a genetic algorithm to predict approximately 900 adverse events using FDA product labels and FAERS data [15]. In this study, we build on this algorithm to predict 135 adverse events of high priority to regulatory review using safety data from marketed drugs with one or more shared molecular targets. We hypothesize that drugs that have similar modes of action at the same targets will have a similar adverse event profile because of shared structural features and likely target binding characteristics. We additionally expect adverse events that are more closely associated with drug targets (such as serotonin syndrome) to be well-predicted via this methodology. Some idiosyncratic reactions may also be captured well because the shared structural features likely play a role in these reactions where the targets and actions have not yet been fully characterized.

Results

Inclusion and exclusion criteria resulted in 54 test drugs and 213 unique comparator drugs, leading to 287 test-comparator drug combinations. The 54 test drugs used in this study had one to 37 comparator drugs, with one and two comparators being most frequent, as identified by DrugBank (Fig. 1a), and were on the market four to nine years (Fig. 1b). Tanimoto similarity scores between test drugs and comparator drugs ranged between 0.02 and 1, with 0.51 being the mean and 0.5 being the mode. Eighteen test drug-comparator associations included a biologic, as defined by a − 1 Tanimoto score (Fig. 1c). Target cosine similarity scores between test drugs and comparator drugs ranged between 0 and 1, with 0.45 being the mean and 1 being the mode (Fig. 1d). Seventy-nine comparator drugs were approved before 1982, while the most recently approved comparator drug had five years of time in market (Fig. 1e). The 54 test drugs are known to bind to 126 targets based on DrugBank data (summarized in Supplemental Table 1).
Fig. 1

Characteristics of test drugs, comparator drugs and test-comparator drug combinations. a) Distribution of number of comparator drugs for test drug. b) Distribution of time on market for test drugs. c) Tanimoto score distribution for test-comparator drug combinations. d) Target similarity score distribution for test-comparator drug combinations. e) Distribution of time on market for comparator drugs

Characteristics of test drugs, comparator drugs and test-comparator drug combinations. a) Distribution of number of comparator drugs for test drug. b) Distribution of time on market for test drugs. c) Tanimoto score distribution for test-comparator drug combinations. d) Target similarity score distribution for test-comparator drug combinations. e) Distribution of time on market for comparator drugs The prevalence of the 135 adverse events considered in this study is summarized in Fig. 2. The overall prevalence of adverse events was higher in the comparator drugs.
Fig. 2

Prevalence of adverse events within comparator drugs and test drugs

Prevalence of adverse events within comparator drugs and test drugs Prediction models were not made for 26 adverse events that were not observed or observed only in one test drug label (accident, anaphylactoid reaction, aplastic anaemia, apnoea, atrioventricular block, azotaemia, cardiomyopathy, cerebral infarction, coagulopathy, colitis, colitis ulcerative, Crohn’s disease, dermatitis bullous, dermatitis exfoliative, gastric ulcer, granulocytopenia, hepatic necrosis, hypokinesia, injury, myopathy, oliguria, respiratory depression, road traffic accident, skin ulcer, thrombosis, and ulcer). Results at varying thresholds (the minimum percentage of comparator drugs which are predicted positive for an adverse event to result in a positive prediction) for the safety label change evaluation and the number of adverse events with left-skewed positive predictive value, which demonstrated a high probability for high positive predictive value, are summarized in Table 1. Based on these results, we selected 70% as the optimum threshold. This resulted in the highest number of adverse events with high positive predictive values along with a high percentage of predicted safety label changes that were also issued by FDA (32%). All performance histograms at 70% threshold for each adverse event are provided in supplementary materials. Positive predictive value histograms of two well-predicted (i.e. left-skewed histograms) adverse events (febrile neutropenia and hypertension) and two poorly-predicted (i.e. right-skewed histograms) adverse events (bacterial infection and haemorrhage) are shown in Fig. 3.
Table 1

Performance of the algorithm when the threshold to make a positive prediction was varied

ThresholdFDA-issued safety label changes that were correctly predicted (%)Predicted safety label changes that were also FDA-issued (%)Number of adverse events with a high positive predictive value
0431311
10391419
30321828
50182842
60172949
70133253
90113448
Fig. 3

Left-skewed positive predictive value histograms demonstrated well-predicted adverse events, as shown in a) Febrile Neutropenia and b) Hypertension. Right-skewed positive predictive value histograms demonstrated poorly-predicted adverse events, as shown in c) Bacterial Infection and d) Haemorrhage

Performance of the algorithm when the threshold to make a positive prediction was varied Left-skewed positive predictive value histograms demonstrated well-predicted adverse events, as shown in a) Febrile Neutropenia and b) Hypertension. Right-skewed positive predictive value histograms demonstrated poorly-predicted adverse events, as shown in c) Bacterial Infection and d) Haemorrhage Fifty-three adverse events showed 100% as the positive predictive value mode, with the median between 50 and 100, 25% quantile between 0 and 100, and 75% quantile at 100%, which suggests left-skewed distributions. By having a left-skewed distribution for positive predictive value, these adverse events were considered well-predicted, which suggests high probability of having high positive predictive value (Table 2). Additionally, these adverse events had a sensitivity mode between 0 and 100%, specificity mode of 100%, and negative predictive value mode of 50–100%.
Table 2

Performance and prevalence of adverse events that were well-predicted by the algorithm

Adverse EventMedian (25th – 75th quantile) ModePrevalence (%)
SensitivitySpecificityPositive Predictive ValueNegative Predictive ValueComparator DrugsTest Drugs
AGRANULOCYTOSIS50 (50–100) 100100 (100–100) 100100 (100–100) 10091 (82–91) 912311
ANAEMIA25 (17–33) 20100 (100–100) 100100 (100–100) 10055 (45–64) 554542
ANAPHYLACTIC REACTION25 (20–33) 25100 (100–100) 100100 (100–100) 10064 (56–73) 733735
BONE MARROW FAILURE100 (0–100) 100100 (90–100) 100100 (0–100) 10091 (91–91) 91145
BRONCHITIS33 (25–50) 50100 (100–100) 100100 (100–100) 10082 (73–90) 822720
CEREBRAL HAEMORRHAGE33 (0–50) 0100 (90–100) 100100 (0–100) 10082 (82–91) 911415
CHOLESTASIS50 (50–100) 100100 (100–100) 100100 (100–100) 10091 (82–91) 9189
CONFUSIONAL STATE25 (20–33) 25100 (100–100) 100100 (100–100) 10070 (60–80) 735033
DEEP VEIN THROMBOSIS50 (33–100) 50100 (100–100) 100100 (100–100) 10090 (82–91) 911113
DIABETES MELLITUS50 (0–100) 090 (90–100) 10050 (0–100) 10091 (82–91) 91279
DIPLOPIA100 (100–100) 100100 (100–100) 100100 (100–100) 10091 (91–100) 91174
EXTRAPYRAMIDAL DISORDER100 (100–100) 100100 (100–100) 100100 (100–100) 10091 (91–91) 91214
FEBRILE NEUTROPENIA50 (33–100) 50100 (100–100) 100100 (100–100) 10090 (82–91) 91715
GASTROINTESTINAL HAEMORRHAGE33 (25–50) 25100 (100–100) 100100 (100–100) 10073 (64–82) 732731
HEPATIC FAILURE33 (25–50) 33100 (100–100) 100100 (100–100) 10073 (64–82) 732925
HEPATOTOXICITY25 (0–33) 090 (88–100) 10050 (0–100) 10073 (67–82) 732527
HYPERCHOLESTEROLAEMIA33 (0–50) 090 (89–100) 10050 (0–100) 10082 (80–91) 912215
HYPERGLYCAEMIA25 (17–40) 0100 (89–100) 100100 (50–100) 10078 (67–82) 803627
HYPERKINESIA50 (50–100) 100100 (100–100) 100100 (100–100) 10090 (82–91) 913613
HYPERSENSITIVITY20 (14–29) 17100 (80–100) 100100 (50–100) 10044 (33–56) 506758
HYPERTENSION50 (33–67) 50100 (86–100) 100100 (67–100) 10075 (62–83) 675740
HYPOGLYCAEMIA50 (25–100) 50100 (90–100) 100100 (50–100) 10090 (82–91) 912715
IMPAIRED HEALING100 (50–100) 100100 (100–100) 100100 (100–100) 10091 (90–91) 9127
INFECTION50 (33–67) 5089 (86–100) 10067 (50–100) 10080 (73–89) 1004327
INSOMNIA20 (14–25) 17100 (100–100) 100100 (100–100) 10055 (45–64) 555849
INTERSTITIAL LUNG DISEASE50 (33–67) 50100 (89–100) 100100 (50–100) 10089 (80–91) 901218
JAUNDICE33 (0–50) 090 (89–100) 10050 (0–100) 10082 (80–91) 914015
LARYNGEAL OEDEMA100 (67–100) 10088 (79–100) 10050 (33–100) 10091 (82–91) 9187
LEUKOPENIA25 (25–33) 25100 (100–100) 100100 (100–100) 10073 (64–82) 734429
MYOCARDIAL INFARCTION25 (0–40) 0100 (88–100) 100100 (0–100) 10073 (67–82) 804429
NEUROLEPTIC MALIGNANT SYNDROME100 (100–100) 100100 (100–100) 100100 (100–100) 10091 (91–100) 91164
NEUROPATHY PERIPHERAL40 (25–50) 5086 (80–100) 10067 (50–100) 10067 (56–78) 676042
NEUTROPENIA50 (33–60) 50100 (100–100) 100100 (100–100) 10075 (67–82) 782338
OEDEMA20 (14–33) 083 (75–100) 10067 (50–100) 10044 (33–55) 507158
PANCREATITIS50 (25–60) 5090 (86–100) 10075 (50–100) 10080 (70–89) 783329
PANCYTOPENIA50 (25–67) 50100 (90–100) 100100 (50–100) 10090 (82–91) 912115
PNEUMONIA50 (25–67) 5089 (86–100) 10050 (33–100) 10086 (78–90) 1003722
PROTEINURIA50 (33–50) 50100 (100–100) 100100 (100–100) 10089 (82–91) 911215
PULMONARY EMBOLISM50 (25–100) 50100 (90–100) 100100 (50–100) 10090 (82–91) 911813
PULMONARY OEDEMA50 (0–100) 090 (90–100) 10050 (0–100) 10090 (82–91) 91179
RENAL FAILURE50 (33–67) 50100 (100–100) 100100 (100–100) 10089 (80–91) 912818
RENAL IMPAIRMENT50 (33–100) 50100 (100–100) 100100 (100–100) 10091 (82–91) 912711
SEIZURE33 (0–50) 089 (86–100) 10050 (0–100) 10078 (70–88) 804927
SEPSIS50 (33–67) 50100 (100–100) 100100 (100–100) 10088 (78–90) 1002024
SEROTONIN SYNDROME100 (50–100) 100100 (100–100) 100100 (100–100) 10091 (91–100) 100137
STEVENS-JOHNSON SYNDROME33 (20–50) 33100 (90–100) 100100 (50–100) 10080 (73–89) 822824
STOMATITIS33 (0–50) 089 (86–100) 10050 (0–100) 10082 (75–90) 802822
SUICIDAL BEHAVIOUR25 (20–33) 33100 (100–100) 100100 (100–100) 10082 (73–82) 822422
SUPRAVENTRICULAR TACHYCARDIA33 (25–50) 33100 (100–100) 100100 (100–100) 10080 (73–90) 822724
TACHYCARDIA40 (25–50) 5090 (86–100) 10075 (50–100) 10078 (67–88) 786131
THROMBOCYTOPENIA40 (25–50) 5088 (83–100) 10067 (50–100) 10073 (62–80) 675836
UPPER RESPIRATORY TRACT INFECTION20 (17–33) 20100 (86–100) 100100 (50–100) 10060 (50–70) 602844
URINARY TRACT INFECTION25 (20–40) 25100 (88–100) 100100 (67–100) 10067 (56–75) 703240
Performance and prevalence of adverse events that were well-predicted by the algorithm Fifty-six adverse events had positive predictive values mode between 0 and 33%, which suggested right-skewed distribution and thus were considered poorly-predicted (Table 3). While the positive predictive value was low, all these adverse events did have high specificity (mode: 76–100%) and negative predictive value (mode: 55–91%). Two adverse events, bacterial infection and fungal infection, additionally had high sensitivity (mode: 100%) (Table 3).
Table 3

Performance and prevalence of adverse events that were poorly-predicted by the algorithm

Adverse EventMedian (25th – 75th quantile) ModePrevalence (%)
SensitivitySpecificityPositive Predictive ValueNegative Predictive ValueComparator DrugsTest Drugs
ACUTE KIDNEY INJURY0 (0–0) 088 (86–89) 880 (0–0) 073 (64–82) 732327
AGGRESSION0 (0–0) 078 (67–89) 890 (0–0) 082 (82–91) 912215
AMNESIA0 (0–0) 080 (79–89) 800 (0–0) 091 (82–91) 91207
ANGINA PECTORIS0 (0–50) 090 (88–100) 1000 (0–100) 082 (73–91) 823316
ANGIOEDEMA0 (0–0) 086 (83–88) 830 (0–0) 055 (45–64) 554545
ARRHYTHMIA0 (0–0) 090 (89–90) 900 (0–0) 082 (82–91) 914113
BACTERIAL INFECTION75 (31–100) 10080 (78–90) 8033 (20–33) 3391 (82–91) 91311
BLINDNESS0 (0–0) 080 (75–80) 800 (0–0) 091 (82–91) 91811
BRADYCARDIA0 (0–0) 090 (89–90) 900 (0–0) 091 (82–91) 913211
CANDIDA INFECTION0 (0–0) 090 (82–90) 900 (0–0) 091 (91–91) 9185
CARDIAC ARREST0 (0–0) 090 (78–90) 900 (0–0) 091 (82–91) 91219
CARDIAC FAILURE0 (0–33) 089 (88–90) 1000 (0–67) 073 (64–82) 734025
CATARACT0 (0–0) 078 (70–80) 780 (0–0) 091 (82–91) 911611
CELLULITIS0 (0–0) 090 (89–90) 900 (0–0) 091 (82–91) 91129
CEREBROVASCULAR ACCIDENT0 (0–33) 089 (88–100) 890 (0–100) 080 (73–82) 823522
CONJUNCTIVITIS0 (0–0) 080 (70–80) 800 (0–0) 082 (82–91) 912913
DEAFNESS0 (0–0) 080 (70–80) 800 (0–0) 091 (82–91) 91157
DELIRIUM0 (0–0) 075 (70–80) 760 (0–0) 082 (82–91) 911413
DELUSION0 (0–0) 080 (72–80) 800 (0–0) 091 (91–91) 91134
DISORIENTATION0 (0–0) 080 (80–85) 800 (0–0) 091 (91–91) 91145
DRUG REACTION WITH EOSINOPHILIA AND SYSTEMIC SYMPTOMS0 (0–0) 080 (78–89) 780 (0–0) 091 (82–91) 91411
DYSGEUSIA0 (0–0) 089 (86–90) 890 (0–0) 080 (73–89) 823120
ELECTROCARDIOGRAM QT PROLONGED0 (0–0) 089 (89–90) 900 (0–0) 082 (80–91) 822015
EMBOLISM0 (0–0) 090 (90–90) 900 (0–0) 091 (90–91) 9165
EOSINOPHILIA0 (0–0) 090 (89–90) 900 (0–0) 091 (82–91) 91249
ERYTHEMA MULTIFORME0 (0–0) 080 (78–80) 800 (0–0) 091 (82–91) 91259
FALL0 (0–0) 078 (67–89) 900 (0–0) 082 (82–91) 91013
FRACTURE0 (0–0) 090 (90–90) 900 (0–0) 091 (90–91) 9175
FUNGAL INFECTION100 (50–100) 10080 (78–89) 8033 (22–50) 3391 (82–91) 9169
GLAUCOMA0 (0–0) 080 (80–90) 800 (0–0) 091 (91–91) 91145
HAEMATOMA0 (0–0) 089 (88–89) 890 (0–0) 082 (73–82) 821620
HAEMOLYTIC ANAEMIA0 (0–0) 090 (90–90) 900 (0–0) 091 (91–91) 91154
HAEMORRHAGE17 (0–33) 086 (75–88) 10033 (0–50) 064 (56–73) 604436
HALLUCINATION0 (0–0) 089 (89–90) 900 (0–0) 082 (80–91) 913315
HEPATITIS0 (0–25) 088 (83–90) 890 (0–50) 078 (70–82) 804324
HOSTILITY0 (0–0) 080 (78–88) 800 (0–0) 091 (82–91) 91169
MEMORY IMPAIRMENT0 (0–0) 079 (70–80) 800 (0–0) 082 (82–91) 911115
MYOSITIS0 (0–0) 080 (80–90) 800 (0–0) 091 (91–91) 9154
PARALYSIS0 (0–0) 080 (78–90) 800 (0–0) 091 (82–91) 91107
PARANOIA0 (0–0) 080 (80–90) 850 (0–0) 091 (91–91) 91114
PHOTOSENSITIVITY REACTION0 (0–0) 090 (90–90) 900 (0–0) 091 (90–91) 91345
RECTAL HAEMORRHAGE0 (0–0) 080 (80–90) 900 (0–0) 091 (91–91) 91155
RESPIRATORY FAILURE0 (0–0) 089 (89–90) 900 (0–0) 082 (80–91) 911615
RHABDOMYOLYSIS0 (0–0) 089 (80–90) 900 (0–0) 091 (82–91) 911911
SLEEP DISORDER0 (0–0) 080 (80–90) 800 (0–0) 091 (91–91) 91134
SUDDEN DEATH0 (0–0) 080 (70–80) 800 (0–0) 091 (91–91) 91144
THROMBOPHLEBITIS0 (0–0) 090 (90–90) 900 (0–0) 091 (91–91) 91124
TINNITUS0 (0–0) 089 (89–90) 900 (0–0) 082 (80–91) 914115
TOXIC EPIDERMAL NECROLYSIS0 (0–50) 090 (89–100) 1000 (0–100) 084 (80–91) 912515
URTICARIA25 (0–40) 088 (80–90) 10050 (0–75) 070 (60–80) 706433
VAGINAL HAEMORRHAGE0 (0–0) 090 (90–90) 900 (0–0) 091 (91–91) 9194
VASCULITIS0 (0–0) 090 (90–90) 900 (0–0) 091 (90–91) 91155
VENTRICULAR ARRHYTHMIA0 (0–0) 088 (74–90) 900 (0–0) 091 (82–91) 91297
VISION BLURRED0 (0–0) 088 (83–89) 890 (0–0) 082 (73–82) 82022
VISUAL IMPAIRMENT33 (0–50) 090 (89–100) 10050 (0–100) 089 (80–91) 913815
WEIGHT INCREASED0 (0–0) 079 (70–88) 900 (0–0) 082 (73–91) 824318
Performance and prevalence of adverse events that were poorly-predicted by the algorithm

Discussion

In this study we developed a preliminary approach to predict 135 adverse events of high priority to regulatory review using post-market safety data from marketed drugs that have the same activity at one or more of the same targets. We identified 53 adverse events that were well-predicted with this approach and chose to use a threshold which optimizes positive predictive value. These adverse events had varying sensitivity, but high specificity and negative predictive value. A model with high positive predictive value but low sensitivity will miss some true adverse events, but this was deemed acceptable for this study. In discussions about optimizing either positive predictive value or sensitivity in this study, it was deemed more important to identify adverse events that are most likely to be true and save time and effort sifting through false positives. In practice, a balance between sensitivity and positive predictive value would likely be optimal in conjunction with a manual review of predictions. Adverse event predictions based on molecular targets have multiple applications. We may be able to identify difficult to observe events that are not commonly seen in clinical trials to statistical significance. Predicted adverse events may be able to augment post-marketing surveillance activities by providing a list of adverse events to monitor. If an adverse event is discovered during pre-market evaluation or post-market utilization, examination of other drugs with similar pharmacologic mechanism and activity may help evaluate causality of the event and determine if further studies are necessary based on information from all comparators, not necessarily limited to those with the same indication. Particularly, examination of secondary targets may be useful, as this may explain the emergence of an adverse event or why a particular drug is at lower risk for adverse events traditionally labeled as a class adverse event. While the preliminary approach presented here is considered a tool for hypothesis generation, further evaluation and refinement will determine if it is useful in regulatory safety review. The method reported in this study matches safety data based on drug activity at one or more of the same known targets. This may limit the predictive ability, as some adverse events may be idiosyncratic or be associated with unknown secondary targets, and thus the mechanisms responsible for the event have not yet been identified. Associations may still be identified, however, if overlapping structural features capture this unknown shared idiosyncratic activity. This method can be expanded to match a drug not only based on drug activity at one or more of the same targets, but also considering other features which characterize the drug activity, such as Anatomical Therapeutic Chemical (ATC) codes or binding strength (Ki). ATC codes, developed by the World Health Organization, may provide insight into drugs that are related by mechanism or therapeutic use [16]. Binding strength to targets of interest, which may be obtained from literature or databases such as the Psychoactive Drug Screening Program [17] or ChEMBL [18], may provide further classification of target similarity by identifying comparator drugs that bind to targets of interest at a similar order of magnitude. The model also does not capture drug dose that may be needed to produce the required target activity. Fifty-six adverse events were predicted with low positive predictive value. Therefore, a positive prediction for these adverse events should be carefully reviewed by experts before reaching a conclusion. In practice, expert review augments this by assessment of FDA Adverse Event Reporting System (FAERS) reports, literature, and more recently evaluations using insurance claims and electronic health data. Reviewers may examine predictions made by this algorithm by reviewing literature and other databases to identify plausible mechanisms for the drug eliciting the reaction, or review cases in FAERS and electronic health records. More detail about evaluation of safety signals at the FDA can be found in Szarfman et al. [19]. Analysis of the poor-performing adverse events in this study identified several clinical patterns: hemorrhage (including “haemorrhage”, “haematoma”, and “rectal haemorrhage”), infection (including “cellulitis”, “fungal infection”, and “bacterial infection”), and psychiatric (including “paranoia”, “delirium”, and “hallucination”) adverse events were among the worst-performing events by positive predictive value. Many of these adverse events may be idiosyncratic or related to unknown secondary target effects, and therefore it is difficult to predict an adverse event based on the known drug targets. This study may have been limited by the known targets that are available in DrugBank, as DrugBank may not contain all known secondary targets for all drugs. To better capture adverse events that may be related to secondary drug targets, target prediction for the test drugs and comparator drugs may be incorporated to better match comparator drugs to test drugs. DrugBank contains limited target predictions, so another source would be used. This study had several limitations. First, the current version of Embase only allows users to extract manually curated adverse events by date for one drug at a time, which makes this process time-intensive for a large set of test drugs and their comparators and thus limited the number of drugs used in this study. We tried to address this limitation by using a probabilistic approach of performance evaluation using bootstrapping. Creating a tool to automate extraction of these adverse events may alleviate the manual burden. Additionally, text-mining FDA labels for adverse events is most accurate when used on a structured document, and thus we elected to use test drugs that had labels available in SPL format. While an assessment of the text-mining for 20 labels showed positive predictive value, sensitivity, and F-score at approximately 90% (unpublished data, Racz et al., 2018), we anticipate larger text-mining errors. This assessment identified patterns in the text-mining algorithm that may lead to errors, and the query is currently being updated to improve performance. Finally, several adverse events were not observed or observed with low prevalence in the test drug set. Further analysis of these adverse events identified some events that may be associated with targets that were not substantially analyzed. This includes events such as “respiratory depression”, which is particularly associated with drugs such as benzodiazepines and opioids and their related receptors [20], and “hypokinesia”, which may be associated with dopamine receptors [21]. Other adverse events, such as “anaphylactoid reaction” and “apnea”, may be reported interchangeably with other MedDRA Preferred Terms, such as “anaphylactic reaction” and “sleep apnea”, respectively; therefore, these terms may be reported in lower frequency. To better capture this, we may consider alternative groupings or adding additional terms to complete a mechanistically-related grouping.

Conclusions

This classifier algorithm predicts significant adverse events that are of high priority for regulatory monitoring, some of which may be difficult to observe in clinical trials. The prediction algorithm uses evidence of adverse events available through FDA product labels and scientific literature for drugs that have the same activity at one or more of the same targets along with structural and target similarities and the duration of post-market experience. For this study, we prioritized achieving high positive predictive value for the adverse event prediction. The model achieved high positive predictive value on 53 out of 135 adverse events, including several adverse events with well-characterized target relationships. We found that 32% of the model predicted safety label changes were FDA-issued within four to nine years after approval.

Methods

Selection of adverse events for evaluation

This methodology predicts 135 adverse events identified by FDA medical experts and reviewers to be of high priority to regulatory review and the pharmacovigilance efforts of the Office of Surveillance and Epidemiology. High priority was determined by FDA pharmacovigilance experts as events that are serious, may be life-threatening or debilitating, or represent frequent events that result in the need for safety label changes. These 135 adverse events were derived using 167 MedDRA Preferred Terms, grouped by mechanistic similarity according to FDA medical experts. For example, “pancreatitis” and “pancreatitis acute” are mechanistically similar and may be reported interchangeably, thus they were captured as one adverse event, “pancreatitis”. The 135 adverse events and the 167 MedDRA Preferred Terms used to define them are listed in Table 4. MedDRA is the Medical Dictionary for Regulatory Activities and is the international medical terminology developed under the auspices of the International Council for Harmonization of Technical Requirements for Pharmaceuticals for Human Use [22]. MedDRA Preferred Terms are medical concepts for symptoms, signs, diagnoses, indications, investigations, procedures, and medical, social, or family history. The FDA Adverse Event Reporting System (FAERS) currently codes reported adverse events as MedDRA Preferred Terms, and all terms from other sources were converted to MedDRA Preferred Terms as described below.
Table 4

Adverse events defined using MedDRA Preferred Terms. The bolded MedDRA Preferred Term is used to name the adverse event, while all MedDRA Preferred Terms grouped together were used to define that adverse event

Adverse Event
ACCIDENTCONFUSIONAL STATEHALLUCINATIONPULMONARY OEDEMA
ACUTE KIDNEY INJURYCONJUNCTIVITISHEPATIC FAILURERECTAL HAEMORRHAGE
AGGRESSIONCROHN’S DISEASEHEPATIC NECROSISRENAL FAILURE
AGRANULOCYTOSISDEAFNESSHEPATITISRENAL IMPAIRMENT
AMNESIADEEP VEIN THROMBOSISHOSTILITYRESPIRATORY DEPRESSION
ANAEMIADELIRIUMHYPERSENSITIVITYRHABDOMYOLYSIS
ANAPHYLACTOID REACTIONDELUSIONHYPERTENSIONROAD TRAFFIC ACCIDENT
ANGINA PECTORISDERMATITIS BULLOUSHYPOGLYCAEMIASEROTONIN SYNDROME
ANGIOEDEMADERMATITIS EXFOLIATIVEABASIASKIN ULCER
APLASTIC ANAEMIADIABETES MELLITUSIMPAIRED HEALINGSLEEP DISORDER
APNOEADIPLOPIAINFECTIONSTOMATITIS
ARRHYTHMIADISORIENTATIONINJURYSUDDEN DEATH
ATRIOVENTRICULAR BLOCKDYSGEUSIAINSOMNIATACHYCARDIA
AZOTAEMIAEMBOLISMINTERSTITIAL LUNG DISEASETHROMBOCYTOPENIA
BACTERIAL INFECTIONEOSINOPHILIALARYNGEAL OEDEMATHROMBOPHLEBITIS
BLINDNESSERYTHEMA MULTIFORMELEUKOPENIATHROMBOSIS
BONE MARROW FAILURECOLITIS ULCERATIVEMEMORY IMPAIRMENTTINNITUS
BRADYCARDIAFALLMYOPATHYTOXIC EPIDERMAL NECROLYSIS
BRONCHITISFEBRILE NEUTROPENIAMYOSITISULCER
CANDIDA INFECTIONFRACTURENEUTROPENIAVISION BLURRED
CARDIAC ARRESTFUNGAL INFECTIONOLIGURIAURINARY TRACT INFECTION
CARDIOMYOPATHYGLAUCOMAPANCYTOPENIAURTICARIA
CATARACTGRANULOCYTOPENIAPARALYSISVAGINAL HAEMORRHAGE
CELLULITISHAEMATOMAPARANOIAVASCULITIS
GASTROINTESTINAL HAEMORRHAGENEUROLEPTIC MALIGNANT SYNDROMEPHOTOSENSITIVITY REACTIONUPPER RESPIRATORY TRACT INFECTION
CHOLESTASISHAEMOLYTIC ANAEMIAPNEUMONIAWEIGHT INCREASED
COAGULOPATHYHAEMORRHAGEPROTEINURIASEIZURE, EPILEPSY
COLITISCEREBRAL INFARCTIONPULMONARY EMBOLISMSEPSIS, SEPTIC SHOCK
STEVENS-JOHNSON SYNDROMEEXTRAPYRAMIDAL DISORDERJAUNDICE, JAUNDICE CHOLESTATICRESPIRATORY ARREST, RESPIRATORY FAILURE
ANAPHYLACTIC REACTION, ANAPHYLACTIC SHOCKNEUROPATHY PERIPHERAL, PARAESTHESIAHYPERGLYCAEMIA, BLOOD GLUCOSE INCREASEDOEDEMA PERIPHERAL, OEDEMA
CARDIAC FAILURE CONGESTIVE, CARDIAC FAILURETORSADE DE POINTES, ELECTROCARDIOGRAM QT PROLONGEDMYOCARDIAL INFARCTION, ACUTE MYOCARDIAL INFARCTIONATRIAL FIBRILLATION, SUPRAVENTRICULAR TACHYCARDIA
CEREBRAL HAEMORRHAGE, HAEMORRHAGE INTRACRANIAL, CEREBELLAR HAEMORRHAGESUICIDAL BEHAVIOUR, COMPLETED SUICIDE, SUICIDE ATTEMPT, SUICIDAL IDEATIONVENTRICULAR FIBRILLATION, VENTRICULAR ARRHYTHMIA, VENTRICULAR EXTRASYSTOLES, VENTRICULAR TACHYCARDIAHYPERKINESIA, TARDIVE DYSKINESIA, DYSKINESIA, AKATHISIA, DYSTONIA, HYPERTONIA
HYPERCHOLESTEROLAEMIA, HYPERLIPIDAEMIAHEPATOTOXICITY, LIVER INJURYPANCREATITIS, PANCREATITIS ACUTEGASTRIC ULCER, PEPTIC ULCER
CEREBROVASCULAR ACCIDENT, TRANSIENT ISCHAEMIC ATTACKVISUAL ACUITY REDUCED, VISUAL IMPAIRMENT, VISUAL FIELD DEFECTDRUG REACTION WITH EOSINOPHILIA AND SYSTEMIC SYMPTOMS
Adverse events defined using MedDRA Preferred Terms. The bolded MedDRA Preferred Term is used to name the adverse event, while all MedDRA Preferred Terms grouped together were used to define that adverse event

Dataset

Drug set selection

Selection of test drugs

Fifty-four drugs approved by FDA between 2008 and 2013 were chosen for this analysis. Analyses were based on available Structured Product Labeling for products and required both an original label and a subsequent version of the label for this assessment. As Structured Product Labeling began in 2006, 2008 was selected to allow time for the requirement to be adequately implemented. The year 2013 was selected as the upper bound to allow at least four years of post-market experience to 2017, which is the median time for a regulatory action on a safety event (e.g. updating a drug label) [23]. Of the drugs approved between 2008 and 2013, drugs were included as long as there was at least one other U.S. marketed drug with the same pharmacological activity at one or more of the same known targets. Additional inclusion criteria were systemic exposure (e.g. not ophthalmic only) and multiple doses (i.e. drugs with single dose administration were excluded) due to an increased likelihood of multiple and significant adverse events.

Selection of comparator drugs

Comparator drugs, defined as drugs that have the same activity (i.e. agonist or antagonist) at one or more of the same targets as the test drug, were chosen using DrugBank [24]. Test and comparator drug targets were identified if the drug had “pharmacological action” at the target (i.e. the column “pharmacological action” in DrugBank must read “yes” as opposed to “no” or “unknown”) and must have a defined action column in DrugBank (i.e. “antagonist” or “agonist”) at the target. Additionally, the comparator drugs must have been approved in the United States and thus have an FDA product label available.

Features for classifier algorithm

Adverse Events from FDA drug labels

Adverse events were obtained from two versions of the test drug label: the originally-approved FDA product label (between 2008 and 2013) and the drug label as of 2017. The adverse events from the 2017 FDA product label were text-mined using Linguamatics I2E (OnDemand Release, Linguamatics Limited, Cambridge, United Kingdom). Adverse events were extracted as MedDRA Preferred Terms from the Boxed Warnings, Warnings and Precautions, and Adverse Reactions sections. The adverse events from the original product label were manually extracted and translated to MedDRA Preferred Terms by a medical expert from the Boxed Warnings, Warnings and Precautions, and Adverse Reactions sections. Manual curation was employed as Linguamatics OnDemand text-mines the current product label only. Comparator drug adverse events were text-mined using Linguamatics I2E (Enterprise Release, Linguamatics Limited, Cambridge, United Kingdom). Adverse events were extracted as MedDRA Preferred Terms from Boxed Warnings, Warnings and Precautions, and Adverse Reactions sections. For each comparator drug, the FDA product label in use at the time of the respective test drug approval was used as the source for text-mining (e.g.: if a test drug was approved on November 1, 2010, the comparator drug labels that were in use on November 1, 2010 were mined). For each drug label and adverse event, the presence or absence of a MedDRA Preferred Term was indicated by “1” or “0”, respectively. The classifiers were trained on and performance was analyzed using test drug label data from 2017. To assess the algorithm’s ability to predict future safety label changes at the approval date (described in detail in “Classifier” below), the difference between drug label data from 2017 and the label at approval (2008–2013) was used.

Adverse events from scientific literature

Adverse events from scientific literature were mined using Embase Biomedical Database (Elsevier B. V, Amsterdam, The Netherlands), a biomedical database covering journals and conference abstracts [25]. A team of Embase indexers manually curate all adverse events from all full-text articles and associate each adverse event with the related drug. These drugs and adverse events are documented in Emtree terms, Elsevier’s controlled terminology. Therefore, each drug in Embase has hundreds to thousands of adverse events associated with it, and each adverse event-drug association has a curated reference. Adverse events reported for all comparator drugs before their respective test drug’s approval date were searched for in Embase. The list of adverse events documented by Elsevier as Emtree terms for each comparator drug was exported and manually matched to MedDRA Preferred Terms.

Comparator drug duration in market

Comparator time in market was included as a feature. The longer a drug has been marketed, the more adverse events, particularly difficult to observe adverse events, are identified and evaluated for labeling. The duration in market for comparator drugs was determined from the Orange Book [26]. Drugs that were approved before 1982 have an approval date listed as “Approved Prior to Jan 1, 1982”; the duration in market for these drugs was imputed to be 36 years (1982 to 2017).

Structural similarity

Structural similarity was included as a feature as it was hypothesized that the more structurally similar a comparator drug was to a test drug, the more likely they were to share pharmacology, including unknown secondary pharmacology that was not included in this analysis and may contribute to similar idiosyncratic reactions. Structural similarities of each test drug to its respective comparator drugs were determined using Tanimoto scores. Simplified Molecular Input Line Entry System (SMILES) structures for all test and comparator drugs were imported into the Tanimoto Matrix workflow in the KNIME Analytics Platform (version 3.3.2) [27]. Structures were then converted to MACCS 166-bit fingerprints, and structural similarity between the test drug and the respective comparator drug was determined. For biologics where similarity score was not available, − 1 was imputed as Tanimoto score.

Target similarity

Target similarity, or how closely the target profile of each comparator aligned with that of the test drug, was included as a feature as it was hypothesized that the more targets a comparator shares with a test drug, the more likely it is that a comparator and test drug share adverse events. The set of known pharmacological targets for each test drug and corresponding comparator drugs was extracted from DrugBank [24]. Target similarities of each test drug with its comparator drugs were determined using target-based cosine similarity scores. A trivalent drug-by-target matrix was then constructed such that for each drug-target pair an entry of “1” indicates drug-target activation, an entry of “-1” indicates drug-target inhibition, and an entry of “0” indicates no pharmacological activity. Cosine similarities the test drug has with its comparator drugs were then computed as follows:

Classifier

Five features were defined for each comparator-test drug -adverse event association: 1) presence or absence of an adverse event in FDA drug label for the comparator drug; 2) presence or absence of an adverse event in scientific literature for comparator drug; 3) structural similarity between comparator drug and test drug; 4) target similarity between comparator drug and test drug; and 5) duration the comparator drug was on the market (Fig. 4), all of which are independent of each other. These features were used to train a Naïve Bayes classifier, using presence or absence of an adverse event in the 2017 FDA drug label for the test drug as the training label (see section Adverse Events from FDA Drug Labels for details). Given the wide range of prevalence of presence of an adverse event, we anticipated the contribution of prevelance of presence of an adverse event to model prediction would be high. Therefore a Naïve Bayes classifier was chosen in order to take into account both prior probability (i.e. prevelance of presence of an adverse event) and likelihood for presence of an adverse event. All statistical calculations were conducted in R version 3.2.2 (R Foundation for Statistical Computing, Vienna, Austria) and the Naïve Bayes classifier from package e1071 was used [28] (see supplemental materials for code).
Fig. 4

Flow diagram of experimental methods

Flow diagram of experimental methods Due to the limited number of drugs available for testing and the high dimensionality of prediction (135 adverse events), 10,000 bootstrapping steps were conducted by selecting a random set of 44 drugs to train the Naïve Bayes classifier, while leaving 10 drugs for testing at each iteration (i.e. 10,000/ ). A prediction was made by each comparator drug-test drug association for an adverse event of interest. Therefore, since a single test drug can have multiple comparator drugs, there may be multiple predictions for one test drug for each adverse event of interest. To remediate this, if the percentage of comparator drug-test drug combinations that predicted the adverse event of interest was above a predefined threshold, the adverse event was considered a positive prediction for the test drug. Performance was calculated while varying the threshold (0, 10, 30, 50, 60, 70, 90%) above which the percentage of comparator drug-test drug combinations predicted the adverse event of interest to identify the optimum threshold. As 10,000 bootstrapping steps were performed, the most frequent value (mode), median, 25th and 75th quantiles for each of the performance metrics (sensitivity, specificity, positive predictive value and negative predictive value) were calculated to assess the predictive ability for each adverse event. Performance metric histograms for each adverse event are provided in the supplemental materials. We chose to optimize positive predictive value, as false positives may be more costly in terms of additional studies and regulatory review compared to false negatives. Adverse events with a distribution for positive predictive value that was left-skewed (defined as a mode positive predictive value > 75%) were considered well-predicted. Leave-one-out cross validation was performed to evaluate safety label changes. Predictions were evaluated as follows:

Evaluation of false positive predictions

Positive predictions that were made by the Naïve Bayes classifier that were not on the respective 2017 drug label were classified as “false positives”. To further evaluate if these predictions may be early signals not yet on the label, the case count and Proportional Reporting Ratio (PRR) were identified for each drug-adverse event pair from the FDA Adverse Event Reporting System using OpenFDA [29, 30]. Data from June 30, 1989 to January 1, 2018 was used in this analysis. Additional file 1. “Supplemental Materials” contains histograms of the performance for each adverse event; “Supplemental Table 1” contains all targets represented in this study. Additional file 2. Contains Naïve Bayes code, files necessary to run code, and output files obtained to perform analysis described in the paper.
  22 in total

1.  The FDA's sentinel initiative--A comprehensive approach to medical product surveillance.

Authors:  R Ball; M Robb; S A Anderson; G Dal Pan
Journal:  Clin Pharmacol Ther       Date:  2016-01-12       Impact factor: 6.875

2.  Automatic signal extraction, prioritizing and filtering approaches in detecting post-marketing cardiovascular events associated with targeted cancer drugs from the FDA Adverse Event Reporting System (FAERS).

Authors:  Rong Xu; Quanqiu Wang
Journal:  J Biomed Inform       Date:  2013-10-28       Impact factor: 6.317

3.  Predicting adverse drug reactions using publicly available PubChem BioAssay data.

Authors:  Y Pouliot; A P Chiang; A J Butte
Journal:  Clin Pharmacol Ther       Date:  2011-05-25       Impact factor: 6.875

Review 4.  The Pharmacology and Toxicology of the 'Holy Trinity'.

Authors:  Joseph T Horsfall; Jon E Sprague
Journal:  Basic Clin Pharmacol Toxicol       Date:  2016-09-26       Impact factor: 4.080

5.  Enhanced GABA Transmission Drives Bradykinesia Following Loss of Dopamine D2 Receptor Signaling.

Authors:  Julia C Lemos; Danielle M Friend; Alanna R Kaplan; Jung Hoon Shin; Marcelo Rubinstein; Alexxai V Kravitz; Veronica A Alvarez
Journal:  Neuron       Date:  2016-05-18       Impact factor: 17.173

6.  Using electronic health care records for drug safety signal detection: a comparative evaluation of statistical methods.

Authors:  Martijn J Schuemie; Preciosa M Coloma; Huub Straatman; Ron M C Herings; Gianluca Trifirò; Justin Neil Matthews; David Prieto-Merino; Mariam Molokhia; Lars Pedersen; Rosa Gini; Francesco Innocenti; Giampiero Mazzaglia; Gino Picelli; Lorenza Scotti; Johan van der Lei; Miriam C J M Sturkenboom
Journal:  Med Care       Date:  2012-10       Impact factor: 2.983

7.  Postmarket Safety Events Among Novel Therapeutics Approved by the US Food and Drug Administration Between 2001 and 2010.

Authors:  Nicholas S Downing; Nilay D Shah; Jenerius A Aminawung; Alison M Pease; Jean-David Zeitoun; Harlan M Krumholz; Joseph S Ross
Journal:  JAMA       Date:  2017-05-09       Impact factor: 56.272

8.  Multivariate models for prediction of human skin sensitization hazard.

Authors:  Judy Strickland; Qingda Zang; Michael Paris; David M Lehmann; David Allen; Neepa Choksi; Joanna Matheson; Abigail Jacobs; Warren Casey; Nicole Kleinstreuer
Journal:  J Appl Toxicol       Date:  2016-08-02       Impact factor: 3.446

9.  Target-Adverse Event Profiles to Augment Pharmacovigilance: A Pilot Study With Six New Molecular Entities.

Authors:  Peter Schotland; Rebecca Racz; David Jackson; Robert Levin; David G Strauss; Keith Burkhart
Journal:  CPT Pharmacometrics Syst Pharmacol       Date:  2018-10-24

10.  OpenFDA: an innovative platform providing access to a wealth of FDA's publicly available data.

Authors:  Taha A Kass-Hout; Zhiheng Xu; Matthew Mohebbi; Hans Nelsen; Adam Baker; Jonathan Levine; Elaine Johanson; Roselie A Bright
Journal:  J Am Med Inform Assoc       Date:  2015-12-07       Impact factor: 4.497

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.