| Literature DB >> 35579820 |
Muhammad Imran1, Aasia Bhatti2, David M King3, Magnus Lerch4, Jürgen Dietrich5, Guy Doron6, Katrin Manlik7.
Abstract
INTRODUCTION: Signal validation in pharmacovigilance is the process of evaluating data to decide whether evidence is sufficient to justify further assessment of a detected signal. During the signal validation process, safety experts in our organization are required to review signals of disproportionate reporting (SDRs) and classify them into one of six predefined categories.Entities:
Mesh:
Year: 2022 PMID: 35579820 PMCID: PMC9114067 DOI: 10.1007/s40264-022-01159-2
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.228
Fig. 1Overall distribution of validated signals of disproportionate reporting over various categories in the historic signal validation data extracted for phase I and II of the experiment. ADR adverse drug reaction
Case data attributes extracted from the spontaneous reporting database, and signal of disproportionate reporting data attributes extracted from the signal detection data mart
| Data source | Attribute level | Attributes | ICH E2B(R3) referencea [ |
|---|---|---|---|
| Case data from safety database | Case attributes | Report type Country of incidence Case medically confirmed | C.1.3 E.i.9 E.i.8 |
| Patient attributes | Age group Gender Ethnicity Pregnancy | D.2.3 D.5 Not available Not available | |
| Product attributes | Medicinal product name (suspect products)b List of indications (1–3) as preferred terms | G.k.2.1.1b/ G.k.2.1.2b G.k.7.r.2b | |
| Event attributes | Event preferred termb Event seriousness Event outcome | E.i.2.1b E.i.3.2 E.i.7 | |
| PEC attributes | Time to onset of event Dechallenge Rechallenge Event listedness Reporter causality Company causality | G.k.9.i.3.1 G.k.8 and E.i.7 G.k.9.i.4 Not available G.k.9.i.2.r.1 and r.3 G.k.9.i.2.r.1 and r.3 | |
| SDR data from signal detection data mart | SDR attributes | Medicinal product name (suspect product of interest)b Event preferred termb Flags: •DME flag (as per company-specific DME list) •listed flag (as per company core data sheet) •trend flag (indicating an increased period frequency) | Not available |
| Case counts | Case counts for this PEC: Each of them (a) cumulative, (b) for the current period,c and (c) for the prior periodc: •Total number of cases (all report types) •Number of cases with report type spontaneous or literature •Number of cases with report type study or published report from study •Number of serious cases •Number of fatal cases •Case frequencyd | Not available | |
| SDR validation attributes | SDR validation outcomee | Not available |
DME designated medical event, ICH International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use, PEC product–event combination, SDR signal of disproportionate reporting
aE2B(R3) data types and values were not explicitly retrieved and used as specified by the ICH. The E2B(R3) reference is only listed for the sake of clarity and attribute identification
bAttributes used for linking of the two datasets
cThe signal detection periodicity is monthly. “Current period” refers to a 1-month look-back period into the previous month. “Prior period” refers to a look-back into the month when an SDR for the same PEC was validated the last time in the past
dNumber of cases for this PEC divided by number of cases for the product
eSignal validation classification for the SDR done by safety expert in the past
Example of how features were engineered from the Individual Case Safety Report data for the Rechallenge attribute by creating two features (total and percent) for each available Rechallenge value (yes, no, unknown).
| ICSR data | |||
|---|---|---|---|
| Case number | Product | Event | Rechallenge |
| 1 | 3 | 2 | Yes |
| 2 | 3 | 2 | Yes |
| 3 | 3 | 2 | No |
| 4 | 3 | 2 | No |
| 5 | 3 | 2 | Unknown |
ICSR Individual Case Safety Report
Fig. 2Overall scheme of the data and model for phase I of the experiment showing the two splits in the data to evaluate the behavior of the model in each of the two groups of signal of disproportionate reporting (SDR) data
Fig. 3Normalized confusion matrix for SDR validation classifications in phase I of the experiment. a Confusion matrix for model A: SDRs with at least one prior validation; 26% of SDRs belonged to this group. b Confusion matrix for model B: SDRs with no prior validation; 74% of SDRs belonged to this group. Values and color scale range from 0.00 (0% of true class) to 1.00 (100% of true class). Results are based on the 30% test datasets for model A and model B. ADR adverse drug reaction, predicted label signal validation prediction by ML model, SDR signal of disproportionate reporting, true label signal validation outcome determined by safety expert, XGB eXtreme Gradient Boosting model
Test set distribution and model performance metrics for model A and model B in phase I of the experiment
| SDR validation class | Model A – 386 (26%) SDRs with one or more prior validations | Model B – 1519 (74%) SDRs without prior validation | ||||||
|---|---|---|---|---|---|---|---|---|
| Precision | Recall | F1 score | Test records (73) | Precision | Recall | F1 score | Test records (525) | |
| No signal—confounding by indication | 0.80 | 1.00 | 0.89 | 10.96% | 0.76 | 0.75 | 0.75 | 10.48% |
| No signal—listed/expected adr | 0.75 | 0.92 | 0.83 | 17.81% | 0.79 | 0.71 | 0.75 | 18.48% |
| No signal—medical judgment | 0.90 | 0.84 | 0.87 | 61.64% | 0.87 | 0.90 | 0.88 | 64.19% |
| No signal—no adr | 0.75 | 0.50 | 0.60 | 8.22% | 0.68 | 0.66 | 0.67 | 6.10% |
| No signal—recently investigated | 0.00 | 0.00 | 0.00 | 0.00% | 1.00 | 0.25 | 0.40 | 0.76% |
| Signal | 0.00 | 0.00 | 0.00 | 1.37% | 0.00 | 0.00 | 0.00 | 0.00% |
| Accuracy | 0.84 | 0.83 | ||||||
| Macro-average F1 score | 0.53 | 0.54 | 0.53 | 0.68 | 0.54 | 0.58 | ||
| Weighted-average F1 score | 0.84 | 0.84 | 0.83 | 0.83 | 0.83 | 0.83 | ||
Results are based on the 30% test datasets for model A and model B
ADR adverse drug reaction, SDR signal of disproportionate reporting
Fig. 4Comparison of the overall feature importance for model A and model B in phase I of the experiment. a Plot for SDRs with one or more prior validations. b Plot for SDRs with no prior validations for the SDRs. The comparison between the two figures shows that the machine learning model benefits from the availability of prior validation features. When the model does not have prior validation information, it leverages features computed from case data. The length of the bars depicts the magnitude of the impact of various features on informing the machine learning model. The color within the bars explains the specific class or classes for which the feature contributed to informing the model. However, this plot does not indicate the direction of impact, i.e., whether the impact of the feature is positive or negative. The figure was produced using SHAP TreeExplainer package [28]. Results are based on the 30% test datasets for model A and model B. ADR adverse drug reaction, SDR signal of disproportionate reporting
Example for a signal validation prediction for one SDR in month 2 of phase II of the experiment showing the information presented to safety experts
| Product | Event | Signal validation prediction | Confidence score | Top three highest impact features | Probabilities for other signal validation classes |
|---|---|---|---|---|---|
| 2 | 8 | No signal—medical judgment | 0.972 | PROD_N_PERIOD, TREND_FLAG_new, OUTCOME_not_recovered_not_resolved_percent | No signal—confounding by indication: 0.01 No signal—listed/expected adr: 0.01 No signal—no adr: 0.005 Signal: 0.002 No signal—recently investigated: 0.001 |
These additional columns were embedded in a signal validation report containing all SDR information from the signal detection system with one line per SDR
OUTCOME_not_recovered_not_resolved_percent: percentage of cases where the event outcome was “not recovered/resolved” from all cases with this PEC, PEC product–event combination, PROD_N_PERIOD: number of new cases for the product in latest signal detection period, SDR signal of disproportionate reporting, TREND_FLAG_new: trend flag “new” indicating that this PEC was identified as SDR the first time
Accuracy of signal validation predictions by medicinal product over 3 subsequent months in phase II of the experiment
| Month 1 | Month 2 | Month 3 | Total of SDRs | Accuracy | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product | Number of SDRs | Accuracy | Number of SDRs | Accuracy | Number of SDRs | Accuracy | ||||||||
| Matcha | No match | Total | Match | No match | Total | Match | No match | Total | ||||||
| 1 | 18 | 1 | 19 | 94.7% | 10 | 2 | 12 | 83.3% | 14 | 2 | 16 | 87.5% | 47 | 89.4% |
| 2 | 4 | 0 | 4 | 100.0% | 2 | 1 | 3 | 66.7% | 0 | 0 | 0 | 0.0% | 7 | 85.7% |
| 3 | 2 | 1 | 3 | 66.7% | 7 | 1 | 8 | 87.5% | 2 | 0 | 2 | 100.0% | 13 | 84.6% |
| 4 | 3 | 0 | 3 | 100.0% | 3 | 0 | 3 | 100.0% | 2 | 0 | 2 | 100.0% | 8 | 100.0% |
| 5 | 7 | 3 | 10 | 70.0% | 9 | 3 | 12 | 75.0% | 12 | 4 | 16 | 75.0% | 38 | 73.7% |
| 6 | 3 | 1 | 4 | 75.0% | 9 | 1 | 10 | 90.0% | 5 | 1 | 6 | 83.3% | 20 | 85.0% |
| Total | 37 | 6 | 43 | 86.0% | 40 | 8 | 48 | 83.3% | 35 | 7 | 42 | 83.3% | 133 | 84.2% |
SDR signal of disproportionate reporting
Match Prediction by the machine learning model matched the signal validation outcome determined by safety expert
Accuracy of signal validation predictions by novelty of signal of disproportionate reporting in phase II of the experiment
| Novelty of SDR | Number of SDRs | Accuracy | ||
|---|---|---|---|---|
| Matcha | No match | Total | ||
| New SDRb | 31 | 12 | 43 | 72.1% |
| Recurring SDRc | 81 | 9 | 90 | 90.0% |
| Total | 112 | 21 | 133 | 84.2% |
ML machine learning, PEC product–event combination, SDR signal of disproportionate reporting
aPrediction by the ML model matched the signal validation outcome determined by safety expert
bSDR for a specific PEC that was identified for the first time by the signal detection system
cSDR for a specific PEC that had already been identified one or more times but meets predefined re-signaling criteria
Accuracy of signal validation predictions by signal of disproportionate reporting validation class in phase II of the experiment
| SDR validation class | Number of SDRs | Accuracy | ||
|---|---|---|---|---|
| Matcha | No match | Total | ||
| No signal—confounding by indication | 9 | 2 | 11 | 81.8% |
| No signal—listed/expected adr | 5 | 8 | 13 | 38.5% |
| No signal—medical judgment | 87 | 7 | 94 | 92.6% |
| No signal—no adr | 11 | 3 | 14 | 78.6% |
| No signal—recently investigated | 0 | 1 | 1 | 0.0% |
| Signal | 0 | 0 | 0 | NA |
| Total | 112 | 21 | 133 | 84.2% |
NA not applicable, SDR signal of disproportionate reporting
aPrediction by the machine learning model matched the signal validation outcome determined by safety experts
| This experiment demonstrated that signal validation in pharmacovigilance can be supported by a machine learning (ML)-based prevalidation step to improve process efficiency and consistency. Medical review by safety experts remains an essential part of the signal validation process, but this can be performed faster and more consistently when augmented by ML predictions. |
| Model explainability plays a major role in gaining trust and acceptance of ML outputs in pharmacovigilance. SHapley Additive exPlanations (SHAP) analysis was used to improve model explainability. |