| Literature DB >> 35579807 |
Oeystein Kjoersvik1, Andrew Bate2,3,4.
Abstract
Effective identification of previously implausible safety signals is a core component of successful pharmacovigilance. Timely, reliable, and efficient data ingestion and related processing are critical to this. The term 'black swan events' was coined by Taleb to describe events with three attributes: unpredictability, severe and widespread consequences, and retrospective bias. These rare events are not well understood at their emergence but are often rationalized in retrospect as predictable. Pharmacovigilance strives to rapidly respond to potential black swan events associated with medicine or vaccine use. Machine learning (ML) is increasingly being explored in data ingestion tasks. In contrast to rule-based automation approaches, ML can use historical data (i.e., 'training data') to effectively predict emerging data patterns and support effective data intake, processing, and organisation. At first sight, this reliance on previous data might be considered a limitation when building ML models for effective data ingestion in systems that look to focus on the identification of potential black swan events. We argue that, first, some apparent black swan events-although unexpected medically-will exhibit data attributes similar to those of other safety data and not prove algorithmically unpredictable, and, second, standard and emerging ML approaches can still be robust to such data outliers with proper awareness and consideration in ML system design and with the incorporation of specific mitigatory and support strategies. We argue that effective approaches to managing data on potential black swan events are essential for trust and outline several strategies to address data on potential black swan events during data ingestion.Entities:
Mesh:
Year: 2022 PMID: 35579807 PMCID: PMC9112242 DOI: 10.1007/s40264-022-01169-0
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.228
Fig. 1ICSR process overview and how a ICSR ML enabled process could approach the challenge of preventing loss of fidelity of data. HCP healthcare practitioner, ML machine learning, PV pharmacovigilance
Examples of different drug or vaccine–adverse event pairs that could be considered pharmacovigilance black swan events and how these exemplars can inform future machine learning data ingestion strategies
| Black swan event | Safety reporting of exemplar | Safety reporting? | Data outlier?a | Medical outlierb | Implications for future ML use for data ingestion |
|---|---|---|---|---|---|
| RotaShield vaccine and intussusception | Emerging safety issue in real-world use with plausibility at the time [ | Yes | Yes (pair)c | No | Only data outlier is unexpectedly frequent reporting of vaccine–AE pairs ML algorithms should work if statistical reporting patterns are similar in character to how previous safety issues were reported as they emerged |
| Cisapride QT prolongation | QT prolongation is a known frequent ADR to drugs. Safety reports of this drug–AE pair accumulated, triggering clinical review. Contemporaneous medical knowledge and other data made an association seem implausible and refuted by some [ | Yes | Yes (pair)c | Yes | Only data outlier is unexpectedly frequent reporting of drug–AE pair combination ML algorithms should work if statistical reporting patterns are similar in character to how previous safety issues were reported as they emerged If the link between a drug/vaccine and an AE is medically implausible, data should be available with high fidelity as reported AE is well known and captured in PV systems |
| Practolol sclerosing with peritonitis as part of oculo musculo cutaneous syndrome | A novel, previously unknown ADR received non-specific AE reporting [ | Yes | Yes | Yes | Presents as a data outlier and a medical outlier Monitoring of ML is essential as subtle changes in reporting patterns may be indicative of an emerging new type of issued |
| Silicone breast implants—autoimmune-like disorders | Delayed or lack of timely reporting into PV systems [ | No | NA | Yes | A data outlier only in the sense of absence of data in the safety system Awareness that ML-based systems cannot mitigate effectively for lack of data Necessitates ability to conduct surveillance in other data streams, particularly without reliance on the necessity of suspicion of an issue |
Although out of the scope of this manuscript, further ML deployment in signal detection and analysis might also facilitate a more effective medical review of the safety report (e.g., by case clustering)
ADR adverse drug reaction, AE adverse event, ML machine learning, NA not applicable, PV pharmacovigilance
aIngested data listed on safety reports at time of identification significantly different to data on reports seen previously
bUnexpected because contemporaneous medical knowledge at the time of identification seemed to render an association unlikely
cData outlier solely in terms of unexpected reporting of drug/vaccine–event pair but not in other ways
dNote that signals of potential subgroup risks, new information on a signal (e.g., occurrence at different dosage), and other local differences in reporting could also present as data outliers—a subset of which could also be potential black swan events
Examples of using machine learning in the data ingestion process where inappropriate deployment could complicate downstream identification of an emerging signal
| ML task | Purpose of ML task within overall PV data ingestion process | Black swan event potential consequence if ML inappropriately deployed during data ingestion |
|---|---|---|
| Privacy-preserving algorithms to ensure that personalized data are not collected and shared further | PV data without PII are made available for analysis [ | In the removal of PII, additional data that would be important or insightful for PV activity are removed |
| Identification of similar safety reports through learned unexpectedness of similar attributes (or combinations of) from previous reporting patterns | Clustered unexpectedly similar case reports indicative of higher probability of being duplicate safety reports [ | Similar reports suddenly reported that are systematically different from those seen before flagged as duplicates of the same instance rather than novel emerging |
| ML used to find likely data errors due to the identification of unrealistic data outliers and highlight and/or propose adjustments for such data errors | ML used to look for likely erroneous data elements as part of the quality assurance process for safety data; for example, recording an extreme dose and dosage form combination [ | True, potentially important data outliers would be less visible at the analysis stage or even wrongly adjusted |
| Extract data of relevance from extensive data input with much extraneous information, e.g., audio reporting of an adverse safety event or a chatbot transcript [ | ML extracts relevant PV information from communications for safety reports [ | Critical safety information articulated in original communication not appropriately captured |
ML machine learning, PII personal identifiable information, PV pharmacovigilance
| ‘Black swan events’ are unexpected, severe situations that, in retrospect, can seem predictable. A subset of safety risks seen in pharmacovigilance are such black swan events. |
| Adequate data ingestion is core to safety. Machine learning (ML) during pharmacovigilance data ingestion should make both data intake and processing more effective and enable signal detection and management, or—at an absolute minimum—not hinder it. |
| Routine use of ML in pharmacovigilance data ingestion requires considering the potential for black swan events. It needs to support both adequate ingestion of familiar and common reporting patterns or attributes and unexpected changes in reporting that might signify a black swan event. |
| There are many manifestations of potential black swan events in pharmacovigilance data, but ML can be anticipated to support or enable the adequate ingestion of data on these events if ML best practice and—when needed—mitigatory strategies are employed. |