| Literature DB >> 31605285 |
Ramani Routray1, Niki Tetarenko2, Claire Abu-Assal3, Ruta Mockute2, Bruno Assuncao2, Hanqing Chen3, Shenghua Bao3, Karolina Danysz2, Sameen Desai2, Salvatore Cicirello2, Van Willis3, Sharon Hensley Alford3, Vivek Krishnamurthy3, Edward Mingle2.
Abstract
INTRODUCTION: Identification of adverse events and determination of their seriousness ensures timely detection of potential patient safety concerns. Adverse event seriousness is a key factor in defining reporting timelines and is often performed manually by pharmacovigilance experts. The dramatic increase in the volume of safety reports necessitates exploration of scalable solutions that also meet reporting timeline requirements.Entities:
Mesh:
Year: 2020 PMID: 31605285 PMCID: PMC6965337 DOI: 10.1007/s40264-019-00869-4
Source DB: PubMed Journal: Drug Saf ISSN: 0114-5916 Impact factor: 5.606
Breakdown of report statistics for the training set
| Total number of cases | 22,932 (12,207 PM, 7,512 SD, 3,213 ML) |
| Total number of documents | 26,256 |
| By type | |
| Post-marketing | 13,083 |
| Solicited | 10,098 |
| Medical literature | 3075 |
| AE seriousness pairs, serious/non-serious | 48,118/25,076 |
| By seriousness classification | |
| Hospitalization | 26,019 |
| Important medical event | 15,149 |
| Death | 6955 |
| Disability | 13 |
| Congenital anomaly | 0 |
| Required intervention (devices) | 0 |
| Life threatening | 48 |
| Number of therapy areas covered | 3 (oncology, hematology, immunology) |
| Number of suspect drugs covered | PM = 23, SD = 237, ML = 32 |
| Number of unique adverse events covereda | PM (14,330), SD (8294), ML (3590) |
AE adverse event, ML medical literature, PM spontaneous reports, SD solicited reports
aCalculated using reported term
Fig. 1Study design. A stratified sample of 20,000 cases was derived from 2 years of safety data. Three neural networks were trained using 90% of the stratified sample and each was tested against the remaining 10% of the sample as depicted in the neural network architecture. IME important medical event, LLT lowest level term, MedDRA Medical Dictionary for Regulatory Activities, PT preferred term
Breakdown of test set report statistics
| Data type | PM | SD | ML |
|---|---|---|---|
| Total number of reports | 1324 | 1045 | 347 |
| AE seriousness pairs, serious/non-seriousa | 763/1565 | 2615/660 | 811/668 |
| By seriousness classification | |||
| Hospitalization | 207 | 1837 | 184 |
| Important medical event | 485 | 330 | 507 |
| Death | 71 | 448 | 120 |
| Disability | 0 | 0 | 0 |
| Congenital anomaly | 0 | 0 | 0 |
| Required intervention (devices) | 0 | 0 | 0 |
| Life threatening | 0 | 0 | 0 |
AE adverse event, ML medical literature, PM spontaneous reports, SD solicited reports
aCalculated using the Medical Dictionary for Regulatory Activities code
Fig. 2Model architectures. Neural network architectures for the a binary seriousness classifier, b seriousness category classifier, and c seriousness term annotator. B-SER beginning of seriousness term, CRF conditional random field, IME important medical event, LSTM long short-term memory, O other
Fig. 3Acceptable quality level (AQL) process for pharmacovigilance (PV) neural networks. This process depicts the framework for the validation of neural networks leveraging the AQL method. It was customized in a manner to accommodate for the inherent needs of PV. The validation process begins once the developer generates the results of a neural network and creates an excel output of the true positive (TP), false positive (FP), false negative (FN), and true negatives (TNs). If the F1 score or accuracy is below the 75% threshold, the PV subject matter expert (SME) reviews 100% of the FP and FN results and reports any trends in errors and results of the review to the developer for further training. If the F1 score or accuracy is above 75%, the PV SME reviews the TP results to ensure the neural network is performing at the F1 score or accuracy claimed. For our purposes, if the number of TPs was less than 150, the PV SME would perform a 100% review of TPs to ensure the system result matches the safety database entry and is indeed a TP, as it was within the work capacity of the team. If there were more than 150 TPs, the PV SME would randomize the TPs, select the appropriate AQL sample of TPs, and then review the results. For both instances, if the TP error rate was ≤ 4%, then the neural network was deemed passed, and if not, it was sent back to the developer for further training
Performance of neural networks
| Source data type | Seriousness classification (accuracy) | Seriousness categorization (F1 score) | Annotation of seriousness category terms (F1 score) |
|---|---|---|---|
| Post-marketing | 83.0% (precision = 0.95, recall = 0.74) | Death—0.78 (precision = 0.88, recall = 0.70) Hospitalization—0.79 (precision = 0.84, recall = 0.74) IME—0.76 (precision = 0.81, recall = 0.72) | NC |
| Solicited reports | 92.9% (precision = 0.87, recall = 0.87) | NC | 0.90 (precision = 0.88, recall = 0.91) |
| Medical literature | 86.3% (precision = 0.83, recall = 0.82) | NC | 0.75 (precision = 0.62, recall = 0.96) |
IME important medical event, NC not calculated
Analysis of alternate algorithm performance on post-marketing data
| Algorithm | Seriousness classification (accuracy) | Seriousness categorization (F1 score) |
|---|---|---|
| Random forests | 81.2% (precision = 0.89, recall = 0.71) | Death—0.59 (precision = 0.53, recall = 0.66) Hospitalization—0.74 (precision = 0.78, recall = 0.70) IME—0.76 (precision = 0.84, recall = 0.69) |
| Support-vector machine | 82.3% (precision = 0.94, recall = 0.72) | Death—0.80 (precision = 0.92, recall = 0.71) Hospitalization—0.75 (precision = 0.79, recall = 0.71) IME – 0.82 (precision = 0.87, recall = 0.77) |
IME important medical event
Example analysis by seriousness models
| Model inputs | Model outputs |
|---|---|
Binary seriousness classifier AE: arrhythmia = serious AE: cardiac arrest = serious Case = serious | |
Seriousness category classifier AE: arrhythmia = hospitalization, IME AE: cardiac arrest = death | |
Annotator Patient was |
AE adverse event, LLT lowest level term, MedDRA Medical Dictionary for Regulatory Activities, PT preferred term, IME important medical event
| Volume, complexity, and time constraints of adverse event reporting are overwhelming the pharmacovigilance workforce. New solutions are needed to support these activities to meet global regulatory timelines. |
| We developed several augmented intelligence approaches to support the correct identification and classification of seriousness, a key factor in adverse reporting, in various document types. |
| Our deep learning models were trained using an extensive data set that captured deep institutional pharmacovigilance practitioner knowledge. |