Lichy Han1, Robert Ball2, Carol A Pamer2, Russ B Altman3,4, Scott Proestel2. 1. Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA. 2. Office of Surveillance and Epidemiology, Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, USA. 3. Department of Genetics, Stanford University. 4. Department of Bioengineering, Stanford University.
Abstract
OBJECTIVE: As the US Food and Drug Administration (FDA) receives over a million adverse event reports associated with medication use every year, a system is needed to aid FDA safety evaluators in identifying reports most likely to demonstrate causal relationships to the suspect medications. We combined text mining with machine learning to construct and evaluate such a system to identify medication-related adverse event reports. METHODS: FDA safety evaluators assessed 326 reports for medication-related causality. We engineered features from these reports and constructed random forest, L1 regularized logistic regression, and support vector machine models. We evaluated model accuracy and further assessed utility by generating report rankings that represented a prioritized report review process. RESULTS: Our random forest model showed the best performance in report ranking and accuracy, with an area under the receiver operating characteristic curve of 0.66. The generated report ordering assigns reports with a higher probability of medication-related causality a higher rank and is significantly correlated to a perfect report ordering, with a Kendall's tau of 0.24 ( P = .002). CONCLUSION: Our models produced prioritized report orderings that enable FDA safety evaluators to focus on reports that are more likely to contain valuable medication-related adverse event information. Applying our models to all FDA adverse event reports has the potential to streamline the manual review process and greatly reduce reviewer workload. Published by Oxford University Press on behalf of the American Medical Informatics Association 2017. This work is written by US Government employees and is in the public domain in the United States.
OBJECTIVE: As the US Food and Drug Administration (FDA) receives over a million adverse event reports associated with medication use every year, a system is needed to aid FDA safety evaluators in identifying reports most likely to demonstrate causal relationships to the suspect medications. We combined text mining with machine learning to construct and evaluate such a system to identify medication-related adverse event reports. METHODS: FDA safety evaluators assessed 326 reports for medication-related causality. We engineered features from these reports and constructed random forest, L1 regularized logistic regression, and support vector machine models. We evaluated model accuracy and further assessed utility by generating report rankings that represented a prioritized report review process. RESULTS: Our random forest model showed the best performance in report ranking and accuracy, with an area under the receiver operating characteristic curve of 0.66. The generated report ordering assigns reports with a higher probability of medication-related causality a higher rank and is significantly correlated to a perfect report ordering, with a Kendall's tau of 0.24 ( P = .002). CONCLUSION: Our models produced prioritized report orderings that enable FDA safety evaluators to focus on reports that are more likely to contain valuable medication-related adverse event information. Applying our models to all FDA adverse event reports has the potential to streamline the manual review process and greatly reduce reviewer workload. Published by Oxford University Press on behalf of the American Medical Informatics Association 2017. This work is written by US Government employees and is in the public domain in the United States.
Entities:
Keywords:
drug-related side effects and adverse reactions; supervised machine learning
Authors: June Almenoff; Joseph M Tonning; A Lawrence Gould; Ana Szarfman; Manfred Hauben; Rita Ouellet-Hellstrom; Robert Ball; Ken Hornbuckle; Louisa Walsh; Chuen Yee; Susan T Sacks; Nancy Yuen; Vaishali Patadia; Michael Blum; Mike Johnston; Charles Gerrits; Harry Seifert; Karol Lacroix Journal: Drug Saf Date: 2005 Impact factor: 5.606
Authors: Jean Lester; George A Neyarapally; Earlene Lipowski; Cheryl Fossum Graham; Marni Hall; Gerald Dal Pan Journal: Pharmacoepidemiol Drug Saf Date: 2013-01-02 Impact factor: 2.890
Authors: Shirley V Wang; Olga V Patterson; Joshua J Gagne; Jeffrey S Brown; Robert Ball; Pall Jonsson; Adam Wright; Li Zhou; Wim Goettsch; Andrew Bate Journal: Drug Saf Date: 2019-11 Impact factor: 5.606
Authors: Monica A Muñoz; Gerald J Dal Pan; Yu-Jung Jenny Wei; Chris Delcher; Hong Xiao; Cindy M Kortepeter; Almut G Winterstein Journal: Drug Saf Date: 2020-04 Impact factor: 5.606
Authors: Ehtesham Iqbal; Robbie Mallah; Daniel Rhodes; Honghan Wu; Alvin Romero; Nynn Chang; Olubanke Dzahini; Chandra Pandey; Matthew Broadbent; Robert Stewart; Richard J B Dobson; Zina M Ibrahim Journal: PLoS One Date: 2017-11-09 Impact factor: 3.240