Seyed M Miran1, Stuart J Nelson2, Doug Redd2, Qing Zeng-Treitler2. 1. Biomedical Informatics Center, School of Medicine and Health Sciences, George Washington University, Washington, D.C., USA. Electronic address: miran@gwu.edu. 2. Biomedical Informatics Center, School of Medicine and Health Sciences, George Washington University, Washington, D.C., USA.
Abstract
BACKGROUND: The data quality of electronic health records (EHR) has been a topic of increasing interest to clinical and health services researchers. One indicator of possible errors in data is a large change in the frequency of observations in chronic illnesses. In this study, we built and demonstrated the utility of a stacked multivariate LSTM model to predict an acceptable range for the frequency of observations. METHODS: We applied the LSTM approach to a large EHR dataset with over 400 million total encounters. We computed sensitivity and specificity for predicting if the frequency of an observation in a given week is an aberrant signal. RESULTS: Compared with the simple frequency monitoring approach, our proposed multivariate LSTM approach increased the sensitivity of finding aberrant signals in 6 randomly selected diagnostic codes from 75 to 88% and the specificity from 68 to 91%. We also experimented with two different LSTM algorithms, namely, direct multi-step and recursive multi-step. Both models were able to detect the aberrant signals while the recursive multi-step algorithm performed better. CONCLUSIONS: Simply monitoring the frequency trend, as is the common practice in systems that do monitor the data quality, would not be able to distinguish between the fluctuations caused by seasonal disease changes, seasonal patient visits, or a change in data sources. Our study demonstrated the ability of stacked multivariate LSTM models to recognize true data quality issues rather than fluctuations that are caused by different reasons, including seasonal changes and outbreaks.
BACKGROUND: The data quality of electronic health records (EHR) has been a topic of increasing interest to clinical and health services researchers. One indicator of possible errors in data is a large change in the frequency of observations in chronic illnesses. In this study, we built and demonstrated the utility of a stacked multivariate LSTM model to predict an acceptable range for the frequency of observations. METHODS: We applied the LSTM approach to a large EHR dataset with over 400 million total encounters. We computed sensitivity and specificity for predicting if the frequency of an observation in a given week is an aberrant signal. RESULTS: Compared with the simple frequency monitoring approach, our proposed multivariate LSTM approach increased the sensitivity of finding aberrant signals in 6 randomly selected diagnostic codes from 75 to 88% and the specificity from 68 to 91%. We also experimented with two different LSTM algorithms, namely, direct multi-step and recursive multi-step. Both models were able to detect the aberrant signals while the recursive multi-step algorithm performed better. CONCLUSIONS: Simply monitoring the frequency trend, as is the common practice in systems that do monitor the data quality, would not be able to distinguish between the fluctuations caused by seasonal disease changes, seasonal patient visits, or a change in data sources. Our study demonstrated the ability of stacked multivariate LSTM models to recognize true data quality issues rather than fluctuations that are caused by different reasons, including seasonal changes and outbreaks.
Authors: Ritu Khare; Levon H Utidjian; Hanieh Razzaghi; Victoria Soucek; Evanette Burrows; Daniel Eckrich; Richard Hoyt; Harris Weinstein; Matthew W Miller; David Soler; Joshua Tucker; L Charles Bailey Journal: EGEMS (Wash DC) Date: 2019-08-01
Authors: Michael G Kahn; Tiffany J Callahan; Juliana Barnard; Alan E Bauck; Jeff Brown; Bruce N Davidson; Hossein Estiri; Carsten Goerg; Erin Holve; Steven G Johnson; Siaw-Teng Liaw; Marianne Hamilton-Lopez; Daniella Meeker; Toan C Ong; Patrick Ryan; Ning Shang; Nicole G Weiskopf; Chunhua Weng; Meredith N Zozus; Lisa Schilling Journal: EGEMS (Wash DC) Date: 2016-09-11