| Literature DB >> 31703676 |
Rawan AlSaad1,2, Qutaibah Malluhi3, Ibrahim Janahi4, Sabri Boughorbel5.
Abstract
BACKGROUND: Predictive modeling with longitudinal electronic health record (EHR) data offers great promise for accelerating personalized medicine and better informs clinical decision-making. Recently, deep learning models have achieved state-of-the-art performance for many healthcare prediction tasks. However, deep models lack interpretability, which is integral to successful decision-making and can lead to better patient care. In this paper, we build upon the contextual decomposition (CD) method, an algorithm for producing importance scores from long short-term memory networks (LSTMs). We extend the method to bidirectional LSTMs (BiLSTMs) and use it in the context of predicting future clinical outcomes using patients' EHR historical visits.Entities:
Keywords: Deep learning; Electronic health record; Interpretability; Predictive models
Mesh:
Year: 2019 PMID: 31703676 PMCID: PMC6842261 DOI: 10.1186/s12911-019-0951-4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Basic statistics of the cohort
| Cases | Controls | ||
|---|---|---|---|
| # of patients | 6159 | 4912 | |
| # of visits | 62962 | 42182 | |
| # of diagnosis | 128877 | 77038 | |
| Avg. # of visits per patient | 10.2 | 8.6 | |
| Avg. # of codes in a visit | 2.0 | 1.8 | |
| Gender | Female | 2395 | 2278 |
| Male | 3764 | 2634 | |
| Race | African American | 2222 | 926 |
| Asian | 56 | 56 | |
| Biracial | 83 | 43 | |
| Caucasian | 2361 | 2805 | |
| Hispanic | 602 | 454 | |
| Native American | 22 | 15 | |
| Pacific Islander | 8 | 2 | |
| Unknown | 805 | 611 | |
Average AUC of models trained on asthma dataset for the task of school-age asthma prediction
| Model | AUC (95% CI) |
| LSTM | 0.831 (0.824-0.838) |
| BiLSTM | 0.819 (0.811-0.827) |
| Logistic Regression | 0.702 (0.692-0.712) |
Fig. 1Validation of contextual decomposition for LSTM and BiLSTM for the class c=1. The attribution is correct if the highest contribution among all visits is assigned to the artificial visit. The prediction curves indicate the prediction accuracy for class c=1, which also represents the upper bound for the attribution accuracy
Fig. 2Evaluation of the agreement between CD scores and importance scores generated from logistic regression coefficients. The matching is correct if the visit with the highest LSTM/BiLSTM CD attribution matches one of the top three visits, which are generated using logistic regression coefficients
Fig. 3CD scores for individual visits produced from LSTM and BiLSTM models trained for the task of predicting school-age asthma. Red is positive, white is neutral and blue is negative. The squares represent patient EHR time-ordered visits, and the label of each square indicates the visit number appended by the date of the visit. The upper row is the LSTM CD attributions and the lower row is the BiLSTM CD attributions
Fig. 4Most predictive subset of visits using CD-based scores highlighted in yellow. Example for a patient where relative contributions of subset of visits produced from LSTM and BiLSTM are similar
Fig. 5Most predictive subset of visits using CD-based scores. Example for a patient where BiLSTM is producing better interpretation than LSTM
Top scoring patterns of length 1 visit, produced by the contextual decomposition of LSTM and BiLSTM models on the asthma data
| ICD Codes | Frequency% | ICD Codes | Frequency% | |
| 1 | 493.9 Asthma Unspecified | 40% | 493.9 Asthma Unspecified | 34% |
| 2 | 493.9,786.0 Asthma Unspecified, Dyspnea and Respiratory Abnormalities | 13% | 786.2 Cough | 15% |
| 3 | 786.0 Dyspnea and Respiratory Abnormalities | 11% | 493.9,786.0 Asthma Unspecified, Dyspnea and Respiratory | 21% |
| 4 | 493.9,786.2 Asthma Unspecified,Cough | 10% | 786.0 Dyspnea and Respiratory Abnormalities | 10% |
| 5 | 465.9,493.9 Acute Upper Respiratory Infections of Unspecified Site, Asthma Unspecified | 9% | 493.9,786.2 Asthma Unspecified, Cough | 9% |
| 6 | 493.0 Extrinsic Asthma | 4% | 465.9,493.9 Acute Upper Respiratory Infections of Unspecified Site,Asthma Unspecified 8% | |
| 7 | 486,493.9 Pneumonia, Asthma Unspecified | 4% | 465.9,786.2 Acute Upper Respiratory Infections of Unspecified Site,Cough | 5% |
| 8 | 465.9,493.9,786.2 Acute Upper Respiratory Infections of Unspecified Site, Asthma Unspecified, Cough | 3% | 486,493.9 Pneumonia, Asthma Unspecified | 3% |
| 9 | 382.9,493.9 Unspecified Otitis Media, Asthma Unspecified | 3% | 486,493.9 Pneumonia, Asthma Unspecified | 3% |
| 10 | 493.0, 493.9 Extrinsic Asthma,Asthma Unspecified | 3% | V67.9 Unspecified Follow-Up Examination | 3% |
Top scoring patterns of length 2 visit, produced by the contextual decomposition of LSTM and BiLSTM models on the asthma data
| ICD Codes | Frequency% | ICD Codes | Frequency% | |
|---|---|---|---|---|
| 1 | [493.9],[493.9] [Asthma Unspecified],[Asthma Unspecified] | 13% | [493.9], [493.9] [Asthma Unspecified],[Asthma Unspecified] | 11% |
| 2 | [493.9,786.0],[493.9][Asthma Unspecified, Dyspnea and Respiratory Ab-normalities], [Asthma Unspecified] | 2% | [493.9,786.0],[493.9][Asthma Unspecified, Dyspnea and Respiratory Ab-normalities], [Asthma Unspecified] | 2% |
| 3 | [493.9],[493.9,786.0] [Asthma Unspecified], [Asthma Unspecified, Dysp-nea and Respiratory Abnormalities] | 2% | [493.9],[493.9,786.0][Asthma Unspecified], [Asthma Unspecified, Dysp-nea and Respiratory Abnormalities] | 2% |
| 4 | [493.9], [V20.2] [Asthma Unspecified], [Routine Infant or Child Health Check] | 2% | [493.9], [V20.2][Asthma Unspecified], [Routine Infant or Child Health Check] | 2% |
| 5 | [493.9,786.2], [493.9] [Asthma Unspecified, Cough], [Asthma Unspecified] | 2% | [493.9,786.2], [493.9][Asthma Unspecified, Cough], [Asthma Unspecified] | 1% |