| Literature DB >> 32202503 |
Subendhu Rongali1, Adam J Rose2, David D McManus3, Adarsha S Bajracharya3, Alok Kapoor3,4, Edgard Granillo5, Hong Yu1,3,6,7.
Abstract
BACKGROUND: Scalable and accurate health outcome prediction using electronic health record (EHR) data has gained much attention in research recently. Previous machine learning models mostly ignore relations between different types of clinical data (ie, laboratory components, International Classification of Diseases codes, and medications).Entities:
Keywords: ablation; neural networks; patient mortality; predictive modeling
Mesh:
Year: 2020 PMID: 32202503 PMCID: PMC7136840 DOI: 10.2196/16374
Source DB: PubMed Journal: J Med Internet Res ISSN: 1438-8871 Impact factor: 5.428
Patient demographic information (N=7537).
| Characteristic | Values | |
|
| ||
|
| Mean | 74.74 |
|
| Median | 66.00 |
|
| ||
|
| Male | 4190 (55.59) |
|
| Female | 3347 (44.41) |
|
| ||
|
| White | 5644 (74.88) |
|
| Black | 867 (11.50) |
|
| Hispanic | 277 (3.68) |
|
| Asian | 226 (3.00) |
|
| Other/unknown | 523 (6.94) |
Figure 1Our model architecture. LSTM: long short-term memory.
Figure 2Model for constructing the encounter vector. ReLU: rectified linear unit; ICD: International Classification of Diseases.
Figure 3The correlational neural network for our 3 views. ICD: International Classification of Diseases.
Area under the receiver operating characteristic curve scores for different models.
| Method | Area under the receiver operating characteristic curve, mean (SD) |
| Logistic regression | 0.82 (0.0103) |
| RETAINa (only ICDb) | 0.82 (0.0924) |
| TaRETAINc- | 0.82 (0.0118) |
| TaRETAIN- | 0.82 (0.0919) |
| RETAIN (all codes) | 0.86 (0.0105) |
| Long short-term memory with only ICD codes | 0.83 (0.0104) |
| CLOUTd—only autoencoder | 0.80 (0.0116) |
| CLOUT—only latent space | 0.81 (0.0082) |
| CLOUT—simple concatenation | 0.88 (0.0096) |
| CLOUT—autoencoder concatenation | 0.88 (0.0107) |
| CLOUT—latent space concatenation |
|
aRETAIN: Reverse Time Attention model.
bICD: International Classification of Diseases.
cTaRETAIN: time-aware RETAIN.
dCLOUT: L(STM) Outcome prediction using Comprehensive features relations.
eBest performing model.
Figure 4The area under the receiver operating characteristic curves for various models. RETAIN: Reverse Time Attention model; CLOUT: L(STM) Outcome prediction using Comprehensive feature relations.
Precision, recall, and F-scores for top CLOUTa models.
| Method and class | Precision | Recall | F-score | |
|
| ||||
|
| 0 | 0.85 | 0.82 | 0.83 |
|
| 1 | 0.71 | 0.76 | 0.73 |
|
| Average | 0.80 | 0.79 | 0.80 |
|
| ||||
|
| 0 | 0.85 | 0.85 | 0.85 |
|
| 1 | 0.74 | 0.74 | 0.74 |
|
| Average | 0.81 | 0.81 | 0.81 |
|
| ||||
|
| 0 | 0.84 | 0.88 | 0.86 |
|
| 1 | 0.78 | 0.72 | 0.72 |
|
| Average | 0.82 | 0.82 | 0.82 |
aCLOUT: L(STM) Outcome prediction using Comprehensive features relations.
Pearson correlation coefficients for agreement between physicians and models.
| Agreement | Physician 1, | Physician 2, | Physician 3, | Physician 4, | Physician 5, | Mean (SD) | |||||||
|
| |||||||||||||
|
| Physician 1 | 1.00 | 0.81 | 0.56 | 0.61 | 0.88 | 0.72 (0.13) | ||||||
|
| Physician 2 | 0.81 | 1.00 | 0.87 | 0.65 | 0.86 | 0.80 (0.09) | ||||||
|
| Physician 3 | 0.56 | 0.87 | 1.00 | 0.49 | 0.69 | 0.65 (0.14) | ||||||
|
| Physician 4 | 0.61 | 0.65 | 0.49 | 1.00 | 0.61 | 0.59 (0.06) | ||||||
|
| Physician 5 | 0.88 | 0.86 | 0.69 | 0.61 | 1.00 | 0.76 (0.11) | ||||||
|
| |||||||||||||
|
| Logistic regression | 0.60 | 0.63 | 0.53 | 0.32 | 0.52 | 0.52 (0.11) | ||||||
|
| RETAINa | 0.65 | 0.72 | 0.61 | 0.30 | 0.58 | 0.57 (0.14) | ||||||
|
| CLOUTb—only autoencoder | −0.07 | 0.13 | 0.21 |
| 0.17 | 0.20 (0.20) | ||||||
|
| CLOUT—only latent space | 0.42 | 0.77 |
| 0.35 | 0.53 | 0.54 (0.15) | ||||||
|
| CLOUT—simple concatenation | 0.52 | 0.64 | 0.70 | 0.19 |
| 0.54 (0.19) | ||||||
|
| CLOUT—autoencoder concatenation | 0.54 | 0.70 | 0.64 | 0.14 | 0.62 | 0.53 (0.20) | ||||||
|
| CLOUT—latent space concatenation |
|
| 0.59 | 0.18 |
|
| ||||||
aRETAIN: Reverse Time Attention model.
bCLOUT: L(STM) Outcome prediction using Comprehensive features relations.
cItalicization signifies highest physician-model agreement in the column.