| Literature DB >> 33008368 |
Divneet Mandair1, Premanand Tiwari2, Steven Simon3, Kathryn L Colborn4, Michael A Rosenberg5,6.
Abstract
BACKGROUND: With cardiovascular disease increasing, substantial research has focused on the development of prediction tools. We compare deep learning and machine learning models to a baseline logistic regression using only 'known' risk factors in predicting incident myocardial infarction (MI) from harmonized EHR data.Entities:
Keywords: Electronic health records; Machine learning; Myocardial infarction
Mesh:
Year: 2020 PMID: 33008368 PMCID: PMC7532582 DOI: 10.1186/s12911-020-01268-x
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
UCHealth Population by MI diagnosis
| No MI | 6-month Incident MI | |
|---|---|---|
| 2.25 million(%) | 20,591 (%) | |
| 43.32 ± 22.56 | 70.36 ± 14.02 | |
| 1,228,689 (54.5%) | 7880 (37.7%) | |
| 376,371 (16.72%) | 12,314 (59.8%) | |
| 59,199 (2.63%) | 6700 (32.53%) | |
| 30,251 (1.34%) | 1668 (8.1%) | |
| 44,999 (2%) | 3038 (14.75) | |
| 131,059 (5.82%) | 5432 (26.38%) | |
| 127,575 (5.67%) | 2343 (11.37%) | |
| 42,759 (1.9%) | 2128 (10.33%) |
Legend: Baseline demographics and relative frequency of typical MI risk factors. Diagnoses based on presence of diagnosis code (ICD-10; ICD-9) for each. Hypertension: I10x; 401.x, Coronary artery disease: I25.1; 414.01, Mitral Valve disease: I34.2, I34.0, 394.0, 424.0, Heart failure: I50.9, 428.0, Type II Diabetes Mellitus: E11.9, 250.00, Obesity: E66.9, 278.0, Chronic kidney disease: N18.9, 585.9
Comparison of Resampling Strategies
| F1 Score | AUC | Training time | |
|---|---|---|---|
| Oversampling | |||
| Random | 0.105 | 0.816 | 22 min |
| SMOTE | 0.109 | 0.786 | 34 min |
| Undersampling | |||
| Random | 0.091 | 0.839 | 2 min |
| Cluster centroid | 0.057 | 0.78 | 78 min |
| None | 0.01 | 0.51 | 3 min |
Sampling comparison from deep learning model
Comparison of machine learning approaches
| F1 Score | AUC | Training time | |
|---|---|---|---|
| Naïve Bayes | 0.060 | 0.73 | 1 min |
| Logistic regression with L2 regularization | 0.084 | 0.829 | 1 min |
| Logstic regression with no regularization | 0.06 | 0.79 | 1 min |
| RF | 0.084 | 0.765 | 3 min |
| Shallow NN | 0.101 | 0.83 | 1 min |
| Deep NN | 0.092 | 0.835 | 2 min |
| GBM | 0.077 | 0.83 | 9 min |
Comparison of various models using Random Undersampling technique and all features. F1 and AUC calculated from model applied to held-out testing set (20%); training time is for training of training set (80%)
Fig. 1a. Precision-recall curve for optimal model. b. ROC curve for optimal model. A) Confusion matrix for the optimally performing DNN B) ROC curve for DNN model
Fig. 2Calibration curve for optimal model. a. Calibration plot for the DNN, showing a wide discrepancy between actual observed distribution of outcomes vs. predicted probabilities from the model b. Distribution of distribution of predicted probabilities for cases vs. controls, showing good discrimination
Fig. 3Calibration curve for comparison model. a. Calibration plot for the logistic model using only known risk factors, showing a similar discrepancy between actual observed distribution of outcomes vs. predicted probabilities b. Distribution of distribution of predicted probabilities for cases vs. controls, again with good discrimination