| Literature DB >> 35506114 |
Logan Ryan1, Jenish Maharjan1, Samson Mataraso1, Gina Barnes1, Jana Hoffman1, Qingqing Mao1, Jacob Calvert1, Ritankar Das1.
Abstract
Background: Pulmonary embolisms (PE) are life-threatening medical events, and early identification of patients experiencing a PE is essential to optimizing patient outcomes. Current tools for risk stratification of PE patients are limited and unable to predict PE events before their occurrence. Objective: We developed a machine learning algorithm (MLA) designed to identify patients at risk of PE before the clinical detection of onset in an inpatient population. Materials andEntities:
Keywords: anticoagulants; artificial intelligence; clinical; decision support systems; thrombosis
Year: 2022 PMID: 35506114 PMCID: PMC9052977 DOI: 10.1002/pul2.12013
Source DB: PubMed Journal: Pulm Circ ISSN: 2045-8932 Impact factor: 2.886
Demographic and clinical characteristics of the study sample used for training and testing the machine learning models
| Characteristic | Non PE encounters (%) | PE encounters (%) | |
|---|---|---|---|
| Age | 40–49 | 10542 (17.48) | 37 (11.97) |
| 50–59 | 15090 (25.03) | 73 (23.62) | |
| 60–69 | 17540 (29.09) | 108 (34.95) | |
| 70–79 | 9994 (16.57) | 63 (20.39) | |
| 80+ | 7131 (11.83) | 28 (9.06) | |
| Sex | Male | 30942 (51.32) | 152 (49.19) |
| Female | 29355 (48.68) | 157 (50.81) | |
| Ethnicity | Hispanic | 6933 (11.50) | 22 (7.12) |
| Not Hispanic | 53364 (88.50) | 287 (92.88) | |
| Race | White | 34678 (57.51) | 193 (62.46) |
| Black | 5645 (9.36) | 38 (12.30) | |
| Asian | 8832 (14.65) | 42 (13.59) | |
| Pacific Islander | 843 (1.40) | 4 (1.29) | |
| Native American | 276 (0.46) | 0 (0.00) | |
| Other/unknown | 10023 (16.62) | 32 (10.36) | |
| Comorbidities | HIV/AIDS | 1472 (2.44) | 9 (2.91) |
| Renal disease | 5205 (8.63) | 15 (4.85) | |
| Liver disease | 6397 (10.61) | 33 (10.68) | |
| Prior organ transplant | 6405 (10.62) | 19 (6.15) | |
| Diabetes | 14059 (23.32) | 68 (22.01) | |
| COPD | 5037 (8.35) | 42 (13.59) | |
| History of cancer | 22354 (37.07) | 168 (54.37) | |
| Alcohol use disorder | 2601 (4.31) | 13 (4.21) | |
| Pneumonia | 5474 (9.08) | 34 (11.00) | |
| Heart failure | 7216 (11.97) | 60 (19.42) | |
| Myocardial infarction | 2161 (3.58) | 8 (2.59) | |
| Previous VTE | 3369 (5.59) | 133 (43.04) | |
| Recent/active pregnancy (within 2 months) | 7276 (12.07) | 44 (14.24) | |
| Recent surgery (within 1 month) | 1794 (2.98) | 5 (1.64) |
Abbreviations: COPD, chronic obstructive pulmonary disease; HIV/AIDS, human immunodeficiency virus/acquired immunodeficiency syndrome; PE, pulmonary embolism; VTE, venous thromboembolism.
Figure 1Receiver operating characteristic (ROC) curves for the XGBoost, neural network, and logistic regression machine learning models. AUROC, area under the receiver operating characteristic curve; LR, logistic regression; NN, neural network; XGB, XGBoost
Comparison of performance metrics of the three machine learning models
| Model | AUC | Sensitivity | Specificity | LR + | LR− | DOR |
|---|---|---|---|---|---|---|
| XGBoost | 85% | 81% | 77% | 3.48 | 0.25 | 13.80 |
| Neural network | 74% | 81% | 48% | 1.56 | 0.40 | 3.89 |
| Logistic regression | 67% | 81% | 45% | 1.47 | 0.43 | 3.43 |
Abbreviations: AUROC, area under the receiver operating characteristic curve; DOR, diagnostic odds ratio; LR, likelihood ratio.
Figure 2SHAP summary plot for the XGBoost model. The x axis of the plot shows the SHAP value for each of the features. The color of a point is indicative of the feature value, where red is a high value and blue is a low value. The y axis lists feature names in descending order of importance to the model's decision‐making process. Superscripts in the feature names denote the hour of the patient's stay at which the measure was recorded. Delta symbols (Δ) are used when a feature captures the hourly change in a measure, with superscripts denoting the hours under consideration, for example, ΔUrine Output , is the change in urine output measured during the first hour and the second hour in the 3‐h input period. DVT, deep vein thrombosis; GCS, Glasgow coma scale; SysABP, systolic arterial blood pressure