| Literature DB >> 34922532 |
Jan Chrusciel1, François Girardon2, Lucien Roquette2, David Laplanche1, Antoine Duclos3,4, Stéphane Sanchez5.
Abstract
OBJECTIVE: This study aimed to assess the performance improvement for machine learning-based hospital length of stay (LOS) predictions when clinical signs written in text are accounted for and compared to the traditional approach of solely considering structured information such as age, gender and major ICD diagnosis.Entities:
Keywords: Data mining; Emergency department; Health services research; Length of stay
Mesh:
Year: 2021 PMID: 34922532 PMCID: PMC8684269 DOI: 10.1186/s12911-021-01722-4
Source DB: PubMed Journal: BMC Med Inform Decis Mak ISSN: 1472-6947 Impact factor: 2.796
Patient characteristics for the study to assess the prediction of hospital length of stay (LOS) using unstructured data at the emergency department (ED)
| Characteristic | Total |
|---|---|
| n | 5,006 |
| Age—mean ± SD | 64.3 ± 26.3 |
| Age category—n (%) | |
| < 18 | 494 (9.9) |
| Age ≥ 18 | 4512 (90.1) |
| Gender—n (%) | |
| Male | 2,333 (46.6) |
| Female | 2,673 (53.4) |
| Emergency LOS (hours)—Median (Q1–Q3) | 7.2 (4.8–9.6) |
| Total (ED + hospital) LOS (days)—Median (Q1–Q3) | 6.1 (3.7–11.0) |
| Intensive care patients—n (%) | 378 (7.6) |
| Most frequent diagnoses—n (%) | |
| Pneumonia (J189) | 212 (4.2) |
| Altered general health (R53 + 0) | 188 (3.8) |
| Shortness of breath (R060) | 174 (3.5) |
| Abdominal pain (R104) | 122 (2.4) |
| Femoral bone fracture (S7200) | 121 (2.4) |
| Most frequent concepts—n (%) | |
| Pain | 4,921 (98.3) |
| Blood pressure | 4,568 (91.3) |
| Capillary | 3,521 (70.3) |
| Abdomen | 2,155 (43.0) |
| Face | 2,046 (40.9) |
| Type of hospital stay, n (%) | |
| Pulmonology | 871 (17.4) |
| Digestive system | 761 (15.2) |
| Cardiovascular medicine (except cardiovascular catheterization) | 503 (10.0) |
| Trauma and orthopaedics | 467 (9.3) |
| Diseases of the nervous system (including stroke) | 465 (9.3) |
| Urology, nephrology | 332 (6.6) |
| Rheumatology | 313 (6.3) |
| Endocrinology | 195 (3.9) |
| Hematology | 193 (3.9) |
| Diagnostic or therapeutic catheterization | 161 (3.2) |
| Dermatology | 134 (2.7) |
| ENT, stomatology | 128 (2.6) |
| Toxicology, alcohol-related disease | 122 (2.4) |
| Psychiatry | 107 (2.1) |
| Multidisciplinary stays and known disease follow-up | 89 (1.8) |
| Obstetrics | 51 (1.0) |
| Infectiology | 44 (0.9) |
| Gynecology | 38 (0.8) |
| Other (chronic pain, ophtalmology, complex trauma, burn injury) | 32 (0.6) |
Model performance for the “structured-data only” and “unstructured-data added” feature sets
| Structured | Unstructured | Difference (pts) | All features | |
|---|---|---|---|---|
| Recall | 77.3% | 77.1% | − 0.19 | 76.6% |
| Specificity | 70.4% | 72.7% | 2.31 | 71.1% |
| Precision | 74.2% | 75.7% | 1.48 | 74.4% |
| Accuracy | 74.1% | 75.0% | 1.0 | 75.0% |
| F1 Score | 75.7% | 76.4% | 0.68 | 75.5% |
Fig. 1Feature importance for the unstructured data model to predict hospital length of stay (LOS)
Fig. 2Feature importance for the structured data model to predict hospital length of stay (LOS)
Model performance of structured and unstructured data to predict hospital length of stay (LOS) when trained on intensive care unit stays
| Structured data | Unstructured data | Difference (points) | All features | |
|---|---|---|---|---|
| Recall | 77.6% | 75.9% | − 1.72 | 84.5% |
| Specificity | 66.7% | 77.8% | 11.11 | 72.2% |
| Precision | 88.2% | 91.7% | 3.43 | 90.7% |
| Accuracy | 75.0% | 76.3% | 1.32 | 76.3% |
| F1 score | 82.6% | 83.1% | 0.49 | 87.5% |