| Literature DB >> 33949961 |
Khalid Alghatani1, Nariman Ammar2, Abdelmounaam Rezgui3, Arash Shaban-Nejad2.
Abstract
BACKGROUND: Patient monitoring is vital in all stages of care. In particular, intensive care unit (ICU) patient monitoring has the potential to reduce complications and morbidity, and to increase the quality of care by enabling hospitals to deliver higher-quality, cost-effective patient care, and improve the quality of medical services in the ICU.Entities:
Keywords: ICU patient monitoring; clinical intelligence; intensive care unit (ICU); machine learning; predictive model; vital signs measurements
Year: 2021 PMID: 33949961 PMCID: PMC8135024 DOI: 10.2196/21347
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Intelligent Remote Patient Monitoring (IRPM) framework. IICUPM: intelligent intensive care unit patient monitoring.
Figure 2Data extraction pipeline from the Medical Information Mart for Intensive Care (MIMIC) database. ICU: intensive care unit.
Descriptive statistics for outcome variables in the two models.
| Model | Operationalization | Values | ||
| In-hospital mortality (binary classification) | 0: survival; 1: nonsurvival | 0: 11.897%; 1: 88.103% | ||
|
|
|
| ||
|
| Binary classification | 0: LOS≤2.636 days; 1: LOS>2.636 days | 0: 50%; 1: 50% | |
|
| Regression-based classification | Number of days in intensive care unit | Mean 4.74959 (SD 6.49982) | |
Descriptive statistics for baseline model predictors (N=44,626).
| Input variables | Measurement | Value |
| HeartRate_mean | Heart rate (beats/minute), mean (SD) | 85.99 (15.59) |
| sysbp_mean | Arterial systolic blood pressure (mmHg) mean (SD) | 118.75 (16.90) |
| diasbp_mean | Arterial diastolic blood pressure (mmHg), mean (SD) | 60.47 (10.89) |
| RespRate_mean | Respiratory rate (breaths/minute), mean (SD) | 18.93 (4.05) |
| Tempc_mean | Body temperature (°C), mean (SD) | 36.84 (0.62) |
| Spo2_mean | Peripheral oxygen saturation (%), mean | 97.27 |
| Glucose_mean | Blood glucose (mg/dL), mean (SD) | 138.74 (41.86) |
| Age | Age (years), mean (SD) | 64.35 (16.87) |
| GenderM | Male population, n (%) | 25,241 (56.56) |
| GenderF | Female population, n (%) | 19,385 (43.44) |
| Height | Patient height (cm), mean (SD) | 160.66 (11.76) |
| Weight | Patient weight (kg), mean (SD) | 80.45 (23.47) |
Pearson correlation coefficients among vital signs of the baseline model.
| Variable | Heart rate | Systolic BPa | Diastolic BP | Respiration rate | Body temperature | SpO2b | Glucose |
| Heart rate | 1 | –0.104 | 0.211 | 0.326 | 0.268 | –0.099 | 0.063 |
| Systolic BP | –0.104 | 1 | 0.524 | –0.032 | 0.065 | 0.045 | 0.063 |
| Diastolic BP | 0.211 | 0.524 | 1 | 0.0257 | 0.065 | –0.0148 | 0.0142 |
| Respiration rate | 0.326 | –0.032 | 0.0257 | 1 | 0.118 | –0.259 | 0.069 |
| Body temperature | 0.268 | 0.065 | 0.0335 | 0.118 | 1 | 0.051 | –0.022 |
| SPO2 | –0.099 | 0.045 | –0.0148 | –0.259 | 0.051 | 1 | –0.048 |
| Glucose | 0.063 | 0.078 | 0.0142 | 0.069 | –0.022 | –0.048 | 1 |
aBP: blood pressure.
bSpO2: oxygen saturation.
Figure 3Feature engineering pipeline in the quantiles approach. MIMIC: Medical Information Mart for Intensive Care; ICU: intensive care unit; PDF: probability density function; PPF: percent point function.
Figure 4Distribution of a sample patient observation before and after applying the quantiles approach.
Figure 5Distribution of a sample patient observation before and after applying the quantiles approach (continued from Figure 4).
Sample data from an individual patient before applying the quantiles approach.
| Feature | Operationalization | Mean (SD) |
| HeartRate_mean | Mean heart rate (beats/minute) | 98.92 (27.89) |
| sysbp_mean | Mean systolic blood pressure (mmHg) | 107.8 (21.26) |
| diasbp_mean | Mean diastolic blood pressure (mmHg) | 56.88 (10.00) |
| resprate_mean | Mean respiration rate (breaths/minute) | 17.29 (3.33) |
| tempc_mean | Mean body temperature (°C) | 37.08 (0.33) |
| spo2_mean | Mean oxygen saturation (%) | 97.86 (3.57) |
| glucose_mean | Mean glucose level (mg/dL) | 206.0 (73.26) |
Sample of patient data from after applying the quantiles approach.
| Feature | Operationalization | Value | |
|
|
|
| |
|
| HeartRate_mean_mod | Mean of modified heart rate (beats/minute) | 103.59 |
|
| sysbp_mean_mod | Mean of modified arterial diastolic blood pressure (mmHg) | 109.34 |
|
| diasbp_mean_mod | Mean of modified arterial systolic blood pressure (mmHg) | 57.03 |
|
| resprate_mean_mod | Mean of modified respiratory rate (breaths/minute) | 16.36 |
|
| tempc_mean_mod | Mean of modified body temperature (°C) | 37.12 |
|
| spo2_mean_mod | Mean of modified peripheral oxygen saturation (%) | 89.00 |
|
| glucose_mean_mod | Mean of modified blood glucose level (mg/dL) | 214.46 |
|
|
|
| |
|
| heartRate_std_mod | SD of modified heart rate (beats/minute) | 35.76 |
|
| sysbp_std_mod | SD of modified arterial diastolic blood pressure (mmHg) | 30.83 |
|
| diasbp_std_mod | SD of modified arterial systolic blood pressure (mmHg) | 14.36 |
|
| resprate_std_mod | SD of modified respiratory rate (breaths/minute) | 5.49 |
|
| tempc_std_mod | SD of modified body temperature (°C) | 0.43 |
|
| spo2_std_mod | SD of modified peripheral oxygen saturation (%) | 8.74 |
|
| glucose_std_mod | SD of modified blood glucose level (mg/dL) | 78.60 |
|
|
|
| |
|
| HeartRateQuantPer | First and fourth quantiles percent of heart Rate | 0.5522 |
|
| SystolicQuantPer | First and fourth quantiles percent of arterial diastolic blood pressure | 0.4266 |
|
| DiastolicQuantPer | First and fourth quantiles percent of arterial systolic blood pressure | 0.4400 |
|
| RespRateQuantPer | First and fourth quantiles percent of respiratory rate | 0.3384 |
|
| TempCQuantPer | First and fourth quantiles percent of body temperature | 0.5384 |
|
| SPO2QuantPer | First and fourth quantiles percent of peripheral oxygen saturation | 0.0689 |
|
| GlucoseQuantPer | First and fourth quantiles percent of blood glucose level | 0.8125 |
Vital sign data after applying the quantiles approach for the entire patient population.
| Feature | Operationalization | Value | |
|
|
|
| |
|
| HeartRate_mean_mod | Mean of modified heart rate (beats/minute) | 86.55 (15.8469) |
|
| sysbp_mean_mod | Mean of modified arterial diastolic blood pressure (mmHg) | 119.06 (16.865) |
|
| diasbp_mean_mod | Mean of modified arterial systolic blood pressure (mmHg) | 61.2201 (11.4944) |
|
| resprate_mean_mod | Mean of modified respiratory rate (breaths/minute) | 19.22 (4.1363) |
|
| tempc_mean_mod | Mean of modified body temperature (°C) | 36.82 (0.67382) |
|
| spo2_mean_mod | Mean of modified peripheral oxygen saturation (%) | 96.00 (5.28098) |
|
| glucose_mean_mod | Mean of modified blood glucose level (mg/dL) | 144.50 (48.3843) |
|
|
|
| |
|
| heartRate_std_mod | SD of modified heart rate (beats/minute) | 11.33 (6.02761) |
|
| sysbp_std_mod | SD of modified arterial diastolic blood pressure (mmHg) | 19.22 (7.64726) |
|
| diasbp_std_mod | SD of modified arterial systolic blood pressure (mmHg) | 13.21 (6.06014) |
|
| resprate_std_mod | SD of modified respiratory rate (breaths/minute) | 4.96 (2.05444) |
|
| tempc_std_mod | SD of modified body temperature (°C) | 0.61 (0.35567) |
|
| spo2_std_mod | SD of modified peripheral oxygen saturation (%) | 2.53 (2.18251) |
|
| glucose_std_mod | SD of modified blood glucose level (mg/dL) | 34.69 (32.2924) |
|
|
|
| |
|
| HeartRateQuantPer | First and fourth quantiles percent of heart Rate | 51.63 |
|
| SystolicQuantPer | First and fourth quantiles percent of arterial diastolic blood pressure | 50.49 |
|
| DiastolicQuantPer | First and fourth quantiles percent of arterial systolic blood pressure | 47.47 |
|
| RespRateQuantPer | First and fourth quantiles percent of respiratory rate | 49.02 |
|
| TempCQuantPer | First and fourth quantiles percent of body temperature | 56.57 |
|
| SPO2QuantPer | First and fourth quantiles percent of peripheral oxygen saturation | 46.26 |
|
| GlucoseQuantPer | First and fourth quantiles percent of blood glucose level | 57.04 |
Pearson correlation coefficients among the mean vital signs for a sample patient using the statistical model.
| Variable | Heart rate | Systolic BPa | Diastolic BP | Respiration rate | Body temperature | SPO2b | Glucose |
| Heart rate | 1 | –0.103 | 0.183 | 0.316 | 0.236 | –0.065 | 0.053 |
| Systolic BP | –0.103 | 1 | 0.504 | –0.034 | 0.057 | 0.056 | 0.069 |
| Diastolic BP | 0.183 | 0.504 | 1 | 0.030 | 0.031 | 0.028 | 0.029 |
| Respiration rate | 0.316 | –0.034 | 0.030 | 1 | 0.128 | –0.095 | 0.064 |
| Body temperature | 0.236 | 0.057 | 0.031 | 0.128 | 1 | 0.016 | –0.028 |
| SPO2 | –0.065 | 0.056 | 0.028 | –0.095 | 0.016 | 1 | –0.028 |
| Glucose | 0.053 | 0.069 | 0.029 | 0.064 | –0.028 | –0.028 | 1 |
aBP: blood pressure.
bSpO2: oxygen saturation.
Mortality model results for six algorithms using different performance metrics.
| Method and algorithm | Training set accuracy, mean (SD) | Test set accuracy (95% CI) | Test set sensitivity (95% CI) | Test set specificity (95% CI) | Test set NPVa (95% CI) | Test set PPVb (95% CI) | |
|
| |||||||
|
| LRc | 0.8826 (0.0058) | 0.8806 (0.874-0.890) | 0.0331 (0.033-0.034) | 0.9979 (0.991-1.009) | 0.8817 (0.875-0.891) | 0.6923 (0.688-0.700) |
|
| LDAd | 0.8817 (0.0058) | 0.8788 (0.873-0.888) | 0.0523 (0.052-0.053) | 0.9932 (0.986-1.004) | 0.8833 (0.877-0.893) | 0.5182 (0.515-0.524) |
|
| RFe | 0.8846 (0.0061) | 0.8854 (0.879-0.895) | 0.1127 (0.112-0.114) | 0.9923 (0.985-1.003) | 0.8898 (0.884-0.899) | 0.6710 (0.666-0.679) |
|
| kNNf | 0.8765 (0.0054) | 0.8760 (0.870-0.886) | 0.0854 (0.085-0.087) | 0.9855 (0.978-0.996) | 0.8861 (0.880-0.896) | 0.4496 (0.447-0.455) |
|
| SVMg | 0.8837 (0.0058) | 0.8808 (0.875-0.890) | 0.0272 (0.027-0.028) | 0.9989 (0.992-1.010) | 0.8811 (0.875-0.891) | 0.7872 (0.782-0.796) |
|
| XGBh | 0.8842 (0.0061) | 0.8815 (0.875-0.891) | 0.1429 (0.142-0.145) | 0.9837 (0.977-0.994) | 0.8923 (0.886-0.902) | 0.5495 (0.546-0.556) |
|
| |||||||
|
| LR | 0.8838 (0.0063) | 0.8815 (0.875-0.891) | 0.0545 (0.054-0.055) | 0.9960 (0.989-1.007) | 0.8838 (0.878-0.893) | 0.6548 (0.650-0.662) |
|
| LDA | 0.8821 (0.0067) | 0.8814 (0.875-0.891) | 0.0935 (0.093-0.095) | 0.9905 (0.983-1.001) | 0.8875 (0.881-0.897) | 0.5772 (0.573-0.584) |
|
| RF | 0.8859 (0.0064) | 0.8861 (0.880-0.896) | 0.0891 (0.089-0.090) | 0.9964 (0.989-1.007) | 0.8876 (0.881-0.897) | 0.7756 (0.770-0.784) |
|
| KNN | 0.8802 (0.0060) | 0.8764 (0.870-0.886) | 0.0589 (0.059-0.060) | 0.9895 (0.982-1.000) | 0.8836 (0.877-0.893) | 0.4395 (0.437-0.445) |
|
| SVM | 0.8851 (0.0058) | 0.8820 (0.876-0.892) | 0.0449 (0.045-0.046) | 0.9816 (0.991-1.009) | 0.8829 (0.877-0.893) | 0.7439 (0.739-0.752) |
|
| XGB | 0.8844 (0.0061) | 0.8822 (0.875-0.891) | 0.1643 (0.164-0.167) | 0.9816 (0.975-0.992) | 0.8945 (0.888-0.904) | 0.5533 (0.550-0.560) |
aNPV: negative predictive value.
bPPV: positive predictive value.
cLR: logistic regression.
dLDA: linear discriminant analysis.
eRF: random forest.
fkNN: k-nearest neighbor.
gSVM: support vector machine.
hXGB: extreme gradient boosting.
Figure 6Comparison of the mortality model results using the quantiles approach on the training set (left) and the test set (right). LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbor; SVM: support vector machine; XGB: extreme gradient boosting.
Mortality model performance based on area under the receiver operating characteristic curve (AUROC).
| Method and algorithm | Training set AUROC, mean (SD) | Test set AUROC | |
|
|
|
| |
|
| LRa | 0.702047 (0.015652) | 0.69313 |
|
| LDAb | 0.701731 (0.016077) | 0.69247 |
|
| RFc | 0.764875 (0.009214) | 0.76725 |
|
| kNNd | 0.629262 (0.008944) | 0.63173 |
|
| SVMe | 0.653269 (0.011730) | 0.66800 |
|
| XGBf | 0.771187 (0.012094) | 0.76971 |
|
|
|
| |
|
| LR | 0.727331 (0.014217) | 0.72810 |
|
| LDA | 0.725909 (0.014758) | 0.72622 |
|
| RF | 0.783696 (0.010503) | 0.78292 |
|
| KNN | 0.631649 (0.010416) | 0.64087 |
|
| SVM | 0.719253 (0.008940) | 0.72333 |
|
| XGB | 0.788908 (0.010665) | 0.79036 |
aLR: logistic regression.
bLDA: linear discriminant analysis.
cRF: random forest.
dkNN: k-nearest neighbor.
eSVM: support vector machine.
fXGB: extreme gradient boosting.
Figure 7Comparison of receiver operating characteristic curves in the mortality model using the baseline (left) and the quantiles approach (right). LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbour; SVM: support vector machine; XGB: extreme gradient boosting.
Length of stay model results for six algorithms using different performance metrics.
| Method and algorithm | Training set accuracy, mean (SD) | Test set accuracy (95% CI) | Test set sensitivity (95% CI) | Test set specificity (95% CI) | Test set NPVa (95% CI) | Test set PPVb (95% CI) | |
|
| |||||||
|
| LRc | 0.5787 (0.01) | 0.5715 (0.57-0.58) | 0.56 (0.554-0.564) | 0.59 (0.58-0.59) | 0.57 (0.563-0.573) | 0.58 (0.57-0.58) |
|
| LDAd | 0.5787 (0.01) | 0.5710 (0.57-0.58) | 0.56 (0.55-0.56) | 0.59 (0.58-0.59) | 0.57 (0.56-0.57) | 0.58 (0.57-0.58) |
|
| RFe | 0.6205 (0.01) | 0.6193 (0.62-0.63) | 0.61 (0.60-0.61) | 0.63 (0.63-0.64) | 0.61 (0.61-0.62) | 0.63 (0.62-0.63) |
|
| kNNf | 0.5639 (0.01) | 0.5713 (0.57-0.58) | 0.52 (0.520-0.529) | 0.62 (0.616-0.627) | 0.56 (0.559-0.569) | 0.58 (0.58-0.59) |
|
| SVMg | 0.6228 (0.01) | 0.6141 (0.61-0.62) | 0.56 (0.56-0.57) | 0.67 (0.66-0.68) | 0.60 (0.60-0.61) | 0.63 (0.63-0.64) |
|
| XGBh | 0.6303 (0.01) | 0.6130 (0.61-0.62) | 0.58 (0.58-0.59) | 0.64 (0.64-0.65) | 0.60 (0.60-0.61) | 0.62 (0.62-0.63) |
|
| |||||||
|
| LR | 0.6126 (0.01) | 0.6131 (0.61-0.62) | 0.59 (0.59-0.60) | 0.63 (0.629-0.640) | 0.61 (0.60-0.61) | 0.62 (0.62-0.63) |
|
| LDA | 0.6131 (0.01) | 0.6130 (0.61-0.62) | 0.59 (0.59-0.60) | 0.64 (0.63-0.64) | 0.61 (0.60-0.61) | 0.62 (0.62-0.63) |
|
| RF | 0.6511 (0.01) | 0.6461 (0.64-0.65) | 0.64 (0.63-0.66) | 0.66 (0.65-0.66) | 0.64 (0.64-0.65) | 0.65 (0.65-0.66) |
|
| kNN | 0.5748 (0.01) | 0.5768 (0.57-0.58) | 0.4865 (0.483-0.49) | 0.6681 (0.66-0.68) | 0.56 (0.56-0.57) | 0.60 (0.59-0.60) |
|
| SVM | 0.6466 (0.01) | 0.6386 (0.63-0.65) | 0.5939 (0.59-0.60) | 0.68 (0.68-0.69) | 0.63 (0.62-0.63) | 0.66 (0.65-0.66) |
|
| XGB | 0.6496 (0.01) | 0.6284 (0.62-0.64) | 0.61 (0.60-0.62) | 0.65 (0.64-0.66) | 0.62 (0.62-0.63) | 0.64 (0.63-0.64) |
aNPV: negative predictive value.
bPPV: positive predictive value.
cLR: logistic regression.
dLDA: linear discriminant analysis.
eRF: random forest.
fkNN: k-nearest neighbor.
gSVM: support vector machine.
hXGB: extreme gradient boosting.
Figure 8Comparison of the length of stay model results using the quantiles approach on the training set (left) and the test set (right). LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbor; SVM: support vector machine; XGB: extreme gradient boosting.
Performance of the length of stay model results based on the area under the receiver operating characteristic curve (AUROC).
| Method and algorithm | Training set AUROC, mean (SD) | Test set AUROC | |
|
|
|
| |
|
| LRa | 0.612883 (0.006047) | 0.60833 |
|
| LDAb | 0.612776 (0.006058) | 0.60837 |
|
| RFc | 0.664959 (0.006147) | 0.66325 |
|
| kNNd | 0.583710 (0.006401) | 0.59110 |
|
| SVMe | 0.665992 (0.006041) | 0.66118 |
|
| XGBf | 0.677454 (0.007311) | 0.66586 |
|
|
|
| |
|
| LR | 0.654390 (0.012180) | 0.65407 |
|
| LDA | 0.654178 (0.012102) | 0.65384 |
|
| RF | 0.705115 (0.010004) | 0.69782 |
|
| kNN | 0.598228 (0.007539) | 0.60507 |
|
| SVM | 0.694473 (0.009834) | 0.69272 |
|
| XGB | 0.704889 (0.011338) | 0.69693 |
aLR: logistic regression.
bLDA: linear discriminant analysis.
cRF: random forest.
dkNN: k-nearest neighbor.
eSVM: support vector machine.
fXGB: extreme gradient boosting.
Figure 9Comparison of receiver operating characteristic curves in the length of stay model using the baseline (left) and quantiles (right) approaches. LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbor; SVM: support vector machine; XGB: extreme gradient boosting.
Regression error values of the length of stay model using the baseline and quantile approaches.
| Method | MAEa | RMSEb | |
|
|
|
| |
|
| MLRc | 3.509 | 6.029 |
|
| SVRd | 2.857 | 6.214 |
|
|
|
| |
|
| MLR | 3.446 | 5.961 |
|
| SVR | 2.810 | 6.137 |
aMAE: mean absolute error.
bRMSE: root mean square error.
cMLR: multivariate linear regression.
dSVR: support vector regression.
Figure 10Probability calibration curves of the mortality model for the six classification algorithms. LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbor; SVM: support vector machine; XGB: extreme gradient boosting.
Figure 11Probability calibration curves of the length of stay model for the six classification algorithms. LR: logistic regression; LDA: linear discriminant analysis; RF: random forest; KNN: k-nearest neighbor; SVM: support vector machine; XGB: extreme gradient boosting.
Mortality model results for six algorithms using different performance metrics.
| Methods and algorithm | Training set accuracy, mean (SD) | Test set accuracy (95% CI) | Test set sensitivity (95% CI) | Test set specificity (95% CI) | Test set NPVa (95% CI) | Test set PPVb (95% CI) | |||||||
|
| |||||||||||||
|
| LRc | 0.88263 (0.0058) | 0.87636 (0.870-0.886) | 0.0620 (0.062-0.063) | 0.9963 (0.989-1.007) | 0.8781 (0.872-0.888) | 0.7160 (0.711-0.724) | ||||||
|
| LDAd | 0.88171 (0.0058) | 0.87663 (0.870-0.886) | 0.0974 (0.097-0.099) | 0.9914 (0.984-1.002) | 0.8817 (0.875-0.891) | 0.6275 (0.623-0.635) | ||||||
|
| RFe | 0.88458 (0.0061) | 0.88145 (0.875-0.891) | 0.0952 (0.095-0.097) | 0.9973 (0.990-1.008) | 0.8820 (0.876-0.892) | 0.8396 (0.834-0.849) | ||||||
|
| kNNf | 0.87645 (0.0054) | 0.87196 (0.866-0.881) | 0.0620 (0.062-0.063) | 0.9913 (0.984-1.002) | 0.8776 (0.871-0.887) | 0.5132 (0.510-0.519) | ||||||
|
| SVMg | 0.88365 (0.0058) | 0.87581 (0.870-0.885) | 0.0428 (0.042-0.044) | 0.9985 (0.991-1.009) | 0.8762 (0.875-0.891) | 0.8163 (0.811-0.825) | ||||||
|
| XGBh | 0.88422 (0.0061) | 0.87966 (0.873-0.889) | 0.1670 (0.166-0.169) | 0.9846 (0.978-0.995) | 0.8891 (0.883-0.899) | 0.6166 (0.612-0.624) | ||||||
|
| |||||||||||||
|
| LR | 0.88380 (0.0063) | 0.88150 (0.875-0.891) | 0.0545 (0.054-0.055) | 0.9960 (0.989-1.007) | 0.8838 (0.878-0.893) | 0.6548 (0.650-0.662) | ||||||
|
| LDA | 0.88210 (0.0067) | 0.88141 (0.875-0.891) | 0.0935 (0.093-0.095) | 0.9905 (0.983-1.001) | 0.8875 (0.881-0.897) | 0.5772 (0.573-0.584) | ||||||
|
| RF | 0.88586 (0.0064) | 0.88608 (0.880-0.896) | 0.0891 (0.089-0.090) | 0.9964 (0.989-1.007) | 0.8876 (0.881-0.897) | 0.7756 (0.770-0.784) | ||||||
|
| KNN | 0.88018 (0.0060) | 0.87640 (0.870-0.886) | 0.0589 (0.059-0.060) | 0.9895 (0.982-1.000) | 0.8836 (0.877-0.893) | 0.4395 (0.437-0.445) | ||||||
|
| SVM | 0.88511 (0.0058) | 0.88195 (0.876-0.892) | 0.0449 (0.045-0.046) | 0.9816 (0.991-1.009) | 0.8829 (0.877-0.893) | 0.7439 (0.739-0.752) | ||||||
|
| XGB | 0.88443 (0.0061) | 0.88222 (0.875-0.891) | 0.1643 (0.164-0.167) | 0.9816 (0.975-0.992) | 0.8945 (0.888-0.904) | 0.5533 (0.550-0.560) | ||||||
aNPV: negative predictive value.
bPPV: positive predictive value.
cLR: logistic regression.
dLDA: linear discriminant analysis.
eRF: random forest.
fkNN: k-nearest neighbor.
gSVM: support vector machine.
hXGB: extreme gradient boosting.
Length of stay model results for six algorithms using different performance metrics.
| Method and algorithm | Training set accuracy, mean (SD) | Test set accuracy (95% CI) | Test set sensitivity (95% CI) | Test set specificity (95% CI) | Test set NPVa (95% CI) | Test set PPVb (95% CI) | |
|
| |||||||
|
| LRc | 0.61262 (0.0117) | 0.61312 (0.609-0.620) | 0.5983 (0.594-0.605) | 0.6267 (0.622-0.634) | 0.6292 (0.625-0.636) | 0.5957 (0.592-0.602) |
|
| LDAd | 61.307 (0.0112) | 0.61216 (0.608-0.619) | 0.5951 (0.591-0.602) | 0.6277 (0.624-0.635) | 0.6277 (0.624-0.635) | 0.5951 (0.591-0.602) |
|
| RFe | 0.65108 (0.0081) | 0.64778 (0.643-0.655) | 0.6400 (0.636-0.647) | 0.6550 (0.651-0.662) | 0.6643 (0.660-0.672) | 0.6304 (0.626-0.637) |
|
| kNNf | 0.57483 (0.0104) | 0.58740 (0.583-0.594) | 0.4941 (0.491-0.500) | 0.6731 (0.669-0.681) | 0.5913 (0.587-0.598) | 0.5816 (0.578-0.588) |
|
| SVMg | 0.64659 (0.0088) | 0.64379 (0.639-0.651) | 0.5946 (0.591-0.601) | 0.6890 (0.684-0.697) | 0.6489 (0.645-0.656) | 0.6374 (0.633-0.645) |
|
| XGBh | 0.64961 (0.0076) | 0.63540 (0.631-0.642) | 0.6132 (0.609-0.620) | 0.6557 (0.651-0.663) | 0.6483 (0.644-0.656) | 0.6209 (0.617-0.628) |
|
| |||||||
|
| LR | 0.61262 (0.0117) | 0.61307 (0.609-0.620) | 0.5930 (0.589-0.600) | 0.6332 (0.629-0.640) | 0.6058 (0.602-0.613) | 0.6208 (0.617-0.628) |
|
| LDA | 0.61307 (0.0112) | 0.61298 (0.609-0.620) | 0.5909 (0.587-0.598) | 0.6352 (0.631-0.642) | 0.6053 (0.601-0.612) | 0.6212 (0.617-0.628) |
|
| RF | 0.65108 (0.0081) | 0.64614 (0.642-0.653) | 0.6374 (0.633-0.645) | 0.6549 (0.650-0.662) | 0.6408 (0.636-0.648) | 0.6516 (0.647-0.659) |
|
| KNN | 0.57483 (0.0104) | 0.57677 (0.573-0.583) | 0.4865 (0.483-0.492) | 0.6681 (0.664-0.676) | 0.5624 (0.559-0.569) | 0.5974 (0.593-0.604) |
|
| SVM | 0.64659 (0.0088) | 0.63861 (0.634-0.646) | 0.5939 (0.590-0.601) | 0.6838 (0.679-0.691) | 0.6245 (0.620-0.632) | 0.6553 (0.651-0.663) |
|
| XGB | 0.64961 (0.0076) | 0.62839 (0.624-0.635) | 0.6085 (0.604-0.615) | 0.6484 (0.644-0.656) | 0.6206 (0.617-0.628) | 0.6367 (0.632-0.644) |
aNPV: negative predictive value.
bPPV: positive predictive value.
cLR: logistic regression.
dLDA: linear discriminant analysis.
eRF: random forest.
fkNN: k-nearest neighbor.
gSVM: support vector machine.
hXGB: extreme gradient boosting.
Model results including and excluding the age feature.
| Model | Accuracy | AUROCa | PPVb (95% CI) | ||||
|
|
|
|
| ||||
|
| Without age | 88.536 | 0.76740 | 0.7468 (0.742-0.755) | |||
|
| With age | 88.608 | 0.78292 | 0.7756 (0.770-0.784) | |||
|
|
|
|
| ||||
|
| Without age | 64.390 | 0.69433 | 0.6487 (0.644-0.656) | |||
|
| With age | 64.614 | 0.69782 | 0.6516 (0.647-0.659) | |||
aAUROC: area under the receiver operating characteristic curve.
bPPV: positive predictive value.