| Literature DB >> 34889756 |
Janmajay Singh1, Masahiro Sato1, Tomoko Ohkuma1.
Abstract
BACKGROUND: Missing data in electronic health records is inevitable and considered to be nonrandom. Several studies have found that features indicating missing patterns (missingness) encode useful information about a patient's health and advocate for their inclusion in clinical prediction models. But their effectiveness has not been comprehensively evaluated.Entities:
Keywords: electronic health records; hospital mortality; informative missingness; machine learning; missing data; sepsis
Year: 2021 PMID: 34889756 PMCID: PMC8701717 DOI: 10.2196/25022
Source DB: PubMed Journal: JMIR Med Inform
Results of model discrimination and calibration for all task settings on the test data. These correspond to internal validation for PhysioNet 2012 Challenge and external for PhysioNet 2019 Challenge.
|
| Masking (AUROCa), mean (SD) | Masking (Brier), mean (SD) | No masking (AUROC), mean (SD) | No masking (Brier), mean (SD) |
| P12b mortality | 0.842 (0.82-0.86) | 0.093 (0.087-0.100) | 0.830 (0.81-0.85) | 0.095 (0.088-0.101) |
| P12 LOSc | 0.814 (0.79-0.84) | 0.054 (0.049-0.060) | 0.737 (0.71-0.77) | 0.064 (0.058-0.070) |
| P19d sepsis-overall | 0.907 (0.90-0.92) | 0.039 (0.036-0.041) | 0.889 (0.88-0.90) | 0.045 (0.043-0.048) |
| P19 sepsis-frequent | 0.757 (0.74-0.77) | 0.014 (0.013-0.014) | 0.766 (0.75-0.78) | 0.014 (0.013-0.015) |
aAUROC: area under the curve of the receiver operating characteristic.
bP12: PhysioNet 2012 Challenge.
cLOS: length of stay.
dP19: PhysioNet 2019 Challenge.
Figure 1Receiver operating characteristic (ROC) curve, precision-recall (PR) curve, and calibration plot for the PhysioNet 2012 Challenge mortality classification task.
Figure 3Receiver operating characteristic (ROC) curve, precision-recall (PR) curve, and calibration plot for the PhysioNet 2019 Challenge sepsis-overall classification task.
Subgroup analysis results for the PhysioNet 2012 Challenge mortality classification task.
|
| #Samples | Masking (AUROCa), mean (SD) | Masking (Brier), mean (SD) | No masking (AUROC), mean (SD) | No masking (Brier), mean (SD) | ||||||
|
| |||||||||||
|
| ≤35 | 268 | 0.847 (0.74-0.93) | 0.057 (0.037-0.079) | 0.852 (0.75-0.94) | 0.059 (0.040-0.079) | |||||
|
| 35-45 | 309 | 0.906 (0.84-0.96) | 0.048 (0.031-0.066) | 0.880 (0.80-0.95) | 0.054 (0.037-0.072) | |||||
|
| 45-55 | 569 | 0.878 (0.82-0.93) | 0.064 (0.050-0.078) | 0.885 (0.83-0.93) | 0.064 (0.052-0.077) | |||||
|
| 55-65 | 708 | 0.859 (0.82-0.90) | 0.074 (0.060-0.090) | 0.848 (0.80-0.89) | 0.076 (0.063-0.090) | |||||
|
| 65-75 | 845 | 0.830 (0.79-0.87) | 0.094 (0.079-0.109) | 0.822 (0.78-0.86) | 0.094 (0.080-0.108) | |||||
|
| >75 | 1294 | 0.801 (0.77-0.83) | 0.135 (0.121-0.149) | 0.786 (0.75-0.82) | 0.135 (0.123-0.149) | |||||
|
| |||||||||||
|
| Coronary care unit | 587 | 0.806 (0.75-0.86) | 0.087 (0.069-0.106) | 0.807 (0.74-0.86) | 0.086 (0.070-0.104) | |||||
|
| Cardiac surgery unit | 780 | 0.862 (0.79-0.92) | 0.035 (0.025-0.046) | 0.845 (0.76-0.92) | 0.037 (0.028-0.048) | |||||
|
| Surgical ICU | 1192 | 0.852 (0.82-0.88) | 0.094 (0.082-0.107) | 0.843 (0.81-0.87) | 0.095 (0.083-0.106) | |||||
|
| Medical ICU | 1434 | 0.801 (0.77-0.83) | 0.128 (0.115-0.140) | 0.787 (0.76-0.82) | 0.129 (0.117-0.141) | |||||
aAUROC: area under the curve of the receiver operating characteristic.
bICU: intensive care unit.
Subgroup analysis results for the PhysioNet 2019 Challenge sepsis-overall classification task. A total of 6095 patients did not have intensive care unit type specified, and thus, they were not considered for the corresponding analysis.
|
| #Samples | Masking (AUROCa), mean (SD) | Masking (Brier), mean (SD) | No masking (AUROC), mean (SD) | No masking (Brier), mean (SD) | ||||||
|
| |||||||||||
|
| ≤35 | 1742 | 0.904 (0.86-0.94) | 0.037 (0.029-0.045) | 0.893 (0.85-0.93) | 0.044 (0.035-0.052) | |||||
|
| 35-45 | 1949 | 0.911 (0.88-0.94) | 0.041 (0.033-0.049) | 0.910 (0.88-0.94) | 0.046 (0.038-0.055) | |||||
|
| 45-55 | 3334 | 0.920 (0.90-0.94) | 0.032 (0.026-0.037) | 0.900 (0.87-0.93) | 0.037 (0.032-0.043) | |||||
|
| 55-65 | 4581 | 0.897 (0.87-0.92) | 0.042 (0.037-0.048) | 0.886 (0.86-0.91) | 0.048 (0.042-0.053) | |||||
|
| 65-75 | 4768 | 0.917 (0.90-0.94) | 0.039 (0.034-0.043) | 0.877 (0.85-0.90) | 0.049 (0.043-0.054) | |||||
|
| >75 | 3626 | 0.896 (0.87-0.92) | 0.040 (0.034-0.046) | 0.888 (0.86-0.91) | 0.045 (0.039-0.051) | |||||
|
| |||||||||||
|
| Medical ICU | 6923 | 0.895 (0.88-0.91) | 0.044 (0.040-0.048) | 0.882 (0.86-0.90) | 0.049 (0.045-0.053) | |||||
|
| Surgical ICU | 6982 | 0.903 (0.89-0.92) | 0.041 (0.037-0.045) | 0.882 (0.86-0.90) | 0.050 (0.046-0.055) | |||||
aAUROC: area under the curve of the receiver operating characteristic.
bICU: intensive care unit.
Figure 4Temporal evaluation for the PhysioNet 2019 Challenge sepsis-frequent task; records corresponding to sepsis are labeled as S=1 while the remainder are S=0: (A) drop in probability of false-positive prediction (S=0) is because after 90 hours, only patients with sepsis remain in the data; (B) this cohort characteristic is learned by the model resulting in perfect predictive value after 90 hours. ICU: intensive care unit.
Subgroup analysis results for the PhysioNet 2012 Challenge length of stay classification task.
|
| #Samples | Masking (AUROCa), mean (SD) | Masking (Brier), mean (SD) | No masking (AUROC), mean (SD) | No masking (Brier), mean (SD) | |||||
|
| ||||||||||
|
| ≤35 | 268 | 0.862 (0.80-0.92) | 0.081 (0.055-0.109) | 0.707 (0.61-0.80) | 0.108 (0.079-0.138) | ||||
|
| 35-45 | 309 | 0.820 (0.71-0.91) | 0.060 (0.040-0.081) | 0.721 (0.62-0.82) | 0.079 (0.057-0.104) | ||||
|
| 45-55 | 569 | 0.800 (0.72-0.88) | 0.057 (0.042-0.073) | 0.712 (0.63-0.79) | 0.064 (0.048-0.081) | ||||
|
| 55-65 | 708 | 0.797 (0.71-0.87) | 0.045 (0.033-0.059) | 0.790 (0.72-0.86) | 0.054 (0.042-0.068) | ||||
|
| 65-75 | 845 | 0.803 (0.72-0.87) | 0.047 (0.035-0.060) | 0.712 (0.64-0.78) | 0.053 (0.042-0.065) | ||||
|
| >75 | 1294 | 0.814 (0.77-0.86) | 0.056 (0.046-0.067) | 0.747 (0.69-0.80) | 0.062 (0.052-0.073) | ||||
|
| ||||||||||
|
| Coronary care unit | 587 | 0.791 (0.73-0.85) | 0.086 (0.068-0.105) | 0.763 (0.71-0.82) | 0.095 (0.078-0.112) | ||||
|
| Cardiac surgery unit | 780 | 0.890 (0.77-0.98) | 0.013 (0.006-0.020) | 0.759 (0.60-0.90) | 0.018 (0.011-0.025) | ||||
|
| Surgical ICU | 1192 | 0.812 (0.75-0.87) | 0.046 (0.036-0.056) | 0.710 (0.64-0.77) | 0.056 (0.036-0.056) | ||||
|
| Medical ICU | 1434 | 0.776 (0.73-0.82) | 0.071 (0.060-0.083) | 0.682 (0.63-0.73) | 0.082 (0.071-0.094) | ||||
aAUROC: area under the curve of the receiver operating characteristic.
bICU: intensive care unit.