| Literature DB >> 35581469 |
Shuangxia Ren1, Jill A Zupetic2,3, Mohammadreza Tabary2,3, Rebecca DeSensi2,3, Mehdi Nouraie2,3, Xinghua Lu1,4, Richard D Boyce1,4, Janet S Lee5,6.
Abstract
We created an online calculator using machine learning (ML) algorithms to impute the partial pressure of oxygen (PaO2)/fraction of delivered oxygen (FiO2) ratio using the non-invasive peripheral saturation of oxygen (SpO2) and compared the accuracy of the ML models we developed to published equations. We generated three ML algorithms (neural network, regression, and kernel-based methods) using seven clinical variable features (N = 9900 ICU events) and subsequently three features (N = 20,198 ICU events) as input into the models. Data from mechanically ventilated ICU patients were obtained from the publicly available Medical Information Mart for Intensive Care (MIMIC III) database and used for analysis. Compared to seven features, three features (SpO2, FiO2 and PEEP) were sufficient to impute PaO2 from the SpO2. Any of the ML models enabled imputation of PaO2 from the SpO2 with lower error and showed greater accuracy in predicting PaO2/FiO2 ≤ 150 compared to the previously published log-linear and non-linear equations. To address potential hidden hypoxemia that occurs more frequently in Black patients, we conducted sensitivity analysis and show ML models outperformed published equations in both Black and White patients. Imputation using data from an independent validation cohort of ICU patients (N = 133) showed greater accuracy with ML models.Entities:
Mesh:
Substances:
Year: 2022 PMID: 35581469 PMCID: PMC9114384 DOI: 10.1038/s41598-022-12419-7
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.996
Figure 1Overview of the experimental study design.
Correlation coefficients between PF ratios and variables.
| SF ratio | PEEP | MAP | Temperature | Vasopressor administration | TV | |
|---|---|---|---|---|---|---|
| PF ratio | 0.44 | − 0.31 | 0.06 | − 0.06 | − 0.04 | 0.02 |
Correlation coefficients between measured PF ratios and the six other measured variables (SpO2/FiO2 = SF ratio, PEEP, MAP, Temperature, Vasopressor Administration and TV) were performed. The variable with the strongest correlation coefficient (r) was chosen for the 3-feature model.
PF ratio PaO2/FiO2, SF ratio SpO2/FiO2, TV tidal volume, PEEP positive end-expiratory pressure, MAP mean arterial pressure.
Subject characteristics based on three features.
| 20,198 | |
| Female sex, n (%) | 8084 (40.0) |
| Age in years, mean (± SD)a | 64.0 (± 16.2) |
| PaO2/FiO2, mean (± SD) | 310.4 (± 184.4) |
| 20,198 | |
| PaO2/FiO2 > 300, n | 8996 |
| PaO2/FiO2 = 201–300, n | 5226 |
| PaO2/FiO2 = 101–200, n | 4448 |
| PaO2/FiO2 < 100, n | 1528 |
| 17,818 | |
| 1 measurement, n | 16,065 |
| 2 measurements, n | 1367 |
| 3 measurements, n | 262 |
| 4 measurements, n | 77 |
| 5 measurements, n | 29 |
| 6 measurements, n | 14 |
| 7 measurements, n | 4 |
The 3-feature models captured 20,198 ICU events from 17,818 unique patients. Variables included in the 3-feature machine learning models are SpO2, FiO2, and PEEP.
ICU intensive care unit.
aFor subjects older than 89 years, the age was assigned as 90 years of age.
RMSE and BIC of the 3-feature machine learning models regression tasks compared to published methods.
| Entire dataset 2 (20,198 events) | Subset 2 (SpO2 < 97%) (3280 events) | |||
|---|---|---|---|---|
| RMSE | BIC | RMSE | BIC | |
| Neural network | 84.7 | 17,952.7 | 67.5 | 2778.9 |
| Linear regression | 88.8 | 18,144.3 | 68.0 | 2783.5 |
| Support vector regression | 85.9 | 18,013.6 | 70.3 | 2805.0 |
| Log-linear | 117.7 | NA | 72.2 | NA |
| Non-linear | 91.8 | NA | 81.2 | NA |
The RMSE and BIC for the 3-feature machine learning models were calculated for the entire dataset (20,198 ICU events) and a subset of the dataset with SpO2 < 97% (3280 ICU events) and compared to the published log-linear and non-linear models.
BIC Bayesian information criterion, RMSE Root-mean-square deviation.
RMSE: An estimate of the differences between values predicted by a model and the values observed. The lower RMSE is, the lower the difference that exists between the predicted and observed values.
BIC: A criterion used in Bayesian statistics to choose between models. The model with the lowest BIC is supposed to be the best.
Prediction performance of machine learning classification models based on three features.
| Entire dataset 2 (20,198 events) | Subset 2 (SpO2 < 97%) (3280 events) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Neural network | Logistic regression | SVM | Log-linear | Non-linear | Neural network | Logistic regression | SVM | Log-linear | Non-linear | |
| Total, No | 20,198 | 20,198 | 20,198 | 20,198 | 20,198 | 3280 | 3280 | 3280 | 3280 | 3280 |
| Sensitivity | 0.96 | 0.97 | 0.98 | 0.84 | 0.93 | 0.80 | 0.87 | 0.83 | 0.85 | 0.58 |
| Specificity | 0.39 | 0.26 | 0.33 | 0.56 | 0.49 | 0.76 | 0.59 | 0.69 | 0.59 | 0.89 |
| Positive LR | 1.59 | 1.32 | 1.46 | 1.90 | 1.83 | 3.37 | 2.13 | 2.75 | 2.09 | 5.16 |
| Negative LR | 0.09 | 0.10 | 0.07 | 0.29 | 0.15 | 0.27 | 0.23 | 0.25 | 0.25 | 0.47 |
| Diagnostic OR | 17.12 | 13.16 | 19.68 | 6.49 | 12.61 | 12.53 | 9.46 | 10.96 | 8.44 | 10.94 |
| AUROC | 0.83 | 0.81 | 0.74 | NA | NA | 0.85 | 0.83 | 0.84 | NA | NA |
| F1 | 0.92 | 0.92 | 0.92 | 0.87 | 0.91 | 0.81 | 0.80 | 0.81 | 0.79 | 0.70 |
| BIC | − 4612.6 | − 4440.7 | − 4446.0 | NA | NA | − 591.8 | − 567.0 | − 580.0 | NA | NA |
Prediction performance statistics were calculated for the machine learning models based on three features and compared to the Log-linear and Non-linear methods for the entire dataset (20,198 ICU events; entire dataset 2) and for a subset of the events where SpO2 < 97% (3280 events; subset 2). Variables included in the 3-feature machine learning models are SpO2, FiO2, and PEEP.
PaO: SVR Support vector regression, AUROC area under receiver operating characteristic curve, BIC Bayesian information criterion, LR likelihood ratio, OR odds ratio.
Figure 2Precision-recall curves of machine learning models in Dataset 2 and Subset 2 using 3 features. The precision recall curves, where improved performance is demonstrated if the curve is closer to the upper right-hand corner or has the highest area under the curve (AUC), are shown for the 3 machine learning models for (A) the entire Dataset 2 (N = 20,198) ICU events) and (B) Subset 2 where SpO2 < 97% (N = 3280 ICU events). Data was obtained from the MIMIC-III database v1.4 (https://mimic.physionet.org).
RMSE of the 3-feature machine learning models regression task compared to the published non-linear equation.
| N = 133 | Neural network | SVR | Regression | Non-linear |
|---|---|---|---|---|
| RMSE (adjusted R2) | 65.0 (0.35) | 64.9 (0.35) | 74.1 (0.16) | 67.1 (0.31) |
The PaO2 was imputed using an online calculator of the three machine learning models using SpO2, PEEP, and FiO2 from a validation cohort of 133 mechanically ventilated ICU patients. Subsequently, the RMSE and adjusted R2 for the 3-feature machine learning models were calculated and compared to the published non-linear equation. A lower RMSE and higher adjusted R2 indicate higher accuracy.
SVR Support vector regression, RMSE root-mean-square deviation.
Examples of comparing four models applied to four cases from different categories of PaO2 (< 150, 150–200, 200–300, > 300).
| PaO2 | SpO2 | FiO2 | PEEP | Neural network-imputed | Regression-imputed | SVR-imputed | Nonlinear-imputed |
|---|---|---|---|---|---|---|---|
| 113 | 96 | 40 | 5 | 115.3 | 136.7 | 101.6 | 82 |
| 190 | 100 | 60 | 5 | 203.0 | 186.2 | 188.4 | 167 |
| 217 | 100 | 90 | 5 | 226.8 | 220.1 | 194.2 | 167 |
| 304 | 100 | 100 | 5 | 259.3 | 231.4 | 260.5 | 167 |
PaO Partial pressure of oxygen, FiO fraction of inspired oxygen, SpO peripheral saturation of oxygen, PEEP positive end-expiratory pressure, SVR support vector regression.