| Literature DB >> 35990108 |
Kai-Chih Pai1, Wen-Cheng Chao2,3,4, Yu-Len Huang5, Ruey-Kai Sheu5, Lun-Chi Chen1, Min-Shian Wang6, Shau-Hung Lin7, Yu-Yi Yu8,9, Chieh-Liang Wu2,3,4, Ming-Cheng Chan3,8.
Abstract
Objective: The aim of this study was to develop an artificial intelligence-based model to detect the presence of acute respiratory distress syndrome (ARDS) using clinical data and chest X-ray (CXR) data. Method: The transfer learning method was used to train a convolutional neural network (CNN) model with an external image dataset to extract the image features. Then, the last layer of the model was fine-tuned to determine the probability of ARDS. The clinical data were trained using three machine learning algorithms-eXtreme Gradient Boosting (XGB), random forest (RF), and logistic regression (LR)-to estimate the probability of ARDS. Finally, ensemble-weighted methods were proposed that combined the image model and the clinical data model to estimate the probability of ARDS. An analysis of the importance of clinical features was performed to explore the most important features in detecting ARDS. A gradient-weighted class activation mapping (Grad-CAM) model was used to explain what our CNN sees and understands when making a decision.Entities:
Keywords: Acute respiratory distress syndrome; artificial intelligence; chest X-ray; clinical data; ensemble-weighted model; machine learning
Year: 2022 PMID: 35990108 PMCID: PMC9386858 DOI: 10.1177/20552076221120317
Source DB: PubMed Journal: Digit Health ISSN: 2055-2076
Figure 1.Patient selection flowchart.
Figure 2.Model framework overview.
Clinical features.
| Feature type | Unit (measurement) | Feature type | Unit (measurement) |
|---|---|---|---|
| Vital signs | Laboratory data | ||
| Temperature | °C | PO2-A | mmHg |
| SBP | mmHg | Procalcitonin | ng/mL |
| DBP | mmHg | PCO2-A | mmHg |
| Pulse rate | bpm | Fluids | |
| Respiratory rate | breath/min | Urine output | ml |
| SPO2 | % | ||
| Ventilatory parameters | |||
| FiO2 | % | ||
| Positive end-expiratory pressure | cmH2O | ||
| Total respiratory rate | breath/min | ||
| Tidal volume | cc/kg | ||
| Minute volume | L/min | ||
| Mean airway pressure | cmH2O | ||
Figure 3.Examples of chest X-rays (CXRs) from the original dataset and after histogram equalization preprocessing.
Figure 4.Original image and five enhanced images using data augmentation.
Figure 5.Example of input image (a) and extracted lung region image (b) cropped by the segmentation model (c), resulting in the reshaped image (d).
Figure 6.Workflow of the convolutional neural network (CNN) model with transfer learning.
Demographic characteristics.
| All (N = 1577) | Non-ARDS (n = 1194) | ARDS (n = 383) | ||
|---|---|---|---|---|
|
| ||||
| Age (years) | 66.1 ± 15.8 | 66.1 ± 15.7 | 66.3 ± 16.3 | 0.823 |
| Male, n (%) | 1017 (64.5%) | 762 (63.8%) | 255 (66.6%) | 0.357 |
| BMI (kg/m2) | 24.3 ± 4.9 | 24.2 ± 4.7 | 24.4 ± 5.4 | 0.623 |
|
| ||||
| Emergency room | 1577 (100.0) | 1194 (100.0) | 383 (100.0) | 1.000 |
|
| ||||
| Medical | 927 (58.8%) | 631 (52.8%) | 296 (77.3%) | <0.001 |
| Surgical | 650 (41.2%) | 563 (47.2%) | 87 (22.7%) | |
|
| ||||
| APACHE II score | 25.2 ± 6.1 | 24.2 ± 5.7 | 28.3 ± 6.3 | <0.001 |
| SOFA score, Day 1 | 8.6 ± 3.6 | 7.9 ± 3.5 | 10.3 ± 3.4 | <0.001 |
| SOFA score, Day 3 | 7.4 ± 4.0 | 6.6 ± 3.7 | 9.3 ± 4.0 | <0.001 |
| SOFA score, Day 7 | 6.8 ± 4.0 | 6.3 ± 3.9 | 7.6 ± 4.2 | <0.001 |
|
| ||||
| Cardiovascular disease | 444 (28.2%) | 344 (28.8%) | 100 (26.1%) | 0.338 |
| Cerebrovascular disease | 454 (28.8%) | 376 (31.5%) | 78 (20.4%) | <0.001 |
| Dementia | 101 (6.4%) | 77 (6.4%) | 24 (6.3%) | 0.994 |
| Chronic pulmonary disease | 272 (17.2%) | 212 (17.8%) | 60 (15.7%) | 0.387 |
| Rheumatic disease | 273 (17.2%) | 49 (4.1%) | 35 (9.1%) | <0.001 |
| Hepatic disease | 269 (17.1%) | 196 (16.4%) | 73 (19.1%) | 0.263 |
| Diabetes mellitus | 548 (34.7%) | 395 (33.1%) | 153 (39.9%) | 0.017 |
| Renal disease | 484 (30.7%) | 346 (29.0%) | 138 (36.0%) | 0.011 |
| Malignancy | 480 (30.4%) | 326 (27.3%) | 154 (40.2%) | <0.001 |
| Charlson Comorbidity Index (CCI) | 2.2 ± 1.4 | 2.1 ± 1.4 | 2.3 ± 1.5 | 0.034 |
|
| ||||
| ICU length of stay (days) | 14.1 ± 14.0 | 13.4 ± 14.2 | 16.3 ± 12.9 | <0.001 |
| Hospital length of stay (days) | 31.6 ± 27.1 | 31.8 ± 28.3 | 31.0 ± 23.1 | 0.571 |
| Ventilator days | 11.5 ± 12.7 | 10.6 ± 12.4 | 14.4 ± 13.1 | <0.001 |
| Hospital mortality, n (%) | 433 (27.5) | 282 (23.6) | 151 (39.4) | <0.001 |
BMI: body mass index; APACHE II: Acute Physiology and Chronic Health Evaluation score; SOFA: Sequential Organ Failure Assessment; ICU: intensive care unit.
Figure 7.Convolutional neural network (CNN) model for training accuracy, validation accuracy, training loss, validation loss.
ARDS classification results based on clinical data.
| Data type | Classifier | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|
| Clinical data | XGB | 0.848 ± 0.03 | 0.809 ± 0.03 | 0.861 ± 0.03 | 0.910 ± 0.02 |
| RF | 0.840 ± 0.03 | 0.791 ± 0.03 | 0.855 ± 0.04 | 0.910 ± 0.02 | |
| LR | 0.832 ± 0.02 | 0.791 ± 0.05 | 0.845 ± 0.02 | 0.902 ± 0.02 | |
| Ensemble (Average probability) | 0.871 ± 0.02 | 0.676 ± 0.02 | 0.934 ± 0.02 | 0.912 ± 0.02 | |
| Ensemble (Maximum probability) | 0.821 ± 0.03 | 0.849 ± 0.04 | 0.812 ± 0.03 | 0.912 ± 0.02 |
ARDS: acute respiratory distress syndrome; XGB: eXtreme Gradient Boosting; RF: random forest; LR: logistic regression; AUC: area under the curve.
Figure 8.Receiver operating characteristic (ROC) curves demonstrating the performance of the machine learning models and convolutional neural network (CNN) models for ARDS classification: (a) three machine models using clinical data; (b) two CNN models. Note. ARDS: acute respiratory distress syndrome; AUC: area under the curve.
ARDS classification results based on CXRs.
| Data type | Classifier | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|
| Original image | CNN | 0.743 ± 0.02 | 0.783 ± 0.05 | 0.729 ± 0.02 | 0.835 ± 0.01 |
| Segmented and reshaped image | 0.791 ± 0.03 | 0.760 ± 0.04 | 0.802 ± 0.04 | 0.854 ± 0.02 |
ARDS: acute respiratory distress syndrome; CXRs: chest X-ray; CNN: convolutional neural network; AUC: area under the curve.
ARDS classification results from two AI models combining clinical data and CXRs.
| Classifier | Ensemble method | Accuracy | Sensitivity | Specificity | AUC |
|---|---|---|---|---|---|
| XGB + CNN | Ensemble (Average probability) | 0.859 ± 0.02 | 0.846 ± 0.02 | 0.863 ± 0.02 | 0.920 ± 0.02 |
| Ensemble (Maximum probability) | 0.777 ± 0.03 | 0.922 ± 0.02 | 0.731 ± 0.03 | 0.909 ± 0.02 | |
| RF + CNN | Ensemble (Average probability) | 0.852 ± 0.02 | 0.836 ± 0.01 | 0.857 ± 0.03 | 0.919 ± 0.02 |
| Ensemble (Maximum probability) | 0.763 ± 0.03 | 0.914 ± 0.02 | 0.715 ± 0.04 | 0.906 ± 0.02 | |
| LR + CNN | Ensemble Average probability) | 0.854 ± 0.02 | 0.849 ± 0.01 | 0.856 ± 0.03 | 0.920 ± 0.02 |
| Ensemble (Maximum probability) | 0.769 ± 0.03 | 0.927 ± 0.04 | 0.718 ± 0.04 | 0.911 ± 0.02 | |
| XGB + RF + LR + CNN | Ensemble (Average probability) | 0.855 ± 0.03 | 0.830 ± 0.02 | 0.863 ± 0.03 | 0.925 ± 0.02 |
| Ensemble (Maximum probability) | 0.749 ± 0.04 | 0.935 ± 0.04 | 0.689 ± 0.04 | 0.915 ± 0.02 |
ARDS: acute respiratory distress syndrome; AI: artificial intelligence; CXRs: chest X-rays; XGB: eXtreme Gradient Boosting; RF: random forest; LR: logistic regression; AUC: area under the curve.
Figure 9.Receiver operating characteristic (ROC) curves demonstrating the performance of two ensemble-weighted models: (a) average probability; (b) maximum probability. Note. XGB: eXtreme Gradient Boosting; CNN: convolutional neural network; AUC: area under the curve.
Figure 10.Feature importance (a) and summary plot (b) of SHAP values.
Figure 11.Comparison of acute respiratory distress syndrome (ARDS) classification models based on original data and segmented image from two cases: (a) Color visualization of a false negative on original images. (b) Color visualization of true positive on segmented images.
Figure 12.Venn diagrams represent the effectiveness of classification by XGBoost and convolutional neural network (CNN) classifier. (a) True Positive, (b) True Negative.