| Literature DB >> 35755843 |
Wandong Hong1, Yajing Lu1, Xiaoying Zhou2, Shengchun Jin2, Jingyi Pan2, Qingyi Lin2, Shaopeng Yang2, Zarrin Basharat3, Maddalena Zippi4, Hemant Goyal5.
Abstract
Background and Aims: This study aimed to develop an interpretable random forest model for predicting severe acute pancreatitis (SAP).Entities:
Keywords: LIME plot; acute pancreatitis; artificial intelligence; nomogram; predictor; random forest
Mesh:
Substances:
Year: 2022 PMID: 35755843 PMCID: PMC9226542 DOI: 10.3389/fcimb.2022.893294
Source DB: PubMed Journal: Front Cell Infect Microbiol ISSN: 2235-2988 Impact factor: 6.073
Comparison of clinical and laboratory findings among patients, with and without SAP (training sample set).
| Variable | Training set (n = 487) | Test set (n = 161) | P-value |
|---|---|---|---|
| Age, years (IQR) | 47 (37,61) | 49 (36,64) | 0.501 |
| Male sex, N (%) | 301 (61.81) | 103 (63.98) | 0.623 |
| Duration of symptoms (days, IQR) | 1.83 ± 0.80 | 1.78 ± 0.78 | 0.515 |
| BMI, kg/m2 (IQR) | 23.5 (21.1-26.3) | 23.9 (21.5-21.5) | 0.573 |
| SIRS, N (%) | 191 (39.22) | 65 (40.37) | 0.795 |
| Biliary etiology, N (%) | 207 (42.51) | 68 (42.24) | 0.584 |
| Laboratory findings | |||
| Hematocrit (l/l) | 0.42 (0.38-0.46) | 0.42 (0.38-0.46) | 0.693 |
| Platelets (109/L) | 199 (161-233) | 195 (157-233) | 0.472 |
| Prothrombin time, s (IQR) | 13.8 (13.1-14.6) | 13.7 (13.0-14.5) | 0.278 |
| Albumin, g/L (IQR) | 36.3 (32.6-39.9) | 36.4 (34.0-39.8) | 0.191 |
| Total bilirubin, mmol/L (IQR) | 20 (14-31) | 20 (13-32) | 0.916 |
| ALT, U/L (IQR) | 43 (19-119) | 31 (19-82) | 0.055 |
| AST, U/L (IQR) | 39 (22-88) | 28 (19-71) | 0.012 |
| Glucose, mmol/L (IQR) | 7.9 (6.3-10.5) | 8.4 (6.7-11.3) | 0.128 |
| Serum creatinine, μmol/L (IQR) | 64 (54-77) | 64 (55-76) | 0.882 |
| BUN, mmol/L (IQR) | 4.8 (3.7-6.1) | 4.9 (4.0-6.2) | 0.346 |
| Total cholesterol, mmol/L (IQR) | 4.79 (3.8-6.2) | 4.8 (3.8-6.1) | 0.970 |
| HDL, mmol/L (IQR) | 1.0 (0.7-1.3) | 1.0 (0.8-1.3) | 0.461 |
| LDL, mmol/L (IQR) | 2.5 (1.9-3.2) | 2.2 (1.8-3.0) | 0.100 |
| Triglyceride (mg/dL), mmol/L (IQR) | 1.3 (0.8-3.4) | 1.3 (0.8-3.6) | 0.995 |
| Serum calcium, mmol/L (IQR) | 2.7 (2.1-2.3) | 2.2 (2.1-2.3) | 0.051 |
| C-reactive protein, mg/L (IQR) | 35.0 (11.7-90.0) | 29.4 (8.7-85.3) | 0.415 |
| Pleural effusion, N (%) | 89 (18.28) | 35 (21.74) | 0.333 |
| Patients with SAP, N (%) | 49 (10.1) | 16 (9.9) | 0.0092 |
| Length of hospital stay, days (IQR) | 10 (7-13) | 11 (7-15) | 0.964 |
| Hospital mortality, N (%) | 9 (1.85) | 1 (0.62) | 0.274 |
Data were mean ± standard deviation, or numbers and percentages, or median (25th–75th percentile), as appropriate. N, number; IQR, interquartile range; BMI, body mass index; SIRS, systemic inflammatory response syndrome; ALT, alanine aminotransferase; AST, aspartate aminotransferase; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol.
Comparison of clinical and laboratory findings between patients, with and without SAP in the training sample (487 patients).
| Variable | No-SAP (n = 438) | SAP (n = 49) | P-value |
|---|---|---|---|
| Age, years (IQR) | 46 (37-60) | 51 (38-66) | 0.115 |
| Male sex, N (%) | 270 (61.6) | 31 (63.3) | 0.825 |
| Duration of symptoms (days, IQR) | 1.8 ± 0.8 | 1.9 ± 0.8 | 0.799 |
| BMI, kg/m2 (IQR) | 23.4 (20.9-26.3) | 24.4 (22.1-26.6) | 0.083 |
| SIRS, N (%) | 157 (35.8) | 34 (69.4) | <0.001 |
| Biliary etiology, N (%) | 190 (43.4) | 17 (34.7) | 0.243 |
| Laboratory findings | |||
| Hematocrit (l/l) | 0.42 (0.38-0.45) | 0.44 (0.41-0.49) | 0.007 |
| Platelets (109/L) | 202 (167-234) | 184 (135-208) | 0.005 |
| Prothrombin time, s (IQR) | 13.8 (13.1-14.6) | 14.6 (13.6-15.3) | 0.004 |
| Albumin, g/L (IQR) | 37.1 (33.3-39.3) | 30.4 (27.5-33.9) | <0.001 |
| Total bilirubin, mmol/L (IQR) | 20 (14-31) | 20 (15-28) | 0.631 |
| ALT, U/L (IQR) | 43 (18-121) | 48 (24-77) | 0.868 |
| AST, U/L (IQR) | 36 (21-88) | 60 (41-85) | 0.005 |
| Glucose, mmol/L (IQR) | 7.7 (6.2-10.0) | 10.2 (8.2-14.4) | <0.001 |
| Serum creatinine, μmol/L (IQR) | 63 (54-76) | 81 (59-154) | <0.001 |
| BUN, mmol/L (IQR) | 4.6 (3.6-5.9) | 7.3 (5.1-11.4) | <0.001 |
| Total cholesterol, N | 0.001 | ||
| <160 mmol/L | 203 (95.75) | 9 (9.45) | |
| 160-240 mmol/L | 131 (83.97) | 25 (16.03) | |
| >240 mmol/L | 104 (87.39) | 15 (12.61) | |
| HDL, mmol/L (IQR) | 1.0 (0.8-1.3) | 0.6 (0.4-1.0) | <0.001 |
| LDL, mmol/L (IQR) | 2.6 (2.0-3.3) | 1.6 (1.3-2.4) | <0.001 |
| Triglyceride (mg/dL), mmol/L (IQR) | 1.3 (0.8-3.3) | 2.4 (1.3-7.2) | <0.001 |
| Serum calcium, mmol/L (IQR) | 2.2 (2.1-2.3) | 1.9 (1.6-2.1) | <0.001 |
| C-reactive protein, mg/L (IQR) | 30.5 (10.4-87.8) | 80.0 (28.4-90.0) | 0.003 |
| Pleural effusion, N (%) | 59 (13.47) | 30 (61.22) | <0.001 |
Data were mean ± standard deviation, or numbers and percentages, or median (25th–75th percentile), as appropriate. N, number; IQR, interquartile range; BMI, body mass index; SIRS, systemic inflammatory response syndrome; ALT, alanine aminotransferase; AST, aspartate aminotransferase; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol.
Figure 1Nomogram predicting the probability of SAP. To obtain the nomogram-predicted probability, patient values on each axis were located and a vertical line was drawn to the point axis to determine how many points were attributed for each variable value. Points for all variables were summed and accessed on the point line to find SAP probability.
Figure 2Variable importance plot of the RF for SAP.
Figure 3ROC curves for the RF and LR models, for a tenfold cross-validation on the training set.
Figure 4The precision-recall curves for RF and LR models for tenfold cross-validation on the training set.
Figure 5Calibration plots for RF and LR models for tenfold cross-validation on the training set.
Figure 6ROC curves for the RF and LR models and BISAP scores, applied on the test set.
Figure 7The precision-recall curves for the (A) RF model, (B) LR model, and (C) BISAP score applied on the test set.
Diagnostic values of various models of SAP.
| Variable | Cut-off value | Sensitivity | Specificity | LR+ | LR- | Accuracy |
|---|---|---|---|---|---|---|
| RF model | 0.13 | 93.8% | 82.8% | 5.44 | 0.08 | 83.9% |
| LR model | 0.08 | 93.8% | 79.3% | 4.53 | 0.08 | 80.8% |
| BISAP score | 2 | 68.8% | 78.6% | 3.22 | 0.40 | 77.64% |
LR+, Positive likelihood ratio; LR-, negative likelihood ratio.
Figure 8LIME plot for the individualized likelihood of two typical predictions. This shows the main contributing features behind the model prediction. The length of the color bar represents the amount of contribution. The first case (case 49) is a non-SAP patient who was correctly classified, with a prediction probability of 0.97 as non-SAP based on the RF model. The first case (case 49) had a creatinine value of 86 μmol/L, BUN=7.1 mmol/L, no pleural effusion, LDL=1.82 mmol/L, albumin=36.5 mg/dl, total cholesterol=3.24 mmol/L, HDL=0.79 mmol/L, glucose=8.4 mmol/L, prothrombin time=15.2 s, hematocrit=0.465, platelets=206×10^9/L, AST=76 U/L, calcium=2.43 mmol/L, triglyceride=0.96 mmol/L, no SIRS, and CRP=5 mg/L. The second case (case 51) is an SAP patient who was correctly classified, with a prediction probability of 0.82 (SAP based on RF model). The second case (case 51) had a creatinine value of 260 μmol/L, BUN=16.6 mmol/L, glucose =23.2 mmol/L, HDL=0.47 mmol/L, no pleural effusion, albumin =26.5 mg/dl, calcium=0.83 mmol/L, triglyceride=25.6 mmol/L, LDL=1.87 mmol/L, hematocrit=0.39, prothrombin time=15.7 s, AST=155 U/L, SIRS, platelets=243×10^9/L, CRP =76.1 mg/L, and total cholesterol=10.54 mmol/L.