| Literature DB >> 32853226 |
Tengyang Wang1, Guanghua Liu1, Hongye Lin2.
Abstract
Kawasaki disease is the leading cause of pediatric acquired heart disease. Coronary artery abnormalities are the main complication of Kawasaki disease. Kawasaki disease patients with intravenous immunoglobulin resistance are at a greater risk of developing coronary artery abnormalities. Several scoring models have been established to predict resistance to intravenous immunoglobulin, but clinicians usually do not apply those models in patients because of their poor performance. To find a better model, we retrospectively collected data including 753 observations and 82 variables. A total of 644 observations were included in the analysis, and 124 of the patients observed were intravenous immunoglobulin resistant (19.25%). We considered 7 different linear and nonlinear machine learning algorithms, including logistic regression (L1 and L1 regularized), decision tree, random forest, AdaBoost, gradient boosting machine (GBM), and lightGBM, to predict the class of intravenous immunoglobulin resistance (binary classification). Data from patients who were discharged before Sep 2018 were included in the training set (n = 497), while all the data collected after 9/1/2018 were included in the test set (n = 147). We used the area under the ROC curve, accuracy, sensitivity, and specificity to evaluate the performances of each model. The gradient GBM had the best performance (area under the ROC curve 0.7423, accuracy 0.8844, sensitivity 0.3043, specificity 0.9919). Additionally, the feature importance was evaluated with SHapley Additive exPlanation (SHAP) values, and the clinical utility was assessed with decision curve analysis. We also compared our model with the Kobayashi score, Egami score, Formosa score and Kawamura score. Our machine learning model outperformed all of the aforementioned four scoring models. Our study demonstrates a novel and robust machine learning method to predict intravenous immunoglobulin resistance in Kawasaki disease patients. We believe this approach could be implemented in an electronic health record system as a form of clinical decision support in the near future.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32853226 PMCID: PMC7451628 DOI: 10.1371/journal.pone.0237321
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Demographic and clinical features of patients.
| Categories | Variables | Data |
|---|---|---|
| Age in months | 19.4 ± 16.8 | |
| Male | 400 (62.1%) | |
| Female | 244 (37.9%) | |
| Spring | 128 (19.9%) | |
| Summer | 234 (36.3%) | |
| Autumn | 151 (23.4%) | |
| Winter | 151 (23.4%) | |
| Rash | 526 (81.7%) | |
| Erythema of oral mucosa | 534 (82.9%) | |
| Strawberry tongue | 430 (66.8%) | |
| Cervical lymphadenopathy | 373 (57.9%) | |
| Edema of the hands and feet | 312 (48.4%) | |
| Periungual desquamation | 78 (12.1%) | |
| Total days with fever (mean ± SD) | 7.4 ± 3.0 | |
| Typical | 374 (58.1%) | |
| Atypical | 270 (41.9%) | |
| Respond | 520 (80.8%) | |
| Resistant | 124 (19.2%) |
Abbreviations: SD stands for Standard Deviation.
Fig 1Schematic of patient enrollment and development of the machine learning model.
A full record contains 82 features. There were 644 remaining records after the data cleaning process. The sizes of the training set and test set are 497 and 147, respectively. The hyperparameter tuning process uses 20% records in the training set. Machine learning models with optimal hyperparameters used the other 80% of records for training. Records in the test set (n = 147) were used to test the results of the trained models.
Variables considered in the model.
| Categories | Variables |
|---|---|
| Sex, Age in months | |
| Days of illness prior to hospitalization, Days of illness at diagnosis, KD type, Rash, Erythema of oral mucosa, Strawberry tongue, Conjunctival injection, Cervical lymphadenopathy | |
| Left main coronary artery diameter, Proximal right coronary artery (RCA) diameter, Left main coronary artery Z_Score (Japan), Proximal_RCA Z_Score (Japan), Left main coronary artery Z_Score (AHA), Proximal_RCA Z_Score (AHA) | |
| Serum potassium, triglyceride, Blood urea nitrogen, Creatinine, Serum calcium, Alkaline phosphatase, Serum total protein, Serum albumin | |
| White blood cell count, Eosinophilic granulocyte count, Basophil count, Erythrocyte mean corpuscular volume, Mean corpuscular hemoglobin, Mean corpuscular hemoglobin concentration, Mean platelet volume, Plateletcrit, Percentage of monocytes, Percentage of eosinophils, Percentage of basophils, Neutrophil count, Hematocrit, Platelet distribution width, Percentage of lymphocyte, Percentage of neutrophils | |
| C-reactive protein |
1: Variables used in Kobayashi score.
2: Variables used in Egami score.
3: Variables used in Kawamura score.
4: Variables used in Formosa score.
Fig 2ROC curves and AUC values: An analysis of the predictive capacity for discrimination between IVIG-resistant and non-IVIG-resistant KD patients.
A: Logistic regression with L1 regularized (AUC 0.6725). B: Logistic regression with L2 regularized (AUC 0.6879). C: decision tree with maximum depth = 12 (AUC 0.6367). D: random forest with 16 estimators (AUC 0.6488). E: AdaBoost with 64 estimators (AUC 0.6704). F: GBM with 32 estimators (AUC 0.7423). G: lightGBM with maximum depth = 5 (AUC 0.6560).
Model performances in AUC, accuracy, sensitivity, and specificity.
| model | logit_l1 | logit_l2 | DT | RF | AdaBoost | GBM | lightGBM |
|---|---|---|---|---|---|---|---|
| 0.6725 | 0.6879 | 0.6367 | 0.6488 | 0.6704 | 0.6560 | ||
| 0.7007 | 0.7415 | 0.7143 | 0.8367 | 0.7959 | 0.7619 | ||
| 0.3478 | 0.2609 | 0.0435 | 0.3044 | 0.3043 | 0.2174 | ||
| 0.7661 | 0.8306 | 0.7581 | 0.9839 | 0.8871 | 0.8629 | ||
| 1 | 3 | 12 | 16 | 64 | 32 | 5 |
Abbreviations: Logit l1 and logit l2 represent logistic regression with L1 and L2 regularizations, respectively; DT stands for decision tree; RF stands for random forest; GBM stands for gradient boosting machine.
Fig 3Decision curves for predicting IVIG resistance in KD patients by machine learning models.
The x-axis indicates the threshold probability for the outcome of IVIG resistance among KD patients without additional initial treatment. The y-axis indicates the net benefit. Two extreme strategies, intervention for all and intervention for none, were added as references.
Fig 4SHAP force plot.
Features contributing to pushing the model toward the output from the base value (the average model output over the training dataset we passed). Features pushing the prediction higher are shown in red, and those pushing the prediction lower are shown in blue.
Fig 5SHAP values and importance.
A: The SHAP values of the 20 most important features for every sample. Features are sorted in descending order by Shapley values. B: Feature importance represented by the mean absolute Shapley value. Features are sorted in descending order by Shapley values.
Model performances comparison.
| model | accuracy | sensitivity | specificity | PPV | NPV | AUC |
|---|---|---|---|---|---|---|
| 0.5782 | 0.1304 | 0.6613 | 0.0667 | 0.8039 | 0.5700 | |
| 0.6735 | 0.1304 | 0.7742 | 0.0968 | 0.8276 | 0.6520 | |
| 0.7211 | 0.4348 | 0.7742 | 0.2632 | 0.8807 | 0.5070 | |
| 0.5646 | 0.4782 | 0.5806 | 0.1746 | 0.8571 | 0.5050 | |
| 0.8844 | 0.3043 | 0.9919 | 0.8750 | 0.8848 | 0.7423 |
Performance of Kobayashi score, Egami score, Formosa score and Kawamura score in accuracy, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and AUC. The GBM model we trained is listed for comparison.
Fig 6Decision curves of the GBM model, Kobayashi score, Egami score, Formosa score and Kawamura score.