Literature DB >> 34106618

Predicting in-hospital mortality in ICU patients with sepsis using gradient boosting decision tree.

Ke Li¹, Qinwen Shi¹, Siru Liu², Yilin Xie¹, Jialin Liu^3,4.

Abstract

ABSTRACT: Sepsis is a leading cause of mortality in the intensive care unit. Early prediction of sepsis can reduce the overall mortality rate and cost of sepsis treatment. Some studies have predicted mortality and development of sepsis using machine learning models. However, there is a gap between the creation of different machine learning algorithms and their implementation in clinical practice.This study utilized data from the Medical Information Mart for Intensive Care III. We established and compared the gradient boosting decision tree (GBDT), logistic regression (LR), k-nearest neighbor (KNN), random forest (RF), and support vector machine (SVM).A total of 3937 sepsis patients were included, with 34.3% mortality in the Medical Information Mart for Intensive Care III group. In our comparison of 5 machine learning models (GBDT, LR, KNN, RF, and SVM), the GBDT model showed the best performance with the highest area under the receiver operating characteristic curve (0.992), recall (94.8%), accuracy (95.4%), and F1 score (0.933). The RF, SVM, and KNN models showed better performance (area under the receiver operating characteristic curve: 0.980, 0.898, and 0.877, respectively) than the LR (0.876).The GBDT model showed better performance than other machine learning models (LR, KNN, RF, and SVM) in predicting the mortality of patients with sepsis in the intensive care unit. This could be used to develop a clinical decision support system in the future.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 34106618 PMCID： PMC8133100 DOI： 10.1097/MD.0000000000025813

Source DB: PubMed Journal: Medicine (Baltimore) ISSN： 0025-7974 Impact factor: 1.889

Introduction

Sepsis is a life-threatening organ dysfunction caused by a dysregulated host response to infection.[ Sepsis is not only a common and potentially life-threatening condition, but is also a major global health issue.[ An estimated more than 30 million people develop sepsis every year worldwide, potentially leading to 6 million deaths.[ Sepsis is one of the most burdensome diseases worldwide because of high treatment costs and excessively lengthy hospital stays.[ However, early diagnosis and accurate identification of the risk factors, as well as the appropriate treatment, reduce the overall mortality rate, and improve patient outcomes.[ It is difficult to diagnose sepsis early due to various sources of infection in patients with sepsis and the difference in host response. Early and timely detection of sepsis has always been the focus of research.[ Some studies have shown that machine learning can be used to build prognostic models for both mortality and sepsis development.[ To predict septic mortality, previous studies have employed big data and machine learning models such as stochastic gradient boosting, support vector machine (SVM), naive Bayes, logistic regression (LR), and random forest (RF).[ Machine learning helps to analyze complex data automatically and produces significant results. Machine learning-based approaches have the potential for increased sensitivity and specificity by training sepsis patient data.[ However, the possibility of some techniques, including ensemble algorithms, has not yet been addressed in improving the prediction outcomes. It is also necessary to find methods for generating accurate predictions. To address these issues, we conducted an exploratory study to evaluate the efficiency of different classification algorithms in predicting death in adult patients with sepsis. In this study, we compared the gradient boosting decision tree (GBDT) model with other machine learning approaches (LR, k-nearest neighbor, RF, and SVM) using the prediction of sepsis in-hospital mortality as the use case.

Methods

Dataset

This study used the Medical Information Mart for Intensive Care III (MIMIC-III) V1.4. MIMIC-III is a large, freely available database comprising anonymous health-related data associated with over 53,423 adult patients admitted to critical care units at the Beth Israel Deaconess Medical Center in Boston between 2001 and 2012.[

Ethics statement

Because our study used an open-access database, no further local institutional review board approval was required. Data analysis and model development procedures followed the MIMIC-III guidelines and regulations.

Sepsis definition

Septic patients were identified using the International Classification of Diseases-9th Revision, Clinical Modification (ICD-9-CM) code for sepsis from records in the database. The codes included: 003.1 (salmonella septicemia); 022.3 (anthrax septicemia); 038.0 to 038.9 (subcodes of septicemia); 054.5 (herpetic septicemia). In October 2002, new diagnostic codes came into effect. They were included in the study: 995.91 (systemic inflammatory response syndrome caused by the infectious process without organ dysfunction) and 995.92 (systemic inflammatory response syndrome caused by the infectious process with organ dysfunction).[

Data extraction and imputation

We developed SQL scripts that contain a large number of SQL statements to query the MIMIC-III database for all adult patients (≥ 18 years). We used the indicators when the patients entered the ICU for the first time to predict the in-hospital mortality rate. Researchers extracted all data of sepsis patients 48 hours after ICU admission. Data extracted included patient age, sex, ethnicity, length of hospital stay, Glasgow coma scale, percutaneous oxygen saturation, vital signs, and laboratory values. Sepsis was defined as an ICD-9. Researchers restricted our search to patients aged 18 years or older, and we excluded patients with data missing more than 30%.[ For each variable with less than 30% missing values, we replaced the missing values by means in each group.

Prediction model

In this study, 5 prediction models of the GBDT, LR, K-nearest neighbor (KNN), RF, and SVM models were established and compared. GBDT is a new algorithm that combines decision trees and holistic learning techniques.[ Its basic idea is to combine a series of weak base classifiers into a strong base classifier.[ In the learning process, a new regression tree is constructed by fitting residuals to reduce the loss function until the residuals are less than a certain threshold, or the number of regression trees reaches a certain threshold. The advantages of GBDT are good training effect, less overfitting, and flexible handling of various data types, including continuous and discrete values.[ LR is a statistical method for analyzing datasets, and it is also a supervised machine-learning algorithm developed for learning classification problems.[ It is one of the most widely used methods in health sciences research, especially in epidemiology.[ Some studies have shown that LR is effective in analyzing mortality factors and predicting mortality.[ The KNN is an algorithm that stores all available instances and classifies new instances based on a similarity measure (such as distance functions).[ It has been widely used in classification and regression prediction problems owing to its simple implementation and outstanding performance.[ SVM was derived from the statistical learning theory proposed by Cortes and Vapnik.[ SVM maps the original datasets from the input space to the high-dimensional feature space, thus simplifying the classification problems in the feature space. Its main advantage is that it uses kernel tricks to build expert knowledge about the problem, minimizing both model complexity and prediction error.[ RF is an ensemble supervised machine learning algorithm. It uses a decision tree as the base classifier. RF produces many classifiers and combines their results through majority voting.[ In this study, we used the Scikit-learn toolkit to train and test the model. We conducted a 5-fold cross-validation using only the encounters allocated to the training set. For the cross-validation results, a paired t test was used to measure the significant difference between the models.

Results

A total of 3937 patients were included, with 34.3% in-hospital mortality in the MIMIC-III v1.4 database (Fig. 1). The main characteristics of the patients with sepsis are shown in Table 1. Compared with those who survived, patients who died were older (68.9 ± 14.9) versus 65.5 ± 16.7 years (P < .01).

Figure 1

The flowchart for including patients in the study.

Table 1

Patient demographic information.

Variable	Death (n = 1352)	Survival (n = 2585)	P value
Gender
Female	578 (42.8%)	1147 (44.4%)	.344
Male	774 (57.2%)	1438 (55.6%)	.344
Age (y) (mean, SD)	68.9 ± 14.9	65.5 ± 16.7	<.01
Ethnicity
Caucasian	950 (70.3%)	1894 (73.3%)	.047
Hispanic	37 (2.7%)	90 (3.5%)	.218
African American	109 (8.1%)	246 (9.5%)	.143
Other	256 (18.9%)	355 (13.7%)	<.01
ICU days (mean, SD)	17.4 ± 18.1	18.2 ± 16.5	.176

Death = death of septic patients during hospitalization.

The flowchart for including patients in the study. Patient demographic information. Death = death of septic patients during hospitalization. The results of the 5 machine learning methods found in 5-fold cross-validation are shown in Tables 2 and 3. It included the area under the receiver operating characteristic curve (AUROC), precision, recall, accuracy, and F1 score. In our study, accuracy was defined by dividing the number of correctly predicted observations by the total number of observations. Precision is calculated by dividing the number of correctly predicted positive observations by the number of predicted positive observations. Recall is the proportion of correctly predicted positive observations to all observations in the actual class. The F1 score is a weighted average of the accuracy and recall, representing the balance between these 2 values.[

Table 2

Comparison of performance of the 5 models.

	LR	KNN	SVM	RF	GBDT
AUC	0.876	0.877	0.898	0.980	0.992
Precision	0.723	0.806	0.828	0.931	0.948
Recall	0.776	0.624	0.749	0.885	0.917
Accuracy	0.821	0.819	0.860	0.938	0.954
F1 score	0.715	0.702	0.780	0.907	0.933

Table 3

Comparison of AUROC and F1 among the different models.

	AUROC	F1	P(AUROC)	P(F1)
GBDT	0.992 (0.989–0.994)	0.933 (0.929–0.938)	<0.01 vs LR<0.01 vs KNN<0.01 vs RF<0.01 vs SVM	<0.01 vs LR<0.01 vs KNN<0.01 vs RF<0.01 vs SVM
LR	0.876 (0.864–0.885)	0.715 (0.704–0.723)	0.774 vs KNN<0.01 vs RF0.012 vs SVM	0.354 vs KNN<0.01 vs RF<0.01 vs SVM
KNN	0.877 (0.871–0.885)	0.702 (0.665–0.730)	<0.01 vs RF0.010 vs SVM	<0.01 vs RF<0.01 vs SVM
RF	0.980 (0.978–0.984)	0.907 (0.896–0.930)	<0.01 vs SVM	<0.01 vs. SVM
SVM	0.898 (0.880–0.914)	0.780 (0.771–0.801)

Comparison of performance of the 5 models. Comparison of AUROC and F1 among the different models. The AUROC ranged from 0.876 to 0.992 for the 5 predictive models. GBDT showed the largest AUROC (0.992), highest precision (94.8%), recall (91.7%), accuracy (95.4%), and F1 score (0.933). LR showed the lowest AUC (0.876), precision (0.723), and recall (0.776). The receiver operating characteristic curves of these predictive models are shown in Figure 2. GBDT ranks the individual variables based on their relative influence, and the top 10 variables are presented in Figure 3.

Figure 2

Comparison of the ROC curve of the 5 models. ROC = receiver operating characteristic curve.

Figure 3

Top-10 variable importance of GBDT. GBDT = gradient boosting decision tree, GCS = Glasgow coma scale, max2 = parameter maximum in 48 hours of admission, Mean1 = average of parameters within 24 hours of admission, mean2 = average of parameters within 48 hours of admission, min2 = parameter minimum in 48 hours of admission, PTT = partial thromboplastin time.

Comparison of the ROC curve of the 5 models. ROC = receiver operating characteristic curve. Top-10 variable importance of GBDT. GBDT = gradient boosting decision tree, GCS = Glasgow coma scale, max2 = parameter maximum in 48 hours of admission, Mean1 = average of parameters within 24 hours of admission, mean2 = average of parameters within 48 hours of admission, min2 = parameter minimum in 48 hours of admission, PTT = partial thromboplastin time. Figure 2 shows the comparison of AUROC for predicting death in patients with sepsis according to the 5 predicted models. The AUROC and F1 score of GBDT were higher than those of the other models. The AUROC of the RF, SVM, and KNN showed better performance than LR but worse than GBDT. GBDT was significantly different from the other models (P < .01).

Discussion

In this study, there was no significant difference between the sexes of patients with sepsis who died and those who survived (P = .344). There was a significant difference in age between the death and survival groups (P < .01). There was no significant difference in the number of days in the hospital between patients with sepsis who died and those who survived (P = .176). The results showed that GBDT had the largest AUROC (0.992) and highest precision (0.948), recall (0.917), accuracy (0.954), and F1 score (0.933) for predicting death in patients with sepsis. The results were better than those of the other models. This is because GBDT is based on the tree model and inherits the advantages of the tree model: it is robust to outliers and has little noise interference; its uncorrelated features have low interference and can deal with missing values well. A tree model is a decision support tool that uses a tree-like diagram or model to represent a decision and its possible consequences, including chance event outcomes, costs of resources, and utility.[ This study has some limitations. First, the study was performed at a single institution; the performance of machine learning techniques might be different when applied to a sample of different institutions with a different distribution of covariates. This study mainly involved Caucasians (72.2%), African Americans (9.0%), and Hispanics (3.2%). The results of this study need to be further verified in other ethnic groups due to ethnic differences. Although the Sequential Organ Failure Assessment (SOFA) criterion is the latest definition of sepsis, the use of SOFA as a criterion for sepsis may lead to some bias due to missing data in the MIMIC III database. If the event death occurs during the assessment period, data from some patients, many of whom have high scores, will be missing, leading to survival bias.[ SOFA criteria may lead to delayed diagnosis and intervention in cases of severe infection.[ Some authors have reported that the use of SOFA criteria requires further exploration.[ The sepsis standard (ICD-9) used in this study is an imperfect characterization of sepsis. Nevertheless, we believe it is useful in developing sepsis prediction tools, as evidenced by the improvements in sepsis-related clinical outcomes using a sepsis prediction algorithm trained on the same standard.[

Conclusions

In this study, researchers established and evaluated a GBDT prediction model for death in patients with sepsis in the ICU. The GBDT model showed better performance than other machine learning models in predicting death in patients with sepsis in the ICU. Among these models, the GBDT model showed the best performance with the highest AUROC and F1 scores. The evaluation results demonstrated that GBDT is an effective algorithm that offers the best predictive performance for predicting death in patients with sepsis. In future studies, we intend to verify the performance of the GBDT model in hospitals with different demographic and clinical characteristics, as well as in nonintensive care units. It can also be used to develop a clinical decision support system.

Author contributions

Jialin Liu and Ke Li conceived the study. Qinwen Shi, Siru Liu, Jialin Liu, Ke Li, and Yilin Xie performed the analysis, interpreted the results, and drafted the manuscript. All authors have revised the manuscript accordingly. All authors read and approved the final manuscript. Conceptualization: Jialin Liu, Ke Li. Data curation: Ke Li, Jialin Liu, Qinwen Shi, Siru Liu, Yilin Xie. Formal analysis: Jialin Liu, Ke Li, Qinwen Shi. Funding acquisition: Ke Li, Jialin Liu. Investigation: Ke Li, Jialin Liu, Qinwen Shi, Siru Liu, Yilin Xie. Methodology: Jialin Liu, Ke Li, Qinwen Shi, Siru Liu. Project administration: Jialin Liu. Resources: Jialin Liu, Ke Li. Supervision: Jialin Liu, Ke Li. Validation: Jialin Liu. Writing – original draft: Jialin Liu, Ke Li, Qinwen Shi, Siru Liu, Yilin Xie. Writing – review & editing: Jialin Liu, Ke Li, Qinwen Shi, Siru Liu, Yilin Xie.

24 in total

1. Management of iatrogenic bile duct injuries: Multiple logistic regression analysis of predictive factors affecting morbidity and mortality.

Authors: Ela Ekmekcigil; Ömer Ünalp; Alper Uğuz; Ruslan Hasanov; Halil Bozkaya; Timur Köse; Mustafa Parıldar; Ömer Özütemiz; Ahmet Çoker
Journal: Turk J Surg Date: 2018-08-28

2. New Sepsis Criteria: A Change We Should Not Make.

Authors: Steven Q Simpson
Journal: Chest Date: 2016-02-27 Impact factor: 9.410

3. From vital signs to clinical outcomes for patients with sepsis: a machine learning basis for a clinical decision support system.

Authors: Eren Gultepe; Jeffrey P Green; Hien Nguyen; Jason Adams; Timothy Albertson; Ilias Tagkopoulos
Journal: J Am Med Inform Assoc Date: 2013-08-19 Impact factor: 4.497

4. Vital Signs: Epidemiology of Sepsis: Prevalence of Health Care Factors and Opportunities for Prevention.

Authors: Shannon A Novosad; Mathew R P Sapiano; Cheri Grigg; Jason Lake; Misha Robyn; Ghinwa Dumyati; Christina Felsen; Debra Blog; Elizabeth Dufort; Shelley Zansky; Kathryn Wiedeman; Lacey Avery; Raymund B Dantes; John A Jernigan; Shelley S Magill; Anthony Fiore; Lauren Epstein
Journal: MMWR Morb Mortal Wkly Rep Date: 2016-08-26 Impact factor: 17.586

5. A machine learning-based model for 1-year mortality prediction in patients admitted to an Intensive Care Unit with a diagnosis of sepsis.

Authors: J E García-Gallo; N J Fonseca-Ruiz; L A Celi; J F Duitama-Muñoz
Journal: Med Intensiva (Engl Ed) Date: 2018-09-20

6. High-performance detection and early prediction of septic shock for alcohol-use disorder patients.

Authors: Jacob Calvert; Thomas Desautels; Uli Chettipally; Christopher Barton; Jana Hoffman; Melissa Jay; Qingqing Mao; Hamid Mohamadlou; Ritankar Das
Journal: Ann Med Surg (Lond) Date: 2016-05-10

7. Shock Index: A Simple and Effective Clinical Adjunct in Predicting 60-Day Mortality in Advanced Cancer Patients at the Emergency Department.

Authors: Tzu-Heng Cheng; Yi-Da Sie; Kuang-Hung Hsu; Zhong Ning Leonard Goh; Cheng-Yu Chien; Hsien-Yi Chen; Chip-Jin Ng; Chih-Huang Li; Joanna Chen-Yeen Seak; Chen-Ken Seak; Yi-Tung Liu; Chen-June Seak
Journal: Int J Environ Res Public Health Date: 2020-07-07 Impact factor: 3.390

8. Emergency department triage prediction of clinical outcomes using machine learning models.

Authors: Yoshihiko Raita; Tadahiro Goto; Mohammad Kamal Faridi; David F M Brown; Carlos A Camargo; Kohei Hasegawa
Journal: Crit Care Date: 2019-02-22 Impact factor: 9.097

Review 9. The SOFA score-development, utility and challenges of accurate assessment in clinical trials.

Authors: Simon Lambden; Pierre Francois Laterre; Mitchell M Levy; Bruno Francois
Journal: Crit Care Date: 2019-11-27 Impact factor: 9.097

10. Temporal trends in the systemic inflammatory response syndrome, sepsis, and medical coding of sepsis.

Authors: Benjamin S Thomas; S Reza Jafarzadeh; David K Warren; Sandra McCormick; Victoria J Fraser; Jonas Marschall
Journal: BMC Anesthesiol Date: 2015-11-24 Impact factor: 2.217

4 in total

1. The potential for leveraging machine learning to filter medication alerts.

Authors: Siru Liu; Kensaku Kawamoto; Guilherme Del Fiol; Charlene Weir; Daniel C Malone; Thomas J Reese; Keaton Morgan; David ElHalta; Samir Abdelrahman
Journal: J Am Med Inform Assoc Date: 2022-04-13 Impact factor: 4.497

2. Evaluating machine learning models for sepsis prediction: A systematic review of methodologies.

Authors: Hong-Fei Deng; Ming-Wei Sun; Yu Wang; Jun Zeng; Ting Yuan; Ting Li; Di-Huan Li; Wei Chen; Ping Zhou; Qi Wang; Hua Jiang
Journal: iScience Date: 2021-12-20

3. Predicting Risk of Hypoglycemia in Patients With Type 2 Diabetes by Electronic Health Record-Based Machine Learning: Development and Validation.

Authors: Hao Yang; Jiaxi Li; Siru Liu; Xiaoling Yang; Jialin Liu
Journal: JMIR Med Inform Date: 2022-06-16

4. Predicting Mortality in Intensive Care Unit Patients With Heart Failure Using an Interpretable Machine Learning Model: Retrospective Cohort Study.

Authors: Jili Li; Siru Liu; Yundi Hu; Lingfeng Zhu; Yujia Mao; Jialin Liu
Journal: J Med Internet Res Date: 2022-08-09 Impact factor: 7.076

4 in total