Literature DB >> 35071556

Machine learning approach to predict acute kidney injury after liver surgery.

Jun-Feng Dong¹, Qiang Xue², Ting Chen³, Yuan-Yu Zhao¹, Hong Fu¹, Wen-Yuan Guo¹, Jun-Song Ji⁴.

Abstract

BACKGROUND: Acute kidney injury (AKI) after surgery appears to increase the risk of death in patients with liver cancer. In recent years, machine learning algorithms have been shown to offer higher discriminative efficiency than classical statistical analysis. AIM: To develop prediction models for AKI after liver cancer resection using machine learning techniques.
METHODS: We screened a total of 2450 patients who had undergone primary hepatocellular carcinoma resection at Changzheng Hospital, Shanghai City, China, from January 1, 2015 to August 31, 2020. The AKI definition used was consistent with the Kidney Disease: Improving Global Outcomes. We included in our analysis preoperative data such as demographic characteristics, laboratory findings, comorbidities, and medication, as well as perioperative data such as duration of surgery. Computerized algorithms used for model development included logistic regression (LR), support vector machine (SVM), random forest (RF), extreme gradient boosting (XGboost), and decision tree (DT). Feature importance was also ranked according to its contribution to model development.
RESULTS: AKI events occurred in 296 patients (12.1%) within 7 d after surgery. Among the original models based on machine learning techniques, the RF algorithm had optimal discrimination with an area under the curve value of 0.92, compared to 0.87 for XGBoost, 0.90 for DT, 0.90 for SVM, and 0.85 for LR. The RF algorithm also had the highest concordance-index (0.86) and the lowest Brier score (0.076). The variable that contributed the most in the RF algorithm was age, followed by cholesterol, and surgery time.
CONCLUSION: Machine learning algorithms are highly effective in discriminating patients at high risk of developing AKI. The successful application of machine learning models may help guide clinical decisions and help improve the long-term prognosis of patients. ©The Author(s) 2021. Published by Baishideng Publishing Group Inc. All rights reserved.

Entities: Chemical

Keywords: Acute kidney injury; Liver cancer; Machine learning; Prediction; Surgery

Year: 2021 PMID： 35071556 PMCID： PMC8717516 DOI： 10.12998/wjcc.v9.i36.11255

Source DB: PubMed Journal: World J Clin Cases ISSN： 2307-8960 Impact factor: 1.337

Core Tip: Acute kidney injury (AKI) is a relatively common complication after liver surgery and has a negative impact on long-term patient prognosis. Early detection and timely intervention are key in order to minimize the negative impact of AKI. Machine learning has become increasingly better integrated with clinical medicine. In our retrospective study, we established a real-time prediction model based on machine learning algorithms. The final models showed high power to discriminate AKI events.

INTRODUCTION

Liver surgery associated acute kidney injury (LSA-AKI) is a relatively common postoperative complication in patients with liver cancer. LSA-AKI has a negative impact on the postoperative recovery and increases long-term patient mortality[1]. The incidence of AKI has been reported to be between 15% and 50% in patients with liver cancer that undergo surgery[2]. However, in clinical practice, AKI events are often underdiagnosed[3]. Many studies have investigated AKI-associated risk factors, and several classical scoring systems for AKI have emerged[4,5]. Nevertheless, the potential non-linear relationship between variables and variable-outcome can compromise the predictive performance of the model. Moreover, the traditional multiple linear analysis methods limit the number of relevant variables that may be clinically significant[6]. In contrast, machine learning techniques are not limited to linear relationships nor to the number of variables included in the analysis, and therefore may offer a better predictive performance. Machine learning includes computer algorithm-based technology that can efficiently process clinical data to solve classification or regression problems[7,8]. With the continuous expansion of artificial intelligence (AI) techniques, machine learning and clinical medicine are gradually overlapping, as illustrated by numerous studies performed on both[9]. In clinical medicine, machine learning has demonstrated its value in analyzing postoperative complications and long-term outcomes due to its powerful data processing capabilities[10-13]. For example, in contrast to traditional regression models, machine learning performed better at screening patients at high-risk of sepsis[14]. Moreover, in prior prospective evaluations of the AKI events, the machine-learning-based AKI predictor outperformed physician predictive performance[15]. Machine learning has also made progress in critical care medicine[16], and was shown to be valuable in the emergency department[17], and iconography[18]. In the era of big data, the combination of machine learning and electronic medical records can provide more advanced technical support for clinical management of AKI patients[19]. AKI predictive models based on big data and artificial intelligence are potentially reliable tools to individually and prospectively monitor the condition of each patient and help support clinical decisions accordingly[20,21]. In our research, machine learning algorithms were used to develop the LSA-AKI models, with appropriate validation and evaluation of the model’s performance.

MATERIALS AND METHODS

Study population

A total of 2450 patients who had undergone primary hepatocellular carcinoma resection at Changzheng Hospital, Shanghai City, China, from January 1, 2015 to August 31, 2020 were screened (Figure 1). The study was approved by the Ethics Committee of Navy Medical University, with an exemption from the informed consent.

Figure 1

Patient selection and analysis. The 3218 patients who underwent liver cancer resection were initially included. 768 patients were excluded based on exclusion criteria, and a total of 2450 patients were included in the study (data set). The data set was divided into a training set and test set. First, the model was applied to the training set for the modeling process and the parameters were debugged. Then, the model was validated in the test set.

Data collection

The AKI standard used was the 2012 KDIGO criteria, which is defined as: (1) An increase in serum creatinine of more than 50% within 7 d after surgery; and (2) An increase in serum creatinine of more than 0.3 mg/dL within 48 h after surgery. The preoperative serum creatinine was measured as a baseline value. We included in our analysis preoperative data such as demographic characteristics, laboratory findings, comorbidities, and medication, as well as perioperative data such as duration of surgery. The baseline characteristics included age, gender, and dyslipidemia. Data on tumor characteristics such as alpha-fetoprotein (AFP) and tumor size were also collected. Laboratory measurements included hemoglobin, serum creatinine, and cholesterol. Perioperative variables included the use of blood products and surgery duration.

Statistical analysis

Python version 3.6 and Scikit-learn package (https://github.com/scikit-learn/scikit-learn) were used for development of the model. Patients were randomly assigned to the training and the test sets at a ratio of 7:3. The training set was used for model development and optimization, while the test set was used for model validation and evaluation.

Machine learning techniques

We used several mature machine learning algorithms for modelling: the logistic regression (LR), the support vector machine (SVM), the random forest (RF), the extreme gradient propulsion (XGBoost), and the decision tree (DT). The operating principle of the LR model is to calculate the regression coefficient through the maximum likelihood ratio, and therefore to calculate the occurrence probability of the observing endpoint. The DT, RF, and XGBoost techniques adopted the tree-based algorithm, which is a tree-like modelling which can synthesize the analysis to reach the best prediction decision (Figure 2). Feature importance was ranked according to the mean decrease in the Gini index[22]. SVM, a binary program introduced by Vapnik[23], was able to place the tagged targets to their belonged hyperplane partitions according to the inputted variable characteristics[24]. In this study, we used the five machine learning algorithms described above to predict whether a patient developed AKI within 7 d after liver cancer resection.

Figure 2

Tree-like algorithm. Tree-like modelling can help analysis to reach the best prediction decision. Classification results for acute kidney injury (AKI) and non-AKI are shown in blue and orange, respectively. The smaller the Gini index, the darker the color. BMI: Body mass index; WBC: White blood cell; HGB: Hemoglobin.

Performance evaluation

The area under the curve (AUC) in the receiver operating characteristic curve was applied to show the RF model performance. The greater the AUC, the better the predictive performance. Additionally, the concordance index (C-index)[25] and the Brier score (BS)[26] were measured to gauge the model’s discriminatory ability. A high C-index and a low BS suggest superior predictive performance. The optimal hyperparameters were identified in a 10-fold cross-validation to avoid the overfitting pitfall during model development.

RESULTS

Patient characteristics

A total of 2450 cases were included in our analysis. The age of the population was 54 ± 10.5 (mean ± SD). The majority were men, accounting for 81.3% (1992/2450) of the population. Tumor-associated information included: the tumor size (ranging from 0.8 cm to 8.3 cm); specific tumor markers of liver cancer (AFP fluctuated between 483 and 43203). 23.9% (586/2450) of the patients had dyslipidemia, 7.8% (190/2450) had diabetes mellitus, 48.4% (92/190) of which were currently receiving insulin. 13.2% (324/2450) of the patients had been prescribed oral beta blockers, and 8.1% (198/2450) were on aspirin. Table 1 shows the baseline characteristics in the training and the test sets, and confirms that there were no statistically significant differences between the two sets.

Table 1

Patient characteristics

Variables	Training set	Test set	P value
Patient population, n	1715	735
Age (yr)	55 (45-65)	54 (44-66)	0.323
Male, n (%)	1390 (81.0)	602 (81.9)	0.307
BMI (kg/m²)	24.6 (17.1-29.8)	24.9 (17.3-28.9)	0.956
Tumor size (cm)	4.5 (0.9-7.8)	4.8 (0.8-8.3)	0.283
AFP	8301 (489-35203)	8842 (503-43203)	0.058
WBC (× 10³/µL)	7.3 (3.5-13.8)	7.5 (3.3-15.8)	0.128
Hemoglobin (mg/dL)	13.0 (10.8-15.6)	12.7 (10.5-16.5)	0.460
PLT (× 10³/µL)	168 (102-245)	175 (113-260)	0.156
Creatinine (mg/dL)	0.92 (0.71-1.16)	0.90 (0.70-1.15)	0.128
ALB (g/dL)	3.8 (3.3-4.4)	3.7 (3.2-4.3)	0.603
AST (IU/L)	36.1 (6.3-163.5)	42.4 (5.8-173.4)	0.096
Diabetes mellitus, n (%)	109 (6.4)	81 (11.0)	0.098
Dyslipidemia, n (%)	395 (23.0)	191 (26.0)	0.063
ALT (IU/L)	39.8 (8.3-178.5)	42.3 (6.5-169.8)	0.132
Glucose (mg/dL)	11.8 (5.8-18.3)	12.5 (6.3-19.8)	0.285
Cholesterol (mg/dL)	162.2 (135.8-198.3)	168.0 (130.0-198.3)	0.323
PRBC (units)	0.5 (0.0-3.0)	0.8 (0.0-3.0)	0.112
Crystalloid (mL)	2318.8 (1500-3500)	2218 (1500-4000)	0.994
Surgery time (min)	278 (198-363)	285 (202-387)	0.856
Beta blockers, n (%)	257 (15.0)	67 (9.1)	0.155
Aspirin, n (%)	152 (8.9)	46 (6.3)	0.183
RAAS blocker, n (%)	91 (5.3)	61 (8.3)	0.360
Insulin, n (%)	48 (2.8)	44 (6.0)	0.059
Systolic blood pressure	113 (88-154.8)	118 (95-165.5)	0.658
Diastolic blood pressure	75 (55-84)	77 (58-89)	0.537
Mean arterial pressure	93 (71-119)	108 (68-121)	0.437

PLT: Platelet; AFP: Alpha-fetoprotein; WBC: White blood cell; BMI: Body mass index; ALB: Albumin; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PRBC: Packed red blood cell; RAAS: Renin-angiotensin-aldosterone system.

Patient characteristics PLT: Platelet; AFP: Alpha-fetoprotein; WBC: White blood cell; BMI: Body mass index; ALB: Albumin; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; PRBC: Packed red blood cell; RAAS: Renin-angiotensin-aldosterone system.

AKI morbidity

Serum creatinine fluctuations were continuously monitored after the operation, and were compared with the preoperative baseline values. Our results indicate that a LSA-AKI event occurred in 296 patients (12.1%) within 7 days after surgery. The incidence of AKI in the training set and test set was 11.5% (198/1715) and 13.3% (98/735), respectively.

Measures of effectiveness

The LR, SVM, RF, XGboost, and DT models were developed to predict postoperative AKI events. Table 2 and Figure 3 show the performance of the five machine learning models used. The RF technique had the largest evaluated AUC (0.92) in contrast to the LR technique which had the minimum evaluated AUC (0.85). Table 2 shows the C-index and the BS of the five models. The models developed from machine learning were, as expected, shown to have a great C-index and small BS for the interest outcomes of AKI. In particular, the RF model performed better than the other prediction models with a higher C-index and lower BS (C-index: 0.86, BS: 0.076).

Table 2

Model performance (Concordance-index, Brier score, and area under the curve)

Machine learning models	Concordance-index	Brier score	AUC
Logistic regression	0.84	0.078	0.85
Support vector machine	0.86	0.083	0.90
Random forest	0.86	0.076	0.92
Extreme gradient boosting	0.80	0.083	0.87
Decision tree	0.83	0.085	0.90

AUC: Area under the curve.

Figure 3

Areas under the receiver operating characteristic curve. LR: Logistic regression; SVM: Support vector machine; RF: Random forest; XGboost: Extreme gradient boosting; DT: Decision tree; AUC: Area under the curve. Model performance (Concordance-index, Brier score, and area under the curve) AUC: Area under the curve.

Tree structure

Figure 2 depicts a tree-like algorithm processing variables to classify the sample. Each variable flowed through the tree and showed the importance of its value. Samples in the training set continue to branch out according to the classification results. Variables were given an entropy value and Gini index in the decision tree. In the random forest, the final prediction result was determined according to the majority votes of the final decision trees, with the importance of each variable ranked according to the Gini index.

Importance rank

The ranked variable value of the RF algorithm is shown in Figure 4, revealing the 18 foremost variables. Variables were ranked according to the mean decreases in the Gini index. The top five contributing variables to the model were age, cholesterol, surgery time, serum creatinine, and platelet counts.

Figure 4

Ranked variable values of the random forest algorithm. PLT: Platelet; AFP: Alpha-fetoprotein; WBC: White blood cell; BMI: Body mass index; CR: Creatinine clearance; HB: Hemoglobin; ALB: Albumin; ALT: Alanine aminotransferase; AST: Aspartate aminotransferase; SBP: Systolic blood pressure; DM: Diabetes mellitus.

DISCUSSION

Early detection and timely intervention are key to efficient treatment of AKI events[27]. Therefore, it is a clinical priority to develop risk assessment systems to screen the high-risk population so that timely and effective interventions can be conducted. However, due to the multifactorial nature and the multilinear relationships of LSA-AKI, previous risk scores have been inefficient in predicting AKI episodes[28]. In addition, development of such risk scores commonly used a small set of preoperative clinical variables. Nevertheless, other factors, including intraoperative events such as surgery duration and body fluid loss may also actively impact the development of LSA-AKI. With the advent of big data, machine learning holds great potential in the field of AKI research due to its unparalleled ability in data processing[19]. Therefore, machine learning models may be powerful tools for AKI risk stratification and prediction[20]. A clinical decision support system based on the machine learning technique has many advantages, such as helping save clinicians' time and energy, increasing the efficiency of diagnosis and treatment, and improving real-time monitoring of patients' conditions[29]. In this retrospective study, we developed, validated, and evaluated several LSA-AKI machine learning models based on preoperative and intraoperative features. It is important to note that we included intraoperative variables to construct the models to offer a better simulation of the real physiological conditions during liver surgery. The existent risk scores in predicting AKI events after liver surgery included the Kalisvaart Score[30] and the Park Score[31]. These risk scoring systems were developed from traditional regression analysis methods, with AUC values ranging from 0.70 to 0.85. In our study, the prediction models established by a machine learning approach had a high discriminatory power with AUC values ranging from 0.85 to 0.92. The RF classifier had the largest evaluated AUC (0.92), in contrast to the LR classifier which had the minimum evaluated AUC (0.85). These models, derived from machine learning algorithms, showed an apparent improvement in LSA-AKI discrimination ability compared with that of the Kalisvaart and the Park Scores. The first report of machine learning on LSA-AKI indicated that XGBoost had a high obtained AUC score for predicting LSA-AKI events [0.90, 95% confidence interval (CI): 0.86-0.93], whereas the AUC of LR analysis was 0.61 (95%CI: 0.56-0.66)[6]. These results suggest that the traditional regression model does not perform better than machine learning models in predictive analysis, which may result from its linear assumption during data analysis[6]. Figure 4 lists the factors involved in the development of the RF model and the contribution ranking of the related variables. These ranked variables may be potential risk factors for the development of LSA-AKI events. It is worth noting that the rank of the relevant variables did not include some previously known risk factors, such as intraoperative urine output. In addition, several factors previously thought to be unrelated to AKI development, such AFP, appear to be relevant. These findings might prompt new research ideas and better understanding of AKI events. There are also some limitations in our study. First, this was a single-center retrospective study. Due to the relatively small sample size and the lack of external validation, our results may not be generalizable. Second, including all variables in the process of data collection is a very challenging task, and therefore some potentially relevant factors may have been ignored. Finally, most of the inputted features were implemented manually. We are still working on developing a real-time automated electronic health record algorithm that could collect perioperative patient information from a variety of data sources. With these new technologies, predictive models based on machine learning may have the potential to change clinical practice.

CONCLUSION

LSA-AKI is a postoperative complication with high incidence in patients with liver cancer. LSA-AKI has a negative impact on the postoperative recovery of patients and results in increased long-term mortality. As LSA-AKI is associated with a variety of factors, and given the complex nonlinear relationship among variables and outcomes, it is challenging for traditional regression analysis to predict its occurrence. In recent years, the intersection of machine learning and clinical medicine has allowed early detection of AKI. Our model, based on machine learning approaches, may be helpful for screening patients at high risk of AKI, ultimately helping to guide clinical decisions and facilitate prospective interventions for high-risk individuals. Future research should attempt to further improve the predictive performance of LSA-AKI by combining AKI biomarkers such as IL-18, NGAL and KIM1[32] with machine learning.

ARTICLE HIGHLIGHTS

Research background

Recently, machine learning has proven helpful in the interpretation of medical results and has potential for helping guide diagnosis and treatment, ultimately improving patient outcomes.

Research motivation

Machine learning methods to predict acute kidney injury (AKI) events remain largely unexplored.

Research objectives

We aimed to develop prediction models for AKI after liver cancer resection based on machine learning techniques.

Research methods

A total of 2450 patients who had undergone primary hepatocellular carcinoma resection at Changzheng Hospital, Shanghai City, China, from January 1, 2015 to August 31, 2020 were screened. Patients were randomly assigned to the training and the test sets at a ratio of 7:3. The training set was used for model development and optimization, while the test set was used for model validation and evaluation.

Research results

AKI events occurred in 296 patients (12.1%) after surgery. Among the original models based on machine learning techniques, the random forest (RF) algorithm had optimal discrimination with an area under the curve value of 0.92, compared to 0.87 for extreme gradient boosting, 0.90 for decision tree, 0.90 for support vector machine, and 0.85 for logistic regression. The RF algorithm also had the highest concordance-index (0.86) and the lowest Brier score (0.076). The variables that contributed the most in the RF algorithm were age, cholesterol, and surgery time.

Research conclusions

Machine learning technology can accurately predict AKI after hepatectomy.

Research perspectives

In the era of personalized medicine, our model based on machine learning can discriminate patients at high risk for AKI, thus helping guide clinical decisions and facilitating prospective interventions for high-risk individuals.

32 in total

1. Relation of aortic wall thickness and distensibility to cardiovascular risk factors (from the Multi-Ethnic Study of Atherosclerosis [MESA]).

Authors: Ashkan A Malayeri; Shunsuke Natori; Hossein Bahrami; Alain G Bertoni; Richard Kronmal; João A C Lima; David A Bluemke
Journal: Am J Cardiol Date: 2008-05-24 Impact factor: 2.778

2. Beyond Biomarkers: Machine Learning in Diagnosing Acute Kidney Injury.

Authors: Bruce A Molitoris
Journal: Mayo Clin Proc Date: 2019-05 Impact factor: 7.616

3. Preoperative estimated glomerular filtration rate and RIFLE-classified postoperative acute kidney injury predict length of stay post-coronary bypass surgery in an Australian setting.

Authors: E M Moore; J A Simpson; A Tobin; J Santamaria
Journal: Anaesth Intensive Care Date: 2010-01 Impact factor: 1.669

4. Cardiovascular Event Prediction by Machine Learning: The Multi-Ethnic Study of Atherosclerosis.

Authors: Bharath Ambale-Venkatesh; Xiaoying Yang; Colin O Wu; Kiang Liu; W Gregory Hundley; Robyn McClelland; Antoinette S Gomes; Aaron R Folsom; Steven Shea; Eliseo Guallar; David A Bluemke; João A C Lima
Journal: Circ Res Date: 2017-08-09 Impact factor: 17.367

5. Electronic Medical Record-Based Predictive Model for Acute Kidney Injury in an Acute Care Hospital.

Authors: Olga Laszczyńska; Milton Severo; Ana Azevedo
Journal: Stud Health Technol Inform Date: 2016

6. Three immunomarker support vector machines-based prognostic classifiers for stage IB non-small-cell lung cancer.

Authors: Zhi-Hua Zhu; Bing-Yu Sun; Yun Ma; Jian-Yong Shao; Hao Long; Xu Zhang; Jian-Hua Fu; Lan-Jun Zhang; Xiao-Dong Su; Qiu-Liang Wu; Peng Ling; Ming Chen; Ze-Ming Xie; Yi Hu; Tie-Hua Rong
Journal: J Clin Oncol Date: 2009-02-02 Impact factor: 44.544

7. Reduced ascending aortic strain and distensibility: earliest manifestations of vascular aging in humans.

Authors: Alban Redheuil; Wen-Chung Yu; Colin O Wu; Elie Mousseaux; Alain de Cesare; Raymond Yan; Nadjia Kachenoura; David Bluemke; Joao A C Lima
Journal: Hypertension Date: 2010-01-11 Impact factor: 10.190

8. A new model to predict acute kidney injury requiring renal replacement therapy after cardiac surgery.

Authors: Neesh Pannu; Michelle Graham; Scott Klarenbach; Steven Meyer; Teresa Kieser; Brenda Hemmelgarn; Feng Ye; Matthew James
Journal: CMAJ Date: 2016-06-13 Impact factor: 8.262

9. Enhancing the prediction of acute kidney injury risk after percutaneous coronary intervention using machine learning techniques: A retrospective cohort study.

Authors: Chenxi Huang; Karthik Murugiah; Shiwani Mahajan; Shu-Xia Li; Sanket S Dhruva; Julian S Haimovich; Yongfei Wang; Wade L Schulz; Jeffrey M Testani; Francis P Wilson; Carlos I Mena; Frederick A Masoudi; John S Rumsfeld; John A Spertus; Bobak J Mortazavi; Harlan M Krumholz
Journal: PLoS Med Date: 2018-11-27 Impact factor: 11.069

10. Diagnostics, Risk Factors, Treatment and Outcomes of Acute Kidney Injury in a New Paradigm.

Authors: Charat Thongprayoon; Panupong Hansrivijit; Karthik Kovvuru; Swetha R Kanduri; Aldo Torres-Ortiz; Prakrati Acharya; Maria L Gonzalez-Suarez; Wisit Kaewput; Tarun Bathini; Wisit Cheungpasitporn
Journal: J Clin Med Date: 2020-04-13 Impact factor: 4.241