Literature DB >> 34149290

Improving Risk Identification of Adverse Outcomes in Chronic Heart Failure Using SMOTE+ENN and Machine Learning.

Ke Wang^1,2,3, Jing Tian⁴, Chu Zheng^1,3, Hong Yang^1,3, Jia Ren¹, Chenhao Li^1,3, Qinghua Han⁴, Yanbo Zhang^1,3.

Abstract

PURPOSE: This study sought to develop models with good identification for adverse outcomes in patients with heart failure (HF) and find strong factors that affect prognosis. PATIENTS AND METHODS: A total of 5004 qualifying cases were selected, among which 498 cases had adverse outcomes and 4506 cases were discharged after improvement. The study subjects were hospitalized patients diagnosed with HF from a regional cardiovascular hospital and the cardiology department of a medical university hospital in Shanxi Province of China between January 2014 and June 2019. Synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN) was used to pre-process unbalanced data. Traditional logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF), and extreme gradient boosting (XGBoost) were used to build risk identification models, and each model was repeated 100 times. Model discrimination and calibration were estimated using F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score. The best performing of the five models was used to identify the risk of adverse outcomes and evaluate the influencing factors.
RESULTS: The SME-XGBoost was the best performing model with means of F1-score (0.3673, 95% confidence interval [CI]: 0.3633-0.3712), AUC (0.8010, CI: 0.7974-0.8046), and Brier score (0.1769, CI: 0.1748-0.1789). Age, N-terminal pronatriuretic peptide, pulmonary disease, etc. were the most significant factors of adverse outcomes in patients with HF.
CONCLUSION: The combination of SMOTE+ENN and advanced machine learning methods effectively improved the discrimination efficacy of adverse outcomes in HF patients, accurately stratified patients at risk of adverse outcomes, and found the top factors of adverse outcomes. These models and factors emphasize the importance of health status data in determining adverse outcomes in patients with HF.

Entities: Chemical

Keywords: SHAP; SMOTE+ENN; XGBoost; heart failure; machine learning

Year: 2021 PMID： 34149290 PMCID： PMC8206455 DOI： 10.2147/RMHP.S310295

Source DB: PubMed Journal: Risk Manag Healthc Policy ISSN： 1179-1594

Introduction

Heart failure (HF) is the leading cause of death in most countries in the world.1 According to reports, one in every eight deaths in the United States is due to HF.2 Recent data show that the prevalence of HF increases as the population ages, the cardiovascular risk profile of the population deteriorates, and survival rates for patients with acute cardiovascular disease improve.3,4 HF puts a heavy burden on society through the extensive use of healthcare resources. Without doubt, accurately identifying the risk of adverse outcomes in HF is of vital importance to patients, the medical system, and society as a whole. Thanks to the digitization of medical information, particularly the introduction of electronic medical records (EMR) and the phenomenon of big data,5 researchers have been provided with massive amounts of available data. Moreover, the rise of machine learning (ML) algorithms6–8 offers researchers with new powerful tools. In fact, many researchers are currently focusing on risk identification using ML; however, it has not yet achieved high accuracy for the identification of HF related events.9 The reasons can be summarized as follows: first, medical data often show severe category imbalances, but many studies have ignored this problem, leading to predictions biased to most categories; second, the variable screening methods of many studies are laggard, and the influence of variables is not considered comprehensively; third, some studies have not improved model selection and parameter optimization despite of the presence of advanced ML models and parameter optimization methods. Accordingly, our aim was to use ML methods to address the limitations of the previously proposed models, especially for the unbalanced data processing, and eventually establish an ML model that can well identify the risk of adverse outcomes in HF patients and find strong influencing factors, so as to provide the basis for patients, doctors, and clinical researchers to initiate subsequent treatment and intervention measures.

Patients and Methods

Study Population

The patients for this study were enrolled according to inclusion and exclusion criteria from two medical centers in Shanxi Province of China between January 2014 and June 2019. The data were obtained according to the case report form of chronic heart failure (CHF-CRF) developed by our research group according to the case record content and HF guidelines.10 CHF-CRF included the patient’s demographics, medical history, physicals tatus and vitals, currently applied medical therapy, electrocardiogram, echocardiographic, and laboratory parameters. The inclusion criteria were 1) aged ≥18 years; 2) diagnosed with HF, according to the guideline for the diagnosis and treatment of HF in China (2018)11; 3) fall under the New York Heart Association (NYHA) II–IV Classification; and 4) received HF treatment while in the hospital. Patients who had an acute cardiovascular event within 2 months prior to admission or were unable or refused to participate in the project for some reason were excluded.

Data Preprocessing and Feature Selection

Some variables (also called features in ML) in this study were missing in different ratios. Referring to relevant studies on missing value processing,12–14 the variables with a missing percentage of no more than 30% were retained and filled with the missForest method.15,16 The quantitative data were normalized, and the multi-categorical variables were processed by One-Hot.17 After initial screening by single-factor method, recursive feature elimination (RFE) based on random forest (RF) with fivefold cross-validation (CV) was used to screen the overall features. The main idea of RFE is to repeatedly build the model and then select the best feature, pick out the selected feature, and then repeat this process on the remaining features until all features have been traversed.

Model Development

In addition to several commonly used supervised learning algorithms such as logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), random forest (RF),18 we introduced extreme gradient boosting (XGBoost) algorithm, which has attracted a lot of attention in recent years due to its computational speed, generalization ability and high predictive performance.19,20 According to whether adverse outcomes occurred, 5003 patients were divided into training set, verification set, and test set in a 3:1:1 ratio by stratified random sampling. The training validation set (training set+verification set) and verification set were pretreated using the synthesizing minority oversampling technology combined with edited nearest neighbors (SMOTE+ENN). We used a Grid Search method with fivefold CV to optimize the hyperparameters of the ML models in the original verification set and the pretreated verification set, respectively, and then used the ML models with the optimal hyperparameters to train the original training verification set and the pretreated training verification set (details in ). Finally, the performance of each model was evaluated and compared in the test set. To obtain a more robust performance estimate, avoid reporting biased results and limit overfitting, we repeat the holdout method 100 times with different random seeds and compute the average performance over these 100 repetitions21 (Figure 1).

Figure 1

Architecture of the system.

Architecture of the system. SMOTE+ENN is a comprehensive sampling method proposed by Batista et al in 2004,22 which combines the SMOTE and the Wilson’s Edited Nearest Neighbor Rule (ENN).23 SMOTE is an over-sampling method, and its main idea is to form new minority class examples by interpolating between several minority class examples that lie together. Although it can effectively improve the classification accuracy of the model, it can also generate noise samples and boundary samples. To create better defined class clusters, ENN is used as a data cleaning method that can remove any example whose class label differs from the class of at least two of its three nearest neighbors. Since some majority class examples might in vade the minority class space and vice versa, SMOTE+ENN reduces the possibility of overfitting introduced by synthetic examples.22 The KNN method is a popular classification method in data mining and statistics because of its simple implementation and significant classification performance.24 The idea is that if the majority of the k most similar samples (ie, the nearest neighbors in the feature space) of a sample belong to a certain category, the sample also belongs to this category, where K is usually not greater than 20. In the KNN algorithm, the selected neighbors are all objects that have been correctly classified. This method only determines the category to which the sample to be classified belongs based on the category of the nearest sample or samples. SVM is one of the most important methods in ML, which is broadly applied to image recognition and image processing.25 It is used to classify data through approximate inter-class distance in high dimensional space, and can satisfactorily solve the problems of small sample size, nonlinearity, and high dimensional data recognition and classification. The SVM looks for an optimal plane that can divide the sample observed in multi-dimensional space into two optimal planes. This optimal plane enables the two categories to be separated with the greatest possible distance from the nearest point. On the spacing boundary, the point that determines the spacing is the support vector, and the segmented hyperplane is in the middle of the spacing. An RF algorithm is a scheme that was proposed in the 2000s by Breiman for building a predictor ensemble with a set of decision trees that grow in randomly selected subspaces of data.26 Integration is not just a simple bagging integration,27 it combines the idea of bagging integration and feature selection. The RF classifier consists of a combination of tree classifiers, where each classifier is generated using a random vector that is independent of the input vector samples, and each tree votes for the most classes to classify the input vector. Numerous studies conducted worldwide have shown that RF algorithms perform very well in classification and prediction in various fields.28 Tree boosting29 is a highly effective and widely used ML method. XGBoost is an ensemble learning algorithm based on gradient boosting theory, it is a scalable end-to-end tree enhancement system proposed by Chen and Guestrin in 2016.30 Owing to its good scalability and high efficiency in the face of large data sets, it has been widely used by data scientists and has obtained the most advanced results in many ML challenges in recent years. Compared with the traditional gradient boosting decision tree, XGBoost has further improved the loss function, regularization, and parallelization,31 and has achieved good results in many application scenarios for classification problems and regression problems.

Performance Evaluation

Multiple evaluation indexes such as F1-score, the area under the receiver-operating characteristic curve (AUROC), and Brier score32 were used to comprehensively evaluate the discrimination and calibration of ML models (details in ).

Model Interpretation and Feature Importance

We used the best-performing of the five ML models to assess the importance of each variable. Moreover, we implemented SHapley Additive exPlanations (SHAP), which is a recent approach to explain the output of a ML model, to illustrate the individual feature-level impacts. In brief, SHAP is an additive feature attribution method that provides an explanation of the tree ensemble’s overall impact in the form of particular feature contributions and is relatively consistent with human intuition.33

Software Packages

All operations were implemented in Python 3.6.5, and various Python modules were used to conduct the analysis. The GridSearchCV from sklearn.model_selection was used for grid search with 5-fold cross-validation. The SMOTEENN from imblearn.combine was used for SMOTE+ENN. The LogisticRegression from sklearn.linear_model was used for Logistic regression. The KNeighborsClassifier from sklearn.neighbors was used for KNN. The SVC from sklearn.svm was used for SVM. The RandomForestClassifier from sklearn.ensemble was used for RF. The XGBClassifier from xgboost.sklearn was used for XGBoost.

Results

Patient Characteristics

A total of 5004 inpatients were included in this study, including 3292 males (65.79%), with an average age of 65.73 ± 11.58 years old and 1712 females (34.21%), with an average age of 70.80 ± 10.32 years old. Among these patients, 498 patients had adverse outcomes (deterioration or death), 4506 patients improved and were discharged, and the ratio of the two types of patients was 1:9.05, which represents an imbalanced data set.

Variables Selected

After feature selection by single factor and the RFE-RF with fivefold CV, the final optimal number of features was 44 (Figure 2, Table 1) (details in ).

Figure 2

Results of feature screening by RFE-RF with fivefold CV.

Table 1

Risk Factors Selected for Adverse Outcomes in Patients with HF

Variable	Adverse Outcomes		P value	Variable	Adverse Outcomes		P value
Variable	No	Yes	P value	Variable	No	Yes	P value
Age (years)	67.0(59.0–76.0)	76.0(68.0–81.0)	<0.001	HDLC (μmol/L)	1.0(0.8–1.1)	1.0(0.9–1.2)	0.004
DP (mmHg)	130(120–140)	130(118–150)	0.029	LDLC (μmol/L)	2.4(1.9–2.9)	2.3(1.8–2.9)	0.008
SP (mmHg)	80(70–85)	76(70–84)	<0.001	BUN (mmol/L)	6.0(4.9–7.6)	7.0(5.4–9.41)	<0.001
Height (cm)	167.0(160.0–171.0)	165.0(160.0–170.0)	0.013	CR (mmol/L)	78.0(66.0–92.9)	91.2(74.9–115.6)	<0.001
Weight (kg)	69.0(60.0–75.0)	65.0(55.0–71.0)	<0.001	UA (μmol/L)	365.0(297.0–443.0)	403.0(324.0–502.1)	<0.001
BMI (kg/m)	24.9(22.5–27.2)	23.4(21.1–25.9)	<0.001	K.1 (mmol/L)	4.1(3.8–4.3)	4.1(3.8–4.4)	0.007
WBC (10⁹/L)	6.6(5.5–7.9)	6.9(5.7–8.4)	0.003	NA (mmol/L)	140.0(138.0–142.0)	139.3(137.0–141.2)	<0.001
RBC (10¹²/L)	4.4(4.0–4.8)	4.2(3.8–4.6)	<0.001	CL (mmol/L)	104.0(101.8–107.0)	102.2(99.4–105.0)	<0.001
RDW (%)	13.8(13.3–14.5)	14.4(13.7–15.3)	<0.001	CYSC (mg/L)	1.1(0.9–1.3)	1.27(1.04–1.6)	<0.001
HGB (g/L)	137.0(125.0–149.0)	130.0(117.0–143.0)	<0.001	NTPROBNP	869.8(324.8–2427.7)	3072.1(1324.3–6324.1)	<0.001
NEU (10¹⁰/L)	4.2(3.3–5.3)	4.7(3.6–5.9)	<0.001	SG	1.0(1.0–1.0)	1.0(1.0–1.0)	0.007
N (%)	63.5(57.1–70.0)	68.5(62.3–75.1)	<0.001	Heartrate	70(62–82)	78.5(67–92)	<0.001
ALT (U/L)	19.0(13.4–29.0)	17.0(11.8–28.0)	<0.001	QRS (ms)	96(88–108)	102(90–122)	<0.001
ALB (g/L)	43.6(40–46.9)	40.8(37.0–43.8)	<0.001	QTC (ms)	431(406–462)	447(420–478)	<0.001
TBIL (μmol/L)	14.5(11.0–19.6)	15.3(11.3–21.7)	0.006	LA (mm)	38.4(36.0–42.0)	41.0(38.0–46.0)	<0.001
DBIL (μmol/L)	3.5(2.4–5.2)	4.8(3.1–6.6)	<0.001	RA (mm)	35.0(31.0–40.0)	37.8(33.0–45.0)	<0.001
x.GT (U/L)	27.0(18.1–43.7)	33.0(20.0–56.0)	<0.001	RA1 (mm)	43.0(39.0–47.0)	45.0(40.0–50.0)	<0.001
GLU (μmol/L)	5.1(4.5–6.2)	5.3(4.6–6.8)	<0.001	LVDD (mm)	52.0(47.0–58.0)	55.0(49.0–61.0)	<0.001
TG (mmol/L)	1.4(1.0–1.9)	1.2(0.9–1.6)	<0.001	EF (%)	53.0(41.0–62.0)	45.0(35.0–56.3)	<0.001
Healthcare			<0.001	NYHA			<0.001
Urban employee	2270(50.4%)	263(52.8%)			18(0.4%) 0.4%	0(0.0%)
Urban residents	559(13.30%)	56(11.2%)			2025(44.9%)	96(19.3%)
Rural cooperative	1160(25.7%)	103(20.7%)			1696(37.6%)	193(38.8%)
Poverty relief	6(0.1%)	0(0.0%)		IV	767(17.0%)	209(42.0%)
Full public	24(0.5%)	11(2.2%)		Pumonary			<0.001
Self-paying	142(3.2%)	31(6.2%)		No	3968(88.1%)	327(65.7%)
Other	305(6.8%)	34(6.8%)		Yes	538(11.9%)	171(34.3%)
Lung Rales			<0.001	PVS1AI			<0.001
No	3648(81.0%)	285(57.2%)		No	2507(55.6%)	179(35.9%)
Moist rales	830(18.4%)	205(41.2%)		Little	1718(38.1%)	246(49.4%)
Dry rales	28(0.6%)	8(1.6%)		Moderate	246(5.5%)	59(11.8%)
Infection			<0.001	Massive	35(0.8%)	14(2.8%)
No	4129(91.6%)	376(75.5%)
Yes	377(8.4%)	122(24.5%)

Note: Values are median (interquartile range) or n (%).

Risk Factors Selected for Adverse Outcomes in Patients with HF Note: Values are median (interquartile range) or n (%). Results of feature screening by RFE-RF with fivefold CV.

Outcomes of the ML Models

Among the evaluated ML models, SME-XGBoost yielded the highest F1-score and AUROC. The Brier score was also relatively low (Table 2). Therefore, SME-XGBoost was used as the optimal model for further study.

Table 2

Results of ML Models for the Unbalanced Data and the Data After Pretreatment with SMOTE+ENN(SME) [Mean (95% CI)]

Models	F1-Score	AUC	Brier Score
LR	0.0000(0.0000,0.0000)	0.7583(0.7542,0.7624)	0.7583(0.7542,0.7624)
KNN	0.0375 (0.0322,0.0429)	0.6721 (0.6675,0.6768)	0.0904 (0.0898,0.0909)
SVM	0.0000 (0.0000,0.0000)	0.7218 (0.7117,0.7318)	0.0869 (0.0865,0.0873)
RF	0.0000 (0.0000,0.0000)	0.7993 (0.7957,0.8030)	0.0796 (0.0793,0.0798)
XGBoost	0.3515 (0.3458,0.3572)	0.7918 (0.7879,0.7957)	0.1733 (0.1728,0.1737)
SME-LR	0.2914(0.2891,0.2936)	0.7819(0.7784,0.7853)	0.2801(0.2782,0.2820)
SME-KNN	0.2667 (0.2631,0.2703)	0.6481 (0.6437,0.6525)	0.3256 (0.3230,0.3283)
SME-SVM	0.1976 (0.1922,0.2030)	0.6963 (0.6925,0.7001)	0.1632 (0.1615,0.1650)
SME-RF	0.3606 (0.3567,0.3645)	0.7983 (0.7947,0.8019)	0.1577 (0.1565,0.1588)
SME-XGBoost^b	0.3673 (0.3633,0.3712)	0.8010 (0.7974,0.8046)	0.1769 (0.1748,0.1789)
P value^a	<0.001	<0.001	<0.001

Notes: aP value is the result of one-way analysis of variance for the three indicators of models. bAfter multiple comparisons of least-significant difference (LSD), SME-XGBoost is significantly different from other models.

Results of ML Models for the Unbalanced Data and the Data After Pretreatment with SMOTE+ENN(SME) [Mean (95% CI)] Notes: aP value is the result of one-way analysis of variance for the three indicators of models. bAfter multiple comparisons of least-significant difference (LSD), SME-XGBoost is significantly different from other models.

Categorization of Prediction Score and Risk Distributions

The best performing SME-XGBoost model was used to identity the risk of adverse outcomes in the test set. The Brier score of the model was 0.1769, indicating that the final model was well calibrated and could accurately identify patients with adverse outcomes. The patients were separated into two groups, low and high prediction scores, using the maximal Youden’s index as an optimal cut-off value (0.3739) (Figure 3A). At this cut-off, the prediction scores was associated with a sensitivity and specificity of 0.798 and 0.690, respectively. The distribution plots of the patient risk sequence identified by the model showed a certain aggregation of patients who had adverse outcomes (Figure 3B), indicating that the model accurately stratified patients at low or high risk.

Figure 3

Categorization threshold of prediction score (A) and prediction distributions of adverse outcomes in patients with HF (B).

Categorization threshold of prediction score (A) and prediction distributions of adverse outcomes in patients with HF (B). SHAP plot can give physicians an intuitive understanding of key features in the model and it visually displays the top 20 risk factors (Figure 4). Older age, higher value of N-terminal pronatriuretic peptide (NT-proBNP), direct bilirubin (DBIL), QRS wave, creatinine (CR), heart rate, glucose (GLU), red blood cell volume distribution width (RDW), anteroposterior diameter of right atrium (RA), diastolic pressure (DP), and lower value of albumin (ALB), urine-specific gravity (SG), systolic pressure, red blood cells (RBC), chloride ion concentration (CL) were associated with higher risk probability of adverse outcomes in patients with HF. In addition, pulmonary disease (PUMONARY), high level of New York Heart Association (NYHA) clinical classifications, and pulmonary aortic valve regurgitation (PVSIAI-1) were also higher risk factors for adverse outcomes.

Figure 4

SHAP summary plots for the risk of adverse outcomes in patients with HF. The importance ranking of the top 20 risk factors with stability and interpretation using SME-XGBoost model. The SHAP value (x-axis) is a unified index responding to the impact of a feature in the model. In each feature importance row, all patients’ attribution to outcome were plotted using different color dots, in which the red dot represented high risk value and the blue dot represented low risk value.

Discussion

HF damages the quality of life more than almost any other chronic diseases.4 Accurate identification of prognostic risks is fundamental to patient-centered care, both in selecting treatment strategies and in informing patients as a foundation for shared decision making.32 Although published reports are abundant with different models identifying the risk of either mortality or hospitalizations in patients with HF,34 the present study extends this knowledge in several important ways. First, most standard algorithms assume or expect balanced class distributions or equal misclassification costs. When presented with imbalanced data sets these algorithms fail to properly represent the distributive characteristics of the data, and thus providing unfavorable accuracies across the classes of the data.35 Unfortunately, in the field of biomedicine, unbalanced data are ubiquitous, as the number of healthy people for whom medical data has been collected is often much larger than that of unhealthy ones. This provides us with new challenges in exploring disease risk identification models. If the problem of category imbalance was ignored, the risk identification model built with imbalanced data sets tends to envisage a higher accuracy rate for the majority class and ignore the minority class. The detailed performance is that the F1-score of the models is very close to or even equal to 0. It indicates that the ability of the model to identify true positive outcomes is very poor, which can be confirmed in our study (Table 2). Studies have shown that for several base classifiers, a balanced data set provides improved over all classification performance compared to an imbalanced data set.36,37 Thus, it is essential to use an effective preprocessing method to deal with imbalances before modeling so as to improve the accuracy of the model.38 In some reports, SMOTE is a typical oversampling technique which can effectively deal with the imbalanced data. However, it brings noise and other problems, affecting the classification accuracy.39 Our study extends this knowledge in an effective way. We used SMOTE+ENN to preprocess the data. In addition to the data imbalance issue, this method also solved the problem that the SMOTE algorithm is prone to overlapping data and noise. The performance of each model constructed on the data processed by SMOTE+ENN improved significantly in the study, particularly for F1-score as indicator that reflect the detection rate of positive events. The above results show that SMOTE+ENN can effectively solve the problem of classification deviation caused by unbalanced data and provide a reference for future classification prediction research of imbalanced data. Second, most of the previous models were developed using traditional statistical approaches. However, the new alternatives, such as ML–based models, have remained not under used.40 Advanced statistical tools and ML methods can improve the risk identification ability of traditional statistical techniques in various ways.41 In our study, in addition to the advanced ML model, other ML knowledge that has been shown to effectively improve the performance of risk identification models was also used, such as the missing value filling based on missForest, feature selection based on RFECV, and hyperparameter optimization based on GridSearchCV. Among the evaluated models, SME-XGBoost demonstrated the best performance, and this algorithm was used to evaluate the impact factors. XGBoost combining SMOTE+ENN forms the foundation for future testing of the clinical utility with more accurate risk stratification of patients’ care and outcomes. Third, this study found that models constructed from data collected by CHF-CRF can accurately identity the risk of adverse outcomes. If combined with rigorous clinical trials, better risk identification results can be obtained, which is the next step in our research. Fourth, although many ML models can provide the importance of variables, they have difficulty explaining whether variables increase or decrease the occurrence of outcomes. Meanwhile, the lack of intuitive understanding of ML models among clinicians is one of the major obstacles to the implementation of ML in the medical field.42 In our study, we employed ML methods to account for feature importance in specific domains, apply a visual interpretation of the importance of each feature, and compared the accuracy of different ML models using risk identification for adverse outcomes in patients with HF. The study ultimately included 44 variables. Majority of them are routinely assessed during the management of HF; therefore, they are readily available from EMR. In our study, we found that age, systolic pressure, creatinine, NYHA, and NT-proBNP were important factors of adverse outcomes, which is consistent with the results of a recent systematic review of 117 HF predictive models.43 Meanwhile, the importance of these factors has also been confirmed in other studies.32,44,45 However, several highly important factors of adverse outcomes from the present study such as pulmonary disease, albumin, DBIL, QRS, SG and CL were not reported in previous studies to the best of our knowledge. It suggests that these factors should be paid more attention in the future and it also provides a new basis for the future study of the prognosis of HF. In addition, some investigators found that sex, sodium, diabetes, blood urea nitrogen, hemoglobin, ejection fraction, angiotensin-converting enzyme inhibitor treatment and left ventricular systolic dysfunction had significant impact for adverse outcomes in patients with HF,40,42,45 but these factors did not show strong influence in this study.

Limitations and Development

First, this study used a retrospective study—without follow-up of patients—and all patient information was collected in Shanxi Province, meaning it could be stored with a certain bias. In further, we will expand the scope of data collection, make full use of the advantages of EMR information, and carry out patient follow-up, combined with a time factor. Meanwhile, we will collect more data from different hospitals and regions, and use data from different regions as external validation of this model. Second, the information collected in this study was structured data, further research is needed to unearth unstructured information, and add imaging information, biomarkers, environmental factors, and lifestyle habits, as well as other factors to improve prediction. Third, this research solves the problem of data imbalance from the data level. The next step is to combine this with the algorithm level. Fourth, although this study has achieved good results, there is still the possibility of further improvement. With the rapid development of artificial intelligence, deep learning has been applied to the construction of medical models. Future research will introduce deep learning to predict the prognosis of HF, and combine more extensive data and information to conduct research on different levels.

Conclusions

Combining SMOTE+ENN and advanced ML methods effectively improved the risk identification of adverse outcomes in patients with HF, and accurately stratified patients at risk of adverse outcomes. This method can be used to solve the problem of class imbalance in medical data modeling in the future. Moreover, ML model and SHAP plot can provide intuitive explanations of what led to a patients’ predicted risk, thus helping clinicians better understand the decision-making process for disease severity assessment. The features can provide a reference for intervention and the models can be used by clinicians as an important tool for identifying the high-risk patients.

22 in total

1. Age affects the prognostic impact of diabetes in chronic heart failure.

Authors: Filipe Manuel Cunha; Joana Pereira; Ana Ribeiro; Marta Amorim; Sérgio Silva; José Paulo Araújo; Adelino Leite-Moreira; Paulo Bettencourt; Patrícia Lourenço
Journal: Acta Diabetol Date: 2018-01-08 Impact factor: 4.280

2. Efficient kNN Classification With Different Numbers of Nearest Neighbors.

Authors: Shichao Zhang; Xuelong Li; Ming Zong; Xiaofeng Zhu; Ruili Wang
Journal: IEEE Trans Neural Netw Learn Syst Date: 2017-04-12 Impact factor: 10.451

3. [Chinese guidelines for the diagnosis and treatment of heart failure 2018].

Authors:
Journal: Zhonghua Xin Xue Guan Bing Za Zhi Date: 2018-10-24

4. 2016 ACC/AHA/HFSA Focused Update on New Pharmacological Therapy for Heart Failure: An Update of the 2013 ACCF/AHA Guideline for the Management of Heart Failure: A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Failure Society of America.

Authors: Clyde W Yancy; Mariell Jessup; Biykem Bozkurt; Javed Butler; Donald E Casey; Monica M Colvin; Mark H Drazner; Gerasimos Filippatos; Gregg C Fonarow; Michael M Givertz; Steven M Hollenberg; JoAnn Lindenfeld; Frederick A Masoudi; Patrick E McBride; Pamela N Peterson; Lynne Warner Stevenson; Cheryl Westlake
Journal: J Am Coll Cardiol Date: 2016-05-20 Impact factor: 24.094

5. Machine Learning Prediction of Mortality and Hospitalization in Heart Failure With Preserved Ejection Fraction.

Authors: Suveen Angraal; Bobak J Mortazavi; Aakriti Gupta; Rohan Khera; Tariq Ahmad; Nihar R Desai; Daniel L Jacoby; Frederick A Masoudi; John A Spertus; Harlan M Krumholz
Journal: JACC Heart Fail Date: 2019-10-09 Impact factor: 12.035

6. Predicting diabetes-related hospitalizations based on electronic health records.

Authors: Theodora S Brisimi; Tingting Xu; Taiyao Wang; Wuyang Dai; Ioannis Ch Paschalidis
Journal: Stat Methods Med Res Date: 2018-11-25 Impact factor: 3.021

7. Predictors of cause-specific hospital readmission in patients with heart failure.

Authors: Zaruhi V Babayan; Robert L McNamara; Nagaprasad Nagajothi; Edward K Kasper; Haroutune K Armenian; Neil R Powe; Kenneth L Baughman; João A C Lima
Journal: Clin Cardiol Date: 2003-09 Impact factor: 2.882

8. When and how should multiple imputation be used for handling missing data in randomised clinical trials - a practical guide with flowcharts.

Authors: Janus Christian Jakobsen; Christian Gluud; Jørn Wetterslev; Per Winkel
Journal: BMC Med Res Methodol Date: 2017-12-06 Impact factor: 4.615

9. Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone.

Authors: Davide Chicco; Giuseppe Jurman
Journal: BMC Med Inform Decis Mak Date: 2020-02-03 Impact factor: 2.796

10. Development and Internal Validation of Machine Learning Algorithms for Preoperative Survival Prediction of Extremity Metastatic Disease.

Authors: Quirina C B S Thio; Aditya V Karhade; Bas JJ Bindels; Paul T Ogink; Jos A M Bramer; Marco L Ferrone; Santiago Lozano Calderón; Kevin A Raskin; Joseph H Schwab
Journal: Clin Orthop Relat Res Date: 2020-02 Impact factor: 4.755

1 in total

1. Using random forest algorithm for glomerular and tubular injury diagnosis.

Authors: Wenzhu Song; Xiaoshuang Zhou; Qi Duan; Qian Wang; Yaheng Li; Aizhong Li; Wenjing Zhou; Lin Sun; Lixia Qiu; Rongshan Li; Yafeng Li
Journal: Front Med (Lausanne) Date: 2022-07-28

1 in total