Literature DB >> 35073375

Anemia or other comorbidities? using machine learning to reveal deeper insights into the drivers of acute coronary syndromes in hospital admitted patients.

Faisal Alsayegh¹, Moh A Alkhamis², Fatima Ali¹, Sreeja Attur¹, Nicholas M Fountain-Jones^3,4, Mohammad Zubaid¹.

Abstract

Acute coronary syndromes (ACS) are a leading cause of deaths worldwide, yet the diagnosis and treatment of this group of diseases represent a significant challenge for clinicians. The epidemiology of ACS is extremely complex and the relationship between ACS and patient risk factors is typically non-linear and highly variable across patient lifespan. Here, we aim to uncover deeper insights into the factors that shape ACS outcomes in hospitals across four Arabian Gulf countries. Further, because anemia is one of the most observed comorbidities, we explored its role in the prognosis of most prevalent ACS in-hospital outcomes (mortality, heart failure, and bleeding) in the region. We used a robust multi-algorithm interpretable machine learning (ML) pipeline, and 20 relevant risk factors to fit predictive models to 4,044 patients presenting with ACS between 2012 and 2013. We found that in-hospital heart failure followed by anemia was the most important predictor of mortality. However, anemia was the first most important predictor for both in-hospital heart failure, and bleeding. For all in-hospital outcome, anemia had remarkably non-linear relationships with both ACS outcomes and patients' baseline characteristics. With minimal statistical assumptions, our ML models had reasonable predictive performance (AUCs > 0.75) and substantially outperformed commonly used statistical and risk stratification methods. Moreover, our pipeline was able to elucidate ACS risk of individual patients based on their unique risk factors. Fully interpretable ML approaches are rarely used in clinical settings, particularly in the Middle East, but have the potential to improve clinicians' prognostic efforts and guide policymakers in reducing the health and economic burdens of ACS worldwide.

Entities: Chemical

Mesh：

Year: 2022 PMID： 35073375 PMCID： PMC8786175 DOI： 10.1371/journal.pone.0262997

Source DB: PubMed Journal: PLoS One ISSN： 1932-6203 Impact factor: 3.240

Introduction

Cardiovascular diseases are responsible for one-third of deaths worldwide, with projected mortalities of up to 7.8 million in 2025 [1]. Besides the costs of treatment and intervention programs, premature deaths due to cardiovascular disease cause substantial global economic losses due to lost productivity [2]. Effective primary prevention is often difficult due to the complexity of cardiovascular disease epidemiology and the dynamic nature of risk profiles that are rapidly changing in response to increasing urbanization and globalization and shifts in demography [3,4]. This is particularly true for Acute coronary syndromes (ACS) that is an important category of cardiovascular disease that includes unstable angina and myocardial infarction. The complexity of the epidemiology of ACS poses a significant challenge to the prognostic capacities of primary and secondary care clinicians leading to a higher frequency of negative in-hospital outcomes. Large population-level studies provide critical insights on ACS risk factors, however, non-linear relationships and complex interactions between ACS risk factors make inference and prediction difficult. Machine Learning (ML) algorithms can capture these complex relationships to build powerful predictive models that have provided important insights into the clinical epidemiology of cardiovascular diseases generally [5,6]. However, ML models are often considered ‘black box’ and can be difficult to interpret. Interrogating these ‘black box’ models with advances in interpretable machine learning can help gain mechanistic insights into predictions in a variety of systems (Molnar, 2018, Fountain-Jones et al). Interpretable machine learning methods, however, are rarely used to help predict and interpret ACS risk. ACS diagnosis and treatment are significant challenges for clinicians due to the significant overlap in symptoms between ACS patients and non-patients [7]. Despite the availability of many ACS diagnostic tools (e.g., coronary angiography, cardiac markers, and electrocardiographic), nearly two to five percent of ACS true cases were wrongly discharged from the emergency room due to the false indications of non-cardiac disease [8,9]. This diagnostic error is a leading cause of ACS mortalities worldwide, causing severe public health and economic implications [9]. However, the risk stratification approach [10], significantly helped clinicians improve their diagnostic and prognostic efforts of ACS events over the past few decades. Risk stratification is defined as formal prediction procedure of ACS events according to the individual patients’ risk at the time of presentation [11], and include many tools such as Thrombolysis in Myocardial Infraction (TIMI) [12], the Evalution of Methods of Management of Acute Coronary Syndrome (EMMACE) [13,14], and the Global Registry of Acute Coronary Events (GRACE) [15-17]. These tools rely on calculations from data on patient’s presenting symptoms, historical information available at the time of presentation, and laboratory result studies [12]. Anemia is one of the most observed comorbidities, with an estimated worldwide prevalence of 10%-43% in patients with ACS. Anemia can aggravate ACS outcomes due to the twin effect of the decreased overall oxygen content of the blood leading to ischemic myocardial tissue and subsequently increased cardiac output to sustain a sufficient oxygen supply [18]. Therefore, anemia is an independent predictor of adverse cardiovascular events in patients with ACS [18]. There is some evidence that anemia is strongly associated with severe hemorrhagic complications and short and long-term mortality in ACS [19]. Further, older ACS patients with chronic anemia typically suffer from other co-morbidities such as chronic heart failure and kidney disease when compared to their counterparts with normal hemoglobin levels [20]. Anemia can also compromise or delay interventional procedures such as coronary angiography and percutaneous coronary intervention (PCI), leading to potential cardiac complications [21,22]. While research on the relationship between anemia and ACS outcomes is limited in the Middle East, the Gulf Registry of Acute Coronary Events–II (GULF RACE II) study found that nearly 28% of ACS patients were anemic at the time of admission [23]. Additionally, most of their anemic patients suffered from multiple in-hospital ACS related complications [23]. The inherent limitations of population-based studies from registries or clinical trials may partially pose an obstacle to improving ACS diagnostic and prognostic performance. For example, the generalizability of inferences from such studies may not realistically reflect all the patients with ACS or represents populations with special risk factors [24]. While risk stratification tools such as major adverse cardiovascular event (MACE) or Thrombolysis in Myocardial infarction (TIMI) can tailor personalized interventions based on individual patient-level predictions, they mainly depend on regression scoring systems that primarily assume linear relationships between the outcome and its predictors on a population level [11,12]. Additionally, traditional statistical linear models are susceptible to overfitting and tend to underperform with large datasets collected by registries, partially due to the high correlations between variables [25]. In contrast, ML algorithms require minimal statistical assumptions, can explore large data sets, and accommodate thousands of variables of different varieties (e.g., genomic data, medical images). ML algorithms can also efficiently and robustly quantify complex interactions between variables providing the ability to infer novel insights into the clinical epidemiology of ACS. Further, ML algorithms can outperform traditional statistical methods in individual-level predictions, rendering them the most suitable tools for improving clinical performance [5,6]. Yet, ML algorithms have not been widely adopted in clinical practice, particularly in countries in the Middle East where ACS is common. Here, we apply a newly developed multi-algorithm ML ensemble pipeline to the Gulf locals with ACS events (Gulf COAST) registry to identify which factors shaped the risk of different in-hospital ACS events among four Gulf countries. More specifically, we used patient characteristics data to build interpretable predictive risk models for three in-hospital ACS events, including heart failure, bleeding, and mortalities, to identify and compare their unique requirements for onset in a clinical setting. Further, we explore the role of patients’ initial hemoglobin values upon admission (i.e., admission anemia), and their interactions with other relevant factors for each selected ACS event. Moreover, we extend and evaluate our models in the context of in-hospital individual level prognosis to address the utility and limitations of interpretable ML models compared to traditional risk stratification methods.

Methods

Data source

We retrieved our data from the Gulf locals with acute coronary syndromes events (Gulf-COAST registry), which comprises 4,044 records of patients admitted with a diagnosis of ACS to 29 hospitals between January 2012 and January 2013 in Bahrain, Kuwait, Oman, and the United Arab Emirates. A detailed description of the design and implementation of the registry is available elsewhere [26]. We used the world health organization (WHO) definition of anemia in adults (Males <13 g/dl, Females <12 g/dl) in our study [27]. We selected variables (hereafter ‘features’) that were found to be significantly associated with ACS events including variables capturing patient demographics, past medical history, medical status upon admission and in-hospital ACS outcomes (Table 1) [28-30]. We used infarction/reinfarction, percutaneous coronary intervention (PCI), heart failure, stroke, bleeding, and mortality as independent outcomes for our predictive models (6 ML models in total). Also, we reused these in-hospital outcomes as independent features for predicting the risk of each corresponding ACS event (e.g., using in-hospital bleeding as an independent predictor of mortality).

Table 1

Baseline characteristics of Gulf-COAST patients.

Study population	n = 4,044 (%)
Demographics and hemoglobin on admission
Country
UAE	691 (17.14)
Kuwait	1,230 (30.51)
Oman	1,481 (36.74)
Bahrain	629 (15.60)
Sex (Female)	1,354 (33.59)
Age, mean ± SD (years)	60.33 ± 12.69
Anemia (at admission)	1,713 (42.36)
Initial hemoglobin, mean ± SD (g/dl)	13.26 (2.06)
Smoking	1,593 (39.52)
Alcohol consumption	126 (3.13)
Age, mean ± SD (years)	60.33 ± 12.69
Past Medical History
Hypertension	2,612 (64.80)
Dyslipidemia	2,277 (56.49)
Diabetes mellitus	2,179 (54.06)
Previous history of CVD	2,354 (58.40)
Stroke or TIA^b	290 (7.19)
Chronic renal failure	292 (7.24)
Cancer	44(1.09)
In-hospital outcomes
Infarction/reinfarction	71 (1.76)
PCI^c	14 (0.35)
Heart Failure	521 (12.92)
Stroke	36 (0.89)
Bleeding	119 (2.95)
Mortality	167 (4.14)
Length of hospital stay, mean ± SD (days)	5.90 (7.36)
Prevalence of In-hospital ACS^d per subtypes
LBBB MI	3/30 (10.00)
STEMI	436/997 (43.73%)
NSTEMI	763/1,916 (39.82%)
Unstable Angina	351/1,101 (31.88%)

aCardiovascular Disease;

bTransient Ischemic Attack;

cPercutaneous Coronary Intervention; ACS; Acute Coronary Syndrome.

aCardiovascular Disease; bTransient Ischemic Attack; cPercutaneous Coronary Intervention; ACS; Acute Coronary Syndrome.

Data processing

We used the ML pipeline proposed by Fountain-Jones et al., which constructs predictive models and compares four popular supervised algorithms, including random forest (RF), support vector machine (SVM), gradient boosting (GBM), and logistic regression (LR). ML algorithms construct classification models using different approaches (see Fountain-Jones et al.), and comparing performance across algorithms is important to optimize importance. We excluded features with the largest mean absolute correlation (ρ > 0.9) and applied the ‘Boruta’ R package to eliminate further features to just those relevant for prediction to boost the performance of our ML algorithms [31]. We controlled for class imbalance using a down-sampling procedure that randomly down samples the majority class to match its frequency to the minority class (i.e., patients discharged with ACS events). We then randomly partitioned the dataset into a training (80%) and testing (20%) sets and used the K-fold cross-validation (K = 10) procedure to train the ML algorithms. All of our statistical analyses were conducted in the R software environment [32].

Model training and evaluation

We trained our ML algorithms using the complete set of features for each ACS in-hospital event (Table 1). We ran the GBM, SVM, and LR algorithms using the ‘Caret’ R package while we used the ‘random Forest’ R package to run the RF algorithm [33-35]. We estimated the performance parameters of each model, including the area under the curve through a receiver operator characteristic (ROC), accuracy (Acc), specificity (Sp), and sensitivity (Se) using the 10-fold cross-validation approach. These parameters were calculated using the average confusion matrix across all folds of the cross-validation. Here, we used the 10-fold cross-validation procedure to avoid overfitting due to the use of the same data for training and validation, as well as to prevent artificial inflation of the accuracy. Default grid parameter settings were used in the training process of all algorithms. We then compared the estimated validation parameters of each model using the testing dataset to select the best performing algorithm in predicting the probability of an in-hospital ACS event.

Model interpretation

We interrogated our best-performing models feature importance, partial dependence, feature interaction strength and the relationships between features and the ACS events on randomly selected individual patients. We used Breiman’s permutation procedure to compute feature importance, which is implemented in the ‘iml’ R package [33,36]. This method quantifies the expected loss in predictive performance (i.e., how the algorithm classifies the occurrence of patients ACS events) for a pair of observations compared to the full model when a specific feature has been switched [33,37]. Thus, the feature is deemed unimportant when the permutation procedure does not affect model performance. We used partial dependence (PD) plots and centred individual conditional expectation (ICE) to estimate the global and individual effects of each important feature on the response, and each observation, respectively [38]. Feature interaction strength was quantified using Friedman’s H-statistic, which accounts for the portion of variance explained by the interaction through a partial dependency decomposition procedure [39]. Finally, following a game theory approach, we calculated Shapley values (ϕ), from the final selected models. This unique approach quantifies individual-level predictions for randomly selected patients and the contribution of each feature to those predictions [40].

Results

For both in-hospital mortality and bleeding models, the RF algorithm slightly outperformed other algorithms in terms of performance parameters (i.e., AUC, Acc, Sp, Se) and correctly predicted 85% and 75% of death and bleeding events, respectively (AUCs = 0.85 & 0.75; Table 2). In contrast, the GBM algorithm slightly outperformed other algorithms and correctly predicted heart failure 77% of the time (AUC = 77.44; Table 2). While LR model consistently had poor performance (AUCs < 0.5; Table 2) and was leaning toward random prediction for all ACS outcomes. Also, for all in-hospital events models, LR performance parameters were notably lower than other ML algorithms (Table 2). Overall, the mortality model had the highest performance parameters when compared to both heart failure and bleeding models (Table 2). Our models were not able to accurately predict the need for PCI, stroke, and infarction models.

Table 2

Cross-validation summary results for GBM, RF and SVM models.

Model	AUC^a ± SE^b	Accuracy (%) ± SE	Specificity (%) ± SE	Sensitivity (%) ± SE
Mortality
RF^c	0.85 ± 0.02	79.18 ± 1.17	79.21 ± 1.19	74.15 ± 3.58
GBM^d	0.81 ± 0.01	78.72 ± 2.12	78.73 ± 2.14	75.86 ± 2.95
SVM^e	0.82 ± 0.00	76.29 ± 1.70	72.28 ± 1.75	79.54 ± 2.91
LR^f	0.69 ± 0.00	60.19 ± 2.35	55.29 ± 1.99	69.71 ± 2.34
Heart Failure
RF	0.68 ± 0.00	62.24 ± 1.39	62.18 ± 1.42	65.92 ± 1.25
GBM	0.77 ± 0.01	68.76 ± 1.69	69.67 ± 1.74	70.66 ± 1.71
SVM	0.71 ± 0.02	62.55 ± 1.76	62.46 ± 1.82	67.50 ± 1.59
LR	0.58 ± 0.10	63.89 ± 2.66	61.79 ± 1.34	58.15 ± 1.88
Bleeding
RF	0.76 ± 0.01	70.57 ± 3.21	72.11 ± 3.23	57.81 ± 5.06
GBM	0.65 ± 0.02	64.58 ± 3.14	64.60 ± 3.16	58.64 ± 4.58
SVM	0.63 ± 0.00	77.06 ± 2.79	77.09 ± 1.75	66.87 ± 1.75
LR	0.54 ± 0.00	66.11 ± 4.33	55.78 ± 3.99	56.72 ± 3.83

aArea Under the Curve,

bStandard error,

cRandom Forest,

dGradient Boosting,

eSupport Vector Machine,

fLR: Logistic Regression. Model highlighted in gray is the best performing model.

aArea Under the Curve, bStandard error, cRandom Forest, dGradient Boosting, eSupport Vector Machine, fLR: Logistic Regression. Model highlighted in gray is the best performing model. Our ML approach revealed that in-hospital heart failure followed by initial hemoglobin values at admission and age, were the most important features for predicting the risk of mortality (Fig 1A). Notably, hemoglobin was the most important predictor for both in-hospital heart failure and bleeding events (Fig 1B and 1C). However, in-hospital infraction and bleeding were the second and third important predictors for heart failure, respectively (Fig 1B). In contrast, age and heart failure were the second and third important predictors for bleeding, respectively (Fig 1C).

Fig 1

Important Features that Contribute to the Prediction of Three In-Hospital ACS Related Outcomes, Including (A) mortality, (B) heart failure, and (C) Bleeding. A classification error loss function (“ce”) was used to calculate feature importance. Black dots indicate median “ce,”. CVD: Cardiovascular disease. TIA: Transient ischemic attack. PD plots consistently showed that the risk of all ACS outcomes increased when patients with initial hemoglobin values of equal or less than 10 g/dl are more likely to be discharged dead, had heart failure or developed bleeding (Fig 2B–2D). Our PD plots also show that patients with in-hospital heart failure (Fig 2A) and those aged over 75 years (Fig 2G) are more likely to be discharged dead. Further, patients having in-hospital infarction (Fig 2E) and bleeding (Fig 2H) are more likely to experience an in-hospital heart failure (Fig 2H). However, in the bleeding model the effect of age was inconclusive with no distinct trends (Fig 2F). Yet, patients having in-hospital heart failure are more likely to experience bleeding events (Fig 2I).

Fig 2

Centred Individual Conditional Expectation (ICE) plots for the top three important features that contribute to the prediction of three in-hospital ACS outcomes, including (A, D, G) mortality, (B, E, H) heart failure, and (C, F, I) bleeding. The plots show the relationship between the predicted risk of ACS outcomes and each corresponding feature. The black lines indicate the predicted risk in each patient, while the red line indicates the partial dependence calculated as the average risk across all patients. Initial hemoglobin on admission had the strongest interactions with the other features in shaping the risk of in-hospital deaths and heart failure events (Fig 3A and 3D). However, chronic renal failure had the strongest overall interaction strength, among other features for shaping the risk of bleeding (Fig 3G). Nevertheless, the interaction between admission hemoglobin and age (Fig 3B) was the strongest for predicting the risk of death from an ACS event, in which the majority of the patients aged above 75 years with hemoglobin values close to or less than 10 g/dl are at high risk of death from an ACS event (Fig 3C). For the heart failure model, the interaction between admission hemoglobin and sex was the strongest among other interactions (Fig 3E). Further, notable declines in risk were inferred for both males and females at hemoglobin values greater than 10 g/dl, but with a sharp increase at hemoglobin values ≅ 20 10 g/dl (Fig 3F). However, the risk of heart failure in males at lower hemoglobin values was slightly higher than in females (Fig 3F). For the bleeding model, the interaction between hemoglobin and chronic renal failure was the strongest among other interactions (Fig 3H), in which anemic patients with chronic renal failure are more likely to experience bleeding events (Fig 3I).

Fig 3

Feature interaction plots calculated using Friedman’s H-statistic.

Feature interaction plots calculated using Friedman’s H-statistic.

(A-C) indicate the mortality model; (D-F) indicate the heart failure model; (G-I) indicate the bleeding model. The three plots on the top (A, D, G) showing the overall interaction strength of each feature with the other features. Plots (B & E) demonstrates the overall interaction strength of hemoglobin with the other features, while plot (H) demonstrate the overall interaction strength of chronic renal failure with the other features. Partial dependence plots at the bottom (C, F, I) represent the top interactions that shaped the risk of the three acute coronary syndromes outcomes. (C) interaction between age and initial hemoglobin for the mortality model. The heat matrix corresponds to the risk of death, in which lighter shades of red indicate lower risks of death, and darker shades of reds indicate higher risks of death. The bar on the right indicates (y hat) the relative risk of being dead with all other feature combinations marginalized. (F) interaction between sex and initial hemoglobin for the heart failure model. (I) interaction between chronic renal failure initial hemoglobin for the bleeding model. Red and green partial dependence curves represent different sexes (F) and the status of chronic renal failure (I), while the y-axis indicates the predicted risk of heart failure or bleeding. CVD: Cardiovascular disease. TIA: Transient ischemic attack. The game-theoretic approach we used provided more insight into how the best performing model predicted ACS outcomes at an individual patient level (Fig 4). Patients with hemoglobin values of 10 g/dl or less were observed with ACS, and the models predicted that they are more likely to experience either death, heart failure, or bleeding (probabilities > 0.8; Fig 4A–4C). Conversely, patients with hemoglobin values higher than 10 g/dl were also observed, and the models predicted that they are less likely to experience either death, heart failure, or bleeding (probabilities = 0.0; Fig 4D–4F).

Fig 4

Value contributions for the respective risk of acute coronary syndromes (ACS) based on Shapley Values (φ) for Six Individual Patients.

(A) a patient discharged dead; (B) a patient with an in-hospital heart failure; (C) a patient with an in-hospital bleeding; (D) a patient discharged alive; (E) a patient discharged alive with no in-hospital heart failure; and (F) a patient discharged alive with no in-hospital bleeding. Red bars indicate positive outcomes, and blue bars indicate negative outcomes. Positive Shapely value indicates that this feature increased the risk of ACS outcome, whereas negative values indicate that this feature decreased the risk of ACS outcome. The values next to each feature name indicate the observed value of that feature for that patient. *CVD: Cardiovascular disease; *TIA: Transient ischemic attack.

Value contributions for the respective risk of acute coronary syndromes (ACS) based on Shapley Values (φ) for Six Individual Patients.

Discussion

Using our integrated ML pipeline and the Gulf COAST data, we uncovered deeper insights into the factors that shape the outcome of ACS in the Gulf countries. We also revealed the unique and complex role of anemia on admission in the prognosis of different ACS outcomes. Overall, we found that the initial hemoglobin values at admission were the most important variables shaping the risk of ACS in-hospital outcomes. Notably, the relationship between admission anemia and other baseline characteristics was non-linear in shaping the risk of in-hospital events. These rigorous interpretable insights generated by our ML approach can not only improve clinicians’ prognostic efforts but assist with reducing the public health and economic implications of this important cardiovascular disease. Our ML algorithms consistently identified admission hemoglobin values and in-hospital cardiovascular disease events as the most important predictors for the risk of in-hospital mortalities, heart failure, and bleeding (Fig 1). Our results were consistent with past studies in terms of the critical role of anemia in shaping the risk of ACS in-hospital outcomes [41,42]. We found that hemoglobin values on admission equal to or less than 10 g/dl increases the risk of in-hospital death, heart failure, and bleeding (Fig 2B–2D). This finding has also been observed in previous studies since anemia causes hypo-oxygenation to major organs, including the heart, leading to compromise of cardiac function and sterile inflammation, which accelerates atherosclerosis and promotes thrombosis [43]. Further, cICE plots (Fig 2D–2F) and feature interaction plots (Fig 3) show that the relationship between anemia and the risk of several ACS outcomes is non-linear and far more complex [20,42]. These results indicate that anemia on admission has both a direct and indirect role in the prognosis of ACS and that the combination of anemia and other baseline characteristics shaped the risk of in-hospital outcomes. While we were unable to quantify a distinct relationship between the risk of mortality and bleeding with age (Fig 2F), the individual interaction between mortality and age showed that patients aged greater than 75 years old with initial hemoglobin value less than 10 g/dl are more likely to die from ACS related complication (i.e., darker shades of red tend to accumulate across the spectrum of hemoglobin values; Fig 3C). This also confirms the complex non-linear relationship between hemoglobin and age in shaping the risk of ACS events [44]. Furthermore, our models demonstrate that males are more likely to experience heart failure than females if initial hemoglobin values were less than 10 g/dl (Fig 3F), which agrees with the notion that sex is an important modifier and contributor to the development of heart failure [44]. We also show that anemic patients with chronic renal failure are more likely to develop a bleeding event (Fig 3I). This could be because a patient with chronic renal failure suffers from severe albuminuria, which is an important facilitator of bleeding events [45]. These findings are expected since anemia is often associated with other comorbidities such as infections, chronic inflammatory conditions, chronic renal failure and neoplastic diseases [42]. Yet, unlike past studies, we found a slight non-linear increase in the risk of ACS outcomes at hemoglobin thresholds greater than 15 g/dl, particularly in the mortality and bleeding models (Fig 2C and 2D). This U-shaped trend is notably distinct in the individual interactions between hemoglobin on one side and age, sex, and chronic renal failure on the other (Fig 3C, 3F and 3I). These findings strongly quantify the notion of the role of polycythemia in shaping the risk in-hospital ACS events with other risk factors in the higher dimensional space [46,47]. Indeed, past studies confirmed that polycythemia could cause both thrombosis and bleeding events in the same patients with ACS [48,49]. Additionally, polycythemia can cause ischemia which leads to the development of arterial or venous thrombosis, myocardial, or heart failure [49]. One important limitation of the Gulf COAST registry is the population size, and therefore generalizability of our findings might be biased toward the population included in our analyses. Yet, many of our ML inferences agrees with the findings of the GULF RACE II study in terms of the role of anemia in ACS patients, which included these three Gulf countries [23]. Nevertheless, ML predictive models are mainly meant to reveal complex relationships in the available data that might guide and improve future prognostic efforts of ACS events in the same population where the data have been collected. While our predictive models have not been validated on recent ACS registries, the use of the k-fold cross-validation approach reduces the chances of overfitting and strengthens the validity of their subsequent inference. The inability to fit valid predictive models for other ACS related outcomes such as infarction/reinfarction, PCI, and stroke is another limitation of the present study since the number of the cases were substantially less than the other selected outcomes (Table 1). Indeed, the prevalence of PCI in the COAST registry was substantially low (0.35%; Table 1); this is due to the limited availability of facilities for such procedures in the selected countries, as well as most of the selected patients did not need that procedure [26]. Yet, future studies should focus on evaluating the impact of hemoglobin level on other common ACS events when sufficient data is available. It is worth noting that the number of mortalities (n = 167; Table 1) was substantially less than the heart failure events (n = 521; Table 1). In contrast, the predictive power of our mortality model was remarkably higher than the heart failure model (Table 2). Therefore, our ML analytical pipeline is insensitive to the number of event outcomes in the dataset but can be more sensitive to selected features or to the way how the features were coded and calibrated [50]. Thus, our selected features were better predictors of motilities than other ACS events, and future efforts should attempt to either add other relevant features or calibrate the selected features to improve the performance of these predictive models. The complexity of ACS epidemiology coupled with the increasing size of registry data, as well as the highly non-linear relationships between admission anemia, other baseline characteristics, and ACS in-hospital events, highlight the strength of our ML analytical pipeline. Our selected ML algorithms were shown to outperform commonly used algorithms such as logistic regression, as well as risk stratification tools like TIMI, EMMACE, and the GRACE models, due to their flexibility in quantifying non-linear relationships with minimal underlying statistical assumptions. [5,6,51]. Past ACS studies that used similar ML algorithms were mainly focused on comparing their predictive power (i.e., black-box approach) to traditional risk stratification tools rather than their interpretability in a clinical setting [6,51]. Hence, providing an interpretable predictive model will further help to improve the in-hospital decision making and, ultimately, the overall prognosis of ACS. Thus, our study represents the first attempt to implement an interpretable ML pipeline focused on unveiling complex relationships in the higher dimensional space to improve clinicians’ ACS prognostic efforts, particularly in the Middle East. Further, we illustrated the remarkable applicability of Shapley values to elucidate in finer scales what each model means in terms of predicted risk of different ACS events (e.g., why a specific patient developed an ACS outcome, while the other did not?). This unique and intuitive attribute can be used to improve in-hospital clinician’s prognoses and subsequently reduce the implications of different ACS outcomes. For example, for a randomly selected patient who had been discharged dead (Fig 4A), having low initial hemoglobin values (< 10 g/dl) and bleeding put that patient at high risk of in-hospital death (probability > 0.8). Conversely, the other selected patient who was discharged alive (Fig 4D) completely lacks such risk factors. Thus, patients with a similar Shapley profile of the dead patient (Fig 4A) should be targeted with rigorous interventions to reduce the risk of in-hospital mortalities or other ACS events (Fig 4B and 4C). Finally, future studies of ML applications in clinical settings should explore such methods for resources allocation within health care systems [52]. For example, length of stay is an important outcome that requires substantial resources when the duration of patient stay in the hospital is long. Therefore, the ML model’s predictive ability can help guide the mobilization of clinical resources to targeted patients that are expected to stay longer due to their clinical profile.

Conclusion

This study represents a unique attempt to implement an interpretable ML pipeline focused on revealing the complex relationship between ACS events and the role of anemia in predicting multiple ACS outcomes. We showed that anemia was the most important predictor of mortality, heart failure, and bleeding and had remarkably non-linear relationships with both ACS outcomes and patients’ baseline characteristics. We demonstrated how our ML pipeline outperformed commonly used statistical and risk stratification methods due to its minimal statistical assumptions and ability to elucidate the predicted risk of each individual patient based on their unique risk factors in finer scales. To the authors’ knowledge, a fully interpretable ML pipeline has not been yet implemented widely in clinical settings, particularly in the Middle East. Therefore, our ML models can improve clinicians’ prognostic efforts and be used to guide policymakers in reducing the burdens of ACS on public health and the economy worldwide. 23 Jun 2021 PONE-D-21-13062 Machine Learning Reveals Deeper Insights into the Role of Anemia in the Outcome of Acute Coronary Syndrome PLOS ONE Dear Dr. Ali, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Please submit your revised manuscript by Aug 07 2021 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript: A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: http://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols . Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols . We look forward to receiving your revised manuscript. Kind regards, Andreas Zirlik, MD Academic Editor PLOS ONE Journal Requirements: When submitting your revision, we need you to address these additional requirements. 1. Please ensure that your manuscript meets PLOS ONE's style requirements, including those for file naming. The PLOS ONE style templates can be found at and https://journals.plos.org/plosone/s/file?id=ba62/PLOSOne_formatting_sample_title_authors_affiliations.pdf 2. Please include the data sources used in the Data availability statement and Methods section. Please also indicate in the Data availability statement whether you are able to openly share the code used, and if so, where others can access this. 3. Please ensure that you include a title page within your main document. We do appreciate that you have a title page document uploaded as a separate file, however, as per our author guidelines (http://journals.plos.org/plosone/s/submission-guidelines#loc-title-page) we do require this to be part of the manuscript file itself and not uploaded separately. Could you therefore please include the title page into the beginning of your manuscript file itself, listing all authors and affiliations. 4. Thank you for stating the following in the Financial Disclosure section: "Gulf COAST is an investigator-initiated study, financially supported by AstraZeneca and sponsored and overseen by Kuwait University (Project Code XX02/11). Neither Kuwait University nor AstraZeneca had any role in study design, data collection, data analysis or writing of the manuscript." We note that you received funding from a commercial source: AstraZeneca. Please provide an amended Competing Interests Statement that explicitly states this commercial funder, along with any other relevant declarations relating to employment, consultancy, patents, products in development, marketed products, etc. Within this Competing Interests Statement, please confirm that this does not alter your adherence to all PLOS ONE policies on sharing data and materials by including the following statement: "This does not alter our adherence to PLOS ONE policies on sharing data and materials.” (as detailed online in our guide for authors http://journals.plos.org/plosone/s/competing-interests). If there are restrictions on sharing of data and/or materials, please state these. Please note that we cannot proceed with consideration of your article until this information has been declared. Please include your amended Competing Interests Statement within your cover letter. We will change the online submission form on your behalf. Please know it is PLOS ONE policy for corresponding authors to declare, on behalf of all authors, all potential competing interests for the purposes of transparency. PLOS defines a competing interest as anything that interferes with, or could reasonably be perceived as interfering with, the full and objective presentation, peer review, editorial decision-making, or publication of research or non-research articles submitted to one of the journals. Competing interests can be financial or non-financial, professional, or personal. Competing interests can arise in relationship to an organization or another person. Please follow this link to our website for more details on competing interests: http://journals.plos.org/plosone/s/competing-interests 5. Please amend your list of authors on the manuscript to ensure that each author is linked to an affiliation. Authors’ affiliations should reflect the institution where the work was done (if authors moved subsequently, you can also list the new affiliation stating “current affiliation:….” as necessary). [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #1: No Reviewer #2: Partly ********** 2. Has the statistical analysis been performed appropriately and rigorously? Reviewer #1: No Reviewer #2: I Don't Know ********** 3. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #1: No Reviewer #2: No ********** 4. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #1: No Reviewer #2: Yes ********** 5. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #1: The Authors present the following aims of their study: 1) To identify factors associated with in-hospital adverse events post ACS 2) To explore the role of haemoglobin values at admission and their interaction with other relevant factors 3) To explore the role of patients’ initial hemoglobin values at admission, and their interactions with other relevant factors for each selected ACS adverse event. 4) To extend and evaluate the created models in the context of in-hospital individual level prognosis. The authors found in their analysis that the strongest predictor for the in-hospital adverse events death, heart failure and bleeding was length of hospital stay. Haemoglobin was the second strongest predictor with strong interaction with age for mortality and sex for bleeding and heart failure. Shapley values for the contribution of each predictor in 6 selected patients confirm the explanatory strength of length of stay and haemoglobin. The individual contribution of various predictors to explain short term prognosis after ACS is definitely an important topic to study. It might improve patient management and provide valuable insights into the pathophysiology of adverse events. Furthermore, machine learning approaches are quite new to cardiovascular risk prediction and need to find their optimal place in the field. However, the authors are advised to completely rethink the presented analysis. By their machine learning approach the authors found that length of hospitalization and haemoglobin at admission were the most important features for predicting the risk of mortality, heart failure, and bleeding. They further report that “risk of all ACS outcomes increases when patients stay in the hospital for more than 10 days“. This is one of the major limitations of the study: Length of hospital stay is an outcome parameter rather than a predictor and should not be used as a predicting parameter for events during the same hospitalization (the authors count their outcome parameters death, heart failure, bleeding only as long as the patient is still in hospital). In case of death, this event ends hospitalization and thus has a direct influence on its predictive parameter length of stay. Similarly, patients experience their in-hospital heart failure and in-hospital bleeding before the value of their predicting variable “length of hospital stay” can be known. In addition, in-hospital heart failure and bleeding have a direct influence on length of stay: In most patients they would prolong hospitalization so that length of stay is a function of heart failure and bleeding and not vice versa. This is in my opinion misconception that also invalidates further calculations which are based on length of stay being among the three strongest predictors. Therefore, I do believe that this manuscript cannot be accepted for publication in the Journal. Further comments: The authors need to elaborate more on the rationale of the research question and the use of machine learning for their specific research question. The authors should put more emphasis in explaining the clinical impact of their results and the use of machine learning tools. The authors should revise the manuscript – including language - to improve readability. The manuscript text is interrupted by a table an various figure legends, which hampers understanding such a complex analysis. Furthermore, I did not have access to supplemetary table 1. Figure 2: The authors state that the risk for death, heart failure, and bleeding markedly increases after 10 days of hospital stay. I can confirm this statement for heart failure and bleeding. However, for mortality it looks like a steady rather than an abrupt increase. The authors state that the risk of heart failure increases at the age of 50 years and above. However, in figure 2H it looks as if the risk of heart failure does not increase below 60 years and markedly increase around 70 years and older. The Authors state that patients with Hb levels <10g/dl are at a higher risk of death, heart failure, or bleeding. The risk of heart failure, however, increases already at higher haemoglobin levels ( <13g/dl?). See also Figures 3 F and 3 I. The authors should comment on these discrepant findings. I believe figure 2 would benefit from using the same scale on all y-axes for better comparison. Reviewer #2: In the submitted study entitled „Machine Learning Reveals Deeper Insights into the Role of Anemia in the Outcome of Acute Coronary Syndrome” the authors used a multi-algorithm machine learning ensemble pipeline on data of 4044 patients from the Gulf-COAST registry which included patients with the diagnosis of ACS between January 2012 and January 2013. If outcomes or complications of a patient presenting with ACS could be anticipated on patient characteristics from admission or the clinical course, a more patient tailored approach could lead to better outcomes. Therefore the approach to use a machine learning algorithm on the existing registry seems intriguing. However there are some major concerns regarding the submitted study. One major concern is that the study seems to lack a specific hypothesis and scientific question and therefor is more a report of multiple incoherent findings rather than a comprehensive paper. In the abstract and the conclusion the authors heavily emphasize their findings on anemia and ACS and claim that they “revealed the unique and complex role of anemia on admission in the prognosis of different ACS outcomes”. However the majority of the presented data are on the role of length of hospital stay and age. Furthermore the findings on anemia are not reported or discussed in sufficient depth. In regard to Figure 2 the authors state “[…] patients with initial hemoglobin values of equal or less than 10 g/dl are more likely to be discharge dead, had heart failure or develop bleeding”. While this seems to be true for mortality (Fig 2 D), the curve in Figure 2 E for heart failure seems to be u-shaped with a high risk at an Hb around 10 but lower risk at Hb 5-9 and 11-15, which is not addressed in the manuscript. Another major concern is with the underlying data which were analyzed. In table 1 the number of Patients receiving PCI is reported which is 0.35 % of the 4044 Patients with ACS. This rate of PCIs in an ACS collective seems surprisingly low. Also the type of ACS is not further evaluated in the manuscript. I think it would enhance the quality of the manuscript to evaluate the impact of the different hemoglobin levels in the different types of ACS. In addition it would be important to get information on how many values actually were present in regard to the single characteristics and outcomes. In addition there are some minor questions: - Line 142: “Heart failure was the third most important feature for predicting ACS mortalities […]” Does heart failure in this context mean heart failure as a preexisting condition or a complication? - Line 205-207 The authors discuss that length of stay is probably dependent on complications instead of being an independent factor toward predicting the risks of ACS. I agree with the authors but I think that raises the question why length of stay is addressed as a risk factor and not as an outcome. I think it would make more sense to analyze it in that way. - Figure I In Figure I the Abbreviation CRF is used without being explained in the manuscript - Figure 3 B, E, H Please explain more detailed what those graphs are actually showing. Overall I think the manuscript should be revised de novo with a more comprehensible story. ********** 6. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #1: No Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step. 1 Oct 2021 Reviewer #1: 1- However, the authors are advised to completely rethink the presented analysis. By their machine learning approach the authors found that length of hospitalization and haemoglobin at admission were the most important features for predicting the risk of mortality, heart failure, and bleeding. They further report that “risk of all ACS outcomes increases when patients stay in the hospital for more than 10 days“. This is one of the major limitations of the study: Length of hospital stay is an outcome parameter rather than a predictor and should not be used as a predicting parameter for events during the same hospitalization (the authors count their outcome parameters death, heart failure, bleeding only as long as the patient is still in hospital). In case of death, this event ends hospitalization and thus has a direct influence on its predictive parameter length of stay. Similarly, patients experience their in-hospital heart failure and in-hospital bleeding before the value of their predicting variable “length of hospital stay” can be known. In addition, in-hospital heart failure and bleeding have a direct influence on length of stay: In most patients they would prolong hospitalization so that length of stay is a function of heart failure and bleeding and not vice versa. This is in my opinion misconception that also invalidates further calculations which are based on length of stay being among the three strongest predictors. Therefore, I do believe that this manuscript cannot be accepted for publication in the Journal. - We agree with the reviewer’s point in terms of excluding length of stay from the analysis. Hence, we have repeated the whole analysis and revised all the results and figures. While removing length of stay slightly dropped the performance of all selected machine learning algorithms, the initial hemoglobin levels remained as the strongest predictor and interacting variable with other predictors. Therefore, we hope that the reviewer now sees the revised version acceptable for publication and thank them for their valuable time in reviewing the manuscript. 2- The authors need to elaborate more on the rationale of the research question and the use of machine learning for their specific research question. - We elaborated extensively in both introduction (lines 45-74, 101-106 & 107-116) and discussion sections (lines 326-329, & 347-359) for the rational of using ML for our specific research question as suggested by the reviewer. 3- The authors should put more emphasis in explaining the clinical impact of their results and the use of machine learning tools. - We extensively revisited the explanation of the clinical impact of our results derived from ML Model as suggested by the reviewer. 4- The authors should revise the manuscript – including language - to improve readability. The manuscript text is interrupted by a table an various figure legends, which hampers understanding such a complex analysis. Furthermore, I did not have access to supplemetary table 1. - We extensively revised the linguistics of the manuscript as suggested by the reviewer. However, the presence of tables and figure legends in the body of the manuscript is due to following the journal’s strict submission guidelines. We also apologize for miss-referencing supplementary table 1 as it was referring to table 1. 5- Figure 2: The authors state that the risk for death, heart failure, and bleeding markedly increases after 10 days of hospital stay. I can confirm this statement for heart failure and bleeding. However, for mortality it looks like a steady rather than an abrupt increase. - The point is no longer valid, after removing length of stay from the analysis as suggested by the reviewer. 6- The authors state that the risk of heart failure increases at the age of 50 years and above. However, in figure 2H it looks as if the risk of heart failure does not increase below 60 years and markedly increase around 70 years and older. - After removing length of stay from the analysis the results slightly changed, and we commented extensively on this point as suggested by the reviewer (lines 299-305). 7- The Authors state that patients with Hb levels <10g/dl are at a higher risk of death, heart failure, or bleeding. The risk of heart failure, however, increases already at higher haemoglobin levels ( <13g/dl?). See also Figures 3 F and 3 I. The authors should comment on these discrepant findings. - We extensively commented on this as suggested by the reviewer (Lines 313-322). 8- I believe figure 2 would benefit from using the same scale on all y-axes for better comparison. - We attempted to unify the y-axes for all figures as suggested by the reviewer. However, some of the resulting figures looked distorted with large empty spaces, as the probability thresholds differs between important features. Yet, the x-axis is unified among the same features. Reviewer #2 (note that revisions starts after the new figures below): 1- One major concern is that the study seems to lack a specific hypothesis and scientific question and therefor is more a report of multiple incoherent findings rather than a comprehensive paper. In the abstract and the conclusion the authors heavily emphasize their findings on anemia and ACS and claim that they “revealed the unique and complex role of anemia on admission in the prognosis of different ACS outcomes”. However the majority of the presented data are on the role of length of hospital stay and age. - We made substantial revisions throughout the manuscript in this regard. We made the paper more oriented toward how machine learning is more effective than classical risk stratification tools used by most clinicians in revealing complex relationships between different risk factors and ACS in-hospital events. Then how our selected ML models can produce rich outputs that is more interpretable than past used ML models. Finally, we showed how our ML untangled the complex relationship between initial hemoglobin values at admission and different common in-hospital ACS events with an extensive and elaborated discussion. We hope that the current version is more acceptable to the reviewer and we thank them for their time in reviewing the manuscript and provide useful suggestions. 2- Furthermore the findings on anemia are not reported or discussed in sufficient depth. In regard to Figure 2 the authors state “[…] patients with initial hemoglobin values of equal or less than 10 g/dl are more likely to be discharge dead, had heart failure or develop bleeding”. While this seems to be true for mortality (Fig 2 D), the curve in Figure 2 E for heart failure seems to be u-shaped with a high risk at an Hb around 10 but lower risk at Hb 5-9 and 11-15, which is not addressed in the manuscript. - We extensively commented on this as suggested by the reviewer (lines 313-322). 3- Another major concern is with the underlying data which were analyzed. In table 1 the number of Patients receiving PCI is reported which is 0.35 % of the 4044 Patients with ACS. This rate of PCIs in an ACS collective seems surprisingly low. Also the type of ACS is not further evaluated in the manuscript. I think it would enhance the quality of the manuscript to evaluate the impact of the different hemoglobin levels in the different types of ACS. In addition it would be important to get information on how many values actually were present in regard to the single characteristics and outcomes. - We did comment on this in the discussion section as suggested by the reviewer (lines 331-340). Further, the evaluation of other types of ACS will make the content of the manuscript complicated and distracting to the reader. Yet, we commented on this in the discussion section. Thus, we think that current version of the manuscript is already rich and extensive. 4- Line 142: “Heart failure was the third most important feature for predicting ACS mortalities […]” Does heart failure in this context mean heart failure as a preexisting condition or a complication? - It is an in-hospital heart failure which different from prior CVD. We clarified this and made it distinct throughout the manuscript. 5- Line 205-207: The authors discuss that length of stay is probably dependent on complications instead of being an independent factor toward predicting the risks of ACS. I agree with the authors but I think that raises the question why length of stay is addressed as a risk factor and not as an outcome. I think it would make more sense to analyze it in that way. - We removed length of stay from the models as suggested by reviewer 1 and repeated the whole analysis. Yet, we agree with the reviewer’s notion of setting length of stay as an outcome. However, the manuscript is already rich in content after the substantial revision. Thus, building a model for length of stay might complicate the results and distract the reviewer. Yet, we commented on this in the discussion section (Lines 367-372). 6- Figure I In Figure I the Abbreviation CRF is used without being explained in the manuscript - Abbreviation CRF has been spelled out in all of the figures as suggested by the reviewer. 7- Figure 3 B, E, H; Please explain more detailed what those graphs are actually showing. - We described and interpreted these figures as suggested by the reviewer. Submitted filename: Response_to_Reviewers.docx Click here for additional data file. 17 Nov 2021

PONE-D-21-13062R1

Anemia or other comorbidities? Using machine learning to reveal deeper insights into the drivers of in-hospital Acute coronary syndromes

PLOS ONE Dear Dr. Alkhamis, Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we feel that it has merit but does not fully meet PLOS ONE’s publication criteria as it currently stands. Therefore, we invite you to submit a revised version of the manuscript that addresses the points raised during the review process. Most of the points are formal points that need to be addressed. Please submit your revised manuscript by Jan 01 2022 11:59PM. If you will need more time than this to complete your revisions, please reply to this message or contact the journal office at plosone@plos.org. When you're ready to submit your revision, log on to https://www.editorialmanager.com/pone/ and select the 'Submissions Needing Revision' folder to locate your manuscript file. Please include the following items when submitting your revised manuscript:

If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Guidelines for resubmitting your figure files are available below the reviewer comments at the end of this letter. A rebuttal letter that responds to each point raised by the academic editor and reviewer(s). You should upload this letter as a separate file labeled 'Response to Reviewers'. A marked-up copy of your manuscript that highlights changes made to the original version. You should upload this as a separate file labeled 'Revised Manuscript with Track Changes'. An unmarked version of your revised paper without tracked changes. You should upload this as a separate file labeled 'Manuscript'. If applicable, we recommend that you deposit your laboratory protocols in protocols.io to enhance the reproducibility of your results. Protocols.io assigns your protocol its own identifier (DOI) so that it can be cited independently in the future. For instructions see: https://journals.plos.org/plosone/s/submission-guidelines#loc-laboratory-protocols. Additionally, PLOS ONE offers an option for publishing peer-reviewed Lab Protocol articles, which describe protocols hosted on protocols.io. Read more information on sharing protocols at https://plos.org/protocols?utm_medium=editorial-email&utm_source=authorletters&utm_campaign=protocols. We look forward to receiving your revised manuscript. Kind regards, Andreas Zirlik, MD Academic Editor PLOS ONE Journal Requirements: Please review your reference list to ensure that it is complete and correct. If you have cited papers that have been retracted, please include the rationale for doing so in the manuscript text, or remove these references and replace them with relevant current references. Any changes to the reference list should be mentioned in the rebuttal letter that accompanies your revised manuscript. If you need to cite a retracted article, indicate the article’s retracted status in the References list and also include a citation and full reference for the retraction notice. [Note: HTML markup is below. Please do not edit.] Reviewers' comments: Reviewer's Responses to Questions Comments to the Author 1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass the “Comments to the Author” section, enter your conflict of interest statement in the “Confidential to Editor” section, and submit your "Accept" recommendation. Reviewer #2: All comments have been addressed ********** 2. Is the manuscript technically sound, and do the data support the conclusions? The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented. Reviewer #2: Yes ********** 3. Has the statistical analysis been performed appropriately and rigorously? Reviewer #2: I Don't Know ********** 4. Have the authors made all data underlying the findings in their manuscript fully available? The PLOS Data policy requires authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exception (please refer to the Data Availability Statement in the manuscript PDF file). The data should be provided as part of the manuscript or its supporting information, or deposited to a public repository. For example, in addition to summary statistics, the data points behind means, medians and variance measures should be available. If there are restrictions on publicly sharing data—e.g. participant privacy or use of data from a third party—those must be specified. Reviewer #2: Yes ********** 5. Is the manuscript presented in an intelligible fashion and written in standard English? PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors here. Reviewer #2: Yes ********** 6. Review Comments to the Author Please use the space provided to explain your answers to the questions above. You may also include additional comments for the author, including concerns about dual publication, research ethics, or publication ethics. (Please upload your review as an attachment if it exceeds 20,000 characters) Reviewer #2: In the revised manuscript the authors made substantial changes to both the analysis (by excluding length of stay and rerunning the analysis) and the writing of the text. Overall the applied changes do significantly enhance the quality of the manuscript. There are only some minor comments which I would like to make: 1) The new title of the manuscript is conflicting. “Using machine learning to reveal deeper insights into the drivers of in-hospital Acute coronary syndromes” indicates that the examined population did suffer from ACS during an hospitalization rather than being admitted for ACS. Please reframe to avoid confusion. 2) Line 72, 93 and 349: You refer to a risk stratification tool called MACE. To my knowledge MACE is a composite of major adverse cardiac events and not a risk stratification tool. In the reference that you added there is no risk stratification tool called “MACE” descripted. Please clarify what you are referring to. 3) In the abstract you claim that “anemia was the most important predictor of mortality, 32 heart failure, and bleeding […]” however in your results you say that “in-hospital heart failure followed by initial hemoglobin values at admission and age, were the most important features for predicting the risk of mortality”. Please correct that in the abstract. 4) You explained the low rates of PCI in your ACS-registry (0,35%) with accessibility and lack of necessity in most patients. Please report the rates of the subtypes of ACS in your baseline characteristics as it is very important to understand the quality of care the examined patients received, in regard to generalizability of your findings. 5) In table 1 the reference-letter for the explanation of CVD in the figure legend is missing. Typically CVD also is the abbreviation for cardiovascular disease. If you really mean coronary vascular disease = coronary artery disease the correct abbreviation would be CAD. ********** 7. PLOS authors have the option to publish the peer review history of their article (what does this mean?). If published, this will include your full peer review and any attached files. If you choose “no”, your identity will remain anonymous but your review may still be made public. Do you want your identity to be public for this peer review? For information about this choice, including consent withdrawal, please see our Privacy Policy. Reviewer #2: No [NOTE: If reviewer comments were submitted as an attachment file, they will be attached to this email and accessible via the submission site. Please log into your account, locate the manuscript record, and check for the action link "View Attachments". If this link does not appear, there are no attachment files.] While revising your submission, please upload your figure files to the Preflight Analysis and Conversion Engine (PACE) digital diagnostic tool, https://pacev2.apexcovantage.com/. PACE helps ensure that figures meet PLOS requirements. To use PACE, you must first register as a user. Registration is free. Then, login and navigate to the UPLOAD tab, where you will find detailed instructions on how to use the tool. If you encounter any issues or have any questions when using PACE, please email PLOS at figures@plos.org. Please note that Supporting Information files do not need this step.

23 Nov 2021 Reviewer #2: In the revised manuscript, the authors made substantial changes to both the analysis (by excluding length of stay and rerunning the analysis) and the writing of the text. Overall the applied changes do significantly enhance the quality of the manuscript. There are only some minor comments which I would like to make: - We thank the reviewer for their valuable time and comments, which substantially improved the quality of the manuscript and hope that the minor requested revisions were fulfilled as suggested. 1) The new title of the manuscript is conflicting. “Using machine learning to reveal deeper insights into the drivers of in-hospital Acute coronary syndromes” indicates that the examined population did suffer from ACS during an hospitalization rather than being admitted for ACS. Please reframe to avoid confusion. - We fixed the title as suggested by the reviewer, and we welcome additional suggestions to improve the title of the manuscript. 2) Line 72, 93 and 349: You refer to a risk stratification tool called MACE. To my knowledge MACE is a composite of major adverse cardiac events and not a risk stratification tool. In the reference that you added there is no risk stratification tool called “MACE” described. Please clarify what you are referring to. - We agree with the reviewer’s comment, therefore to avoid this confusion, we replaced it with other common risk stratification tools such EMMACE and GRACE with their relevant citations as examples. 3) In the abstract, you claim that “anemia was the most important predictor of mortality,32 heart failure, and bleeding […]” however, in your results, you say that “in-hospital heart failure followed by initial hemoglobin values at admission and age, were the most important features for predicting the risk of mortality”. Please correct that in the abstract. - We fixed the sentence in the abstract as suggested by the reviewer. 4) You explained the low rates of PCI in your ACS registry (0,35%) with accessibility and lack of necessity in most patients. Please report the rates of the subtypes of ACS in your baseline characteristics as it is very important to understand the quality of care the examined patients received in regard to the generalizability of your findings. ¬- It is very important to emphasize that the reported PCI in table 1 is an outcome PCI. In other words, this was the PCI rate for patients who during hospitalization suffered recurrent infarction as a complication during their hospital stay. Therefore, the rate of 0.35% is for the 14 patients whose denominator was 71 patients (who suffered re-infarction during their hospital stay). In other words, 14 of 71 (19.7%). That said, we added the ACS subtypes in table 1 as suggested by the reviewer. 5) In table 1, the reference letter for the explanation of CVD in the figure legend is missing. Typically CVD also is the abbreviation for cardiovascular disease. If you really mean coronary vascular disease = coronary artery disease, the correct abbreviation would be CAD. ¬- We agree with the reviewer’s comment, here, we were using the expression CVD as an abbreviation for cardiovascular disease and therefore, we fixed it in the captions of table 1, as well as of figures 1, 3 and 4. Submitted filename: Response_to_Reviewers.docx Click here for additional data file. 11 Jan 2022 Anemia or other comorbidities? Using machine learning to reveal deeper insights into the drivers of acute coronary syndromes in hospital admitted patients PONE-D-21-13062R2 Dear Dr. Alkhamis, We’re pleased to inform you that your manuscript has been judged scientifically suitable for publication and will be formally accepted for publication once it meets all outstanding technical requirements. Within one week, you’ll receive an e-mail detailing the required amendments. When these have been addressed, you’ll receive a formal acceptance letter and your manuscript will be scheduled for publication. An invoice for payment will follow shortly after the formal acceptance. To ensure an efficient process, please log into Editorial Manager at http://www.editorialmanager.com/pone/, click the 'Update My Information' link at the top of the page, and double check that your user information is up-to-date. If you have any billing related questions, please contact our Author Billing department directly at authorbilling@plos.org. If your institution or institutions have a press office, please notify them about your upcoming paper to help maximize its impact. If they’ll be preparing press materials, please inform our press team as soon as possible -- no later than 48 hours after receiving the formal acceptance. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information, please contact onepress@plos.org. Kind regards, Andreas Zirlik, MD Academic Editor PLOS ONE Additional Editor Comments (optional): Reviewers' comments: 13 Jan 2022 PONE-D-21-13062R2 Anemia or other comorbidities? Using machine learning to reveal deeper insights into the drivers of acute coronary syndromes in hospital admitted patients Dear Dr. Alkhamis: I'm pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE. Congratulations! Your manuscript is now with our production department. If your institution or institutions have a press office, please let them know about your upcoming paper now to help maximize its impact. If they'll be preparing press materials, please inform our press team within the next 48 hours. Your manuscript will remain under strict press embargo until 2 pm Eastern Time on the date of publication. For more information please contact onepress@plos.org. If we can help with anything else, please email us at plosone@plos.org. Thank you for submitting your work to PLOS ONE and supporting open access. Kind regards, PLOS ONE Editorial Office Staff on behalf of Univ. Prof. Dr. Andreas Zirlik Academic Editor PLOS ONE

42 in total

1. The health care burden of acute chest pain.

Authors: S Goodacre; E Cross; J Arnold; K Angelini; S Capewell; J Nicholl
Journal: Heart Date: 2005-02 Impact factor: 5.994

2. 2011 ACCF/AHA focused update incorporated into the ACC/AHA 2007 Guidelines for the Management of Patients with Unstable Angina/Non-ST-Elevation Myocardial Infarction: a report of the American College of Cardiology Foundation/American Heart Association Task Force on Practice Guidelines developed in collaboration with the American Academy of Family Physicians, Society for Cardiovascular Angiography and Interventions, and the Society of Thoracic Surgeons.

Authors: R Scott Wright; Jeffrey L Anderson; Cynthia D Adams; Charles R Bridges; Donald E Casey; Steven M Ettinger; Francis M Fesmire; Theodore G Ganiats; Hani Jneid; A Michael Lincoff; Eric D Peterson; George J Philippides; Pierre Theroux; Nanette K Wenger; James Patrick Zidar; Jeffrey L Anderson; Cynthia D Adams; Elliott M Antman; Charles R Bridges; Robert M Califf; Donald E Casey; William E Chavey; Francis M Fesmire; Judith S Hochman; Thomas N Levin; A Michael Lincoff; Eric D Peterson; Pierre Theroux; Nanette Kass Wenger; James Patrick Zidar
Journal: J Am Coll Cardiol Date: 2011-05-10 Impact factor: 24.094

3. Association of hemoglobin levels with clinical outcomes in acute coronary syndromes.

Authors: Marc S Sabatine; David A Morrow; Robert P Giugliano; Paul B J Burton; Sabina A Murphy; Carolyn H McCabe; C Michael Gibson; Eugene Braunwald
Journal: Circulation Date: 2005-04-11 Impact factor: 29.690

4. Clinical trial--derived risk model may not generalize to real-world patients with acute coronary syndrome.

Authors: Andrew T Yan; Philip Jong; Raymond T Yan; Mary Tan; David Fitchett; Chi-Ming Chow; Matthew T Roe; Karen S Pieper; Anatoly Langer; Shaun G Goodman
Journal: Am Heart J Date: 2004-12 Impact factor: 4.749

5. The role of platelets in the pathogenesis of thrombosis and hemorrhage in patients with thrombocytosis.

Authors: P N Walsh; S Murphy; W E Barry
Journal: Thromb Haemost Date: 1977-12-15 Impact factor: 5.249

6. Performance of the GRACE Risk Score 2.0 Simplified Algorithm for Predicting 1-Year Death After Hospitalization for an Acute Coronary Syndrome in a Contemporary Multiracial Cohort.

Authors: Wei Huang; Gordon FitzGerald; Robert J Goldberg; Joel Gore; Richard H McManus; Hamza Awad; Molly E Waring; Jeroan Allison; Jane S Saczynski; Catarina I Kiefe; Keith A A Fox; Frederick A Anderson; David D McManus
Journal: Am J Cardiol Date: 2016-07-29 Impact factor: 2.778