Literature DB >> 34041539

Systematic review and meta-analysis of mortality risk prediction models in adult cardiac surgery.

Shubhra Sinha1, Arnaldo Dimagli1, Lauren Dixon1, Mario Gaudino2, Massimo Caputo1, Hunaid A Vohra1, Gianni Angelini1, Umberto Benedetto1.   

Abstract

OBJECTIVES: The most used mortality risk prediction models in cardiac surgery are the European System for Cardiac Operative Risk Evaluation (ES) and Society of Thoracic Surgeons (STS) score. There is no agreement on which score should be considered more accurate nor which score should be utilized in each population subgroup. We sought to provide a thorough quantitative assessment of these 2 models.
METHODS: We performed a systematic literature review and captured information on discrimination, as quantified by the area under the receiver operator curve (AUC), and calibration, as quantified by the ratio of observed-to-expected mortality (O:E). We performed random effects meta-analysis of the performance of the individual models as well as pairwise comparisons and subgroup analysis by procedure type, time and continent.
RESULTS: The ES2 {AUC 0.783 [95% confidence interval (CI) 0.765-0.800]; O:E 1.102 (95% CI 0.943-1.289)} and STS [AUC 0.757 (95% CI 0.727-0.785); O:E 1.111 (95% CI 0.853-1.447)] showed good overall discrimination and calibration. There was no significant difference in the discrimination of the 2 models (difference in AUC -0.016; 95% CI -0.034 to -0.002; P = 0.09). However, the calibration of ES2 showed significant geographical variations (P < 0.001) and a trend towards miscalibration with time (P=0.057). This was not seen with STS.
CONCLUSIONS: ES2 and STS are reliable predictors of short-term mortality following adult cardiac surgery in the populations from which they were derived. STS may have broader applications when comparing outcomes across continents as compared to ES2. REGISTRATION: Prospero (https://www.crd.york.ac.uk/PROSPERO/) CRD42020220983.
© The Author(s) 2021. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery.

Entities:  

Keywords:  Cardiac surgery; European System for Cardiac Operative Risk Evaluation; Mortality; Prediction; Society of Thoracic Surgeons

Mesh:

Year:  2021        PMID: 34041539      PMCID: PMC8557799          DOI: 10.1093/icvts/ivab151

Source DB:  PubMed          Journal:  Interact Cardiovasc Thorac Surg        ISSN: 1569-9285


INTRODUCTION

Cardiac surgery carries an inherent risk of perioperative mortality and morbidity. This varies considerably depending on the patients’ characteristics, baseline pathology and planned surgical intervention. Prediction models have been created [1-6] to quantify this risk. These models are utilized when counselling patients, discussing patients within the multi-disciplinary team, for benchmarking performance and more recently in guidelines for the management of aortic stenosis and deciding between surgical or transcatheter treatments [7, 8]. Present models predominantly quantify the risk of death in the short term. The most cited models are the European System for Cardiac Operative Risk Evaluation (ES) [1, 2, 9] and the Society of Thoracic Surgeons (STS) score [10, 11]. There is no guidance at present on which is the optimum score to utilize in a given clinical or research setting and concerns have arisen regarding the degree of applicability of a specific model to a localized population given the heterogenous populations from which they were originally derived. This leaves clinicians with the difficult decision of choosing which model to utilize when reporting and comparing outcomes. The relative performance of these models is thus the focus of this systematic review. We aim to build on previous work by using dedicated statistical methods to evaluate the comparative discrimination and calibration of the ES2 and STS not only in the wider cardiac surgery spectrum but also as they are applied to specific subgroups of the population. We believe that this is the most thorough comparison of these models.

METHODS

The data and scripts that support the findings of this study are available from the corresponding author upon reasonable request.

Systematic review

We report on the original papers and subsequent external validations available and draw comparisons between the models’ discriminatory power, as defined by the area under the receiver operator curve (AUC) or C-statistic, and their calibration, as defined by the ratio of the observed-to-expected mortality (O:E) within 30 days of the operation or the same hospital admission. Longer-term follow-up data were not included in the analysis to allow parity among studies and with the originally published papers on STS and ES2. A systematic literature review and meta-analysis of the above findings followed the Preferred Reporting Items for Systematic Reviews and Meta-Analyses [12] and Meta-analysis Of Observational Studies in Epidemiology principles [13]. Our librarian conducted a literature search, restricting articles to those translatable into English and referencing adults only, using the described search string (Supplementary Material, Table S1). We also hand-searched the reference lists of papers identified but did not contact the authors. Excluded papers and rationale for exclusion have been noted (Fig. 1 and Supplementary Material, Table S2). If studies performed subgroup analysis such that the AUC or predicted mortality was not available for the whole dataset, then the subgroups were treated as independent populations. Institutes reporting on multiple occasions but utilizing different populations of patients were also treated as independent populations. The search is updated to 29 October 2020. Papers were screened and data extracted independently by 3 reviewers (SS/AD/LD). Outliers and studies with a high risk of bias were included the primary analysis following discussion between 2 authors (SS/UB). SS/UB had full access to all the data in the study and take responsibility for its integrity and the data analysis. The data extraction items were based on the CHARMS checklist [14] and the risk of bias was assessed using the PROBAST tool [15, 16] (Prospero ID: CRD42020220983).
Figure 1:

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flowchart.

Preferred Reporting Items for Systematic Reviews and Meta-Analyses flowchart. Databases searched: MEDLINE (1946 to present), CINAHL (1981 to present), Embase (1974 to present) and EmCare (1946 to present). Preferred Reporting Items for Systematic Reviews and Meta-Analyses diagram: Fig. 1. Risk of bias assessment: Supplementary Material, Table S3. Low risk of bias: 17 papers. Uncertain risk of bias: 2 papers. High risk of bias: 24 papers.

Statistical analysis

Data were extracted as frequency and percentage for categorical variables and mean and standard deviation for continuous variables. The outcomes were AUC and O:E. Two separate analyses were conducted. First, we reviewed each score in turn and provided pooled estimates of AUC and O:E for comparison in accordance with previously published guidance [16-18]. It was assumed that variation in these parameters across studies was prone to between-study heterogeneity, due to the varied case-mix of populations studied, and thus, a random effects model was utilized [17]. The standard error of the AUC was calculated using Newcombe Method 4 [19]: ĉ is the estimated AUC, n is the number of observed events and m is the number of non-events, m* = n* = [1/2 (m + n)] − 1). Analysis was conducted using R (version 4.0.3). Meta-analysis models were formed using R-package ‘metamisc’ [17] and ‘metafor’ [20] and results displayed as forest plots. We reported 95% prediction interval (PI), which takes into account the between-study heterogeneity [17]. Second, for studies reporting ES2 and STS, we established pooled estimates of discrimination (AUC) and calibration (O:E) for each model and compared the confidence intervals (CIs). The lack of overlap in CIs indicated a marked difference in performance. The differences in AUCs and standard error of the difference in AUCs [6, 21] were calculated per paper and utilized in a meta-analysis with the ‘metafor’ [20] package. We also conducted stratified analysis by operation, continent and time. All ES2 papers were published after 2011; however, we separated the papers into studies solely reporting on patients operated on in or after 2010 (‘post-2010’) and those that contained data on patients operated on prior to 2010 (‘pre-2010’), on whom the authors had retrospectively calculated the ES2. We repeated the main comparisons stratifying by risk of bias (Supplementary Material, Figs. S1–S4). The presence of small-study effects was verified by visual inspection of the funnel plots (Supplementary Material, Figs. S5 and S6). Statistical heterogeneity was tested using Cochrane Q-test, and extent of statistical consistency was measured with I2, which describes the percentage of the variability in effect estimates due to heterogeneity rather than sampling error (chance).

RESULTS

Study characteristics

A total of 41 studies published between 2004 and 2020 were included the final analysis. The study characteristics are summarized in Table 1. They contained a heterogenous mix of patients, procedures and locations, commonly found in these studies [6, 22, 23]. Twenty studies reported on all operations performed [2, 24–42], 11 reported on aortic valve replacements with or without coronary artery bypass grafts (CABG) [43-53], 8 CABG only [54-61], 2 on mitral valve repair/replacement [62, 63], 2 on unspecified valvular operations [64, 65] and 1 on thoracic aortic [66] operations. A total of 23 were based in Europe [2, 24, 25, 28, 31, 35–39, 42, 46, 48–50, 53–57, 59, 62, 67], 5 in North America (NA) [32, 41, 44, 58, 63], 4 in South America (SA) [26, 30, 34, 47], 8 in Asia [27, 29, 33, 51, 60, 64–66] and 3 in New Zealand (NZ) [40, 52, 61].
Table 1:

Overview of study characteristics

Author, year CountryStudy periodSample sizeMissing dataAge (years), mean ± SDMale (%)Urgency (%)Case mix (%)Observed mortality, % (n)Expected mortalityO:EAUC

Basraon et al., 2011 [44]

USA, 1 centre

RS

1997–2008537NR70 ± 10100Emergency 0.1%AVR (56% also CABG)5.9 (32) STS 3.6% STS 1.64 STS 0.73

Poullis et al., 2014 [24]

Patients <70 years

Liverpool, UK

RS

2006–20102437RF presumed absent

Median 60

SD 4.1

79.5Urgent 17.8%

CABG 68.2%

AVR 53.4%

1.6 (39) ES2 2.5% ES2 0.64 ES2 0.80

Poullis et al., 2014 [24]

Patients ≥70 years

Liverpool, UK

RS

2006–20102147RF presumed absent

Median 76.4

SD 4.6

65.8Urgent 21.8%

CABG 31.8%

AVR 46.6%

4.3 (92) ES2 5.0% ES2 0.86 ES2 0.75

Nashef et al., 2012 [2]

43 European countries, 154 centres

PS

May–July 201022 381<1%64.7 ± 12.569.1

Urgent 18.5%

Emergency 4.3%

Salvage 0.5%

CABG 46.7%

Valves 46.3%

3.9 (873) ES2 3.95% ES2 0.99

Grant et al., 2012 [35]

UK Database

RS

2010–201123 740Imputation67.1 ± 11.872.3

Urgent 28.7%

Emergency 2.9%

Salvage 0.3%

CABG 52.5%

Valves 21%

AVR + CABG 10%

Aortic 4.3%

3.1 (736) ES2 3.4% ES2 0.92 ES2 0.81

Chalmers et al., 2013 [36]

Liverpool, UK

RS

2006–20105576RF presumed absent

Median 69.3

SD 10

73.9Urgent 28.3%

CABG 52.2%

AVR + CABG 9.3%

Isolated valves 20.7%

Aortic 6.2%

2.2 (101) ES2 2.0 ES2 1.1 ES2 0.79

Di Dedda et al., 2013 [37]

Italy, 1 centre

RS

2010–20111090NR64.5 ± 13.568.3

Urgent 2.2%

Emergency 1.7%

CABG 34.1%

Isolated valves 37.2%

Aortic 7.8%

3.75 (41) ES2 3.1%

ES2 1.2

ES2 0.81

Howell et al., 2013 [38]

High-risk patients (ES > 10)

Netherlands and Birmingham

RS

2006–2011933Nil

Median 74.3

SD 7.7

57.5

Urgent 50.2%

Emergency 9.2%

Salvage 0.3%

CABG 48.8%

2 procedures 32.6%

3 procedures 18.5%

9.7 (90) ES2 9.3% ES2 1.04 ES2 0.67

Biancari et al., 2012 [54]

Finland, 1 centre

RS

2006–20111027Excluded prior to analysis67 ± 9.477.8

Urgent 45.9%

Emergency 8.8%

Isolated CABG3.7 (38) ES2 4.5% ES2 0.82 ES2 0.852

Hogervorst et al., 2018 [55]

Netherlands, 1 centre

RS

2012–20142296Nil

Median 71

SD 9.6

71.2Emergency 11.4%

CABG 46.1%

OPCAB 6.1%

2.4 (55) ES2 1.6% ES2 1.5 ES2 0.871

Provenchère et al., 2017 [39]

Octogenarians

France, 1 centre

RS

2006–20127161NR63 ± 1468Urgent 5.7%

CABG 37%

Valves 57.7%

5.67 (406) ES2 5.17% ES2 1.1 ES2 0.80

Singh et al., 2019 [40]

NZ, 1 centre

PS

2014–20171666NR65 ± 1176

Urgent 32.3%

Aortic 9.4%

CABG 56%1.56 (26) ES2 2.97% ES2 0.53 ES2 0.831

Ad et al., 2007 [41]

USA, 1 centre

Female patients

RS

2001–2004692 of 3125NR65.80NRIsolated CABG2.9 (20) STS 2.6% STS 1.1 STS 0.82

Ad et al., 2007 [41]

USA, 1 centre

Male patients

RS

2001–20042433 of 3125NR62.6100NRIsolated CABG1.5 (37) STS 2.1% STS 0.71 STS 0.85

Barili et al., 2013 [46]

Italy, 3 centres

PS

2006–20121758<1%; multiple imputation69.8 ± 13.255

Urgent 2%

Emergency 0%

Isolated AVR1.4 (25)

ES2 1.88%

STS 2.0%

ES2 0.74

STS 0.7

ES2 0.81

STS 0.85

Barili et al., 2014 [42]

Elective

Italy, 3 centres

PS

2006–201212 201 of 13 871<1%; multiple imputation67.3 ± 11.868NR

CABG 51%

AVR 39%

MVR 26%

2+ procedures 34%

1.7 (210) ES2 2.5% ES2 0.68 ES2 0.80

Barili et al., 2014 [42]

Non-elective

Italy, 3 centres

PS

2006–20121670 of 13 871<1%; multiple imputation68.1 ± 11.474NR

CABG 73%

AVR 17%

MVR14%

2+ procedures 25%

8.1 (125) ES2 6.2% ES2 1.3 ES2 0.82

Carnero-Alcázar et al., 2013 [25]

Spain, 1 centre

PS

2005–20103798 of 4780Excluded patients with missing data67 ± 10.1562.3Emergency 4.63%CABG 32.4%5.7 (215) ES2 4.46% ES2 1.27 ES2 0.85

Borracci et al., 2014 [26]

Argentina, 1 centre

PS

2012–2013503NR66.4 ± 10.374.8Urgent or emergency 15.9%

CABG 54.3%

Valve 27%

Valve + CABG 11.7%

4.17 (21) ES2 3.18% ES2 1.31 ES2 0.856

Carosella et al., 2014 [47]

Argentina, 4 centres

RS

2008–2012250NR68.6 ± 13.363.2Urgent 7.6%

Isolated AVR 67.2%

AVR + CABG 32.8%

3.6 (9) ES2 1.64% ES2 2.20 ES2 0.76

Chan et al., 2014 [63]

Canada, 1 centre

RS

2001–20111154NR63.358.8NR

MVR

73.7% repair

- 26.3% replacement

1 (11)

ES2 3.0%

STS 2.3%

ES2 0.33

STS 0.42

ES2 0.67

STS 0.74

Nishida et al., 2014 [66]

Japan, 1 centre

RS

1993–2013461NR63.5 ± 0.765Emergency 35.4%Thoracic aortic surgery7.2 (33) ES2 7.4% ES2 0.97 ES2 0.770

Paparella et al., 2014 [56]

Italy, 7 centres

RS

2011–201262931.6%; replaced with mean values67.3 ± 11.265.9

Urgent 15.1

Emergency 3.9%

Isolated CABG4.9 (305) ES2 4.4% ES2 1.10 ES2 0.83

Spiliopoulos et al., 2014 [53]

Germany, 1 centre

RS

1999–2005222NR66.1672.7NRAVR + CABG6.3 (14) ES2 3.99% ES2 1.58 ES2 0.77

Garcia-Valentin [67] et al., 2016

Spain, 20 centres

RS

2012–20134034Nil66.6 ± 12.363.8

Urgent 39.2%

Emergency 4.5%

CABG 25.4%6.5 (262) ES2 5.7% ES2 1.14 ES2 0.79

Kar et al., 2017 [27]

India, 1 centre

RS

2011–2012911Excluded prior to analysis (61)49.37 ± 13.466.5

Urgent 13.5%

Emergency 4.7%

No OPCAB

CABG 47.8%

Valve 46.8%

Valve + CABG 5.4%

5.7 (52) ES2 2.9% ES2 1.97 ES2 0.76

Kirmani et al., 2013 [28]

Liverpool, UK

RS

2001–201014 432RF presumed absent65.3 ± 1172.4

Urgent 16.5%

Emergency 2.2%

CABG 61.7%

Valve 26.3%

Valve + CABG 12%

3.1 (447)

ES2 2.44%

STS 2.40%

ES2 1.27

STS 1.29

ES2 0.816

STS 0.810

Borde et al., 2013 [29]

India, 1 centre

PS

2011–2012498Excluded prior to analysis (39)60.48 ± 7.5180.1Emergency 1.6%

CABG 86.5%

AVR 5.2%

1.6 (8)

ES2 2.01%

STS 1.6%

ES2 0.80

STS 1.0

ES2 0.69

STS 0.65

Kunt et al., 2013 [57]

Turkey, 1 centre

RS

2004–2012428Nil74.5 ± 3.965Emergency 3.7%Isolated CABG7.9 (34)

ES2 1.7%

STS 5.8%

ES2 4.65

STS 1.36

ES2 0.72

STS 0.62

Laurent et al., 2013 [48]

France, 1 centre

PS

2009–2011314Nil73.4 ± 9.7 (29% ≥80 years)59Emergency 3%Severe AS5.7 (18)

ES2 2.3%

STS 2.8%

ES2 2.48

STS 2.04

ES2 0.77

STS 0.73

Luc et al., 2017 [58]

Patient >80 years

Canada, 1 centre

RS

2002–2008304RF presumed absent82.174.3Emergency 3.9%Isolated CABG2 (6)

ES2 4%

STS 3%

ES2 0.50

STS 0.67

ES2 0.794

STS 0.671

Luc et al., 2017 [58]

Patient ≤80 years

Canada, 1 centre

RS

2002–2008608RF presumed absent63.884.9Emergency 2.6%Isolated CABG1 (6)

ES2 2%

STS 1%

ES2 0.50

STS 1.0

ES2 0.845

STS 0.829

Vilca Mejia et al., 2020 [30]

Brazil, 11 centres

RS

2013–20175222Imputation60.6 ± 1263.6

Urgent 29%

Emergency 59.6%

CABG 60.2%

AVR 22.3%

Aortic 0.82%

7.64 (399)

ES2 3.1%

STS 1.0%

ES2 2.46

STS 7.64

ES2 0.763

STS 0.766

Nilsson et al., 2004 [59]

Sweden, 1 centre

RS

1996–20014497NR66.4 ± 9.377

Urgent 25.1%

Emergency 7.2%

Salvage 1%

Isolated CABG1.89 (85) STS 1.89% STS 1.0 STS 0.71

Osnabrugge et al., 2014 [32]

USA, multicentre

RS

2003–201250 588RF presumed absent64.7 ± 11.271.1NR

CABG 80.8%

AVR 8.1%

2.1 (1071)

ES2 3.1%

STS 2.7%

ES2 0.68

STS 0.78

ES2 0.77

STS 0.81

Qadir et al., 2014 [60]

Pakistan, 1 centre

RS

2006–20102004RF presumed absent58.3 ± 9.682.7

Urgent 11.1%

Emergency 11.1%

Salvage 5.6%

Isolated CABG3.8 (76) ES2 3.72% ES2 1.02 ES2 0.836

Rabbani et al., 2014 [64]

Pakistan, 1 centre

RS

2006–2013

576

STS: 490

RF presumed absent47.36 ± 15.553.5NRValve replacement surgery ± CABG5.7 (28)

ES2 4.94%

STS 2.13%

ES2 1.15

STS 2.68

ES2 0.816

STS 0.812

Shapira-Daniels et al., 2020 [33]

Israel, 1 centre

RS

2008–20151279NR64 ± 1273

Urgent 47%

Emergent/salvage 1%

CABG 62%

AVR 17%

1.95 (25)

ES2 3.31%

STS 3.12%

ES2 0.59

STS 0.63

ES2 0.81

STS 0.83

Tiveron et al., 2015 [34]

Brazil, 1 centre

PS

2011–2013562NRNRNRNR

CABG 65.5%

Valve 28.5%

Valve + CABG 6%

4.6 (26)

ES2 1.3%

STS 3.7%

ES2 3.54

STS 1.24

ES2 0.704

STS 0.649

Tralhão et al., 2015 [49]

Patients >80 years

Portugal, 1 centre

RS

2003–2010106RF presumed absent83.1 ± 2.236.8

Urgent 9.4%

Emergency 0%

Isolated AVR5.7 (6)

ES2 4.4%

STS 4.0%

ES2 1.30

STS 1.43

ES2 0.792

STS 0.702

Wang et al., 2013 [65]

China, 1 centre

RS

2006–20113479Imputation50 ± 12.446.2NRValve surgery only3.2 (112)

ES2 2.52%

STS 3.28%

ES2 1.28

STS 0.98

ES2 0.693

STS 0.706

Wang et al., 2014 [61]

NZ, 1 centre

RS

2010–2012818NR64.5 ± 10.079.8NRIsolated CABG1.6 (13)

ES2 1.6%

STS 2.3%

ES2 1.0

STS 0.70

ES2 0.642

STS 0.641

Wang et al., 2015 [52]

NZ, 1 centre

RS

2005–2012620NR64.8 ± 15.565.5

Urgent 50.6%

Emergency 0.3%

AVR ± CABG2.9 (18)

ES2 3.8%

STS 2.8%

ES2 0.76

STS 1.04

ES2 0.711

STS 0.684

Wendt et al., 2014 [50]

Germany, 1 centre

RS

1999–20121066Nil68.3 ± 11.553.8NRAVR ± CABG4.2 (45)

ES2 3.2%

STS 4.8%

ES2 1.31

STS 0.88

ES2 0.724

STS 0.726

Yamaoka et al., 2016 [51]

Japan, 1 centre

RS

2002–2013406NR71.6 ± 9.953Urgent/emergency 2%AVR ± CABG3.4 (14)

ES2 3.1%

STS 4.9%

ES2 1.09

STS 0.69

ES2 0.704

STS 0.781

Bold representation is to highlight the different patient populations AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; ES: European System for Cardiac Operative Risk Evaluation; MVR: mitral valve repair/replacement; NR: not reported; NZ: New Zealand; O:E: observed-to-expected mortality; PS: prospective; RF: risk factor; RS: retrospective; SD: standard deviation; STS: Society of Thoracic Surgeons.

Overview of study characteristics Basraon et al., 2011 [44] USA, 1 centre RS Poullis et al., 2014 [24] Patients <70 years Liverpool, UK RS Median 60 SD 4.1 CABG 68.2% AVR 53.4% Poullis et al., 2014 [24] Patients ≥70 years Liverpool, UK RS Median 76.4 SD 4.6 CABG 31.8% AVR 46.6% Nashef et al., 2012 [2] 43 European countries, 154 centres PS Urgent 18.5% Emergency 4.3% Salvage 0.5% CABG 46.7% Valves 46.3% Grant et al., 2012 [35] UK Database RS Urgent 28.7% Emergency 2.9% Salvage 0.3% CABG 52.5% Valves 21% AVR + CABG 10% Aortic 4.3% Chalmers et al., 2013 [36] Liverpool, UK RS Median 69.3 SD 10 CABG 52.2% AVR + CABG 9.3% Isolated valves 20.7% Aortic 6.2% Di Dedda et al., 2013 [37] Italy, 1 centre RS Urgent 2.2% Emergency 1.7% CABG 34.1% Isolated valves 37.2% Aortic 7.8% ES2 1.2 Howell et al., 2013 [38] High-risk patients (ES > 10) Netherlands and Birmingham RS Median 74.3 SD 7.7 Urgent 50.2% Emergency 9.2% Salvage 0.3% CABG 48.8% 2 procedures 32.6% 3 procedures 18.5% Biancari et al., 2012 [54] Finland, 1 centre RS Urgent 45.9% Emergency 8.8% Hogervorst et al., 2018 [55] Netherlands, 1 centre RS Median 71 SD 9.6 CABG 46.1% OPCAB 6.1% Provenchère et al., 2017 [39] Octogenarians France, 1 centre RS CABG 37% Valves 57.7% Singh et al., 2019 [40] NZ, 1 centre PS Urgent 32.3% Aortic 9.4% Ad et al., 2007 [41] USA, 1 centre Female patients RS Ad et al., 2007 [41] USA, 1 centre Male patients RS Barili et al., 2013 [46] Italy, 3 centres PS Urgent 2% Emergency 0% ES2 1.88% STS 2.0% ES2 0.74 STS 0.7 ES2 0.81 STS 0.85 Barili et al., 2014 [42] Elective Italy, 3 centres PS CABG 51% AVR 39% MVR 26% 2+ procedures 34% Barili et al., 2014 [42] Non-elective Italy, 3 centres PS CABG 73% AVR 17% MVR14% 2+ procedures 25% Carnero-Alcázar et al., 2013 [25] Spain, 1 centre PS Borracci et al., 2014 [26] Argentina, 1 centre PS CABG 54.3% Valve 27% Valve + CABG 11.7% Carosella et al., 2014 [47] Argentina, 4 centres RS Isolated AVR 67.2% AVR + CABG 32.8% Chan et al., 2014 [63] Canada, 1 centre RS MVR 73.7% repair - 26.3% replacement ES2 3.0% STS 2.3% ES2 0.33 STS 0.42 ES2 0.67 STS 0.74 Nishida et al., 2014 [66] Japan, 1 centre RS Paparella et al., 2014 [56] Italy, 7 centres RS Urgent 15.1 Emergency 3.9% Spiliopoulos et al., 2014 [53] Germany, 1 centre RS Garcia-Valentin [67] et al., 2016 Spain, 20 centres RS Urgent 39.2% Emergency 4.5% Kar et al., 2017 [27] India, 1 centre RS Urgent 13.5% Emergency 4.7% No OPCAB CABG 47.8% Valve 46.8% Valve + CABG 5.4% Kirmani et al., 2013 [28] Liverpool, UK RS Urgent 16.5% Emergency 2.2% CABG 61.7% Valve 26.3% Valve + CABG 12% ES2 2.44% STS 2.40% ES2 1.27 STS 1.29 ES2 0.816 STS 0.810 Borde et al., 2013 [29] India, 1 centre PS CABG 86.5% AVR 5.2% ES2 2.01% STS 1.6% ES2 0.80 STS 1.0 ES2 0.69 STS 0.65 Kunt et al., 2013 [57] Turkey, 1 centre RS ES2 1.7% STS 5.8% ES2 4.65 STS 1.36 ES2 0.72 STS 0.62 Laurent et al., 2013 [48] France, 1 centre PS ES2 2.3% STS 2.8% ES2 2.48 STS 2.04 ES2 0.77 STS 0.73 Luc et al., 2017 [58] Patient >80 years Canada, 1 centre RS ES2 4% STS 3% ES2 0.50 STS 0.67 ES2 0.794 STS 0.671 Luc et al., 2017 [58] Patient ≤80 years Canada, 1 centre RS ES2 2% STS 1% ES2 0.50 STS 1.0 ES2 0.845 STS 0.829 Vilca Mejia et al., 2020 [30] Brazil, 11 centres RS Urgent 29% Emergency 59.6% CABG 60.2% AVR 22.3% Aortic 0.82% ES2 3.1% STS 1.0% ES2 2.46 STS 7.64 ES2 0.763 STS 0.766 Nilsson et al., 2004 [59] Sweden, 1 centre RS Urgent 25.1% Emergency 7.2% Salvage 1% Osnabrugge et al., 2014 [32] USA, multicentre RS CABG 80.8% AVR 8.1% ES2 3.1% STS 2.7% ES2 0.68 STS 0.78 ES2 0.77 STS 0.81 Qadir et al., 2014 [60] Pakistan, 1 centre RS Urgent 11.1% Emergency 11.1% Salvage 5.6% Rabbani et al., 2014 [64] Pakistan, 1 centre RS 576 STS: 490 ES2 4.94% STS 2.13% ES2 1.15 STS 2.68 ES2 0.816 STS 0.812 Shapira-Daniels et al., 2020 [33] Israel, 1 centre RS Urgent 47% Emergent/salvage 1% CABG 62% AVR 17% ES2 3.31% STS 3.12% ES2 0.59 STS 0.63 ES2 0.81 STS 0.83 Tiveron et al., 2015 [34] Brazil, 1 centre PS CABG 65.5% Valve 28.5% Valve + CABG 6% ES2 1.3% STS 3.7% ES2 3.54 STS 1.24 ES2 0.704 STS 0.649 Tralhão et al., 2015 [49] Patients >80 years Portugal, 1 centre RS Urgent 9.4% Emergency 0% ES2 4.4% STS 4.0% ES2 1.30 STS 1.43 ES2 0.792 STS 0.702 Wang et al., 2013 [65] China, 1 centre RS ES2 2.52% STS 3.28% ES2 1.28 STS 0.98 ES2 0.693 STS 0.706 Wang et al., 2014 [61] NZ, 1 centre RS ES2 1.6% STS 2.3% ES2 1.0 STS 0.70 ES2 0.642 STS 0.641 Wang et al., 2015 [52] NZ, 1 centre RS Urgent 50.6% Emergency 0.3% ES2 3.8% STS 2.8% ES2 0.76 STS 1.04 ES2 0.711 STS 0.684 Wendt et al., 2014 [50] Germany, 1 centre RS ES2 3.2% STS 4.8% ES2 1.31 STS 0.88 ES2 0.724 STS 0.726 Yamaoka et al., 2016 [51] Japan, 1 centre RS ES2 3.1% STS 4.9% ES2 1.09 STS 0.69 ES2 0.704 STS 0.781 Bold representation is to highlight the different patient populations AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; ES: European System for Cardiac Operative Risk Evaluation; MVR: mitral valve repair/replacement; NR: not reported; NZ: New Zealand; O:E: observed-to-expected mortality; PS: prospective; RF: risk factor; RS: retrospective; SD: standard deviation; STS: Society of Thoracic Surgeons. The necessary data could be derived from 39 studies [2, 24–30, 32–34, 36–40, 42, 46–58, 60–68] (42 independent populations; 190 378 patients, 6254 deaths) on ES2 and 21 studies [28–30, 32–34, 41, 44, 46, 48–52, 57–59, 63–65] (23 independent populations; 92 291 patients; 2477 deaths) on STS score, 18 papers [28–30, 32–34, 46, 48–52, 57, 58, 61, 63–65] (19 independent populations; 84 132 patients; 3455 deaths) comparing ES2 and STS.

Individual model performance

European System for Cardiac Operative Risk Evaluation 2 in individual studies

The ES2 showed good discrimination (AUC = 0.782; 95% CI: 0.763–0.800; 95% PI: 0.646–0.875) and calibration (O:E = 1.118; 95% CI: 0.950–1.317; 95% PI: 0.430–2.912) (Fig. 2/Table 2). There was no significant difference in AUC between studies at high and low risks of bias (Supplementary Material, Figs. S1 and S2), between continents nor between studies reporting on patients operated on before and after 2010 (Supplementary Material, Fig. S7).
Figure 2:

Forest plots of meta-analysis of European System for Cardiac Operative Risk Evaluation 2. (A) Area under the receiver operator curve. (B) Observed-to-expected ratio.

Table 2:

Tabulated results of meta-analyses

Prediction modelParameter measuredNumber of studiesSummary95% CI95% PI I 2
Individual model performance
 ES2Discrimination (AUC)400.7820.763 to 0.8000.646 to 0.87595.4
Calibration (O:E)401.1180.950 to 1.3170.430 to 2.91297.0
 STSDiscrimination (AUC)230.7570.727 to 0.7850.651 to 0.83956.4
Calibration (O:E)231.1110.853 to 1.4470.0.318 to 3.88996.8

AUC: area under the receiver operator curve; CI: confidence interval; ES2: European System for Cardiac Operative Risk Evaluation 2; O:E: observed-to-expected mortality ratio; PI: prediction interval; STS: Society of Thoracic Surgeons.

Forest plots of meta-analysis of European System for Cardiac Operative Risk Evaluation 2. (A) Area under the receiver operator curve. (B) Observed-to-expected ratio. Tabulated results of meta-analyses AUC: area under the receiver operator curve; CI: confidence interval; ES2: European System for Cardiac Operative Risk Evaluation 2; O:E: observed-to-expected mortality ratio; PI: prediction interval; STS: Society of Thoracic Surgeons. We found that ES2 calibration varied significantly between continents (P < 0.0001). ES2 overestimated risk in NA (O:E = 0.515; 95% CI: 0.312–0.718) and NZ (O:E = 0.680; 95% CI: 0.429–0.931) and under-estimated risk in SA (O:E = 2.279; 95% CI: 1.403–3.155). ES2 had a trend towards risk underestimation in ‘post-2010’ studies (O:E = 1.368; 95% CI: 1.004–1.732) compared to ‘pre-2010’ studies (O:E = 0.991; 95% CI: 0.854–1.128)(P = 0.057) (Table 3/Supplementary Material, Fig. S8). There was statistical evidence of an association between AUC and O:E and the type of operation (P < 0.0001), largely driven by in 1 mitral study (Table 3).
Table 3:

Subgroup analysis of European System for Cardiac Operative Risk Evaluation 2


Number of studiesSummaryCI I 2
Discrimination (AUC)
Summary estimate400.7820.763–0.80095.4
Subgroup analysis
 By operation (all studies: P < 0.0001; excluding MVR: P = 0.07)
  AVR ± CABG70.7420.718–0.76664.5
  CABG70.7890.730–0.84897.4
  MVR10.6700.648–0.692
  Valve20.7590.639–0.87990.5
  Mixed220.7900.768–0.81395.8
  Aortic10.7590.739–0.879
 By continent (P = 0.557)
  Europe210.7930.771–0.81595.6
  North America40.7700.697–0.84297.6
  South America40.7710.708–0.83595.3
  Asia80.7630.4723–0.80394.6
  NZ30.7290.620–0.83798.9
 Studies containing patients operated on prior to 2010 (P = 0.397)
  Pre-2010280.7720.751–0.79395.3
  Post-2010120.7900.754–0.82797
Calibration (O:E)
Summary estimate401.1180.950–1.31797.0
Subgroup analysis
 By operation (all studies: P < 0.0001; excluding MVR: P = 0.55)
  AVR ± CABG71.3350.950–1.72158.2
  CABG71.2670.449–2.08684.7
  MVR10.3180.131–0.515
  Valve21.2491.046–1.4520
  Mixed221.1260.918–1.33495.6
  Aortic10.9670.649–1.285
 By continent (P < 0.0001)
  Europe211.0990.987–1.21187.2
  North America50.5150.312–0.71880.6
  South America42.2791.403–3.15583.1
  Asia81.0870.824–1.35078.3
  NZ30.6800.429–0.93140.8
 Studies containing patients operated on prior to 2010 (P = 0.057)
  Pre-2010280.9910.854–1.12891
  Post-2010121.3681.004–1.73295.1

AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; CI: confidence interval; MVR: mitral valve repair/replacement; NZ: New Zealand; O:E: observed-to-expected mortality ratio.

Subgroup analysis of European System for Cardiac Operative Risk Evaluation 2 AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; CI: confidence interval; MVR: mitral valve repair/replacement; NZ: New Zealand; O:E: observed-to-expected mortality ratio.

Society of Thoracic Surgeons in individual studies

STS demonstrated good discrimination (AUC = 0.757; 95% CI: 0.727–0.785; 95% PI: 0.651–0.839) and calibration (O:E = 1.111; 95% CI: 0.853–1.447; 95% PI: 0.318–3.889; Fig. 3/Table 2). There was a statistically significant correlation between AUC and the continent of the study (P = 0.03; Table 4/Supplementary Material, Fig. S9), with the lower extent of CIs falling noticeably below 0.7 for SA (0.731; 95% CI: 0.627–0.834) and NZ (0.667; 95% CI: 0.532–0.801). There was strong statistical evidence of an association between calibration and operation (P = 0.0018), largely driven by in 1 mitral study (Table 4). There were no significant differences in STS score between continents nor over time.
Figure 3:

Forest plots of meta-analysis of Society of Thoracic Surgeons score. (A) Area under the receiver operator curve. (B) Observed-to-expected ratio.

Table 4:

Subgroup analysis of Society of Thoracic Surgeons


Number of studiesSummaryCI I 2
Discrimination (AUC)
Summary estimate230.7570.727 to 0.78556.4
Subgroup analysis
 By operation (all studies: P = 0.22; excluding MVR: P = 0.13)
  AVR ± CABG60.7280.667 to 0.7890
  CABG70.7450.772 to 0.82151
  MVR10.7400.533 to 0.947
  Valve20.7490.647 to 0.85158.9
  Mixed70.7970.772 to 0.82148.6
  Aortic0
 By continent (P = 0.03)
  Europe60.7510.684 to 0.81866.6
  North America70.8090.792 to 0.8270
  South America20.7310.627 to 0.83655
  Asia60.7580.699 to 0.8176
  NZ20.6670.532 to 0.8010
 Studies containing patients operated on prior to 2010 (P = 0.21)
  Pre-2010190.7730.742 to 0.80540.6
  Post-201040.7140.628 to 0.80125.4
Calibration (O:E)
Summary estimate231.1110.853 to 1.44796.8
Subgroup analysis
 By operation (all studies: P = 0.0018; excluding MVR: P = 0.36)
  AVR ± CABG61.1710.788 to 1.55565.1
  CABG70.9130.726 to 1.10041.5
  MVR10.4140.171 to 0.658
  Valve21.7630.102 to 3.42591.3
  Mixed71.8880.024 to 3.75298.5
  Aortic0
 By continent (P = 0.42)
  Europe61.0560.832 to 1.27977.9
  North America70.8470.573 to 1.12271
  South America24.440−1.823 to 10.70299.5
  Asia61.2300.640 to 1.82080.8
  NZ20.8320.499 to 1.16621.3
 Studies containing patients operated on prior to 2010 (P = 0.37)
  Pre-2010190.9870.815 to 1.15985.1
  Post-201042.639−0.622 to 5.90199

AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; CI: confidence interval; MVR: mitral valve repair/replacement; NZ: New Zealand; O:E: observed-to-expected mortality ratio.

Forest plots of meta-analysis of Society of Thoracic Surgeons score. (A) Area under the receiver operator curve. (B) Observed-to-expected ratio. Subgroup analysis of Society of Thoracic Surgeons AUC: area under the receiver operator curve; AVR: aortic valve replacement; CABG: coronary artery bypass graft; CI: confidence interval; MVR: mitral valve repair/replacement; NZ: New Zealand; O:E: observed-to-expected mortality ratio.

European System for Cardiac Operative Risk Evaluation 2 versus Society of Thoracic Surgeons in comparative studies

There was no difference in discrimination between ES2 [AUC: 0.756 (95% CI: 0.728–0.783)] and STS [AUC: 0.752 (95% CI: 0.720–0.781)], with no statistically significant difference in the AUC [−0.016 (95% CI: −0.033 to 0.002); P = 0.9; Table 2/Fig. 4]. The pooled estimates of the O:E for the ES2 (1.124; 95% CI: 0.804–1.710) and STS (1.116; 95% CI: 0.812–1.535) were also similar with overlap between their CIs.
Figure 4:

Difference in discrimination of European System for Cardiac Operative Risk Evaluation 2 and Society of Thoracic Surgeons score. TE: difference in C-stastistic; seTE: standard error of difference in C-statistic.

Difference in discrimination of European System for Cardiac Operative Risk Evaluation 2 and Society of Thoracic Surgeons score. TE: difference in C-stastistic; seTE: standard error of difference in C-statistic.

DISCUSSION

We compared the performance of the 2 most used mortality prediction models in adult cardiac surgery-ES2 and STS scores, using measures of discrimination (AUC) and calibration (O:E). Discrimination is a model’s ability to successfully differentiate between those likely and unlikely to experience an event in each population. Calibration describes the certainty with which it can predict the occurrence of an event in an individual. Both should be optimized to have a truly efficient model. Our results build on findings from 3 previous meta-analyses [6, 22, 23] by providing a dedicated statistical technique to quantitatively assess calibration in addition to discrimination and performing extended subgroup analysis. The most notable finding of our study was that whilst the ES2 and STS performed well across the whole population, there was significant variation in the performance of ES2 between continents. It was shown to work well in the continent from which it was derived (i.e. Europe) but over-predicted risk in NA and NZ and under-predicted risk in SA. The availability of the coefficients for ES2 in the public domain may explain why this is more widely reported and there are substantially more papers from Europe. There was a tendency of ES2 to under-predict risk in papers with patients operated on solely after 2010. However, the STS score showed good and stable performance in all continents and across both time periods studied. The STS score regression coefficients are not in the public domain and it utilizes far more variables to provide procedure-specific outcome calculations of morbidity and mortality. Consequently, the STS score performance was reported far less frequently. A key difference in the models is that STS is recalibrated annually to ensure the O:E ratio remains around 1 [10, 11]. Analysis of papers providing direct comparisons of calibration of the 2 models suggested a non-significant difference between them. The same predominance of European papers was not seen here and this may account for the discrepancy in our findings. It would have been interesting to evaluate the calibration of these models using the calibration slope or calibration in large, however this is often not reported. The Hosmer–Lemeshow statistic is one of the most widely reported statistics regarding model calibration but does not lend itself to statistical comparison between studies. Over time the risk profile of patients has increased but operative mortality has decreased and ES has been shown to suffer from poor calibration, especially in those at highest risk [69-73]. The lack of availability of individual patient-level data limited our ability to analyse differential model performance in high and low-risk populations. Further review of these population subgroups would be of clinical importance. Clinicians need to balance the superior performance of the STS with the relative parsimony and ease of use of ES2. Our findings suggest that ES2 and STS can be used in the populations from which they are derived but that STS may offer advantages when performing comparative research across continents.

Limitations

Bias may have been introduced into the study as we only reviewed articles in English. Abstracts and unpublished works could not be included and may have resulted in publication bias. Small study effects and significant heterogeneity could not be negated despite performing meta-regression, subgroup and sensitivity analyses. We were only able to compare studies in whom the AUC and O:E ratios could be derived, and a large study [74] was excluded due to this. Reclassification metrics have been shown to be a good estimate of model discrimination [75]; however, they were not reported in these studies and the lack of individual patient-level data made their derivation impossible. The ES2 and STS calibration demonstrated statistically significant differences by type of operation which was driven by a singular study on mitral operations. Most studies evaluated either a mixed population, aortic valve replacements ± CABG or isolated CABG. There were few studies with dedicated performance measures on mitral valve, aortic or off-pump CABG and so the utility of these scoring systems in these subgroups could not be evaluated accurately. With the increasing number of ‘prophylactic’ aortic aneurysm operations being conducted and the emergence of transcatheter mitral interventions the validation of existing risk prediction models in these populations will become increasingly relevant. Some interventional cardiologists have reported the use of these scoring systems in the prediction of risk in their patients and this is partially reflected in the latest guidelines [7]. We did not review the accuracy of these models in patients undergoing interventional procedures and so cannot comment on their applicability in this setting.

CONCLUSIONS

The results of this meta-analysis validate the use of either ES2 or STS in the prediction of mortality following adult cardiac surgery, especially in the continent from which they were derived. Both scores show good discrimination throughout the populations studied. The STS may be better calibrated when evaluating outcomes across European and North American centres. Future research should focus on analysis of large databases of individual patient-level data to corroborate these findings.

SUPPLEMENTARY MATERIAL

Supplementary material is available at ICVTS online. Click here for additional data file.
  73 in total

1.  Validation of EuroSCORE II in patients undergoing coronary artery bypass surgery.

Authors:  Fausto Biancari; Francesco Vasques; Reija Mikkola; Marta Martin; Jarmo Lahtinen; Jouni Heikkinen
Journal:  Ann Thorac Surg       Date:  2012-04-18       Impact factor: 4.330

2.  Additive and logistic EuroSCORE performance in high risk patients.

Authors:  Ganesh Shanmugam; Mark West; Geoff Berg
Journal:  Interact Cardiovasc Thorac Surg       Date:  2005-04-18

3.  Do we need separate risk stratification models for hospital mortality after heart valve surgery?

Authors:  Menno van Gameren; A Pieter Kappetein; Ewout W Steyerberg; Angeliek C Venema; Els A J Berenschot; Edward L Hannan; Ad J J C Bogers; Johanna J M Takkenberg
Journal:  Ann Thorac Surg       Date:  2008-03       Impact factor: 4.330

4.  Reliability of new scores in predicting perioperative mortality after mitral valve surgery.

Authors:  Fabio Barili; Davide Pacini; Claudio Grossi; Roberto Di Bartolomeo; Francesco Alamanni; Alessandro Parolari
Journal:  J Thorac Cardiovasc Surg       Date:  2013-08-28       Impact factor: 5.209

5.  Validation of EuroSCORE II in a modern cohort of patients undergoing cardiac surgery.

Authors:  John Chalmers; Mark Pullan; Brian Fabri; James McShane; Matthew Shaw; Neeraj Mediratta; Michael Poullis
Journal:  Eur J Cardiothorac Surg       Date:  2012-07-24       Impact factor: 4.191

6.  The Society of Thoracic Surgeons 2018 Adult Cardiac Surgery Risk Models: Part 1-Background, Design Considerations, and Model Development.

Authors:  David M Shahian; Jeffrey P Jacobs; Vinay Badhwar; Paul A Kurlansky; Anthony P Furnary; Joseph C Cleveland; Kevin W Lobdell; Christina Vassileva; Moritz C Wyler von Ballmoos; Vinod H Thourani; J Scott Rankin; James R Edgerton; Richard S D'Agostino; Nimesh D Desai; Liqi Feng; Xia He; Sean M O'Brien
Journal:  Ann Thorac Surg       Date:  2018-03-22       Impact factor: 4.330

7.  Validation and quality measurements for EuroSCORE and EuroSCORE II in the Spanish cardiac surgical population: a prospective, multicentre study.

Authors:  Antonio Garcia-Valentin; Carlos A Mestres; Eduardo Bernabeu; José A Bahamonde; Iván Martín; Cristina Rueda; Alberto Domenech; Jamit Valencia; Delfina Fletcher; Facundo Machado; José Amores
Journal:  Eur J Cardiothorac Surg       Date:  2015-03-11       Impact factor: 4.191

8.  Does the choice of risk-adjustment model influence the outcome of surgeon-specific mortality analysis? A retrospective analysis of 14,637 patients under 31 surgeons.

Authors:  S W Grant; A D Grayson; M Jackson; J Au; B M Fabri; G Grotte; M Jones; B Bridgewater
Journal:  Heart       Date:  2007-11-01       Impact factor: 5.994

9.  The Society of Thoracic Surgeons 2008 cardiac surgery risk models: part 1--coronary artery bypass grafting surgery.

Authors:  David M Shahian; Sean M O'Brien; Giovanni Filardo; Victor A Ferraris; Constance K Haan; Jeffrey B Rich; Sharon-Lise T Normand; Elizabeth R DeLong; Cynthia M Shewan; Rachel S Dokholyan; Eric D Peterson; Fred H Edwards; Richard P Anderson
Journal:  Ann Thorac Surg       Date:  2009-07       Impact factor: 4.330

10.  Performance of InsCor and three international scores in cardiac surgery at Santa Casa de Marília.

Authors:  Marcos Gradim Tiveron; Helton Augusto Bomfim; Maycon Soto Simplício; Marcos Henriques Bergonso; Milena Paiva Brasil de Matos; Sergio Marques Ferreira; Eraldo Antônio Pelloso; Rubens Tofano de Barros
Journal:  Rev Bras Cir Cardiovasc       Date:  2015 Jan-Mar
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.