| Literature DB >> 33900400 |
Tom van den Bosch1,2, Anne-Loes K Warps3,4, Michael P M de Nerée Tot Babberich5, Christina Stamm5, Bart F Geerts6, Louis Vermeulen1,2, Michel W J M Wouters4,7,8,9, Jan Willem T Dekker4,8,9, Rob A E M Tollenaar3,4, Pieter J Tanis10, Daniël M Miedema1,2.
Abstract
Importance: Quality improvement programs for colorectal cancer surgery have been introduced with benchmarking based on quality indicators, such as mortality. Detailed (pre)operative characteristics may offer relevant information for proper case-mix correction. Objective: To investigate the added value of machine learning to predict quality indicators for colorectal cancer surgery and identify previously unrecognized predictors of 30-day mortality based on a large, nationwide colorectal cancer registry that collected extensive data on comorbidities. Design, Setting, and Participants: All patients who underwent resection for primary colorectal cancer registered in the Dutch ColoRectal Audit between January 1, 2011, and December 31, 2016, were included. Multiple machine learning models (multivariable logistic regression, elastic net regression, support vector machine, random forest, and gradient boosting) were made to predict quality indicators. Model performance was compared with conventionally used scores. Risk factors were identified by logistic regression analyses and Shapley additive explanations (ie, SHAP values). Statistical analysis was performed between March 1 and September 30, 2020. Main Outcomes and Measures: The primary outcome of this cohort study was 30-day mortality. Prediction models were trained on a training set by performing 5-fold cross-validation, and outcomes were measured by the area under the receiver operating characteristic curve on the test set. Machine learning was further used to identify risk factors, measured by odds ratios and SHAP values.Entities:
Mesh:
Year: 2021 PMID: 33900400 PMCID: PMC8076964 DOI: 10.1001/jamanetworkopen.2021.7737
Source DB: PubMed Journal: JAMA Netw Open ISSN: 2574-3805
Figure 1. Patient Inclusion Criteria
AUC Scores for All Outcome Measures
| Outcome measure | Best machine learning model | DCRA case-mix regression model | ASA score | POSPOM | CCI |
|---|---|---|---|---|---|
| Mortality | 0.82 (0.79-0.85) | 0.81 (0.78-0.84) | 0.74 (0.71-0.77) | 0.73 (0.70-0.77) | 0.66 (0.63-0.70) |
|
| NA | .01 | 1.1 × 10−10 | 1.4 × 10−10 | 6.0 × 10−17 |
| Complicated course | 0.68 (0.67-0.69) | NA | NA | NA | NA |
| Prolonged length of stay | 0.71 (0.69-0.73) | NA | NA | NA | NA |
| Readmission | 0.63 (0.61-0.65) | NA | NA | NA | NA |
| ICU admission | 0.74 (0.72-0.75) | NA | NA | NA | NA |
Abbreviations: ASA, American Society of Anesthesiology; AUC, area under the receiver operating characteristic curve; CCI, Charlson Comorbidity Index; DCRA, Dutch ColoRectal Audit; ICU, intensive care unit; NA, not applicable; POSPOM, preoperative score to predict postoperative mortality.
Reported P values are in comparison with the best machine learning model and were calculated using the test of DeLong et al.[34]
Figure 2. Receiver Operating Characteristic Plot for 30-Day Mortality
Accuracy of 30-day mortality prediction for the best performing machine learning (ML) model (elastic net regression), case-mix logistic regression (LR) model, the preoperative score to predict postoperative mortality (POSPOM), American Society of Anesthesiology (ASA) score, and Charlson Comorbidity Index (CCI).
Figure 3. Significant Predictors of 30-Day Mortality
Logistic regression model of 30-day mortality for 62 501 patients. All regression coefficients with P < .05 are translated to odds ratios. For categorical variables, references are shown on the right axis. To convert creatinine to mg/dL, divide by 88.4. BMI indicates body mass index (calculated as weight in kilograms divide by height in meters squared); COPD, chronic obstructive pulmonary disease. aReference, 60-70 years. bReference, 18.5-25. cReference, American Society of Anesthesiology (ASA) score I. dReference, elective setting. eReference, open approach. fReference, low anterior resection or sigmoid resection.
Figure 4. Variables That Demonstrated the Greatest Association With Prediction of 30-Day Mortality
Top 30 Shapley additive explanation (SHAP) feature values of the gradient-boosting model for prediction of 30-day mortality. SHAP values were calculated per variable for all patients in the test set. Distributions of SHAP values for patients are shown in blue (patients who are positive for a variable) and orange (patients who are negative for a variable). SHAP values were ranked by the mean of the absolute value across all patients in the test set. ASA indicates American Society of Anesthesiology; BMI, body mass index (calculated as weight in kilograms divided by height in meters squared); COPD, chronic obstructive pulmonary disease; and MDT, multidisciplinary team.