Kevin A Chen1, Matthew E Berginski2, Chirag S Desai1, Jose G Guillem1, Jonathan Stem1, Shawn M Gomez2,3, Muneera R Kapadia4. 1. Division of Gastrointestinal Surgery, Department of Surgery, University of North Carolina, 101 Manning Drive, Burnett Womack Building, Suite 4038, Chapel Hill, NC, 27599, USA. 2. Department of Pharmacology, University of North Carolina, 120 Mason Farm Rd, Genetic Medicine Building, Chapel Hill, NC, 27599, USA. 3. Joint Department of Biomedical Engineering, University of North Carolina, 10202C Mary Ellen Jones Building, Chapel Hill, NC, 27599, USA. 4. Division of Gastrointestinal Surgery, Department of Surgery, University of North Carolina, 101 Manning Drive, Burnett Womack Building, Suite 4038, Chapel Hill, NC, 27599, USA. muneera_kapadia@med.unc.edu.
Abstract
BACKGROUND: Procedure-specific complications can have devastating consequences. Machine learning-based tools have the potential to outperform traditional statistical modeling in predicting their risk and guiding decision-making. We sought to develop and compare deep neural network (NN) models, a type of machine learning, to logistic regression (LR) for predicting anastomotic leak after colectomy, bile leak after hepatectomy, and pancreatic fistula after pancreaticoduodenectomy (PD). METHODS: The colectomy, hepatectomy, and PD National Surgical Quality Improvement Program (NSQIP) databases were analyzed. Each dataset was split into training, validation, and testing sets in a 60/20/20 ratio, with fivefold cross-validation. Models were created using NN and LR for each outcome. Models were evaluated primarily with area under the receiver operating characteristic curve (AUROC). RESULTS: A total of 197,488 patients were included for colectomy, 25,403 for hepatectomy, and 23,333 for PD. For anastomotic leak, AUROC for NN was 0.676 (95% 0.666-0.687), compared with 0.633 (95% CI 0.620-0.647) for LR. For bile leak, AUROC for NN was 0.750 (95% CI 0.739-0.761), compared with 0.722 (95% CI 0.698-0.746) for LR. For pancreatic fistula, AUROC for NN was 0.746 (95% CI 0.733-0.760), compared with 0.713 (95% CI 0.703-0.723) for LR. Variables related to intra-operative information, such as surgical approach, biliary reconstruction, and pancreatic gland texture were highly important for model predictions. DISCUSSION: Machine learning showed a marginal advantage over traditional statistical techniques in predicting procedure-specific outcomes. However, models that included intra-operative information performed better than those that did not, suggesting that NSQIP procedure-targeted datasets may be strengthened by including relevant intra-operative information.
BACKGROUND: Procedure-specific complications can have devastating consequences. Machine learning-based tools have the potential to outperform traditional statistical modeling in predicting their risk and guiding decision-making. We sought to develop and compare deep neural network (NN) models, a type of machine learning, to logistic regression (LR) for predicting anastomotic leak after colectomy, bile leak after hepatectomy, and pancreatic fistula after pancreaticoduodenectomy (PD). METHODS: The colectomy, hepatectomy, and PD National Surgical Quality Improvement Program (NSQIP) databases were analyzed. Each dataset was split into training, validation, and testing sets in a 60/20/20 ratio, with fivefold cross-validation. Models were created using NN and LR for each outcome. Models were evaluated primarily with area under the receiver operating characteristic curve (AUROC). RESULTS: A total of 197,488 patients were included for colectomy, 25,403 for hepatectomy, and 23,333 for PD. For anastomotic leak, AUROC for NN was 0.676 (95% 0.666-0.687), compared with 0.633 (95% CI 0.620-0.647) for LR. For bile leak, AUROC for NN was 0.750 (95% CI 0.739-0.761), compared with 0.722 (95% CI 0.698-0.746) for LR. For pancreatic fistula, AUROC for NN was 0.746 (95% CI 0.733-0.760), compared with 0.713 (95% CI 0.703-0.723) for LR. Variables related to intra-operative information, such as surgical approach, biliary reconstruction, and pancreatic gland texture were highly important for model predictions. DISCUSSION: Machine learning showed a marginal advantage over traditional statistical techniques in predicting procedure-specific outcomes. However, models that included intra-operative information performed better than those that did not, suggesting that NSQIP procedure-targeted datasets may be strengthened by including relevant intra-operative information.
Authors: Olga Kantor; Mark S Talamonti; Henry A Pitt; Charles M Vollmer; Taylor S Riall; Bruce L Hall; Chi-Hsiung Wang; Marshall S Baker Journal: J Am Coll Surg Date: 2017-04-10 Impact factor: 6.113
Authors: Emily F Midura; Dennis Hanseman; Bradley R Davis; Sarah J Atkinson; Daniel E Abbott; Shimul A Shah; Ian M Paquette Journal: Dis Colon Rectum Date: 2015-03 Impact factor: 4.585
Authors: S A Rojas-Machado; M Romero-Simó; A Arroyo; A Rojas-Machado; J López; R Calpena Journal: Int J Colorectal Dis Date: 2015-10-27 Impact factor: 2.571
Authors: Alexander Bonde; Kartik M Varadarajan; Nicholas Bonde; Anders Troelsen; Orhun K Muratoglu; Henrik Malchau; Anthony D Yang; Hasan Alam; Martin Sillesen Journal: Lancet Digit Health Date: 2021-06-29
Authors: Christopher P Scally; Oliver A Varban; Arthur M Carlin; John D Birkmeyer; Justin B Dimick Journal: JAMA Surg Date: 2016-06-15 Impact factor: 14.766
Authors: Nicholas P McKenna; Katherine A Bews; Robert R Cima; Cynthia S Crowson; Elizabeth B Habermann Journal: J Gastrointest Surg Date: 2019-06-26 Impact factor: 3.452