Marisa A Bartz-Kurycki1, Charles Green1, Kathryn T Anderson1, Adam C Alder2, Brian T Bucher3, Robert A Cina4, Ramin Jamshidi5, Robert T Russell6, Regan F Williams7, KuoJen Tsao8. 1. McGovern Medical School at the University of Texas Health Science Center at Houston, 6431 Fannin St, Houston, TX, 77030, USA. 2. Children's Medical Center of Dallas, 1935 Medical District Dr, Dallas, TX, 75235, USA. 3. University of Utah School of Medicine, 30 N 1900 E, Salt Lake City, UT, 84132, USA. 4. Medical University of South Carolina, 180 Calhoun St, Charleston, SC, 29401, USA. 5. Phoenix Children's Hospital, 1919 E Thomas Rd, Phoenix, AZ, 85016, USA. 6. Children's Hospital of Alabama, University of Alabama at Birmingham, 1600 7th Ave. S., Birmingham, AL, 35233, USA. 7. University of Tennessee Health Science Center, 910 Madison Ave, Memphis, TN, 38163, USA. 8. McGovern Medical School at the University of Texas Health Science Center at Houston, 6431 Fannin St, Houston, TX, 77030, USA. Electronic address: KuoJen.Tsao@uth.tmc.edu.
Abstract
BACKGROUND: Machine-learning can elucidate complex relationships/provide insight to important variables for large datasets. This study aimed to develop an accurate model to predict neonatal surgical site infections (SSI) using different statistical methods. METHODS: The 2012-2015 National Surgical Quality Improvement Program-Pediatric for neonates was utilized for development and validations models. The primary outcome was any SSI. Models included different algorithms: full multiple logistic regression (LR), a priori clinical LR, random forest classification (RFC), and a hybrid model (combination of clinical knowledge and significant variables from RF) to maximize predictive power. RESULTS: 16,842 patients (median age 18 days, IQR 3-58) were included. 542 SSIs (4%) were identified. Agreement was observed for multiple covariates among significant variables between models. Area under the curve for each model was similar (full model 0.65, clinical model 0.67, RF 0.68, hybrid LR 0.67); however, the hybrid model utilized the fewest variables (18). CONCLUSIONS: The hybrid model had similar predictability as other models with fewer and more clinically relevant variables. Machine-learning algorithms can identify important novel characteristics, which enhance clinical prediction models.
BACKGROUND: Machine-learning can elucidate complex relationships/provide insight to important variables for large datasets. This study aimed to develop an accurate model to predict neonatal surgical site infections (SSI) using different statistical methods. METHODS: The 2012-2015 National Surgical Quality Improvement Program-Pediatric for neonates was utilized for development and validations models. The primary outcome was any SSI. Models included different algorithms: full multiple logistic regression (LR), a priori clinical LR, random forest classification (RFC), and a hybrid model (combination of clinical knowledge and significant variables from RF) to maximize predictive power. RESULTS: 16,842 patients (median age 18 days, IQR 3-58) were included. 542 SSIs (4%) were identified. Agreement was observed for multiple covariates among significant variables between models. Area under the curve for each model was similar (full model 0.65, clinical model 0.67, RF 0.68, hybrid LR 0.67); however, the hybrid model utilized the fewest variables (18). CONCLUSIONS: The hybrid model had similar predictability as other models with fewer and more clinically relevant variables. Machine-learning algorithms can identify important novel characteristics, which enhance clinical prediction models.
Authors: Daniel A da Silva; Carla S Ten Caten; Rodrigo P Dos Santos; Flavio S Fogliatto; Juliana Hsuan Journal: PLoS One Date: 2019-12-13 Impact factor: 3.240