Robert J Huang1, Nicole Sung-Eun Kwon1, Yutaka Tomizawa2, Alyssa Y Choi3, Tina Hernandez-Boussard4, Joo Ha Hwang1. 1. Division of Gastroenterology and Hepatology, Stanford University School of Medicine, Stanford, CA. 2. Division of Gastroenterology, University of Washington, Seattle, WA. 3. Division of Gastroenterology and Hepatology, University of California Irvine, Irvine, CA. 4. Department of Medicine, Biomedical Informatics Research, Stanford University, Stanford, CA.
Abstract
PURPOSE: Noncardia gastric cancer (NCGC) is a leading cause of global cancer mortality, and is often diagnosed at advanced stages. Development of NCGC risk models within electronic health records (EHR) may allow for improved cancer prevention. There has been much recent interest in use of machine learning (ML) for cancer prediction, but few studies comparing ML with classical statistical models for NCGC risk prediction. METHODS: We trained models using logistic regression (LR) and four commonly used ML algorithms to predict NCGC from age-/sex-matched controls in two EHR systems: Stanford University and the University of Washington (UW). The LR model contained well-established NCGC risk factors (intestinal metaplasia histology, prior Helicobacter pylori infection, race, ethnicity, nativity status, smoking history, anemia), whereas ML models agnostically selected variables from the EHR. Models were developed and internally validated in the Stanford data, and externally validated in the UW data. Hyperparameter tuning of models was achieved using cross-validation. Model performance was compared by accuracy, sensitivity, and specificity. RESULTS: In internal validation, LR performed with comparable accuracy (0.732; 95% CI, 0.698 to 0.764), sensitivity (0.697; 95% CI, 0.647 to 0.744), and specificity (0.767; 95% CI, 0.720 to 0.809) to penalized lasso, support vector machine, K-nearest neighbor, and random forest models. In external validation, LR continued to demonstrate high accuracy, sensitivity, and specificity. Although K-nearest neighbor demonstrated higher accuracy and specificity, this was offset by significantly lower sensitivity. No ML model consistently outperformed LR across evaluation criteria. CONCLUSION: Drawing data from two independent EHRs, we find LR on the basis of established risk factors demonstrated comparable performance to optimized ML algorithms. This study demonstrates that classical models built on robust, hand-chosen predictor variables may not be inferior to data-driven models for NCGC risk prediction.
PURPOSE: Noncardia gastric cancer (NCGC) is a leading cause of global cancer mortality, and is often diagnosed at advanced stages. Development of NCGC risk models within electronic health records (EHR) may allow for improved cancer prevention. There has been much recent interest in use of machine learning (ML) for cancer prediction, but few studies comparing ML with classical statistical models for NCGC risk prediction. METHODS: We trained models using logistic regression (LR) and four commonly used ML algorithms to predict NCGC from age-/sex-matched controls in two EHR systems: Stanford University and the University of Washington (UW). The LR model contained well-established NCGC risk factors (intestinal metaplasia histology, prior Helicobacter pylori infection, race, ethnicity, nativity status, smoking history, anemia), whereas ML models agnostically selected variables from the EHR. Models were developed and internally validated in the Stanford data, and externally validated in the UW data. Hyperparameter tuning of models was achieved using cross-validation. Model performance was compared by accuracy, sensitivity, and specificity. RESULTS: In internal validation, LR performed with comparable accuracy (0.732; 95% CI, 0.698 to 0.764), sensitivity (0.697; 95% CI, 0.647 to 0.744), and specificity (0.767; 95% CI, 0.720 to 0.809) to penalized lasso, support vector machine, K-nearest neighbor, and random forest models. In external validation, LR continued to demonstrate high accuracy, sensitivity, and specificity. Although K-nearest neighbor demonstrated higher accuracy and specificity, this was offset by significantly lower sensitivity. No ML model consistently outperformed LR across evaluation criteria. CONCLUSION: Drawing data from two independent EHRs, we find LR on the basis of established risk factors demonstrated comparable performance to optimized ML algorithms. This study demonstrates that classical models built on robust, hand-chosen predictor variables may not be inferior to data-driven models for NCGC risk prediction.
Authors: William F Anderson; M Constanza Camargo; Joseph F Fraumeni; Pelayo Correa; Philip S Rosenberg; Charles S Rabkin Journal: JAMA Date: 2010-05-05 Impact factor: 56.272
Authors: M Constanza Camargo; Chihaya Koriyama; Keitaro Matsuo; Woo-Ho Kim; Roberto Herrera-Goepfert; Linda M Liao; Jun Yu; Gabriel Carrasquilla; Joseph J Y Sung; Isabel Alvarado-Cabrero; Jolanta Lissowska; Fernando Meneses-Gonzalez; Yashushi Yatabe; Ti Ding; Nan Hu; Philip R Taylor; Douglas R Morgan; Margaret L Gulley; Javier Torres; Suminori Akiba; Charles S Rabkin Journal: Int J Cancer Date: 2013-08-28 Impact factor: 7.396
Authors: Robert J Huang; Meira Epplein; Chisato Hamashima; Il Ju Choi; Eunjung Lee; Dennis Deapen; Yanghee Woo; Thuy Tran; Shailja C Shah; John M Inadomi; David A Greenwald; Joo Ha Hwang Journal: Clin Gastroenterol Hepatol Date: 2021-10-06 Impact factor: 13.576
Authors: Shailja C Shah; Meg McKinley; Samir Gupta; Richard M Peek; Maria Elena Martinez; Scarlett L Gomez Journal: Gastroenterology Date: 2020-08-06 Impact factor: 22.682
Authors: Imon Banerjee; Kevin Li; Martin Seneviratne; Michelle Ferrari; Tina Seto; James D Brooks; Daniel L Rubin; Tina Hernandez-Boussard Journal: JAMIA Open Date: 2019-01-04