Ben Boursi1,2,3,4, Ronac Mamtani5,6,7, Wei-Ting Hwang5,6, Kevin Haynes5,6, Yu-Xiao Yang8,5,6. 1. Division of Gastroenterology, Perelman School of Medicine at the University of Pennsylvania, 733 Blockley Hall, 423 Guardian Drive, Philadelphia, PA, 19104-6021, USA. bben217@gmail.com. 2. Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA. bben217@gmail.com. 3. Department of Biostatistics and Epidemiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA. bben217@gmail.com. 4. Tel-Aviv University, 69978, Tel Aviv, Israel. bben217@gmail.com. 5. Center for Clinical Epidemiology and Biostatistics, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA. 6. Department of Biostatistics and Epidemiology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA. 7. Division of Hematology/Oncology, Perelman School of Medicine at the University of Pennsylvania, Philadelphia, PA, 19104, USA. 8. Division of Gastroenterology, Perelman School of Medicine at the University of Pennsylvania, 733 Blockley Hall, 423 Guardian Drive, Philadelphia, PA, 19104-6021, USA.
Abstract
BACKGROUND: Current risk scores for colorectal cancer (CRC) are based on demographic and behavioral factors and have limited predictive values. AIM: To develop a novel risk prediction model for sporadic CRC using clinical and laboratory data in electronic medical records. METHODS: We conducted a nested case-control study in a UK primary care database. Cases included those with a diagnostic code of CRC, aged 50-85. Each case was matched with four controls using incidence density sampling. CRC predictors were examined using univariate conditional logistic regression. Variables with p value <0.25 in the univariate analysis were further evaluated in multivariate models using backward elimination. Discrimination was assessed using receiver operating curve. Calibration was evaluated using the McFadden's R2. Net reclassification index (NRI) associated with incorporation of laboratory results was calculated. Results were internally validated. RESULTS: A model similar to existing CRC prediction models which included age, sex, height, obesity, ever smoking, alcohol dependence, and previous screening colonoscopy had an AUC of 0.58 (0.57-0.59) with poor goodness of fit. A laboratory-based model including hematocrit, MCV, lymphocytes, and neutrophil-lymphocyte ratio (NLR) had an AUC of 0.76 (0.76-0.77) and a McFadden's R2 of 0.21 with a NRI of 47.6 %. A combined model including sex, hemoglobin, MCV, white blood cells, platelets, NLR, and oral hypoglycemic use had an AUC of 0.80 (0.79-0.81) with a McFadden's R2 of 0.27 and a NRI of 60.7 %. Similar results were shown in an internal validation set. CONCLUSION: A laboratory-based risk model had good predictive power for sporadic CRC risk.
BACKGROUND: Current risk scores for colorectal cancer (CRC) are based on demographic and behavioral factors and have limited predictive values. AIM: To develop a novel risk prediction model for sporadic CRC using clinical and laboratory data in electronic medical records. METHODS: We conducted a nested case-control study in a UK primary care database. Cases included those with a diagnostic code of CRC, aged 50-85. Each case was matched with four controls using incidence density sampling. CRC predictors were examined using univariate conditional logistic regression. Variables with p value <0.25 in the univariate analysis were further evaluated in multivariate models using backward elimination. Discrimination was assessed using receiver operating curve. Calibration was evaluated using the McFadden's R2. Net reclassification index (NRI) associated with incorporation of laboratory results was calculated. Results were internally validated. RESULTS: A model similar to existing CRC prediction models which included age, sex, height, obesity, ever smoking, alcohol dependence, and previous screening colonoscopy had an AUC of 0.58 (0.57-0.59) with poor goodness of fit. A laboratory-based model including hematocrit, MCV, lymphocytes, and neutrophil-lymphocyte ratio (NLR) had an AUC of 0.76 (0.76-0.77) and a McFadden's R2 of 0.21 with a NRI of 47.6 %. A combined model including sex, hemoglobin, MCV, white blood cells, platelets, NLR, and oral hypoglycemic use had an AUC of 0.80 (0.79-0.81) with a McFadden's R2 of 0.27 and a NRI of 60.7 %. Similar results were shown in an internal validation set. CONCLUSION: A laboratory-based risk model had good predictive power for sporadic CRC risk.
Authors: J Sint Nicolaas; V de Jonge; O van Baalen; F J G M Kubben; W Moolenaar; M F J Stolk; E J Kuipers; M E van Leerdam Journal: Endoscopy Date: 2013-04-11 Impact factor: 10.093
Authors: James D Lewis; Rita Schinnar; Warren B Bilker; Xingmei Wang; Brian L Strom Journal: Pharmacoepidemiol Drug Saf Date: 2007-04 Impact factor: 2.890
Authors: Joseph Zvi Tchebiner; Amir Nutman; Ben Boursi; Amir Shlomai; Tal Sella; Assaf Wasserman; Hanan Guzner-Gur Journal: Am J Med Sci Date: 2011-11 Impact factor: 2.378
Authors: Michal F Kaminski; Marcin Polkowski; Ewa Kraszewska; Maciej Rupinski; Eugeniusz Butruk; Jaroslaw Regula Journal: Gut Date: 2014-01-02 Impact factor: 23.059
Authors: Ran Goshen; Barak Mizrahi; Pini Akiva; Yaron Kinar; Eran Choman; Varda Shalev; Victoria Sopik; Revital Kariv; Steven A Narod Journal: Br J Cancer Date: 2017-03-02 Impact factor: 7.640
Authors: Jennifer Anne Cooper; Ronan Ryan; Nick Parsons; Chris Stinton; Tom Marshall; Sian Taylor-Phillips Journal: BMC Gastroenterol Date: 2020-03-25 Impact factor: 3.067