Eberechukwu Onukwugha1, Ran Qi2, Jinani Jayasekera3, Shujia Zhou2. 1. Department of Pharmaceutical Health Services Research, University of Maryland School of Pharmacy, 220 Arch Street, Baltimore, MD, 21201, USA. eonukwug@rx.umaryland.edu. 2. Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Catonsville, MD, USA. 3. Department of Pharmaceutical Health Services Research, University of Maryland School of Pharmacy, 220 Arch Street, Baltimore, MD, 21201, USA.
Abstract
BACKGROUND: Prognostic classification approaches are commonly used in clinical practice to predict health outcomes. However, there has been limited focus on use of the general approach for predicting costs. We applied a grouping algorithm designed for large-scale data sets and multiple prognostic factors to investigate whether it improves cost prediction among older Medicare beneficiaries diagnosed with prostate cancer. METHODS: We analysed the linked Surveillance, Epidemiology and End Results (SEER)-Medicare data, which included data from 2000 through 2009 for men diagnosed with incident prostate cancer between 2000 and 2007. We split the survival data into two data sets (D0 and D1) of equal size. We trained the classifier of the Grouping Algorithm for Cancer Data (GACD) on D0 and tested it on D1. The prognostic factors included cancer stage, age, race and performance status proxies. We calculated the average difference between observed D1 costs and predicted D1 costs at 5 years post-diagnosis with and without the GACD. RESULTS: The sample included 110,843 men with prostate cancer. The median age of the sample was 74 years, and 10% were African American. The average difference (mean absolute error [MAE]) per person between the real and predicted total 5-year cost was US$41,525 (MAE US$41,790; 95% confidence interval [CI] US$41,421-42,158) with the GACD and US$43,113 (MAE US$43,639; 95% CI US$43,062-44,217) without the GACD. The 5-year cost prediction without grouping resulted in a sample overestimate of US$79,544,508. CONCLUSION: The grouping algorithm developed for complex, large-scale data improves the prediction of 5-year costs. The prediction accuracy could be improved by utilization of a richer set of prognostic factors and refinement of categorical specifications.
BACKGROUND: Prognostic classification approaches are commonly used in clinical practice to predict health outcomes. However, there has been limited focus on use of the general approach for predicting costs. We applied a grouping algorithm designed for large-scale data sets and multiple prognostic factors to investigate whether it improves cost prediction among older Medicare beneficiaries diagnosed with prostate cancer. METHODS: We analysed the linked Surveillance, Epidemiology and End Results (SEER)-Medicare data, which included data from 2000 through 2009 for men diagnosed with incident prostate cancer between 2000 and 2007. We split the survival data into two data sets (D0 and D1) of equal size. We trained the classifier of the Grouping Algorithm for Cancer Data (GACD) on D0 and tested it on D1. The prognostic factors included cancer stage, age, race and performance status proxies. We calculated the average difference between observed D1 costs and predicted D1 costs at 5 years post-diagnosis with and without the GACD. RESULTS: The sample included 110,843 men with prostate cancer. The median age of the sample was 74 years, and 10% were African American. The average difference (mean absolute error [MAE]) per person between the real and predicted total 5-year cost was US$41,525 (MAE US$41,790; 95% confidence interval [CI] US$41,421-42,158) with the GACD and US$43,113 (MAE US$43,639; 95% CI US$43,062-44,217) without the GACD. The 5-year cost prediction without grouping resulted in a sample overestimate of US$79,544,508. CONCLUSION: The grouping algorithm developed for complex, large-scale data improves the prediction of 5-year costs. The prediction accuracy could be improved by utilization of a richer set of prognostic factors and refinement of categorical specifications.
Authors: Jennie D Bowen; Ron Z Goetzel; Greg Lenhart; Ronald J Ozminkowski; Kenneth S Babamoto; Julia D Portale Journal: J Occup Environ Med Date: 2009-04 Impact factor: 2.162
Authors: Andrew J Armstrong; Ian F Tannock; Ronald de Wit; Daniel J George; Mario Eisenberger; Susan Halabi Journal: Eur J Cancer Date: 2009-12-11 Impact factor: 9.162
Authors: Ravishankar Jayadevappa; Sumedha Chhatre; Mark Weiner; Bernard S Bloom; S Bruce Malkowicz Journal: Urol Oncol Date: 2005 May-Jun Impact factor: 3.498
Authors: Angela B Mariotto; K Robin Yabroff; Yongwu Shao; Eric J Feuer; Martin L Brown Journal: J Natl Cancer Inst Date: 2011-01-12 Impact factor: 13.506
Authors: P C Prorok; G L Andriole; R S Bresalier; S S Buys; D Chia; E D Crawford; R Fogel; E P Gelmann; F Gilbert; M A Hasson; R B Hayes; C C Johnson; J S Mandel; A Oberman; B O'Brien; M M Oken; S Rafla; D Reding; W Rutt; J L Weissfeld; L Yokochi; J K Gohagan Journal: Control Clin Trials Date: 2000-12
Authors: Anna S Geraedts; Marjolein Fokkema; Annet M Kleiboer; Filip Smit; Noortje W Wiezer; Maria Cristina Majo; Willem van Mechelen; Pim Cuijpers; Brenda W J H Penninx Journal: J Occup Environ Med Date: 2014-08 Impact factor: 2.162
Authors: Eberechukwu Onukwugha; Phillip Osteen; Jinani Jayasekera; C Daniel Mullins; Christine A Mair; Arif Hussain Journal: Cancer Date: 2014-06-24 Impact factor: 6.860
Authors: Hans-Helmut König; Hanna Leicht; Horst Bickel; Angela Fuchs; Jochen Gensichen; Wolfgang Maier; Karola Mergenthal; Steffi Riedel-Heller; Ingmar Schäfer; Gerhard Schön; Siegfried Weyerer; Birgitt Wiese; Hendrik van den Bussche; Martin Scherer; Matthias Eckardt Journal: BMC Health Serv Res Date: 2013-06-15 Impact factor: 2.655