Heather T Gold1, Huong T Do. 1. Department of Public Health, Weill Medical College of Cornell University, 411 E, 69th Street, New York, NY 10021, USA.
Abstract
OBJECTIVE: To test the validity of three published algorithms designed to identify incident breast cancer cases using recent inpatient, outpatient, and physician insurance claims data. DATA: The Surveillance, Epidemiology, and End Results (SEER) registry data linked with Medicare physician, hospital, and outpatient claims data for breast cancer cases diagnosed from 1995 to 1998 and a 5 percent control sample of Medicare beneficiaries in SEER areas. STUDY DESIGN: We evaluate the sensitivity and specificity of three algorithms applied to new data compared with original reported results. Algorithms use health insurance diagnosis and procedure claims codes to classify breast cancer cases, with SEER as the reference standard. We compare algorithms by age, stage, race, and SEER region, and explore via logistic regression whether adding demographic variables improves algorithm performance. PRINCIPAL FINDINGS: The sensitivity of two of three algorithms is significantly lower when applied to newer data, compared with sensitivity calculated during algorithm development (59 and 77.4 percent versus 90 and 80.2 percent, p<.00001). Sensitivity decreases as age increases, and false negative rates are higher for cases with in situ, metastatic, and unknown stage disease compared with localized or regional breast cancer. Substantial variation also exists by SEER registry. There was potential for improvement in algorithm performance when adding age, region, and race to an indicator variable for whether the algorithm determined a subject to be a breast cancer case (p<.00001). CONCLUSIONS: Differential sensitivity of the algorithms by SEER region and age likely reflects variation in practice patterns, because the algorithms rely on administrative procedure codes. Depending on the algorithm, 3-5 percent of subjects overall are misclassified in 1998. Misclassification disproportionately affects older women and those diagnosed with in situ, metastatic, or unknown-stage disease. Algorithms should be applied cautiously to insurance claims databases to assess health care utilization outside SEER-Medicare populations because of uneven misclassification of subgroups that may be understudied already.
OBJECTIVE: To test the validity of three published algorithms designed to identify incident breast cancer cases using recent inpatient, outpatient, and physician insurance claims data. DATA: The Surveillance, Epidemiology, and End Results (SEER) registry data linked with Medicare physician, hospital, and outpatient claims data for breast cancer cases diagnosed from 1995 to 1998 and a 5 percent control sample of Medicare beneficiaries in SEER areas. STUDY DESIGN: We evaluate the sensitivity and specificity of three algorithms applied to new data compared with original reported results. Algorithms use health insurance diagnosis and procedure claims codes to classify breast cancer cases, with SEER as the reference standard. We compare algorithms by age, stage, race, and SEER region, and explore via logistic regression whether adding demographic variables improves algorithm performance. PRINCIPAL FINDINGS: The sensitivity of two of three algorithms is significantly lower when applied to newer data, compared with sensitivity calculated during algorithm development (59 and 77.4 percent versus 90 and 80.2 percent, p<.00001). Sensitivity decreases as age increases, and false negative rates are higher for cases with in situ, metastatic, and unknown stage disease compared with localized or regional breast cancer. Substantial variation also exists by SEER registry. There was potential for improvement in algorithm performance when adding age, region, and race to an indicator variable for whether the algorithm determined a subject to be a breast cancer case (p<.00001). CONCLUSIONS: Differential sensitivity of the algorithms by SEER region and age likely reflects variation in practice patterns, because the algorithms rely on administrative procedure codes. Depending on the algorithm, 3-5 percent of subjects overall are misclassified in 1998. Misclassification disproportionately affects older women and those diagnosed with in situ, metastatic, or unknown-stage disease. Algorithms should be applied cautiously to insurance claims databases to assess health care utilization outside SEER-Medicare populations because of uneven misclassification of subgroups that may be understudied already.
Authors: D K McClish; L Penberthy; M Whittemore; C Newschaffer; D Woolard; C E Desch; S Retchin Journal: Am J Epidemiol Date: 1997-02-01 Impact factor: 4.897
Authors: Michael N Neuss; Christopher E Desch; Kristen K McNiff; Peter D Eisenberg; Dean H Gesme; Joseph O Jacobson; Mohammad Jahanzeb; Jennifer J Padberg; John M Rainey; Jeff J Guo; Joseph V Simone Journal: J Clin Oncol Date: 2005-08-08 Impact factor: 44.544
Authors: Kathryn J Ruddy; Jeph Herrin; Lindsey Sangaralingham; Rachel A Freedman; Ahmedin Jemal; Tufia C Haddad; Summer V Allen; Tina Hieken; Judy C Boughey; Patricia A Ganz; Rachel D Havyer; Nilay D Shah Journal: J Natl Cancer Inst Date: 2020-01-01 Impact factor: 13.506
Authors: Lesley S Park; Janet P Tate; Maria C Rodriguez-Barradas; David Rimland; Matthew Bidwell Goetz; Cynthia Gibert; Sheldon T Brown; Michael J Kelley; Amy C Justice; Robert Dubrow Journal: J AIDS Clin Res Date: 2014-07
Authors: Pragati G Advani; Xiudong Lei; Cameron W Swanick; Ying Xu; Yu Shen; Nathan A Goodwin; Grace L Smith; Sharon H Giordano; Kelly K Hunt; Reshma Jagsi; Benjamin D Smith Journal: Int J Radiat Oncol Biol Phys Date: 2019-02-01 Impact factor: 7.038
Authors: Erin F Gillespie; Rayna K Matsuno; Beibei Xu; Daniel P Triplett; Lindsay Hwang; Isabel J Boero; John P Einck; Catheryn Yashar; James D Murphy Journal: Int J Radiat Oncol Biol Phys Date: 2016-05-12 Impact factor: 7.038
Authors: Carlos H Barcenas; Jiangong Niu; Ning Zhang; Yufeng Zhang; Thomas A Buchholz; Linda S Elting; Gabriel N Hortobagyi; Benjamin D Smith; Sharon H Giordano Journal: J Clin Oncol Date: 2014-05-27 Impact factor: 44.544
Authors: Grace L Smith; Ya-Chen T Shih; Ying Xu; Sharon H Giordano; Benjamin D Smith; George H Perkins; Welela Tereffe; Wendy A Woodward; Thomas A Buchholz Journal: Cancer Date: 2010-02-01 Impact factor: 6.860
Authors: Melody J Eide; Richard Krajenta; Dayna Johnson; Jordan J Long; Gordon Jacobsen; Maryam M Asgari; Henry W Lim; Christine C Johnson Journal: Am J Epidemiol Date: 2009-12-06 Impact factor: 4.897