Importance: Cancer registries are important real-world data sources consisting of data abstraction from the medical record; however, patients with unknown or missing data are underrepresented in studies that use such data sources. Objective: To assess the prevalence of missing data and its association with overall survival among patients with cancer. Design, Setting, and Participants: In this retrospective cohort study, all variables within the National Cancer Database were reviewed for missing or unknown values for patients with the 3 most common cancers in the US who received diagnoses from January 1, 2006, to December 31, 2015. The prevalence of patient records with missing data and the association with overall survival were assessed. Data analysis was performed from February to August 2020. Exposures: Any missing data field within a patient record among 63 variables of interest from more than 130 total variables in the National Cancer Database. Main Outcomes and Measures: Prevalence of missing data in the medical records of patients with cancer and associated 2-year overall survival. Results: A total of 1 198 749 patients with non-small cell lung cancer (mean [SD] age, 68.5 [10.9] years; 628 811 men [52.5%]), 2 120 775 patients with breast cancer (mean [SD] age, 61.0 [13.3] years; 2 101 758 women [99.1%]), and 1 158 635 patients with prostate cancer (mean [SD] age, 65.2 [9.0] years; 100% men) were included in the analysis. Among those with non-small cell lung cancer, 851 295 patients (71.0%) were missing data for variables of interest; 2-year overall survival was 33.2% for patients with missing data and 51.6% for patients with complete data (P < .001). Among those with breast cancer, 1 161 096 patients (54.7%) were missing data for variables of interest; 2-year overall survival was 93.2% for patients with missing data and 93.9% for patients with complete data (P < .001). Among those with prostate cancer, 460 167 patients (39.7%) were missing data for variables of interest; 2-year overall survival was 91.0% for patients with missing data and 95.6% for patients with complete data (P < .001). Conclusions and Relevance: This study found that within a large cancer registry-based real-world data source, there was a high prevalence of missing data that were unable to be ascertained from the medical record. The prevalence of missing data among patients with cancer was associated with heterogeneous differences in overall survival. Improvements in documentation and data quality are necessary to make optimal use of real-world data for clinical advancements.
Importance: Cancer registries are important real-world data sources consisting of data abstraction from the medical record; however, patients with unknown or missing data are underrepresented in studies that use such data sources. Objective: To assess the prevalence of missing data and its association with overall survival among patients with cancer. Design, Setting, and Participants: In this retrospective cohort study, all variables within the National Cancer Database were reviewed for missing or unknown values for patients with the 3 most common cancers in the US who received diagnoses from January 1, 2006, to December 31, 2015. The prevalence of patient records with missing data and the association with overall survival were assessed. Data analysis was performed from February to August 2020. Exposures: Any missing data field within a patient record among 63 variables of interest from more than 130 total variables in the National Cancer Database. Main Outcomes and Measures: Prevalence of missing data in the medical records of patients with cancer and associated 2-year overall survival. Results: A total of 1 198 749 patients with non-small cell lung cancer (mean [SD] age, 68.5 [10.9] years; 628 811 men [52.5%]), 2 120 775 patients with breast cancer (mean [SD] age, 61.0 [13.3] years; 2 101 758 women [99.1%]), and 1 158 635 patients with prostate cancer (mean [SD] age, 65.2 [9.0] years; 100% men) were included in the analysis. Among those with non-small cell lung cancer, 851 295 patients (71.0%) were missing data for variables of interest; 2-year overall survival was 33.2% for patients with missing data and 51.6% for patients with complete data (P < .001). Among those with breast cancer, 1 161 096 patients (54.7%) were missing data for variables of interest; 2-year overall survival was 93.2% for patients with missing data and 93.9% for patients with complete data (P < .001). Among those with prostate cancer, 460 167 patients (39.7%) were missing data for variables of interest; 2-year overall survival was 91.0% for patients with missing data and 95.6% for patients with complete data (P < .001). Conclusions and Relevance: This study found that within a large cancer registry-based real-world data source, there was a high prevalence of missing data that were unable to be ascertained from the medical record. The prevalence of missing data among patients with cancer was associated with heterogeneous differences in overall survival. Improvements in documentation and data quality are necessary to make optimal use of real-world data for clinical advancements.
Authors: Payal D Soni; Holly E Hartman; Robert T Dess; Ahmed Abugharib; Steven G Allen; Felix Y Feng; Anthony L Zietman; Reshma Jagsi; Matthew J Schipper; Daniel E Spratt Journal: J Clin Oncol Date: 2019-03-21 Impact factor: 44.544
Authors: Jeremy L Warner; Suzanne E Maddux; Kevin S Hughes; John C Krauss; Peter Paul Yu; Lawrence N Shulman; Deborah K Mayer; Mike Hogarth; Mark Shafarman; Allison Stover Fiscalini; Laura Esserman; Liora Alschuler; George Augustine Koromia; Zabrina Gonzaga; Edward P Ambinder Journal: J Am Med Inform Assoc Date: 2015-01-20 Impact factor: 4.497
Authors: Gary V Walker; Sharon H Giordano; Melanie Williams; Jing Jiang; Jiangong Niu; Jill MacKinnon; Patricia Anderson; Brad Wohler; Amber H Sinclair; Francis P Boscoe; Maria J Schymura; Thomas A Buchholz; Benjamin D Smith Journal: Int J Radiat Oncol Biol Phys Date: 2013-07-15 Impact factor: 7.038
Authors: Douglas S Swords; Brian K Bednarski; Craig A Messick; Matthew M Tillman; George J Chang; Y Nancy You Journal: Ann Surg Oncol Date: 2021-08-18 Impact factor: 5.344
Authors: Selen Bozkurt; Christopher J Magnani; Martin G Seneviratne; James D Brooks; Tina Hernandez-Boussard Journal: Front Digit Health Date: 2022-06-02
Authors: Sarah P Huepenbecker; Shuangshuang Fu; Charlotte C Sun; Hui Zhao; Kristin M Primm; Sharon H Giordano; Larissa A Meyer Journal: Am J Obstet Gynecol Date: 2022-04-29 Impact factor: 10.693
Authors: Norah Alsadhan; Alaa Almaiman; Mar Pujades-Rodriguez; Cathy Brennan; Farag Shuweihdi; Sultana A Alhurishi; Robert M West Journal: BMC Med Res Methodol Date: 2022-05-19 Impact factor: 4.612
Authors: Abigail N Pepin; Alan Zwart; Malika Danner; Marylin Ayoob; Thomas Yung; Brian T Collins; Deepak Kumar; Simeng Suy; Nima Aghdam; Sean P Collins Journal: Front Oncol Date: 2022-01-19 Impact factor: 6.244
Authors: Sung Jun Ma; Lucas M Serra; Brian Yu; Mark K Farrugia; Austin J Iovoli; Han Yu; Song Yao; Oluwadamilola T Oladeru; Anurag K Singh Journal: Cancers (Basel) Date: 2022-01-21 Impact factor: 6.639
Authors: María Isabel Fernández-Cano; Antonia Arreciado Marañón; Azahara Reyes-Lacalle; Maria Feijoo-Cid; Josep Maria Manresa-Domínguez; Laura Montero-Pons; Rosa Maria Cabedo-Ferreiro; Pere Toran-Monserrat; Gemma Falguera-Puig Journal: Int J Environ Res Public Health Date: 2022-04-06 Impact factor: 3.390
Authors: Jett Crowdis; Sara Balch; Lauren Sterlin; Beena S Thomas; Sabrina Y Camp; Michael Dunphy; Elana Anastasio; Shahrayz Shah; Alyssa L Damon; Rafael Ramos; Delia M Sosa; Ilan K Small; Brett N Tomson; Colleen M Nguyen; Mary McGillicuddy; Parker S Chastain; Meng Xiao He; Alexander T M Cheung; Stephanie Wankowicz; Alok K Tewari; Dewey Kim; Saud H AlDubayan; Ayanah Dowdye; Benjamin Zola; Joel Nowak; Jan Manarite; Idola Henry Gunn; Bryce Olson; Eric S Lander; Corrie A Painter; Nikhil Wagle; Eliezer M Van Allen Journal: Cell Genom Date: 2022-08-19
Authors: Scarlett Hao; Anastasios Mitsakos; William Irish; Janet Elizabeth Tuttle-Newhall; Alexander A Parikh; Rebecca A Snyder Journal: J Surg Oncol Date: 2022-03-22 Impact factor: 2.885