BACKGROUND: Routine clinical data from clinical charts are indispensable for retrospective and prospective observational studies and clinical trials. Their reproducibility is often not assessed. We developed a prostate cancer-specific database for clinical annotations and evaluated data reproducibility. METHODS: For men with prostate cancer who had clinical-grade paired tumor-normal sequencing at a comprehensive cancer center, we performed team-based retrospective data collection from the electronic medical record using a defined source hierarchy. We developed an open-source R package for data processing. With blinded repeat annotation by a reference medical oncologist, we assessed data completeness, reproducibility of team-based annotations, and impact of measurement error on bias in survival analyses. RESULTS: Data elements on demographics, diagnosis and staging, disease state at the time of procuring a genomically characterized sample, and clinical outcomes were piloted and then abstracted for 2261 patients (with 2631 samples). Completeness of data elements was generally high. Comparing to the repeat annotation by a medical oncologist blinded to the database (100 patients/samples), reproducibility of annotations was high; T stage, metastasis date, and presence and date of castration resistance had lower reproducibility. Impact of measurement error on estimates for strong prognostic factors was modest. CONCLUSIONS: With a prostate cancer-specific data dictionary and quality control measures, manual clinical annotations by a multidisciplinary team can be scalable and reproducible. The data dictionary and the R package for reproducible data processing are freely available to increase data quality and efficiency in clinical prostate cancer research.
BACKGROUND: Routine clinical data from clinical charts are indispensable for retrospective and prospective observational studies and clinical trials. Their reproducibility is often not assessed. We developed a prostate cancer-specific database for clinical annotations and evaluated data reproducibility. METHODS: For men with prostate cancer who had clinical-grade paired tumor-normal sequencing at a comprehensive cancer center, we performed team-based retrospective data collection from the electronic medical record using a defined source hierarchy. We developed an open-source R package for data processing. With blinded repeat annotation by a reference medical oncologist, we assessed data completeness, reproducibility of team-based annotations, and impact of measurement error on bias in survival analyses. RESULTS: Data elements on demographics, diagnosis and staging, disease state at the time of procuring a genomically characterized sample, and clinical outcomes were piloted and then abstracted for 2261 patients (with 2631 samples). Completeness of data elements was generally high. Comparing to the repeat annotation by a medical oncologist blinded to the database (100 patients/samples), reproducibility of annotations was high; T stage, metastasis date, and presence and date of castration resistance had lower reproducibility. Impact of measurement error on estimates for strong prognostic factors was modest. CONCLUSIONS: With a prostate cancer-specific data dictionary and quality control measures, manual clinical annotations by a multidisciplinary team can be scalable and reproducible. The data dictionary and the R package for reproducible data processing are freely available to increase data quality and efficiency in clinical prostate cancer research.
Authors: Gaurav Singal; Peter G Miller; Vineeta Agarwala; Gerald Li; Gaurav Kaushik; Daniel Backenroth; Anala Gossai; Garrett M Frampton; Aracelis Z Torres; Erik M Lehnert; David Bourque; Claire O'Connell; Bryan Bowser; Thomas Caron; Ezra Baydur; Kathi Seidl-Rathkopf; Ivan Ivanov; Garrett Alpha-Cobb; Ameet Guria; Jie He; Shannon Frank; Allen C Nunnally; Mark Bailey; Ann Jaskiw; Dana Feuchtbaum; Nathan Nussbaum; Amy P Abernethy; Vincent A Miller Journal: JAMA Date: 2019-04-09 Impact factor: 56.272
Authors: Paul A Harris; Robert Taylor; Robert Thielke; Jonathon Payne; Nathaniel Gonzalez; Jose G Conde Journal: J Biomed Inform Date: 2008-09-30 Impact factor: 6.317
Authors: Howard I Scher; Michael J Morris; Walter M Stadler; Celestia Higano; Ethan Basch; Karim Fizazi; Emmanuel S Antonarakis; Tomasz M Beer; Michael A Carducci; Kim N Chi; Paul G Corn; Johann S de Bono; Robert Dreicer; Daniel J George; Elisabeth I Heath; Maha Hussain; Wm Kevin Kelly; Glenn Liu; Christopher Logothetis; David Nanus; Mark N Stein; Dana E Rathkopf; Susan F Slovin; Charles J Ryan; Oliver Sartor; Eric J Small; Matthew Raymond Smith; Cora N Sternberg; Mary-Ellen Taplin; George Wilding; Peter S Nelson; Lawrence H Schwartz; Susan Halabi; Philip W Kantoff; Andrew J Armstrong Journal: J Clin Oncol Date: 2016-02-22 Impact factor: 44.544
Authors: Bastien Nguyen; Jose Mauricio Mota; Subhiksha Nandakumar; Konrad H Stopsack; Emily Weg; Dana Rathkopf; Michael J Morris; Howard I Scher; Philip W Kantoff; Anuradha Gopalan; Dmitriy Zamarin; David B Solit; Nikolaus Schultz; Wassim Abida Journal: Eur Urol Date: 2020-04-19 Impact factor: 20.096
Authors: William K Oh; Julia Hayes; Carolyn Evan; Judith Manola; Daniel J George; Helen Waldron; Meaghan Donovan; John Varner; John Orechia; Beth Katcher; Diana Lu; Arthur Nevins; Renée L Wright; Lauren Tormey; James Talcott; Mark A Rubin; Massimo Loda; William R Sellers; Jerome P Richie; Philip W Kantoff; Jane Weeks Journal: Clin Genitourin Cancer Date: 2006-06 Impact factor: 2.872
Authors: Jose Mauricio Mota; Ethan Barnett; Jones T Nauseef; Bastien Nguyen; Konrad H Stopsack; Andreas Wibmer; Jessica R Flynn; Glenn Heller; Daniel C Danila; Dana Rathkopf; Susan Slovin; Philip W Kantoff; Howard I Scher; Michael J Morris; Nikolaus Schultz; David B Solit; Wassim Abida Journal: JCO Precis Oncol Date: 2020-04-16
Authors: Ahmet Zehir; Ryma Benayed; Ronak H Shah; Aijazuddin Syed; Sumit Middha; Hyunjae R Kim; Preethi Srinivasan; Jianjiong Gao; Debyani Chakravarty; Sean M Devlin; Matthew D Hellmann; David A Barron; Alison M Schram; Meera Hameed; Snjezana Dogan; Dara S Ross; Jaclyn F Hechtman; Deborah F DeLair; JinJuan Yao; Diana L Mandelker; Donavan T Cheng; Raghu Chandramohan; Abhinita S Mohanty; Ryan N Ptashkin; Gowtham Jayakumaran; Meera Prasad; Mustafa H Syed; Anoop Balakrishnan Rema; Zhen Y Liu; Khedoudja Nafa; Laetitia Borsu; Justyna Sadowska; Jacklyn Casanova; Ruben Bacares; Iwona J Kiecka; Anna Razumova; Julie B Son; Lisa Stewart; Tessara Baldi; Kerry A Mullaney; Hikmat Al-Ahmadie; Efsevia Vakiani; Adam A Abeshouse; Alexander V Penson; Philip Jonsson; Niedzica Camacho; Matthew T Chang; Helen H Won; Benjamin E Gross; Ritika Kundra; Zachary J Heins; Hsiao-Wei Chen; Sarah Phillips; Hongxin Zhang; Jiaojiao Wang; Angelica Ochoa; Jonathan Wills; Michael Eubank; Stacy B Thomas; Stuart M Gardos; Dalicia N Reales; Jesse Galle; Robert Durany; Roy Cambria; Wassim Abida; Andrea Cercek; Darren R Feldman; Mrinal M Gounder; A Ari Hakimi; James J Harding; Gopa Iyer; Yelena Y Janjigian; Emmet J Jordan; Ciara M Kelly; Maeve A Lowery; Luc G T Morris; Antonio M Omuro; Nitya Raj; Pedram Razavi; Alexander N Shoushtari; Neerav Shukla; Tara E Soumerai; Anna M Varghese; Rona Yaeger; Jonathan Coleman; Bernard Bochner; Gregory J Riely; Leonard B Saltz; Howard I Scher; Paul J Sabbatini; Mark E Robson; David S Klimstra; Barry S Taylor; Jose Baselga; Nikolaus Schultz; David M Hyman; Maria E Arcila; David B Solit; Marc Ladanyi; Michael F Berger Journal: Nat Med Date: 2017-05-08 Impact factor: 53.440
Authors: Goutam Chakraborty; Subhiksha Nandakumar; Rahim Hirani; Bastien Nguyen; Konrad H Stopsack; Christoph Kreitzer; Sai Harisha Rajanala; Romina Ghale; Ying Z Mazzu; Naga Vara Kishore Pillarsetty; Gwo-Shu Mary Lee; Howard I Scher; Michael J Morris; Tiffany Traina; Pedram Razavi; Wassim Abida; Jeremy C Durack; Stephen B Solomon; Matthew G Vander Heiden; Lorelei A Mucci; Andreas G Wibmer; Nikolaus Schultz; Philip W Kantoff Journal: Clin Cancer Res Date: 2022-08-15 Impact factor: 13.801