Shirley V Wang1, Sebastian Schneeweiss2, Marc L Berger3, Jeffrey Brown4, Frank de Vries5, Ian Douglas6, Joshua J Gagne2, Rosa Gini7, Olaf Klungel8, C Daniel Mullins9, Michael D Nguyen10, Jeremy A Rassen11, Liam Smeeth6, Miriam Sturkenboom12. 1. Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital, MA, USA; Department of Medicine, Harvard Medical School, MA, USA. Electronic address: swang1@bwh.harvard.edu. 2. Division of Pharmacoepidemiology and Pharmacoeconomics, Brigham and Women's Hospital, MA, USA; Department of Medicine, Harvard Medical School, MA, USA. 3. Pfizer, NY, USA. 4. Department of Population Medicine, Harvard Medical School, MA, USA. 5. Department of Clinical Pharmacy, Maastricht UMC+, The Netherlands. 6. London School of Hygiene and Tropical Medicine, England, UK. 7. Agenzia regionale di sanità della Toscana, Florence, Italy. 8. Division of Pharmacoepidemiology & Clinical Pharmacology, Utrecht University, Utrecht, Netherlands. 9. Pharmaceutical Health Services Research Department, University of Maryland School of Pharmacy, MA, USA. 10. FDA Center for Drug Evaluation and Research, USA. 11. Aetion, Inc., NY, USA. 12. Erasmus University Medical Center Rotterdam, Netherlands.
Abstract
PURPOSE: Defining a study population and creating an analytic dataset from longitudinal healthcare databases involves many decisions. Our objective was to catalogue scientific decisions underpinning study execution that should be reported to facilitate replication and enable assessment of validity of studies conducted in large healthcare databases. METHODS: We reviewed key investigator decisions required to operate a sample of macros and software tools designed to create and analyze analytic cohorts from longitudinal streams of healthcare data. A panel of academic, regulatory, and industry experts in healthcare database analytics discussed and added to this list. CONCLUSION: Evidence generated from large healthcare encounter and reimbursement databases is increasingly being sought by decision-makers. Varied terminology is used around the world for the same concepts. Agreeing on terminology and which parameters from a large catalogue are the most essential to report for replicable research would improve transparency and facilitate assessment of validity. At a minimum, reporting for a database study should provide clarity regarding operational definitions for key temporal anchors and their relation to each other when creating the analytic dataset, accompanied by an attrition table and a design diagram. A substantial improvement in reproducibility, rigor and confidence in real world evidence generated from healthcare databases could be achieved with greater transparency about operational study parameters used to create analytic datasets from longitudinal healthcare databases.
PURPOSE: Defining a study population and creating an analytic dataset from longitudinal healthcare databases involves many decisions. Our objective was to catalogue scientific decisions underpinning study execution that should be reported to facilitate replication and enable assessment of validity of studies conducted in large healthcare databases. METHODS: We reviewed key investigator decisions required to operate a sample of macros and software tools designed to create and analyze analytic cohorts from longitudinal streams of healthcare data. A panel of academic, regulatory, and industry experts in healthcare database analytics discussed and added to this list. CONCLUSION: Evidence generated from large healthcare encounter and reimbursement databases is increasingly being sought by decision-makers. Varied terminology is used around the world for the same concepts. Agreeing on terminology and which parameters from a large catalogue are the most essential to report for replicable research would improve transparency and facilitate assessment of validity. At a minimum, reporting for a database study should provide clarity regarding operational definitions for key temporal anchors and their relation to each other when creating the analytic dataset, accompanied by an attrition table and a design diagram. A substantial improvement in reproducibility, rigor and confidence in real world evidence generated from healthcare databases could be achieved with greater transparency about operational study parameters used to create analytic datasets from longitudinal healthcare databases.
Authors: Emily Feld; Joanna Harton; Neal J Meropol; Blythe J S Adamson; Aaron Cohen; Ravi B Parikh; Matthew D Galsky; Vivek Narayan; John Christodouleas; David J Vaughn; Rebecca A Hubbard; Ronac Mamtani Journal: Eur Urol Date: 2019-07-28 Impact factor: 20.096
Authors: Antoine Perpoil; Gael Grimandi; Stéphane Birklé; Jean-François Simonet; Anne Chiffoleau; François Bocquet Journal: Int J Environ Res Public Health Date: 2020-12-29 Impact factor: 3.390
Authors: Charles E Leonard; Colleen M Brensinger; Ghadeer K Dawwas; Rajat Deo; Warren B Bilker; Samantha E Soprano; Neil Dhopeshwarkar; James H Flory; Zachary T Bloomgarden; Joshua J Gagne; Christina L Aquilante; Stephen E Kimmel; Sean Hennessy Journal: Cardiovasc Diabetol Date: 2020-02-25 Impact factor: 9.951
Authors: Sallie-Anne Pearson; Nicole Pratt; Juliana de Oliveira Costa; Helga Zoega; Tracey-Lea Laba; Christopher Etherton-Beer; Frank M Sanfilippo; Alice Morgan; Lisa Kalisch Ellett; Claudia Bruno; Erin Kelty; Maarten IJzerman; David B Preen; Claire M Vajdic; David Henry Journal: Int J Environ Res Public Health Date: 2021-12-18 Impact factor: 3.390