Richard Williams1, Evangelos Kontopantelis2, Iain Buchan3, Niels Peek4. 1. MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, University of Manchester, Manchester, UK. Electronic address: richard.williams2@manchester.ac.uk. 2. MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR School for Primary Care Research, University of Manchester, Manchester, UK. 3. MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, University of Manchester, Manchester, UK; NIHR Manchester Biomedical Research Centre, University of Manchester, Manchester, UK. 4. MRC Health eResearch Centre, University of Manchester, Manchester, UK; NIHR Greater Manchester Primary Care Patient Safety Translational Research Centre, University of Manchester, Manchester, UK.
Abstract
INTRODUCTION: The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets. OBJECTIVE: To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools. METHODS: We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed. RESULTS: Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered. DISCUSSION: There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation. CONCLUSION: Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected.
INTRODUCTION: The construction of reliable, reusable clinical code sets is essential when re-using Electronic Health Record (EHR) data for research. Yet code set definitions are rarely transparent and their sharing is almost non-existent. There is a lack of methodological standards for the management (construction, sharing, revision and reuse) of clinical code sets which needs to be addressed to ensure the reliability and credibility of studies which use code sets. OBJECTIVE: To review methodological literature on the management of sets of clinical codes used in research on clinical databases and to provide a list of best practice recommendations for future studies and software tools. METHODS: We performed an exhaustive search for methodological papers about clinical code set engineering for re-using EHR data in research. This was supplemented with papers identified by snowball sampling. In addition, a list of e-phenotyping systems was constructed by merging references from several systematic reviews on this topic, and the processes adopted by those systems for code set management was reviewed. RESULTS: Thirty methodological papers were reviewed. Common approaches included: creating an initial list of synonyms for the condition of interest (n=20); making use of the hierarchical nature of coding terminologies during searching (n=23); reviewing sets with clinician input (n=20); and reusing and updating an existing code set (n=20). Several open source software tools (n=3) were discovered. DISCUSSION: There is a need for software tools that enable users to easily and quickly create, revise, extend, review and share code sets and we provide a list of recommendations for their design and implementation. CONCLUSION: Research re-using EHR data could be improved through the further development, more widespread use and routine reporting of the methods by which clinical codes were selected.
Authors: Martin J O'Connor; Denise B Warzel; Marcos Martínez-Romero; Josef Hardi; Debra Willrett; Attila L Egyedi; Aras Eftekhari; John Graybeal; Mark A Musen Journal: AMIA Annu Symp Proc Date: 2020-03-04
Authors: Brent A Williams; Stephen Voyce; Stephen Sidney; Véronique L Roger; Timothy B Plante; Sharon Larson; Michael J LaMonte; Darwin R Labarthe; Bailey M DeBarmore; Alexander R Chang; Alanna M Chamberlain; Catherine P Benziger Journal: J Am Heart Assoc Date: 2022-04-12 Impact factor: 6.106
Authors: Brian D Nicholson; Paul Aveyard; Willie Hamilton; Clare R Bankhead; Constantinos Koshiaris; Sarah Stevens; Frederick Dr Hobbs; Rafael Perera Journal: Clin Epidemiol Date: 2019-01-25 Impact factor: 4.790
Authors: Antonio Martinez-Millana; María Argente-Pla; Bernardo Valdivieso Martinez; Vicente Traver Salcedo; Juan Francisco Merino-Torres Journal: J Clin Med Date: 2019-01-17 Impact factor: 4.241
Authors: Matthew Sperrin; David J Webb; Pinal Patel; Kourtney J Davis; Susan Collier; Alexander Pate; David A Leather; Jeanne M Pimenta Journal: Pharmacoepidemiol Drug Saf Date: 2019-08-05 Impact factor: 2.890