Literature DB >> 34926996

Development of a repository of computable phenotype definitions using the clinical quality language.

Pascal S Brandt¹, Jennifer A Pacheco², Luke V Rasmussen².

Abstract

OBJECTIVE: The objective of this study is to create a repository of computable, technology-agnostic phenotype definitions for the purposes of analysis and automatic cohort identification.
MATERIALS AND METHODS: We selected phenotype definitions from PheKB and excluded definitions that did not use structured data or were not used in published research. We translated these definitions into the Clinical Quality Language (CQL) and Fast Healthcare Interoperability Resources (FHIR) and validated them using code review and automated tests.
RESULTS: A total of 33 phenotype definitions met our inclusion criteria. We developed 40 CQL libraries, 231 value sets, and 347 test cases. To support these test cases, a total of 1624 FHIR resources were created as test data. DISCUSSION AND
CONCLUSION: Although a number of challenges were encountered while translating the phenotypes into structured form, such as requiring specialized knowledge, or imprecise, ambiguous, and conflicting language, we have created a repository and a development environment that can be used for future research on computable phenotypes.

Entities: Chemical

Keywords: CQL; EHR-driven phenotyping; FHIR; cohort identification

Year: 2021 PMID： 34926996 PMCID： PMC8672934 DOI： 10.1093/jamiaopen/ooab094

Source DB: PubMed Journal: JAMIA Open ISSN： 2574-2531

INTRODUCTION

Sets of criteria that are used to identify cohorts of patients for clinical research are referred to as phenotype definitions, or phenotypes for brevity. Phenotype definitions must be executed by implementers, often by manually translating textual descriptions of selection criteria into executable code, in order to identify patient cohorts. The heterogeneity of both the representation of the logic, as well as the data model that underlies the logic, is a key barrier to evaluating phenotype implementations to gain insight into the process of phenotype development. Systems such as Informatics for Integrating Biology and the Bedside (i2b2) and the Observational Health Data Sciences and Informatics (OHDSI) platform, provide computable phenotypes that are bound to their respective data models. These computable representations are automatically translated into queries for a specific database system when the phenotype is used for cohort identification. However, i2b2 and OHDSI phenotypes cannot be directly shared between platforms, nor are they comparable without some translation. Within the electronic clinical quality measure space, Clinical Quality Language (CQL; https://cql.hl7.org/) is used for the representation of similar criteria but is technology-agnostic. This means that it is not coupled to any specific software implementation. Furthermore, CQL has been shown to be a feasible logical expression language for representing clinically validated phenotype definitions. It supports a wide range of Boolean, temporal, aggregate, and other operations. The language is data model independent but does require the selection of a data model when writing CQL. It works out of the box with Fast Healthcare Interoperability Resources (FHIR; https://hl7.org/FHIR/), which is widely used and has recently become the legally required standard for clinical data exchange in the United States. Additionally, the Common Data Model Harmonization project provides mappings from FHIR to many other common healthcare data models, maximizing potential utilization of CQL-based phenotype definitions.

OBJECTIVES

In this work, we developed a database of phenotype definitions represented in a structured, unambiguous, computable, technology-agnostic standardized format. This representation would allow automated computational analysis and cohort identification against data platforms that support FHIR, and CQL or for which another mapping exists. We additionally provide a suite of test cases that, together with the provided testing configuration, can be used to validate the correctness of the phenotype definitions. Targeted to informaticians and research data analysts, we provide these phenotypes to the clinical research informatics community as an initial repository of computable technology-agnostic phenotype definitions, which we hope we and others will extend over time, using the same methods and tools and development environment.

MATERIALS AND METHODS

Data set

We selected phenotype definitions from the PheKB phenotype repository, as it is the most mature and widely used in the United States. PheKB was initiated in 2012 and has had phenotypes contributed from many projects and collaborative groups, most notably the electronic Medical Records and Genomics Network. The repository is continuously growing, with over 100 phenotypes in various stages of development. We extracted all public phenotype definitions available on May 22, 2020. We automatically selected from the list of publicly available phenotypes those with a status of “Final.” From this collection, we reviewed each definition and excluded any phenotype entry that (1) did not make use of structured data [ie, was entirely natural language processing (NLP)-based], (2) that was a generic repository for submitting data, or (3) was not used in a published research study. These criteria were chosen to ensure that our final data set consisted only of completed and clinically validated phenotype definitions. For each phenotype that met these criteria, we downloaded all files linked to the phenotype definition. Source code for this step of the process is available on GitHub (https://github.com/PheMA/phekb-export).

Translation

Two authors (PSB and LVR) independently translated each of the phenotype definitions using the available metadata and artifacts downloaded from PheKB. One author was primarily responsible for the translation of each phenotype into CQL and FHIR, but the authors were not entirely blinded. Group discussion amongst all authors was used to confirm interpretation of phenotype definitions that were ambiguous. Standard terminologies such as the International Classification of Diseases versions 9 (ICD-9) and 10 (ICD-10), Current Procedural Terminology (CPT), Logical Observation Identifiers Names and Codes (LOINC), and RxNorm were used for structured data and were represented using FHIR ValueSet resources. These terminologies were usually specified in the phenotype definitions, but where they were not, we used the recommended default terminologies from the FHIR standard. We developed an open-source tool to translate value sets in various formats into FHIR resources, and built an interface to allow web-based interaction with the tool (https://github.com/PheMA/terminology-manager). The tool supports Comma Separated Values files, as well as concept sets exported from the OHDSI platform, into ValueSet resources. It also supports searching, inspecting, and importing value sets directly from the Value Set Authority Center (VSAC), using the VSAC FHIR server. For each phenotype, we created a single CQL library that contained the logic required to identify matching patients. Logic shared between phenotypes was authored in shared libraries that were imported using the CQL include operator. The shared libraries were iteratively refined and expanded (with already developed phenotypes refactored to incorporate new library functions) during the course of the project. We did not implement NLP logic, as there is no widely accepted standard representation or implementation of this type of logic. To our knowledge, there is currently no way to natively express NLP constructs using FHIR or CQL, although this is an active area of research., We adopted a number of conventions for the standards-based representation. First, in this work, we only represent phenotype cases, and not controls, suspected cases, or subtypes. Case definitions usually contain the most, and most varied, criteria; thus, they serve as a good basis for analysis or extension. We adopted the convention of creating a CQL statement in each library called “Case,” which represents the entry point for evaluating the phenotype definition. Additionally, unless explicitly stated otherwise, we modeled drugs based on their RxNorm ingredient name and lab values based on the highest ranked appropriate LOINC code.

Development environment

We made use of several open-source tools during the phenotype translation process and published this development environment on GitHub (https://github.com/PheMA/phekb-phenotypes). We used Visual Studio Code (https://code.visualstudio.com) as our primary CQL authoring environment, and for syntax highlighting we used the language-cql plugin (https://github.com/Jonnokc/Clinical-Quality-Language). To translate CQL into the equivalent machine-readable representation, known as the Expression Logical Model (ELM), we used the reference implementation of the CQL to ELM translator (https://github.com/cqframework/clinical_quality_language). For testing, we used the CQL Testing Framework (CTF) developed by the Agency for Healthcare Research and Quality (https://github.com/AHRQ-CDS/CQL-Testing-Framework). The CTF provides a mechanism to specify test data, which are materialized as FHIR resources, using a simple YAML file. The CTF also provides a configurable test runner, which can run a specific CQL library against the test data generated by the YAML specification, and assert that the results match what is expected. This allows authors to carefully create test data to make sure phenotypes correctly identify potentially tricky edge and corner cases.

Validation

We used 2 methods to ensure that our CQL phenotype definitions were correctly translated from the artifacts available in PheKB. First, each phenotype was translated by a single author, and then verified using a code review process. The primary author created a pull request on GitHub (a way of isolating code for a given purpose, in this case representing a single phenotype definition), and the second author reviewed the code to make sure it accurately represented the phenotype definition as described in PheKB. Secondly, we used an approach from software engineering called test-driven development to ensure that our translations of phenotype logic and value sets were correct. We made use of the CTF to implement this approach. In addition to allowing the CQL author to express both test cases and FHIR data using YAML, the CTF integrates with the Mocha JavaScript testing framework (https://mochajs.org) in order to evaluate phenotype logic using the given data to assert that results produced are correct. This evaluation is done using the ELM representation of the phenotype, and the open-source JavaScript CQL engine (https://github.com/cqframework/cql-execution). All tests were run automatically on each code commit to ensure no regressions were introduced. The full development and validation pipeline is shown in Figure 1.

Figure 1.

Phenotype definition development and verification pipeline.

RESULTS

At the time of our extraction, there were a total of 71 publicly available phenotype definitions in PheKB with a status of “Final.” We excluded 2 definitions that were not actually phenotypes. One was used as a placeholder to publish new value sets, and one was the description of a risk model. We excluded 3 more that used only NLP criteria. Finally, from the remaining phenotypes, we identified only those with associated publications. This selection process resulted in 33 total phenotype definitions and is illustrated in Figure 2.

Figure 2.

Phenotype definition selection process.

Phenotype definition selection process. We created a total of 40 CQL libraries—one for each phenotype, and 7 helper libraries, totaling 3327 lines of CQL code. A total of 231 value sets were assembled, of which 216 were manually created and 15 were imported from VSAC. These value sets consist of 17 948 individual codes, of which 13,340 are unique. Additionally, 347 test cases were written that collectively contain 2044 test assertions. To support these test cases, 347 patients, 96 encounters, 101 procedures, 335 medication orders, 385 conditions, and 360 observations were manually created as FHIR resources using the CTF.

DISCUSSION

While building this repository, we observed numerous advantages to using popular healthcare-specific standards such as FHIR and CQL. Advantages include convenient conceptual models, mechanisms for verification, and the availability of tools, documentation and expertise to provide assistance during development. This repository will provide a collection of diverse, computable, verified, and standardized phenotype definitions that will aid automated analysis. In addition, we note other benefits above our primary objective. First, the methods and tools we used are documented within the repository and can be adopted by other researchers and developers. Also, our use of a logical representation that is technology and data model independent may facilitate automated execution by allowing implementation sites to implement their own data providers for existing phenotype definitions. Similarly, the use of common standard terminologies may enable automatic mapping during local execution. During this work, we also experienced first hand many of the challenges that face phenotype implementers. We encountered numerous occurrences of ambiguity, underspecify, and imprecise language. For example, the Clopidogrel Poor Metabolizers phenotype uses the phrase “within 30 days,” but does not specify whether the interval boundary should be inclusive or not, or whether 30 days both before or after the event should be considered. In each case, the primary CQL developer had to confer with the other authors in order to determine the exact semantics of the phenotype definition. Even then, we would occasionally rely on subjective decisions regarding the intent. This resulted in a considerable slowdown in implementation. We note that a benefit of having translated the narrative definition to CQL is that the up-front investment in time has removed the ambiguities for all subsequent users. Additionally, some phenotype definitions relied heavily on domain or tribal knowledge not specified within the definition itself. This makes it difficult for non-clinicians or healthcare outsiders to replicate research or use existing phenotype definitions for new research. For example, the High-Density Lipoproteins phenotype requires that a cohort member have at least one “random glucose test,” but does not specify how these tests are to be identified. We also encountered contradictory criteria definitions, for example, the Bone Scan Utilization phenotype requires that a cohort member be both >35 and ≥35 years old. The creation of this repository demonstrates a step forward for these phenotypes. Although a formal representation may not eliminate all these issues, it would require phenotype authors to be more precise at the definition phase, which would reduce the cognitive load on implementers. Although we did not formally track the amount of time to implement each phenotype, authoring a formal definition in CQL and providing confirmatory tests does require an additional investment in time and resources for phenotype authors. Furthermore, it requires specialized informatics knowledge that has its own learning curve. However, we believe that the reduction in ambiguity benefits all phenotype authors (informaticians and research data analysts), and that when a phenotype is planned for reuse or broader dissemination, the time spent by the phenotype author in formalizing its representation using CQL has a cumulative payback each time the phenotype is reused. This work demonstrates the realization of previous desiderata for computable phenotypes, including supporting human and computable formats, set operations and relational algebra, using well-defined temporal relationships, using standard terminologies, and supporting standards–compliant interfaces to external software. Given the benefits of a concrete, unambiguous phenotype definition, we hope that phenotype repository managers will encourage the inclusion of computable definitions, in addition to providing APIs to allow integrating with their repositories. We acknowledge the following limitations of this work. First, given the subjective nature of interpreting narrative phenotype definitions, we cannot guarantee fidelity of the intended definition. The only way to determine semantic correctness would have been to reach out to the original phenotype definition authors, who may not have a definitive answer (given elapsed time from when some phenotypes were authored). Additionally, our CQL-based phenotype definitions were not clinically validated on actual datasets, although they are derived from clinically valid phenotype definitions.

CONCLUSION

This repository of structured phenotype definitions provides clear definitions of phenotype algorithms, represented in a format that facilitates automation of cohort identification within supported data platforms. We believe that the provided data set and development environment can be a resource for clinical informatics practitioners and researchers who want to study phenotype definitions or identify cohorts of patients for biomedical knowledge discovery. In the future, we plan to evaluate the phenotype definitions in the translated dataset. This includes evaluating a single phenotype definition (represented using the standard proposed in this work) at 3 large academic medical centers, with performance being evaluated using manual chart review.

FUNDING

The Fulbright Foreign Student Program and the South African National Research Foundation (to PSB). In part by NIH grant R01GM105688 and by the NHGRI through the grant U01HG011169 (Northwestern University; to LVR and JAP).

AUTHOR CONTRIBUTIONS

All authors helped select the set of phenotypes to translate. PSB and LVR developed the representation standard, the development environment, and translated the PheKB phenotypes into the standard format. JAP was consulted to help resolve ambiguities during translation. PSB wrote the first draft of the manuscript, and all authors helped refine and edit the final version.

CONFLICT OF INTEREST STATEMENT

PSB is a consultant for Commure, Inc. LVR and JAP have no competing interests to disclose.

DATA AVAILABILITY

The raw data underlying this article are available in PheKB at https://phekb.org/, and the phenotypes available in GitHub are available at https://github.com/PheMA/phekb-phenotypes.

15 in total

1. Investigating the Capabilities of FHIR Search for Clinical Trial Phenotyping.

Authors: Christian Gulden; Sebastian Mate; Hans-Ulrich Prokosch; Stefan Kraus
Journal: Stud Health Technol Inform Date: 2018

2. PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability.

Authors: Jacqueline C Kirby; Peter Speltz; Luke V Rasmussen; Melissa Basford; Omri Gottesman; Peggy L Peissig; Jennifer A Pacheco; Gerard Tromp; Jyotishman Pathak; David S Carrell; Stephen B Ellis; Todd Lingren; Will K Thompson; Guergana Savova; Jonathan Haines; Dan M Roden; Paul A Harris; Joshua C Denny
Journal: J Am Med Inform Assoc Date: 2016-03-28 Impact factor: 4.497

3. Design and Concept of the SMITH Phenotyping Pipeline.

Authors: Frank A Meineke; Sebastian Stäubert; Matthias Löbe; Alexandr Uciteli; Markus Löffler
Journal: Stud Health Technol Inform Date: 2019-09-03

4. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2).

Authors: Shawn N Murphy; Griffin Weber; Michael Mendis; Vivian Gainer; Henry C Chueh; Susanne Churchill; Isaac Kohane
Journal: J Am Med Inform Assoc Date: 2010 Mar-Apr Impact factor: 4.497

5. The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies.

Authors: Catherine A McCarty; Rex L Chisholm; Christopher G Chute; Iftikhar J Kullo; Gail P Jarvik; Eric B Larson; Rongling Li; Daniel R Masys; Marylyn D Ritchie; Dan M Roden; Jeffery P Struewing; Wendy A Wolf
Journal: BMC Med Genomics Date: 2011-01-26 Impact factor: 3.063

6. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers.

Authors: George Hripcsak; Jon D Duke; Nigam H Shah; Christian G Reich; Vojtech Huser; Martijn J Schuemie; Marc A Suchard; Rae Woong Park; Ian Chi Kei Wong; Peter R Rijnbeek; Johan van der Lei; Nicole Pratt; G Niklas Norén; Yu-Chuan Li; Paul E Stang; David Madigan; Patrick B Ryan
Journal: Stud Health Technol Inform Date: 2015

7. Developing a FHIR-based EHR phenotyping framework: A case study for identification of patients with obesity and multiple comorbidities from discharge summaries.

Authors: Na Hong; Andrew Wen; Daniel J Stone; Shintaro Tsuji; Paul R Kingsbury; Luke V Rasmussen; Jennifer A Pacheco; Prakash Adekkanattu; Fei Wang; Yuan Luo; Jyotishman Pathak; Hongfang Liu; Guoqian Jiang
Journal: J Biomed Inform Date: 2019-10-14 Impact factor: 6.317

Review 8. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future.

Authors: Omri Gottesman; Helena Kuivaniemi; Gerard Tromp; W Andrew Faucett; Rongling Li; Teri A Manolio; Saskia C Sanderson; Joseph Kannry; Randi Zinberg; Melissa A Basford; Murray Brilliant; David J Carey; Rex L Chisholm; Christopher G Chute; John J Connolly; David Crosslin; Joshua C Denny; Carlos J Gallego; Jonathan L Haines; Hakon Hakonarson; John Harley; Gail P Jarvik; Isaac Kohane; Iftikhar J Kullo; Eric B Larson; Catherine McCarty; Marylyn D Ritchie; Dan M Roden; Maureen E Smith; Erwin P Böttinger; Marc S Williams
Journal: Genet Med Date: 2013-06-06 Impact factor: 8.822

9. Desiderata for computable representations of electronic health records-driven phenotype algorithms.

Authors: Huan Mo; William K Thompson; Luke V Rasmussen; Jennifer A Pacheco; Guoqian Jiang; Richard Kiefer; Qian Zhu; Jie Xu; Enid Montague; David S Carrell; Todd Lingren; Frank D Mentch; Yizhao Ni; Firas H Wehbe; Peggy L Peissig; Gerard Tromp; Eric B Larson; Christopher G Chute; Jyotishman Pathak; Joshua C Denny; Peter Speltz; Abel N Kho; Gail P Jarvik; Cosmin A Bejan; Marc S Williams; Kenneth Borthwick; Terrie E Kitchner; Dan M Roden; Paul A Harris
Journal: J Am Med Inform Assoc Date: 2015-09-05 Impact factor: 4.497

10. Toward cross-platform electronic health record-driven phenotyping using Clinical Quality Language.

Authors: Pascal S Brandt; Richard C Kiefer; Jennifer A Pacheco; Prakash Adekkanattu; Evan T Sholle; Faraz S Ahmad; Jie Xu; Zhenxing Xu; Jessica S Ancker; Fei Wang; Yuan Luo; Guoqian Jiang; Jyotishman Pathak; Luke V Rasmussen
Journal: Learn Health Syst Date: 2020-06-25

2 in total

1. Design and validation of a FHIR-based EHR-driven phenotyping toolbox.

Authors: Pascal S Brandt; Jennifer A Pacheco; Prakash Adekkanattu; Evan T Sholle; Sajjad Abedian; Daniel J Stone; David M Knaack; Jie Xu; Zhenxing Xu; Yifan Peng; Natalie C Benda; Fei Wang; Yuan Luo; Guoqian Jiang; Jyotishman Pathak; Luke V Rasmussen
Journal: J Am Med Inform Assoc Date: 2022-08-16 Impact factor: 7.942

Review 2. Fast Healthcare Interoperability Resources (FHIR) for Interoperability in Health Research: Systematic Review.

Authors: Carina Nina Vorisek; Moritz Lehne; Sophie Anne Ines Klopfenstein; Paula Josephine Mayer; Alexander Bartschke; Thomas Haese; Sylvia Thun
Journal: JMIR Med Inform Date: 2022-07-19

2 in total