Literature DB >> 34871428

Supporting research, protecting data: one institution's approach to clinical data warehouse governance.

Kellie M Walters1, Anna Jojic1, Emily R Pfaff2, Marie Rape1, Donald C Spencer3, Nicholas J Shaheen4, Brent Lamm3, Timothy S Carey5.   

Abstract

Institutions must decide how to manage the use of clinical data to support research while ensuring appropriate protections are in place. Questions about data use and sharing often go beyond what the Health Insurance Portability and Accountability Act of 1996 (HIPAA) considers. In this article, we describe our institution's governance model and approach. Common questions we consider include (1) Is a request limited to the minimum data necessary to carry the research forward? (2) What plans are there for sharing data externally?, and (3) What impact will the proposed use of data have on patients and the institution? In 2020, 302 of the 319 requests reviewed were approved. The majority of requests were approved in less than 2 weeks, with few or no stipulations. For the remaining requests, the governance committee works with researchers to find solutions to meet their needs while also addressing our collective goal of protecting patients.
© The Author(s) 2021. Published by Oxford University Press on behalf of the American Medical Informatics Association.

Entities:  

Keywords:  EHR data; clinical data warehouse; clinical informatics; data governance; data privacy

Mesh:

Year:  2022        PMID: 34871428      PMCID: PMC8922173          DOI: 10.1093/jamia/ocab259

Source DB:  PubMed          Journal:  J Am Med Inform Assoc        ISSN: 1067-5027            Impact factor:   4.497


INTRODUCTION

Data collected as part of usual clinical care are a powerful resource for research., Researchers can leverage these data to find patients potentially eligible for a trial, conduct secondary data analyses,, or follow clinical outcomes of study participants., The Health Insurance Portability and Accountability Act of 1996 (HIPAA) allows for such uses of these data, provided patients sign a HIPAA authorization form or an Institutional Review Board (IRB) grants a waiver of HIPAA authorization. While this ensures the appropriate legal protections are in place, concerns about data release and use do not end there. Data brokers and clinical leaders face the following questions: How much, and what type, of data is appropriate to share with an external party? How will patients react to a study recruitment letter related to their medical history, as identified from the electronic health record? Does disclosure of these data present a risk to institutional reputation? Different stakeholders have different reactions to these questions, and data brokers must carefully balance the benefits of data use with potential drawbacks. A recent systematic review on data access and use in clinical data warehouses found a lack of in-depth information on data governance and criteria used for reviewing requests. This article seeks to fill that void by describing the governance approach at the University of North Carolina at Chapel Hill for research uses of the Carolina Data Warehouse for Health (CDW-H), the central repository for electronic health record (EHR) data for UNC Health.

Institutional context

The University of North Carolina at Chapel Hill and UNC Health work in close partnership to carry out their combined research mission. UNC-Chapel Hill, a large Research 1 University, is home to schools of medicine, public health, pharmacy, nursing, and dentistry. UNC Health encompasses the academic medical center in Chapel Hill and community practices and hospitals throughout the state of North Carolina. The CDW-H, UNC Health’s institutional EHR data warehouse, was established in 2009. The CDW-H is used for operational, quality improvement, and research activities. This article addresses the research component, which is jointly governed by the UNC School of Medicine and UNC Health through the CDW-H Oversight and Operations Committees. The North Carolina Translational and Clinical Sciences Institute (NC TraCS), UNC’s Clinical and Translational Science Award (CTSA) hub, housed in the School of Medicine, is charged with the stewardship of the data request and approval process.

MATERIALS AND METHODS

Data request process

In order to ensure appropriate data protections and compliance with study protocols, investigators are not permitted to directly access the CDW-H. Instead, the CDW-H research program works on a request model, with NC TraCS serving as the single point of entry for all research requests. All requests go through a standard intake and review process (see Figure  1).
Figure 1.

Carolina Data Warehouse for Health request process. All research requests for data go through this intake and approval process. Of note is the dual review pathway. Most requests are approved through administrative (expedited) review, which includes a step to ensure the request aligns with the IRB protocol. Requests that are controversial or represent additional risk, such as large data sharing projects or requests that involve recruitment of children, are reviewed by the CDW-H governance committees.

Carolina Data Warehouse for Health request process. All research requests for data go through this intake and approval process. Of note is the dual review pathway. Most requests are approved through administrative (expedited) review, which includes a step to ensure the request aligns with the IRB protocol. Requests that are controversial or represent additional risk, such as large data sharing projects or requests that involve recruitment of children, are reviewed by the CDW-H governance committees. The data request form (see Supplementary Appendix A) includes questions about inclusion/exclusion criteria, data elements requested, planned use of the data, and data sharing plans (if any). The structured nature of the data request form ensures that there is adequate information to define the scope of the project and conduct governance review. Requests are fulfilled by CDW-H honest brokers, designated analysts within NC TraCS or approved departments who are trained in querying healthcare data and navigating regulatory issues. The honest broker is not a member of the study team and does not participate in the research or analyses. CDW-H supports requests for fully identified, HIPAA limited, and deidentified EHR data. Requests fall into 3 categories: Study recruitment: With appropriate approvals, study teams can request datasets of patients that potentially meet their study criteria and may receive approval to contact these patients via mail, phone, patient portal, or in clinic. Longitudinal EHR data for patients enrolled in trials, cohort studies, or registries: After a patient is enrolled in a study, information about that patient’s medical history and future clinical outcomes may be requested, supplementing the data collected by the study team. Secondary data analyses: With waivers of both consent and HIPAA authorization in place, researchers may request datasets without patient contact. Such studies allow investigators to analyze trends and outcomes in real-world data.

Governance structure

CDW-H governance is comprised of the CDW-H Oversight and Operations Committees. The CDW-H Operations Committee reviews all data requests through an administrative pathway or full committee meeting. The CDW-H Oversight Committee sets policy for the CDW-H and reviews precedent-setting requests and appeals. Both Committees benefit from interdisciplinary membership rosters that include clinician scientists, Office of Human Research Ethics (IRB) leadership, informatics researchers, public health researchers, legal counsel, privacy office staff, and patient representatives. Members include representatives from both the University and Health System. An administrative review process was developed in response to an increased volume in requests and recognition that many requests did not require extensive discussion. In this expedited review, a staff member and the CDW-H Operations Chair assess whether the request meets HIPAA and IRB requirements. Examples of requests eligible for administrative review include many datasets for secondary analyses if data will remain within UNC, recruitment lists comprised of adults, and data regarding patients who have consented to participate in the associated study. The CDW-H review process is not intended to replace or circumvent the IRB process. Rather, the reviews complement one another and often evaluate different issues.

Governance approach

The primary goal of CDW-H governance is to determine how best to safely use clinical data for research to ultimately improve patient health. The challenge is that it can be difficult to define what uses are appropriate. While HIPAA provides us with the legal guardrails we must operate within, we have learned that questions about data disclosure and sharing often go beyond what HIPAA considers. UNC’s interdisciplinary governance committees exist to address this exact challenge. All requests must receive appropriate IRB review. The CDW-H review process includes a step to ensure the request and protocol align; and, if not, the discrepancy must be addressed before the request can be approved. Beyond IRB and HIPAA requirements, governance often considers the following: Is the request limited to the minimum necessary data? As machine learning and analyses of large cohorts become more common, researchers are asking for more and more data—more patients, more years of data, or more data elements for each patient., However, HIPAA requires only the minimum necessary information to complete a task to be disclosed; therefore, the Committee may require such requests be narrowed in scope, or the scope of the request be well justified (by, for example, a consultation with a statistician). Are data sharing plans justified and in compliance with legal requirements? Projects requiring the sharing of data outside the institution are becoming more common. This represents additional inadvertent disclosure risk, so the Committee must weigh the value of the data sharing with that risk. Key considerations include what data will be shared, what plans there are for data reuse, how shared data will be stored, and whether a data sharing agreement will be in place. When sharing data, a study team may also request approval for data linkage. Linking CDW-H data with, for example, claims data has the benefit of filling in gaps in EHR data, reducing missing data bias., The Committee considers how data will be linked, what identifiers (if any) need to be shared to support the linkage, and whether the linked data itself presents risk of reidentification. What impact could this request have on patients? Ever present on the Committee’s mind is the impact a request may have on patients. For example, CDW-H data is frequently used to support recruitment, and the Committee recognizes that patients may have concerns if they receive a recruitment letter based on information in their medical records. UNC created template recruitment language to help address this concern. The Committee also advises on methods to prevent negative impact on patients—for example, by requesting that recruitment lists of pregnant people exclude patients with diagnosis or procedure codes signifying miscarriage. What impact could this request have on our institution? The Committee considers how a request may reflect on UNC. A common concern when sharing EHR data is the possibility that data could be misused for competitive purposes., Consider a project that allows researchers to access data from multiple health systems; the pooled data could enable comparing rates of post-surgical complications among competing institutions. The Committee may require that UNC’s name not be disclosed in the combined dataset. The above is not an exhaustive list but does represent many of the most common issues the Committee considers. The Committee judges each case individually with the goal of finding common ground with the study team in an attempt to “get to yes.” If immediate approval is not possible, the Committee may respond by coaching the study team to make modifications in order to meet the research needs of their project, while also meeting the goal of protecting patients. Table  1 outlines common scenarios the Committee has faced and examples of their responses.
Table 1.

Common request scenarios and examples of the Committee’s past responses

ScenarioExample responses
Sharing fully identified dataset with outside institution

Ask research team to justify the requested data (eg, “What analytical purpose does exact street address serve?”)

Suggest alternative variables to achieve a similar goal (eg, providing census tract instead of full address)

Evaluate options to avoid release of identifiers unless necessary (eg, date shifting where exact dates are not required)

Cohort size or control group size appears excessively large (eg, 100 controls for each case)

Request justification for cohort or control group size

Recommend (or require) a consult with CTSA biostatistics service

Add inclusion criteria when appropriate (eg, only include patients with at least 3 encounters in the study period)

Cohort definition for a recruitment dataset is very broad, while recruitment goal is small (eg, a list of 500 000 patients in order to recruit 25 participants)

Educate researcher that CDW-H is more appropriately used to recruit more narrowly defined populations

Recommend consult with CTSA recruitment service

Cohort definition targets sensitive recruitment population (eg, teenagers with suicidal ideation)

Recruitment criteria may be narrowed by, for example, requiring a specified diagnosis code to appear multiple times on a patient’s record, rather than once, or requiring chart review after receipt of dataset but prior to patient contact

Amendments to recruitment materials may be required to ensure language is benign and unlikely to cause distress

Request to link data with an external dataset, such as claims data or EHR data from another institution

Recommend a linkage methodology that does not require sharing identifiers (ie, privacy preserving record linkage)22,23

Common request scenarios and examples of the Committee’s past responses Ask research team to justify the requested data (eg, “What analytical purpose does exact street address serve?”) Suggest alternative variables to achieve a similar goal (eg, providing census tract instead of full address) Evaluate options to avoid release of identifiers unless necessary (eg, date shifting where exact dates are not required) Request justification for cohort or control group size Recommend (or require) a consult with CTSA biostatistics service Add inclusion criteria when appropriate (eg, only include patients with at least 3 encounters in the study period) Educate researcher that CDW-H is more appropriately used to recruit more narrowly defined populations Recommend consult with CTSA recruitment service Recruitment criteria may be narrowed by, for example, requiring a specified diagnosis code to appear multiple times on a patient’s record, rather than once, or requiring chart review after receipt of dataset but prior to patient contact Amendments to recruitment materials may be required to ensure language is benign and unlikely to cause distress Recommend a linkage methodology that does not require sharing identifiers (ie, privacy preserving record linkage),

RESULTS

We aim for a governance process that is compliant, efficient, and supportive of research. In practice, this means that data requests: (1) are reviewed in a timely manner, (2) align with the corresponding IRB protocol, and (3) comply with institutional policy and federal law. To measure success, we reviewed the outcomes of data requests received in 2020, as shown in Table  2.
Table 2.

CDWH governance outcomes

Summary of requestsRequests (#)Requests (%)
Data requests reviewed319100%
Data requests approved30294.7%
Requests reviewed via administrative pathway26583.1%
Requests reviewed via full committee pathway5416.9%
Administrative pathway time from submission to approvalaAdmin. reviewed requests (#)Admin. reviewed requests (%)
 0–2 days9837.0%
 3–14 days9435.5%
 15–28 days3412.8%
 29 days and over2910.9%
 Not approvedb103.8%
Administrative pathway actionAdmin. reviewed requests (#)Admin. reviewed requests (%)
Requests may receive stipulations in multiple categories
 Approve without stipulations20276.2%
 Regulatory-related changes required (eg, IRBprotocol modification)5721.5%
 Other (eg, clarification needed to understandif data will be shared)83.0%
Committee pathway time from submission to approvalaComm. reviewed requests (#)Comm. reviewed requests (%)
 0–28 days1425.9%
 29–60 days1425.9%
 61 days and over1935.2%
 Not approvedb713.0%
Committee pathway actionComm. reviewed requests (#)Comm. reviewed requests (%)
Requests may receive stipulations in multiple categories
 Approve without stipulations1425.9%
 Regulatory-related changes required (eg, IRBprotocol modification)1425.9%
 Data sharing agreement required and/or modification to data sharing plans required2648.1%
 Scope modification or justification required47.4%
 Other (eg, add inclusion criteria to increase specificity; add criteria to ensure appropriate guardian for child is contacted)23.7%

Request approval may be dependent on factors outside of CDW-H governance’s control. For example, approval may depend on the study team making an IRB protocol modification.

In some cases, a request may be reviewed and receive stipulations, but the study team will choose not to proceed with the required changes. These requests are marked as not approved.

CDWH governance outcomes Request approval may be dependent on factors outside of CDW-H governance’s control. For example, approval may depend on the study team making an IRB protocol modification. In some cases, a request may be reviewed and receive stipulations, but the study team will choose not to proceed with the required changes. These requests are marked as not approved. Of the 319 requests reviewed, 302 (94.7%) were approved. Most requests (265, 83.1%) were reviewed via administrative review. Seventy-two percent (192) of these requests received approval in less than 2 weeks. Fifty-seven requests (21.5%) required modification to the data request, IRB protocol, or both in order to ensure alignment across the documents and be compliant. Fifty-four requests (16.9%) were reviewed by the CDW-H Operations Committee. About half of these requests received approval within 2 months. The committee process is lengthier, because requests tend to be complex and require multiple consultations. Most requests receive stipulations, which must be addressed before approval. Common stipulations include modifications to ensure request and IRB protocol alignment or execution of a data sharing agreement. In rare instances, the CDW-H Operations Committee escalates requests to the CDW-H Oversight Committee. Three requests from 2020 were escalated and ultimately approved.

DISCUSSION

Data governance practices will necessarily vary by institution. Our governance approach has commonalities with the spectrum of criteria and procedures described in Pavlenko et al including requirements for human subjects protection training, IRB approval, and data sharing agreements, and attention to patients’ perspectives and institutional reputation when reviewing requests. Notably, however, the spectrum in Pavlenko does not include an explicit focus on whether a request is limited to the minimum necessary data. As interest and capacity for working with larger datasets grow, we expect more institutions will need to address this issue. The success and longevity of CDW-H governance is attributable to a combination of factors. The bedrock of CDW-H governance is strong institutional support from UNC Health and UNC-Chapel Hill and close relationships among stakeholders within both organizations. Importantly, CDW-H governance’s purview is narrow, limited to the review of requests for CDW-H data. The Committee intentionally avoids commenting on the protocol or scientific merit of a project. Having a single point of entry helps to reduce confusion and ensures requests are reviewed consistently. The CDW-H governance process and committees have proven capable of adapting, and this will be critical to continued success. Trends in clinical informatics, including increased interest in data sharing, a growing appetite for larger analytical datasets, and heightened interest in analyzing clinical notes, will present new challenges for our governance system as we seek to meet the needs of researchers while also protecting patients and data.

CONCLUSION

Our governance process has proven effective and efficient for UNC over the past decade. The Committees are a valuable resource for the University and Health System. They help ensure clinical data are provisioned appropriately and researchers are educated about the benefits and sensitivities of working with clinical data. Though many data governance challenges lie ahead, our past experience demonstrates that this system is a robust one that is able to address a dynamic clinical research environment.

FUNDING

This work was supported by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through Grant Award Number UL1TR002489. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

AUTHOR CONTRIBUTIONS

Manuscript drafting: KMW, AJ, ERP, and TSC. Initial implementation of governance processes: TSC, BL, DCS, and MR. Ongoing leadership of governance processes: AJ, BL, ERP, NJS, DCS, MR, and KMW. Manuscript revisions and final approval: TSC, AJ, BL, ERP, NJS, DCS, MR, and KMW.

Supplementary material

Supplementary material is available at Journal of the American Medical Informatics Association online. Click here for additional data file.
  21 in total

1.  Toward a national framework for the secondary use of health data: an American Medical Informatics Association White Paper.

Authors:  Charles Safran; Meryl Bloomrosen; W Edward Hammond; Steven Labkoff; Suzanne Markel-Fox; Paul C Tang; Don E Detmer
Journal:  J Am Med Inform Assoc       Date:  2006-10-31       Impact factor: 4.497

Review 2.  Potential unintended consequences of health information exchange.

Authors:  Gilad J Kuperman; Julie J McGowan
Journal:  J Gen Intern Med       Date:  2013-05-21       Impact factor: 5.128

3.  Comparative Effectiveness and Safety of Bariatric Procedures for Weight Loss: A PCORnet Cohort Study.

Authors:  David Arterburn; Robert Wellman; Ana Emiliano; Steven R Smith; Andrew O Odegaard; Sameer Murali; Neely Williams; Karen J Coleman; Anita Courcoulas; R Yates Coley; Jane Anau; Roy Pardee; Sengwee Toh; Cheri Janning; Andrea Cook; Jessica Sturtevant; Casie Horgan; Kathleen M McTigue
Journal:  Ann Intern Med       Date:  2018-10-30       Impact factor: 25.391

4.  The Patient-Centered Outcomes Research Network Antibiotics and Childhood Growth Study: Implementing Patient Data Linkage.

Authors:  Melanie Canterberry; Alan F Kaul; Satyender Goel; Pi-I Debby Lin; Jason P Block; Vinit P Nair; Qianli Ma; Thomas W Carton
Journal:  Popul Health Manag       Date:  2019-12-17       Impact factor: 2.459

5.  Comparative Effectiveness of Aspirin Dosing in Cardiovascular Disease.

Authors:  W Schuyler Jones; Hillary Mulder; Lisa M Wruck; Michael J Pencina; Sunil Kripalani; Daniel Muñoz; David L Crenshaw; Mark B Effron; Richard N Re; Kamal Gupta; R David Anderson; Carl J Pepine; Eileen M Handberg; Brittney R Manning; Sandeep K Jain; Saket Girotra; Danielle Riley; Darren A DeWalt; Jeff Whittle; Ythan H Goldberg; Veronique L Roger; Rachel Hess; Catherine P Benziger; Peter Farrehi; Li Zhou; Daniel E Ford; Kevin Haynes; Jeffrey J VanWormer; Kirk U Knowlton; Jennifer L Kraschnewski; Tamar S Polonsky; Dan J Fintel; Faraz S Ahmad; James C McClay; James R Campbell; Douglas S Bell; Gregg C Fonarow; Steven M Bradley; Anuradha Paranjape; Matthew T Roe; Holly R Robertson; Lesley H Curtis; Amber G Sharlow; Lisa G Berdan; Bradley G Hammill; Debra F Harris; Laura G Qualls; Guillaume Marquis-Gravel; Madelaine F Modrow; Gregory M Marcus; Thomas W Carton; Elizabeth Nauman; Lemuel R Waitman; Abel N Kho; Elizabeth A Shenkman; Kathleen M McTigue; Rainu Kaushal; Frederick A Masoudi; Elliott M Antman; Desiree R Davidson; Kevin Edgley; James G Merritt; Linda S Brown; Doris N Zemon; Thomas E McCormick; Jacqueline D Alikhaani; Kenneth C Gregoire; Russell L Rothman; Robert A Harrington; Adrian F Hernandez
Journal:  N Engl J Med       Date:  2021-05-15       Impact factor: 176.079

6.  A survey of practices for the use of electronic health records to support research recruitment.

Authors:  Jihad S Obeid; Laura M Beskow; Marie Rape; Ramkiran Gouripeddi; R Anthony Black; James J Cimino; Peter J Embi; Chunhua Weng; Rebecca Marnocha; John B Buse
Journal:  J Clin Transl Sci       Date:  2017-08

7.  Scalable and accurate deep learning with electronic health records.

Authors:  Alvin Rajkomar; Eyal Oren; Kai Chen; Andrew M Dai; Nissan Hajaj; Michaela Hardt; Peter J Liu; Xiaobing Liu; Jake Marcus; Mimi Sun; Patrik Sundberg; Hector Yee; Kun Zhang; Yi Zhang; Gerardo Flores; Gavin E Duggan; Jamie Irvine; Quoc Le; Kurt Litsch; Alexander Mossin; Justin Tansuwan; James Wexler; Jimbo Wilson; Dana Ludwig; Samuel L Volchenboum; Katherine Chou; Michael Pearson; Srinivasan Madabushi; Nigam H Shah; Atul J Butte; Michael D Howell; Claire Cui; Greg S Corrado; Jeffrey Dean
Journal:  NPJ Digit Med       Date:  2018-05-08

8.  Implementing a hash-based privacy-preserving record linkage tool in the OneFlorida clinical research network.

Authors:  Jiang Bian; Alexander Loiacono; Andrei Sura; Tonatiuh Mendoza Viramontes; Gloria Lipori; Yi Guo; Elizabeth Shenkman; William Hogan
Journal:  JAMIA Open       Date:  2019-09-27

Review 9.  Electronic health records to facilitate clinical research.

Authors:  Martin R Cowie; Juuso I Blomster; Lesley H Curtis; Sylvie Duclaux; Ian Ford; Fleur Fritz; Samantha Goldman; Salim Janmohamed; Jörg Kreuzer; Mark Leenay; Alexander Michel; Seleen Ong; Jill P Pell; Mary Ross Southworth; Wendy Gattis Stough; Martin Thoenes; Faiez Zannad; Andrew Zalewski
Journal:  Clin Res Cardiol       Date:  2016-08-24       Impact factor: 5.460

10.  PCORnet® 2020: current state, accomplishments, and future directions.

Authors:  Christopher B Forrest; Kathleen M McTigue; Adrian F Hernandez; Lauren W Cohen; Henry Cruz; Kevin Haynes; Rainu Kaushal; Abel N Kho; Keith A Marsolo; Vinit P Nair; Richard Platt; Jon E Puro; Russell L Rothman; Elizabeth A Shenkman; Lemuel Russell Waitman; Neely A Williams; Thomas W Carton
Journal:  J Clin Epidemiol       Date:  2020-09-28       Impact factor: 6.437

View more
  1 in total

1.  Research data warehouse best practices: catalyzing national data sharing through informatics innovation.

Authors:  Shawn N Murphy; Shyam Visweswaran; Michael J Becich; Thomas R Campion; Boyd M Knosp; Genevieve B Melton-Meaux; Leslie A Lenert
Journal:  J Am Med Inform Assoc       Date:  2022-03-15       Impact factor: 7.942

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.