| Literature DB >> 33849921 |
Suzy Gallier1, Gary Price2, Hina Pandya2, Gillian McCarmack2, Chris James2, Bob Ruane2, Laura Forty2, Benjamin L Crosby2, Catherine Atkin2, Ralph Evans3, Kevin W Dunn4, Eliot Marston5, Clark Crawford6, Martin Levermore7,8, Shekha Modhwadia3, John Attwood9, Stephen Perks10, Rima Doal1, Georgios Gkoutos11, Richard Dormer12, Andy Rosser13, Hilary Fanning14, Elizabeth Sapey15,16.
Abstract
INTRODUCTION: Health Data Research UK designated seven UK-based Hubs to facilitate health data use for research. PIONEER is the Hub in Acute Care. PIONEER delivered workshops where patients/public citizens agreed key principles to guide access to unconsented, anonymised, routinely collected health data. These were used to inform the protocol.Entities:
Keywords: health care; information management; information systems; medical informatics; record systems
Mesh:
Year: 2021 PMID: 33849921 PMCID: PMC8051388 DOI: 10.1136/bmjhci-2020-100294
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
A summary of related data sources
| Name | Country | Subject areas | Update period | Description |
| HES | UK | All healthcare | Daily (A&E is quarterly) | High-level, does not include physiological measurements outside of classifications in main diagnoses |
| MIMIC | USA | Intensive care | Static | Deidentified data covering period '01–'12, in high detail |
| WHO Global Health Observatory | Global | All healthcare | Variable | High-level count data, on a global scale, does not go to the level of individual patients |
| Global Health Data Exchange | Global | All healthcare | Variable | Catalogue of existing datasets, generating novel data is outside of its scope |
| NIHR Health Informatics Collaborative | UK | Thematic | Open only to member of the HIC for collaboration, does not provide a TRE | |
| PIONEER | UK | Acute care including preceding and subsequent health contacts | On demand | Datasets tailored to specific use cases, updated on demand and available via a secure TRE. Individual patient level data that includes medications, physiological measurements, images over 20 years. Individually linked data from primary care, ambulance and secondary care. |
This should not be considered fully comprehensive, but highlights the differences between currently available datasets and PIONEER. HES=Hospital Episode Statistics.24 MIMIC Critical Care Dataset,25 Global Health Observatory,26 Global Health Data Exchange27 NIHR=National Institute for Health Research Health Informatics Collaborative.28
TRE, trusted research environment.
Figure 1The process of pseudonymising health data in PIONEER. This describes the process to move data using a salt code. Data partners contain identifiable health data, as shown by the date and time of birth. This is extracted from the record in an identifiable form, and then made pseudonymised using a salt code. The date and time of birth cannot be calculated from this hash. The data is then transferred by secure and encrypted pathways to University Hospitals Birmingham NHS Foundation Trust (UHB). At UHB, a proportion of records will have the data made reidentifiable, for QA/QC purposes, but it then remains in the hashed format. If that data is requested, the hash will be transformed into an age in years or into appropriate age bins, as determined on a case by case basis, and as approved by the Data Trust Committee (DTC).
Figure 2The data staging and access process. Data can enter PIONEER from internal (UHB) and external data providers. At stage 1, data has been cleansed, normalised and sent to UHB in a pseudonymised form. At stage 2, the pseudonymised data is checked, cleansed and undergoes QA/QC by the PIONEER team. From this data, a metadata catalogue is formed, providing high-level data descriptions, which describe the kind of data PIONEER holds. The metadata catalogue is available for researchers to browse. At stage 3, the pseudonymised data is moved to the secure cloud, only accessible by PIONEER team members for further QA/QC processes. If a data request is received, it is reviewed by the Data Trust Committee (DTC) and PIONEER team. The PIONEER team perform a due diligence check and assess the request in terms of risk (see table 1). If the DTC do not support data access, no data access occurs. If the DTC support data access, a data licensing agreement is formed and the exact cut of data required is anonymised, extracted and staged in a bespoke trusted research environment (TRE). At stage 4, this staged and anonymised data can then be accessed by the researchers using specific log on processes and only approved data can leave the TRE (which would be aggregate data and not individual data lines).
An overview of the Data Request Form and data access considerations
| Heading | Requirements |
| The project: Technical summary | Project title, aims, scientific rationale and background in technical language. |
| The project: Lay summary | Project title, aims, scientific rationale and background in lay language. |
| Patient and public involvement | To describe the patient and public involvement and engagement (PPIE) work completed so far and to offer the opportunity for PIONEER supported PPIE. |
| Expected value of the project to the NHS and general public | To describe how the project is likely to lead to patient and public benefit. |
| Data requirements and analysis plan | This includes an exact description of the data fields required, whether aggregate or individual data lines are needed, and the justification for this. PIONEER offers the ability to perform analysis on behalf of the researchers. If the researchers wish to perform their own analysis, a description of techniques and tooling required is requested. |
| Security | Data access is only permitted with a data licensing agreement, but if necessary, anonymised data can be sent to an external trusted research environment (TRE). The security arrangements for this TRE will be listed and reviewed by the PIONEER team, including whether specific ISO standards are met. |
| Expertise | Listed to ensure the researchers have the relevant training and expertise to conduct the analysis. |
| Dissemination plan | PIONEER supports open access of data outputs to ensure insights benefit as many people as possible. |
| Human rights violations and significant harm | All data requestors undergo a due diligence assessment to determine evidence of serious human rights violations, arms manufacturing or trade and tobacco industry involvement. |
| Controversies and data breaches | All data requestors undergo a due diligence assessment to determine evidence of data breaches, falsified scientific reports or involvement in significant controversies including financial irregularities and health and safety fines. |
| Benefit | Designated as clear benefit to NHS patients or society, potential benefit or no potential benefit. |
| Data | Designated as data which is aggregated or highly unlikely to or may lead to patient identification or data which has a realistic potential patient identification. |
| Security | Designated as provides evidence of data security which meet all requirements or meet most requirements with additional support or does not meet data security requirements. |
| Potential reputational risk to PIONEER | Designated as low, moderate or high based on due diligence check. |
| Previous dealings | To determine the outcome of previous data access activities. |
| Overall assessment of risk | Low, moderate or high with a recommendation to support or not support data access. |