| Literature DB >> 35193857 |
Emily Griffiths1, Rebecca M Joseph2, George Tilston3, Sarah Thew4, Zoher Kapacee1, William Dixon1,5, Niels Peek6,5.
Abstract
OBJECTIVE: How health researchers find secondary data to analyse is unclear. We sought to describe the approaches that UK organisations take to help researchers find data and to assess the findability of health data that are available for research.Entities:
Keywords: information management; medical informatics; record systems
Mesh:
Year: 2022 PMID: 35193857 PMCID: PMC8867248 DOI: 10.1136/bmjhci-2021-100325
Source DB: PubMed Journal: BMJ Health Care Inform ISSN: 2632-1009
List of public sector organisations that took part in the surveys
| Repository | Description | URL |
| Health Data Finder for Research | Health data finder is a metadata catalogue aiming to inform potential users about health datasets that are available for use in research |
|
| UK Data Service* | The UK Data Service enables access to a range of datasets, primarily in the field of social and economic research; funded by the Economic and Social Research Council (ESRC) |
|
| Consumer Data Research Centre (CDRC)* | The CDRC enables access to routinely collected consumer data; funded by the ESRC |
|
| Urban Big Data Centre (UBDC)* | The UBDC enables access to urban-related data; funded by the ESRC |
|
| Administrative Data Research Network (ADRN)* | The ADRN was a service funded by the ESRC to enable secure access to datasets |
|
| Electronic Data Research and Innovation Service (eDRIS) | eDRIS is a service coordinating access to the national Scottish health datasets |
|
| Health Informatics Centre—Trusted Research Environment (University of Dundee) | A data safe haven run as part of the University of Dundee, affiliated with National Health Service (NHS) Tayside and NHS Fife; the service coordinates access to local health datasets |
|
| NHS Greater Glasgow and Clyde Safe Haven | A data safe haven and data service coordinating access to local health datasets |
|
| CALIBER (University College London) | A platform for sharing data and methodologies; linked primary care, secondary care (hospital admissions), mortality and cancer registry data |
|
| Her Majesty’s Revenue and Customs (HMRC) Data Lab* | A service providing secure access to deidentified HMRC data |
|
| Connected Health Cities (CHC) North East and North Cumbria | CHC is a programme in the North of England which aims to use local health data and technology to improve health services; North East and North Cumbria are developing infrastructure to connect local hospitals with their trustworthy research environment—this will include development of a metadata catalogue |
|
| CHC Connected Yorkshire | Connected Yorkshire is based across Leeds, Sheffield and Bradford and works with the established Born in Bradford cohort; the dataset information described in this paper relates to the Born in Bradford study |
|
*Not primarily health organisations.
Description of target UK e-cohorts assessed for findability in direct and indirect searches
| E-cohort | URL | Responsible organisation | Description | Number of 2018 search results: direct (indirect) | Number of 2021 search results: direct (indirect) |
| Clinical Practice Research Datalink |
| MHRA (Medicines and Healthcare Regulatory Agency)/National Institute for Health Research | Primary care research dataset with linkage to additional datasets | Bing 9 (1) | Bing 14 (8) |
| The Health Improvement Network |
| Cegedim* | Primary care research dataset | Bing 3 (0) | Bing 1 (2) |
| QResearch |
| The University of Oxford; EMIS (Egton Medical Information Systems)* | Primary care research dataset | Bing 5 (0) | Bing 0 (3) |
| ResearchOne |
| TPP SystmOne* | Primary care research dataset | Bing 4 (0) | Bing 0 (1) |
| Consultations in Primary Care Archive |
| Keele University | Primary care research dataset | Bing 0 (1) | Bing 0 (0) |
| Hospital Episode Statistics |
| National Health Service (NHS) Digital | Secondary care dataset | Bing 7 (3) | Bing 6 (3) |
| Salford Integrated Record |
| Salford Royal NHS Foundation Trust | Integrated primary and secondary care dataset | Bing 0 (0) | Bing 0 (0) |
| Prescribing Information System |
| NHS Scotland | National prescribing dataset | Bing 0 (1) | Bing 0 (1) |
| SAIL databank |
| Swansea University; NHS Wales; Health and Care Research Wales | Linked health and other routinely collected datasets | Bing 4 (0) | Bing 1 (6) |
| NHS Lothian Research Safe Haven/The University of Edinburgh |
| NHS Lothian, University of Edinburgh, Edinburgh Napier University, Queen Margaret University | Service coordinating access to linked health datasets across the Lothian region, Scotland | Bing 0 (2) | Bing 0 (0) |
| Grampian Data Safe Haven |
| NHS Grampian and the University of Aberdeen | Service coordinating access to linked health datasets across the Grampian region, Scotland | Bing 0 (1) | Bing 0 (0) |
| Health Informatics Centre—Trusted Research Environment (University of Dundee) |
| University of Dundee | Service coordinating access to linked health datasets across the Tayside region, Scotland | Bing 2 (1) | Bing 1 (0) |
| NHS Greater Glasgow and Clyde Safe Haven |
| NHS Greater Glasgow and Clyde and the Robertson Centre for Biostatistics, University of Glasgow | Service coordinating access to linked health datasets across the Greater Glasgow and Clyde region, Scotland | Bing 2 (2) | Bing 0 (0) |
Most of the organisations are public sector.
*Commercial organisations.
Figure 1Internet search process—looking for health datasets via two popular, general search engines (1) and via catalogues (2).
Catalogues of UK-based e-cohorts found through general internet search engines in 2018 and the number of target e-cohorts within them in 2018 and 2021
| Catalogue | Web link (correct in March 2018 at the time of searching) | Number of targets found (2018) | Number of targets found (2021) |
| Health Data Finder for Research |
| 2 | NA |
| Children and young people’s health data catalogue 2009 |
| 0 | NA |
| NHS Digital: Data and information |
| 1 | 1 |
| Perinatal mental health: national datasets | 1 | 1 | |
| NHS England Data Catalogue |
| 1 | 1 |
| National Data Catalogue Scotland |
| 1 | 1 |
| Asthma UK Data Catalogue |
| 1 | 5 |
| Urban Big Data Centre Health and social care data |
| 0 | 0 |
| Social Services Improvement Agency Data Catalogue |
| 0 | 0 |
Catalogues no longer accessible in 2021 are marked as NA.
Assessment of findability within catalogues, including whether the catalogue listed target e-cohorts from table 2 (see figure 1)
| Catalogue name | Target e-cohorts listed | Searchability | Metadata | Unique and persistent identifier | |
| Found in 2018 but not in 2021 | Health Data Finder for Research | Clinical Practice Research Datalink (CPRD) | Can filter | Dataset and field level | No |
| Children and young people’s health data catalogue 2009 | – | Downloadable file | Dataset level | No | |
| Found in 2018 and 2021 | NHS Digital: Data and information | HES | Search bar; Can filter | Dataset level | No |
| NHS England Data Catalogue | HES | Search bar; Can filter | Dataset and field level | No | |
| Perinatal mental health: national datasets | HES | Downloadable file | Dataset and field level | No | |
| Asthma UK Data Catalogue | HES | Search bar; Dropdown list | Dataset level | No | |
| Urban Big Data Centre Health and social care data | – | Dropdown list | Dataset level | No | |
| Social Services Improvement Agency Data Catalogue | – | Downloadable file | Dataset and field level | No | |
| National Data Catalogue Scotland | PIS | a-z listing | Dataset level | No | |
| Not found in 2018, found in 2021 | DataCat (University of Liverpool) | – | Search bar; Can filter | Dataset level | Yes |
| ORDA (University of Sheffield) | – | Search bar; Can filter | Dataset level | Yes | |
| UK Data Archive | – | Search bar; Can filter | Dataset level | Yes | |
| University of Lancaster | – | Search bar; Can filter | Dataset level | Yes | |
| Mauro Data Mapper/Oxford Metadata Catalogue | – | Dropdown list | Dataset and field level | Yes | |
| Zenodo | – | Search bar; Can filter | Dataset level | Yes | |
| Health Innovation Gateway | CPRD | Search bar; Can filter; Dropdowns; Highlight new datasets | Dataset level | Yes | |
| Social Care Wales | – | Search bar; filter; show all | Dataset level | No | |
| ONS Secure Research Service | HES | Spreadsheet | Dataset level | No |
For catalogues found in 2018, these were revisited in 2021; two were inaccessible, the other eight were unchanged in terms of metadata detail and presence of identifiers.