| Literature DB >> 32935024 |
Kerina Helen Jones1, Sharon Heys1, Karen S Tingay1, Paul Jackson2, Chris Dibben3.
Abstract
INTRODUCTION: Administrative data arising via the operation of public service delivery systems hold great benefits for citizens and society by enabling new research questions to be addressed, providing they can be made available in a safe, socially acceptable way. In recognition of this potential, the UK Administrative Data Research Network was established in 2013 to enable new research for public benefit. However, there are considerable challenges to be overcome for effective data use, and many of these are common to administrative data enterprises in general. Using this network as a practical case study, we set out to explore the issues and propose how to share the 'good', suggest solutions to the 'bad', and improve the 'clunky' issues, to lead to improvements in administrative data use.Entities:
Keywords: Administrative data research; data access; data linkage
Year: 2018 PMID: 32935024 PMCID: PMC7479922 DOI: 10.23889/ijpds.v4i1.587
Source DB: PubMed Journal: Int J Popul Data Sci ISSN: 2399-4908
1This entails groupings of partners taking forward the creation of new linked datasets [14].
2The UK Digital Economy Act (2017) extends the opportunity for data sharing by government departments http://www.legislation.gov.uk/ukpga/2017/30/contents/enacted.
3The 5Ps plan is a set of principles devised by Paul Jackson (ADS Strategic Data Negotiator) and stands for personality, prospectus, pathway, partnership and planning the service.
4This is where data are brought together for research, but are deleted when no longer required for the study. This, and theming are expanded upon in the discussion on existing workstreams.
Increasing dialogue between data providers, ADRCs and researchers to promote the value of data sharing for research, and to provide assurance of risk mitigation. Placing the focus on research themes1 rather than on individual government departments. Adopting the principles of the Digital Economy Act2 and implementing the 5 Ps plan, plus the 6th P for ‘define the product’3 . | |
Streamlining network approvals processes and allowing researchers to attend the peer review panel to address queries upfront. Documenting and sharing experiences of going through approval processes to identify common issues and inform others. Providing case studies to illustrate consent requirements. | |
Building confidence in network trustworthiness by including the 6th P – product - in the ‘5Ps’ plan which would be attractive to data providers, and including this in the network prospectus. Developing a set of flexible principles enabling researchers to self-check their proposed outputs with reference to experts. Providing common training for checkers and training for researchers. | |
Increasing dialogue with data providers to emphasise the value of good quality datasets and accurate metadata. Documenting and sharing solutions to tricky issues. Acknowledging intellectual effort needed, creating a persisted asset listing for datasets, and a dataset citation index to track dataset usage. | |
Clarifying expectations and documenting the roles of researcher support staff. Increasing connections with outside networks for mutual support ideas, training, funding and collaboration. Conveying to funders that greater timing flexibility is needed for data-intensive research to allow for unknown delays in data delivery to researchers. | |
Encouraging data reuse and requiring good reasons before supporting a project unwilling to allow reuse. Moving away from create-and-destroy, and building transparency into data retention models, including the levels of control data providers wish to retain in the reuse of their data, with class approval for similar projects. Building awareness among data providers and funders of the value of data retention, with due regard to risk mitigation. | |
5Please note that in the survey Q9 ‘Disclosure control in release of results’ followed ‘Availability of analysis tools’, as it was set out to approximate the order of the data use pathway. After the survey, the question responses were grouped into 6 areas, resulting in a slight change of order: placing Q9 into C) Controls on access and disclosure, since they are similar in topic. As the new order is used through the remainder of the paper to the recommendations, the question numbers in the table have been set out accordingly.
| Step | Description | |
|---|---|---|
| 1 | Identifying potential datasets | Gaining awareness of datasets of interest, their locations and their data custodians |
| 2 | Acquiring datasets | Legal, technical and procedural processes for transferring datasets |
| 3 | Obtaining data provider permissions | The types of permissions required and how to apply |
| 4 | Regulatory approval processes | Navigating and securing lawful and ethical approvals |
| 5 | Peer-review approvals | The requirements of network and funder peer-review panels |
| 6 | Obtaining consent to link data | Understanding when consent to link is required and how to go about gaining it |
| 7 | Accessing data | The processes by which data are accessed |
| 8 | Disclosure control in data access | The measures applied to mitigate risk in data accessed for research |
| 95 | Disclosure control in release of results | The measures applied to mitigate risk in results released for dissemination |
| 10 | Data formats | Dealing with differences in data formats and compatibility |
| 11 | Data quality | Issues of completeness and accuracy |
| 12 | Linkage quality | Reliability of the linkage process |
| 13 | Metadata | Dataset descriptors and documentation (for locating and using data) |
| 14 | Support available to researchers | How to provide effective support to researchers |
| 15 | Acquiring analysis skills | The range of skills needed for data querying and manipulation |
| 16 | Availability of analysis tools | Ensuring a range of tools are available to data users |
| 17 | Reuse of administrative data | Processes for enabling the reuse of data, as opposed to one-off uses |
| 18 | Data retention and archiving | Having a suitable process to retain and archive data beyond the project life-span. |
In terms of the other topics identified, i.e. the need for better communications across the ADRN and more attention to public engagement, a cross-network directors’ update was introduced, and the public engagement work of the ADRCs in Scotland and Wales was acknowledged, along with a need to extend the work.
| Area | Key message | Actions | |
|---|---|---|---|
| A) | Data acquisition pathway (recommendations 1-3) | Need for a more streamlined and definite process for data acquisition, with good information and data documentation | A programme of workshops involving a wide range of stakeholders was initiated in November 2017. Each workshop sought to agree on a research area and to develop datasets to match this. Four main themes were agreed upon: world of work; data for children; growing old; and productive society [ |
| B) | Approval processes (recommendations 4-6) | Concern was raised over duplication within the approval processes and the need for clear guidance | The network peer-review approvals panel undertook a self-assessment exercise. The panel considered the benefits of researchers attending meetings and decided that follow-up outside meetings would be more effective. This approvals panel is independent of the ADRN and so makes decisions on its own operation. The themed partnership approach will meet the recommendations on clarifying consent, as theme partners will take on the role of licencing authorities, defining the conditions for reuse of the data they create. |
| C) | Controls on access and disclosure (recommendations 7-9) | Secure settings and disclosure control are valued within ADRN, and emphasis should be put on facilitating access cross-nationally | The series of stakeholder workshops, with a focus on research themes, is enabling the development of a more standardised process for data access to data with more predictable timescales. ADRC Scotland is leading on a programme to increase the availability of safe settings (safe pods) for accessing data. A common outcome from all the workshops is a focus on enriching data for longitudinal studies. Increased training in SDC is being planned. |
| D) | Data and metadata (recommendations 10-13) | Emphasis on documentation of data and good metadata is needed | The theme partners are working to develop datasets for each research theme. As the datasets are produced, they are documented and metadata is developed. Data quality reports will be shared with data providers and they will be fully involved in the testing of the dataset for use. From time to time, guides are commissioned to provide an overview of the data and its background, and datasets will be curated with persistent identifiers. |
| E) | Researcher support: (recommendations 14-16) | Need for more consistent support and communication with researchers | Researcher support staff are based at each ADRC and the teams are coordinated across the ADRN by the ADS. This is seen as a good service, however, it is recognised that each ADRC has its own local procedures which may be, at least partly, the cause of the identified inconsistencies. Research funders are being included in the themed workshops so they have fuller knowledge of timescales in working with administrative data, and the need to build in flexibility. |
| F) | Data reuse & retention: (recommendations 17-18) | The ADRN should move towards a reuse model | ADRN policy has been changed, moving away from create-and-destroy to data reuse for research. The themed approach concentrates on the delivery of curated datasets which are functionally anonymised and made available for reuse in research by accredited researchers. Projects have been clustered into themes and used as exemplars as part of discussions during acquisition and development of the datasets. Clearly defined stewardship of retained data will be needed, with an asset registration number for each dataset, and agreed arrangements for archiving. |