| Literature DB >> 28712359 |
Catrin Tudur Smith1, Sarah Nevitt2, Duncan Appelbe2, Richard Appleton3, Pete Dixon4, Janet Harrison2, Anthony Marson4, Paula Williamson2, Elizabeth Tremain5.
Abstract
BACKGROUND: Demands are increasingly being made for clinical trialists to actively share individual participant data (IPD) collected from clinical trials using responsible methods that protect the confidentiality and privacy of clinical trial participants. Clinical trialists, particularly those receiving public funding, are often concerned about the additional time and money that data-sharing activities will require, but few published empirical data are available to help inform these decisions. We sought to evaluate the activity and resources required to prepare anonymised IPD from a clinical trial in anticipation of a future data-sharing request.Entities:
Keywords: Anonymisation; Clinical trial; Cost; Data sharing; IPD; Individual participant data; Transparency
Mesh:
Year: 2017 PMID: 28712359 PMCID: PMC5512949 DOI: 10.1186/s13063-017-2067-4
Source DB: PubMed Journal: Trials ISSN: 1745-6215 Impact factor: 2.279
Fig. 1Steps involved in the anonymisation process. ¥The unique patient identifier code and date of randomisation can provide valuable information about the sequence and pattern of randomisation. Recoded data should be supplemented by complete flow of trial participants, highlighting any randomisation errors. *Steps 4 to 7 could be performed in any order. PII personally identifiable information, CRF case report form
Suggested content of the data pack ready for sharing
| Suggested content of data pack | Description |
|---|---|
| Anonymised data | Electronic data collected for each patient in the trial in a format that can be recognised by a wide range of statistical software (e.g. SAS, Stata, R). The use of “StatTransfer” or other similar product may be useful for this purpose |
| Blank CRF | Blank CRFs with descriptions of the data collected. These could be annotated to provide a map of the data variables within the dataset, or provided as blank CRFs along with dataset specifications |
| Dataset specifications | Meta-data describing the datasets e.g. data-freeze date, variable labels, variable descriptions, formats, anonymisation method applied to each variable and summary of amendments made during the trial e.g. changing data definitions, adding/removing variables |
| Protocol | Trial protocol, including all amendments |
| Statistical analysis plan | Methods of analysis and procedures for data handling used in the final statistical analysis (this is useful if researchers want to replicate published analyses to facilitate their understanding of the dataset) |
| Analysis programs | Programmes used for generating and analysing data used in the final analysis report (this is useful if researchers want to replicate published analyses to facilitate their understanding of the dataset) |
| Clinical study report (CSR) (or equivalent) if applicable | Report of efficacy and safety data from the trial that forms the basis of submissions to regulatory authorities e.g. EMA |
CRF case report form, EMA European Medicines Agency
Time required to prepare the data pack for the SANAD trial
| Step of process | Role | Tasks | Time (hours) |
|---|---|---|---|
| Getting access to the data and documentation | Statistician | • Requesting access and liaising with TM and IS | 2.5 |
| Information systems | • Working out how to open Access database (old version) | 3 | |
| Trial manager | • Locating data files and documentation | 5 | |
| De-identification | Statistician | • First stage of de-identification (IDs, personally identifiable information and dates), preparation of variable list and identification of free-text variables | 32 |
| Final data pack | Statistician | • Pull all relevant files and documentation together and transfer to separate secure folder | 1.5 |
| Quality control check | Statistician | • Independent statistician to understand the datasets and check through to ensure that relevant data have been de-identified and check any remaining text variables are suitably redacted to protect patient privacy | 6 |
| Total | Statistician | 42 | |
| Information systems | 3 | ||
| Trial manager | 5 | ||
| Overall | 50 | ||
TM trial manager, IS information systems, GP General Practitioner, CRF case report form
Time required to prepare the data pack for the MENDS trial
| Step of process | Role | Tasks | Time (hours) |
|---|---|---|---|
| Getting access to the data and documentation | Statistician | • Requesting access | 1.5 |
| Information systems | • Setting up access to SQL server tables | 6 | |
| De-identification | Statistician | • First stage of anonymisation (IDs, PII and dates), preparation of variable list and identification of free-text variables | 26 |
| Final data pack | Statistician | • Pull all relevant files and documentation together and transfer to separate secure folder | 1 |
| Quality control check | Statistician | • Independent statistician to check the datasets to ensure that relevant data have been anonymised and check any remaining text variables are suitably redacted to protect patient privacy | 5 |
| Total | Statistician | 33.5 | |
| Information systems | 6 | ||
| Overall | 39.5 | ||
PII personally identifiable information
Estimated cost of data pack preparation
| Staff rolea | Approximate salary (£) | SANAD cost (£) | MENDS cost (£) |
|---|---|---|---|
| Senior statistician | 56,482 | 911 | 717 |
| Junior statistician | 31,342 | 506 | 398 |
| Senior IS staff | 44,620 | 51 | 103 |
| Junior IS staff | 37,394 | 43 | 86 |
| Senior trial coordinator | 44,620 | 86 | 0 |
| Junior trial coordinator | 31,342 | 60 | 0 |
| Archiving | 34,233 | 93 | 93 |
| Total directly incurred staff (£) | 1750 | 1397 | |
| Estimate of full economic cost | 1435 | 1143 | |
| Total project cost | 3185 | 2540 | |
IS information systems. aAssumed a 50:50 split in contribution between senior and junior staff where applicable