| Literature DB >> 27410386 |
Rebecca Sudlow1, Janice Branson2, Tim Friede3, David Morgan4, Caroline Whately-Smith5.
Abstract
BACKGROUND: Access to patient level datasets from clinical trial sponsors continues to be an important topic for the Pharmaceutical Industry as well as academic institutions and researchers. How to make access to patient level data actually happen raises many questions from the perspective of the researcher.Entities:
Keywords: Clinical trial data; Data sharing; Data transparency; Patient level data
Mesh:
Year: 2016 PMID: 27410386 PMCID: PMC4943504 DOI: 10.1186/s12874-016-0171-x
Source DB: PubMed Journal: BMC Med Res Methodol ISSN: 1471-2288 Impact factor: 4.615
Fig. 1Key steps in the request and access of patient level data. Each step is referenced and described in more detail in the text
Summary of the EFPIA/PhRMA Principles for Responsible Clinical Trial Data Sharing
| 1. Enhancing data sharing with researchers | On request from qualified medical and scientific researchers, companies will provide protocols, reports and patient-level clinical trial data for medicines that have been approved in both the EU and US. |
| Each company will establish a scientific review board that will include scientists and/or healthcare professionals who are not employees of the company. | |
| Access will be consistent with patient informed consent and safeguarding privacy. | |
| 2. Enhancing public access to clinical study information | Companies will make available synopses of CSRs submitted to US and European regulatory authorities from 1 Jan 2014. |
| 3. Sharing results with patients who participate in clinical trials | Companies will work with regulators to adopt mechanisms for providing a factual summary of clinical trial results and make the summaries available to research participants. |
| 4. Certifying procedures for sharing clinical trial information | Companies will certify on a publicly available web site that they have established policies and procedures to implement these data sharing commitments. |
| 5. Reaffirming commitments to publish clinical trial results | Results from all phase 3 clinical trials and any clinical trial results of significant medical importance should be submitted for publication, whether positive or negative, including results from discontinued development programs. |
Typical End to End Process for requesting Patient Level Data (PLD)
| Step | Considerations |
|---|---|
| 1. Develop PLD research proposal | What studies have been conducted? |
| Who is the Data Holder? | |
| Do they have a data access policy and is this study available? | |
| Could access to the CSR help inform the research proposal development? | |
| Does the Data Holder have a specific template for the research proposal? | |
| 2. Submission and review of research proposal | Who will review it? Independent review panel or within the Data Holder organisation? |
| Data Holder’s expectations for sharing data | |
| Researcher’s expectations for accessing data | |
| 3. Dataset preparation for external sharing | How will the data be shared? (open, secure system or other?) |
| Data Sharing Agreement review and sign-off | |
| Data de-identification principles | |
| 4. Analysis | Data package contents |
| Opportunities to ask questions | |
| 5. Reporting / Publication | Data Holder’s expectations regarding publication of the results. |
| Referencing the data source | |
| Any commitments to share the results/manuscript prior to publication? |
Key components of a research request form
| Proposal component | Additional notes |
|---|---|
| Name and affiliation of the lead researcher | |
| Statement of the Scientific Goals of the Research | |
| Synopsis of Research Proposal | Lay version may also be needed |
| Statistical Analysis Plan (SAP) | Including endpoints to be evaluated, analytic methods to used and methods to control for bias in post-hoc or data driven analyses. Should also state whether specific populations are to be analysed e.g. effects of treatment in special patient groups |
| Studies for which data is requested | Use unique study ID if known and database version required (if the study has been conducted over a long period of time and has been analysed at different follow-up time points) |
| Include all studies to be combined including those obtained from other sources (e.g. studies completed by your own institution) | |
| Name and affiliation of other members of the research team | There should be a professionally qualified statistician or confirmation that the proposed research team has the relevant statistical expertise to perform and take responsibility for all statistical analyses should be provided. |
| Some Data Holders require CVs or other information as reference. | |
| Conflicts of interest | Both real and potential |
| Source of Funding | This is the funding source for the researcher. |
| Currently Data Holders do not require payment for the preparation and access to the patient level datasets. |
Overview of the different patient level data access models and their pros and cons from the Researcher’s and Data Holder’s perspectives
| Pros | Cons |
|---|---|
| OPEN access: | |
| Researchers: | Researchers: |
| No knowledge of who else is accessing the data. Potential for overlapping or repeated research questions arising from the same dataset leading to increased chances of errors or increased type 1 errors. | |
| Data Holder: | |
| No traceability regarding who has accessed the data, whether they are qualified in statistical analysis and how they have consequently used the data. | |
| As pre-specification of analysis is not needed nor monitored, there is a risk for data dredging and over-interpretation of findings. | |
| High internal costs if all trials are required to be anonymized and posted prospectively some of which may never be accessed. | |
| Resource-intensive. | |
| Direct Sharing | |
| Researcher: | Researcher: |
| Easier to merge and combine data from a variety of sources. | Potential impact on research credibility if collaboration by Researcher is seen as not truly independent from the Data Holder. |
| Increased opportunity to address data and analysis questions with the Data Holders’ study personnel. | |
| Data Holder: | Data Holder: |
| Controlled Access | |
| Researcher: | Researcher: |
| Data Holder: | Commitments to publish/share their results with the Data Holder may be required. |
| Third Party Analysis | |
| Researcher: | Researcher: |
| Able to focus his/her time on interpreting the analyses rather than in data manipulation and programming. | Analysis work may be convoluted as there will need to be a lot of interaction between the third party and the researcher. |
| Data Holder: | Data Holder: |
| Analysis not considered to be independent as Data Holder owns the contract with the third party. | |
| Relatively low resource intensity but cost could be HIGH | |
Possible Data Package Contents
| Item | Further details |
|---|---|
| Anonymized Raw datasets | Dataset content reflects the information as recorded on the case report form. These are usually split into a number of raw datasets reflecting the different types of data that have been collected, e.g. adverse events, laboratory assessments, disease specific measurements. Only the datasets required for the research may be provided by some Data Holders |
| Anonymized Analysis-ready datasets | These datasets will have been derived from the raw datasets and will reflect the additional programming that needs to be applied for the data to be analysis ready. This could be the synthesis of different datapoints to create a single efficacy assessments (e.g. ACR score in RA or a time to disease progression) and could also include derivations and assumptions as a result of missing data. They will also identify the original analysis populations (e.g. ITT, Per Protocol) that were defined in the Statistical Analysis Plan. Researchers should understand the differences between these populations so they can be used appropriately. |
| Protocol (including any amendments) | The protocol describes the clinical study design, assessment schedule and planned statistical analysis in detail. Small amounts of text may be subject to redaction if they are considered to be commercially confidential. |
| Annotated Case Report Form | This document provides the link between the data points that were recorded by the investigator onto the paper or electronic case report form and the variable name and dataset location where they are held within the database. This is a key document to help the researcher navigate the database. |
| Statistical Analysis Plan | This document is written by a statistician prior to the study data being available for analysis. It is a comprehensive outline of the statistical endpoints to be derived and analysis methodology to be used. The 1 to 2 pages of statistical detail from the protocol are expanded into a document that can be 10–20 pages in length. |
| Dataset specifications | This (alongside the Statistical Analysis Plan) will provide a map of the dataset structure and data variable locations |
| Clinical Study Report | The CSR will be subject to some redactions in order to preserve patients’ anonymity and in some cases to protect commercially confidential information. The patient level data listings will not be included. |
| Optional: SAS Programs | In situations where the analysis ready datasets cannot be found, sponsors may choose to share the SAS programs that were used to create the derived datasets and analysis results. Note that copies of SAS programs may not be executable on other systems without some editing. |
| In certain cases the SAS code outlining the statistical models used may be shared in order to help the researcher navigate the data and original modelling approach. | |
| Optional: SAS Logs | Limited value as the original SAS program may not be executable on other computer systems or outside of the Data Holder’s standard SAS macro calls. |
Different scenarios for communication between Data Holder and Researcher
| Communication between data holder and researcher | Purpose | How to implement / alternatives | Consequences if not possible |
|---|---|---|---|
| While researcher is putting together the research proposal | Researcher fully understanding which data have been collected, study design etc. | Possibility for Researchers to raise questions directly regarding data collected on the data sharing company sites. | Higher number of research proposals needing to be rejected or resubmitted following initial review. |
| During research to clarify understanding of study’s SAP, dataset specifications | Enable the Researcher to understand data and the analysis already conducted. | Ensure complete documentation is provided by data holder to eliminate this as much as possible | Lack of knowledge of data collected potentially leading to inappropriate analysis. |
| Sharing the completed analysis, interpretation and proposed publications | Sponsor is aware prior to publication of any difference in interpretation of results | Sponsor requests to be informed up front of publication. The alternative is to take a risk and deal with receiving information in parallel to it being in public domain | Differing results based on analyses of anonymized data could lead to different interpretation and raise either justifiable or unnecessary concerns in scientific and public domains |