| Literature DB >> 34664259 |
Donna R Rivera1, Henry J Henk2, Elizabeth Garrett-Mayer3, Jennifer B Christian4, Andrew J Belli5, Suanna S Bruinooge3, Janet L Espirito6, Connor Sweetnam7, Monika A Izano7, Yanina Natanzon8, Nicholas J Robert6, Mark S Walker8, Aaron B Cohen9, Marley Boyd6, Lindsey Enewold10, Eric Hansen5, Rebecca Honnold11, Lawrence Kushi12, Pallavi S Mishra Kalyani1, Ruth Pe Benito11, Lori C Sakoda12, Elad Sharon10, Olga Tymejczyk9, Emily Valice12, Joseph Wagner4, Laura Lasiter13, Jeff D Allen13.
Abstract
The purpose of this study was to evaluate the potential collective opportunities and challenges of transforming real-world data (RWD) to real-world evidence for clinical effectiveness by focusing on aligning analytic definitions of oncology end points. Patients treated with a qualifying therapy for advanced non-small cell lung cancer in the frontline setting meeting broad eligibility criteria were included to reflect the real-world population. Although a trend toward improved outcomes in patients receiving PD-(L)1 therapy over standard chemotherapy was observed in RWD analyses, the magnitude and consistency of treatment effect was more heterogeneous than previously observed in controlled clinical trials. The study design and analysis process highlighted the identification of pertinent methodological issues and potential innovative approaches that could inform the development of high-quality RWD studies.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34664259 PMCID: PMC9298732 DOI: 10.1002/cpt.2453
Source DB: PubMed Journal: Clin Pharmacol Ther ISSN: 0009-9236 Impact factor: 6.903
Harmonized definitions employed in the pilot project
| Term | Harmonized definition | Decision impact | |
|---|---|---|---|
| Population | |||
| Advanced NSCLC | All data sources had the ability to identify patients diagnosed with NSCLC. Evidence of advanced disease was defined as either stage IIIB, IIIC, or IV NSCLC or earlystage (stages I, II, and IIIA) NSCLC with a recurrence or progression at initial diagnosis. | Including patients diagnosed early stage (stages I, II, and IIIA) NSCLC with a recurrence or progression to advanced or metastatic status improved sample size for analysis but created a less homogeneous population of both newly diagnosed and previously treated (vs. patients newly diagnosed lung cancer). | |
| Frontline | Patients were required to have no evidence of treatment in 180 days before the date of diagnosis and evidence of an eligible treatment within 120 days after diagnosis | Patients who have delays to treatment initiation would not be included. | |
| Histologic subtype | Histology was not required for inclusion | Histology was not universally collected, although subanalysis feasible. Results reflected overall aNSCLC trends but were less specific to a histology subtype. | |
| Eligibility criteria | The study population was not limited to those meeting eligibility criteria common for inclusion in a clinical trial (e.g., kidney function, performance status) | Data on organ function and performance status at or prior to treatment initiation was not often available or difficult to ascertain in RWD sources, although subanalysis was feasible. The population may be less like the RCT population(s). | |
| Regimens | |||
| Drugs | The following medications were included representing traditional chemotherapy or IO given after the date of diagnosis: cisplatin/carboplatin, oxaliplatin, or nedaplatin with pemetrexed, paclitaxel, nab‐paclitaxel, or gemcitabine; atezolizumab, nivolumab, or pembrolizumab. Oral agents were not included. | Regimens are subject to misclassification, particularly in the doublet chemotherapy cohort. Patients starting on a PD‐(L)1 should not be ALK or EFGR positive. | |
| Frontline (first line regimen) assignment | Frontline regimen was defined as all administered agents received within 30 days following the day of first infusion. | Misclassification or omission of patients with delays to full treatment initiation in the first 30 days was possible. This would not impact the PD‐(L)1 monotherapy cohort, as additional therapy would not be expected. | |
| End points | |||
| rwOS | Length of time from the date of treatment initiation to the date of death or end of follow‐up; or end of study | Date of initiation may bias toward slightly shorter event times compared with clinical trials which can use date of randomization or enrollment instead. Missing events, on average, tend to make survival outcomes look better than in trials, especially if missingness is not independent of timing of death events. | |
| rwTTNT | Length of time from the date of treatment initiation to the date of the next systemic treatment. When subsequent treatment is not received (e.g., continuing current treatment or disenrollment not due to confirmed death), patients were censored at their last known activity. | Missingness for subsequent treatment, including receiving treatment outside the system of capture is a limitation. This measure is also affected by the clinical guideline recommendations for administration of treatment cycles which can vary by regimen and has to be evaluated for comparability prior to the study to ensure appropriate interpretation. | |
| rwTTD | Length of time from the date of treatment initiation to the date of patient treatment discontinuation the. The study treatment discontinuation date was defined as the last administration or noncancelled order of a drug contained within the regimen. Discontinuation was defined as having a subsequent systemic therapy after the initial regimen, having a gap of more than 120 days with no systemic therapy following the last administration, or having a date of death while on the initial regimen. Patients without a discontinuation were censored at the end of follow‐up. | At the patient level, TTD is associated with PFS across therapeutic classes. | |
| rwTTP | Progression was omitted as claims‐based algorithms are inadequate and among the EHRs progression events are not consistently captured in structured data. Unlike in clinical trials, there is not a uniform criterion (e.g., RECIST) in the off‐protocol setting for determination of disease progression. | As TTP and PFS are accepted outcomes in clinical trials, comparison of these outcomes to randomized trials of similar regimens were limited by the data available. | |
| Analysis | |||
| Estimation | Kaplan‐Meier estimation was used to describe distribution of end points for each dataset for each regimen, and for estimating key time points (e.g., 6‐month, 12‐month event rates) with confidence intervals. | ||
| Comparisons | Proportional hazards regression, adjusting for prognostic factors available to all groups. | ||
| Additional analyses | For OS, censor all events at 24 months and re‐estimate HRs for treatment effect, adjusted for other prognostic variables. | ||
ALK, anaplastic lymphoma kinase; aNSCLC, advanced non‐small cell lung cancer; EHR, electronic health record; HR, hazard ratio; IO, intra‐osseous; NSCLC, non‐small cell lung cancer; OS, overall survival; PFS, progression‐free survival; RCT, randomized controlled trial; RECIST, Response Evaluation Criteria in Solid Tumors; rwOS, real‐world overall survival; rwTTD, real‐world time to treatment discontinuation; rwTTNT, real‐world time to next treatment line; rwTTP, real‐world time to treatment progression.
Figure 1Cohort construction, including data from all data sources. aNSCLC, advanced non‐small cell lung cancer; EHR, electronic health record; PDC, platinum doublet chemotherap; PD‐(L)1, programmed cell death protein 1/programmed death‐ligand 1; RWD, real‐world data.
Figure 2Characteristics of treatment groups within participating data sources. Numbers in table represent percent of patients in each category. Coloring ranges from bright green (0%) to bright orange (100%) to highlight areas of differences across data sources for the same treatment.
Figure 3Kaplan‐Meier curves for overall survival (OR) for (a) PDC and (b) PD‐(L)1; hazard ratios with 95% confidence intervals (vertical bars) comparing PD‐(L)1 to PDC for (c) OS, (d) time to treatment discontinuation, and (e) time to next treatment (TTNT). PDC, platinum doublet chemotherapy; rwOS, real‐world overall survival; rwTTD, real‐world time to treatment discontinuation; rwTTNT, real‐world time to next treatment.
Recommendations from the RWE Pilot 2.0 for developing a common RWD protocol to achieve consistency and increase reproducibility using a format that minimizes ambiguity or subjectivity in interpretation of definitions or analysis approaches
|
Recommendation/ Sub‐recommendation | Description |
|---|---|
| Defining the eligibility criteria | Shared variables that are commonly available across data sources should be used for defining patient inclusion in the study. In the RWE Pilot 2.0, cancer diagnosis (including stage and cancer type) and treatments receipt (platinum doublet and immunotherapy) were the primary criteria. Given that the goal was to make real‐world inferences, the eligibilities were based on the population of interest for generalizability. For example, if the goal was to assess treatment differences in patients with advanced age, the age range would be limited to adequately address that question in this study. |
| Collaborative common RWD protocol | The collaborative protocol should determine a list of core common required data elements, common variables available in a variety of formats that require translation (e.g., age groups; gender and race categories) should be described, definitions (e.g., exposure and end points) should be included, and standardized reporting formats should be agreed on prior to study initiation. Include a standard reporting template complete with table and figure drafts to create understanding around the intended results to be generated. |
| Define core common key data elements | Establish a core set of data elements with standard definitions enables greater comparability. Variables may have varying levels of availability in RWD, and their relevance for inclusion as a required variable depends on the relation to the study question. Structured data such as age and sex, are minimal common data elements that are typically readily available across independent data sources and requisite for analysis. However, other data elements demand thoughtful consideration and transparency such as (ii) variables available in different formats (e.g., PD‐L1 biomarker +/− indicator vs. expression), (ii) variables requiring derivation (e.g., ICD codes vs. laboratory values in the definition of reduced organ function), or (iii) variables requiring extraction from unstructured data (e.g., status of advanced at initial diagnosis vs. progression after initial diagnosis). |
| Align clinical variables and laboratory values | Key clinical and analytic variables should be identified and aligned as needed, and it should be determined whether strict variable definitions are required for inclusion criteria or if variations are acceptable. Variance in measurement can lead to subsequent impact on outcome calculations. For example, kidney function or genomic testing may be extracted from structured or unstructured data, where a source could have data ranging from the actual lab values to markers of function (e.g., laboratory tests for organ function, CrCl, ICD‐9/10 indicating dysfunction) or indicators of testing to specific testing results (e.g., PD‐(L) test completed to expression percentage). In areas where variation is accepted, the use of sensitivity analyses to examine variance is useful to guide inappropriate interpretation. Implement a well‐developed common protocol for all RWD studies |
| Data quality assessment | Development of a template for quantitative evaluation of data distributions, quality, and missingness may provide a quantitative approach to understanding data availability and missingness for improved interpretation. However careful evaluation by a representative team that has deep knowledge of the data curation, extraction, and provenance is necessary. The use of quality indicators for data or consensus on problematic missingness for key covariates may inform the study design. |
| End point selection | Commonly used end points in clinical trials may not be practical or replicable in RWD. As an example, rwTTP and rwPFS were not included in Pilot 2.0. Challenges with measuring rwTTP and rwPFS exist: claims‐based algorithms are limited, relying on proxy measures for progression and consensus definitions among EHRs data sources were prohibitively difficult to establish because of differences in capture and reporting. While uniform criterion (e.g., RECIST) allow protocol directed establishment of progression in clinical trials, progression outcomes are not consistently captured in RWD as there is currently a lacking capability in the off‐protocol setting for determination of disease progression. Additional endpoints, rwTTNT and rwTTD, are more readily accessible in RWD. While survival outcomes (rwOS) are easier to define and measure in most RWD sources, sources are often missing mortality information on a large fraction of patients, which affects estimation of rwOS parameters (e.g., median rwOS) and substantially limits interpretation, while incurring additional biases due to missing data. Linking to additional data sources which include more complete mortality data could improve end point ascertainment and should be done if feasible to make estimates based on rwOS more accurate and evaluable to other studies, such as clinical trials. |
| Defining event times and censoring | When evaluating endpoints, there is a need to it may be most reasonable merge clinical applicability with analytical feasibility. For example, in defining rwTTD, groups had to align on the appropriate time period that would equate with without no treatment receipt to be considered a discontinuation. An additional step in the process would be evaluating the potential to share software code between groups for replicability and additional validation. |
| Statistical analysis plan | SAP must be written comprehensively with sufficient detail to reduce the risk of deviations in methods used and characterizations of variables in models or tests. In conjunction with the SAP, it is instrumental that the protocol includes table and figure templates to ensure that all groups have the same understanding of the intended results to be generated, and the models required to reduce variance in interpretation. Developing tables within the shared research protocol allowed groups to consider subtle differences in modeling that would not have arisen without having developed them in advance. |
| Addressing missing data and potential biases | Approaches for quantifying and accounting for missing data in analyses should be considered in the protocol to maintain study integrity while minimizing biases in the interpretation of results. |
| Assess sample size | Because the number of patients in RWD sources is often based on retrospective data availability, study planning for RWD studies may not consider sample size and the power to detect clinically relevant effects. Even so, it is important to ensure that the sample size is sufficiently large to be able to derive meaningful inferences. If the study is underpowered, modeling may be infeasible or hypothesis tests can tend to find “insignificant” findings with wide confidence intervals, leading to potentially misleading results. In contrast, if the RWD source provides a very large sample, the study may be overpowered and there will be a tendency to over‐interpret statistically significant findings. Statistically significant |
| Cautious inference |
Even with careful attention to adjustment for population differences, there are inherent selection bias and unmeasured confounding as well as cohort effects that may not be able to be accounted for in a study; these limitations of RWD need to be appropriately addressed in the interpretation of results, inferences, and conclusions of RWE studies. In our study, while there were no obvious differences in the patient characteristics included in Pilot 2.0 across treatment cohorts, the clinical standard of care was likely to differ for the PDC population before and after FDA approval for PD‐(L)1 therapies. Similarly, comparisons of results from RWE studies to results from clinical trials need to be cautious given underlying differences in patients treated in clinical trials vs. those in available in RWD sources; these differences are expected due to limited adult clinical trial participation in patients with cancer (3–5%) and strict trial eligibility criteria. This is a strength of RWD in allowing expansion of eligibility criteria to better understand use in a real‐world population which is, in turn, a limitation in comparative efforts due to the aforementioned selection bias. |
| Diverse Multidisciplinary Research Team | Perhaps the most pivotal part of the process in an RWD study is developing a multidisciplinary team, including clinicians, biostatisticians, epidemiologists, and data scientists, to ensure that studies are clinically relevant with appropriate methods utilized to optimally account for potential biases arising from the observational nature of RWD. Teams are encouraged to include patient stakeholders and diverse representation in the conversation, as this is most effectively accomplished as a team science effort. |
CrCl, creatinine clearance; EHR, electronic health record; ICD, International Classification of Disease; RECIST, Response Evaluation Criteria in Solid Tumors; RWD, real‐world data; RWE, real‐world evidence; rwOS, real‐world overall survival; rwPFS, real‐world progression‐free survival; rwTTD, real‐world time to treatment discontinuation; rwTTNT, real‐world time to next treatment line; rwTTP, real‐world time to treatment progression.