Khader Shameer1, Youyi Zhang1, Andrzej Prokop2, Sreenath Nampally1, Imran Khan A N3, Jim Weatherall3, Renee Bailey Iacona4, Faisal M Khan1. 1. Data Science & Artificial Intelligence, BioPharmaceuticals R&D, AstraZeneca, Gaithersburg, MD. 2. Oncology Biometrics, Oncology R&D, AstraZeneca, Warsaw, Poland. 3. Data Science & Artificial Intelligence, BioPharmaceuticals R&D, AstraZeneca, Macclesfield, United Kingdom. 4. Oncology Biometrics, Oncology R&D, AstraZeneca, Gaithersburg, MD.
Abstract
PURPOSE: Overall survival (OS) is the gold standard end point for establishing clinical benefits in phase III oncology trials. However, these trials are associated with low success rates, largely driven by failure to meet the primary end point. Surrogate end points such as progression-free survival (PFS) are increasingly being used as indicators of biologic drug activity and to inform early go/no-go decisions in oncology drug development. We developed OSPred, a digital health aid that combines actual clinical data and machine intelligence approaches to visualize correlation trends between early (PFS-based) and late (OS) end points and provide support for shared decision making in the drug development pipeline. METHODS: OSPred is based on a trial-level data set of 81 reports (35 anticancer drugs with various mechanisms of action; 156 observations) in non-small-cell lung cancer (NSCLC). OSPred was developed using R Shiny, with packages ggplot2, metafor, boot, dplyr, and mvtnorm, to analyze and visualize correlation results and predict OS hazard ratio (HR OS) on the basis of user-inputted PFS-based data, namely, HR PFS, or the odds ratio of PFS at 4 (OR PFS4) or 6 (OR PFS6) months. RESULTS: The three main features of the tool are as follows: prediction of HR OS on the basis of user-inputted early end point values; visualization of comparisons of the user's investigational drug with other drugs in the NSCLC setting, including by specific MoA; and creation of a probability density chart, providing point prediction and CIs for HR OS. A working version of the tool for download is linked. CONCLUSION: The OSPred tool offers interactive visualization of clinical trial end point correlations with reference to a large pool of historical NSCLC studies. Its focused capability has the potential to digitally transform and accelerate data-driven decision making as part of the drug development process.
PURPOSE: Overall survival (OS) is the gold standard end point for establishing clinical benefits in phase III oncology trials. However, these trials are associated with low success rates, largely driven by failure to meet the primary end point. Surrogate end points such as progression-free survival (PFS) are increasingly being used as indicators of biologic drug activity and to inform early go/no-go decisions in oncology drug development. We developed OSPred, a digital health aid that combines actual clinical data and machine intelligence approaches to visualize correlation trends between early (PFS-based) and late (OS) end points and provide support for shared decision making in the drug development pipeline. METHODS: OSPred is based on a trial-level data set of 81 reports (35 anticancer drugs with various mechanisms of action; 156 observations) in non-small-cell lung cancer (NSCLC). OSPred was developed using R Shiny, with packages ggplot2, metafor, boot, dplyr, and mvtnorm, to analyze and visualize correlation results and predict OS hazard ratio (HR OS) on the basis of user-inputted PFS-based data, namely, HR PFS, or the odds ratio of PFS at 4 (OR PFS4) or 6 (OR PFS6) months. RESULTS: The three main features of the tool are as follows: prediction of HR OS on the basis of user-inputted early end point values; visualization of comparisons of the user's investigational drug with other drugs in the NSCLC setting, including by specific MoA; and creation of a probability density chart, providing point prediction and CIs for HR OS. A working version of the tool for download is linked. CONCLUSION: The OSPred tool offers interactive visualization of clinical trial end point correlations with reference to a large pool of historical NSCLC studies. Its focused capability has the potential to digitally transform and accelerate data-driven decision making as part of the drug development process.
The costs associated with bringing new therapeutic agents to market are high, ranging from hundreds of millions to billions of US dollars (USD).[1,2] Anticancer agents are associated with the highest costs of all therapeutic areas, with a median research and development cost of $780 million USD for bringing a new anticancer drug to market (2018 estimates, on the basis of a sample of 10 drugs).[2] A substantial proportion of these costs can be attributed to the conduct of pivotal phase III trials; estimates showed a mean cost of $45.4 million USD per trial for anticancer agents approved from 2015 to 2016.[3] The design and type of end point chosen for pivotal phase III trials also have an important impact on cost, with trials using a clinical end point costing $64.7 million USD versus $24.0 million USD for those using a surrogate end point.[3] The failure rate of pivotal phase III trials is high in all therapeutic areas,[4] especially for anticancer agents.[5] An analysis by Wong et al[5] found that the probability of success of anticancer agents proceeding from phase I to approval was 3.4% overall, with the probability of success for proceeding from phase I to II, phase II to III, and phase III to approval being 57.6%, 32.7%, and 35.5%, respectively. The low success rate of oncology phase III trials is largely driven by a failure to meet the primary efficacy end point, which accounts for approximately 50% of failures.[6-9] Given the high human and financial costs associated with these trial failures, there is a clear need to better predict the outcomes of phase III trials of anticancer drugs to inform early go/no-go decisions by clinical teams.In trials of novel anticancer agents, overall survival (OS) is considered the gold standard end point for establishing clinical benefits.[10,11] However, there are several challenges with using OS as an end point, including the need for large study populations (eg, to detect smaller treatment effects or to see differential results more quickly) and the long duration of patient follow-up required for this end point to mature.[11,12] In addition, OS data can potentially be confounded by crossover, subsequent therapies, and noncancer death.[11,12] Progression-free survival (PFS) and objective response rate (ORR) are increasingly being used as surrogate end points for OS, for preliminary anticancer assessments, and to inform early go/no-go decisions in oncology drug development (eg, the decision to initiate phase III trials), as they can permit the use of smaller patient cohorts and often mature more quickly compared with OS.[11,13] Moreover, unlike OS, study treatment crossover or subsequent lines of therapy do not have a notable impact on these end points.[11] In addition to being increasingly used as primary outcome measures, PFS and ORR are also often used as intermediate end points (assessed at interim analyses) in phase III clinical trials, to determine whether to stop the trial for futility or continue the trial for assessment of OS.[14]The development of data-driven tools to augment shared decision making has become increasingly common in the medical field.[15] In recent years, there have been calls for more widespread use to incorporate the patient perspective and increase trial participation[16] and to address issues associated with the conduct of randomized clinical trials during pandemics.[17] Several tools are now available to support decision making in a range of clinical settings across different cancer types.[18-23] On the basis of large data sets collected to answer specific questions, data-driven tools rely on machine learning, which consists of training and evaluating algorithms to predict outcomes. These tools often use innovative methods of data visualization to enhance their useability and accessibility for clinicians. Curating large volumes of data from historical oncology trials to model the correlation trends between early and late end points could provide a useful decision aid for clinical teams by enabling prediction of late clinical trial outcomes, such as OS, from early surrogate end points, in future trials. This could reduce the human cost of clinical trial failure and help to derisk financial investment in novel anticancer therapies, by forecasting OS outcomes to inform early decisions by data monitoring committees on whether to initiate a phase III trial or stop it for futility. It could also help to build confidence in early end points, supporting their recognition by regulatory bodies such as the US Food and Drug Administration (FDA) and the European Medicines Agency for accelerated approval of novel therapies. In addition, it could reduce the risk of having to withdraw new drugs, or specific indications for new drugs, after regulatory approvals on the basis of surrogate end points. For example, indications for nivolumab in patients with small-cell lung cancer (SCLC) and pembrolizumab in patients with metastatic SCLC have recently been withdrawn from the US market. This occurred after the drugs had been granted accelerated approval by the FDA on the basis of surrogate end points (ORR and durability of response) from early-stage trials, which did not translate into OS benefit in confirmatory trials.[24,25] A data-driven tool to support decision making might have helped to predict whether these surrogate end points would result in OS benefit.Here, we describe the development of an interactive dashboard, on the basis of a robust, trial-level data set in the non–small-cell lung cancer (NSCLC) setting, to analyze and visualize correlation trends between the early end points (hazard ratio of PFS [HR PFS] and odds ratio of PFS at 4 months and at 6 months [OR PFS4 and OR PFS6, respectively]) and the late end point of HR OS in clinical trials of anticancer agents with various mechanisms of action (MoAs). Our aim is to facilitate shared decision making in the drug development pipeline by enabling rapid testing/prediction and validation of hypothesis testing after randomized phase II trials and during interim analyses of phase III trials.
METHODS
Compilation of a Trial-Level Data Set: Literature Search and Data Extraction
As reported elsewhere,[26] we compiled a trial-level data set collected from historical phase II-IV randomized controlled trials of anticancer agents in the NSCLC setting. Relevant trials were identified by means of a systematic literature review.[26] Briefly, the data set was collected from historical trial reports accessed from several public data sources (including Citeline, Trialtrove, ClinicalTrials.gov, and PubMed) and an AstraZeneca internal database (Fig 1). Aggregate treatment effect estimates were extracted from the identified trial reports, including the reported HRs for OS and PFS. In addition, PFS4 and PFS6 data were extracted from the reports by mining the reported Kaplan-Meier curves (using the WebPlotDigitizer tool) and used to calculate odds ratios (ORs) for PFS4 and PFS6 (OR PFS4 and OR PFS6, respectively).
FIG 1.
Clinical trial curation and end points data extraction workflow. AZ, AstraZeneca; CSR, clinical study reports; OS, overall survival; PFS, progression-free survival; PFS 4/6, progression-free survival at 4 and 6 months.
Clinical trial curation and end points data extraction workflow. AZ, AstraZeneca; CSR, clinical study reports; OS, overall survival; PFS, progression-free survival; PFS 4/6, progression-free survival at 4 and 6 months.Overall, the search strategy yielded 81 industry-wide trial reports (in both first and subsequent lines of therapy), representing 35 anticancer drugs and 156 observations. These studies investigated anticancer drugs covering 15 different MoAs, with the most represented MoAs being epidermal growth factor receptor (EGFR) pathway inhibition (25 trials), programmed cell death 1/programmed cell death ligand 1 (PD-1/PD-L1) pathway inhibition (18 trials), vascular endothelial growth factor receptor (VEGFR) pathway inhibition (13 trials), and DNA damage response (six trials); these four major subsets were used for downstream analysis by MoA.[26]
Data Modeling
Meta-regression analysis incorporating both fixed and random effects was implemented to measure the correlation between early and late end points, borrowing strength across the various MoA subsets mentioned above. The technical details of this analysis are published elsewhere.[26] The early end points considered in the model were HR PFS, OR PFS4, and OR PFS6; the late end point was HR OS.
Developing the OSPred Tool for Data Visualization
An interactive analysis tool, OSPred, was developed using the R Shiny software,[27] with required R packages ggplot2,[28] metafor,[29] boot,[30] dplyr,[31] and mvtnorm,[32] to analyze and visualize the correlation results from historical trials (on the basis of the abovementioned analysis) and to predict HR OS values (late end point) on the basis of user-defined input of HR or OR PFS (early end points; Fig 2).
FIG 2.
Development of the OSPred digital dashboard. AZ, AstraZeneca; HR, hazard ratio; NSCLC, non–small-cell lung cancer; OR, odds ratio; OS, overall survival; PFS, progression-free survival; REML, restricted maximum likelihood.
Development of the OSPred digital dashboard. AZ, AstraZeneca; HR, hazard ratio; NSCLC, non–small-cell lung cancer; OR, odds ratio; OS, overall survival; PFS, progression-free survival; REML, restricted maximum likelihood.The five R packages had different functions within the OSPred tool. The dplyr package was used for data preprocessing. dplyr is a grammar of data manipulation, providing a consistent set of verbs to help solve common data manipulation challenges. Its functions include adding new variables that are functions of existing variables, picking variables on the basis of their names and cases on the basis of their values, reducing multiple values down to a single summary, and changing the ordering of rows.[30] As part of the OSPred tool, the function of the mvtnorm package was to compute multivariate normal and t probabilities, quantiles, random deviates, and densities[31] after HRs and ORs had been calculated. The metafor package was then used for the meta-regression analysis, as previously described.[26] metafor allows the user to calculate various effect sizes and outcome measures frequently used in meta-analyses, including risk differences, risk ratios, and odd ratios for 2 × 2 table data; incidence rate ratios and differences for two-group person-time data; and raw and standardized mean differences and response ratios (ratios of means). The package provides a variety of models and analysis approaches, including fixed-, random-, and mixed-effects models using the inverse-variance methods and functions for creating a variety of meta-analytic plots and figures.[28] The boot package was then used for bootstrapping the algorithm, by assigning measures of accuracy (bias, variance, CIs, prediction error, etc) to sample predicted clinical outcomes—in our scenario, this was HR OS. boot uses functions and data sets for bootstrapping[29] from the book Boostrap Methods and Their Application.[33] In the OSPred tool, the function of the ggplot2 package was to provide data visualization through the creation of graphics. ggplot2 uses the book The Grammar of Graphics.[34] Once data are inputted and esthetic mapping is chosen, the user can select which graphical primitives to use (eg, layers, scales, faceting specifications, and coordinate systems).[27] Finally, Shiny was used to facilitate the building of the interactive web application (ie, the dashboard of the OSPred tool, on the basis of analyses). Shiny is an open-source R package that functions through automatic reactive binding between inputs and outputs and extensive prebuilt widgets.[26]
RESULTS
The three main features of the OSPred tool are as follows: prediction of late end point (HR OS) on the basis of early end points (data inputted by the user); visualization of comparisons of the end user's investigational drug with other drugs in the NSCLC setting, including by specific MoA; and creation of a probability density chart, providing point prediction and CIs for HR OS. The OSPred dashboard, or user interface, is shown in Figure 3.
FIG 3.
OSPred dashboard for interactive analysis of early-to-late end points in clinical trials of NSCLC. In the regression plot, the purple circles represent the historical data inputted into the model (in this example, for the specific MoA selected), and the point prediction of HR OS is highlighted by the green X. In the probability density chart, the point prediction for HR OS is represented by the black line and the red line represents the CIs for HR OS; if the point prediction is less than or equal to the predefined threshold value, then the trial is considered to be successful. HR, hazard ratio; NSCLC, non–small-cell lung cancer; MoA, mechanism of action; OR, odds ratio; OS, overall survival; PD-1, programmed cell death 1; PD-L1, programmed cell death ligand 1; PFS, progression-free survival; PFS 4/6, progression-free survival at 4 and 6 months.
OSPred dashboard for interactive analysis of early-to-late end points in clinical trials of NSCLC. In the regression plot, the purple circles represent the historical data inputted into the model (in this example, for the specific MoA selected), and the point prediction of HR OS is highlighted by the green X. In the probability density chart, the point prediction for HR OS is represented by the black line and the red line represents the CIs for HR OS; if the point prediction is less than or equal to the predefined threshold value, then the trial is considered to be successful. HR, hazard ratio; NSCLC, non–small-cell lung cancer; MoA, mechanism of action; OR, odds ratio; OS, overall survival; PD-1, programmed cell death 1; PD-L1, programmed cell death ligand 1; PFS, progression-free survival; PFS 4/6, progression-free survival at 4 and 6 months.The user input in the OSPred platform involves inputting a numerical value for HR PFS or OR PFS at different time points (eg, at interim analyses and/or other predetermined time points) and selecting the MoA of the investigational drug. A set of buttons allows the user to select the PFS type (ie, HR PFS, OR PFS4, or OR PFS6) and input a value. A dropdown list on the input panel allows users to choose the MoA of the investigational drug if they wish to make a prediction on the basis of MoA. The MoAs that can be selected are (1) inhibition of the PD-1/PD-L1 pathway, (2) inhibition of the EGFR pathway, (3) inhibition of the VEGFR pathway, and (4) DNA damage response.On the basis of the user-defined input, predictions of key statistical metrics are displayed, including Spearman's rank correlation (range, –1 to 1), R2 (the amount of variance of the outcome accounted by the regression [%]), I2 (the residual heterogeneity/amount of unaccounted variability in the regression [%]), P value (MoA-specific and overall), and the predicted true value for HR OS (with 95% CIs).Two charts are also included in the display: the left-hand chart illustrates the regression plot for HR OS over user-defined early end point and the point prediction of HR OS on the basis of user-defined input (highlighted by the green X), and the right-hand chart (bell chart) illustrates the predicted distribution of HR OS with the point predicted value and CIs.A link to a working version of the tool available in ref. 35 can be accessed by end users, where they will be able to download the code and data to install the tool locally. Although end users can download the tool to input their own data, these data will not contribute to the wider tool, which is not hosted on a public platform accessible to end users and can only be updated with new data by the development team.
DISCUSSION
The OSPred tool offers interactive visualization of clinical trial end point correlations with reference to a large pool of historical NSCLC trials. It also provides an early indication of the potential longer-term outcomes that may be achieved in late-stage clinical trials of targeted investigational agents (ie, prediction of HR OS) on the basis of user-defined inputs.Briefly, the trial-level meta-analysis on which our tool is based showed low-to-moderate correlations between treatment effects for early end points (on the basis of HR or OR PFS) and HR OS across trials of agents with various MoAs and moderate correlations between treatment effects for HR PFS and HR OS across all trials and in the PD-1/PD-L1 and EGFR trial subsets.[26] Limitations of this analysis, which include a lack of stratification by stage of disease, nature of the control arm, prognostic factors (eg, performance status or nodal volume), length of follow-up, or line of therapy (all of which could have confounded the results), have implications for the reliability of the tool. The tool could be improved by allowing for further stratification according to these trial-related variables.Our tool has been applied to this large set of historical trial reports,[26] and our intent is that the results can be referred to by clinical teams to inform shared decision making in later stages of the drug development process. For example, the tool could aid decisions to initiate phase III trials by providing an early indication of potential long-term outcomes (ie, HR OS) on the basis of available data for end points collected from phase II trials (eg, PFS6). It could be used in a similar manner at interim analyses of phase III trials to aid decisions on whether to continue follow-up for the OS end point or stop the trial for futility. OSPred's focused capability has the potential to digitally transform and accelerate decision making with data-driven insights into the drug development process.Importantly, the OSPred tool can provide outputs with specific relevance to the MoA of the investigational agent, which is achieved by only using historical trials of agents with the same MoA to inform predictions. This is important to consider, as evidence from the trial-level meta-analysis carried out to inform the development of the OSPred tool suggests that the correlation between the PFS-based early end points and HR OS may vary according to the MoA of the investigational agent.[26] The strength of correlation between PFS and OS end points also varies across published meta-analyses that examine trials of investigational agents with specific MoAs.[36,37] A key limitation of the tool is that the number of historical trials used to inform the model was small for two of the MoAs investigated (ie, VEGFR inhibition and DNA damage response). In the future, we hope to incorporate a greater number of trials into the machine learning algorithm to allow more robust predictions for these MoAs and other new MoAs by the tool.The results of the same trial-level meta-regression analysis also suggest that, for PD-1/PD-L1 checkpoint inhibitors and EGFR inhibitors, an OR PFS4/6 in favor of the investigational product might be associated with an HR OS similarly in favor of the investigational product.[26] Currently, PFS6 is often used as the key end point in randomized phase II trials to support initiation of phase III trials and accelerate approval of novel therapies,[13,38] whereas PFS4 is only based on extrapolations. Therefore, demonstrating that PFS4 can also predict the late end point of OS could encourage clinical trial investigators to change the key end point in phase II trials from PFS6 to PFS4, leading to earlier results and potentially changing how clinical trials are conducted. However, it is important to consider that a full safety assessment would require a longer time window.Previous meta-analyses have demonstrated that R2, the amount of variance between PFS-based end points and HR OS accounted for by the regression, is low in oncology trials, as a result of the substantial degree of heterogeneity reported in these studies.[26,39-42] This means that some of the variability in HR OS cannot be explained by variability in PFS-based end points. Therefore, there is a need to identify alternative early end points, beyond the traditional end points assessed with RECIST criteria, that might predict HR OS more accurately. Although the OSPred tool includes a wider range of data than these previous meta-analyses, it uses the same early end points, and a limitation of the tool is that it does not take into account any trial-to-trial variability (eg, whether trials allowed for crossover and whether there were differences by length of follow-up, stage of disease, prognostic factors, nature of the control arm, or line of therapy). Indeed, a recent analysis sought to address the issue of subsequent therapies and crossovers by evaluating second PFS (time from random assignment to progression on first subsequent therapy), reporting that second PFS (r = 0.67) had a better correlation with OS than ORR (r = 0.12) or PFS (r = 0.21).[43]In the future, we hope to incorporate additional trial-level data into the platform on a yearly basis to allow clinical teams to easily analyze and visualize the correlation trends between novel early end points and HR OS. We also hope to expand the tool to better inform future trial designs, in the form of early end points beyond PFS and of additional solid tumor indications such as breast cancer and other types of lung cancers (eg, SCLC). A next step for this tool might also be to increase the source data, potentially to incorporate large-scale patient-level data, which would enable more granular outputs and answers. For example, a larger more detailed data set could enable the tool to stratify HR OS predictions according to a patient's prognostic outcome or stage of disease. Equally, given the significant time and effort taken to compile the data set for this work, comprehensive clinical trial reporting and inclusion of different outcome-related metrics (eg, PFS4 and PFS6) in a format that more easily lends itself to this sort of analysis would further expand the development of tools like OSPred, which, in turn, would increase the realization of data-driven decision making in drug discovery, development, and repositioning. Beyond the clinical development setting, the tool could be used for precision oncology in clinical practice. For example, if an individual patient responds poorly to a treatment for which a high correlation between early PFS-based end points and OS was demonstrated through the tool, treatment could be terminated or switched.In conclusion, the OSPred tool offers interactive visualization of clinical trial end point correlations with reference to a large pool of historical NSCLC studies. Its focused capability has the potential to digitally transform and accelerate data-driven decision making as part of the drug development process. In the future, the tool will be expanded to early end points beyond PFS, additional solid tumor indications, and to encompass a greater variety of stratification factors. Widespread use of the tool by oncology clinical teams could reduce the human and financial costs of clinical trial failure by aiding decisions on whether to initiate or continue phase III trials. It could also help to build confidence in early end points, supporting their recognition by regulatory bodies worldwide for accelerated approval of novel anticancer therapies.
Authors: Matthew D Galsky; Michael Diefenbach; Nihal Mohamed; Charles Baker; Sumit Pokhriya; Jason Rogers; Ashish Atreja; Liangyuan Hu; Che-Kai Tsao; John Sfakianos; Reza Mehrazin; Nikhil Waingankar; William K Oh; Madhu Mazumdar; Bart S Ferket Journal: JCO Clin Cancer Inform Date: 2017-11
Authors: Jin-Hyeok Park; Jeong-Heum Baek; Sun Jin Sym; Kang Yoon Lee; Youngho Lee Journal: BMC Med Inform Decis Mak Date: 2020-09-22 Impact factor: 2.796