Literature DB >> 34971492

Pharmacometric modeling and machine learning analyses of prognostic and predictive factors in the JAVELIN Gastric 100 phase III trial of avelumab.

Nadia Terranova¹, Jonathan French², Haiqing Dai³, Matthew Wiens², Akash Khandelwal⁴, Ana Ruiz-Garcia⁵, Juliane Manitz³, Anja von Heydebreck⁴, Mary Ruisi³, Kevin Chin³, Pascal Girard¹, Karthik Venkatakrishnan³.

Abstract

Avelumab (anti-PD-L1) is an approved anticancer treatment for several indications. The JAVELIN Gastric 100 phase III trial did not meet its primary objective of demonstrating superior overall survival (OS) with avelumab maintenance versus continued chemotherapy in patients with advanced gastric cancer/gastroesophageal junction cancer; however, the OS rate was numerically higher with avelumab at timepoints after 12 months. Machine learning (random forests, SIDEScreen, and variable-importance assessments) was used to build models to identify prognostic/predictive factors associated with long-term OS and tumor growth dynamics (TGDs). Baseline, re-baseline, and longitudinal variables were evaluated as covariates in a parametric time-to-event model for OS and Gompertzian population model for TGD. The final OS model incorporated a treatment effect on the log-logistic shape parameter but did not identify a treatment effect on OS or TGD. Variables identified as prognostic for longer OS included older age; higher gamma-glutamyl transferase (GGT) or albumin; absence of peritoneal carcinomatosis; lower neutrophil-lymphocyte ratio, lactate dehydrogenase, or C-reactive protein (CRP); response to induction chemotherapy; and Eastern Cooperative Oncology Group performance status of 0. Among baseline and time-varying covariates, the largest effects were found for GGT and CRP, respectively. Liver metastasis at re-baseline predicted higher tumor growth. Tumor size after induction chemotherapy was associated with number of metastatic sites and stable disease (vs. response). Asian region did not impact OS or TGD. Overall, an innovative workflow supporting pharmacometric modeling of OS and TGD was established. Consistent with the primary trial analysis, no treatment effect was identified. However, potential prognostic factors were identified.

© 2021 The healthcare business of Merck KGaA, Darmstadt, Germany and Metrum Research Group. CPT: Pharmacometrics & Systems Pharmacology published by Wiley Periodicals LLC on behalf of American Society for Clinical Pharmacology and Therapeutics.

Entities: Chemical

Mesh：

Substances：

Year: 2022 PMID： 34971492 PMCID： PMC8923733 DOI： 10.1002/psp4.12754

Source DB: PubMed Journal: CPT Pharmacometrics Syst Pharmacol ISSN： 2163-8306

WHAT IS THE CURRENT KNOWLEDGE ON THE TOPIC? Immune checkpoint inhibitors have limited antitumor activity in patients with gastric cancer or gastroesophageal junction cancer (GC/GEJC), but a subset of patients obtain clinical benefits. Pharmacometric modeling can identify patient and disease factors associated with outcome, potentially enabling more efficient selection of therapy. WHAT QUESTION DID THIS STUDY ADDRESS? This analysis aimed to identify prognostic or predictive factors in patients with advanced GC/GEJC who received maintenance therapy with avelumab or continued chemotherapy in the JAVELIN Gastric 100 trial. Machine learning was used to develop improved models of overall survival (OS) and tumor growth dynamics (TGD). WHAT DOES THIS STUDY ADD TO OUR KNOWLEDGE? Advanced models of OS and TGD were generated, and potentially important baseline and longitudinal covariates that are prognostic for OS and TGD were identified. HOW MIGHT THIS CHANGE DRUG DISCOVERY, DEVELOPMENT, AND/OR THERAPEUTICS? This study presents an innovative workflow for modeling OS and TGD in the presence of covariate sets during immune checkpoint inhibitor therapy, which may be applicable to disease settings beyond GC/GEJC.

INTRODUCTION

Avelumab, an anti–PD‐L1 immune checkpoint inhibitor, has been approved in various countries as monotherapy for metastatic Merkel cell carcinoma and advanced urothelial carcinoma (first‐line maintenance and second‐line therapy) and in combination with axitinib for advanced renal cell carcinoma. , , , JAVELIN Gastric 100 was a phase III trial that compared maintenance avelumab treatment versus continuation of chemotherapy in patients with advanced gastric cancer or gastroesophageal junction cancer (GC/GEJC) that had not progressed with first‐line induction chemotherapy (Figure 1). This trial did not meet its primary objective of demonstrating superior overall survival (OS) in the avelumab arm. However, in Kaplan‐Meier analyses, OS curves crossed after ~ 12 months, with numerically higher OS rates observed in the chemotherapy arm before 12 months and in the avelumab arm after 12 months. These results raised questions about whether a subpopulation exists that could benefit from avelumab.

FIGURE 1

JAVELIN Gastric 100 trial overview. (a) Study design schematic. (b) Primary analysis of OS. Borrowed with permission from Moehler M, et al. Phase III trial of avelumab maintenance after first‐line induction chemotherapy versus continuation of chemotherapy in patients with gastric cancers: results from JAVELIN Gastric 100. J Clin Oncol. 2021;39(9):966–977. https://ascopubs.org/doi/full/10.1200/JCO.20.00892 © American Society of Clinical Oncology. 1L, first line; 5‐FU, 5‐fluorouracil; BOR, best overall response; BSC, best supportive care; CI, confidence interval; GC/GEJC, gastric/gastroesophageal junction cancer; HER2, human epidermal growth factor receptor 2; HR, hazard ratio; IV, intravenous; OS, overall survival; PD, progressive disease; PFS, progression‐free survival; PRO, patient‐reported outcome; Q2W, every 2 weeks; QOL, quality of life; R, randomization; RECIST, Response Evaluation Criteria in Solid Tumors Machine learning (ML) methods are commonly used to identify potential prognostic and predictive factors from trial data, with increasing application in the context of pharmacometric covariate modeling. , , , , The current analysis aimed to develop pharmacometric models that applied ML approaches to JAVELIN Gastric 100 trial data, with the purpose of identifying prognostic or predictive factors for OS and tumor growth dynamics (TGDs; i.e., changes in tumor size over time).

MATERIALS AND METHODS

JAVELIN Gastric 100

The design and outcomes of the JAVELIN Gastric 100 trial have been reported previously. In brief, the study enrolled 805 patients with unresectable, human epidermal growth factor receptor 2–negative (ERBB2–negative), locally advanced or metastatic GC or GEJC who received 12 weeks of first‐line induction chemotherapy with oxaliplatin + a fluoropyrimidine. Subsequently, 499 patients without progressive disease (complete or partial response or stable disease) were randomized 1:1 to receive avelumab 10 mg/kg every 2 weeks (n = 249) or continued chemotherapy (n = 250), stratified by region (Asia vs. non‐Asia). All patients received best supportive care (BSC); patients in the chemotherapy arm considered ineligible for further chemotherapy received BSC only (n = 19). The primary end point was OS (defined as time from randomization to death). When referring to study data, “baseline” refers to values obtained before the 12‐week induction period and “re‐baseline” refers to values obtained at randomization (after induction).

Analysis overview

Analyses focused on efficacy outcomes (OS and TGD) for all randomized patients. Patient and disease characteristics at baseline or re‐baseline were evaluated as potential predictors. Longitudinal values of laboratory biomarkers (i.e., those assessed over time) were considered for inclusion in the ML models. For time‐invariant values, re‐baseline values were included as covariates. For time‐varying values, change from baseline to re‐baseline and individual‐predicted post‐randomization annual rates of change (model described later) were included as covariates in the ML model, whereas actual profiles of those identified as relevant were considered in subsequent parametric modeling. In this analysis, we screened 32 time‐invariant covariates and 19 time‐varying covariates. This yielded a total of 89 and 52 covariates entering the ML models for OS and TGD, respectively. Modeling methodology and software are described in the Supplementary Methods.

Stability assessment of covariate candidates

Both time‐invariant and longitudinal covariates were included in the ML models. To assess longitudinal covariate stability over time, a linear mixed effect model was fitted to post‐randomization values (Supplementary Methods). A covariate was considered time invariant if 50% or more patients in each treatment arm had no meaningful trend over time (defined as an individual‐specific annual rate of change larger than the estimated residual error standard deviation). For time‐invariant biomarkers, re‐baseline values were used as covariates in subsequent models. For time‐varying variables, baseline, change from baseline to re‐baseline, and individual‐specific annual rate of change were used as covariates (features) in subsequent ML models. Time‐invariant demographic covariates used baseline values.

Identification of prognostic and predictive markers for OS

To evaluate the relationship between candidate covariates and OS, exploratory graphical analysis was performed. Prognostic markers were identified using random forests (RFs) with tuned parameters. Missing data were imputed to the median for continuous covariates and the mode for discrete covariates, stratified by treatment arm. No imputation was performed for time‐varying covariates. Predictive markers were identified using methods appropriate for exploratory subgroup discovery. In particular, the SIDEScreen procedure (a two‐stage algorithm for identifying combinations of predictors, i.e., regions in the covariate space ) was used to identify subpopulations showing significant treatment effects. For prognostic models, covariate importance was assessed using a combination of the Boruta method, the permutation method, and a random splits method. The assessment criteria aligned with the overall hypothesis‐generating strategy; therefore, the general approach was to include covariates that one method deemed important, while discarding those that neither method selected. The Boruta method is a strategy in RF models in which randomly permuted copies of the original covariates are introduced as candidate predictors in addition to the original covariates. Because the permuted variables have no relationship to the outcome, the original covariates that are selected more frequently than the permuted covariate are considered to be important predictors. The Boruta method classifies covariates into three categories: confirmed, tentative, and rejected. If not rejected by the Boruta method, a covariate was considered important and included in the parametric model if the importance determined by either the permutation or random splits method was greater than or equal to 25% of the most important variable identified (relative importance). If the covariate was rejected by the Boruta method, then to be included in the parametric model, the relative importance determined by the permutation and random splits methods needed to be 10% or greater for both methods and 25% or greater for one method. For further evaluations, this process was repeated, excluding the slopes of time‐varying covariates to assess variables present only at baseline or re‐baseline. For evaluating predictive markers, the SIDEScreen procedure had a built‐in measure of covariate importance.

Development of a parametric time‐to‐event model

A parametric time‐to‐event (TTE) model for OS was constructed with re‐baseline and time‐varying factors. Four structural models for the baseline survival distribution were evaluated: exponential, Weibull, log‐logistic, and Gompertz. Time‐invariant and re‐baseline covariates were evaluated for whether they modified the scale parameter of each model, and time‐varying covariates were evaluated for whether they modified the baseline hazard proportionally. Two choices for how covariates entered the model were considered: nonparallel hazard functions to reproduce the crossing of survival curves through covariate effects on the scale parameter, and ensuring a continuous cumulative hazard function by using proportional hazards effects for time‐varying covariates. Equations describing such relationships are reported in the Supplementary Methods. Covariate effects on the shape and scale parameters were assumed to be log‐linear. Nonlinear associations were evaluated if model diagnostics indicated that the linear association was not appropriate. A treatment effect on the shape parameter was also explored based on model diagnostic and exploratory plots. Uninformative priors were used in the Bayesian modeling. Credible intervals were derived from percentiles of the posterior distributions of parameters or quantities of interest. Clinical trial simulations were used to predict landmark survival, accounting for variability in the study population and uncertainty in model parameters. One thousand clinical trials were simulated by resampling the patient population and sampling the parameters from the posterior distribution. The OS simulations accounted for administrative censoring at the end of the trial but not patient discontinuations. Arms of equal size to the original arms were created. Differences in survival probability were computed for landmarks of 12, 18, and 24 months.

Identification of prognostic and predictive markers for TGD

The fundamental approach for identifying prognostic and predictive markers for TGD was to use ML (RFs and variable importance metrics) approaches on the empirical Bayes estimates (EBEs) of structural parameters in a tumor dynamic model to identify important covariates via three steps. First, a nonlinear mixed‐effects model was constructed for longitudinally measured TGD data. Different tumor dynamic models using ordinary differential equations were explored. The base structural model did not include any drug‐induced decay process; thus, the net growth characterized by a constant rate of growth accounted for natural tumor growth and any tumor inhibition by therapy. Model evaluations utilized standard procedures in population pharmacology modeling. , In the second step, RF methods were used to identify prognostic and predictive factors associated with TGD parameters from a set of 52 covariates. The outcome modeled was individual‐predicted (EBE) parameters from the TGD model. An RF model for each outcome was used. Variables were considered prognostic on a random effect if the estimated Shapley importance was greater than 5% of the interquartile range (IQR) of the random effect. The predictive importance of each covariate was assessed using a differential effect of the Shapley values. Each covariate was discretized into quartiles if there were more than four distinct values. Shapley values for each quartile within each arm were averaged to calculate a differential treatment effect of the covariate in each quartile. If the greatest differential treatment effect across the four quartiles was greater than 10% of the IQR, the covariate was considered predictive. Finally, the base model was updated to include the factors identified in the ML step, resulting in the “final” TGD model. Both scientific understanding of the mechanistic effect of covariates and visual inspection of the relationship between the sampled EBEs and covariates guided selection of parameter‐covariate relationships.

RESULTS

Identification of prognostic and predictive markers for OS and development of a parametric TTE model

Of 23 longitudinal laboratory values assessed, six were determined to be time invariant and 17 time varying (Table S1). Although large changes were recorded at the last assessment for many patients, these time points had minimal effects on the cumulative hazard in subsequent modeling. When stratifying by event time quartile, changes in the laboratory values from baseline were much larger in the first quartile than in the fourth quartile, suggesting that changes in laboratory values were associated with OS. Variables concluded to be time‐varying (at least one arm containing >50% of patients with more than one standard deviation change per year) were albumin (ALB), alkaline phosphatase (ALP), alanine aminotransferase, aspartate aminotransferase (AST), C‐reactive protein (CRP), gamma‐glutamyl transferase (GGT), lactate dehydrogenase (LDH), hemoglobin, lymphocytes, lymphocyte‐leukocyte ratio, neutrophils, platelets, leukocytes, estimated glomerular filtration rate (eGFR), platelet‐lymphocyte ratio, neutrophil‐lymphocyte ratio, and inflammation index.

ML covariate screening

Important covariates identified by ML as prognostic were mostly the estimated slopes from time‐varying laboratory values. Among variables observable at re‐baseline, Eastern Cooperative Oncology Group performance status (ECOG PS) and age were identified as important (Figure 2a and Table S2). To focus on time‐invariant covariates observable before maintenance therapy, the importance of variables available only at re‐baseline was also considered. Additional variables were shown to be important; however, the importance of re‐baseline ECOG PS and age remained, suggesting that the selection was consistent (Figure 2b). The importance of the identified variables was confirmed by visual inspection of Kaplan‐Meier curves for OS, given the stratification of variables into quartiles. The important time‐varying covariates included in the parametric model were ALB, ALP, AST, CRP, inflammation index, LDH, lymphocyte‐leukocyte ratio, neutrophil‐lymphocyte ratio, and platelet‐lymphocyte ratio. The important time‐invariant covariates included in the parametric model were age, re‐baseline ECOG PS, baseline sum of longest diameters (SLDs), creatine kinase (CK), days since diagnosis, eGFR, GGT, heart rate, prior gastrectomy, systolic blood pressure, triglycerides, and tumor response at re‐baseline.

FIGURE 2

ML variable importance for OS. (a) Relative variable importance per methodology. Covariates included in this plot are filtered to those covariates that the Boruta method does not reject, or those covariates where the mean of the relative importance of the other two methods is greater than or equal to 0.05. (b) Variable importance for those covariates measurable at re‐baseline or baseline, relative to the most important covariate for each method. Covariates included in this plot are filtered to those covariates that the Boruta method does not reject, or those covariates where the mean of the relative importance of the other two methods is greater than or equal to 0.05. In both (a) and (b), the vertical bars show the 10% and 25% relative importance thresholds. *Variables meeting the criteria for consideration in the parametric TTE model. ECOG, Eastern Cooperative Oncology Group; ML, machine learning; MSI, microsatellite instability; OS, overall survival; SLD, sum of the longest diameter of target lesions Several of the time‐varying laboratory value slopes were highly correlated. In particular, absolute correlations between the slopes of platelet‐lymphocyte ratio, neutrophil‐lymphocyte ratio, lymphocyte‐leukocyte ratio, and inflammation index were greater than 0.70. Thus, based on clinical considerations, , , , only the neutrophil‐lymphocyte ratio was retained for the parametric model. The SIDEScreen algorithm analysis did not identify any potentially predictive covariates. Initially, simple subgroup analysis identified suggested treatment‐modifying effects, but these did not withstand adjustment for multiplicity. As an additional method to assess predictive factors, prognostic factors were assessed separately for each arm; however, identified variables were consistent. Although the order of variable importance among the three data sets (avelumab arm, chemotherapy arm, and both arms combined) differed, no difference in ranking was large enough to suggest a predictive effect.

Parametric TTE model

The log‐logistic model provided the best fit to the OS data. Exploration of the baseline hazard function showed that the log‐logistic model best fit the nonmonotonic estimated hazard curve for trial arms, both combined and individually. Maximum likelihood fits for the log‐logistic model suggested no treatment effect on median OS. Because the nonparametric kernel hazard estimator, with an Epanechnikov kernel and local optimization of the bandwidth, suggested different shapes of the hazard function for treatment type, a parameter for this effect was also included. ML screening identified both time‐invariant (on‐scale parameter/accelerated failure time: treatment, age, CK, days since diagnosis, re‐baseline ECOG PS, eGFR, heart rate, GGT, peritoneal carcinomatosis, prior gastrectomy, systolic blood pressure, SLD at baseline, triglycerides, and tumor response at re‐baseline) and time‐varying (on‐hazard/proportional hazards using longitudinal values: ALP, AST, CRP, LDH, ALB, and neutrophil‐lymphocyte ratio) covariates (Table S3). The visual predictive check (VPC) for the model showed close alignment to observed Kaplan‐Meier curves (Figure S1) and hazard ratio (Figure S2). In a substantial fraction of the posterior draws, both the mean hazard function and survival curves exhibited crossing of curves during follow‐up, replicating observed OS data. Different shape parameters and covariate distributions contributed to the crossing of the survival curves. Asian versus non‐Asian region was not identified as a covariate; VPCs stratified by region are shown in Figure S3. Estimated parameters suggested that the effect of treatment on median OS (i.e., effect on scale parameter) was negligible because the point estimate was close to zero and the credible intervals extended to both positive and negative effects (Figure 3). Furthermore, the credible interval for the treatment effect on the shape parameter excluded zero; thus, the probability of a non‐negligible difference in the shape of the hazard function was high. The magnitude of effects on OS of covariates identified as prognostic via ML were evaluated further. Specifically, the effects of covariates were scaled to the effect on median OS between the 5th and 95th percentiles of continuous covariates, or a reference value of zero or false for categorical covariates. Among time‐invariant covariates, the largest relative effect was for GGT, with a median difference of 17% longer OS (1.42 vs. 1.21 years; 95th percentile to median). Among time‐varying covariates, the largest effect was for CRP, with median hazard ratios of 0.699 for a CRP of 0.50 versus 2.0 mg/L (5th percentile to median) and 2.01 for a CRP of 34.9 versus 2.0 mg/L (95th percentile to median). Reference and comparison values for the time‐varying covariates were derived from observed data at re‐baseline.

FIGURE 3

Estimated covariate effects on OS. (a) Estimated covariate effects on median survival for baseline, re‐baseline, and stable covariates (dots) and 70% (thick line) and 95% (thin line) credible intervals for each time‐invariant covariate effect parameter estimated in the model. Larger estimates correspond to longer survival. For continuous covariates, the 2.5th and 97.5th percentiles are compared with the median. For treatment, the reference is chemotherapy. For the tumor response covariate, the effects are compared with patients who had neither stable disease nor response. For re‐baseline ECOG, the effect is compared with a re‐baseline ECOG score of 0. For prior gastrectomy, the baseline reference is no prior gastrectomy. For peritoneal carcinomatosis, the reference is no peritoneal carcinomatosis. The dashed lines are at zero, for no effect, and the bounds at ±15% for the posterior median to classify the variable as having a meaningful effect. (b) Estimated covariate effects on the hazard ratio for time‐varying covariates (dots) and 70% (thick line) and 95% (thin line) credible intervals for each time‐varying (longitudinal data) covariate effect parameter estimated in the model. Smaller estimates correspond with longer survival. The percentiles for reference and comparison values were calculated using the values of the covariate at re‐baseline only. The dashed lines are at one, for no effect, and the bounds ±1.2 for the posterior median to classify the variable as having a meaningful effect. CR, complete response; ECOG, Eastern Cooperative Oncology Group; OS, overall survival; PR, partial response; SD, stable disease; SLD, sum of the longest diameter of target lesions Time‐varying covariates were evaluated using percentiles at re‐baseline for comparison. Important variables included neutrophil‐lymphocyte ratio, LDH, CRP, albumin, peritoneal carcinomatosis, and age (Figure 3). Model fit was also assessed through stratified VPCs for time‐invariant covariates. A possible lack of model fit was seen only in the VPC for heart rate, whereas the others showed the observed data within the 95% credible interval range for predicted OS curves. When evaluating the effect size and importance for time‐varying covariates, trajectories were meaningful, and extreme laboratory values and events were correlated. The effect for CRP and neutrophil‐lymphocyte ratio is shown in Figure 4 and Figure S4, respectively. The rapid changes in laboratory values close to event times were also demonstrated by stratifying by first versus fourth quartile of observed event time and assessing median trajectory. The first‐quartile (shorter OS) had more rapid changes in laboratory values than the fourth‐quartile (longer OS) group. Changes were similar between treatment arms.

FIGURE 4

Analysis of C‐reactive protein (CRP) in relation to overall survival. (a) Estimated survival curves for smoothed 10th percentile, median, and 90th percentile CRP levels over time. All other variables held constant at reference levels. Increases in CRP are estimated to have increased risk. Bands are 95% credible intervals. (b) Observed data for CRP. The first and fourth quartiles of known event times are compared to show the trajectories of CRP for short‐term versus long‐term survivors The predicted effect of avelumab on long‐term OS was evaluated through clinical trial simulation. The probability of longer OS in the avelumab arm increased over time, from 47% at 12 months after randomization to 76% at 24 months. Indeed, estimated differences between the avelumab and chemotherapy arms were predicted to increase over time (median difference: 12 months, −0.40%, 95% confidence interval [CI], −0.13 to 0.12; 18 months, 3.2%, 95% CI, −0.088 to 0.16; and 24 months, 4.0%, 95% CI, −0.060 to 0.15); however, all included the null value and were well below the 10% difference considered clinically meaningful (Figure 5 and Figure S5; Table S4).

FIGURE 5

The posterior distributions of survival differences between the avelumab arm and the chemotherapy/best supportive care arm. The probability of the avelumab arm having longer survival increased with time, but there was non‐negligible probability of the avelumab arm having shorter survival at all landmark times The goodness of fit and stability of different models for TGD were evaluated. The Gompertzian equation was the most parsimonious model for providing a good description of the data. The base model generally predicted the central tendency in TGD data across patients. During the estimation process, a high correlation was observed between interindividual variability (IIV) on Kg and Kd (rate constants of growth and deceleration/decline, respectively; the deceleration rate relates to natural death of tumor cells). Thus, Kd was reformulated as a parameter proportional to Kg with a slope and intercept, which allowed the simplification of the dimensionality of the omega structure restricting the correlation between Kd and Kg to be 1. The relationship between the two estimated random effects (Kg‐Kd and baseline tumor size) suggested minimal correlation; thus, the two univariate models would not lead to nonidentification of interactions between the random effects. The RF had modest predictive power (19% variance explained) on the baseline TGD random effect and weaker predictive power (4% variance explained) on the growth‐deceleration random effect, but this was sufficient to compute variable importance scores. All variables considered in the model were assessed with Shapley values, the permutation algorithm, and the random splits algorithm. The last two methods did not suggest any covariates that were not already captured with Shapley values (Figure S6). No treatment‐covariate interactions were included in the final model because each of the differential treatment effects was much less than 10% of the IQR. Covariate effects on the random effects of the growth‐deceleration rate were liver metastasis and time since diagnosis. For the random effect of baseline TGD, covariates identified were number of metastatic sites and tumor response at re‐baseline. Identified prognostic and predictive markers on TGD dynamics by ML were incorporated to the base model (Table S5) as described in Supplementary Methods to generate the “final” TGD model (Table S6). The objective function value for the “final” TGD model was ~ 121 points lower than for the base TGD model. The effect of liver metastasis is illustrated in Figure S7. The mean percentage change in TGD at the end of the study (Table S7) was calculated by averaging shrinkage (<0%) and growth (>0%). Consistent with the OS model, Asian versus non‐Asian region was not identified as a covariate for the TGD model (Figure S8). Parameter estimates from the final model and median and mean values from the nonparametric bootstrap appeared to agree for Kg and tumor size at baseline and covariate effects on tumor baseline (Table 1). However, estimates for covariate effects on Kg/Kd seemed to present very high uncertainty, with a point estimate that differed between final model estimate and median bootstrap analysis. Stratified VPCs were generated using the final model parameters to determine if observed data were consistent with model‐simulated median, 2.5th, and 97.5th percentiles, which indicated acceptable performance of the model in all covariate groups (Figures [Link], [Link], [Link], [Link], [Link]). Similar to the base model, the low percentage of patients remaining on trial after ~ 300 days (<20%) resulted in overprediction of tumor size profiles, with observed percentiles being below simulated trends.

TABLE 1

Final structural tumor growth dynamics model parameters and final model variance parameters

			Final model		Nonparametric bootstrap
			Estimate	Shrinkage	Median	Mean	RSE (%)	95% CI
Structural model parameters
Kg, 1/year	ϴ₁	Tumor growth rate	0.271	NA	0.288	0.293	11.9	0.249, 0.378
BASE, mm	ϴ₂	Baseline tumor size	27.0	NA	26.8	26.8	5.43	23.9, 29.6
Slope, 1/mm	ϴ₃	Slope of proportionality between Kd and Kg	−18.8	NA	−7.30	−12.6	115	−50.2, −0.814
Intercept, 1/year*mm	ϴ₄	Intercept	5.18	NA	2.21	3.73	111	0.415, 14.3
Kd, 1/year*mm	ϴ₃*Kg+ ϴ₄	Tumor deceleration rate	0.0852	NA	0.116	0.117	21.4	0.0756, 0.170
Covariate effect parameters
Kg_Tdiag	ϴ₅	Time from diagnosis effect on Kg‐Kd	−0.00291	NA	−0.00404	−0.00857	150	−0.0408, 0.0104
Kg_Livermet	ϴ₆	Liver metastasis effect on Kg‐Kd	0.0164	NA	0.0381	0.101	111	0.00351, 0.387
BASE_Nmet	ϴ₇	No. of metastatic sites effect on BASE	0.143	NA	0.146	0.146	15.7	0.104, 0.193
BASE_Rnon‐PD/PR	ϴ₈	Re‐baseline non‐PD/PR effect on BASE	−0.0769	NA	−0.0778	−0.0660	167	−0.249, 0.181
BASE_RSD	ϴ₉	Re‐baseline stable disease effect on BASE	0.644	NA	0.651	0.659	20.5	0.423, 0.941
Interindividual variance parameters
IIV‐Kd‐Kg	Ω_(1,1)		0.00163 [CV% = 4.04]	32.7	0.00865	0.0936	131	0.000220, 0.382
IIV‐BASE	Ω_(2,2)		0.659 [CV% = 96.6]	1.85	0.658	0.659	5.91	0.585, 0.735
IIV‐Kd	Ω₁*ϴ23		0.576 [CV% = 88.3]	100	0.509	0.522	30.7	0.204, 0.880
Interindividual covariance parameters
Kd‐Kg	Ω_1,3		−0.031 [Corr = 1.00]	NA	0.0658	0.154	92.6	0.0109, 0.429
Residual variance
Proportional	Σ_(1,1)		0.0505 [CV% = 22.5]	14.8	0.0486	0.0489	13.5	0.0365, 0.0624

Kd was estimated using Kg, slope, and intercept as follows: Kd = slope*Kg+intercept.

The CI was determined from the 2.5th and 97.5th percentiles of the nonparametric bootstrap (n = 1000) estimates.

Abbreviations: CI, confidence interval; Corr, correlation coefficient; CV, coefficient of variation; IIV, interindividual variance; NA, not applicable; PD, progressive disease; PR, partial response; RSE, relative standard error.

CV% of omegas = .

CV% of sigma = .

Final structural tumor growth dynamics model parameters and final model variance parameters Kd was estimated using Kg, slope, and intercept as follows: Kd = slope*Kg+intercept. The CI was determined from the 2.5th and 97.5th percentiles of the nonparametric bootstrap (n = 1000) estimates. Abbreviations: CI, confidence interval; Corr, correlation coefficient; CV, coefficient of variation; IIV, interindividual variance; NA, not applicable; PD, progressive disease; PR, partial response; RSE, relative standard error. CV% of omegas = . CV% of sigma = . Covariate effects and uncertainty in parameter estimates were presented using forest plots, with ratios and 95% CIs constructed using 1000 bootstrap parameter sets over the reference value from the model fit for each fixed covariate effect (Figure S14). Additionally, a forest plot displaying the relative change in tumor shrinkage was constructed using the same method. The reference patient had: three metastatic sites, no liver metastasis, and treatment 53 days after diagnosis. The time duration for the tumor shrinkage calculation was 133 days, which was the approximate time during maintenance when 50% or more of the initial patient population remained in the study. Consistent with previous observations, forest plots showed high uncertainty on Kg/Kd effects. Figure 6 shows simulated tumor profiles over time for the reference patient compared with patients having selected characteristics.

FIGURE 6

Simulated tumor profiles over time for identified covariates effects. Blue: Reference patient with three metastatic sites, no liver metastasis, time since diagnosis of 53 days, and re‐baseline response of complete response. Red: Simulated scenarios as described in each panel. The 95% prediction intervals using the nonparametric bootstrap results are presented. CR, complete response; PR, partial response; SD, stable disease; SLD, sum of the longest diameter of target lesions

DISCUSSION

This longitudinal pharmacometric analysis did not identify any significant treatment effects of avelumab versus chemotherapy in the maintenance treatment of advanced GC/GEJC, consistent with the primary analysis of the JAVELIN Gastric 100 trial. Disease models of OS and TGD were developed by integrating covariates efficiently informed by ML methods, and covariates potentially prognostic of OS and TGD were identified. However, no predictive factors associated with OS or TGD during avelumab treatment were found. The analyses presented provide an example of incorporating ML approaches into a traditional pharmacometric workflow. Specifically, separate RF models were used to identify prognostic factors for OS and tumor size end points, which were then added to parametric TTE and population TGD models, respectively. Supplementing ML with parametric methods resulted in more‐interpretable final models than the RF alone, particularly for noisy time‐varying covariates and given the multistage trial design (induction and maintenance phases). Furthermore, in comparison to parametric modeling alone and/or using stepwise regression or hypothesis testing, the ML approach was faster and was performed using a single pass over the data. One potential limitation of this workflow arises in the translation of the nonlinear and interacting effects inherent in ML models into parametric forms. We started with linear effects and used diagnostic plots to guide refinement of the model. Alternative approaches could also be considered to guide the initial choice of covariate‐effect relationships, such as using partial dependence or accumulated local effect plots. Most parameters selected by ML exhibited large effects in the parametric model, and those with a smaller effect may also be relevant for future clinical consideration. For several parameters selected by ML, the effect on the estimated mean was small and affected the tail or variance of event times, and not the mean, median, or central tendency. However, considering the number of covariates screened, the plausibility of misspecification from random variability in the data, especially when dealing with smaller covariate effects, cannot be ruled out. Accordingly, the results presented here should be considered hypothesis generating. Nevertheless, the longitudinal models developed using covariates for OS and TGD provide a quantitative framework that can be leveraged as a disease model for GC in the maintenance setting. In conjunction, the parametric model estimated linear relationships, and no model misspecification was evident based on diagnostic plots. Strong, very nonlinear relationships could have been important predictors in ML models for a small subset of patients with extreme values and would have manifested as tail effects. The assessment of which covariates were stable over time was based on models that evaluated linear trends in time. Although this approach can identify linear and monotonic nonlinear trends, it is possible that longitudinal trends that were nonmonotonic were missed. Based on the study design and relatively sparse collection of data (median number of observations ranged from 4 to 8 across 19 time‐varying covariates), the ability to detect nonmonotonic longitudinal trends was limited. Future applications of this methodology should consider the possibility of identifying nonmonotonic longitudinal trends. Time‐invariant (older age, higher GGT levels, absence of peritoneal carcinomatosis, complete or partial response at re‐baseline, and re‐baseline ECOG PS of 0) and time‐varying (lower neutrophil‐lymphocyte ratio, lower LDH, lower CRP, and higher albumin) covariates predicting longer OS were identified. Age, CRP, LDH, and neutrophil‐lymphocyte ratio have been reported previously as strong prognostic biomarkers in patients with solid tumors. , , , , , , , , In contrast to our results, GGT has been reported previously as a marker for poor prognosis in patients with GC. , Clinical trial simulations suggested a benefit with avelumab treatment at milestone survival times greater than 1 year; however, these differences were estimated to be less than 10% and were not considered clinically meaningful, and the probability of exceeding the 10% threshold was small at all landmark times considered. Predicted differences beyond the median OS (e.g., 2 years) were driven primarily through the treatment effect on the log‐logistic shape parameter. In the log‐logistic model, the shape parameter influences the variance and, hence, the tails of the survival distribution. The inclusion of a treatment effect on the shape parameter was necessary to characterize the data, even after incorporating the effects of time‐varying covariates. This further suggests that no factor included in the parametric model was sufficient to identify a subset of patients likely to survive longer with avelumab versus chemotherapy. Tumor growth inhibition was also modeled with a combination of parametric models and ML. The analysis was limited by the modest change in tumor size during maintenance treatment and the high percentage of patients who discontinued before median tumor shrinkage in the population data set was observed. Furthermore, because of the shared IIV for tumor growth and deceleration rate in the TGD model, interpretation of some identified effects was complex. The negative slope relating Kg and Kd indicates that effects reducing Kg will increase Kd, with an overall effect on tumor shrinkage that is greater than expected compared with an isolated effect on Kg. Tumor size during maintenance was stable. Although time since diagnosis was identified as a covariate on tumor growth rate constant by ML, simulations from the resulting parametric TGD model that included all ML‐identified covariates did not reveal a meaningful association, indicating that the effect was likely not clinically relevant. Baseline characteristics identified by ML as predictors of TGD were liver metastasis for tumor growth and deceleration constant rates. The reduced tumor shrinkage associated with the presence of liver metastasis, which was more marked in the avelumab arm, is consistent with previous reports of reduced efficacy with immunotherapy in patients with liver metastases. Number of metastatic sites and stable disease at re‐baseline were associated with baseline tumor size. JAVELIN Gastric 100 was a multiregional clinical trial, including countries in the Eastern Asian region (Japan, Republic of Korea, Taiwan, and Thailand) where GC has its highest prevalence. An important finding was that Asian versus non‐Asian region was not identified as a covariate in OS or TGD models. Assessment of conservation of disease‐related intrinsic and extrinsic factors is an important consideration when applying International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use guidelines E5 and E17 for ethnic sensitivity assessment to support Asia‐inclusive clinical development strategies. , , , , The results of our analyses indicate the lack of discernable differences in disease progression or outcomes between Asian and non‐Asian populations and are valuable to inform the design of Asia‐inclusive trials in GC in the postinduction setting. In conclusion, our analyses established an innovative workflow supporting ML‐enabled pharmacometric modeling of OS and TGD. No significant treatment effect on OS was found within the JAVELIN Gastric 100 population, consistent with the primary analysis; thus, no subpopulation for which avelumab was superior to chemotherapy was identified. However, a disease model for GC in the postinduction/maintenance setting was developed, and potential prognostic factors for both OS and TGD were identified. These require further confirmation but may inform future studies in this setting.

CONFLICTS OF INTEREST

N. Terranova is an employee of Merck Institute of Pharmacometrics, Lausanne, Switzerland, an affiliate of Merck KGaA, Darmstadt, Germany. J. French reports personal fees from the healthcare business of Merck KGaA, Darmstadt, Germany during the conduct of the study. H. Dai is an employee of EMD Serono. M. Wiens reports personal fees from the healthcare business of Merck KGaA, Darmstadt, Germany during the conduct of the study. A. Khandelwal is an employee of the healthcare business of Merck KGaA, Darmstadt, Germany. A. Ruiz‐Garcia reports personal fees from the healthcare business of Merck KGaA, Darmstadt, Germany during the conduct of the study. J. Manitz is an employee of EMD Serono. A. von Heydebreck is an employee of and reports stock ownership in the healthcare business of Merck KGaA, Darmstadt, Germany. M. Ruisi was an employee of EMD Serono when the study was conducted. K. Chin was an employee of EMD Serono when the study was conducted. P. Girard is an employee of Merck Institute of Pharmacometrics, Lausanne, Switzerland, an affiliate of Merck KGaA, Darmstadt, Germany. K. Venkatakrishnan is an employee of EMD Serono. As an Associate Editor for CPT: Pharmacometrics & Systems Pharmacology, Jonathan French was not engaged in the review or decision process for this paper.

AUTHOR CONTRIBUTIONS

N.T., H.D., J.M., M.W., K.V., J.F., A.R.‐G., and A.vH. wrote the manuscript. N.T., M.R., H.D., J.M., P.G., M.W., K.V., J.F., A.R.‐G., A.K., A.vH., and K.C. designed the research. N.T., M.R., H.D., M.W., K.V., J.F., A.R.‐G., A.vH., and K.C. performed the research. N.T., M.R., H.D., J.M., M.W., J.F., A.R.‐G., A.K., A.vH., and K.C. analyzed the data. Fig S1 Click here for additional data file. Fig S2 Click here for additional data file. Fig S3 Click here for additional data file. Fig S4 Click here for additional data file. Fig S5 Click here for additional data file. Fig S6 Click here for additional data file. Fig S7 Click here for additional data file. Fig S8 Click here for additional data file. Fig S9 Click here for additional data file. Fig S10 Click here for additional data file. Fig S11 Click here for additional data file. Fig S12 Click here for additional data file. Fig S13 Click here for additional data file. Fig S14 Click here for additional data file. Figure Legends Click here for additional data file. Supplementary Methods Click here for additional data file. Table S1 Click here for additional data file. Table S2 Click here for additional data file. Table S3 Click here for additional data file. Table S4 Click here for additional data file. Table S5 Click here for additional data file. Table S6 Click here for additional data file. Table S7 Click here for additional data file.

34 in total

Review 1. Strategies for identifying predictive biomarkers and subgroups with enhanced treatment effect in clinical trials using SIDES.

Authors: Ilya Lipkovich; Alex Dmitrienko
Journal: J Biopharm Stat Date: 2014 Impact factor: 1.051

2. Driving Access to Medicines With a Totality of Evidence Mindset: An Opportunity for Clinical Pharmacology.

Authors: Karthik Venkatakrishnan; Jack Cook
Journal: Clin Pharmacol Ther Date: 2017-11-28 Impact factor: 6.875

3. Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries.

Authors: Hyuna Sung; Jacques Ferlay; Rebecca L Siegel; Mathieu Laversanne; Isabelle Soerjomataram; Ahmedin Jemal; Freddie Bray
Journal: CA Cancer J Clin Date: 2021-02-04 Impact factor: 508.702

4. Gastric cancer cells alter the immunosuppressive function of neutrophils.

Authors: Soichiro Hiramatsu; Hiroaki Tanaka; Junya Nishimura; Yoshihito Yamakoshi; Chie Sakimura; Tatsuro Tamura; Takahiro Toyokawa; Kazuya Muguruma; Masakazu Yashiro; Kosei Hirakawa; Masaichi Ohira
Journal: Oncol Rep Date: 2019-11-20 Impact factor: 3.906

5. Asia-inclusive global development of pevonedistat: Clinical pharmacology and translational research enabling a phase 3 multiregional clinical trial.

Authors: Xiaofei Zhou; Sharon Friedlander; Erik Kupperman; Farhad Sedarati; Shingo Kuroda; Zhaowei Hua; Ying Yuan; Yuka Yamamoto; Douglas V Faller; Kazue Haikawa; Katsuhiko Nakai; Sharon Bowen; Yi Dai; Karthik Venkatakrishnan
Journal: Clin Transl Sci Date: 2021-02-02 Impact factor: 4.689

6. Basic concepts in population modeling, simulation, and model-based drug development: part 3-introduction to pharmacodynamic modeling methods.

Authors: R N Upton; D R Mould
Journal: CPT Pharmacometrics Syst Pharmacol Date: 2014-01-02

7. A fully human IgG1 anti-PD-L1 MAb in an in vitro assay enhances antigen-specific T-cell responses.

Authors: Italia Grenga; Renee N Donahue; Lauren M Lepone; Jacob Richards; Jeffrey Schlom
Journal: Clin Transl Immunology Date: 2016-05-20

8. Pharmacometrics and Machine Learning Partner to Advance Clinical Data Analysis.

Authors: Gilbert Koch; Marc Pfister; Imant Daunhawer; Melanie Wilbaux; Sven Wellmann; Julia E Vogt
Journal: Clin Pharmacol Ther Date: 2020-02-17 Impact factor: 6.875

9. Population modeling of tumor growth curves and the reduced Gompertz model improve prediction of the age of experimental tumors.

Authors: Cristina Vaghi; Anne Rodallec; Raphaëlle Fanciullino; Joseph Ciccolini; Jonathan P Mochel; Michalis Mastri; Clair Poignard; John M L Ebos; Sébastien Benzekry
Journal: PLoS Comput Biol Date: 2020-02-25 Impact factor: 4.475

10. The preoperative and the postoperative neutrophil-to-lymphocyte ratios both predict prognosis in gastric cancer patients.

Authors: Eun Young Kim; Kyo Young Song
Journal: World J Surg Oncol Date: 2020-11-10 Impact factor: 2.754

3 in total

1. A novel analytical framework for risk stratification of real-world data using machine learning: A small cell lung cancer study.

Authors: Luca Marzano; Adam S Darwich; Salomon Tendler; Asaf Dan; Rolf Lewensohn; Luigi De Petris; Jayanth Raghothama; Sebastiaan Meijer
Journal: Clin Transl Sci Date: 2022-07-29 Impact factor: 4.438

2. Machine learning-guided covariate selection for time-to-event models developed from a small sample of real-world patients receiving bevacizumab treatment.

Authors: Eleni Karatza; Apostolos Papachristos; Gregory B Sivolapenko; Daniel Gonzalez
Journal: CPT Pharmacometrics Syst Pharmacol Date: 2022-08-04

3. Pharmacometric modeling and machine learning analyses of prognostic and predictive factors in the JAVELIN Gastric 100 phase III trial of avelumab.

Authors: Nadia Terranova; Jonathan French; Haiqing Dai; Matthew Wiens; Akash Khandelwal; Ana Ruiz-Garcia; Juliane Manitz; Anja von Heydebreck; Mary Ruisi; Kevin Chin; Pascal Girard; Karthik Venkatakrishnan
Journal: CPT Pharmacometrics Syst Pharmacol Date: 2022-01-19

3 in total