| Literature DB >> 35699330 |
Luke W Patten1, Patrick Blatchford1, Matthew Strand1, Alexander M Kaizer1.
Abstract
The consistency of reporting results for patient-derived xenograft (PDX) studies is an area of concern. The PDX method commonly starts by implanting a derivative of a human tumor into a mouse, then comparing the tumor growth under different treatment conditions. Currently, a wide array of statistical methods (e.g., t-test, regression, chi-squared test) are used to analyze these data, which ultimately depend on the outcome chosen (e.g., tumor volume, relative growth, categorical growth). In this simulation study, we provide empirical evidence for the outcome selection process by comparing the performance of both commonly used outcomes and novel variations of common outcomes used in PDX studies. Data were simulated to mimic tumor growth under multiple scenarios, then each outcome of interest was evaluated for 10 000 iterations. Comparisons between different outcomes were made with respect to average bias, variance, type-1 error, and power. A total of 18 continuous, categorical, and time-to-event outcomes were evaluated, with ultimately 2 outcomes outperforming the others: final tumor volume and change in tumor volume from baseline. Notably, the novel variations of the tumor growth inhibition index (TGII)-a commonly used outcome in PDX studies-was found to perform poorly in several scenarios with inflated type-1 error rates and a relatively large bias. Finally, all outcomes of interest were applied to a real-world dataset.Entities:
Keywords: TGII; mouse model; oncology; outcome selection; patient-derived xenograft (PDX); statistical analysis; translational science
Mesh:
Year: 2022 PMID: 35699330 PMCID: PMC9240739 DOI: 10.1002/ame2.12250
Source DB: PubMed Journal: Animal Model Exp Med ISSN: 2576-2095
Summary of outcomes presented with mathematical formula; notation defined in footnote
| Outcome type | Outcome | Notation |
|---|---|---|
| Continuous “individual” | Final volume |
|
| Final difference |
| |
| Final ratio |
| |
| AUC (all times) |
| |
| AUC (basic) |
| |
| Continuous “relative” | TGII (group‐level) |
|
| TGII (random pairs) |
| |
| TGII (matched pairs #1) |
| |
| TGII (matched pairs #2) |
| |
| TGII (common denominator) |
| |
| Relative difference (group‐level) |
| |
| Relative difference (random pairs) |
| |
| Relative difference (matched pairs #1) |
| |
| Relative difference (matched pairs #2) |
| |
| Relative difference (common difference) |
|
Note: Y = tumor volume, , , , , n = number of observations per group, k = k‐nearest neighbors, i and j = individual observations, t = treated group, c = control group. The k‐nearest neighbors are based on the starting tumor volumes, where the k number of control tumors are matched to each treated tumor.
Abbreviations: AUC, area under the curve; TGII, tumor growth inhibition index.
FIGURE 1Schematic illustrating how to calculate “individual” outcomes with single tumor data over time (A), calculation of baseline/final measures (B), area under the curve (AUC) for all times (C), and basic AUC using baseline/final measures (D)
FIGURE 2Schematic illustrating how to calculate select “relative” outcomes with three treatment and three control tumors (A), baseline/final changes by group level summary (B), matched pairs for highlighted treatment tumor with k=2 (C), and outcomes based on common differences and denominators (D)
General summary of performance across 8 simulated scenarios with good (+) results across all scenarios or poor (−) results
| Outcome | Type‐1 error | Power | Bias | Variance |
|---|---|---|---|---|
| Final volume | + | + | + | + |
| Final difference | + | + | + | + |
| Final ratio | + | + | + | − |
| AUC (all) | + | − | + | − |
| AUC (basic) | + | + | + | + |
| TGII | − | − | − | − |
| Relative difference | − | + | + | + |
| Categorical | + | − | Not evaluated | Not evaluated |
| Time to event | + | − | Not evaluated | Not evaluated |
Good (+) type‐1 error rates were defined as less than 6% false positives (i.e., less than 1% inflated). Good power was defined separately for each scenario as being within 10% of the highest observed power, which was set from the outcomes with stable type‐1 error rates. Both good bias and good variance were defined as having less than 3% average relative error, having the 95% confidence interval covering 0% error, and having the confidence interval spanning less than 10%. To be represented as good (+) in this table, the outcome must be “good” in all 8 simulated scenarios. Minus signs (−) denote “poor” results, which was defined as the opposite of good.
Type‐1 error for the 2 categorical outcomes (binary and the 4‐category RECIST) had type‐1 error rates between 0% and 3% (expecting 5%), which may be a result of extremely poor power for these outcomes.
Summarized for all 5 estimators (more specific results can be found in Supporting Information Tables S1–S3).
FIGURE 3The tumor volumes of the 8 tumors in each treatment group. (A) All measurements taken across the span of the study. (B) Only the first and last measurements for each tumor
Results from univariate testing for all 18 outcomes on real‐world data
| Outcome | Estimate |
|
|---|---|---|
| Final volume | −1.30 | .996 |
| Final difference | 3.29 | .987 |
| Final ratio | 0.40 | .752 |
| AUC (all times) | −353.18 | .932 |
| AUC (basic) | 1176.58 | .810 |
| TGII (group‐level) | 1.01 | .988 |
| TGII (random pairs) | 0.54 | .359 |
| TGII (matched pairs #1) | 0.16 | .691 |
| TGII (matched pairs #2) | 0.56 | .337 |
| TGII (common denominator) | 0.01 | .984 |
| Relative difference (group‐level) | 3.29 | .988 |
| Relative difference (random pairs) | 3.29 | .987 |
| Relative difference (matched pairs #1) | 52.28 | .748 |
| Relative difference (matched pairs #2) | 52.28 | .748 |
| Relative difference (common difference) | 3.29 | .984 |
| Binary | – | 1.000 |
| Categorical (RECIST) | – | 1.000 |
| Time to event of tumor doubling | 0.12 | .010 |
First, the 5 estimates for the “individual” continuous outcomes are mean differences between treatment groups, where a positive estimate represents larger tumors in the treated group compared with the control. Second, the 5 estimates for TGII directly represent the estimand of interest ; values greater than 1 represent larger tumor growth in the treated group compared with the control. Third, the 5 estimates for relative difference are mean differences between treatment groups, which is also the estimand of interest (); positive values represent larger tumor growth in the treated group compared with the control. Fourth, estimates are not provided for the categorical outcomes because all tumors had the same fate for both treatment groups. Lastly, the estimate for the time‐to‐event outcome is a hazard ratio for the treatment group compared with the control group; a value less than 1 represents a lower hazard of the tumor doubling compared with the control.