| Literature DB >> 27862181 |
Nils Ternès1,2, Federico Rotolo1,2, Georg Heinze3, Stefan Michiels1,2.
Abstract
Stratified medicine seeks to identify biomarkers or parsimonious gene signatures distinguishing patients that will benefit most from a targeted treatment. We evaluated 12 approaches in high-dimensional Cox models in randomized clinical trials: penalization of the biomarker main effects and biomarker-by-treatment interactions (full-lasso, three kinds of adaptive lasso, ridge+lasso and group-lasso); dimensionality reduction of the main effect matrix via linear combinations (PCA+lasso (where PCA is principal components analysis) or PLS+lasso (where PLS is partial least squares)); penalization of modified covariates or of the arm-specific biomarker effects (two-I model); gradient boosting; and univariate approach with control of multiple testing. We compared these methods via simulations, evaluating their selection abilities in null and alternative scenarios. We varied the number of biomarkers, of nonnull main effects and true biomarker-by-treatment interactions. We also proposed a novel measure evaluating the interaction strength of the developed gene signatures. In the null scenarios, the group-lasso, two-I model, and gradient boosting performed poorly in the presence of nonnull main effects, and performed well in alternative scenarios with also high interaction strength. The adaptive lasso with grouped weights was too conservative. The modified covariates, PCA+lasso, PLS+lasso, and ridge+lasso performed moderately. The full-lasso and adaptive lassos performed well, with the exception of the full-lasso in the presence of only nonnull main effects. The univariate approach performed poorly in alternative scenarios. We also illustrate the methods using gene expression data from 614 breast cancer patients treated with adjuvant chemotherapy.Entities:
Keywords: Biomarker-by-treatment interactions; High-dimensional; Precision medicine; Stratified medicine; Survival; Variable selection
Mesh:
Substances:
Year: 2016 PMID: 27862181 PMCID: PMC5763402 DOI: 10.1002/bimj.201500234
Source DB: PubMed Journal: Biom J ISSN: 0323-3847 Impact factor: 2.207
Design of the simulation study
| Two hundred fifty repetitions per scenario | Median survival time (years) | Hazard ratio | Average censoring probability | |||||
|---|---|---|---|---|---|---|---|---|
|
|
| |||||||
|
|
|
|
|
|
| |||
|
| (1a) | Complete null | 1.0 | 1.0 | 1.0 | 1.0 | 0.10 | 0.11 |
| (2a) | Treatment effect only | 1.0 | 2.0 | 1.0 | 1.0 | 0.10 | 0.31 | |
| (3a) | 10 prognostic markers | 1.0 | 1.0 | 0.5 | 0.5 | 0.30 | 0.30 | |
| (4a) | One treatment modifier | 1.0 | 1.0 | 1.0 | 0.5 | 0.10 | 0.15 | |
| (5a) | 10 treatment modifiers | 1.0 | 1.0 | 1.0 | 0.5 | 0.10 | 0.29 | |
| (6a) | 10 treatment modifiers + | 1.0 | 1.0 | 1.0 | 0.5 | 0.30 | 0.35 | |
| 10 prognostic markers | 0.5 | 0.5 | ||||||
|
| (1b) | Complete null | 1.0 | 1.0 | 1.0 | 1.0 | 0.11 | 0.10 |
| (2b) | Treatment effect only | 1.0 | 2.0 | 1.0 | 1.0 | 0.11 | 0.31 | |
| (3b) | 20 prognostic markers | 1.0 | 1.0 | 0.5 | 0.5 | 0.35 | 0.35 | |
| (4b) | One treatment modifier | 1.0 | 1.0 | 1.0 | 0.5 | 0.11 | 0.15 | |
| (5b) | 20 treatment modifiers | 1.0 | 1.0 | 1.0 | 0.5 | 0.11 | 0.35 | |
| (6b) | 20 treatment modifiers + | 1.0 | 1.0 | 1.0 | 0.5 | 0.35 | 0.39 | |
| 20 prognostic markers | 0.5 | 0.5 | ||||||
|
| (1c) | Complete null | 1.0 | 1.0 | 1.0 | 1.0 | 0.65 | 0.66 |
| (2c) | Treatment effect only | 1.0 | 2.0 | 1.0 | 1.0 | 0.65 | 0.81 | |
| (3c) | 10 prognostic markers | 1.0 | 1.0 | exp ( | exp ( | 0.60 | 0.61 | |
| (4c) | One treatment modifier | 1.0 | 1.0 | 1.0 | exp ( | 0.66 | 0.64 | |
| (5c) | 10 treatment modifiers | 1.0 | 1.0 | 1.0 | exp ( | 0.66 | 0.59 | |
| (6c) | 10 treatment modifiers + | 1.0 | 1.0 | 1.0 | exp ( | 0.61 | 0.58 | |
| 10 prognostic markers | exp ( | exp ( | ||||||
: control arm, : experimental arm, X: biomarker, randomly drawn from U(−0.5, −0.1), randomly drawn from U(−0.7, −0.1).
Proportion of models selecting at least one biomarker‐by‐treatment interaction for all the methods among 250 replications
| Univariate | Modified covariates | PCA+lasso | PLS+lasso | Ridge+lasso | Group‐lasso | Two‐I model | Full‐lasso | Alasso (sw) | Alasso (gw) | Alasso (aspw) | Gradient boosting | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Null scenarios | Scenario 1a | 0.07 | 0.39 | 0.38 | 0.36 | 0.39 | 0.48 | 0.41 | 0.01 | 0.14 | 0.00 | 0.42 | 0.68 |
| Scenario 2a | 0.06 | 0.35 | 0.43 | 0.38 | 0.39 | 0.56 | 0.44 | 0.01 | 0.12 | 0.00 | 0.37 | 0.66 | |
| Scenario 3a | 0.06 | 0.37 | 0.24 | 0.41 | 0.47 | 1.00 | 1.00 | 0.88 | 0.20 | 0.00 | 0.32 | 1.00 | |
| Scenario 1b | 0.06 | 0.38 | 0.35 | 0.32 | 0.38 | 0.52 | 0.40 | 0.01 | 0.12 | 0.00 | 0.36 | 0.68 | |
| Scenario 2b | 0.04 | 0.41 | 0.43 | 0.43 | 0.38 | 0.52 | 0.38 | 0.02 | 0.16 | 0.00 | 0.38 | 0.69 | |
| Scenario 3b | 0.08 | 0.45 | 0.27 | 0.42 | 0.58 | 1.00 | 1.00 | 0.98 | 0.32 | 0.00 | 0.55 | 1.00 | |
| Scenario 1c | 0.05 | 0.56 | 0.57 | 0.50 | 0.56 | 0.56 | 0.40 | 0.03 | 0.25 | 0.00 | 0.51 | 0.73 | |
| Scenario 2c | 0.04 | 0.55 | 0.56 | 0.61 | 0.56 | 0.46 | 0.37 | 0.00 | 0.13 | 0.00 | 0.44 | 0.65 | |
| Scenario 3c | 0.06 | 0.53 | 0.49 | 0.58 | 0.60 | 1.00 | 1.00 | 0.51 | 0.63 | 0.00 | 0.73 | 1.00 | |
| Alternative scenarios | Scenario 4a | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.15 | 1.00 | 1.00 |
| Scenario 5a | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 1.00 | 0.66 | 1.00 | 1.00 | |
| Scenario 6a | 0.59 | 0.90 | 0.80 | 0.94 | 0.99 | 1.00 | 1.00 | 1.00 | 1.00 | 0.55 | 1.00 | 1.00 | |
| Scenario 4b | 1.00 | 1.00 | 1.00 | 0.99 | 1.00 | 1.00 | 1.00 | 0.98 | 1.00 | 0.19 | 1.00 | 1.00 | |
| Scenario 5b | 1.00 | 1.00 | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 0.98 | 1.00 | 0.78 | 1.00 | 1.00 | |
| Scenario 6b | 0.43 | 0.78 | 0.64 | 0.90 | 0.98 | 1.00 | 1.00 | 1.00 | 1.00 | 0.28 | 1.00 | 1.00 | |
| Scenario 4c | 0.27 | 0.66 | 0.70 | 0.65 | 0.68 | 0.72 | 0.76 | 0.20 | 0.48 | 0.00 | 0.65 | 0.76 | |
| Scenario 5c | 0.74 | 0.94 | 0.93 | 0.89 | 0.98 | 0.99 | 1.00 | 0.73 | 0.96 | 0.10 | 0.97 | 0.99 | |
| Scenario 6c | 0.51 | 0.86 | 0.83 | 0.82 | 0.91 | 1.00 | 1.00 | 0.97 | 0.99 | 0.14 | 0.99 | 1.00 |
Null scenarios: type‐I error or FDR. Alternative scenarios: power.
Selection performance of the methods in alternative scenarios
| Univariate | Modified covariates | PCA+lasso | PLS+lasso | Ridge+lasso | Group‐lasso | Two‐I model | Full‐lasso | Alasso (sw) | Alasso (gw) | Alasso (aspw) | Gradient boosting | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Scenario 4a | Selected biomarkers | 4 | 14 | 13 | 14 | 14 | 24 | 18 | 2 | 2 | 0 | 3 | 7 |
| TP / FP(pFP) | 1 / 3(0) | 1 / 13(0) | 1 / 12(0) | 1 / 13(0) | 1 / 13(0) | 1 / 23(0) | 1 / 18(0) | 1 / 1(0) | 1 / 1(0) | 0 / 0(0) | 1 / 2(0) | 1 / 6(0) | |
| AUPRC | 1.00 | 0.98 | 0.98 | 0.95 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | 0.99 | ||
| Scenario 5a | Selected biomarkers | 14 | 43 | 42 | 49 | 49 | 100 | 78 | 20 | 20 | 3 | 18 | 37 |
| TP / FP (pFP) | 6 / 8(0) | 9 / 34(0) | 9 / 33(0) | 9 / 40(0) | 9 / 40(0) | 10 / 90(0) | 9 / 69(0) | 9 / 11(0) | 9 / 11(0) | 2 / 1(0) | 9 / 9(0) | 10 / 27(0) | |
| AUPRC | 0.53 | 0.63 | 0.61 | 0.64 | 0.68 | 0.71 | 0.78 | 0.78 | 0.78 | 0.81 | 0.68 | ||
| Scenario 6a | Selected biomarkers | 2 | 25 | 15 | 29 | 37 | 109 | 99 | 23 | 14 | 1 | 13 | 38 |
| TP / FP (pFP) | 1 / 1(0) | 4 / 21(1) | 4 / 11(0) | 6 / 23(1) | 7 / 30(1) | 10 / 99(10) | 9 / 90(0) | 9 / 14(0) | 8 / 7(0) | 1 / 0(0) | 8 / 5(0) | 10 / 29(1) | |
| AUPRC | 0.27 | 0.25 | 0.37 | 0.38 | 0.43 | 0.21 | 0.75 | 0.69 | 0.71 | 0.71 | 0.62 | ||
| Scenario 4b | Selected biomarkers | 4 | 13 | 13 | 14 | 14 | 25 | 18 | 2 | 2 | 0 | 2 | 7 |
| TP / FP (pFP) | 1 / 3(0) | 1 / 12(0) | 1 / 12(0) | 1 / 13(0) | 1 / 13(0) | 1 / 24(0) | 1 / 17(0) | 1 / 1(0) | 1 / 1(0) | 0 / 0(0) | 1 / 1(0) | 1 / 6(0) | |
| AUPRC | 1.00 | 0.99 | 0.98 | 0.94 | 0.98 | 1.00 | 0.98 | 0.99 | 0.98 | 0.99 | 0.99 | ||
| Scenario 5b | Selected biomarkers | 19 | 55 | 50 | 64 | 62 | 127 | 101 | 26 | 31 | 6 | 26 | 46 |
| TP / FP (pFP) | 8 / 11(0) | 14 / 41(0) | 14 / 36(0) | 16 / 48(0) | 16 / 46(0) | 20 / 107(0) | 18 / 83(0) | 13 / 13(0) | 15 / 16(0) | 4 / 2(0) | 15 / 12(0) | 16 / 31(0) | |
| AUPRC | 0.42 | 0.45 | 0.45 | 0.49 | 0.51 | 0.53 | 0.63 | 0.63 | 0.62 | 0.65 | 0.51 | ||
| Scenario 6b | Selected biomarkers | 1 | 20 | 12 | 30 | 39 | 124 | 110 | 28 | 14 | 1 | 13 | 35 |
| TP / FP (pFP) | 1 / 1(0) | 4 / 16(1) | 4 / 8(0) | 7 / 22(1) | 9 / 30(1) | 19 / 106(20) | 17 / 93(1) | 13 / 15(1) | 8 / 6(0) | 0 / 0(0) | 8 / 6(0) | 13 / 22(1) | |
| AUPRC | 0.19 | 0.16 | 0.26 | 0.27 | 0.28 | 0.17 | 0.54 | 0.48 | 0.47 | 0.48 | 0.35 | ||
| Scenario 4c | Selected biomarkers | 1 | 8 | 8 | 6 | 8 | 12 | 14 | 0 | 1 | 0 | 2 | 4 |
| TP / FP (pFP) | 0 / 0(0) | 0 / 8(0) | 0 / 8(0) | 0 / 6(0) | 0 / 7(0) | 1 / 12(0) | 1 / 13(0) | 0 / 0(0) | 0 / 1(0) | 0 / 0(0) | 0 / 2(0) | 0 / 4(0) | |
| AUPRC | 0.36 | 0.34 | 0.34 | 0.25 | 0.34 | 0.44 | 0.33 | 0.35 | 0.25 | 0.37 | 0.40 | ||
| Scenario 5c | Selected biomarkers | 3 | 28 | 27 | 23 | 32 | 51 | 51 | 4 | 9 | 0 | 8 | 21 |
| TP / FP (pFP) | 1 / 1(0) | 5 / 23(0) | 5 / 22(0) | 4 / 19(0) | 5 / 27(0) | 7 / 44(0) | 7 / 44(0) | 2 / 2(0) | 4 / 6(0) | 0 / 0(0) | 3 / 5(0) | 5 / 16(0) | |
| AUPRC | 0.27 | 0.29 | 0.32 | 0.27 | 0.33 | 0.44 | 0.33 | 0.35 | 0.27 | 0.35 | 0.35 | ||
| Scenario 6c | Selected biomarkers | 2 | 22 | 17 | 19 | 27 | 73 | 75 | 6 | 8 | 0 | 7 | 27 |
| TP / FP (pFP) | 1 / 1(0) | 3 / 19(0) | 3 / 14(0) | 3 / 16(0) | 4 / 22(1) | 7 / 66(8) | 7 / 68(3) | 3 / 4(0) | 3 / 5(0) | 0 / 0(0) | 3 / 4(0) | 5 / 22(0) | |
| AUPRC | 0.21 | 0.22 | 0.26 | 0.24 | 0.29 | 0.19 | 0.32 | 0.33 | 0.26 | 0.34 | 0.31 |
TP: true positive, FP: false positive, pFP: prognostic false positive, AUPRC: area under the precision‐recall curve.
Figure 1False Negative Rate (FNR) against the False Discovery Rate (FDR) in alternative scenarios. Average quantities across 250 replications.
Figure 2Difference in arm‐specific C‐statistics (ΔC‐statistics) in alternative scenarios in the training and validation set. Vertical lines represent the reduction in ΔC‐statistic from the training set to the validation set. Average quantities across 250 replications.
Selected treatment‐effect modifiers and interaction strength of the developed signature in the breast cancer application
| Number of selected biomarkers | ΔC‐statistics | |
|---|---|---|
|
| 4 | 0.10 |
|
| 21 | 0.09 |
|
| 13 | 0.12 |
|
| 20 | 0.01 |
|
| 39 | 0.04 |
|
| 4 | 0.06 |
|
| 34 | 0.12 |
|
| 0 | 0 |
|
| 1 | 0.06 |
|
| 0 | 0 |
|
| 2 | 0.14 |
|
| 8 | 0.18 |