| Literature DB >> 30054467 |
Joske Ubels1,2,3, Pieter Sonneveld2, Erik H van Beers3, Annemiek Broijl2, Martin H van Vliet4, Jeroen de Ridder5.
Abstract
Many cancer treatments are associated with serious side effects, while they often only benefit a subset of the patients. Therefore, there is an urgent clinical need for tools that can aid in selecting the right treatment at diagnosis. Here we introduce simulated treatment learning (STL), which enables prediction of a patient's treatment benefit. STL uses the idea that patients who received different treatments, but have similar genetic tumor profiles, can be used to model their response to the alternative treatment. We apply STL to two multiple myeloma gene expression datasets, containing different treatments (bortezomib and lenalidomide). We find that STL can predict treatment benefit for both; a twofold progression free survival (PFS) benefit is observed for bortezomib for 19.8% and a threefold PFS benefit for lenalidomide for 31.1% of the patients. This demonstrates that STL can derive clinically actionable gene expression signatures that enable a more personalized approach to treatment.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30054467 PMCID: PMC6063966 DOI: 10.1038/s41467-018-05348-5
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Fig. 1Illustration of the difference between prognostic and predictive classifiers and an overview of the approach. a Example of the Kaplan–Meier curve for a prognostic classifier. b Example of the Kaplan–Meier curve for a predictive classifier. c Division of dataset into training and test sets. D1–D3 are all used once to validate the classifier trained on the remaining two-thirds of data. d Flow of the GESTURE algorithm. In step 1 the prototypes with a longer than expected survival difference are identified on fold A. In step 2 the number of prototypes and corresponding decision boundary used in the classifier are optimized on fold B. In step 3 the performance of the classifier on fold C across all repeats is used to select the combination of gene sets to be used in the final classifier. In step 4 a classifier for these gene sets is defined on all training data. This classifier will be validated on the fold D not included in the training data
Classification accuracy in cross-validation and HR in independent validation for the classifiers trained on labels based on the top 25% surviving bortezomib patients and the bottom 25% nonbortezomib patients
| Classification accuracy | Validation HR | ||
|---|---|---|---|
|
| 0.58 (std. dev.: 0.07) | 0.96 (95% CI: 0.57–1.60) | 0.86 |
|
| 0.68 (std. dev.: 0.03) | 0.95 (95% CI: 0.54–1.68) | 0.87 |
|
| 0.81 (std. dev.: 0.06) | 0.81 (95% CI: 0.31–2.13) | 0.67 |
Classification accuracy in cross-validation and HR in independent validation for the classifiers trained on labels selected from randomly generated classifications with a significant HR under 0.5
| Classification accuracy | Validation HR | ||
|---|---|---|---|
|
| 0.50 (std. dev.: 0.02) | 0.81 (95% CI: 0.49–1.35) | 0.42 |
|
| 0.66 (std. dev.: 0.02) | 0.81 (95% CI: 0.50–1.41) | 0.51 |
|
| 0.83 (std. dev.: 0.06) | 1.10 (95% CI: 0.52–2.34) | 0.80 |
Fig. 2Overview of the bortezomib classifier results and comparison to known markers. a Kaplan–Meier of the entire bortezomib dataset, showing a HR of 0.74 (95% CI: 0.61–0.90, p = 0.0029, n = 910), between the treatment arms. b Kaplan–Meier of the combined classifications into a “benefit” and “no benefit” class of D1–D3. A HR of 0.50 (95% CI: 0.32–0.76, p = 0.0012, n = 180) is found between the treatment arms in the “benefit” class and a HR of 0.78 (95% CI: 0.63–0.98, p = 0.03, n = 730) in the “no benefit” class. These results show that a subgroup, comprising 19.8% of the population (n = 180 out of 910 total), is identified by our method that benefits substantially more from bortezomib treatment than the population as a whole; in the entire population an HR of 0.74 (95% CI: 0.61–0.90, p = 0.0029, n = 910) is found. c The HR found in the “benefit” class (y-axis) when different operating points (x-axis) are used, compared with known predictive and prognostic markers. The gray dotted line indicated the HR found in the entire dataset, without classification. d Relationships between the 31 genes in common between the D1–D3 classifiers. Node size corresponds to how much more a gene was observed in the selected gene sets than expected. Green nodes indicate that the gene is associated with a p < 0.05. Relationships are inferred from literature with the GeneMANIA[41] algorithm. A purple edge indicates the genes are co-expressed, a green edge indicates a genetic interaction, a red edge a physical interaction, an orange edge a shared protein domain, a dark blue edge indicates colocalization and a light blue edge shows that both genes are annotated to the same pathway
Fig. 3Overview of the lenalidomide classifier results. a Kaplan–Meier curves for the entire lenalidomide dataset, showing an HR of 0.59 (95% CI: 0.41–0.84, p = 0.0042, n = 662) between the treatment arms. b Kaplan–Meier curve of the combined classifications into a “benefit” and “no benefit” class of D1–D3. An HR of 0.36 (95% CI: 0.18–0.71, p = 0.0031, n = 206) is found between the treatment arms in the “benefit” class and an HR of 0.71 (95% CI: 0.46–1.10, p = 0.13, n = 456) in the “no benefit” class