| Literature DB >> 28256629 |
Kaixian Yu1, Qing-Xiang Amy Sang2, Pei-Yau Lung1, Winston Tan3, Ty Lively2, Cedric Sheffield2, Mayassa J Bou-Dargham2, Jun S Liu4, Jinfeng Zhang1.
Abstract
Choosing the optimal chemotherapy regimen is still an unmet medical need for breast cancer patients. In this study, we reanalyzed data from seven independent data sets with totally 1079 breast cancer patients. The patients were treated with three different types of commonly used neoadjuvant chemotherapies: anthracycline alone, anthracycline plus paclitaxel, and anthracycline plus docetaxel. We developed random forest models with variable selection using both genetic and clinical variables to predict the response of a patient using pCR (pathological complete response) as the measure of response. The models were then used to reassign an optimal regimen to each patient to maximize the chance of pCR. An independent validation was performed where each independent study was left out during model building and later used for validation. The expected pCR rates of our method are significantly higher than the rates of the best treatments for all the seven independent studies. A validation study on 21 breast cancer cell lines showed that our prediction agrees with their drug-sensitivity profiles. In conclusion, the new strategy, called PRES (Personalized REgimen Selection), may significantly increase response rates for breast cancer patients, especially those with HER2 and ER negative tumors, who will receive one of the widely-accepted chemotherapy regimens.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28256629 PMCID: PMC5335706 DOI: 10.1038/srep43294
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
GEO data sets used in the study and number of patients in each data set.
| GEO accession number | Regimen | Total | ||
|---|---|---|---|---|
| Anthracycline (A) | Paclitaxel and Anthracycline (TA) | Docetaxel and Anthracycline (TxA) | ||
| GSE20194 | 4 (0) | 257 (20.6%) | 8 (12.5%) | 269 (20.1%) |
| GSE20271 | 85 (8.2%) | 91 (20.9%) | — | 176 (14.8%) |
| GSE22093 | 50 (10%) | — | — | 50 (10%) |
| GSE23988 | — | — | 61 (32.8%) | 61 (32.8%) |
| GSE25055 | — | 290 (18.3%) | — | 290 (18.3%) |
| GSE25065 | — | 92 (20.7%) | 88 (26.1%) | 180 (23.3%) |
| GSE42822 | — | — | 53 (37.7%) | 53 (37.7%) |
| Total | 139 (8.6%) | 730 (19.7%) | 210 (30.5%) | 1079 (20.4%) |
Values in parenthesis are percentage of patients who have pCR among the patients in the corresponding regimen group. The rest of the patients have RD. All the patients were put into one of three regimen groups based on the treatment each patient received: anthracycline alone (A), paclitaxel and anthracycline (TA), and docetaxel and anthracycline (TxA).
Figure 1Schematic illustration of our strategy for developing personalized treatment from multiple patient cohorts with different treatments.
All the patients received one of three therapies. Each group has responders and non-responders. From each treatment group, we identify biomarkers and build predictive models for selecting responders for the corresponding treatment. The three models are validated through cross-validation, where each patient is evaluated using the model trained without using that patient’s information. To assess the overall performance of the three sets of biomarkers and corresponding models, all the patients are evaluated by all the three models and the therapy with the highest probability of giving pathological complete response (pCR) is assigned to the patient. The expected probability of pCR is calculated and compared with the actual pCR, which can be either the average pCR of the three regimens or the highest pCR of the three regimens. Here all the patients are assigned a therapy for comparison purpose since all the patients in reality received one of the three therapies. In practice, patients who are predicted to not respond well to any of the therapies may opt not taking any of them and try a new therapy. Note, the numbers of colored human figures have no actual meaning. In reality, there are patients who respond to more than one regimen. The responders in this figure represent those who have the best response for the corresponding regimen.
Genes selected for the three regimens. Multiple probes are selected for some genes (e.g. NFIB and H2AFZ).
| Probe Set | Symbol | Description | Chromosome | pCR Status* |
|---|---|---|---|---|
| Anthracycline (A) regimen | ||||
| 218066_at | SLC12A7 | solute carrier family 12 (potassium/chloride transporter), member 7 | 5 | − |
| 210164_at | GZMB | Granzyme B (Granzyme 2, Cytotoxic T-Lymphocyte-Associated Serine Esterase 1) | 14 | + |
| 213211_s_at | TAF6L | TAF6-Like RNA Polymerase II, P300/CBP-Associated Factor (PCAF)-Associated Factor, 65 kDa | 11 | − |
| 214567_s_at | XCL2 | Chemokine (C Motif) Ligand 2 | 1 | + |
| Paclitaxel and anthracycline (TA) regimen | ||||
| 213033_s_at | NFIB | Nuclear Factor I/B | 9 | + |
| 219051_x_at | METRN | Meteorin, Glial Cell Differentiation Regulator | 16 | − |
| 209289_at | NFIB | Nuclear Factor I/B | 9 | + |
| 205225_at | ESR1 | Estrogen Receptor 1 | 6 | − |
| 220425_x_at | ROPN1B | Rhophilin Associated Tail Protein 1B | 3 | + |
| 213032_at | NFIB | Nuclear Factor I/B | 9 | + |
| 204822_at | TTK | TTK Protein Kinase | 6 | + |
| 221253_s_at | TXNDC5 | Thioredoxin Domain Containing 5 (Endoplasmic Reticulum) | 6 | + |
| 208712_at | CCND1 | Cyclin D1 | 11 | − |
| 221872_at | RARRES1 | Retinoic Acid Receptor Responder (Tazarotene Induced) 1 | 3 | + |
| 203693_s_at | E2F3 | E2F transcription factor 3 | 6 | + |
| 204825_at | MELK | Maternal embryonic leucine zipper kinase | 9 | + |
| 206754_s_at | CYP2B7P | Cytochrome P450, Family 2, Subfamily B, Polypeptide 7, Pseudogene | 19 | + |
| Docetaxel and anthracycline (TxA) regimen | ||||
| 203554_x_at | PTTG1 | pituitary tumor-transforming 1 | 5 | + |
| 202107_s_at | MCM2 | minichromosome maintenance complex component 2 | 3 | + |
| 200934_at | DEK | DEK Proto-Oncogene | 6 | + |
| 200853_at | H2AFZ | H2A histone family, member Z | 4 | + |
| 210052_s_at | TPX2 | TPX2, microtubule-associated, homolog (Xenopus laevis) | 20 | + |
| 202825_at | SLC25A4 | Solute Carrier Family 25 (Mitochondrial Carrier; Adenine Nucleotide Translocator), Member 4 | 4 | − |
| 201930_at | MCM6 | minichromosome maintenance complex component 6 | 2 | + |
| 202427_s_at | BRP44 | brain protein 44 | 1 | + |
| 218437_s_at | LZTFL1 | Leucine Zipper Transcription Factor-Like 1 | 3 | − |
| 212695_at | CRY2 | Cryptochrome Circadian Clock 2 | 11 | − |
| 201853_s_at | CDC25B | cell division cycle 25 homolog B (S. pombe) | 20 | + |
| 201695_s_at | PNP | purine nucleoside phosphorylase | 14 | + |
| 208079_s_at | AURKA | Serine/Threonine-Protein Kinase Aurora-A | 20 | + |
| 204159_at | CDKN2C | Cyclin-Dependent Kinase Inhibitor 2 C | 1 | + |
| 202633_at | TOPBP1 | DNA Topoisomerase II-Beta-Binding Protein 1 | 3 | + |
| 207618_s_at | BCS1L | BC1 (Ubiquinol-Cytochrome C Reductase) Synthesis-Like | 2 | − |
| 212055_at | C18orf10 | Tubulin Polyglutamylase Complex Subunit 2 | 18 | + |
| 202951_at | STK38 | Serine/Threonine Kinase 38 | 6 | + |
| 201896_s_at | PSRC1 | Proline and Serine Rich Coiled-Coil 1 | 1 | + |
| 214435_x_at | RALA | V-Ral Simian Leukemia Viral Oncogene Homolog A (Ras Related) | 7 | + |
| 208920_at | SRI | Calcium Binding Protein Amplified In Mutlidrug-Resistant Cells | 7 | − |
| 204767_s_at | FEN1 | Flap Structure-Specific Endonuclease 1 | 11 | + |
| 210648_x_at | SNX3 | Sorting Nexin 3 | 6 | + |
| 216248_s_at | NR4A2 | Nuclear Receptor Subfamily 4 Group A Member 2 | 2 | − |
| 204900_x_at | SAP30 | Sin3A Associated Protein 30 kDa | 4 | + |
| 204822_at | TTK | Phosphotyrosine Picked Threonine-Protein Kinase | 6 | + |
| 214456_x_at | SAA1/2 | Serum Amyloid A1/2 | 11 | + |
| 203418_at | CCNA2 | Cyclin A2 | 4 | + |
| 207175_at | ADIPOQ | Adiponectin, C1Q And Collagen Domain Containing | 3 | + |
| 221599_at | C11orf67 | Adipogenesis Associated, Mth938 Domain Containing | 11 | − |
*pCR status: “+”, gene expression up-regulated in pCR cases; “−”, gene expression down-regulated in pCR cases.
Figure 2Observed pCR rates vs. predicted probabilities.
The predicted probabilities for each regimen are divided into 5 equal length intervals (x-axis). For each interval and each regimen (5 * 3 combinations), the observed pCR rate is calculated by dividing the number of pCR patients with the total number of patients for the particular interval-regimen combination. The predicted probabilities correlate strongly with observed pCR rate in general. However, they differ significantly for some regimens and probability intervals. The three points at each interval for three regimens are scattered around the middle point for visual clarity. The bars show confidence intervals and the sizes of the points are proportional to the number of patients in that particular group.
PRES assignment and expected pCR for each study in the independent validation, where each independent data set being tested was left out when training the models.
| Study | TA | TxA | A | pCR rate (%) |
|---|---|---|---|---|
| 20194 original | 257 (20.6%) | 8 (12.5%) | 4 (0) | 20.1 |
| 20271 original | 91 (20.9%) | — | 85 (8.2%) | 14.8 |
| 22093 original | — | — | 50 (10%) | 10 |
| — | — | |||
| 23988 original | — | 61 (32.8%) | — | 32.8 |
| 25055 original | 290 (18.3%) | — | — | 18.3 |
| 25065 original | 92 (20.7%) | 88 (26.1%) | — | 23.3 |
| 42822 original | — | 53 (37.7%) | — | 37.7 |
Figure 3The expected number of pCR and number of patients assigned to each regimen for the whole dataset and different subpopulations.
Numbers within the bars are the numbers of patients assigned to the corresponding regimens. Numbers in parenthesis are rate of pCR for the corresponding regimen. In each sub-figure, the bars on the left show numbers from original assignment and those on the right are numbers produced by PRES. (a) All patients; (b) HER2-negative; (c) HER2-negative and ER-negative.
Figure 4The boxplot for predicted probabilities of paclitaxel-sensitive and resistant groups.
The predicted probabilities of pCR for paclitaxel-sensitive cell lines are significantly higher (p-value = 0.0108) than those of the paclitaxel-resistant cell lines.