| Literature DB >> 26516301 |
Leslie C Timpe1, Dian Li1, Ten-Yang Yen2, Judi Wong2, Roger Yen2, Bruce A Macher2, Alexandra Piryatinska1.
Abstract
Approximately 20 drugs have been approved by the FDA for breast cancer treatment, yet predictive biomarkers are known for only a few of these. The identification of additional biomarkers would be useful both for drugs currently approved for breast cancer treatment and for new drug development. Using glycoprotein expression data collected via mass spectrometry, in conjunction with statistical models constructed by elastic net or lasso regression, we modeled quantitatively the responses of breast cancer cell lines to ~90 drugs. Lasso and elastic net regression identified HER2 as a predictor protein for lapatinib, afatinib, gefitinib and erlotinib, which target HER2 or the EGF receptor, thus providing an internal control for the approach. Two additional protein datasets and two RNA datasets were also tested as sources of predictor proteins for modeling drug sensitivity. Protein expression measured by mass spectrometry gave models with higher coefficients of determination than did reverse phase protein array (RPPA) predictor data. Further, cross validation of the elastic net models shows that, for many drugs, the prediction error is lower when the predictor data is from proteins, rather than mRNA expression measured on microarrays. Drugs that could be modeled effectively include PI3K inhibitors, Akt inhibitors, paclitaxel and docetaxel, rapamycin, everolimus and temsirolimus, gemcitabine and vinorelbine. Strikingly, this modeling approach with protein predictors often succeeds for drugs that are targeted agents, even when the nominal target is not in the dataset.Entities:
Keywords: Breast cancer cell lines; Chemotherapy; Elastic net regression; Lasso regression; Mass spectrometry; Statistical modeling; Targeted agents
Year: 2015 PMID: 26516301 PMCID: PMC4621756 DOI: 10.4172/jpb.1000370
Source DB: PubMed Journal: J Proteomics Bioinform ISSN: 0974-276X
Figure 1The regression model. One or more predictor variables are from the glycoprotein or other dataset.
Figure 2Drug sensitivity as a function of HER2 expression. Each point corresponds to a cell line. A. Gefitinib B. Lapatinib. Sensitivity (vertical axis) is the negative common logarithm of GI50, the drug concentration that inhibits proliferation by 50% (ref. 4). The horizontal axis is the common logarithm of the spectral counts, after adding 1 to each value. Red symbols: cell lines that overexpress HER2. Blue symbols: drug-sensitive cell lines that do not overexpress HER2.
Figure 3Comparison of predicted with observed sensitivities for afatinib (BIBW2992). Observed values of the drug sensitivities are plotted on the horizontal axes. A. Fitted values from elastic net modeling are plotted on the vertical axis. The predictors are HER2, SLC7A5, BST2, LAMB1, CTSB, CDH13, TCN1, SUSD2 and A2ML1. B. Lasso model. The four predictor variables from the lasso model are HER2, SLC7A5, BST2 and A2ML1. The fitted values (vertical axis) were constructed with these predictors using ordinary least squares regression. Red symbols: cell lines that overexpress HER2. Blue symbols: drug-sensitive cell lines that do not overexpress HER2.
Figure 4Frequency distributions of coefficients of determination (R2) for all single predictor models and all three-predictor models. For each drug the pool of candidate predictors was identified by lasso regression (Supplementary Information Table 4). The best (lowest MSE) one and three predictor models were identified using the Leaps and Bounds algorithm [23]. The coefficients of determination were found using ordinary least squares regression.
The top twelve single predictor models for the glycoprotein dataset.
| Drug | Accession Number | Gene Name | R2 |
|---|---|---|---|
| Lapatinib | P04626 | HER2 | 0.76 |
| Sigma AKT1,2 | P48960 | CD97 | 0.70 |
| Rapamycin | O14672 | ADAM10 | 0.69 |
| Gefitinib | Q01650 | SLC7A5 | 0.67 |
| GSK2141795 | Q8IWA5 | SLC44A2 | 0.66 |
| Erlotinib | Q01650 | SLC7A5 | 0.66 |
| GSK2126458 | P50897 | PPT1 | 0.65 |
| Ispinesib | P08195 | SLC3A2 | 0.63 |
| GSK1120212 | P08648 | ITGA5 | 0.60 |
| Vorinostat | Q07954 | LRP1 | 0.59 |
| GSK1059615 | P12830 | CDH1 | 0.55 |
| AG1478 | Q01650 | SLC7A5 | 0.53 |
Top Performing Drugs in Cross Validation.
| glycoproteins | R | RNA array | R | RNA seq | R | RPPA | R | MRM | R |
|---|---|---|---|---|---|---|---|---|---|
| AKT inhibitor | 0.86 | Cisplatin | 0.58 | Disulfiram | 0.66 | Lapatinib | 0.79 | Lapatinib | 0.82 |
| Gefitinib | 0.79 | AKT inhibitor | 0.56 | AKT inhibitor | 0.62 | Erlotinib | 0.79 | AKT inhibitor | 0.77 |
| GSK1059868 | 0.57 | TCS 2312 | 0.55 | OlomoucineII | 0.59 | BIBW2992 | 0.68 | Rapamycin | 0.62 |
| GSK2126458 | 0.56 | GSK2119563 | 0.46 | Bosutinib | 0.58 | CPT 11 | 0.67 | Docetaxel | 0.58 |
| Erlotinib | 0.54 | GSK2126458 | 0.44 | GSK1059868 | 0.57 | AKT inhibitor | 0.67 | AG1478 | 0.57 |
| BEZ235 | 0.53 | Erlotinib | 0.42 | GSK461364 | 0.54 | AZD6244 | 0.58 | BIBW2992 | 0.56 |
| Rapamycin | 0.53 | CGC 11047 | 0.4 | GSK2141795 | 0.53 | Everolimus | 0.56 | GSK1070916 | 0.52 |
| GSK2119563 | 0.44 | Fascaplysin | 0.39 | GSK1120212 | 0.51 | NU6102 | 0.54 | PF 3814735 | 0.52 |
| Vorinostat | 0.41 | GSK923295 | 0.36 | Etoposide | 0.48 | LBH589 | 0.49 | Sunitinib | 0.48 |
| Lapatinib | 0.35 | Etoposide | 0.3 | PF 4691502 | 0.48 | Triciribine | 0.47 | PF 4691502 | 0.43 |
| Ispinesib | 0.29 | LBH589 | 0.3 | Gemcitabine | 0.47 | GSK461364 | 0.45 | GSK2119563 | 0.43 |
| AZD6244 | 0.29 | AS 252424 | 0.25 | AZD6244 | 0.39 | GSK2126458 | 0.43 | Tykerb:IGF1R | 0.4 |
Figure 5Dendrogram of datasets. The cross-validation data, some of which is displayed in Tables 2, were used to create and then cluster a distance matrix with the dist() and hclust() functions in R.