| Literature DB >> 29017547 |
Matthew Ruffalo1, Petar Stojanov1, Venkata Krishna Pillutla1, Rohan Varma2, Ziv Bar-Joseph3,4.
Abstract
BACKGROUND: Translating in vitro results to clinical tests is a major challenge in systems biology. Here we present a new Multi-Task learning framework which integrates thousands of cell line expression experiments to reconstruct drug specific response networks in cancer.Entities:
Keywords: LINCS; Machine learning; TCGA
Mesh:
Substances:
Year: 2017 PMID: 29017547 PMCID: PMC5635550 DOI: 10.1186/s12918-017-0471-8
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Fig. 1Overview of the multi-task learning method. RNA-Seq data from drug response experiments in different cell lines or cancer types (top) is used to select pathways linking source proteins to DE genes in general protein-protein and protein-DNA interaction networks (second row). Reconstructed networks are constrained by encouraging pathways that are shared across different cancer types leading to a general network (third row) that captures the common pathways activated during the response. In addition to the general network, cell type specific networks are also identified (bottom) and these can help identify tissue specific proteins and explain differences in response of certain cancer types when treated with the same drug
Comparison of different gene and network analysis methods for the reconstruction of drug response networks
| Control | MT | Cell 1 | Cell 2 | MT-Diff | Diff-Cell1 | Diff-Cell2 |
|---|---|---|---|---|---|---|
| CGC | 13.33 | 8.5 | 8.66 | 6.33 | 4.66 | 6.5 |
| GO | 72.33 | 33.66 | 41.5 | 47.66 | 43.16 | 29.5 |
| Oncogenic | 7.66 | 3 | 3.33 | 10.33 | 10.33 | 4.83 |
| Breast & Prostate | ||||||
| CGC | 14.66 | 8.33 | 10 | 2.66 | 3.33 | 1.83 |
| GO | 77.5 | 70.66 | 64.83 | 18.16 | 25.66 | 18.33 |
| Oncogenic | 8.66 | 4.33 | 5.16 | 2.66 | 2.5 | 2.5 |
| Prostate | ||||||
| CGC | 15 | 10.16 | 10.33 | 3.33 | 2.33 | 3.83 |
| GO | 82.33 | 85.83 | 88.66 | 23 | 26.83 | 18.5 |
| Oncogenic | 11 | 8.33 | 7.66 | 3 | 4.5 | 3.16 |
Values for each gene set and learning method denote the average number of genes (across six drugs) selected by each method which are also contained in the corresponding validation set. MT: multi-task, Cell 1, Cell 2: single task analysis (cell type based) for the two cells. “Diff” columns show genes selected only by differential expression (DE); MT-Diff: DE set for the two cell types, selecting genes that are differentially expressed in both cells, Diff-Cell1/2: cell type specific DE set
Results for breast cancer, prostate cancer and melanoma
| MTL | Breast | Prostate | Melanoma | |
|---|---|---|---|---|
| CGC | 28.66 | 22.33 | 22.66 | 21.16 |
| GO | 222 | 179.66 | 209.66 | 189.66 |
| Oncogenic | 14.16 | 9.3 | 14.83 | 4 |
Values for each gene set and learning method denote the average number of genes (across six drugs) selected by each method which are also contained in the corresponding validation set
Fig. 2A merged network for the output of multi-task learning using data from breast cancer (lightest shade), prostate cancer (medium shade), and melanoma (darkest shade). Top nodes (red shades): Sources. These proteins are either known to interact with the drugs we tested or determined to be sources using the correlation analysis between drug expression response and KO response as described in Methods. Middle nodes (blue shades): Signaling proteins. These proteins are determined to belong to key pathways connecting sources and TFs. Bottom nodes (green shades): TFs. These proteins regulate a large subset of the DE genes in the different cell types following treatment with the drugs being tested. Note that while sources tend to be cell type specific, most signaling and TF proteins are shared between two or all three cell types indicating that several of the response pathways may be shared between the different cancer types
Recurrent Genes for Breast Cancer, Prostate Cancer and Melanoma
| Tumor type | Gene | Potential role |
|---|---|---|
| Breast cancer |
| Co-activator with Her2 of estrogen |
| receptor (ER) in | ||
|
| Histone deacetylase involved in the | |
| oncogenic tumorigenesis of breast | ||
| cancer. | ||
|
| Inhibition of | |
| cell death in breast cell lines | ||
|
|
| |
| 20q amplified breast cancer | ||
|
| Associated with increased breast | |
| cancer risks | ||
|
| Androgen-induced gene that | |
| promotes proliferation activity of | ||
| breast cancer cells. | ||
| Prostate cancer |
| Tumoral Prostate Shows Different |
| Expression Pattern | ||
|
| Blockade of PDGF signaling induced | |
| apoptosis in metastatic PCa cells. | ||
|
| Deletion of MAP3K7 at 6q12-22 is | |
| associated with early PSA recurrence | ||
|
| PCa cell lines have high p62/ | |
| levels required for cell survival | ||
|
| Alien interacts with the human | |
| androgen receptor and inhibits | ||
| prostate cancer cell growth | ||
|
| Dysregulated expression of | |
| plays an important role in prostate | ||
| cancer progression | ||
| Melanoma |
| In vitro IFN- |
| combination treatment had more | ||
| potent apoptotic effects | ||
|
| Loss of TTP represents a key event | |
| in the establishment of melanomas | ||
|
| Overexpression in BL6 murine | |
| melanoma cells inhibits the | ||
| proliferative capacity in vivo | ||
|
| Diagnostic Immunohistochemical | |
| Marker for Synovial Sarcoma | ||
|
| One of only two DNA repair and | |
| replication proteins which are | ||
| prognostic for melanoma |
Fig. 3Tissue-specific pathways for prostate cancer. Tissue-specific prostate genes are shown as ellipses and other genes interacting with them are shown as squares. Red, sources, cyan, intermediate nodes, green, target nodes. CUL2 (ranked 14th) and PTPN11 (ranked 30th) were also on our list of prostate-specific genes
Fig. 4P-values for survival models fit using mRNA expression of genes in four sets: genes identified by the multi-task learning method for each drug, COSMIC cancer genes, all genes present in mRNA expression data, and single-task genes. For COSMIC, all genes, and single-task genes, 100 random subsets of available genes are chosen; each random subset contains the same number of genes as the multi-task set for a specific drug. Models are fit to a random training set chosen from 80% of patients, risk scores are calculated for training set and validation set samples, and the median risk in the training set is used as a threshold to divide validation set samples into two groups. P-values are computed from the difference in survival between the two groups of validation set samples. a shows results for paclitaxel, b shows docetaxel, c shows doxorubicin
Fig. 5Kaplan-Meier survival curves for the survival analysis described in “Survival analysis using gene sets from the multi-task framework” section