| Literature DB >> 27673682 |
Noam Auslander1, Allon Wagner2, Matthew Oberhardt1, Eytan Ruppin1,3.
Abstract
Altered cellular metabolism is an important characteristic and driver of cancer. Surprisingly, however, we find here that aggregating individual gene expression using canonical metabolic pathways fails to enhance the classification of noncancerous vs. cancerous tissues and the prediction of cancer patient survival. This supports the notion that metabolic alterations in cancer rewire cellular metabolism through unconventional pathways. Here we present MCF (Metabolic classifier and feature generator), which incorporates gene expression measurements into a human metabolic network to infer new cancer-mediated pathway compositions that enhance cancer vs. adjacent noncancerous tissue classification across five different cancer types. MCF outperforms standard classifiers based on individual gene expression and on canonical human curated metabolic pathways. It successfully builds robust classifiers integrating different datasets of the same cancer type. Reassuringly, the MCF pathways identified lead to metabolites known to be associated with the pertaining specific cancer types. Aggregating gene expression through MCF pathways leads to markedly better predictions of breast cancer patients' survival in an independent cohort than using the canonical human metabolic pathways (C-index = 0.69 vs. 0.52, respectively). Notably, the survival predictive power of individual MCF pathways strongly correlates with their power in predicting cancer vs. noncancerous samples. The more predictive composite pathways identified via MCF are hence more likely to capture key metabolic alterations occurring in cancer than the canonical pathways characterizing healthy human metabolism.Entities:
Year: 2016 PMID: 27673682 PMCID: PMC5038951 DOI: 10.1371/journal.pcbi.1005125
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Overview of the MCF algorithm.
Fig 2Comparing the performance of MCF to MGE-SVM across integrated cancer-type datasets.
(A) A bar plot describing the predicted AUC obtained over the combined datasets of the same cancer type using a five-fold cross validation procedure for MGE-SVM (red bars) and MCF (blue bars) classifiers. AUC denotes the area under the curve. Error bars represent one standard deviation, and p-values are for a one-sided, paired-sample t-test for the AUC of each of the five folds. (B), (C) present the receiver operating characteristic (ROC) curves obtained in the classification of the lung and breast cancer combined datasets, respectively.
The target Ts metabolites that MCF selected when it choses ATP as a seed (↑ denotes increased formation from ATP in cancer and ↓ denotes decreased formation from ATP in cancer compared to noncancerous tissue counterpart, Methods).
The table shows one instance of each selected target although in some cases the same target metabolite was identified in multiple compartments (e.g. UDP in the cytosol and in the mitochondria).
| prostate | Breast | Colon | head & neck | lung |
|---|---|---|---|---|
| ↑ 3alpha,7alpha,12alpha-Trihydroxy-5beta-cholestanoyl-CoA(S) | ↑ dADP | ↓ O-Acetylcarnitine | ↑ CTP | ↑ Hydroxy-methylglutaryl-CoA |
| ↑ 3alpha,7alpha-Dihydroxy-5beta-cholest-24-enoyl-CoA | ↑ Oxidized thioredoxin | ↑ 5-Phospho-beta-D-ribosylamine | ↑ dATP | ↑ Spermine |
| ↓ 3alpha,7alpha,26-Trihydroxy-5beta-cholestane | ↓ Hydrogen peroxide | ↑ Spermine | ↑ dCTP | ↑ D-Mannose 1-phosphate |
| ↓ 3alpha,7alpha,12alpha-Trihydroxy-5beta-cholestan-26-al | ↓ L-Threonate | ↑ Fumarate | ↑ dGTP | ↑ Deoxycytidine |
| ↓ 7alpha-Dihydroxy-5beta-cholestan-26-al | ↓ Hydrogen peroxide | ↑ GMP | ↑ dITP | ↑ Diphosphate |
| ↓ 3alpha,7alpha,12alpha,26-Tetrahydroxy-5beta-cholestane | ↓ Iodine | ↓ retinoyl glucuronide | ↑ dTTP | ↑ UDP-D-glucuronate |
| ↑ 5-Amino-1-(5-Phospho-D-ribosyl)imidazole-4-carboxamide | ↓ UDP | ↑ Phosphoenolpyruvate | ||
| ↑ Leukotriene B4 | ↓ Oxalate |
Fig 3MCF pathway utilization predicts the survival of breast cancer patients, while canonical pathways show no such signal.
Shown in (A) and (B) are the Kaplan-Meier survival curves for patients predicted by MCF and canonical pathways respectively to have the best and worst prognosis (top and bottom 10% of patients scores, respectively; Methods). (C) A scatter plot showing the correlation between the prediction classification accuracy achieved using each individual MCF pathway in the combined breast cancer data from TCGA and GEO (where they are identified) (X-label) and the C-index obtained using each such pathway in predicting patients’ survival on the (unseen) METABRIC data. (D) The canonical pathway enrichment of the reactions participating in the MCF composite pathways predictive of survival. The dashed line represents a significance threshold of 0.05 (corrected for multiple hypotheses testing).
summary of the datasets utilized in this work for five cancer types.
N and C stand for number of normal and cancerous samples in the data, respectively.
| TCGA data | GEO data | |||
|---|---|---|---|---|
| Cancer type | TCGA designation | sample count (N/C) | GEO accession | sample count (N/C) |
| Prostate | PRAD | 487/52 | GSE32448 [ | 40/40 |
| Lung adeno-carcinoma | LUAD | 58/490 | GSE19804 [ | 60/60 |
| Colon | COAD | 41/273 | GSE32323 [ | 17/17 |
| Head & neck | HNSC | 43/498 | GSE6631 [ | 22/22 |
| Breast | BRCA | 111/1098 | GSE10780 [ | 140/42 |