| Literature DB >> 29190299 |
Sushant Patkar1, Assaf Magen1, Roded Sharan2, Sridhar Hannenhalli1.
Abstract
Guilt-by-association codifies the empirical observation that a gene's function is informed by its neighborhood in a biological network. This would imply that when a gene's network context is altered, for instance in disease condition, so could be the gene's function. Although context-specific changes in biological networks have been explored, the potential changes they may induce on the functional roles of genes are yet to be characterized. Here we analyze, for the first time, the network-induced potential functional changes in breast cancer. Using transcriptomic samples for 1047 breast tumors and 110 healthy breast tissues from TCGA, we derive sample-specific protein interaction networks and assign sample-specific functions to genes via a diffusion strategy. Testing for significant changes in the inferred functions between normal and cancer samples, we find several functions to have significantly gained or lost genes in cancer, not due to differential expression of genes known to perform the function, but rather due to changes in the network topology. Our predicted functional changes are supported by mutational and copy number profiles in breast cancers. Our diffusion-based functional assignment provides a novel characterization of a tumor that is complementary to the standard approach based on functional annotation alone. Importantly, this characterization is effective in predicting patient survival, as well as in predicting several known histopathological subtypes of breast cancer.Entities:
Mesh:
Year: 2017 PMID: 29190299 PMCID: PMC5708603 DOI: 10.1371/journal.pcbi.1005793
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Overall approach.
The reference gene is depicted by black circle. The initial static global PIN is projected onto normal and cancer samples based on gene expression, and each function (red and green) are diffused through each PIN. In this case, the reference gene is assigned green function in normal and red function in cancer, i.e., the gene gained red and lost the green function in cancer.
Functions ranked based on functional variability of genes.
Top 10 gained (green) and lost (red) functions are shown, along with Δ, Δ divided (normalized) by the number of genes annotated by the function, followed by the sample shuffling, and the log fold change, which is the log ratio of the average number of expressed genes annotated by f in cancer and normal samples.
| GO ID | Description | Normalized | Normalized | log fold change | ||
|---|---|---|---|---|---|---|
| GO:0048661 | 893 | 15.13 | 11 | 0.18 | -0.04 | |
| GO:0048010 | 744 | 10.19 | -28 | -0.38 | -0.01 | |
| GO:0051279 | 740 | 13.21 | 0 | 0 | -0.03 | |
| GO:1901983 | 723 | 12.05 | -10 | -0.16 | -0.04 | |
| GO:0000910 | 527 | 6.84 | -8 | -0.10 | -0.02 | |
| GO:0010676 | 523 | 8.43 | -22 | -0.35 | -0.05 | |
| GO:0051291 | 508 | 5.90 | 12 | 0.13 | -0.03 | |
| GO:0042552 | 394 | 6.67 | -6 | -0.10 | -0.03 | |
| GO:2000756 | 369 | 6.47 | -10 | -0.17 | -0.03 | |
| GO:0016575 | 333 | 5.64 | -17 | -0.28 | -0.01 | |
| GO:0006334 | -310 | -3.13 | -3 | -0.03 | 0.04 | |
| GO:0051148 | -127 | -2.49 | -15 | -0.29 | -0.06 | |
| GO:0007032 | -75 | -1.27 | -2 | -0.03 | -0.007 | |
| GO:0018022 | -65 | -0.91 | -10 | -0.14 | 0.002 | |
| GO:0007052 | -64 | -1.05 | -15 | -0.24 | 0.005 | |
| GO:0019886 | -56 | -0.62 | 2 | 0.02 | 0.003 | |
| GO:0016236 | -53 | -0.71 | -13 | -0.17 | -0.01 | |
| GO:2000117 | -52 | -0.61 | -4 | -0.04 | 0.005 | |
| GO:0051437 | -51 | -0.68 | 3 | 0.04 | 0.006 | |
| GO:0031145 | -43 | -0.57 | 7 | 0.09 | 0.006 |
Functions ranked based on expression variability of genes.
Top 10 (green) and bottom 10 (red) functions are shown based on log fold change of expression based activity (that is, number of annotated genes present in the corresponding projected PIN), along with Δ, Δ divided (normalized) by the total number of genes annotated by the function, followed by the sample shuffling results, and the log fold change.
| GO ID | Description | Normalized | Normalized | log fold change | ||
|---|---|---|---|---|---|---|
| GO:0006342 | 38 | 0.74 | -1 | -0.01 | 0.08 | |
| GO:0006334 | -310 | -3.13 | -3 | -0.03 | 0.06 | |
| GO:0045814 | negative regulation of gene expression, epigenetic | 17 | 0.25 | -3 | -0.04 | 0.04 |
| GO:0034728 | 4 | 0.03 | -3 | -0.02 | 0.03 | |
| GO:1990138 | -10 | -0.20 | -7 | -0.14 | 0.03 | |
| GO:0016458 | gene silencing | 7 | 0.04 | -6 | -0.03 | 0.03 |
| GO:0031060 | 19 | 0.37 | -15 | -0.29 | 0.03 | |
| GO:0065004 | 22 | 0.14 | -11 | -0.07 | 0.03 | |
| GO:0031047 | 4 | 0.04 | -7 | -0.07 | 0.02 | |
| GO:0071824 | 8 | 0.04 | -3 | -0.01 | 0.02 | |
| GO:1901379 | 14 | 0.25 | -3 | -0.05 | -0.12 | |
| GO:0043266 | 10 | 0.12 | 1 | 0.01 | -0.12 | |
| GO:1904064 | 309 | 5.15 | 0 | 0 | -0.10 | |
| GO:0019229 | 206 | 3.32 | -1 | -0.01 | -0.10 | |
| GO:0001508 | 2 | 0.03 | 1 | 0.01 | -0.09 | |
| GO:0048871 | multicellular organismal homeostasis | -9 | -0.08 | -2 | -0.01 | -0.09 |
| GO:0051148 | -127 | -2.49 | -15 | -0.29 | -0.09 | |
| GO:0034764 | positive regulation of transmembrane transport | 3 | 0.03 | -7 | -0.07 | -0.09 |
| GO:0034767 | positive regulation of ion transmembrane transport | 205 | 2.38 | 5 | 0.05 | -0.09 |
| GO:0050891 | multicellular organismal water homeostasis | -31 | -0.57 | -5 | -0.09 | -0.09 |
Links between functional loss and mutation and deletion CNV.
The Fisher test contingency table showing the distribution of functions with elevated missense mutation frequency (columns 2 and 3) and deletion CNV rates (columns 4 and 5) between lost and gained functions. Mut(f) = 1 denotes significantly higher missense mutation frequency among the genes contributing to functional gain. CNV(f) = 1 has an analogous interpretation for deletion CNV.
| 23 | 709 | 26 | 706 | |
| 27 | 390 | 7 | 410 |
Change in functional activity and association with patient survival.
Fisher test contingency table to test for association between functional loss/gain with associations with patient survival; β indicates the association of tumor-specific functional activity with survival risk.
| β<0 & p-value ≤ 0.05 | p-value > 0.05 | β>0 & p-value ≤ 0.05 | p-value > 0.05 | |
|---|---|---|---|---|
| 87 | 639 | 6 | 639 | |
| 24 | 373 | 20 | 373 |
Prediction accuracues using diffusion based fuctional profiles and annotation based functional profiles quantified by AUC-ROC.
The following table displays the AUC estimates of the 7 independent classifiers trained with two different feature sets (diffusion-based functional activity and annotation-based functional activity) for each clinical indicator.
| Clinical Indicator | AUC—Diffusion | AUC–Annotation (Control) | P-value (AUC-Diffusion > AUC-Control) |
|---|---|---|---|
| 0.87 (95% CI = 0.867–0.882) | 0.81 (95% CI = 0.803–0.823) | <2.2e-16 | |
| 0.73 (95% CI = 0.72–0.746) | 0.65 (95% CI = 0.644–0.67) | <2.2e-16 | |
| 0.74 (95% CI = 0.741–0.756) | 0.73 (95% CI = 0.722–0.739) | <2.2e-16 | |
| 0.73 (95% CI = 0.727–0.742) | 0.75 (95% CI = 0.731–0.748) | 1 | |
| 0. 87(95% CI = 0.866–0.882) | 0.81 (95% CI = 0.803–0.823) | <2.2e-16 | |
| 0.88 (95% CI = 0.882–0.895) | 0.81 (95% CI = 0.811–0.828) | <2.2e-16 | |
| 0.74 (95% CI = 0.736–0.75) | 0.72 (95% CI = 0.719–0.735) | <2.2e-16 |
Fig 2Clustering breast cancer samples based on their functional activity profile.
Kaplan-Meier survival curves of patients grouped in the 10 clusters show significant survival differences.
Fig 3Diffusion based functional heterogeneity across clinical subtypes.
The following figure displays the log ratio between the average numbers of genes assigned to each function by diffusion (represented by columns) across samples annotated with a subtype (represented by rows) versus the rest of the samples.
Contingency table.
The table generated after performing diffusion based function assignment of a function to gene g in each tumor and normal sample.
| ( | Assigned | Not assigned | Total |
|---|---|---|---|
| Cancer | = ( | ||
| Normal | = ( | ||
| Total | = ( | ( |
Contingency table.
The following table is generated to determine if elevated missense (respectively nonsense) mutation frequencies are enriched among functions with net gain (respectively net loss).