| Literature DB >> 30066650 |
Davide S Sardina1, Giovanni Micale2, Alfredo Ferro3, Alfredo Pulvirenti3, Rosalba Giugno4.
Abstract
BACKGROUND: The analysis of tissue-specific protein interaction networks and their functional enrichment in pathological and normal tissues provides insights on the etiology of diseases. The Pan-cancer proteomic project, in The Cancer Genome Atlas, collects protein expressions in human cancers and it is a reference resource for the functional study of cancers. However, established protocols to infer interaction networks from protein expressions are still missing.Entities:
Keywords: Network algorithm; Network inference; Protein expression; Protein interaction network
Mesh:
Year: 2018 PMID: 30066650 PMCID: PMC6069689 DOI: 10.1186/s12859-018-2183-5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1Inference Network Based on iRefIndex Analysis (INBIA) pipeline. We selected 14 inference methods and applied them to the 16 RPPA datasets in order to achieve PPI predictions (a). Networks are inferred following two approaches: (i) the predictions have been compared with the gold standard, iRefIndex, in order to obtain true positive (TP), false positive (FP), true negative (TN) and false negative (FN) values from which it was computed F-measure, a weighted combination of precision and recall (b). The best method for each cancer type was selected and its associated network was returned (c); (ii) for each cancer type, an ensemble network was created by computing all possible PPIs generated from genes associated to TCPA proteins and then a score from the ensemble of best methods (BM) that represents the percentage of BMs within the ensemble which have predicted that PPI (d). The methods named M1, M2, …M14 correspond to those reported in Additional file 1:Table S2
Inference methods used to retrieve predictions on PPI from cancer tissues grouped by category
| Category | Method name |
|---|---|
| Correlation | Spearman correlation [SPEARMAN] |
| Pearson correlation [PEARSON] | |
| TOM Similarity [WGCNA] | |
| Partial Correlation | Simple partial correlation [SPC] |
| GeneNet shrunken [GENENET] | |
| Graphical lasso [GLASSO] | |
| Regression | Partial least squares regression [PLS] |
| Ridge regression [RIDGE] | |
| Lasso regression [LASSO] | |
| Elastic net regression [ELASTICNET] | |
| Mutual Information | ARACNE additive [ARACNEA] |
| ARACNE multiplicative [ARACNEM] | |
| Context likelihood of relatedness [CLR] | |
| Maximum relevance minimum redundancy [MRNET] |
Cancer types and best performing inference network methods with maximum F-measure value
| Cancer Type | INBIA | PERA |
|---|---|---|
| BLCA | CLR (0.188) | ELASTICNET (0.179) |
| BRCA | GLASSO (0.186) | PLS (0.179) |
| COAD | CLR (0.182) | ARACNE (0.166) |
| GBM | PLS (0.196) | PLS (0.191) |
| HNSC | PLS (0.184) | PLS (0.178) |
| KIRC | PLS (0.210) | PLS (0.180) |
| LGG | PLS (0.193) | PLS (0.194) |
| LUAD | CLR (0.187) | SPEARMAN (0.188) |
| LUSC | CLR (0.184) | SPEARMAN (0.184) |
| OV | GLASSO (0.191) | PLS (0.174) |
| PRAD | MRNET (0.191) | WGCNA (0.168) |
| READ | MRNET (0.186) | CLR (0.166) |
| SKCM | MRNET (0.188) | WGCNA (0.166) |
| STAD | PLS (0.179) | PLS (0.165) |
| THCA | PLS (0.196) | PLS (0.174) |
| UCEC | PLS (0.189) | WGCNA (0.178) |
Fig. 2Network prediction quality based on tissue specificity. Precision-recall curves of INBIA’s (orange line) and PERA’s performances (blue line) in predicting tissue-specific PPIs. Each plot refers to a specific cancer type. The performances were computed by considering the ensemble scores generated from INBIA’s and PERA’s best methods and the TissueNet counterparts as ground truth (see Additional file 1: Table S5)
Comparisons with Negatome and TissueNet by considering cancer types and normal counterparts (see Additional file 1: Table S5). The overlap is reported in percentage with respect to the number of total unique predictions per cancer/tissue. For each tissue, we report in the first row the percentage of INBIA’s overlapping, while in the second PERA’s one
| Cancer type | Negatome (%) | TissueNet (%) |
|---|---|---|
| BLCA |
| 51.565 |
| 1.463 | 44.390 | |
| BRCA |
| 47.654 |
| 0.830 | 29.461 | |
| COAD |
| 44.901 |
| 0.806 | 34.274 | |
| 30.323 | ||
| GBM |
| 36.452 |
| 28.064 | ||
| 25.484 | ||
| 19.636 | ||
| 0.727 | 26.909 | |
| 20.727 | ||
| 15.636 | ||
| HNSC | 43.706 | |
| 0.820 | 31.967 | |
| KIRC |
| 48.231 |
| 2.137 | 40.171 | |
| LGG |
| 32.812 |
| 1.020 | 25.510 | |
| LUAD |
| 46.373 |
| 2.367 | 41.420 | |
| LUSC |
| 46.418 |
| 3.297 | 43.407 | |
| OV |
| 25.942 |
| 1.060 | 22.261 | |
| PRAD |
| 39.458 |
| 0.395 | 32.411 | |
| READ |
| 43.769 |
| 1.431 | 38.998 | |
| SKCM |
| 54.018 |
| 0.704 | 42.958 | |
| STAD |
| 43.218 |
| 0.772 | 35.135 | |
| THCA |
| 26.769 |
| 40.308 | ||
| 0.763 | 19.466 | |
| 26.336 | ||
| UCEC |
| 48.615 |
| 1.463 | 37.073 |
Text highlighted in italic refers to our method (INBIA), the second one to PERA
Fig. 3Comparison of predicted PPI classes in GIANT. The plots show comparison between INBIA and PERA best methods for each cancer type and relative amount of predicted PPIs that fall within each GIANT class. PPIs in C1-C2 represent functionally related pairs in the same tissue (C1) or in multi-tissues (C2), conversely C3-C4 are likely functionally unrelated pairs. For HNSC no overlaps were found, meaning that both methods cannot predict the interactions provided by GIANT
Common over-represented motifs found in INBIA and PERA networks inferred from best methods. For each motif, the number of occurrences and relative p-values in INBIA and PERA networks are reported (CD = Cell Death, CP = Cell Proliferation, P = Phosphorylation, S = Signaling)
| Motif | Cancer type | INBIA | PERA |
|---|---|---|---|
|
| THCA | 6256 (0.034) | 7 (2.66E-04 |
|
| LGG | 10477 (0.039) | 30 (3.86E-06) |
|
| THCA | 4230 (0.042) | 3 (0.001) |
|
| LGG | 6086 (0.047) | 24 (2.60E-06) |
|
| OV | 4159 (0.048) | 60 (3.28E-06) |
|
| THCA | 2898 (0.047) | 2 (0.003051681) |
|
| THCA | 7039 (0.038) | 7 (1.79E-05) |
Fig. 4Gene Set enrichment analysis. The plot represents the most enriched gene sets for each cancer type. For each type, the best methods predictions based on INBIA and PERA were retrieved and the gene symbols corresponding to the predicted PPIs were enriched by using MSigDB. The x-axis contains the hallmark gene sets for each method while y-axis reports the gene symbols k that overlaps with hallmark gene sets, reported in the legend with different colors
COSMIC mutation analysis by considering genes related to PPIs predicted from best methods in our and PERA approaches and the relative amount of somatic and germline mutated genes
| Cancer type | Somatic mutation | Germline mutation |
|---|---|---|
| BLCA | ERBB3,NOTCH1,TSC1 | – |
| BRCA | ||
|
|
| |
| AKT1,BRCA2,CCND1,CDH1,CDKN1B,ERBB2,ESR1, | BRCA2,CHEK2,TP53 | |
| GATA3,NOTCH1,PIK3CA,TP53 | ||
| COAD | ERBB3 | – |
| GBM | PIK3CA,PIK3R1 | – |
| HNSC | ERBB3,MTOR,NOTCH1,TSC2 | – |
| KIRC |
|
|
| MET,MTOR,TSC1,TSC2,VHL | ||
| LGG | BRAF,EGFR,PTEN,RAF1,TP53 | ATM,PTEN,TP53 |
| LUAD |
|
|
| – | – | |
| LUSC |
|
|
| TP53 | – | |
| OV |
|
|
|
| ||
| AKT1,BRAF,CCNE1,CTNNB1,ERBB2,MAPK1,PIK3R1 | – | |
| PRAD |
|
|
| AR,BRAF,RAF1 | – | |
| READ |
|
|
|
| ||
| AKT1,BRAF,CTNNB1,MAP2K1,PIK3CA,PIK3R1,SMAD3,SMAD4,SRC,TP53 | – | |
| SKCM | ERBB3,NOTCH1 | – |
| STAD | BRAF,CDH1,ERBB2,ERBB3,PIK3CA | CDH1 |
| THCA |
|
|
| BRAF,NRAS | ||
| UCEC |
|
|
| MTOR,PTEN,SRC,YWHAE | PTEN |
For each cancer type, tuple highlighted in italic refers to our method (INBIA), the second one to PERA. When only one record per cancer type is present then it means that the two approaches achieved the same results