| Literature DB >> 26257820 |
Sanghamitra Bandyopadhyay1, Sumanta Ray2, Anirban Mukhopadhyay3, Ujjwal Maulik4.
Abstract
BACKGROUND: Detecting protein complexes within protein-protein interaction (PPI) networks is a major step toward the analysis of biological processes and pathways. Identification and characterization of protein complexes in PPI network is an ongoing challenge. Several high-throughput experimental techniques provide substantial number of PPIs which are widely utilized for compiling the PPI network of a species.Entities:
Keywords: Disorders; Gene Ontology; Multiobjective optimization; Protein complex; Protein–protein interactions
Year: 2015 PMID: 26257820 PMCID: PMC4529733 DOI: 10.1186/s13015-015-0056-2
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Fig. 1Figure illustrating the mutation procedure. a A subgraph in which yellow nodes represent chromosome whereas green nodes are the first neighbor of these. In b the randomly selected nodes are colored as red. Two process are performed with equal probability: insertion and deletion. c and d The resulting chromosomes after insertion and deletion operations.
Summary of the human PPI network data sets used here
| Data set | #Proteins | #Interactions | Avg. degree | Max. degree | Density | Clustering coefficient | #Connected components | Network diameter | Avg. number of neighbors |
|---|---|---|---|---|---|---|---|---|---|
| HPRD | 9,589 | 39,240 | 7.924 | 271 | 0.001 | 0.102 | 262 | 14 | 7.703 |
Comparison of performance of different algorithms with respect to sensitivity, PPV and accuracy
| Method | Sensitivity | PPV | Accuracy |
|---|---|---|---|
| MCODE | 0.2134 | 0.5274 | 0.3347 |
| ClusterONE | 0.1414 | 0.4562 | 0.2540 |
| Affinity_propagation | 0.1837 | 0.4443 | 0.2857 |
| CORE | 0.315 | 0.353 | 0.333 |
| RNSC | 0.379 | 0.469 | 0.4216 |
| MCL-Caw | 0.331 | 0.241 | 0.279 |
| PEWCC | 0.435 | 0.469 | 0.451 |
| Proposed_Method | 0.3061 | 0.6928 | 0.4605 |
Fig. 2Precision vs recall curves for all methods at different threshold values (t).
AUC score of different methods
| MCODE | ClusterONE | Affinity_Propagation | COACH | RNSC | MCL-Caw | PEWCC | Proposed method | |
|---|---|---|---|---|---|---|---|---|
| AUC | 0.1096 | 0.0590 | 0.2714 | 0.3572 | 0.2277 | 0.0841 | 0.6595 | 0.7244 |
Predicted protein complexes, their GO-terms, p-values, and KEGG pathways
| Sl.No. | Cluster id | Predicted complex (% of proteins covered) | Matched proteins | GO-term (bp) | GO-term (CC) | GO-term (mf) | KEGG pathway |
|---|---|---|---|---|---|---|---|
| 1 | 5 | CTNNB1–DVL1–DVL3–PPM1A complex (80%), HSPB1–PPA1–PPA1–SETDB1–TP53–WIPI1 complex (66.67), EEF1A1–MDH2–WARS complex (66.67), transforming growth factor–SMAD complex (66.67) | DVL2, ‘CTNNB1’ ‘PPM1A’ ; ‘SMAD3’ ‘TP53’ ‘PPA1’ ‘PPA1’; YWHAG’ ‘EEF1A1’; ‘EEF1A1’ ‘SMAD2’ | Enzyme linked receptor protein signaling pathway (GO:0007167 ) (5.9E–11) | Cytosol (GO:0005829) (3.4E–11) | Enzyme binding (GO:0019899) (1.5E–12) | Colorectal cancer (2.9E–11), Pathways in cancer(1.2E–10) |
| 2 | 6 | CDH1–CTNNB1–PTPN1 complex(50), transforming growth factor–SMAD complex (66.67), HSPB1–PPA1–PPA1–SETDB1–TP53–WIPI1 complex (66.67) | DVL2 SMAD3 TP53 YWHAG | Response to organic substance (GO:0010033) (4.2E–14) | Cytosol (GO:0005829) (2.5E–25) | Protein kinase activity (GO:0004672) (1.8E–10) | Pathways in cancer (6.1E–19) |
| 3 | 8 | MARCKS–NMT1–TP53 complex (66.67), transforming growth factor–SMAD complex (75), CTNNB1–DVL1–DVL3–PPM1A complex (50) | CTNNB1 SMAD3 TP53 YWHAG TP53 SMAD2 | Protein amino acid phosphorylation (GO:0006468) (4.1E–14) | Cytosol (GO:0005829) (3.8E–23) | Protein serine/threonine kinase activity (GO:0004674) (1.7E–11) | Chronic myeloid leukemia (5.7E–16) |
| 4 | 11 | p300–MDM2–p53 protein complex (63.64), CDH1–CTNNB1–PTPN1 complex (60) | HDAC1 TP53 ESR1 TP53 CTNNB1 | Positive regulation of nitrogen compound metabolic process (GO:0051173) (4.1E–20) | Nucleoplasm (GO:0005654) (2.6E–18) | Transcription factor binding (GO:0008134) (1.5E–15) | Prostate cancer (2.0E–12) |
| 5 | 15 | p300–MDM2–p53 protein complex (72.73), CASP8–MAPK1–MAPK3–PEA15–RPS6KA3 complex (60),CDH1–CTNNB1–PTPN1 complex (75), EP300–HOXB6–HOXB7 complex (66.67) | EP300 HDAC1 TP53 ESR1 LCK HSP90AA1 MAPK1 MAPK3 EP300 TP53 EP300 TP53 FOXO1 SMAD4 EP300 SMAD2 CTNNB1 PTPN1 HGS EP300 EEF1A1 SMAD2 | Positive regulation of macromolecule metabolic process (GO:0010604) (4.5E–34) | Nucleoplasm (GO:0005654) (3.1E–34) | Transcription regulator activity (GO:0030528) (7.6E–26) | Chronic myeloid leukemia (9.1E–26) |
| 6 | 20 | Smad protein complex (53.33), FOXO–SMAD complex (60), transforming growth factor–SMAD complex (75), CDH1–CTNNB1–PTPN1 complex (75) | SMAD3 SMAD2 LCK HSP90AA1 TP53 YWHAG CTNNB1 SMAD2 | Positive regulation of cellular biosynthetic process (GO:0031328) (6.8E–23) | Cytosol (GO:0005829) (3.9E–15) | Enzyme binding (GO:0019899) (1.5E–20) | Adherens junction (1.2E–16) |
| 7 | 25 | EEF1A1–MDH2–WARS complex (66.67), CTNNB1–DVL1–DVL3–PPM1A complex (66.67) | DVL2 CTNNB1 PPM1A SMAD3 TP53 PPA1 PPA1 YWHAG EEF1A1 EEF1A1 SMAD2 | Enzyme linked receptor protein signaling pathway (GO: 0007167) (2.5E–15) | Cytosol (GO:0005829) (6.1E–20) | SMAD binding (2.2E–13) (GO:0046332) | Pathways in cancer (3.5E–21) |
| 8 | 26 | CDK5R2–CHN1–ERBB2 complex (66.67), EEF1A1–MDH2–WARS complex(66.67) | SMAD3 TP53 YWHAG SMAD2 | Response to organic substance (GO:0010033) (4.9E–12) | Cytosol (GO:0005829) (3.9E–20) | Enzyme binding (GO:0019899) (8.0E–16) | Colorectal cancer(6.4E–14) |
| 9 | 36 | SMAD1–SMAD4–ECSIT2 containing complex (80), CDH1–CTNNB1–PTPN1 complex (75), p300–MDM2–p53 protein complex (81.82) | ‘BRCA1’ ‘CTNNB1’ ‘ECSIT’ ‘EP300’ ‘ESR1’ ‘HDAC1’ ‘MDM2’ ‘PTPN1’ ‘RB1’ ‘SMAD1’ ‘SMAD4’ ‘SP1’ ‘TP53’ ‘UBE2Z’ | Positive regulation of cellular biosynthetic process (GO:0031328) (7.8E–37) | Nuclear lumen (GO:0031981) (3.0E–27) | Transcription regulator activity (1.7E–30) (GO:0030528) | Pathways in cancer(2.6E–30) |
| 10 | 39 | FOXO–SMAD complex (60), EEF1A1–MDH2–WARS complex (66.67), CDH1–CTNNB1–PTPN1 complex (75) | ‘CDH2’ ‘CTNNB1’ ‘EEF1A1’ ‘FOXO1’ ‘FOXO3’ ‘SMAD1’ ‘SMAD4’ ‘YWHAG’ | Cell cycle (GO:0007049) (1.2E–30) | Nucleoplasm (GO:0005654) (8.8E–59) | Transcription factor binding (GO:0008134) (1.5E–25) | Cell cycle (1.6E–18) |
| 11 | 40 | ERBB2IP–ZFYVE9 containing complex, p300–MDM2–p53 protein complex (72.73), CDK7–cyclin H complex (60) | ‘AR’ ‘BRCA1’ ‘CDK2’ ‘EP300’ ‘ERBB2IP’ ‘ESR1’ ‘HDAC1’ ‘MDM2’ ‘MNAT1’ ‘SP1’ ‘TBP’ ‘TP53’ ‘UBE2I’ | Positive regulation of macromolecule metabolic process (GO:0010604) (1.6E–29) | Nuclear lumen(GO:0031981) (1.3E–19) | Transcription activator activity (GO:0016563) (3.2E–14) | Pathways in cancer(3.3E–16) |
| 12 | 10 | CDK7–cyclin H complex (80), CASP8–MAPK1–MAPK3–PEA15–RPS6KA3 complex (80), ITPR1–STARD13–TXNDC4 complex (66.67), PIN1–TP53 complex(66.67) | ‘AR’ ‘CASP8’ ‘CDK2’ ‘CDK7’ ‘HMGB1’ ‘ITPR1’ MAPK1’ ‘MAPK3’ ‘MDM2’ ‘MNAT1’ ‘PEA15’ ‘PIN1’ ‘TP53’ ‘TP73’ | Positive regulation of gene expression (GO:0010628) (1.0E–23) | Transcription factor complex (GO:0005667) (9.8E–13) | Enzyme binding (GO:0019899) (9.7E–21) | Pathways in cancer (7.0E–26) |
| 13 | 51 | HSPA1A–TP53 complex (100), ATXN1–C1orf94–DAZAP2–RBPMS–UBQLN4 containing complex (66.67), p300–MDM2–p53 protein complex (80) | ‘ATXN1’ ‘BAT2’ ‘BRCA1’ ‘DAZAP2’ ‘EP300’ ‘ERBB2IP’ ‘ESR1’ ‘HDAC1’ ‘HSPA1A’ ‘MDM2’ ‘SP1’ ‘TBP’ ‘TP53’ ‘UBE2I’ ‘UBQLN4’ | Positive regulation of cellular biosynthetic process (GO:0031328) (4.7E–32) | Nucleoplasm (GO:0005654) (1.7E–28) | Transcription regulator activity (GO:0030528) (3.2E–20) | Pathways in cancer (4.2 E–24) |
| 14 | 3 | CDH1–CTNNB1–PTPN1 complex (75), EEF1A1–MDH2–WARS complex(66.67) | ‘CDH2’ ‘CTNNB1’ ‘EEF1A1’ ‘PTPN1’ ‘YWHAG’ | Positive regulation of macromolecule metabolic process (GO:0010604) (8.7E–28) | Nucleoplasm (GO:0005654) (2.0E–18) | Transcription factor binding (GO:0008134) (1.3E–18) | Neurotrophin signaling pathway (2.3E–12) |
| 15 | 38 | COP9 signalosome (CSN) (70), mutant p53/NF-Y protein (mutp53/NF-Y) complex (66.67), TP53–TP73 complex (66.67), HSPB1–PPA1–PPA1–SETDB1–TP53–WIPI1 complex (66.67) | ‘COPS3’ ‘COPS4’ ‘COPS5’ ‘COPS6’ ‘COPS7A’ ‘COPS8’ ‘CREBBP’ ‘EP300’ ‘GPS1’ ‘NFYA’ ‘PPA1’ ‘SMAD3’ ‘SP1’ ‘TP53’ ‘TP73’ ‘WT1’ | Positive regulation of nitrogen compound metabolic process (GO:0051173) (6.9E–22) | Cytosol (GO:0005829) (1.4E–10) | Enzyme binding (GO:0019899) (2.8E–18) | Pathways in cancer (7.0E–20) |
| 16 | 21 | MDM2–PML–PML–SUMO1–SUZ12 complex (80), SUMO1 activation complex (66.67), p300–MDM2–p53 protein complex (80) | ‘AR’ ‘BRCA1’ ‘DAXX’ ‘EP300’ ‘ESR1’ ‘HDAC1’ ‘MDM2’ ‘PIAS1’ ‘PML’ ‘RB1’ ‘SP1’ ‘SUMO1’ ‘TP53’ ‘UBE2I’ | Positive regulation of transcription (5.7E–34) | Organelle lumen (GO:0043233) (8.3E–34) | Transcription regulator activity (GO:0030528) (4.9E–39) | Pathways in cancer (3.9E–21) |
| 17 | 30 | CDH1–CTNNB1–PTPN1 complex (75), PKD1–PKD2 complex (66.67), EEF1A1–MDH2–WARS complex (66.67) | ‘CDH2’ ‘CTNNB1’ ‘EEF1A1’ ‘PKD1’ ‘PTPN1’ ‘YWHAG’ | Positive regulation of nitrogen compound metabolic process (GO:0051173) (1.0E–17) | Cell projection (GO:0042995) (9.0E–8) | Enzyme binding (GO:0019899) (2.1E–13) | Adherens junction (1.5E–18) |
| 18 | 13 | TP53–TP73 complex (66.67), p300–MDM2–p-53 protein complex (72.73) | ‘BRCA1’ ‘CREBBP’ ‘EP300’ ‘ESR1’ ‘HDAC1’ ‘MDM2’ ‘SP1’ ‘TBP’ ‘TP53’ ‘TP73’ ‘WT1’ | Positive regulation of cellular biosynthetic process (GO:0031328) (3.1E–25) | Nuclear lumen (7.3E–24) | Transcription regulator activity (GO:0030528) (2.2E–24) | Prostate cancer (8.5E–9) |
Fig. 3Bar diagram showing the proportion of proteins in predicted clusters that are involved in some real protein complexes.
Fig. 4Bar diagram showing the involvement of predicted complexes in 22 primary disease classes.
Fig. 5Bipartite network showing the association of predicted complexes and 16 disease classes. The red nodes represent predicted complexes and yellow diamond shaped nodes denote disease classes. Size of complex nodes are varied according to the number of associated disorders involved with that complex. Edge width indicates the number of associated disorders between complex and disease node linked by that edge.
Some useful metrics of disorder associated genes
| Disease category | No. of associated genes | No. of interaction in human PPI | Density | Semantic similarity |
|---|---|---|---|---|
| Bone | 180 | 94 | 0.0163 | 0.5676 |
| Cancer | 869 | 725 | 0.0019 | 0.8931 |
| Cardiovascular | 332 | 457 | 0.0083 | 0.7374 |
| Connective_tissue | 149 | 75 | 0.0068 | 0.7146 |
| Dermatological | 270 | 154 | 0.0042 | 0.6381 |
| Developmental | 143 | 219 | 0.0216 | 0.7589 |
| Ear, nose, throat | 177 | 183 | 0.0117 | 0.8277 |
| Endocrine | 305 | 572 | 0.0123 | 0.8432 |
| Gastrointestinal | 110 | 52 | 0.0087 | 0.5819 |
| Hematological | 389 | 755 | 0.0100 | 0.6449 |
| Immunological | 290 | 658 | 0.0157 | 0.8159 |
| Metabolic | 638 | 2,088 | 0.0103 | 0.7458 |
| Muscular | 294 | 1,183 | 0.0275 | 0.6421 |
| Neurological | 917 | 898 | 0.0021 | 0.7811 |
| Nutritional | 51 | 82 | 0.0643 | 0.6536 |
| Ophthalmological | 563 | 555 | 0.0037 | 0.7035 |
| Psychiatric | 75 | 49 | 0.0177 | 0.7032 |
| Renal | 170 | 131 | 0.0091 | 0.6545 |
| Respiratory | 79 | 1 | 0.00032 | 0.7023 |
| Skeletal | 278 | 186 | 0.0048 | 0.6679 |
| Unclassified | 64 | 6 | 0.0030 | 0.5080 |
| Multiple | 721 | 427 | 0.0016 | 0.8019 |
Fig. 6Bipartite network showing direct association between predicted complexes and disorders. The big red node represent predicted complexes whereas small nodes denote different disorders. Disorder nodes are colored according to the involvement of associated disease classes.