| Literature DB >> 21167069 |
Mitra Mirzarezaee1, Babak N Araabi, Mehdi Sadeghi.
Abstract
BACKGROUND: It has been understood that biological networks have modular organizations which are the sources of their observed complexity. Analysis of networks and motifs has shown that two types of hubs, party hubs and date hubs, are responsible for this complexity. Party hubs are local coordinators because of their high co-expressions with their partners, whereas date hubs display low co-expressions and are assumed as global connectors. However there is no mutual agreement on these concepts in related literature with different studies reporting their results on different data sets. We investigated whether there is a relation between the biological features of Saccharomyces Cerevisiae's proteins and their roles as non-hubs, intermediately connected, party hubs, and date hubs. We propose a classifier that separates these four classes.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21167069 PMCID: PMC3018396 DOI: 10.1186/1752-0509-4-172
Source DB: PubMed Journal: BMC Syst Biol ISSN: 1752-0509
Distribution of four classes of proteins in S. Cerevisiae's PIN
| Class Label | Number (Percentage) of Proteins |
|---|---|
| Non-Hub (NH) | 4796 (81.4) |
| Intermediately Connected (IC) | 575 (09.8) |
| Party Hub (PH) | 195 (03.3) |
| Date Hub (DH) | 322 (05.5) |
| Total | 5,888 (100) |
Base classifiers comparison based on different feature-sets
| KNN | Bayes with | Bayes with | |
|---|---|---|---|
| Amino Acid compositions | 26.0 (11.5) | 25.0 (50.0) | 33.6 (15.0) |
| Dipeptides | 31.7 (21.8) | 31.5 (25.1) | 43.9 (36.8) |
| PairsComp1Gap | 31.5 (23.7) | 31.6 (27.2) | 43.0 (29.4) |
| PairsComp2Gaps | 30.7 (21.0) | 31.8 (25.7) | 45.9 (34.4) |
| Haralick Features | 26.9 (03.5) | 26.6 (06.1) | 26.0 (07.8) |
| 48 physicochemical prop. | 26.6 (10.3) | 25.0 (50.0) | 29.4 (13.5) |
| Biological Process level 1 | 31.8 (16.8) | 27.8 (16.4) | 34.3 (18.8) |
| Biological Process level 2 | 33.0 (22.2) | 33.5 (18.0) | 30.9 (14.1) |
| Cellular level 1 | 32.7 (25.8) | 27.0 (19.0) | 35.4 (29.5) |
| Cellular level 2 | 31.4 (28.5) | 31.0 (20.2) | 28.2 (14.5) |
| Functional Process level 1 | 30.0 (10.9) | 27.3 (08.4) | 27.6 (11.6) |
| Functional Process level 2 | 28.2 (15.8) | 30.5 (17.9) | 28.3 (15.7) |
| Domains | 56.0 (60.0) | 67.1 (58.9) | 63.7 (55.5) |
| Repeated Domains | 57.0 (59.3) | 66.6 (57.7) | 65.7 (57.1) |
| Disordered Regions | 26.5 (07.5) | 25.2 (-4.0) | 27.2 (13.3) |
| PSSM-20 | 26.2 (08.4) | 26.0 (08.4) | 26.0 (09.3) |
| PSSM-400 | 37.0 (35.8) | 42.6 (42.1) | 54.1 (47.9) |
Fusion of feature-sets with Gaussian Bayes classification
| NH | IC | PH | DH | |||
|---|---|---|---|---|---|---|
| All Features | 68.3(62.3) | NH | 06.7 | 01.0 | 02.6 | |
| IC | 34.4 | 03.2 | 04.5 | |||
| PH | 14.9 | 13.5 | 08.1 | |||
| DH | 22.1 | 14.7 | 01.1 | |||
| Domains | 67.0(58.9) | NH | 03.4 | 01.9 | 03.8 | |
| IC | 36.3 | 04.5 | 11.5 | |||
| PH | 18.9 | 02.7 | 06.8 | |||
| DH | 20.0 | 08.4 | 13.7 | |||
| Domain | 66.8(58.0) | NH | 03.5 | 03.5 | 03.8 | |
| RepDomains | IC | 35.7 | 06.4 | 10.8 | ||
| PH | 14.9 | 04.1 | 08.1 | |||
| DH | 16.8 | 10.5 | 14.7 | |||
| Domains | 70.9(64.8) | NH | 04.4 | 02.5 | 02.9 | |
| RepDomains PSSM-400 | IC | 35.0 | 06.4 | 05.1 | ||
| PH | 10.8 | 05.4 | 08.1 | |||
| DH | 14.7 | 07.4 | 13.7 | |||
| Domains | 71.7(65.4) | NH | 04.0 | 01.5 | 03.1 | |
| PSSM-400 | IC | 35.0 | 04.5 | 07.0 | ||
| PH | 12.2 | 06.8 | 06.8 | |||
| DH | 16.4 | 06.3 | 09.5 | |||
| Domains | 74.0(67.1) | NH | 04.2 | 01.6 | 03.0 | |
| PSSM-400 | IC | 31.8 | 05.1 | 05.7 | ||
| Cellular1 | PH | 12.2 | 05.4 | 06.8 | ||
| DH | 14.7 | 06.3 | 06.3 | |||
| Domains | 74.7(69.3) | NH | 04.4 | 01.5 | 02.5 | |
| PSSM-400 | IC | 31.8 | 05.1 | 05.1 | ||
| Cellular1 | PH | 09.5 | 08.1 | 06.8 | ||
| CompPair2Gaps | DH | 15.8 | 05.3 | 05.3 | ||
| Domains | 74.9(69.9) | NH | 04.1 | 01.5 | 02.2 | |
| PSSM-400 | IC | 31.2 | 05.1 | 04.5 | ||
| Cellular1 | PH | 13.5 | 06.8 | 05.4 | ||
| CompPair2Gaps | DH | 14.7 | 07.4 | 04.2 | ||
Minimum Risk extension of base classifiers on different feature-sets
| Min Risk KNN | Min Risk Bayes | Min Risk Bayes | |
|---|---|---|---|
| Amino Acid compositions | 30.2 (14.2) | 29.4 (50.0) | 28.4 (14.8) |
| Dipeptides | 34.8 (26.2) | 43.5 (25.1) | 45.0 (34.0) |
| PairsComp1Gap | 35.7 (26.9) | 41.8 (27.2) | 42.7 (29.0) |
| PairsComp2Gaps | 37.0 (29.6) | 41.1 (25.7) | 45.6 (33.6) |
| Haralick Features | 27.9 (07.6) | 28.4 (06.1) | 26.3 (06.1) |
| 48 physicochemical prop. | 29.5 (13.5) | 29.0 (50.0) | 29.0 (15.5) |
| Biological Process level 1 | 32.0 (19.2) | 30.4 (16.3) | 30.4 (14.9) |
| Biological Process level 2 | 34.1 (25.2) | 35.7 (18.0) | 37.7 (16.5) |
| Cellular level 1 | 34.2 (23.8) | 35.4 (19.0) | 36.6 (34.2) |
| Cellular level 2 | 35.8 (30.3) | 34.0 (20.2) | 36.6 (31.8) |
| Functional Process level 1 | 29.6 (17.3) | 28.1 (08.4) | 31.6 (14.5) |
| Functional Process level 2 | 29.6 (21.9) | 33.2 (17.9) | 30.6 (15.8) |
| Domains | 60.5 (57.9) | 67.5 (58.9) | 67.8 (57.1) |
| Repeated Domains | 59.6 (57.0) | 67.4 (57.7) | 63.9 (55.7) |
| Disordered Regions | 27.0 (07.2) | 26.1 (02.6) | 27.7 (08.9) |
| PSSM-20 | 26.8 (06.1) | 26.3 (08.4) | 25.6 (01.7) |
| PSSM-400 | 43.8 (33.2) | 49.8 (42.1) | 54.0 (48.0) |
MDM Bayes classification with different number of PDFs for the best feature-set
| 73.0 (70.5) | |
| 73.7 (70.7) | |
| 73.3 (72.1) | |
| 72.1 (70.6) |
Comparison of Minimum Risk classifiers on best fused features
| Bayes with Gaussian PDF | 77.0(69.4) | NH | 03.7 | 01.8 | 03.6 | |
| IC | 29.3 | 06.4 | 08.9 | |||
| PH | 08.1 | 04.0 | 08.1 | |||
| DH | 07.4 | 05.3 | 05.3 | |||
| Bayes with MDM PDF | 74.4(69.6) | NH | 03.1 | 02.6 | 02.5 | |
| IC | 31.2 | 08.3 | 10.8 | |||
| PH | 10.8 | 0.0 | 06.8 | |||
| DH | 11.6 | 03.2 | 11.6 | |||
Predicted labels from both Min Risk Bayes classifiers with Gaussian and MDM models
| NHs | ICS | PHs | DHs | Average CCR | ||
|---|---|---|---|---|---|---|
| Gaussian | 1400 | 164 | 89 | 113 | 77.0 | 69.4 |
| MDM | 1384 | 111 | 127 | 144 | 74.4 | 69.6 |
| True Labels | 1440 | 157 | 74 | 95 | - | - |
PH/DH/NH prediction results in S.Cerevisiae
| NHs vs. Others | 90.8 | 81.9 | 95.7 | 66.9 |
| ICs vs. Others | 55.4 | 96.2 | 58.4 | 95.7 |
| PHs vs. Others | 79.6 | 97.6 | 59 | 99.1 |
| DHs vs. Others | 82.1 | 95.7 | 52 | 98.9 |
| PH+DH vs. Others | 87.5 | 93.6 | 59.2 | 98.6 |
Figure 1Roc Curves and AROC values for Separating NHs, ICs, PHs, and DHs.