| Literature DB >> 24944073 |
Eileen Marie Hanna1, Nazar Zaki.
Abstract
BACKGROUND: Developing suitable methods for the identification of protein complexes remains an active research area. It is important since it allows better understanding of cellular functions as well as malfunctions and it consequently leads to producing more effective cures for diseases. In this context, various computational approaches were introduced to complement high-throughput experimental methods which typically involve large datasets, are expensive in terms of time and cost, and are usually subject to spurious interactions.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24944073 PMCID: PMC4230023 DOI: 10.1186/1471-2105-15-204
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Examples of bridge, fjord and shore proteins in protein-protein interaction networks.
Figure 2A yeast protein-protein interaction sub-network. The nodes coloured in yellow correspond to essential proteins identified by the ProRank algorithm.
Figure 3The detected complexes by the original ProRank algorithm when applied on the sub-network in Figure 2.
Figure 4The detected complexes by the ProRank algorithm with the complex-overlap assumption when applied on the sub-network in Figure 2.
Figure 5A schema representing the steps which delineate the ProRank + algorithm.
The properties of the five datasets used in the experimental study
| 1,622 | 9,074 | 0.007 | 11.189 | |
| 2,708 | 7,123 | 0.002 | 5.261 | |
| 3,672 | 14,317 | 0.002 | 7.798 | |
| 1,855 | 7,669 | 0.004 | 8.268 | |
| 5,640 | 59,748 | 0.004 | 21.187 |
Figure 6ProRank + compared to ProRank, MCL, MCODE, CMC, AP, ClusterONE, RNSC, RRW, and CFinder. Here, the four weighted yeast datasets are used: Collins, Krogan core, Krogan extended and Gavin. The comparisons are in terms of (a) the number of clusters that match the reference complexes, (b) the geometric accuracy (Acc) which reflects the clustering-wise sensitivity (S) and the clustering-wise positive predictive value (PPV), and (c) the maximum matching ratio (MMR).
Figure 7ProRank + compared to ProRank, MCL, MCODE, AP, ClusterONE, RNSC, and RRW. Here, the un-weighted BioGRID dataset is used. The comparisons are in terms of (a) the number of clusters that match reference complexes, and (b) the geometric accuracy (Acc) which reflects the clustering-wise sensitivity (S) and the clustering-wise positive predictive value (PPV), and the maximum matching ratio (MMR).
Testing ProRank + on small complexes
| 428 | 91 | 0.875 | 0.935 | 0.433 | |
| 229 | 34 | 0.667 | 0.816 | 0.163 | |
| 260 | 78 | 0.75 | 0.769 | 0.217 | |
| 534 | 57 | 0.897 | 0.947 | 0.293 | |
| 823 | 78 | 0.882 | 0.9 | 0.351 |
The results are in terms of (a) the number of clusters that match the reference complexes, (b) the geometric accuracy (Acc) which reflects the clustering-wise sensitivity (S) and the clustering-wise positive predictive value (PPV), and (c) the maximum matching ratio (MMR).
Selected complexes detected by ProRank + when tested on human protein-protein interaction dataset
| {CCT3, CCT2, CCT8, CCT6A, CCT4, CCT7, CCT5, TCP1} | 100% | |
| {RPL32, RPS17, RPSA, RPL10A, RPL12, SLC25A5, RPL7, RPL18, RPL15, RPL21, RPS6, RPS4X, RPL19, RPL14, RPL4, RPS27L, RPS23, RPS26, RPS16, RPL7A, RPS24, RPS13, RPS15A, RPS8, RPS3A, FAU, RPL11, RPL6, RPL9, RPL5, RPS27, RPL17, RPS2, RPS25, RPS20, NOP56, RPS15, RPL23A, RPS10, RPL10L, RPLP0P6, RPS28, RPS5, RPS9, RPL23, RPL18A, RPS3, RPL37A, RPL31, RPL10, RPL8, RPS11, RPL36, RPS19, RPL30, RPL24, RPS21, RPL27, RPS12, RPL29, RPS29, RPS7, RPL22, RPLP0, RPS14, RPL3, RPLP2, RPL27A, RPL13, RPS18, RPS27A} | 81.48% | |
| {PSMD8, PSMB2, PSMC3, PSMC4, PSMA4, PSMA1, PSMD1, PSMD7, PSMA2, PSMB6, PSMB7, PSMD3, PSMB1, PSMC1, PSMC5, PSMC2, PSMB4, PSMA6, PSMD6, PSMD14, PSMD12, PSMD11, PSMD13, PSMA7, PSMC6, PSMA5, PSMB3, PSMB5, PSMA8, PSMD2} | 83.33% | |
| {SMARCA4, SMARCC1, ARID1A, SMARCE1, SMARCC2, SMARCA2, SMARCB1} | 60% |