| Literature DB >> 29914123 |
Buwen Cao1,2, Shuguang Deng3, Hua Qin4, Pingjian Ding5, Shaopeng Chen6, Guanghui Li7,8.
Abstract
High-throughput technology has generated large-scale protein interaction data, which is crucial in our understanding of biological organisms. Many complex identification algorithms have been developed to determine protein complexes. However, these methods are only suitable for dense protein interaction networks, because their capabilities decrease rapidly when applied to sparse protein⁻protein interaction (PPI) networks. In this study, based on penalized matrix decomposition (PMD), a novel method of penalized matrix decomposition for the identification of protein complexes (i.e., PMDpc) was developed to detect protein complexes in the human protein interaction network. This method mainly consists of three steps. First, the adjacent matrix of the protein interaction network is normalized. Second, the normalized matrix is decomposed into three factor matrices. The PMDpc method can detect protein complexes in sparse PPI networks by imposing appropriate constraints on factor matrices. Finally, the results of our method are compared with those of other methods in human PPI network. Experimental results show that our method can not only outperform classical algorithms, such as CFinder, ClusterONE, RRW, HC-PIN, and PCE-FR, but can also achieve an ideal overall performance in terms of a composite score consisting of F-measure, accuracy (ACC), and the maximum matching ratio (MMR).Entities:
Keywords: clustering; penalized matrix decomposition; protein complex; protein–protein interaction (PPI)
Mesh:
Year: 2018 PMID: 29914123 PMCID: PMC6100434 DOI: 10.3390/molecules23061460
Source DB: PubMed Journal: Molecules ISSN: 1420-3049 Impact factor: 4.411
Figure 1Values of F-measure for different values of k ∈ (0, 2500] with a 100 increment in HPRD dataset.
Results of six protein complexes Algorithms in HPRD Dataset.
| Algorithms | Number | Precision | Recall | F-Measure | ACC | Sep | MMR | MCC |
|---|---|---|---|---|---|---|---|---|
| CFinder | 49 | 0.959 | 0.143 | 0.249 | 0.184 | 0.165 | 0.017 | 0.327 |
| ClusterONE | 755 | 0.295 | 0.186 | 0.229 | 0.333 | 0.209 | 0.084 | 0.391 |
| RRW | 167 | 0.671 | 0.190 | 0.296 | 0.236 | 0.231 | 0.034 | 0.209 |
| HC-PIN | 99 | 0.646 | 0.140 | 0.230 | 0.256 | 0.233 | 0.024 | 0.196 |
| PCE-FR | 274 | 0.534 | 0.178 | 0.267 | 0.279 | 0.169 | 0.029 | 0.035 |
|
| 118 | 0.451 | 0.356 | 0.398 | 0.362 | 0.777 | 0.010 | 0.343 |
Figure 2Results comparison of the six algorithms in HPRD dataset using CHPC2012 gold standard dataset. Columns correspond to the following algorithms, CFinder, ClusterONE, HC-PIN, PCE-FR, and from left to right. Various color of the same columns denotes the individual components of the composite score of the algorithm (cyan = F-measure, blue = ACC, and purple = MMR). The total height of each column is the value of the composite score for a special algorithm in a special dataset. Large score shows the clustering result is better.
Figure 3Graphical description of . Matrix is decomposed into two base matrices, namely, U, V, and a diagonal matrix .