| Literature DB >> 26136815 |
Wei Peng1, Jianxin Wang2, Fangxiang Wu3, Pan Yi4.
Abstract
The increase of protein-protein interaction (PPI) data of different species makes it possible to identify common subnetworks (conserved protein complexes) across species via local alignment of their PPI networks, which benefits us to study biological evolution. Local alignment algorithms compare PPI network of different species at both protein sequence and network structure levels. For computational and biological reasons, it is hard to find common subnetworks with strict similar topology from two input PPI networks. Consequently some methods introduce less strict criteria for topological similarity. However those methods fail to consider the differences of the two input networks and adopt equally lenient criteria on them. In this work, a new dividing-and-matching-based method, namely UEDAMAlign is proposed to detect conserved protein complexes. This method firstly uses known protein complexes or computational methods to divide one of the two input PPI networks into subnetworks and then maps the proteins in these subnetworks to the other PPI network to get their homologous proteins. After that, UEDAMAlign conducts unequally lenient criteria on the two input networks to find common connected components from the proteins in the subnetworks and their homologous proteins in the other network. We carry out network alignments between S. cerevisiae and D. melanogaster, H. sapiens and D. melanogaster, respectively. Comparisons are made between other six existing methods and UEDAMAlign. The experimental results show that UEDAMAlign outperforms other existing methods in recovering conserved protein complexes that both match well with known protein complexes and have similar functions.Entities:
Keywords: Conserved protein complexes; Local network alignment; Network alignment; PPI networks
Year: 2015 PMID: 26136815 PMCID: PMC4487215 DOI: 10.1186/s13015-015-0053-5
Source DB: PubMed Journal: Algorithms Mol Biol ISSN: 1748-7188 Impact factor: 1.405
Figure 1Eight cases (a–h) of connectivity in conserved protein complexes from two different PPI networks when UEDAMAlign adopts the same lenient criteria as AlignNemo does to extend a pair of homologous proteins. The nodes with different color come from different PPI networks. The full lines connecting two different color nodes represent their known homologous mappings. The dot lines represent artificial homologous mappings by a unbalanced Bi-random walk algorithm. The full lines connecting the same color nodes represent their interactions.
Figure 2Eleven cases (a–k) of connectivity in conserved protein complexes from two different PPI networks, when parameters l and r are set to 2 and 3 in the course of extending a pair of homologous proteins. The nodes with different color come from different PPI networks. The full lines connecting two different color nodes represent their known homologous mappings. The dot lines represent artificial homologous mappings by a unbalanced Bi-random walk algorithm. The full lines connecting the same color nodes represent their interactions.
The basic information of results of different methods
| Method | Yeast-fly | ||||
|---|---|---|---|---|---|
| Conserved pairs | Yeast | Fly | |||
| Distinct complex (size ≥2) | Avg size | Distinct complex (size ≥2) | Avg size | ||
| UEDAMAlignCFinder (k=4) | 129 | 129 | 7.48 | 129 | 10.72 |
| UEDAMAlignCMC | 128 | 128 | 9.65 | 128 | 12.89 |
| UEDAMAlignCoach | 725 | 725 | 5.84 | 723 | 4.32 |
| UEDAMAlignknowncomplex | 148 | 148 | 3.92 | 146 | 5.12 |
| UEDAMAlignMCL | 862 | 862 | 3.16 | 861 | 3.23 |
| AlignMCL | 933 | 915 | 3.22 | 927 | 3.79 |
| Match-and-Split | 27 | 27 | 4.63 | 27 | 6.85 |
| Mawish | 41 | 41 | 2.34 | 40 | 3.55 |
| NetworkBlast | 191 | 179 | 9.12 | 191 | 10.86 |
| Produles | 95 | 46 | 4.09 | 46 | 4.39 |
Comparison of different methods in terms of how well matching with known proteins
| Methods | PC | MPC | MKC | Recall | Precision | F-measure | CR | PM |
|---|---|---|---|---|---|---|---|---|
| Yeast-fly | ||||||||
| UEDAMAlignCFinder (k = 4) | 129 | 59 | 66 | 0.1471 | 0.4574 | 0.2226 | 0.1891 | 2 |
| UEDAMAlignCMC | 128 | 58 | 73 | 0.1476 | 0.4531 | 0.2226 | 0.2068 | 0 |
| UEDAMAlignCoach | 725 | 207 | 129 | 0.4259 | 0.2855 | 0.3419 | 0.3057 | 4 |
| UEDAMAlignknowncomplex | 148 | 145 | 172 | 0.3806 | 0.9797 | 0.5482 | 0.3432 | 45 |
| UEDAMAlignMCL | 862 | 159 | 137 | 0.3698 | 0.1845 | 0.2461 | 0.2401 | 9 |
| AlignMCL | 915 | 151 | 162 | 0.3804 | 0.165 | 0.2302 | 0.2479 | 9 |
| Match-and-Split | 27 | 12 | 20 | 0.03 | 0.4444 | 0.0562 | 0.0641 | 2 |
| Mawish | 41 | 16 | 26 | 0.0402 | 0.3902 | 0.0729 | 0.0318 | 1 |
| NetworkBlast | 179 | 9 | 10 | 0.0221 | 0.0503 | 0.0307 | 0.0391 | 0 |
| Produles | 46 | 29 | 26 | 0.0706 | 0.6304 | 0.1269 | 0.0573 | 3 |
| Human-fly | ||||||||
| UEDAMAlignCFinder (k = 4) | 238 | 80 | 187 | 0.0531 | 0.3361 | 0.0917 | 0.106 | 3 |
| UEDAMAlignCMC | 404 | 67 | 182 | 0.0447 | 0.1658 | 0.0705 | 0.1585 | 2 |
| UEDAMAlignCoach | 1,538 | 428 | 493 | 0.2765 | 0.2783 | 0.2774 | 0.2983 | 10 |
| UEDAMAlignknowncomplex | 515 | 508 | 821 | 0.3908 | 0.9864 | 0.5598 | 0.4242 | 158 |
| UEDAMAlignMCL | 1,453 | 171 | 322 | 0.117 | 0.1177 | 0.1173 | 0.2008 | 9 |
| AlignMCL | 1,094 | 144 | 305 | 0.0992 | 0.1316 | 0.1131 | 0.1697 | 7 |
| Match-and-Split | 53 | 23 | 73 | 0.0147 | 0.434 | 0.0285 | 0.0558 | 3 |
| Mawish | 61 | 28 | 70 | 0.0178 | 0.459 | 0.0343 | 0.0333 | 1 |
| NetworkBlast | 164 | 45 | 107 | 0.029 | 0.2744 | 0.0525 | 0.0897 | 0 |
| Produles | 99 | 35 | 77 | 0.0223 | 0.3535 | 0.0419 | 0.0461 | 5 |
Comparison in terms of biological relevance between each pair of conserved protein complexes predicted by each method
| Methods | Yeast-fly | ||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
| UEDAMAlignCFinder (k = 4) | 129 | 3.96 | 5.3766 | 3.4259 | 3.7503 |
| UEDAMAlignCMC | 128 | 3.5266 | 4.9468 | 2.9469 | 3.3061 |
| UEDAMAlignCoach | 725 | 3.4729 | 4.4809 | 2.5707 | 3.1565 |
| UEDAMAlignKnownComplex | 148 | 4.5041 | 7.0767 | 3.7421 | 3.9779 |
| UEDAMAlignMCL | 862 | 2.3539 | 3.2412 | 1.4475 | 2.3063 |
| AlignMCL | 933 | 2.2563 | 2.9469 | 1.255 | 2.2319 |
| Match-and-Split | 27 | 4.069 | 5.7868 | 3.3512 | 3.614 |
| Mawish | 41 | 4.4942 | 5.9584 | 3.7828 | 4.2566 |
| NetworkBlast | 191 | 2.2865 | 2.8698 | 1.8388 | 2.198 |
| Produles | 95 | 3.4301 | 6.3427 | 2.525 | 2.8541 |
The basic information of results of different methods based on AlingNemo’s dataset
| Methods | Yeast-fly | ||||
|---|---|---|---|---|---|
| Conserved pairs | Yeast | Fly | |||
| Distinct complexes (size ≥2) | Avg size | Distinct complexes (size ≥2) | Avg size | ||
| UEDAMAlignCFinder (k = 4) | 126 | 126 | 8.02 | 126 | 17.13 |
| UEDAMAlignCMC | 127 | 127 | 10.57 | 127 | 23.39 |
| UEDAMAlignCoach | 1,019 | 1,019 | 9.34 | 1,019 | 18.6 |
| UEDAMAlignknowncomplex | 160 | 160 | 4.04 | 156 | 8.65 |
| UEDAMAlignMCL | 697 | 697 | 6.26 | 696 | 5.35 |
| AlignNemo | 248 | 243 | 9.27 | 246 | 10.06 |
| AlignMCL | 684 | 523 | 3.63 | 630 | 12.92 |
Comparison of different methods in terms of how well matching with known protein based one AlignNemo’s dataset
| Methods | PC | MPC | MKC | Recall | Precision | F-measure | CR | PM |
|---|---|---|---|---|---|---|---|---|
| Yeast-fly | ||||||||
| UEDAMAlignCFinder (k = 4) | 126 | 62 | 66 | 0.1535 | 0.4921 | 0.234 | 0.1682 | 2 |
| UEDAMAlignCMC | 127 | 57 | 74 | 0.1458 | 0.4488 | 0.2201 | 0.2208 | 0 |
| UEDAMAlignCoach | 1,019 | 190 | 134 | 0.4095 | 0.1865 | 0.2562 | 0.288 | 0 |
| UEDAMAlignknowncomplex | 160 | 158 | 184 | 0.4136 | 0.9875 | 0.583 | 0.3745 | 47 |
| UEDAMAlignMCL | 697 | 113 | 115 | 0.2783 | 0.1621 | 0.2049 | 0.2365 | 4 |
| AlignNemo | 243 | 77 | 53 | 0.1782 | 0.3169 | 0.2281 | 0.1755 | 0 |
| AlignMCL | 523 | 95 | 97 | 0.234 | 0.1816 | 0.2045 | 0.2224 | 5 |
| Human-fly | ||||||||
| UEDAMAlignCFinder (k = 4) | 116 | 42 | 67 | 0.0264 | 0.3621 | 0.0493 | 0.0451 | 1 |
| UEDAMAlignCMC | 288 | 62 | 101 | 0.0394 | 0.2153 | 0.0666 | 0.1251 | 1 |
| UEDAMAlignCoach | 2,978 | 432 | 281 | 0.2449 | 0.1451 | 0.1822 | 0.1945 | 0 |
| UEDAMAlignknowncomplex | 333 | 326 | 552 | 0.235 | 0.979 | 0.3791 | 0.2634 | 34 |
| UEDAMAlignMCL | 679 | 103 | 219 | 0.0688 | 0.1517 | 0.0947 | 0.1459 | 2 |
| AlignNemo | 114 | 31 | 48 | 0.0194 | 0.2719 | 0.0363 | 0.0628 | 0 |
| AlignMCL | 732 | 97 | 254 | 0.0666 | 0.1325 | 0.0887 | 0.2012 | 2 |
Comparison in terms of biological relevance between each pair of conserved protein complexes predicted by each method based one AlignNemo’s dataset
| Methods | Yeast-fly | ||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
| UEDAMAlignCFinder (k = 4) | 126 | 2.669 | 4.9984 | 1.8629 | 2.5173 |
| UEDAMAlignCMC | 127 | 2.3109 | 4.669 | 1.661 | 2.223 |
| UEDAMAlignCoach | 1,019 | 2.0566 | 3.784 | 1.5193 | 2.0422 |
| UEDAMAlignKnownComplex | 160 | 2.7962 | 7.16 | 1.7741 | 2.4475 |
| UEDAMAlignMCL | 697 | 1.9411 | 3.0208 | 1.3032 | 1.6191 |
| AlignNemo | 248 | 1.7501 | 3.5919 | 0.916 | 1.3803 |
| AlignMCL | 683 | 1.2522 | 2.283 | 1.019 | 1.4451 |
Comparison of performance of UEDAMAlignKnownComplex and UEDAMAlignCoach with respect to various values of parameter l and r on how well matching with known protein
| Methods | PC | MPC | MKC | Recall | Precision | F-measure | CR | PM |
|---|---|---|---|---|---|---|---|---|
| Yeast-fly | ||||||||
| | 725 | 207 | 129 | 0.4259 | 0.2855 | 0.3419 | 0.3057 | 4 |
| | 762 | 214 | 144 | 0.4477 | 0.2808 | 0.3452 | 0.3078 | 4 |
| | 785 | 217 | 142 | 0.4493 | 0.2764 | 0.3423 | 0.3078 | 4 |
| | 785 | 218 | 144 | 0.4523 | 0.2777 | 0.3441 | 0.3078 | 4 |
| | 148 | 145 | 172 | 0.3806 | 0.9797 | 0.5482 | 0.3432 | 45 |
| | 149 | 146 | 173 | 0.3832 | 0.9799 | 0.5509 | 0.3453 | 46 |
| | 148 | 145 | 172 | 0.3806 | 0.9797 | 0.5482 | 0.3432 | 45 |
| | 149 | 146 | 173 | 0.3832 | 0.9799 | 0.5509 | 0.3453 | 46 |
| Human-fly | ||||||||
| | 1,538 | 428 | 493 | 0.2765 | 0.2783 | 0.2774 | 0.2983 | 10 |
| | 1,420 | 410 | 474 | 0.2647 | 0.2887 | 0.2762 | 0.2963 | 9 |
| | 1,421 | 401 | 469 | 0.2595 | 0.2822 | 0.2704 | 0.2965 | 9 |
| | 1,430 | 406 | 473 | 0.2626 | 0.2839 | 0.2728 | 0.2965 | 9 |
| | 515 | 508 | 821 | 0.3908 | 0.9864 | 0.5598 | 0.4242 | 158 |
| | 521 | 514 | 826 | 0.3951 | 0.9866 | 0.5642 | 0.4269 | 158 |
| | 522 | 515 | 827 | 0.3958 | 0.9866 | 0.5650 | 0.4270 | 158 |
| | 524 | 517 | 829 | 0.3974 | 0.9866 | 0.5666 | 0.4274 | 158 |
Comparison in terms of biological relevance between each pair of conserved protein complexes predicted by UEDAMAlignKnownComplex and UEDAMAlignCoach with respect to various values of parameter l and r
| Methods | Yeast-fly | ||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
|
| 725 | 3.4729 | 4.4809 | 2.5707 | 3.1565 |
|
| 762 | 3.4142 | 4.5086 | 2.6173 | 3.1450 |
|
| 785 | 3.4591 | 4.4847 | 2.6750 | 3.2003 |
|
| 785 | 3.4140 | 4.4826 | 2.6646 | 3.1738 |
|
| 148 | 4.5041 | 7.0767 | 3.7421 | 3.9779 |
|
| 149 | 4.3174 | 7.0879 | 3.6295 | 3.8850 |
|
| 148 | 4.4140 | 7.0767 | 3.7057 | 3.9537 |
|
| 149 | 4.3174 | 7.0879 | 3.6295 | 3.8850 |