| Literature DB >> 29949962 |
Karel Kalecky1, Young-Rae Cho2.
Abstract
Motivation: Cross-species analysis of large-scale protein-protein interaction (PPI) networks has played a significant role in understanding the principles deriving evolution of cellular organizations and functions. Recently, network alignment algorithms have been proposed to predict conserved interactions and functions of proteins. These approaches are based on the notion that orthologous proteins across species are sequentially similar and that topology of PPIs between orthologs is often conserved. However, high accuracy and scalability of network alignment are still a challenge.Entities:
Mesh:
Year: 2018 PMID: 29949962 PMCID: PMC6022567 DOI: 10.1093/bioinformatics/bty288
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Summary of algorithms for comparison
| Algorithm | BLAST data | Algorithm parameters | Weights | One-to-one | Enforced coverage | Type |
|---|---|---|---|---|---|---|
| AlignMCL | Yes | No | No | Local | ||
| AlignNemo | Yes | No | No | Local | ||
| CUFID | Yes | Yes | No | Global | ||
| HubAlign | No | Yes | Yes | Global | ||
| IsoRankN | -K 10 –thresh 1 | No | No | No | Global | |
| MAGNA++ | -p 15000 -n 2000 -a 0.5 -t 4 -m S3 | No | Yes | Yes | Global | |
| MI-GRAAL | -p 19 | No | Yes | Yes | Global | |
| NETAL | -b 0.5 -c 0.5 | No | Yes | No | Global | |
| NetCoffee | No | Yes | No | Global | ||
| NetworkBLAST | beta 0.9; blast_th 1 | Yes | No | No | Local | |
| PINALOG | No | Yes | No | Local | ||
| SANA | -s3 0.5 -sequence 0.5 | No | Yes | Yes | Global | |
| SMETANA | Yes | No | No | Global | ||
| WAVE | No | Yes | Yes | Global | ||
| No | No | No | Global | |||
| unweighted inter-network | No | No | No | Global | ||
| Yes | No | No | Global |
Note: Enforced coverage means that all proteins from the smaller network need to be aligned.
Overview and classification of evaluation measures
| Abbr. | Measure | Calculation | |
|---|---|---|---|
| Informative measures | AP | Aligned Pairs | # of aligned pairs |
| Cov | Coverage | # of unique aligned genes | |
| CE | Conserved edges | # of edges from one network that are aligned to an edge in the other network | |
| LCCC | Largest conserved connected component | # of edges in the largest connected component assembled from conserved edges | |
| Annotation-based measures | KO | KO functionally correct alignments | # of aligned pairs where both genes share KO functional annotation |
| GO | GO semantically correct alignments | Sum of ratios of GO annotations shared by aligned pairs (i.e. sum of Jaccard indices) | |
| Ground-truth-based measures | EN | Discovered ENSEMBL orthologs | # of ENSEMBL orthologs found among aligned pairs |
| IP | Discovered InParanoid orthologs | # of InParanoid orthologs found among aligned pairs | |
| Combined topological measures | CE-F | Conserved edges of functionally correct pairs | # of edges from one network that are aligned to an edge in the other network with both aligned gene pairs being functionally correct |
| CE-O | Conserved edges of known orthologs | # of edges from one network that are aligned to an edge in the other network with both aligned gene pairs being among ENSEMBL or InParanoid orthologs | |
| LCCC-F | Largest conserved connected component from functionally correct pairs | # of edges in the largest connected component assembled from conserved edges between functionally correctly aligned pairs | |
| LCCC-O | Largest conserved connected component from known orthologs | # of edges in the largest connected component assembled from conserved edges between aligned pairs also present among ENSEMBL or InParanoid orthologs |
Alignment overview with informative measures
| Aligner | Weighted | AP | Cov | CE | LCCC | cKO | cGO | cEN | cIP |
|---|---|---|---|---|---|---|---|---|---|
| AlignMCL | No | 7711 | 7064 | 19 528 | 7716 | 4.39 | 3.34 | 2.80 | 6.00 |
| AlignMCL | Yes | 7711 | 7064 | 19 528 | 7716 | 4.39 | 3.34 | 2.80 | 6.00 |
| AlignNemo | No | 4776 | 2826 | 9385 | 3936 | 4.37 | 3.24 | 3.09 | 6.46 |
| AlignNemo | Yes | 4081 | 2515 | 8127 | 3425 | 4.43 | 3.28 | 3.21 | 6.38 |
| CUFID | No | 5654 | 11 308 | 13 892 | 6664 | 4.54 | 4.34 | 4.03 | 5.30 |
| CUFID | Yes | 5250 | 10 500 | 14 744 | 7070 | 4.14 | 3.78 | 3.69 | 4.85 |
| HubAlign | No | 5926 | 11 852 | 50 016 | 24 978 | 5.19 | 4.36 | 4.50 | 5.99 |
| IsoRankN | No | 2964 | 5065 | 8711 | 3945 | 2.18 | 2.80 | 1.93 | 2.53 |
| MAGNA++ | No | 5933 | 11 866 | 3388 | 1601 | 11.19 | 6.10 | 9.10 | 13.93 |
| MI-GRAAL | No | ||||||||
| NETAL | No | 3100 | 6200 | 456 | 129 | ∞ | 7.74 | 3100 | ∞ |
| NetCoffee | No | 2310 | 4620 | 3742 | 1654 | 2.94 | 2.85 | 2.62 | 3.68 |
| NetworkBLAST | No | 7904 | 3185 | 12 552 | 5224 | 6.68 | 3.47 | 4.46 | 9.81 |
| NetworkBLAST | Yes | 4008 | 2195 | 10 404 | 4432 | 5.37 | 3.47 | 4.06 | 7.68 |
| PINALOG | No | 5317 | 10 634 | 32 792 | 16 285 | 4.19 | 4.01 | 3.76 | 4.70 |
| PrimAlign | No | 3801 | 4883 | 16 518 | 6971 | 2.31 | 2.86 | 1.71 | 3.07 |
| PrimAlign | Yes | 3752 | 4843 | 16 408 | 6934 | 2.29 | 2.86 | 1.70 | 3.04 |
| SANA | No | 5933 | 11 866 | 43 524 | 21 738 | 9.11 | 5.14 | 7.43 | 11.03 |
| SMETANA | No | 3487 | 5384 | 18 770 | 8981 | 2.83 | 2.97 | 2.31 | 3.46 |
| SMETANA | Yes | 3854 | 5718 | 14 417 | 6320 | 2.86 | 3.03 | 2.28 | 3.55 |
| WAVE | No | 5933 | 11 866 | 32 500 | 16 138 | 10.85 | 5.48 | 9.90 | 12.68 |
Fig. 1.Results of human-yeast alignment. Each chart shows one evaluation measure (as detailed in Table 2). PrimAlign and its modifications were run to output the same num-ber of aligned pairs as other algorithms for direct comparison. The red cross marks denote the level of aligned pairs for PrimAlign using the default threshold (dark red––for U; light red––for W; they mostly overlap). U––unweighted networks, W––weighted networks
Statistical comparison of PrimAlign with the others
| Aligner | Weighted | KO | GO | EN | IP | CE-F | CE-O | LCCC-F | LCCC-O |
|---|---|---|---|---|---|---|---|---|---|
| AlignMCL | No | ||||||||
| AlignMCL | Yes | ||||||||
| AlignNemo | No | ||||||||
| AlignNemo | Yes | ||||||||
| CUFID | No | ||||||||
| CUFID | Yes | ||||||||
| HubAlign | No | ||||||||
| IsoRankN | No | ||||||||
| MAGNA++ | No | ||||||||
| MI-GRAAL | No | ||||||||
| NETAL | No | ||||||||
| NetCoffee | No | ||||||||
| NetworkBLAST | No | ||||||||
| NetworkBLAST | Yes | ||||||||
| PINALOG | No | ||||||||
| SANA | No | ||||||||
| SMETANA | No | ||||||||
| SMETANA | Yes | ||||||||
| WAVE | No |
Notes: Statistical comparison of differences in evaluation measures between PrimAlign with unweighted networks on input and the other algorithms for human-yeast alignment. (empty field) P > 0.05
P ≤ 0.05
P ≤ 0.01
P ≤ 0.001. Green = improvement.
Fig. 2.Results of tests with 30 synthetic networks. Comparing precision, recall and topological measures of PrimAlign with SANA and AlignMCL at their size of aligned node pairs
Fig. 3.Example of P-R curve. P-R curve of PrimAlign for the 1st synthetic network compared with SANA and AlignMCL. The result of PrimAlign for the default threshold is highlighted.