| Literature DB >> 35832626 |
Apurva Badkas1, Sébastien De Landtsheer1, Thomas Sauter1.
Abstract
Protein-protein interaction network (PPIN) analysis is a widely used method to study the contextual role of proteins of interest, to predict novel disease genes, disease or functional modules, and to identify novel drug targets. PPIN-based analysis uses both generic and context-specific networks. Multiple contextualization methodologies have been described, such as shortest-path algorithms, neighborhood-based methods, and diffusion/propagation algorithms. This review discusses these methods, provides intuitive representations of PPIN contextualization, and also examines how the quality of such context-specific networks could be improved by considering additional sources of evidence. As a heuristic, we observe that tasks such as identifying disease genes, drug targets, and protein complexes should consider local neighborhoods, while uncovering disease mechanisms and discovering disease-pathways would gain from diffusion-based construction.Entities:
Keywords: Context-specific network; Diffusion; Neighborhood; Protein-protein interaction network
Year: 2022 PMID: 35832626 PMCID: PMC9251778 DOI: 10.1016/j.csbj.2022.06.040
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Some applications of PPINs mentioned in literature.
| Authors | Type | Application | Comments | Ref |
|---|---|---|---|---|
| Tomkins and Manzoni | Review | PPIN analysis of Parkinson's disease | Illustrates the different uses of PPIN analysis, including exploration of the neighbourhood of a single gene, disease genes prioritization, exploration of novel functions, disease candidates and pathways, and for comparative studies with other neurodegenerative diseases | |
| Vinayagam | Method | Predicted novel cancer-associated genes | Applied concepts from control theory to PPIN analysis | |
| Cheng | Method | Predicted hundreds of drug-disease associations | Method based on network proximity of disease proteins and drug targets in a PPIN | |
| Cheng | Method | Predicted drug combinations | Network proximity applied to prediction of drug combinations. Approach validated for a combination of anti-hypertensives | |
| Chautard E | Review | Identifying drug targets | Analysis of drug targets in a PPIN identifies characteristics of drug targets, can guide drug design | |
| Choobdar S | Review | Identify various protein communities, functional and disease modules | DREAM challenge exhibits various approaches for identification of modules based on topology of networks, including PPINs | |
| Maron | Method | Patient specific subnetwork identification and disease sub-typing | Differences in types of cardiomyopathies (hypertrophic and dilated) detected based on patient-specific networks | |
| Vavouraki | Method | Disease stratification and exploring molecular mechanism | PPIN based study of Hereditary Spastic Paraplegia |
Fig. 1Flowchart of the major steps discussed in the manuscript to obtain a context-specific network. Inputs are taken to be proteins representing a specific context and a generic PPIN. Two main methods of contextualization – neighbourhood-based and diffusion based- are elaborated. We also discuss additional options for curation, using different data sources. Such a contextualized network can then be subject to further analysis such as identification of important nodes, and clustering.
Fig. 2Different components of contextualized PPIN construction.
Some PPIN databases.
| HPRD | 41,327 | Primary | 1 ( | Manually curated from literature. Last updated in 2010 | ||
| APID | 667,805 | Secondary | >400 | Experimentally validated interactions. Last updated in 2021. Collection of interactions from IntAct, HPRD, BioGRID, DIP and BioPlex | ||
| BioGRID | 841,206 | Primary | 81 | Lists physical and genetic interactions for various organisms. Contains a ‘muti-validated’ dataset with high confidence interactions, based on presence of multiple evidences of a given interaction. Updated monthly | ||
| IntAct | 3,62,712** | Primary | 16 | Experimentally obtained data, curated data from literature | ||
| BioPlex | ∼120,000 (HEK293T) ∼ 71,000 (HCT116) | Primary | 2 human Cell lines | Experimentally obtained Affinity-Purification Mass Spectrometry (AP-MS) data | ||
| STRING | 1,19,38,498 | Secondary/Predictive | 14,094 | Physical and functional interactions obtained from experiments, computational predictions, text-mining and other databases. Provides confidence scores associated with each interaction. | ||
| HIPPIE | 7,83,182 | Secondary | 1 ( | Provides confidence scores and functional annotation for experimentally verified interactions. Last updated April 2022 | ||
| HINT | 119,526 | Secondary | 12 | Manually curated high-throughput experimental data, curated from 8 different databases | ||
| GeneMANIA | 1,17,49,785^ | Secondary | 9 | Physical and functional interactions. Can be used as a curation tool, e.g for adding missing members in a network; as a tool for functional annotation and interpretation |
*As obtained from the database website, May 2022.
** For Human species, only intra-species physical interactions between proteins considered here.
^All human interactions.
Non-Redundant-Physical, ++Non-Redundant – Genetic.
Physical and functional interactions.
Fig. 3Illustration of network building methods: In terms of number of nodes and edges, simply connecting seed nodes (a) after mapping them to a generic PPIN (which could result in the loss of number of seed nodes) will keep the number of nodes constant, but will lead to a modest increase in the number of edges. Connecting nodes via shortest paths (b) would increase the number of nodes as well as edges. Here, the choice of whether one or more shortest paths are considered will indeed affect the size of the final network. A steep increase may be expected in the size of the network in terms of both number of nodes and edges when the neighbourhood/diffusion-based network building is applied (c), especially if hub nodes are present in the seed genes. The largest size of a network is the entire generic PPIN used in the process. We can illustrate the network building process, as a bottom-up construction method (d) or a top-down contextualization process (e). In the bottom-up construction, starting from nodes of interest, one can build up the connections between the nodes based on available evidence in generic PPIN, or add new nodes and edges to the starting nodes to understand how they influence and are influenced by their interaction partners. On the other hand, one can start with a generic PPIN, and trim away nodes and edges that may not be expressed in specific tissues or cell types, or in certain disease contexts. In a constructed network or in contextualization of a network, multiple criteria can be used to reach a final network. A constructed/contextualized network can further be appended or pruned (f). For example, for a network constructed by connecting the seeds via shortest paths, one would need to consider interactions among the newly added seeds, thus increasing the number of edges of the network while keeping the number of nodes constant. Alternatively, given evidence of expression, one may consider including additional nodes, and thus additional edges to the network. On the other hand, one may prune the network based on additional criteria, such as removing peripheral nodes (reducing number of nodes and edges), or simply removing some of the edges that may not have experimental support. The loss of an edge may or may not reduce the number of nodes. While these steps may seem trivial, they affect the size and topology of the network, and have a major effect on the predictions and conclusions of network-based analyses.