Literature DB >> 23595661

NetworkPrioritizer: a versatile tool for network-based prioritization of candidate disease genes or other molecules.

Tim Kacprowski¹, Nadezhda T Doncheva, Mario Albrecht.

Abstract

SUMMARY: The prioritization of candidate disease genes is often based on integrated datasets and their network representation with genes as nodes connected by edges for biological relationships. However, the majority of prioritization methods does not allow for a straightforward integration of the user's own input data. Therefore, we developed the Cytoscape plugin NetworkPrioritizer that particularly supports the integrative network-based prioritization of candidate disease genes or other molecules. Our versatile software tool computes a number of important centrality measures to rank nodes based on their relevance for network connectivity and provides different methods to aggregate and compare rankings. AVAILABILITY: NetworkPrioritizer and the online documentation are freely available at http://www.networkprioritizer.de

Entities: Disease Gene Species

Mesh：

Year: 2013 PMID： 23595661 PMCID： PMC3661055 DOI： 10.1093/bioinformatics/btt164

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 INTRODUCTION

An important objective of medical bioinformatics is to elucidate the genetic foundations of human diseases. To this end, it is crucial to identify genes that might predispose to or cause specific diseases. To rank candidate genes, e.g. from some genome-wide association study, according to their disease relevance, the existing plethora of computational prioritization methods exploits the available biomedical knowledge. Many methods combine multiple genotypic and phenotypic data sources, e.g. gene expression, protein interactions and overlapping disease characteristics (Doncheva ). Integrated information of biological and molecular relationships and interactions is naturally represented as networks. The biological connections between known disease genes and the remaining genes in a network are of particular interest, as they can point to new disease genes according to the guilt-by-association principle. The majority of prioritization methods are available only as web services (Tranchevent ). Since these require the upload of the user’s input data, they do not allow for the analysis of confidential data. Furthermore, most web services rely on pre-defined background data. For example, GeneWanderer ranks candidate genes based on their distance to disease genes in a pre-defined protein–protein interaction network. GeneDistiller and ENDEAVOUR combine multiple data sources, but do not allow the user to include own data. Additionally, the rank aggregation used by ENDEAVOUR cannot be modified by the user. Existing Cytoscape plugins for prioritization tasks are also subject to major limitations. The plugin iCTNet (Wang ) queries only a specific database to construct networks, but a straightforward integration of own data is not possible. The plugins cytoHubba (Lin ) and GPEC (Le and Kwon, 2012) rank network nodes using their close neighborhood and random walks in the network, respectively. However, neither one supports multiple rankings or further analysis of the rankings. The plugin NetworkAnalyzer (Assenov ; Doncheva ) and the Java application CentiBiN (Junker ) feature a large set of centrality measures, but they cannot compute the measures for a user-defined set of seed nodes or for weighted networks. Here, we present NetworkPrioritizer, a novel Cytoscape plugin for the integrative network-based prioritization of candidate genes or other molecules. It comprises two main functionalities. First, it facilitates the estimation of the relevance of network nodes, e.g. candidate genes, with regard to a set of seed nodes, e.g. known disease genes. Second, our plugin allows for the user-guided aggregation and comparison of multiple node rankings derived according to different relevance measures. Users can supply their own data and tailor the network analysis as well as the rank aggregation to their needs.

2 SOFTWARE FEATURES

2.1 Relevance measures and ranking

NetworkPrioritizer can rank nodes in any user-imported Cytoscape network. Each ranking is based on the relevance of nodes for the network connectivity. This relevance is estimated by a number of centrality measures such as shortest path betweenness, shortest path closeness, random walk betweenness, random walk receiver closeness and random walk transmitter closeness (Borgatti, 2005) (see web site). Closeness quantifies the path distance between a node and the rest of the network. Betweenness measures the influence of a node on the network paths connecting other nodes. Since these measures are applicable only to undirected networks, the edge directions are ignored in directed networks. NetworkPrioritizer can handle unweighted and weighted networks with user-adjustable effect of the edge weights on the computed centralities (Opsahl ) (Fig. 1a). A particular feature of NetworkPrioritizer is the computation of the centrality measures for a set of seed nodes, which can be imported from a text file or selected in the network view.

Fig. 1.

Two important user-interface elements of NetworkPrioritizer. (a) In the Preferences dialog, the user can adjust settings for the network analysis and for the rank aggregation. (b) The Ranking Manager allows to inspect, compare, aggregate, export and import rankings

2.2 Rank aggregation

The Ranking Manager of NetworkPrioritizer provides different methods to aggregate and compare multiple rankings (Fig. 1b). In this context, the rankings to aggregate are called primary rankings. Weighted Borda Fuse (WBF) is a generalization of the popular Borda count aggregation method (Saari, 1999), which works as follows: In primary rankings, each node receives a score that is equal to the number of nodes ranked lower in the respective primary ranking. In the aggregated ranking, the nodes are ranked according to the sum of their sores. WBF also allows weighing the contribution of each primary ranking to the aggregated score. Weighted AddScore Fuse (WASF) calculates the weighted sum of scores for each node in the primary rankings and awards a higher rank the larger this sum is. Since both WBF and WASF are consensus-based aggregation methods, they can be used to identify candidate genes that attain high ranks in all primary rankings. If the primary rankings are based on comparable scores, i.e. scores on similar scales, WASF is more distinctive and thus more accurate than WBF. MaxRank Fuse performs aggregation by assigning each node the highest rank achieved in any primary ranking. Thus, a candidate with a high rank in a single primary ranking obtains a high rank in the aggregation. Rank aggregation can result in ties if two or more nodes receive the same rank. NetworkPrioritizer can leave ties unresolved or break them arbitrarily. Furthermore, the Ranking Manager provides two common measures of ranking distance, the Spearman footrule and the Kendall tau (Dwork ). The Spearman footrule is the sum, over all nodes, of the difference between the ranks of a node in two compared rankings. The Kendall tau distance between two rankings is the number of nodes with different ranks. Rank lists and rank list distances can be imported from, or exported to, plain text files for further analysis (see web site for file format details).

2.3 Batch functionality

To facilitate the prioritization of nodes in multiple networks, NetworkPrioritizer provides batch functionality. First, NetworkPrioritizer computes all centrality measures for each network and saves the resulting primary rankings to plain text files. Second, the primary rankings are re-imported and aggregated for each network separately.

3 CASE STUDY

A network of both protein–protein interactions and functional similarity links was compiled from BioMyn (Ramírez ) and FunSimMat (Schlicker ), respectively, for proteins encoded by genes in genomic loci associated with Crohn’s disease (Franke ). Proteins associated with inflammatory bowel disease (IBD), or Crohn’s disease as a subtype of IBD, were used as seed nodes for the network analysis (see web site). The 10 top-ranked proteins function in the ‘immune system process’, ‘response to stress’, ‘signal transduction’ and ‘homeostatic process’ according to their Gene Ontology annotation. Since these processes are closely related to IBD (Zhu and Li, 2012), the proteins are promising candidates for further experimental studies.

4 CONCLUSIONS

NetworkPrioritizer is a versatile Cytoscape plugin that enables the ranking of individual network nodes based on their relevance for connecting a set of seed nodes to the rest of the network. The plugin computes centrality measures for unweighted and weighted networks and provides rank aggregation methods and ranking distance calculations. With its modular and extensible software design, NetworkPrioritizer is a very useful tool for integrative network-based prioritization of, e.g. candidate disease genes. Funding: Part of this study was financially supported by the BMBF through the German National Genome Research Network (NGFN) and the Greifswald Approach to Individualized Medicine (GANI_MED). The research was also conducted in the context of the DFG-funded Cluster of Excellence for Multimodal Computing and Interaction (MMCI). Conflict of Interest: none declared.

12 in total

1. Exploration of biological network centralities with CentiBiN.

Authors: Björn H Junker; Dirk Koschützki; Falk Schreiber
Journal: BMC Bioinformatics Date: 2006-04-21 Impact factor: 3.169

2. Topological analysis and interactive visualization of biological networks and protein structures.

Authors: Nadezhda T Doncheva; Yassen Assenov; Francisco S Domingues; Mario Albrecht
Journal: Nat Protoc Date: 2012-03-15 Impact factor: 13.491

3. GPEC: a Cytoscape plug-in for random walk-based gene prioritization and biomedical evidence collection.

Authors: Duc-Hau Le; Yung-Keun Kwon
Journal: Comput Biol Chem Date: 2012-03-03 Impact factor: 2.877

4. Computing topological parameters of biological networks.

Authors: Yassen Assenov; Fidel Ramírez; Sven-Eric Schelhorn; Thomas Lengauer; Mario Albrecht
Journal: Bioinformatics Date: 2007-11-15 Impact factor: 6.937

Review 5. A guide to web tools to prioritize candidate genes.

Authors: Léon-Charles Tranchevent; Francisco Bonachela Capdevila; Daniela Nitsch; Bart De Moor; Patrick De Causmaecker; Yves Moreau
Journal: Brief Bioinform Date: 2010-03-21 Impact factor: 11.622

Review 6. Oxidative stress and redox signaling mechanisms of inflammatory bowel disease: updated experimental and clinical evidence.

Authors: Hong Zhu; Y Robert Li
Journal: Exp Biol Med (Maywood) Date: 2012-03-22

7. Improving disease gene prioritization using the semantic similarity of Gene Ontology terms.

Authors: Andreas Schlicker; Thomas Lengauer; Mario Albrecht
Journal: Bioinformatics Date: 2010-09-15 Impact factor: 6.937

8. Genome-wide meta-analysis increases to 71 the number of confirmed Crohn's disease susceptibility loci.

Authors: Andre Franke; Dermot P B McGovern; Jeffrey C Barrett; Kai Wang; Graham L Radford-Smith; Tariq Ahmad; Charlie W Lees; Tobias Balschun; James Lee; Rebecca Roberts; Carl A Anderson; Joshua C Bis; Suzanne Bumpstead; David Ellinghaus; Eleonora M Festen; Michel Georges; Todd Green; Talin Haritunians; Luke Jostins; Anna Latiano; Christopher G Mathew; Grant W Montgomery; Natalie J Prescott; Soumya Raychaudhuri; Jerome I Rotter; Philip Schumm; Yashoda Sharma; Lisa A Simms; Kent D Taylor; David Whiteman; Cisca Wijmenga; Robert N Baldassano; Murray Barclay; Theodore M Bayless; Stephan Brand; Carsten Büning; Albert Cohen; Jean-Frederick Colombel; Mario Cottone; Laura Stronati; Ted Denson; Martine De Vos; Renata D'Inca; Marla Dubinsky; Cathryn Edwards; Tim Florin; Denis Franchimont; Richard Gearry; Jürgen Glas; Andre Van Gossum; Stephen L Guthery; Jonas Halfvarson; Hein W Verspaget; Jean-Pierre Hugot; Amir Karban; Debby Laukens; Ian Lawrance; Marc Lemann; Arie Levine; Cecile Libioulle; Edouard Louis; Craig Mowat; William Newman; Julián Panés; Anne Phillips; Deborah D Proctor; Miguel Regueiro; Richard Russell; Paul Rutgeerts; Jeremy Sanderson; Miquel Sans; Frank Seibold; A Hillary Steinhart; Pieter C F Stokkers; Leif Torkvist; Gerd Kullak-Ublick; David Wilson; Thomas Walters; Stephan R Targan; Steven R Brant; John D Rioux; Mauro D'Amato; Rinse K Weersma; Subra Kugathasan; Anne M Griffiths; John C Mansfield; Severine Vermeire; Richard H Duerr; Mark S Silverberg; Jack Satsangi; Stefan Schreiber; Judy H Cho; Vito Annese; Hakon Hakonarson; Mark J Daly; Miles Parkes
Journal: Nat Genet Date: 2010-12 Impact factor: 38.330

9. Novel search method for the discovery of functional relationships.

Authors: Fidel Ramírez; Glenn Lawyer; Mario Albrecht
Journal: Bioinformatics Date: 2011-12-16 Impact factor: 6.937

10. Hubba: hub objects analyzer--a framework of interactome hubs identification for network biology.

Authors: Chung-Yen Lin; Chia-Hao Chin; Hsin-Hung Wu; Shu-Hwa Chen; Chin-Wen Ho; Ming-Tat Ko
Journal: Nucleic Acids Res Date: 2008-05-24 Impact factor: 16.971

13 in total

1. Decoding the complex genetic causes of heart diseases using systems biology.

Authors: Djordje Djordjevic; Vinita Deshpande; Tomasz Szczesnik; Andrian Yang; David T Humphreys; Eleni Giannoulatou; Joshua W K Ho
Journal: Biophys Rev Date: 2014-12-10

Review 2. Individuating Possibly Repurposable Drugs and Drug Targets for COVID-19 Treatment Through Hypothesis-Driven Systems Medicine Using CoVex.

Authors: Julian Matschinske; Marisol Salgado-Albarrán; Sepideh Sadegh; Dario Bongiovanni; Jan Baumbach; David B Blumenthal
Journal: Assay Drug Dev Technol Date: 2020-11-06 Impact factor: 1.738

3. Integromics network meta-analysis on cardiac aging offers robust multi-layer modular signatures and reveals micronome synergism.

Authors: Konstantina Dimitrakopoulou; Aristidis G Vrahatis; Anastasios Bezerianos
Journal: BMC Genomics Date: 2015-03-04 Impact factor: 3.969

4. Arete - candidate gene prioritization using biological network topology with additional evidence types.

Authors: Artem Lysenko; Keith Anthony Boroevich; Tatsuhiko Tsunoda
Journal: BioData Min Date: 2017-07-06 Impact factor: 2.522

Review 5. The Scope of Big Data in One Medicine: Unprecedented Opportunities and Challenges.

Authors: Molly E McCue; Annette M McCoy
Journal: Front Vet Sci Date: 2017-11-16

6. HGPEC: a Cytoscape app for prediction of novel disease-gene and disease-disease associations and evidence collection based on a random walk on heterogeneous network.

Authors: Duc-Hau Le; Van-Huy Pham
Journal: BMC Syst Biol Date: 2017-06-15