Literature DB >> 26272981

cyNeo4j: connecting Neo4j and Cytoscape.

Georg Summer1, Thomas Kelder2, Keiichiro Ono3, Marijana Radonjic2, Stephane Heymans4, Barry Demchak3.   

Abstract

UNLABELLED: We developed cyNeo4j, a Cytoscape App to link Cytoscape and Neo4j databases to utilize the performance and storage capacities Neo4j offers. We implemented a Neo4j NetworkAnalyzer, ForceAtlas2 layout and Cypher component to demonstrate the possibilities a distributed setup of Cytoscape and Neo4j have.
AVAILABILITY AND IMPLEMENTATION: The app is available from the Cytoscape App Store at http://apps.cytoscape.org/apps/cyneo4j, the Neo4j plugins at www.github.com/gsummer/cyneo4j-parent and the community and commercial editions of Neo4j can be found at http://www.neo4j.com. CONTACT: georg.summer@gmail.com.
© The Author 2015. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2015        PMID: 26272981      PMCID: PMC4653389          DOI: 10.1093/bioinformatics/btv460

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Network biology facilitates the understanding of complex biological systems by organizing, analyzing and visualizing knowledge and experimental data in networks. Built upon the field of graph theory, network biology provides researchers decades worth of research in the form of sophisticated graph algorithms. Software applications like Cytoscape (Saito ; Shannon ) and Gephi (Bastian ) are developed to provide visualization and analysis methods to data scientists, with Cytoscape being widely used in life sciences. As networks are becoming larger and more complex, the computational performance necessary to analyze them increases drastically. Moving the computation from desktop environments like Cytoscape and Gephi to powerful servers is a common method used to cope with the increasing demand for computation. We present cyNeo4j, a Cytoscape app to link Cytoscape on the desktop to a server environment using a Neo4j database. Neo4j (www.neo4j.com) is a Java-based database designed to store and query graphs. Neo4j falls in the category of NoSQL databases as it departs from the relational model used in traditional databases. Neo4j ensures transaction reliability through ACID compliance, provides a SQL-inspired query language called Cypher and its community edition is free to use and open source. Additionally, Neo4j servers can be extended using plugins to add more complex algorithms than the ones built in. CyNeo4j supports two such plugins which showcase the performance increase that can be achieved using a Neo4j and powerful computational backend. As of version 1.1, cyNeo4j supports a plugin that provides a set of network layout algorithms and a plugin that implements the Cytoscape NetworkAnalyzer. We will briefly explain how Cytoscape users can enrich their workflows with cyNeo4j and how app and algorithm developers can benefit from it.

2 cyNeo4j for Cytoscape users

A prerequisite for cyNeo4j is a running Neo4j server. Neo4j provides thorough documentation to setup the server, additionally tutorials are available on the cyNeo4j website. The server can be run on the same computer or ideally on a computationally more powerful one. The first task for a Cytoscape user is to connect to a Neo4j database. After the connection is established and validated by cyNeo4j, the app discovers all algorithms available on the server and supported by the app itself. There are two typical use-cases for cyNeo4j: the network to be analyzed is available locally in Cytoscape or a network is stored in the running Neo4j server. In the first case the user can upload a network from Cytoscape to the Neo4j server and then run algorithms on it both locally and on the Neo4j server. This allows for an interactive workflow that uses the computational strength of the Neo4j server without interrupting the normal workflow in Cytoscape. Figure 1 shows the results of a benchmark for the NetworkAnalyzer (Assenov ) functionality present in Cytoscape. The Neo4j implementation cuts the waiting time for the network statistics by a factor of 4 in a subset of the STRING network (Szklarczyk ) with 4436 nodes and 93 286 edges if run on the same computer, producing the same statistical results (disregarding rounding behaviour). A Dell XPS 2015 (8 GB RAM, SSD, Intel Core i7-5500U 2.4 GHz) was used to compute both the Cytoscape and cyNeo4j NetworkAnalyzer results. The second use case envisions a network already stored on the Neo4j server. This network can be larger than one that is feasible to work with in Cytoscape. While it might not be possible to have the whole network locally, Neo4j is a fully fledged database and parts can be extracted using the Cypher query language for targeted analysis. Algorithms in Neo4j can still be executed on the larger network and the results then studied in smaller chunks accessed through Cypher. Currently two example algorithms are implemented as Neo4j plugins to showcase the cyNeo4j app.
Fig. 1.

Network Analyzer Benchmark: The NetworkAnalyzer plugin implemented in Neo4j and called from cyNeo4j reduces the computation time by a factor of 4 compared to the implementation in Cytoscape. Multiple threads allow for a further decline in computation time needed. We extracted a subset of STRING with 4436 nodes and 93 286 edges

The NetworkAnalyzer plugin for Neo4j calculates network statistics (e.g. betweenness centrality, average shortest paths, etc.) similar to the one shipped in Cytoscape. The resulting statistics are added as properties to the local Cytoscape network and can be used for visualization using the standard Cytoscape VizMapper functionality. Additionally, the statistics can be saved in the Neo4j network. The ForceAtlas2 (Jacomy ) plugin brings a layout algorithm for large graphs from Gephi to Cytoscape via Neo4j. CyNeo4j allows the user to execute this layout with the same parameters as in Gephi and in a similar iterative and interactive fashion. Network Analyzer Benchmark: The NetworkAnalyzer plugin implemented in Neo4j and called from cyNeo4j reduces the computation time by a factor of 4 compared to the implementation in Cytoscape. Multiple threads allow for a further decline in computation time needed. We extracted a subset of STRING with 4436 nodes and 93 286 edges

3 cyNeo4j for Developers

This section will give a short overview on how to implement algorithms as Neo4j plugins and how to integrate them in cyNeo4j.

3.1 Implementation of algorithms in Neo4j

On the server side, plugins have access to the complete Neo4j Java API including a set of graph algorithms tailored for Neo4j and the query language Cypher. These plugins can vary widely in how Neo4j capabilities are utilized: The Neo4j NetworkAnalyzer plugin heavily depends on the shortest path and centralities algorithms of Neo4j, whereas the ForceAtlas2 algorithm only uses the nodes and edges retrieval functionality of the API. Third party libraries can also be used to add functionality. The standard plugin interface of Neo4j allows plugins to return sets of basic variable types (numerics and strings).

3.2 Extension of cyNeo4j with new algorithms

CyNeo4j uses the Cypher and Plugin REST API through HTTP to communicate with Neo4j. New algorithms can be added to Neo4j as plugins which can be accessed through the REST API of the server making them easy to reuse in web applications or data analysis environments like R. New Neo4j plugins need to be added to cyNeo4j to allow proper discovery upon connecting to the Neo4j server, integration into the Cytoscape UI and interpretation of the algorithm results. The integration into cyNeo4j also allows for an iterative execution of an algorithm by sending multiple requests to the server enabling the user to interrupt the execution or observe intermediate results as shown with the ForceAtlas2 plugin. Algorithms can be executed asynchronously to not block Cytoscape during long running calculations. In this case the user has to determine when the calculation is done and has to retrieve the results manually.

4 Conclusion

We developed cyNeo4j to connect Cytoscape and Neo4j allowing us to speed up the performance of network analysis algorithms and use the Cypher query language to navigate and explore networks too large for typical desktop computers.

Funding

The project was supported through the Google Summer of Code 2014 and the European Union (FP7-HEALTH-2010), MEDIA, large-scale integrating project grants. Conflict of Interest: none declared.
  5 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  Computing topological parameters of biological networks.

Authors:  Yassen Assenov; Fidel Ramírez; Sven-Eric Schelhorn; Thomas Lengauer; Mario Albrecht
Journal:  Bioinformatics       Date:  2007-11-15       Impact factor: 6.937

3.  A travel guide to Cytoscape plugins.

Authors:  Rintaro Saito; Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Samad Lotia; Alexander R Pico; Gary D Bader; Trey Ideker
Journal:  Nat Methods       Date:  2012-11-06       Impact factor: 28.547

4.  STRING v10: protein-protein interaction networks, integrated over the tree of life.

Authors:  Damian Szklarczyk; Andrea Franceschini; Stefan Wyder; Kristoffer Forslund; Davide Heller; Jaime Huerta-Cepas; Milan Simonovic; Alexander Roth; Alberto Santos; Kalliopi P Tsafou; Michael Kuhn; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

5.  ForceAtlas2, a continuous graph layout algorithm for handy network visualization designed for the Gephi software.

Authors:  Mathieu Jacomy; Tommaso Venturini; Sebastien Heymann; Mathieu Bastian
Journal:  PLoS One       Date:  2014-06-10       Impact factor: 3.240

  5 in total
  11 in total

1.  Nature and Extent of Physical Comorbidities Among Korean Patients With Mental Illnesses: Pairwise and Network Analysis Based on Health Insurance Claims Data.

Authors:  Ho Joon Kim; Sam Yi Shin; Seong Hoon Jeong
Journal:  Psychiatry Investig       Date:  2022-06-15       Impact factor: 3.202

2.  STON: exploring biological pathways using the SBGN standard and graph databases.

Authors:  Vasundra Touré; Alexander Mazein; Dagmar Waltemath; Irina Balaur; Mansoor Saqi; Ron Henkel; Johann Pellet; Charles Auffray
Journal:  BMC Bioinformatics       Date:  2016-12-05       Impact factor: 3.169

3.  Beyond Pathway Analysis: Identification of Active Subnetworks in Rett Syndrome.

Authors:  Ryan A Miller; Friederike Ehrhart; Lars M T Eijssen; Denise N Slenter; Leopold M G Curfs; Chris T Evelo; Egon L Willighagen; Martina Kutmon
Journal:  Front Genet       Date:  2019-02-21       Impact factor: 4.599

Review 4.  Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

Authors:  Anastasis Oulas; George Minadakis; Margarita Zachariou; Kleitos Sokratous; Marilena M Bourdakou; George M Spyrou
Journal:  Brief Bioinform       Date:  2019-05-21       Impact factor: 11.622

5.  GREG-studying transcriptional regulation using integrative graph databases.

Authors:  Songqing Mei; Xiaowei Huang; Chengshu Xie; Antonio Mora
Journal:  Database (Oxford)       Date:  2020-01-01       Impact factor: 3.451

6.  Fast and flexible analysis of linked microbiome data with mako.

Authors:  Lisa Röttjers; Karoline Faust
Journal:  Nat Methods       Date:  2021-12-09       Impact factor: 47.990

7.  ERMer: a serverless platform for navigating, analyzing, and visualizing Escherichia coli regulatory landscape through graph database.

Authors:  Zhitao Mao; Ruoyu Wang; Haoran Li; Yixin Huang; Qiang Zhang; Xiaoping Liao; Hongwu Ma
Journal:  Nucleic Acids Res       Date:  2022-04-30       Impact factor: 19.160

Review 8.  Rationale of the FIBROTARGETS study designed to identify novel biomarkers of myocardial fibrosis.

Authors:  João Pedro Ferreira; Jean-Loup Machu; Nicolas Girerd; Frederic Jaisser; Thomas Thum; Javed Butler; Arantxa González; Javier Diez; Stephane Heymans; Kenneth McDonald; Mariann Gyöngyösi; Hueseyin Firat; Patrick Rossignol; Anne Pizard; Faiez Zannad
Journal:  ESC Heart Fail       Date:  2017-10-07

9.  Reactome graph database: Efficient access to complex pathway data.

Authors:  Antonio Fabregat; Florian Korninger; Guilherme Viteri; Konstantinos Sidiropoulos; Pablo Marin-Garcia; Peipei Ping; Guanming Wu; Lincoln Stein; Peter D'Eustachio; Henning Hermjakob
Journal:  PLoS Comput Biol       Date:  2018-01-29       Impact factor: 4.475

10.  Towards FAIRer Biological Knowledge Networks Using a Hybrid Linked Data and Graph Database Approach.

Authors:  Marco Brandizi; Ajit Singh; Christopher Rawlings; Keywan Hassani-Pak
Journal:  J Integr Bioinform       Date:  2018-08-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.