Literature DB >> 24715956

Cyrface: An interface from Cytoscape to R that provides a user interface to R packages.

Emanuel Gonçalves1, Franz Mirlach2, Julio Saez-Rodriguez1.   

Abstract

There is an increasing number of software packages to analyse biological experimental data in the R environment. In particular, Bioconductor, a repository of curated R packages, is one of the most comprehensive resources for bioinformatics and biostatistics. The use of these packages is increasing, but it requires a basic understanding of the R language, as well as the syntax of the specific package used. The availability of user graphical interfaces for these packages would decrease the learning curve and broaden their application. Here, we present a Cytoscape app termed Cyrface that allows Cytoscape apps to connect to any function and package developed in R. Cyrface can be used to run R packages from within the Cytoscape environment making use of a graphical user interface. Moreover, it can link R packages with the capabilities of Cytoscape and its apps, in particular network visualization and analysis. Cyrface's utility has been demonstrated for two Bioconductor packages ( CellNOptR and DrugVsDisease), and here we further illustrate its usage by implementing a workflow of data analysis and visualization. Download links, installation instructions and user guides can be accessed from the Cyrface's homepage ( http://www.ebi.ac.uk/saezrodriguez/cyrface/) and from the Cytoscape app store ( http://apps.cytoscape.org/apps/cyrface).

Entities:  

Year:  2013        PMID: 24715956      PMCID: PMC3962008          DOI: 10.12688/f1000research.2-192.v2

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


Introduction

The availability of high-throughput experimental data has led to the development of multiple computational methods to analyse these data. One of the most used environments is the statistical programming language R [1]. Multiple R packages for computational biology and bioinformatics are available in various resources such as the Comprehensive R Archive Network (CRAN). Furthermore, Bioconductor [2] provides a large collection of curated packages to analyse biological data developed in R. These packages are subject to stringent quality control in terms of functionality and documentation. It is an open-source project hosting 824 active and curated software packages as of May 2014. For those not familiar with computational programming, learning R and using its packages can be a time consuming task. Therefore, the use of intuitive graphical interfaces that simplifies their use can enhance the usability of these R packages. Cytoscape [3, 4] is a Java open-source framework with an intuitive graphical interface devoted to the visualization and analysis of networks. It is arguably one of the most used tools in bioinformatics, and has a variety of user developed extensions to solve numerous computational biology problems. These user defined extensions are termed plug-ins (1.x and 2.x) or apps (3.x) depending of which version of Cytoscape is being used. Here, we present Cyrface, an app for Cytoscape that facilitates an interface between any R package and Cytoscape. Cyrface is designed to integrate the major strengths of R and Cytoscape environments by providing a general Java to R interface. By linking these two environments, Cyrface allows one to use Cytoscape as a graphical user interface for R packages. It also enables Cytoscape apps to access the wealth of methods implemented in R. Workflow management systems such as Taverna [5] and Galaxy [6– 8] can call R packages from a graphical user interface (GUI)-based interface. Taverna is a standalone Java open-source tool for the general development and execution of workflows. Galaxy is an open-source web-platform to assemble workflows based on genomic experimental data analysis. Thus, Cyrface complements Taverna and Galaxy by enhancing GUIs for R within a different environment with complementary features. RCytoscape [9] is another tool that exists to link R and Cytoscape. It is a Bioconductor R package that establishes a connection between R and Java. The fundamental difference between RCytoscape and Cyrface is that RCytoscape supports the connection from R to Java, whereas Cyrface allows a connection from Java to R. A typical use of RCytoscape would be to handle experimental data from R and transfer the biological network to Cytoscape while controlling it within R. Hence, RCytoscape and Cyrface provide complementary features. This paper is structured as follows: Firstly, we provide a description of the implementation of Cyrface. Then, to illustrate the usefulness of Cyrface, we show two existing apps, CytoCopteR [10] and DrugVsDisease (DvD) [11], that make use of Cyrface, and we demonstrate an implementation of a simplified version of the DataRail [12] workflow. Finally, we discuss on-going and future developments.

Implementation

Cyrface is a Java open-source framework developed to establish the connection between Cytoscape and R. Interaction between these two different environments (invoking R within Java) is not natively supported by Java. Therefore, to achieve this Cyrface uses the external libraries RCaller [13] and Rserve [14]. On the one hand, to support the communication between Java and R, RCaller uses an R package called Runiversal that converts the R objects into an XML format, thus allowing the R objects to be read by Java. On the other hand, Rserve establishes a TCP/IP server allowing other programs from various languages to connect to an R session and access its features. Rserve is currently being used by several mature projects, among them the Taverna workflow management system [5]. Rserve and RCaller libraries are integrated in Cyrface by implementing RserveHandler and RCallerHandler Java classes, respectively. Both classes extend the abstract class RHandler that contains the signature of all the necessary methods to establish and maintain a connection with R. Figure 1 depicts the hierarchical structure of these classes and the connection points between these two different environments.
Figure 1.

Diagram of the Cyrface interaction layer with R.

Within the grey box the class hierarchy of the classes responsible for establishing the connection between Cytoscape and R is represented. RHandler is an abstract Java class that is extended by RserveHandler and RCallerHandler classes that add support to Rserve and RCaller libraries, respectively. The connection from Java to R can be achieved using either RserveHandler or RCallerHandler classes, or other classes that successfully extend RHandler.

Diagram of the Cyrface interaction layer with R.

Within the grey box the class hierarchy of the classes responsible for establishing the connection between Cytoscape and R is represented. RHandler is an abstract Java class that is extended by RserveHandler and RCallerHandler classes that add support to Rserve and RCaller libraries, respectively. The connection from Java to R can be achieved using either RserveHandler or RCallerHandler classes, or other classes that successfully extend RHandler. Cyrface software architecture is designed to allow the integration of other Java libraries that facilitate the connection between Java and R. Thereby, this structure allows one to take advantage of particular strengths of different libraries and to adapt to particular requirements of the users. For instance, execute R commands automatically without requiring first to manually initiate an R session. Cyrface uses Cytoscape’s features, such as the Command Line. The Command Line offers the users the ability to script basic commands in Cytoscape, such as import, display or modify networks through a simple command line or script file. A useful feature of the Command Line is the ability of performing repetitive tasks automatically. By supporting this tool Cyrface extends the possibility of the users to integrate in their scripts methods developed in R together with common Cytoscape features. The Command Tool Dialog window can be used to dynamically execute the necessary R commands. This can be useful, for example when debugging a script. On Cyrface’s homepage, we provide an illustrative example using the Command Line Dialog tool to plot some features of an existing and publicly available data-set termed, iris [13], using the well known plotting library ggplot [14]. The iris data-set is widely used in the field of pattern recognition and machine learning and is subdivided into different classes, where each class defines the type of the plant iris. This is an illustrative example to demonstrate how Cyrface can within Cytoscape perform any task in the R environment and collect the respective output.

Results and discussion

A typical use of Cyrface is to provide a graphical user interface to R packages within Cytoscape. Cyrface is currently being used by two Cytoscape plug-ins, CytoCopteR [10] and DvD [11]. CytoCopteR [10] provides a simple step-by-step interface allowing users without any experience in R to use the CellNOptR ( www.cellnopt.org) package and handle the input and output networks in Cytoscape. CellNOptR is an open-source software package that provides methods for building predictive logic models from signalling networks using experimental measurements of activation of proteins upon perturbation. DvD [11], Drug vs Disease, is an R package that provides a workflow for the comparison of drug and disease gene expression profiles. It provides dynamic access to databases, such as Array Express [15], to compare drug and disease signatures to generate hypotheses of drug-repurposing. CytoCopteR and DvD are two examples of how Cyrface captures the strengths of both environments. On one side, R provides a wealth of bioinformatics and biostatistics packages with very comprehensive resources such as Bioconductor and CRAN. On the other side, Cytoscape facilitates a user-friendly graphical interface for network visualisation and analysis, complemented with a variety of plug-ins or apps addressing different computational biological problems. Cyrface links these two environments by providing a way to develop user-friendly graphical interfaces for R packages by embedding them within Cytoscape. As another illustrative example, we implemented in Cyrface a simplified version of the DataRail [12] workflow. This example is designed to illustrate how one can use methods already available in R and build a graphical user interface in Cytoscape to access them. DataRail is an open-source MATLAB toolbox that handles experimental data in a tabular format and provides methods to maximize and extract information using internal or external tools. Experimental data is stored in a format termed Minimum Information for Data Analysis in Systems Biology ( MIDAS). This is a tabular format that specifies the layout of experimental data files [12]. A typical use of DataRail is to import, store and process the input information from instruments using the MIDAS format, and export it to other MIDAS compliant software, such as CellNOptR. The DataRail workflow implemented in Cyrface is structured in several sequential steps that allows the users to import, normalise and visualise experimental data-sets stored in the MIDAS format ( Figure 2). The workflow is tested using an in silico generated data-set and a signalling network from [16].
Figure 2.

The Cyrface implementation of the DataRail [12] workflow.

The rounded rectangles represent the MIDAS files containing the experimental data at a given state. Hexagon nodes represent functions such as load or normalise. Green identifies steps that were successfully executed and grey identifies those that were not run yet. The data-set shown represents the normalised values of the protein activity state of a set of proteins (columns) under different stimulatory conditions (rows).

The Cyrface implementation of the DataRail [12] workflow.

The rounded rectangles represent the MIDAS files containing the experimental data at a given state. Hexagon nodes represent functions such as load or normalise. Green identifies steps that were successfully executed and grey identifies those that were not run yet. The data-set shown represents the normalised values of the protein activity state of a set of proteins (columns) under different stimulatory conditions (rows). An extension to the workflow was subsequently added to support the model training function form CellNOptR package [10]. CellNOptR training function maximises the fit of the experimental data and the corresponding prior-knowledge network, by generating and optimising a logic model. Thereby, through an intuitive graphical interface, users are able to select a biological network and use it to assess the quality of the fit with a corresponding data set of experimental data. This extension illustrates how one can in principle embed any R package in such a workflow, but it does not replace the CytoCopteR app as a complete interface for CellNOptR. The workflow supports the SIF network format, which is supported by Cytoscape, but also the Systems Biology Markup Language Qualitative Models (SBML Qual) format [16]. SBML Qual is an extension of the SBML format and is proposed to provide a standard representation for logic and qualitative models of biological networks. Support for importing models stored in SBML Qual format is achieved using the jSBML library [17] and the respective SBML Qual package. Supplementary material 1 provides a step-by-step tutorial on how to use the workflow.

Conclusions

Here, we present Cyrface; a bioinformatics Java library that provides a general interaction between Cytoscape and R. Cyrface offers a way to combine a friendly graphical interface within the Cytoscape environment with any R package. A GUI should benefit beginners and occasional users; as well as being useful for training and illustration purposes, it extends the accessibility of the tool to those not familiar with the R command line interface. Moreover, Cyrface complements other libraries such as Rserve since, (i) it is capable of using Rserve, RCaller or any other existing Java library to query R, and (ii) it provides a tailored implementation for Cytoscape, providing interfaces that are suited to Cytoscape features, such as the support of the Command Dialog tool. Cyrface’s homepage (see Software Details section) contains the link to download Cyrface and user-guide instructions. A few examples demonstrating the usefulness of the tool and the different supported libraries are also shown and explained. The source-code of Cyrface is publicly available on its GitHub webpage (see Software Details section). Future features for Cyrface will include providing connections to Cytoscape.js, improvements to the DataRail workflow and further developing and testing future features, such as add support access to remote servers of Rserve. A common scenario in an interdisciplinary field such as network biology, is one where there is on one side an expert on a certain biological question, who has data to address this question and, on the other side a computational scientist who develops algorithms, but is less familiar with the experiments. To help to bridge this situation tools like Cyrface facilitate to encapsulate sophisticated algorithms developed in R in a user-friendly interface within the Cytoscape framework, to enable non-experts to apply these algorithms.

Software details

Homepage: http://www.ebi.ac.uk/saezrodriguez/cyrface/ Software available from: http://apps.cytoscape.org/apps/cyrface Latest source code: https://github.com/EmanuelGoncalves/cyrface Source code as at the time of publication: https://github.com/F1000Research/cyrface Archived source code as at the time of publication: http://www.dx.doi.org/10.5281/zenodo.10153 [18] License: GNU General Public License version 3.0 (GPLv3) My main concerns from the first version were taken into account and the paper subsequently corrected. I have no further comments. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. Cyrface is a welcome addition to the Cytoscape ecosystem; nicely complementary to RCytoscape. My only reservation is one which applies to my own work (the aforementioned RCytoscape), indeed as well as all parts of the Cytoscape ecosystem. My reservation has two parts: First: network biology is in its infancy and as such experimental data are woefully incomplete. Molecular interactions are stochastic, contingent and often very short-lived yet it is exactly these molecular interactions that we need to understand in order to predict and control cellular activity in health and disease. Second: it is (and may remain) unusual to find researchers, much less clinicians, who are adept at both programming and biomedicine. These two disciplines seem to select for, and then reinforce, different styles of thinking. Therefore progress in this field (call it network biology, or systems biology, or integrative biomedicine) requires hybrid teams: some who are very strong in biological and/or clinical sensibilities and some who are strong in computation and data analysis. Such a hybrid team, at its best, stays together long enough for mutual understanding and communication to emerge, as in the "trading language" which emerged in the world of particle physics in and around linear accelerators in the 60's (see Peter Galison's "Image and Logic"). I worry about the following scenario for Cyrface: a capable programmer hooks up the latest and greatest Bioconductor package to Cytoscape, exposes as best they can the parameterizations offered by that package, and turns the tool over to their collaborating biologist. Experimental data is loaded and analyses or simulations run. Puzzles and inconclusive results will inevitably emerge, requiring detailed knowledge of both the strengths and weaknesses of the Bioc package. With good luck, perseverance and good data, this small working team may in time settle on a satisfactory Cyrface tool which can be reused without the constant intervention of the programmer. This will last until new data is acquired, upsetting the equilibrium, and the hybrid style of work and the back-and-forth between biologist and programmer, begins anew. I say that I worry about this scenario. It may be exactly the intended use of Cyrface; the problem it is intended to solve. But this essentially sociological characteristic (requirement?) of Cyrface is not described in the paper. I think that those of us who create bioinformatics software tend to avoid being explicit about this - and I think that this (the social & collaborative requirements of bioinformatic research) deserves a lot more attention. If indeed network biology, as I claim, is in its infancy, then it may be helpful if the ecosystem of Cytoscape-related tools are considered from this perspective. I suspect that the conclusions we might all (mostly) agree upon are: Thus, Cyrface is a good step in the right direction. The next steps, it seems to me, include: These possible next steps carry on in the spirit of Cyrface, RCytoscape, and Cytoscape3 apps, and will promote the creation of, and sharing of, custom network analyses, shared tools, and lead to fruitful collaborations across the hybrid community of biologists, physicians and programmers. User-friendly exploration of data-rich networks in a web browser (as with cytoscape.js) will become increasingly popular. That user-friendliness often competes with analytical nuance and close scrutiny - biologically and clinically useful results become less likely. Cyrface's connection of Cytoscape to R is a great step in the right direction, marrying as it does user-friendliness with some new analytical power in a way that is nicely complementary to Cytoscape java plugins and Cytoscape access to web services. Providing easy connections to R (python, C++) analyses for cytoscape.js A standard mechanism whereby scripts (R, python, Ruby, Perl)  upon execution, can start up a Cytoscape or cytoscape.js session, customize it with networks, functions, buttons and menus, and with both public and laboratory data. As a generalization of Cyrface, this mechanism would encourage the rapid expansion of Cytoscape capabilities. I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard. The paper is well written, however, many items require clarification: My main criticism is the initial need to install several packages (RCaller, Rserve, Runiversal, Java, etc...), plus Cytoscape and the CommandTool plugin. Since I did not know if this would break my personal configuration or if I would be able to uninstall it, I was not able to performe the whole installation myself. It would be great if the authors could provide a virtual machine image with all the software preloaded so that one can try it out of the box without installation. I know by experience that cityscape plugins tend to work only with a single version of Cytoscape, so a clear list of all the required versions in the paper itself would be very useful. The number of packages we are dealing with is very confusing. I am also afraid that with such a large number of dependencies, the program may break after any update. In the Implementation section, the authors mention the "iris dataset". It would be useful to define what this is. It is also mentioned in the documentation but it is still unclear what the authors are referring to. The documentation shows an example where the ggplot2 package is used to plot "petal". Could the authors please define what this is? Figure 2: It would be useful to define more accurately what type of data we are looking at here (e.g. gene expression?). In the discussion, several R packages are mentioned (DvD, Cytocopter), what are their links with the present software other than the fact that they run under R? Is it really possible to interact with them from Cytoscape? If so, the proper documentation or a tutorial should be provided. It would also be useful if an actual example using DvD along interaction data, could be shown. I do not understand the example with DataRail. The usefulness of the example given is not clear to me since it seems that no interaction data is given (for me this is the main purpose of Cytoscape). A useful example for most users might be: - A network of protein-protein interaction (PPI) data in Cytoscape. - Some expression data in R (an exprs object for instance). An example of a question might be " How to superimpose the expression data and generate a proper network attribute from it?" I have read this submission. I believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above. In this manuscript, the authors describe the Cytoscape plugin Cyrface. Cyrface consists of two components: 1) a Java API, which is already being used by the Cytoscape plugins CytoCopteR and DrugvsDisease (both developed by the same group), and 2) a graphical user interface that connects R to Cytoscape. As a proof-of-concept to what kind of applications can be built on top this interface, the plugin also supports the MIDAS and SBML-Qual formats. The article is well written and the tool is useful to the community. However, we recommend the following changes to the article to make it more appealing to potential future users: As the new version of the Cytoscape (3.x) is becoming more widely used by the community, the authors should explicitly state that they are targeting version 2.8 with this framework. This will reduce the confusion for users who are not as familiar with Cytoscape. The tutorial in the supplementary materials helps to understand the general use case for this plug-in, but the lack of downloadable “sample” files for this example will make it harder for users to learn how to use the DataRail pipeline. We think it is important to provide example files that people can use for reproducing the figures in the manuscript. ​The Cyrface interaction layer with R looks helpful for programmers, but can the authors comment on how these classes are different from the default Java implementations of the RServe clients, e.g. http://rforge.net/RServe? This will help clarify why people should use Cyrface for their next project. The command line interface (commandTool) appears to be useful; but it seems that it is only capable of running commands in an isolated environment, with each command having its own session. If this is the case, can the authors comment on what the advantage of running R commands from the commandTool is compared to initiating a terminal window and running commands directly from an R shell? Are users able to, for example, pass node/edge attribute fields to the corresponding R commands? It looks like the current implementation does not support setting a different RServe location other than localhost. Although not necessary, if users are given the option to set a different RServe address within the plug-in, this will further lower the barrier for users who are not experienced with R to use Cyrface, where they can use a pre-installed Rserve hosted on a different machine. We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
                                                                                                                                                                                                         Emanuel Gonçalves
                                                                                                                                                                                                         Saez-Rodriguez Group
                                                                                                                                                                                                         EMBL-EBI
                                                                                                                                                                                                         emanuel@ebi.ac.uk
                                                                                                                                                                                                         May 12, 2014
  14 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  Flexible informatics for linking experimental data to mathematical models via DataRail.

Authors:  Julio Saez-Rodriguez; Arthur Goldsipe; Jeremy Muhlich; Leonidas G Alexopoulos; Bjorn Millard; Douglas A Lauffenburger; Peter K Sorger
Journal:  Bioinformatics       Date:  2008-01-24       Impact factor: 6.937

3.  Bioconductor: open software development for computational biology and bioinformatics.

Authors:  Robert C Gentleman; Vincent J Carey; Douglas M Bates; Ben Bolstad; Marcel Dettling; Sandrine Dudoit; Byron Ellis; Laurent Gautier; Yongchao Ge; Jeff Gentry; Kurt Hornik; Torsten Hothorn; Wolfgang Huber; Stefano Iacus; Rafael Irizarry; Friedrich Leisch; Cheng Li; Martin Maechler; Anthony J Rossini; Gunther Sawitzki; Colin Smith; Gordon Smyth; Luke Tierney; Jean Y H Yang; Jianhua Zhang
Journal:  Genome Biol       Date:  2004-09-15       Impact factor: 13.583

4.  The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud.

Authors:  Katherine Wolstencroft; Robert Haines; Donal Fellows; Alan Williams; David Withers; Stuart Owen; Stian Soiland-Reyes; Ian Dunlop; Aleksandra Nenadic; Paul Fisher; Jiten Bhagat; Khalid Belhajjame; Finn Bacall; Alex Hardisty; Abraham Nieva de la Hidalga; Maria P Balcazar Vargas; Shoaib Sufi; Carole Goble
Journal:  Nucleic Acids Res       Date:  2013-05-02       Impact factor: 16.971

5.  Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences.

Authors:  Jeremy Goecks; Anton Nekrutenko; James Taylor
Journal:  Genome Biol       Date:  2010-08-25       Impact factor: 13.583

6.  Cytoscape 2.8: new features for data integration and network visualization.

Authors:  Michael E Smoot; Keiichiro Ono; Johannes Ruscheinski; Peng-Liang Wang; Trey Ideker
Journal:  Bioinformatics       Date:  2010-12-12       Impact factor: 6.937

7.  JSBML: a flexible Java library for working with SBML.

Authors:  Andreas Dräger; Nicolas Rodriguez; Marine Dumousseau; Alexander Dörr; Clemens Wrzodek; Nicolas Le Novère; Andreas Zell; Michael Hucka
Journal:  Bioinformatics       Date:  2011-06-22       Impact factor: 6.937

8.  CellNOptR: a flexible toolkit to train protein signaling networks to data using multiple logic formalisms.

Authors:  Camille Terfve; Thomas Cokelaer; David Henriques; Aidan MacNamara; Emanuel Goncalves; Melody K Morris; Martijn van Iersel; Douglas A Lauffenburger; Julio Saez-Rodriguez
Journal:  BMC Syst Biol       Date:  2012-10-18

9.  DvD: An R/Cytoscape pipeline for drug repurposing using public repositories of gene expression data.

Authors:  Clare Pacini; Francesco Iorio; Emanuel Gonçalves; Murat Iskar; Thomas Klabunde; Peer Bork; Julio Saez-Rodriguez
Journal:  Bioinformatics       Date:  2012-11-04       Impact factor: 6.937

10.  SBML qualitative models: a model representation format and infrastructure to foster interactions between qualitative modelling formalisms and tools.

Authors:  Claudine Chaouiya; Duncan Bérenguier; Sarah M Keating; Aurélien Naldi; Martijn P van Iersel; Nicolas Rodriguez; Andreas Dräger; Finja Büchel; Thomas Cokelaer; Bryan Kowal; Benjamin Wicks; Emanuel Gonçalves; Julien Dorier; Michel Page; Pedro T Monteiro; Axel von Kamp; Ioannis Xenarios; Hidde de Jong; Michael Hucka; Steffen Klamt; Denis Thieffry; Nicolas Le Novère; Julio Saez-Rodriguez; Tomáš Helikar
Journal:  BMC Syst Biol       Date:  2013-12-10
View more
  2 in total

1.  Modelling with ANIMO: between fuzzy logic and differential equations.

Authors:  Stefano Schivo; Jetse Scholma; Paul E van der Vet; Marcel Karperien; Janine N Post; Jaco van de Pol; Rom Langerak
Journal:  BMC Syst Biol       Date:  2016-07-27

2.  CyREST: Turbocharging Cytoscape Access for External Tools via a RESTful API.

Authors:  Keiichiro Ono; Tanja Muetze; Georgi Kolishovski; Paul Shannon; Barry Demchak
Journal:  F1000Res       Date:  2015-08-05
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.