Literature DB >> 16845051

BiologicalNetworks: visualization and analysis tool for systems biology.

Michael Baitaluk1, Mayya Sedova, Animesh Ray, Amarnath Gupta.   

Abstract

Systems level investigation of genomic scale information requires the development of truly integrated databases dealing with heterogeneous data, which can be queried for simple properties of genes or other database objects as well as for complex network level properties, for the analysis and modelling of complex biological processes. Towards that goal, we recently constructed PathSys, a data integration platform for systems biology, which provides dynamic integration over a diverse set of databases [Baitaluk et al. (2006) BMC Bioinformatics 7, 55]. Here we describe a server, BiologicalNetworks, which provides visualization, analysis services and an information management framework over PathSys. The server allows easy retrieval, construction and visualization of complex biological networks, including genome-scale integrated networks of protein-protein, protein-DNA and genetic interactions. Most importantly, BiologicalNetworks addresses the need for systematic presentation and analysis of high-throughput expression data by mapping and analysis of expression profiles of genes or proteins simultaneously on to regulatory, metabolic and cellular networks. BiologicalNetworks Server is available at http://brak.sdsc.edu/pub/BiologicalNetworks.

Entities:  

Mesh:

Year:  2006        PMID: 16845051      PMCID: PMC1538788          DOI: 10.1093/nar/gkl308

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Networks of molecular interactions are widely studied to reveal the complex roles played by genes, gene products and the cellular environments in biological processes. In these networks (or graphs), the nodes represent genes or gene products and the edges represent specific interactions. In a protein–DNA network, an edge may represent the binding of a transcription factor to a promoter region, while in a protein–protein physical interaction network it might represent a recorded evidence of co-immunoprecipitation or a two-hybrid interaction. The nodes of the network are typically associated with additional information about the genes (or gene products), such as positions in the chromosome (or localization sites), or their Gene Ontology (GO) classification. A number of specialized and publicly accessible databases are available, which contain data about the nodes [the SGD database (1)] and the interactions [BIND database (2)], and some databases contain information about both [such as KEGG (3)]. In addition, individual researchers often publish their data as part of their publications or project web sites. Currently, a number of analysis and visualization tools have been developed by different groups for assimilating, visualizing and for the analysis and modeling of these molecular interaction network data, of which the most notable are Cytoscape (4), Osprey (5), PathwayAssist (6), Pathways Database System (7), GeneGO (), VisANT (8,9). Yet another direction of effort is the storage and analysis of large-scale gene and protein expression data (10–14). While some of these store and display expression data, they do not allow query or analysis of such data or integration of novel gene expression data with existing network models. Ideally, an analysis tool for molecular interaction networks should enable a user to import, efficiently store, effectively retrieve and perform analysis on single genes, gene families, patterns of molecular interactions, as well as on the global structure of the network. The tool needs to be sufficiently flexible for both micro-scale and macro-scale analysis using heterogenous data, and extract the data from a large number of disparate databases; it should allow one to construct interaction networks by curation as well as computation [e.g. using algorithms that convert a time-series microarray dataset into an influence network (15)]; it should enable the users to retrieve different interaction graphs through on-demand queries and construct new graphs by assembling them in a variety of ways. It should also allow the incorporation of novel datasets locally, such as the user's own microarray expression data, and/or overlay these on biological networks to explore novel relationships among genes. These critical needs are the minimal requirements of a systems level analysis of biological pathways. Previously we reported a general-purpose scalable warehouse of biological information, PathSys (16). PathSys is a comprehensive data warehouse resulting from the integration of molecular interaction data with other graph-structured data, such as ontologies, e.g. GO (17) and taxonomies, e.g. enzyme classification system and functional classification of yeast proteins (3), and state data such as gene expression profiles, from over 20 curated and publicly contributed data sources, biological experimental and PubMed data for the eight representative organisms (Saccharomyces cerevisiae Drosophila melanogaster, etc. for full list see website). It contains more than 100 000 events of regulation, interaction and modification among genes, proteins, cell processes and small molecules. Here we present BiologicalNetworks, the web-based query tool built on top of PathSys, and show how it enables a user to derive novel biological insights at the single gene level and functional relationships at the systems level.

Data integration

PathSys's data integration model achieves the following: To illustrate the novelty and capabilities of BiologicalNetworks, in Table 1 we compare BiologicalNetworks against Cytoscape and VisANT. A limitation of Cytoscape and VisANT is that the query capability of these systems is exactly the same as the graph-data manipulation and filtering capability visible on the interface. Thus, the visual integration tools do not have the capacity to take any combination of operations in any order and yet have the system retrieve the specified data in a fashion that optimizes memory and disk operations. We have circumvented this limitation of visual integration by a database-level integration method using a query evaluation engine implementing a query algebra.
Table 1

Comparison of BiologicalNetworks against Cytoscape and VisANT

BiologicalNetworksCytoscapeVisANT
Graph manipulationDeveloped in houseBased on yFiles package graph engineDeveloped in house
Project workspaceProject workspace; data sharing, through user/account/user privileges mechanismNot availableProject workspace could be shared by e-mail
Data representationGeneric data model having three types of nodes (primary, connector and graph nodes representing modularity) and Node/Attribute types hierarchiesTernary relations; no modularityTernary relations; modularity presented
InputLocal file, database load.sif formatted fileDatabase load
OutputLocal (tab delimited, xml, SBML, BN project) file; database edit/update; image printingLocal file; Image printing.visML file
Data integrationData integration engine performing data and property types integration, thus creating biological data and properties ontologiesGO databaseSGD, KEGG, GO are integrated
FilteringFiltering by any combination of Attribute/Node types from Attribute/Node type hierarchiesFlexible filters with different attributes of node and edgeSeveral ‘select’ filters available
SearchAnalytical search tools;Search node name on the graphSearch by keyword and node name on the graph
Keyword search;
Build/expend pathways;
Find direct interactions;
Find covering pathways (all shortest paths);
Find common targets/regulators;
Find intersections with curated pathways;
Network operationsVarious layouts, Network intersection/union/subtraction, statistics, search for cycles, Networks comparison (Network BLAST)Various layouts, several plug-ins for network operations availableRelaxing layout and statistical tool available
Microarray dataImport/Export microarray data;Several plug-ins availableNot available
Expression patterns;
Clustering analysis (different clustering algorithms);
Visually display (static and dynamic time display) gene expressions on the pathways;
Building pathways from expression values;
Building correlation (e.g. Pearson correlation) networks;
Run GO terms overrepresentation analysis (Fisher's test) on expression clusters, networks or group of genes
Integration of object and property types from over 20 databases (for a full list see the website), thus creating a controlled vocabulary (ontology) of object/attribute types. Integration of nomenclature for genes/proteins. Naming conventions between different datasets can be different, and the server-side parser translates between standard nomenclatures and an automated reconciliation procedure assigns multiple names as synonyms to the same ORF. Integration of different types of networks. BiologicalNetworks supports an arbitrary number of interaction types. Users can upload different types of interactions by specifying different evidence codes that are supported in the BiologicalNetworks (see software documentation for a full list of interaction types).

Data representation

Pathways are represented as a graph with three types of nodes. The nodes of the first type are reserved for genes, proteins, small molecules, cellular processes, etc. The nodes of the second type (controls) represent events of functional regulation, chemical reactions or protein–protein interactions; they can have physical meaning, may denote general associations; they can represent shared characteristics between components. The nodes of the third type represent complex objects, such as macromolecular complexes, functional groups, pathways etc. In this case components are made up of subcomponents, being compound or modular, and the connections between modular components (or modules) exist along with interconnections between their subcomponents. Interactions in BiologicalNetworks can also be defined as successively higher-level connections between groups of proteins, complexes, pathways or sub-networks. Most importantly, a classification scheme of Property Types representation (about 2000 unique Property Types and 25 000 nodes of the Property Types tree) allows BiologicalNetworks to represent detailed micro-level information. For example, a protein/process is not only localized in the ‘nucleus’, but it is represented as the starting point of finer subdivisions, such as ‘within the nuclear membrane’ and ending in ‘outer surface of the nuclear membrane’ as a ‘component’ or as simply ‘peripherally associated’ (Figure 1). Such a data model and integration environment align well with data representations of existing databases, such as BIND (2), KEGG (3), TransPath (18), eMaze (19) and significantly extends the concepts of other tools, such as Cytoscape and VisANT. To aid biological understanding, interaction networks and protein complexes can be viewed within the context of GO (17) annotations or KEGG (3) pathway assignments.
Figure 1

BiologicalNetworks data representation and querying.

Interactions have not only information about the relevant literature, but also the experimental system used and a rich array of details on the evidence and classification of the biological properties. For example, a ‘genetic interaction’ between two genes may have information on the wild type/mutant forms, ‘phenotype’ (invasiveness, etc.), mutant ‘allele’, the number of gene copies, etc. Such a rich degree of annotation should play a vital role in understanding the nature of the interaction (see manual for description). An important goal of systems biology is to generate dynamical models of molecular interaction networks (20). To enable this capability we have stored kinetic parameters as properties of reactions, reactants and products. This has been possible to achieve because of our representation of all three above objects as nodes in the interaction graph. This allows the user to represent the process graphs in SBML (21) format for dynamical simulation using a variety of computational methods. To accompany BiologicalNetworks, we have developed a preliminary standard for exchanging files that have visual markup and annotation of network layouts. Users of BiologicalNetworks can input several basic data types, including data in standardized network and interaction data exchange formats, such as PSI-MI (22), BioPAX () and SBML (21).

Data analysis

Once a network dataset has been imported or loaded into BiologicalNetworks, the genes or proteins within it can be queried for other known and predicted interactions from the PathSys's database. Additionally, a repository of curated pathways (∼100 for S.cerevisiae and several hundreds from other organisms) is available for analysis. Imported interactions and components define a network ‘workspace’, which can be annotated and saved for sharing inside user groups working with BiologicalNetworks. To enable data analysis, the following tools are available: Search: find and display a list of objects based on a name or a keyword. Expand: searches the database and displays objects functionally linked to a selected node or a set of nodes. Thus, by alternating expand and filtering options, users can browse through the database building their favorite pathways. Build pathways: finds a set of links between two or more nodes by searching for the shortest path in the total network of all links in the database. This tool assists in finding regulatory paths between all selected objects. Find common targets/regulators: searches for common targets or regulators for the group of molecules. This tool as well as Build Pathway can find functional links between proteins in the lists imported from other programs (e.g. gene expression clusters). Find intersection with curated pathways: searches a group of nodes for other known and predicted interactions from the PathSys's repository of curated pathways. BiologicalNetworks provides an advanced querying facility for retrieving the data of user's interest by querying Nodes and Properties types. User friendly querying interface allows user to make query with any logical combination of conditions both on Node and Property trees (Figure 1, see User's Tutorial for details). Networks can also be analyzed for graph topological properties, such as degree distributions, path lengths, shortest paths or clustering coefficients.

Microarray data analysis

Expression data are easily imported through the Import Expression Data Wizard with a minimal amount of data preprocessing. BiologicalNetworks can interpret files of several types, including Tab Delimited (Stanford) Multiple Sample format, the Affymetrix file format, the TIGR format and GenePix file format. The Expression Experiment Viewer is designed to display a graphical representation of processed gene expression data. It provides a workspace and a suite of algorithms for data analysis, sorting and searching, clustering and normalization, etc. These allow the user the flexibility in creating meaningful views of the expression data. Results of the clustering analysis are represented in the form of tables and heat maps, and graphically as expression graphs. These viewers appear as a subtree under the Analysis Result within the main Project Properties tree. Functionalities available from Microarray submenu and Microarray Experiment Manager Menu bar, allows the user to: Open an expression experiment in a form of a table and heat map; Sort the experiment by a particular sample; Expression data can be visually displayed on an existing pathway diagram by showing different shades of green/red depending on the fold change of expression; Build pathways from expression values; Build correlation networks (e.g. Pearson correlation); Run GO terms overrepresentation analysis (Fisher's test) on expression clusters, networks or group of genes. In Figure 2 a sample pathway incorporating expression data from a microarray experiment has been assembled. On the left panel is a hierarchical tree of analysis workspace, where different types of microarray data as well as analysis and associated results can be accessed.
Figure 2

Microarray data analysis in BiologicalNetworks.

On the pathway diagram genes that are up-regulated in a particular experiment are shown in shades of red, while genes that are down-regulated are shown in shades of green; if no match is found the color gray is used. Using these data it is now possible to provide further annotations of edge property such as positive or negative regulation or to provide new edges between nodes. In conclusion, the BiologicalNetworks web server allows a systems level analysis of genomic scale information as well as single object queries over a variety of databases for integrative views of biological function and for hypothesis generation.
  22 in total

1.  The TRANSPATH signal transduction database: a knowledge base on signal transduction networks.

Authors:  F Schacherer; C Choi; U Götze; M Krull; S Pistor; E Wingender
Journal:  Bioinformatics       Date:  2001-11       Impact factor: 6.937

2.  PathFinder: reconstruction and dynamic visualization of metabolic pathways.

Authors:  Alexander Goesmann; Martin Haubrock; Folker Meyer; Jörn Kalinowski; Robert Giegerich
Journal:  Bioinformatics       Date:  2002-01       Impact factor: 6.937

3.  The systems biology markup language (SBML): a medium for representation and exchange of biochemical network models.

Authors:  M Hucka; A Finney; H M Sauro; H Bolouri; J C Doyle; H Kitano; A P Arkin; B J Bornstein; D Bray; A Cornish-Bowden; A A Cuellar; S Dronov; E D Gilles; M Ginkel; V Gor; I I Goryanin; W J Hedley; T C Hodgman; J-H Hofmeyr; P J Hunter; N S Juty; J L Kasberger; A Kremling; U Kummer; N Le Novère; L M Loew; D Lucio; P Mendes; E Minch; E D Mjolsness; Y Nakayama; M R Nelson; P F Nielsen; T Sakurada; J C Schaff; B E Shapiro; T S Shimizu; H D Spence; J Stelling; K Takahashi; M Tomita; J Wagner; J Wang
Journal:  Bioinformatics       Date:  2003-03-01       Impact factor: 6.937

4.  Pathways database system: an integrated system for biological pathways.

Authors:  L Krishnamurthy; J Nadeau; G Ozsoyoglu; M Ozsoyoglu; G Schaeffer; M Tasan; W Xu
Journal:  Bioinformatics       Date:  2003-05-22       Impact factor: 6.937

5.  The Gene Ontology (GO) database and informatics resource.

Authors:  M A Harris; J Clark; A Ireland; J Lomax; M Ashburner; R Foulger; K Eilbeck; S Lewis; B Marshall; C Mungall; J Richter; G M Rubin; J A Blake; C Bult; M Dolan; H Drabkin; J T Eppig; D P Hill; L Ni; M Ringwald; R Balakrishnan; J M Cherry; K R Christie; M C Costanzo; S S Dwight; S Engel; D G Fisk; J E Hirschman; E L Hong; R S Nash; A Sethuraman; C L Theesfeld; D Botstein; K Dolinski; B Feierbach; T Berardini; S Mundodi; S Y Rhee; R Apweiler; D Barrell; E Camon; E Dimmer; V Lee; R Chisholm; P Gaudet; W Kibbe; R Kishore; E M Schwarz; P Sternberg; M Gwinn; L Hannick; J Wortman; M Berriman; V Wood; N de la Cruz; P Tonellato; P Jaiswal; T Seigfried; R White
Journal:  Nucleic Acids Res       Date:  2004-01-01       Impact factor: 16.971

6.  Pathway studio--the analysis and navigation of molecular networks.

Authors:  Alexander Nikitin; Sergei Egorov; Nikolai Daraselia; Ilya Mazo
Journal:  Bioinformatics       Date:  2003-11-01       Impact factor: 6.937

7.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

8.  The HUPO PSI's molecular interaction format--a community standard for the representation of protein interaction data.

Authors:  Henning Hermjakob; Luisa Montecchi-Palazzi; Gary Bader; Jérôme Wojcik; Lukasz Salwinski; Arnaud Ceol; Susan Moore; Sandra Orchard; Ugis Sarkans; Christian von Mering; Bernd Roechert; Sylvain Poux; Eva Jung; Henning Mersch; Paul Kersey; Michael Lappe; Yixue Li; Rong Zeng; Debashis Rana; Macha Nikolski; Holger Husi; Christine Brun; K Shanker; Seth G N Grant; Chris Sander; Peer Bork; Weimin Zhu; Akhilesh Pandey; Alvis Brazma; Bernard Jacq; Marc Vidal; David Sherman; Pierre Legrain; Gianni Cesareni; Ioannis Xenarios; David Eisenberg; Boris Steipe; Chris Hogue; Rolf Apweiler
Journal:  Nat Biotechnol       Date:  2004-02       Impact factor: 54.908

9.  Osprey: a network visualization system.

Authors:  Bobby-Joe Breitkreutz; Chris Stark; Mike Tyers
Journal:  Genome Biol       Date:  2003-02-27       Impact factor: 13.583

10.  PathMAPA: a tool for displaying gene expression and performing statistical tests on metabolic pathways at multiple levels for Arabidopsis.

Authors:  Deyun Pan; Ning Sun; Kei-Hoi Cheung; Zhong Guan; Ligeng Ma; Matthew Holford; Xingwang Deng; Hongyu Zhao
Journal:  BMC Bioinformatics       Date:  2003-11-07       Impact factor: 3.169

View more
  31 in total

1.  Semantic integration of data on transcriptional regulation.

Authors:  Michael Baitaluk; Julia Ponomarenko
Journal:  Bioinformatics       Date:  2010-04-28       Impact factor: 6.937

Review 2.  Visualization of omics data for systems biology.

Authors:  Nils Gehlenborg; Seán I O'Donoghue; Nitin S Baliga; Alexander Goesmann; Matthew A Hibbs; Hiroaki Kitano; Oliver Kohlbacher; Heiko Neuweger; Reinhard Schneider; Dan Tenenbaum; Anne-Claude Gavin
Journal:  Nat Methods       Date:  2010-03       Impact factor: 28.547

Review 3.  Bioinformatics and systems biology of the lipidome.

Authors:  Shankar Subramaniam; Eoin Fahy; Shakti Gupta; Manish Sud; Robert W Byrnes; Dawn Cotter; Ashok Reddy Dinasarapu; Mano Ram Maurya
Journal:  Chem Rev       Date:  2011-09-23       Impact factor: 60.622

4.  Identification of secreted proteins that reflect autophagy dynamics within tumor cells.

Authors:  Adam A Kraya; Shengfu Piao; Xiaowei Xu; Gao Zhang; Meenhard Herlyn; Phyllis Gimotty; Beth Levine; Ravi K Amaravadi; David W Speicher
Journal:  Autophagy       Date:  2015       Impact factor: 16.016

5.  Graphle: Interactive exploration of large, dense graphs.

Authors:  Curtis Huttenhower; Sajid O Mehmood; Olga G Troyanskaya
Journal:  BMC Bioinformatics       Date:  2009-12-14       Impact factor: 3.169

6.  Algorithms for effective querying of compound graph-based pathway databases.

Authors:  Ugur Dogrusoz; Ahmet Cetintas; Emek Demir; Ozgun Babur
Journal:  BMC Bioinformatics       Date:  2009-11-16       Impact factor: 3.169

7.  Protopia: a protein-protein interaction tool.

Authors:  Alejandro Real-Chicharro; Iván Ruiz-Mostazo; Ismael Navas-Delgado; Amine Kerzazi; Othmane Chniber; Francisca Sánchez-Jiménez; Miguel Angel Medina; José F Aldana-Montes
Journal:  BMC Bioinformatics       Date:  2009-10-15       Impact factor: 3.169

8.  PathExpress update: the enzyme neighbourhood method of associating gene-expression data with metabolic pathways.

Authors:  Nicolas Goffard; Tancred Frickey; Georg Weiller
Journal:  Nucleic Acids Res       Date:  2009-05-27       Impact factor: 16.971

9.  POLAR MAPPER: a computational tool for integrated visualization of protein interaction networks and mRNA expression data.

Authors:  Joana P Gonçalves; Mário Grãos; André X C N Valente
Journal:  J R Soc Interface       Date:  2008-11-28       Impact factor: 4.118

10.  An editor for pathway drawing and data visualization in the Biopathways Workbench.

Authors:  Robert W Byrnes; Dawn Cotter; Andreia Maer; Joshua Li; David Nadeau; Shankar Subramaniam
Journal:  BMC Syst Biol       Date:  2009-10-02
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.