Literature DB >> 24590442

BicOverlapper 2.0: visual analysis for gene expression.

Rodrigo Santamaría1, Roberto Therón1, Luis Quintales2.   

Abstract

MOTIVATION: Systems biology demands the use of several point of views to get a more comprehensive understanding of biological problems. This usually leads to take into account different data regarding the problem at hand, but it also has to do with using different perspectives of the same data. This multifaceted aspect of systems biology often requires the use of several tools, and it is often hard to get a seamless integration of all of them, which would help the analyst to have an interactive discourse with the data.
RESULTS: Focusing on expression profiling, BicOverlapper 2.0 visualizes the most relevant aspects of the analysis, including expression data, profiling analysis results and functional annotation. It also integrates several state-of-the-art numerical methods, such as differential expression analysis, gene set enrichment or biclustering.
AVAILABILITY AND IMPLEMENTATION: BicOverlapper 2.0 is available at: http://vis.usal.es/bicoverlapper2
© The Author 2014. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2014        PMID: 24590442      PMCID: PMC4058931          DOI: 10.1093/bioinformatics/btu120

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

BicOverlapper 1.0 (Santamaria ) focused on the visualization of complex gene expression analysis results coming from biclustering algorithms. Based on Venn-like diagrams and overlapping visualization layers, it successfully conveyed biclusters. With the use of BicOverlapper by the authors and third-party users, several new requirements arose, and it has evolved to support other analysis techniques and additional steps of the analysis process. Similar evolutions have occurred on other tools on the field. For example, Expander has extended microarray data analysis with relational and functional information (Ulitsky ). Hierarchical Clustering Explorer, although originally designed for general use, added new methods for bioinformatics analysis (Seo ). Treeview (Saldanha, 2004) is developing toward a new version that will address high-throughtput biology needs (see https://www.princeton.edu/∼abarysh/treeview/).

2 APPROACH

During the design of BicOverlapper 2.0, we focused on a high level of interaction and a visual analytics (Thomas and Cook, 2005) approach. Another important design principle was the simplification of installation and interfaces. Finally, following the original ‘overlapping’ philosophy, we designed linked visualizations and an agglomerative use of standard numerical analyses. For example, differential expression analysis compares two experimental conditions, but BicOverlapper 2.0 allows to compare several combinations of experimental conditions at once and then to visualize the relationships between the differentially expressed groups.

3 METHODS

The tool is implemented as two interconnected layers: visualization and analysis. The analysis layer is R/Bioconductor-dependent, using several packages and ad hoc scripts. Data retrieval from Gene Expression Omnibus (GEO) and ArrayExpress is supported by its corresponding packages (Davis and Meltzer, 2007; Kauffmann ), although it requires high bandwith and not all of the experiments are supported. Data analysis includes the following: Differential expression with limma (Smyth, 2005). In addition to one-to-one comparisons, BicOverlapper allows to perform multiple comparisons at once, visualized as intersecting differentially expressed groups. This way, analysis time is reduced, and the differences between the comparisons can be inspected. Gene set enrichment analysis is also implemented via GSEAlm (Oron and Gentleman, 2008). Enriched gene sets are visualized as overlapping groups. Biclustering, as in the previous version, is computed with biclust (Kaiser ) package. The Iterative Search Algorithm (ISA) algorithm is now also available by the isa2 package. Correlation networks. This is a simple yet powerful method to find groups. Genes with low overall expression variation are filtered out, and the rest are linked if they have a profile distance below some standard deviations. The resulting network is visualized as a force-directed layout, where nodes can be colored by the expression under selected conditions. The visualization layer is developed in Java and it communicates with the analysis layer via rJava (Urbanek, 2007). This layer contains several visualization techniques, with implementations based on Prefuse (Heer ) (networks, scatterplots), Processing (Reas and Fry, 2007) (overlapper, heatmap) and plain Java (parallel coordinates, word clouds).

4 RESULTS

To involve biology specialists on bioinformatics analyses, we need simpler and highly interactive tools. For example, Figure 1 was generated only by clicking two menu options and selecting one visual item and gene/condition labels, on a process that takes not more than 5 min (see Supplementary Video at http://vis.usal.es/bicoverlapper2/docs/tour.mp4). Underneath, this requires the seamless connection of different steps: expression data loading, computation of distribution statistics, three differential expression analyses (for up- and downregulation), gene annotation retrieval and the visualization of four interactive representations.
Fig. 1.

Yeast gene expression profile along three cell cycles, from experiment GSE3431 (Tu ). Each cell cycle is divided into three time intervals (early, mid and late). Differential expression for every combination of such intervals is computed and visualized as overlapping groups. Thirty-six genes high-regulated at early and mid intervals have been selected (intersection between ‘early versus late’ and ‘mid versus late’ groups at the bottom left); their expression profiles are shown in parallel coordinates and heatmap visualizations. Finally, the functional annotations, stacked by term, are shown as a word cloud, indicating, for example, that 9 of the 36 genes are related to metabolic and oxidation–reduction processes

Yeast gene expression profile along three cell cycles, from experiment GSE3431 (Tu ). Each cell cycle is divided into three time intervals (early, mid and late). Differential expression for every combination of such intervals is computed and visualized as overlapping groups. Thirty-six genes high-regulated at early and mid intervals have been selected (intersection between ‘early versus late’ and ‘mid versus late’ groups at the bottom left); their expression profiles are shown in parallel coordinates and heatmap visualizations. Finally, the functional annotations, stacked by term, are shown as a word cloud, indicating, for example, that 9 of the 36 genes are related to metabolic and oxidation–reduction processes Figure 1 provides a considerable amount of information about the experiment. First, parallel coordinates (Inselberg, 2009) indicate with boxplots that the data are normalized, although probably skewed towards upregulation. Second, differential expression groups, displayed as Venn diagrams, present a large overlap for genes upregulated at mid and early timepoints with respect to late timepoints. These intersecting genes have a clear pattern under heatmap and parallel coordinates and include nine genes related to the Gene Ontology (GO) terms ‘oxidation–reduction process’ and five related to ‘fatty acid beta-oxidation’.

5 CONCLUSION

BicOverlapper is a simple-to-use, highly visual and interactive tool for gene expression analysis. Easily and without programming knowledge, the user can have an overall view of several expression aspects, from raw data to analysis results and functional annotations. This may significantly reduce the analysis time and improve the analytical discourse with the data. For the future, we are working on the support of high-throughput data, especially RNA-Seq and a comprehensive report and image generation. Funding: This work was supported by the Spanish Government, under the Ministerio de Economía y Competitividad-MINECO (projects BFU2011-28804 and Consolider-Ingenio CSD007-00015) and by the Ministerio de Ciencia e Innovación - MICINN (project FI2010-16234) Conflict of Interest: none declared.
  8 in total

1.  Java Treeview--extensible visualization of microarray data.

Authors:  Alok J Saldanha
Journal:  Bioinformatics       Date:  2004-06-04       Impact factor: 6.937

2.  Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes.

Authors:  Benjamin P Tu; Andrzej Kudlicki; Maga Rowicka; Steven L McKnight
Journal:  Science       Date:  2005-10-27       Impact factor: 47.728

3.  BicOverlapper: a tool for bicluster visualization.

Authors:  Rodrigo Santamaría; Roberto Therón; Luis Quintales
Journal:  Bioinformatics       Date:  2008-03-05       Impact factor: 6.937

4.  GEOquery: a bridge between the Gene Expression Omnibus (GEO) and BioConductor.

Authors:  Sean Davis; Paul S Meltzer
Journal:  Bioinformatics       Date:  2007-05-12       Impact factor: 6.937

5.  An interactive power analysis tool for microarray hypothesis testing and generation.

Authors:  Jinwook Seo; Heather Gordish-Dressman; Eric P Hoffman
Journal:  Bioinformatics       Date:  2006-01-17       Impact factor: 6.937

6.  Expander: from expression microarrays to networks and functions.

Authors:  Igor Ulitsky; Adi Maron-Katz; Seagull Shavit; Dorit Sagir; Chaim Linhart; Ran Elkon; Amos Tanay; Roded Sharan; Yosef Shiloh; Ron Shamir
Journal:  Nat Protoc       Date:  2010-01-28       Impact factor: 13.491

7.  Importing ArrayExpress datasets into R/Bioconductor.

Authors:  Audrey Kauffmann; Tim F Rayner; Helen Parkinson; Misha Kapushesky; Margus Lukk; Alvis Brazma; Wolfgang Huber
Journal:  Bioinformatics       Date:  2009-06-08       Impact factor: 6.937

8.  Gene set enrichment analysis using linear models and diagnostics.

Authors:  Assaf P Oron; Zhen Jiang; Robert Gentleman
Journal:  Bioinformatics       Date:  2008-09-11       Impact factor: 6.937

  8 in total
  5 in total

Review 1.  It is time to apply biclustering: a comprehensive review of biclustering applications in biological and biomedical data.

Authors:  Juan Xie; Anjun Ma; Anne Fennell; Qin Ma; Jing Zhao
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

2.  BicPAMS: software for biological data analysis with pattern-based biclustering.

Authors:  Rui Henriques; Francisco L Ferreira; Sara C Madeira
Journal:  BMC Bioinformatics       Date:  2017-02-02       Impact factor: 3.169

3.  MoSBi: Automated signature mining for molecular stratification and subtyping.

Authors:  Tim Daniel Rose; Thibault Bechtler; Octavia-Andreea Ciora; Kim Anh Lilian Le; Florian Molnar; Nikolai Köhler; Jan Baumbach; Richard Röttger; Josch Konstantin Pauling
Journal:  Proc Natl Acad Sci U S A       Date:  2022-04-11       Impact factor: 12.779

Review 4.  Visualizing genome and systems biology: technologies, tools, implementation techniques and trends, past, present and future.

Authors:  Georgios A Pavlopoulos; Dimitris Malliarakis; Nikolas Papanikolaou; Theodosis Theodosiou; Anton J Enright; Ioannis Iliopoulos
Journal:  Gigascience       Date:  2015-08-25       Impact factor: 6.524

5.  A comprehensive evaluation of module detection methods for gene expression data.

Authors:  Wouter Saelens; Robrecht Cannoodt; Yvan Saeys
Journal:  Nat Commun       Date:  2018-03-15       Impact factor: 14.919

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.