Literature DB >> 28334291

SmartR: an open-source platform for interactive visual analytics for translational research data.

Sascha Herzinger1, Wei Gu1, Venkata Satagopam1, Serge Eifes1,2, Kavita Rege1, Adriano Barbosa-Silva1, Reinhard Schneider1.   

Abstract

SUMMARY: In translational research, efficient knowledge exchange between the different fields of expertise is crucial. An open platform that is capable of storing a multitude of data types such as clinical, pre-clinical or OMICS data combined with strong visual analytical capabilities will significantly accelerate the scientific progress by making data more accessible and hypothesis generation easier. The open data warehouse tranSMART is capable of storing a variety of data types and has a growing user community including both academic institutions and pharmaceutical companies. tranSMART, however, currently lacks interactive and dynamic visual analytics and does not permit any post-processing interaction or exploration. For this reason, we developed SmartR , a plugin for tranSMART, that equips the platform not only with several dynamic visual analytical workflows, but also provides its own framework for the addition of new custom workflows. Modern web technologies such as D3.js or AngularJS were used to build a set of standard visualizations that were heavily improved with dynamic elements.
AVAILABILITY AND IMPLEMENTATION: The source code is licensed under the Apache 2.0 License and is freely available on GitHub: https://github.com/transmart/SmartR . CONTACT: reinhard.schneider@uni.lu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2017        PMID: 28334291      PMCID: PMC5870773          DOI: 10.1093/bioinformatics/btx137

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Translational research can be described as an ‘interdisciplinary branch of the biomedical field supported by three main pillars: benchside, bedside and community’ (Cohrs ). One of the most difficult, yet most important, tasks in this field is the proper communication and knowledge exchange between the different fields of expertise. An information system that integrates all level of data (pre-clinical, clinical, OMICS, etc.) generated in research and that provides an interactive interface to explore, visualize and analyze those data will substantially increase the efficiency of knowledge exchange and hypothesis generation. In the context of the eTRIKS (European Translational Information & Knowledge Management Services) consortium (https://www.etriks.org/, 2017) academia and pharma seek to combine their interdisciplinary knowledge to provide secure data environments and open source tools that help to answer important biological questions and enable the discovery of new scientific facts within existing studies. The tranSMART platform (Athey ) addresses these requirements, supports a multitude of data types, has a well-established community and provides APIs, that make connections with a variety of other services possible. Detailed reasons for this choice and comparisons to other existing analytics platforms are described elsewhere (Satagopam ). A major problem currently present in tranSMART is the lack of interactive visual-analytical functionality, which is essential for a collaborative knowledge management platform. In the present state, analytical workflows are restricted to the displaying of static images generated by the statistical programming language R. The static nature of this approach makes it very difficult to apply any post-processing analysis or to do further exploration, such as selecting a certain feature for further investigation. When confronted with this problem, one might first attempt to use existing web-visualization libraries like Highcharts (http://www.highcharts.com/, 2017), Plotly.js (Plotly Technologies Inc., 2015), or one of many BioJS (Gómez ) components. Browsing existing heat map implementations or other basic visualizations, one can see that, although visually appealing, they lack deep integration of more than the most basic statistics. Because of the analytical limitations of the web-browser and the missing analytical engine, they cannot re-compute initial input values and therefore often chose not to display many statistics in the first place to preserve their dynamism. Another approach is to combine custom visualizations with an analytical engine as shown in DIVE (Rysavy ), BRAVIZ (Angulo ), HitWalker2 (Bottomly ), or Shiny (https://shiny.rstudio.com/, 2017). Using custom visualizations with a supporting analysis component enables the researcher to iteratively explore the data with each analysis step, in contrast to hypothesis-driven research. To make this methodology available for translational researchers, we developed SmartR, a new, highly modular, analytical framework for tranSMART, that equips the platform with interactive and dynamic visualization capabilities, built using recent web technologies.

2 Materials and methods

The tranSMART platform uses Grails (https://grails.org/, 2017) as a web-framework, which provides a plugin architecture of its own. Therefore, it was a natural choice to use Grails for the back-end of our plugin as well. This has the benefit of having direct access to internal services and APIs to ensure consistent database access across the different Oracle and Postgres versions of the platform, which helps to keep maintenance low. To support the user graphics with non-trivial statistics, for instance clustering information, it was necessary to properly integrate a language for statistical computation, such as R (R Development Core Team, 2011), into Grails. Because Grails uses Groovy, which integrates well with most standard Java libraries, we could use the Java client for Rserve, ‘a TCP/IP server which allows other programs to use facilities of R’ (Urbanek, 2003). In other words, this allows direct read and write access with respect to the RSession via the back-end of our application. As a base for the front-end, we decided to use the framework AngularJS (https://angularjs.org/, 2017) to enforce a MVC (Model-View-Controller) structure for each workflow. Besides the usual advantages with regard to maintenance and testability, the enforcement of a specific workflow structure helps to keep a similar structure to each workflow, even with multiple contributing developers with different levels of experience. This was a high priority goal from the outset, because it would allow the formation of a small community, which could contribute their own ideas and requirements to the plugin. Another reason for building upon an almost completely decoupled framework, rather than integrating SmartR directly into tranSMART, are the regular changes of the platform’s code base and the long list of partially outdated dependencies. The visualizations are implemented as AngularJS directives, which enables arbitrary placement of the plots in HTML. Technically, most JavaScript visualization libraries can be used within such a directive, but we focused on the low-level library D3.js, ‘a JavaScript library for manipulating documents based on data’ (Bostock ). While the coding effort to create even basic visualizations is quite high, D3.js gives a high level of freedom for customization and creativity to the developer. This allowed us to implement features which we found useful that were not provided by other visualization libraries. An example for this is the dynamic heat map that we created.

3 Results

The framework itself equips the tranSMART platform with a new analytical engine that is testable, maintainable, and expandable. We also provide a series of prebuilt, commonly used visual-analytical workflows. In the following, we will focus on one of these workflows, namely the interactive heat map, as an example to illustrate the interactive and dynamic nature of the platform. Videos, screenshots and links to public test servers for all created visual analytics can be found in the Supplementary Material. Since several decades, heat maps are a common tool for analyzing gene expression data, but displaying non-static heat maps with the limited resources of a web-browser is a challenge. The SmartR heat map provides a solution by implementing a lazy-loading approach, where initially only the 100 most significant genes according to user defined ranking criteria are displayed. This reduction of displayed data lets us treat the single fields of the heat map as movable dynamic elements, rather than a static image. Doing so leads to several useful features, such as the possibility to change the clustering on-the-fly, select various color sets for different data types and accessibility (color blindness), or to sort rows and columns of the heat map. Another feature is the possibility to ‘expand’ the heat map by overlaying non-array, one-dimensional data types (see Fig. 1), e.g. phenotypic data like ‘Age’ (numerical) or ‘Tumor Type—T0’ (categorical). In this way one can directly relate clusters or single samples to certain user defined groups within the selected cohort(s), leading to a much better understanding of the data across different data types. All displayed genes can be directly linked to external annotation databases like Gene Cards and the EMBL-EBI database. This function allows the user to link the findings to much broader knowledge bases with a single click. Similar functionality is revealed when we apply a clustering to the heat map and click on one of the resulting dendrogram nodes. This will gather all genes in the respective sub-tree and trigger a KEGG pathway enrichment analysis via external tools like BioCompendium (http://biocompendium.embl.de/, 2017). This allows us to link a cluster directly to a possibly related KEGG pathway. Besides the interactive heat map, we have also provided a few other commonly used analyses like correlation analysis, where the user can select regions on the correlation plot and get updated analysis instantly, box plots, volcano plots, and line graphs for visualizing longitudinal-like data. We could not exhaust all possible analyses during our implementation but we would like to emphasis that SmartR not only provides a list of pre-built analyses but also provides a framework for easy implementation of customized workflows. To truly grasp the dynamic nature of this approach, we highly recommend watching the related videos available in the Supplementary Material.
Fig. 1.

The SmartR Heat Map. Seen is the interactive heat map in tranSMART based on the breast cancer mRNA data of the GEO study GSE4382 (Sorlie )

The SmartR Heat Map. Seen is the interactive heat map in tranSMART based on the breast cancer mRNA data of the GEO study GSE4382 (Sorlie ) Click here for additional data file.
  8 in total

1.  D³: Data-Driven Documents.

Authors:  Michael Bostock; Vadim Ogievetsky; Jeffrey Heer
Journal:  IEEE Trans Vis Comput Graph       Date:  2011-12       Impact factor: 4.579

2.  BioJS: an open source JavaScript framework for biological data visualization.

Authors:  John Gómez; Leyla J García; Gustavo A Salazar; Jose Villaveces; Swanand Gore; Alexander García; Maria J Martín; Guillaume Launay; Rafael Alcántara; Noemi Del-Toro; Marine Dumousseau; Sandra Orchard; Sameer Velankar; Henning Hermjakob; Chenggong Zong; Peipei Ping; Manuel Corpas; Rafael C Jiménez
Journal:  Bioinformatics       Date:  2013-02-23       Impact factor: 6.937

3.  DIVE: a graph-based visual-analytics framework for big data.

Authors:  Steven J Rysavy; Dennis Bromley; Valerie Daggett
Journal:  IEEE Comput Graph Appl       Date:  2014 Mar-Apr       Impact factor: 2.088

4.  A Multi-facetted Visual Analytics Tool for Exploratory Analysis of Human Brain and Function Datasets.

Authors:  Diego A Angulo; Cyril Schneider; James H Oliver; Nathalie Charpak; Jose T Hernandez
Journal:  Front Neuroinform       Date:  2016-08-23       Impact factor: 4.081

5.  Repeated observation of breast tumor subtypes in independent gene expression data sets.

Authors:  Therese Sorlie; Robert Tibshirani; Joel Parker; Trevor Hastie; J S Marron; Andrew Nobel; Shibing Deng; Hilde Johnsen; Robert Pesich; Stephanie Geisler; Janos Demeter; Charles M Perou; Per E Lønning; Patrick O Brown; Anne-Lise Børresen-Dale; David Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  2003-06-26       Impact factor: 12.779

6.  tranSMART: An Open Source and Community-Driven Informatics and Data Sharing Platform for Clinical and Translational Research.

Authors:  Brian D Athey; Michael Braxenthaler; Magali Haas; Yike Guo
Journal:  AMIA Jt Summits Transl Sci Proc       Date:  2013-03-18

7.  HitWalker2: visual analytics for precision medicine and beyond.

Authors:  Daniel Bottomly; Shannon K McWeeney; Beth Wilmot
Journal:  Bioinformatics       Date:  2015-12-26       Impact factor: 6.937

8.  Integration and Visualization of Translational Medicine Data for Better Understanding of Human Diseases.

Authors:  Venkata Satagopam; Wei Gu; Serge Eifes; Piotr Gawron; Marek Ostaszewski; Stephan Gebel; Adriano Barbosa-Silva; Rudi Balling; Reinhard Schneider
Journal:  Big Data       Date:  2016-06       Impact factor: 2.128

  8 in total
  9 in total

1.  Usability and Suitability of the Omics-Integrating Analysis Platform tranSMART for Translational Research and Education.

Authors:  J Christoph; C Knell; A Bosserhoff; E Naschberger; M Stürzl; M Rübner; H Seuss; M Ruh; H-U Prokosch; B Sedlmayr
Journal:  Appl Clin Inform       Date:  2017-12-21       Impact factor: 2.342

2.  A roadmap towards personalized immunology.

Authors:  Sylvie Delhalle; Sebastian F N Bode; Rudi Balling; Markus Ollert; Feng Q He
Journal:  NPJ Syst Biol Appl       Date:  2018-02-06

3.  Integrating Multimodal Radiation Therapy Data into i2b2.

Authors:  Eric Zapletal; Jean-Emmanuel Bibault; Philippe Giraud; Anita Burgun
Journal:  Appl Clin Inform       Date:  2018-05-30       Impact factor: 2.342

4.  Systematically linking tranSMART, Galaxy and EGA for reusing human translational research data.

Authors:  Chao Zhang; Jochem Bijlard; Christine Staiger; Serena Scollen; David van Enckevort; Youri Hoogstrate; Alexander Senf; Saskia Hiltemann; Susanna Repo; Wibo Pipping; Mariska Bierkens; Stefan Payralbe; Bas Stringer; Jaap Heringa; Andrew Stubbs; Luiz Olavo Bonino Da Silva Santos; Jeroen Belien; Ward Weistra; Rita Azevedo; Kees van Bochove; Gerrit Meijer; Jan-Willem Boiten; Jordi Rambla; Remond Fijneman; J Dylan Spalding; Sanne Abeln
Journal:  F1000Res       Date:  2017-08-16

Review 5.  Systems Bioinformatics: increasing precision of computational diagnostics and therapeutics through network-based approaches.

Authors:  Anastasis Oulas; George Minadakis; Margarita Zachariou; Kleitos Sokratous; Marilena M Bourdakou; George M Spyrou
Journal:  Brief Bioinform       Date:  2019-05-21       Impact factor: 11.622

6.  Data and knowledge management in translational research: implementation of the eTRIKS platform for the IMI OncoTrack consortium.

Authors:  Wei Gu; Reha Yildirimman; Emmanuel Van der Stuyft; Denny Verbeeck; Sascha Herzinger; Venkata Satagopam; Adriano Barbosa-Silva; Reinhard Schneider; Bodo Lange; Hans Lehrach; Yike Guo; David Henderson; Anthony Rowe
Journal:  BMC Bioinformatics       Date:  2019-04-01       Impact factor: 3.169

7.  Cardiovascular RNA markers and artificial intelligence may improve COVID-19 outcome: a position paper from the EU-CardioRNA COST Action CA17129.

Authors:  Lina Badimon; Emma L Robinson; Amela Jusic; Irina Carpusca; Leon J deWindt; Costanza Emanueli; Péter Ferdinandy; Wei Gu; Mariann Gyöngyösi; Matthias Hackl; Kanita Karaduzovic-Hadziabdic; Mitja Lustrek; Fabio Martelli; Eric Nham; Ines Potočnjak; Venkata Satagopam; Reinhard Schneider; Thomas Thum; Yvan Devaux
Journal:  Cardiovasc Res       Date:  2021-07-07       Impact factor: 10.787

Review 8.  The RA-MAP Consortium: a working model for academia-industry collaboration.

Authors:  Andrew P Cope; Michael R Barnes; Alexandra Belson; Michael Binks; Sarah Brockbank; Francisco Bonachela-Capdevila; Claudio Carini; Benjamin A Fisher; Carl S Goodyear; Paul Emery; Michael R Ehrenstein; Neil Gozzard; Ray Harris; Sally Hollis; Sarah Keidel; Marc Levesque; Catharina Lindholm; Michael F McDermott; Iain B McInnes; Christopher M Mela; Gerry Parker; Simon Read; Ayako Wakatsuki Pedersen; Frederique Ponchel; Duncan Porter; Ravi Rao; Anthony Rowe; Peter Schulz-Knappe; Matthew A Sleeman; Deborah Symmons; Peter C Taylor; Brian Tom; Wayne Tsuji; Denny Verbeeck; John D Isaacs
Journal:  Nat Rev Rheumatol       Date:  2017-12-07       Impact factor: 32.286

9.  Fractalis: a scalable open-source service for platform-independent interactive visual analysis of biomedical data.

Authors:  Sascha Herzinger; Valentin Grouès; Wei Gu; Venkata Satagopam; Peter Banda; Christophe Trefois; Reinhard Schneider
Journal:  Gigascience       Date:  2018-09-01       Impact factor: 6.524

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.