Literature DB >> 27585944

Exploring and visualizing multidimensional data in translational research platforms.

William Dunn, Anita Burgun, Marie-Odile Krebs, Bastien Rance.   

Abstract

The unprecedented advances in technology and scientific research over the past few years have provided the scientific community with new and more complex forms of data. Large data sets collected from single groups or cross-institution consortiums containing hundreds of omic and clinical variables corresponding to thousands of patients are becoming increasingly commonplace in the research setting. Before any core analyses are performed, visualization often plays a key role in the initial phases of research, especially for projects where no initial hypotheses are dominant. Proper visualization of data at a high level facilitates researcher's abilities to find trends, identify outliers and perform quality checks. In addition, research has uncovered the important role of visualization in data analysis and its implied benefits facilitating our understanding of disease and ultimately improving patient care. In this work, we present a review of the current landscape of existing tools designed to facilitate the visualization of multidimensional data in translational research platforms. Specifically, we reviewed the biomedical literature for translational platforms allowing the visualization and exploration of clinical and omics data, and identified 11 platforms: cBioPortal, interactive genomics patient stratification explorer, Igloo-Plot, The Georgetown Database of Cancer Plus, tranSMART, an unnamed data-cube-based model supporting heterogeneous data, Papilio, Caleydo Domino, Qlucore Omics, Oracle Health Sciences Translational Research Center and OmicsOffice® powered by TIBCO Spotfire. In a health sector continuously witnessing an increase in data from multifarious sources, visualization tools used to better grasp these data will grow in their importance, and we believe our work will be useful in guiding investigators in similar situations.
© The Author 2016. Published by Oxford University Press.

Entities:  

Keywords:  data analytics; high-dimensional data; omics; translational research; visualization

Mesh:

Year:  2017        PMID: 27585944      PMCID: PMC5862238          DOI: 10.1093/bib/bbw080

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


Introduction

Background

The continued digitization of our world along with recent advances in technology are providing researchers with data at an unprecedented rate in a variety of fields such as molecular biology, business and government [1]. Big data in general is typically challenged by five Vs (sheer volume, velocity data are received and sent, variety of formats and types, questions of veracity and ability to turn raw data into valuable information), and medical research data are no exception. The technological advances that have followed in the wake of the next-generation sequencing (NGS) experiments at the turn of the 21st century [2] have given rise to the production of ‘big-data’ at a scale never seen before. As a result of this recent abundance of data, some have proposed that fundamental paradigms in a variety of domains—especially molecular biology—have shifted to data-driven analysis and visualization leveraging computational power and computer science [3, 4].

Growing need for multidimensional visualizations in health research

In a research environment focused increasingly on high throughput, a common challenge is the comprehensive visualization of data, an important step for any extensive exploration of the data. In Heer et al. [1], apart from providing a thorough review of emerging visualization techniques for big data, the authors outlined several benefits of quality visualization such as facilitating our ability to see patterns, trends and outliers, improving comprehension, memory, and decision-making and finally adding aesthetic appeal to engage a wider audience in data exploration and analysis. In health care or clinical research settings, visual analytics is especially useful in studying parameters across patients when no clear hypotheses are immediately available [5]. Whereas traditional analysis of heterogeneous or multidimensional cohort data with partial overlap usually involves limiting attention to certain subsets (inevitably leading to loss of the overall sense of relationships between different modalities), a thorough visualization can provide a more complete picture, ultimately allowing a more comprehensive study of the data that improves hypothesis and research workflow [6]. As a result, systematic organization of research data can facilitate translational science and jump-start drug discovery [7], contribute to patient stratification and personalized medicine [8] and ultimately improve quality health care [9].

Driving motivation for the review

Quality visualization can be applied to any of the numerous domains where big data has recently affected the health-care arena such as, among others, managing cost, improving quality improvement, monitoring patients for clinical deterioration and improving treatment efficiency in emergency care [10-13]. In clinical research, multidimensional data can be used to help segment patients or elucidate disease pathway. This has most notably been seen in oncology with large data sets containing various genomics and clinical data for thousands of cancer patients such as The Cancer Genome Atlas (TCGA [14]) or the International Cancer Genome Consortium ([15]). However, multi-omics research has extended into a wide variety of fields such as dementia and Alzheimer's disease (Alzheimer's Disease Neuroimaging Initiative [16]), autism spectrum disorder (National Database for Autism Research [17]), psychiatric diseases (Psychiatric Genomics Consortium [18]), as well as for rare diseases (RD-Connect [19]). To better explore and take advantage of these rich, diverse data sets, a comprehensive exploration of data using efficient visualization that allows experts to seamlessly explore heterogeneous data on demand is required.

Multidimensional visualization basics

While basic statistics visualizations such as histograms, bar charts, line graphs or scatter plots typically suffice for one- or two-dimensional data, complex multidimensional data pose more challenges to researchers. The central question is usually how to better grasp the rich multivariable data and their relations contained in data sets with hundreds or thousands of patients or variables. A variety of techniques ranging from simple box plots to complex radial tree layout diagrams [20] exist to better visualize multiple variables of a multidimensional data set. We have provided a brief sampling of these techniques based on several variables from a local study in Figure 1. For example, interactive, filterable, dynamic pivot tables can allow for a variety of visualizations for multidimensional data. Correlation matrices using multiple scatter plots show an additional insight into the interaction between variables. In addition, heatmaps are commonly used for multidimensional data, especially in genetic research with expression, pathway or molecular abundance data and involve a matrix where each cell is colored according to a gradient and is often clustered by samples [22]. Heatmaps and other visualizations are available in a wide variety of software such as R, Matlab®, SAS®, as well as to users without programming knowledge through programs with intuitive user interfaces (e.g. ClustVis [23], HemI [24]).
Figure 1.

A sampling of commonly used visualization techniques for multidimensional data using a subset of data in our data set compiling data from three groups of patients Var1, Var2 and Var3 are neurocognitive dimensions, Var4 and Var5 are psychopathological dimensions and Var6 is a global genetic index. Specific visualizations used are (A) dynamic pivot table (using R ‘rpivotTable’ package), (B) correlation matrix (using R ‘PerformanceAnalytics’ package), (C) Heatmap clustered by rows and columns (using R ‘gplots’ package), (D) 3D scatterplot using color and size (using R ‘scatterplot3d’ package) and (E) parallel coordinates showing all data (using d3 Javascript library ‘d3.parcoords.js’ [21]). A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

A sampling of commonly used visualization techniques for multidimensional data using a subset of data in our data set compiling data from three groups of patients Var1, Var2 and Var3 are neurocognitive dimensions, Var4 and Var5 are psychopathological dimensions and Var6 is a global genetic index. Specific visualizations used are (A) dynamic pivot table (using R ‘rpivotTable’ package), (B) correlation matrix (using R ‘PerformanceAnalytics’ package), (C) Heatmap clustered by rows and columns (using R ‘gplots’ package), (D) 3D scatterplot using color and size (using R ‘scatterplot3d’ package) and (E) parallel coordinates showing all data (using d3 Javascript library ‘d3.parcoords.js’ [21]). A colour version of this figure is available at BIB online: https://academic.oup.com/bib. Another increasingly common technique for visualizing the relationships between variables in multidimensional data sets is parallel coordinates. Here, vertical axes corresponding to each variable scaled to a common height are placed next to each other and connected with lines representing different samples [25]. This technique has been enhanced by tools such as scatter plot matrix overlay [26], proximity-based shading [27] and clustering methods that eliminate overplotting [28]. One particular application of parallel coordinate visualization in current research is Dynamics Visualization based on Parallel Coordinates, which uses multidimensional methods to visualize complex and dynamic biochemical networks to better understand disease mechanism and ultimately to derive effective treatment strategies [29]. In many cases, multidimensional visualizations can be combined with each other. For example, visualizations can be constructed to provide elegant high-level representations of large multi-omics studies containing billions of data points arising from multiple genetic experiments and clinical and demographic data from hundreds of patients [30-32]. For instance, OmicCircos [33] is an R package that produces circular plots capable of integrating expression, copy number variations (CNV) and protein fusions as well as visualizations of statistics that compare data across these sources. This allows researchers a high-level view that may facilitate the understanding of complex diseases such as cancer or psychiatric diseases. Two other interesting R packages that integrate multi-omics with visualizations are coMET [34], which incorporates epigenetic results and other types of genomic data such as expression profiles, and caOmicsV [35], which also provides several options of viewing various genomic data side-by-side other phenotypic data. The field of data visualization is immense. Dedicated tools and libraries have been developed and exist through a rising number of open-source and fee-based platforms. For example, many scientists rely on various programming languages or statistics packages with data-visualization capabilities such as R [36] or Python Matlibplot [37]. More and more researchers are turning to JavaScript graphics libraries to enhance visualization with dynamic capabilities. Such libraries include Highcharts [38], Chart.js [39], Dygraphs [40], JavaScript InfoVis Toolkit [41] and D3.js (Data-Driven Documents [42]) (for comprehensive overview and side-by-side comparison of these libraries see [43]). In sum, impressive techniques have been developed to answer to the clear need for strong data visualization in health-care research. However, such tools and techniques are not easily accessible to the clinician or biologist end users. R packages or Python library are easy to leverage for a bioinformatician, but the knowledge gap is often too wide for biologists and clinicians without a background in bioinformatics or biostatistics. A common challenge is finding these visualizations seamlessly incorporated within a translational research platform without the need for complicated backend programming. Such systems would open the door to all members of the clinical research team, not only those with programming backgrounds, a common theme in contemporary translational bioinformatics [44]. In this work, we will review the tools available to researchers and clinicians that fill this gap and provide intuitive visualization solutions for multidimensional clinical and omics data to advance health science and translational research.

Materials and methods

Literature review methods

Our literature review can be seen as a follow-up to our previous article reviewing translational research platforms integrating heterogeneous data [45]. In the current project, we searched for systems (i) that accept a variety of data types (and at least clinical and omics data), (ii) that feature data visualization functionalities and (iii) that provide researchers with data analysis or statistical functionalities. We are interested in characterizing a comprehensive current landscape of tools that can be used in translational research to provide visualizations for multidimensional medical research data with easy-to-use graphical user interfaces. Therefore, we have strived to include a wide variety of tools with slightly different dedicated domains, structure and capacities and availabilities. The first three platforms identified that respected these inclusion criteria were three platforms from the previous review [cBioPortal, The Georgetown Database of Cancer (G-DOC) Plus and tranSMART]. We then searched scientific literature available through PubMed® [46] using Medical Subject Headings terms and free-text search, and subsequently identified 367 articles potentially describing visualization for heterogeneous data (PubMed queries and literature search, details are available in Supplementary Table S1). We identified three new platforms through this step, and one from citations for one of the corresponding publications. To completely cover the field of translational platforms, we decided to also include commercial products in our review. We identified candidates through Google® search and discussion with colleagues. The web search and discussions lead to the addition of one open-source platform and of three commercial products respecting the inclusion criteria. Overall, 11 platforms with advanced visualization capacities were included in the review: cBioPortal, interactive genomics patient stratification explorer (iGPSe), Igloo-Plot, G-DOC Plus, tranSMART, an unnamed data-cube-based model supporting heterogeneous data, Papilio, Caleydo Domino, Qlucore Omics, Oracle Health Sciences Translational Research Center and OmicsOffice powered by TIBCO Spotfire. The first eight programs are open source, whereas the last three are commercial products. We next identified the main features of each program analyzed along five major axes: general information, licensing, information content supported, visualization and data exploration. This information was based on publicly available resources (i.e. original articles published in PubMed describing the systems and dedicated Web sites) and direct correspondence with authors of the original papers or representatives for commercial products. In addition, we also include our personal experience using the program where available (based on using the five in-use open-source programs cBioPortal, Igloo-Plot, G-DOC Plus, tranSMART and Caleydo Domino as well as demo versions of Qlucore Omics and OmicsOffice).

Results

Overview of multi-visualization tools

Our search results identified several flexible analytic tools or software programs with easy-to-use front-end graphic user interfaces (GUI) that have been developed to help researchers visualize complex data without needing deep data analytics or programming backgrounds. Tables 1 and 2 summarize general information, licensing, information content supported, visualization and data exploration features for each system (Tables 1 and 2). The text below summarizes the systems in general with particular focus on visualization.
Table 1.

Summary of visualization programs for multidimensional data that can be applied to user-provided datasets. For each tool reviewed, we evaluated a number of features organized by the various categories: General Information, Licensing. PoC = Proof of Concept.

CategoryItemFreely available
Commercial
Name of the platformcBioPortaliGPSeIgloo-PlottranSMARTG-DOC PluData-cube-based model supporting heterogeneous dataPapilioCaleydo DominoQlucore Omics ExplorerOracle Health Sciences Translational Research CenterOmicsOffice® powered by TIBCO Spotfire
General informationPMID or article reference225888772500092824444495257174082713033025248201(Steenwijk et al. 2010) [8]26356916NANANA
Initial release year20122014201420122016201420102014200720111996
URLcbioportal.orgosumo.org/ #processmetagenomics.atc. tcs.com/IglooPlottransmart foundation.orggdoc.george town.eduNANAcaleydo.org/ tools/dominoqlucore.comoracle.com/us/ industries/ health-sciences/ hs-cohort- explorerds- 1672120.pdfcambridge soft.com/ ensemble/ spotfire/ OmicsOffice/
Referencegithub.com/cBioPortal/ cbioportalosumo.orgmetagenomics. atc.tcs.com/ IglooPlot/walk through.htmlwiki.transmart foundation.orgNANANAgithub.com/ Caleydo/org. caleydo.view. dominoqlucore.com/ documentationoracle.com/ us/industries/ health-sciences/ hs-cohort -explorerds- 1672120.pdfscistore. cambridgesoft.com/ ScistoreProduct Page.aspx ?ItemID=8541
Data housingMySQLapache serverInternal memory from loaded dataany Relational Database Management System (e.g. Oracle, PostgreSQL)OracleInternal c ++ data structures from dataSQLiteInternal memory from loaded dataInternal memory from loaded dataSAS Cloud or on premis (MySQL)Cloud or on premis (Oracle or SQL)
Principle frontend and/or backend programming languagesJava and Spring in backend, Javascript with libraries such as D3 and JQuery in front endJavascript, d3.js, RperkTkGrails, JavaGroovy & Grails, Adobe Flex, JavaScriptC ++, using a framework based on opengl and qt4C ++Java, OpenGL/JOGLC ++Oracle ADF/Java EE on the front end, with hooks into Oracle BI. The backend is Oracle stack data and middle tiers so Oracle DB, Oracle BIFS, Oracle Weblogic in a Java 2EE environment.NET/C# with code in Iron Python, R, and in some cases C/C ++
Current statusIn usePoCIn useIn useIn usePoCPoCIn useIn useIn useIn use
Dedicated domainExploration of largescale cancer genomics setsIntegrative genomics based cancer patient stratificationGeneral visualization of multidimensional datasetsHypothesis generation, hypothesis validation, and cohort discovery in translational researchIntegrative analysis of various data types to uncover disease mechanismsExploration of heterogeneous data in clinical cohortsExploration of heterogeneous data in clinical cohortsGeneral visualization of multidimensional datasetsVisualization, exploration, and analysis of bioinformatics dataData agregation, integration, data cleaning for clinical cohort studiesStart to end genomics data analysis
LicensingSoftware availabilityOpensource (GNU Affero General Public License, version 3)Open sourceOpen sourceOpen sourceOpen sourceNANAOpen-source (BSD License)Fee-basedFee-basedFee-based
Client-side interfaceWeb browserWeb browserStand-alone for linux or windowsWeb browserWeb browserStand-alone stand-alone (Trolltech Qt interface)Web browser or standaloneStand-aloneWeb browserStand-alone
User mailing list or supportYesNoNoYesYesNoNoNoYesYesYes
Summary of visualization programs for multidimensional data that can be applied to user-provided datasets. For each tool reviewed, we evaluated a number of features organized by the various categories: General Information, Licensing. PoC = Proof of Concept. Summary of visualization programs for multidimensional data that can be applied to user-provided datasets. For each tool reviewed, we evaluated a number of features organized by the various categories: Information content supported, visualization, data-exploration. PoC = Proof of Concept. ANOVA = analysis of variance.

cBioPortal

cBioPortal, originally developed at the Memorial Sloan-Kettering Cancer Center, provides an interactive platform to visualize the data for over 120 different cancer studies [47, 48]. In a typical workflow, a researcher will accept a cancer study, select data type priority such as mutation and copy number alteration data, enter a list of genes of interest and then visualize various graphics summarizing the data slice. For example, researchers can investigate the frequency of specific mutations at each gene for the study, see scatter plots and box plots showing interaction between genomic events from different platforms and explore survival analyses where available. Advanced visualization features include an interactive Cytoscape graph that allows users to explore genes of interest within the larger network context and a MutationMapper graphic that allows interactive exploration population-wide genetic events linked to tables and three-dimensional (3D) visualizations. Some notable advantages of the tool are that it allows for easy integration with Integrative Genomics Viewer (IGV [49]) for more detailed genetic exploration and also provides a convenient REST-based web API (Application Programming Interface) that allows researchers an even wider range of analysis options. While the public online version is based on TCGA data sets, users can customize their instances by editing the code available through GitHub [50].

iGPSe

iGPSe is a proof of concept visual analytic system designed to allow users to perform complicated feature selection, clustering and subgroup comparison of genomic and clinical data without the need of deep programming or scripting knowledge [8]. Users begin by loading mRNA, microRNA (miRNA) and clinical data, as well as lists of genes of interest. The clustering analysis section allows patients to select clustering parameters and visualization results with heatmaps, silhouette plots and interactivity sparsity graphs. The final, integrative patient stratification, section contains interactive parallel sets based on clustering analysis linked to survival plots that allow real-time survival comparison of mRNA or miRNA clusters [51]. The principle advantage of this software was that, while applicable to other fields, it was developed with the input of domain experts in oncology to seamlessly integrate relevant features such as the various clustering algorithms, options to refine clusters and use of interactive summary pages.

Igloo-Plot

Igloo-Plot is an interactive visualization tool for multidimensional data in general developed by TATA Consultancy Services [52, 53]. Users download the application, upload their data according to predefined data formats and are presented with several, normalization, statistical analysis and clustering [54] and data visualization options. Options allowing for the selection of subgroups of samples or features are available through user-provided regular expressions. Principle visualization features include line graphs displaying variation across variables to aid in the normalization steps as well as the characteristic semicircular, or ‘igloo’ plot that facilitates the identification of clusters within the data and the identification of markers that define the clusters.

G-DOC Plus

G-DOC Plus is an updated version of the original G-DOC data management platform designed in 2011 to integrate structured clinical research with high-throughput data to advance precision medicine, translational research and population genetics [55, 56]. General visualization features include survival curves, Venn diagrams and heatmaps as well as those more specific for high-throughput analyses such as tools to visualize copy number instability, interaction networks and 3D representations of molecular targets. A principle feature of G-DOC Plus is its inherent comprehensive structure based on plug-ins to further its commitment to stay up-to-date with emerging omic technologies; the current version supports a wide variety of formats to accept mRNA, copy number variation, metabolite mass spectrometry and whole genome sequencing data. As of the date of manuscript drafting, G-DOC Plus allows users to explore data for >10 000 patients from over 50 public data sets from a wide variety of domains such as pediatric and adult oncology and wound healing. Data can also be loaded with the assistance of the support team by following a detailed data loading standard operating procedure.

TranSMART

TranSMART is a rapidly growing web-based robust research management and analysis platform based on N-tier (data, business, presentation tiers in this case) architecture and Java schema designed to integrate disparate data sources to close the gap between basic science and clinical practice currently used by >100 organizations around the world. It features a simple user interface involving drag-and-drop movements that allows for an interactive analysis of a wide variety of data (demographic, diagnosis, medication, genetic, etc.) [57, 58] (Figure 2). The default installation provides a wide variety of basic, noninteractive, R-based plotting options such as scatterplots, bar charts, histograms, as well as more complex waterfall plots, Manhattan plots and frequency plots for genomic analysis. TranSMART benefits from a growing worldwide community dedicated to improving its data processing and analytic features as well as its visualization features. For example, one project in our group involves the expansion of visualization capabilities of a plug-in called SmartR, a grails plug-in designed to improve the visual analytics tranSMART through advanced visualization libraries such as d3.js [59].
Figure 2

Overview of tranSMART. In a typical workflow, users define subsets of patients based on a drag and drop method of variables from the right column to the appropriate boxes (A). In this example, the summary statistics view (B) shows age difference between patients with genotypes (subsets 1 and 2, respectively) in a candidate gene. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

Overview of tranSMART. In a typical workflow, users define subsets of patients based on a drag and drop method of variables from the right column to the appropriate boxes (A). In this example, the summary statistics view (B) shows age difference between patients with genotypes (subsets 1 and 2, respectively) in a candidate gene. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

Data-cube-based model supporting heterogeneous data

The next tool in which we were interested was a proof of concept developed by Angelelli et al. [6] based on a data-cube-based model and designed for the visual exploration and analysis of large heterogeneous medical cohort studies. This software allows researchers to upload various data sets such as radiology results and cognitive scoring, slice patient groups based on specific features and then visualize how the data correlate with each other. The principle visualization component consists of a multiple-view dashboard featuring scatterplots, histograms and a 3D brain atlas color-coded by fiber bundle. These visualizations are all coordinated with each other based on interactive drag and drop or highlighting functions that allow users to select variables or data points of interest. The main advantages of this system are the flexibility of accepting incomplete, partial overlapping data reflective of real-world situations as well as the structure of the data storage, which allow fast, flexible calculations describing the relationships between different pieces of data.

Papilio

Papilio is another interactive tool that leverages visual analytics developed to explore heterogeneous medical cohort data to guide medical researchers and facilitate hypothesis generation, especially when no evident hypotheses are initially favored. After loading data, a first module called PrePap prepares the data. Next, the visualization module, VisPap, offers an interactive data exploration environment where users interact with a dashboard showing scatterplots, parallel coordinates and line diagrams all coordinated so as to maintain relationships and dependencies of data. Users also have the ability to visualize statistical analyses such as confidence-weighted principal component ellipses overlaid onto the data. Its principle features include a thorough image-processing pipeline that prepares raw images for downstream analysis as well as its robust conceptual framework based on domains, features and mappers that enhance the flexibility of the database while maintaining relationships between data.

Caleydo domino

Domino is a flexible data-visualization tool that improves the extraction, manipulation and comparison of interconnected heterogeneous subsets of multidimensional data sets in general [60, 61]. Users position draggable blocks in a workspace to rapidly assemble complex coordinated graphical schema representing the data and relationships between subsets. The software features a wide variety of simple and complex visualizations to incorporate into the schema ranging from histograms and scatterplots to parallel coordinate plots, mosaic plots and Sankey diagrams [62] (Figure 3). Two principle features include an intuitive GUI featuring placeholders and live previews that indicate possible drop locations and possible visualization to use as well as its library of innovative visualization techniques such as flexible linked axis (‘Flexible linked axes for multivariate data visualization’) and StratomeX, used for interactive visualization in cancer subtype analysis [64] (Figure 4).
Figure 3

A demonstration of Caleydo Domino using exploration of a set of multiple tabular data sets for a music data set containing song and musician information. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data [63]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

Figure 4

A demonstration of StratomeX using exploration of a set of multiple tabular data sets for the TCGA clear cell renal carcinoma data set. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data. Above, users can visualize the relation between patients with subtypes based on two different genomic clustering experiments [65]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

A demonstration of Caleydo Domino using exploration of a set of multiple tabular data sets for a music data set containing song and musician information. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data [63]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib. A demonstration of StratomeX using exploration of a set of multiple tabular data sets for the TCGA clear cell renal carcinoma data set. This figure displays the main user interface of the program where users can drag and position data subsets and chose which calculations or visualizations to use to explore data and relationships between data. Above, users can visualize the relation between patients with subtypes based on two different genomic clustering experiments [65]. A colour version of this figure is available at BIB online: https://academic.oup.com/bib.

Qlucore omics

As we believe, it is important to survey the widest variety of visualizations used to promote translational research using multidimensional data sets, we decided to additionally review available commercial solutions, the first of which is Qlucore Omics, a platform started in 2007 in Lund, Sweden optimized to explore biological data sets through interactive analysis and visualization features [66]. Data are loaded using a wizard, preprocessed and analyzed using a GUI workspace where users can select data and specific graphics and analyses to perform. The wide assortment of visualization supported range from scatterplots and histograms to heatmaps and network visualizations all based on data and parameters selected from a tool bar. Users additionally have options to annotate data by features or statistics results, specify specific data or data slices to be plotted and synchronize visualizations such as by color codes to meet specific requirements. Like most commercial products, the software comes with complete documentation, support and comprehensive tutorials. An advantage of this program is the sheer amount of features available including calculations ranging from simple t-test statistics to advanced machine learning classifier builders.

Oracle health sciences translational research center

Oracle Health Sciences translational research center (TRC) provides a standardized industrial architecture that helps store, integrate and analyze multi-omic and clinical data and is specifically designed to facilitate biomarker discoveries, validation and application to clinical care [67]. The software’s top layer component is a cohort explorer used to identify and stratify clinical cohorts based on various normalization and filtering criteria. A principle advantage of the system is that it contains a rich omics data bank compiled from a large number of public studies that helps fit the project at hand into the context of up-to-date literature as well as promote cross-study omics data analysis. Of note, while the TRC supports direct integration with statistical and visualization software or even natural language processing functionality for test reports, these features are not included in the basic system package.

Omicsoffice® powered by TIBCO Spotfire

Our final commercial product to review is OmicsOffice, a comprehensive genomics data analysis tool backed by the TIBCO Spotfire data visualization and analytics software [68, 69]. Users work almost entirely within the GUI environment to perform genomic experiments and analyze data with almost no data preprocessing required start to finish. Visualization is based on a coordinate dashboard view where users can visualize all graphs and data as well as choose which data are displayed in real time using mouse-guided data slicing features. Visualization techniques span the gamut ranging from interactive bar and pie charts to pathway viewers and volcano plots for genomic results. OmicsOffice recognizes a wide range of proprietary omics data formats and includes workflows for integrating and running group comparisons on cross-platform data. Several benefits of the program are the comprehensive, peer-reviewed ‘click and go’ analytic pipelines for specific experiments such as quantitative polymerase chain reaction (qPCR), microarrays and NGS that take in raw data and produce full reports containing publication-ready graphics and information on quality control.

Discussion

In this manuscript, we have provided a detailed review investigating current visualization tools for multidimensional, big clinical research data sets used to promote translational research. We believe thorough visualization that integrates diverse data sources will become increasingly relevant in an environment where digitalization of the health field continues to accelerate.

Limitations

For the purpose of this review, we limited the scope to platforms controlled by intuitive graphical user interfaces that were flexible in receiving user-provided data. However, one related area that could have implications for visualization in translational research in general are tools developed to investigate data from fixed input data sets, usually arising from large multi-institutional research studies consisting of various data from hundreds or thousands of patients. In addition, we discuss additional techniques that have been used to visualize data in the medical field not limited to those used in the translational research applications we have described above.

Heterogeneity of the reviewed platforms

The use cases covered by the different platforms are heterogeneous (general cohort exploration, genomics analysis, general translational research and so forth). However, most of the systems could be used for a variety of applications leveraging similar data. Although the analytical capacities of platforms are complex to compare because of their difference in scope, we believe that the visualization features are relevant to explore together. In addition, we believe it was necessary to include visualizations from a variety of use cases to include the most comprehensive picture of contemporary visualization trends for exploration of heterogeneous health-related data sets.

Tools designed to visualize data for specific data sets

Data visualization has been shown to be especially helpful in oncology research where visualization is crucial for understanding certain genomic events, verifying data quality and identifying important aspects in cancer development (see [21] for thorough review). For example, NetGestalt [70] allows for multi-omic exploration of the colorectal cancer TCGA data set and canEvolve [71] allows for integrated exploration of multiple TCGA studies. Note that while the current version of cBioPortal is dedicated primarily for the TCGA cancer data sets, we decided to keep this platform in our review because of its code availability and its strong presence in the translational research community. In addition, SysBioCube is an integrative data analysis platform designed by the US Army Medical Research group to study posttraumatic stress disorder [72], and Data Portal is a tool for interactive exploration of cognitive and radiological data for pediatric patients [73]. These tools allow researchers to intuitively explore rich data sets to uncover important biological pathways, regulation networks or drug targets.

Additional visualization techniques used in health research

A thorough review of emerging innovative visualization techniques for high-dimensional, complex data through innumerous ways of mapping of data variables to visual features such as position, size, shape and color is presented by Heer et al. [1]. For example, in visualizing time series data, various methods such as stacked graphs or index graphs showing percentage of change based on a selected point are available. Various techniques have been proposed to convert time data and events into optimal formats to facilitate quick interactive visualization [74, 75]. KNAVE-II is a tool designed to analyze and visualize time-oriented clinical data, whose principle feature is being able to classify and characterized raw time data using a predefined knowledge base [76]. In addition, a growing number of methods exist to represent spatial data such as color encoding (choropleth maps), overlaying graduated symbols or size distortion (cartograms). Spatial representation and cartography are also used in various medical research domains including brain function mapping [77], exploration of topographical distribution of skin molecules [78], identification of splice events in neurexins [79] and of course the more traditional domain of epidemiology [80]. Finally, a number of graph methods have been used to visualize the relation between the different points in a network such as force-directed layouts, arc diagrams and, as discussed previously, matrix views. In medical research, network visualization is especially useful in exploration of genetic or proteomic information and molecular pathways [81, 82], and several tools exist to facilitate this process [83, 84].

Desiderata

Throughout our search of contemporary tools for multidimensional data visualization as approached from scientific domains, but also through additional searches spanning other domains where big data also poses challenges and opportunities such as data journalism, security and human–machine interface, we noticed several themes continually reemerging. Going forward, we believe that tools for multidimensional data visualization could be enhanced by adding capabilities for patient slicing, coordinated views, interactivity, flexibility, scalability and statistical power. We briefly describe each feature below.

Patient slicing, grouping or clustering

Multidimensional data sets with large numbers of samples or features are typically difficult to fully grasp by humans without some type of synthesis. As a result, various types of dimension reduction techniques such as principal component analysis (PCA) [85], self-organizing maps [86] and local linear embedding [87] have been proposed to simplify the data to only the most salient features. In addition, at the individual patient level, especially in studies with hundreds or thousands of patients, it is important to be able to select only relevant samples according to features or clusters of similar samples. This was important for our project consisting of data from a wide variety of sources and helped us, for example, separate out the effects of methylation (epigenetic) and genetic mutations for risk of transition to psychosis.

Coordinated or linked views

Moreover, visualization tools for multidimensional visualization are enhanced with multiple coordinated views, allowing users to see the same data set from different perspectives at once. This enables flexible exploration of various nuanced hypotheses with interactive data selection, or ‘brushing’, and can be applicable in a variety of domains outside of medicine from international politics to baseball [88]. Two interesting examples are PRISMA, which allows users to see uploaded data represented by treemaps, scatterplots and parallel coordinates, all coordinated with each other in terms of color, filter and selection [89], and SEURAT, which combines linked views with exploratory analyses for microarray data visualization [90].

Interactivity

Often going hand in hand with patient slicing or coordinate views, interaction is a key aspect of visualization tools that facilitates flexible searching and localizing of interesting features in a data set through intuitive commands [91]. Many of the popular visualization platforms mentioned in the introduction consist of or support user interaction ranging from tooltips on mouse hover/touch to triggering the reordering of data or other complex actions.

Flexibility

Like many research groups, we are constantly changing what types and formats of data we collect based both on changes within the scientific community and the types of patients that enter our research center. This ‘variability’ issue is likely the most important challenge in analyzing big data [92]. It is, thus, important that tools be flexible to accept data types from a wide range of sources. We also understand that this may pose a limit, as measures to increase flexibility to accept different data by widening acceptable parameters or formats may force us to decrease the level of specificity and, thus, detail for a data source.

Scalability

Given the increasing data generated everyday by high-throughput experiments and technologies, another feature typically required for successful translational research is scalability [93]. In addition, it is important for visualizations to be able to efficiently transition through scales of magnitude while keeping an appropriate data granularity. For example, features should be implemented that support ‘drilling down’, to find specific information about outliers from high-level visualizations [5].

Statistical power

In our study, it was important not only to group or cluster patients but also to understand or measure the strength of the clusters or the differences between them. It is, thus, important that any program we have would be backed by a powerful statistics package. Much progress has been made in this domain in the past few years allowing statistics packages such as R be easily integrated into third-party software such as Web sites (‘embedded scientific computing’—see OpenCPU [94], rApache [95]).

Conclusion

In this work, we have presented a comprehensive review of the current tools in use for visualization of complex, multidimensional data sets. As medical research shifts increasingly toward a more data-driven approach, this need to comprehensively visualize multivariate data will continue to grow, especially in health-care research settings. We believe our work will serve a wide variety of investigators performing similar research. Thorough multidimensional visualization offers several benefits with potential implications in understanding disease and ultimately improving patient care. Translation research platforms in the clinical domain provide an ideal setting for a wide range of multidimensional visualization applications. In this work, we summarize the existing landscape of these types of tools currently used as well as provide our input on points to consider in advancing their development. Click here for additional data file.
Table 2.

Summary of visualization programs for multidimensional data that can be applied to user-provided datasets. For each tool reviewed, we evaluated a number of features organized by the various categories: Information content supported, visualization, data-exploration. PoC = Proof of Concept.

CategorySubcategoryItemFreely available
Commercial
GeneralGeneral informationName of the platformcBioPortaliGPSeIgloo-PlottranSMARTG-DOC PlusData-cube-based model supporting heterogeneous dataPapilioCaleydo DominoQlucore Omics ExplorerOracle Health Sciences Translational Research CenterOmicsOffice® powered by TIBCO Spotfire
Information content supportedClinicalDemographicsYesNoYesYesYesYesYesYesYesYesYes
DiagnosisYesNoYesYesYesYesYesYesYesYesYes
BiologyNoNoYesYesYesYesYesYesYesYesYes
SurvivalYesYesYesYesYesYesYesYesYesYesYes
ImagingNoNoYesYesYesYesYesYesNoNoYes
OmicsGene mutationYesNoYesYesYesYesNoYesYesYesYes
mRNAYesYesYesYesYesYesNoYesYesYesYes
OtherMethylation, protein and phosphoprotein datamiRNANANANANANANAMethylation, protein expression, flow cytometryMethylationRNA sequence, chromatin immunoprecipitation sequence, qPCR
OtherAny type of raw or processed data that corresponds in a one to one relation to a sampleNoNoYesYesYesYesYesYesYesNoNo
VisualizationHigh dimensionalHeatmapyes (through IGV)YesYesYesYesNoNoYesYesYesYes
Correlation matrixNoNoYesYesNoYesNoYesNonoYes
Parallel coordinatesNoNoNoNoNoNoYesYesNonoYes
OtherOncoPrinterParallel sets, silhouette plot, Sankey plot, force-directed graphsNAWaterfall plot, PCA plot, Haploview, Manhattan plot, Forest plot, Frequency plot for aCGHBiological network and pathways viewers (Reactome, Cytoscape), integrated genome browser (JBrowse)NAScatterplots color coded by patient type overlayed with PCA ellipsesParallel sets, sankey-diagrams, and more novel graphicsSample PCA, variable PCARequires business intelligence layer for visualizationPathway viewer, 3D scatterplot, map chart, treemap
Low dimensionalTimeline/line chartNoNoNoYesNoNoYesNoYesYesYes
HistogramsYesNoNoYesNoYesNoYesYesYesYes
ScatterplotsYesNoNoYesNoYesYesYesYesNoYes
Kaplan–Meier survival plotYesYesNoYesYesNoNoYesYesNoNo
Bar charts/box and whiskerYesNoNoYesYesNoNoYesYesYesYes
Pie chartsNoYesNoYesnoNoNoYesnoYesYes
OtherMutationMapper, volcano plotNANovel semi-circle plotting approach based on correlation and Hooke's lawNAInteractive 3D molecular viewer, chromosome and CNV visualizations, Venn diagramAtlas view representing areas of brain implicated in analysesNANANANAVolcano plot
CoordinationLinked viewsNoYesNoNoNoYesYesYesYesYesYes
Data-explorationStatistics and data miningStatisticsSurvival log-rank test, Cytoscape graph viewer for genetic networksLog-rank test, P value, k-means, spectral clustering and community detectionClass discovery within dataLogistic regression, correlation, t-test,χ, Fischer test, ANOVA, basic summary statistics, hierarchical clustering, k-means clusteringPCA, differential expression analysis, hierarchical clustering, group comparisonsCorrelation statistics between radiology results and cognitive testing, multivariate statistics, multilinear regression, as well as any type of statistics provided calculated by R in future versionsBasic statistics such as finding differences in measures between two groups. Confidence-weighted principal component ellipsesNAT-test, ANOVA, linear regression, quadratic regression, rank regression, classifier building and training: SVM, RT, kNNIntegrated with programming languages such as R for statistics beyond simple group countsLine similarity, regression modeling, wide range of parametric and nonparametric statistical tests, functional gene analysis, data classification

ANOVA = analysis of variance.

  60 in total

Review 1.  Sharing heterogeneous data: the national database for autism research.

Authors:  Dan Hall; Michael F Huerta; Matthew J McAuliffe; Gregory K Farber
Journal:  Neuroinformatics       Date:  2012-10

Review 2.  Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics.

Authors:  Irene Kouskoumvekaki; Nour Shublaq; Søren Brunak
Journal:  Brief Bioinform       Date:  2013-08-01       Impact factor: 11.622

3.  Merging multiple omics datasets in silico: statistical analyses and data interpretation.

Authors:  Kazuharu Arakawa; Masaru Tomita
Journal:  Methods Mol Biol       Date:  2013

4.  Cartography of neurexin alternative splicing mapped by single-molecule long-read mRNA sequencing.

Authors:  Barbara Treutlein; Ozgun Gokce; Stephen R Quake; Thomas C Südhof
Journal:  Proc Natl Acad Sci U S A       Date:  2014-03-17       Impact factor: 11.205

5.  International network of cancer genome projects.

Authors:  Thomas J Hudson; Warwick Anderson; Axel Artez; Anna D Barker; Cindy Bell; Rosa R Bernabé; M K Bhan; Fabien Calvo; Iiro Eerola; Daniela S Gerhard; Alan Guttmacher; Mark Guyer; Fiona M Hemsley; Jennifer L Jennings; David Kerr; Peter Klatt; Patrik Kolar; Jun Kusada; David P Lane; Frank Laplace; Lu Youyong; Gerd Nettekoven; Brad Ozenberger; Jane Peterson; T S Rao; Jacques Remacle; Alan J Schafer; Tatsuhiro Shibata; Michael R Stratton; Joseph G Vockley; Koichi Watanabe; Huanming Yang; Matthew M F Yuen; Bartha M Knoppers; Martin Bobrow; Anne Cambon-Thomsen; Lynn G Dressler; Stephanie O M Dyke; Yann Joly; Kazuto Kato; Karen L Kennedy; Pilar Nicolás; Michael J Parker; Emmanuelle Rial-Sebbag; Carlos M Romeo-Casabona; Kenna M Shaw; Susan Wallace; Georgia L Wiesner; Nikolajs Zeps; Peter Lichter; Andrew V Biankin; Christian Chabannon; Lynda Chin; Bruno Clément; Enrique de Alava; Françoise Degos; Martin L Ferguson; Peter Geary; D Neil Hayes; Thomas J Hudson; Amber L Johns; Arek Kasprzyk; Hidewaki Nakagawa; Robert Penny; Miguel A Piris; Rajiv Sarin; Aldo Scarpa; Tatsuhiro Shibata; Marc van de Vijver; P Andrew Futreal; Hiroyuki Aburatani; Mónica Bayés; David D L Botwell; Peter J Campbell; Xavier Estivill; Daniela S Gerhard; Sean M Grimmond; Ivo Gut; Martin Hirst; Carlos López-Otín; Partha Majumder; Marco Marra; John D McPherson; Hidewaki Nakagawa; Zemin Ning; Xose S Puente; Yijun Ruan; Tatsuhiro Shibata; Michael R Stratton; Hendrik G Stunnenberg; Harold Swerdlow; Victor E Velculescu; Richard K Wilson; Hong H Xue; Liu Yang; Paul T Spellman; Gary D Bader; Paul C Boutros; Peter J Campbell; Paul Flicek; Gad Getz; Roderic Guigó; Guangwu Guo; David Haussler; Simon Heath; Tim J Hubbard; Tao Jiang; Steven M Jones; Qibin Li; Nuria López-Bigas; Ruibang Luo; Lakshmi Muthuswamy; B F Francis Ouellette; John V Pearson; Xose S Puente; Victor Quesada; Benjamin J Raphael; Chris Sander; Tatsuhiro Shibata; Terence P Speed; Lincoln D Stein; Joshua M Stuart; Jon W Teague; Yasushi Totoki; Tatsuhiko Tsunoda; Alfonso Valencia; David A Wheeler; Honglong Wu; Shancen Zhao; Guangyu Zhou; Lincoln D Stein; Roderic Guigó; Tim J Hubbard; Yann Joly; Steven M Jones; Arek Kasprzyk; Mark Lathrop; Nuria López-Bigas; B F Francis Ouellette; Paul T Spellman; Jon W Teague; Gilles Thomas; Alfonso Valencia; Teruhiko Yoshida; Karen L Kennedy; Myles Axton; Stephanie O M Dyke; P Andrew Futreal; Daniela S Gerhard; Chris Gunter; Mark Guyer; Thomas J Hudson; John D McPherson; Linda J Miller; Brad Ozenberger; Kenna M Shaw; Arek Kasprzyk; Lincoln D Stein; Junjun Zhang; Syed A Haider; Jianxin Wang; Christina K Yung; Anthony Cros; Anthony Cross; Yong Liang; Saravanamuttu Gnaneshan; Jonathan Guberman; Jack Hsu; Martin Bobrow; Don R C Chalmers; Karl W Hasel; Yann Joly; Terry S H Kaan; Karen L Kennedy; Bartha M Knoppers; William W Lowrance; Tohru Masui; Pilar Nicolás; Emmanuelle Rial-Sebbag; Laura Lyman Rodriguez; Catherine Vergely; Teruhiko Yoshida; Sean M Grimmond; Andrew V Biankin; David D L Bowtell; Nicole Cloonan; Anna deFazio; James R Eshleman; Dariush Etemadmoghadam; Brooke B Gardiner; Brooke A Gardiner; James G Kench; Aldo Scarpa; Robert L Sutherland; Margaret A Tempero; Nicola J Waddell; Peter J Wilson; John D McPherson; Steve Gallinger; Ming-Sound Tsao; Patricia A Shaw; Gloria M Petersen; Debabrata Mukhopadhyay; Lynda Chin; Ronald A DePinho; Sarah Thayer; Lakshmi Muthuswamy; Kamran Shazand; Timothy Beck; Michelle Sam; Lee Timms; Vanessa Ballin; Youyong Lu; Jiafu Ji; Xiuqing Zhang; Feng Chen; Xueda Hu; Guangyu Zhou; Qi Yang; Geng Tian; Lianhai Zhang; Xiaofang Xing; Xianghong Li; Zhenggang Zhu; Yingyan Yu; Jun Yu; Huanming Yang; Mark Lathrop; Jörg Tost; Paul Brennan; Ivana Holcatova; David Zaridze; Alvis Brazma; Lars Egevard; Egor Prokhortchouk; Rosamonde Elizabeth Banks; Mathias Uhlén; Anne Cambon-Thomsen; Juris Viksna; Fredrik Ponten; Konstantin Skryabin; Michael R Stratton; P Andrew Futreal; Ewan Birney; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A Foekens; Sancha Martin; Jorge S Reis-Filho; Andrea L Richardson; Christos Sotiriou; Hendrik G Stunnenberg; Giles Thoms; Marc van de Vijver; Laura van't Veer; Fabien Calvo; Daniel Birnbaum; Hélène Blanche; Pascal Boucher; Sandrine Boyault; Christian Chabannon; Ivo Gut; Jocelyne D Masson-Jacquemier; Mark Lathrop; Iris Pauporté; Xavier Pivot; Anne Vincent-Salomon; Eric Tabone; Charles Theillet; Gilles Thomas; Jörg Tost; Isabelle Treilleux; Fabien Calvo; Paulette Bioulac-Sage; Bruno Clément; Thomas Decaens; Françoise Degos; Dominique Franco; Ivo Gut; Marta Gut; Simon Heath; Mark Lathrop; Didier Samuel; Gilles Thomas; Jessica Zucman-Rossi; Peter Lichter; Roland Eils; Benedikt Brors; Jan O Korbel; Andrey Korshunov; Pablo Landgraf; Hans Lehrach; Stefan Pfister; Bernhard Radlwimmer; Guido Reifenberger; Michael D Taylor; Christof von Kalle; Partha P Majumder; Rajiv Sarin; T S Rao; M K Bhan; Aldo Scarpa; Paolo Pederzoli; Rita A Lawlor; Massimo Delledonne; Alberto Bardelli; Andrew V Biankin; Sean M Grimmond; Thomas Gress; David Klimstra; Giuseppe Zamboni; Tatsuhiro Shibata; Yusuke Nakamura; Hidewaki Nakagawa; Jun Kusada; Tatsuhiko Tsunoda; Satoru Miyano; Hiroyuki Aburatani; Kazuto Kato; Akihiro Fujimoto; Teruhiko Yoshida; Elias Campo; Carlos López-Otín; Xavier Estivill; Roderic Guigó; Silvia de Sanjosé; Miguel A Piris; Emili Montserrat; Marcos González-Díaz; Xose S Puente; Pedro Jares; Alfonso Valencia; Heinz Himmelbauer; Heinz Himmelbaue; Victor Quesada; Silvia Bea; Michael R Stratton; P Andrew Futreal; Peter J Campbell; Anne Vincent-Salomon; Andrea L Richardson; Jorge S Reis-Filho; Marc van de Vijver; Gilles Thomas; Jocelyne D Masson-Jacquemier; Samuel Aparicio; Ake Borg; Anne-Lise Børresen-Dale; Carlos Caldas; John A Foekens; Hendrik G Stunnenberg; Laura van't Veer; Douglas F Easton; Paul T Spellman; Sancha Martin; Anna D Barker; Lynda Chin; Francis S Collins; Carolyn C Compton; Martin L Ferguson; Daniela S Gerhard; Gad Getz; Chris Gunter; Alan Guttmacher; Mark Guyer; D Neil Hayes; Eric S Lander; Brad Ozenberger; Robert Penny; Jane Peterson; Chris Sander; Kenna M Shaw; Terence P Speed; Paul T Spellman; Joseph G Vockley; David A Wheeler; Richard K Wilson; Thomas J Hudson; Lynda Chin; Bartha M Knoppers; Eric S Lander; Peter Lichter; Lincoln D Stein; Michael R Stratton; Warwick Anderson; Anna D Barker; Cindy Bell; Martin Bobrow; Wylie Burke; Francis S Collins; Carolyn C Compton; Ronald A DePinho; Douglas F Easton; P Andrew Futreal; Daniela S Gerhard; Anthony R Green; Mark Guyer; Stanley R Hamilton; Tim J Hubbard; Olli P Kallioniemi; Karen L Kennedy; Timothy J Ley; Edison T Liu; Youyong Lu; Partha Majumder; Marco Marra; Brad Ozenberger; Jane Peterson; Alan J Schafer; Paul T Spellman; Hendrik G Stunnenberg; Brandon J Wainwright; Richard K Wilson; Huanming Yang
Journal:  Nature       Date:  2010-04-15       Impact factor: 49.962

6.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients.

Authors:  David W Bates; Suchi Saria; Lucila Ohno-Machado; Anand Shah; Gabriel Escobar
Journal:  Health Aff (Millwood)       Date:  2014-07       Impact factor: 6.301

7.  coMET: visualisation of regional epigenome-wide association scan results and DNA co-methylation patterns.

Authors:  Tiphaine C Martin; Idil Yet; Pei-Chien Tsai; Jordana T Bell
Journal:  BMC Bioinformatics       Date:  2015-04-28       Impact factor: 3.169

8.  G-DOC Plus - an integrative bioinformatics platform for precision medicine.

Authors:  Krithika Bhuvaneshwar; Anas Belouali; Varun Singh; Robert M Johnson; Lei Song; Adil Alaoui; Michael A Harris; Robert Clarke; Louis M Weiner; Yuriy Gusev; Subha Madhavan
Journal:  BMC Bioinformatics       Date:  2016-04-30       Impact factor: 3.169

9.  FTSPlot: fast time series visualization for large datasets.

Authors:  Michael Riss
Journal:  PLoS One       Date:  2014-04-14       Impact factor: 3.240

10.  A multivariate approach to the integration of multi-omics datasets.

Authors:  Chen Meng; Bernhard Kuster; Aedín C Culhane; Amin Moghaddas Gholami
Journal:  BMC Bioinformatics       Date:  2014-05-29       Impact factor: 3.169

View more
  9 in total

1.  The Role of Free/Libre and Open Source Software in Learning Health Systems.

Authors:  C Paton; T Karopka
Journal:  Yearb Med Inform       Date:  2017-09-11

Review 2.  Druggable Transcriptional Networks in the Human Neurogenic Epigenome.

Authors:  Gerald A Higgins; Aaron M Williams; Alex S Ade; Hasan B Alam; Brian D Athey
Journal:  Pharmacol Rev       Date:  2019-10       Impact factor: 25.468

3.  Usability and Suitability of the Omics-Integrating Analysis Platform tranSMART for Translational Research and Education.

Authors:  J Christoph; C Knell; A Bosserhoff; E Naschberger; M Stürzl; M Rübner; H Seuss; M Ruh; H-U Prokosch; B Sedlmayr
Journal:  Appl Clin Inform       Date:  2017-12-21       Impact factor: 2.342

4.  Making Visualization Work for You: Deriving Valuable Insights from Omics Data.

Authors:  Alexander Yemelin
Journal:  Methods Mol Biol       Date:  2021

5.  Surveying the Maize community for their diversity and pedigree visualization needs to prioritize tool development and curation.

Authors:  Taner Z Sen; Bremen L Braun; David A Schott; John L Portwood Ii; Mary L Schaeffer; Lisa C Harper; Jack M Gardiner; Ethalinda K Cannon; Carson M Andorf
Journal:  Database (Oxford)       Date:  2017-01-01       Impact factor: 3.451

6.  Transcriptional activation of CBFβ by CDK11p110 is necessary to promote osteosarcoma cell proliferation.

Authors:  Yong Feng; Yunfei Liao; Jianming Zhang; Jacson Shen; Zengwu Shao; Francis Hornicek; Zhenfeng Duan
Journal:  Cell Commun Signal       Date:  2019-10-14       Impact factor: 5.712

7.  Data and knowledge management in translational research: implementation of the eTRIKS platform for the IMI OncoTrack consortium.

Authors:  Wei Gu; Reha Yildirimman; Emmanuel Van der Stuyft; Denny Verbeeck; Sascha Herzinger; Venkata Satagopam; Adriano Barbosa-Silva; Reinhard Schneider; Bodo Lange; Hans Lehrach; Yike Guo; David Henderson; Anthony Rowe
Journal:  BMC Bioinformatics       Date:  2019-04-01       Impact factor: 3.169

8.  Presenting and sharing clinical data using the eTRIKS Standards Master Tree for tranSMART.

Authors:  Adriano Barbosa-Silva; Dorina Bratfalean; Wei Gu; Venkata Satagopam; Paul Houston; Lauren B Becnel; Serge Eifes; Fabien Richard; Andreas Tielmann; Sascha Herzinger; Kavita Rege; Rudi Balling; Paul Peeters; Reinhard Schneider
Journal:  Bioinformatics       Date:  2019-05-01       Impact factor: 6.937

9.  MouseBytes, an open-access high-throughput pipeline and database for rodent touchscreen-based cognitive assessment.

Authors:  Flavio H Beraldo; Daniel Palmer; Sara Memar; David I Wasserman; Wai-Jane V Lee; Shuai Liang; Samantha D Creighton; Benjamin Kolisnyk; Matthew F Cowan; Justin Mels; Talal S Masood; Chris Fodor; Mohammed A Al-Onaizi; Robert Bartha; Tom Gee; Lisa M Saksida; Timothy J Bussey; Stephen S Strother; Vania F Prado; Boyer D Winters; Marco Am Prado
Journal:  Elife       Date:  2019-12-11       Impact factor: 8.140

  9 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.