| Literature DB >> 27441714 |
Venkata Satagopam1, Wei Gu1, Serge Eifes1,2, Piotr Gawron1, Marek Ostaszewski1, Stephan Gebel1, Adriano Barbosa-Silva1, Rudi Balling1, Reinhard Schneider1.
Abstract
Translational medicine is a domain turning results of basic life science research into new tools and methods in a clinical environment, for example, as new diagnostics or therapies. Nowadays, the process of translation is supported by large amounts of heterogeneous data ranging from medical data to a whole range of -omics data. It is not only a great opportunity but also a great challenge, as translational medicine big data is difficult to integrate and analyze, and requires the involvement of biomedical experts for the data processing. We show here that visualization and interoperable workflows, combining multiple complex steps, can address at least parts of the challenge. In this article, we present an integrated workflow for exploring, analysis, and interpretation of translational medicine data in the context of human health. Three Web services-tranSMART, a Galaxy Server, and a MINERVA platform-are combined into one big data pipeline. Native visualization capabilities enable the biomedical experts to get a comprehensive overview and control over separate steps of the workflow. The capabilities of tranSMART enable a flexible filtering of multidimensional integrated data sets to create subsets suitable for downstream processing. A Galaxy Server offers visually aided construction of analytical pipelines, with the use of existing or custom components. A MINERVA platform supports the exploration of health and disease-related mechanisms in a contextualized analytical visualization system. We demonstrate the utility of our workflow by illustrating its subsequent steps using an existing data set, for which we propose a filtering scheme, an analytical pipeline, and a corresponding visualization of analytical results. The workflow is available as a sandbox environment, where readers can work with the described setup themselves. Overall, our work shows how visualization and interfacing of big data processing services facilitate exploration, analysis, and interpretation of translational medicine data.Entities:
Keywords: big data analytics; big data infrastructure design; data acquisition and cleaning; data integration; data mining; disease map
Mesh:
Year: 2016 PMID: 27441714 PMCID: PMC4932659 DOI: 10.1089/big.2015.0057
Source DB: PubMed Journal: Big Data ISSN: 2167-6461 Impact factor: 2.128

A workflow for big data analytics in translational medicine. Clinical and “omics” data are integrated in the tranSMART database, allowing their exploration and selection of relevant subsets for downstream analysis. Selected data set is automatically transferred to Galaxy Server as a source for user-defined analytical pipelines. Finally, the results of the analysis are automatically transferred to an associated knowledge repository hosted on MINERVA platform (here: PD map) and displayed on the visualized molecular interaction networks. PD, Parkinson's disease.

Cohort/subset definition based on the variables displayed in data tree. Two distinct subsets are defined based on the variables “disease state” and “gender.” In the left panel: data tree in tranSMART data set explorer. The data tree for GEO study GSE7621 following curation and loading to tranSMART is shown here. The data leafs correspond to the low- and high-dimensional data variable names. GEO, Gene Expression Omnibus.

Visually constructed data flow in the Galaxy Server comparing two cohorts from tranSMART.

Data visualization and analysis using PD map. (A) Differential gene expression data comparing postmortem brain tissues from male PD patients versus controls are displayed on the PD map (green, upregulated; red, downregulated). Pathways and processes of conspicuous areas (colored circle) could be identified using the pathway and compartment layout view of the PD map. Detailed view on deregulated genes that encode for proteins involved in dopamine metabolism, secretion, and recycling (B), on mitochondrial electron transport chain, in particular elements of complex I (C), and on microglia activation (D).