| Literature DB >> 30165440 |
Sascha Herzinger1, Valentin Grouès1, Wei Gu1, Venkata Satagopam1, Peter Banda1, Christophe Trefois1, Reinhard Schneider1.
Abstract
Background: Translational research platforms share the aim of promoting a deeper understanding of stored data by providing visualization and analysis tools for data exploration and hypothesis generation. However, such tools are usually platform bound and are not easily reusable by other systems. Furthermore, they rarely address access restriction issues when direct data transfer is not permitted. In this article, we present an analytical service that works in tandem with a visualization library to address these problems. Findings: Using a combination of existing technologies and a platform-specific data abstraction layer, we developed a service that is capable of providing existing web-based data warehouses and repositories with platform-independent visual analytical capabilities. The design of this service also allows for federated data analysis by eliminating the need to move the data directly to the researcher. Instead, all operations are based on statistics and interactive charts without direct access to the dataset. Conclusions: The software presented in this article has a potential to help translational researchers achieve a better understanding of a given dataset and quickly generate new hypotheses. Furthermore, it provides a framework that can be used to share and reuse explorative analysis tools within the community.Entities:
Mesh:
Year: 2018 PMID: 30165440 PMCID: PMC6143733 DOI: 10.1093/gigascience/giy109
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Figure 1:The Fractalis stack. Shown is a schematic view of the three major components: the visual component, which resides in the web browser and interacts with the user; the central server component, which manages the application states and handles job distribution; and the remote server component, which can be deployed remotely and handles the majority of the application workload.
Figure 2:Fractalis in Ada. Shown is a self-hosted instance of Ada using Fractalis to display several statistics for the selected dataset. Notable is the native look of Fractalis within the existing UI, making the integration almost completely invisible to the user.
Figure 3:Fractalis pipeline demonstration. Shown are four Fractalis charts that show statistics based on the The Cancer Genome Atlas (TCGA)—colon adenocarcinoma (COAD) dataset. From left to right, top to bottom: a volcano plot using results of the R package DESeq2; an MA plot using results of the R package DESeq2; box plots and a one-way ANOVA group test; and a survival plot using the Kaplan-Meier estimator. The first three plots compare early-stage cancer with late-stage cancer. The last plot compares high read count of has-mir-1269a with low read count of has-mir-1269a.
Fractalis distributed pipeline benchmark
| Worker location | Ping, ms | 1st meas., ms | 2nd meas., ms | 3rd meas., ms | Avg., ms |
|---|---|---|---|---|---|
| Intranet | <1 | 92 | 84 | 88 | 88 |
| London, UK (Google Cloud) | 20 | 222 | 200 | 199 | 207 |
| South Carolina, USA (Google Cloud) | 102 | 794 | 794 | 789 | 792 |
The table shows the time past between submitting an analysis and receiving the results. All results include the time needed to prepare the data for analysis, the computation of the correlation statistics, the sending of the results, and the latency/overhead introduced by the communication between the service components. The Ping column shows the base latency by pinging the server from our location in Luxembourg.