Literature DB >> 26844007

Collaborative Data Analytics with DataHub.

Anant Bhardwaj1, David Karger1, Harihar Subramanyam1, Amol Deshpande2, Sam Madden1, Eugene Wu3, Aaron Elmore4, Aditya Parameswaran5, Rebecca Zhang1.   

Abstract

While there have been many solutions proposed for storing and analyzing large volumes of data, all of these solutions have limited support for collaborative data analytics, especially given the many individuals and teams are simultaneously analyzing, modifying and exchanging datasets, employing a number of heterogeneous tools or languages for data analysis, and writing scripts to clean, preprocess, or query data. We demonstrate DataHub, a unified platform with the ability to load, store, query, collaboratively analyze, interactively visualize, interface with external applications, and share datasets. We will demonstrate the following aspects of the DataHub platform: (a) flexible data storage, sharing, and native versioning capabilities: multiple conference attendees can concurrently update the database and browse the different versions and inspect conflicts; (b) an app ecosystem that hosts apps for various data-processing activities: conference attendees will be able to effortlessly ingest, query, and visualize data using our existing apps; (c) thrift-based data serialization permits data analysis in any combination of 20+ languages, with DataHub as the common data store: conference attendees will be able to analyze datasets in R, Python, and Matlab, while the inputs and the results are still stored in DataHub. In particular, conference attendees will be able to use the DataHub notebook - an IPython-based notebook for analyzing data and storing the results of data analysis.

Entities:  

Year:  2015        PMID: 26844007      PMCID: PMC4734646          DOI: 10.14778/2824032.2824100

Source DB:  PubMed          Journal:  Proceedings VLDB Endowment        ISSN: 2150-8097


  2 in total

1.  Privacy-aware sharing and collaborative analysis of personal wellness data: Process model, domain ontology, software system and user trial.

Authors:  Lauri Tuovinen; Alan F Smeaton
Journal:  PLoS One       Date:  2022-04-07       Impact factor: 3.240

2.  CWcollab: Presenting multimedia with web-based context-aware collaboration.

Authors:  Chunxu Tang; Beinan Wang; C Y Roger Chen; Huijun Wu
Journal:  Entertain Comput       Date:  2022-07-11
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.