| Literature DB >> 25248396 |
Edward S Dove1, Yann Joly1, Anne-Marie Tassé2, Bartha M Knoppers1,2.
Abstract
The biggest challenge in twenty-first century data-intensive genomic science, is developing vast computer infrastructure and advanced software tools to perform comprehensive analyses of genomic data sets for biomedical research and clinical practice. Researchers are increasingly turning to cloud computing both as a solution to integrate data from genomics, systems biology and biomedical data mining and as an approach to analyze data to solve biomedical problems. Although cloud computing provides several benefits such as lower costs and greater efficiency, it also raises legal and ethical issues. In this article, we discuss three key 'points to consider' (data control; data security, confidentiality and transfer; and accountability) based on a preliminary review of several publicly available cloud service providers' Terms of Service. These 'points to consider' should be borne in mind by genomic research organizations when negotiating legal arrangements to store genomic data on a large commercial cloud service provider's servers. Diligent genomic cloud computing means leveraging security standards and evaluation processes as a means to protect data and entails many of the same good practices that researchers should always consider in securing their local infrastructure.Entities:
Mesh:
Year: 2014 PMID: 25248396 PMCID: PMC4592072 DOI: 10.1038/ejhg.2014.196
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Figure 1Contrast between traditional bioinformatics workflow and new cloud-based workflow. The traditional bioinformatics workflow is characterized by researchers downloading or uploading genomic and health-related data to local on-site storage (eg, computers) for processing, analysis and obtaining of results. The results are then uploaded to repositories for publishing. This process is typically slower, redundant and necessitates high IT capital expenditure. Indeed, the traditional practice of genome analysis requires researchers to spend weeks to months downloading hundreds of terabytes of data from a central repository before computations can begin. By contrast, the new cloud computing bioinformatics model eliminates the need for researchers to download the data to their own computers. Instead, it is characterized by a one-stop workflow where the compute (eg, standard and custom pipelines, workflow tools) is brought to the data. Particularly in Infrastructure as a Service cloud computing, researchers can upload their analytic software into a cloud, run the software, and download the compiled results in a secure fashion. Platform as a Service and Software as a Service cloud computing can also provide a one-stop bioinformatics workflow, albeit with less raw computing resources made available to researchers.