Literature DB >> 30337067

Secure large-scale genome data storage and query.

Luyao Chen1, Md Momin Aziz2, Noman Mohammed3, Xiaoqian Jiang4.   

Abstract

BACKGROUND AND
OBJECTIVE: Cloud computing plays a vital role in big data science with its scalable and cost-efficient architecture. Large-scale genome data storage and computations would benefit from using these latest cloud computing infrastructures, to save cost and speedup discoveries. However, due to the privacy and security concerns, data owners are often disinclined to put sensitive data in a public cloud environment without enforcing some protective measures. An ideal solution is to develop secure genome database that supports encrypted data deposition and query.
METHODS: Nevertheless, it is a challenging task to make such a system fast and scalable enough to handle real-world demands providing data security as well. In this paper, we propose a novel, secure mechanism to support secure count queries on an open source graph database (Neo4j) and evaluated the performance on a real-world dataset of around 735,317 Single Nucleotide Polymorphisms (SNPs). In particular, we propose a new tree indexing method that offers constant time complexity (proportion to the tree depth), which was the bottleneck of existing approaches.
RESULTS: The proposed method significantly improves the runtime of query execution compared to the existing techniques. It takes less than one minute to execute an arbitrary count query on a dataset of 212  GB, while the best-known algorithm takes around 7  min.
CONCLUSIONS: The outlined framework and experimental results show the applicability of utilizing graph database for securely storing large-scale genome data in untrusted environment. Furthermore, the crypto-system and security assumptions underlined are much suitable for such use cases which be generalized in future work.
Copyright © 2018 Elsevier B.V. All rights reserved.

Entities:  

Keywords:  Genome data storage Neo4j; Graph database; Homomorphic encryption; Secure computation on genome data; Secure genome data storage

Mesh:

Year:  2018        PMID: 30337067      PMCID: PMC6196742          DOI: 10.1016/j.cmpb.2018.08.007

Source DB:  PubMed          Journal:  Comput Methods Programs Biomed        ISSN: 0169-2607            Impact factor:   5.428


  14 in total

1.  Quantifying population genetic differentiation from next-generation sequencing data.

Authors:  Matteo Fumagalli; Filipe G Vieira; Thorfinn Sand Korneliussen; Tyler Linderoth; Emilia Huerta-Sánchez; Anders Albrechtsen; Rasmus Nielsen
Journal:  Genetics       Date:  2013-08-26       Impact factor: 4.562

2.  Secure management of biomedical data with cryptographic hardware.

Authors:  Mustafa Canim; Murat Kantarcioglu; Bradley Malin
Journal:  IEEE Trans Inf Technol Biomed       Date:  2011-10-17

3.  A cryptographic approach to securely share and query genomic sequences.

Authors:  Murat Kantarcioglu; Wei Jiang; Ying Liu; Bradley Malin
Journal:  IEEE Trans Inf Technol Biomed       Date:  2008-09

Review 4.  Privacy-preserving techniques of genomic data-a survey.

Authors:  Md Momin Al Aziz; Md Nazmus Sadat; Dima Alhadidi; Shuang Wang; Xiaoqian Jiang; Cheryl L Brown; Noman Mohammed
Journal:  Brief Bioinform       Date:  2019-05-21       Impact factor: 11.622

5.  Private and Efficient Query Processing on Outsourced Genomic Databases.

Authors:  Reza Ghasemi; Md Momin Al Aziz; Noman Mohammed; Massoud Hadian Dehkordi; Xiaoqian Jiang
Journal:  IEEE J Biomed Health Inform       Date:  2016-11-04       Impact factor: 5.772

6.  Secure count query on encrypted genomic data.

Authors:  Mohammad Zahidul Hasan; Md Safiur Rahman Mahdi; Md Nazmus Sadat; Noman Mohammed
Journal:  J Biomed Inform       Date:  2018-03-15       Impact factor: 6.317

7.  The personal genome project.

Authors:  G M Church
Journal:  Mol Syst Biol       Date:  2005-12-13       Impact factor: 11.429

8.  Big Data: Astronomical or Genomical?

Authors:  Zachary D Stephens; Skylar Y Lee; Faraz Faghri; Roy H Campbell; Chengxiang Zhai; Miles J Efron; Ravishankar Iyer; Michael C Schatz; Saurabh Sinha; Gene E Robinson
Journal:  PLoS Biol       Date:  2015-07-07       Impact factor: 8.029

9.  Counting potentially functional variants in BRCA1, BRCA2 and ATM predicts breast cancer susceptibility.

Authors:  Nichola Johnson; Olivia Fletcher; Claire Palles; Matthew Rudd; Emily Webb; Gabrielle Sellick; Isabel dos Santos Silva; Valerie McCormack; Lorna Gibson; Agnes Fraser; Angela Leonard; Clare Gilham; Sean V Tavtigian; Alan Ashworth; Richard Houlston; Julian Peto
Journal:  Hum Mol Genet       Date:  2007-03-06       Impact factor: 6.150

10.  Modeling 3D facial shape from DNA.

Authors:  Peter Claes; Denise K Liberton; Katleen Daniels; Kerri Matthes Rosana; Ellen E Quillen; Laurel N Pearson; Brian McEvoy; Marc Bauchet; Arslan A Zaidi; Wei Yao; Hua Tang; Gregory S Barsh; Devin M Absher; David A Puts; Jorge Rocha; Sandra Beleza; Rinaldo W Pereira; Gareth Baynam; Paul Suetens; Dirk Vandermeulen; Jennifer K Wagner; James S Boster; Mark D Shriver
Journal:  PLoS Genet       Date:  2014-03-20       Impact factor: 5.917

View more
  1 in total

1.  Parallel and private generalized suffix tree construction and query on genomic data.

Authors:  Md Momin Al Aziz; Parimala Thulasiraman; Noman Mohammed
Journal:  BMC Genom Data       Date:  2022-06-17
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.