| Literature DB >> 28786361 |
Jan Henrik Ziegeldorf1, Jan Pennekamp2, David Hellmanns2, Felix Schwinger2, Ike Kunze2, Martin Henze2, Jens Hiller2, Roman Matzutt2, Klaus Wehrle2.
Abstract
BACKGROUND: Whole genome sequencing has become fast, accurate, and cheap, paving the way towards the large-scale collection and processing of human genome data. Unfortunately, this dawning genome era does not only promise tremendous advances in biomedical research but also causes unprecedented privacy risks for the many. Handling storage and processing of large genome datasets through cloud services greatly aggravates these concerns. Current research efforts thus investigate the use of strong cryptographic methods and protocols to implement privacy-preserving genomic computations.Entities:
Keywords: Bloom filters; Homomorphic encryption; Secure outsourcing
Mesh:
Year: 2017 PMID: 28786361 PMCID: PMC5547447 DOI: 10.1186/s12920-017-0277-y
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1The problem scenario of Track 3 in the iDASH competition: A researcher aims to securely outsource expensive genome analysis to the cloud using homomorphic encryption
Fig. 2General overview of our approach: The data owner holds a database with genomes of Patients P which she encodes row-wise as Bloom filters (Step 1) and encrypts and uploads them to the cloud (Step 2) in the preprocessing phase. In the online phase, the data owner encodes, encrypts (only in FHE-BLOOM), and uploads her query. In the encrypted domain, the cloud matches the query to each database record (Step 3) and aggregates the results (Step 4) without ever learning the data in clear by utilizing the homomorphic properties of the chosen encryption scheme. The final results are returned to the data owner, who decrypts (Step 5) and postprocesses the results (Step 6) to obtain a list of patients that match her query
Complexity analysis of FHE-BLOOM and PHE-BLOOM: Setup overheads are similar in both approaches and grow linearly in the number of patients n and Bloom filter size l which is proportional to the number of SNPs m, i.e., l=−m log(p)/ log(2)2
| Approach | DB setup (Client) | Query (Cloud) | Query (Client) | ||
|---|---|---|---|---|---|
| Time | Comm | Time | Time | Comm. | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Competition benchmarks and test cases: i) database setup (preparing, encrypting, and uploading the database), ii) query processing in the cloud (matching query and database in the encrypted domain), iii) query overheads on the client (pre- and postprocessing the query, including overheads for up- and download), and iv) total query overheads
| Setting | DB Setup (Client) | Query (Cloud) | Query (Client) | Query (Total) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| n | m | Time | Mem. | Comm. | Time | Mem. | Time | Mem. | Time | Comm. | |
|
| |||||||||||
| Test 1 | 1 | 10000 | 5.73 | 91.78 | 26.81 | 3.258 | 86.95 | 7.532 | 91.12 | 10.790 | 27.10 |
| Test 2 | 1 | 100000 | 35.24 | 105.77 | 265.44 | 21.075 | 86.81 | 32.938 | 105.65 | 54.013 | 265.74 |
| Test 3 | 50 | 100000 | 1452.78 | 157.43 | 13264.54 | 273.922 | 92.98 | 34.385 | 105.53 | 308.307 | 287.71 |
|
| |||||||||||
| Test 1 | 1 | 10000 | 76.77 | 126.06 | 53.29 | 0.008 | 240.40 | 0.001 | 126.06 | 0.009 | 0.03 |
| Test 2 | 1 | 100000 | 752.61 | 128.24 | 533.00 | 0.073 | 2081.71 | 0.002 | 128.24 | 0.075 | 0.25 |
| Test 3 | 50 | 100000 | 822.76 | 143.85 | 533.03 | 0.073 | 2081.70 | 0.002 | 143.85 | 0.075 | 0.25 |
Time is measured in seconds, memory and communication are measured in MBs
Fig. 3Query time of FHE-BLOOM grows linearly in the number of patients n
Fig. 4Query time of FHE-BLOOM grows linearly in the number of SNPs m (note the non-linear x-axis)
Fig. 5Query time of FHE-BLOOM grows linearly with exponentially decreasing p (note the logarithmic x-axis)
Fig. 6Query time of PHE-BLOOM grows linearly in ⌈n/s ⌉
Fig. 7Query time of PHE-BLOOM increases linearly in the maximum number of SNPs m (note the non-linear x-axis)
Fig. 8Query time of PHE-BLOOM grows linearly with exponentially decreasing p (note the logarithmic x-axis)