| Literature DB >> 30309350 |
Hao Chen1, Ran Gilad-Bachrach1, Kyoohyung Han2, Zhicong Huang3, Amir Jalali4, Kim Laine5, Kristin Lauter1.
Abstract
BACKGROUND: One of the tasks in the 2017 iDASH secure genome analysis competition was to enable training of logistic regression models over encrypted genomic data. More precisely, given a list of approximately 1500 patient records, each with 18 binary features containing information on specific mutations, the idea was for the data holder to encrypt the records using homomorphic encryption, and send them to an untrusted cloud for storage. The cloud could then homomorphically apply a training algorithm on the encrypted data to obtain an encrypted logistic regression model, which can be sent to the data holder for decryption. In this way, the data holder could successfully outsource the training process without revealing either her sensitive data, or the trained model, to the cloud.Entities:
Keywords: Cryptography; Homomorphic encryption; Logistic regression
Mesh:
Year: 2018 PMID: 30309350 PMCID: PMC6180402 DOI: 10.1186/s12920-018-0397-z
Source DB: PubMed Journal: BMC Med Genomics ISSN: 1755-8794 Impact factor: 3.063
Fig. 1Linear minimax approximate for sigmoid: f(x)=0.5+0.125x
Fig. 2Degree 3 minimax approximate for sigmoid: f(x)=0.5+0.197x−0.004x3
Running 10-fold cross-validation on the iDASH dataset with 1579 samples and 18 selected genotypes
| Training method | # iterations | Avg. training time | Avg. AUC | Avg. AUC (unencrypted) |
|---|---|---|---|---|
| GD + | 36 | 115.33 h | 0.690 | 0.690 |
| 1-Bit GD + | 36 | 14.90 h | 0.668 | 0.690 |
The first average AUC value is obtained from running the training algorithm using SEAL on encrypted data. The second AUC value is obtained from running the same algorithm on unencrypted data using MATLAB
Running 10-fold cross-validation on compressed MNIST dataset with 1500 samples and 196 features
| Training method | # iterations | Avg. training time | Avg. AUC | Avg. AUC (unencrypted) |
|---|---|---|---|---|
| GD + | 10 | 48.76 h | 0.974 | 0.977 |
| 1-Bit GD + | 10 | 27.10 h | 0.974 | 0.978 |
The first average AUC value is obtained from running the training algorithm using SEAL on encrypted data. The second AUC value is obtained from running the same algorithm on unencrypted data using MATLAB