| Literature DB >> 29666041 |
Miran Kim1, Yongsoo Song2,3, Shuang Wang1, Yuhou Xia4, Xiaoqian Jiang1.
Abstract
BACKGROUND: Learning a model without accessing raw data has been an intriguing idea to security and machine learning researchers for years. In an ideal setting, we want to encrypt sensitive data to store them on a commercial cloud and run certain analyses without ever decrypting the data to preserve privacy. Homomorphic encryption technique is a promising candidate for secure data outsourcing, but it is a very challenging task to support real-world machine learning tasks. Existing frameworks can only handle simplified cases with low-degree polynomials such as linear means classifier and linear discriminative analysis.Entities:
Keywords: gradient descent; homomorphic encryption; logistic regression; machine learning
Year: 2018 PMID: 29666041 PMCID: PMC5930176 DOI: 10.2196/medinform.8805
Source DB: PubMed Journal: JMIR Med Inform
Figure 1Two secure models: (a) secure storage and computation outsourcing and (b) secure model outsourcing.
Research works in secure analysis.
| Reference | Problem | Techniques |
| Graepel et al [ | Linear means classifier/discriminative analysis | Homomorphic encryption |
| Bos et al [ | Prediction using learned logistic regression model | Homomorphic encryption |
| Dowlin et al [ | Prediction using learned neural networks | Homomorphic encryption |
| Aono et al [ | Logistic regression | Additive homomorphic encryption |
| Mohassel et al [ | Logistic regression | Multiparty computation |
| This work | Logistic regression | Homomorphic encryption |
Figure 2Graphs of (a) sigmoid function and Taylor polynomials and (b) sigmoid function and least squares approximations.
Description of datasets.
| Dataset | Number of observations | Number of features |
| Edinburgh Myocardial Infarction | 1253 | 10 |
| Low Birth Weight Study | 189 | 10 |
| Nhanes III | 15,649 | 16 |
| Prostate Cancer Study | 379 | 10 |
| Umaru Impact Study | 575 | 9 |
Experiment results of our homomorphic encryption-based logistic regression algorithm
| Dataset and degree of | Encryption (sec) | Evaluation (min) | Decryption (sec) | Storage (GB) | |
| 3 | 12 | 131 | 6.3 | 0.69 | |
| 7 | 12 | 116 | 6.0 | 0.71 | |
| 3 | 11 | 101 | 4.9 | 0.67 | |
| 7 | 11 | 100 | 4.5 | 0.70 | |
| 3 | 21 | 265 | 12 | 1.15 | |
| 7 | 21 | 240 | 13 | 1.17 | |
| 3 | 11 | 119 | 4.4 | 0.68 | |
| 7 | 11 | 100 | 4.5 | 0.70 | |
| 3 | 10 | 109 | 5.1 | 0.61 | |
| 7 | 10 | 94 | 4.3 | 0.63 | |
Figure 3Average AUC of encrypted logistic regression. FPR: false positive rate; TPR: true positive rate.
Comparison of encrypted/unencrypted logistic regression. AUC: area under the receiver operating characteristic curve. MSE: mean squared error; NMSE: normalized mean squared error.
| Dataset and iteration number | Degree of | Our homomorphic encryption-based logistic regression | Unencrypted logistic regression | MSE | NMSE | |||
| Accuracy | AUC | Accuracy | AUC | |||||
| 25 | 3 | 86.03% | 0.956 | 88.43% | 0.956 | 0.0259 | 0.0261 | |
| 20 | 7 | 86.19% | 0.954 | 86.19% | 0.954 | 0.0007 | 0.0012 | |
| 25 | 3 | 69.30% | 0.665 | 68.25% | 0.668 | 0.0083 | 0.0698 | |
| 20 | 7 | 69.29% | 0.678 | 69.29% | 0.678 | 0.0003 | 0.0049 | |
| 25 | 3 | 79.23% | 0.732 | 79.26% | 0.751 | 0.0033 | 0.0269 | |
| 20 | 7 | 79.23% | 0.737 | 79.23% | 0.737 | 0.0002 | 0.0034 | |
| 25 | 3 | 68.85% | 0.742 | 68.86% | 0.750 | 0.0085 | 0.0449 | |
| 20 | 7 | 69.12% | 0.750 | 69.12% | 0.752 | 0.0002 | 0.0018 | |
| 25 | 3 | 74.43% | 0.585 | 74.43% | 0.587 | 0.0074 | 0.0829 | |
| 20 | 7 | 75.43% | 0.617 | 74.43% | 0.619 | 0.0004 | 0.0077 | |