| Literature DB >> 34188157 |
Alexander Ziller1,2,3, Dmitrii Usynin1,2,4,3, Rickmer Braren1, Marcus Makowski1, Daniel Rueckert2,4, Georgios Kaissis5,6,7,8.
Abstract
The successful training of deep learning models for diagnostic deployment in medical imaging applications requires large volumes of data. Such data cannot be procured without consideration for patient privacy, mandated both by legal regulations and ethical requirements of the medical profession. Differential privacy (DP) enables the provision of information-theoretic privacy guarantees to patients and can be implemented in the setting of deep neural network training through the differentially private stochastic gradient descent (DP-SGD) algorithm. We here present deepee, a free-and-open-source framework for differentially private deep learning for use with the PyTorch deep learning framework. Our framework is based on parallelised execution of neural network operations to obtain and modify the per-sample gradients. The process is efficiently abstracted via a data structure maintaining shared memory references to neural network weights to maintain memory efficiency. We furthermore offer specialised data loading procedures and privacy budget accounting based on the Gaussian Differential Privacy framework, as well as automated modification of the user-supplied neural network architectures to ensure DP-conformity of its layers. We benchmark our framework's computational performance against other open-source DP frameworks and evaluate its application on the paediatric pneumonia dataset, an image classification task and on the Medical Segmentation Decathlon Liver dataset in the task of medical image segmentation. We find that neural network training with rigorous privacy guarantees is possible while maintaining acceptable classification performance and excellent segmentation performance. Our framework compares favourably to related work with respect to memory consumption and computational performance. Our work presents an open-source software framework for differentially private deep learning, which we demonstrate in medical imaging analysis tasks. It serves to further the utilisation of privacy-enhancing techniques in medicine and beyond in order to assist researchers and practitioners in addressing the numerous outstanding challenges towards their widespread implementation.Entities:
Year: 2021 PMID: 34188157 PMCID: PMC8242021 DOI: 10.1038/s41598-021-93030-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Classification performance (measured as mean receiver-operator characteristic area-under-the-curve (ROC-AUC)) on the paediatric chest radiography binary classification dataset.
| Model | ROC-AUC | GDP | RDP |
|---|---|---|---|
| Non-private | 0.960 [0.946 to 0.971] | ||
| Private | 0.848 [0.814 to 0.881] | 0.52 | 0.64 |
| Private (relaxed) | 0.882 [0.868 to 0.899] | 2.69 | 2.81 |
Ranges in angled brackets. The non-private model significantly outperformed the private model in both the high-privacy setting and the relaxed privacy setting, while the private model trained with relaxed privacy guarantees significantly outperformed the private model with strict guarantees.
Segmentation performance (measured by the mean Dice coefficient) on the liver semantic segmentation dataset.
| Model | Dice coefficient | GDP | RDP |
|---|---|---|---|
| Non-private | 0.950 [0.948 to 0.951] | ||
| Private | 0.943 [0.941 to 0.945] | 0.12 | 0.35 |
Ranges in angled brackets. The privately trained and the non-privately trained models performed on par despite the provision of stringent privacy guarantees in the privately trained setting.
Computational performance (median time for N = 25 batches of 32 examples in seconds over N = 5 repetitions) and mean peak memory consumption (one batch of 32 examples in MiB, N = 6 repetitions) of the compared frameworks for the classification and segmentation benchmarks.
| Task | |||
|---|---|---|---|
| Classification | 38.82 s [38.67 to 39.08] | 16.39 s [16.29 to 16.69] | 73.11 s [72.41 to 75.40] |
| 6366 MiB [6201 to 6448] | 7014 MiB [6816 to 7213] | 2044 MiB [1992 to 2102] | |
| Segmentation | 70.89 s [70.41 to 71.01] | 78.47 s [78.08 to 79.86] | 97.89 s [97.26 to 99.16] |
| 9770 MiB [9508 to 9829] | 9909 MiB [9812 to 10112] | 2085 MiB [1890 to 2205] | |
| Segmentation (Transposed Conv.) | 47.27 s [45.12 to 51.15] | – | 64.68 s [62.76 to 66.32] |
| 12014 MiB [11598 to 12249] | – | 1537 MiB [1399 to 1620] |
Ranges in angled brackets. The Segmentation (Transposed Conv.) row showcases framework performance in a U-Net architecture using transposed convolutions. Opacus is incompatible with this layer type.