| Literature DB >> 29263842 |
Shuang Wang1, Xiaoqian Jiang1, Haixu Tang2, Xiaofeng Wang2, Diyue Bu2, Knox Carey3, Stephanie Om Dyke4, Dov Fox5, Chao Jiang1, Kristin Lauter6, Bradley Malin7, Heidi Sofia8, Amalio Telenti9, Lei Wang2, Wenhao Wang2, Lucila Ohno-Machado1.
Abstract
The human genome can reveal sensitive information and is potentially re-identifiable, which raises privacy and security concerns about sharing such data on wide scales. In 2016, we organized the third Critical Assessment of Data Privacy and Protection competition as a community effort to bring together biomedical informaticists, computer privacy and security researchers, and scholars in ethical, legal, and social implications (ELSI) to assess the latest advances on privacy-preserving techniques for protecting human genomic data. Teams were asked to develop novel protection methods for emerging genome privacy challenges in three scenarios: Track (1) data sharing through the Beacon service of the Global Alliance for Genomics and Health. Track (2) collaborative discovery of similar genomes between two institutions; and Track (3) data outsourcing to public cloud services. The latter two tracks represent continuing themes from our 2015 competition, while the former was new and a response to a recently established vulnerability. The winning strategy for Track 1 mitigated the privacy risk by hiding approximately 11% of the variation in the database while permitting around 160,000 queries, a significant improvement over the baseline. The winning strategies in Tracks 2 and 3 showed significant progress over the previous competition by achieving multiple orders of magnitude performance improvement in terms of computational runtime and memory requirements. The outcomes suggest that applying highly optimized privacy-preserving and secure computation techniques to safeguard genomic data sharing and analysis is useful. However, the results also indicate that further efforts are needed to refine these techniques into practical solutions.Entities:
Year: 2017 PMID: 29263842 PMCID: PMC5677972 DOI: 10.1038/s41525-017-0036-1
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Fig. 1Performance of Track 1 in terms of detection power vs. the number of Beacon queries for the top two entries: Vanderbilt University (center) and University of Manitoba (right), as well as our baseline (left). The error rate is defined as the number of correct responses over the total number of queries issued by a malicious user
Results for competition Track 2 (secure collaboration), where “Accuracy@k” is defined as the average of all correctly identified top k results over 5 runs using databases with 500 patients records
| Team | Run time (s) | Accuracy@ | ||||
|---|---|---|---|---|---|---|
| Top 1 | Top 3 | Top 5 | Top 1 | Top 3 | Top 5 | |
| IBM Team 1 |
|
|
|
|
| 4 |
| Indiana University at Bloomington[ | 209.03 | 273.14 | 337.79 |
|
| 4 |
| University of Manitoba[ | 22.65 | 22.99 | 22.88 | 0 | 2 | 2 |
| Cybernetica AS | 80.97 | 67.47 | 64.64 |
| 1 | 1 |
| University of Maryland | 12.93 | 21 | 30.4 |
| 0.67 | 2.3 |
| RWTH Aachen University | 5700 | >6300 | >6300 |
|
|
|
The bold values indicate the best performance among teams
A summary of the results for Track 3 (secure outsourcing)
| Team | Data encryption time (s) | Encrypted data size (MB) | Secure computing time (s) | Result decryption time (s) | Total time (s) for computing, result decryption and transfer |
|---|---|---|---|---|---|
| Microsoft[ | 1.86 | 24.00 | 3.09 | 0.02 | 3.63 |
| RWTH Aachen University[ | 34.90 | 255.00 | 15.28 | 0.68 | 16.32 |
| EPFL[ | 137.60 | 147.00 | 6.79 | 9.28 | 19.26 |
| Seoul National University[ | 51.02 | 10.00 | 21.10 | 0.005 | 25.11 |
| IBM team 2 | 478.10 | 1660.00 | 959.10 | 200.70 | 1178.2 |
| Waseda University | 109.72 | 5447.82 | 8937.51 | 0.058 | 8938.81 |