| Literature DB >> 30815118 |
Yongtai Liu1, Zhiyu Wan1, Weiyi Xia1, Murat Kantarcioglu2, Yevgeniy Vorobeychik1, Ellen Wright Clayton1, Abel Kho3, David Carrell4, Bradley A Malin1.
Abstract
As the quantity and detail of association studies between clinical phenotypes and genotypes grows, there is a push to make summary statistics widely available. Genome wide summary statistics have been shown to be vulnerable to the inference of a targeted individual's presence. In this paper, we show that presence attacks are feasible with phenome wide summary statistics as well. We use data from three healthcare organizations and an online resource that publishes summary statistics. We introduce a novel attack that achieves over 80% recall and precision within a population of 16,346, where 8,173 individuals are targets. However, the feasibility of the attack is dependent on the attacker's knowledge about 1) the targeted individual and 2) the reference dataset. Within a population of over 2 million, where 8,173 individuals are targets, our attack achieves 31% recall and 17% precision. As a result, it is plausible that sharing of phenomic summary statistics may be accomplished with an acceptable level of privacy risk.Mesh:
Year: 2018 PMID: 30815118 PMCID: PMC6371366
Source DB: PubMed Journal: AMIA Annu Symp Proc ISSN: 1559-4076