Literature DB >> 36212774

Privacy-Preserving and Efficient Verification of the Outcome in Genome-Wide Association Studies.

Anisa Halimi1, Leonard Dervishi2, Erman Ayday2, Apostolos Pyrgelis3, Juan Ramón Troncoso-Pastoriza3, Jean-Pierre Hubaux3, Xiaoqian Jiang4, Jaideep Vaidya5.   

Abstract

Providing provenance in scientific workflows is essential for reproducibility and auditability purposes. In this work, we propose a framework that verifies the correctness of the aggregate statistics obtained as a result of a genome-wide association study (GWAS) conducted by a researcher while protecting individuals' privacy in the researcher's dataset. In GWAS, the goal of the researcher is to identify highly associated point mutations (variants) with a given phenotype. The researcher publishes the workflow of the conducted study, its output, and associated metadata. They keep the research dataset private while providing, as part of the metadata, a partial noisy dataset (that achieves local differential privacy). To check the correctness of the workflow output, a verifier makes use of the workflow, its metadata, and results of another GWAS (conducted using publicly available datasets) to distinguish between correct statistics and incorrect ones. For evaluation, we use real genomic data and show that the correctness of the workflow output can be verified with high accuracy even when the aggregate statistics of a small number of variants are provided. We also quantify the privacy leakage due to the provided workflow and its associated metadata and show that the additional privacy risk due to the provided metadata does not increase the existing privacy risk due to sharing of the research results. Thus, our results show that the workflow output (i.e., research results) can be verified with high confidence in a privacy-preserving way. We believe that this work will be a valuable step towards providing provenance in a privacy-preserving way while providing guarantees to the users about the correctness of the results.

Entities:  

Keywords:  Privacy; genome-wide association studies; provenance; verifiable computation; workflows

Year:  2022        PMID: 36212774      PMCID: PMC9536480          DOI: 10.56553/popets-2022-0094

Source DB:  PubMed          Journal:  Proc Priv Enhanc Technol        ISSN: 2299-0984


  16 in total

1.  Genomic privacy and limits of individual detection in a pool.

Authors:  Sriram Sankararaman; Guillaume Obozinski; Michael I Jordan; Eran Halperin
Journal:  Nat Genet       Date:  2009-08-23       Impact factor: 38.330

2.  An Inference Attack on Genomic Data Using Kinship, Complex Correlations, and Phenotype Information.

Authors:  Iman Deznabi; Mohammad Mobayen; Nazanin Jafari; Oznur Tastan; Erman Ayday
Journal:  IEEE/ACM Trans Comput Biol Bioinform       Date:  2018 Jul-Aug       Impact factor: 3.710

3.  Identifying personal genomes by surname inference.

Authors:  Melissa Gymrek; Amy L McGuire; David Golan; Eran Halperin; Yaniv Erlich
Journal:  Science       Date:  2013-01-18       Impact factor: 47.728

4.  Quality control procedures for genome-wide association studies.

Authors:  Stephen Turner; Loren L Armstrong; Yuki Bradford; Christopher S Carlson; Dana C Crawford; Andrew T Crenshaw; Mariza de Andrade; Kimberly F Doheny; Jonathan L Haines; Geoffrey Hayes; Gail Jarvik; Lan Jiang; Iftikhar J Kullo; Rongling Li; Hua Ling; Teri A Manolio; Martha Matsumoto; Catherine A McCarty; Andrew N McDavid; Daniel B Mirel; Justin E Paschall; Elizabeth W Pugh; Luke V Rasmussen; Russell A Wilke; Rebecca L Zuvich; Marylyn D Ritchie
Journal:  Curr Protoc Hum Genet       Date:  2011-01

5.  Privacy-Preserving Data Sharing for Genome-Wide Association Studies.

Authors:  Caroline Uhlerop; Aleksandra Slavković; Stephen E Fienberg
Journal:  J Priv Confid       Date:  2013

6.  Efficient verification for outsourced genome-wide association studies.

Authors:  Xinyue Wang; Xiaoqian Jiang; Jaideep Vaidya
Journal:  J Biomed Inform       Date:  2021-03-10       Impact factor: 6.317

7.  Multiple Testing in the Context of Gene Discovery in Sickle Cell Disease Using Genome-Wide Association Studies.

Authors:  Kevin H M Kuo
Journal:  Genomics Insights       Date:  2017-08-01

8.  The effect of kinship in re-identification attacks against genomic data sharing beacons.

Authors:  Kerem Ayoz; Miray Aysen; Erman Ayday; A Ercument Cicek
Journal:  Bioinformatics       Date:  2020-12-30       Impact factor: 6.937

9.  Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays.

Authors:  Nils Homer; Szabolcs Szelinger; Margot Redman; David Duggan; Waibhav Tembe; Jill Muehling; John V Pearson; Dietrich A Stephan; Stanley F Nelson; David W Craig
Journal:  PLoS Genet       Date:  2008-08-29       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.