Literature DB >> 26691928

Privacy-Preserving Data Exploration in Genome-Wide Association Studies.

Aaron Johnson1, Vitaly Shmatikov2.   

Abstract

Genome-wide association studies (GWAS) have become a popular method for analyzing sets of DNA sequences in order to discover the genetic basis of disease. Unfortunately, statistics published as the result of GWAS can be used to identify individuals participating in the study. To prevent privacy breaches, even previously published results have been removed from public databases, impeding researchers' access to the data and hindering collaborative research. Existing techniques for privacy-preserving GWAS focus on answering specific questions, such as correlations between a given pair of SNPs (DNA sequence variations). This does not fit the typical GWAS process, where the analyst may not know in advance which SNPs to consider and which statistical tests to use, how many SNPs are significant for a given dataset, etc. We present a set of practical, privacy-preserving data mining algorithms for GWAS datasets. Our framework supports exploratory data analysis, where the analyst does not know a priori how many and which SNPs to consider. We develop privacy-preserving algorithms for computing the number and location of SNPs that are significantly associated with the disease, the significance of any statistical test between a given SNP and the disease, any measure of correlation between SNPs, and the block structure of correlations. We evaluate our algorithms on real-world datasets and demonstrate that they produce significantly more accurate results than prior techniques while guaranteeing differential privacy.

Entities:  

Keywords:  Differential privacy; genome-wide association studies

Year:  2013        PMID: 26691928      PMCID: PMC4681528          DOI: 10.1145/2487575.2487687

Source DB:  PubMed          Journal:  KDD        ISSN: 2154-817X


  15 in total

1.  The structure of haplotype blocks in the human genome.

Authors:  Stacey B Gabriel; Stephen F Schaffner; Huy Nguyen; Jamie M Moore; Jessica Roy; Brendan Blumenstiel; John Higgins; Matthew DeFelice; Amy Lochner; Maura Faggart; Shau Neen Liu-Cordero; Charles Rotimi; Adebowale Adeyemo; Richard Cooper; Ryk Ward; Eric S Lander; Mark J Daly; David Altshuler
Journal:  Science       Date:  2002-05-23       Impact factor: 47.728

2.  Haploview: analysis and visualization of LD and haplotype maps.

Authors:  J C Barrett; B Fry; J Maller; M J Daly
Journal:  Bioinformatics       Date:  2004-08-05       Impact factor: 6.937

3.  A genome-wide association study identifies IL23R as an inflammatory bowel disease gene.

Authors:  Richard H Duerr; Kent D Taylor; Steven R Brant; John D Rioux; Mark S Silverberg; Mark J Daly; A Hillary Steinhart; Clara Abraham; Miguel Regueiro; Anne Griffiths; Themistocles Dassopoulos; Alain Bitton; Huiying Yang; Stephan Targan; Lisa Wu Datta; Emily O Kistner; L Philip Schumm; Annette T Lee; Peter K Gregersen; M Michael Barmada; Jerome I Rotter; Dan L Nicolae; Judy H Cho
Journal:  Science       Date:  2006-10-26       Impact factor: 47.728

4.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer.

Authors:  David J Hunter; Peter Kraft; Kevin B Jacobs; David G Cox; Meredith Yeager; Susan E Hankinson; Sholom Wacholder; Zhaoming Wang; Robert Welch; Amy Hutchinson; Junwen Wang; Kai Yu; Nilanjan Chatterjee; Nick Orr; Walter C Willett; Graham A Colditz; Regina G Ziegler; Christine D Berg; Saundra S Buys; Catherine A McCarty; Heather Spencer Feigelson; Eugenia E Calle; Michael J Thun; Richard B Hayes; Margaret Tucker; Daniela S Gerhard; Joseph F Fraumeni; Robert N Hoover; Gilles Thomas; Stephen J Chanock
Journal:  Nat Genet       Date:  2007-05-27       Impact factor: 38.330

5.  Protecting aggregate genomic data.

Authors:  Elias A Zerhouni; Elizabeth G Nabel
Journal:  Science       Date:  2008-09-04       Impact factor: 47.728

6.  Genome-wide association study of prostate cancer identifies a second risk locus at 8q24.

Authors:  Meredith Yeager; Nick Orr; Richard B Hayes; Kevin B Jacobs; Peter Kraft; Sholom Wacholder; Mark J Minichiello; Paul Fearnhead; Kai Yu; Nilanjan Chatterjee; Zhaoming Wang; Robert Welch; Brian J Staats; Eugenia E Calle; Heather Spencer Feigelson; Michael J Thun; Carmen Rodriguez; Demetrius Albanes; Jarmo Virtamo; Stephanie Weinstein; Fredrick R Schumacher; Edward Giovannucci; Walter C Willett; Geraldine Cancel-Tassin; Olivier Cussenot; Antoine Valeri; Gerald L Andriole; Edward P Gelmann; Margaret Tucker; Daniela S Gerhard; Joseph F Fraumeni; Robert Hoover; David J Hunter; Stephen J Chanock; Gilles Thomas
Journal:  Nat Genet       Date:  2007-04-01       Impact factor: 38.330

7.  Detecting novel associations in large data sets.

Authors:  David N Reshef; Yakir A Reshef; Hilary K Finucane; Sharon R Grossman; Gilean McVean; Peter J Turnbaugh; Eric S Lander; Michael Mitzenmacher; Pardis C Sabeti
Journal:  Science       Date:  2011-12-16       Impact factor: 47.728

8.  A second generation human haplotype map of over 3.1 million SNPs.

Authors:  Kelly A Frazer; Dennis G Ballinger; David R Cox; David A Hinds; Laura L Stuve; Richard A Gibbs; John W Belmont; Andrew Boudreau; Paul Hardenbol; Suzanne M Leal; Shiran Pasternak; David A Wheeler; Thomas D Willis; Fuli Yu; Huanming Yang; Changqing Zeng; Yang Gao; Haoran Hu; Weitao Hu; Chaohua Li; Wei Lin; Siqi Liu; Hao Pan; Xiaoli Tang; Jian Wang; Wei Wang; Jun Yu; Bo Zhang; Qingrun Zhang; Hongbin Zhao; Hui Zhao; Jun Zhou; Stacey B Gabriel; Rachel Barry; Brendan Blumenstiel; Amy Camargo; Matthew Defelice; Maura Faggart; Mary Goyette; Supriya Gupta; Jamie Moore; Huy Nguyen; Robert C Onofrio; Melissa Parkin; Jessica Roy; Erich Stahl; Ellen Winchester; Liuda Ziaugra; David Altshuler; Yan Shen; Zhijian Yao; Wei Huang; Xun Chu; Yungang He; Li Jin; Yangfan Liu; Yayun Shen; Weiwei Sun; Haifeng Wang; Yi Wang; Ying Wang; Xiaoyan Xiong; Liang Xu; Mary M Y Waye; Stephen K W Tsui; Hong Xue; J Tze-Fei Wong; Luana M Galver; Jian-Bing Fan; Kevin Gunderson; Sarah S Murray; Arnold R Oliphant; Mark S Chee; Alexandre Montpetit; Fanny Chagnon; Vincent Ferretti; Martin Leboeuf; Jean-François Olivier; Michael S Phillips; Stéphanie Roumy; Clémentine Sallée; Andrei Verner; Thomas J Hudson; Pui-Yan Kwok; Dongmei Cai; Daniel C Koboldt; Raymond D Miller; Ludmila Pawlikowska; Patricia Taillon-Miller; Ming Xiao; Lap-Chee Tsui; William Mak; You Qiang Song; Paul K H Tam; Yusuke Nakamura; Takahisa Kawaguchi; Takuya Kitamoto; Takashi Morizono; Atsushi Nagashima; Yozo Ohnishi; Akihiro Sekine; Toshihiro Tanaka; Tatsuhiko Tsunoda; Panos Deloukas; Christine P Bird; Marcos Delgado; Emmanouil T Dermitzakis; Rhian Gwilliam; Sarah Hunt; Jonathan Morrison; Don Powell; Barbara E Stranger; Pamela Whittaker; David R Bentley; Mark J Daly; Paul I W de Bakker; Jeff Barrett; Yves R Chretien; Julian Maller; Steve McCarroll; Nick Patterson; Itsik Pe'er; Alkes Price; Shaun Purcell; Daniel J Richter; Pardis Sabeti; Richa Saxena; Stephen F Schaffner; Pak C Sham; Patrick Varilly; David Altshuler; Lincoln D Stein; Lalitha Krishnan; Albert Vernon Smith; Marcela K Tello-Ruiz; Gudmundur A Thorisson; Aravinda Chakravarti; Peter E Chen; David J Cutler; Carl S Kashuk; Shin Lin; Gonçalo R Abecasis; Weihua Guan; Yun Li; Heather M Munro; Zhaohui Steve Qin; Daryl J Thomas; Gilean McVean; Adam Auton; Leonardo Bottolo; Niall Cardin; Susana Eyheramendy; Colin Freeman; Jonathan Marchini; Simon Myers; Chris Spencer; Matthew Stephens; Peter Donnelly; Lon R Cardon; Geraldine Clarke; David M Evans; Andrew P Morris; Bruce S Weir; Tatsuhiko Tsunoda; James C Mullikin; Stephen T Sherry; Michael Feolo; Andrew Skol; Houcan Zhang; Changqing Zeng; Hui Zhao; Ichiro Matsuda; Yoshimitsu Fukushima; Darryl R Macer; Eiko Suda; Charles N Rotimi; Clement A Adebamowo; Ike Ajayi; Toyin Aniagwu; Patricia A Marshall; Chibuzor Nkwodimmah; Charmaine D M Royal; Mark F Leppert; Missy Dixon; Andy Peiffer; Renzong Qiu; Alastair Kent; Kazuto Kato; Norio Niikawa; Isaac F Adewole; Bartha M Knoppers; Morris W Foster; Ellen Wright Clayton; Jessica Watkin; Richard A Gibbs; John W Belmont; Donna Muzny; Lynne Nazareth; Erica Sodergren; George M Weinstock; David A Wheeler; Imtaz Yakub; Stacey B Gabriel; Robert C Onofrio; Daniel J Richter; Liuda Ziaugra; Bruce W Birren; Mark J Daly; David Altshuler; Richard K Wilson; Lucinda L Fulton; Jane Rogers; John Burton; Nigel P Carter; Christopher M Clee; Mark Griffiths; Matthew C Jones; Kirsten McLay; Robert W Plumb; Mark T Ross; Sarah K Sims; David L Willey; Zhu Chen; Hua Han; Le Kang; Martin Godbout; John C Wallenburg; Paul L'Archevêque; Guy Bellemare; Koji Saeki; Hongguang Wang; Daochang An; Hongbo Fu; Qing Li; Zhen Wang; Renwu Wang; Arthur L Holden; Lisa D Brooks; Jean E McEwen; Mark S Guyer; Vivian Ota Wang; Jane L Peterson; Michael Shi; Jack Spiegel; Lawrence M Sung; Lynn F Zacharia; Francis S Collins; Karen Kennedy; Ruth Jamieson; John Stewart
Journal:  Nature       Date:  2007-10-18       Impact factor: 49.962

9.  Public access to genome-wide data: five views on balancing research with privacy and protection.

Authors:  George Church; Catherine Heeney; Naomi Hawkins; Jantina de Vries; Paula Boddington; Jane Kaye; Martin Bobrow; Bruce Weir
Journal:  PLoS Genet       Date:  2009-10-02       Impact factor: 5.917

10.  Resolving individuals contributing trace amounts of DNA to highly complex mixtures using high-density SNP genotyping microarrays.

Authors:  Nils Homer; Szabolcs Szelinger; Margot Redman; David Duggan; Waibhav Tembe; Jill Muehling; John V Pearson; Dietrich A Stephan; Stanley F Nelson; David W Craig
Journal:  PLoS Genet       Date:  2008-08-29       Impact factor: 5.917

View more
  15 in total

Review 1.  Privacy-preserving techniques of genomic data-a survey.

Authors:  Md Momin Al Aziz; Md Nazmus Sadat; Dima Alhadidi; Shuang Wang; Xiaoqian Jiang; Cheryl L Brown; Noman Mohammed
Journal:  Brief Bioinform       Date:  2019-05-21       Impact factor: 11.622

2.  Big Data Privacy in Biomedical Research.

Authors:  Shuang Wang; Luca Bonomi; Wenrui Dai; Feng Chen; Cynthia Cheung; Cinnamon S Bloss; Samuel Cheng; Xiaoqian Jiang
Journal:  IEEE Trans Big Data       Date:  2016-09-13

3.  Privacy-Preserving and Efficient Verification of the Outcome in Genome-Wide Association Studies.

Authors:  Anisa Halimi; Leonard Dervishi; Erman Ayday; Apostolos Pyrgelis; Juan Ramón Troncoso-Pastoriza; Jean-Pierre Hubaux; Xiaoqian Jiang; Jaideep Vaidya
Journal:  Proc Priv Enhanc Technol       Date:  2022

4.  Genomic Data Sharing under Dependent Local Differential Privacy.

Authors:  Emre Yilmaz; Tianxi Ji; Erman Ayday; Pan Li
Journal:  CODASPY       Date:  2022-04-15

Review 5.  An overview of human genetic privacy.

Authors:  Xinghua Shi; Xintao Wu
Journal:  Ann N Y Acad Sci       Date:  2016-09-14       Impact factor: 5.691

6.  Enabling Privacy-Preserving GWASs in Heterogeneous Human Populations.

Authors:  Sean Simmons; Cenk Sahinalp; Bonnie Berger
Journal:  Cell Syst       Date:  2016-07-21       Impact factor: 10.304

7.  Privacy in the Genomic Era.

Authors:  Muhammad Naveed; Erman Ayday; Ellen W Clayton; Jacques Fellay; Carl A Gunter; Jean-Pierre Hubaux; Bradley A Malin; Xiaofeng Wang
Journal:  ACM Comput Surv       Date:  2015-09       Impact factor: 10.282

8.  Mechanisms to protect the privacy of families when using the transmission disequilibrium test in genome-wide association studies.

Authors:  Meng Wang; Zhanglong Ji; Shuang Wang; Jihoon Kim; Hai Yang; Xiaoqian Jiang; Lucila Ohno-Machado
Journal:  Bioinformatics       Date:  2017-12-01       Impact factor: 6.937

9.  On genomics, kin, and privacy.

Authors:  Amalio Telenti; Erman Ayday; Jean Pierre Hubaux
Journal:  F1000Res       Date:  2014-03-31

10.  Privacy-Preserving Artificial Intelligence Techniques in Biomedicine.

Authors:  Reihaneh Torkzadehmahani; Reza Nasirigerdeh; David B Blumenthal; Tim Kacprowski; Markus List; Julian Matschinske; Julian Spaeth; Nina Kerstin Wenke; Jan Baumbach
Journal:  Methods Inf Med       Date:  2022-01-21       Impact factor: 1.800

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.