Literature DB >> 33482874

Privacy-preserving chi-squared test of independence for small samples.

Yuichi Sei1, Akihiko Ohsuga2.   

Abstract

BACKGROUND: The importance of privacy protection in analyses of personal data, such as genome-wide association studies (GWAS), has grown in recent years. GWAS focuses on identifying single-nucleotide polymorphisms (SNPs) associated with certain diseases such as cancer and diabetes, and the chi-squared (χ2) hypothesis test of independence can be utilized for this identification. However, recent studies have shown that publishing the results of χ2 tests of SNPs or personal data could lead to privacy violations. Several studies have proposed anonymization methods for χ2 testing with ε-differential privacy, which is the cryptographic community's de facto privacy metric. However, existing methods can only be applied to 2×2 or 2×3 contingency tables, otherwise their accuracy is low for small numbers of samples. It is difficult to collect numerous high-sensitive samples in many cases such as COVID-19 analysis in its early propagation stage.
RESULTS: We propose a novel anonymization method (RandChiDist), which anonymizes χ2 testing for small samples. We prove that RandChiDist satisfies differential privacy. We also experimentally evaluate its analysis using synthetic datasets and real two genomic datasets. RandChiDist achieved the least number of Type II errors among existing and baseline methods that can control the ratio of Type I errors.
CONCLUSIONS: We propose a new differentially private method, named RandChiDist, for anonymizing χ2 values for an I×J contingency table with a small number of samples. The experimental results show that RandChiDist outperforms existing methods for small numbers of samples.

Entities:  

Keywords:  Chi-squared testing; Differentical privacy; Privacy-preserving data mining

Year:  2021        PMID: 33482874     DOI: 10.1186/s13040-021-00238-x

Source DB:  PubMed          Journal:  BioData Min        ISSN: 1756-0381            Impact factor:   2.522


  10 in total

Review 1.  Mathematical multi-locus approaches to localizing complex human trait genes.

Authors:  Josephine Hoh; Jurg Ott
Journal:  Nat Rev Genet       Date:  2003-09       Impact factor: 53.242

2.  A worldwide survey of haplotype variation and linkage disequilibrium in the human genome.

Authors:  Donald F Conrad; Mattias Jakobsson; Graham Coop; Xiaoquan Wen; Jeffrey D Wall; Noah A Rosenberg; Jonathan K Pritchard
Journal:  Nat Genet       Date:  2006-10-22       Impact factor: 38.330

3.  Falling prices and unfair competition in consumer genomics.

Authors:  Ruslan Dorfman
Journal:  Nat Biotechnol       Date:  2013-09       Impact factor: 54.908

Review 4.  Why rare diseases are an important medical and social issue.

Authors:  Arrigo Schieppati; Jan-Inge Henter; Erica Daina; Anita Aperia
Journal:  Lancet       Date:  2008-06-14       Impact factor: 79.321

5.  Using population mixtures to optimize the utility of genomic databases: linkage disequilibrium and association study design in India.

Authors:  T J Pemberton; M Jakobsson; D F Conrad; G Coop; J D Wall; J K Pritchard; P I Patel; N A Rosenberg
Journal:  Ann Hum Genet       Date:  2007-05-30       Impact factor: 1.670

6.  FDG-PET maximum standardized uptake value is prognostic for recurrence and survival after stereotactic body radiotherapy for non-small cell lung cancer.

Authors:  Zachary A Kohutek; Abraham J Wu; Zhigang Zhang; Amanda Foster; Shaun U Din; Ellen D Yorke; Robert Downey; Kenneth E Rosenzweig; Wolfgang A Weber; Andreas Rimner
Journal:  Lung Cancer       Date:  2015-05-28       Impact factor: 5.705

7.  Hypothesis testing, type I and type II errors.

Authors:  Amitav Banerjee; U B Chitnis; S L Jadhav; J S Bhawalkar; S Chaudhury
Journal:  Ind Psychiatry J       Date:  2009-07

8.  Open access data sharing in genomic research.

Authors:  Stacey Pereira; Richard A Gibbs; Amy L McGuire
Journal:  Genes (Basel)       Date:  2014-08-29       Impact factor: 4.096

9.  Challenges in the Practice of Sexual Medicine in the Time of COVID-19 in the United Kingdom.

Authors:  Louis Jacob; Lee Smith; Laurie Butler; Yvonne Barnett; Igor Grabovac; Daragh McDermott; Nicola Armstrong; Anita Yakkundi; Mark A Tully
Journal:  J Sex Med       Date:  2020-05-14       Impact factor: 3.802

10.  Estimating cumulative point prevalence of rare diseases: analysis of the Orphanet database.

Authors:  Stéphanie Nguengang Wakap; Deborah M Lambert; Annie Olry; Charlotte Rodwell; Charlotte Gueydan; Valérie Lanneau; Daniel Murphy; Yann Le Cam; Ana Rath
Journal:  Eur J Hum Genet       Date:  2019-09-16       Impact factor: 4.246

  10 in total
  1 in total

1.  Measuring the Candidates' Emotions in Political Debates Based on Facial Expression Recognition Techniques.

Authors:  Alfredo Rodríguez-Fuertes; Julio Alard-Josemaría; Julio E Sandubete
Journal:  Front Psychol       Date:  2022-05-09
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.