Literature DB >> 30309344

iDASH secure genome analysis competition 2017.

XiaoFeng Wang1, Haixu Tang2, Shuang Wang3, Xiaoqian Jiang4, Wenhao Wang2, Diyue Bu2, Lei Wang2, Yicheng Jiang3, Chenghong Wang3.   

Abstract

Entities:  

Mesh:

Year:  2018        PMID: 30309344      PMCID: PMC6180354          DOI: 10.1186/s12920-018-0396-0

Source DB:  PubMed          Journal:  BMC Med Genomics        ISSN: 1755-8794            Impact factor:   3.063


× No keyword cloud information.
Year 2017 marks the 4th anniversary since the first iDASH Secure Genome Analysis Competition [1] launched jointly by University California San Diego and Indiana University, Bloomington. The past 4 years have witnessed the continued progresses in genomic and biomedical technologies, with their influence permeating our daily life, redefining our perception of privacy. As an example, when the law enforcement identified the golden state killer from his remote relative’s DNA from GEDmatch [2], the kinship inference attack reported 3 years ago [3] now becomes reality. Facing the genome privacy community are still these old devils: protecting the privacy during genomic data sharing and genome analysis. Just they look increasingly real every day, which mounts a great pressure on the young community to come up with practical, usable solutions. Seeking such practical privacy solutions has always been at the center of the competition, as set by the organizers 4 years ago. This year, serving the purpose are three carefully designed competition tracks, including De-duplication for GA4GH (Track 1), SGX-based whole genome variation search (Track 2) and HME based logistic regression model learning (Track 3). Each track either has its root in the real privacy problem haunting the already deployed system or new challenges emerged from innovative applications of new computing techniques to support secure genomic data sharing or analysis. Specifically, in collaboration with the Global Alliance for Genomics and Health (GA4GH), Track 1 looks for new privacy-preserving patient linkage (PPRL) solution for removing duplicated health records maintained by multiple data owners. The new solution can be applied on top of existing European ENCCA unified patient identifier framework to facilitate record deduplication in GA4GH. Track 2 is meant to seek new answers for a long-standing genome privacy problem: how to perform a large-scale Genome-Wide Association Study (GWAS) on the untrusted public cloud. This time, however, the attempts are made by leveraging Intel’s Software Guard Extension (SGX), a new hardware trusted execution environment (TEE) support, to move a secure analysis solution closer to practical use. Track 3 is designed to answer the new demand for training a machine learning model (logistic regression) on encrypted genomic data, when the computation needs to be conducted in an untrusted environment, through a homomorphic encryption (HME) scheme. Altogether, these three tracks attracted 65 participation teams from 19 countries across North America, Europe, and Asia. Among them, 19 teams from 23 organizations submitted their final results before the deadline. Finally, a joint team from IBM/INRIA and ENS de Lyon/Cornell Tech/Bar-ILan University won Track 1, CEA France won Track 2 and Seoul National University/UC San Diego won Track 3. This special issue of BMC Medical Genomics highlights some most intriguing techniques reported during the competition. Carpov et al. [4] describe their winning SGX based secure GWAS solution, which includes two key components i.e., (1) genome data compression and encryption and (2) top K most significant SNPs computation. Rust programming framework was used to enable massively parallel computation, which allows for the application to scale well with large input VCF files. It took about 1 min to handle the whole processing steps for about 30GB inputs. Laud et al. [5] report the results of using secure multiparty computation to efficiently solve privacy-preserving record linkage problem in large databases. The solution is built upon a commercial platform named Sharemind. To deduplicate 10 million record across 1000 different databases, it took about 30 min for computing servers over 100 Mbits/s network. Kim et al. [6] present a winning solution for the homomorphic encryption based secure logistic regression. This solution is built upon an efficient approximate arithmetic homomorphic encryption library named HAEEN [7]. The experimental results show that a logistic regression training task over a dataset with 1,579 samples and 18 features can be finished within 6 min. Chen et al. [8] present another novel solution to the iDASH homomorphic encryption based secure logistic regression task. In their solution, they applied a multi-bit plaintext space in fully homomorphic encryption together with fixed point number encoding. Bootstrapping is combined in fully homomorphic encryption with a scaling operation in the fixed point arithmetics. They also use a minimax polynomial approximation to the sigmoid function and a 1-bit gradient descent method to reduce the plaintext growth in the training process. Their training over encrypted data took 0.4–3.2 h per iteration of the gradient descent. Bonte et al. [9] discuss an alternative solution for secure logistic regression training over homomorphically encrypted data. The key idea of the proposed solution is based on a simplified fixed Hessian method of a much lower multiplicative complexity, which can be efficiently and iteratively solved under homomorphic operations. All these new techniques showcase the achievements of this year’s competition. Some of them have already demonstrated the potential of practical use, particularly the deduplication techniques for GA4GH, while the others report exciting progress that can lead to breakthroughs in genome privacy, including truly scalable and privacy-preserving genome analysis on untrusted cloud, based upon the new SGX hardware, and the feasibility of training classification models over fully encrypted data. All such techniques and findings will contribute to the genome-privacy research and move the science in this emerging domain forward.
  8 in total

1.  Evaluation of Privacy Risks of Patients' Data in China: Case Study.

Authors:  Mengchun Gong; Shuang Wang; Lezi Wang; Chao Liu; Jianyang Wang; Qiang Guo; Hao Zheng; Kang Xie; Chenghong Wang; Zhouguang Hui
Journal:  JMIR Med Inform       Date:  2020-02-05

Review 2.  Functional genomics data: privacy risk assessment and technological mitigation.

Authors:  Gamze Gürsoy; Tianxiao Li; Susanna Liu; Eric Ni; Charlotte M Brannon; Mark B Gerstein
Journal:  Nat Rev Genet       Date:  2021-11-10       Impact factor: 53.242

3.  Emerging technologies towards enhancing privacy in genomic data sharing.

Authors:  Bonnie Berger; Hyunghoon Cho
Journal:  Genome Biol       Date:  2019-07-02       Impact factor: 13.583

4.  Optimized homomorphic encryption solution for secure genome-wide association studies.

Authors:  Marcelo Blatt; Alexander Gusev; Yuriy Polyakov; Kurt Rohloff; Vinod Vaikuntanathan
Journal:  BMC Med Genomics       Date:  2020-07-21       Impact factor: 3.063

5.  Privacy-preserving construction of generalized linear mixed model for biomedical computation.

Authors:  Rui Zhu; Chao Jiang; Xiaofeng Wang; Shuang Wang; Hao Zheng; Haixu Tang
Journal:  Bioinformatics       Date:  2020-07-01       Impact factor: 6.937

6.  Using blockchain to log genome dataset access: efficient storage and query.

Authors:  Gamze Gürsoy; Robert Bjornson; Molly E Green; Mark Gerstein
Journal:  BMC Med Genomics       Date:  2020-07-21       Impact factor: 3.063

7.  Leveraging blockchain for immutable logging and querying across multiple sites.

Authors:  Mustafa Safa Ozdayi; Murat Kantarcioglu; Bradley Malin
Journal:  BMC Med Genomics       Date:  2020-07-21       Impact factor: 3.063

8.  iDASH secure genome analysis competition 2018: blockchain genomic data access logging, homomorphic encryption on GWAS, and DNA segment searching.

Authors:  Tsung-Ting Kuo; Xiaoqian Jiang; Haixu Tang; XiaoFeng Wang; Tyler Bath; Diyue Bu; Lei Wang; Arif Harmanci; Shaojie Zhang; Degui Zhi; Heidi J Sofia; Lucila Ohno-Machado
Journal:  BMC Med Genomics       Date:  2020-07-21       Impact factor: 3.063

  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.