| Literature DB >> 23413435 |
Liina Kamm1, Dan Bogdanov, Sven Laur, Jaak Vilo.
Abstract
MOTIVATION: Increased availability of various genotyping techniques has initiated a race for finding genetic markers that can be used in diagnostics and personalized medicine. Although many genetic risk factors are known, key causes of common diseases with complex heritage patterns are still unknown. Identification of such complex traits requires a targeted study over a large collection of data. Ideally, such studies bring together data from many biobanks. However, data aggregation on such a large scale raises many privacy issues.Entities:
Mesh:
Year: 2013 PMID: 23413435 PMCID: PMC3605601 DOI: 10.1093/bioinformatics/btt066
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.This figure illustrates how players A and B use a 3-out-of-3 additive secret sharing scheme to distribute two 32-bit integer values x and y to shares. The shares are sent to three servers that use the homomorphic property of the scheme to securely compute the sum of x and y. The shares of the sum are sent to player C, who reconstructs the result
Fig. 2.Secure GWAS consists of three major stages: data acquisition, formation of case–control groups and statistical testing. Panel (A) depicts how these three stages are linked. Data are gathered and sent in securely coded shares to be stored. For statistical tests, case and control info is securely coded and applied to the securely stored data so that statistical analyses can be carried out. Panel (B) describes two alternative scenarios that can be used for secure storage of genotype and phenotype data. Scenario 1 depicts a situation where genotype data are entered into secure storage by the wetlab and phenotype data are entered by the donors themselves. Scenario 2 depicts a case where different gene banks send selected genotype and phenotype data to secure storage so they can make joint analyses on more data. Panel (C) describes how case and control groups can be formed. In the simplest case, researches have unrestricted access to phenotype data and can thus form case and control groups by themselves. In more complex settings, researchers do not have rights to access phenotype data, and hosts must use secure multi-party computations to construct case and control groups based on inclusion criteria
Contingency table for the standard test
| Group | Allele A | Allele B |
|---|---|---|
| Cases | ||
| Controls |
Contingency table for the Cochran–Armitage test for trend
| Group | Allele AA | Allele AB | Allele BB | Total |
|---|---|---|---|---|
| Cases | ||||
| Controls | ||||
| Total |
Performance results for data upload and filtering
| Number of donors | Upload | Filtering |
|---|---|---|
| 270 donors | 12.0 min | 0.51 s |
| 540 donors | 17.3 min | 0.59 s |
| 810 donors | 23.2 min | 0.63 s |
| 1080 donors | 29.4 min | 0.68 s |
Performance results for three different frequency analyses
| Number of donors | Cochran–Armitage | TDT | |
|---|---|---|---|
| 270 donors | |||
| 540 donors | |||
| 810 donors | |||
| 1080 donors | |||
| 1080 donors (non-secure) | 14 s | 35 s | 11 s |
Performance results for four test evaluation methods
| Number of donors | Cochran– Armitage | TDT | ||
|---|---|---|---|---|
| 270 donors | ||||
| 540 donors | ||||
| 810 donors | ||||
| 1080 donors | ||||
| 1080 donors (non-secure) | 21 ms | 20 ms | 49 ms | 11 ms |