| Literature DB >> 19798439 |
Peter M Visscher1, William G Hill.
Abstract
It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.Entities:
Mesh:
Year: 2009 PMID: 19798439 PMCID: PMC2746319 DOI: 10.1371/journal.pgen.1000628
Source DB: PubMed Journal: PLoS Genet ISSN: 1553-7390 Impact factor: 5.917
Simulation results (m = 50,000 SNPs; type-I error rate = 0.05; 1000 simulations).
| Linear regression | Homer et al. | ||||||
| Proband in test? |
|
|
| P( | P( | P( | P( |
|
|
|
|
| ||||
| NO | ∞ | 100 | 0.000 | 0.055 | 1.000 | 0.000 | 1.000 |
| NO | ∞ | 1000 | 0.002 | 0.064 | 1.000 | 0.002 | 0.486 |
| NO | ∞ | 10000 | 0.000 | 0.056 | 0.731 | 0.016 | 0.133 |
| NO | 100 | 100 | 0.001 | 0.061 | 1.000 | 0.057 | 0.039 |
| NO | 100 | 1000 | 0.005 | 0.065 | 0.678 | 0.994 | 0.000 |
| NO | 100 | 10000 | 0.041 | 0.052 | 0.079 | 0.999 | 0.000 |
| NO | 1000 | 100 | −0.000 | 0.047 | 1.000 | 0.000 | 0.997 |
| NO | 1000 | 1000 | 0.014 | 0.069 | 0.999 | 0.060 | 0.047 |
| NO | 1000 | 10000 | 0.002 | 0.057 | 0.185 | 0.404 | 0.000 |
| NO | 10000 | 100 | 0.002 | 0.067 | 1.000 | 0.000 | 0.999 |
| NO | 10000 | 1000 | 0.001 | 0.065 | 1.000 | 0.001 | 0.408 |
| NO | 10000 | 10000 | −0.002 | 0.053 | 0.472 | 0.048 | 0.051 |
|
|
|
|
| ||||
| YES | ∞ | 100 | 0.999 | 1.000 | 0.048 | 1.000 | 0.000 |
| YES | ∞ | 1000 | 1.003 | 1.000 | 0.051 | 0.996 | 0.000 |
| YES | ∞ | 10000 | 0.997 | 0.709 | 0.053 | 0.396 | 0.000 |
| YES | 100 | 100 | 1.004 | 1.000 | 0.064 | 1.000 | 0.000 |
| YES | 100 | 1000 | 0.999 | 0.686 | 0.060 | 1.000 | 0.000 |
| YES | 100 | 10000 | 0.974 | 0.078 | 0.063 | 0.998 | 0.000 |
| YES | 1000 | 100 | 0.999 | 1.000 | 0.058 | 1.000 | 0.000 |
| YES | 1000 | 1000 | 1.002 | 1.000 | 0.063 | 0.992 | 0.000 |
| YES | 1000 | 10000 | 1.015 | 0.190 | 0.053 | 0.625 | 0.000 |
| YES | 10000 | 100 | 1.000 | 1.000 | 0.063 | 1.000 | 0.000 |
| YES | 10000 | 1000 | 0.999 | 1.000 | 0.059 | 0.993 | 0.000 |
| YES | 10000 | 10000 | 0.998 | 0.475 | 0.067 | 0.375 | 0.000 |
D refers to the Homer et al. test statistic.