| Literature DB >> 25888290 |
Ailith Pirie1, Angela Wood2, Michael Lush3, Jonathan Tyrer4, Paul D P Pharoah5,6.
Abstract
BACKGROUND: The detection of bias due to cryptic population structure is an important step in the evaluation of findings of genetic association studies. The standard method of measuring this bias in a genetic association study is to compare the observed median association test statistic to the expected median test statistic. This ratio is inflated in the presence of cryptic population structure. However, inflation may also be caused by the properties of the association test itself particularly in the analysis of rare variants. We compared the properties of the three most commonly used association tests: the likelihood ratio test, the Wald test and the score test when testing rare variants for association using simulated data.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25888290 PMCID: PMC4339749 DOI: 10.1186/s12859-015-0496-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1The level of inflation of the median test statistic for variants, with increasing numbers of heterozygotes, in a case–control analysis using the likelihood ratio test, the Wald test and the score test. a) The over-dispersion ratio at the median test statistic in a case–control analysis for a sample size of 5,000 with up to 50 heterozygotes. b) The over-dispersion ratio at the median test statistic in a case–control analysis for a sample size of 10,000 with up to 50 heterozygotes.
Figure 2The level of inflation in the test statistic evaluated at the mean is used to smooth out the variation in the median test statistic caused by the small number of contingencies. We consider how the over-dispersion ratio varies as the frequency of variant increases. a) The over-dispersion ratio evaluated at the mean test statistic in a case–control analysis of 5,000 samples with variants with up to 50 heterozygotes. b) The over-dispersion ratio evaluated at the mean test statistic in a case–control analysis of 10,000 samples with variants with up to 50 heterozygotes.
Figure 3The levels of inflation ( λ ) measured at the median test statistic using the likelihood ratio test, Wald test and score test on rare variant focussed datasets. There is evidence that the test statistics of the Wald test are under-inflated whereas the test statistics for the likelihood ratio test and the score test are over-inflated.
The level of inflation in the mean and median test statistics of the association tests
|
|
|
| ||||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| LRT | 1.905 (1.903-1.906) | 1.172 (1.171-1.172) | 3.047 (3.047-3.047) | 1.270 (1.270-1.271) | 0.992 (0.991-0.992) | 1.002 (1.001-1.002) |
| Wald test | 0.311 (0.310-0.311) | 0.580 (0.580-0.580) | 0.017 (0.017-0.017) | 0.343 (0.343-0.343) | 0.990 (0.989-0.991) | 0.987 (0.987-0.988) |
| Score test | 1.902 (1.900-1.904) | 1.004 (1.004-1.004) | 2.198 (2.198-2.198) | 1.008 (1.008-1.009) | 0.991 (0.990-0.992) | 0.997 (0.996-0.997) |
The level of inflation in the test statistics of the likelihood ratio test, Wald test and score test measured at the median test statistic and the mean statistic. The inflation factor (λ) is calculated by comparing the observed test statistic to the expected test statistic at a given point in the χ2 distribution. Inflation was measured for datasets including all variants, only variants with less than 20 heterozygotes in the sample and only variants with at least 20 heterozygotes in the sample. Each value of λ was averaged across 1000 simulated datasets. All intervals are 95% confidence intervals. A normal level of inflation is indicated by λ = 1.