| Literature DB >> 22110703 |
Abstract
In case-control genetic association studies, cases are subjects with the disease and controls are subjects without the disease. At the time of case-control data collection, information about secondary phenotypes is also collected. In addition to studies of primary diseases, there has been some interest in studying genetic variants associated with secondary phenotypes. In genetic association studies, the deviation from Hardy-Weinberg proportion (HWP) of each genetic marker is assessed as an initial quality check to identify questionable genotypes. Generally, HWP tests are performed based on the controls for the primary disease or secondary phenotype. However, when the disease or phenotype of interest is common, the controls do not represent the general population. Therefore, using only controls for testing HWP can result in a highly inflated type I error rate for the disease- and/or phenotype-associated variants. Recently, two approaches, the likelihood ratio test (LRT) approach and the mixture HWP (mHWP) exact test were proposed for testing HWP in samples from case-control studies. Here, we show that these two approaches result in inflated type I error rates and could lead to the removal from further analysis of potential causal genetic variants associated with the primary disease and/or secondary phenotype when the study of primary disease is frequency-matched on the secondary phenotype. Therefore, we proposed alternative approaches, which extend the LRT and mHWP approaches, for assessing HWP that account for frequency matching. The goal was to maintain more (possible causative) single-nucleotide polymorphisms in the sample for further analysis. Our simulation results showed that both extended approaches could control type I error probabilities. We also applied the proposed approaches to test HWP for SNPs from a genome-wide association study of lung cancer that was frequency-matched on smoking status and found that the proposed approaches can keep more genetic variants for association studies.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22110703 PMCID: PMC3215743 DOI: 10.1371/journal.pone.0027642
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Parameters for simulation studies.
| Coefficients of logistic models | |||
| Factors |
|
| Prevalence |
| SNP1 |
|
| |
| SNP2 |
|
| |
| SNP3 |
|
| |
| SNP4 |
|
| |
| Sex |
|
| 50% (Male) |
| Ethnicity |
|
| 75% (Caucasian) |
| Age | |||
| 0-30 |
|
| 36% |
| 31-50 | 39% | ||
| Secondary Trait | NA |
| |
Estimated type I error probability for test of deviation from HWP of SNP1, a causal SNP to both primary disease and secondary phenotype (MAF = 40%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.629540 | 0.575270 | 0.554130 | 0.562720 | 0.053358 | 0.041523 | 0.035780 | 0.038847 |
|
| 0.215320 | 0.207040 | 0.206470 | 0.201280 | 0.003199 | 0.003054 | 0.002981 | 0.002936 | |
|
| 0.050372 | 0.048842 | 0.049259 | 0.049460 | 0.000067 | 0.000060 | 0.000111 | 0.000088 | |
|
| 0.210240 | 0.206540 | 0.205710 | 0.205040 | 0.003146 | 0.002987 | 0.002762 | 0.002758 | |
|
|
| 0.633310 | 0.584310 | 0.536340 | 0.484760 | 0.052694 | 0.027412 | 0.010742 | 0.006781 |
|
| 0.216690 | 0.215840 | 0.205520 | 0.169840 | 0.003171 | 0.002029 | 0.000832 | 0.000479 | |
|
| 0.050907 | 0.053288 | 0.052970 | 0.046346 | 0.000067 | 0.000056 | 0.000050 | 0.000016 | |
|
| 0.212230 | 0.216690 | 0.220330 | 0.216660 | 0.003372 | 0.002963 | 0.002298 | 0.001488 | |
|
|
| 0.082900 | 0.176400 | 0.179690 | 0.121400 | 0.000366 | 0.001921 | 0.002132 | 0.001113 |
|
| 0.062015 | 0.118910 | 0.137840 | 0.106030 | 0.000193 | 0.000867 | 0.001054 | 0.000592 | |
|
| 0.056323 | 0.081140 | 0.096205 | 0.082865 | 0.000109 | 0.000253 | 0.000700 | 0.000413 | |
|
| 0.050907 | 0.059385 | 0.064534 | 0.061732 | 0.000212 | 0.000144 | 0.000289 | 0.000109 | |
|
|
| 0.086345 | 0.181400 | 0.184410 | 0.125150 | 0.000306 | 0.001731 | 0.001913 | 0.000965 |
|
| 0.067275 | 0.125500 | 0.144800 | 0.112180 | 0.000058 | 0.000449 | 0.000555 | 0.000413 | |
|
| 0.054004 | 0.077626 | 0.092293 | 0.079165 | 0.000099 | 0.000235 | 0.000621 | 0.000393 | |
|
| 0.055775 | 0.064159 | 0.069567 | 0.066528 | 0.000116 | 0.000054 | 0.000153 | 0.000054 | |
|
|
| 0.050846 | 0.050180 | 0.049620 | 0.049718 | 0.000068 | 0.000059 | 0.000122 | 0.000038 |
|
| 0.048648 | 0.050629 | 0.049565 | 0.050288 | 0.000102 | 0.000162 | 0.000109 | 0.000130 | |
|
| 0.049669 | 0.049533 | 0.049771 | 0.050028 | 0.000108 | 0.000089 | 0.000062 | 0.000078 | |
|
| 0.049458 | 0.048902 | 0.050458 | 0.049812 | 0.000179 | 0.000074 | 0.000111 | 0.000061 | |
|
|
| 0.055737 | 0.049401 | 0.037640 | 0.022980 | 0.000040 | 0.000006 | 0.000012 | 0.000010 |
|
| 0.051877 | 0.045782 | 0.033138 | 0.020467 | 0.000014 | 0.000019 | 0.000021 | <0.000001 | |
|
| 0.052006 | 0.053254 | 0.044670 | 0.029965 | 0.000082 | 0.000011 | 0.000008 | <0.000001 | |
|
| 0.054092 | 0.053385 | 0.054638 | 0.054220 | 0.000138 | 0.000041 | 0.000107 | 0.000019 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP2, a causal SNP to primary disease but unassociated with secondary phenotype (MAF = 40%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.627260 | 0.573700 | 0.550220 | 0.559390 | 0.053796 | 0.040312 | 0.034674 | 0.037743 |
|
| 0.210050 | 0.198150 | 0.200780 | 0.196990 | 0.003259 | 0.002515 | 0.002577 | 0.002789 | |
|
| 0.050007 | 0.049852 | 0.049675 | 0.051325 | 0.000071 | 0.000113 | 0.000062 | 0.000160 | |
|
| 0.216220 | 0.219080 | 0.217210 | 0.215410 | 0.003184 | 0.003390 | 0.003172 | 0.002850 | |
|
|
| 0.631460 | 0.584970 | 0.536990 | 0.488360 | 0.053770 | 0.026639 | 0.010651 | 0.007235 |
|
| 0.210800 | 0.208130 | 0.200800 | 0.168160 | 0.003330 | 0.001737 | 0.000933 | 0.000374 | |
|
| 0.050602 | 0.054912 | 0.054477 | 0.048824 | 0.000073 | 0.000057 | 0.000016 | 0.000005 | |
|
| 0.218030 | 0.229660 | 0.232740 | 0.228650 | 0.003316 | 0.003497 | 0.002618 | 0.001366 | |
|
|
| 0.050558 | 0.050690 | 0.050830 | 0.051459 | 0.000092 | 0.000117 | 0.000148 | 0.000103 |
|
| 0.051329 | 0.055468 | 0.056119 | 0.052864 | 0.000089 | 0.000129 | 0.000150 | 0.000111 | |
|
| 0.050079 | 0.054422 | 0.055539 | 0.053985 | 0.000067 | 0.000120 | 0.000057 | 0.000145 | |
|
| 0.050483 | 0.051177 | 0.051220 | 0.050447 | 0.000094 | 0.000136 | 0.000103 | 0.000107 | |
|
|
| 0.054121 | 0.054870 | 0.054884 | 0.055146 | 0.000067 | 0.000117 | 0.000097 | 0.000100 |
|
| 0.056680 | 0.061018 | 0.061881 | 0.057905 | 0.000077 | 0.000068 | 0.000089 | 0.000040 | |
|
| 0.048408 | 0.052943 | 0.053900 | 0.052227 | 0.000067 | 0.000145 | 0.000057 | 0.000132 | |
|
| 0.055468 | 0.056702 | 0.056614 | 0.056278 | 0.000024 | 0.000100 | 0.000055 | 0.000081 | |
|
|
| 0.050108 | 0.049843 | 0.049292 | 0.050494 | 0.000096 | 0.000125 | 0.000095 | 0.000046 |
|
| 0.050273 | 0.050039 | 0.049661 | 0.050889 | 0.000065 | 0.000133 | 0.000101 | 0.000102 | |
|
| 0.049889 | 0.050504 | 0.049880 | 0.050395 | 0.000076 | 0.000065 | 0.000070 | 0.000105 | |
|
| 0.049888 | 0.049224 | 0.051004 | 0.050025 | 0.000068 | 0.000139 | 0.000087 | 0.000115 | |
|
|
| 0.055240 | 0.050562 | 0.038818 | 0.025086 | 0.000070 | 0.000032 | 0.000012 | 0.000012 |
|
| 0.053956 | 0.046384 | 0.034750 | 0.022099 | 0.000047 | 0.000008 | 0.000012 | 0.000008 | |
|
| 0.052390 | 0.055258 | 0.046097 | 0.032279 | 0.000076 | 0.000041 | 0.000020 | <0.000001 | |
|
| 0.055004 | 0.054376 | 0.055794 | 0.054795 | 0.000002 | 0.000085 | 0.000046 | 0.000102 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP3, a causal SNP to secondary phenotype but unassociated with primary disease (MAF = 40%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.050237 | 0.049513 | 0.050059 | 0.049689 | 0.000102 | 0.000110 | 0.000117 | 0.000056 |
|
| 0.049339 | 0.049678 | 0.049446 | 0.048962 | 0.000108 | 0.000116 | 0.000108 | 0.000139 | |
|
| 0.049916 | 0.050404 | 0.049220 | 0.049619 | 0.000098 | 0.000158 | 0.000111 | 0.000080 | |
|
| 0.049775 | 0.050679 | 0.049866 | 0.050453 | 0.000059 | 0.000130 | 0.000118 | 0.000171 | |
|
|
| 0.052723 | 0.053571 | 0.046324 | 0.032064 | 0.000102 | 0.000053 | <0.000001 | <0.000001 |
|
| 0.050644 | 0.054787 | 0.049998 | 0.038074 | 0.000108 | 0.000015 | 0.000037 | <0.000001 | |
|
| 0.050812 | 0.055255 | 0.053091 | 0.045548 | 0.000122 | 0.000111 | 0.000042 | 0.000004 | |
|
| 0.049322 | 0.053758 | 0.054543 | 0.053817 | 0.000059 | 0.000132 | 0.000078 | 0.000069 | |
|
|
| 0.091228 | 0.203020 | 0.202380 | 0.132960 | 0.000471 | 0.002732 | 0.003004 | 0.001147 |
|
| 0.072172 | 0.158300 | 0.184610 | 0.131600 | 0.000233 | 0.001944 | 0.002358 | 0.001177 | |
|
| 0.058820 | 0.103760 | 0.127970 | 0.109320 | 0.000192 | 0.000643 | 0.001143 | 0.000612 | |
|
| 0.051530 | 0.063688 | 0.073503 | 0.071796 | 0.000130 | 0.000196 | 0.000318 | 0.000314 | |
|
|
| 0.093969 | 0.208080 | 0.207090 | 0.136520 | 0.000448 | 0.002430 | 0.002673 | 0.001069 |
|
| 0.077520 | 0.166330 | 0.193820 | 0.138970 | 0.000144 | 0.001229 | 0.001499 | 0.000816 | |
|
| 0.056407 | 0.099694 | 0.123440 | 0.104840 | 0.000180 | 0.000568 | 0.001051 | 0.000553 | |
|
| 0.056720 | 0.069087 | 0.078565 | 0.077589 | 0.000108 | 0.000100 | 0.000160 | 0.000221 | |
|
|
| 0.050286 | 0.049335 | 0.050350 | 0.049391 | 0.000137 | 0.000111 | 0.000196 | 0.000111 |
|
| 0.049295 | 0.049636 | 0.049703 | 0.048950 | 0.000097 | 0.000118 | 0.000107 | 0.000123 | |
|
| 0.049719 | 0.049851 | 0.049578 | 0.049965 | 0.000152 | 0.000104 | 0.000117 | 0.000093 | |
|
| 0.049215 | 0.049359 | 0.050344 | 0.050839 | 0.000178 | 0.000141 | 0.000057 | 0.000179 | |
|
|
| 0.054900 | 0.049287 | 0.038968 | 0.023448 | 0.000078 | 0.000036 | 0.000020 | <0.000001 |
|
| 0.052976 | 0.045988 | 0.033970 | 0.019914 | 0.000056 | <0.000001 | 0.000024 | 0.000003 | |
|
| 0.052207 | 0.054182 | 0.045483 | 0.030691 | 0.000152 | 0.000060 | 0.000022 | <0.000001 | |
|
| 0.053854 | 0.054086 | 0.055279 | 0.055419 | 0.000089 | 0.000065 | 0.000033 | 0.000095 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP4, a SNP unassociated with secondary phenotype and primary disease (MAF = 40%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.049895 | 0.051331 | 0.049151 | 0.049731 | 0.000081 | 0.000101 | 0.000142 | 0.000115 |
|
| 0.050845 | 0.050014 | 0.048887 | 0.050541 | 0.000056 | 0.000080 | 0.000100 | 0.000104 | |
|
| 0.049367 | 0.048765 | 0.051168 | 0.049342 | 0.000081 | 0.000124 | 0.000131 | 0.000173 | |
|
| 0.050228 | 0.049878 | 0.049212 | 0.050788 | 0.000049 | 0.000024 | 0.000086 | 0.000070 | |
|
|
| 0.052873 | 0.056271 | 0.046639 | 0.033368 | 0.000089 | 0.000064 | 0.000024 | <0.000001 |
|
| 0.052488 | 0.055169 | 0.050527 | 0.040863 | 0.000060 | 0.000067 | 0.000033 | 0.000035 | |
|
| 0.050102 | 0.053512 | 0.055504 | 0.047180 | 0.000081 | 0.000106 | 0.000077 | 0.000014 | |
|
| 0.049741 | 0.052617 | 0.054543 | 0.054514 | 0.000052 | 0.000022 | 0.000065 | 0.000039 | |
|
|
| 0.051678 | 0.052132 | 0.050314 | 0.049756 | 0.000103 | 0.000075 | 0.000064 | 0.000098 |
|
| 0.050903 | 0.049942 | 0.048878 | 0.051188 | 0.000067 | 0.000130 | 0.000204 | 0.000140 | |
|
| 0.049440 | 0.048856 | 0.050964 | 0.049971 | 0.000087 | 0.000101 | 0.000135 | 0.000160 | |
|
| 0.050011 | 0.049740 | 0.050224 | 0.052106 | 0.000078 | 0.000043 | 0.000090 | 0.000070 | |
|
|
| 0.054502 | 0.055479 | 0.053625 | 0.053484 | 0.000093 | 0.000073 | 0.000052 | 0.000084 |
|
| 0.055727 | 0.055445 | 0.054290 | 0.056204 | 0.000029 | 0.000059 | 0.000101 | 0.000110 | |
|
| 0.047628 | 0.047015 | 0.049317 | 0.047875 | 0.000062 | 0.000109 | 0.000135 | 0.000156 | |
|
| 0.055520 | 0.055276 | 0.055206 | 0.057024 | 0.000052 | 0.000027 | 0.000032 | 0.000055 | |
|
|
| 0.051447 | 0.051662 | 0.050660 | 0.049422 | 0.000048 | 0.000111 | 0.000135 | 0.000088 |
|
| 0.050599 | 0.049412 | 0.048554 | 0.051584 | 0.000061 | 0.000074 | 0.000095 | 0.000073 | |
|
| 0.049071 | 0.048852 | 0.050214 | 0.050171 | 0.000082 | 0.000111 | 0.000170 | 0.000114 | |
|
| 0.049997 | 0.049947 | 0.049675 | 0.051680 | 0.000105 | 0.000076 | 0.000127 | 0.000121 | |
|
|
| 0.056339 | 0.052359 | 0.040427 | 0.025242 | 0.000017 | 0.000029 | 0.000013 | 0.000007 |
|
| 0.054430 | 0.046910 | 0.034919 | 0.022838 | 0.000014 | 0.000025 | 0.000014 | <0.000001 | |
|
| 0.052086 | 0.053611 | 0.046387 | 0.031905 | 0.000082 | 0.000072 | 0.000036 | 0.000014 | |
|
| 0.054686 | 0.055224 | 0.054902 | 0.056712 | 0.000065 | 0.000041 | 0.000034 | 0.000065 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP1, a causal SNP to both primary disease and secondary phenotype (MAF = 10%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.162010 | 0.153420 | 0.150040 | 0.152840 | 0.001597 | 0.001321 | 0.001332 | 0.001540 |
|
| 0.078159 | 0.072453 | 0.071508 | 0.072241 | 0.000309 | 0.000347 | 0.000336 | 0.000353 | |
|
| 0.051000 | 0.049919 | 0.050700 | 0.049936 | 0.000118 | 0.000068 | 0.000063 | 0.000080 | |
|
| 0.071504 | 0.072624 | 0.070306 | 0.072665 | 0.000174 | 0.000309 | 0.000361 | 0.000241 | |
|
|
| 0.149180 | 0.135750 | 0.108590 | 0.078171 | 0.001100 | 0.000328 | 0.000048 | 0.000010 |
|
| 0.070424 | 0.066110 | 0.056082 | 0.040867 | 0.000250 | 0.000092 | 0.000010 | 0.000011 | |
|
| 0.049444 | 0.051274 | 0.049291 | 0.040294 | 0.000123 | 0.000038 | 0.000022 | 0.000005 | |
|
| 0.076373 | 0.082008 | 0.081561 | 0.081048 | 0.000240 | 0.000318 | 0.000295 | 0.000101 | |
|
|
| 0.056718 | 0.068336 | 0.067064 | 0.058557 | 0.000248 | 0.000260 | 0.000204 | 0.000139 |
|
| 0.055615 | 0.058761 | 0.059443 | 0.056363 | 0.000104 | 0.000120 | 0.000239 | 0.000162 | |
|
| 0.051958 | 0.054015 | 0.055546 | 0.052000 | 0.000128 | 0.000075 | 0.000121 | 0.000079 | |
|
| 0.052333 | 0.052450 | 0.051962 | 0.053095 | 0.000102 | 0.000166 | 0.000200 | 0.000119 | |
|
|
| 0.053442 | 0.063423 | 0.062436 | 0.054951 | 0.000164 | 0.000132 | 0.000063 | 0.000096 |
|
| 0.052441 | 0.053680 | 0.053728 | 0.051717 | 0.000022 | 0.000047 | 0.000054 | 0.000050 | |
|
| 0.045016 | 0.045112 | 0.046540 | 0.043593 | 0.000077 | 0.000032 | 0.000077 | 0.000079 | |
|
| 0.051805 | 0.051895 | 0.050645 | 0.051822 | 0.000053 | 0.000086 | 0.000084 | 0.000052 | |
|
|
| 0.052115 | 0.052281 | 0.054407 | 0.075478 | 0.000311 | 0.000155 | 0.000511 | 0.001738 |
|
| 0.052444 | 0.049599 | 0.052305 | 0.059940 | 0.000113 | 0.000149 | 0.000202 | 0.000639 | |
|
| 0.051163 | 0.049736 | 0.050396 | 0.056037 | 0.000155 | 0.000060 | 0.000073 | 0.000300 | |
|
| 0.054020 | 0.052456 | 0.051402 | 0.058749 | 0.000111 | 0.000155 | 0.000187 | 0.000327 | |
|
|
| 0.049613 | 0.040647 | 0.027971 | 0.016156 | 0.000065 | 0.000012 | 0.000013 | <0.000001 |
|
| 0.048224 | 0.035411 | 0.024398 | 0.013660 | 0.000016 | 0.000002 | <0.000001 | <0.000001 | |
|
| 0.050066 | 0.048976 | 0.038241 | 0.024400 | 0.000115 | 0.000024 | <0.000001 | <0.000001 | |
|
| 0.052258 | 0.051763 | 0.049974 | 0.051433 | 0.000071 | 0.000102 | 0.000107 | 0.000059 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP2, a causal SNP to primary disease but unassociated with secondary phenotype (MAF = 10%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.161340 | 0.154220 | 0.148260 | 0.148780 | 0.001358 | 0.001472 | 0.001255 | 0.001499 |
|
| 0.076030 | 0.074891 | 0.074848 | 0.073556 | 0.000307 | 0.000159 | 0.000388 | 0.000311 | |
|
| 0.051226 | 0.050695 | 0.049906 | 0.051283 | 0.000158 | 0.000184 | 0.000166 | 0.000090 | |
|
| 0.069704 | 0.071344 | 0.067986 | 0.070402 | 0.000182 | 0.000189 | 0.000176 | 0.000303 | |
|
|
| 0.146860 | 0.141050 | 0.117140 | 0.085517 | 0.000991 | 0.000629 | 0.000140 | 0.000036 |
|
| 0.066561 | 0.068085 | 0.061517 | 0.045970 | 0.000224 | 0.000075 | 0.000052 | 0.000000 | |
|
| 0.048454 | 0.051398 | 0.049869 | 0.044307 | 0.000104 | 0.000113 | 0.000036 | 0.000026 | |
|
| 0.073637 | 0.079299 | 0.077750 | 0.078682 | 0.000257 | 0.000214 | 0.000161 | 0.000201 | |
|
|
| 0.052950 | 0.050386 | 0.050286 | 0.049250 | 0.000114 | 0.000102 | 0.000200 | 0.000075 |
|
| 0.050115 | 0.050408 | 0.050887 | 0.050550 | 0.000118 | 0.000105 | 0.000078 | 0.000099 | |
|
| 0.051150 | 0.050612 | 0.051208 | 0.050205 | 0.000119 | 0.000088 | 0.000198 | 0.000122 | |
|
| 0.051033 | 0.051331 | 0.049037 | 0.050975 | 0.000095 | 0.000080 | 0.000077 | 0.000127 | |
|
|
| 0.052819 | 0.049958 | 0.050091 | 0.048436 | 0.000069 | 0.000092 | 0.000117 | 0.000029 |
|
| 0.050449 | 0.051010 | 0.052295 | 0.051333 | 0.000013 | 0.000060 | 0.000025 | 0.000030 | |
|
| 0.047014 | 0.046632 | 0.046919 | 0.046702 | 0.000088 | 0.000086 | 0.000134 | 0.000111 | |
|
| 0.052191 | 0.051853 | 0.050738 | 0.052515 | 0.000054 | 0.000060 | 0.000059 | 0.000088 | |
|
|
| 0.051798 | 0.050953 | 0.053997 | 0.060978 | 0.000122 | 0.000115 | 0.000093 | 0.001070 |
|
| 0.049308 | 0.051280 | 0.052182 | 0.057070 | 0.000130 | 0.000166 | 0.000082 | 0.000243 | |
|
| 0.051033 | 0.050921 | 0.050215 | 0.052377 | 0.000125 | 0.000134 | 0.000189 | 0.000223 | |
|
| 0.052914 | 0.051716 | 0.049788 | 0.054130 | 0.000081 | 0.000079 | 0.000095 | 0.000312 | |
|
|
| 0.051345 | 0.043123 | 0.033298 | 0.018485 | 0.000030 | 0.000009 | 0.000007 | <0.000001 |
|
| 0.046360 | 0.039182 | 0.029185 | 0.016736 | 0.000023 | 0.000018 | <0.000001 | <0.000001 | |
|
| 0.050293 | 0.050347 | 0.040255 | 0.027978 | 0.000091 | 0.000057 | <0.000001 | 0.000041 | |
|
| 0.051471 | 0.050799 | 0.049526 | 0.051332 | 0.000034 | 0.000060 | 0.000051 | 0.000072 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP3, a causal SNP to secondary phenotype but unassociated with primary disease (MAF = 10%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.050153 | 0.050133 | 0.050911 | 0.052248 | 0.000080 | 0.000135 | 0.000112 | 0.000056 |
|
| 0.049309 | 0.050903 | 0.049561 | 0.051465 | 0.000120 | 0.000082 | 0.000117 | 0.000119 | |
|
| 0.051165 | 0.049958 | 0.050187 | 0.051265 | 0.000108 | 0.000064 | 0.000067 | 0.000091 | |
|
| 0.050114 | 0.050480 | 0.051380 | 0.050936 | 0.000130 | 0.000077 | 0.000128 | 0.000052 | |
|
|
| 0.050368 | 0.048035 | 0.037602 | 0.024191 | 0.000043 | 0.000019 | 0.000016 | 0.000013 |
|
| 0.048566 | 0.050854 | 0.042486 | 0.031576 | 0.000099 | 0.000009 | <0.000001 | 0.000002 | |
|
| 0.049050 | 0.050853 | 0.048492 | 0.041053 | 0.000122 | 0.000040 | 0.000034 | 0.000007 | |
|
| 0.046458 | 0.049772 | 0.051727 | 0.049122 | 0.000094 | 0.000052 | 0.000084 | 0.000010 | |
|
|
| 0.058316 | 0.073940 | 0.072316 | 0.061562 | 0.000135 | 0.000253 | 0.000289 | 0.000271 |
|
| 0.054009 | 0.067947 | 0.066766 | 0.060957 | 0.000112 | 0.000250 | 0.000186 | 0.000199 | |
|
| 0.053243 | 0.058409 | 0.061204 | 0.057918 | 0.000116 | 0.000142 | 0.000220 | 0.000099 | |
|
| 0.051326 | 0.053231 | 0.054928 | 0.053418 | 0.000124 | 0.000075 | 0.000089 | 0.000086 | |
|
|
| 0.054332 | 0.068308 | 0.066194 | 0.057182 | 0.000055 | 0.000132 | 0.000114 | 0.000135 |
|
| 0.051343 | 0.062151 | 0.061383 | 0.056378 | 0.000032 | 0.000063 | 0.000031 | 0.000086 | |
|
| 0.045976 | 0.048570 | 0.050252 | 0.047917 | 0.000066 | 0.000104 | 0.000110 | 0.000072 | |
|
| 0.050590 | 0.051410 | 0.053220 | 0.051586 | 0.000039 | 0.000015 | 0.000039 | 0.000050 | |
|
|
| 0.051076 | 0.050677 | 0.054545 | 0.057184 | 0.000142 | 0.000101 | 0.000154 | 0.000071 |
|
| 0.049723 | 0.050855 | 0.051065 | 0.055910 | 0.000178 | 0.000066 | 0.000060 | 0.000084 | |
|
| 0.051668 | 0.050298 | 0.050811 | 0.054281 | 0.000129 | 0.000053 | 0.000117 | 0.000073 | |
|
| 0.051233 | 0.051140 | 0.051973 | 0.056249 | 0.000035 | 0.000061 | 0.000173 | 0.000159 | |
|
|
| 0.049861 | 0.040333 | 0.028715 | 0.015904 | 0.000039 | 0.000021 | 0.000006 | <0.000001 |
|
| 0.047309 | 0.038727 | 0.026424 | 0.014201 | 0.000034 | 0.000002 | <0.000001 | <0.000001 | |
|
| 0.050091 | 0.048795 | 0.039083 | 0.025750 | 0.000115 | 0.000026 | 0.000025 | <0.000001 | |
|
| 0.050428 | 0.051245 | 0.051873 | 0.051315 | 0.000025 | 0.000028 | 0.000076 | 0.000043 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Estimated type I error probability for test of deviation from HWP of SNP4, a SNP unassociated with secondary phenotype and primary disease (MAF = 10%), at 0.05 and 0.0001 significance levels in simulation studies* using different approaches for HWP testing.
| Approaches |
|
|
| ||||||
|
|
| ||||||||
| 0.1 | 0.3 | 0.5 | 0.7 | 0.1 | 0.3 | 0.5 | 0.7 | ||
|
|
| 0.049955 | 0.050560 | 0.050754 | 0.051529 | 0.000103 | 0.000092 | 0.000086 | 0.000119 |
|
| 0.051255 | 0.050142 | 0.050185 | 0.050944 | 0.000117 | 0.000074 | 0.000141 | 0.000090 | |
|
| 0.050236 | 0.050098 | 0.050188 | 0.051949 | 0.000147 | 0.000081 | 0.000078 | 0.000095 | |
|
| 0.050851 | 0.050228 | 0.050880 | 0.051135 | 0.000037 | 0.000165 | 0.000099 | 0.000092 | |
|
|
| 0.048855 | 0.050711 | 0.042215 | 0.028967 | 0.000066 | 0.000045 | 0.000006 | <0.000001 |
|
| 0.049976 | 0.050795 | 0.046092 | 0.035414 | 0.000089 | 0.000044 | 0.000016 | 0.000009 | |
|
| 0.047727 | 0.050971 | 0.049961 | 0.043757 | 0.000117 | 0.000043 | 0.000008 | <0.000001 | |
|
| 0.045951 | 0.049392 | 0.051668 | 0.050732 | 0.000025 | 0.000169 | 0.000070 | 0.000046 | |
|
|
| 0.051172 | 0.049296 | 0.050029 | 0.050813 | 0.000096 | 0.000145 | 0.000066 | 0.000052 |
|
| 0.052033 | 0.050256 | 0.050089 | 0.050230 | 0.000070 | 0.000084 | 0.000031 | 0.000142 | |
|
| 0.051063 | 0.049871 | 0.049610 | 0.051423 | 0.000151 | 0.000073 | 0.000099 | 0.000086 | |
|
| 0.051395 | 0.050905 | 0.051639 | 0.050943 | 0.000083 | 0.000135 | 0.000115 | 0.000121 | |
|
|
| 0.050153 | 0.047861 | 0.048359 | 0.049300 | 0.000076 | 0.000108 | 0.000074 | 0.000052 |
|
| 0.051980 | 0.050553 | 0.050566 | 0.050233 | 0.000033 | 0.000014 | 0.000026 | 0.000069 | |
|
| 0.045581 | 0.044546 | 0.044106 | 0.045819 | 0.000101 | 0.000062 | 0.000115 | 0.000065 | |
|
| 0.051886 | 0.050960 | 0.051851 | 0.051027 | 0.000031 | 0.000083 | 0.000074 | 0.000079 | |
|
|
| 0.051528 | 0.051132 | 0.052837 | 0.055981 | 0.000132 | 0.000197 | 0.000075 | 0.000086 |
|
| 0.052157 | 0.049806 | 0.051373 | 0.053434 | 0.000089 | 0.000055 | 0.000124 | 0.000068 | |
|
| 0.051175 | 0.050076 | 0.049524 | 0.052920 | 0.000138 | 0.000089 | 0.000126 | 0.000059 | |
|
| 0.051641 | 0.051034 | 0.052126 | 0.051752 | 0.000098 | 0.000104 | 0.000099 | 0.000110 | |
|
|
| 0.050713 | 0.044745 | 0.032836 | 0.020672 | 0.000013 | 0.000041 | <0.000001 | 0.000009 |
|
| 0.050183 | 0.041807 | 0.029977 | 0.017740 | 0.000007 | 0.000016 | <0.000001 | <0.000001 | |
|
| 0.049308 | 0.049678 | 0.039879 | 0.029402 | 0.000125 | 0.000048 | 0.000036 | <0.000001 | |
|
| 0.051104 | 0.050868 | 0.052230 | 0.051118 | 0.000030 | 0.000077 | 0.000064 | 0.000061 | |
*Simulation studies were based on 1,000,000 replicates, each replicate with 2,000 cases in terms of primary disease and 2,000 controls frequency-matched on secondary phenotype.
MAF: minor allele frequency.
LRT_t: LRT approach, using presence and absence of secondary phenotype as cases and controls.
mHWP_t: mHWP exact test, using presence and absence of secondary phenotype as cases and controls.
LRT_d: LRT approach, using presence and absence of primary disease as cases and controls.
mHWP_d: mHWP exact test, using presence and absence of primary disease as cases and controls.
eLRT: extended LRT approach.
emHWP: extended mHWP exact test.
: prevalence of primary disease in general population.
: prevalence of secondary phenotype in general population.
Numbers of SNPs rejected using different approaches for testing HWP in the case-control genetic association study of lung cancer frequency-matched on smoking behavior.
| Number of rejections | ||||
| Significance | LRT_t | LRT_d | eLRT | emHWP |
|
| 3445 | 3049 | 2949 | 2320 |
|
| 1778 | 1405 | 1374 | 1057 |
|
| 1501 | 1130 | 1116 | 847 |
|
| 1121 | 812 | 798 | 637 |
|
| 1031 | 743 | 734 | 580 |
|
| 891 | 617 | 608 | 478 |
|
| 825 | 568 | 568 | 453 |
|
| 730 | 502 | 490 | 419 |