| Literature DB >> 19126586 |
Santiago Rodriguez1, Tom R Gaunt, Ian N M Day.
Abstract
Mendelian randomization (MR) permits causal inference between exposures and a disease. It can be compared with randomized controlled trials. Whereas in a randomized controlled trial the randomization occurs at entry into the trial, in MR the randomization occurs during gamete formation and conception. Several factors, including time since conception and sampling variation, are relevant to the interpretation of an MR test. Particularly important is consideration of the "missingness" of genotypes that can be originated by chance, genotyping errors, or clinical ascertainment. Testing for Hardy-Weinberg equilibrium (HWE) is a genetic approach that permits evaluation of missingness. In this paper, the authors demonstrate evidence of nonconformity with HWE in real data. They also perform simulations to characterize the sensitivity of HWE tests to missingness. Unresolved missingness could lead to a false rejection of causality in an MR investigation of trait-disease association. These results indicate that large-scale studies, very high quality genotyping data, and detailed knowledge of the life-course genetics of the alleles/genotypes studied will largely mitigate this risk. The authors also present a Web program (http://www.oege.org/software/hwe-mr-calc.shtml) for estimating possible missingness and an approach to evaluating missingness under different genetic models.Entities:
Mesh:
Year: 2009 PMID: 19126586 PMCID: PMC2640163 DOI: 10.1093/aje/kwn359
Source DB: PubMed Journal: Am J Epidemiol ISSN: 0002-9262 Impact factor: 4.897
Figure 1.Mendelian randomization and randomized controlled trials. Adapted from the paper by Hingorani and Humphries (12). HWE, Hardy-Weinberg equilibrium.
Measures of Dispersion for Sample Sizes of 5,000, 20,000, and 50,000 and for Allele Frequencies (p and q) Ranging From 0.05 to 0.95
| Sample Size | σ | 95% Confidence Interval | μ | σ/μ | |||
| Lower Bound | Upper Bound | ||||||
| 5,000 | 0.05 | 0.95 | 3.53 | −6.92 | 6.92 | 250 | 0.0141 |
| 5,000 | 0.20 | 0.80 | 13.86 | −27.16 | 27.16 | 1,000 | 0.0139 |
| 5,000 | 0.50 | 0.50 | 30.62 | −60.01 | 60.01 | 2,500 | 0.0122 |
| 5,000 | 0.80 | 0.20 | 33.94 | −66.52 | 66.52 | 4,000 | 0.0085 |
| 5,000 | 0.95 | 0.05 | 20.98 | −41.11 | 41.11 | 4,750 | 0.0044 |
| 20,000 | 0.05 | 0.95 | 7.06 | −13.84 | 13.84 | 1,000 | 0.0071 |
| 20,000 | 0.20 | 0.80 | 27.71 | −54.32 | 54.32 | 4,000 | 0.0069 |
| 20,000 | 0.50 | 0.50 | 61.24 | −120.02 | 120.02 | 10,000 | 0.0061 |
| 20,000 | 0.80 | 0.20 | 67.88 | −133.05 | 133.05 | 16,000 | 0.0042 |
| 20,000 | 0.95 | 0.05 | 41.95 | −82.22 | 82.22 | 19,000 | 0.0022 |
| 50,000 | 0.05 | 0.95 | 11.17 | −21.89 | 21.89 | 2,500 | 0.0045 |
| 50,000 | 0.20 | 0.80 | 43.82 | −85.88 | 85.88 | 10,000 | 0.0044 |
| 50,000 | 0.50 | 0.50 | 96.82 | −189.78 | 189.78 | 25,000 | 0.0039 |
| 50,000 | 0.80 | 0.20 | 107.33 | −210.37 | 210.37 | 40,000 | 0.0027 |
| 50,000 | 0.95 | 0.05 | 66.33 | −130.01 | 130.01 | 47,500 | 0.0014 |
Standard deviation of the observed number of subjects in the sample.
Mean of the observed number of subjects in the sample.
Deviation From Hardy-Weinberg Equilibrium for Sample Sizes of 5,000, 20,000, and 50,000 and for Allele Frequencies Ranging From 0.05 to 0.95, After Subtracting From the Homozygote 1 Group a Number Equivalent to 1.96 Standard Deviations
| Sample Size | Hz1 True | Het True | Hz2 True | Hz1 Exp | Het Exp | Hz2 Exp | Hz1 Obs | No Call | χ2 | ||
| 5,000 | 0.05 | 13 | 475 | 4,513 | 0.0487 | 12 | 462 | 4,519 | 6 | 7 | 3.65 |
| 5,000 | 0.20 | 200 | 1,600 | 3,200 | 0.1956 | 190 | 1,565 | 3,217 | 173 | 27 | 2.48 |
| 5,000 | 0.50 | 1,250 | 2,500 | 1,250 | 0.4939 | 1,205 | 2,470 | 1,265 | 1,190 | 60 | 0.75 |
| 5,000 | 0.80 | 3,200 | 1,600 | 200 | 0.7973 | 3,136 | 1,595 | 203 | 3,133 | 67 | 0.06 |
| 5,000 | 0.95 | 4,513 | 475 | 13 | 0.9496 | 4,471 | 475 | 13 | 4,471 | 41 | 0.00 |
| 20,000 | 0.05 | 50 | 1,900 | 18,050 | 0.0493 | 49 | 1,875 | 18,063 | 36 | 14 | 3.55 |
| 20,000 | 0.20 | 800 | 6,400 | 12,800 | 0.1978 | 781 | 6,330 | 12,835 | 746 | 54 | 2.42 |
| 20,000 | 0.50 | 5,000 | 10,000 | 5,000 | 0.4970 | 4,910 | 9,940 | 5,030 | 4,880 | 120 | 0.73 |
| 20,000 | 0.80 | 12,800 | 6,400 | 800 | 0.7987 | 12,672 | 6,389 | 805 | 12,667 | 133 | 0.06 |
| 20,000 | 0.95 | 18,050 | 1,900 | 50 | 0.9498 | 17,968 | 1,900 | 50 | 17,968 | 82 | 0.00 |
| 50,000 | 0.05 | 125 | 4,750 | 45,125 | 0.0496 | 123 | 4,710 | 45,145 | 103 | 22 | 3.52 |
| 50,000 | 0.20 | 2,000 | 16,000 | 32,000 | 0.1986 | 1,969 | 15,890 | 32,055 | 1,914 | 86 | 2.40 |
| 50,000 | 0.50 | 12,500 | 25,000 | 12,500 | 0.4981 | 12,358 | 24,905 | 12,548 | 12,310 | 190 | 0.73 |
| 50,000 | 0.80 | 32,000 | 16,000 | 2,000 | 0.7992 | 31,798 | 15,983 | 2,008 | 31,790 | 210 | 0.06 |
| 50,000 | 0.95 | 45,125 | 4,750 | 125 | 0.9499 | 44,995 | 4,749 | 125 | 44,995 | 130 | 0.00 |
Abbreviations: Exp, expected; Het, heterozygote; Hz, homozygote; Obs, observed.
ptrue are the allele frequencies used to start the simulations. They were also used to estimate the true frequencies of homozygotes (Hz1 True and Hz2 True) and heterozygotes (Het True) assuming Hardy-Weinberg equilibrium.
pfalse are the allele frequencies observed after subtracting 1.96 standard deviations from the homozygote 1 group.
Hz1 Exp, Het Exp, and Hz2 Exp are the expected values of the 3 genotype groups computed from pfalse and qfalse (1 − pfalse).
Hz1 Obs are the observed values for each genotype group after subtracting 1.96 standard deviations from the homozygote 1 group (which equates to the number presented in the “No Call” column). The observed values of the other 2 genotype groups used to compute the deviations from Hardy-Weinberg equilibrium (Het Obs and Hz2 Obs) are equal to Het True and Hz2 True, respectively.
Hardy-Weinberg equilibrium χ2 value.
Deviation From Hardy-Weinberg Equilibrium for Sample Sizes of 5,000, 20,000, and 50,000 and for Allele Frequencies Ranging From 0.05 to 0.95, After Subtracting From the Heterozygote Group a Number Equivalent to 1.96 Standard Deviations
| Sample Size | Hz1 True | Het True | Hz2 True | Hz1 Exp | Het Exp | Hz2 Exp | Het Obs | No Call | χ2 | ||
| 5,000 | 0.05 | 13 | 475 | 4,513 | 0.0463 | 11 | 438 | 4,511 | 434 | 41 | 0.36 |
| 5,000 | 0.20 | 200 | 1,600 | 3,200 | 0.1961 | 190 | 1,556 | 3,190 | 1,535 | 65 | 0.86 |
| 5,000 | 0.50 | 1,250 | 2,500 | 1,250 | 0.5000 | 1,233 | 2,465 | 1,233 | 2,431 | 69 | 0.97 |
| 5,000 | 0.80 | 3,200 | 1,600 | 200 | 0.8039 | 3,190 | 1,556 | 190 | 1,535 | 65 | 0.86 |
| 5,000 | 0.95 | 4,513 | 475 | 13 | 0.9537 | 4,511 | 438 | 11 | 434 | 41 | 0.36 |
| 20,000 | 0.05 | 50 | 1,900 | 18,050 | 0.0482 | 46 | 1,826 | 18,046 | 1,819 | 81 | 0.34 |
| 20,000 | 0.20 | 800 | 6,400 | 12,800 | 0.1980 | 779 | 6,312 | 12,779 | 6,271 | 129 | 0.85 |
| 20,000 | 0.50 | 5,000 | 10,000 | 5,000 | 0.5000 | 4,965 | 9,931 | 4,965 | 9,861 | 139 | 0.97 |
| 20,000 | 0.80 | 12,800 | 6,400 | 800 | 0.8020 | 12,779 | 6,312 | 779 | 6,271 | 129 | 0.85 |
| 20,000 | 0.95 | 18,050 | 1,900 | 50 | 0.9518 | 18,046 | 1,826 | 46 | 1,819 | 81 | 0.34 |
| 50,000 | 0.05 | 125 | 4,750 | 45,125 | 0.0488 | 119 | 4,634 | 45,119 | 4,621 | 129 | 0.34 |
| 50,000 | 0.20 | 2,000 | 16,000 | 32,000 | 0.1988 | 1,967 | 15,861 | 31,967 | 15,796 | 204 | 0.84 |
| 50,000 | 0.50 | 12,500 | 25,000 | 12,500 | 0.5000 | 12,445 | 24,890 | 12,445 | 24,781 | 219 | 0.96 |
| 50,000 | 0.80 | 32,000 | 16,000 | 2,000 | 0.8012 | 31,967 | 15,861 | 1,967 | 15,796 | 204 | 0.84 |
| 50,000 | 0.95 | 45,125 | 4,750 | 125 | 0.9512 | 45,119 | 4,634 | 119 | 4,621 | 129 | 0.34 |
Abbreviations: Exp, expected; Het, heterozygote; Hz, homozygote; Obs, observed.
ptrue are the allele frequencies used to start the simulations. They were also used to estimate the true frequencies of homozygotes (Hz1 True and Hz2 True) and heterozygotes (Het True) assuming Hardy-Weinberg equilibrium.
pfalse are the allele frequencies observed after subtracting 1.96 standard deviations from the heterozygote group.
Hz1 Exp, Het Exp, and Hz2 Exp are the expected values of the 3 genotype groups computed from pfalse and qfalse (1 − pfalse).
Het Obs are the observed values for each genotype group after subtracting 1.96 standard deviations from the heterozygote group (which equates to the number presented in the “No Call” column). The observed values of the other 2 genotype groups used to compute the deviations from Hardy-Weinberg equilibrium (Hz1 Obs and Hz2 Obs) are equal to Hz1 True and Hz2 True, respectively.
Hardy-Weinberg equilibrium χ2 value.
Hardy-Weinberg χ2 Values (in Parentheses) for Particular Combinations of Allele Frequency, Gain/Loss of Homozygotes, and Sample Size
| Allele Frequency | Gain/Loss, % | Sample Size |
| 0.05 | ±5 | 5,000 (0.03), 100,000 (0.57) |
| 0.05 (0.03), 0.50 (0.81) | ±5 | 5,000 |
| 0.05 | ±5 (0.03), ±20 (0.46) | 5,000 |
Hardy-Weinberg χ2 Values (in Parentheses) for Particular Combinations of Allele Frequency, Gain/Loss of Heterozygotes, and Sample Size
| Allele Frequency | Gain/Loss, % | Sample Size |
| 0.05 | ±5 | 5,000 (0.12), 100,000 (2.36) |
| 0.05 (0.12), 0.50 (3.14) | ±5 | 5,000 |
| 0.05 | ±5 (0.12), ±20 (2.23) | 5,000 |
Figure 2.Q-Q plots comparing the expected and observed χ2 values for controls from the Wellcome Trust Case Control Consortium (WTCCC) (6) and one of the 7 case collections (cases of bipolar disorder). The Q-Q plots (which are similar) for all of the other 6 case collections can be seen in Web Figure 2 (http://aje.oxfordjournals.org/). Also shown is a Q-Q plot comparing the expected and observed χ2 values for 30 studies of the apolipoprotein E (APOE) polymorphism taken from the literature and a Q-Q plot comparing the expected and observed χ2 values for Framingham Study (7) prostate cancer cases. A) Q-Q plot for WTCCC controls; B) Q-Q plot for WTCCC bipolar disorder cases; C) Q-Q plot for the 30 APOE studies; D) Q-Q plot for the Framingham prostate cancer cases.
Effect on Mendelian Randomization of Both Deviations From Hardy-Weinberg Equilibrium and Additions of Missing Persons to Conform to Perfect Hardy-Weinberg Equilibriuma
| Original Study | Addition of Missing Persons | Addition of Missing Persons | Addition of Missing Persons | ||||
| MR Probability | HWE χ2 | MR Probability | HWE χ2 | MR Probability | HWE χ2 | MR Probability | HWE χ2 |
| 0.0697 | 3.32 | 0.0477 | 0 | 2.91−16 | 0 | 7.54−30 | 0 |
Abbreviations: HWE, Hardy-Weinberg equilibrium; MR, Mendelian randomization.
Analyses were carried out in a study of 10,000 persons with information on a genetic marker, an intermediate trait, and a disease outcome.
Addition of 507 homozygote 1 subjects with a diseased status and a mean body mass index (weight (kg)/height (m)2) equal to that of subjects in the original study.
Addition of 507 homozygote 1 subjects with a diseased status and a body mass index greater than or equal to the highest 75th percentile of body mass index for subjects in the original study.
Addition of 507 homozygote 1 subjects with a diseased status and a body mass index greater than or equal to the highest 95th percentile of body mass index for subjects in the original study.
Figure 3.Illustration of output from the Web tool (see Materials and Methods section and http://www.oege.org/software/hwe-mr-calc.shtml) developed for Hardy-Weinberg equilibrium (HWE) analysis with estimations of possible ascertainment bias. In this example, the user input was genotype group counts of 5,236 (commoner homozygotes), 4,050 (heterozygotes), and 714 (rarer homozygotes). The output χ2 value of 3.32 approaches but does not reach statistical significance (at P < 0.05). The table of solutions for perfect HWE (χ2 = 0) shows, in red, the count that would be necessary if one of the 3 groups were accordingly adjusted. The differences from observed counts identify the possible missingnesses or excesses, which can then be considered in a subsequent genotype-phenotype or Mendelian randomization analysis. Hz, homozygotes; SNPs, single nucleotide polymorphisms. (TRG, SR, and INMD represent the authors’ initials.)
Figure 4.Steps needed to utilize Hardy-Weinberg equilibrium (HWE) testing for possible biologic ascertainment in genotype-phenotype or Mendelian randomization (MR) studies. HWD, Hardy-Weinberg disequilibrium.