| Literature DB >> 21072972 |
Duncan C Thomas1, David V Conti, James Baurley, Frederik Nijhout, Michael Reed, Cornelia M Ulrich.
Abstract
Candidate gene studies are generally motivated by some form of pathway reasoning in the selection of genes to be studied, but seldom has the logic of the approach been carried through to the analysis. Marginal effects of polymorphisms in the selected genes, and occasionally pairwise gene–gene or gene–environment interactions,are often presented, but a unified approach to modelling the entire pathway has been lacking. In this review, a variety of approaches to this problem is considered, focusing on hypothesis-driven rather than purely exploratory methods. Empirical modelling strategies are based on hierarchical models that allow prior knowledge about the structure of the pathway and the various reactions to be included as ‘prior covariates’. By contrast, mechanistic models aim to describe the reactions through a system of differential equations with rate parameters that can vary between individuals, based on their genotypes. Some ways of combining the two approaches are suggested and Bayesian model averaging methods for dealing with uncertainty about the true model form in either framework is discussed. Biomarker measurements can be incorporated into such analyses, and two-phase sampling designs stratified on some combination of disease, genes and exposures can be an efficient way of obtaining data that would be too expensive or difficult to obtain on a full candidate gene sample. The review concludes with some thoughts about potential uses of pathways in genome-wide association studies.Entities:
Mesh:
Substances:
Year: 2009 PMID: 21072972 PMCID: PMC2999471 DOI: 10.1186/1479-7364-4-1-21
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Figure 1Biochemical diagram of folate metabolism (reproduced with permission from Reed [14]). AICART, aminoimidazolecarboxamide ribonucleotide transferase; BHMT, betaine-homocysteine methyltransferase; CBS, cystathionine b-synthase; DHFR, dihydrofolate reductase; DNMT, DNA-methyltransferase; dTMP, thymidine monophosphate; FTD, 10-formyltetrahydrofolate dehydrogenase; FTS, 10-formyltetrahydrofolate synthase; GAR, glycinamide ribotid; G-NMT, glycine N-methyltransferase; HCOOH; formic acid; H2C = O, formaldehyde; HCY, homocysteine; MAT, methionine adenosyl transferase; MS, methionine synthase; MTCH, 5,10-methylenetetrahydrofolate cyclohydrolase; MTD, 5,10-methylenetetrahydrofolate dehydrogenase; MTHFR, 5,10-methylenetetrahydrofolate reductase; NE, nav-enzymatic; PGT, phosphoribosyl glycinamide transferase; SAH, S-adenosylhomocysteine; SAHH, SAH hydrolase; SAM, S-adenosylmethionine; SHMT, serine hydroxymethyltransferase; THF, tetrahydrofolate; 5m-THF, 5-methylTHF; 5,10-CH2-THF, 5,10-methyleneTHF; 10f-THF, 10-formylTHF; TS, thymidylate synthase.
Marginal odds ratios (ORs) for the association of each gene with disease under various choices of reaction rates or intermediate metabolite concentrations as the causal risk factor (ORs are expressed relative to the low enzyme activity rate genotype)
| Genes | Simulated causal intermediate variable (β = 2 per SD) | |||
|---|---|---|---|---|
| Homocysteine concentration | Pyrimidine synthesis | Purine synthesis | DNA methylation | |
| 1.012 | 0.963 | 0.978 | 0.988 | |
| 0.910*** | 0.437*** | 1.129*** | 1.103*** | |
| 0.753*** | 2.451*** | 0.540*** | 1.659*** | |
| 0.793*** | 1.805*** | 0.532*** | 1.372*** | |
| 1.059 | 0.863** | 0.200*** | 0.950 | |
| 1.044 | 0.989 | 0.972 | 0.963 | |
| 0.969 | 1.009 | 0.713*** | 1.077* | |
| 1.048 | 0.899*** | 1.709*** | 0.957 | |
| 1.256*** | 0.558*** | 0.592*** | 0.639*** | |
| 1.298*** | 1.073* | 1.153*** | 0.573*** | |
| 1.197*** | 0.790*** | 0.736*** | 0.815*** | |
| 0.428*** | 1.097** | 1.108** | 0.564*** | |
| 2.753*** | 1.073* | 1.028 | 0.925* | |
| 0.998 | 1.013 | 1.014 | 0.999 | |
| 1. Intracellular folate | 0.790*** | 1.783*** | 1.961*** | 1.543*** |
| 2. Methionine intake | 3.819*** | 1.226*** | 1.112** | 1.342*** |
* p < 0.05; ** p < 0.01; *** p < 0.001
AICART, aminoimidazolecarboxamide ribonucleotide transferase; CBS, cystathionine b-synthase; DHFR, dihydrofolate reductase; FTD, 10-formyltetrahydrofolate dehydrogenase; FTS, 10-formyltetrahydrofolate synthase; MAT, methionine adenosyl transferase; MS, methionine synthase; MTCH, 5,10-methylenetetrahydrofolate cyclohydrolase; MTD, 5,10-methylenetetrahydrofolate dehydrogenase; MTHFR, 5,10-methylenetetrahydrofolate reductase; PGT, phosphoribosyl glycinamide transferase; SAHH, S-adenosyl-homocysteine hydrolase; SHMT, serine hydroxymethyltransferase; TS, thymidylate synthase.
Multiple stepwise logistic regression models, including only main effects or main effects and G × G/G × E interaction terms for four different choices of the causal variable (gene names are given in Table 1; E1 = intracellular folate concentration; E2 = methionine intake)
| Simulated causal risk factor | |||||||
|---|---|---|---|---|---|---|---|
| Homocysteine concentration | Pyrimidine synthesis | Purine synthesis | DNA methylation | ||||
+p < 0.05; ++p < 0.01; +++p < 0.001 for positive associations;-, - -, - - - denote corresponding levels of significance for negative associations. AICART, aminoimidazolecarboxamide ribonucleotide transferase; DNMT, DNA methyltransferase; TS, thymidylate synthase.
Summary of hierarchical modelling fits (parameter estimates [SEs]) for selected genetic effects (βG), prior covariates (Z'π) and prior correlations (σ2A) for simulation with homocysteine concentration as the causal variable
| No prior | N(0, σ2) | N(Zπ, σ2) | N(0, σ2eZ'ψ) | N(Z'π, σ2eZ'ψ) | N(Z'π,σ2(I-ρA)-1) | |
|---|---|---|---|---|---|---|
| G3: | -0.370 (0.113) | -0.341 (0.111) | -0.352 (0.109) | -0.327 (0.112) | -0.340 (0.109) | -0.106 (0.114) |
| G4: | -0.258 (0.133) | -0.229 (0.123) | -0.245 (0.124) | -0.216 (0.120) | -0.207 (0.117) | -0.304 (0.127) |
| G9: | 0.300 (0.121) | 0.282 (0.112) | 0.272 (0.112) | 0.155 (0.109) | 0.336 (0.079) | 0.128 (0.112) |
| G10: | 0.335 (0.116) | 0.301 (0.110) | 0.313 (0.113) | 0.293 (0.110) | 0.245 (0.116) | 0.198 (0.114) |
| G11: | 0.240 (0.112) | 0.206 (0.106) | 0.219 (0.114) | 0.106 (0.095) | 0.190 (0.102) | 0.060 (0.112) |
| G12: | -0.809 (0.131) | -0.735 (0.126) | -0.760 (0.135) | -0.752 (0.128) | -0.760 (0.125) | -0.648 (0.127) |
| G13: | 1.492 (0.134) | 1.360 (0.123) | 1.417 (0.129) | 1.400 (0.129) | 1.419 (0.133) | 1.149 (0.123) |
| π1: folate (mean) | -0.020 (0.638) | -0.668 (0.192) | -0.641 (0.453) | |||
| π2: methionine (mean) | 0.115 (0.480) | -0.266 (0.151) | 0.203 (0.377) | |||
| π3: connections (mean) | 0.000 (0.092) | 0.113 (0.027) | 0.022 (0.030) | |||
| ψ1: folate (variance) | 0.171 (0.337) | -0.312 (1.536) | ||||
| ψ2: methionine (variance) | -0.181 (0.327) | -0.913 (1.259) | ||||
| ψ3: connections (variance) | 0.472 (0.185) | 0.901 (0.390) | ||||
| σβ = SD(β| | 0.492 (0.109) | 0.658 (0.143) | 1.175 (0.491) | 2.362 (2.714) | 0.877 (0.285) | |
| σπ = SD(π) | 0.864 (0.385) | 0.858 (0.356) | 0.889 (0.403) | |||
| σψ = SD(ψ) | 0.342 (0.059) | 1.350 (1.010) | ||||
| ρ = corr(β| | 0.597 (0.170) | |||||
Figure 2Schematic representation of simplified one-compartment model.
Results of Markov chain Monte Carlo fitting of single-compartment models with homocysteine as an unobserved intermediate metabolite, created at a rate depending on SAHH (λ) and removed at a rate depending on CBS (μ), applied to the simulation taking homocysteine concentration as the causal variable
| Model | Ln(β) | λ | μ | ||
|---|---|---|---|---|---|
| Mean | SD | Mean | SD | ||
| λ, μ fixed | 1.63 (0.13) | 0.152 (0.030) | 0 | 0.226 (0.031) | 0 |
| λ, μ random | 1.55 (0.13) | 0.178 (0.029) | 0.079 (0.024) | 0.256 (0.037) | 0.082 (0.021) |
| λ, μ fixed: ln(μ0/λ0) | 2.77 (0.13) | 0 | 0 | 0.765 (0.037) | 0 |
| γs random: ln(μ0/λ0) ln[ | 2.99 (0.06) | 0 | 0 | 1.022 (0.002) | 0.007 (0.001) |
| 1.22 (0.07) | 0.209 (0.016) | 0 | 0.273 (0.014) | 0 | |
| 3.16 (0.06) | 0.161 (0.011) | 0 | 0.215 0.010) | 0 | |
CBS, cystathionine b-synthase; SAHH, S-adenosylhomocysteine.
Summary of hierarchical modelling fits for selected genetic effects (βG), prior covariates (Z'π) and prior standard deviations (σβ and σπ) for simulation with different intermediates as the causal variable, using the Z matrix derived from independent data from the same simulation model (see text).
| Simulated causal variable | ||||
|---|---|---|---|---|
| Homocysteine concentration | Pyrimidine synthesis ( | Purine synthesis ( | DNA methylation ( | |
| 0.06 (0.10) | 0.15 (0.11) | 0.12 (0.10) | ||
| -0.26 (0.14) | ||||
| 0.01 (0.05) | 0.05 (0.18) | -0.04 (0.16) | ||
| 0.13 (0.12) | 0.06 (0.14) | 0.11 (0.12) | ||
| -0.12 (0.13) | ||||
| -0.04 (0.10) | 0.13 (0.11) | |||
| -0.18 (0.10) | -0.13 (0.10) | |||
| - | 0.22 (0.13) | |||
| 0.07 (0.12) | 0.06 (0.12) | 0.01 (0.11) | ||
| π1: homocysteine | -0.16 (0.63) | -0.02 (0.70) | 0.00 (0.57) | |
| π2: v | -0.18 (0.57) | 0.08 (0.62) | 0.17 (0.54) | |
| π3: v | 0.11 (0.59) | 20.35 (0.61) | -0.28 (0.55) | |
| π4: v | -0.01 (0.69) | 0.27 (0.69) | -0.01 (0.70) | 1.01 (0.73) |
| σβ = SD(β| | 0.49 (0.12) | 0.48 (0.11) | 0.52 (0.11) | 0.47 (0.11) |
| σπ = SD(π) | 1.14 (0.48) | 1.14 (0.47) | 1.33 (0.58) | 0.99 (0.40) |
Bolded entries have posterior credibility intervals that exclude zero
AICART, aminoimidazolecarboxamide ribonucleotide transferase; CBS, cystathionine b-reductase; DNMT, DNA methyltransferase; FTD, 10-formyltetrahydrofolate dehydrogenase; FTS, 10-formyltetrahydrofolate synthase; MS, methionine synthase; MTCH, 5,10-methylenetetrahydrofolate cyclohydrolase; MTD, 5,10-methylenetetrahydrofolate dehydrogenase; MTHFR, 5,10-methylenetetrahydrofolate reductase; PGT, phosphoribosyl glycinamide transferase; SAHH, S-adenosylhomocysteine; SHMT, serine hydroxymethyltransferase; TS, thymidylate synthase.
Mendelian randomization estimates of the effect of homocysteine on disease risk
| Analysis | |||
|---|---|---|---|
| Direct: | -- | -- | 2.57 (0.22) |
| 0.216 (0.079) | 0.112 (0.142) | 0.52 (0.68) | |
| 20.633 (0.088) | -0.801 (0.175) | 1.27 (0.33) | |
| 0.917 (0.074) | 0.995 (0.166) | 1.09 (0.20) | |
| -- | 1.32 (0.16) | ||
| 1.33 (0.05) | 1.28 (0.20) | ||
| -- | 1.31 (0.20) | ||
| -- | -0.04 (0.95) | 1.92 (0.15) |
Figure 3Hierarchical clustering of folate genes based on 184 GO terms.
Figure 4Top-ranking topologies without incorporating priors: left, gene only; right, genes and exposures. With no priors, the two topologies have posterior probabilities 3.9 per cent and 2.3 per cent, respectively. Using a topology derived by hierarchical clustering of the A matrix from simulated data, the top-ranked gene-only topology was identical to that shown on the left, with posterior probability of 9.5 per cent. Using the GO topology shown in Figure 3, the same genes were included, but reordered as (((MTHR, SAHH), MTD), SHMT)with a posterior probability of 6.4 per cent.
Estimated log relative risk per unit change of true long-term homocysteine concentrations, treated as a latent variable in a single compartment linear-kinetics model; data simulated assuming homocysteine is the causal variable.
| Sampling | Subsample | Biomarker(s) measured | ||
|---|---|---|---|---|
| Homocysteine | TS enzyme | Both | ||
| Random | 80 | 1.92 (0.21) | - | 2.68 (0.46) |
| - | 2.71 (0.43) | -0.04 (0.20) | ||
| 200 | 1.82 (0.15) | - | 2.28 (0.21) | |
| - | 3.26 (0.54) | -0.04 (0.14) | ||
| Stratified by G, E, and Y | 8 × 10 = 80 | 1.77 (0.25) | - | 2.62 (0.94) |
| - | 2.47 (0.64) | -0.05 (0.28) | ||
| 8 × 25 = 200 | 1.82 (0.16) | - | 2.03 (0.20) | |
| - | 2.93 (0.39) | |||
| -0.14 (0.14) |
The simulated coefficients are 2.0 for homocysteine and 0 for TS