| Literature DB >> 22140480 |
Joong-Ho Won1, Georg Ehret, Aravinda Chakravarti, Richard A Olshen.
Abstract
Though recently they have fallen into some disrepute, genome-wide association studies (GWAS) have been formulated and applied to understanding essential hypertension. The principal goal here is to use data gathered in a GWAS to gauge the extent to which SNPs and their interactions with other features can be combined to predict mean arterial blood pressure (MAP) in 3138 pre-menopausal and naturally post-menopausal white women. More precisely, we quantify the extent to which data as described permit prediction of MAP beyond what is possible from traditional risk factors such as blood cholesterol levels and glucose levels. Of course, these traditional risk factors are genetic, though typically not explicitly so. In all, there were 44 such risk factors/clinical variables measured and 377,790 single nucleotide polymorphisms (SNPs) genotyped. Data for women we studied are from first visit measurements taken as part of the Atherosclerotic Risk in Communities (ARIC) study. We begin by assessing non-SNP features in their abilities to predict MAP, employing a novel regression technique with two stages, first the discovery of main effects and next discovery of their interactions. The long list of SNPs genotyped is reduced to a manageable list for combining with non-SNP features in prediction. We adapted Efron's local false discovery rate to produce this reduced list. Selected non-SNP and SNP features and their interactions are used to predict MAP using adaptive linear regression. We quantify quality of prediction by an estimated coefficient of determination (R(2)). We compare the accuracy of prediction with and without information from SNPs.Entities:
Mesh:
Year: 2011 PMID: 22140480 PMCID: PMC3227593 DOI: 10.1371/journal.pone.0027891
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Non-SNP features (above the horizontal line) and interactions (below the horizontal line) chosen by our adaptive prediction algorithm.
| Code | feature | cutoff fraction of occurrence | |||||||
| 0.01 | 0.05 | 0.1 | 0.2 | ||||||
| avg coef | CVcount | avg coef | CV count | avg coef | CV count | avg coef | CV count | ||
| ANTA07A | Waist girth (cm) | 2.97E−02 | 2 | 2.53E−02 | 4 | 2.88E−02 | 5 | 1.63E−02 | 7 |
| APASIU01 | Apolipoprotein A1 (mg/L) | NA | NA | 2.53E−04 | 4 | 2.17E−04 | 1 | NA | NA |
| APBSIU01 | Apolipoprotein B (mg/L) | 3.37E−05 | 3 | 8.38E−05 | 2 | NA | NA | NA | NA |
| BMI01 | Body mass index (kg/m2) | 3.23E−02 | 2 | 1.69E−02 | 2 | 2.74E−02 | 1 | 6.15E−02 | 1 |
| CENTERID.B | Field center | −3.51E−01 | 5 | −7.11E−01 | 7 | −2.35E−01 | 2 | NA | NA |
| CENTERID.D | Field center | −1.95E+00 | 10 | −2.18E+00 | 10 | −1.74E+00 | 8 | −1.31E+00 | 1 |
| CHOLMD02.1 | Meds that secondarily lower cholesterol | 8.14E+00 | 10 | 8.45E+00 | 10 | 8.20E+00 | 10 | 7.54E+00 | 10 |
| CIGT01.2 | Cigarette smoking status (% never) | 6.01E−01 | 6 | 8.89E−01 | 7 | 4.49E−01 | 4 | 6.28E−01 | 1 |
| CIGT01.3 | Cigarette smoking status (% never) | 9.73E−01 | 10 | 1.26E+00 | 9 | 8.92E−01 | 6 | 8.96E−01 | 2 |
| CIGTYR01 | Cigarette years of smoking | −6.29E−04 | 9 | −7.00E−04 | 8 | −5.54E−04 | 6 | NA | NA |
| ETHANL03 | Usual ethanol intake (g/week) | 5.60E−03 | 6 | 6.21E−03 | 4 | 2.28E−03 | 3 | NA | NA |
| INSSIU01 | Insulin (pmol/L) | 7.81E−04 | 2 | 6.11E−04 | 4 | 3.49E−04 | 4 | NA | NA |
| TCHSIU01 | Total cholesterol (mmol/L) | 5.84E−01 | 1 | 5.88E−01 | 2 | 5.80E−01 | 5 | 5.31E−01 | 10 |
| TRGSIU01 | Total triglycerides (mmol/L) | 3.58E−01 | 5 | 3.82E−01 | 9 | 3.11E−01 | 9 | 1.34E−01 | 4 |
| V1AGE01 | Age at first visit | 1.01E−01 | 9 | 1.31E−01 | 10 | 1.24E−01 | 10 | 1.03E−01 | 3 |
| WSTHPR01 | Waist-to-hip ratio | 1.73E+00 | 3 | 1.94E+00 | 2 | 4.54E+00 | 2 | 5.61E+00 | 2 |
| ANTA07A:TCHSIU01 | 8.10E−03 | 5 | 7.32E−03 | 4 | 8.17E−03 | 1 | NA | NA | |
| BMI01:TCHSIU01 | 3.13E−02 | 2 | 2.29E−02 | 1 | NA | NA | NA | NA | |
| BMI01:TRGSIU01 | 1.12E−02 | 1 | NA | NA | NA | NA | NA | NA | |
| CHOLMD021:ANTA07A | NA | NA | NA | NA | NA | NA | 7.32E−03 | 3 | |
| CHOLMD021:ANTA07A:TCHSIU01 | 1.05E−03 | 4 | NA | NA | 1.04E−03 | 1 | NA | NA | |
| ERHA21:ANTA07A | 7.04E−04 | 10 | 8.83E−04 | 10 | 7.56E−04 | 9 | 1.09E−03 | 4 | |
| ERHA21:APBSIU01 | 4.69E−06 | 1 | 1.10E−05 | 1 | NA | NA | NA | NA | |
| ERHA21:BMI01 | 2.38E−03 | 8 | 2.39E−03 | 10 | 3.17E−03 | 10 | 4.24E−03 | 9 | |
| ERHA21:BMI01:V1AGE01 | 7.31E−05 | 1 | NA | NA | NA | NA | NA | NA | |
| ERHA21:CENTERIDB | −3.66E−03 | 2 | NA | NA | NA | NA | NA | NA | |
| ERHA21:CIGT013 | NA | NA | NA | NA | 1.33E−03 | 1 | NA | NA | |
| ERHA21:INSSIU01 | 5.44E−05 | 1 | NA | NA | 4.75E−05 | 1 | 3.09E−05 | 1 | |
| ERHA21:TCHSIU01 | 5.37E−03 | 4 | 5.86E−03 | 6 | 9.13E−03 | 4 | NA | NA | |
| ERHA21:TRGSIU01 | 4.71E−03 | 4 | 2.57E−03 | 3 | 6.12E−03 | 1 | NA | NA | |
| ERHA21:V1AGE01 | 9.25E−04 | 5 | 1.50E−03 | 1 | 1.25E−04 | 1 | NA | NA | |
| ERHA21:WSTHPR01 | 5.09E−02 | 1 | NA | NA | NA | NA | NA | NA | |
| INSSIU01:CIGT012 | 3.32E−03 | 2 | 3.42E−03 | 1 | NA | NA | NA | NA | |
| INSSIU01:TCHSIU01 | NA | NA | 9.67E−04 | 1 | NA | NA | NA | NA | |
For each cutoff fraction of occurrence in the bootstrapped CART, average coefficient and the number of times the corresponding feature is selected over the 10-fold cross validation is presented. Results are shown for medication-adjusted mean arterial blood pressure.
Coefficient of determination (R 2) estimated using 10-fold cross validation of our adaptive prediction algorithm.
| Method | mFDR cutoff | BP | SNP effects | cutoff fraction of occurrence | |||||||
| 0.01 | 0.05 | 0.1 | 0.2 | ||||||||
| Non-SNP first | 0.2 | adjusted | non-SNP | 0.270 | (0.014) | 0.271 | (0.014) | 0.261 | (0.015) | 0.244 | (0.015) |
| main | −0.007 | (0.003) | −0.009 | (0.004) | −0.003 | (0.003) | −0.005 | (0.002) | |||
| inter | 0.001 | (0.002) | −0.003 | (0.004) | 0.004 | (0.003) | 0.001 | (0.002) | |||
| unadjusted | non-SNP | 0.146 | (0.011) | 0.141 | (0.011) | 0.143 | (0.011) | 0.130 | (0.012) | ||
| main | −0.002 | (0.002) | 0.004 | (0.003) | 0.002 | (0.002) | 0.000 | (0.003) | |||
| inter | −0.001 | (0.003) | 0.004 | (0.003) | −0.007 | (0.003) | −0.012 | (0.005) | |||
| 0.5 | adjusted | non-SNP | 0.273 | (0.014) | 0.270 | (0.014) | 0.260 | (0.015) | 0.244 | (0.014) | |
| main | 0.220 | (0.020) | 0.230 | (0.019) | 0.232 | (0.015) | 0.194 | (0.012) | |||
| inter | 0.248 | (0.013) | 0.250 | (0.013) | 0.238 | (0.014) | 0.227 | (0.012) | |||
| unadjusted | non-SNP | 0.170 | (0.011) | 0.170 | (0.011) | 0.171 | (0.011) | 0.151 | (0.013) | ||
| main | 0.128 | (0.010) | 0.133 | (0.011) | 0.128 | (0.013) | 0.110 | (0.012) | |||
| inter | 0.170 | (0.011) | 0.171 | (0.011) | 0.172 | (0.011) | 0.147 | (0.014) | |||
| SNP first | 0.2 | adjusted | SNP only | 0.133 | (0.010) | 0.132 | (0.010) | 0.133 | (0.010) | 0.134 | (0.010) |
| non-SNP main | 0.268 | (0.013) | 0.264 | (0.012) | 0.264 | (0.012) | 0.260 | (0.013) | |||
| non-SNP inter | 0.271 | (0.014) | 0.268 | (0.014) | 0.261 | (0.016) | 0.249 | (0.014) | |||
| Non-SNP first+candidate SNPs | 0.2 | adjusted | non-SNP | 0.270 | (0.014) | 0.271 | (0.014) | 0.261 | (0.015) | 0.244 | (0.015) |
| SNP main | 0.263 | (0.014) | 0.264 | (0.014) | 0.259 | (0.016) | 0.241 | (0.016) | |||
| SNP inter | 0.269 | (0.014) | 0.269 | (0.014) | 0.261 | (0.015) | 0.248 | (0.015) | |||
“Non-SNP first”: the non-SNP features were first selected and the main effects of SNPs were chosen at the marginal false discovery rate cutoff of 0.2 and 0.5.
“SNP first”: the main effects of SNPs were first selected at the marginal false discovery rate cutoff of 0.2 and non-SNP effects were later included.
“candidate SNPs”: the non-SNP features were first selected and the 26 candidate SNPs were included together with the main effects of SNPs that were chosen at the marginal false discovery rate cutoff of 0.2.
For the column “BP”, “adjusted” is for results for medication-adjusted mean arterial blood pressure, and “unadjusted” for unadjusted blood pressure for each cutoff fraction of occurrence in the bootstrapped CART.
For each method and mFDR cutoff, the first row presents the baseline R 2; “main” and “inter” refers to the increase or decrease in R 2 from “non-SNP.” Standard errors of the individual R 2 for each of the ten folds are presented within parentheses.
List of candidate SNPs used for the analysis presented in the last set of Table 2.
| SNP_rs_ID | Chr | physical_start | Gene Symbol | SBP association | DBP association | HTN association | algorithm select? |
| rs12046278 | 1 | 10722163 | Y | Y | |||
| rs13401889 | 2 | 190618803 | Y | ||||
| rs7571613 | 2 | 190513906 | LOC653447 | Y | |||
| rs13423988 | 2 | 68764769 | |||||
| rs17806132 | 2 | 190416531 | PMS1 | Y | Y | Y | |
| rs305489 | 3 | 11986162 | Y | Y | |||
| rs7640747 | 3 | 37571808 | ITGA9 | Y | |||
| rs448378 | 3 | 170583592 | MDS1 | Y | Y | ||
| rs9815354 | 3 | 41887654 | ULK4 | Y | |||
| rs899364 | 8 | 11366953 | Y | Y | |||
| rs2736376 | 8 | 11155174 | Y | ||||
| rs7016759 | 8 | 49574968 | Y | ||||
| rs1910252 | 8 | 49569914 | Y | ||||
| rs11775334 | 8 | 10109029 | MSRA | Y | Y | ||
| rs1004467 | 10 | 104584496 | CYP17A1 | Y | |||
| rs11014166 | 10 | 18748803 | CACNB2 | Y | Y | Y | |
| rs381815 | 11 | 16858843 | PLEKHA7 | Y | |||
| rs11024074 | 11 | 16873794 | PLEKHA7 | Y | |||
| rs11612893 | 12 | 129290571 | Y | ||||
| rs2681472 | 12 | 88533089 | ATP2B1 | Y | Y | ||
| rs2681492 | 12 | 88537219 | ATP2B1 | Y | Y | ||
| rs2384550 | 12 | 113837113 | Y | ||||
| rs278126 | 12 | 118620099 | CIT | Y | |||
| rs3184504 | 12 | 110368990 | LNK | Y | Y | Y | |
| rs6495122 | 15 | 72912697 | Y | ||||
| rs16982520 | 20 | 57192114 | Y |
In columns 5–7, the entry “Y” indicated that the corresponding SNP's association with the corresponding BP trait was previously identified. The last column shows which of these SNPs were selected in our adaptive prediction algorithm.