| Literature DB >> 20018079 |
Li Yao1, Wenjun Zhong, Zhumin Zhang, Matthew J Maenner, Corinne D Engelman.
Abstract
The aim of this study was to detect the effect of interactions between single-nucleotide polymorphisms (SNPs) on incidence of heart diseases. For this purpose, 2912 subjects with 350,160 SNPs from the Framingham Heart Study (FHS) were analyzed. PLINK was used to control quality and to select the 10,000 most significant SNPs. A classification tree algorithm, Generalized, Unbiased, Interaction Detection and Estimation (GUIDE), was employed to build a classification tree to detect SNP-by-SNP interactions for the selected 10 k SNPs. The classes generated by GUIDE were reexamined by a generalized estimating equations (GEE) model with the empirical variance after accounting for potential familial correlation. Overall, 17 classes were generated based on the splitting criteria in GUIDE. The prevalence of coronary heart disease (CHD) in class 16 (determined by SNPs rs1894035, rs7955732, rs2212596, and rs1417507) was the lowest (0.23%). Compared to class 16, all other classes except for class 288 (prevalence of 1.2%) had a significantly greater risk when analyzed using GEE model. This suggests the interactions of SNPs on these node paths are significant.Entities:
Year: 2009 PMID: 20018079 PMCID: PMC2795986 DOI: 10.1186/1753-6561-3-s7-s83
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Descriptive statistics of selected traits by generation: range, percentage distribution or mean, and standard deviation at baseline
| Traita | Generation 1 ( | Generation 2 ( | Generation comparison | Overall ( | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Range | Mean | SD | Range | Mean/% | SD | Range | Mean/% | SD | ||
| Age | 29-54 | 34.87 | 3.79 | 5-59 | 33.72 | 9.26 | t = 2.31b | 5-59 | 33.86 | 8.78 |
| Body Mass Index | 16.7-36.0 | 23.81 | 3.27 | 13.5-51.1 | 24.94 | 4.10 | t = -4.95c | 13.5-51.1 | 24.80 | 4.03 |
| SBP | 90-160 | 123.01 | 13.45 | 78-200 | 119.13 | 14.24 | t = 4.86c | 78-200 | 119.60 | 14.20 |
| DBP | 50-105 | 77.85 | 9.35 | 48-120 | 77.16 | 9.97 | t = 1.23 | 48-120 | 77.25 | 9.90 |
| Cholesterol | 129-339 | 191.91 | 37.28 | 101-388 | 190.13 | 36.23 | t = 0.68 | 101-388 | 190.26 | 36.30 |
| Cigarettes | 0-50 | 7.58 | 10.71 | 0-88 | 7.60 | 12.09 | t = -0.03 | 0-88 | 7.60 | 12.00 |
| Smoke | 0-2 | 0-2 | χ2 = 25.97c | 0-2 | ||||||
| 0 | 44.09% | 0 | 41.27% | 0 | 41.46% | |||||
| 1 | 5.38% | 1 | 20.01% | 1 | 19.01% | |||||
| 2 | 50.54% | 2 | 38.72% | 2 | 39.52% | |||||
| Diabetes | 0-1 | 0-1 | χ2 = 2.24 | 0-1 | ||||||
| 0 | 92.13% | 0 | 89.59% | 0 | 89.90% | |||||
| 1 | 7.87% | 1 | 10.41% | 1 | 10.10% | |||||
aSBP, systolic blood pressure (mm Hg); DBP, diastolic blood pressure (mm Hg); Cholesterol, fasting × total cholesterol (mg/dl); Cigarettes, number of cigarettes smoked per day; Smoke, smoking status (0, never; 1, former; 2, current); Diabetes, diabetes status (0, No; 1, Yes).
bp < 0.05
cp < 0.001
Figure 1Classification tree generated by GUIDE. For each SNP, an additive genetic model with respect to the minor allele was used. At each intermediate node, a case went to the left child node if and only if the condition was satisfied. The classification tree generated by GUIDE had 17 terminal nodes (yellow or green). Node numbers were labeled at the terminal nodes. Nodes in yellow were controls (coded as 1) and nodes in green were cases (coded as 2). There were 228 cases in the sample. Predicted class and number of errors divided by number of cases are given beneath each terminal node. The current tree in Figure 1 pruned by ten-fold cross-validation has the smallest misclassification cost.
Functions of 16 SNPs that determined the splits in the classification tree
| SNP | Alleles | MAFb | Chromosome | Position | Gene | Function | |
|---|---|---|---|---|---|---|---|
| rs2741302 | Aa/C | 0.179582 | 2 | 232,936,077 | 0.0245 | ||
| rs41009 | Aa/G | 0.134071 | 2 | 8,012,609 | 0.0009 | ||
| rs10023987 | Ca/T | 0.196187 | 4 | 98,109,876 | 0.0068 | ||
| rs10804990 | A/Ga | 0.381304 | 4 | 6,970,562 | TBC1D14 TBC1 domain family member 14 | intron | 0.0253 |
| rs1417507 | Ta/C | 0.276753 | 6 | 50,324,791 | 0.0146 | ||
| rs2146333 | Aa/C | 0.188192 | 6 | 36,564,996 | KCTD20 potassium channel tetramerization domain containing 20 | 3'UTR | 0.01703 |
| rs9275765 | Aa/T | 0.215867 | 6 | 32,797,302 | |||
| rs2191806 | Ca/G | 0.182042 | 7 | 78,581,862 | MAGI2 membrane associated guanylate kinase, WW and PDZ domain containing 2 | intron | 0.0169 |
| rs17517421 | Aa/G | 0.113776 | 8 | 18,410,095 | 0.0007 | ||
| rs10833833 | Aa/G | 0.123616 | 11 | 3,355,238 | ZNF195 zinc finger protein 195 | intron | 0.0092 |
| rs1894035 | Ca/T | 0.311808 | 12 | 50,932,021 | KRT86 keratin 86 | intron | 0.02598 |
| rs7955732 | Ga/T | 0.403444 | 12 | 71,011,262 | TRHDE thyrotropin-releasing hormone degrading enzyme | intron | 0.02056 |
| rs8011590 | Ca/G | 0.083025 | 14 | 106,109,086 | 0.01665 | ||
| rs9924619 | Ca/G | 0.115006 | 16 | 28,924,029 | 0.00072 | ||
| rs17763533 | Ca/T | 0.214637 | 17 | 41,273,970 | 0.01929 | ||
| rs2212596 | Aa/C | 0.49754 | 21 | 38,890,404 | ERG v-ets erythroblastosis virus E26 oncogene homolog (avian) | intron | 0.01421 |
aMinor allele
bMAF, minor allele frequency
cp-values from single-marker analysis by PLINK
Prevalence of CHD in each of 17 classes identified by GUIDEa
| No. subjects | No. CHD (%) | ||
|---|---|---|---|
| 16 | 433 | 1 (0.23%) | Reference |
| 288 | 489 | 6 (1.2%) | 0.12 |
| 136 | 367 | 6 (1.6%) | <0.0001 |
| 48 | 335 | 7 (2.1%) | 0.04 |
| 289 | 106 | 10 (9%) | 0.0004 |
| 137 | 56 | 8 (14%) | <0.0001 |
| 5 | 235 | 36 (15%) | <0.0001 |
| 49 | 50 | 8 (16%) | <0.0001 |
| 37 | 73 | 13 (18%) | <0.0001 |
| 25 | 42 | 8 (19%) | <0.0001 |
| 145 | 37 | 7 (19%) | <0.0001 |
| 7 | 297 | 58 (20%) | <0.0001 |
| 35 | 43 | 9 (21%) | <0.0001 |
| 13 | 65 | 14 (22%) | <0.0001 |
| 69 | 30 | 7 (23%) | <0.0001 |
| 73 | 38 | 9 (24%) | <0.0001 |
| 19 | 56 | 15 (27%) | <0.0001 |
| Overall | 2752 | 222 (8%) | --- |
aNote: GEE model deleted 160 subjects with missing SNPs, while GUIDE kept all the missing values and treated them as zero.
bp-value from the GEE model for each class compared to the reference class (16).