| Literature DB >> 20018015 |
Seoae Cho1, Haseong Kim, Sohee Oh, Kyunga Kim, Taesung Park.
Abstract
The current trend in genome-wide association studies is to identify regions where the true disease-causing genes may lie by evaluating thousands of single-nucleotide polymorphisms (SNPs) across the whole genome. However, many challenges exist in detecting disease-causing genes among the thousands of SNPs. Examples include multicollinearity and multiple testing issues, especially when a large number of correlated SNPs are simultaneously tested. Multicollinearity can often occur when predictor variables in a multiple regression model are highly correlated, and can cause imprecise estimation of association. In this study, we propose a simple stepwise procedure that identifies disease-causing SNPs simultaneously by employing elastic-net regularization, a variable selection method that allows one to address multicollinearity. At Step 1, the single-marker association analysis was conducted to screen SNPs. At Step 2, the multiple-marker association was scanned based on the elastic-net regularization. The proposed approach was applied to the rheumatoid arthritis (RA) case-control data set of Genetic Analysis Workshop 16. While the selected SNPs at the screening step are located mostly on chromosome 6, the elastic-net approach identified putative RA-related SNPs on other chromosomes in an increased proportion. For some of those putative RA-related SNPs, we identified the interactions with sex, a well known factor affecting RA susceptibility.Entities:
Year: 2009 PMID: 20018015 PMCID: PMC2795922 DOI: 10.1186/1753-6561-3-s7-s25
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Genome-wide scan for RA-SNP association. The p-values < 0.05 from single SNP association tests were plotted in -log10 scale against chromosomal positions of the corresponding 48,366 SNPs. Blue and light blue were used to distinguish chromosomes. Red indicates potential RA-related SNPs that were identified by fitting the penalized logistic regression model (M1) via elastic-net using top 3000 of those 48,366 SNPs.
RA-related SNPs identified with ten largest main effects via the elastic-net method (M1)
| SNP | Chromosomea | Coefficientb |
|---|---|---|
| Top 1000 | ||
| rs6903608 | 6 | -0.3413 |
| rs2395185 | 6 | 0.3285 |
| rs11686264 | 2 | -0.3284 |
| rs6981223 | 8 | -0.31 |
| rs10948693 | 6 | -0.2813 |
| rs9727917 | 1 | -0.2806 |
| rs2440468 | 16 | -0.2736 |
| rs4499874 | 5 | 0.2714 |
| rs9275595 | 6 | 0.2641 |
| rs7970893 | 12 | -0.2492 |
| Top 2000 | ||
| rs2395175 | 6 | 0.2522 |
| rs6903608 | 6 | -0.2299 |
| rs10094729 | 8 | -0.166 |
| rs2101613 | 10 | -0.1613 |
| rs6910071 | 6 | 0.1529 |
| rs660895 | 6 | 0.1522 |
| rs9277554 | 6 | -0.1468 |
| rs12203592 | 6 | -0.1401 |
| rs2578240 | 9 | 0.1356 |
| rs9275572 | 6 | -0.1353 |
| Top 3000 | ||
| rs2395175 | 6 | 0.3532 |
| rs660895 | 6 | 0.2302 |
| rs9275572 | 6 | -0.219 |
| rs10094729 | 8 | -0.1972 |
| rs6903608 | 6 | -0.1889 |
| rs3873444 | 6 | -0.1403 |
| rs7970893 | 12 | -0.1321 |
| rs234592 | 14 | -0.1316 |
| rs10789176 | 1 | -0.125 |
| rs9275601 | 6 | -0.1221 |
aChromosome where SNP is located.
bCoefficient representing size and direction of SNP main effect.
Figure 2Distributions of top 3000 screened SNPs vs. 398 potential RA-related SNPs across chromosomes. For each chromosome, blue bars represent the number of SNPs that were selected as top 3000 SNPs via single SNP association tests at Step 1; and red bars represent the number of potential RA-related SNPs that were identified at Step 2 by fitting penalized logistic regression model (M1) via elastic-net using the top 3000 screened SNPs.
RA-related SNPs identified with sex-by-SNP interaction via the elastic-net method (M2)
| SNP | Chromosomea | Coefficientb |
|---|---|---|
| Top 1000 | ||
| rs2858870 | 6 | -0.329 |
| rs9727917 | 1 | -0.2572 |
| rs10184573 | 2 | -0.2347 |
| rs10514911 | 17 | 0.2314 |
| rs11703151 | 22 | 0.2077 |
| Top 2000 | ||
| rs6903608 | 6 | -0.538 |
| rs1217675 | 8 | 0.5188 |
| rs560271 | 17 | -0.5169 |
| rs201119 | 10 | -0.4943 |
| rs2044750 | 18 | -0.4812 |
| Top 3000 | ||
| rs3873444 | 6 | -0.3573 |
| rs948195 | 11 | -0.3233 |
| rs2579088 | 12 | 0.3063 |
| rs13277113 | 8 | 0.303 |
| rs12407970 | 1 | 0.2787 |
aChromosome where SNP is located.
bCoefficient representing size and direction of SNP main effect.