| Literature DB >> 22373502 |
Vitara Pungpapong1, Libo Wang, Yanzhu Lin, Dabao Zhang, Min Zhang.
Abstract
Next-generation sequencing technologies enable us to explore rare functional variants. However, most current statistical techniques are too underpowered to capture signals of rare variants in genome-wide association studies. We propose a supervised coalescing of single-nucleotide polymorphisms to obtain gene-based markers that can stably reveal possible genetic effects related to rare alleles. We use a newly developed empirical Bayes variable selection algorithm to identify associations between studied traits and genetic markers. Using our novel method, we analyzed the three continuous phenotypes in the GAW17 data set across 200 replicates, with intriguing results.Entities:
Year: 2011 PMID: 22373502 PMCID: PMC3287887 DOI: 10.1186/1753-6561-5-S9-S5
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Identified genes and covariates in at least 5 out of 200 replicates for Q1
| Gene/covariate | Average of | SDa | Frequency |
|---|---|---|---|
| Age | 0.01667 | 0.00154 | 200 |
| Smokeb | 0.49877 | 0.06437 | 200 |
| 0.78316 | 0.10969 | 200 | |
| 0.65308 | 0.16401 | 53 | |
| 0.79018 | 0.30581 | 12 | |
| 0.87993 | 0.28302 | 6 |
a The average of and its standard deviation are calculated on the basis of replicates whose component has a nonzero coefficient.
b Smoke is coded as 1 for smokers and 0 for nonsmokers.
Figure 1Identified SNPs in at least 5 out of 200 replicates for Q1. The x-axis indicates the chromosomal position of each SNP. The y-axis represents the frequency at which SNPs were identified as having nonzero effects across 200 replicates. Red dots represent true positives, and blue dots represent false positives.
Figure 2Identified SNPs for Q1 within . The frequencies of identified SNPs within three genetic regions, ARNT, KDR, and FLT1, are shown. The x-axes indicate the chromosomal position of each SNP. The y-axes represent the frequency at which SNPs were identified as having nonzero effects across 200 replicates. Red dots represent true positives, and blue dots represent false positives.
Identified genes and covariates in at least 5 out of 200 replicates for Q2
| Gene/covariate | Average of | SDa | Frequency |
|---|---|---|---|
| 1.35707 | 0.31121 | 12 | |
| 0.99105 | 0.17755 | 7 |
a The average of and its standard deviation are calculated on the basis of replicates whose component has a nonzero coefficient.
Figure 3Identified SNPs in at least 5 out of 200 replicates for Q2. The x-axis indicates the chromosomal position of each SNP. The y-axis represents the frequency at which SNPs were identified as having nonzero effects across 200 replicates. Red dots represent true positives, and blue dots represent false positives.
Identified genes and covariates in at least 5 out of 200 replicates for Q4
| Gene/covariate | Average of | SDa | Frequency |
|---|---|---|---|
| Age | –0.04591 | 0.00064 | 200 |
| Smokeb | –0.36779 | 0.04127 | 200 |
| Sexc | 0.22870 | 0.03260 | 199 |
aThe average of and its standard deviation are calculated on the basis of replicates whose component has a nonzero coefficient.
bSmoke is coded as 1 for smokers and 0 for nonsmokers.
cSex is coded as 1 for males and 0 for females.