| Literature DB >> 18466488 |
Alex C Lam1, Michael Schouten, Yurii S Aulchenko, Chris S Haley, Dirk-Jan de Koning.
Abstract
We applied a simple and efficient two-step method to analyze a family-based association study of gene expression quantitative trait loci (eQTL) in a mixed model framework. This two-step method produces very similar results to the full mixed model method, with our method being significantly faster than the full model. Using the Genetic Analysis Workshop 15 (GAW15) Problem 1 data, we demonstrated the value of data filtering for reducing the number of tests and controlling the number of false positives. Specifically, we showed that removing non-expressed genes by filtering on expression variability effectively reduced the number of tests by nearly 50%. Furthermore, we demonstrated that filtering on genotype counts substantially reduced spurious detection. Finally, we restricted our analysis to the markers and transcripts that were closely located. We found five times more signals in close proximity (cis-) to transcripts than in our genome-wide analysis. Our results suggest that careful pre-filtering and partitioning of data are crucial for controlling false positives and allowing detection of genuine effects in genetic analysis of gene expression.Entities:
Year: 2007 PMID: 18466488 PMCID: PMC2367564 DOI: 10.1186/1753-6561-1-s1-s144
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1Comparison of the two step and the full mixed model methods. Transformed (-log10) p-values of 10,000 samples are plotted.
Figure 2Variability and expression level of expression traits. Frequency distribution of the IQR of transcripts. The red dashed line indicates IQR of 0.1. The inserted histogram shows the expression level, as measured by the 75% quantile of each expression trait. All of the expression traits with IQR under 0.1 are found to be also lowly expressed, with the maximum expression level of 3.3 in log intensity (as indicated by the blue line).
Figure 3Heritability of expression traits. IQR filtering removed mostly the expression traits with low heritability.
Relationship between the minor genotype count and number of significant associations without the filtering of SNPs on genotype counts
| Minor genotype count | No. SNPs | No. hits | Max. no. hits by a single SNP | Avg. no. hits per SNP |
| 1 | 103 | 1054 | 200 | 10.23 |
| 2 | 55 | 333 | 147 | 6.05 |
| 3–6 | 166 | 508 | 48 | 3.06 |
| 7–10 | 56 | 107 | 12 | 1.91 |
| 11–15 | 52 | 85 | 9 | 1.63 |
| 16–20 | 42 | 56 | 4 | 1.33 |
| 21–30 | 45 | 65 | 5 | 1.44 |
| >30 | 51 | 74 | 6 | 1.45 |
Figure 4Scatter plot of the expression trait residuals of probeset 208835_s_at after Step 1. Spurious p-value of 2.6 × 10-5 is caused by an outlier in genotype class 4/4.