| Literature DB >> 26027500 |
Jae Hoon Sul1, Towfique Raj2, Simone de Jong3, Paul I W de Bakker4, Soumya Raychaudhuri5, Roel A Ophoff6, Barbara E Stranger7, Eleazar Eskin8, Buhm Han9.
Abstract
In studies of expression quantitative trait loci (eQTLs), it is of increasing interest to identify eGenes, the genes whose expression levels are associated with variation at a particular genetic variant. Detecting eGenes is important for follow-up analyses and prioritization because genes are the main entities in biological processes. To detect eGenes, one typically focuses on the genetic variant with the minimum p value among all variants in cis with a gene and corrects for multiple testing to obtain a gene-level p value. For performing multiple-testing correction, a permutation test is widely used. Because of growing sample sizes of eQTL studies, however, the permutation test has become a computational bottleneck in eQTL studies. In this paper, we propose an efficient approach for correcting for multiple testing and assess eGene p values by utilizing a multivariate normal distribution. Our approach properly takes into account the linkage-disequilibrium structure among variants, and its time complexity is independent of sample size. By applying our small-sample correction techniques, our method achieves high accuracy in both small and large studies. We have shown that our method consistently produces extremely accurate p values (accuracy > 98%) for three human eQTL datasets with different sample sizes and SNP densities: the Genotype-Tissue Expression pilot dataset, the multi-region brain dataset, and the HapMap 3 dataset.Entities:
Mesh:
Year: 2015 PMID: 26027500 PMCID: PMC4457958 DOI: 10.1016/j.ajhg.2015.04.012
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025