Literature DB >> 19654115

Multiple testing in genome-wide association studies via hidden Markov models.

Zhi Wei1, Wenguang Sun, Kai Wang, Hakon Hakonarson.   

Abstract

MOTIVATION: Genome-wide association studies (GWAS) interrogate common genetic variation across the entire human genome in an unbiased manner and hold promise in identifying genetic variants with moderate or weak effect sizes. However, conventional testing procedures, which are mostly P-value based, ignore the dependency and therefore suffer from loss of efficiency. The goal of this article is to exploit the dependency information among adjacent single nucleotide polymorphisms (SNPs) to improve the screening efficiency in GWAS.
RESULTS: We propose to model the linear block dependency in the SNP data using hidden Markov models (HMMs). A compound decision-theoretic framework for testing HMM-dependent hypotheses is developed. We propose a powerful data-driven procedure [pooled local index of significance (PLIS)] that controls the false discovery rate (FDR) at the nominal level. PLIS is shown to be optimal in the sense that it has the smallest false negative rate (FNR) among all valid FDR procedures. By re-ranking significance for all SNPs with dependency considered, PLIS gains higher power than conventional P-value based methods. Simulation results demonstrate that PLIS dominates conventional FDR procedures in detecting disease-associated SNPs. Our method is applied to analysis of the SNP data from a GWAS of type 1 diabetes. Compared with the Benjamini-Hochberg (BH) procedure, PLIS yields more accurate results and has better reproducibility of findings.
CONCLUSION: The genomic rankings based on our procedure are substantially different from the rankings based on the P-values. By integrating information from adjacent locations, the PLIS rankings benefit from the increased signal-to-noise ratio, hence our procedure often has higher statistical power and better reproducibility. It provides a promising direction in large-scale GWAS. AVAILABILITY: An R package PLIS has been developed to implement the PLIS procedure. Source codes are available upon request and will be available on CRAN (http://cran.r-project.org/). CONTACT: zhiwei@njit.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Year:  2009        PMID: 19654115     DOI: 10.1093/bioinformatics/btp476

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


  16 in total

1.  False Discovery Control in Large-Scale Spatial Multiple Testing.

Authors:  Wenguang Sun; Brian J Reich; T Tony Cai; Michele Guindani; Armin Schwartzman
Journal:  J R Stat Soc Series B Stat Methodol       Date:  2015-01-01       Impact factor: 4.488

2.  Penalized multimarker vs. single-marker regression methods for genome-wide association studies of quantitative traits.

Authors:  Hui Yi; Patrick Breheny; Netsanet Imam; Yongmei Liu; Ina Hoeschele
Journal:  Genetics       Date:  2014-10-28       Impact factor: 4.562

3.  Integrating prior knowledge in multiple testing under dependence with applications to detecting differential DNA methylation.

Authors:  Pei Fen Kuan; Derek Y Chiang
Journal:  Biometrics       Date:  2012-01-19       Impact factor: 2.571

4.  Multiple testing for neuroimaging via hidden Markov random field.

Authors:  Hai Shu; Bin Nan; Robert Koeppe
Journal:  Biometrics       Date:  2015-05-26       Impact factor: 2.571

5.  Ranking causal variants and associated regions in genome-wide association studies by the support vector machine and random forest.

Authors:  Usman Roshan; Satish Chikkagoudar; Zhi Wei; Kai Wang; Hakon Hakonarson
Journal:  Nucleic Acids Res       Date:  2011-02-11       Impact factor: 16.971

6.  Bayesian Hidden Markov Models for Dependent Large-Scale Multiple Testing.

Authors:  Xia Wang; Ali Shojaie; Jian Zou
Journal:  Comput Stat Data Anal       Date:  2019-01-29       Impact factor: 1.681

7.  Evolutionary forces shaping genomic islands of population differentiation in humans.

Authors:  Tamara Hofer; Matthieu Foll; Laurent Excoffier
Journal:  BMC Genomics       Date:  2012-03-22       Impact factor: 3.969

8.  SNVer: a statistical tool for variant calling in analysis of pooled or individual next-generation sequencing data.

Authors:  Zhi Wei; Wei Wang; Pingzhao Hu; Gholson J Lyon; Hakon Hakonarson
Journal:  Nucleic Acids Res       Date:  2011-08-03       Impact factor: 16.971

9.  Gene hunting with hidden Markov model knockoffs.

Authors:  M Sesia; C Sabatti; E J Candès
Journal:  Biometrika       Date:  2018-08-04       Impact factor: 2.445

10.  Finding type 2 diabetes causal single nucleotide polymorphism combinations and functional modules from genome-wide association data.

Authors:  Chiyong Kang; Hyeji Yu; Gwan-Su Yi
Journal:  BMC Med Inform Decis Mak       Date:  2013-04-05       Impact factor: 2.796

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.