| Literature DB >> 29674520 |
Yangqing Deng1, Wei Pan2.
Abstract
Due to issues of practicality and confidentiality of genomic data sharing on a large scale, typically only meta- or mega-analyzed genome-wide association study (GWAS) summary data, not individual-level data, are publicly available. Reanalyses of such GWAS summary data for a wide range of applications have become more and more common and useful, which often require the use of an external reference panel with individual-level genotypic data to infer linkage disequilibrium (LD) among genetic variants. However, with a small sample size in only hundreds, as for the most popular 1000 Genomes Project European sample, estimation errors for LD are not negligible, leading to often dramatically increased numbers of false positives in subsequent analyses of GWAS summary data. To alleviate the problem in the context of association testing for a group of SNPs, we propose an alternative estimator of the covariance matrix with an idea similar to multiple imputation. We use numerical examples based on both simulated and real data to demonstrate the severe problem with the use of the 1000 Genomes Project reference panels, and the improved performance of our new approach.Keywords: 1000 Genomes Project; COJO analysis; Wald test; gene-based testing; multiple SNPs; multiple imputation; type I error
Mesh:
Year: 2018 PMID: 29674520 PMCID: PMC5972416 DOI: 10.1534/genetics.118.300813
Source DB: PubMed Journal: Genetics ISSN: 0016-6731 Impact factor: 4.562