| Literature DB >> 30982610 |
Zilin Li1, Xihao Li1, Yaowu Liu1, Jincheng Shen2, Han Chen3, Hufeng Zhou1, Alanna C Morrison4, Eric Boerwinkle5, Xihong Lin6.
Abstract
Whole-genome sequencing (WGS) studies are being widely conducted in order to identify rare variants associated with human diseases and disease-related traits. Classical single-marker association analyses for rare variants have limited power, and variant-set-based analyses are commonly used by researchers for analyzing rare variants. However, existing variant-set-based approaches need to pre-specify genetic regions for analysis; hence, they are not directly applicable to WGS data because of the large number of intergenic and intron regions that consist of a massive number of non-coding variants. The commonly used sliding-window method requires the pre-specification of fixed window sizes, which are often unknown as a priori, are difficult to specify in practice, and are subject to limitations given that the sizes of genetic-association regions are likely to vary across the genome and phenotypes. We propose a computationally efficient and dynamic scan-statistic method (Scan the Genome [SCANG]) for analyzing WGS data; this method flexibly detects the sizes and the locations of rare-variant association regions without the need to specify a prior, fixed window size. The proposed method controls for the genome-wise type I error rate and accounts for the linkage disequilibrium among genetic variants. It allows the detected sizes of rare-variant association regions to vary across the genome. Through extensive simulated studies that consider a wide variety of scenarios, we show that SCANG substantially outperforms several alternative methods for detecting rare-variant-associations while controlling for the genome-wise type I error rates. We illustrate SCANG by analyzing the WGS lipids data from the Atherosclerosis Risk in Communities (ARIC) study.Entities:
Keywords: dynamic windows; family-wise error rate; genome-wise error rate; power; rare variant analysis; scan statistics; whole-genome sequencing association studies
Mesh:
Year: 2019 PMID: 30982610 PMCID: PMC6507043 DOI: 10.1016/j.ajhg.2019.03.002
Source DB: PubMed Journal: Am J Hum Genet ISSN: 0002-9297 Impact factor: 11.025