| Literature DB >> 27980646 |
Taebeom Kim1, Peng Wei1.
Abstract
With the rapidly decreasing cost of the next-generation sequencing technology, a large number of whole genome sequences have been generated, enabling researchers to survey rare variants in the protein-coding and regulatory regions of the genome. However, it remains a daunting task to identify functional variants associated with complex diseases from whole genome sequencing (WGS) data because of the millions of candidate variants and yet moderate sample size. We propose to incorporate the Encyclopedia of DNA Elements (ENCODE) information in the association analysis of WGS data to boost the statistical power. We use the RegulomeDB and PolyPhen2 scores as external weights in existing rare variants association tests. We demonstrate the proposed framework using the WGS data and blood pressure phenotype from the San Antonio Family Studies provided by the Genetic Analysis Workshop 19. We identified a genome-wide significant locus in gene SNUPN on chromosome 15 that harbors a rare nonsynonymous variant, which was not detected by benchmark methods that did not incorporate biological information, including the T5 burden test and sequence kernel association test.Entities:
Year: 2016 PMID: 27980646 PMCID: PMC5133533 DOI: 10.1186/s12919-016-0040-y
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Fig. 1The Manhattan plots of of regSKAT and regT5 for the odd-numbered chromosomes from 1 to 21. of SKAT, uwSKAT, T5, or MB was also plotted if it was greater than 5. The red line corresponds to genome-wide significance threshold, while the blue line corresponds to a p value of 1e-05
Fig. 2Functional annotation of the top windows on chromosome 15 and phenotype histogram for the genome-wide significant window. Panels a and b Bar plots showing the frequency of functional categories in the sliding windows that had at least 1 test with p value of <10−5 within each of the SKAT and T5 frameworks. X-axis corresponds to the center variant position in a sliding window. Dots in each window show . Panel c Histogram showing the distribution of SBP of 789 individuals. Carriers of the variants in the genome-wide significant window centering at variant chr15:75912109 were highlighted. Dotted lines indicate the 10th, 50th, and 90th percentiles of the observed SBP