| Literature DB >> 22373178 |
Peng Wei1, Xiaoming Liu, Yun-Xin Fu.
Abstract
Next-generation sequencing has opened up new avenues for the genetic study of complex traits. However, because of the small number of observations for any given rare allele and high sequencing error, it is a challenge to identify functional rare variants associated with the phenotype of interest. Recent research shows that grouping variants by gene and incorporating computationally predicted functions of variants may provide higher statistical power. On the other hand, many algorithms are available for predicting the damaging effects of nonsynonymous variants. Here, we use the simulated mini-exome data of Genetic Analysis Workshop 17 to study and compare the effects of incorporating the functional predictions of single-nucleotide polymorphisms using two popular algorithms, SIFT and PolyPhen-2, into a gene-based association test. We also propose a simple mixture model that can effectively combine test results based on different functional prediction algorithms.Entities:
Year: 2011 PMID: 22373178 PMCID: PMC3287855 DOI: 10.1186/1753-6561-5-S9-S20
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Figure 1SIFT scores versus PolyPhen-2 scores. (a) (1 − SIFT score) plotted against PolyPhen-2 score. The red dashed lines correspond to the thresholds for predicting deleterious variants: 0.95 for SIFT and 0.2 for PolyPhen-2. The blue solid line corresponds to the LOESS curve (locally weighted scatterplot smoothing). (b) SIFT-based VT test p-values plotted against PolyPhen-2-based VT test p-values. Red plus signs correspond to genes that had tied rank 1 (posterior probabilities of association equal to 1) by the mixture model combining both tests. (c) Enlarged section of part b. (d) SIFT-based VT test z-values plotted against PolyPhen-2-based VT test z-values. Red plus signs correspond to genes that had tied rank 1 by the mixture model combining both tests. (e) Raw versus recalibrated PolyPhen-2 scores; solid line is the identical line. (f) Raw versus recalibrated PolyPhen-2 score-based VT test p-values.
Figure 2Venn diagrams for the top 100 genes. Top 100 genes found by (a) PolyPhen-2, SIFT, and binary-weight-based VT tests and (b) unweighted, SIFT, and binary-weight-based VT tests.
Top ten genes ranked by SIFT-based VT test p-value
| Gene | SIFT | PolyPhen-2 | Binary | Unweighted | Number of SNPs | Number of nonsynonymous SNPs |
|---|---|---|---|---|---|---|
| 0.0002 | 0.0001 | 0.0007 | 0.0003 | 34 | 23 | |
| 0.0002 | 0.0005 | 0.0002 | 0.0004 | 22 | 15 | |
| 0.0003 | 0.0002 | 0.0009 | 0.0032 | 39 | 30 | |
| 0.0003 | 0.0003 | 0.0016 | 0.0004 | 30 | 20 | |
| 0.0003 | 0.0007 | 0.0002 | 0.0002 | 35 | 20 | |
| 0.0004 | 0.0003 | 0.0004 | 0.0066 | 18 | 6 | |
| 0.0005 | 0.0005 | 0.0021 | 0.0119 | 15 | 7 | |
| 0.0007 | 0.0144 | 0.0011 | 0.0010 | 36 | 16 | |
| 0.0009 | 0.0006 | 0.0040 | 0.0006 | 10 | 6 | |
| 0.0009 | 0.0008 | 0.0015 | 0.0005 | 45 | 29 |
All genes had tied rank 1 by the mixture model combining both SIFT-based and PolyPhen-2-based VT test p-values. P-values were obtained from 10,000 permutations.