| Literature DB >> 22373445 |
Jeesun Jung1, Jessica Dantzer, Yunlong Liu.
Abstract
Identifying rare variants that are responsible for complex disease has been promoted by advances in sequencing technologies. However, statistical methods that can handle the vast amount of data generated and that can interpret the complicated relationship between disease and these variants have lagged. We apply a zero-inflated Poisson regression model to take into account the excess of zeros caused by the extremely low frequency of the 24,487 exonic variants in the Genetic Analysis Workshop 17 data. We grouped the 697 subjects in the data set as Europeans, Asians, and Africans based on principal components analysis and found the total number of rare variants per gene for each individual. We then analyzed these collapsed variants based on the assumption that rare variants are enriched in a group of people affected by a disease compared to a group of unaffected people. We also tested the hypothesis with quantitative traits Q1, Q2, and Q4. Analyses performed on the combined 697 individuals and on each ethnic group yielded different results. For the combined population analysis, we found that UGT1A1, which was not part of the simulation model, was associated with disease liability and that FLT1, which was a causal locus in the simulation model, was associated with Q1. Of the causal loci in the simulation models, FLT1 and KDR were associated with Q1 and VNN1 was correlated with Q2. No significant genes were associated with Q4. These results show the feasibility and capability of our new statistical model to detect multiple rare variants influencing disease risk.Entities:
Year: 2011 PMID: 22373445 PMCID: PMC3287826 DOI: 10.1186/1753-6561-5-S9-S103
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Minor allele frequencies of SNPs within two randomly selected genes
| Gene | Chromosome | SNP | Function | Minor allele | Minor allele frequency | |||
|---|---|---|---|---|---|---|---|---|
| All* | Europeans | Asians | Africans | |||||
| 1 | C1S8678 | Nonsynonymous | A | 0.341 | 0.032 | 0.579 | 0.982 | |
| C1S8682 | Synonymous | C | 0.001 | 0 | 0 | 0.005 | ||
| C1S8684 | Synonymous | A | 0.311 | 0.923 | 0.106 | 0.964 | ||
| C1S8686 | Nonsynonymous | C | 0.325 | 0.942 | 0.134 | 0.959 | ||
| 18 | C18S2642 | Synonymous | C | 0.006 | 0 | 0.022 | 0 | |
| C18S2643 | Synonymous | A | 0.009 | 0 | 0.034 | 0 | ||
| C18S2648 | Synonymous | T | 0.006 | 0 | 0.028 | 0 | ||
* All 697 individuals were used to calculate the minor allele frequency.
Genes associated with disease liability, Q1, and Q2 for the pooled populations at the significance levels of 0.05 and 1.56 × 10−5
| Trait | All SNPs (nonsynonymous and synonymous) | Nonsynonymous SNPs | ||||
|---|---|---|---|---|---|---|
| Gene | Replicates significant at 0.05 (%)a | Replicates significant at 1.56 × 10−5 (%)b | Gene | Replicates significant at 0.05 (%)a | Replicates significant at 1.56 × 10−5 (%)b | |
| Disease liability (case-control) | 77.5 | 68.5 | 72.5 | 36.5 | ||
| 84 | 79.5 | 74 | 66.5 | |||
| 85.5 | 76.5 | 74 | 64 | |||
| 83.5 | 71 | 84 | 77.5 | |||
| 72.5 | 55 | 72.5 | 5 | |||
| 77.5 | 72.5 | |||||
| 77 | 69.5 | |||||
| 70 | 65 | |||||
| 71.5 | 55.5 | |||||
| Q1 | 62.5 | 45 | 96.5 | 97.5 | ||
| 100 | 92 | 69.5 | 0 | |||
| 64.5 | 42.5 | |||||
| Q2 | 66.5 | 29 | 62 | 0 | ||
| 62.5 | 41.5 | |||||
| 71.5 | 51.5 | |||||
Listed genes were significant in at least 70% of replicates. Genes in boldface are the true causal genes in the simulation model.
a Percentage of replicates in which the gene was significantly associated with disease liability, Q1, or Q2 at the significance level of 0.05.
b Percentage of replicates in which the gene was significantly associated with disease liability, Q1, or Q2 at the significance level of 1.56 × 10−5
Genes significantly associated with disease liability for each ethnic group at the significance levels of 0.05 and 1.56 × 10−5
| Africans ( | Europeans ( | Asians ( | ||||||
|---|---|---|---|---|---|---|---|---|
| Gene | Replicates significant at 0.05 (%)a | Replicates significant at 1.56 × 10−5 (%)b | Gene | Replicates significant at 0.05 (%)a | Replicates significant at 1.56 × 10−5 (%)b | Gene | Replicates significant at 0.05 (%)a | Replicates significant at 1.56 × 10−5 (%)b |
| 84 | 71 | 72 | 57 | 95.5 | 95.5 | |||
| 77 | 50.5 | 71 | 38 | 72.5 | 64 | |||
| 73 | 68 | 74 | 65.5 | 74.5 | 53.5 | |||
| 71 | 59.5 | 83 | 22 | 74.5 | 69.5 | |||
| 71.5 | 51.5 | 86.5 | 82.5 | 87.5 | 67.5 | |||
| 86.5 | 74 | 89.5 | 47 | |||||
| 84.5 | 79.5 | |||||||
Listed genes were significant in at least 70% of replicates. None of the genes were true causal genes in the simulation model. Analysis is for all SNPs.
a Percentage of replicates in which the gene was significantly associated with disease liability at the significance level of 0.05.
b Percentage of replicates in which the gene was significantly associated with disease liability at the significance level of 1.56 × 10−5.
Power to detect causal genes (out of 35 genes) in the GAW17 simulation models using the combined sample
| Trait | All SNPs (nonsynonymous and synonymous) | Nonsynonymous SNPs | ||||
|---|---|---|---|---|---|---|
| Gene | Power calculated at significance level 0.05 (%) | Power calculated at significance level 1.56 × 10−5 (%) | Gene | Power calculated at significance level 0.05 (%) | Power calculated at significance level 1.56 × 10−5 (%) | |
| Disease liability | 100 | 92 | 54.5 | 33.5 | ||
| FLT1 | 96.5 | 97.5 | ||||
| Q1 | 100 | 92 | 96.5 | 97.5 | ||
| 56.5 | 0 | 54.5 | 0 | |||
| Q2 | 46.5 | 0 | 62 | 0 | ||
Power to detect causal genes in the GAW17 simulation model as associated with disease liability using each ethnic group
| Africans ( | Europeans ( | Asians ( | ||||||
|---|---|---|---|---|---|---|---|---|
| Gene | Power calculated at significance level 0.05 (%) | Power calculated at significance level 1.56 × 10−5 (%) | Gene | Power calculated at significance level 0.05 (%) | Power calculated at significance level 1.56 × 10−5 (%) | Gene | Power calculated at significance level 0.05 (%) | Power calculated at significance level 1.56 × 10−5 (%) |
| 58 | 32 | 66.5 | 51.5 | 43 | 33 | |||
| 42 | 29.5 | |||||||
Analysis is for all SNPs.