| Literature DB >> 22373223 |
G Bryce Christensen1, Christophe G Lambert.
Abstract
To enable the assessment of compound heterozygosity, we propose a simple approach for incorporating genotype phase in a rare variant collapsing procedure for the analysis of DNA sequence data. When multiple variants are identified within a gene, knowing the phase of each variant may provide additional statistical power to detect associations with phenotypes that follow a recessive or additive inheritance pattern. We begin by phasing all marker data; then, we collapse nonsynonymous single-nucleotide polymorphisms within genes on each phased haplotype, resulting in a single diploid genotype for each gene, which represents whether one or both haplotypes carry a nonsynonymous variant allele. A recessive or additive association test can then be used to assess the relationship between the collapsed genotype and the phenotype of interest. We apply this approach to the unrelated individuals data from Genetic Analysis Workshop 17 and compare the results of the additive test with a dominant test in which phase is not informative. Analysis of the first phenotype replicate shows that the FLT1 gene is significantly associated with both Q1 and the binary affection status phenotype. This association was detected by both the additive and dominant tests, although the additive phase-informed test resulted in a smaller p-value. No false-positive results were detected in the first phenotype replicate. Analysis of the average values of all phenotype replicates correctly identified five other genes important to the simulation, but with an increase in false-positive rates. The accuracy of our method is contingent on correct phase determination.Entities:
Year: 2011 PMID: 22373223 PMCID: PMC3287937 DOI: 10.1186/1753-6561-5-S9-S95
Source DB: PubMed Journal: BMC Proc ISSN: 1753-6561
Results of association testing on the first phenotype replicate
| Additive model, Beagle | Additive model, fastPHASE | Dominant model (no phase) | ||||
|---|---|---|---|---|---|---|
| Phenotype | Gene | Gene | Gene | |||
| Affected | 2.21 × 10−6 | 1.04 × 10−6 | 3.75 × 10−6 | |||
| 1.25 × 10−4 | 1.49 × 10−4 | 2.30 × 10−4 | ||||
| 2.30 × 10−4 | 2.30 × 10−4 | 4.64 × 10−4 | ||||
| Q1 | 1.70 × 10−20 | 7.24 × 10−20 | 3.73 × 10−18 | |||
| 3.63 × 10−4 | 3.63 × 10−4 | 1.37 × 10−4 | ||||
| 7.05 × 10−4 | 7.15 × 10−4 | 1.03 × 10−3 | ||||
| Q2 | 2.32 × 10−4 | 3.84 × 10−4 | 8.67 × 10−4 | |||
| 3.84 × 10−4 | 4.06 × 10−4 | 8.87 × 10−4 | ||||
| 4.97 × 10−4 | 4.97 × 10−4 | 1.17 × 10−3 | ||||
| Q4 | 1.18 × 10−4 | 4.35 × 10−4 | 1.18 × 10−4 | |||
| 2.81 × 10−4 | 2.81 × 10−4 | 2.81 × 10−4 | ||||
| 4.16 × 10−4 | 4.36 × 10−4 | 4.00 × 10−4 | ||||
The three genes with the smallest p-values are listed for each test. Genes involved in the GAW17 phenotype simulation are marked with an asterisk.
Genotype counts for selected genes
| Gene | Method | Group | |||
|---|---|---|---|---|---|
| Beagle | Case | 142 | 63 | 4 | |
| Control | 412 | 74 | 2 | ||
| fastPHASE | Case | 142 | 64 | 3 | |
| Control | 412 | 76 | 0 | ||
| Beagle | Case | 136 | 58 | 15 | |
| Control | 348 | 123 | 17 | ||
| fastPHASE | Case | 136 | 59 | 14 | |
| Control | 348 | 123 | 17 |
A/A indicates that no nonsynonymous variants are present, B/A indicates that nonsynonymous variants were observed on only one haplotype, and B/B indicates that nonsynonymous variants were observed on both haplotypes. Counts are based on two phasing methods, Beagle and fastPHASE.
Results of association testing for averaged phenotypes across all 200 simulation replicates
| Additive model, Beagle | Additive model, fastPHASE | Dominant model (no phase) | ||||
|---|---|---|---|---|---|---|
| Phenotype | Gene | Gene | Gene | |||
| Affected | 4.70 × 10−8 | 1.08 × 10−7 | 2.90 × 10−7 | |||
| 3.75 × 10−5 | 5.44 × 10−5 | 5.44 × 10−5 | ||||
| 5.44 × 10−5 | 9.43 × 10−5 | 9.43 × 10−5 | ||||
| Q1 | 7.43 × 10−61 | 9.61 × 10−57 | 2.04 × 10−58 | |||
| 7.22 × 10−8 | 3.02 × 10−7 | 7.22 × 10−8 | ||||
| 1.45 × 10−6 | 1.12 × 10−6 | 2.26 × 10−6 | ||||
| Q2 | 1.98 × 10−16 | 9.42 × 10−17 | 5.89 × 10−13 | |||
| 7.64 × 10−7 | 7.64 × 10−7 | 7.64 × 10−7 | ||||
| 1.26 × 10−6 | 1.25 × 10−6 | 7.95 × 10−7 | ||||
| Q4 | 3.35 × 10−4 | 1.38 × 10−4 | 6.22 × 10−5 | |||
| 3.62 × 10−4 | 2.25 × 10−4 | 1.11 × 10−4 | ||||
| 5.79 × 10−4 | 3.35 × 10−4 | 2.22 × 10−4 | ||||
Tests were run using the average phenotype values from the 200 simulation replicates. The three genes with the smallest p-values are listed for each test. Genes involved in the GAW17 phenotype simulation are marked with an asterisk.
Genes found by each analysis approach
| Analysis of first simulation replicate | Analysis of average values of all 200 simulation replicates | |||||
|---|---|---|---|---|---|---|
| Trait | Additive model, Beagle | Additive model, fastPHASE | Dominant model (no phase) | Additive model, Beagle | Additive model, fastPHASE | Dominant model (no phase) |
| Affected | ||||||
| Q1 | ||||||
| Q2 | None | None | None | |||
| Q4 | None | None | None | None | None | None |
Given the Bonferroni significance threshold for 2,196 tests (p < 2.27 × 10−5), we show the genes identified as significant for each phenotype using each analysis approach. Genes involved in the GAW17 phenotype simulation are marked with an asterisk.