| Literature DB >> 22973297 |
Abra Brisbin1, Gregory D Jenkins, Katarzyna A Ellsworth, Liewei Wang, Brooke L Fridley.
Abstract
Aggregating information across multiple variants in a gene or region can improve power for rare variant association testing. Power is maximized when the aggregation region contains many causal variants and few neutral variants. In this paper, we present a method for the localization of the association signal in a region using a sliding-window based approach to rare variant association testing in a region. We first introduce a novel method for analysis of rare variants, the Difference in Minor Allele Frequency test (DMAF), which allows combined analysis of common and rare variants, and makes no assumptions about the direction of effects. In whole-region analyses of simulated data with risk and protective variants, DMAF and other methods which pool data across individuals were found to outperform methods which pool data across variants. We then implement a sliding-window version of DMAF, using a step-down permutation approach to control type I error with the testing of multiple windows. In simulations, the sliding-window DMAF improved power to detect a causal sub-region, compared to applying DMAF to the whole region. Sliding-window DMAF was also effective in localizing the causal sub-region. We also applied the DMAF sliding-window approach to test for an association between response to the drug gemcitabine and variants in the gene FKBP5 sequenced in 91 lymphoblastoid cell lines derived from white non-Hispanic individuals. The application of the sliding-window test procedure detected an association in a sub-region spanning an exon and two introns, when rare and common variants were analyzed together.Entities:
Keywords: multiple testing; rare variants; region-based analysis
Year: 2012 PMID: 22973297 PMCID: PMC3434438 DOI: 10.3389/fgene.2012.00173
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Summary of models used in Simulation study I and II.
| Simulation study | Model | Type of causal variants | MAF of causal variants |
|---|---|---|---|
| I (Entire region) | A | Risk | ≤0.05 |
| B | Risk | ≤0.04 | |
| C | Risk | ≤0.06 | |
| D | Risk and Protective | ≤0.05 | |
| E | Risk | ≤0.05, 0.10 | |
| F | Risk and Protective | ≤0.05, 0.10 | |
| II (Sub-Region) | G | Risk | ≤0.05 |
| H | Risk and Protective | ≤0.05 |
Number of risk (protective) variants per region in each simulation model.
| Model | Region 1 (262 variants) | Region 2 (237 variants) | Region 3 (233 variants) |
|---|---|---|---|
| A | 100 (0) | 96 (0) | 100 (0) |
| B | 97 (0) | 95 (0) | 98 (0) |
| C | 105 (0) | 97 (0) | 102 (0) |
| D | 100 (100) | 96 (95) | 100 (99) |
| E | 101 (0) | 97 (0) | 101 (0) |
| F | 101 (100) | 97 (95) | 101 (99) |
| G | 67 (0) | 64 (0) | 67 (0) |
| H | 34 (33) | 32 (32) | 34 (34) |
Rare variant association methods.
| Method | First author, reference | Protective | Pooling | Implementation |
|---|---|---|---|---|
| DMAF | Brisbin | Y | Subjects | |
| CMC | Li (Li and Leal, | N | SNVs, then Subjects | SNVs with MAF ≤ 0.01 collapsed (default); variants with MAF > 0.01 analyzed with Hotelling T2 |
| RVT1 | Morris (Morris and Zeggini, | N | SNVs | Logistic regression |
| KBAC | Liu (Liu and Leal, | N | SNVs | Default |
| WSS | Madsen (Madsen and Browning, | N | SNVs | Empirical |
| VT | Price (Price et al., | N | SNVs | 10,000 permutations, variant weights = 1 |
| Hotel | Hotelling (Hotelling, | Y | Subjects | Blocks of 10 SNVs were analyzed with |
| aSum | Han (Han and Pan, | Y | SNVs | Empirical |
| C-alpha | Neale (Neale et al., | Y | Subjects | Empirical |
“Protective” column indicates whether the method is designed to accommodate protective variants; “pooling” indicates the dimension across which information is pooled.
Type I error rates for rare variant association methods.
| Method | Type I error rate |
|---|---|
| DMAF sq rare | 0.067 |
| DMAF abs rare | 0.057 |
| DMAF abs all | 0.053 |
| C-alpha rare | 0.067 |
| C-alpha all | 0.053 |
| Hotel | 0.020 |
| CMC | 0.067 |
| aSum | 0.053 |
| KBAC | 0.030 |
| RVT1 | 0.043 |
| VT | 0.040 |
| WSS | 0.037 |
Type I error rate calculated for null simulations across regions 1, 2, and 3 at a nominal α = 0.05.
Figure 1Power at empirical type I error = 0.05 for each method and region. (A–F) show results from corresponding models (A–F).
Difference in power of each method between models D and A and models F and E.
| Region | Models | DMAF sq rare | DMAF abs rare | DMAF abs all | C-alpha rare | C-alpha all | Hotel | CMC | aSum | KBAC | RVT1 | Price | WSS |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | D–A | −0.09 | −0.24 | −0.76 | −0.23 | −0.73 | −0.63 | −0.43 | |||||
| 1 | F–E | −0.25 | −0.09 | −0.05 | −0.1 | ||||||||
| 2 | D–A | −0.02 | −0.11 | −0.04 | 0 | −0.07 | −0.9 | −0.28 | −0.87 | −0.86 | −0.68 | ||
| 2 | F–E | −0.91 | −0.15 | −0.91 | −0.89 | −0.58 | |||||||
| 3 | D–A | −0.81 | −0.29 | −0.8 | −0.85 | −0.41 | |||||||
| 3 | F–E | −0.04 | −0.56 | −0.18 | −0.59 | −0.62 | −0.44 |
Positive values (bold) indicate increased power when protective variants are added to the simulations.
Figure 2Power vs. sliding window size. Solid lines depict power for each window size; dashed lines indicate power for analysis of the whole region, without sliding windows for (A) model G and (B) model H. Power is for a nominal α = 0.05.
Fraction of simulations in which the window or set of windows with the most significant .
| Window size | Region and model | |||||
|---|---|---|---|---|---|---|
| 1G | 1H | 2G | 2H | 3G | 3H | |
| 10 | 0.735 | 0.614 | 0.942 | 0.650 | 0.848 | 0.906 |
| 20 | 0.704 | 0.585 | 0.959 | 0.715 | 0.859 | 0.948 |
| 30 | 0.559 | 0.548 | 0.931 | 0.728 | 0.882 | 0.940 |
| 40 | 0.419 | 0.486 | 0.868 | 0.723 | 0.866 | 0.920 |
| 50 | 0.378 | 0.422 | 0.643 | 0.718 | 0.800 | 0.860 |
| 60 | 0.349 | 0.367 | 0.554 | 0.621 | 0.755 | 0.775 |
| 70 | 0.395 | 0.373 | 0.485 | 0.546 | 0.688 | 0.667 |
| 80 | 0.484 | 0.569 | 0.297 | 0.158 | 0.615 | 0.569 |
.
| Method | ||
|---|---|---|
| Rare variants | All variants | |
| DMAFsq | 0.511 | |
| DMAFabs | 0.439 | |
| VT | NA | 0.129 |
| C-alpha | 0.686 | 0.252 |
| C-alpha* | 0.675 | |
| aSum | 0.300 | 0.146 |
| CMC | 0.700 | |
| Hotel | 0.757 | |
| KBAC | 0.414 | 0.301 |
| RVT1 | 0.303 | 0.168 |
| WSS | 0.472 | 0.449 |
P-values less than 0.05 are in bold. *Indicates modified version based on heterogeneity of odds ratios.
Figure 3Corrected-log10 . Colors differentiate lengths of windows for analysis of (A) all variants and (B) only rare variants. “Position” is kb from start of gene.