| Literature DB >> 34337551 |
Tamar Sofer1,2,3, Jiwon Lee3, Nuzulul Kurniansyah3, Deepti Jain4, Cecelia A Laurie4, Stephanie M Gogarten4, Matthew P Conomos4, Ben Heavner4, Yao Hu5, Charles Kooperberg5, Jeffrey Haessler5, Ramachandran S Vasan6,7, L Adrienne Cupples7,8, Brandon J Coombes9, Amanda Seyerle10, Sina A Gharib11, Han Chen12,13, Jeffrey R O'Connell14, Man Zhang15, Daniel J Gottlieb3, Bruce M Psaty16,17, W T Longstreth17, Jerome I Rotter18, Kent D Taylor18, Stephen S Rich19, Xiuqing Guo18, Eric Boerwinkle12,20, Alanna C Morrison12, James S Pankow21, Andrew D Johnson7,22, Nathan Pankratz23, Alex P Reiner5, Susan Redline1,3, Nicholas L Smith24,25,26, Kenneth M Rice4, Elizabeth D Schifano27.
Abstract
Whole-genome sequencing (WGS) and whole-exome sequencing studies have become increasingly available and are being used to identify rare genetic variants associated with health and disease outcomes. Investigators routinely use mixed models to account for genetic relatedness or other clustering variables (e.g., family or household) when testing genetic associations. However, no existing tests of the association of a rare variant with a binary outcome in the presence of correlated data control the type 1 error where there are (1) few individuals harboring the rare allele, (2) a small proportion of cases relative to controls, and (3) covariates to adjust for. Here, we address all three issues in developing a framework for testing rare variant association with a binary trait in individuals harboring at least one risk allele. In this framework, we estimate outcome probabilities under the null hypothesis and then use them, within the individuals with at least one risk allele, to test variant associations. We extend the BinomiRare test, which was previously proposed for independent observations, and develop the Conway-Maxwell-Poisson (CMP) test and study their properties in simulations. We show that the BinomiRare test always controls the type 1 error, while the CMP test sometimes does not. We then use the BinomiRare test to test the association of rare genetic variants in target genes with small-vessel disease (SVD) stroke, short sleep, and venous thromboembolism (VTE), in whole-genome sequence data from the Trans-Omics for Precision Medicine (TOPMed) program.Entities:
Year: 2021 PMID: 34337551 PMCID: PMC8321319 DOI: 10.1016/j.xhgg.2021.100040
Source DB: PubMed Journal: HGG Adv ISSN: 2666-2477
Figure 1.Step 1 of testing genetic association using the proposed framework
A null model of association between the binary outcomes and covariates of interest is fitted, accounting for genetic relationship. Then, estimated conditional outcome probabilities are extracted to be used in the testing step.
Figure 2.Step 2 of testing genetic associations using the proposed framework
Based on estimated outcome probabilities, variants are inspected one at a time. For a given variant, individuals harboring the rare allele are identified, and a test of the null hypothesis is performed testing whether n is consistent with the outcome probabilities within individuals with the rare allele, based on the null model.
Estimated type 1 error rates of BinomiRare and CMP tests in simulations with related individuals
| Estimated type 1 error by p value threshold | |||||||
|---|---|---|---|---|---|---|---|
| MAF | exp( | 10−2 | 10−3 | 10−4 | 10−2 | 10−3 | 10−4 |
| 0.001 | 0.01 | 3.38E–03 | 2.08E–04 | 1.25E–05 | 3.78E–03 | 2.55E–04 | 1.88E–05 |
| 0.001 | 0.05 | 5.61E–03 | 4.45E–04 | 2.76E–05 | 5.63E–03 | 4.4E–04 | 3.24E–05 |
| 0.001 | 0.5 | 5.78E–03 | 3.22E–04 | 1.57E–05 | 5.44E–03 | 2.94E–04 | 1.26E–05 |
| 0.01 | 0.01 | 6.22E–03 | 4.62E–04 | 3.25E–05 | 6.36E–03 | 4.64E–04 | 3.35E–05 |
| 0.01 | 0.05 | 7.70E–03 | 6.58E–04 | 4.42E–05 | 7.83E–03 | 6.58E–04 | 5.1E–05 |
| 0.01 | 0.5 | 8.68E–03 | 8.28E–04 | 6.67E–05 | 8.21E–03 | 7.3E–04 | 6.56E–05 |
| 0.02 | 0.01 | 6.17E–03 | 4.27E–04 | 2.73E–05 | 6.41E–03 | 4.71E–04 | 3.38E–05 |
| 0.02 | 0.05 | 7.53E–03 | 6.27E–04 | 5.34E–05 | 7.52E–03 | 6.15E–04 | 5.27E–05 |
| 0.02 | 0.5 | 7.90E–03 | 6.87E–04 | 6.51E–05 | 7.56E–03 | 6.34E–04 | 5.55E–05 |
| 0.05 | 0.01 | 4.99E–03 | 2.88E–04 | 1.84E–05 | 5.21E–03 | 2.97E–04 | 1.48E–05 |
| 0.05 | 0.05 | 5.94E–03 | 4.49E–04 | 3.10E–05 | 5.93E–03 | 4.26E–04 | 2.73E–05 |
| 0.05 | 0.5 | 6.02E–03 | 4.69E–04 | 3.91E–05 | 5.67E–03 | 4.16E–04 | 3.34E–05 |
| 0.001 | 0.01 | 5.75E–02 | 6.59E–03 | 4.18E–04 | 6.01E–02 | 6.54E–03 | 4.41E–04 |
| 0.001 | 0.05 | 4.50E–02 | 4.44E–03 | 2.83E–04 | 4.22E–02 | 3.89E–03 | 2.26E–04 |
| 0.001 | 0.5 | 3.44E–02 | 9.29E–04 | 6.28E–06 | 3.34E–02 | 7.78E–04 | 8.21E–06 |
| 0.01 | 0.01 | 2.34E–02 | 2.19E–03 | 1.70E–04 | 2.25E–02 | 2.09E–03 | 1.68E–04 |
| 0.01 | 0.05 | 1.62E–02 | 1.55E–03 | 1.37E–04 | 1.53E–02 | 1.44E–03 | 1.20E–04 |
| 0.01 | 0.5 | 9.30E–03 | 7.53E–04 | 4.41E–05 | 8.71E–03 | 6.74E–04 | 4.65E–05 |
| 0.02 | 0.01 | 1.76E–02 | 1.50E–03 | 1.11E–04 | 1.69E–02 | 1.44E–03 | 1.16E–04 |
| 0.02 | 0.05 | 1.11E–02 | 1.10E–03 | 1.02E–04 | 1.02E–02 | 9.77E–04 | 8.58E–05 |
| 0.02 | 0.5 | 7.86E–03 | 6.30E–04 | 5.49E–05 | 7.45E–03 | 5.76E–04 | 4.36E–05 |
| 0.05 | 0.01 | 1.06E–02 | 7.29E–04 | 4.14E–05 | 1.01E–02 | 7.19E–04 | 4.04E–05 |
| 0.05 | 0.05 | 6.53E–03 | 4.94E–04 | 3.44E–05 | 6.38E–03 | 4.52E–04 | 3.38E–05 |
| 0.05 | 0.5 | 5.80E–03 | 4.37E–04 | 3.30E–05 | 5.46E–03 | 3.83E–04 | 2.99E–05 |
Settings in which the type 1 error was not controlled, defined according to type 1 error rate being larger than the highest value in a 95% confidence interval around the expected type 1 error rate, based on binomial distribution with parameters being the p value threshold and number of simulations used.
Characteristics of the TOPMed datasets and variants considered for association testing
| SVD stroke | Short sleep | VTE | |
|---|---|---|---|
| No. of individuals in the analysis | 5,358 | 20,021 | 11,627 |
| No. of cases | 692 (12.9%) | 2,408 (12%) | 3,793 (32.6%) |
| No. of controls | 4,666 (87.1%) | 17,613 (88%) | 7,834 (67.4%) |
| Gene of interest | |||
| No. of potentially functional non-monomorphic variants identified | 122 | 58 | 142 |
| No. of variants further passing TOPMed quality filters | 117 | 49 | 132 |
| No. of variants further having 2 < individuals with the rare allele < 300 | 20 | 9 | 25 |
| No. of variants with estimated power > 0.5 at the 0.05 | 3 | 1 | 4 |
Results from association analysis of rare genetic variants within monogenic disease genes of interest
| rsID | Variant | BinomiRare p value | BinomiRare mid-p value | Estimated power (OR = 2) | ClinVar interpretation | CADD PHRED | FATHMM-XF coding | ||
|---|---|---|---|---|---|---|---|---|---|
| rs115582213 | chr-19-15162524-C-T | 0.04 | 0.03 | 87 | 17 | 0.7 | benign/likely benign | 25.4 | 0.66 |
| rs112197217 | chr-19-15179425-G-T | 0.53 | 0.49 | 166 | 23 | 0.91 | benign/likely benign | 21 | 0.42 |
| rs11670799 | chr-19-15188240-G-A | 0.81 | 0.77 | 180 | 23 | 0.94 | benign/likely benign | 28.8 | 0.68 |
| rs121912617 | chr-12-26122364-G-T | 0.04 | 0.03 | 127 | 38 | 0.98 | not available | 27.5 | 0.66 |
| rs6026 | chr-1-169528054-C-T | 0.37 | 0.34 | 115 | 31 | 0.94 | benign/likely benign | 25.7 | 0.75 |
| rs6034 | chr-1-169529782-G-C | 1.00 | 0.94 | 46 | 16 | 0.57 | conflicting interpretations | 21.3 | 0.56 |
| rs78958618 | chr-1-169542985-G-A | 0.67 | 0.63 | 130 | 32 | 0.94 | benign | 15.18 | 0.11 |
| rs9332485 | chr-1-169586344-C-T | 0.37 | 0.34 | 222 | 55 | 1 | benign/likely benign | 22.5 | 0.23 |
Genetic variants presented are those that passed functional annotation and statistical power filters. For each variantwe provide its BinomiRare p value and mid-p value, the number of individuals with the rare allele n, the number of individuals with both the rare allele and the outcome n, the estimated power computed while assuming effect size OR = 2 and p value threshold = 0.05, pathogenicity interpretation from ClinVar, CADD score, and FATHMM-XF coding score.