| Literature DB >> 22904618 |
Guanjie Chen1, Ao Yuan, Yanxun Zhou, Amy R Bentley, Jie Zhou, Weiping Chen, Daniel Shriner, Adebowale Adeyemo, Charles N Rotimi.
Abstract
Advances in technology and reduced costs are facilitating large-scale sequencing of genes and exomes as well as entire genomes. Recently, we described an approach based on haplotypes called SCARVA1 that enables the simultaneous analysis of the association between rare and common variants in disease etiology. Here, we describe an extension of SCARVA that evaluates individual markers instead of haplotypes. This modified method (SCARVAsnp) is implemented in four stages. First, all common variants in a pre-specified region (eg, gene) are evaluated individually. Second, a union procedure is used to combined all rare variants (RVs) in the index region, and the ratio of the log likelihood with one RV excluded to the log likelihood of a model with all the collapsed RVs is calculated. On the basis of previously-reported simulation studies,1 a likelihood ratio ≥1.3 is considered statistically significant. Third, the direction of the association of the removed RV is determined by evaluating the change in λ values with the inclusion and exclusion of that RV. Lastly, significant common and rare variants, along with covariates, are included in a final regression model to evaluate the association between the trait and variants in that region. We apply simulated and real data sets to show that the method is simple to use, computationally effcient, and that it can accurately identify both common and rare risk variants. This method overcomes several limitations of existing methods. For example, SCARVAsnp limits loss of statistical power by not including variants that are not associated with the trait of interest in the final model. Also, SCARVAsnp takes into consideration the direction of association by effectively modelling positively and negatively associated variants.Entities:
Keywords: complex traits; rare and common variants
Year: 2012 PMID: 22904618 PMCID: PMC3418150 DOI: 10.4137/BBI.S9966
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Figure 1Power (Y-axis) and non-centrality parameters (X-axis) for different values of δ(from left to right, panels represent δof 0.05, 0.025, and 0.01).
Note: The degrees of freedom (df) are indicated by the color of the lines.
Association analysis of the simulated sequence data set using SCARVAsnp.
| Type of variant | Sig. CVs/RVs | Single | Joint analysis | |
|---|---|---|---|---|
|
|
| |||
| < 0.0001 | 0.46 (0.02) | < 0.0001 | ||
| < 0.0001 | 0.47 (0.02) | < 0.0001 | ||
| Rare (+) | ||||
| | 1.38 | |||
| | 1.30 | |||
| | 2.36 | 0.61 (0.04) | < 0.0001 | |
| Rare (−) | ||||
| | 1.92 | |||
| | 1.69 | −0.57 (0.05) | < 0.0001 |
Notes: Sig. CVs/RVs—statistically significant common and rare variants.
Total number of CVs analyzed;
total number of RVs analyzed.
Rare (+): Positively-associated RVs. Rare (−): Negatively-associated RVs. β: regression coefficients for CVs. λ: regression coefficients for collapsed RV terms.
Results of the DHS sequence data for 3 lipid genes using SCARVAsnp.
| Genes | SNPs | MAF | Sig. CVs/RVs | Joint model | |
|---|---|---|---|---|---|
| ANG3-008357 | 0.40 | Common | 5.44 × 10−10 | 2.17 × 10−7 | |
| ANG3-005424 | 0.01 | Rare (−) | 2.56 | ||
| ANG3-005308 | 0.02 | Rare (−) | 5.21 | 2.86 × 10−7 | |
| ANG3-004520 | 0.01 | Rare (+) | 1.86 | 1.99 × 10−2 | |
| ANG4-010707 | 0.06 | Common | 1.50 × 10−7 | 2.75 × 10−5 | |
| ANG4-009155 | 0.28 | Common | 5.09 × 10−3 | 1.54 × 10−2 | |
| ANG4-006052 | 0.03 | Rare (−) | 3.88 | ||
| ANG4 009191 | 0.03 | Rare (−) | 4.44 | ||
| ANG4-001175 | 0.04 | Rare (−) | 4.08 | ||
| ANG4-010620 | 0.04 | Rare (−) | 3.98 | ||
| ANG4-006175 | 0.04 | Rare (−) | 3.61 | 1.00 × 10−20 | |
| ANG5-014661 | 0.01 | Rare (−) | 1.69 | ||
| ANG5-011617 | 0.02 | Rare (−) | 2.67 | ||
| ANG5-022751 | 0.02 | Rare (−) | 1.58 | ||
| ANG5-012530 | 0.04 | Rare (−) | 1.32 | 2.20 × 10−4 | |
| ANG5-026244 | 0.01 | Rare (+) | 3.22 | ||
| ANG5-012581 | 0.01 | Rare (+) | 2.59 | ||
| ANG5-017106 | 0.03 | Rare (+) | 5.58 | 6.66 × 10−6 |
Notes:
Total number of CVs analyzed;
total number of RVs analyzed.
Rare (+): Positively-associated RVs. Rare (−): Negatively-associated RVs.
Abbreviations: DHS, Dallas Heart Study; Sig. CVs/RVs, statistically significant common and rare variants; P/ratio, P values for CVs or ratio values for RVs.
Figure 2The distribution of the ratio of the log likelihood values from SCARVAsnp (blue line), and the distribution of scores from SCORE-TEST (red line) for rare variants in ANGPT4 in the Dallas Heart Study (DHS).
Results of simulated and DHS data comparing SCARVAsnp, SCORE-TEST, and SKAT.
| Data sets | Total # of SNPs | SCARVAsnp | SCORE-test (T5 | SKAT |
|---|---|---|---|---|
| Simulated | 2.02 × 10−66 | |||
| Rare (+) < 0.0001 | 0.000044 | |||
| Rare (−)< 0.0001 | ||||
| ANG3_008357 = 2.17 × 10−7 | 3.28 × 10−7 | |||
| Rare (+) = 1.99 × 10−2 | 0.008470 | |||
| Rare (−) = 2.86 × 10−7 | ||||
| ANG4_010707 = 2.75 × 10−5 | 3.78 × 10−23 | |||
| ANG4_009155 = 1.54 × 10−2 | ||||
| Rare (+) = N/A | 0.000001 | |||
| Rare (−) = 1.00 × 10−20 | ||||
| 2.01 × 10−7 | ||||
| Rare (+) = 6.66 × 10−6 | 0.015066 | |||
| Rare (−) = 2.20 × 10−4 |
Notes:
Total number of CVs analyzed;
total number of RVs analyzed;
P-value for the set of RVs with MAF < 5%;
P-value for the set of CVs and RVs.
Abbreviation: DHS, Dallas Heart Study.