| Literature DB >> 30683923 |
Jaeyoon Chung1,2, Gyungah R Jun2,3,4, Josée Dupuis4, Lindsay A Farrer5,6,7,8,9,10.
Abstract
Complex diseases are usually associated with multiple correlated phenotypes, and the analysis of composite scores or disease status may not fully capture the complexity (or multidimensionality). Joint analysis of multiple disease-related phenotypes in genetic tests could potentially increase power to detect association of a disease with common SNPs (or genes). Gene-based tests are designed to identify genes containing multiple risk variants that individually are weakly associated with a univariate trait. We combined three multivariate association tests (O'Brien method, TATES, and MultiPhen) with two gene-based association tests (GATES and VEGAS) and compared performance (type I error and power) of six multivariate gene-based methods using simulated data. Data (n = 2000) for genetic sequence and correlated phenotypes were simulated by varying causal variant proportions and phenotype correlations for various scenarios. These simulations showed that two multivariate association tests (TATES and MultiPhen, but not O'Brien) paired with VEGAS have inflated type I error in all scenarios, while the three multivariate association tests paired with GATES have correct type I error. MultiPhen paired with GATES has higher power than competing methods if the correlations among phenotypes are low (r < 0.57). We applied these gene-based association methods to a GWAS dataset from the Alzheimer's Disease Genetics Consortium containing three neuropathological traits related to Alzheimer disease (neuritic plaque, neurofibrillary tangles, and cerebral amyloid angiopathy) measured in 3500 autopsied brains. Gene-level significant evidence (P < 2.7 × 10-6) was identified in a region containing three contiguous genes (TRAPPC12, TRAPPC12-AS1, ADI1) using O'Brien and VEGAS. Gene-wide significant associations were not observed in univariate gene-based tests.Entities:
Mesh:
Year: 2019 PMID: 30683923 PMCID: PMC6461986 DOI: 10.1038/s41431-018-0327-8
Source DB: PubMed Journal: Eur J Hum Genet ISSN: 1018-4813 Impact factor: 4.246
Description of the methods for multivariate and gene-based association testing
| Type | Method name | Input | Other requirements | Output |
|---|---|---|---|---|
| Multivariate association test | O’Brien [ | Genome-wide association summary Statistics ( | Genome-wide summary statistics, other than subset of genome | SNP-level summary statistics of |
| TATES [ | SNP-level association summary statistics ( | Individual-level phenotype data or correlation structure of phenotype data | SNP-level summary statistics of | |
| MultiPhen [ | Individual-level genetic and phenotypic data | Missing data in any genetic or phenotypic data will reduce sample size for actual association tests | SNP-level summary statistics of | |
| Gene-based association test | VEGAS [ | SNP-level association summary statistics ( | Individual-level genotypes for computing LD | |
| GATES [ | SNP-level association summary statistics ( | Individual-level genotypes for computing LD |
Type I error rate of multivariate association methods with a gene-based association method, VEGAS
| Factor loading ( | Proportion of independent SNPs (%) | VEGAS | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | ||
| 0.15 | 0–40 | 0.00 | 0.00 | 0.02 | 0 | 0.0005 | 0.001 | 0 | 0.0005 | 0.0002 |
| 40–60 | 0.00 | 0.00 | 0.02 | 4.1 × 10−5 | 0.0004 | 0.003 | 1.4 × 10−5 | 0.0003 | 0.0005 | |
| 60–100 | 0.00 | 0.00 | 0.02 | 4.8 × 10−5 | 0.0002 | 0.003 | 0 | 0.0002 | 0.0007 | |
| 0.35 | 0–40 | 0.00 | 0.00 | 0.02 | 0.0002 | 0.0006 | 0.002 | 0 | 0.0006 | 0.0002 |
| 40–60 | 0.01 | 0.00 | 0.02 | 0.0002 | 0.0004 | 0.003 | 3.3 × 10−5 | 0.0004 | 0.0004 | |
| 60–100 | 0.01 | 0.00 | 0.02 | 0.0003 | 0.0003 | 0.003 | 5.7 × 10−5 | 0.0003 | 0.0006 | |
| 0.55 | 0–40 | 0.01 | 0.00 | 0.02 | 0.002 | 0.0010 | 0.003 | 0.0001 | 0.0008 | 0.0002 |
| 40–60 | 0.01 | 0.00 | 0.02 | 0.002 | 0.0005 | 0.003 | 0.0002 | 0.0004 | 0.0003 | |
| 60–100 | 0.02 | 0.00 | 0.02 | 0.001 | 0.0004 | 0.003 | 0.0002 | 0.0004 | 0.0006 | |
| 0.75 | 0–40 | 0.06 | 0.00 | 0.02 | 0.01 | 0.0008 | 0.002 | 0.0015 | 0.0005 | 0.0006 |
| 40–60 | 0.12 | 0.00 | 0.02 | 0.02 | 0.0007 | 0.003 | 0.0032 | 0.0006 | 0.0003 | |
| 60–100 | 0.19 | 0.00 | 0.02 | 0.03 | 0.0003 | 0.005 | 0.0071 | 0.0003 | 0.0009 | |
Type I error rate of multivariate association methods with a gene-based association method, GATES
| Factor Loading ( | Proportion of independent SNPs (%) | GATES | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | ||
| 0.15 | 0–40 | 0.000 | 0.006 | 0.010 | 0.0000 | 0.0004 | 0.0000 | 0.0000 | 0.0001 | 0.0001 |
| 40–60 | 0.000 | 0.003 | 0.000 | 0.0000 | 0.0004 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
| 60–100 | 0.000 | 0.002 | 0.000 | 0.0000 | 0.0002 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
| 0.35 | 0–40 | 0.000 | 0.006 | 0.010 | 0.0000 | 0.0008 | 0.0000 | 0.0000 | 0.0001 | 0.0001 |
| 40–60 | 0.000 | 0.003 | 0.000 | 0.0000 | 0.0003 | 0.0010 | 0.0000 | 0.0000 | 0.0001 | |
| 60–100 | 0.000 | 0.002 | 0.000 | 0.0000 | 0.0001 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
| 0.55 | 0–40 | 0.002 | 0.006 | 0.010 | 0.0000 | 0.0009 | 0.0000 | 0.0000 | 0.0001 | 0.0001 |
| 40–60 | 0.001 | 0.004 | 0.010 | 0.0000 | 0.0004 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
| 60–100 | 0.001 | 0.002 | 0.000 | 0.0000 | 0.0002 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
| 0.75 | 0–40 | 0.010 | 0.008 | 0.010 | 0.0000 | 0.0005 | 0.0010 | 0.0000 | 0.0000 | 0.0000 |
| 40–60 | 0.010 | 0.005 | 0.000 | 0.0000 | 0.0005 | 0.0000 | 0.0000 | 0.0001 | 0.0000 | |
| 60–100 | 0.000 | 0.002 | 0.000 | 0.0000 | 0.0001 | 0.0000 | 0.0000 | 0.0000 | 0.0000 | |
Fig. 1Power comparisons of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods (VEGAS and GATES) in various scenarios by varying the proportion of independent SNPs in a gene. a Causal variant percentage = 15% and b causal variant percentage = 5%. Empirical power calculated at α level of 0.0001
Power of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods (VEGAS and GATES) for phenotypes in the same or different directions
| Proportion of independent SNPs (%) | VEGAS | GATES | ||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| α = 0.0001 | ||||||||||||||||||
| O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | |
| (a) All three phenotypes are correlated with each other in the same direction (+++) | ||||||||||||||||||
| 0–40 | 0.81 | 0.82 | 0.80 | 0.65 | 0.69 | 0.70 | 0.52 | 0.58 | 0.60 | 0.63 | 0.74 | 0.76 | 0.47 | 0.59 | 0.64 | 0.37 | 0.48 | 0.55 |
| 40–60 | 0.62 | 0.60 | 0.58 | 0.40 | 0.43 | 0.42 | 0.26 | 0.31 | 0.31 | 0.32 | 0.45 | 0.47 | 0.19 | 0.28 | 0.32 | 0.12 | 0.18 | 0.23 |
| 60–100 | 0.49 | 0.45 | 0.42 | 0.26 | 0.28 | 0.26 | 0.14 | 0.17 | 0.17 | 0.14 | 0.25 | 0.26 | 0.06 | 0.12 | 0.15 | 0.03 | 0.06 | 0.09 |
| (b) One of the phenotypes is correlated with others in an opposite direction (+−−) | ||||||||||||||||||
| 0–40 | 0.00 | 0.81 | 0.80 | 0.00 | 0.69 | 0.69 | 0.00 | 0.58 | 0.60 | 0.00 | 0.74 | 0.75 | 0.00 | 0.59 | 0.63 | 0.00 | 0.47 | 0.54 |
| 40–60 | 0.00 | 0.60 | 0.58 | 0.00 | 0.43 | 0.42 | 0.00 | 0.30 | 0.31 | 0.00 | 0.45 | 0.46 | 0.00 | 0.28 | 0.32 | 0.00 | 0.18 | 0.23 |
| 60–100 | 0.00 | 0.43 | 0.41 | 0.00 | 0.27 | 0.25 | 0.00 | 0.17 | 0.16 | 0.00 | 0.24 | 0.25 | 0.00 | 0.12 | 0.14 | 0.00 | 0.06 | 0.09 |
The factor loading and percentage of causal variants among the variants in a gene were fixed at 0.55 and 15%, respectively
Power of multivariate association methods (O’Brien, TATES, and MultiPhen) with gene-based association methods ((a) VEGAS and (b) GATES)
| # of phenotypes affected by causal variants | Proportion of independent SNPs (%) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | O’Brien | TATES | MultiPhen | ||
| (a) VEGAS | ||||||||||
| 1 | 0–40 | 0.20 | 0.65 | 0.68 | 0.08 | 0.52 | 0.54 | 0.03 | 0.42 | 0.44 |
| 40–60 | 0.08 | 0.38 | 0.41 | 0.02 | 0.24 | 0.41 | 0.01 | 0.24 | 0.18 | |
| 60–100 | 0.55 | 0.23 | 0.26 | 0.01 | 0.12 | 0.26 | 0.00 | 0.12 | 0.08 | |
| 2 | 0–40 | 0.57 | 0.76 | 0.79 | 0.37 | 0.63 | 0.68 | 0.25 | 0.52 | 0.58 |
| 40–60 | 0.34 | 0.52 | 0.56 | 0.15 | 0.36 | 0.41 | 0.08 | 0.25 | 0.30 | |
| 60–100 | 0.23 | 0.35 | 0.39 | 0.08 | 0.21 | 0.24 | 0.03 | 0.13 | 0.16 | |
| 3 | 0–40 | 0.81 | 0.82 | 0.80 | 0.65 | 0.69 | 0.70 | 0.52 | 0.58 | 0.60 |
| 40–60 | 0.62 | 0.60 | 0.58 | 0.40 | 0.43 | 0.42 | 0.26 | 0.31 | 0.31 | |
| 60–100 | 0.49 | 0.45 | 0.42 | 0.26 | 0.28 | 0.26 | 0.14 | 0.17 | 0.17 | |
| (b) GATES | ||||||||||
| 1 | 0–40 | 0.05 | 0.58 | 0.60 | 0.02 | 0.45 | 0.46 | 0.01 | 0.36 | 0.37 |
| 40–60 | 0.01 | 0.28 | 0.30 | 0.00 | 0.17 | 0.18 | 0.00 | 0.11 | 0.12 | |
| 60–100 | 0.00 | 0.13 | 0.13 | 0.00 | 0.06 | 0.07 | 0.00 | 0.03 | 0.04 | |
| 2 | 0–40 | 0.32 | 0.68 | 0.74 | 0.20 | 0.54 | 0.62 | 0.13 | 0.44 | 0.53 |
| 40–60 | 0.10 | 0.38 | 0.45 | 0.04 | 0.24 | 0.31 | 0.02 | 0.15 | 0.22 | |
| 60–100 | 0.03 | 0.19 | 0.24 | 0.01 | 0.93 | 0.14 | 0.00 | 0.05 | 0.08 | |
| 3 | 0–40 | 0.63 | 0.74 | 0.76 | 0.47 | 0.59 | 0.64 | 0.37 | 0.48 | 0.55 |
| 40–60 | 0.32 | 0.45 | 0.47 | 0.19 | 0.28 | 0.32 | 0.12 | 0.18 | 0.23 | |
| 60–100 | 0.14 | 0.25 | 0.26 | 0.06 | 0.12 | 0.15 | 0.03 | 0.06 | 0.09 | |
The factor loading and percentage of causal variants among the variants in a gene were fixed at 0.55% and 15%, respectively
Associations (P values) of known AD genes from the analysis of neuropathological phenotypes using multivariate gene-based association methods
| Gene | CH | Start | Stop | Eff. SNPsa (%) | VEGASb | GATES | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Univariate | Multivariate | Univariate | Multivariate | |||||||||||
| NP | NFT | CAA | NP+NFT +CAA O’Brien | NP | NFT | CAA | NP+NFT+CAA | |||||||
| O’Brien | TATES | MultiPhen | ||||||||||||
|
| 1 | 207,669,473 | 207,815,110 | 90.6/422 (21.5%) | 0.66 | 0.27 | 0.17 | 0.37 | 0.91 | 0.63 | 0.27 | 0.81 | 0.69 | 0.97 |
|
| 2 | 127,805,607 | 127,864,864 |
| 3.2 × 10−3 | 1.1 × 10−3 | 0.66 | 7.5 × 10−4 | 7.9Ex10−4 | 3.0 × 10−4 | 0.79 | 1.8 × 10−3 | 5.3 × 10−4 | 0.11 |
|
| 2 | 234,054,795 | 234,116,549 | 91.6/279 (32.8%) | 0.47 | 0.07 | 0.12 | 0.54 | 0.99 | 0.60 | 0.43 | 0.55 | 0.83 | 0.38 |
|
| 3 | 9,908,394 | 9,921,938 | 47.8/146 (32.7%) | 0.45 | 0.57 | 0.58 | 0.34 | 0.76 | 0.06 | 0.59 | 0.57 | 0.14 | 0.70 |
|
| 5 | 88,014,058 | 88,199,922 | 88.8/338 (26.3%) | 0.09 | 0.06 | 0.44 | 0.34 | 0.06 | 0.18 | 0.83 | 0.65 | 0.14 | 0.91 |
|
| 5 | 139,712,428 | 139,726,188 | 38.8/91 (42.6%) | 0.76 | 0.96 | 0.05 | 0.30 | 0.62 | 0.96 | 0.15 | 0.43 | 0.31 | 0.26 |
|
| 6 | 32,485,151 | 32,498,006 | 874.5/1847 (47.3%) | 0.12 | 0.08 | 0.63 | 0.14 | 2.8 × 10−3 | 0.02 | 1.00 | 0.32 | 7.4 × 10−3 | 0.06 |
|
| 6 | 47,445,525 | 47,594,999 | 77.6/472 (16.4%) | 0.24 | 0.46 | 0.31 | 0.27 | 0.48 | 0.67 | 0.64 | 0.66 | 0.71 | 0.63 |
|
| 7 | 99,998,495 | 100,026,302 | 24.5/90 (27.2%) | 0.68 | 0.37 | 0.73 | 0.77 | 0.66 | 0.11 | 0.78 | 0.90 | 0.26 | 0.50 |
|
| 7 | 131,808,091 | 132,333,447 | 377.7/1435 (26.3%) | 0.81 | 0.80 | 0.85 | 0.84 | 0.65 | 0.76 | 0.92 | 0.94 | 0.96 | 0.31 |
|
| 7 | 143,088,205 | 143,105,985 | 59.1/158 (37.4%) | 0.24 | 0.13 | 0.09 | 0.27 | 0.39 | 0.21 | 0.21 | 0.37 | 0.27 | 0.21 |
|
| 8 | 27,168,999 | 27,316,908 | 120.8/500 (24.2%) | 0.10 | 0.63 | 2.9 × 10−3 | 0.07 | 0.31 | 0.80 | 0.01 | 0.13 | 0.03 | 0.05 |
|
| 8 | 27,454,434 | 27,472,328 | 60.8/178 (34.2%) | 0.36 | 0.47 | 0.42 | 0.19 | 0.49 | 0.35 | 0.63 | 0.29 | 0.47 | 0.53 |
|
| 8 | 30,643,126 | 30,670,352 | 30.3/145 (20.9%) | 0.82 | 0.70 | 0.99 | 0.87 | 0.89 | 0.80 | 0.96 | 0.82 | 0.96 | 0.09 |
|
| 10 | 11,502,509 | 11,653,679 | 104.5/456 (22.9%) | 0.40 | 0.12 | 0.85 | 0.86 | 0.55 | 0.54 | 0.97 | 0.53 | 0.79 | 0.13 |
|
| 11 | 47,487,489 | 47,574,792 | 47.5/159 (29.9%) | 0.97 | 0.39 | 0.54 | 0.69 | 0.99 | 0.02 | 0.70 | 0.28 | 0.04 | 0.51 |
|
| 11 | 59,939,080 | 59,950,674 | 31.4/129 (24.4%) | 0.24 | 0.14 | 0.97 | 0.25 | 0.11 | 0.46 | 0.93 | 0.46 | 0.28 | 0.83 |
|
| 11 | 85,668,214 | 85,780,923 |
| 0.02 | 2.4 × 10−3 | 0.03 | 8.6 × 10−4 | 0.10 | 5.2 × 10−3 | 0.04 | 2.5 × 10−3 | 0.01 | 0.08 |
|
| 11 | 121,322,912 | 121,504,471 | 105.2/359 (29.3%) | 0.20 | 0.08 | 0.92 | 0.51 | 0.22 | 0.13 | 0.96 | 0.08 | 0.34 | 0.18 |
|
| 14 | 53,323,989 | 53,417,815 | 72.2/260 (27.8%) | 0.10 | 0.16 | 0.69 | 0.37 | 0.13 | 0.26 | 0.78 | 0.09 | 0.31 | 0.30 |
|
| 14 | 92,788,925 | 92,967,825 | 210.7/766 (27.5%) | 0.73 | 0.06 | 0.56 | 0.54 | 0.86 | 0.30 | 0.49 | 0.69 | 0.45 | 0.42 |
|
| 17 | 43,971,748 | 44,105,700 | 94.2/919 (10.3%) | 0.33 | 0.39 | 0.35 | 0.15 | 0.28 | 0.90 | 0.15 | 0.22 | 0.30 | 0.06 |
|
| 17 | 56,378,592 | 56,406,152 |
| 0.50 | 0.68 | 0.29 | 0.40 | 0.76 | 0.52 | 0.18 | 0.14 | 0.36 | 0.05 |
|
| 19 | 1,040,102 | 1,065,571 | 116.1/339 (34.3%) | 3.4 × 10−3 | 1.00 | 0.06 | 0.05 | 3.9 × 10−3 | 1.00 | 0.06 | 0.01 | 8.7 × 10−3 | |
|
| 19 | 3,359,616 | 3,463,603 | 159.0/432 (36.8%) | 0.58 | 0.84 | 0.54 | 0.38 | 0.85 | 0.94 | 0.82 | 0.58 | 0.95 | 0.46 |
|
| 19 | 45,409,039 | 45,412,650 |
| <1.0 × 10−6 | <1.0 × 10−6 | <1.0 × 10−6 | <1.0 × 10−6 | 8.2 × 10−45 | 2.9 × 10−42 | 1.7 × 10−19 | 1.6 × 10−68 | 2.1 × 10−44 | 2.5 × 10−17 |
|
| 20 | 54,987,168 | 55,034,396 |
| 0.07 | 0.02 | 0.10 | 4.4 × 10−3 | 0.25 | 0.08 | 0.33 | 0.04 | 0.19 | 0.01 |
aThe Eff. SNPs indicates the proportion of independent SNPs out of the total number of SNPs in a gene range. The total SNPs were selected within 10 kb of both ends of the defined gene range. The genomic locations were assigned coordinates based on 1000 Genomes build 37 (hg19). Eff. SNPs attaining a P-value in a multivariate test that was at least one order of magnitude more significant than results for any of the univariate tests are italicized
bVEGAS computes P values using a permutation test and does not compute empirical P values with precision <1 × 10−6
Novel associations (P values) from the analysis of neuropathological phenotypes using multivariate gene-based association methods
| Gene | CH | Start | Stop | Eff. SNPsa (%) | VEGAS | GATES | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Univariate | Multivariate | Univariate | Multivariate | |||||||||||
| NP | NFT | CAA | NP+NFT+CAA O’Brien | NP | NFT | CAA | NP+NFT+CAA | |||||||
| O’Brien | TATES | MultiPhen | ||||||||||||
|
| 2 | 3,383,446 | 3,483,342 | 131.2/509 (25.8%) | 0.09 | 4.0 × 10−5 | 0.5 | <1.0 × 10−6 | 0.06 | 2.5 × 10−5 | 0.04 | 6.4 × 10−5 | 6.4 × 10−5 | 0.03 |
|
| 2 | 3,481,242 | 3,482,409 | 68.7/232 (29.6%) | 2.0 × 10−3 | 3.9 × 10−5 | 5.0 × 10−3 | <1.0 × 10−6 | 0.05 | 1.3 × 10−5 | 0.02 | 3.4 × 10−5 | 3.4 × 10−5 | 0.01 |
|
| 2 | 3,501,690 | 3,523,350 | 52.8/215 (24.6%) | 2.6 × 10−3 | 1.6 × 10−5 | 7.4 × 10−4 | <1.0 × 10−6 | 0.03 | 1.0 × 10−5 | 0.01 | 2.6 × 10−5 | 2.5 × 10−5 | 6.1 × 10−3 |
|
| 7 | 18,126,572 | 19,036,993 | 611.5/1973 (31.0%) | 0.40 | 0.14 | 0.61 | 0.24 | 0.56 | 0.01 | 5.2 × 10−3 | 6.1 × 10−5 | 0.01 | 3.5 × 10−3 |
|
| 12 | 53,038,342 | 53,045,959 | 69.3/247 (28.0%) | 3.7 × 10−4 | 0.11 | 0.01 | 3.3 × 10−5 | 9.5 × 10−3 | 0.21 | 0.17 | 1.5 × 10−3 | 0.02 | 0.03 |
|
| 14 | 76,044,940 | 76,114,512 | 81.9/255 (32.1%) | 0.02 | 0.01 | 1.3 × 10−3 | 5.8 × 10−5 | 0.15 | 0.11 | 0.02 | 4.2 × 10−3 | 0.04 | 9.4 × 10−3 |
|
| 15 | 41,474,926 | 41,522,895 | 55.1/322 (17.1%) | 9.0 × 10−4 | 6.6 × 10−3 | 0.02 | 7.9 × 10−5 | 0.01 | 0.03 | 0.15 | 4.9 × 10−4 | 0.03 | 0.11 |
aThe Eff. SNPs indicates the proportion of independent SNPs out of the total number of SNPs in a gene range. The total SNPs were selected within 10 kb of both ends of the defined gene range. The genomic locations were assigned coordinates based on 1000 Genomes build 37 (hg19)