| Literature DB >> 32508874 |
Yamin Deng1, Tao He2, Ruiling Fang1, Shaoyu Li3, Hongyan Cao1, Yuehua Cui4.
Abstract
Genome-wide association studies focusing on a single phenotype have been broadly conducted to identify genetic variants associated with a complex disease. The commonly applied single variant analysis is limited by failing to consider the complex interactions between variants, which motivated the development of association analyses focusing on genes or gene sets. Moreover, when multiple correlated phenotypes are available, methods based on a multi-trait analysis can improve the association power. However, most currently available multi-trait analyses are single variant-based analyses; thus have limited power when disease variants function as a group in a gene or a gene set. In this work, we propose a genome-wide gene-based multi-trait analysis method by considering genes as testing units. For a given phenotype, we adopt a rapid and powerful kernel-based testing method which can evaluate the joint effect of multiple variants within a gene. The joint effect, either linear or nonlinear, is captured through kernel functions. Given a series of candidate kernel functions, we propose an omnibus test strategy to integrate the test results based on different candidate kernels. A p-value combination method is then applied to integrate dependent p-values to assess the association between a gene and multiple correlated phenotypes. Simulation studies show a reasonable type I error control and an excellent power of the proposed method compared to its counterparts. We further show the utility of the method by applying it to two data sets: the Human Liver Cohort and the Alzheimer Disease Neuroimaging Initiative data set, and novel genes are identified. Our method has broad applications in other fields in which the interest is to evaluate the joint effect (linear or nonlinear) of a set of variants.Entities:
Keywords: gene-based association; kernel function; multi-trait; nonlinear effect; p-value combination
Year: 2020 PMID: 32508874 PMCID: PMC7248273 DOI: 10.3389/fgene.2020.00437
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
The type 1 error rate of different methods under different settings.
| 100 | 0.3 | 0.059 | 0.037 | 0.052 | |
| 0.8 | 0.045 | 0.052 | 0.041 | ||
| 200 | 0.3 | 0.050 | 0.061 | 0.038 | |
| 0.8 | 0.048 | 0.049 | 0.032 | ||
| 400 | 0.3 | 0.048 | 0.064 | 0.052 | |
| 0.8 | 0.051 | 0.061 | 0.061 | ||
| 100 | 0.3 | 0.044 | 0.052 | 0.046 | |
| 0.8 | 0.049 | 0.038 | 0.044 | ||
| 200 | 0.3 | 0.061 | 0.041 | 0.046 | |
| 0.8 | 0.041 | 0.067 | 0.043 | ||
| 400 | 0.3 | 0.051 | 0.057 | 0.035 | |
| 0.8 | 0.047 | 0.050 | 0.037 |
FIGURE 1The testing power of different methods under the four scenarios with p = 0.3.
FIGURE 2The testing power of different methods under the four scenarios with p = 0.8.
FIGURE 3The Q–Q plot of the observed –log10 (p-value) versus the expected –log10 (p-value) for the six enzyme traits based on the multi-trait analysis.
List of top genes and the p-values with different methods in the Human Liver Cohort study.
| 80 | 6 | 1.11E−05 | 0.1227 | 0.1048 | |
| 58 | 21 | 1.29E−05 | 0.0072 | 0.1003 | |
| 42 | 19 | 4.22E−05 | 0.0425 | 0.1022 | |
| 150 | 1 | 5.53E−05 | 0.0789 | 0.0926 |
FIGURE 4The Q–Q plots of the observed –log10 (p-value) versus the expected –log10 (p-value) for the five cortical regions based on the multi-trait analysis.
List of top genes and the p-values with different methods in the Alzheimer Disease Neuroimaging Initiative study.
| 731 | 10 | 3.45E−06 | 0.0004 | 0.2572 | |
| 320 | 3 | 6.60E−06 | 0.0238 | 0.4595 | |
| 2,457 | 11 | 8.37E−06 | 0.1373 | 0.0165 | |
| 89 | 12 | 9.64E−06 | 0.6580 | 0.1698 | |
| 2,234 | 1 | 1.03E−05 | 0.1887 | 0.1364 | |
| 170 | 14 | 1.83E−05 | 0.5421 | 0.2648 | |
| 468 | 2 | 2.25E−05 | 0.0017 | 0.0077 | |
| 1,444 | 15 | 2.29E−05 | 0.0003 | 0.3606 | |
| 200 | 16 | 2.30E−05 | 0.0213 | 0.0093 | |
| 153 | 9 | 3.45E−05 | 0.0015 | 0.1364 | |
| 663 | 5 | 3.69E−05 | 0.1254 | 0.0232 | |
| 772 | 3 | 4.10E−05 | 0.1855 | 0.0036 |