| Literature DB >> 20485529 |
Melissa G Naylor1, Xihong Lin, Scott T Weiss, Benjamin A Raby, Christoph Lange.
Abstract
BACKGROUND: Discovering genetic associations between genetic markers and gene expression levels can provide insight into gene regulation and, potentially, mechanisms of disease. Such analyses typically involve a linkage or association analysis in which expression data are used as phenotypes. This approach leads to a large number of multiple comparisons and may therefore lack power. We assess the potential of applying canonical correlation analysis to partitioned genomewide data as a method for discovering regulatory variants. METHODOLOGY/PRINCIPALEntities:
Mesh:
Year: 2010 PMID: 20485529 PMCID: PMC2869348 DOI: 10.1371/journal.pone.0010395
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Type one error.
| Sample Size | Type I Error |
| 30 | 0.124 |
| 40 | 0.068 |
| 50 | 0.057 |
| 60 | 0.054 |
| 100 | 0.047 |
| 200 | 0.047 |
| 400 | 0.049 |
| 500 | 0.049 |
| 1000 | 0.048 |
Type one error of Bartlett's test for correlation between 20 SNPs and three gene expression traits simulated under the null hypothesis of no correlation.
Power.
| Sample Size | Genetic Model | Analysis | Heritability | ||||||||
| 0 | 0.03 | 0.06 | 0.09 | 0.12 | 0.15 | 0.18 | 0.21 | 0.24 | |||
| 400 | additive | CCA | 4.9 | 54 | 91 | 99 | 100 | 100 | 100 | 100 | 100 |
| regression | 0.2 | 56 | 95 | 100 | 100 | 100 | 100 | 100 | 100 | ||
| recessive | CCA | 5.0 | 18 | 35 | 49 | 58 | 65 | 68 | 72 | 74 | |
| regression | 0.2 | 12 | 34 | 50 | 60 | 66 | 71 | 75 | 77 | ||
| 200 | additive | CCA | 4.5 | 25 | 53 | 77 | 92 | 97 | 99 | 100 | 100 |
| regression | 0.2 | 19 | 57 | 84 | 96 | 99 | 100 | 100 | 100 | ||
| recessive | CCA | 4.9 | 11 | 18 | 25 | 34 | 40 | 47 | 52 | 56 | |
| regression | 0.2 | 4 | 14 | 24 | 35 | 44 | 52 | 57 | 60 | ||
Estimated percent power of Bartlett's test and univariate regression after Bonferroni correction. (Genetic model is not applicable when heritability is zero because no genetic effect is simulated.)
Figure 1Histograms of the rank of the SNP of interest in simulations.
Each panel represents a distinct set of simulations conditions: number of subjects, genetic model, and heritability. For each simulated dataset in which there was a significant correlation between SNPs and gene expression traits ( for Bartlett's test), SNPs were ranked according to the magnitude of their coefficients in the top canonical variate. Each panel is a histogram of ranks of the SNP of interest. The first column shows that the rank is uniformly distributed under the null hypothesis. The axis labels on the upper left plot apply to all plots.
Figure 2Power for three methods of determining a significant finding.
The red line shows power to detect a correlation in the simulated region via Bartlett's test with . The blue line shows power to detect a correlation and have the SNP of interest be in the top five ranking SNPs (based on magnitude of SNP coefficients). The green shows power to detect a correlation and have the SNP of interest be the top ranking SNP. The black line shows power to detect an association between the SNP of interest and at least one of the three genes using univariate regression.
Canonical correlation analysis of Childhood Asthma Management Study with maximum number of SNPs (ms) equal to 20 and 50.
| Univariate | All Tests | All Tests |
|
| |
| Regression | ms = 20 | ms = 50 | ms = 20 | ms = 50 | |
| Total Number of tests | 1759512 | 97773 | 42667 | 156 | 1488 |
| Number of tests significant at the 0.05 level after bonferroni correction | 1749 | 412 | 177 | 6 (10) | 14 (17) |
| Number of SNPs in a significant CCA test with non-zero coefficient in the top canoncical variate | 6434 | 5669 | 151 | 463 | |
| Number of SNPs significant in both a univariate regression and in a CCA test | 908 | 575 | 23 (25) | 53 (62) | |
| Number of SNPs significant in a univariate regression and ranked number one in a significant CCA test | 177 | 70 | 6 (9) | 10 (13) | |
| Number of SNP-probe pairs significant in a univariate regression and ranked number one probe and SNP in a significant CCA test | 174 | 70 | 6 (8) | 10 (13) | |
| Number of CCA tests w/3 probes | 107 | 1138 | |||
| Number of CCA tests w/2 probes | 49 | 350 | |||
| Number of CCA tests w/1 probes | 97617 | 41179 |
If two numbers are listed then the second number is only adjusted for the number of tests with more than one probe.