| Literature DB >> 22216329 |
Abstract
In genome wide association studies (GWAS), haplotype analyses of SNP data are neglected in favour of single point analysis of associations. In a recent GWAS, we found that none of the known candidate genes for intramuscular fat (IMF) had been identified. In this study, data from the GWAS for these candidate genes were re-analysed as haplotypes. First, we confirmed that the methodology would find evidence for association between haplotypes in candidate genes of the calpain-calpastatin complex and musculus longissimus lumborum peak force (LLPF), because these genes had been confirmed through single point analysis in the GWAS. Then, for intramuscular fat percent (IMF), we found significant partial haplotype substitution effects for the genes ADIPOQ and CXCR4, as well as suggestive associations to the genes CEBPA, FASN, and CAPN1. Haplotypes for these genes explained 80% more of the phenotypic variance compared to the best single SNP. For some genes the analyses suggested that there was more than one causative mutation in some genes, or confirmed that some causative mutations are limited to particular subgroups of a species. Fitting the SNPs and their interactions simultaneously explained a similar amount of the phenotypic variance compared to haplotype analyses. Haplotype analysis is a neglected part of the suite of tools used to analyse GWAS data, would be a useful method to extract more information from these data sets, and may contribute to reducing the missing heritability problem.Entities:
Mesh:
Year: 2011 PMID: 22216329 PMCID: PMC3247274 DOI: 10.1371/journal.pone.0029601
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Comparison of linkage disequilibrium (LD) measures for the genes in the study.
A. D′ and D′ values are filled black circles, r 2 values are open black circles. Least squares fitted regression lines of LD on length of haplotype (D′ solid line, r 2 dashed line) are not statistically significant and the slopes are b<−1×10−5. This is evidence that the length differences between haplotypes are not important in accounting for LD between SNPs in this sample of genes. Values are means of LD estimates for each breed, not calculated from a sample of mixed breed individuals. B. Plot of D′ against Most of the comparisons between pairs of SNPs show high D′and low r 2 values, a typical result for cattle at this distance between SNPs. High D′values can indicate a reduced number of haplotypes or classes of haplotypes that are missing. r 2 values are useful in describing how well the genotypes at one SNP predict the genotypes at the other SNP.
SNPs of the calpain-calpastatin gene haplotypes associated as single point associations to LLPF in the GWAS.
| SNP | A | B | Bta | Position (bp) |
|
| s.e. |
|
|
| ||||||||
| ARS-BFGL-NGS-43901 | A | C | 7 | 97492911 | 0.0 | −0.026 | 0.040 | 0.5171 |
| ARS-USMARC-670 | A | G | 7 | 97524770 | 0.1 | 0.061 | 0.037 | 0.0952 |
| ARS-USMARC-116 | A | G | 7 | 97561407 | 0.1 | 0.065 | 0.040 | 0.1059 |
|
| ||||||||
| ARS-BFGL-NGS-13350 | A | G | 10 | 37625930 | 0.1 | 0.026 | 0.038 | 0.4841 |
| Hapmap47063-BTA-62293 | A | G | 10 | 37647411 | 0.0 | 0.032 | 0.038 | 0.3931 |
| ARS-BFGL-BAC-12264 | A | G | 10 | 37675399 | 0.0 | 0.007 | 0.056 | 0.9204 |
|
| ||||||||
| ARS-BFGL-NGS-21416 | A | G | 29 | 45202710 | 0.1 | 0.046 | 0.046 | 0.3176 |
| CAPN1_1 | C | G | 29 | 45221190 | 1.1 | 0.179 | 0.040 | 8.8e-06 |
| CAPN1_2 | A | G | 29 | 45239821 | 0.9 | −0.138 | 0.043 | 0.0012 |
*Regressions were performed on number of copies of the B allele.
b regression coefficient of LLPF regressed on number of B allele copies.
s.e. standard error of b.
Calpain-calpastatin gene haplotypes associated to LLPF.
| Haplotype |
|
| s.e. |
|
|
| ||||
| excluding MHF<0.05 | ||||
| h222 | 3.4 | −0.238 | 0.125 | 0.0584 |
| h221 | −0.302 | 0.124 | 0.0151 | |
| h122 | −0.241 | 0.113 | 0.0323 | |
| h121 | −0.154 | 0.110 | 0.1596 | |
| h112 | −0.418 | 0.114 | 0.0002 | |
| excluding h112 | 1.5 | |||
| h222 | 0.136 | 0.074 | 0.0682 | |
| h221 | 0.067 | 0.075 | 0.3753 | |
| h122 | 0.121 | 0.056 | 0.0297 | |
| h121 | 0.197 | 0.056 | 0.0005 | |
| only h112 | ||||
| h112 | 2.2 | −0.206 | 0.047 | 1.15e-05 |
|
| ||||
| excluding MHF<0.05 | ||||
| h222 | 1.6 | 0.397 | 0.138 | 0.0042 |
| h212 | 0.360 | 0.149 | 0.0161 | |
| h211 | 0.221 | 0.127 | 0.0819 | |
| h122 | 0.211 | 0.132 | 0.1122 | |
| h121 | 0.338 | 0.147 | 0.0216 | |
| only h222 | ||||
| h222 | 0.9 | 0.162 | 0.075 | 0.0319 |
|
| ||||
| excluding MHF<0.05 | ||||
| h222 | 0.4 | −0.186 | 0.202 | 0.3561 |
| h221 | −0.151 | 0.191 | 0.4292 | |
| h211 | −0.193 | 0.191 | 0.3135 | |
| h122 | −0.243 | 0.202 | 0.2298 | |
| h121 | −0.208 | 0.190 | 0.2751 | |
| h211 analysed by breed | ||||
| ANG | 3.1 | −0.005 | 0.079 | 0.9476 |
| HFD | −0.088 | 0.107 | 0.4151 | |
| MGY | −0.184 | 0.154 | 0.2327 | |
| SHN | −0.084 | 0.135 | 0.5362 | |
| BEL | 0.052 | 0.095 | 0.5850 | |
| SGT | −0.049 | 0.122 | 0.6861 | |
| BRM | 2.029 | 0.471 | 1.8e-05 |
*b regression of LLPF on number of copies of the haplotype.
s.e. standard error of b.
h111 is the haplotype of all the A alleles (AAA) while h222 is the haplotype of all the B alleles (BBB) see Table 1 for the code of A and B alleles.
Single point SNP associations of candidate genes for IMF in the GWAS.
| SNP | A | B | Bta | Position (bp) |
|
| s.e. |
|
|
| ||||||||
| ARS-BFGL-NGS-26946 | A | G | 1 | 82201457 | 0.0 | 0.049 | 0.103 | 0.6316 |
| Hapmap43250-BTA-37524 | A | G | 1 | 82245379 | 1.4 | −0.956 | 0.378 | 0.0117 |
| BTB-00035080 | A | G | 1 | 82271202 | 1.4 | −0.642 | 0.274 | 0.0191 |
|
| ||||||||
| ARS-BFGL-NGS-117383 | A | G | 2 | 63905821 | 0.0 | 0.331 | 0.413 | 0.4239 |
| Hapmap55796-rs29011172 | A | T | 2 | 63947669 | 1.1 | −0.427 | 0.135 | 0.0016 |
| ARS-BFGL-NGS-119079 | A | G | 2 | 63998173 | 0.9 | 0.297 | 0.116 | 0.0107 |
|
| ||||||||
| ARS-BFGL-NGS-105692 | A | G | 18 | 43119331 | 0.0 | −0.076 | 0.176 | 0.6715 |
| ARS-BFGL-NGS-21339 | A | G | 18 | 43150185 | 1.4 | 0.265 | 0.101 | 0.0092 |
| BTA-43268-no-rs | A | G | 18 | 43170819 | 0.1 | −0.188 | 0.123 | 0.1273 |
|
| ||||||||
| ARS-BFGL-NGS-21416 | A | G | 29 | 45202710 | 0.1 | −0.019 | 0.104 | 0.8625 |
| CAPN1_1 | C | G | 29 | 45221190 | 0.2 | −0.192 | 0.091 | 0.0348 |
| CAPN1_2 | A | G | 29 | 45239821 | 0.0 | −0.008 | 0.099 | 0.9204 |
This list consists of all the genes with at least 1 SNP with P<0.05 to IMF, the full list is in the supplementary online material.
Note that for the gene CXCR4, this gene is the closest gene to the significant SNPs, but these are not located within the gene itself.
Haplotype associations of candidate genes for IMF in the GWAS.
| Haplotype |
|
| s.e. |
|
|
| ||||
| h222 | 2.4 | −0.985 | 0.238 | 3.8e-05 |
| h122 | −0.879 | 0.224 | 9.2e-05 | |
|
| ||||
| h222 | 1.7 | −0.307 | 0.120 | 0.0105 |
| h221 | −0.518 | 0.148 | 5.0e-04 | |
| h122 | −0.485 | 0.291 | 0.0960 | |
|
| ||||
| h212 | 1.0 | 0.337 | 0.119 | 0.0048 |
|
| ||||
| h221 | 1.9 | 0.192 | 0.284 | 0.4997 |
| h212 | −0.292 | 0.097 | 0.0028 | |
| h211 | −0.126 | 0.163 | 0.4413 | |
| h122 | 0.026 | 0.212 | 0.9007 | |
| h121 | −0.374 | 0.231 | 0.1054 | |
|
| ||||
| h212 | 1.5 | −0.236 | 0.078 | 0.0027 |
|
| ||||
| h222 | 1.8 | 0.302 | 0.224 | 0.1781 |
| h221 | 0.224 | 0.205 | 0.2737 | |
| h212 | 0.051 | 0.205 | 0.8018 | |
| h121 | 0.338 | 0.264 | 0.2004 | |
| h112 | 0.519 | 0.229 | 0.0238 | |
|
| ||||
| h221 | 1.3 | −0.102 | 0.091 | 0.2623 |
| h212 | −0.287 | 0.100 | 0.0042 | |
|
| ||||
| h212 | 1.2 | −0.239 | 0.090 | 0.0081 |
|
| ||||
| h222 | 1.0 | 0.168 | 0.232 | 0.4674 |
| h221 | 0.156 | 0.230 | 0.4985 | |
| h122 | −0.102 | 0.208 | 0.6230 | |
| h121 | 0.080 | 0.203 | 0.6936 | |
| h112 | 0.186 | 0.210 | 0.3762 | |
|
| ||||
| h222 | 1.0 | 0.003 | 0.135 | 0.9847 |
| h221 | −0.007 | 0.138 | 0.9564 | |
| h122 | −0.263 | 0.102 | 0.0100 | |
| h121 | −0.074 | 0.102 | 0.4684 |
This list consists of all the genes with at least 1 haplotype with P<0.05 to IMF, the full list is in the supplementary online material.
Note that for the gene CXCR4, this gene is the closest gene to the significant SNPs, but these are not located within the gene itself.
Figure 2Plot of –logP values for SNPs compared to haplotypes for candidate genes for IMF.
The SNPs are numbered 1, 2, and 3 in order along the chromosome and in the haplotypes, 1 = A and 2 = B alleles at each SNP. Haplotypes were fitted simultaneously. Note that for the gene CXCR4, this gene is the closest gene to the significant SNPs, but these are not located within the gene itself.
Percent of phenotypic variance for IMF explained by haplotypes compared to SNPs.
| Gene | |||||
| Total variance |
|
|
|
|
|
| 3 SNPs summed | 2.8 | 0.3 | 2.0 | 1.5 | 1.2 |
| 3 SNPs simultaneous | 2.2 | 0.7 | 1.7 | 1.5 | 1.4 |
| 3 SNPs plus interactions | 2.6 | 0.9 | 1.8 | 1.6 | 1.6 |
| 3-SNP haplotypes | 2.4 | 1.0 | 1.7 | 1.9 | 1.8 |
variance of each SNP estimated individually then summed across SNPs.
simultaneous estimate.
Note that for the gene CXCR4, this gene is the closest gene to the significant SNPs, but these are not located within the gene itself.