| Literature DB >> 24904635 |
Huiyu Wang1, Ignacy Misztal2, Ignacio Aguilar3, Andres Legarra4, Rohan L Fernando5, Zulma Vitezica6, Ron Okimoto7, Terry Wing7, Rachel Hawken7, William M Muir8.
Abstract
The purpose of this study was to compare results obtained from various methodologies for genome-wide association studies, when applied to real data, in terms of number and commonality of regions identified and their genetic variance explained, computational speed, and possible pitfalls in interpretations of results. Methodologies include: two iteratively reweighted single-step genomic BLUP procedures (ssGWAS1 and ssGWAS2), a single-marker model (CGWAS), and BayesB. The ssGWAS methods utilize genomic breeding values (GEBVs) based on combined pedigree, genomic and phenotypic information, while CGWAS and BayesB only utilize phenotypes from genotyped animals or pseudo-phenotypes. In this study, ssGWAS was performed by converting GEBVs to SNP marker effects. Unequal variances for markers were incorporated for calculating weights into a new genomic relationship matrix. SNP weights were refined iteratively. The data was body weight at 6 weeks on 274,776 broiler chickens, of which 4553 were genotyped using a 60 k SNP chip. Comparison of genomic regions was based on genetic variances explained by local SNP regions (20 SNPs). After 3 iterations, the noise was greatly reduced for ssGWAS1 and results are similar to that of CGWAS, with 4 out of the top 10 regions in common. In contrast, for BayesB, the plot was dominated by a single region explaining 23.1% of the genetic variance. This same region was found by ssGWAS1 with the same rank, but the amount of genetic variation attributed to the region was only 3%. These findings emphasize the need for caution when comparing and interpreting results from various methods, and highlight that detected associations, and strength of association, strongly depends on methodologies and details of implementations. BayesB appears to overly shrink regions to zero, while overestimating the amount of genetic variation attributed to the remaining SNP effects. The real world is most likely a compromise between methods and remains to be determined.Entities:
Keywords: BayesB; association mapping; body weight; broiler chicken; genome-wide association; ssGWAS
Year: 2014 PMID: 24904635 PMCID: PMC4033036 DOI: 10.3389/fgene.2014.00134
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Correlations of EBV obtained from regular BLUP and GEBVs.
| ssGWAS2/1 | 0.91 |
| ssGWAS2/2 | 0.90 |
| ssGWAS2/3 | 0.88 |
| ssGWAS2/4 | 0.87 |
| ssGWAS2/5 | 0.85 |
| BayesB | 0.90 |
| CGWAS | 0.71 |
GEBVs = genomic breeding values.
Single-step genomic analyses (ssGWAS), BayesB, and classical genome wide association (CGWAS).
ssGWAS2/1 = the first iteration of Scenario 2 (ssGWAS2) in ssGWAS, which is equivalent to ssGWAS1.
BayesB with π = 0.9.
Comparison of accuracies of EBV obtained from regular BLUP and GEBVs.
| EBV | 0.34 |
| ssGWAS2/1 | 0.44 |
| ssGWAS2/2 | 0.52 |
| ssGWAS2/3 | 0.52 |
| ssGWAS2/4 | 0.51 |
| ssGWAS2/5 | 0.50 |
GEBVs = genomic breeding values.
ssGWAS = single-step genomic association analyses.
ssGWAS2/1 = the first iteration of Scenario 1 (ssGWAS1) in ssGBLUP, which is equivalent to ssGWAS2/1.
Figure 1Proportion of genetic variance of 20-SNP region under the Senarios 1 (ssGWAS1) of extended single-step genomic BLUP (ssGBLUP). (A) The first iteration (ssGWAS1/1). (B) The third iteration (ssGWAS1/3). (C) The fifth iteration (ssGWAS1/5). The x-axis represents region location of 20 SNPs. The y-axis represents the proportion of genetic variance of each region.
Figure 4Proportion of genetic variance of 20-SNP region using BayesB with π = 0.9 implemented by GenSel. The x-axis represents region location of 20 SNPs. The y-axis represents the proportion of genetic variance of each region.
Rankings of top 10 regions.
| ssGWAS1/2 | 1 | 3 | 2 | 12 | 4 | 9 | 7 | 10 | 5 | 6 |
| ssGWAS1/3 | 1 | 3 | 2 | 21 | 4 | 11 | 7 | 15 | 8 | 6 |
| ssGWAS1/4 | 1 | 3 | 2 | 32 | 4 | 14 | 10 | 21 | 9 | 6 |
| ssGWAS1/5 | 1 | 2 | 4 | 36 | 3 | 14 | 19 | 18 | 10 | 6 |
| ssGWAS2/2 | 1 | 9 | 6 | 2 | 16 | 20 | 19 | 8 | 7 | 5 |
| ssGWAS2/3 | 1 | 110 | 62 | 29 | 8 | 233 | 57 | 31 | 21 | 16 |
| ssGWAS2/4 | 1 | 351 | 256 | 72 | 3 | 575 | 126 | 58 | 22 | 35 |
| ssGWAS2/5 | 1 | 479 | 472 | 100 | 2 | 766 | 179 | 86 | 25 | 50 |
Each region consists of 20 SNPs, and in totally there are 2031 regions on whole genome.
ssGWAS = single-step genomic association analyses.
ssGWAS1/1 = the first iteration of Scenario 1 (ssGWAS1) in ssGBLUP, which is equivalent to ssGWAS2/1.
Figure 2Proportion of genetic variance of 20-SNP region under the Senarios 2 (ssGWAS2) of extended single-step genomic BLUP (ssGBLUP). (A) The first iteration (ssGWAS2/1). (B) The third iteration (ssGWAS2/3). (C) The fifth iteration (ssGWAS2/5). The x-axis represents region location of 20 SNPs. The y-axis represents the proportion of genetic variance of each region.
Figure 3Proportion of genetic variance of 20-SNP region using classical genome wide association studies (CGWAS) implemented by WOMBAT. The x-axis represents region location of 20 SNPs. The y-axis represents the proportion of genetic variance of each region.
Rankings top 10 regions among different methods.
| 1 | 6 | 3.07 | 2 | 1.29 | 62 | 0.38 | 2 | 2.35 |
| 2 | 6 | 2.9 | 3 | 0.91 | 110 | 0.26 | 3 | 1.89 |
| 3 | 6 | 1.3 | 4 | 0.78 | 8 | 0.84 | 40 | 0.25 |
| 4 | 6 | 0.98 | 360 | 0.09 | 810 | 0.01 | 322 | 0.06 |
| 5 | 6 | 0.79 | 278 | 0.11 | 565 | 0.02 | 27 | 0.32 |
| 6 | 27 | 0.79 | 1 | 2.53 | 1 | 5.65 | 1 | 23.06 |
| 7 | 6 | 0.6 | 668 | 0.04 | 1216 | <0.01 | 1646 | 0 |
| 8 | 7 | 0.48 | 314 | 0.1 | 927 | <0.01 | 99 | 0.14 |
| 9 | 12 | 0.48 | 855 | 0.03 | 925 | <0.01 | 387 | 0.05 |
| 10 | 4 | 0.45 | 274 | 0.11 | 903 | <0.01 | 173 | 0.09 |
| Total | 11.84 | 5.99 | 7.16 | 28.21 | ||||
| 1 | 27 | 23.06 | 1 | 2.53 | 1 | 5.65 | 6 | 0.79 |
| 2 | 6 | 2.35 | 2 | 1.29 | 62 | 0.38 | 1 | 3.07 |
| 3 | 6 | 1.89 | 3 | 0.91 | 110 | 0.26 | 2 | 2.9 |
| 4 | 11 | 1.39 | 15 | 0.43 | 31 | 0.55 | 279 | 0.08 |
| 5 | 2 | 1.03 | 42 | 0.28 | 63 | 0.38 | 656 | 0.04 |
| 6 | 3 | 1 | 144 | 0.16 | 166 | 0.18 | 11 | 0.43 |
| 7 | 4 | 0.73 | 9 | 0.53 | 105 | 0.27 | 450 | 0.06 |
| 8 | 5 | 0.68 | 6 | 0.59 | 16 | 0.72 | 423 | 0.06 |
| 9 | 2 | 0.59 | 7 | 0.56 | 57 | 0.39 | 32 | 0.29 |
| 10 | 2 | 0.54 | 264 | 0.11 | 119 | 0.24 | 53 | 0.22 |
| Total | 33.26 | 7.39 | 9.02 | 7.94 | ||||
| 1 | 27 | 2.53 | 1 | 5.65 | 6 | 0.79 | 1 | 23.06 |
| 2 | 6 | 1.29 | 62 | 0.38 | 1 | 3.07 | 2 | 2.35 |
| 3 | 6 | 0.91 | 110 | 0.26 | 2 | 2.9 | 3 | 1.89 |
| 4 | 6 | 0.78 | 8 | 0.84 | 3 | 1.3 | 40 | 0.25 |
| 5 | 10 | 0.72 | 54 | 0.41 | 59 | 0.22 | 93 | 0.15 |
| 6 | 5 | 0.59 | 16 | 0.72 | 423 | 0.06 | 8 | 0.68 |
| 7 | 2 | 0.56 | 57 | 0.39 | 32 | 0.29 | 9 | 0.59 |
| 8 | 1 | 0.54 | 21 | 0.67 | 76 | 0.19 | 23 | 0.35 |
| 9 | 4 | 0.53 | 105 | 0.27 | 450 | 0.06 | 7 | 0.73 |
| 10 | 12 | 0.5 | 13 | 0.77 | 357 | 0.07 | 31 | 0.27 |
| Total | 8.95 | 10.36 | 8.95 | 30.32 | ||||
| c | ||||||||
| 1 | 27 | 5.65 | 1 | 2.53 | 6 | 0.79 | 1 | 23.06 |
| 2 | 6 | 2.06 | 16 | 0.43 | 98 | 0.16 | 56 | 0.2 |
| 3 | 2 | 1.23 | 20 | 0.39 | 125 | 0.14 | 29 | 0.31 |
| 4 | 3 | 1.02 | 19 | 0.4 | 26 | 0.32 | 11 | 0.54 |
| 5 | 10 | 0.95 | 365 | 0.08 | 1063 | 0.02 | 77 | 0.17 |
| 6 | 2 | 0.92 | 370 | 0.08 | 573 | 0.05 | 155 | 0.1 |
| 7 | 14 | 0.85 | 82 | 0.21 | 606 | 0.05 | 41 | 0.25 |
| 8 | 6 | 0.84 | 4 | 0.78 | 3 | 1.3 | 40 | 0.25 |
| 9 | 2 | 0.83 | 13 | 0.45 | 123 | 0.14 | 14 | 0.41 |
| 10 | 12 | 0.83 | 152 | 0.15 | 555 | 0.05 | 118 | 0.13 |
| Total | 15.18 | 5.50 | 3.02 | 25.42 |
The third iteration of both scenarios (ssGWAS1/3 and ssGWAS2/3) in single-step genomic BLUP (ssGBLUP), BayesB, and classical genome wide association studies (CGWAS).
chr = chromosome number.
gVar(%) = proportion of genetic variance each region consisting of 20 SNPs represents.
Rankings of each region.
Total = sum of gVar(%) of 10 regions of each method.