| Literature DB >> 21605355 |
David Habier1, Rohan L Fernando, Kadir Kizilkaya, Dorian J Garrick.
Abstract
BACKGROUND: Two bayesian methods, BayesCπ and BayesDπ, were developed for genomic prediction to address the drawback of BayesA and BayesB regarding the impact of prior hyperparameters and treat the prior probability π that a SNP has zero effect as unknown. The methods were compared in terms of inference of the number of QTL and accuracy of genomic estimated breeding values (GEBVs), using simulated scenarios and real data from North American Holstein bulls.Entities:
Mesh:
Substances:
Year: 2011 PMID: 21605355 PMCID: PMC3144464 DOI: 10.1186/1471-2105-12-186
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Model configurations in which SNP 1 has non-zero effect for an example using three SNPs in the analysis
| Model | ||||
|---|---|---|---|---|
| SNP effect | 1 | 2 | 3 | 4 |
| ≠0 | ≠0 | ≠0 | ≠0 | |
| ≠0 | 0 | ≠0 | 0 | |
| ≠0 | ≠0 | 0 | 0 | |
Figure 1Cumulative distribution functions, F(x), of the distributions used to sample QTL effects: Gamma with shape 0.4 and scale 1.66 and absolute standard normal.
Posterior mean of (1-π) multiplied by K = 2,000 loci used in the analyses (se) according to the Bayesian method, number of QTL, distribution of QTL effects and training data size.
| QTL effect distribution and training data size | |||||
|---|---|---|---|---|---|
| Gamma | Normal | ||||
| Method | No. of QTL | 1,000 | 4,000 | 1,000 | 4,000 |
| BayesC | 10 | 7 (1) | 7 (0.8) | 13 (0.9) | 12 (0.8) |
| 200 | 69 (5) | 86 (3) | 236 (13) | 204 (3) | |
| 1,000 | 312 (40) | 315 (8) | 1,230 (91) | 1,007 (19) | |
| BayesD | 10 | 165 (11) | 59 (3) | 229 (9) | 81 (4) |
| 200 | 645 (22) | 343 (7) | 952 (24) | 564 (6) | |
| 1,000 | 984 (39) | 747 (10) | 1,169 (36) | 1,227 (12) | |
The starting value for π was 0.5. Results are based on 24 replicates of the ideal simulation in which all loci were in linkage equilibrium and both SNPs and QTL were modeled
Posterior mean of (1-π) multiplied by K = 2,000 SNPs used in the analyses (se) obtained by BayesDπ according to the number of QTL, distribution of QTL effects and training data size.
| QTL effect distribution and training data size | ||||
|---|---|---|---|---|
| Gamma | Normal | |||
| No. of QTL | 1,000 | 4,000 | 1,000 | 4,000 |
| 10 | 243 (14) | 253 (14) | 375 (23) | 395 (21) |
| 20 | 278 (24) | 293 (25) | 546 (31) | 538 (29) |
| 40 | 461 (30) | 465 (26) | 779 (31) | 771 (19) |
Results are based on 24 replicates of the realistic simulation in which heritability was 0.9, loci were in linkage disequilibrium, and only SNPs were modeled
Posterior mean of (1-π) multiplied by K = 2,000 SNPs used in the analyses (se) obtained by BayesCπ according to the heritability (h2), number of QTL, distribution of QTL effects and training data size.
| QTL effect distribution and training data size | |||||
|---|---|---|---|---|---|
| Gamma | Normal | ||||
| No. of | 1,000 | 4,000 | 1,000 | 4,000 | |
| 0.9 | 10 | 52 (5) | 99 (9) | 73 (5) | 147 (7) |
| 20 | 65 (6) | 127 (11) | 112 (7) | 210 (10) | |
| 40 | 115 (11) | 198 (13) | 202 (19) | 343 (17) | |
| 0.2 | 10 | 421 (137) | 37 (5) | 532 (115) | 54 (5) |
| 20 | 654 (140) | 62 (8) | 917 (131) | 133 (35) | |
| 40 | 1006 (97) | 174 (57) | 1178 (42) | 434 (109) | |
| 0.03 | 10 | 1083 (80) | 933 (130) | 1045 (59) | 1081 (108) |
| 20 | 1162 (69) | 1103 (58) | 1035 (50) | 1099 (62) | |
| 40 | 1043 (83) | 1206 (42) | 1149 (54) | 1331 (39) | |
Starting value for π was 0.5. Results are based on 24 replicates of the realistic simulation in which loci were in linkage disequilibrium and only SNPs were modeled
GEBV accuracy of 113 Holstein Friesian bulls born between 1953 and 1994 according to the Bayesian method, quantitative trait and number of Holstein Friesian bulls born between 1995 and 2004 used for training.
| Trait | Training data size | P-BLUP | G-BLUP | BayesA | BayesB, | BayesC | BayesD |
|---|---|---|---|---|---|---|---|
| Milk yield | 1,000 | 0.15 | 0.38 | 0.39 | 0.22 | 0.35 | 0.38 |
| 4,000 | 0.24 | 0.46 | 0.46 | 0.41 | 0.43 | 0.46 | |
| 6,500 | 0.10 | 0.48 | 0.48 | 0.40 | 0.43 | 0.47 | |
| Fat yield | 1,000 | -0.05 | 0.41 | 0.48 | 0.51 | 0.48 | 0.47 |
| 4,000 | 0.04 | 0.49 | 0.54 | 0.55 | 0.58 | 0.56 | |
| 6,500 | -0.15 | 0.51 | 0.56 | 0.52 | 0.54 | 0.57 | |
| Protein yield | 1,000 | 0.02 | 0.13 | 0.14 | 0.05 | 0.14 | 0.13 |
| 4,000 | 0.03 | 0.17 | 0.17 | 0.13 | 0.17 | 0.16 | |
| 6,500 | -0.02 | 0.21 | 0.22 | 0.17 | 0.21 | 0.20 | |
| Somatic cell score | 1,000 | 0.03 | 0.04 | 0.06 | 0.06 | 0.06 | 0.05 |
| 4,000 | -0.11 | 0.14 | 0.18 | 0.12 | 0.15 | 0.16 | |
| 6,500 | -0.04 | 0.17 | 0.17 | 0.12 | 0.14 | 0.14 | |
Starting value of π in BayesCπ was 0.5.
standard error,
Posterior mean () and standard deviation () of (1-π) obtained by BayesCπ (Starting value of π was 0.5) and BayesDπ multiplied by K = 40,764 SNPs used in the analyses, and average number of SNPs () fitted by BayesB with π = 0.99 and standard error (se) according to the quantitative trait and the number of Holstein Friesian bulls used for training
| BayesB, | BayesC | BayesD | ||||
|---|---|---|---|---|---|---|
| Trait | Training data size | |||||
| Milk yield | 1,000 | 402 (1.5) | 2,119 | 545 | 13,982 | 1,793 |
| 4,000 | 436 (1.6) | 2,315 | 398 | 13,329 | 896 | |
| 6,500 | 518 (1.6) | 2,555 | 326 | 14,768 | 750 | |
| Fat yield | 1,000 | 401 (1.1) | 562 | 201 | 13,533 | 1,752 |
| 4,000 | 441 (1.3) | 1,488 | 210 | 13,513 | 895 | |
| 6,500 | 504 (1.3) | 2,058 | 229 | 13,703 | 631 | |
| Protein yield | 1,000 | 403 (0.9) | 10,986 | 3,970 | 14,430 | 2,201 |
| 4,000 | 438 (1.1) | 9,500 | 1,756 | 13,512 | 774 | |
| 6,500 | 514 (1.1) | 5,503 | 970 | 14,496 | 694 | |
| Somatic cell score | 1,000 | 398 (1.2) | 5,644 | 3,105 | 12,962 | 1,948 |
| 4,000 | 428 (1.3) | 3,624 | 1,043 | 13,941 | 954 | |
| 6,500 | 466 (1.3) | 2,723 | 508 | 13,464 | 741 | |
Figure 2Histogram of model frequencies > 0.1 obtained by BayesB with .