| Literature DB >> 20969788 |
Ross K Shepherd1, Theo H E Meuwissen, John A Woolliams.
Abstract
BACKGROUND: The information provided by dense genome-wide markers using high throughput technology is of considerable potential in human disease studies and livestock breeding programs. Genome-wide association studies relate individual single nucleotide polymorphisms (SNP) from dense SNP panels to individual measurements of complex traits, with the underlying assumption being that any association is caused by linkage disequilibrium (LD) between SNP and quantitative trait loci (QTL) affecting the trait. Often SNP are in genomic regions of no trait variation. Whole genome Bayesian models are an effective way of incorporating this and other important prior information into modelling. However a full Bayesian analysis is often not feasible due to the large computational time involved.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20969788 PMCID: PMC3098088 DOI: 10.1186/1471-2105-11-529
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 148 QTL effects and 5726 SNP posterior probabilities of being in LD with QTL. Average effect of an allelic substitution in the training data set (▲) plotted against genomic location for each of the 48 QTL. Also the SNP posterior probability (+) of being in LD with at least one QTL plotted against genomic location for each of the 5726 SNP. The QTL effects are in absolute values.
Correlation and regression coefficient of TBV on GEBV for various generations of the validation data.
| ICE | emBayesB | ||||||
|---|---|---|---|---|---|---|---|
| GS-BLUP | |||||||
| All | 0.1 | 0.75 | 0.87 (0.89) | 0.79 (1.22) | 0.77 (1.49) | 0.87 (0.89) | 0.88 (1.13) |
| All | 0.3 | 0.75 (0.85) | 0.85 (0.86) | 0.79 (0.92) | 0.87 (1.15) | 0.85 (0.86) | 0.88 (1.05) |
| All | 0.5 | 0.71 (0.69) | 0.81 (0.79) | 0.76 (0.78) | 0.87 (1.00) | 0.80 (0.78) | 0.88 (0.99) |
| All | 0.7 | 0.66 (0.55) | 0.74 (0.67) | 0.72 (0.65) | 0.77 (0.75) | 0.74 (0.67) | 0.87 (0.91) |
| All | 0.9 | 0.57 (0.38) | 0.58 (0.43) | 0.55 (0.35) | 0.57 (0.38) | 0.58 (0.43) | 0.87 (0.90) |
| 1 | 0.5 | 0.74 (0.71) | 0.84 (0.82) | 0.78 (0.78) | 0.87 (1.00) | 0.84 (0.82) | 0.88 (0.99) |
| 2 | 0.5 | 0.73 (0.71) | 0.81 (0.81) | 0.78 (0.81) | 0.87 (1.03) | 0.80 (0.80) | 0.88 (1.02) |
| 3 | 0.5 | 0.68 (0.62) | 0.77 (0.71) | 0.74 (0.72) | 0.85 (0.92) | 0.77 (0.70) | 0.86 (0.92) |
Correlation between TBV and GEBV, and the regression coefficient (in brackets) of TBV on GEBV for each generation, and for all three generations combined, of the validation data for GS-BLUP, ICE and emBayesB. Unless indicated otherwise, the initial parameter estimates are γ = 0.01, and where is the total phenotypic variance in the training data. The true heritability was 0.3
Generations of the validation data used.
Initial heritability assumed.
Parameters in brackets are fixed at the initial values.
Parameters in brackets are all estimated.
All three generations combined.
Correlation between TBV and GEBV.
Regression coefficient (in brackets) of TBV on GEBV.
Figure 2Genetic variation explained by each of the 48 QTL and 5726 SNP posterior probabilities. Percentage of the total genetic variance in the training data set explained by each QTL (▲) plotted against genomic location for each of the 48 QTL. Also the SNP posterior probability (+) of being in LD with at least one QTL plotted against genomic location for each of the 5726 SNP.
Figure 3Graphical illustration of how a posterior probability is calculated for a SNP. Graphs of the mixture prior p(g), conditional likelihood h(g|G, σ) and conditional posterior distribution p(g|gy) as given in equations (A1) and (A2) of Additional file 1 for and n = 500. The Dirac Delta function is replaced by a DE with λ= 1000 i.e. a Spike at 0. The posterior probability γof SNP j being in LD with QTL is calculated from equation (A3) by numerical integration. Figures A and B show the distributions when Gis 0.19 and 0.11 respectively, where Gis the conditional maximum likelihood estimate of g.
Figure 4Shrinkage of the cML estimate using the posterior mean or the MAP estimate. Plots of the analytical formulae for the conditional posterior mean E(g|,y) (Figures 4A and 4C) as given in Appendix 1 of [13] and the MAP estimate of g(Figures 4B and 4D) as given by equation (7) with the posterior probability γgiven by equation (A4). All plots have G(the conditional ML estimate of g) on the horizontal axis. Gis in σ units as all plots use σ = 1.