| Literature DB >> 23763755 |
Gota Morota1, Masanori Koyama, Guilherme J M Rosa, Kent A Weigel, Daniel Gianola.
Abstract
BACKGROUND: Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23763755 PMCID: PMC3706293 DOI: 10.1186/1297-9686-45-17
Source DB: PubMed Journal: Genet Sel Evol ISSN: 0999-193X Impact factor: 4.297
Example of diffusion on a graph
| 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | |||
| 0.1 | 0.8 | 0.1 | 0.2 | 0.6 | 0.2 | 0 | 0.2 | 0.8 | |||
| 0.17 | 0.66 | 0.17 | 0.28 | 0.44 | 0.28 | 0.04 | 0.28 | 0.68 | |||
| 0.219 | 0.562 | 0.219 | 0.312 | 0.376 | 0.312 | 0.171 | 0.330 | 0.498 | |||
| 0.331 | 0.336 | 0.331 | 0.333 | 0.333 | 0.333 | 0.324 | 0.333 | 0.342 |
= (0, 1, 2) are genotype codes; = (0.1, 0.2) is the diffusion rate; is the time diffusion of the influence of genotype on genotype .
Figure 1A SNP grid graph. A SNP grid graph with 3 genotypes (p = 3). It has 33=27 vertices.
Figure 2Histograms of lower triangular elements of four diffusion kernels. Histograms of lower triangular elements of four diffusion kernels based on four different bandwidth parameters (θ).
Averages of kernel elements and their predictive correlations for the Holstein data
| Diffusion | 10 | 0.369 (0.369) | 0.138 (0.134) | 0.727 (0.726) | 215.93 (216.61) |
| | 11 | 0.693 (0.693) | 0.483 (0.477) | ||
| | 11.5 | 0.801 (0.801) | 0.644 (0.639) | 0.739 (0.732) | 207.93 (212.97) |
| | 12 | 0.874 (0.874) | 0.765 (0.762) | 0.739 (0.728) | 210.54 (215.08) |
| | 13 | 0.952 (0.952) | 0.907 (0.906) | 0.734 (0.725) | 211.50 (217.61) |
| | 14 | 0.982 (0.982) | 0.966 (0.965) | 0.729 (0.723) | 214.29 (218.70) |
| Gaussian | 5×10−5 | 1 (1) | 0.237 (0.225) | 0.721 (0.702) | 220.675 (233.21) |
| | 2×10−5 | 1 (1) | 0.551 (0.542) | 0.736 (0.733) | 213.41 (213.95) |
| | 1×10−5 | 1 (1) | 0.749 (0.742) | ||
| | 5×10−6 | 1 (1) | 0.866 (0.861) | 0.736 (0.729) | 210.24 (214.47) |
| | 3×10−6 | 1 (1) | 0.917 (0.914) | 0.734 (0.726) | 211.51 (216.42) |
| | 1×10−6 | 1 (1) | 0.971 (0.971) | 0.729 (0.724) | 214.37 (217.93) |
| NA | 0.992 (1.009) | -0.000126 (-0.000128) | 0.729 (0.722) | 214.36 (219.27) | |
| NA | 0.894 (0.909) | -0.000113 (-0.00012) | 0.730 (0.723) | 213.64 (218.31) |
Averages of diagonal and off-diagonal kernel elements, predictive correlation, and mean-squared error of prediction (MSE) for the diffusion, Gaussian, and two additive genomic relationship kernels (G1∗ and G2∗) with different values of the bandwidth parameter for the Holstein data. Values in parentheses were obtained by combining the SNP grid and the hypercube kernels by applying a same bandwidth parameter. G1∗ and G2∗ do not involve bandwidth parameters. The best prediction within the same kernel is underlined.
Averages of kernel elements and their predictive correlations for the wheat data
| Diffusion | 3 | 1 | 0.136 | 0.685 | |
| | 3.25 | 1 | 0.289 | 0.580 | 0.673 |
| | 3.5 | 1 | 0.466 | 0.577 | |
| | 4 | 1 | 0.752 | 0.547 | 0.704 |
| | 5 | 1 | 0.962 | 0.522 | 0.721 |
| Gaussian | 0.005 | 1 | 0.134 | ||
| | 0.003 | 1 | 0.290 | 0.579 | 0.697 |
| | 0.002 | 1 | 0.434 | 0.562 | 0.697 |
| | 0.001 | 1 | 0.655 | 0.558 | 0.703 |
| | 0.0005 | 1 | 0.809 | 0.556 | 0.673 |
| NA | 2 | -0.003 | 0.518 | 0.709 | |
| NA | 2 | -0.003 | 0.521 | 0.708 |
Average of diagonal and off-diagonal kernel elements, predictive correlation, and mean-squared error of prediction (MSE) for the diffusion, Gaussian, and two additive genomic relationship kernels at different values of the bandwidth parameter for the wheat data. The predictive correlation and the MSE were obtained from a 10-fold cross-validation. Additive genomic relationship kernels (G1 and G2) do not involve bandwidth parameters. The best prediction within the same kernel is underlined.