| Literature DB >> 33649571 |
Abelardo Montesinos-López1, Osval Antonio Montesinos-López2, José Cricelio Montesinos-López3, Carlos Alberto Flores-Cortes4, Roberto de la Rosa5, José Crossa6,7.
Abstract
The primary objective of this paper is to provide a guide on implementing Bayesian generalized kernel regression methods for genomic prediction in the statistical software R. Such methods are quite efficient for capturing complex non-linear patterns that conventional linear regression models cannot. Furthermore, these methods are also powerful for leveraging environmental covariates, such as genotype × environment (G×E) prediction, among others. In this study we provide the building process of seven kernel methods: linear, polynomial, sigmoid, Gaussian, Exponential, Arc-cosine 1 and Arc-cosine L. Additionally, we highlight illustrative examples for implementing exact kernel methods for genomic prediction under a single-environment, a multi-environment and multi-trait framework, as well as for the implementation of sparse kernel methods under a multi-environment framework. These examples are followed by a discussion on the strengths and limitations of kernel methods and, subsequently by conclusions about the main contributions of this paper.Entities:
Mesh:
Year: 2021 PMID: 33649571 PMCID: PMC8115678 DOI: 10.1038/s41437-021-00412-1
Source DB: PubMed Journal: Heredity (Edinb) ISSN: 0018-067X Impact factor: 3.821
Single-environment prediction performance of 7 kernels under a ten-fold cross-validation with data Data_Wheat_2019.RData.
| Kernel | MSE | SE_MSE | Cor | SE_Cor | Time |
|---|---|---|---|---|---|
| Linear | 5.279 | 0.497 | 0.642 | 0.048 | 10.140 |
| Polynomial | 4.974 | 0.587 | 0.658 | 0.054 | 10.510 |
| Gaussian | 4.881 | 0.540 | 0.668 | 0.047 | 12.440 |
| AK1 | 5.149 | 0.522 | 0.651 | 0.047 | 10.600 |
| 9.970 | |||||
| Exponential | 4.980 | 0.521 | 0.663 | 0.048 | 9.780 |
MSE denotes the mean square error of prediction (SE_MSE id the standard error of MSE) and Cor is used for the average Pearson’s correlation (SE_Cor is the standard error of Cor). Time indicates the implementation time in seconds.
Multi-environment prediction performance of seven kernels under a ten-fold cross-validation with data Data_Wheat_2019.RData.
| Kernel | MSE | SE_MSE | Cor | SE_Cor | Time |
|---|---|---|---|---|---|
| Linear | 2.916 | 0.240 | 0.988 | 0.001 | 33.560 |
| Polynomial | 2.753 | 0.235 | 0.989 | 0.001 | 36.400 |
| Sigmoid | 3.148 | 0.262 | 0.987 | 0.001 | 35.090 |
| Gaussian | 2.723 | 0.219 | 0.989 | 0.001 | 40.450 |
| AK1 | 2.819 | 0.230 | 0.988 | 0.001 | 40.730 |
| AKL | 2.670 | 0.225 | 0.989 | 0.001 | 76.29 |
| Exponential | 2.919 | 0.232 | 0.988 | 0.001 | 31.390 |
MSE denotes the mean square error of prediction (SE_MSE id the standard error of MSE) and Cor is used for the average Pearson’s correlation (SE_Cor is the standard error of Cor). Time indicates the implementation time in seconds.
Multi-trait prediction performance of seven kernels under a ten-fold cross-validation with data Data_Wheat_2019.RData.
| Kernel | Trait | MSE | SE_MSE | Cor | SE_Cor |
|---|---|---|---|---|---|
| Linear | DTHD | 2.686 | 0.355 | 0.848 | 0.027 |
| Linear | GRYLD | 0.317 | 0.038 | 0.541 | 0.064 |
| Polynomial | DTHD | 2.626 | 0.380 | 0.850 | 0.029 |
| Polynomial | GRYLD | 0.314 | 0.039 | 0.544 | 0.062 |
| Sigmoid | DTHD | 2.877 | 0.376 | 0.837 | 0.029 |
| Sigmoid | GRYLD | 0.326 | 0.038 | 0.526 | 0.065 |
| Gaussian | DTHD | 2.533 | 0.349 | 0.855 | 0.028 |
| AK1 | DTHD | 2.571 | 0.345 | 0.854 | 0.027 |
| AKL | GRYLD | 0.315 | 0.036 | 0.539 | 0.060 |
| Exponential | DTHD | 2.551 | 0.342 | 0.852 | 0.028 |
| Exponential | GRYLD | 0.345 | 0.041 | 0.480 | 0.068 |
MSE denotes the mean square error of prediction (MSE) and its standard error (SE_MSE), Cor is used for the average Pearson’s correlation (Cor) and its standard error (SE_Cor).
Multi-environment prediction performance of seven kernels with a ordinal response variable under a ten-fold cross-validation with data Data_Wheat_2019.RData.
| Kernel | PCCC | SE_PCCC | Kappa | SE_Kappa | Time |
|---|---|---|---|---|---|
| Linear | 0.671 | 0.025 | 0.454 | 0.044 | 62.72 |
| Polynomial | 0.680 | 0.026 | 0.468 | 0.049 | 69.98 |
| Sigmoid | 0.664 | 0.025 | 0.443 | 0.045 | 71.25 |
| AK1 | 0.664 | 0.027 | 0.441 | 0.050 | 79.83 |
| AKL | 0.673 | 0.025 | 0.450 | 0.048 | 65.6 |
| Exponential | 0.596 | 0.035 | 0.281 | 0.075 | 72.76 |
PCCC denotes the proportion of cases correctly classified (SE_PCCC is the standard error of PCCC) and Kappa is average Kappa’s coefficient (SE_Kappa is the standard error of Kappa). Time indicates the implementation time in seconds.
Prediction performance in terms of mean square error (MSE) and its standard error (SE_MSE), as well as Pearson’s correlation and its standard error (SE_Cor) under seven approximate Gaussian kernel methods with the method proposed by Cuevas et al. (2020).
| m | Kernel | MSE | SE_MSE | Cor | SE_Cor | Time |
|---|---|---|---|---|---|---|
| 10 | Linear | 0.348 | 0.045 | 0.908 | 0.026 | 13.79 |
| 10 | Polynomial | 0.349 | 0.044 | 0.911 | 0.024 | 15.98 |
| 10 | ||||||
| 10 | ||||||
| 10 | AK1 | 0.344 | 0.028 | 0.915 | 0.021 | 16.53 |
| 10 | AKL | 0.314 | 0.035 | 0.921 | 0.020 | 15.2 |
| 10 | Exponential | 0.336 | 0.027 | 0.918 | 0.018 | 15.56 |
| 15 | Linear | 0.318 | 0.036 | 0.917 | 0.022 | 17.72 |
| 15 | Polynomial | 0.344 | 0.031 | 0.909 | 0.021 | 18.92 |
| 15 | Sigmoid | 0.338 | 0.035 | 0.913 | 0.023 | 19.13 |
| 15 | Gaussian | 0.308 | 0.035 | 0.922 | 0.021 | 20.44 |
| 15 | AK1 | 0.332 | 0.034 | 0.915 | 0.022 | 20.14 |
| 15 | AKL | 0.336 | 0.023 | 0.919 | 0.018 | 21.16 |
| 15 | Exponential | 0.353 | 0.033 | 0.912 | 0.021 | 19.58 |
| 20 | Linear | 0.316 | 0.034 | 0.918 | 0.021 | 18.31 |
| 20 | Polynomial | 0.346 | 0.036 | 0.910 | 0.025 | 18.33 |
| 20 | ||||||
| 20 | ||||||
| 20 | AK1 | 0.345 | 0.025 | 0.916 | 0.020 | 15.35 |
| 20 | AKL | 0.323 | 0.028 | 0.918 | 0.021 | 15.25 |
| 20 | Exponential | 0.346 | 0.035 | 0.914 | 0.023 | 16.64 |
| 35 | Linear | 0.303 | 0.026 | 0.924 | 0.018 | 19.39 |
| 35 | Polynomial | 0.316 | 0.027 | 0.921 | 0.020 | 20.35 |
| 35 | Sigmoid | 0.292 | 0.025 | 0.926 | 0.019 | 22.24 |
| 35 | Gaussian | 0.327 | 0.027 | 0.920 | 0.020 | 22.91 |
| 35 | AK1 | 0.333 | 0.025 | 0.918 | 0.020 | 23.5 |
| 35 | AKL | 0.324 | 0.028 | 0.921 | 0.019 | 21.61 |
| 35 | Exponential | 0.330 | 0.024 | 0.920 | 0.020 | 22.89 |
| 40 | Linear | 0.307 | 0.026 | 0.923 | 0.019 | 24.03 |
| 40 | Polynomial | 0.326 | 0.026 | 0.919 | 0.020 | 24.14 |
| 40 | Sigmoid | 0.306 | 0.026 | 0.923 | 0.019 | 22.25 |
| 40 | Gaussian | 0.322 | 0.025 | 0.921 | 0.019 | 21.29 |
| 40 | AK1 | 0.334 | 0.024 | 0.919 | 0.019 | 25.25 |
| 40 | AKL | 0.333 | 0.023 | 0.920 | 0.019 | 23.69 |
| 40 | Exponential | 0.335 | 0.023 | 0.919 | 0.019 | 20.63 |
Ten-fold cross-validation was implemented and the prediction performance was only reported for the testing set. Five values of training size, m, were implemented: 5, 10, 20, 35 and 40 (all data). Time reported: the implementation time in seconds. The data set used for this example was Data_Toy_EYT with trait GY as the response variable.
| M1 | M2 | M3 | M4 | M5 | |
|---|---|---|---|---|---|
| GID304660 | −0.997 | −0.997 | −0.997 | −0.997 | −0.997 |
| GID6175067 | 0.427 | 0.427 | 0.427 | 0.427 | 0.427 |
| GID6332122 | −0.421 | −2.421 | −2.421 | −2.421 | −2.421 |
| GID6341870 | 0.427 | 0.427 | 0.427 | 0.427 | 0.427 |
| GID6931427 | 0.427 | 0.427 | 0.427 | 0.427 | 0.427 |
| GID7460318 | 0.427 | 0.427 | 0.427 | 0.427 | 0.427 |
| GID304660 | GID6175067 | GID6332122 | GID6341870 | GID6931427 | GID7460318 | |
|---|---|---|---|---|---|---|
| GID304660 | 0.994 | −0.426 | 2.414 | −0.426 | −0.426 | −0.426 |
| GID6175067 | −0.426 | 0.182 | −1.034 | 0.182 | 0.182 | 0.182 |
| GID6332122 | 2.414 | −1.034 | 5.861 | −1.034 | −1.034 | −1.034 |
| GID6341870 | −0.426 | 0.182 | −1.034 | 0.182 | 0.182 | 0.182 |
| GID6931427 | −0.426 | 0.182 | −1.034 | 0.182 | 0.182 | 0.182 |
| GID7460318 | −0.426 | 0.182 | −1.034 | 0.182 | 0.182 | 0.182 |
| [,1] | [,2] | [,3] | |
|---|---|---|---|
| [1,] | −0.597 | −0.780 | −0.186 |
| [2,] | −0.671 | 0.495 | −0.552 |
| [3,] | −0.129 | −0.004 | 0.009 |
| [4,] | −0.261 | −0.016 | 0.076 |
| [5,] | −0.295 | −0.047 | −0.018 |
| [6,] | −0.133 | 0.023 | 0.001 |
| [7,] | −0.206 | −0.054 | 0.031 |
| [8,] | −0.215 | −0.056 | −0.001 |
| [9,] | −0.710 | 0.188 | 0.678 |
| [10,] | −0.134 | −0.028 | −0.004 |