| Literature DB >> 26921298 |
Abelardo Montesinos-López1, Osval A Montesinos-López2, José Crossa3, Juan Burgueño2, Kent M Eskridge4, Esteban Falconi-Castillo5, Xinyao He2, Pawan Singh2, Karen Cichy6.
Abstract
Genomic tools allow the study of the whole genome, and facilitate the study of genotype-environment combinations and their relationship with phenotype. However, most genomic prediction models developed so far are appropriate for Gaussian phenotypes. For this reason, appropriate genomic prediction models are needed for count data, since the conventional regression models used on count data with a large sample size ([Formula: see text]) and a small number of parameters (p) cannot be used for genomic-enabled prediction where the number of parameters (p) is larger than the sample size ([Formula: see text]). Here, we propose a Bayesian mixed-negative binomial (BMNB) genomic regression model for counts that takes into account genotype by environment [Formula: see text] interaction. We also provide all the full conditional distributions to implement a Gibbs sampler. We evaluated the proposed model using a simulated data set, and a real wheat data set from the International Maize and Wheat Improvement Center (CIMMYT) and collaborators. Results indicate that our BMNB model provides a viable option for analyzing count data.Entities:
Keywords: Bayesian model; GenPred; Gibbs sampler; count data; genome enabled prediction; genomic selection; shared data resource
Mesh:
Year: 2016 PMID: 26921298 PMCID: PMC4856070 DOI: 10.1534/g3.116.028118
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Plot of the mean count vs. the variance of Model NB as a function of the scale parameter . Good approximations are obtained when the mean and variance are very similar; in the plot, they should follow the diagonal that plots
Posterior mean and posterior SD of the Bayesian method with four sample sizes (n) for Model NB
| Scenario | Parameter | True | Mean | SD | Mean | SD | Mean | SD | Mean | SD |
|---|---|---|---|---|---|---|---|---|---|---|
| 1.5 | 1.48 | 0.36 | 1.49 | 0.27 | 1.54 | 0.23 | 1.55 | 0.21 | ||
| −1 | −0.98 | 0.26 | −0.99 | 0.25 | −1.08 | 0.25 | −1.02 | 0.19 | ||
| 1 | 1 | 1.00 | 0.27 | 0.99 | 0.22 | 0.99 | 0.27 | 0.95 | 0.22 | |
| 5 | 5.08 | 0.92 | 5.08 | 0.52 | 5.02 | 0.47 | 5.03 | 0.33 | ||
| 0.5 | 0.54 | 0.20 | 0.59 | 0.18 | 0.58 | 0.18 | 0.59 | 0.22 | ||
| 0.5 | 0.50 | 0.13 | 0.52 | 0.14 | 0.53 | 0.11 | 0.51 | 0.11 | ||
| 1.5 | 1.48 | 0.50 | 1.46 | 0.50 | 1.56 | 0.61 | 1.47 | 0.50 | ||
| −1 | −1.06 | 0.23 | −1.00 | 0.20 | −1.01 | 0.22 | −1.03 | 0.19 | ||
| 2 | 1 | 0.95 | 0.24 | 1.03 | 0.22 | 0.99 | 0.20 | 0.97 | 0.20 | |
| 5 | 5.10 | 0.81 | 4.99 | 0.59 | 5.04 | 0.35 | 5.03 | 0.20 | ||
| 0.5 | 0.54 | 0.18 | 0.57 | 0.22 | 0.58 | 0.19 | 0.53 | 0.18 | ||
| 0.5 | 0.50 | 0.12 | 0.51 | 0.14 | 0.53 | 0.13 | 0.51 | 0.10 | ||
Scenarios proposed to fit the real data set with Models NB, Pois, Normal and LN
| Scenario | Main Effects | Nested Effect | Interaction Effects | |||
|---|---|---|---|---|---|---|
| E | L | G | R(E) | EL | EG | |
| S1 | X | X | X | |||
| S2 | X | X | X | |||
| S3 | X | X | X | X | ||
| S4 | X | X | X | X | ||
E, Environment; R, blocks; L, lines; G, lines taking into account markers; EL and EG, interaction effects of E and L, and E and G; R(E) blocks nested in the environment.
Estimated beta coefficients, variance components, and posterior predictive checks for the four scenarios (S1, S2, S3, S4) for each proposed model
| S1 | S2 | S3 | S4 | |||||
|---|---|---|---|---|---|---|---|---|
| Parameter | Mean | SD | Mean | SD | Mean | SD | Mean | SD |
| | −0.93 | 0.60 | −1.05 | 0.61 | −2.52 | 0.71 | −2.38 | 0.99 |
| | −0.83 | 0.71 | −1.16 | 0.66 | −2.27 | 0.58 | −2.73 | 1.00 |
| | −0.03 | 0.48 | −0.15 | 0.56 | −1.69 | 0.85 | −1.96 | 0.78 |
| | −0.09 | 0.52 | −0.06 | 0.65 | −0.02 | 0.54 | −0.25 | 0.67 |
| | 0.05 | 0.51 | 0.08 | 0.60 | 0.10 | 0.53 | −0.13 | 0.66 |
| | −0.20 | 0.62 | 0.05 | 0.70 | −0.27 | 0.47 | 0.09 | 0.67 |
| | −0.05 | 0.61 | 0.20 | 0.66 | −0.15 | 0.46 | 0.21 | 0.65 |
| | 0.07 | 0.42 | 0.11 | 0.61 | 0.11 | 0.61 | 0.32 | 0.50 |
| | −0.14 | 0.41 | −0.10 | 0.59 | −0.10 | 0.60 | 0.11 | 0.48 |
| | 0.43 | 0.05 | 1.37 | 0.17 | 0.34 | 0.05 | 1.03 | 0.15 |
| | – | – | – | – | 0.38 | 0.03 | 1.04 | 0.10 |
| | 2.80 | 0.12 | 2.81 | 0.12 | 11.87 | 1.12 | 11.55 | 1.17 |
| Loglik | −1526.65 | −1526.88 | −1268.83 | −1275.25 | ||||
| Cor | 0.69 | 0.69 | 0.90 | 0.89 | ||||
| MSEP | 2.13 | 2.12 | 0.75 | 0.77 | ||||
| | −7.14 | 0.22 | −7.21 | 0.39 | −6.69 | 0.11 | −6.80 | 0.33 |
| | −7.08 | 0.13 | −7.17 | 0.11 | −7.07 | 0.16 | −7.27 | 0.19 |
| | −5.97 | 0.43 | −6.46 | 0.29 | −5.88 | 0.16 | −6.66 | 0.28 |
| | 0.12 | 0.17 | 0.07 | 0.29 | −0.25 | 0.11 | −0.34 | 0.23 |
| | 0.27 | 0.17 | 0.23 | 0.29 | −0.13 | 0.11 | −0.22 | 0.23 |
| | 0.06 | 0.14 | 0.03 | 0.15 | 0.14 | 0.15 | 0.13 | 0.17 |
| | 0.22 | 0.14 | 0.18 | 0.15 | 0.25 | 0.15 | 0.24 | 0.17 |
| | 0.04 | 0.34 | 0.41 | 0.21 | −0.09 | 0.13 | 0.51 | 0.19 |
| | −0.20 | 0.33 | 0.17 | 0.21 | −0.31 | 0.13 | 0.28 | 0.19 |
| | 0.44 | 0.05 | 1.46 | 0.17 | 0.35 | 0.05 | 1.03 | 0.14 |
| | – | – | – | – | 0.38 | 0.03 | 1.05 | |
| | 1000.00 | 1000.00 | 1000.00 | 1000.00 | ||||
| Loglik | −1477.63 | −1477.52 | −1228.73 | −1234.97 | ||||
| Cor | 0.66 | 0.66 | 0.90 | 0.89 | ||||
| MSEP | 1.87 | 1.86 | 0.74 | 0.76 | ||||
| | −12.30 | 5.86 | 7.90 | 4.36 | 13.70 | 3.69 | 9.22 | 3.11 |
| | −12.20 | 5.80 | 7.93 | 4.41 | 13.60 | 3.73 | 9.11 | 3.16 |
| | −10.40 | 5.87 | 9.66 | 4.36 | 15.50 | 3.69 | 10.94 | 3.10 |
| | 0.96 | 0.16 | 1.42 | 0.35 | 0.72 | 0.18 | 1.58 | 0.40 |
| | – | – | – | – | 1.33 | 0.18 | 1.13 | 0.34 |
| | 2.75 | 0.14 | 2.91 | 0.15 | 1.67 | 0.11 | 2.23 | 0.17 |
| Loglik | −1918.00 | −1957.00 | −1542.00 | −1747.00 | ||||
| Cor | 0.60 | 0.56 | 0.83 | 0.71 | ||||
| MSEP | 2.41 | 2.60 | 1.07 | 1.68 | ||||
| | −3.95 | 0.51 | −6.34 | 3.33 | 1.41 | 0.48 | 3.32 | 1.31 |
| | −3.95 | 0.48 | −6.33 | 3.32 | 1.41 | 0.49 | 3.32 | 1.29 |
| | −3.51 | 0.49 | −5.85 | 3.33 | 1.86 | 0.49 | 3.79 | 1.31 |
| | 0.09 | 0.01 | 0.15 | 0.03 | 0.07 | 0.01 | 0.16 | 0.03 |
| | – | – | – | – | 0.08 | 0.01 | 0.05 | 0.02 |
| | 0.17 | 0.01 | 0.181 | 0.009 | 0.11 | 0.01 | 0.15 | 0.01 |
| Loglik | −484.00 | −518.00 | −125.00 | −354.00 | ||||
| Cor | 0.71 | 0.68 | 0.88 | 0.79 | ||||
| MSEP | 2.50 | 2.63 | 1.25 | 1.97 | ||||
The beta coefficients corresponding to effects of environments (β1, β2, β3) are given for models Normal and LN only. Mean, posterior mean; SD posterior SD.
Estimated posterior predictive checks with cross-validation for Models NB, Pois, Normal and LN
| Batan 2012 | Batan 2014 | Ecuador 2014 | |||||
|---|---|---|---|---|---|---|---|
| Scenario | Cor | MSEP | Cor | MSEP | Cor | MSEP | |
| S1 | Mean | 0.43 (3) | 0.98 (3.5) | 0.43 (3.5) | 1.39 (2) | 0.18 (3) | 11.733 (4) |
| SD | 0.33 | 0.72 | 0.33 | 1.35 | 0.40 | 9.471 | |
| S2 | Mean | 0.42 (4) | 0.98 (3.5) | 0.43 (3.5) | 1.38 (1) | 0.20 (2) | 11.222 (2) |
| SD | 0.33 | 0.72 | 0.33 | 1.36 | 0.37 | 8.614 | |
| S3 | Mean | 0.54 (2) | 0.49 (1) | 0.52 (2) | 1.48 (3) | 0.22 (1) | 8.645 (1) |
| SD | 0.28 | 0.38 | 0.29 | 2.32 | 0.39 | 5.688 | |
| S4 | Mean | 0.56 (1) | 0.61 (2) | 0.56 (1) | 1.85 (4) | 0.12 (4) | 11.343 (3) |
| SD | 0.24 | 0.44 | 0.22 | 2.68 | 0.41 | 8.154 | |
| S1 | Mean | 0.43 (3) | 0.98 (3.5) | 0.43 (3.5) | 1.39 (2) | 0.18 (3) | 11.733 (4) |
| SD | 0.33 | 0.72 | 0.33 | 1.35 | 0.40 | 9.471 | |
| S2 | Mean | 0.42 (4) | 0.98 (3.5) | 0.43 (3.5) | 1.38 (1) | 0.20 (2) | 11.222 (2) |
| SD | 0.33 | 0.72 | 0.33 | 1.36 | 0.37 | 8.614 | |
| S3 | Mean | 0.54 (2) | 0.48 (1) | 0.52 (2) | 1.48 (3) | 0.22 (1) | 8.645 (1) |
| SD | 0.28 | 0.38 | 0.29 | 2.32 | 0.39 | 5.688 | |
| S4 | Mean | 0.56 (1) | 0.61 (2) | 0.56 (1) | 1.85 (4) | 0.12 (4) | 11.343 (3) |
| SD | 0.24 | 0.44 | 0.22 | 2.68 | 0.41 | 8.154 | |
| S1 | Mean | 0.36(1) | 1.10 (4) | 0.37 (1.5) | 1.79 (1) | 0.15 (1.5) | 7.425 (2) |
| SD | 0.28 | 0.88 | 0.39 | 1.70 | 0.32 | 4.151 | |
| S2 | Mean | 0.34 (2) | 0.99 (2) | 0.33 (3) | 2.01 (3) | 0.07 (3) | 7.454 (3) |
| SD | 0.33 | 0.65 | 0.44 | 2.46 | 0.33 | 4.339 | |
| S3 | Mean | 0.33 (3) | 0.81 (1) | 0.37 (1.5) | 1.96 (2) | 0.15 (1.5) | 7.318 (1) |
| SD | 0.30 | 0.46 | 0.40 | 2.99 | 0.29 | 4.159 | |
| S4 | Mean | 0.27 (4) | 1.03 (3) | 0.24 (4) | 2.37 (4) | 0.04 (4) | 8.482 (4) |
| SD | 0.34 | 0.73 | 0.45 | 3.42 | 0.24 | 4.326 | |
| S1 | Mean | 0.51 (2) | 0.66 (2.5) | 0.46 (1) | 1.60 (1) | 0.15 (1.5) | 8.10 (4) |
| SD | 0.21 | 0.42 | 0.31 | 2.35 | 0.38 | 5.11 | |
| S2 | Mean | 0.51 (2) | 0.66 (2.5) | 0.43 (3.5) | 1.78 (2) | 0.09 (3.5) | 7.82 (2) |
| SD | 0.22 | 0.39 | 0.35 | 2.82 | 0.46 | 5.31 | |
| S3 | Mean | 0.51 (2) | 0.64 (1) | 0.45 (2) | 1.871 (3) | 0.15 (1.5) | 7.76 (1) |
| SD | 0.21 | 0.45 | 0.31 | 3.16 | 0.37 | 5.21 | |
| S4 | Mean | 0.43 (4) | 0.72 (4) | 0.43 (3.5) | 1.95 (4) | 0.09 (3.5) | 8.04(3) |
| SD | 0.25 | 0.42 | 0.33 | 3.15 | 0.41 | 5.18 | |
The numbers in parentheses denote the ranking of the four scenarios for each posterior predictive check.
Rank averages for the four scenarios for each model resulting from the 10-fold cross-validation implemented
| Scenario | Batan 2012 | Batan 2014 | Ecuador 2014 | Batan 2012 | Batan 2014 | Ecuador 2014 |
|---|---|---|---|---|---|---|
| S1 | 3.25 | 2.75 | 3.5 | 2.5 | 1.25 | 1.75 |
| S2 | 3.75 | 2.25 | 2 | 2 | 3 | 3 |
| S3 | 1.5 | 2.5 | 1 | 2 | 1.75 | 1.75 |
| S4 | 1.5 | 2.5 | 3.5 | 3.5 | 4 | 4 |
| S1 | 3.25 | 2.75 | 3.5 | 2.25 | 1 | 2.75 |
| S2 | 3.75 | 2.25 | 2 | 2.25 | 2.75 | 2.75 |
| S3 | 1.5 | 2.5 | 1 | 1.5 | 2.5 | 1.25 |
| S4 | 1.5 | 2.5 | 3.5 | 4 | 3.75 | 3.25 |
Each average was obtained as the mean of the rankings given in Table 4 for the two posterior predictive checks (Cor and MSEP) in each scenario.