| Literature DB >> 23450123 |
Abstract
Allocating resources between population size and replication affects both genetic gain through phenotypic selection and quantitative trait loci detection power and effect estimation accuracy for marker-assisted selection (MAS). It is well known that because alleles are replicated across individuals in quantitative trait loci mapping and MAS, more resources should be allocated to increasing population size compared with phenotypic selection. Genomic selection is a form of MAS using all marker information simultaneously to predict individual genetic values for complex traits and has widely been found superior to MAS. No studies have explicitly investigated how resource allocation decisions affect success of genomic selection. My objective was to study the effect of resource allocation on response to MAS and genomic selection in a single biparental population of doubled haploid lines by using computer simulation. Simulation results were compared with previously derived formulas for the calculation of prediction accuracy under different levels of heritability and population size. Response of prediction accuracy to resource allocation strategies differed between genomic selection models (ridge regression best linear unbiased prediction [RR-BLUP], BayesCπ) and multiple linear regression using ordinary least-squares estimation (OLS), leading to different optimal resource allocation choices between OLS and RR-BLUP. For OLS, it was always advantageous to maximize population size at the expense of replication, but a high degree of flexibility was observed for RR-BLUP. Prediction accuracy of doubled haploid lines included in the training set was much greater than of those excluded from the training set, so there was little benefit to phenotyping only a subset of the lines genotyped. Finally, observed prediction accuracies in the simulation compared well to calculated prediction accuracies, indicating these theoretical formulas are useful for making resource allocation decisions.Entities:
Keywords: GenPred; Shared data resources; genomic selection; plant breeding
Mesh:
Year: 2013 PMID: 23450123 PMCID: PMC3583455 DOI: 10.1534/g3.112.004911
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1 Prediction accuracy (rA) as a function of replication number and population size for each of three statistical models. (A) Heritability of single plot measurements set to 0.20. (B) Heritability of single plot measurements set to 0.60.
Figure 2 Prediction accuracy (rA) for two statistical models as affected by tradeoffs between replication (r) and population size (n) for various levels of relative genotyping costs (C) expressed in field plot equivalents. Total budget is set to 500 field plot equivalents.
Figure 3 Comparison between calculated prediction accuracy (dashed lines) and observed prediation accuracy (solid lines) in simulations.
Population size and number of replication numbers achieving the greatest prediction accuracy according to theoretical calculations and simulations for different resource allocation situations
| Optimal Resource Allocation Strategy ( | |||
|---|---|---|---|
| B – C – | Theoretical (Obs | Observed (Obs | Agreement |
| 250 – 0 – 0.20 | 250:1 | 250:1 | Yes |
| 250 – 0.5 – 0.20 | 100:2 | 100:2 | Yes |
| 250 – 1 – 0.20 | 83:2 | 83:2 | Yes |
| 250 – 0 – 0.60 | 250:1 | 250:1 | Yes |
| 250 – 0.5 – 0.60 | 167:1 | 167:1 | Yes |
| 250 – 1 – 0.60 | 125:1 (0.773) | 83:2 (0.777) | No |
| 500 – 0 – 0.20 | 500:1 | 500:1 | Yes |
| 500 – 0.5 – 0.20 | 200:2 | 200:2 | Yes |
| 500 – 1 – 0.20 | 167:2 (0.704) | 125:3 (0.710) | No |
| 500 – 0 – 0.60 | 500:1 | 500:1 | Yes |
| 500 – 0.5 – 0.60 | 333:1 | 333:1 | Yes |
| 500 – 1 – 0.60 | 250:1 (0.844) | 167:2 (0.848) | No |
Predictions in the simulations were made with the RR-BLUP model. The column titled “Agreement” indicates if the theoretical calculations agreed with what was observed in the simulation. B, total budget in field plot equivalents; C, genotyping cost in field plot equivalents; , heritability of single plot measurements; RR-BLUP, ridge regression best linear unbiased prediction.
Prediction accuracies observed in the simulation are displayed in parenthesis for the instances in which the theoretical and observed optimal resource allocation strategies do not agree.
Figure 4 Prediction accuracy (rA) for each relative genotyping cost and resource allocation strategy across generations of random mating (Cycle). Population sizes corresponding to each level of r can be observed in Figure 2. Total budget was set to 500 field plot equivalents. (A) Heritability of single plot measurements set to 0.20. (B) Heritability of single plot measurements set to 0.60. Average standard error of prediction accuracies was 0.005 and ranged from 0.002 to 0.007.
Response and prediction accuracy of several resource allocation and phenotyping strategies for a budget of 250 total plot units, relative genotyping cost of 0.50, and plot heritability of 0.20 (i.e., B = 250, C = 0.50, =0.20)
| %Pheno. | Prediction Accuracy of Phenotyped DHs | Prediction Accuracy of Nonphenotyped DHs | Cycle 1 Mean | Cycle 2 Mean | Cycle 3 Mean | Standard Error of Cycle 3 Mean | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 1 | 250 | 125 | 2.15 | 0.66 | 0.58 | 1.28 | 1.66 | 1.90 | 0.021 |
| 50 | 2 | 167 | 84 | 1.99 | 0.70 | 0.59 | 1.22 | 1.63 | 1.92 | 0.020 |
| 50 | 3 | 125 | 62 | 1.86 | 0.74 | 0.59 | 1.20 | 1.64 | 1.92 | 0.019 |
| 50 | 4 | 100 | 50 | 1.75 | 0.76 | 0.58 | 1.13 | 1.56 | 1.89 | 0.020 |
| 75 | 2 | 125 | 94 | 1.86 | 0.72 | 0.61 | 1.26 | 1.70 | 2.00 | 0.018 |
| 75 | 3 | 91 | 68 | 1.71 | 0.75 | 0.61 | 1.21 | 1.67 | 1.96 | 0.019 |
| 75 | 4 | 71 | 53 | 1.59 | 0.77 | 0.58 | 1.13 | 1.60 | 1.90 | 0.020 |
| 100 | 3 | 71 | 71 | 1.59 | 0.75 | 0.62 | 1.17 | 1.65 | 1.97 | 0.019 |
| 100 | 4 | 56 | 56 | 1.46 | 0.77 | 0.61 | 1.12 | 1.59 | 1.90 | 0.019 |
Predictions were made with RR-BLUP. Resource allocation strategy with highest cycle 3 mean is indicated by asterisk (*) in Cycle 3 mean column. All resource allocation strategies producing genetic gain not significantly different from the greatest genetic gain observed are underlined. Units are in C0 genetic SDs. RR-BLUP, ridge regression best linear unbiased prediction; GEBVs, genomic estimated breeding value.
Percentage of DH population phenotyped. DH lines not phenotyped were genotyped and genomic selection model was applied to calculate their GEBVs.
Number of DH lines phenotyped (i.e., %Pheno × n rounded to nearest whole integer).
Prediction accuracy of DH lines that were phenotyped and included in the model training dataset.
Prediction accuracy of DH lines that were not phenotyped.
Response and prediction accuracy of several resource allocation and phenotyping strategies for a budget of 250 total plot units, relative genotyping cost of 1, and plot heritability of 0.20 (i.e., B = 250, C = 1, = 0.20)
| %Pheno. | Prediction Accuracy of Phenotyped DHs | Prediction Accuracy of Nonphenotyped DHs | Cycle 1 Mean | Cycle 2 Mean | Cycle 3 Mean | Standard Error of Cycle 3 Mean | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 1 | 167 | 84 | 1.99 | 0.61 | 0.51 | 1.05 | 1.44 | 1.64 | 0.021 |
| 50 | 2 | 125 | 62 | 1.86 | 0.68 | 0.54 | 1.11 | 1.52 | 1.76 | 0.021 |
| 50 | 3 | 100 | 50 | 1.75 | 0.73 | 0.56 | 1.09 | 1.53 | 1.78 | 0.020 |
| 50 | 4 | 83 | 42 | 1.67 | 0.75 | 0.55 | 1.08 | 1.51 | 1.78 | 0.020 |
| 75 | 1 | 143 | 107 | 1.92 | 0.62 | 0.54 | 1.11 | 1.52 | 1.73 | 0.020 |
| 75 | 2 | 100 | 75 | 1.75 | 0.70 | 0.58 | 1.16 | 1.60 | 1.86 | 0.019 |
| 75 | 3 | 77 | 58 | 1.63 | 0.73 | 0.57 | 1.10 | 1.55 | 1.81 | 0.019 |
| 75 | 4 | 62 | 46 | 1.52 | 0.76 | 0.57 | 1.06 | 1.53 | 1.80 | 0.019 |
| 100 | 1 | 125 | 125 | 1.86 | 0.64 | 0.56 | 1.16 | 1.59 | 1.81 | 0.020 |
| 100 | 2 | 83 | 83 | 1.67 | 0.71 | 0.60 | 1.15 | 1.61 | 1.90 | 0.019 |
| 100 | 4 | 50 | 50 | 1.40 | 0.77 | 0.59 | 1.04 | 1.52 | 1.84 | 0.019 |
Predictions were made with RR-BLUP. Resource allocation strategy with highest cycle 3 mean is indicated by asterisk (*) in Cycle 3 mean column. All resource allocation strategies producing genetic gain not significantly different from the greatest genetic gain observed are underlined. Units are in C0 genetic SDs. RR-BLUP, ridge regression best linear unbiased prediction; GEBVs, genomic estimated breeding value.
Percentage of DH population phenotyped. DH lines not phenotyped were genotyped and genomic selection model was applied to calculate their GEBVs.
Number of DH lines phenotyped (i.e., %Pheno × n rounded to nearest whole integer).
Prediction accuracy of DH lines that were phenotyped and included in the model training dataset.
Prediction accuracy of DH lines that were not phenotyped.
Response and prediction accuracy of several resource allocation and phenotyping strategies for a budget of 500 total plot units, relative genotyping cost of 0.50, and plot heritability of 0.20 (i.e., B = 500, C = 0.50, =0.20)
| %Pheno. | Prediction Accuracy of Phenotyped DHs | Prediction Accuracy of Nonphenotyped DHs | Cycle 1 Mean | Cycle 2 Mean | Cycle 3 Mean | Standard Error of Cycle 3 Mean | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 1 | 500 | 250 | 2.42 | 0.72 | 0.68 | 1.61 | 1.99 | 2.33 | 0.019 |
| 50 | 2 | 333 | 166 | 2.27 | 0.77 | 0.70 | 1.62 | 2.03 | 2.39 | 0.018 |
| 50 | 3 | 250 | 125 | 2.15 | 0.80 | 0.71 | 1.58 | 2.01 | 2.39 | 0.018 |
| 50 | 4 | 200 | 100 | 2.06 | 0.81 | 0.70 | 1.51 | 1.94 | 2.32 | 0.020 |
| 75 | 1 | 400 | 300 | 2.34 | 0.74 | 0.71 | 1.63 | 2.04 | 2.39 | 0.018 |
| 75 | 2 | 250 | 188 | 2.15 | 0.78 | 0.73 | 1.57 | 2.01 | 2.39 | 0.017 |
| 75 | 3 | 182 | 136 | 2.02 | 0.80 | 0.72 | 1.52 | 1.97 | 2.36 | 0.018 |
| 75 | 4 | 143 | 107 | 1.92 | 0.82 | 0.72 | 1.50 | 1.96 | 2.36 | 0.018 |
| 100 | 3 | 143 | 143 | 1.92 | 0.81 | 0.73 | 1.50 | 1.98 | 2.38 | 0.017 |
| 100 | 4 | 111 | 111 | 1.80 | 0.82 | 0.72 | 1.45 | 1.92 | 2.33 | 0.017 |
Predictions were made with RR-BLUP. Resource allocation strategy with highest cycle 3 mean is indicated by asterisk (*) in Cycle 3 mean column. All resource allocation strategies producing genetic gain not significantly different from the greatest genetic gain observed are underlined. Units are in C0 genetic SDs. RR-BLUP, ridge regression best linear unbiased prediction; GEBVs, genomic estimated breeding value.
Percentage of DH population phenotyped. DH lines not phenotyped were genotyped and genomic selection model was applied to calculate their GEBVs.
Number of DH lines phenotyped (i.e., %Pheno × n rounded to nearest whole integer).
Prediction accuracy of DH lines that were phenotyped and included in the model training dataset.
Prediction accuracy of DH lines that were not phenotyped.
Response and prediction accuracy of several resource allocation and phenotyping strategies for a budget of 500 total plot units, relative genotyping cost of 1, and plot heritability of 0.20 (i.e., B = 500, C = 1, =0.20)
| %Pheno. | Prediction Accuracy of Phenotyped DHs | Prediction Accuracy of Nonphenotyped DHs | Cycle 1 Mean | Cycle 2 Mean | Cycle 3 Mean | Standard Error of Cycle 3 Mean | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| 50 | 1 | 333 | 166 | 2.27 | 0.68 | 0.61 | 1.42 | 1.80 | 2.09 | 0.021 |
| 50 | 2 | 250 | 125 | 2.15 | 0.75 | 0.66 | 1.46 | 1.88 | 2.21 | 0.019 |
| 50 | 3 | 200 | 100 | 2.06 | 0.78 | 0.68 | 1.44 | 1.88 | 2.21 | 0.019 |
| 50 | 4 | 167 | 84 | 1.99 | 0.80 | 0.68 | 1.44 | 1.87 | 2.25 | 0.018 |
| 75 | 1 | 286 | 214 | 2.21 | 0.71 | 0.66 | 1.48 | 1.89 | 2.20 | 0.021 |
| 75 | 4 | 125 | 94 | 1.86 | 0.81 | 0.69 | 1.41 | 1.87 | 2.26 | 0.017 |
Predictions were made with RR-BLUP. Resource allocation strategy with highest cycle 3 mean is indicated by asterisk in Cycle 3 mean column. All resource allocation strategies producing genetic gain not significantly different from the greatest genetic gain observed are underlined. Units are in C0 genetic SDs. RR-BLUP, ridge regression best linear unbiased prediction; GEBVs, genomic estimated breeding value.
Percentage of DH population phenotyped. DH lines not phenotyped were genotyped and genomic selection model was applied to calculate their GEBVs.
Number of DH lines phenotyped (i.e., %Pheno × n rounded to nearest whole integer).
Prediction accuracy of DH lines that were phenotyped and included in the model training dataset.
Prediction accuracy of DH lines that were not phenotyped.