| Literature DB >> 29945967 |
Vikas Belamkar1, Mary J Guttieri2, Waseem Hussain1, Diego Jarquín1, Ibrahim El-Basyoni3, Jesse Poland4, Aaron J Lorenz5, P Stephen Baenziger6.
Abstract
Genomic prediction (GP) is now routinely performed in crop plants to predict unobserved phenotypes. The use of predicted phenotypes to make selections is an active area of research. Here, we evaluate GP for predicting grain yield and compare genomic and phenotypic selection by tracking lines advanced. We examined four independent nurseries of F3:6 and F3:7 lines trialed at 6 to 10 locations each year. Yield was analyzed using mixed models that accounted for experimental design and spatial variations. Genotype-by-sequencing provided nearly 27,000 high-quality SNPs. Average genomic predictive ability, estimated for each year by randomly masking lines as missing in steps of 10% from 10 to 90%, and using the remaining lines from the same year as well as lines from other years in a training set, ranged from 0.23 to 0.55. The predictive ability estimated for a new year using the other years ranged from 0.17 to 0.28. Further, we tracked lines advanced based on phenotype from each of the four F3:6 nurseries. Lines with both above average genomic estimated breeding value (GEBV) and phenotypic value (BLUP) were retained for more years compared to lines with either above average GEBV or BLUP alone. The number of lines selected for advancement was substantially greater when predictions were made with 50% of the lines from the testing year added to the training set. Hence, evaluation of only 50% of the lines yearly seems possible. This study provides insights to assess and integrate genomic selection in breeding programs of autogamous crops.Entities:
Keywords: GenPred; Genomic selection; Triticum aestivum; genomic best linear unbiased prediction; genomic prediction; genotyping-by-sequencing; shared data resources; spatial variation
Mesh:
Year: 2018 PMID: 29945967 PMCID: PMC6071594 DOI: 10.1534/g3.118.200415
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Description of prediction scenarios and composition of training and test set using 2012 as an example
| Prediction scenario | Cross-validation scheme | Training set | Test set | Size of training set | Size of test set |
|---|---|---|---|---|---|
| Predictions of 2012 F3:6 nursery | NA10 | 2013+2014+2015+90% of the lines in 2012 | Rest 10% of the lines in 2012 | 1,072 | 28 |
| NA20 | 2013+2014+2015+80% of the lines in 2012 | Rest 20% of the lines in 2012 | 1,044 | 56 | |
| NA50 | 2013+2014+2015+50% of the lines in 2012 | Rest 50% of the lines in 2012 | 960 | 140 | |
| NA100 | 2013+2014+2015 | 100% of the lines in 2012 | 820 | 280 | |
| Predicting 2012 F3:6 lines advanced to F3:7 in 2013 | — | 2013+2014+2015 | 100% of the lines in 2012 (subset GEBV of 2012 57 F3:6 lines advanced to F3:7 in 2013) | 820 | 280 |
| Predicting 2012 F3:6 lines advanced to F3:7 in 2013 skipping the year of F3:7 (2013) in training set | — | 2014+2015 | 100% of the lines in 2012 and 2013 (subset GEBV of 2012 57 F3:6 lines advanced to F3:7 in 2013) | 540 | 560 |
| Tracking lines advanced in the breeding program from 2012 F3:6 | — | 2013+2014+2015 | 100% of the lines in 2012 | 820 | 280 |
| Tracking lines advanced in the breeding program from 2012 F3:6 including 50% of the lines from the same nursery in training set | — | 2013+2014+2015+50% of the lines in 2012 | Rest 50% of the lines in 2012 | 960 | 140 |
Figure 1The maximum realized kinship coefficient (MRKC) between lines grown in a testing year and lines in rest of the years (training set). The MRKC was calculated as max (U) and U is the realized kinship coefficient between line i in the test set and line j in the training set. The MRKC value represents the kinship between a specific line in the test set and all lines in the training set. The kinship of each of the preliminary yield trials (PYT 2012-2015) with rest of the PYTs (training set) and kinship between lines advanced from the PYT to advanced yield trial (AYT) each year (example, AYT 2013) and lines in the training set (PYT 2013, 2014, and 2015) are shown in a and b. The average maximum realized kinship coefficient estimated across all lines grown in PYT and AYT is written on top of each of the violin plots.
Figure 2Predictive ability (PA) estimated for grain yield using various cross-validation scenarios (NA10 to NA90) and for an entire new preliminary yield trial nursery (NA100) in each of the four years, 2012 to 2015, a to d. Each of cross-validation scenarios was performed 10 times and the mean PA is written on top of each of the box plot and is also highlighted with an asterisk in the boxplots.
Figure 3Comparison of phenotypic and genomic selection. The comparison is made using 57 lines advanced from the preliminary yield trial (F3:6) grown in 2012-2015 to the advanced yield trial (F3:7) grown in 2013-2016. The suffix “2” separated by an underscore represents scenarios where the genomic estimated breeding values of the preliminary yield trials (F3:6) are estimated by excluding the following (F3:7) year from the training set. Correlation coefficients noted on the bar plots.
Figure 4Comparison of adjusted observed grain yield values (best linear unbiased predictors, BLUPs) and genomic estimated breeding values (GEBVs) for lines tested in 2012 (a) and 2013 (b) preliminary yield trials (PYTs). The lines advanced in each year from the 2012 and 2013 PYT nursery until 2016 are highlighted with different colors and shapes. The two vertical lines represent the mean of the BLUPs of the PYT nursery and the mean of the BLUPs of the 57 PYT lines selected for advancement to the advanced yield trial nursery. The two horizontal lines indicate the mean (solid) and the 75th percentile of the genomic estimated breeding values (dashed).
Figure 5Comparison of adjusted observed grain yield (best linear unbiased predictors, BLUPs) and genomic breeding values estimated using all lines in other years and 50% of the lines randomly selected from 2012 (a) and 2013 (b) from preliminary yield trials (PYTs). The lines advanced in each year from the 2012 and 2013 PYT nursery until 2016 are highlighted with different colors and shapes. The two vertical lines represent the mean of the BLUPs of the PYT nursery and the mean of the BLUPs of the 57 PYT lines selected for advancement to the advanced yield trial nursery. The two horizontal lines indicate the mean (solid) and the 75th percentile of the genomic estimated breeding values (dashed).