| Literature DB >> 35003202 |
Md Abdullah Al Bari1, Ping Zheng2, Indalecio Viera1, Hannah Worral3, Stephen Szwiec3, Yu Ma2, Dorrie Main2, Clarice J Coyne4, Rebecca J McGee5, Nonoy Bandillo1.
Abstract
Phenotypic evaluation and efficient utilization of germplasm collections can be time-intensive, laborious, and expensive. However, with the plummeting costs of next-generation sequencing and the addition of genomic selection to the plant breeder's toolbox, we now can more efficiently tap the genetic diversity within large germplasm collections. In this study, we applied and evaluated genomic prediction's potential to a set of 482 pea (Pisum sativum L.) accessions-genotyped with 30,600 single nucleotide polymorphic (SNP) markers and phenotyped for seed yield and yield-related components-for enhancing selection of accessions from the USDA Pea Germplasm Collection. Genomic prediction models and several factors affecting predictive ability were evaluated in a series of cross-validation schemes across complex traits. Different genomic prediction models gave similar results, with predictive ability across traits ranging from 0.23 to 0.60, with no model working best across all traits. Increasing the training population size improved the predictive ability of most traits, including seed yield. Predictive abilities increased and reached a plateau with increasing number of markers presumably due to extensive linkage disequilibrium in the pea genome. Accounting for population structure effects did not significantly boost predictive ability, but we observed a slight improvement in seed yield. By applying the best genomic prediction model (e.g., RR-BLUP), we then examined the distribution of genotyped but nonphenotyped accessions and the reliability of genomic estimated breeding values (GEBV). The distribution of GEBV suggested that none of the nonphenotyped accessions were expected to perform outside the range of the phenotyped accessions. Desirable breeding values with higher reliability can be used to identify and screen favorable germplasm accessions. Expanding the training set and incorporating additional orthogonal information (e.g., transcriptomics, metabolomics, physiological traits, etc.) into the genomic prediction framework can enhance prediction accuracy.Entities:
Keywords: genomic prediction; genomic selection; germplasm accessions; next-generation sequencing; pea (Pisum sativum L); reliability criteria
Year: 2021 PMID: 35003202 PMCID: PMC8740293 DOI: 10.3389/fgene.2021.707754
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Heritability and summary statistics for seed yield and other agronomic traits.
| Trait | Mean | Range | SD | CV(%) |
|
|
|---|---|---|---|---|---|---|
| DFF (days) | 71 | 60–84 | 4.8 | 6.7 | 0.90 | 0.80 |
| NoSeedsPod (Nos.) | 5.7 | 4.4–6.9 | 0.5 | 8.5 | 0.84 | 0.66 |
| PH (cm) | 74 | 37.6–108.3 | 11.5 | 15.5 | 0.81 | 0.68 |
| PodsPlant (Nos.) | 18 | 15–23 | 1.5 | 8.3 | 0.50 | 0.27 |
| DM (days) | 104 | 99–112 | 2.4 | 2.3 | 0.51 | 0.38 |
| SeedYield (kg ha−1) | 2,918 | 1734–4,463 | 451 | 15.4 | 0.67 | 0.46 |
DFF is days to first flowering; NoSeedsPod is the number of seeds per pod, PH is plant height, PodsPlant is the number of pods per plant, DM is days to physiological maturity, SeedYield is seed yield per hectare, SD is the standard deviation, CV is coefficient of variance, H 2 is heritability in the broad sense.
Predictive ability for seed yield and agronomic traits using five genomic prediction models.
| Traits | RR-BLUP | PLSR | RF | BayesCpi | RKHS |
|---|---|---|---|---|---|
| DFF (days) | 0.60 (0.57–0.63) | 0.57 (0.53–0.61) | 0.55 (0.52–0.58) | 0.59 (0.55–0.63) | 0.54 (0.5–0.58) |
| NoSeedsPod | 0.42 (0.37–0.48) | 0.41 (0.36–0.46) | 0.40 (0.35–0.45) | 0.42 (0.38–0.46) | 0.40 (0.34–0.48) |
| PH (cm) | 0.39 (0.33–0.44) | 0.42 (0.38–0.48) | 0.45 (0.4–0.5) | 0.45 (0.41–0.48) | 0.43 (0.39–0.48) |
| PodsPlant | 0.28 (0.22–0.33) | 0.25 (0.2–0.31) | 0.28 (0.22–0.34) | 0.23 (0.17–0.29) | 0.28 (0.23–0.34) |
| DM (days) | 0.42 (0.36–0.47) | 0.44 (0.39–0.5) | 0.41 (0.35–0.46) | 0.47 (0.43–0.5) | 0.45 (0.4–0.48) |
| SeedYield (kg ha−1) | 0.38 (0.34–0.42) | 0.31 (0.27–0.36) | 0.39 (0.35–0.44) | 0.35 (0.31–0.39) | 0.42 (0.37–0.48) |
DFF is days to first flowering; NoSeedsPod is the number of seeds per pod; PH is Plant height in cm, PodsPlant is the number of pods per plant; DM is days to physiological maturity; within parentheses are ranges of predictive ability.
FIGURE 1Predictive ability with increasing training population size using the RR-BLUP model, DFF is days to first flowering, DM, is days to physiological maturity, NoSeedsPod is number of seeds per pod, PH is plant height in cm, PodsPlant is the number of pods per plant, SeedYield is seed yield in kg ha−1.
Predictive ability within and across subpopulations using RR-BLUP and all SNP markers.
| Sub pops | DFF | NoSeedsPod | PH | PodsPlant | DM | SeedYield |
|---|---|---|---|---|---|---|
| Sub pop 5 (51) | 0.27 | 0.26 | 0.08 | -0.01 | 0.02 | 0.18 |
| Sub pop 7 (58) | 0.34 | 0.40 | 0.22 | 0.12 | -0.01 | 0.01 |
| Sub pop 8 (41) | 0.68 | 0.35 | 0.33 | 0.07 | 0.43 | 0.37 |
| SP- | 0.50 | 0.45 | 0.47 | 0.25 | 0.51 | 0.34 |
| SP+ | 0.53 | 0.35 | 0.42 | 0.25 | 0.48 | 0.45 |
| SP PC10 | 0.51 | 0.41 | 0.44 | 0.18 | 0.20 | 0.43 |
| Var exp (R2) | 0.13 | 0.09 | 0.19 | 0.15 | 0.15 | 0.17 |
DFF is days to first flowering, NoSeedsPod is the number of seeds per pod, PH is plant height, PodsPlant is the number of pods per plant, DM is days to physiological maturity, SP- does not account for population structure, SP+, refers to the population structure addressed in the model, SP PC10 addresses population structure with 10 PC, Var exp (R2) refers the variance explained by population structure after fitting a regression model, within parenthesis represent the number of entries in each subpopulation.
FIGURE 2Distribution of phenotyped and predicted nonphenotyped accessions within the USDA pea germplasm collection for seed yield and plant height.
FIGURE 3Reliability criteria for nonphenotyped lines: the top 50 of genomic estimated breeding values are blue, and bottom 50 are in red, intermediates are in green. (A) reliability estimates for seed yield (kg ha−1), (B) days to first flowering, (C) plant height, (D) number of seeds per plant.