| Literature DB >> 35219296 |
Avjinder S Kaler1, Larry C Purcell1, Timothy Beissinger2, Jason D Gillman3.
Abstract
BACKGROUND: Genomic selection is a powerful tool in plant breeding. By building a prediction model using a training set with markers and phenotypes, genomic estimated breeding values (GEBVs) can be used as predictions of breeding values in a target set with only genotype data. There is, however, limited information on how prediction accuracy of genomic prediction can be optimized. The objective of this study was to evaluate the performance of 11 genomic prediction models across species in terms of prediction accuracy for two traits with different heritabilities using several subsets of markers and training population proportions. Species studied were maize (Zea mays, L.), soybean (Glycine max, L.), and rice (Oryza sativa, L.), which vary in linkage disequilibrium (LD) decay rates and have contrasting genetic architectures.Entities:
Keywords: Bayes B; Genomic estimated breeding values; Genomic selection/prediction; Maize (Zea mays L.); Rice (Oryza sativa L.); Soybean (Glycine max L.)
Mesh:
Substances:
Year: 2022 PMID: 35219296 PMCID: PMC8881851 DOI: 10.1186/s12870-022-03479-y
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Descriptive statistics and broad sense heritability of canopy wilting (CW) and carbon isotope ratio (δ 13C) in soybean, panicles per plant (PPP) and seeds per plant (SPP) in rice, and days to tasseling (DT) and ear height (EH) in maize
| Soybean | Rice | Maize | ||||
|---|---|---|---|---|---|---|
| CW (%) | PPP | SPP | DT (days) | EH (cm) | ||
| 16.99 | −29.06 | 3.24 | 4.86 | 67.58 | 61.38 | |
| 6.46 | 0.27 | 0.41 | 0.34 | 5.75 | 20.27 | |
| 7.5 | −29.81 | 2.23 | 3.44 | 54.5 | 8 | |
| 45.63 | −28.37 | 4.12 | 5.63 | 85 | 136 | |
| 38.13 | 1.46 | 1.89 | 2.19 | 30.5 | 128 | |
| 346 | 346 | 352 | 352 | 279 | 279 | |
| 80 | 60 | 80 | 55 | 85 | 65 | |
aCW data from Kaler et al. [18], and δ 13C from Kaler et al. [19]
bData from Zhao et al. [21]
cData from Wallace et al. [20]
Marker distribution in the different subsets of markers were selected based on the two methods: (1) when linkage disequilibrium between markers was correlated at r ≥ 0.90 (LD_90), r = 0.80 (LD_80), r = 0.70 (LD_70), r = 0.60 (LD_50) and (2) when SNP markers were significant with the respective traits at P-values of 50% (SNP_5), 10% (SNP_1), 5% (SNP_05), or non-significant (SNP_NS). The traits evaluated included canopy wilting (CW) and carbon isotope ratio (δ 13C) for soybean, panicles per plant (PPP) and seeds per plant (SPP) for rice, and days to tasseling ((DT) and ear height (EH) for maize
| Crop | Trait | Complete | LD_90 | LD_80 | LD_70 | LD_60 | LD_50 | SNP_5 | SNP_1 | SNP_05 | SNP_NS |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 31,260 | 18,971 | 17,650 | 15,944 | 14,458 | 13,045 | 16,819 | 4106 | 2111 | 29,138 | ||
| 31,260 | 18,971 | 17,650 | 15,944 | 14,458 | 13,045 | 14,238 | 2174 | 919 | 30,332 | ||
| 34,848 | 28,390 | 26,808 | 25,437 | 23,910 | 22,107 | 15,983 | 2337 | 1043 | 33,804 | ||
| 34,848 | 28,390 | 26,808 | 25,437 | 23,910 | 22,107 | 13,530 | 1554 | 674 | 34,169 | ||
| 48,833 | 42,605 | 40,951 | 39,421 | 37,824 | 36,050 | 23,836 | 4277 | 2070 | 46,763 | ||
| 48,833 | 42,605 | 40,951 | 39,421 | 37,824 | 36,050 | 23,813 | 4257 | 2121 | 46,707 |
Marker based narrow sense heritability (h) for canopy wilting (CW) and carbon isotope ratio (δ 13C) in soybean, and seeds per plant (SPP) and panicles per plant (PPP) in rice, and days to tasseling (DT) and ear height (EH) in maize using 10 sets of markers in different training-to-testing proportions (TPS)
| Traits | TPS (%) | Subsets of markers† | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Com | SNP_5 | SNP_1 | SNP_05 | SNP_NS | LD_90 | LD_80 | LD_70 | LD_60 | LD_50 | ||
| Soybean | |||||||||||
| CW | 90 | 74 | 76 | 77 | 78 | 76 | 75 | 76 | 76 | 75 | 75 |
| 70 | 59 | 71 | 71 | 75 | 71 | 64 | 68 | 69 | 65 | 65 | |
| 50 | 79 | 79 | 80 | 79 | 79 | 78 | 79 | 79 | 78 | 78 | |
| δ13C | 90 | 27 | 36 | 38 | 41 | 36 | 27 | 28 | 28 | 28 | 27 |
| 70 | 38 | 46 | 45 | 48 | 46 | 38 | 39 | 39 | 38 | 37 | |
| 50 | 36 | 43 | 44 | 41 | 43 | 36 | 37 | 37 | 36 | 35 | |
| Rice | |||||||||||
| SPP | 90 | 36 | 49 | 51 | 52 | 27 | 34 | 34 | 34 | 33 | 34 |
| 70 | 25 | 33 | 38 | 38 | 21 | 24 | 24 | 25 | 24 | 24 | |
| 50 | 28 | 42 | 38 | 43 | 23 | 28 | 27 | 27 | 26 | 26 | |
| PPP | 90 | 61 | 78 | 81 | 80 | 57 | 61 | 61 | 61 | 61 | 61 |
| 70 | 70 | 88 | 88 | 87 | 66 | 71 | 71 | 71 | 71 | 72 | |
| 50 | 77 | 84 | 82 | 77 | 74 | 76 | 76 | 76 | 76 | 76 | |
| Maize | |||||||||||
| DT | 90 | 80 | 85 | 85 | 83 | 76 | 81 | 81 | 81 | 81 | 81 |
| 70 | 63 | 69 | 70 | 68 | 60 | 63 | 63 | 64 | 64 | 64 | |
| 50 | 97 | 99 | 99 | 98 | 88 | 98 | 98 | 98 | 98 | 98 | |
| EH | 90 | 60 | 74 | 79 | 74 | 52 | 61 | 62 | 62 | 62 | 62 |
| 70 | 52 | 64 | 63 | 61 | 44 | 52 | 52 | 52 | 53 | 52 | |
| 50 | 70 | 77 | 79 | 71 | 59 | 69 | 70 | 69 | 69 | 69 | |
†Subsets of markers included a complete set (Com), SNP markers significant at P-values of 0.50 (SNP_5), 0.10 (SNP_1), 0.05 (SNP_05) or based upon linkage disequilibrium when the correlation coefficient between markers in a LD block were ≥ 0.90 (LD_90), 0.80 (LD_80), 0.70 (LD_70), 0.60 (LD_60), or 0.50 (LD_50)
Fig. 1Prediction accuracies of 11 genomic prediction models in panicles per plant (PPP) and seeds per plant (SPP) in rice, days to tasseling (DT) and ear height (EH) in maize and canopy wilting (CW) and carbon isotope ratio (δ 13C) in soybean, using different subsets of markers, which were selected based on the two methods, linkage disequilibrium between markers and significant markers at different P-values using cross validation for three training-to-testing proportions (90:10%, 70:30%, and 50:50%)
Fig. 2Prediction accuracies of 11 genomic prediction models in canopy wilting (CW) and carbon isotope ratio (δ 13C) in soybean, and panicles per plant (PPP) and seeds per plant (SPP) in rice, and days to tasseling (DT) and ear height (EH) in maize using a subset of significant markers at P < 0.05, using cross validation for three training-to-testing proportions (90:10%, 70:30%, and 50:50%)
Fig. 3Effect of marker density of different subsets of markers, which were selected based on the two methods, linkage disequilibrium between markers and significant markers at different P-values, on the prediction accuracy of BayesB model for three traits including canopy wilting (CW) in soybean, seeds per plant (SPP) in rice, and ear height (EH) in maize using cross validation with a training-to-testing proportion of 90:10%