| Literature DB >> 34207722 |
Dongdong Li1, Zhenxiang Xu2, Riliang Gu2, Pingxi Wang1, Jialiang Xu1, Dengxiang Du3, Junjie Fu1, Jianhua Wang2, Hongwei Zhang1, Guoying Wang1.
Abstract
Genomic prediction (GP) across different populations and environments should be enhanced to increase the efficiency of crop breeding. In this study, four populations were constructed and genotyped with DNA chips containing 55,000 SNPs. These populations were testcrossed to a common tester, generating four hybrid populations. Yields of the four hybrid populations were evaluated in three environments. We demonstrated by using real data that the prediction accuracies of GP across structured hybrid populations were lower than those of within-population GP. Including relatives of the validation population in the training population could increase the prediction accuracies of GP across structured hybrid populations drastically. G × E models (including main and genotype-by-environment effect) had better performance than single environment (within environment) and across environment (including only main effect) GP models in the structured hybrid population, especially in the environment where yields had higher heritability. GP by implementing G × E models in two cross-validation schemes indicated that, to increase the prediction accuracy of a new hybrid line, it would be better to field-test the hybrid line in at least one environment. Our results would be helpful for designing training population and planning field testing in hybrid breeding.Entities:
Keywords: genomic prediction; genotype by environment; hybrid prediction; maize; yield per plant
Year: 2021 PMID: 34207722 PMCID: PMC8227059 DOI: 10.3390/plants10061174
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Summary of basic statistics of yields of the four populations.
| Population | N | Mean (Kg) | SD (Kg) | Minimum (Kg) | Maximum (Kg) | Range (Kg) | CV (%) |
|---|---|---|---|---|---|---|---|
| Population 1 | 475 | 753.71 | 43.55 | 641.95 | 947.32 | 305.37 | 5.78 |
| Population 2 | 72 | 815.52 | 54.11 | 691.36 | 959.49 | 268.12 | 6.63 |
| Population 3 | 60 | 656.97 | 34.87 | 587.86 | 745.70 | 157.83 | 5.31 |
| Population 4 | 68 | 687.40 | 58.63 | 539.14 | 840.69 | 301.55 | 8.53 |
Note: N is the population size, SD is standard deviation, and CV is coefficient of variation. Population 1 to population 4 are introduced in detail in the Materials and Methods.
Variance dissection of yield of the four hybrid populations.
| Population | ANOVA |
| ||||||
|---|---|---|---|---|---|---|---|---|
| Source | DF | SS | MS | F |
| |||
| Population 1 | Rep/Env | 3 | 122,714.40 | 40,904.80 | 11.24 | 0.00 | 0.63 | 91.17 |
| Genotype | 474 | 5,459,187.50 | 11,517.27 | 3.17 | 0.00 | |||
| Environment | 2 | 39,391,468.00 | 19,695,734.00 | 5413.62 | 0.00 | |||
| G by E | 935 | 4,392,113.50 | 4697.45 | 1.29 | 0.00 | |||
| Error | 1314 | 4,780,569.00 | 3638.18 | |||||
| Population 2 | Rep/Env | 3 | 24,789.26 | 8263.09 | 2.23 | 0.09 | 0.66 | 85.47 |
| Genotype | 71 | 957,095.81 | 13,480.22 | 3.63 | 0.00 | |||
| Environment | 2 | 2,448,691.25 | 1,224,345.63 | 329.97 | 0.00 | |||
| G by E | 130 | 651,743.31 | 5013.41 | 1.35 | 0.03 | |||
| Error | 187 | 693,851.19 | 3710.43 | |||||
| Population 3 | Rep/Env | 3 | 3349.01 | 1116.34 | 0.34 | 0.80 | 0.62 | 84.52 |
| Genotype | 59 | 548,701.06 | 9300.02 | 2.83 | 0.00 | |||
| Environment | 2 | 1,797,173.38 | 898,586.69 | 273.06 | 0.00 | |||
| GE_interaction | 118 | 431,993.81 | 3660.96 | 1.11 | 0.27 | |||
| Error | 156 | 513,373.16 | 3290.85 | |||||
| Population 4 | Rep/Env | 3 | 15,394.80 | 5131.60 | 1.56 | 0.20 | 0.74 | 88.79 |
| Genotype | 67 | 1,310,953.50 | 19,566.47 | 5.95 | 0.00 | |||
| Environment | 2 | 2,757,321.25 | 1,378,660.63 | 419.16 | 0.00 | |||
| GE_interaction | 133 | 762,263.31 | 5731.30 | 1.74 | 0.00 | |||
| Error | 186 | 611,768.25 | 3289.08 | |||||
Note: DF is degree of freedom; SS is sum of squares; MS is mean square of variance; F is F value of F-test; p is p value of F-test; is broad-sense heritability; is the multiple R-square of fitted linear model. Rep/Env is the replicate effect nested in each environment.
Figure 1PC analysis and genetic similarity among the four populations. (a) PC analysis of the four populations on the basis of 18,702 SNP markers; (b) the genetic similarity heatmap was used to demonstrate the genetic relatedness among the four populations.
Figure 2GP across different populations. (a) PA of the within-population GP, which was performed with five-fold CV and was repeated 100 times. A and AD were the GBLUP models including only additive effect, and additive plus dominance effects, respectively; Pop1-Pop4 were population 1–4, and all pops indicated all populations were used together to perform within-population GP (b) PA of the one-to-one prediction scheme. The lower name in the x-axis is the training population, and the upper name in the x-axis is the validation population; (c) PA of the three-to-one prediction scheme. The populations in the x-axis are the validation population, and the remaining three populations were used together as the training population.
Figure 3PA of the SE, AE, and G × E models for CV1 (a) and CV2 (b), respectively. The data inside and outside of brackets are the training, and validation datasets, respectively.