| Literature DB >> 32457778 |
Rui Guo1,2,3, Thanda Dhliwayo2, Edna K Mageto4, Natalia Palacios-Rojas2, Michael Lee4, Diansi Yu5,6, Yanye Ruan3, Ao Zhang2,3, Felix San Vicente2, Michael Olsen7, Jose Crossa2, Boddupalli M Prasanna7, Lijun Zhang3, Xuecai Zhang2.
Abstract
Enriching of kernel zinc (Zn) concentration in maize is one of the most effective ways to solve the problem of Zn deficiency in low and middle income countries where maize is the major staple food, and 17% of the global population is affected with Zn deficiency. Genomic selection (GS) has shown to be an effective approach to accelerate genetic gains in plant breeding. In the present study, an association-mapping panel and two maize double-haploid (DH) populations, both genotyped with genotyping-by-sequencing (GBS) and repeat amplification sequencing (rAmpSeq) markers, were used to estimate the genomic prediction accuracy of kernel Zn concentration in maize. Results showed that the prediction accuracy of two DH populations was higher than that of the association mapping population using the same set of markers. The prediction accuracy estimated with the GBS markers was significantly higher than that estimated with the rAmpSeq markers in the same population. The maximum prediction accuracy with minimum standard error was observed when half of the genotypes were included in the training set and 3,000 and 500 markers were used for prediction in the association mapping panel and the DH populations, respectively. Appropriate levels of minor allele frequency and missing rate should be considered and selected to achieve good prediction accuracy and reduce the computation burden by balancing the number of markers and marker quality. Training set development with broad phenotypic variation is possible to improve prediction accuracy. The transferability of the GS models across populations was assessed, the prediction accuracies in a few pairwise populations were above or close to 0.20, which indicates the prediction accuracies across years and populations have to be assessed in a larger breeding dataset with closer relationship between the training and prediction sets in further studies. GS outperformed MAS (marker-assisted-selection) on predicting the kernel Zn concentration in maize, the decision of a breeding strategy to implement GS individually or to implement MAS and GS stepwise for improving kernel Zn concentration in maize requires further research. Results of this study provide valuable information for understanding how to implement GS for improving kernel Zn concentration in maize.Entities:
Keywords: GBS; genomic selection; kernel Zn concentration; maize; rAmpSeq
Year: 2020 PMID: 32457778 PMCID: PMC7225839 DOI: 10.3389/fpls.2020.00534
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Basic information of three populations of DTMA association mapping panel, and two DH populations (DH1 and DH2), including population size, name of parents for DH populations, kernel Zn concentration in each population of the values of mean, minimum, maximum, and stand deviation, and heritability (h2), number of locations and number of replications.
| DTMA | 236 | 27.11 | 18.35 | 39.53 | 3.41 | 0.84 | 3 | 6 | ||
| DH1 | 108 | CML503 | CLWN201 | 24.59 | 16.87 | 36.45 | 4.01 | 0.75 | 3 | 4 |
| DH2 | 143 | CML465 | CML451 | 25.59 | 18.38 | 37.93 | 3.50 | 0.62 | 3 | 4 |
FIGURE 1Phenotype distribution of maize kernel Zn conentration of: (A) all the inbred lines across the three population used in the present study; (B) all the inbred lines in the DTMA panel; (C) all the inbred lines in the DH1 population; (D) all the inbred lines in the DH2 population.
FIGURE 2Distribution of MAF and missing rate across all the three populations before and after filtering: (A) MAF distribution of the GBS marker dataset before filtering; (B) MAF distribution of the GBS marker dataset after filtering; (C) MAF distribution of the rAmpSeq marker dataset before filtering; (D) MAF distribution of the rAmpSeq marker dataset after filtering; (E) missing rate distribution of the GBS marker dataset before filtering; (F) missing rate distribution of the GBS marker dataset after filtering.
FIGURE 3Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population estimated with the GBS and rAmpSeq marker datasets.
FIGURE 4Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population, when the training population size was set from 10 to 90% of total genotypes, with an interval of 10%. Panel (A) in the DTMA panel estimated with GBS markers; (B) in the DTMA panel estimated with rAmpSeq markers; (C) in the DH1 population estimated with GBS markers; (D) in the DH1 population estimated with rAmpSeq markers; (E) in the DH2 population estimated with GBS markers; (F) in the DH2 population estimated with rAmpSeq markers.
FIGURE 5Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population, with number of markers varying from 10 to all markers: (A) in the GBS marker dataset; (B) in the rAmpSeq dataset.
Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population, estimated from the GBS marker datasets with different levels of quality filtered with missing rate and MAF.
| 0% | 0.10 | 9656 | 0.39 | 14318 | 0.66 | 504 | 0.66 |
| 0.20 | 5681 | 0.37 | 12294 | 0.65 | 495 | 0.66 | |
| 0.30 | 3314 | 0.35 | 6961 | 0.65 | 445 | 0.66 | |
| 0.40 | 1570 | 0.31 | 4330 | 0.44 | 329 | 0.51 | |
| 20% | 0.10 | 201258 | 0.39 | 64617 | 0.65 | 45440 | 0.66 |
| 0.20 | 129155 | 0.42 | 55241 | 0.65 | 44080 | 0.65 | |
| 0.30 | 79223 | 0.42 | 31658 | 0.67 | 39934 | 0.65 | |
| 0.40 | 37999 | 0.41 | 19029 | 0.43 | 29699 | 0.57 | |
| 40% | 0.10 | 252221 | 0.43 | 94925 | 0.66 | 65411 | 0.66 |
| 0.20 | 162792 | 0.43 | 80983 | 0.63 | 62977 | 0.66 | |
| 0.30 | 100366 | 0.43 | 47682 | 0.66 | 56738 | 0.67 | |
| 0.40 | 48312 | 0.42 | 26995 | 0.46 | 40970 | 0.58 | |
| 60% | 0.10 | 275811 | 0.42 | 120842 | 0.63 | 80595 | 0.65 |
| 0.20 | 178326 | 0.42 | 103032 | 0.65 | 76640 | 0.66 | |
| 0.30 | 109965 | 0.43 | 62590 | 0.64 | 68022 | 0.65 | |
| 0.40 | 52972 | 0.43 | 35050 | 0.57 | 48562 | 0.58 | |
| 80% | 0.10 | 285127 | 0.43 | 137892 | 0.64 | 89870 | 0.67 |
| 0.20 | 184501 | 0.42 | 116838 | 0.64 | 84845 | 0.65 | |
| 0.30 | 113833 | 0.41 | 71836 | 0.65 | 74818 | 0.65 | |
| 0.40 | 54893 | 0.41 | 39930 | 0.57 | 52811 | 0.56 | |
Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population, estimated from the rAmpSeq marker datasets with different levels of quality filtered with MAF.
| DTMA | 0.10 | 4847 | 0.35 |
| 0.20 | 2960 | 0.35 | |
| 0.30 | 1731 | 0.31 | |
| 0.40 | 811 | 0.27 | |
| DH1 | 0.10 | 3722 | 0.61 |
| 0.20 | 3098 | 0.61 | |
| 0.30 | 1723 | 0.62 | |
| 0.40 | 1077 | 0.48 | |
| DH2 | 0.10 | 2588 | 0.5 |
| 0.20 | 2392 | 0.49 | |
| 0.30 | 2096 | 0.52 | |
| 0.40 | 1429 | 0.45 |
FIGURE 6Genomic prediction accuracies of kernel Zn concentration in the DTMA panel, DH1 population, and DH2 population, when the training population was formed by sampling the same percentage of genotypes with random selection (Random), with selection from the bottom tail (Bottom), with selection from the top tail (Top), with selection from the middle part (Middle), and with selection from the two tails (Two tails). The training population ranged from 10 to 90% of the total genotypes, with an interval of 20%. Panel (A) in the DTMA panel estimated with GBS markers; (B) in the DTMA panel estimated with rAmpSeq markers; (C) in the DH1 population estimated with GBS markers; (D) in the DH1 population estimated with rAmpSeq markers; (E) in the DH2 population estimated with GBS markers; (F) in the DH2 population estimated with rAmpSeq markers.
Genomic prediction accuracies between pairwise populations estimated from the GBS and rAmpSeq marker datasets.
| DTMA | DH1 | 0.30 | 0.19 | |
| DH2 | −0.12 | −0.13 | ||
| DH1 | DTMA | 0.05 | 0.07 | |
| DH2 | 0.05 | 0.15 | ||
| DH2 | DTMA | −0.06 | −0.04 | |
| DH1 | −0.02 | 0.24 | ||
Comparison the prediction accuracy between GS and MAS estimated from the fivefold cross-validation scheme within each of the three populations, the prediction accuracies of GS were estimate from the filtered GBS dataset, and the prediction accuracies of MAS were estimate from the significantly associated SNPs.
| DTMA | 0.61 | 0.15 | 0.40 | 0.09 | 0.53 | −0.16 | 0.22 | 0.12 |
| DH1 | 0.88 | 0.31 | 0.64 | 0.12 | 0.82 | −0.02 | 0.49 | 0.16 |
| DH2 | 0.85 | 0.42 | 0.65 | 0.09 | 0.76 | 0.01 | 0.42 | 0.14 |