| Literature DB >> 32371453 |
Ahmad H Sallam1, Emily Conley2, Dzianis Prakapenka3, Yang Da3, James A Anderson4.
Abstract
The use of haplotypes may improve the accuracy of genomic prediction over single SNPs because haplotypes can better capture linkage disequilibrium and genomic similarity in different lines and may capture local high-order allelic interactions. Additionally, prediction accuracy could be improved by portraying population structure in the calibration set. A set of 383 advanced lines and cultivars that represent the diversity of the University of Minnesota wheat breeding program was phenotyped for yield, test weight, and protein content and genotyped using the Illumina 90K SNP Assay. Population structure was confirmed using single SNPs. Haplotype blocks of 5, 10, 15, and 20 adjacent markers were constructed for all chromosomes. A multi-allelic haplotype prediction algorithm was implemented and compared with single SNPs using both k-fold cross validation and stratified sampling optimization. After confirming population structure, the stratified sampling improved the predictive ability compared with k-fold cross validation for yield and protein content, but reduced the predictive ability for test weight. In all cases, haplotype predictions outperformed single SNPs. Haplotypes of 15 adjacent markers showed the best improvement in accuracy for all traits; however, this was more pronounced in yield and protein content. The combined use of haplotypes of 15 adjacent markers and training population optimization significantly improved the predictive ability for yield and protein content by 14.3 (four percentage points) and 16.8% (seven percentage points), respectively, compared with using single SNPs and k-fold cross validation. These results emphasize the effectiveness of using haplotypes in genomic selection to increase genetic gain in self-fertilized crops.Entities:
Keywords: GenPred; Shared data resources; genomic selection; haplotype prediction; plant breeding; quantitative trait loci; training population optimization; wheat
Mesh:
Year: 2020 PMID: 32371453 PMCID: PMC7341132 DOI: 10.1534/g3.120.401165
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Estimated genetic variance (), residual variance (), and broad-sense heritability (H) for yield, test weight, and protein content in the Minnesota wheat genomic selection panel
| Trait | H | ||
|---|---|---|---|
| Yield (kg/ha) | 33737 | 168275 | 0.29 |
| Test weight (kg/hL) | 1.20 | 1.19 | 0.67 |
| Protein (%) | 0.28 | 0.27 | 0.68 |
Genetic relationship between (A) and within (A) clusters and average predictive ability, when using the cluster in two training populations to predict another cluster, for yield, test weight, and protein based on single markers
| Predictive ability | |||||||
|---|---|---|---|---|---|---|---|
| Clusters | Number of individuals | Yield | Test weight | Protein | Ave. across traits | ||
| Cluster 1 | 176 | −0.12 ± 0.001 | 0.13 ± 0.002 | 0.32 | 0.38 | 0.28 | 0.33 |
| Cluster 2 | 89 | −0.16 ± 0.001 | 0.51 ± 0.005 | 0.29 | 0.34 | 0.19 | 0.27 |
| Cluster 3 | 118 | −0.14 ± 0.001 | 0.28 ± 0.003 | 0.28 | 0.31 | 0.23 | 0.27 |
| Average for each trait | 0.30 | 0.34 | 0.23 | ||||
Figure 1Population stratification of the Minnesota wheat genomic selection (MN-WGS) panel of 383 wheat lines inferred from K-means clustering in which three clusters were identified and visualized on principal component analysis. Cluster 1 is shown in blue, cluster 2 in red, and cluster 3 in green.
Figure 2Heatmap for the additive genetic relationship matrix displaying genetic relatedness among lines in the MN-WGS panel with the corresponding clusters identified using K-means clustering.
Figure 3The predictive ability for yield, test weight, and protein content using single markers, haplotype blocks of 5 adjacent markers (Haploblock-5), haplotype blocks of 10 adjacent markers (Haploblock-10), haplotype blocks of 15 adjacent markers (Haploblock-15), and haplotype blocks of 20 adjacent markers (Haploblock-20). The two validation methods used are k-fold cross validation and the stratified sampling optimization. A star over the error bar indicates a significant difference in the predictive ability between the haplotype and single markers for the same validation method.