| Literature DB >> 30131817 |
Elsa Sverrisdóttir1, Ea Høegh Riis Sundmark1, Heidi Øllegaard Johnsen1, Hanne Grethe Kirk2, Torben Asp3, Luc Janss4, Glenn Bryan5, Kåre Lehmann Nielsen1.
Abstract
Genomic selection (GS) is becoming increasingly applicable to crops as the genotyping costs continue to decrease, which makes it an attractive alternative to traditional selective breeding based on observed phenotypes. With genome-wide molecular markers, selection based on predictions from genotypes can be made in the absence of direct phenotyping. The reliability of predictions depends strongly on the number of individuals used for training the predictive algorithms, particularly in a highly genetically diverse organism such as potatoes; however, the relationship between the individuals also has an enormous impact on prediction accuracy. Here we have studied genomic prediction in three different panels of potato cultivars, varying in size, design, and phenotypic profile. We have developed genomic prediction models for two important agronomic traits of potato, dry matter content and chipping quality. We used genotyping-by-sequencing to genotype 1,146 individuals and generated genomic prediction models from 167,637 markers to calculate genomic estimated breeding values with genomic best linear unbiased prediction. Cross-validated prediction correlations of 0.75-0.83 and 0.39-0.79 were obtained for dry matter content and chipping quality, respectively, when combining the three populations. These prediction accuracies were similar to those obtained when predicting performance within each panel. In contrast, but not unexpectedly, predictions across populations were generally lower, 0.37-0.71 and 0.28-0.48 for dry matter content and chipping quality, respectively. These predictions are not limited by the number of markers included, since similar prediction accuracies could be obtained when using merely 7,800 markers (<5%). Our results suggest that predictions across breeding populations in tetraploid potato are presently unreliable, but that individual prediction models within populations can be combined in an additive fashion to obtain high quality prediction models relevant for several breeding populations.Entities:
Keywords: Solanum tuberosum; chipping quality; dry matter; genomic prediction; genomic selection; potato breeding
Year: 2018 PMID: 30131817 PMCID: PMC6090097 DOI: 10.3389/fpls.2018.01118
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Density histograms depicting the phenotype distributions for the MASPOT population (gray), Test panel DK (yellow), and Test panel UK (blue). Dry matter content (A) was measured as percentage, while chipping quality (B) was determined as assessment of frying color on a scale from 1 (poor) to 9 (best).
Figure 2Principal component analysis (PCA) of genomic relationship matrix constructed from genotypes at 167,637 SNP markers (A) and 7,800 markers (B) for the three populations, MASPOT (gray), Test panel DK (yellow), and Test panel UK (blue). The first principal component (PC1) is plottet against the second principal component (PC2) in the top and against the third principal component (PC3) in the bottom. Plotted in various colors and connected with lines are five individuals that were genotyped in both Test panel DK and Test panel UK. The three components account for 19.8, 13.4, and 8.6% of the explained variance, respectively, in regards to the 167,637 markers, and 19.7, 11.4, and 9.2%, respectively, for the 7,800 marker set.
Mean prediction correlations and bias found with GBLUP over 50 repeats with 167,637 markers, using the three populations separately and combined for modeling.
| MASPOT [755] | 0.67 [1.41] | 0.62 [1.55] | 0.75 [0.99] | |
| Test panel DK [80] | 0.71 [1.91] | 0.63 [2.85] | 0.83 [1.08] | |
| Test panel UK [290] | 0.57 [1.64] | 0.37 [2.20] | 0.75 [1.36] | |
| MASPOT [524] | 0.35 [1.31] | 0.30 [0.31] | 0.57 [0.99] | |
| Test panel DK [39] | 0.48 [1.76] | 0.42 [0.63] | 0.49 [1.30] | |
| Test panel UK [290] | 0.43 [2.04] | 0.28 [3.79] | 0.78 [1.47] | |
The population used for training the model is listed horizontally while the predicted population is listed vertically. Bias is listed in brackets. The number of phenotypes available in each case is indicated with brackets. Bold lettering indicates within-population predictions, where the same population was used for training and test population with 5-fold cross-validation.
Figure 3Predictions of dry matter content using the combined population (left) or using within-population predictions (right). Green: Predictions of MASPOT population. Gray: Predictions of Test panel UK. Yellow: Predictions of Test panel DK. (A) Combined population. (B) MASPOT model. (C) Test panel UK model. (D) Test panel DK model.
Figure 4Predictions of chipping quality using the combined population (left) or using within-population predictions (right). Green: Predictions of MASPOT population. Gray: Predictions of Test panel UK. Yellow: Predictions of Test panel DK. (A) Combined population. (B) MASPOT model. (C) Test panel UK model. (D) Test panel DK model.
Mean prediction correlations and bias found with GBLUP over 50 repeats with 7,800 cherry-picked markers, using the three populations separately and combined for modeling.
| MASPOT [755] | 0.67 [0.87] | 0.59 [1.16] | 0.74 [0.93] | |
| Test panel DK [90] | 0.72 [1.47] | 0.66 [1.74] | 0.79 [0.99] | |
| Test panel UK [289] | 0.46 [1.18] | 0.44 [1.13] | 0.71 [1.16] | |
| MASPOT [524] | 0.37 [0.75] | 0.30 [0.21] | 0.49 [0.63] | |
| Test panel DK [42] | 0.53 [1.09] | 0.37 [0.25] | 0.39 [0.41] | |
| Test panel UK [289] | 0.44 [3.04] | 0.28 [3.04] | 0.78 [1.39] | |
The population used for training the model is listed horizontally while the predicted population is listed vertically. Bias is listed in brackets. The number of phenotypes available in each case is indicated with brackets. Bold lettering indicates within-population predictions, where the same population was used for training and test population with 5-fold cross-validation.
Mean prediction correlations and bias with standard deviations found with GBLUP over 50 repeats with 7,800 randomly selected markers and with 10 different sets of markers, using the three populations separately and combined for modeling.
| MASPOT [755] | 0.65 ± 0.01 [1.34 ± 0.03] | 0.57 ± 0.02 [1.38 ± 0.08] | 0.74 ± 0.01 [0.95 ± 0.01] | |
| Test panel DK [78–81] | 0.70 ± 0.03 [1.86 ± 0.11] | 0.57 ± 0.06 [2.35 ± 0.27] | 0.81 ± 0.01 [1.03 ± 0.02] | |
| Test panel UK [290] | 0.49 ± 0.04 [1.30 ± 0.13] | 0.32 ± 0.04 [1.53 ± 0.17] | 0.68 ± 0.02 [1.15 ± 0.04] | |
| MASPOT [524] | 0.33 ± 0.02 [1.21 ± 0.10] | 0.29 ± 0.02 [0.43 ± 0.09] | 0.51 ± 0.02 [0.80 ± 0.03] | |
| Test panel DK [38–39] | 0.43 ± 0.07 [1.38 ± 0.28] | 0.36 ± 0.10 [0.73 ± 0.26] | 0.49 ± 0.05 [1.01 ± 0.16] | |
| Test panel UK [290] | 0.33 ± 0.03 [1.79 ± 0.50] | 0.24 ± 0.05 [3.35 ± 1.32] | 0.76 ± 0.02 [1.45 ± 0.03] | |
The population used for training the model is listed horizontally while the predicted population is listed vertically. Bias is listed in brackets. The number of phenotypes available in each case is indicated with brackets. Bold lettering indicates within-population predictions, where the same population was used for training and test population with 5-fold cross-validation.
Mean prediction correlations and bias with standard deviations found with GBLUP over 50 repeats with 167,637 markers and with 10 different subsampling of either 39 or 80, using the three populations separately and combined for modeling.
| MASPOT [80] | 0.66 ± 0.05 [1.52 ± 0.14] | 0.50 ± 0.08 [2.63 ± 0.45] | 0.71 ± 0.07 [1.05 ± 0.06] | |
| Test panel DK [80] | 0.67 ± 0.04 [2.78 ± 0.22] | 0.53 ± 0.10 [5.21 ± 0.69] | 0.82 ± 0.00 [1.11 ± 0.02] | |
| Test panel UK [80] | 0.36 ± 0.08 [1.86 ± 0.47] | 0.42 ± 0.11 [2.71 ± 0.68] | 0.60 ± 0.06 [1.73 ± 0.25] | |
| MASPOT [39] | 0.36 ± 0.13 [1.71 ± 0.55] | 0.24 ± 0.20 [1.03 ± 0.87] | 0.37 ± 0.13 [1.04 ± 0.32] | |
| Test panel DK [39] | 0.34 ± 0.13 [2.23 ± 0.96] | 0.40 ± 0.17 [2.52 ± 1.09] | 0.52 ± 0.08 [1.82 ± 0.28] | |
| Test panel UK [39] | 0.25 ± 0.22 [5.32 ± 4.98] | 0.29 ± 0.19 [5.85 ± 4.33] | 0.44 ± 0.13 [2.77 ± 0.78] | |
The population used for training the model is listed horizontally while the predicted population is listed vertically. Bias is listed in brackets. The number of phenotypes available in each case is indicated with brackets. Bold lettering indicates within-population predictions, where the same population was used for training and test population with 5-fold cross-validation. No subsampling was performed in Test panel DK, being the smallest population, hence no standard deviations are presented.