| Literature DB >> 26635819 |
Nadim Tayeh1, Anthony Klein1, Marie-Christine Le Paslier2, Françoise Jacquin1, Hervé Houtin1, Céline Rond1, Marianne Chabert-Martinello1, Jean-Bernard Magnin-Robert1, Pascal Marget1, Grégoire Aubert1, Judith Burstin1.
Abstract
Pea is an important food and feed crop and a valuable component of low-input farming systems. Improving resistance to biotic and abiotic stresses is a major breeding target to enhance yield potential and regularity. Genomic selection (GS) has lately emerged as a promising technique to increase the accuracy and gain of marker-based selection. It uses genome-wide molecular marker data to predict the breeding values of candidate lines to selection. A collection of 339 genetic resource accessions (CRB339) was subjected to high-density genotyping using the GenoPea 13.2K SNP Array. Genomic prediction accuracy was evaluated for thousand seed weight (TSW), the number of seeds per plant (NSeed), and the date of flowering (BegFlo). Mean cross-environment prediction accuracies reached 0.83 for TSW, 0.68 for NSeed, and 0.65 for BegFlo. For each trait, the statistical method, the marker density, and/or the training population size and composition used for prediction were varied to investigate their effects on prediction accuracy: the effect was large for the size and composition of the training population but limited for the statistical method and marker density. Maximizing the relatedness between individuals in the training and test sets, through the CDmean-based method, significantly improved prediction accuracies. A cross-population cross-validation experiment was further conducted using the CRB339 collection as a training population set and nine recombinant inbred lines populations as test set. Prediction quality was high with mean Q (2) of 0.44 for TSW and 0.59 for BegFlo. Results are discussed in the light of current efforts to develop GS strategies in pea.Entities:
Keywords: GenoPea 13.2K SNP Array; genomic selection; marker density; pea (Pisum sativum L.); prediction accuracy; training set
Year: 2015 PMID: 26635819 PMCID: PMC4648083 DOI: 10.3389/fpls.2015.00941
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Distribution of the SNP markers used to genotype the CRB339 collection on the pea consensus genetic map. Information on the genetic positions of 9723 SNP markers are available and presented (see Tayeh et al., in press; and Supplementary Table 3).
Figure 2Genetic dissimilarity amongst the 339 pea accessions composing the genetic resource collection. Two-dimensional networks were constructed using the R package, network (Butts, 2008). Each accession is represented by a dot. Pairs of accessions with a genomic relationship coefficient >0.2 are linked. Major classes for the use type (A), the population type (B), the sowing type (C), and the geographic origin (D) are color-coded. Uncolored dots correspond to accessions that were not assigned to any of the classes.
Prediction accuracy of TSW, NSeed, and BegFlo within the CRB339 collection.
| TSW | Mean | 0.93 | 0.86 | 0.82 | 0.86 | 0.86 | 0.86 | 0.83 | 0.79 | 0.82 | 0.83 | 0.83 |
| stdev | 0.02 | 0.02 | 0.03 | 0.03 | 0.03 | 0.03 | 0.03 | 0.04 | 0.03 | 0.03 | 0.03 | |
| NSeed | Mean | 0.69 | 0.73 | 0.66 | 0.73 | 0.73 | 0.72 | 0.67 | 0.64 | 0.67 | 0.68 | 0.67 |
| stdev | 0.06 | 0.04 | 0.05 | 0.04 | 0.04 | 0.04 | 0.05 | 0.06 | 0.06 | 0.05 | 0.05 | |
| BegFlo | Mean | 0.79 | 0.78 | 0.76 | 0.78 | 0.78 | 0.79 | 0.64 | 0.63 | 0.64 | 0.64 | 0.65 |
| stdev | 0.05 | 0.04 | 0.04 | 0.04 | 0.04 | 0.04 | 0.05 | 0.06 | 0.05 | 0.05 | 0.05 | |
Mean and standard deviation (stdev), across 200 repetitions, of the phenotypic prediction accuracy and of within- and across-year genomic prediction accuracies are provided for each trait.
Figure 3Effect of the calibration set size (A) and marker density (B) on the prediction quality of TSW, Nseed, and BegFlo of the CRB339 collection. Nine different tests per parameter were conducted. For each trait and each parameter, means and standard deviations of Q2 resulting from five statistical methods (see color codes) were used to construct the corresponding plot.
Figure 4Improvement of the prediction quality of TSW, NSeed, and BegFlo within the CRB339 collection with training sets sampled using the CDmean-based method. For each trait, mean Q2 obtained with the kPLSR method for nine different training set (train) sizes are shown. Two methods to select the training sets were applied: random sampling and CDmean-based sampling (Rincent et al., 2012). Fifty repetitions were made in each case. Data from this figure were obtained using a variance ratio of 0.01 to calculate CD in the CDmean-based method.
Figure 5Illustration of the quality of genomic predictions of TSW (A) and BegFlo (B) of RILs from nine populations based on models trained on 339, 250, 150, or 50 accessions from the CRB339 collection. Five statistical methods were applied and genotypic information from 9700 SNP markers were considered. Mean Q2 and standard deviations used to construct this figure were obtained based on cross-validations with 50 repetitions.