| Literature DB >> 29264625 |
Marty J Faville1, Siva Ganesh2, Mingshu Cao2, M Z Zulfi Jahufer2, Timothy P Bilton3, H Sydney Easton2, Douglas L Ryan4,5, Jason A K Trethewey6,7, M Philip Rolston6, Andrew G Griffiths2, Roger Moraga2, Casey Flay2, Jana Schmidt2, Rachel Tan2, Brent A Barrett2.
Abstract
KEY MESSAGE: Genomic prediction models for multi-year dry matter yield, via genotyping-by-sequencing in a composite training set, demonstrate potential for genetic gain improvement through within-half sibling family selection. Perennial ryegrass (Lolium perenne L.) is a key source of nutrition for ruminant livestock in temperate environments worldwide. Higher seasonal and annual yield of herbage dry matter (DMY) is a principal breeding objective but the historical realised rate of genetic gain for DMY is modest. Genomic selection was investigated as a tool to enhance the rate of genetic gain. Genotyping-by-sequencing (GBS) was undertaken in a multi-population (MP) training set of five populations, phenotyped as half-sibling (HS) families in five environments over 2 years for mean herbage accumulation (HA), a measure of DMY potential. GBS using the ApeKI enzyme yielded 1.02 million single-nucleotide polymorphism (SNP) markers from a training set of n = 517. MP-based genomic prediction models for HA were effective in all five populations, cross-validation-predictive ability (PA) ranging from 0.07 to 0.43, by trait and target population, and 0.40-0.52 for days-to-heading. Best linear unbiased predictor (BLUP)-based prediction methods, including GBLUP with either a standard or a recently developed (KGD) relatedness estimation, were marginally superior or equal to ridge regression and random forest computational approaches. PA was principally an outcome of SNP modelling genetic relationships between training and validation sets, which may limit application for long-term genomic selection, due to PA decay. However, simulation using data from the training experiment indicated a twofold increase in genetic gain for HA, when applying a prediction model with moderate PA in a single selection cycle, by combining among-HS family selection, based on phenotype, with within-HS family selection using genomic prediction.Entities:
Mesh:
Year: 2017 PMID: 29264625 PMCID: PMC5814531 DOI: 10.1007/s00122-017-3030-1
Source DB: PubMed Journal: Theor Appl Genet ISSN: 0040-5752 Impact factor: 5.699
SNP datasets produced for assessment of genomic selection statistical models
| SNP set | Missing data per SNP site (%) | Total number of SNPs | Models tested |
|---|---|---|---|
| 1 | 50 | 1,023,011 | KGD, GBLUP, RF |
| 2 | 10 | 249,546 | GBLUP, RR, RF |
| 3 | 1 | 43,966 | GBLUP, RR, RF |
RR ridge regression, RF random forest, KGD GBLUP using KGD-generated genomic relationship matrix
Fig. 1Multi-dimensional scaling ordination plot for a ryegrass training set made up of individuals (n = 566) from five breeding populations (Pop I–V). Six repeats of a control DNA sample (one per 96-plex GBS library created) are represented by purple dots near the centre of the image
Genotypic (σ g2; for multi-population training set, MP) or additive (σ a2; for individual populations), genotype-by-harvest (σ gh2), genotype-by-year (σ gy2), genotype-by-treatment (σ gt2), genotype-by-site (σ gl2) and experimental error (σ ε2) variance components and their associated standard errors (± SE) and R (repeatability; for MP training set) or h n2 (narrow-sense heritability; for individual populations), estimated for HA (g DM per plot) and DTH (number of days after 25 October) traits
| Trait | Training set | Mean | Range |
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|
| Rua | MP | 25.2 | 13.5–36.8 | 13.6 ± 1.8 | 2.0 ± 0.7 | 19.7 ± 1.6 | – | – | 91.8 ± 1.5 | 0.47 |
| STD | Pop I | 21.2 | 11.8–30.2 | 15.7 ± 3.72 | ns | 10.7 ± 2.67 | – | – | 57.1 ± 2.93 | 0.65 |
| Pop II | 32.2 | 22.9–39.9 | 10.3 ± 2.41 | ns | 8.0 ± 1.95 | – | – | 66.1 ± 2.58 | 0.67 | |
| Pop III | 25.3 | 15.4–36.5 | 15.1 ± 3.07 | ns | 11.7 ± 2.25 | – | – | 77.6 ± 2.85 | 0.60 | |
| Pop IV | 22.1 | 14.9–30.3 | 12.5 ± 3.60 | ns | 10.1 ± 3.45 | – | – | 55.9 ± 4.82 | 0.61 | |
| Pop V | 24.9 | 13.9–37.9 | 26.7 ± 4.95 | ns | 11.5 ± 2.50 | – | – | 73.0 ± 3.17 | 0.73 | |
| Rua | MP | 29.6 | 17.3–48.9 | 30.3 ± 3.2 | ns | 12.0 ± 1.9 | – | – | 101.1 ± 2.1 | 0.67 |
| SEV | Pop I | 26.7 | 15.4–41.1 | 41.4 ± 8.14 | ns | ns | – | – | 94.2 ± 7.29 | 0.84 |
| Pop II | 39.8 | 32.0–52.4 | 21.9 ± 4.10 | ns | ns | – | – | 76.8 ± 4.33 | 0.77 | |
| Pop III | 30.6 | 20.2–41.7 | 30.1 ± 4.94 | ns | ns | – | – | 92.6 ± 4.95 | 0.80 | |
| Pop IV | 26.1 | 16.0–47.0 | 42.7 ± 10 | ns | ns | – | – | 60.6 ± 6.71 | 0.89 | |
| Pop V | 26.1 | 16.1–37.9 | 32.5 ± 5.93 | ns | 6.6 ± 3.03 | – | – | 84.1 ± 5.13 | 0.76 | |
| Aor | MP | 58.9 | 53.0–63.4 | 4.6 ± 2.2 | ns | 11.1 ± 2.8 | – | – | 172.4 ± 3.5 | 0.23 |
| STD | Pop I | 52.2 | 44.6–58.7 | 10.9 ± 4.28 | ns | 8.9 ± 4.14 | – | – | 80.5 ± 5.23 | 0.55 |
| Pop II | 52.4 | 48.0–57.1 | 5.5 ± 2.05 | ns | ns | – | – | 85.7 ± 4.24 | 0.54 | |
| Pop III | 51.4 | 44.6–60.6 | 8.6 ± 3.19 | ns | 12.7 ± 3.76 | – | – | 97.6 ± 4.40 | 0.42 | |
| Pop IV | 54.4 | 49.5–59.2 | 8.1 ± 3.19 | ns | ns | – | – | 103.4 ± 9.50 | 0.59 | |
| Pop V | 57.1 | 49.0–64.4 | 8.0 ± 3.6 | 10.0 ± 4.70 | 11.9 ± 5.00 | – | – | 117.3 ± 6.50 | 0.34 | |
| Aor | MP | 38.4 | 24.0–52.1 | 28.7 ± 3.0 | 10.2 ± 2.2 | ns | – | – | 128.3 ± 2.8 | 0.73 |
| SEV | Pop I | 38.2 | 17.4–53.5 | 63.8 ± 10.25 | 22.1 ± 5.03 | ns | – | – | 75.9 ± 4.82 | 0.85 |
| Pop II | 41.9 | 30.6–54.3 | 35.2 ± 5.60 | 11.6 ± 5.10 | ns | – | – | 101.0 ± 5.20 | 0.79 | |
| Pop III | 39.9 | 23.6–55.4 | 39.0 ± 5.73 | 14.8 ± 3.71 | ns | – | – | 83.0 ± 4.08 | 0.80 | |
| Pop IV | 39.6 | 27.5–49.0 | 25.4 ± 5.26 | ns | ns | – | – | 57.6 ± 5.09 | 0.89 | |
| Pop V | 32.5 | 21.7–46.8 | 45.4 ± 7.24 | 28.2 ± 5.18 | ns | – | – | 85.2 ± 4.59 | 0.76 | |
| Lin | MP | 43.7 | 35.7–52.2 | 10.5 ± 2.0 | ns | 11.4 ± 2.4 | – | – | 124.3 ± 3.0 | 0.45 |
| STD | Pop I | 46.4 | 39.7–51.8 | 11.6 ± 4.43 | ns | 12.0 ± 5.33 | – | – | 83.1 ± 6.77 | 0.52 |
| Pop II | 40.8 | 33.6–45.8 | 12.2 ± 2.95 | ns | ns | – | – | 69.3 ± 4.17 | 0.76 | |
| Pop III | 43.8 | 35.8–52.7 | 9.9 ± 3.05 | ns | 5.8 ± 2.88 | – | – | 71.5 ± 3.95 | 0.59 | |
| Pop IV | – | – | ns | ns | ns | – | – | 63.2 ± 11.19 | – | |
| Pop V | 48.1 | 42.6–54.1 | 12.8 ± 3.68 | ns | 8.3 ± 4.05 | – | – | 78.1 ± 5.10 | 0.60 | |
| Rua | MP | 28.2 | 16.2–42.5 | 14.3 ± 2.0 | ns | 11.6 ± 1.1 | 11.7 ± 1.3 | – | 98.3 ± 2.2 | 0.50 |
| STD + SEV | Pop I | 23.5 | 12.7–33.4 | 12.5 ± 4.20 | ns | ns | 16.0 ± 3.80 | – | 81.7 ± 3.20 | 0.54 |
| Pop II | – | – | ns | ns | 2.8 ± 1.21 | 18.6 ± 3.63 | – | 82.0 ± 2.50 | – | |
| Pop III | 28.1 | 18.1–40.3 | 16.3 ± 3.22 | ns | 4.3 ± 1.40 | 5.7 ± 1.74 | – | 96.1 ± 2.73 | 0.68 | |
| Pop IV | – | – | ns | ns | 2.9 ± 1.91 | 37.5 ± 8.02 | – | 71.9 ± 3.41 | – | |
| Pop V | 26.8 | 16.6–38.9 | 17.0 ± 4.93 | ns | 4.9 ± 1.64 | 19.6 ± 3.94 | – | 92.9 ± 3.03 | 0.53 | |
| Comb | MP | – | – | ns | 3.1 ± 0.6 | 11.4 ± 1.0 | – | 22.7 ± 1.6 | 119.8 ± 1.3 | – |
| STD | Pop I | 35.3 | 27.9–42.2 | 4.9 ± 2.19 | ns | 4.1 ± 1.46 | – | 11.2 ± 2.28 | 77.9 ± 2.79 | 0.40 |
| Pop II | 37.0 | 32.6–42.5 | 2.9 ± 1.35 | ns | 2.9 ± 0.97 | – | 7.6 ± 1.46 | 73.4 ± 2.02 | 0.35 | |
| Pop III | 37.8 | 32.7–43.1 | 4.2 ± 1.61 | ns | ns | – | 9.3 ± 2.01 | 167.6 ± 3.82 | 0.40 | |
| Pop IV | – | – | ns | ns | 4.5 ± 2.18 | – | 12.0 ± 3.15 | 75.1 ± 4.11 | – | |
| Pop V | 38.5 | 31.7–46.8 | 8.6 ± 2.65 | ns | 5.1 ± 1.42 | – | 10.7 ± 2.02 | 84.6 ± 2.50 | 0.53 | |
| Comb | MP | 33.0 | 21.0–45.4 | 13.5 ± 2.4 | 2.6 ± 0.8 | 2.7 ± 1.0 | – | 20.0 ± 2.1 | 117.3 ± 1.72 | 0.47 |
| SEV | Pop I | 32.4 | 18.2–43.6 | 29.1 ± 7.62 | ns | ns | – | 24.2 ± 5.93 | 107.2 ± 5.14 | 0.66 |
| Pop II | 38.5 | 30.5–47.9 | 10.8 ± 3.44 | ns | ns | – | 12.7 ± 3.44 | 126.2 ± 3.42 | 0.52 | |
| Pop III | 33.3 | 24.6–41.3 | 33.3 ± 4.31 | ns | ns | – | 14.7 ± 3.63 | 133.0 ± 4.53 | 0.75 | |
| Pop IV | 33.0 | 26.7–41.5 | 14.1 ± 5.22 | ns | ns | – | 7.6 ± 3.20 | 132.8 ± 7.70 | 0.65 | |
| Pop V | 28.5 | 19.0–40.0 | 20.0 ± 4.6 | ns | ns | – | 11.1 ± 3.12 | 100.4 ± 3.82 | 0.71 | |
| Comb | MP | – | – | ns | 0.8 ± 0.3 | 5.5 ± 0.6 | 11.5 ± 1.0 | 19.9 ± 1.3 | 124.4 ± 1.1 | – |
| STD + SEV | Pop I | 35.8 | 24.9–44.6 | 7.6 ± 3.32 | ns | 2.0 ± 0.93 | 14.8 ± 2.8 | 11.9 ± 2.25 | 79.4 ± 2.30 | 0.37 |
| Pop II | ||||||||||
| Pop III | 35.7 | 29.4–41.6 | 4.8 ± 1.78 | ns | 2.0 ± 0.70 | 4.9 ± 1.21 | 9.4 ± 1.46 | 89.6 ± 1.80 | 0.39 | |
| Pop IV | – | – | ns | ns | 2.3 ± 1.31 | 8.2 ± 2.30 | 8.4 ± 2.25 | 71.6 ± 3.05 | – | |
| Pop V | 36.6 | 28.9–45.0 | 10.0 ± 2.73 | ns | 2.5 ± 0.86 | 6.3 ± 1.48 | 8.5 ± 1.58 | 85.8 ± 2.00 | 0.55 | |
| Comb | MP | 16.2 | 11.1–21.2 | 6.1 ± 0.8 | – | – | – | 0.2 ± 0.02 | 18.5 ± 0.7 | 0.66 |
| DTH | Pop I | – | – | ns | – | – | – | 5.2 ± 1.97 | 12.8 ± 1.59 | – |
| Pop II | 17.0 | 10.6–22.0 | 6.0 ± 1.26 | – | – | – | 1.7 ± 0.78 | 8.3 ± 0.77 | 0.73 | |
| Pop III | 16.7 | 12.5–21.6 | 4.4 ± 1.27 | – | – | – | ns | 17.1 ± 1.34 | 0.61 | |
| Pop IV | 16.7 | 6.9–17.3 | 13.3 ± 3.71 | – | – | – | ns | 21.6 ± 4.08 | 0.79 | |
| Pop V | 24.2 | 21.1–26.6 | 2.3 ± 1.12 | – | – | – | ns | 17.3 ± 1.71 | 0.44 |
ns indicates a variance component was recorded but was not statistically significant (P > 0.05). Results are given for analysis of populations as a MP training set and as individual populations (Pop I–V). Where there was no significant (P > 0.05) σ (g or a)2 for a training set, results are not shown (including all training sets for the trait Aor STD + SEV)
STD standard grazing management, SEV severe summer grazing management, Comb data combined across locations and/or treatments
Fig. 4Cross-validation predictive ability (mean of n = 25 twofold cross-validations) in individual populations (Pop I–V) for seven HA traits and DTH. For each population, predictive ability is based on genomic prediction models developed from a multi-population (MP) training set, using the KGD statistical method. Prediction accuracies by KGD in the full MP training set (Fig. 3), are represented here (black symbol) for comparison. Error bars are SE
Fig. 2Decline in linkage disequilibrium, measured as r against distance in base pairs (bp), for the five populations making up the multi-population (MP) perennial ryegrass training set. The lines are non-linear regression models estimated for each of Pop I–V. These are based on pairwise r values between all SNPs mapped to the same scaffold, for all chromosomes within the population (Supplementary Figure S3)
Fig. 3Mean (n = 5) tenfold cross-validation predictive ability in a multi-population training set (MP) for seven HA traits and DTH, determined using four statistical models and assessed as a Pearson’s correlation between observed phenotype (BLUP) and GEBV; b slope of the regression of GEBVs on BLUPs. RR ridge regression, RF random forest, KGD GBLUP using KGD genomic relationship matrix. RR models used 249,546 SNPs (largest SNP dataset able to be dealt with computationally by this method), while GBLUP, KGD and RF used 1,023,011 SNPs. Differences between any two statistical methods for each trait > LSD bar are significant at P < 0.05. In b statistical methods with a slope of regression of GEBVs on BLUP-adjusted means ≈ 1 are regarded as providing unbiased estimates of BLUPs. Lines are used on the plots not to infer continuity between the points but to clearly illustrate differences amongst the statistical methods
Analysis of variance results considering two measures of genomic predictive ability, the Pearson’s correlation between BLUP’s and GEBV’s and the slope of the regression of GEBV’s on BLUP’s, with statistical method and trait as factors
| Accuracy measure | Source |
| SS | MS |
|
|
|---|---|---|---|---|---|---|
| Pearson correlation | Method | 3 | 0.0039 | 0.0013 | 6.50 | 0.0004 |
| Trait | 7 | 2.1105 | 0.3015 | 1505.95 | < 0.0001 | |
| Method × trait | 21 | 0.0535 | 0.0026 | 12.72 | < 0.0001 | |
| Residual | 128 | 0.0256 | 0.0002 | |||
| Regression slope | Method | 3 | 4.9230 | 1.6409 | 404.69 | < 0.0001 |
| Trait | 7 | 1.8710 | 0.2673 | 65.92 | < 0.0001 | |
| Method × trait | 21 | 0.7570 | 0.0360 | 8.89 | < 0.0001 | |
| Residual | 128 | 0.5190 | 0.0041 |
Mean (n = 10) cross-validation predictive ability in a multi-population (MP) training set using three different SNP sets for seven HA traits and Comb DTH
| Trait | RF | GBLUP | RR | KGD | |||||
|---|---|---|---|---|---|---|---|---|---|
| Set 1 | Set 2 | Set 3 | Set 1 | Set 2 | Set 3 | Set 1 | Set 2 | Set 3 | |
| Aor STD | 0.23 (0.007) | 0.23 (0.008) | 0.23 (0.003) | 0.23 (0.006) | 0.22 (0.006) | 0.21 (0.006) | 0.22 (0.009) | 0.19 (0.007) | 0.22 (0.005) |
| Aor SEV | 0.26 (0.004) | 0.26 (0.004) | 0.24 (0.005) | 0.24 (0.004) | 0.24 (0.005) | 0.23 (0.006) | 0.21 (0.005) | 0.23 (0.005) | 0.24 (0.004) |
| Lin STD | 0.11 (0.006) | 0.14 (0.005) | 0.15 (0.010) | 0.10 (0.003) | 0.10 (0.003) | 0.07 (0.009) | 0.13 (0.006) | 0.15 (0.006) | 0.10 (0.011) |
| Rua STD | 0.26 (0.007) | 0.25 (0.007) | 0.25 (0.004) | 0.26 (0.005) | 0.27 (0.006) | 0.26 (0.005) | 0.28 (0.006) | 0.30 (0.007) | 0.26 (0.005) |
| Rua SEV | 0.32 (0.008) | 0.31 (0.006) | 0.30 (0.006) | 0.31 (0.006) | 0.31 (0.005) | 0.31 (0.005) | 0.28 (0.010) | 0.30 (0.008) | 0.30 (0.007) |
| Rua STD + SEV | 0.40 (0.006) | 0.40 (0.005) | 0.39 (0.004) | 0.43 (0.006) | 0.44 (0.007) | 0.44 (0.007) | 0.44 (0.007) | 0.44 (0.008) | 0.43 (0.006) |
| Comb SEV | 0.36 (0.007) | 0.35 (0.006) | 0.34 (0.005) | 0.35 (0.007) | 0.36 (0.007) | 0.36 (0.007) | 0.33 (0.011) | 0.35 (0.010) | 0.36 (0.006) |
| Comb DTH | 0.47 (0.004) | 0.47 (0.004) | 0.46 (0.005) | 0.52 (0.002) | 0.52 (0.003) | 0.52 (0.003) | 0.51 (0.002) | 0.51 (0.003) | 0.52 (0.005) |
Set 1 = 43,966 SNPs (1% missing data per SNP site). Set 2 = 249,546 SNPs (10% missing data per SNP site). Set 3 = 1,023,011 SNPs (50% missing data per SNP site)
RR ridge regression, RF random forest, KGD GBLUP using KGD genomic relationship matrix
Fig. 5Predicted rates of genetic gain (∆G c) for HA based on data from the evaluation of 108 half-sibling (HS) families of perennial ryegrass Pop II in the Rua STD environment. HS family selection (HSF) is compared with among-HS family phenotypic selection and within-family genomic selection (APWFgs-HS). ∆G c is estimated at six levels of genomic-predictive ability (r), including the value estimated by cross-validation for Rua STD HA in Pop II (0.27, indicated by arrow). Three within-HS family selection intensities (k w = 2.06, 2.27 and 2.67, equivalent to selecting the top 5, 3 and 1% of individuals, respectively) are tested at each r. Among-HS family selection intensity is fixed at k f = 1.40 (equivalent to selecting top 20% of HS families) for all HSF and APWFgs-HS scenarios. The solid horizontal line indicates ∆G c for HSF selection (6.02) and the dotted line shows two times that rate (12.04)