| Literature DB >> 27783639 |
Nanna Hellum Nielsen1, Ahmed Jahoor1,2, Jens Due Jensen1, Jihad Orabi1, Fabio Cericola3, Vahid Edriss1, Just Jensen3.
Abstract
Genomic selection was recently introduced in plant breeding. The objective of this study was to develop genomic prediction for important seed quality parameters in spring barley. The aim was to predict breeding values without expensive phenotyping of large sets of lines. A total number of 309 advanced spring barley lines tested at two locations each with three replicates were phenotyped and each line was genotyped by Illumina iSelect 9Kbarley chip. The population originated from two different breeding sets, which were phenotyped in two different years. Phenotypic measurements considered were: seed size, protein content, protein yield, test weight and ergosterol content. A leave-one-out cross-validation strategy revealed high prediction accuracies ranging between 0.40 and 0.83. Prediction across breeding sets resulted in reduced accuracies compared to the leave-one-out strategy. Furthermore, predicting across full and half-sib-families resulted in reduced prediction accuracies. Additionally, predictions were performed using reduced marker sets and reduced training population sets. In conclusion, using less than 200 lines in the training set can result in low prediction accuracy, and the accuracy will then be highly dependent on the family structure of the selected training set. However, the results also indicate that relatively small training sets (200 lines) are sufficient for genomic prediction in commercial barley breeding. In addition, our results indicate a minimum marker set of 1,000 to decrease the risk of low prediction accuracy for some traits or some families.Entities:
Mesh:
Year: 2016 PMID: 27783639 PMCID: PMC5082657 DOI: 10.1371/journal.pone.0164494
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Overview of all phenotypes.
Based on a population of 309 F6 lines.
| Abbreviation | Phenotype explanation | Method |
|---|---|---|
| f2.8 | Weight (g) of seeds with the size > 2.8 mm | Laboratory screening machine: SORTIMAT (Baumann Saatzuchtbedarf) |
| f2.5 | Weight (g) of seeds with the size > 2.5 & <2.8 mm | |
| f2.2 | Weight (g) of seeds with the size > 2.2 & <2.5 mm | |
| SSW | Standardized seed weight | |
| Protein | Protein (TS %) | Grain Analyser for grain and flour: Infratec 1241 (FOSS) |
| TW | Test weight (kg | |
| Ergosterol | Ergosterol content (%) | |
| PY | Protein yield (hkg |
Fig 1PCA.
Relation between the 309 barley lines based on 3,540 SNP markers.
Fig 2Heatmap.
Relationship between the 309 barely lines based on 3,540 SNP markers.
Summary of phenotypic data.
Mean ± standard deviation (SD) of the traits are given as a mean of 2,158 samples representing 309 lines in three replicates on two locations (Dyngby and Holeby) in two years (2014 and 2015).
| Phenotype | All samples | 2014 | 2015 | |||
|---|---|---|---|---|---|---|
| Dyngby | Holeby | Dyngby | Holeby | |||
| Fraction mass | f2.8 | 82.2 ± 9.1 | 87.5 ± 6.2 | 87.6 ± 4.9 | 77.8 ± 9.0 | 78.8 ± 9.3 |
| f2.5 | 14.4 ± 7.4 | 10.0 ± 4.8 | 9.8 ± 4.3 | 17.6 ± 6.8 | 17.7 ± 7.9 | |
| f2.2 | 2.5 ± 1.7 | 1.9 ± 1.3 | 1.5± 0.6 | 3.7 ± 12.0 | 2.6 ± 1.3 | |
| NIT measurements | Protein | 9.2 ± 0.9 | 10.3 ± 0.5 | 9.4 ± 0.5 | 9.2 ± 0.6 | 8.3 ±0.4 |
| PY | 8.3 ± 1.0 | 9.6 ± 0.6 | 8.3 ± 0.4 | 8.4 ± 0.7 | 7.4 ± 0.4 | |
| TW | 68.3 ± 2.0 | 66.8 ± 1.7 | 68.4 ± 2.9 | 68.5 ± 1.4 | 68.9 ± 1.4 | |
| Ergosterol | 13.2 ± 3.0 | 13.8 ± 2.7 | 9.0 ± 2.3 | 14.6 ± 2.1 | 14.0 ± 1.9 | |
f2.8 = weight (g) of seed size fraction >2.8 mm, f2.5 = weight (g) of seed size fraction > 2.5mm & < 2.8mm, f2.2 = weight (g) of seed size fraction >2.2 mm < 2.5mm, Protein = protein content (TS %), PY = protein yield (hkg · ha-1), TW = test weight (kg · hl-1), Ergosterol = ergosterol content (%)
Fig 3Distribution of seed size-parameters among the 309 barley lines.
a: Weight(g) of f2.8, b: Weight(g) of f2.5 c: Weight(g) of f2.2 d: standardized seed size of f2.5 and f2.2.
Fig 4Distribution of NIT-parameters among the 309 barley lines.
a: Protein (TS%) b: PY = Protein yield (hkg · ha-1), c: TW = Test weight (kg · hl-1) d: Ergosterol content (%)
Heritability, maximum prediction accuracy and variance component estimates.
| Phenotype | h2 | Max acc. | Variance estimates | |||
|---|---|---|---|---|---|---|
| g | l | c | e | |||
| SSW | 0.51 ± 0.06 | 0.72 | 0.14·10−1 | 0.09·10−1 | 0.06·10−1 | 0.09·10−1 |
| Protein | 0.21 ± 0.03 | 0.46 | 0.18·10−1 | 0.18·10−1 | 0.55·10−1 | 1.45·10−1 |
| PY | 0.26 ± 0.02 | 0.51 | 0.18·10−1 | na | 0.67·10−1 | 2.18·10−1 |
| TW | 0.51 ± 0.04 | 0.72 | 0.52 | 0.13 | 0.30 | 1.30 |
| Ergosterol | 0.61 ± 0.06 | 0.79 | 1.16 | 0.50 | 0.39 | 0.39 |
SSW = standardized seed weight, Protein = protein content (TS %), PY = protein yield (hkg · ha-1), TW = test weight (kg · hl-1), Ergosterol = ergosterol content (%)
Prediction accuracies (acc.) using a leave-one-out (LOO) and a leave-family-out (LFO) cross-validation strategy.
LFO was based on families including more than three lines, resulting in 45 families in total. Standard deviations are the standard deviation of the correlation coefficient.
| Phenotype | LOO | LFO | ||
|---|---|---|---|---|
| Acc. | Bias | Acc. | Bias | |
| SSW | 0.68 ± 0.04 | 0.98 ± 0.06 | 0.63 ± 0.03 | 0.98 ± 0.05 |
| Protein | 0.40 ± 0.05 | 0.96 ± 0.12 | 0.26 ± 0.04 | 0.68 ± 0.10 |
| PY | 0.46 ± 0.05 | 1.04 ± 0.11 | 0.33 ± 0.04 | 0.87 ± 0.10 |
| TW | 0.63 ± 0.04 | 0.98 ± 0.07 | 0.56 ± 0.03 | 0.92 ± 0.06 |
| Ergosterol | 0.83 ± 0.03 | 1.02 ± 0.04 | 0.79 ± 0.03 | 1.02 ± 0.03 |
SSW = standardized seed weight, Protein = protein content (TS %), PY = protein yield (hkg · ha-1), TW = test weight (kg · hl-1), Ergosterol = ergosterol content (%)
Prediction accuracies (acc.) using a leave-set-out (LSO) cross-validation strategy.
LSO predictions are compared with leave-random-set-out (LSOrandom), using sets of same sizes as the progeny sets (S2014 and S2015). Standard deviations are the standard deviation of the correlation coefficient.
| Phenotype | LSO | LSOrandom | |||
|---|---|---|---|---|---|
| Acc. | Bias | Acc. | Bias | ||
| S2015 as training population | SSW | 0.50 ± 0.08 | 0.57 ± 0.09 | 0.60± 0.07 | 0.96 ± 0.17 |
| Protein | 0.22 ± 0.09 | 0.82 ± 0.3 | 0.22 ± 0.07 | 0.96 ± 0.24 | |
| PY | 0.19± 0.09 | 0.69 ± 0.3 | 0.22 ± 0.05 | 1.08 ± 0.20 | |
| TW | 0.49 ± 0.08 | 0.90 ± 0.15 | 0.40 ± 0.05 | 0.95 ± 0.06 | |
| Ergosterol | 0.79 ± 0.06 | 1.29 ± 0.06 | 0.76 ± 0.04 | 1.06 ± 0.13 | |
| S2014 as training population | SSW | 0.52 ± 0.06 | 1.27± 0.15 | 0.63 ± 0.05 | 1.01 ± 0.17 |
| Protein | 0.31 ± 0.07 | 0.87 ± 0.19 | 0.20 ± 0.02 | 0.94 ± 0.21 | |
| PY | 0.22± 0.07 | 0.71± 0.23 | 0.19 ± 0.02 | 0.98 ± 0.23 | |
| TW | 0.50 ± 0.06 | 0.94 ± 0.12 | 0.34± 0.03 | 0.94 ± 0.08 | |
| Ergosterol | 0.72 ± 0.06 | 1.29 ± 0.09 | 0.77 ± 0.04 | 1.14 ± 0.15 | |
SSW = standardized seed weight, Protein = protein content (TS %), PY = protein yield (hkg · ha-1), TW = test weight (kg · hl-1), Ergosterol = ergosterol content (%)
Fig 5Prediction accuracies with reduced training population size a: Test Weight (TW) (kg ·hL-1) and b: SSW (standardized seed weight). Predictions were done for randomly selected line sets. The analysis was repeated 100 times for each step using a leave-one-out strategy. Curves were fit by loess.
Fig 6Prediction accuracies with reduced marker numbers.
a: Test Weight (TW) (kg ·hL-1) and b: SSW (standardized seed weight). Predictions were done for randomly selected marker sets. The analysis was repeated 100 times for each step using a leave-one-out strategy. Curves were fit by loess.