| Literature DB >> 29228905 |
Qi Bin Kwong1,2, Chee Keng Teh3, Ai Ling Ong3, Fook Tim Chew4, Sean Mayes5, Harikrishna Kulaveerasingam3, Martti Tammi3, Suat Hui Yeoh6, David Ross Appleton3, Jennifer Ann Harikrishna7,8.
Abstract
BACKGROUND: Genomic selection (GS) uses genome-wide markers as an attempt to accelerate genetic gain in breeding programs of both animals and plants. This approach is particularly useful for perennial crops such as oil palm, which have long breeding cycles, and for which the optimal method for GS is still under debate. In this study, we evaluated the effect of different marker systems and modeling methods for implementing GS in an introgressed dura family derived from a Deli dura x Nigerian dura (Deli x Nigerian) with 112 individuals. This family is an important breeding source for developing new mother palms for superior oil yield and bunch characters. The traits of interest selected for this study were fruit-to-bunch (F/B), shell-to-fruit (S/F), kernel-to-fruit (K/F), mesocarp-to-fruit (M/F), oil per palm (O/P) and oil-to-dry mesocarp (O/DM). The marker systems evaluated were simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs). RR-BLUP, Bayesian A, B, Cπ, LASSO, Ridge Regression and two machine learning methods (SVM and Random Forest) were used to evaluate GS accuracy of the traits.Entities:
Keywords: Complex traits; Genomic prediction; Machine learning; Marker-assisted selection; Perennial crop; Predictive modeling; SNP; SSR
Mesh:
Substances:
Year: 2017 PMID: 29228905 PMCID: PMC5725918 DOI: 10.1186/s12863-017-0576-5
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Fig. 1Plot representing phenotypic distribution and correlation for the traits of F/B (%), S/F (%), K/F (%), M/F (%), O/P (kg/palm/year) and O/DM (%) in the Deli x Nigerian family. Diagonal of the plot shows the histograms and the distribution of the observed phenotypes values. The lower off-diagonal is the scatterplot between traits, whereas the upper off-diagonal represents the correlation value between traits. Significant correlations are tagged with the asterisk (*) symbol
Fig. 2Bar plot showing the genomic heritabilities for the traits of F/B, S/F, K/F, M/F, O/P and O/DM in the Deli x Nigerian family. S/F and O/DM had the highest heritability, followed by M/F, K/F, F/B and O/P
Mean accuracy of traits based on different SSR-based GS methods
| GS Method | F/B | S/F | K/F | M/F | O/P | O/DM | Mean |
|---|---|---|---|---|---|---|---|
| RR-BLUP | 0.28 | 0.25 | 0.17 | 0.14 | 0.17 | 0.14 | 0.19 |
| BA | 0.29 | 0.24 | 0.19 | 0.15 | 0.17 | 0.17 | 0.20 |
| BB | 0.29 | 0.24 | 0.21 | 0.14 | 0.16 | 0.16 | 0.20 |
| BC | 0.29 | 0.23 | 0.20 | 0.14 | 0.18 | 0.18 | 0.20 |
| BL | 0.28 | 0.24 | 0.17 | 0.14 | 0.17 | 0.16 | 0.19 |
| BRR | 0.29 | 0.23 | 0.18 | 0.13 | 0.18 | 0.19 | 0.20 |
| SVM | 0.30 | 0.30 | 0.14 | 0.22 | 0.28 | 0.22 | 0.24 |
| RF | 0.19 | 0.25 | 0.24 | 0.34 | 0.14 | 0.28 | 0.24 |
| Mean | 0.28 | 0.25 | 0.19 | 0.18 | 0.18 | 0.19 |
BL – Bayes Lasso, BRR – Bayes Ridge Regression, BA – Bayes A, BB – Bayes B, BC – Bayes Cπ, SVM – support vector machine, RF – random forest
Mean accuracy of traits based on different SNP-based GS methods
| GS Method | F/B | S/F | K/F | M/F | O/P | O/DM | Mean |
|---|---|---|---|---|---|---|---|
| RR-BLUP | 0.31 | 0.40 | 0.18 | 0.30 | 0.21 | 0.42 | 0.30 |
| BA | 0.33 | 0.40 | 0.20 | 0.30 | 0.20 | 0.43 | 0.31 |
| BB | 0.34 | 0.40 | 0.19 | 0.30 | 0.20 | 0.43 | 0.31 |
| BC | 0.33 | 0.40 | 0.20 | 0.29 | 0.20 | 0.42 | 0.31 |
| BL | 0.30 | 0.39 | 0.19 | 0.28 | 0.18 | 0.42 | 0.29 |
| BRR | 0.32 | 0.40 | 0.20 | 0.29 | 0.20 | 0.42 | 0.30 |
| SVM | 0.32 | 0.47 | 0.28 | 0.26 | 0.27 | 0.39 | 0.33 |
| RF | 0.24 | 0.23 | 0.27 | 0.37 | 0.30 | 0.47 | 0.31 |
| Mean | 0.31 | 0.39 | 0.21 | 0.30 | 0.22 | 0.43 |
BL – Bayes Lasso, BRR – Bayes Ridge Regression, BA – Bayes A, BB – Bayes B, BC – Bayes Cπ, SVM – support vector machine, RF – random forest
Fig. 3Regression boxplot illustrating predicted trait values vs. observed trait values for F/B, S/F, K/F, M/F, O/P and O/DM, selected by best GS method for each trait. The observed trait values were split into three classes. The prediction accuracy was written on the top left corner for each plot