| Literature DB >> 25790019 |
Lun Li1, Yan Long2, Libin Zhang1, Jessica Dalton-Morgan3, Jacqueline Batley3, Longjiang Yu4, Jinling Meng5, Maoteng Li4.
Abstract
The prediction of the flowering time (FT) trait in Brassica napus based on genome-wide markers and the detection of underlying genetic factors is important not only for oilseed producers around the world but also for the other crop industry in the rotation system in China. In previous studies the low density and mixture of biomarkers used obstructed genomic selection in B. napus and comprehensive mapping of FT related loci. In this study, a high-density genome-wide SNP set was genotyped from a double-haploid population of B. napus. We first performed genomic prediction of FT traits in B. napus using SNPs across the genome under ten environments of three geographic regions via eight existing genomic predictive models. The results showed that all the models achieved comparably high accuracies, verifying the feasibility of genomic prediction in B. napus. Next, we performed a large-scale mapping of FT related loci among three regions, and found 437 associated SNPs, some of which represented known FT genes, such as AP1 and PHYE. The genes tagged by the associated SNPs were enriched in biological processes involved in the formation of flowers. Epistasis analysis showed that significant interactions were found between detected loci, even among some known FT related genes. All the results showed that our large scale and high-density genotype data are of great practical and scientific values for B. napus. To our best knowledge, this is the first evaluation of genomic selection models in B. napus based on a high-density SNP dataset and large-scale mapping of FT loci.Entities:
Mesh:
Year: 2015 PMID: 25790019 PMCID: PMC4366152 DOI: 10.1371/journal.pone.0119425
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
The performance of various genome-based trait prediction methods applied to flowering time in multiple environments.
| Environments | RR-BLUP | RKHS | Bayesian LASSO | BayesA | BayesB | Random Forest | SVM (linear kernel) | SVM(Gaussian kernel) |
|---|---|---|---|---|---|---|---|---|
| E7 |
| 0.704 | 0.694 | 0.690 | 0.692 | 0.630 | 0.653 | 0.702 |
| N3 | 0.660 | 0.645 | 0.661 | 0.675 | 0.669 | 0.596 |
| 0.671 |
| N4 | 0.620 | 0.595 | 0.631 | 0.641 |
| 0.614 | 0.586 | 0.644 |
| N6 | 0.623 | 0.634 |
| 0.623 | 0.624 | 0.618 | 0.551 | 0.628 |
| N7 | 0.711 |
| 0.704 | 0.705 | 0.713 | 0.674 | 0.686 | 0.700 |
| S3 | 0.622 |
| 0.620 |
| 0.638 | 0.609 | 0.537 | 0.634 |
| S4 | 0.750 | 0.748 | 0.748 | 0.749 | 0.749 | 0.714 | 0.715 |
|
| S5 | 0.398 | 0.402 | 0.403 | 0.408 | 0.400 |
| 0.297 | 0.428 |
| S6 | 0.667 | 0.680 | 0.675 | 0.683 | 0.669 | 0.622 | 0.610 |
|
| S7 | 0.620 | 0.624 | 0.617 | 0.632 | 0.633 | 0.586 | 0.614 |
|
| Average | 0.638 | 0.639 | 0.639 | 0.645 | 0.644 | 0.611 | 0.593 | 0.651 |
The best prediction model for each environment in the data set is in bold. The performance was evaluated via 10 runs of 10-fold cross-validation and the prediction accuracy was the mean Pearson correlation coefficients of predicted and observed FT.
Finally, for further association study, genomic estimated breeding values (GEBVs) were predicted for each geographic region using RKHS with year as a covariate, respectively. RKHS implemented in the R package ‘rrBLUP’ is capable of handling multiple environments.
Fig 1The comparison of the accuracies achieved by eight exiting genome selection models in each of the environment.
Each node indicates mean accuracies of 10 runs of 10-fold cross-validation, and the ranges stand for ± standard deviation. The prediction accuracy was calculated as the Pearson correlation coefficient of predicted and observed FT.
Fig 2Pairwise linkage disequilibrium (r2) of genomic markers in each chromosome.
Associated SNPs tagging known FT genes.
| SNP | Chromosome | Coordinate | Homologs | Comments |
|---|---|---|---|---|
|
| C5 | 7700765 | ref|NP_177074.1| Floral homeotic protein APETALA 1 [Arabidopsis thaliana] |
|
|
| unassigned C genome | 270314394 | ref|NP_176814.1| calmodulin 4 [Arabidopsis thaliana] |
|
|
| A1 | 7194790 | ref|NP_193547.4| phytochrome E [Arabidopsis thaliana] |
|
|
| A1 | 4708098 | ref|NP_195034.2| AGC (cAMP-dependent, cGMP-dependent and protein kinase C) kinase family protein [Arabidopsis thaliana] |
|
|
| A2 | 17249324 | ref|NP_177058.1| transcription factor bHLH49 [Arabidopsis thaliana] |
|
|
| A3 | 29135636 | ref|NP_001190621.1| 14-3-3-like protein GF14 kappa [Arabidopsis thaliana] |
|
|
| unassigned C genome | 349256190 | ref|NP_568567.1| Dof zinc finger protein DOF5.2 [Arabidopsis thaliana] |
|
|
| C7 | 18411249 | ref|NP_194185.1| MADS-box protein AGL24 [Arabidopsis thaliana] |
|
|
| A3 | 20996981 | - | Located near |
|
| A1 | 2984710 | - | Located near |
|
| A6 | 22522149 | ref|NP_178291.1| receptor-like kinase TMK3 [Arabidopsis thaliana] |
|
Fig 3SNPs located in previously found QTLs.
Five linkage groups were showed with the lines, and the black short lines represented the QTLs. The blue triangles showed the SNPs located in the confidence interval of QTLs.
Fig 4Illustration of genotype effects of associated SNPs on FT and functional clusters of genes tagged by detected SNPs.
(a) Samples with allele ‘A’ at UQnapus0052 tend to blossom earlier than the ones with allele ‘B’ (with t test P-value from 3.47 ×10−9 to 3.95 ×10−1). (b) while at UQnapus0097, lines with allele ‘B’ are more likely to flower sooner (with t test P-value from 1.23 ×10−7 to 1.44 ×10−1). (c) Functional clusters with enrichment score > 1.3 (corresponding to p value of 0.05)