| Literature DB >> 26466667 |
Ning Gao1,2, Jiaqi Li3, Jinlong He4, Guang Xiao5, Yuanyu Luo6, Hao Zhang7, Zanmou Chen8, Zhe Zhang9,10.
Abstract
BACKGROUND: In recent years, with the development of high-throughput sequencing technology and the commercial availability of genotyping bead chips, more attention is being directed towards the utilization of abundant genetic markers in animal and plant breeding programs, human disease risk prediction and personal medicine. Several useful approaches to accomplish genomic prediction have been developed and used widely, but still have room for improvement to gain more accuracy. In this study, an improved Bayesian approach, termed BayesBπ, which differs from the original BayesB in priors assigning, is proposed. An effective method for calculating the locus-specific π by converting p-values from association between SNPs and traits' phenotypes is given and systemically validated using a German Holstein dairy cattle population. Furthermore, the new method is applied to a loblolly pine (Pinus taeda) dataset.Entities:
Mesh:
Year: 2015 PMID: 26466667 PMCID: PMC4606514 DOI: 10.1186/s12863-015-0278-9
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Descriptive statistics of trait phenotypes
| Datasets | Traitsa | N | Min. | Mean | Max. | S.D. | CV% |
|---|---|---|---|---|---|---|---|
| Dairy cattle | MY | 5024 | −3.383 | 0.000 | 3.319 | 1.000 | – |
| MFP | 5024 | −3.569 | 0.000 | 4.281 | 1.000 | – | |
| SCS | 5024 | −4.462 | 0.000 | 3.469 | 1.000 | – | |
| Loblolly Pine | HT | 927 | −287.700 | 20.300 | 226.10 | 73.315 | 361.158 |
| HTLC | 927 | −94.110 | 3.304 | 89.080 | 24.976 | 755.932 | |
| BHLC | 927 | −1.578 | 0.092 | 1.573 | 0.507 | 551.087 | |
| DBH | 927 | −5.439 | 0.294 | 1.349 | 4.150 | 1411.565 | |
| CWAL | 927 | −91.190 | 2.443 | 130.800 | 27.326 | 1118.543 | |
| CWAC | 927 | −140.600 | 2.276 | 157.000 | 42.033 | 1846.793 | |
| BD | 927 | −0.608 | −0.004 | 1.739 | 0.249 | −6225.000 | |
| BA | 927 | −24.560 | −0.261 | 21.140 | 7.315 | −2802.682 | |
| Rootnum_bin | 927 | −0.779 | 0.107 | 0.602 | 0.258 | 241.121 | |
| Rootnum | 927 | −2.422 | 0.321 | 4.368 | 0.960 | 299.065 | |
| Rust_bin | 927 | −0.482 | −0.014 | 0.822 | 0.399 | −2850.000 | |
| Rust_gall_vol | 927 | −1.175 | −0.022 | 5.212 | 1.132 | −5145.454 | |
| Stiffness | 927 | −3.244 | 0.095 | 6.082 | 1.225 | 1289.474 | |
| Lignin | 927 | −3.644 | 0.050 | 4.073 | 1.200 | 2400.000 | |
| LateWood | 927 | −4.544 | 0.090 | 4.878 | 1.571 | 1745.556 | |
| Density | 927 | −10.290 | −0.053 | 17.610 | 2.498 | −4713.208 | |
| C5C6 | 927 | −8.102 | −0.049 | 9.057 | 2.649 | −5406.122 |
aMY, milk yield; MFP, milk fat percentage; SCS, somatic cell score; HT, total stem height; HTLC, total height to the base of the live crown; BHLC, basal height of the live crown; DBH, traits stem diameter; CWAL, crown width along the planting beds; CWAC, crown width across the planting beds; BD, average branch diameter; BA, branch angle average; Rootnum_bin, presence or absence of roots; Rootnum, Root number; Rust_bin, presence or absence of rust; Rust_gall_vol, gall volume; lignin, lignin content; LateWood, latewood percentage; Density, wood specific gravity; C5C6, C5C6 content. In the dairy cattle population, phenotypes were rescaled to standard normal distributions
Fig. 1Distribution of p-values and locus-specific π of three traits in dairy cattle population across the genome. Rows in the figure correspond to distributions of features of milk fat percentage (FP), milk yield (MY), and somatic cell score (SCS), respectively. Four columns correspond to distributions of ω, density of ω, distribution of locus-specific π, and density of locus-specific π, respectively; where, ω = − log 10(p − values). The p-values are derived from ANOVA for all single markers. Logarithmic transformation of the p-values is performed for data visualization convenience and latter utilization. The locus-specific π is derived from the p-values of the ANOVA via formula (4). Since π is the proportion of non-effective markers, 1-π is taken as the probability of each marker to be effective. For milk yield and milk fat percentage, the clusters on chromosome 14 is the genomic segment where located the DGAT1 gene. For somatic cell score, no cluster is observed due to the lack of major genes. Distributions of the locus-specific π are consistent with our prior knowledge about the genetic architectures of these traits. These plots are drawn on the R software platform (http://www.r-project.org/)
Accuracy of genomic prediction of three traits in Germany cattle population r(EBVs, GEBVs)
| Traits |
| GBLUP | BayesB | BayesBπ | BayesCπ |
|---|---|---|---|---|---|
| MY | 200 |
| 0.385 ± 0.018 | 0.382 ± 0.016 | 0.128 ± 0.016 |
| 500 | 0.547 ± 0.007 | 0.547 ± 0.012 |
| 0.324 ± 0.010 | |
| 1000 | 0.620 ± 0.005 |
|
| 0.560 ± 0.006 | |
| 2000 | 0.693 ± 0.003 |
| 0.716 ± 0.002 | 0.718 ± 0.002 | |
| Mean | 0.574 ± 0.006 | 0.579 ± 0.009 |
| 0.432 ± 0.008 | |
| MFP | 200 | 0.353 ± 0.012 |
| 0.544 ± 0.018 | 0.112 ± 0.012 |
| 500 | 0.467 ± 0.008 | 0.629 ± 0.011 |
| 0.332 ± 0.005 | |
| 1000 | 0.594 ± 0.004 | 0.709 ± 0.007 |
| 0.709 ± 0.007 | |
| 2000 | 0.698 ± 0.003 |
| 0.799 ± 0.002 | 0.799 ± 0.001 | |
| Mean | 0.528 ± 0.007 | 0.678 ± 0.010 |
| 0.488 ± 0.006 | |
| SCS | 200 |
| 0.292 ± 0.015 | 0.290 ± 0.018 | 0.161 ± 0.017 |
| 500 |
| 0.440 ± 0.011 | 0.465 ± 0.009 | 0.265 ± 0.006 | |
| 1000 | 0.568 ± 0.004 | 0.570 ± 0.006 |
| 0.535 ± 0.005 | |
| 2000 |
| 0.647 ± 0.002 | 0.647 ± 0.002 | 0.646 ± 0.002 | |
| Mean | 0.508 ± 0.009 | 0.487 ± 0.008 |
| 0.402 ± 0.008 |
The highest accuracies (Mean ± SE) among methods in different scenarios (subpopulations for different traits) are in bold faces. For each trait, accuracies among subpopulations are averaged to test the overall performances (i.e., the “Mean” accuracies here) of methods. For example, the overall performance of GBLUP in MY is the mean of its prediction accuracies for this trait among subpopulation 200, 500, 1000, and 2000
Fig. 2Impact of population sizes on genomic prediction accuracy. Genomic prediction accuracies of each method in each subpopulation are averaged among three traits to test the overall performance of methods in different subpopulations. For example, accuracies of GBLUP in subpopulation 200 are averaged among three traits to gain its’ overall performance in this population size
Accuracy of 17 traits in the loblolly pine population r(Deregressed Phenotypes, GEBVs)
| Trait category | Traits | GBLUP | BayesB | BayesBπ | BayesCπ |
|---|---|---|---|---|---|
| Growth | HT |
| 0.351 ± 0.003 | 0.363 ± 0.002 | 0.374 ± 0.002 |
| HTLC |
| 0.449 ± 0.002 | 0.448 ± 0.002 | 0.449 ± 0.001 | |
| BHLC |
| 0.468 ± 0.007 | 0.479 ± 0.007 | 0.487 ± 0.002 | |
| DBH |
| 0.436 ± 0.003 | 0.446 ± 0.003 | 0.458 ± 0.002 | |
| Development | CWAL | 0.381 ± 0.003 | 0.386 ± 0.003 |
| 0.382 ± 0.002 |
| CWAC | 0.468 ± 0.002 | 0.468 ± 0.002 |
| 0.469 ± 0.002 | |
| BD | 0.262 ± 0.004 | 0.263 ± 0.004 |
| 0.264 ± 0.003 | |
| BA |
| 0.497 ± 0.002 | 0.500 ± 0.003 |
| |
| Rootnum_bin | 0.277 ± 0.003 | 0.272 ± 0.004 |
| 0.275 ± 0.002 | |
| Rootnum |
| 0.245 ± 0.003 | 0.253 ± 0.003 | 0.261 ± 0.002 | |
| Disease resistance | Rust_bin | 0.306 ± 0.004 |
| 0.353 ± 0.004 | 0.32 ± 0.003 |
| Rust_gall_vol | 0.259 ± 0.005 |
| 0.292 ± 0.006 | 0.267 ± 0.004 | |
| Wood quality | Stiffness |
| 0.401 ± 0.003 | 0.410 ± 0.003 | 0.422 ± 0.002 |
| Lignin |
| 0.173 ± 0.005 | 0.176 ± 0.005 | 0.178 ± 0.003 | |
| LateWood | 0.254 ± 0.003 | 0.254 ± 0.003 |
| 0.253 ± 0.002 | |
| Density |
| 0.226 ± 0.003 | 0.234 ± 0.003 |
| |
| C5C6 |
| 0.247 ± 0.004 | 0.257 ± 0.004 | 0.262 ± 0.003 | |
| Mean accuracy | – | 0.345 ± 0.003 | 0.343 ± 0.004 |
| 0.345 ± 0.002 |
The highest accuracies (Mean ± SE) among methods in relevant traits and subpopulations are in bold faces