| Literature DB >> 33193595 |
Jaroslav Klápště1, Heidi S Dungey1, Emily J Telfer1, Mari Suontama1,2, Natalie J Graham1, Yongjun Li1,3, Russell McKinley1.
Abstract
Multivariate analysis using mixed models allows for the exploration of genetic correlations between traits. Additionally, the transition to a genomic based approach is simplified by substituting classic pedigrees with a marker-based relationship matrix. It also enables the investigation of correlated responses to selection, trait integration and modularity in different kinds of populations. This study investigated a strategy for the construction of a marker-based relationship matrix that prioritized markers using Partial Least Squares. The efficiency of this strategy was found to depend on the correlation structure between investigated traits. In terms of accuracy, we found no benefit of this strategy compared with the all-marker-based multivariate model for the primary trait of diameter at breast height (DBH) in a radiata pine (Pinus radiata) population, possibly due to the presence of strong and well-estimated correlation with other highly heritable traits. Conversely, we did see benefit in a shining gum (Eucalyptus nitens) population, where the primary trait had low or only moderate genetic correlation with other low/moderately heritable traits. Marker selection in multivariate analysis can therefore be an efficient strategy to improve prediction accuracy for low heritability traits due to improved precision in poorly estimated low/moderate genetic correlations. Additionally, our study identified the genetic diversity as a factor contributing to the efficiency of marker selection in multivariate approaches due to higher precision of genetic correlation estimates.Entities:
Keywords: Eucalyptus nitens; PLS; Pinus radiata; genomic prediction; multivariate mixed model; variable selection
Year: 2020 PMID: 33193595 PMCID: PMC7662070 DOI: 10.3389/fgene.2020.499094
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
Heritability estimates and their 95% confidence limits using variance components inferred from the sib-ship reconstruction-based univariate model (BLUP) in E. nitens and from using the pedigree-based univariate model (BLUP) in P. radiata as well as marker-based univariate models (GBLUP).
| TS | 0.242 (0.147–0.338) | 0.539 (0.389–0.689) | NA | NA |
| WD | 0.282 (0.193–0.371) | 0.559 (0.420–0.699) | 0.588 (0.292–0.884) | 0.529 (0.400–0.658) |
| DBH | 0.138 (0.030–0.245) | 0.089 (−0.049–0.228) | 0.134 (0.024–0.244) | 0.131 (0.052–0.210) |
| ST1 | 0.210 (0.107–0.313) | 0.394 (0.229–0.559) | NA | NA |
| ST2 | 0.093 (−0.001–0.187) | 0.199 (0.044–0.354) | NA | NA |
| GS1 | 0.248 (0.139–0.357) | 0.309 (0.149–0.469) | NA | NA |
| GS2 | 0.211 (0.103–0.319) | 0.318 (0.154–0.481) | NA | NA |
| ST9 | NA | NA | 0.046 (−0.010–0.102) | 0.126 (0.034–0.218) |
| BR9 | NA | NA | 0.128 (0.019–0.237) | 0.177 (0.073–0.282) |
| PME | NA | NA | 0.224 (0.055–0.393) | 0.397 (0.250–0.544) |
NA represents the case where data were not available for a particular species and trait.
Figure 1Genetic correlations using variance components and covariances inferred from use of a sib-ship reconstruction-based multivariate model (MVBLUP) (below diagonals) and marker-based relationship matrix (MVGBLUP) (above diagonals) in the E. nitens population (left plot) and using variance components and covariances inferred from use of a pedigree-based multivariate model (MVBLUP) (below diagonals) and marker-based relationship matrix (MVGBLUP) (above diagonals) in the P. radiata population (right plot).
Figure 2Correlation networks between traits investigated in the E. nitens (left) and P. radiata (right) populations based on genetic correlations estimated in multivariate model using marker-based relationship matrix. Solid lines represent positive correlations and dashed lines represent negative genetic correlations; the thickness of the lines represents magnitude of correlations.
Number of markers selected in different scenarios using only positive (upper part) or both positive and negative (bottom part) marker loadings obtained from PLS-CA procedure.
| C1 | 970 | 1,940 | 2,909 | 3,879 | 4,849 | 5,864 | 11,728 | 17,591 | 23,455 | 29,318 | |
| C2 | 1,824 | 3,513 | 4,999 | 6,292 | 7,348 | 11,364 | 21,448 | 30,668 | 38,650 | 45,014 | |
| C3 | 2,574 | 4,704 | 6,371 | 7,634 | 8,510 | 15,856 | 28,529 | 38,725 | 46,448 | 51,793 | |
| Pos | C4 | 3,318 | 5,776 | 7,456 | 8,555 | 9,180 | 20,773 | 35,762 | 45,963 | 52,179 | 55,740 |
| C5 | 3,898 | 6,492 | 8,049 | 8,992 | 9,419 | 24,128 | 39,697 | 49,288 | 54,523 | 57,198 | |
| C6 | 4,515 | 7,188 | 8,631 | 9,312 | 9,567 | NA | NA | NA | NA | NA | |
| C7 | 4,997 | 7,632 | 8,896 | 9,452 | 9,627 | NA | NA | NA | NA | NA | |
| C1 | 1,904 | 3,840 | 5,825 | 7,792 | 9,659 | 10,574 | 22,848 | 35,438 | 47,511 | 58,636 | |
| C2 | 3,377 | 6,131 | 8,103 | 9,282 | 9,697 | 19,871 | 37,712 | 49,848 | 56,706 | 58,636 | |
| Pos | C3 | 4,578 | 7,502 | 9,048 | 9,612 | 9,697 | 2,8337 | 46,493 | 55,238 | 58,271 | 58,636 |
| + | C4 | 5,558 | 8,314 | 9,418 | 9,680 | 9,697 | 34,662 | 51,277 | 57,188 | 58,542 | 58,636 |
| Neg | C5 | 6,303 | 8,801 | 9,562 | 9,694 | 9,697 | 40,064 | 54,329 | 58,047 | 58,618 | 58,636 |
| C6 | 6,946 | 9,108 | 9,639 | 9,696 | 9,697 | NA | NA | NA | NA | NA | |
| C7 | 7,425 | 9,300 | 9,665 | 9,697 | 9,697 | NA | NA | NA | NA | NA | |
NA represents the case not applicable for a particular species.
Figure 3Correlations between marker-based relationship matrices using markers selected on the basis of positive loadings only and marker-based relationship matrix using all markers in E. nitens (upper left) and in P. radiata (bottom left) populations and correlations between marker-based relationship matrices using markers selected on the basis of both positive and negative loadings and marker-based relationship matrix using all markers in E. nitens (upper right) and in P. radiata (bottom right) populations. Each line represent scenario for different number of latent variables considered in marker selection (e.g., C1—only the first latent variable is considered, C2—only the first two latent variables are considered, etc.).
Prediction accuracies and their standard deviations (in parenthesis) obtained from multivariate mixed models in the E. nitens population when using, a relationship matrix derived from sib-ship reconstruction (MVBLUP), a marker-based relationship matrix using all markers (MVGBLUP), a marker-based relationship matrix using selected SNPs having only positive loadings (MVGBLUP1), or a marker-based relationship matrix using selected SNPs having both positive and negative loadings (MVGBLUP2).
| TS | 0.737 (0.039) | 0.656 (0.069) | 0.754 (0.034) | 0.665 (0.071) | 0.650NS (0.047) | 0.642NS (0.059) |
| WD | 0.782 (0.060) | 0.764 (0.054) | 0.658 (0.068) | 0.768 (0.049) | 0.759NS (0.053) | 0.766NS (0.035) |
| DBH | 0.246 (0.132) | 0.183 (0.117) | 0.541 (0.251) | 0.529 (0.336) | 0.576NS (0.241) | 0.595 |
| ST1 | 0.613 (0.056) | 0.523 (0.098) | 0.621 (0.072) | 0.545 (0.085) | 0.525NS (0.074) | 0.523NS (0.078) |
| ST2 | 0.571 (0.140) | 0.448 (0.131) | 0.582 (0.137) | 0.442 (0.134) | 0.434NS (0.137) | 0.414NS (0.107) |
| GS1 | 0.683 (0.045) | 0.558 (0.071) | 0.720 (0.062) | 0.609 (0.082) | 0.604NS (0.072) | 0.604NS (0.085) |
| GS2 | 0.603 (0.068) | 0.547 (0.076) | 0.737 (0.068) | 0.651 (0.081) | 0.650NS (0.065) | 0.660NS (0.073) |
Predicted EBV/GEBVs were correlated with EBVs estimated when using the multivariate mixed model using either documented pedigree or relationships inferred from sib-ship reconstruction.
represents a statistically significant while NS represents a statistically non-significant test at α level 0.05.
Prediction accuracies and their standard deviations (in parenthesis) obtained from multivariate mixed model in P. radiata population when using the documented pedigree (MVBLUP), a marker-based relationship matrix using all markers (MVGBLUP), a marker-based relationship matrix using selected SNPs having only positive loadings (MVGBLUP1), or a marker-based relationship matrix using selected SNPs having both positive and negative loadings (MVGBLUP2).
| BR9 | 0.653 (0.088) | 0.550 (0.121) | 0.679 (0.095) | 0.570 (0.136) | 0.586NS (0.134) | 0.589NS (0.123) |
| DBH | 0.441 (0.103) | 0.388 (0.133) | 0.573 (0.069) | 0.611 (0.062) | 0.616NS (0.058) | 0.626NS (0.048) |
| ST9 | 0.638 (0.147) | 0.415 (0.148) | 0.646 (0.126) | 0.435 (0.135) | 0.446NS (0.149) | 0.436NS (0.119) |
| WD | 0.642 (0.043) | 0.645 (0.056) | 0.610 (0.045) | 0.618 (0.064) | 0.627NS (0.064) | 0.631NS (0.044) |
| PME | 0.565 (0.118) | 0.554 (0.119) | 0.553 (0.109) | 0.530 (0.116) | 0.542NS (0.113) | 0.543NS (0.108) |
**represents a statistically significant while NS represents a statistically non-significant test at α level 0.05.
Figure 4Boxplot of prediction accuracies for each tested scenario in E. nitens (left plot) and in P. radiata (right plot) populations, red line represents prediction accuracy for primary trait (DBH).
Correlations between prediction accuracy and Deviance Information Criterion (DIC) and between prediction accuracy and number of selected markers.
| TS | −0.952 | 0.849 | −0.951 | 0.923 | NA | NA | NA | NA |
| WD | −0.702 | 0.544 | −0.332 | 0.274 | −0.650 | 0.551 | −0.557 | 0.364 |
| DBH | −0.559 | 0.467 | −0.409 | 0.358 | −0.664 | 0.841 | −0.583 | 0.439 |
| ST1 | −0.955 | 0.910 | −0.902 | 0.855 | NA | NA | NA | NA |
| ST2 | −0.582 | 0.657 | −0.455 | 0.504 | NA | NA | NA | NA |
| GS1 | −0.906 | 0.777 | −0.756 | 0.701 | NA | NA | NA | NA |
| GS2 | −0.905 | 0.816 | −0.635 | 0.600 | NA | NA | NA | NA |
| ST9 | NA | NA | NA | NA | −0.223 | 0.029 | 0.147 | 0.246 |
| BR9 | NA | NA | NA | NA | 0.115 | −0.235 | 0.623 | −0.613 |
| PME | NA | NA | NA | NA | −0.721 | 0.251 | −0.194 | 0.082 |
NA represents the case where no data were available for particular species and trait.