| Literature DB >> 31852730 |
Jhonathan P R Dos Santos1,2, Samuel B Fernandes3, Scott McCoy4, Roberto Lozano1, Patrick J Brown5, Andrew D B Leakey3,4,6, Edward S Buckler1,7,8, Antonio A F Garcia9, Michael A Gore10.
Abstract
The ability to connect genetic information between traits over time allow Bayesian networks to offer a powerful probabilistic framework to construct genomic prediction models. In this study, we phenotyped a diversity panel of 869 biomass sorghum (Sorghum bicolor (L.) Moench) lines, which had been genotyped with 100,435 SNP markers, for plant height (PH) with biweekly measurements from 30 to 120 days after planting (DAP) and for end-of-season dry biomass yield (DBY) in four environments. We evaluated five genomic prediction models: Bayesian network (BN), Pleiotropic Bayesian network (PBN), Dynamic Bayesian network (DBN), multi-trait GBLUP (MTr-GBLUP), and multi-time GBLUP (MTi-GBLUP) models. In fivefold cross-validation, prediction accuracies ranged from 0.46 (PBN) to 0.49 (MTr-GBLUP) for DBY and from 0.47 (DBN, DAP120) to 0.75 (MTi-GBLUP, DAP60) for PH. Forward-chaining cross-validation further improved prediction accuracies of the DBN, MTi-GBLUP and MTr-GBLUP models for PH (training slice: 30-45 DAP) by 36.4-52.4% relative to the BN and PBN models. Coincidence indices (target: biomass, secondary: PH) and a coincidence index based on lines (PH time series) showed that the ranking of lines by PH changed minimally after 45 DAP. These results suggest a two-level indirect selection method for PH at harvest (first-level target trait) and DBY (second-level target trait) could be conducted earlier in the season based on ranking of lines by PH at 45 DAP (secondary trait). With the advance of high-throughput phenotyping technologies, our proposed two-level indirect selection framework could be valuable for enhancing genetic gain per unit of time when selecting on developmental traits.Entities:
Keywords: Bayesian networks; GenPred; Genomic Prediction; Shared Data Resources; biomass sorghum; genomic prediction; indirect selection; probabilistic programming
Year: 2020 PMID: 31852730 PMCID: PMC7003104 DOI: 10.1534/g3.119.400759
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Overview of the Bayesian models. Bayesian network (BN), pleiotropic Bayesian network (PBN), and dynamic Bayesian network (DBN) probabilistic graphical models. k: number of time points; : number of lines within a time point; p: number of artificial bins; , , : adjusted means for the line evaluated in the time point, which can be for trait 1 (Tr1) or trait 2 (Tr2); , , : population means; : row vector with artificial bins; , , , : column vector with artificial bin effects; , : pleiotropic bin effect; : standardized pleiotropic bin effect; , : pleiotropic means hyperparameters; : bin effects between the current and previous time points; , : pleiotropic standard deviations hyperparameters; , , : standard deviations.
Figure 2Summary statistics of evaluated phenotypes. Correlations between adjusted means, heritabilities, and distributions of adjusted means.
Prediction accuracies obtained from the fivefold cross-validation scheme by training the Bayesian network (BN), pleiotropic Bayesian network (PBN), dynamic Bayesian network (DBN), multi time GBLUP (MTi-GBLUP) and multi trait GBLUP (MTr-GBLUP) models to predict dry biomass yield (DBY) collected at harvest and plant height (PH) measured across different days after planting (DAP). The standard deviation of the prediction accuracy obtained by each Bayesian model is reported within parentheses.
| Trait | Accuracy of the Genomic Prediction Models | ||||
|---|---|---|---|---|---|
| BN | PBN | DBN | MTi-GBLUP | MTr-GBLUP | |
| DBY | 0.47 (0.021) | 0.46 (0.009) | — | — | 0.49 |
| PH-30 | 0.54 (0.021) | 0.53 (0.020) | 0.49 (0.022) | 0.57 | 0.57 |
| PH-45 | 0.59 (0.016) | 0.58 (0.018) | 0.57 (0.016) | 0.62 | 0.62 |
| PH-60 | 0.72 (0.013) | 0.71 (0.015) | 0.50 (0.016) | 0.75 | 0.74 |
| PH-75 | 0.69 (0.015) | 0.69 (0.015) | 0.53 (0.013) | 0.73 | 0.73 |
| PH-90 | 0.67 (0.015) | 0.67 (0.016) | 0.52 (0.013) | 0.71 | 0.71 |
| PH-105 | 0.66 (0.016) | 0.65 (0.016) | 0.52 (0.013) | 0.70 | 0.70 |
| PH-120 | 0.61 (0.018) | 0.60 (0.018) | 0.47 (0.014) | 0.66 | 0.66 |
Figure 3Genomic prediction of plant height with forward chaining cross-validation. Prediction accuracies were obtained from prediction models exploiting single (Bayesian Network and Pleiotropic Bayesian Network) or multiple time points (Dynamic Bayesian Network, Multi Time GBLUP, and Multi Trait GBLUP). The horizontal axis represents the slice (:) of the time interval used for training the models with multiple time points and the vertical axis the testing data. The ’*’ symbol denotes the days after planting (DAP) time point used to obtain the adjusted means.
Figure 4The calculated coincidence index (CI) at multiple developmental stages. The CIs for selecting the top 20% for dry biomass yield at specific developmental stages were calculated using as reference the adjusted mean values obtained by training the Bayesian network (BN), pleiotropic Bayesian network (PBN), and dynamic Bayesian network (DBN) models with the plant height time series. The CIs from the multi time GBLUP (MTi-GBLUP) and multi trait GBLUP (MTr-GBLUP) models were not plotted because their point estimates do not include confidence intervals (point estimates reported in the text). The ’*’ symbol denotes the time point estimates used to obtain the adjusted means as expected values for indirect selection. For the DBN model that leveraged multiple time points, the ’*’ symbol denotes the last time point used for training with the earlier time points also considered in the model.