| Literature DB >> 35233251 |
Annika Perry1, Witold Wachowiak2, Joan Beaton3, Glenn Iason3, Joan Cottrell4, Stephen Cavers1.
Abstract
In tree species, genomic prediction offers the potential to forecast mature trait values in early growth stages, if robust marker-trait associations can be identified. Here we apply a novel multispecies approach using genotypes from a new genotyping array, based on 20,795 single nucleotide polymorphisms (SNPs) from three closely related pine species (Pinus sylvestris, Pinus uncinata and Pinus mugo), to test for associations with growth and phenology data from a common garden study. Predictive models constructed using significantly associated SNPs were then tested and applied to an independent multisite field trial of P. sylvestris and the capability to predict trait values was evaluated. One hundred and eighteen SNPs showed significant associations with the traits in the pine species. Common SNPs (MAF > 0.05) associated with bud set were only found in genes putatively involved in growth and development, whereas those associated with growth and budburst were also located in genes putatively involved in response to environment and, to a lesser extent, reproduction. At one of the two independent sites, the model we developed produced highly significant correlations between predicted values and observed height data (YA, height 2020: r = 0.376, p < 0.001). Predicted values estimated with our budburst model were weakly but positively correlated with duration of budburst at one of the sites (GS, 2015: r = 0.204, p = 0.034; 2018: r = 0.205, p = 0.034-0.037) and negatively associated with budburst timing at the other (YA: r = -0.202, p = 0.046). Genomic prediction resulted in the selection of sets of trees whose mean height was taller than the average for each site. Our results provide tentative support for the capability of prediction models to forecast trait values in trees, while highlighting the need for caution in applying them to trees grown in different environments.Entities:
Keywords: SNP array; Scots pine; common garden trial; genetic variation; local adaptation; marker–trait association; predictive model; quantitative trait
Year: 2022 PMID: 35233251 PMCID: PMC8867712 DOI: 10.1111/eva.13345
Source DB: PubMed Journal: Evol Appl ISSN: 1752-4571 Impact factor: 5.183
FIGURE 1Plant material, datasets and analyses used in the study. MU: P. mugo; SY: P. sylvestris and UN: P. uncinata
FIGURE 2Geographical location of sampled pine populations across Europe (map on left: association trial) and Scotland (map on right: independent trial). Pine species: MU, P. mugo; SY, P. sylvestris and UN, P. uncinata. Independent trial map shows genotyped P. sylvestris (SY‐G) and other P. sylvestris populations included in the trial but not genotyped (SY). Multi‐site field sites in independent trial: GS, Glensaugh and YA, Yair. Association trial map: Europe base map credit: Natural Earth, Esri France. Independent trial map contains OS data © Crown Copyright and database right 2020
Pearson's correlation coefficient I and associated significance values for comparison of predicted and actual values for each trait both with and without a minor allele frequency (MAF) filter when using prediction models constructed with single nucleotide polymorphisms (SNPs) significantly associated with each trait (Budburst; Growth), random sets of SNPs (10 sets of randomly selected SNPs for each model with 95% confidence intervals reported) or all polymorphic SNPs
| Training trait | SNP set | Species datasets |
| MAF: No | MAF: Yes |
|---|---|---|---|---|---|
| Predictive models: Budburst | |||||
| BB2011 | Budburst | a | 25/17 | 0.40*** | 0.23** |
| Budburst | b | 15/11 | 0.41*** | 0.30*** | |
| Budburst | c | 13/9 | 0.38*** | 0.13 | |
| Random | NA | 16/11 | 0.08 ± 0.05 | 0.07 ± 0.05 | |
| All SNPs | NA | 15,019/7712 | 0.57*** | 0.57*** | |
| Predictive models: Growth | |||||
| H2013 | Growth | b | 14/11 | 0.26*** | 0.25** |
| Growth | c | 7/4 | 0.20** | 0.19* | |
| Random | NA | 14/7 | 0.15 ± 0.06 | 0.17 ± 0.05 | |
| All SNPs | NA | 15,019/7712 | 0.49*** | 0.48*** | |
| I2013 | Growth | b | 14/11 | 0.19* | 0.14 |
| Growth | c | 7/4 | 0.19* | 0.09 | |
| Random | NA | 14/7 | 0.09 ± 0.06 | 0.09 ± 0.06 | |
| All SNPs | NA | 15,019/7712 | 0.35*** | 0.35*** | |
Species datasets, single nucleotide polymorphisms (SNPs) identified as significantly associated with the trait in: (a) all species' datasets (i.e. MU‐SY‐UN, MU‐UN and SY); (b) just datasets containing SY (i.e. MU‐SY‐UN and SY); (c) just SY. All models trained using a subset of the SY dataset and validated using the remaining SY trees.
MAF: No = no minor allele frequency filter applied; Yes = only common (MAF > 0.05) SNPs included. MAF was calculated using the datasets from which the SNPs were originally identified as being associated with each trait.
Significance values: *p: 0.01‐0.05; **p: 0.001–0.01; ***p < 0.001.
Counts for each type of SNP in individual species and shared among species
| Species set | SNP type | ||
|---|---|---|---|
| CR < 80 | Mono | Poly | |
| Individual species | |||
| SY | 9 | 5767 | 15,019 |
| MU | 4884 | 4639 | 11,272 |
| UN | 288 | 5297 | 15,210 |
| Shared among species | |||
| SY and MU | 6 | 3700 | 9910 |
| SY and UN | 0 | 4161 | 13,654 |
| MU and UN | 242 | 4170 | 10,430 |
| SY and MU and UN | 0 | 3446 | 9583 |
Species: SY, P. sylvestris; MU, P. mugo; UN, P. uncinata. SNP type: CR < 80, call rate <80%; Mono, monomorphic; Poly, polymorphic.
Total number of single nucleotide polymorphisms (SNPs) associated with phenology and growth traits in the three pine species identified from a mixed linear model (MLM) in TASSEL and a multi‐locus mixed model (MLMM) in R
| Trait | Species | MLM | MLMM | ||
|---|---|---|---|---|---|
| Common | Rare | Common | Rare | ||
| Phenology | |||||
| BB2011 | MU‐UN | 9 (3/1) | 25 | 4 | |
| SY | 7 (6/2) | 11 | 3 | ||
| MU‐SY‐UN | 4 (2/1) | 19 | 3 | ||
| BS2010 | SY | 4 | 14 | ||
| MU‐SY‐UN | (0/1) | ||||
| Growth | |||||
| H2011 | SY | 1 | |||
| MU‐SY‐UN | 1 | 1 | |||
| H2012 | MU‐SY‐UN | 1 (1/0) | |||
| H2013 | SY | 2 | |||
| MU‐SY‐UN | 4 | 1 | |||
| I2012 | MU‐SY‐UN | 1 | |||
| I2013 | MU‐UN | 6 (5/0) | 1 | 1 | |
| SY | 2 (1/0) | 20 | 4 | ||
| MU‐SY‐UN | 6 (3/0) | 4 | 1 | 1 | |
Single nucleotide polymorphisms (SNPs) identified in analyses with a minor allele frequency (MAF) filter (excluding MAF < 0.05) are in parentheses: common SNPs identified both with and without a MAF filter are to the left of the forward slash; SNPs identified only with a MAF filter are to the right of the forward slash.
Species codes: MU, P. mugo; SY, P. sylvestris; UN, P. uncinata. Trait codes: budburst (BB); bud set (BS); height (H); increment (I). Common: SNPs with MAF > 0.05; Rare: SNPs with MAF < 0.05.
FIGURE 3Contribution of putative function groups (G&D: growth and development; R: reproduction; RtE: response to environment) coded for by genes containing single nucleotide polymorphisms significantly associated with each trait (bud set, budburst, height and increment) identified when no MAF filter was applied and as a percentage of the total number of proteins identified for each trait for each species' dataset (MU: P. mugo; SY: P. sylvestris and UN: P. uncinata). Proteins which were uncharacterized, for which no known function in plants was found or for which only cellular processes could be identified are categorized ‘NA.’ Total for each trait may be higher than 100% as there may be more than one putative function assigned to a single protein. MAF: minor allele frequency (MAF >0.05: common; MAF <0.05: rare)
Pearson's correlation coefficient (r) and associated significance values for comparison of predicted and observed values for each trait
| Observed trait | Year | Final predictive models | Predictive model using all SNPs | Predictive model using all SNPs trained with SY from Scotland | |||
|---|---|---|---|---|---|---|---|
| GS | YA | GS | YA | GS | YA | ||
| Predictive model: Budburst (training trait: BB2011) | |||||||
| Duration | 2015 | 0.204* | 0.080 | −0.005 | 0.088 | −0.031 | −0.051 |
| 2016 | 0.083 | 0.149 | −0.112 | 0.016 | −0.157 | −0.021 | |
| 2017 | 0.070 | −0.005 | −0.131 | −0.030 | −0.038 | −0.055 | |
| 2018 | 0.205* | −0.080 | −0.150 | −0.105 | 0.087 | 0.072 | |
| 2019 | 0.152 | 0.125 | 0.099 | 0.188 | 0.053 | 0.016 | |
| Timing | 2015 | 0.130 | 0.034 | −0.167 | 0.167 | −0.151 | 0.089 |
| 2016 | 0.071 | −0.004 | −0.101 | 0.004 | −0.110 | −0.069 | |
| 2017 | 0.069 | −0.202* | −0.093 | −0.047 | −0.068 | −0.012 | |
| 2018 | 0.112 | −0.168 | −0.177 | 0.029 | −0.048 | 0.062 | |
| 2019 | 0.125 | −0.037 | 0.111 | 0.134 | 0.046 | 0.038 | |
| Predictive model: Growth (training trait: H2013) | |||||||
| Height | 2008 | −0.020 | 0.023 | 0.093 | 0.002 | −0.056 | 0.011 |
| 2020 | 0.104 | 0.376*** | 0.034 | 0.144 | 0.039 | 0.118 | |
| Increment | 2015 | 0.022 | NA | 0.060 | NA | 0.124 | NA |
| 2016 | 0.173 | 0.299** | 0.149 | 0.158 | 0.190* | 0.142 | |
| 2017 | 0.121 | 0.312** | −0.012 | 0.175 | −0.049 | 0.149 | |
| 2018 | 0.012 | 0.329*** | 0.030 | 0.138 | 0.028 | 0.075 | |
| 2019 | 0.123 | 0.205* | 0.065 | 0.111 | −0.022 | 0.107 | |
| 2020 | −0.001 | 0.262** | −0.058 | 0.110 | −0.142 | 0.078 | |
Predicted values estimated by final predictive models for growth and budburst constructed using single nucleotide polymorphisms (SNPs) significantly associated with the traits and assessed for their performance in an internal test. Predictive models constructed using all available SNPs (no MAF filter applied, N SNPs = 15,019) trained using the full SY dataset and also trained with only SY trees from Scotland. Duration: time taken for each tree to progress from stage 4 to stage 6. Timing: time taken to reach stage 6 of budburst. Description of each budburst stage is given in Table S4.
Significance values: *p: 0.01–0.05; **p: 0.001–0.01; ***p < 0.001.
FIGURE 4Correlations of observed height measured in 2020 at age 13 against predicted values using the final predictive model for growth for trees in an independent trial at Glensaugh (GS, correlation not significant) and Yair (YA)
FIGURE 5Height at 13 years (measured before the growing season started in 2020) of 10 trees at Yair (YA) and Glensaugh (GS) selected using different methods: Genomic: genomic selection to identify the predicted 10 tallest trees using values from the final predictive model for growth (single nucleotide polymorphisms (SNPs) identified in both SY and MU‐SY‐UN, no minor allele frequency filter applied, N SNPs = 14); Phenotype: phenotype selection where the 10 tallest trees at each site prior to the start of the second growing season. The dotted line represents the mean height of trees at each site