| Literature DB >> 32727925 |
Thomas Keep1, Jean-Paul Sampoux1, José Luis Blanco-Pastor1, Klaus J Dehmer2, Matthew J Hegarty3, Thomas Ledauphin1, Isabelle Litrico1, Hilde Muylle4, Isabel Roldán-Ruiz4, Anna M Roschanski2, Tom Ruttink4, Fabien Surault1, Evelin Willner2, Philippe Barre5.
Abstract
The natural genetic diversity of agricultural species is an essential genetic resource for breeding programs aiming to improve their ecosystem and production services. A large natural ecotype diversity is usually available for most grassland species. This could be used to recombine natural climatic adaptations and agronomic value to create improved populations of grassland species adapted to future regional climates. However describing natural genetic resources can be long and costly. Molecular markers may provide useful information to help this task. This opportunity was investigated for Lolium perenne L., using a set of 385 accessions from the natural diversity of this species collected right across Europe and provided by genebanks of several countries. For each of these populations, genotyping provided the allele frequencies of 189,781 SNP markers. GWAS were implemented for over 30 agronomic and/or putatively adaptive traits recorded in three climatically contrasted locations (France, Belgium, Germany). Significant associations were detected for hundreds of markers despite a strong confounding effect of the genetic background; most of them pertained to phenology traits. It is likely that genetic variability in these traits has had an important contribution to environmental adaptation and ecotype differentiation. Genomic prediction models calibrated using natural diversity were found to be highly effective to describe natural populations for almost all traits as well as commercial synthetic populations for some important traits such as disease resistance, spring growth or phenological traits. These results will certainly be valuable information to help the use of natural genetic resources of other species.Entities:
Keywords: GWAS; GenPred; Shared data resources; association study; forage species; genebank; genomic prediction; natural diversity
Mesh:
Year: 2020 PMID: 32727925 PMCID: PMC7466994 DOI: 10.1534/g3.120.401491
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Spatial distribution of sites of origin of the 385 perennial ryegrass natural populations in study and of locations of the three common gardens in which populations were phenotyped. The 1989-2010 norm of isothermality, i.e., mean temperature diurnal range over annual temperature range (WorldClim bioclimatic derived variable bio3) is displayed as map background.
Brief presentation of the traits
| Trait family | Trait description | Trait name |
|---|---|---|
| Vigor after sowing | Days from sowing to emergence | DES_po15 |
| Vigor after sowing | VAS-lu15, VAS_me15, VAS_po15 | |
| Regularity after sowing | RAS_lu15, RAS_me15, RAS_po15 | |
| Morphology of plants and sward density | Leaf lamina width | LMW_po16, LMW_me16, LMW_lu17 |
| Growth habit | GRH_avg | |
| Sward density | DVG_04_lu17 | |
| Phenology | Percentage of plants heading in first year | HFY_lu15, HFY_po15 |
| Spike emergence (heading) date in GDD | HEA_lu16, HEA_po16, HEA_lu17, HEA_po17 | |
| Aftermath heading (Second wave of fertile elongating stems after the first spring wave has been cut) | AHD_lu16, AHD_lu17, AHD_me16, AHD_me17, AHD_po16, AHD_po17 | |
| Investment in sexual reproduction | Density of elongated fertile stems | DST_lu17, DST_po17 |
| Straw height | HST_lu17 | |
| Spike length | LSP_lu17 | |
| Spikelet length | LSL_lu17 | |
| Spikelet count | NSL_lu17 | |
| Dynamics of vegetative spring growth | Canopy height at 300 GDD | CHs300_lu16, CHs300_lu17, CHs300_po16, CHs300_po17, CHs300_me16, CHs300_me17 |
| Canopy height at 500 GDD | CHs500_lu16, CHs500_lu17, CHs500_po16, CHs500_po17, CHs500_me16, CHs500_me17 | |
| Canopy height at 300 GDD | CH300h_lu16, CH300h_lu17, CH300h_po16, CH300h_po17 | |
| Canopy height at 400 GDD | CH400h_lu16, CH400h_lu17, CH400h_po16, CH400h_po17 | |
| Summer and autumn growth | Summer maximum canopy height | SMH_lu16, SMH_me16, SMH_me17, SMH_po17 |
| Summer growth rate | SGR_lu16, SGR_me16, SGR_me17, SGR_po17 | |
| Autumn maximum canopy height | AMH_me17, AMH_po17 | |
| Autumn growth rate | AGR_me17, AGR_po17 | |
| Dynamics of regrowth after cutting | Vigor after cutting | VAC_lu16, VAC_lu17, VAC_po17 |
| Abiotic stresses | Drought stress symptoms | DRO_lu16, DRO_po16 |
| Winter damage | WID_po16, WID_po17 | |
| Biotic stresses - Disease damages | Helmintosporium (Dreschlera siccans) susceptibility | DHE_01_lu16, DHE_07_lu16, DHE_04_lu17 |
| Black rust (Puccinia graminis) susceptibility | DRB_lu1516 | |
| Susceptibility to indeterminate diseases | DIS_lu15, DIS_lu16, DIS_lu17, DIS_me16, DIS_me17, DIS_po15, DIS_po17 | |
| Dynamics of persistency over successive trial years | Persistency throughout summer | SCD_su15_lu, SCD_su16_lu, SCD_su17_lu, SCD_su17_me |
| Persistency throughout winter | SCD_wi1516_lu, SCD_wi1617_lu, SCD_wi1718_lu, SCD_wi1617_me, SCD_wi1516_po, SCD_wi1617_po | |
| Persistency throughout the trial duration | SCD_15to18_lu, SCD_15to17_po, SCD_16to17_me | |
| Biochemistry of aerial biomass | Lignin content | ADL_04_me17, ADL_10_me17 |
| Acid-Detergent-Fiber content | ADF_04_me17, ADF_10_me17 | |
| Neutral-Detergent-Fiber content | NDF_04_me17, NDF_10_me17 | |
| Crude protein content | PRT_04_me17, PRT_10_me17 | |
| Water-soluble-carbohydrate content | WSC_04_me17, WSC_10_me17 | |
| Neutral-Detergent-Fiber degradability | DNDF_04_me17, DNDF_10_me17 | |
| Organic matter digestibility | OMD_04_me17, OMD_10_me17 | |
| Nitrogen content of sunlit leaf lamina | NLI_lu16 | |
| Isotopic composition of 13C (δ13C) | 13C_lu16 |
After _, the trait name is suffixed by the location (PO for Poel, LU for Lusignan and ME for Melle) and the year (avg for average, and for example, 15-18 for 2015-2018)
dates were converted into growing degree days (GDD) with a base temperature of 0°C starting from the first day when daily minimum temperature and incident shortwave global radiation do not fall anymore below 0°C and 60 W m-2, respectively (i.e., from the start of vegetative spring growth)
in addition new variables indicated by res at the beginning of the variable name were obtained by removing the effect of average heading date (HEA_avg)
Figure 2Histograms for all available traits of the Mantel correlation values between phenotypic and geographic Euclidian distances and between phenotypic and genetic Euclidian distances. Genetic Euclidian distances were computed using the allele frequencies of all available SNP markers. The red dots give the one-tailed p-value associated to each Mantel correlation value. The red horizontal line indicates the 5% p-value threshold below which the traits represented by the dots are considered to have phenotypic pairwise distances significantly correlated to geographic or genetic pairwise distances.
Figure 3Heat map representing the q-value of associations between traits and SNP markers with the significant SNPs positioned along the different chromosomes (Chr1-Chr7 and unmapped from Byrne ). The traits on the abscissa axis are those with which at least one SNP marker was significantly associated with a 10% q-value threshold and the positioned SNP markers are those associated with at least one trait (q-value < 10%). The two black arrows designate the remarkable regions on chromosome 4 and 7.
Figure 4Contribution to phenotypic variance of the most significant SNPs detected by GWAS (which accounted for kinship) computed using either a model that accounted for kinship (red bars) or a model which did not account for kinship (green bars). The contribution to phenotypic variance of all complementary significant SNP detected by GWAS (q-value <10%) is also displayed (blue bars). The information is only given for traits for which a least one significant SNP was detected by GWAS (q-value < 10%). The numeric values above the blue bars indicate how many complementary SNP (each significantly improve the multiple regression model) were detected for the given trait.
Figure 5Venn diagrams displaying the number of significant SNP markers detected by GWAS (q-value < 10%) in different environments for a same trait, heading date (HEA) on the left and aftermath heading (AHD) on the right.
Figure 6Scatterplot of the significance (-log10(pvalue)) of the association test between heading date measured in different environments and a SNP marker (position 39260 in scaffold 1700_ref0031287) located within the LpVRN2 vernalization response gene against the average daily minimum temperature (in °C) over the preceding winter period in the environment where heading date was recorded. Environments ranked from coldest to warmest average daily minimum temperature are the following: PO-2016, PO-2017, LU-2017, and LU-2016.
Figure 7Scatterplot of the predictive ability of the genomic prediction model against the correlation between pairwise phenotypic and genetic Euclidian distances computed using all available SNP. Each dot represents a trait. The dots are colored according to the number of SNP markers detected as being significantly (considering a qvalue threshold of 0.1) linked to the trait according to a Genome Wide Association Study (GWAS). The value of the Pearson correlation between the two presented variables is given in the upper left corner. The genomic prediction models included all available SNP markers. To evaluate the predictive ability, 100 random calibration sets of all but 50 natural populations were used to predict the 50 remaining natural populations.
Figure 8Performance of the (a) spatial and (b) genetic distances optimization methods for choosing the populations to be included in the calibration set to build a genomic prediction model against the number of populations included in the calibration set. The boxplots represent the variability of the performance between the different traits. In order to calculate the performance of an optimized calibration set, the minimum (MinPA) and maximum (MaxPA) predictive abilities over 100 randomly selected calibration sets of same size were computed. The performance of an optimized calibration set was then calculated as with OptPA being the predictive ability obtained with the optimized calibration set. All available SNP markers were included in prediction models.
Figure 9Histogram of the number of traits for which the minimum number of SNPs required to reach 95% of the maximum predictive ability of the best genomic prediction (using any number of SNPs) falls within the shown range. To reach the 95% threshold, the SNPs were either (a) randomly chosen or (b) chosen from the most to the least significant one among a non-redundant set detected in a preliminary GWAS using the calibration set of populations. The calibration set of all minus 50 populations was selected by optimizing genetic distances and the 50 populations left were used to test the models.
Figure 10Scatterplot of Moran’s index and of (a) number of SNP markers detected by GWAS (q-value < 10%) and (b) observed predictive abilities of genomic prediction models for the different phenotypic traits analyzed in the study. Dots are colored according to the trait H2 (broad sense heritability like indicator).
Figure 11Scatterplot of predicted and true (a) longitude and (b) latitude (WGS84, DD) of sites of origin of perennial ryegrass populations. Leave one out cross validation was used to evaluate the genomic prediction models. All available SNP markers were used. Prediction abilities are assessed by r (lowest right corner).