Literature DB >> 30097471

Genomic Signatures of Adaptation to a Precipitation Gradient in Nigerian Sorghum.

Marcus O Olatoye1, Zhenbin Hu1, Fanna Maina1, Geoffrey P Morris2.   

Abstract

Evolution of plants under climatic gradients may lead to clinal adaptation. Understanding the genomic basis of clinal adaptation in crops species could facilitate breeding for climate resilience. We investigated signatures of clinal adaptation in the cereal crop sorghum (Sorghum bicolor L. [Moench]) to the precipitation gradient in West Africa using a panel (n = 607) of sorghum accessions from diverse agroclimatic zones of Nigeria. Significant correlations were observed between common-garden phenotypes of three putative climate-adaptive traits (flowering time, plant height, and panicle length) and climatic variables. The panel was characterized at >400,000 single nucleotide polymorphisms (SNPs) using genotyping-by-sequencing (GBS). Redundancy analysis indicated that a small proportion of SNP variation can be explained by climate (1%), space (1%), and climate collinear with space (3%). Discriminant analysis of principal components identified three genetic groups that are distributed differently along the precipitation gradient. Genome-wide association studies were conducted with phenotypes and three climatic variables (annual mean precipitation, precipitation in the driest quarter, and annual mean temperature). There was no overall enrichment of associations near a priori candidate genes implicated in flowering time, height, and inflorescence architecture in cereals, but several significant associations were found near a priori candidates including photoperiodic flowering regulators SbCN12 and Ma6 Together, the findings suggest that a small (3%) but significant proportion of nucleotide variation in Nigerian sorghum landraces reflects clinal adaptation along the West African precipitation gradient.
Copyright © 2018 Olatoye et al.

Entities:  

Keywords:  Crop evolution; West Africa; climate adaptation; flowering time; genome-wide association studies

Mesh:

Year:  2018        PMID: 30097471      PMCID: PMC6169398          DOI: 10.1534/g3.118.200551

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Adaptation to environmental gradients can lead to evolution of clines, a continuous form of local adaptation (Slatkin 1973; Savolainen ; Yoder ). Climatic gradients that may shape trait adaptation include ultraviolet radiation, photoperiodicity, temperature, and precipitation (Hancock ; Haussmann ; Bastide ). In the model plant Arabidopsis, clinal variation across latitude has been observed for seed dormancy, cold tolerance, height, and flowering time (Zhen and Ungerer 2008; Samis ; Kronholm ; Debieu ). Likewise, in crop species, diffusion from tropical to temperate zones has led to clinal adaptation in flowering time (Ducrocq ; Buckler ; Wu ; Kloosterman ) and cold tolerance (Comadran ; Ma ). Local adaptation of traditional varieties has played a key role in smallholder crop production under adverse climatic conditions and low agricultural inputs (Vasconcelos ). Locally-adapted crop landraces possess alleles that can be beneficial for the development of improved varieties to ensure food security under stressful climates (Zeven 1998; Soler ; Lasky ). Understanding genetic diversity, population structure, and genotype-phenotype associations in crop landraces can guide germplasm conservation and breeding (Djè ; Soler ; Dwivedi ). Recent advances in genotyping technology have facilitated studies of genomic diversity in crops, including studies of local and clinal adaptation using population and quantitative trait genomics (Myles ; Morrell ; Savolainen ). Population genomics methods based on genome-wide patterns of nucleotide variation can identify loci with signatures of selection (Siol ). These have been used to identify genomic targets of adaptation in many crops including maize (Gore ; Hufford ), rice (Meyer ; Li ), and sorghum (Morris ; Mace ). In sorghum, quantitative trait genomics approaches using mixed linear model have identified genomic regions associated with adaptive traits and climatic variables (Morris ; Zhang ; Lasky ). Sorghum (Sorghum bicolor L. [Moench]) is an essential staple cereal crop in dryland regions of the world (National Research Council 1996). It has adapted to a wide variety of climatic gradients and has abundant phenotypic variation for flowering time, plant morphology, and inflorescence morphology (Morris ; Zhang ; Lasky ). Globally, the morphological types (botanical races) of sorghum are distributed according to precipitation zones, with open-panicle guinea types predominant in humid regions, semi-compact caudatum types predominant in semi-arid regions, and compact-panicle durra types predominant in arid regions (Doggett 1988; Morris ). In West Africa, sorghum is found across a steep north-south precipitation gradient, ranging from semiarid grasslands bordering the Sahara Desert in the north (Sahelian zone), through subhumid savannah (Sudanian zone), to humid forest zones in the south (Guinean zone). These regions have been subject to major droughts for several millennia (Shanahan ) and increased drought under climate change is expected to reduce sorghum yields in this region (Lobell ). The West African country of Nigeria is Africa’s most populous nation and its largest sorghum producer, with 5-10 million Mg of grain production per year (Nzeka and Akhidenor 2018). Sorghum is the major cereal in the northern Sudano-Sahelian region of Nigeria, which is characterized by prolonged dry seasons and short rainy seasons (National Research Council 1996). Sorghum, as a non-centric crop, has multiple centers of diversity and two of these overlap with the boundaries of Nigeria (Harlan 1971, 1992). The genetic diversity of Nigerian sorghum is poorly characterized compared to other African sorghum germplasm (Rao ; Deu ; Barro-Kondombo ; Leiser ). Identifying genomic regions underlying adaptation in Nigerian sorghum germplasm could facilitate the identification of adaptive traits and genetic diversity relevant to crop improvement. Given that sorghum is distributed across the precipitation gradient in Nigeria, we hypothesized that Nigerian sorghum germplasm has been shaped by clinal adaptation. Under this hypothesis, we expect precipitation variables to be associated with both phenotype (putative climate-adaptive traits) and genotype (population structure and SNPs). Further, we expect that trait-associated and climate-associated genome regions will colocalize with genes involved in putative climate-adaptive traits. We investigated these predictions in a large panel of georeferenced Nigerian genebank accessions, which were previously phenotyped and which we genotyped at high-density using GBS. We characterized patterns of association among climatic, phenotypic, and genotypic variables and tested colocalization of associated genomic regions. Overall, the patterns are consistent with a small contribution of clinal adaptation shaping genomic variation in Nigerian sorghum.

Materials and Methods

Plant materials

Seeds for 553 Nigerian accessions were obtained from the USDA National Plant Germplasm System (NPGS) (https://www.ars-grin.gov/). Seedlings were raised in a greenhouse for two weeks and 50 mg of fresh leaf tissue was collected from each accession into 96-well plates. A control well was left empty on each plate. Leaf tissue was lyophilized (Labconco Freeze Dryer, Kansas City, MO, USA) for two days and then ground using 96-well plate plant tissue grinder (Retsch Mixer Mill, Haan, Germany). Genomic DNA was extracted using BioSprint 96 DNA Plant Kit (QIAGEN, Valencia CA, USA), quantified using Quant-iTTM PicoGreen dsDNA Assay Kit (ThermoFisher Scientific, Waltham MA, USA), then normalized to 10 ng/μl.

Genotyping-by-sequencing

GBS was conducted on 553 Nigerian accessions using methods previously described (Elshire ; Morris ). Briefly, individual DNA samples were digested using ApeKI restriction enzyme (NEB R0643L) followed by ligation of barcode and common adapters ligation using T4 DNA ligase (NEB M0202L). Ligated libraries were pooled (96-plex libraries) then amplified by polymerase chain reaction (PCR). Purification of libraries was performed using QIAquick PCR purification kit (QIAGEN, Valencia CA, USA). Library size distribution was obtained using a Bioanalyzer (Agilent Technologies 2100, Santa Clara CA, USA). Four 96-plex libraries were pooled to generate 384-plex sequencing libraries. Libraries were sequenced using single end 100-cycle sequencing using Illumina HiSeq2500 (Illumina, San Diego CA, USA) at the University of Kansas Medical Center, Kansas City MO, USA. Sequence reads for Nigerian germplasm were combined with published sequence reads obtained for 1943 accessions (Lasky ). The published sequence data were composed of globally diverse sorghum landraces with major representation by accessions from Africa and Asia. They were obtained from the United States NPGS-GRIN and the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT) gene banks. From these 1943 georeferenced global accessions, sequence information from 158 Nigerian accessions was obtained and combined with the Nigerian NPGS set. Duplicated accessions and accessions with sorghum conversion (SC) numbers (i.e., with introgressions for early maturity and semi-dwarf genes) in the NPGS database were removed from the Nigerian germplasm. Thus, 607 Nigerian accessions (of which 443 were georeferenced; Figure 1A) and 1785 georeferenced global accessions were used for downstream analysis (Files S1 and S2). Reads were aligned to the sorghum reference genome v3.0 (McCormick ) using Burrow Wheeler Alignment algorithm (Li and Durbin 2009). SNP calling was performed using TASSEL 5.0 GBS pipeline (Glaubitz ). The SNPs were filtered for < 20% missingness, then missing data were imputed using BEAGLE 4.0 (Browning and Browning 2013).
Figure 1

Geographic distribution of Nigerian accessions across precipitation zones. (A) Geographic distribution of georeferenced Nigerian accessions with known botanical race. (B) Panicle length variation across precipitation zones (size of circles commensurate to length of panicle in cm). (C) plant height variation across precipitation zones (size of circles commensurate to height of plant in cm). (D) flowering time variation across precipitation zones (size of circles commensurate to days until anthesis, 1 = earliest flowering and 6 = photoperiod sensitive). Background map is color-coded according to average annual precipitation (in mm).

Geographic distribution of Nigerian accessions across precipitation zones. (A) Geographic distribution of georeferenced Nigerian accessions with known botanical race. (B) Panicle length variation across precipitation zones (size of circles commensurate to length of panicle in cm). (C) plant height variation across precipitation zones (size of circles commensurate to height of plant in cm). (D) flowering time variation across precipitation zones (size of circles commensurate to days until anthesis, 1 = earliest flowering and 6 = photoperiod sensitive). Background map is color-coded according to average annual precipitation (in mm).

Climate and phenotype data

Climate data (average from 1960 to 1990) were obtained from WorldClim 1.4 using the Raster package in R (Hijmans ) based on the coordinate (latitude and longitude) for each of the 443 georeferenced Nigerian accessions (File S1) and 1785 global accessions (File S2). As proxies for precipitation gradients that are hypothesized to affect Nigerian sorghum we investigated “annual precipitation” and “precipitation in the driest quarter”. Common garden passport data for flowering time, plant height, and panicle length for the Nigerian accessions were obtained from the USDA-NPGS Germplasm Resource Information Network database (https://www.ars-grin.gov/). The passport data were based on evaluations in one or more common garden experiments in tropical latitudes (Puerto Rico and St. Croix, 17-18°N), so best linear unbiased predictors (BLUPs) were estimated for each trait for each accession (File S1). A term for common garden was fit as random in the BLUP estimation model using lmer function in LME4 package in R (Bates ) as follows:where is the vector of phenotypic observation of the i accession, is the j common garden where i accession was evaluated, and is the residual or error term. Pearson correlations were calculated between BLUPs of three adaptive traits (flowering time, panicle length, and plant height), and environmental factors (latitude, temperature and precipitation). To reduce the influence of outlier sites with exceptional climatic variables, precipitation values of three geographical locations where sorghum is not commonly cultivated were removed. Analysis of variance (ANOVA) and Tukey HSD test in R were performed to identify precipitation differences among sites of origin for different genetic groups or botanical races.

Redundancy analysis

Redundancy analysis (RDA) was performed separately for global and Nigerian germplasm sets using the varpart function in the R vegan package (Oksanen ). A multivariate model was fit using the genomic data (431,698 SNPs for the global accession and 279,689 SNPs for the Nigerian accessions, filtered for monomorphic and singleton markers) as response variable. Ten WorldClim 1.4 climatic variables (annual mean temperature, mean temperature wettest quarter, mean temperature driest quarter, mean temperature warmest quarter, mean temperature coldest quarter, annual precipitation, precipitation wettest quarter, precipitation driest quarter, precipitation in the warmest quarter, and precipitation in the coldest quarter) and geographical variables (latitude and longitude, which we refer to as “space”) were fitted as predictor terms. The “space” term is included to account for isolation-by-distance (Lasky ). To test the significance of the proportion of variation explained by climate collinear with space in the Nigerian germplasm, the proportion of variation explained was compared to the distribution from 1000 permuted data sets. In each stage of the permutation, individuals (genotypes) were randomized and RDA regression fitted and repeated 1000 times.

Population structure and linkage disequilibrium analyses

Discriminant analysis of principal components (DAPC) was conducted with the find clusters function in Adegenet package in R (Jombart 2008; Jombart ). Population differentiation (FST) between DAPC groups was estimated using –weir-fst-pop parameter (Weir and Cockerham’s FST) in VCFtools (Danecek ). While nucleotide diversity within DAPC groups was estimated using –window-pi (1kb) in VCFtools. LD decay analysis for each DAPC group was performed by PopLDdecay (BGI-shenzhen 2017). For comparison with Nigerian germplasm, West African accessions were identified from the published global GBS data (Lasky ). The published GBS data were composed of global accessions from 55 countries, predominantly representing landraces from sub-Saharan Africa and Asia. In the text, “global” refers to all accessions (including Nigerian and other West African accessions), unless otherwise noted that Nigerian or West African accessions have been removed. Linkage disequilibrium decay for the genomic data for Nigerian, West African, and global germplasm was estimated by PopLDdecay (BGI-shenzhen 2017), with minor allele frequency parameter set at 0.05 and smoothing by the spline function in R. Principal component analysis (PCA) was performed using SNPrelate package in R (Zheng ) with LD pruning threshold parameter set to 0.5 and minor allele frequency parameter set to 0.05. Neighbor-joining analysis was performed using TASSEL 5.0 and visualized in APE (Analyses of Phylogenetics and Evolution) package in R (Paradis ). Population differentiation (FST) between Nigerian, West African and global germplasm was evaluated using –weir-fst-pop parameter (Weir and Cockerham’s FST) in VCFtools (Danecek ). While nucleotide diversity within each germplasm, their inbreeding coefficients, and observed heterozygosity were estimated using –window-pi (1kb window), –het, and –hardy respectively, in VCFtools.

Genome-wide association studies

Genome-wide association studies (GWAS) were performed using BLUPs of traits (panicle length, n = 330; plant height, n = 332; and flowering time, n = 412). After filtering the Nigerian data for a minimum minor allele frequency (MAF) of 0.03, a total of 149,342 SNPs were used in the GWAS analysis. First, a multi-locus mixed linear model (MLMM) (Segura ) with a fixed population term (Q) and a random polygenic term (K) was used to perform GWAS for the phenotypic traits. PCA components (first three PCs) used for Q term were estimated using TASSEL 5.0 (Bradbury ) and kinship matrix used for the polygenic term was derived from GAPIT (Lipka ). Bonferroni correction of 2.6e-07 (α/number of markers, where α = 0.05) was used to determine the cut-off threshold for the phenotypic associations. A set of a priori candidate genes was compiled from Phytozome including known sorghum genes, and sorghum homologs of rice and maize genes known to be involved in inflorescence morphology, maturity, and plant height (n = 169; File S3).

Genome scans

Three environmental variables (annual precipitation, precipitation in the driest quarter, and annual mean temperature) were used as proxies for the precipitation gradient (n = 443). A GLM, which does not include population structure and kinship terms, was used to perform an association scan for climatic variables to reduce false negatives (Bergelson and Roux 2010; Lasky ). The top 1% outliers of the environmental associations were selected for enrichment analysis. Enrichment of a priori genes near association peak were performed using a chi-square test. Windows of 100 kb were used as conservative regions for colocalization between SNPs and a priori genes since LD decayed to background levels at > 100 kb. The genome wide Tajima’s D across 100 kb windows was tested using VCF tools in global germplasm (DGlobal), West African germplasm (DWestAfrica), and Nigerian germplasm (DNigeria). Enrichment analysis for a priori genes was performed by testing whether the DNigeria 100 kb windows were significantly enriched for our a priori candidate genes relative to a set of random genes derived from the sorghum genome version 3 gff3 gene file from Phytozome (Goodstein ; McCormick ) for 1000 whole genome permutations.

Data availability

Raw sequencing data are available from the NCBI Sequence Read Archive under project accession SRP132525 SNP genotype, phenotype, and geographic data are available at Dryad (doi:10.5061/dryad.g0141g7). All data are publicly available. File S1 contains detailed descriptions of Nigerian accessions, their passport data, georeference information, the BLUPs of phenotypes, climatic data, and DAPC groups. File S2 contains detailed descriptions of global accessions, their georeference information, and climatic data. File S3 contains a priori candidate genes list and literature sources. File S4 contains ANOVA and Tukey test results for race by precipitation analysis. File S5 contains detailed descriptions of a priori candidate genes associated with significant SNPs for MLMM and GLM GWAS results for the phenotypes. File S6 contains detailed descriptions of a priori candidate genes associated with outlier SNPs for GLM of environmental variables. File S7 contains detailed descriptions of a priori candidate genes associated with Nigerian germplasm Tajima’s D (DNigeria) windows. Supplemental material available at Figshare: https://doi.org/10.25387/g3.6942986.

Results

Trait and environment correlations

The georeferenced sorghum accessions from Nigeria originated across a wide precipitation gradient (Figure 1A). Annual precipitation at the locations of origin ranges from < 500 mm/year for the northern Nigerian accessions (Sahelian zone) to > 3500 mm/year for the southern Nigerian accessions (Guinean zone). Analysis of variance indicates a significant difference in annual precipitation at sites of origin among the botanical races (P-value < 0.001). Significant differences were found between guinea and caudatum types (P-value < 0.05) and between guinea and durra-caudatum types (P-value < 0.01) (File S4) in precipitation differences among sites of origin. The correlation of annual precipitation with common garden phenotypes of georeferenced Nigerian germplasm was investigated for three traits (panicle length, plant height, and flowering time) (Figure 1B-D). For comparison, we also considered correlations with annual mean temperature and latitude. Significant positive relationships with annual mean precipitation were observed for panicle length (Figure 2A; r = 0.21, P-value < 0.001) and plant height (Figure 2B; r = 0.22, P-value < 0.001) but not flowering time (Figure 2C). By contrast, annual mean temperature had no correlation with panicle length (Figure 2D) or plant height (Figure 2F) but a significant negative relationship with flowering time (Figure 2F; r = - 0.19; P-value < 0.001). Latitude had negative relationships with panicle length (Figure 2G; r = - 0.19, P-value < 0.001) and plant height (Figure 2H; r = - 0.26, P-value < 0.001), but no relationship with flowering time (Figure 2I). Among the traits, flowering time had significant positive relationships with both panicle length and plant height (Fig. S1; r = 0.32 and 0.41, respectively; P-values < 0.001). Among the environmental variables, annual mean precipitation had a significant negative relationship with latitude (Fig. S1; r = -0.86, P-value < 0.001) and a weaker but significant negative correlation with annual mean temperature (Fig. S1; r = -0.23, P-value < 0.001).
Figure 2

Relationships between traits and environmental variables in georeferenced germplasm. Pearson correlations between three environmental variables, annual mean precipitation (in mm) (A-C), annual temperature (in °C) (D-F), and latitude (G-I), vs. three phenotypic variables, panicle length (A, D, G), plant height (B, E, H), and flowering time (C, F, I) (n = 443). Phenotypes are BLUPs from common garden experiments. Significant correlation values are noted at 0.05, 0.01 and 0.001 (*, **, and ***).

Relationships between traits and environmental variables in georeferenced germplasm. Pearson correlations between three environmental variables, annual mean precipitation (in mm) (A-C), annual temperature (in °C) (D-F), and latitude (G-I), vs. three phenotypic variables, panicle length (A, D, G), plant height (B, E, H), and flowering time (C, F, I) (n = 443). Phenotypes are BLUPs from common garden experiments. Significant correlation values are noted at 0.05, 0.01 and 0.001 (*, **, and ***).

Genome-wide nucleotide variation

To investigate genomic variation we developed a data set consisting of 431,698 SNPs genotyped across 2392 accessions (Nigeria, West Africa, and global). Most of the SNP variation in the three panels was rare; about 51% of the Nigerian genomic data were composed of SNPs with minor allele frequencies (MAF) < 0.01, 46% of the West African SNPs have MAF < 0.01, 36% of the global reference SNPs have MAF < 0.01, and 37% of global SNPs have MAF < 0.01 (Fig. S2A). The mean observed heterozygosity across loci in each of the germplasm is 0.02 (2%) (Fig. S2B). SNP density was higher in sub-telomeric regions and lower in sub-centromeric regions (Fig. S3). In the Nigerian germplasm (n = 607 accessions; Figure 1A), 279,689 SNPs were retained after removing monomorphic markers, singletons, and doubletons from the initial 431,698 SNPs. This corresponds to an average of 1 SNP per 2.7 kb. In the West African germplasm (n = 325 accessions), 311,786 SNPs were obtained after removing monomorphic markers, singletons, and doubletons. The nucleotide diversity of Nigerian germplasm was similar to the West African and global germplasm. The average nucleotide diversity for global germplasm (πglobal) and West African germplasm (πWestAfrica) was 4.5 × 10−4. The nucleotide diversity in Nigerian germplasm (πNigeria) was somewhat lower at 4.0 × 10−4. The average inbreeding coefficients were 0.83, 0.82, and 0.80 for global germplasm, West African germplasm, and Nigerian germplasm, respectively (all significantly different from each other at P-value < 0.01) (Fig. S2C). The FST between Nigerian and West African germplasm was 0.007 (Fig. S4A) and FST between Nigeria and global germplasm was 0.07 (Fig. S4B). The genome-wide average rate of linkage disequilibrium decay differed among the panels. Of the three germplasm sets, the global germplasm had the fastest LD decay rate by reducing to r2 = 0.1 at 20 kb (Fig. S5). LD decayed to half its initial value at 12 kb and r2 = 0.1 at 50 kb in the Nigerian germplasm. The West African germplasm had a slowest LD decay rate among the three sets, with r2 = 0.1 at 100 kb. To estimate the proportion of SNP variation that has been shaped by climate vs. geographic distance (space) we carried out a redundancy analysis, a form of multivariate regression. The proportion of SNP variation explained by climate and space in global germplasm was substantially greater than in Nigerian germplasm. In global germplasm, the proportion of variation explained by the 10 climate variables, space (latitude and longitude), and their combination together are 4%, 8%, and 5%, respectively (Figure 3A). By contrast, in Nigerian germplasm, climate and geographic variables explained a smaller proportion of the total SNP variation (Figure 3B); climate and space alone each explained 1% of the SNP variation, while climate collinear with space explained 3% of the SNP variation. The proportion of variation explained by climate collinear with space was significantly greater (P < 0.001) than the null distribution from geographically permuted data (Figure 3C).
Figure 3

Redundancy analysis of SNP variation explained by climatic and spatial variables. Multivariate redundancy analysis showing the proportion of genotypic variation explained by climate variables (precipitation and temperature) and space (latitude and longitude) in (A) global germplasm and (B) Nigerian germplasm. (C) Density plot showing the distribution of proportion of variation explained by climate collinear with space obtained for 1000 permutations. The green line indicates the position of the proportion of variation explained by climate collinear with space and the red line indicates the 95th percentile of the null distribution obtained by permutations.

Redundancy analysis of SNP variation explained by climatic and spatial variables. Multivariate redundancy analysis showing the proportion of genotypic variation explained by climate variables (precipitation and temperature) and space (latitude and longitude) in (A) global germplasm and (B) Nigerian germplasm. (C) Density plot showing the distribution of proportion of variation explained by climate collinear with space obtained for 1000 permutations. The green line indicates the position of the proportion of variation explained by climate collinear with space and the red line indicates the 95th percentile of the null distribution obtained by permutations.

Population structure analysis of the Nigerian germplasm

To characterize the genetic structure of Nigerian germplasm in relation to global sorghum diversity, we conducted PCA and DAPC. In the PCA, the first PC explained about 6% of the variation while the second PC explained about 4% of the variation. The Nigerian accessions formed mostly separate clusters relative to the global accessions. The West African and Nigerian germplasm clustered together in most cases (Fig. S6A). Neighbor joining analysis also showed that Nigerian accessions and West African accessions cluster together, separately from the rest of the global germplasm (Fig. S7A). Clustering by botanical race was also observed in the Nigerian germplasm (Fig. S6B and Fig. S7B). DAPC analysis identified three genetic groups (Figure 4A–C). The DAPC groups were genetically differentiated from each other as follows: Group 1 vs. Group 2 (FST of 0.21), Group 1 vs. Group 3 (FST of 0.18), and Group 2 vs. Group 3 (FST of 0.22). Accessions in Group 2 originate from locations with higher precipitation than the accessions in Group 1 (P-value < 0.001) and Group 3 (P-value < 0.001) (Figure 4D). The average nucleotide diversity in 1kb windows for groups 1, 2, and 3 are 4.4 × 10−4, 3.3 × 10−4, and 3.8 × 10−4 respectively. Linkage disequilibrium (r2) level decayed to 0.1 at 30 kb in in Group 1, 80 kb in Group 2, and 90 kb in Group 3 (Fig. S8). Caudatum types are more predominant in Group 1, Guinea types are more predominant in Group 2, and Durra types are more predominant in Group 3 (Table 1).
Figure 4

Genetic grouping of Nigerian germplasm in relation to the precipitation gradient. (A) Three genetic groups identified by the discriminant analysis of principal components (DAPC), (B) neighbor joining tree showing genetic relatedness among Nigerian accessions belonging to the three DAPC groups, (C) distribution of the georeferenced accessions based on their DAPC genetic groups across precipitation zones in Nigeria, and (D) density plots of precipitation distribution of georeferenced accessions based on their DAPC genetic groups. Group 2 accessions originate from locations with higher precipitation than Group 1 (P-value < 0.001) and Group 3 (P-value < 0.001).

Table 1

Distribution of sorghum botanical races among DAPC groups

Botanical RaceGroup 1Group 2Group 3
Bicolor631
Caudatum3976
Durra413
Guinea121790
Bicolor Caudatum953
Bicolor Durra710
Bicolor Guinea2132
Caudatum Durra281157
Guinea Caudatum554
Guinea Durra111
Genetic grouping of Nigerian germplasm in relation to the precipitation gradient. (A) Three genetic groups identified by the discriminant analysis of principal components (DAPC), (B) neighbor joining tree showing genetic relatedness among Nigerian accessions belonging to the three DAPC groups, (C) distribution of the georeferenced accessions based on their DAPC genetic groups across precipitation zones in Nigeria, and (D) density plots of precipitation distribution of georeferenced accessions based on their DAPC genetic groups. Group 2 accessions originate from locations with higher precipitation than Group 1 (P-value < 0.001) and Group 3 (P-value < 0.001). In order to characterize variation among genetic groups identified by DAPC, ANOVA, and Tukey test were performed for putative adaptive traits and environmental variables. Significant difference in the distribution of precipitation and temperature gradient were found between DAPC Groups 1 and 3 (P-value < 0.01) and Groups 2 and 3 (P-value < 0.01). Also, there were significant differences between Groups 2 and 1 (P-value < 0.001) and Groups 2 and 3 (P-value < 0.001). We also found significant differences for putative adaptive traits between DAPC groups. For panicle length, all DAPC groups comparisons were different at P-value < 0.01. Significant differences were found for flowering time distribution between Groups 1 and 2 (P-value < 0.001). For plant height, significant differences were found between Groups 1 and 2 (P-value < 0.001) and Groups 2 and 3 (P-value < 0.001).

Genome-wide association studies of phenotypes

To identify genomic regions associated with phenotypic variation, we conducted MLMM GWAS for panicle length, plant height, and flowering time using BLUPs of common-garden phenotypes. Several genomic regions associated with the traits were identified and two a priori candidate genes fell within 100 kb (Figure 5, File S5). For panicle length no associations were significant at the Bonferroni threshold (Figure 5A). For plant height, a single significant association was observed on chromosome 3 (Figure 5B). The single significant association for plant height (S3_62675143, MAF = 0.29) colocalized with photoperiodic flowering gene SbCN12 (Sobic.003G295300, 73 kb away) (Yang ). For flowering time, nine significant associations were found on chromosomes 3, 6, 7, 9, and 10 (Figure 5C). The most significant flowering time association (S6_799609, MAF = 0.09) colocalized with the known sorghum flowering time and photoperiod sensitivity gene Maturity6 (Ma6/Ghd7, Sobic.006G004400, 99 kb from gene) (Murphy ). With a naive model (GLM GWAS), nominally significant associations were found on all chromosomes for panicle length, plant height, and flowering time, and a large number of these colocalized with a priori candidate genes (Fig. S9).
Figure 5

Genome wide association studies of phenotypes for putative adaptation traits. Genome-wide SNP associations for multi-locus mixed model with a fixed population term and a random polygenic term for (A) panicle length, (B) plant height, and (C) flowering time in Nigerian germplasm (n = 329, 331, and 411 respectively). Vertical dashed lines represent position of a priori candidate genes (Ma6 and SbCN12) and found within 150 kb region of the significant association. Horizontal broken red lines signify the Bonferroni correction threshold (P-value < 0.05).

Genome wide association studies of phenotypes for putative adaptation traits. Genome-wide SNP associations for multi-locus mixed model with a fixed population term and a random polygenic term for (A) panicle length, (B) plant height, and (C) flowering time in Nigerian germplasm (n = 329, 331, and 411 respectively). Vertical dashed lines represent position of a priori candidate genes (Ma6 and SbCN12) and found within 150 kb region of the significant association. Horizontal broken red lines signify the Bonferroni correction threshold (P-value < 0.05).

Genome scans for adaptation

Associations with environmental variables were used to investigate possible genomic signature of climate adaptation. GLM outliers (top 1% of associations) were identified on all chromosomes for “annual precipitation”, “precipitation in the driest quarter,” and “annual temperature” (Figure 6 A-D and File S6). Genome-wide, 15% and 6% of a priori candidate genes were localized within 100 kb of a 1% outlier SNP for “annual precipitation” and “precipitation in the driest quarter”, respectively (vs. 17% and 16% of all genes). Therefore, there is no enrichment of a priori candidate genes near environmental association outliers. A few of the a priori candidate genes that localize near 1% outlier SNPs are as follows. The SNP S9_54870238 (MAF = 0.36; 99th percentile) associated with annual precipitation was 90 kb away from the regulator of photoperiodic flowering SbCN8 (Centroradialis8, Sobic.009G199800). S9_8022437 (MAF = 0.49; 99th percentile) associated with annual precipitation is 94 kb from the sorghum ortholog (Sobic.009G069700) of maize barren inflorescence4 (bif4) (Galli ). S3_4891237 and S3_4750963 (MAF = 0.31, 98th percentile and MAF = 0.04, 99th percentile) associated with annual precipitation and precipitation in the driest quarter were 100 kb from Sbra2 (Sobic.003G052900), the sorghum ortholog of maize inflorescence gene ramosa2 (ra2) (Brown ).
Figure 6

Genome wide association studies of climatic variables. Genome-wide SNP associations for general linear model of (A) annual mean precipitation, (B) precipitation in the driest quarter, (C) and annual mean temperature. Horizontal broken lines indicate the cutoff point of top 1% outliers. Vertical broken lines indicate the position of a priori candidate genes related to inflorescence morphology (orange), plant height (blue), and flowering time (purple), which are within 100 kb from associated SNPs.

Genome wide association studies of climatic variables. Genome-wide SNP associations for general linear model of (A) annual mean precipitation, (B) precipitation in the driest quarter, (C) and annual mean temperature. Horizontal broken lines indicate the cutoff point of top 1% outliers. Vertical broken lines indicate the position of a priori candidate genes related to inflorescence morphology (orange), plant height (blue), and flowering time (purple), which are within 100 kb from associated SNPs. Genome-wide pattern of Tajima’s D in the Nigerian germplasm (DNigeria) across 1 kb windows ranged between -2.0 to 4.0 (Fig. S10 and Fig. S11). The average Tajima’s D value in the Nigerian germplasm (DNigeria) was -0.2 while the average genome wide Tajima’s D (across 1 kb windows) in the global germplasm (DGlobal) and West African germplasm (DWestAfrica) were 0.1 and 0.2, respectively. Positive DNigeria windows were significantly enriched for a priori candidate genes compared to the expectation under a null distribution (Fig. S12 and File S7). Some of the DNigeria windows (Fig. S10) contain genes that control for flowering time and inflorescence development (File S7). For instance, the sorghum flowering time gene Maturity6 (Ma6, Sobic.006G004400) colocalized with the genome wide DNigeria scan window at 0.697 Mb (Tajima’s D = 1.48, 89th percentile) on chromosome 6. Maturity1 (Ma1, Sobic.006G057866) colocalized with the genome wide DNigeria scan window (Tajima’s D = 2.9; in the 98th percentile) around 40 Mb on chromosome 6. The sorghum ortholog of branched silkless1 (Sobic.002G411000), a maize spikelet meristem identity gene (Chuck ), colocalized with the genome wide DNigeria scan window (Tajima’s D = 2.2 in the 94th percentile) around 75.9 Mb on chromosome 2.

Discussion

Evidence for clinal adaptation to the precipitation gradient

Genome-wide studies of nucleotide variation can provide insights into patterns of genetic variation in crop landraces and the role of clinal adaptation in shaping this variation (Meyer ). Overall, we found several lines of evidence that sorghum phenotypic variation across Nigeria has been shaped by clinal adaptation to precipitation. For panicle length, common garden variation in the Nigerian sorghum germplasm was correlated with annual precipitation (Figure 2, Fig. S1). Sorghum accessions originating from lower latitudes that have high precipitation had longer panicles than accessions originating from higher latitudes that have less precipitation (Figure 2A, 2G). The long panicle morphology is associated with open and lax primary branches, which is a key feature of guinea race which are predominant in humid to sub-humid regions of Nigeria (Figure 1A-B) and West Africa more generally (Deu ; Barro-Kondombo ; Lasky ). This open panicle morphology is thought to allow airflow, reducing mold infection under high humidity (Doggett 1988), though this model has not been formally tested in diverse germplasm. For plant height, sorghum accessions from lower latitudes associated with high precipitation were taller than sorghum accessions from higher latitudes associated with less precipitation (Figure 2B, 2H). This pattern is consistent with cross-species ecological studies of plant height, which identified precipitation as the best environmental predictor of within-species latitudinal variation of height (Moles ). Given that higher latitudes in Nigeria have lower rainfall, reduced plant height and panicle length in dry regions may be an adaptation to increase yield stability under reduced water availability, as has been observed in West African pearl millet (Vigouroux ). In our study, common garden variation in flowering time is associated with temperature but not precipitation at the location of origin. The negative relationship between flowering time and annual mean temperature (Figure 2F) suggests that sorghum in hot climates may flower early as an escape from high temperature and resulting water limitation (Tuinstra ). The significant proportion of nucleotide variation explained by climate collinear with space (3%; Figure 3) is consistent with clinal adaptation of sorghum in Nigeria. Redundancy analysis indicated that climate collinear with space explained more SNP variation in the Nigerian germplasm than either of climate and space (isolation-by-distance) alone. The finding that climate collinear with space explained more SNP variation than either climate or space alone is consistent with findings in global sorghum germplasm (Lasky ) and regional germplasm in wild soybean (Glycine soja) (Leamy ) and barley (Hordeum vulgare) landraces (Abebe ). However, the proportion of SNP variation explained by climate collinear with space we observed in this study was much lower than what was observed in wild soybean and barley landraces (6–34% and 29–61%, respectively). Methodological differences that may contribute to the lower proportion of SNP variation explained in this study are the greater number of environmental variables used in the soybean and barley studies, and the use of ascertained SNPs in the soybean study. Genome-wide nucleotide variation in Nigeria is also structured according to precipitation zones. The DAPC analysis identified genetic groups within the sorghum botanical races. These genetic groups showed differences in precipitation distributions (Figure 4C-D, Table 1). Group 1 was associated with the lowest annual mean precipitation, and composed predominantly of caudatum, caudatum intermediates and bicolor types and prevalent at higher latitudes in northeastern Nigeria characterized with lower annual precipitation. The northeastern part of Nigeria was classified as part of the center of diversity of caudatum and caudatum-durra (Harlan 1992). Group 2 was associated with higher annual precipitation distribution and more prevalent at lower latitudes. Most of the accessions in this group belong to the guinea and guinea intermediate racial types. Group 3 was predominantly made up of durra and caudatum-durra intermediates. Notably, there was a complete absence of the guinea race from this group. Consistent with the model that botanical races in sorghum are differentially adapted to precipitation-based agroclimatic zones (Harlan 1992; Morris ), we found differences in botanical race distribution in precipitation zones (File S7). Precipitation distribution of guinea accessions was significantly different from precipitation distribution of caudatum (P-value < 0.05) and caudatum-durra races (P-value < 0.01). Despite some evidence of clinal adaptation, we should note that the redundancy analysis (Figure 3) and theoretical considerations (Pavlidis ; Meirmans 2015) suggest that bulk of the nucleotide variation observed in the Nigerian germplasm is neutral. In addition, the small proportion of phenotypic and genomic variation explained by the precipitation gradient (Figure 3B-C; Figure 2) suggest a modest role of clinal adaptation in shaping phenotypic diversity at the geographic scale of modern Nigeria. However, more detailed studies of panicle and vegetative morphology may reveal traits with stronger climate associations. Other factors that may have shaped the observed diversity patterns which we did not investigate include seed sharing based on ethnolinguistic grouping (Soler ; Labeyrie ) and historical processes of domestication and diffusion (Kimber ; Morris ). Consistent with a major role of cultural and historical factors shaping diversity at a Nigeria-wide scale, in the cases where multiple accessions were collected from single locations, multiple botanical races and genetic group were observed (Figure 1A, File S1).

Identifying putative loci underlying adaptation

The combination of phenotypic association, environmental association, and selection scans can provide multiple lines of evidence for the involvement of particular loci in adaptation (Meyer ). In the Nigerian sorghum germplasm, we observed some cases where a priori candidate genes related to adaptive traits colocalized with GWAS signals (Figure 5-6; File S5-S6). The colocalization of two sorghum photoperiod sensitivity genes (SbCN12 and Ma6) with plant height and flowering time associations suggests that photoperiod sensitivity contributes to plant height and flowering time adaptation in Nigerian sorghum germplasm. This is consistent with previous studies that identified SbCN12 and Ma6/Ghd7 as major genes underlying natural variation in photoperiodic flowering in sorghum (Murphy ; Yang ; Bouchet ). In sub-tropical latitudes, like the common gardens used by the USDA genebank (∼17° N), photoperiod sensitive sorghums are expected to have longer vegetative growth and attain greater heights than in tropical latitudes due to longer days at the higher latitudes (Murphy ). Given that strong association peaks for both environmental variables (annual precipitation and precipitation in the driest quarter; Figure 6) were found near SbCN8, this gene may be a candidate for clinal adaptation to the precipitation gradient in West Africa. The colocalization of inflorescence genes (ra2 and bif4) with associations for annual precipitation (Figure 6A; File S5-S6) is consistent with an adaptive role of inflorescence morphology across the precipitation gradient (Harlan 1992; Morris ). Accounting for population structure in GWAS models when mapping phenotypes that are correlated with population structure can lead to false negatives (Bergelson and Roux 2010; Lasky ). Notably there were no significant associations for panicle length after accounting for population structure (MLM; Figure 5A). This finding is consistent with the expectation that climate-associated traits will be confounded with population structure and there will be little power to map the genetic basis of these traits using mixed model association (Brachi ). Genetic dissection of panicle morphology and other putative clinal adaptation traits should be more effective with a multi-parent mapping strategy that breaks up confounding population structure, such as nested association mapping (Buckler ; Bouchet ). Long-term stable clinal adaptation is expected to be reflected in genomic signals of balancing selection (Novembre and Rienzo 2009; Yoder ). Consistent with the phenotypic evidence of clinal trait adaptation (Figure 2A, 2B, 2F), there was evidence of balancing selection from genome-wide enrichment of DNigeria windows at a priori candidate genes for flowering time, plant height, and inflorescence morphology (Fig. S11 and Fig. S12). Further functional studies of candidate genes will be needed to establish if these candidate genes have a role in climate adaptation (Kesari ; Romero Navarro ). Resequencing of georeferenced germplasm should facilitate the identification of putative functional variants underlying clinal adaptation (Gore ; Meyer ). Demographic effects due to historical processes of domestication and differentiation affect the pattern of diversity under neutrality (Molina ; Meyer and Purugganan 2013). While it is suspected that sorghum has a complex domestication history based on morphology (Harlan 1992), population structure (Deu ), and shattering gene haplotypes (Shattering1) (Olsen 2012), the demographic scenarios have not been formally described or evaluated. Therefore, further study on demographic history in sorghum will be valuable to identify which genomic outlier loci are most likely to represent signatures of selection. If variation at genes involved in putative adaptive traits (flowering time, plant height, and panicle length) underlie clinal variation, then we expect significant enrichment of the a priori genes with environment associations (Rellstab ). However, there was no enrichment of a priori candidate genes near associations with “precipitation in the driest quarter” or “annual mean precipitation”. An alternative to the single-trait GWAS and colocalization approach used in this study would be a multi-trait GWAS approach (Korte ; Zhou and Stephens 2014) simultaneously considering traits and environmental variables. However, biological interpretation of the synthetic or “eigen” traits may remain a challenge until more corroborating functional genetic data are available (Banerjee ).

Resources for genomic-enabled breeding of clinal adaptation

The Nigerian germplasm harbors abundant nucleotide polymorphism (90% of the global nucleotide polymorphism based on the πNigeria vs. πglobal), consistent with West Africa as a center of diversity for sorghum (Harlan 1992). The high diversity could be a result of ancestral diversity in the Nigerian sorghum, and/or gene flow and diffusion from other regions of Africa (Westengen ). Population structure and neighbor-joining tree analysis showed that majority of the Nigerian accessions and West African accessions clustered together (Fig. S6 and Fig. S7). The Nigerian germplasm was 10 times less differentiated from West African germplasm compared to the rest of global sorghum germplasm (FST Nigeria-West Africa = 0.007 and FST Nigeria-Global = 0.07). Low level of differentiation among regional germplasm in sorghum has been attributed to gene flow from human migration and agricultural trade (Menkir ). Characterization of LD patterns is critical for interpretation of genome scans since the local extent of LD decay determines resolution of mapping and long-range LD creates spurious associations (Myles ). LD decay rate in Nigerian germplasm (half of maximum value at 12 kb) was slower than LD decay estimated in the global sorghum association panel (half of maximum value at 1 kb) (Morris ). The global sorghum association panel may capture more historical recombination than the Nigerian germplasm because of its greater geographic diversity. The reduced long-range LD decay in the Nigerian germplasm compared to the West African germplasm (Fig. S5) may be due to the smaller geographic scale, which should reduce long range LD due to isolation-by-distance (Brachi ). Given the observed LD decay rates, the mapping resolution for genome-wide scans in the Nigerian germplasm is expected to be less than in global sorghum panels but greater than in West African regional panels. Overall, the modest LD decay rate and high genetic diversity in the Nigerian germplasm make it suitable for genome-wide association studies. The application of genomics for crop improvement and plant genetic resource management is still in the early stage for most national agricultural research systems in sub-Saharan Africa, including Nigeria and neighboring countries (Ezeaku and Gupta 2004; Leiser ; Yohannes ). The genomic resources developed in this study represent a step toward genomics-enabled breeding and germplasm management for sorghum landraces in Nigeria. The resources developed include a genome-wide catalog of SNP variation, a description of geographic population structure, and estimates of genetic properties including nucleotide diversity and LD decay. The genomic signatures of clinal adaptation identified in this study, if validated in managed stress and multi-environment mapping studies (Cooper ; Lasky ), could be another resource to facilitate genomics-enabled breeding for climate-resilience in West Africa.
  73 in total

1.  Atlantic forcing of persistent drought in West Africa.

Authors:  T M Shanahan; J T Overpeck; K J Anchukaitis; J W Beck; J E Cole; D L Dettman; J A Peck; C A Scholz; J W King
Journal:  Science       Date:  2009-04-17       Impact factor: 47.728

2.  Agricultural origins: centers and noncenters.

Authors:  J R Harlan
Journal:  Science       Date:  1971-10-29       Impact factor: 47.728

3.  A critical assessment of storytelling: gene ontology categories and the importance of validating genomic scans.

Authors:  Pavlos Pavlidis; Jeffrey D Jensen; Wolfgang Stephan; Alexandros Stamatakis
Journal:  Mol Biol Evol       Date:  2012-05-21       Impact factor: 16.240

Review 4.  Ecological genomics of local adaptation.

Authors:  Outi Savolainen; Martin Lascoux; Juha Merilä
Journal:  Nat Rev Genet       Date:  2013-11       Impact factor: 53.242

5.  adegenet: a R package for the multivariate analysis of genetic markers.

Authors:  Thibaut Jombart
Journal:  Bioinformatics       Date:  2008-04-08       Impact factor: 6.937

6.  Population genomic and genome-wide association studies of agroclimatic traits in sorghum.

Authors:  Geoffrey P Morris; Punna Ramu; Santosh P Deshpande; C Thomas Hash; Trushar Shah; Hari D Upadhyaya; Oscar Riera-Lizarazu; Patrick J Brown; Charlotte B Acharya; Sharon E Mitchell; James Harriman; Jeffrey C Glaubitz; Edward S Buckler; Stephen Kresovich
Journal:  Proc Natl Acad Sci U S A       Date:  2012-12-24       Impact factor: 11.205

7.  Intron-mediated alternative splicing of Arabidopsis P5CS1 and its association with natural variation in proline and climate adaptation.

Authors:  Ravi Kesari; Jesse R Lasky; Joji Grace Villamor; David L Des Marais; Ying-Jiun C Chen; Tzu-Wen Liu; Wendar Lin; Thomas E Juenger; Paul E Verslues
Journal:  Proc Natl Acad Sci U S A       Date:  2012-05-21       Impact factor: 11.205

8.  Ethnolinguistic structuring of sorghum genetic diversity in Africa and the role of local seed systems.

Authors:  Ola T Westengen; Mark Atam Okongo; Leo Onek; Trygve Berg; Hari Upadhyaya; Siri Birkeland; Siri Dharma Kaur Khalsa; Kristoffer H Ring; Nils C Stenseth; Anne K Brysting
Journal:  Proc Natl Acad Sci U S A       Date:  2014-09-15       Impact factor: 11.205

9.  Adaptations to climate-mediated selective pressures in humans.

Authors:  Angela M Hancock; David B Witonsky; Gorka Alkorta-Aranburu; Cynthia M Beall; Amha Gebremedhin; Rem Sukernik; Gerd Utermann; Jonathan K Pritchard; Graham Coop; Anna Di Rienzo
Journal:  PLoS Genet       Date:  2011-04-21       Impact factor: 5.917

10.  Increased Power To Dissect Adaptive Traits in Global Sorghum Diversity Using a Nested Association Mapping Population.

Authors:  Sophie Bouchet; Marcus O Olatoye; Sandeep R Marla; Ramasamy Perumal; Tesfaye Tesso; Jianming Yu; Mitch Tuinstra; Geoffrey P Morris
Journal:  Genetics       Date:  2017-06       Impact factor: 4.562

View more
  5 in total

1.  Genome-wide association studies identify putative pleiotropic locus mediating drought tolerance in sorghum.

Authors:  Fanna Maina; Abdou Harou; Falalou Hamidou; Geoffrey P Morris
Journal:  Plant Direct       Date:  2022-06-16

2.  The History of Lentil (Lens culinaris subsp. culinaris) Domestication and Spread as Revealed by Genotyping-by-Sequencing of Wild and Landrace Accessions.

Authors:  Marta Liber; Isabel Duarte; Ana Teresa Maia; Hugo R Oliveira
Journal:  Front Plant Sci       Date:  2021-03-25       Impact factor: 5.753

Review 3.  Progress and challenges in sorghum biotechnology, a multipurpose feedstock for the bioeconomy.

Authors:  Tallyta N Silva; Jason B Thomas; Jeff Dahlberg; Seung Y Rhee; Jenny C Mortimer
Journal:  J Exp Bot       Date:  2022-01-27       Impact factor: 6.992

Review 4.  Sorghum genetic, genomic, and breeding resources.

Authors:  Zhanguo Xin; Mingli Wang; Hugo E Cuevas; Junping Chen; Melanie Harrison; N Ace Pugh; Geoffrey Morris
Journal:  Planta       Date:  2021-11-05       Impact factor: 4.116

5.  Dissecting Adaptive Traits with Nested Association Mapping: Genetic Architecture of Inflorescence Morphology in Sorghum.

Authors:  Marcus O Olatoye; Sandeep R Marla; Zhenbin Hu; Sophie Bouchet; Ramasamy Perumal; Geoffrey P Morris
Journal:  G3 (Bethesda)       Date:  2020-05-04       Impact factor: 3.154

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.