Madhav Bhatta1, Alexey Morgounov2, Vikas Belamkar3, P Stephen Baenziger4. 1. Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, USA. madhav.bhatta@huskers.unl.edu. 2. International Maize and Wheat Improvement Center (CIMMYT), 06511 Emek, Ankara, Turkey. a.morgounov@cgiar.org. 3. Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, USA. vikas.belamkar@unl.edu. 4. Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, NE 68583, USA. pbaenziger1@unl.edu.
Abstract
Synthetic hexaploid wheat (SHW; 2n = 6x = 42, AABBDD, Triticum aestivum L.) is produced from an interspecific cross between durum wheat (2n = 4x = 28, AABB, T. turgidum L.) and goat grass (2n = 2x = 14, DD, Aegilops tauschii Coss.) and is reported to have significant novel alleles-controlling biotic and abiotic stresses resistance. A genome-wide association study (GWAS) was conducted to unravel these loci [marker⁻trait associations (MTAs)] using 35,648 genotyping-by-sequencing-derived single nucleotide polymorphisms in 123 SHWs. We identified 90 novel MTAs (45, 11, and 34 on the A, B, and D genomes, respectively) and haplotype blocks associated with grain yield and yield-related traits including root traits under drought stress. The phenotypic variance explained by the MTAs ranged from 1.1% to 32.3%. Most of the MTAs (120 out of 194) identified were found in genes, and of these 45 MTAs were in genes annotated as having a potential role in drought stress. This result provides further evidence for the reliability of MTAs identified. The large number of MTAs (53) identified especially on the D-genome demonstrate the potential of SHWs for elucidating the genetic architecture of complex traits and provide an opportunity for further improvement of wheat under rapidly changing climatic conditions.
Synthetic hexaploid wheat (SHW; 2n = 6x = 42, AABBDD, Triticum aestivum L.) is produced from an interspecific cross between durum wheat (2n = 4x = 28, AABB, T. turgidum L.) and goat grass (2n = 2x = 14, DD, Aegilops tauschii Coss.) and is reported to have significant novel alleles-controlling biotic and abiotic stresses resistance. A genome-wide association study (GWAS) was conducted to unravel these loci [marker⁻trait associations (MTAs)] using 35,648 genotyping-by-sequencing-derived single nucleotide polymorphisms in 123 SHWs. We identified 90 novel MTAs (45, 11, and 34 on the A, B, and D genomes, respectively) and haplotype blocks associated with grain yield and yield-related traits including root traits under drought stress. The phenotypic variance explained by the MTAs ranged from 1.1% to 32.3%. Most of the MTAs (120 out of 194) identified were found in genes, and of these 45 MTAs were in genes annotated as having a potential role in drought stress. This result provides further evidence for the reliability of MTAs identified. The large number of MTAs (53) identified especially on the D-genome demonstrate the potential of SHWs for elucidating the genetic architecture of complex traits and provide an opportunity for further improvement of wheat under rapidly changing climatic conditions.
Entities:
Keywords:
D-genome; bread wheat; complex traits; durum wheat; genes; genotyping-by-sequencing; haplotype block; marker–trait association; root traits; single nucleotide polymorphism
Drought is one of the most important abiotic stresses that reduce crop productivity and is expected to increase with the change in climate [1]. Erratic rainfall patterns caused by climate change may aggravate drought stress and will have a major impact on agriculture [2,3]. The most prominent example of the impact of drought stress on agriculture was the 2012 drought stress in the United States, where moderate to extreme drought stress occurred across the central agricultural states that resulted in crop harvest failure for corn (Zea mays L.), sorghum (Sorghum bicolor L.), and soybean (Glycine max L.), and the agriculture loss due to drought was estimated to be $30 billion [4]. To cope with the challenges of drought stress, plant breeders have been focusing on improving drought tolerance since several decades [2,3,5]. However, the drought tolerance is a complex phenomenon as most of the traits associated with drought tolerance are polygenic in nature, and understating the genetic architecture of drought tolerance is still underway [3] including in wheat (Triticum sps.) [2]. Wheat is one of the most important staple cereal crops mainly grown under rainfed conditions [3,6] and is expected to suffer from drought stress [3]. Therefore, breeding for drought tolerance and identifying genomic regions and underlying candidate genes associated with drought tolerance are important for wheat improvement.Bread wheat (T. aestivum L.) has limited genetic and phenotypic diversity available for breeding for drought tolerance [2]. This is mainly due to the genetic bottleneck experienced during its origin and subsequent domestication [7,8]. Diversity can be increased through the production of synthetic hexaploid wheat (SHW) and its utilization in breeding programs [9,10,11]. Synthetic hexaploid wheat (2n = 6x = 42, AABBDD) is produced from an interspecific cross between durum wheat (2n = 4x = 28, AABB, T. turgidum L.) and goat grass (2n = 2x = 14, DD, Aegilops tauschii Coss.). The SHWs are reported to have significant genetic variation for biotic [12,13] and abiotic stresses resistance [2,10,14]. However, previous studies focused mainly on biotic stresses including leaf rust (incited by Puccinia triticina) [13,15,16], stem rust (incited by P. graminis) [15,16], stripe rust (incited by P. striiformis) [12,15,16], Fusarium head blight (incited by Fusarium graminearum) [13], yellow spot (incited by Pyrenophora tritici-repentis) [15,16], septoria nodorum (incited by Parastagonospora nodorum) [15,16], Septoria tritici blotch (incited by Mycosphaerella graminicola) [13,15], cereal cyst nematode (incited by Heterodera avenae) [15], crown rot (incited by F. pseudograminearum) [16], and root-lesion nematode (incited by Pratylenchus thornei and P. neglectus) [15]. Therefore, exploiting genetic variation under abiotic stresses such as drought is needed to further utilize the potential of SHWs.About 800 quantitative trait loci (QTLs) and marker–trait associations (MTAs) have been reported for drought tolerant traits (agronomic, physiological, root, and yield-related traits) using bi-parental mapping (~691 QTLs) and genome-wide association studies (GWASs; ~109 MTAs) in wheat [3]. However, only 68 QTLs are major QTLs that explain more than 19% of phenotypic variation [3]. This study was conducted to identify novel genomic regions associated with grain yield (GY) and yield-related traits using GWAS performed using 35,648 genotyping-by-sequencing (GBS)-derived single nucleotide polymorphisms (SNPs) in 123 SHWs grown under two drought-stressed growing seasons (2016 and 2017) in Konya, Turkey. Subsequently, the underlying genes for the MTAs identified were investigated for their potential role in drought stress using the functional annotations. To the best of our knowledge, this is the first report on GWAS on GY and yield-related traits under drought stress in SHWs. The results from this study will be a valuable resource for the genetic improvement of GY and yield-related traits in drought stress, introgression of desirable genes from SHWs into elite wheat germplasm, genomic selection, and marker-assisted selection in the breeding program.
2. Results and Discussion
2.1. Weather Conditions
The mean monthly air temperatures were similar at Konya in both growing seasons with 13 °C in 2015–2016 and 12 °C in 2016–2017 compared to the 25-year mean monthly air temperature (11 °C) in Turkey (Table 1). The total rainfall during 2016–2017 (243 mm) was slightly higher than that during 2015–2016 (222.4 mm) growing season. Total rainfalls during the wheat-growing season (September–July) were 48.9% lower in 2015–2016 and 44.2% lower in 2016–2017 compared to the 25-year mean total rainfall (435.1 mm) in Turkey. Although winter wheat water requirements are higher from mid-March to mid-June (from the spring tillering period to the mid-grain filling period), rainfalls were lower in both years compared to the 25-year mean rainfall. The plants were exposed to drought stress from tillering through grain filling. Hence, the results from the present study can be used to understand the effects and genetics of drought in SHW.
Table 1
Mean monthly temperatures and total monthly rainfalls in two growing seasons (2016 and 2017) and 25-year averaged data in Konya, Turkey.
Month
Konya, 2015–2016
Konya, 2016–2017
Turkey, 1991–2015
Konya, 2015–2016
Konya, 2016–2017
Turkey, 1991–2015
Temperature (temp) (°C) a
Temp (°C)
Temp (°C) b
Rainfall (mm) c
Rainfall (mm)
Rainfall (mm) d
September
22.8
17.1
19.0
35.8
11.2
23.1
October
15.3
13.2
13.7
34.4
0.0
48.3
November
7.5
4.9
7.0
5.8
16.6
58.0
December
−0.1
0.5
2.1
8.0
26.8
73.0
January
1.6
0.2
0.1
37.0
9.0
65.6
February
4.9
3.4
1.3
0.4
69.2
60.0
March
8.5
8.2
5.3
37.8
31.0
61.6
April
15.2
12.7
10.4
9.6
33.2
62.7
May
18.4
16.7
15.2
38.4
41.2
54.6
June
23.7
24.4
19.5
15.0
4.8
34.7
July
26.6
27.7
22.9
0.2
0.0
15.1
Total/average
13.1
11.7
10.6
222.4
243.0
435.1
a Source: Bahri Dagdas International Agricultural Research Institute. b Source: http://sdwebx.worldbank.org/climateportal/index.cfm?page=country_historical_climate&ThisCCode=TUR. c Source: Bahri Dagdas International Agricultural Research Institute. d Source: http://sdwebx.worldbank.org/climateportal/index.cfm?page=country_historical.
2.2. Phenotypic Variation for Yield and Yield-Related Traits
A combined analysis of variance (ANOVA) across years identified significant cross-over genotype × year interaction for all traits except for flag leaf width (FLW) and stem diameter (STMDIA) (Table S4). Therefore, analysis of variance was computed for both years separately and the results indicated that the SHWs showed significant variation for GY and yield-related traits in each year (Table 2). For instance, GY ranged from 200 g⋅m−2 to 341 g⋅m−2 with an average yield of 259 g⋅m−2 in 2016 and from 241 g⋅m−2 to 392 g⋅m−2 with an average yield of 290 g⋅m−2 in 2017 (Table 2). The large variation among the traits in each year can be attributed to the collection of diverse accessions of SHWs from different countries and different genetic backgrounds [11,14].
Table 2
Phenotypic variation for grain yield and yield-related traits with best linear unbiased predictor values, range, percentage of coefficient of variation (CV), and broad sense heritability (H) of 123 synthetic hexaploid wheat grown in two seasons (2016 and 2017) in Konya, Turkey.
Trait
2016
2017
Mean
Range
CV
H2
Mean
Range
CV
H2
Grain yield (g⋅m−2)
259
200–341
9.7
0.32
290
241–392
9.9
0.56
Harvest index
0.4
0.24–0.66
10.9
0.63
0.34
0.27–0.41
6.3
0.64
Biomass weight (g⋅m−2)
671
537–827
9.1
0.39
865
684–1098
8.9
0.63
Thousand kernel weight (g)
32.1
24–42
10.5
0.75
41
33–50
8
0.90
Grain volume weight (Kg⋅hL−1)
65.6
52–77
7.2
0.91
74
68–77
2.3
0.76
Awn length (cm)
6
2.3–8.6
24.3
0.61
5.6
0.5–8.0
28.3
0.95
Flag leaf length (cm)
22.4
21.8–22.8
0.8
0.91
12
9.9–16.4
7.6
0.53
Flag leaf width (cm)
1
0.96–1.13
2.8
0.67
1
0.9–1.3
6.1
0.49
Flag leaf area (cm2)
18.9
17.6–19.7
2.2
0.85
10.1
7.7–14
11.6
0.52
Stem diameter (mm)
2.9
2.4–3.5
6.9
0.57
2.9
2.5–4.0
7.4
0.63
Root length (cm)
393
392–395
0.20
0.6
192.2
72–375
20
0.31
Broad-sense heritability (H) ranged from low to high (Table 2). Low to moderate Hwas observed for GY (0.32–0.56), biomass weight (BMWT; 0.39–0.63), FLW (0.49–0.67), and root length (RTLN; 0.31–0.60); moderate Hwas observed for harvest index (HI; 0.63–0.64) and STMDIA (0.57–0.63); moderate to high Hwas observed for flag leaf length (FLLN; 0.53–0.91), flag leaf area (FLA; 0.52–0.85) and awn length (AWNLN; 0.61–0.95); and high H was observed for thousand kernel weight (TKW; 0.75–0.90) and grain volume weight (GVWT; 0.76–0.91), indicating the genetic instability of these traits across years under drought stress. Similar H for most of these traits have been observed in previous studies [17,18,19,20,21,22,23,24,25,26].
2.3. Principal Component Analysis and Phenotypic Correlation
Principal component (PC) bi-plot analysis showed the association between GY and yield-related traits based on correlation matrix (Figure 1). The first two PCs that explained 43.4% (2016) and 44.9% (2017) of variation better explained the relationship between traits in the two-dimensional space. In the PC biplot, we observed two distinct groupings. The first one comprised of GY, HI, BMWT, TKW, GVWT, AWNLN, and RTLN whereas the second one had FLLN, FLW, and FLA (Figure 1). The traits grouping with GY are the more important traits for improving GY in drought-stressed conditions. The association observed in the PC biplot was supported by the significant correlations of GY with BMWT, HI, TKW, and GVW in both years (Figure S1). Similar correlations for these traits were observed in the previous studies [18,27,28,29,30,31,32].
Figure 1
Principal component bi-plot analysis of 123 drought-stressed synthetic hexaploid wheat grown in two seasons (2016 and 2017) in Konya, Turkey. AWLN, awn length; BMWT, biomass weight; FLA, flag leaf area; FLLN, flag leaf length; FLW, flag leaf width; GVWT, grain volume weight; GY, grain yield; HI, harvest index; RTLN, root length; STMDIA, stem diameter; and TKW, thousand kernel weight.
2.4. Population Structure and Genome-Wide Association Study
Population structure analysis of 123 SHWs was performed using 35,648 SNPs [filtered for minor allele frequency (MAF) > 0.05 and missing data < 20%] using the Bayesian clustering algorithm implemented in Structure software and the results showed that these lines were divided into three subgroups (Figure S2 and Table S1). The details of the population structure and genetic diversity of these SHWs have been previously reported in Bhatta et al. [11].The GWAS identified novel genomic regions for GY and yield-related traits and the MTA explained the high phenotypic variance. The Fixed and random model Circulating Probability Unification algorithm (FarmCPU), with kinship, population structure (Q) or PC, best linear unbiased predictors (BLUPs) for each trait, and 35,648 GBS-derived SNPs was used to identify MTAs. The GBS-derived SNPs were well distributed across each of the chromosome (Figure S3). We identified 194 MTAs distributed across 21 chromosomes for GY and yield-related traits with phenotypic variance explained (PVE) ranging from 1.1% to 32.3% (Figure 2 and Figure S4, Table S5). The highest number of MTAs was observed for GY (29), followed by STMDIA (23), FLA (20), and TKW (20) while the lowest MTAs were observed for HI (10) (Figure 2). Of the 194 MTAs, 75 MTAs were detected on the A genome, with 66 MTAs on the B genome, and 53 MTAs on the D-genome. The highest MTAs were present on chromosome 7A (26 MTAs) and the lowest MTAs were present on chromosome 3D (four MTAs) (Figure 2). Most of the MTAs identified in the present study were year-specific, suggesting the influence of genotype × environment interaction on the phenotype of the traits measured in two years. However, 120 of the 194 significant SNPs were in 83 genes and 45 of these MTAs were present within genes and their annotations suggested their potential role in drought stress. This result further provided confidence that the MTAs identified in the study are likely reliable MTAs (Table S5).
Figure 2
Significant markers trait associations identified on each chromosome for grain yield and yield-related traits obtained from the genome-wide association study of 123 synthetic hexaploid wheats grown in 2016 and 2017 in Konya, Turkey.
2.4.1. Grain Yield
The 29 MTAs for GY were observed in 29 different genomic regions on seven chromosomes including 1B, 2B, 3A, 3D, 5B, 7A, and 7B with PVE ranging from 7.6% to 17.9% (Figure 2, and Table S5). Earlier studies have reported QTLs/MTAs for GY on wheat chromosomes 1B [5,17,19,33], 2B [17,19,29,33,34], 3A [17,30,33,34], 3D [33], 5B [5,28,30,31,33,35], 7A [20,27,30,33], and 7B [17,30,33]. However, it is difficult to align our findings with previous studies due to the use of different marker systems [90K SNP, short sequence repeat (SSR), diversity arrays technology (DART) marker vs. GBS-derived SNP marker], absence of precise location information in published studies, or the use of a different version of the reference wheat genome in previous studies than the International Wheat Genome Sequencing Consortium (IWGSC) RefSeq v1.0. However, identification of several MTAs on the same chromosome as earlier studies provided increased confidence on these associations.The present study identified four major haplotype blocks (from 19 bp to 433 kb) on chromosome 7A with two to six SNPs associated with GY in 2016 (Figure 3). First haplotype block consisted of six MTAs within the 433 kb range, second haplotype block consisted of four MTAs within the 81 bp range, third haplotype block consisted of two MTAs within the 19 bp range, and fourth haplotype block consisted of three MTAs within the 314 kb range. The PVE on GY by the first, second, third, and fourth haplotype blocks were 17.2%, 24.6%, 21.9%, and 8.2%, respectively.
Figure 3
Linkage disequilibrium (LD) values (R2) and haplotype blocks with significant marker–trait associations (MTAs; ≥2) observed (A) on chromosome 7A for GY, (B) on chromosome 3A for GY, (C) on chromosome 3A for BMWT, (D) on chromosome 3B for STMDIA, (E) on chromosome 1A for FLA, (F), on chromosome 6B for FLA, (G) on chromosome 7D for FLA, and (H) on chromosome 6D for RTLN and phenotypic variance explained (PVE) by each haplotype block. Dark red color represents the strong LD whereas light red color represents the weak LD between pairs of MTAs.
One MTA (S7A_112977027; 112977027 bp) present in-between the second (537 kb away) and third (837 kb) haplotype blocks was within the gene, TraesCS7A01G158200.1, and PVE on GY was 12.8% (Table S5). This gene was annotated as a member of sentrin-specific protease of Ubiquitin-like Protease 1 (Ulp1) gene family (Table 3). The Ulp1 is a small ubiquitin related modifier (SUMO)-specific protease that affects several important biological processes in plants including response to abiotic stress [21]. It has been shown to play a role in drought tolerance in Arabidopsis (Arabidopsis thaliana) [36] and rice (Oryza sativa) [22,37]. This makes this MTA interesting and a stronger candidate for future functional validation studies.
Table 3
List of significant markers associated with GY and yield-related traits, favorable alleles (underlined), SNP effects, and drought-related putative genes from genome-wide association study of 123 drought stressed synthetic hexaploid wheats grown in 2016 in Konya, Turkey.
P-loop containing nucleoside triphosphate hydrolases superfamily protein
FLA
S6B_120860110
4.01
G/A
0.17
9.3
TraesCS6B01G125800
Cytochrome P450 family protein, expressed
FLA
S6B_120860130
4.01
A/T
−0.17
9.3
TraesCS6B01G125900
Cytochrome P450 family protein, expressed
STMDIA
S1D_431523575
6.58
A/G
−0.06
10.3
TraesCS1D01G341500
Disease resistance protein (NBS-LRR class) family
STMDIA
S3D_10133372
9.83
G/T
−0.11
8.6
TraesCS3D01G028500.1
Leucine-rich repeat receptor-like protein kinase family protein
STMDIA
S6A_94238211
6.9
T/G
0.06
7.5
TraesCS6A01G122200.1
Protein kinase, putative
RTLN
S5B_669373985
4.62
T/C
0.27
6.9
TraesCS5B01G502200
GRAM domain-containing protein/ABA-responsive
RTLN
S5B_669374027
4.62
T/C
0.27
6.9
TraesCS5B01G502200
GRAM domain-containing protein/ABA-responsive
RTLN
S6D_431108774
4.01
A/G
−0.27
5.8
TraesCS6D01G332800.1
Protein DETOXIFICATION
RTLN
S7A_94404310
4.01
G/A
0.52
7.5
TraesCS7A01G143200.2
Phosphatase 2C family protein
PVE: phenotypic variance explained; GY, grain yield; HI, harvest index; BMWT, biomass weight; TKW, thousand kernel weight; GVWT, grain volume weight; AWNLN, awn length; FLLN, flag leaf length; FLW, flag leaf width; FLA, flag leaf area; STMDIA, stem diameter; RTLN, root length. a S+chromosome_chromosome position in bp.
Another major haplotype block (18 kb) of three MTAs was observed on chromosome 3A in 2017 (Figure 3) and the PVE on GY by this haplotype block was 13.1% (Figure 3). The chromosome 3A is known to be an important chromosome that contains useful QTLs for GY and yield-related traits [17,18,19,20,21,22,30,31,32,33,34,35,36,37,38] and the haplotype block identified will have a significance in the crop improvement program. All three MTAs present in this haplotype block of chromosome 3A were found in the gene, TraesCS3A01G047300 (Table S5), which was annotated as a member of the F-box gene family (Table 4). These three SNPs were indicated as having a moderate impact on the protein as they resulted in a missense mutation and caused an amino acid change. Such changes may alter the function of the protein [39], which makes this F-box gene a strong candidate for future functional characterization studies under drought tolerance in wheat. The F-box proteins are known to regulate many important biological processes, such as embryogenesis, floral development, plant growth and development, biotic and abiotic stresses, hormonal responses, and senescence [39]. Two other MTAs observed on chromosome 3A and 3D were present within genes (F-box family protein: TraesCS3A01G445100 and disease resistance protein RPM1: TraesCS3D01G002700) that have been previously reported to be involved in drought tolerance [39,40] (Table 4).
Table 4
List of significant markers associated with GY and yield-related traits, favorable alleles (underlined), SNP effects, and drought-related putative genes obtained from genome-wide association study of 123 drought stressed synthetic hexaploid wheats grown in 2017 in Konya, Turkey.
Trait
SNP a
−log10 (p)
Alleles
SNP Effect
PVE (%)
Gene-ID
Annotation
GY
S3A_25012018
4.81
A/G
−20.02
12.7
TraesCS3A01G047300
F-box-domain-containing protein
GY
S3D_1203058
4.12
T/G
14.32
12.8
TraesCS3D01G002700
Disease resistance protein RPM1
BMWT
S3A_25012018
6.08
A/G
−59.44
14.4
TraesCS3A01G047300
F-box-domain-containing protein
TKW
S4B_11905230
8.94
C/G
−1.11
3.9
TraesCS4B01G016200.1
LOB domain-containing protein, putative
TKW
S4B_637722874
5.17
T/C
0.86
2.0
TraesCS4B01G344200.1
Zinc finger (C3HC4-type RING finger) family protein
GVWT
S1A_522189599
4.11
A/G
−0.55
2.5
TraesCS1A01G334800
Cytochrome P450
GVWT
S4A_73454791
5.64
C/T
−0.63
5.5
TraesCS4A01G074200.2
Microtubule associated protein family protein, putative, expressed
AWNLN
S5B_43896804
7.31
C/T
−1.13
6.0
TraesCS5B01G038700
F-box family protein
AWNLN
S6B_643657
10.55
C/T
−1.04
5.7
TraesCS6B01G001000
F-box family protein
FLLN
S1B_631203243
5.26
A/G
−0.21
9.8
TraesCS1B01G400600.1
Rp1-like protein
FLLN
S2B_140752747
4.15
G/C
0.29
1.6
TraesCS2B01G167500.1
Cytochrome P450, putative
FLLN
S2D_642055122
4.12
T/C
0.25
5.9
TraesCS2D01G579800
protein kinase family protein
FLLN
S4A_612662321
5.3
C/T
−0.23
5.3
TraesCS4A01G325200
F-box family protein
FLLN
S6D_463762312
5.72
G/A
0.25
6.0
TraesCS6D01G386300
Cytochrome P450, putative
FLW
S1A_516732460
6.99
A/G
−0.03
9.5
TraesCS1A01G326700.1
Citrate-binding protein
FLW
S6B_26200560
7.13
C/A
0.03
12.3
TraesCS6B01G042800
F-box family protein
FLA
S1A_516732460
6.9
A/G
−0.42
8.0
TraesCS1A01G326700.1
Citrate-binding protein
FLA
S2A_764065400
4.18
G/T
−0.19
3.8
TraesCS2A01G563200
NBS-LRR resistance-like protein
STMDIA
S6B_610963076
5.7
T/G
0.06
7.9
TraesCS6B01G346900-TraesCS6B01G347000
NBS-LRR disease resistance protein and F-box protein-like
RTLN
S2D_620326979
4.22
T/C
192.21
9.9
TraesCS2D01G541000.1
Disease resistance protein RPM1
PVE: phenotypic variance explained; GY, grain yield; HI, harvest index; BMWT, biomass weight; TKW, thousand kernel weight; GVWT, grain volume weight; AWNLN, awn length; FLLN, flag leaf length; FLW, flag leaf width; FLA, flag leaf area; STMDIA, stem diameter; RTLN, root length. a S+chromosome_chromosome position in bp.
The GY haplotype blocks and other MTAs identified in the present study have not been mapped to date and four MTAs were in the genes, of which functional annotations suggested that they are likely involved in drought tolerance. This result implied that haplotype blocks observed on chromosome 3A (3 MTAs) and 7A (16 MTAs), and one MTA on chromosome 3D (1) for GY are novel and may potentially be used in a marker-assisted breeding program, focusing on improving drought tolerance in wheat after validating them in different populations and environments.
2.4.2. Harvest Index
A total of 10 SNPs significantly associated with HI were identified on chromosomes 1D, 2A, 2D, 3A, 3D, 5B, 6B, 6D, and 7B (Figure 2) with PVE ranging from 2.2% to 18.7% (Table S5). Previous studies have reported QTLs/MTAs responsible for HI on chromosomes 2D [27], 3A [27,30], 6B [41], and 7B [30]. To the best of our knowledge, the six MTAs identified for HI on chromosomes 1D, 2A, 3D, 5B, and 6D have not been reported and they are potentially novel MTAs responsible for HI.Six MTAs for HI detected on chromosomes 2A, 3A, 6B, 6D, and 7B were found in genes and two of these genes have annotations suggesting their involvement in drought stress (Table 3 and Table S5). The two genes are WRKY transcription factor (TraesCS3A01G343700) found on chromosome 3A and cytochrome P450 (TraesCS6D01G170900.1) found on chromosome 6D. The role of WRKY transcription factor is well known in abiotic stresses including drought tolerance [42,43]. The cytochrome P450 genes are a large superfamily of enzymes and are involved in many metabolic pathways including drought tolerance in rice [44] and Arabidopsis [45,46]. The multi-trait marker associated with GY and HI was located on chromosome 5B (S5B_598463062) with PVE ranging from 15.9% to 18.7% (Table S5).
2.4.3. Biomass Weight
The 15 MTAs responsible for BMWT were identified on chromosomes 1D, 2B, 3A, 4A, 6D, and 7B (Figure 2) with PVE ranging from 4.9% to 14.4% (Table S5). Previous studies have reported QTLs/MTAs responsible for BMWT on chromosomes 1D [30,41], 2B [27], 6D [27], and 7B [30]. The four MTAs identified for BMWT on chromosome 3A and 4A have not been reported and they are potentially novel MTAs responsible for BMWT.A novel haplotype block (38 kb) of three SNPs on chromosome 3A associated with BMWT was identified in 2017 (Figure 3) with PVE by the haplotype block of 11.7%. This MTA (S3A_25012018) was also associated with GY and PVE ranged from 12.7% (GY) to 14.4% (BMWT).All three MTAs present in this haplotype block were within genes (Table S5) and one of the genes had annotations suggesting its involvement in drought tolerance was an F-box family protein (TraesCS3A01G047300) (Table 4) [39]. Excluding MTAs on haplotype block, eight MTAs for BMWT detected on chromosomes 1D, 2B, 6D, and 7B were found in genes (Table S5) and two of the genes had annotations suggesting its involvement in drought stress (Table 3 and Table 4). The genes associated with two MTAs are F-box family protein (TraesCS7B01G242600) [39] and protein DETOXIFICATION containing multi-antimicrobial extrusion protein (MatE) (TraesCS1D01G357500) [23].
2.4.4. Thousand Kernel Weight
A total of 20 MTAs responsible for TKW were detected in 19 different genomic regions on chromosomes 1A, 2A, 2B, 2D, 3A, 3B, 4A, 4B, 4D, 5B, 6D, 7B, and 7D (Figure 2) with PVE ranging from 1.6% to 22.2% (Table S5). Earlier studies have reported QTLs/MTAs for TKW on chromosomes 1A [19,24,29,30], 2A [20], 2B [20,29,30], 2D [19], 3A [24,25,29], 3B [20,26], 4B [5], 5B [24], 7B [30], and 7D [20]. In the present study, only one MTA (S2D_7309581) responsible for TKW was detected on chromosome 2D in both years and assumed to be a stable MTA, because this MTA was detected despite significant genotype x year interaction. The five MTAs identified for TKW on chromosomes 4A, 4D, and 6D have not been previously reported and they are potentially novel MTAs responsible for TKW.Twelve MTAs responsible for TKW detected on chromosomes 2A, 2B, 3A, 3B, 4A, 4B, 5B, and 6D were found in genes (Table S5) and five of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with MTAs involved in drought tolerance are F-box family protein (chromosome 3A; TraesCS3A01G047300) [39], protein kinase family protein (chromosome 4A; TraesCS4A01G347600 and chromosome 6D; TraesCS6D01G360800) [47], cytochrome P450 family protein (chromosome 4D; TraesCS4D01G364700) [44,45,46], and zinc finger (C3HC4-type RING finger) family protein (chromosome 4B; TraesCS4B01G344200.1) [48,49,50]. The SNP S4D_509427923 was annotated as a missense variant and thus may have a moderate impact on the protein function (Table S5).
2.4.5. Grain Volume Weight
Thirteen MTAs responsible for GVWT were identified on chromosomes 1A, 2A, 2B, 2D, 3A, 4A, 5A, 6A, and 7A (Figure 2) with PVE ranging from 1.3% to 16.2% (Table S5). Earlier studies have reported QTLs/MTAs for GVWT on chromosomes 1A [33], 2A [33,51], 2B [33,51], 2D [33,51], 5A [33] and 7A [33,51]. Four MTAs identified for GVWT on chromosomes 3A, 4A, and 6A have not been previously reported and they are potentially novel MTAs responsible for GVWT.Eight MTAs responsible for GVWT detected on chromosomes 1A, 2A, 2B, 4A, 6A, and 7A were found in genes (Table S5) and two of these genes had annotations suggesting their involvement in drought stress (Table 4). The genes associated with two MTAs involved in drought tolerance are cytochrome P450 (TraesCS1A01G334800) [44,45,46] on chromosome 1A and microtubule-associated protein family protein (TraesCS4A01G074200.2) [52] on chromosome 4A.
2.4.6. Awn Length
Twenty MTAs responsible for AWNLN were observed on chromosomes 1D, 2A, 2B, 3B, 4A, 4B, 4D, 5A, 5B, 5D, 6B, and 7A (Figure 2) with PVE ranging from 1.1% to 20.1% (Table S5). Earlier studies have reported QTLs/MTAs for AWNLN on chromosomes 2A [53,54], 4A [54], 4B [54], 5A [54], and 6B [53,54]. The nine MTAs identified for AWNLN on chromosomes 2B, 3B, 4D, 5B, 5D, and 7A have not been previously reported and they are potentially novel MTAs responsible for AWNLN.Eleven MTAs responsible for AWNLN detected on chromosomes 1D, 2A, 4D, 5A, 5B, 6B, and 7A were found in genes (Table S5) and four of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with four MTAs involved in drought tolerance are 60S ribosomal protein L18a (chromosome 4D; TraesCS4D01G290700) [55], guanine nucleotide exchange family protein (chromosome 5A; TraesCS5A01G361300) [56], and F-box family protein (chromosome 5B; TraesCS5B01G038700 and chromosome 6B; TraesCS6B01G001000) [39]. It has been reported that the putative 60S ribosomal protein L18a is an upregulated transcript in response to drought stress in ears and silks during the flowering stage in maize [55].
2.4.7. Flag Leaf Length
Thirteen MTAs responsible for FLLN were detected on chromosomes 1B, 1D, 2A, 2B, 2D, 4A, 6D, and 7B (Figure 2) with PVE ranging from 1.58% to 32.3% (Table S5). Previous studies have reported QTLs for FLLN on chromosomes 1B [57,58], 2B [57,58,59,60], 2D [57,61], 4A [57,58,59], and 7B [61]. The four MTAs identified for FLLN on chromosomes 1D, 2A, and 6D have not been previously reported and they are potentially novel MTAs responsible for FLLN.Eleven MTAs responsible for FLLN detected on chromosomes 1B, 1D, 2B, 2D, 4A, 6D, and 7B were found in genes (Table S5) and four of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with four MTAs involved in drought stress are F-box family protein (chromosome 4A: TraesCS4A01G325200) [39], cytochrome P450 (chromosome 2B; TraesCS2B01G167500 and chromosome 6D; TraesCS6D01G386300) [44,45,46], and Rp1-like protein (chromosome 1B; TraesCS1B01G400600) [40].
2.4.8. Flag Leaf Width
Sixteen MTAs responsible for FLW were detected on chromosomes 1A, 1B, 1D, 2B, 2D, 4B, 6B, and 6D (Figure 2) with PVE ranging from 1.6% to 15.2% (Table S5). Previous studies have found QTLs for FLW on chromosomes 1B [57,60,61], 1D [57,59], 2B [57,59], 2D [57,59,61], 4B [59,60], and 6B [57,58,59]. The two MTAs identified for FLW on chromosomes 1A and 6D have not been previously reported and they are potentially novel MTAs responsible for FLW.Thirteen MTAs responsible for FLW detected on chromosomes 1A, 1B, 1D, 2D, 4B, 6B, and 6D were found in genes (Table S5) and three of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with three MTAs involved in drought stress are citrate-binding protein (chromosome 1A; TraesCS1A01G326700) [62], F-box family protein (chromosome 6B; TraesCS6B01G042800) [39], and mitochondrial transcription termination factor-like (chromosome 6D; TraesCS6D01G040100) [63]. The SNPs S1A_516732460 and S6D_16376439 were annotated as a missense variant and thus may impact the function of the proteins that are annotated as citrate-binding protein and mitochondrial transcription termination factor-like protein, respectively (Table S5).
2.4.9. Flag Leaf Area
Twenty MTAs responsible for FLW were detected on chromosomes 1A, 1B, 1D, 2A, 2D, 4D, 5A, 6B, and 7D (Figure S2) with PVE ranging from 8.1% to 23.1% (Table S5). Previous studies have reported QTLs for FLA on chromosomes 1B [58,59], 1D [57,59,61], 2A [57,59,61], 2D [57,59,61], 4D [58], 5A [57,58,60,61], 6B [58], and 7D [61]. The four MTAs identified for FLA on chromosome 1A have not been previously reported and they are potentially novel MTAs responsible for FLA.Three novel haplotype blocks were observed for FLA on chromosomes 1A (two MTAs), 6B (two MTAs) and 7D (three MTAs) with PVE by these haplotype block ranging from 5.5% to 8.6% (Figure 3). Fourteen MTAs responsible for FLA detected on chromosomes 1A, 1B, 1D, 2A, 4D, 5A, 6B, and 7D were found in genes (Table S5) and five of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with five MTAs involved in drought stress are citrate-binding protein (TraesCS1A01G326700), P-loop containing nucleoside triphosphate hydrolases superfamily protein (TraesCS1D01G197200) [64], cytochrome P450 (TraesCS6B01G125900) [44,45,46], and NBS-LRR resistance-like protein [40].The multi-trait marker associated with FLW and FLA was located on chromosome 1A (S1A_516732460) with PVE ranging from 8.0% to 9.5% (Table S4). Another multi-trait marker associated with FLLN and FLA was located on chromosome 2A (S2A_29874199) with PVE ranging from 23.1% to 32.3% (Table S4). The multi-trait MTA indicates that the related candidate gene may affect multiple traits.
2.4.10. Stem Diameter
In the present study, 23 MTAs responsible for STMDIA were identified on chromosomes 1A, 1D, 2B, 2D, 3A, 3B, 3D, 4D, 5A, 5B, 6A, 6B, 6D, 7A, and 7B (Figure 2) with PVE ranging from 2.7% to 28.8% (Table S5). Earlier study has identified one minor QTL (QSd-3B) for STMDIA on chromosome 3B that explained 8.7% of the phenotypic variance [65]. It means that, 19 MTAs detected on chromosomes 1A, 1D, 2B, 2D, 3A, 3D, 4D, 5A, 5B, 6A, 6B, 6D, 7A, and 7B except chromosome 3B in the present study may potentially be a novel MTAs controlling STMDIA under drought stress.Four MTAs were detected on chromosome 3B and two of them (1 bp apart) were observed in one haplotype block in 2016 with PVE was 9.2% (Figure 3). Fifteen MTAs for STMDIA detected on chromosomes 1D, 2B, 3A, 3B, 6B, 6D, 7A, and 7B were found in genes (Table S5) and four of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with four MTAs involved in drought stress are leucine-rich repeat receptor-like protein kinase family protein (TraesCS3D01G028500) [66], protein kinase (TraesCS6A01G122200.1) [47], disease resistance protein (NBS-LRR class) family (TraesCS1D01G341500 and TraesCS6B01G346900) [40], and F-box protein family (TraesCS6B01G347000) [39].
2.4.11. Root Length
RTLN is one of the most important traits under drought stress. We have measured RTLN 3–4 days after anthesis (Zadoks 60 growth stage) under the drought-stressed field condition using WinRhizo® (WinRhizo reg. 2009c, Regent Instruments Inc., Quebec City, QC, Canada). This trait is very unique compared to previous studies where they focused on the roots of seedlings [67,68,69,70] rather than direct field-based measurements (labor intensive, time consuming, and expensive). Identification of QTL governing RTLN is very important in wheat, especially for the wheat grown under drought stress. Limited information is available on QTL related to root traits in wheat [67,69,70,71,72].In the present study, 15 MTAs responsible for RTLN were identified on chromosomes 2B, 2D, 3B, 5B, 6A, 6D, and 7A (Figure 2) with PVE ranging from 5.3% to 18.5% (Table S5). Earlier studies have reported QTLs for RTLN on chromosomes 2B [68], 3B [69], 5B [67,69], 6A [68,69], and 6D [67,70] in hexaploid wheat and on chromosomes 2B [71,72], 3B [72], 6A [72], and 7A [72] in tetraploid wheat. The MTA identified for RTLN on chromosome 2D has not been previously reported and it is potentially novel MTAs responsible for RTLN under drought stress. Furthermore, previous studies identified very few QTLs for RTLN on the D-genome of wheat [67,70]. Therefore, the MTAs (eight MTAs) for RTLN detected on the D-genome of SHWs in the present study are potentially novel.Seven out of eight MTAs responsible for RTLN were present on chromosome 6D. Two haplotype blocks (the haplotype block1 with a size of 64 kb and the haplotype block2 with a size of 5kb) were identified from five out of seven MTAs for RTLN on chromosome 6D with PVE ranging from 5.0% to 11.8% (Figure 3). One SNP (S6D_435300571) present in the haplotype block2 was found in the gene (TraesCS6D01G332800) with PVE of 13.0%. The gene associated with this SNP is protein detoxification gene-containing multi-antimicrobial extrusion protein (MatE) (Table 3) and has been reported to be expressed mainly in the root than shoots under drought stress [73]. For instance, MatE family genes such as HvAACT1 in barley [74] and TaMate in wheat [75], encode proteins that are primarily localized to root epidermis cells [74] and required for external resistance [23]. In the present study, this gene was also significantly associated with BMWT on chromosome 1D. This result implied that that this gene plays an important role for RTLN and BMWT in drought-stressed conditions.Excluding MTAs on haplotype block, eight MTAs for RTLN detected on chromosomes 2D, 3B, 5B, 6A, and 7A were found in genes (Table S5) and four of these genes had annotations suggesting their involvement in drought stress (Table 3 and Table 4). The genes associated with four MTAs involved in drought stress are GRAM domain-containing protein/ABA-responsive (TraesCS5B01G502200) [76,77,78,79,80,81], phosphatase 2C family protein (TraesCS7A01G143200.2) [82], and disease resistance protein RPM1 (TraesCS2D01G541000.1) [40]. The SNPs S7A_94404310 was annotated as a missense variant and thus may have a moderate impact on the protein function (Table S5).
2.5. Potential Candidate Gene Annotations Affecting Yield and Yield-related Traits under Drought Stress
This study identified ~194 MTAs present on different chromosomes and associated with multiple traits. These 62 MTAs were associated with either the same trait in multiple years (MTA stability in different environments) or multiple traits within the same year or across years (suggesting epistasis) (Table S5). Additionally, ~45 of the MTAs were present in genes with annotations relevant to the respective trait under drought stress (Table 3 and Table 4). Interestingly, we noticed MTAs associated with the same or related traits were located within genes that had the exact same annotation (Figure 4; and Table S6). For instance, some of the MTAs for GY (2 MTAs), BMWT (2), TKW (1), AWNLN (2), FLLN (1), FLW (1), and STMDIA (1) were located within genes annotated as F-box family protein (Table S6). Similarly, the genes annotated as cytochrome P450 harbored MTAs for HI (1), TKW (1), FLA (2), GVWT (1), and FLLN (2). Additional examples are provided in Figure 4 and Table S6 in Supplementary Materials. This result indicated the likely gene families that are important for GY and yield-related traits under drought stress.
Figure 4
Potential candidate gene functions harboring SNPs affecting yield and yield-related traits under drought stress. The count of marker–trait associations (for either single or multiple traits) located within genes that have the same gene annotation is shown. AWLN, awn length; BMWT, biomass weight; FLA, flag leaf area; FLLN, flag leaf length; FLW, flag leaf width; GVWT, grain volume weight; GY, grain yield; HI, harvest index; RTLN, root length; STMDIA, stem diameter; and TKW, thousand kernel weight.
3. Materials and Methods
3.1. Site Description
A field experiment was conducted during two growing seasons (2016 and 2017) under drought stressed conditions (rainfed) at the research farm located at the Bahri Dagdas International Agricultural Research Institute in Konya, Turkey (37°51′15.894″ N, 32°34′3.936″ E; Elevation = 1021 m). This site was characterized by a low precipitation (below 300 mm), low humidity, and slightly alkaline clay loam soil [83].
3.2. Plant Materials and Experimental Design
One hundred twenty-three SHWs developed from two introgression programs were used (Table S1). The first group was developed by Kyoto University, Japan from one spring durum (‘Langdon’) parent crossed with 14 different Ae. tauschii accessions resulting in 14 different lines (Table S2). The remaining 109 lines were the second group of synthetics that were developed by International Maize and Wheat Improvement Center (CIMMYT) from the six durum parents crossed with 11 different Ae. tauschii accessions mainly from the Caspian Sea Basin area. Initially, 13 crosses among six winter durum wheats were involved in the creation of 13 different winter-type synthetics. However, due to the partial sterility, segregation, and continuous selection in the early generation, 109 lines were selected as unique lines because of their differences in phenotype [14] and their kinship values [11]. The synthetic genotypes used in the present study are unique and have been developed recently and tested for their agronomic traits [14], genetic diversity, and population structure [11]. The detailed information of these SHWs were provided in Bhatta et al. [11].The experimental design was an augmented design (plot size: 1.2 m × 5 m) with replicated checks (‘Gerek’ and ‘Karahan’) in the 2016 growing season and modified alpha-lattice design (plot size: 1.2 m × 5 m) including 123 SHWs and replicated checks (‘Gerek’ and ‘Karahan’) with two replications in the 2017 growing season. The SHWs were planted on 20 September in 2015 and harvested on 18 July 2016 for the 2016 growing season, whereas the SHWs were planted on 15 September in 2016 and harvested on 21 July 2017 for the 2017 growing season.
3.3. Trait Measurements
The GY was obtained by harvesting four middle rows of 1.008 m2 (i.e., 84 m × 120 m) and reported in g⋅m−2. The HI, BMWT, TKW, GVWT, AWNLN, FLLN, FLW, and FLA (0.8 × FLLN × FLW) were measured using previously reported protocols [18,32,59,61]. The STMDIA was measured from five randomly selected plants per plot using a digital Vernier caliper at the second internode from the soil surface at physiological maturity. The RTLN was measured from randomly selected three plants per plot after 3–4 days of flowering (Zadoks 60 growth stage) using WinRhizo software (WinRhizo reg. 2009c, Regent Instruments Inc., Quebec City, QC, Canada).
3.4. Phenotypic Data Analysis
A combined ANOVA was performed using the following model:
where yijklm is the GY and yield-related trait; μ is overall mean; Yri is the effect of ith year; R(Yr)ji is the effect of jth replication within the ith year; B(R(Yr))kji is the effect of kth incomplete block within jth replication of ith environment; Cl is the lth checks; Gmkji is the effect of mth genotypes (new variable, where check is coded as 0 and entry is coded is 1 and the genotype is taken as a new variable × entry) within the kth incomplete block of jth replication in the ith year; GXYrmi is the interaction effect of mth genotype and ith year; and eijklmn is the residual. In the combined ANOVA, year and check were assumed as fixed effects, whereas genotype, genotype × year interaction, replication nested within a year, and incomplete block nested within replications were assumed as random effects.Individual analyses of variance were performed because most of the traits had highly significant genotype × year interaction and therefore will be discussed hereafter. An augmented design was analyzed using the following model for the estimation of BLUPs in the year 2016:
where yijkl is the trait, μ is the overall mean; Bi is the effect of ith incomplete block; Cj is the jth check; Gk(i) (new variable, where check is coded as 0 and entry is coded as 1 and genotype is taken as a new variable × entry) is the effect of kth genotypes within the ith block; eijkl is the residual. In ANOVA calculated for the 2016 datasets, the check was assumed as a fixed effect, whereas genotype and incomplete block were assumed as random effects.An alpha (α) lattice design with two replications was analyzed using the following model for the estimation of BLUPs in the year 2017:
where yijk is the trait, μ is the overall mean; Ri is the effect of ith replication; B(R)ji is the effect of jth block within the ith replication; Ck is the kth checks; Glji (new variable, where check is coded as 0 and entry is coded as 1 and the genotype is taken as a new variable × entry) is the effect of kth genotypes within jth incomplete block of ith replication; eijkl is the residual. In ANOVA calculated for the 2017 datasets, the check was assumed as a fixed effect, whereas genotype, replication, and incomplete block nested within replication were assumed as random effects.All phenotypic data were analyzed using PROC MIXED in SAS 9.4 (SAS Institute Inc., Cary, NC) [84] using the restricted maximum likelihood (REML) approach unless mentioned otherwise.Broad-sense heritability for each trait in each year was calculated based on entry mean basis using Equations (4–6) for 2016, 2017, and combine0d experiments, respectively:
where σ2g, σ2yr, σ2gxyr, and σ2e are the variance components for genotype, year, genotype × year, and error, respectively, and n and r are the numbers of years and replications, respectively.Pearson’s correlation of GY and yield-related traits was calculated based on BLUPs for each trait in each year using PROC CORR in SAS. The PC biplot analysis (PCA-biplot) was performed based on the correlation matrix to avoid any variation due to the different scales of the measured variables using ‘factoextra’ package in R software [85].
3.5. Genotyping and SNP Discovery
Genomic DNA was extracted from two to three fresh young leaves of 14-day-old seedlings using BioSprint 96 Plant Kits (Qiagen, Hombrechtikon, Switzerland), as described in Bhatta et al. [11]. The GBS libraries were constructed in 96-plex following digestion with two restriction enzymes, PstI and MspI [86] and pooled libraries were sequenced using Illumina, Inc. (San Diego, CA, USA) next-generation sequencing platforms at the Wheat Genetics Resource Center at Kansas State University (Manhattan, KS, USA). SNP calling was performed using TASSEL v. 5.2.40 GBS v2 Pipeline (available online: https://bitbucket.org/tasseladmin/tassel-5-source/wiki/Tassel5GBSv2Pipeline) [87] with physical alignment to the Chinese Spring genome sequence (RefSeq v1.0) made available by the IWGSC [88] using default settings with the one exception that the number of times for a GBS tag to be present and included for SNP calling was changed from the default value of 1 to 5 to increase the stringency in SNP calling. The identified SNPs with MAF less than 5% and missing data more than 20% were removed from the analysis. All lines had missing data less than 20% and none of them were dropped due to missing percentage-filtering criterion. The GBS-derived SNPs are provided in Table S3 in Supplementary Materials.
3.6. Population Structure and Genome-Wide Association Study Analysis
Population structure of 123 genotypes was assessed using the Bayesian clustering algorithm in the program STRUCTURE v 2.3.4 (available online: https://web.stanford.edu/group/pritchardlab/structure_software/release_versions/v2.3.4/html/structure.html) [89] and principal component (PC) analysis using TASSEL (available online: http://www.maizegenetics.net/tassel) [90], as described in Bhatta et al. [11].Many GWASs were previously performed using the mixed linear model (MLM), where the population structure (Q) or PC was set as a fixed effect and kinship (K) as a random effect to control false positives [91,92]. However, the MLM may lead to confounding between population structure, kinship, and quantitative trait nucleotides (QTNs) that results in false negatives due to model overfitting [78]. Recently, the multilocus mixed model (MLMM), which tests multiple markers simultaneously by fitting pseudo QTNs, in addition to testing markers in stepwise MLM, has been proposed, which is advantageous over conventional GLM and MLM testing one marker at a time [78]. One of the examples of recently popular GWAS analysis algorithm that is based on MLMM is FarmCPU [78,79]. The FarmCPU uses a fixed effect model (FEM) and a random effect model (REM) iteratively to remove the confounding between testing markers and kinship that results in false negatives, prevents model overfitting, and control false positives simultaneously [78]. Therefore, GWAS was performed on the adjusted BLUPs for each trait in each year to identify SNPs associated with GY and yield-related traits in SHWs using FarmCPU with population structures (Q1 and Q2) or first three principal components (PC1, PC2, and PC3) as covariates by looking at the model fit using Quantile-Quantile (Q-Q) plots and FarmCPU-calculated kinship [78] implemented in MVP R software package (available online: https://github.com/XiaoleiLiuBio/MVP). A uniform suggestive genome-wide significance threshold level of p-value = 9.99 × 10-5 (−log10p = 4.00) was selected for MTAs considering the deviation of the observed test statistics values from the expected test statistics values in the Q-Q plots [28,80] from the two-year results of the present study.
3.7. Haplotype Block Analysis
Haplotype blocks with linkage disequilibrium (LD) values (squared correlation coefficient between locus allele frequency; R2 > 0.2) in adjacent regions (<500 kb) of significant MTAs were visualized and plotted using default parameters (Hardy–Weinberg p-value cut off at 1% and MAF > 0.001) of Haploview software (available online: https://www.broadinstitute.org/haploview/haploview) [81]. PVE by each haplotype block on the trait of interest was calculated using multiple regression analysis that accounted for the population structure by removing the haplotype allele of less than 5% in SAS using PROC REG.
3.8. Putative Candidate Gene Analysis
The genes underlying the MTAs and subsequently their annotations were retrieved using a Perl script and the IWGSC RefSeq v1.0 annotations [88] provided for the Chinese Spring. The underlying genes were further examined for their association with GY and yield-related traits under drought stress using previously published literature. Additionally, the SnpEff program (available online: http://snpeff.sourceforge.net/) was used for SNP annotation and predicting the effects of SNPs on the protein function. The MTAs present within genes or flanked (5 kb) by genes were investigated [93].
4. Conclusions
The present study showed SHWs have large amounts of genetic variation for GY and yield-related traits. The GWAS in 123 SHWs using 35,648 SNPs identified several novel (90 MTAs: 45 MTAs on the A genome, 11 on the B genome, and 34 on the D-genome) genomic regions or haplotype blocks associated with GY and yield-related traits in drought-stressed conditions. Most of the MTAs (120 MTAs) were present in genes ad several of them (45 MTAs) were annotated with functions related to drought stress. This provided further evidence for the reliability of the MTAs identified. We also identified MTAs on different chromosomes associated with multiple traits but within genes having the same annotation. This resulted in the identification of candidate genes belonging to the same gene family that likely have a major role in affecting GY and yield-related traits under drought stress in SHWs. The large number of MTAs, especially on the D-genome (53 MTAs with 34 MTAs being novel) identified in the present study, demonstrate the potential of SHWs for elucidating the genetic architecture of complex traits and provide an opportunity for further improvement of wheat under rapidly growing drought-stressed environment worldwide.
Authors: Peter J Bradbury; Zhiwu Zhang; Dallas E Kroon; Terry M Casstevens; Yogesh Ramdoss; Edward S Buckler Journal: Bioinformatics Date: 2007-06-22 Impact factor: 6.937
Authors: R Suzuky Pinto; Matthew P Reynolds; Ky L Mathews; C Lynne McIntyre; Juan-Jose Olivares-Villegas; Scott C Chapman Journal: Theor Appl Genet Date: 2010-06-04 Impact factor: 5.699
Authors: Jeffrey C Glaubitz; Terry M Casstevens; Fei Lu; James Harriman; Robert J Elshire; Qi Sun; Edward S Buckler Journal: PLoS One Date: 2014-02-28 Impact factor: 3.240