Literature DB >> 27007903

Linkage disequilibrium, SNP frequency change due to selection, and association mapping in popcorn chromosome regions containing QTLs for quality traits.

Geísa Pinheiro Paes1, José Marcelo Soriano Viana1, Fabyano Fonseca E Silva2, Gabriel Borges Mundim1.   

Abstract

The objectives of this study were to assess linkage disequilibrium (LD) and selection-induced changes in single nucleotide polymorphism (SNP) frequency, and to perform association mapping in popcorn chromosome regions containing quantitative trait loci (QTLs) for quality traits. Seven tropical and two temperate popcorn populations were genotyped for 96 SNPs chosen in chromosome regions containing QTLs for quality traits. The populations were phenotyped for expansion volume, 100-kernel weight, kernel sphericity, and kernel density. The LD statistics were the difference between the observed and expected haplotype frequencies (D), the proportion of D relative to the expected maximum value in the population, and the square of the correlation between the values of alleles at two loci. Association mapping was based on least squares and Bayesian approaches. In the tropical populations, D-values greater than 0.10 were observed for SNPs separated by 100-150 Mb, while most of the D-values in the temperate populations were less than 0.05. Selection for expansion volume indirectly led to increase in LD values, population differentiation, and significant changes in SNP frequency. Some associations were observed for expansion volume and the other quality traits. The candidate genes are involved with starch, storage protein, lipid, and cell wall polysaccharides synthesis.

Entities:  

Year:  2016        PMID: 27007903      PMCID: PMC4807383          DOI: 10.1590/1678-4685-GMB-2015-0126

Source DB:  PubMed          Journal:  Genet Mol Biol        ISSN: 1415-4757            Impact factor:   1.771


Introduction

Linkage disequilibrium (LD) or gametic phase disequilibrium is the difference between haplotype frequency products (P(AB).P(ab) – P(Ab).P(aB)) (Kempthorne, 1957). Because this difference corresponds to the covariance between values of alleles at two loci (Weir, 2008), LD is commonly defined as the non-random association of alleles at different loci. LD between molecular markers and genes, the basis of quantitative trait locus (QTL) mapping, association mapping, and genomic selection, is due to or affected by selection, mutation, population admixture, genetic drift, outcrossing, inbreeding, and recombination (Gupta). With respect to biallelic markers, the most common statistics to measure LD in a population are the difference between the observed and expected (under linkage equilibrium) haplotype frequencies (D), the proportion of D relative to the expected maximum value in the population (D'), and the square of the correlation between the values of alleles at two loci (r2) (Flint-Garcia ). Association mapping refers not only to the identification of QTLs, but also to the identification of candidate genes based on statistical significance between markers and phenotype. Its main advantages relative to QTL mapping are the use of breeding population instead of population derived by crossing two inbred or pure lines and more precise identification of candidate genes (Flint-Garcia ). However, association mapping is only capable of identifying effects of alleles present in reasonably high frequency in a population. In addition, the efficiency of association mapping is significantly influenced by relatedness and population structure, which can generate spurious associations, that is, associations between unlinked marker and QTL (Weir, 2010). The association mapping methodologies are the candidate gene approach and the genome-wide association study (GWAS) (Rafalski, 2010). Both methods have been successfully used to determine the genetic basis of important complex traits and to identify some of the key genes. In maize (Zea mays L.), LD analyses and association studies have been performed using inbred line panels. The LD analysis performed by Van Inghelandt was based on 1,537 inbreds genotyped for 359 simple sequence repeat (SSR) loci and 8,244 single nucleotide polymorphisms (SNPs). Considering only linked markers, LD under low (SSR) and high (SNP) marker densities was comparable for Flint and Lancaster heterotic pools. For Stiff Stalk Synthetic (SSS) and Iodent heterotic pools, the average LD based on SNPs was 45 to 52% greater than that based on SSR markers. Truntzler assessed LD in a panel of 314 dent inbreds genotyped for 979 SNPs. They observed an r2 value of 0.20 for SNPs at a spacing of 200 kb. Based on a panel of 240 inbreds genotyped for 29,619 SNPs, Thirunavukkarasu estimated r2values ranging from 0.21 to 0.25. LD blocks were observed on all chromosomes, with the LD decay occurring over regions of 200-300 kb. Association mapping in maize has been effective for identifying candidate genes for complex traits such as pathogen resistance, root development, drought tolerance, chilling tolerance, oil biosynthesis, plant architecture, kernel composition, flowering, and metabolic processes. Using simulated and field data from five plant species including maize, Stich and Melchinger (2009) and Yang compared association mapping methods. They concluded that a mixed-model approach using a kinship matrix to correct for relatedness was the best method. This approach outperformed a model controlling for relatedness and population structure because the spurious associations could not be completely controlled by population structure. Thirunavukkarasu assessed 240 inbreds under water-stressed and well-watered environments. They measured anthesis-to-silking interval, grain yield, 100-kernel weight, and four ear traits, and carried out association mapping based on 29,619 high-quality SNPs. Fifty and 70 SNPs were strongly associated with tolerance to water stress under stressed and well-watered environments, respectively. Significant SNPs were identified mainly on chromosomes 5 and 3 under the water-stressed environment and on chromosomes 10, 1, and 7 under well-watered conditions. Thirty-one of the SNPs detected under water-stressed conditions were situated near drought-tolerance genes. To our knowledge, little information is available on LD and SNP frequency changes due to selection in maize and special maize breeding populations, nor have QTLs been identified in such populations by association mapping. Thus, our objectives were to assess LD and SNP frequency changes due to selection, and to perform association mapping in popcorn (Zea mays L. ssp. everta(Sturtev.) Zhuk.) chromosome regions containing QTLs for quality traits.

Materials and Methods

Populations

The populations employed in this study were Viçosa, Viçosa cycles 1 (c1) and 4 (c4) (obtained from Viçosa after one and four half-sib selection cycles, respectively), Viçosa cycle 2 (c2) fsf (derived from Viçosa after two full-sib selection cycles), Viçosa S4 (generated from four inbred progeny selection cycles applied in the Viçosa population), Beija-Flor c1 and Beija-Flor c4 (obtained from Beija-Flor c1 after three half-sib selection cycles), and UFV MP-1 and UFV MP-2 (derived from hybrids P622 and P625, respectively, developed by the Agricultural Alumni Seed Improvement Association, Romney, IN, USA). The first seven populations, representing tropical germplasm, were cultivated during the 2012-2013 growing season in an experimental field at the Federal University of Viçosa (UFV), Minas Gerais, Brazil. The populations UFV MP-1 and UFV MP-2, representing temperate germplasm, were cultivated in 20-L pots in a greenhouse at UFV, in 2014. Leaf samples of 100-150 young plants were collected from each population for genotyping. The populations derived from Viçosa and Beija-Flor c1 were obtained by progeny and plant-within-progeny selection for expansion volume. In trials of non-inbred progeny, 196 progeny were assessed using a 14 x 14 lattice design with two replications, at the UFV experimental station located in Coimbra, Minas Gerais. The 20 superior half-sib families were recombined using one male row to four female rows. In the recombination plots with the 20 superior full-sib families, at least one female per family was crossed with a male from another progeny, providing 380 full-sib families. In the half-sib progeny recombination plots, 196 plants were selected, providing the half-sib families for the next cycle. The 196 full-sib families for the next cycle were selected based on the expansion volume of the female parent. The inbred progeny were assessed in an experimental field at UFV using an incomplete block design with replications only for the controls (commercial hybrids and populations). Each incomplete block consisted of 10 progeny and the controls. The trials included 344 S1 progeny, 309 S2progeny, 277 S3 progeny, and 268 S4 progeny. In each progeny, three to five plants were selfed. The progeny for the next cycle were obtained by selecting the best families and then the superior selfed plants. Populations Viçosa S1 to Viçosa S4 were obtained by recombining all assessed inbred progeny. The progeny tests and recombination plots were conducted during 1998-1999 to 2007-2008 growing seasons. To assess expansion volume, we used a hot air popcorn popper (1,200 W) or a 27-L microwave oven (900 W), and samples of 30 g per plot and 10 g per plant.

Genotyping

DNA was extracted using KitWizard Genomic DNA Purification kit according to the manufacturer's protocol with modifications. A Qubit 2.0 fluorometer (Life Technologies, Carlsbad, CA, USA) and a NanoVue spectrophotometer (GE Healthcare BioSciences Corporation, Piscataway, NJ, USA) were used to assess DNA quantity and purity level, respectively. Individuals were genotyped from 50 ng/μL DNA samples using GoldenGate assays (Illumina, San Diego, CA, USA). Genotyping was performed on an Illumina BeadXpress. Individuals were genotyped for 96 SNPs located in chromosome regions containing QTLs for the following popcorn quality traits: expansion volume, flake volume, unpopped kernel number, and flake size (Table 1). The SNPs were selected from the maize 56-kb SNP50 array (56,110 SNPs from ~19,000 genes) on the basis of locations of the SSR primers flanking the QTLs mapped by Li , 2007, 2008, 2009), Babu , andLu and by using information in Maize Genetics and Genomics (MaizeGDB) and National Center for Biotechnology Information (NCBI) databases. Two SNPs did not map to any assembly. The number of genotyped plants ranged from 38 to 113. Genotypes were assigned using Illumina GenomeStudio (version 2011.1), with the GC score specified as 0.25. The average distance between adjacent SNPs was 9.1 Mb, and within bins, 464 kb.
Table 1

Name and location of the true and simulated SNPs.

NameChr.Position (bp)BinNameChr.Position (bp)
PUT-163a-91054912-4739------
PUT-163a-16922676-1070------
SYN6001122083771.01114464052
SYN27251124326691.01214918980
PUT-163a-5499487-2275125365261.01316747184
SYN38927126915471.01416822691
SYN6413179614121.01517271821
PZE-101014003179935261.01618098952
PZE-101014266180956941.01718662055
SYN11901185108191.01819201172
SYN11909185122371.019110247073
PZA-000175002185534731.0110112041260
SYN201961155124781.0211114985200
PUT-163a-16922676-10731460700671.0312144959476
SYN112211669806861.0413167180473
SYN112221669811081.0414168864563
SYN177011670621981.0415168909790
PZE-1010838261720670651.0416169218117
SYN385091721057951.0417170928139
SYN385101721058591.0418172494781
PZE-10112055611485250471.05191146847931
PZE-10112063911485664961.05201148129135
PZE-10112064511485667931.05211149660797
PZE-10113110311682566821.05221167602997
PZE-10113111411682573191.05231167714661
PZE-10113116611684297681.05241168981018
PZE-10116230012055440181.07251204597382
SYN42212362912571.08261234385544
SYN42312362961651.08271236155502
ZM011097-067612362964171.08281236592209
SYNGENTA1166622055830152.082920
PZE-10215872122060394412.08302855530
PUT-163a-148951348-515357898193.023135524536
SYNGENTA17024358540173.023235754583
PZE-103010658358544163.023337669980
SYN334443151377423.0434314800309
SYN334433152226263.0435315194865
SYN334423152228643.0436316334091
PZE-10316021032114058763.08373209104813
PZE-10316021832114087033.08383210855530
PZE-10316022732114105923.08393211377457
SYN3339432154567833.08403213475311
PZE-10316595332154623163.08413215009460
PUT-163a-149100944-92532155133433.08423216145401
PZE-104008299455953864.024340
PZE-1040334594418730084.0544434923515
PZE-1040337914422842654.0545436303978
PZE-1040338174422881344.0546438019829
PZE-1040338264423068694.0547439186188
SYN2274541547161254.06484148089966
PZE-10408038441547167584.06494148380829
SYN509542195525.015050
SYN524542542735.015151575760
SYN526542547005.015251580048
PZE-105018859585605995.025354536728
SYN465161479050686.05546146819366
SYN464661479093876.05556148468338
SYN464261479095076.05566148928894
PZE-10610071561535020186.05576151316528
PZE-10610072061535027566.05586152790375
PZE-10610072861535084076.05596153651642
PZE-10611588961618541676.07606159940903
PZE-10611614861619249486.07616161828934
PZE-10611615661619903616.07626162886520
SYN1269261641802766.07636164783844
PZA02688.261641836876.07646166620102
SYN1269861641867266.07656167632980
PZE-10708343071257230017.026670
PZE-10708342971257236857.026771462112
PZE-10710578371579121897.0468714959671
SYN3610871579144337.0469716157837
PZE-10710585571579348037.0470717505219
PZE-108004863849679258.017180
PZE-108004875849698128.017281878784
PZE-108004908850251898.017382059971
PZE-1080525998927306728.0374886063553
PZE-1080526008927307028.0375887675003
PZE-1080526038927324628.0376887804291
PZE-10813498381739315758.09778166943466
SYN2080881739536518.09788167344955
SYN2080681739732578.09798168464798
PZE-10813520381742570438.09808170052048
SYN1723110499042110.018191098717
SYN1723310499294710.018292942947
PZE-11000642310499306110.018394581985
PZE-11000709110548525210.028495060295
PZE-11000719410553711210.028596864235
SYN1675710559261710.028697174065
PZE-110012640101108885110.028799114563
PZE-110012671101118462210.0288910769264
PZE-110012682101119500710.0289912219177
PZE-110045755108643962410.0390985800064
SYN37480108652802610.0391987218643
SYN169821011487529910.04929113242081
SYN169791011487621710.04939114440567
PZE-1100606861011489272410.04949116157425

Phenotyping

Expansion volume was assessed in a 27-L microwave oven (900 W) using samples of 10 or 30 g per plant. To provide an estimate of error variance for expansion volume, two measurements were obtained for most plants in the temperate populations. Hundred-kernel weight was measured with an electronic scale. Average kernel sphericity was calculated as the ratio of geometric mean diameter (cubic root of the multiplied length, width, and depth) to kernel length (as measured with a digital caliper [0.005-mm precision]) of 10 randomly-selected kernels per plant (Tian ). To determine kernel density, 50 kernels were weighted and placed in a 100-mL beaker (1.0-mL precision) containing 50 mL of 90% aqueous ethanol. Kernel volume was obtained by subtracting 50 mL from the final volume (Vyn and Tollenaar, 1998). The number of phenotyped plants ranged from 43 to 108.

Data simulation

Because no reference was available for interpreting the LD analysis results for the popcorn populations, we also analyzed two simulated populations. Simulated population 1 (Pop1) was a second generation composite obtained by crossing two populations in linkage equilibrium. This population was in LD only for linked markers and/or genes. The second simulated population (Pop2) was obtained from Pop1 after 10 cycles of random crosses assuming sample sizes of 100 and 300. The effective population sizes were 200 and 600, respectively. The program used for simulating genotypes and phenotypes - REALbreeding - has been developed by the second author using REALbasic software (Viana). In the simulation process, we tried to reproduce the same distribution of SNPs observed in the popcorn populations. We simulated 1,170 SNPs on nine chromosomes, of which 94 were selected and analyzed (Table 1). The average distance between adjacent SNPs was 9.1 Mb. Nineteen QTLs (candidate genes) and 81 minor genes affecting the expansion volume trait were randomly distributed along the nine chromosomes. Based on user input, which included minimum and maximum genotypic values for homozygotes, degree of dominance (d/a), direction of dominance, and broad sense heritability, the REALbreeding program provided the phenotypic values of each genotyped individual. The phenotypic values were computed from the true population mean, additive and dominance values, and error effects sampled from a normal distribution. The error variance was computed from the broad sense heritability. The minimum and maximum genotypic values of homozygotes were 5 and 50 mL g-1. We also defined bidirectional dominance (-1.2 ≤ (d/a)i ≤ 1.2) and used a heritability of 50%. The proportion of the phenotypic variance explained by each QTL was set to 2.4%.

Statistical analyses

Missing genotypes were imputed with Beagle 3.3.2 (Browning and Browning, 2007). PowerMarker 3.25 (Liu and Muse, 2005) was used to compute SNP frequencies, gene diversity (expected heterozygosity), and LD statistics and to perform Hardy-Weinberg equilibrium tests and association mapping based on analysis of variance (equivalent to a least-squares regression analysis). The fixation index (FST) was computed using GenAlEx 6.5 (Peakall and Smouse, 2006). For the population structure analysis, we used the Structure software (Falush ). SAS (SAS Institute, 2007) was used to compare population means and to compute phenotypic correlations. We used the R packages MCMCpack (Martin ) and boa (Smith, 2007) for a Bayesian GWAS. A SNP was considered to be non-polymorphic when the minor allele frequency (maf) was less than 1%. Only SNPs in Hardy-Weinberg equilibrium, as assessed using a chi-square test at the 5% significance level, were used for the LD analysis. The LD measures were D, D', and r2. The significance of a SNP frequency change was based on Waples (1989)assuming a 0.05% level of significance. For the population structure analysis, the burn-in period and the number of Markov chain Monte Carlo (MCMC) replications consisted of 5,000 and 25,000 iterations, respectively, and the number of assumed populations (K) was varied from 2 to 10. We ran the analysis under the no-admixture model with correlated frequencies. The most probable Kvalue was determined based on the inferred plateau method (Viana ). The least-squares association mapping used a Benjamini-Hochberg false discovery rate (FDR) of 5% (Benjamini and Hochberg, 1995). For the Bayesian GWAS, the burn-in period, number of MCMC replications, and sampling interval were 50,000, 100,000, and five, respectively. Significant SNP effects were identified using 95% highest posterior density (HPD) intervals.

Candidate gene analysis

SNP sequences in FASTA format were obtained from the NCBI Database of Single Nucleotide Polymorphisms and used to perform BLAST searches against the 'B73'RefGen_v2 reference genome at the MaizeGDB. Information on gene products, expression, and ontology (biological process, molecular function, and cellular component) was obtained using the MaizeCyc database, the Maize eFP browser, and the Gramene database. To identify candidate genes, we searched up to 1 Mb upstream and downstream of each SNP region.

Results

LD analysis

The percentage of polymorphic SNPs in the popcorn populations ranged from 56.0 in Beija-Flor c4 to 93.0 in Viçosa c1 (Table 2), but the number of SNPs in Hardy-Weinberg equilibrium was the factor that negatively affected the LD analysis. The percentage of polymorphic SNPs in Hardy-Weinberg equilibrium ranged from 18.5 in Beija-Flor c4 to 65.5 in Viçosa. Expected heterozygosity ranged from 0.29 in UFV MP-1 to 0.39 in Viçosa c2 fsf. The minimum and maximum average D and r2 values were observed in Beija-Flor c4 and Viçosa c2 fsf, respectively. The lowest and highest average D' values were observed in UFV MP-2 and Viçosa c2 fsf, respectively. In the simulated populations, the number of polymorphic SNPs agreed with the value expected at the 5% significance level (at least 91), the expected heterozygosity approached the maximum value, and 10 generations of random mating decreased LD values. Also as expected, average LD values for linked SNPs were greater than those for linked and unlinked SNPs. This decrease occurred only in 50% of the cases for the popcorn populations. In the tropical popcorn populations, D-values greater than 0.10 were observed for SNPs separated by 100-150 Mb. Most of the D-values relative to the temperate populations were less than 0.05 (Figure 1). For the simulated populations, SNPs separated by more than 50 Mb generally exhibited a D-value less than 0.05, and SNPs separated by less than 10 Mb generally showed a D-value greater than 0.10.
Table 2

Population, number of genotyped individuals (Ng), number of polymorphic SNPs (Np), number of SNPs in Hardy-Weinberg equilibrium (Ne), average expected heterozygosity (He), and average absolute values of the LD measures by chromosome and for all SNPs1.

PopulationNgNpNeHeDD'r2 D1 D'1 r21
Viçosa9987570.30490.04910.80590.19490.04010.77420.1660
Viçosa c17389440.30630.02980.74550.14610.02820.76240.1507
Viçosa c411279280.37390.05400.91070.31900.06500.93250.4008
Viçosa c2 fsf11376240.39100.07410.97120.41280.08000.98420.4764
Viçosa S4 11278300.38330.06460.80140.33620.06660.74840.3552
Beija-Flor c110782310.37500.05200.80350.25930.05010.81340.2552
Beija-Flor c43854100.43000.00760.81080.01760.00610.76140.0207
UFV MP-19566370.28790.02680.79130.20050.02750.77320.1992
UFV MP-29561310.31410.01690.69750.12780.01650.67650.1312
Pop110096930.48690.06700.33920.13110.02720.13890.0296
Pop130096910.48580.06470.32330.12680.01950.09790.0232
Pop210096940.47920.04540.24500.06820.02520.13730.0214
Pop230096950.47740.03960.21070.06140.01740.09430.0140
Figure 1

Relationship between the absolute D-value and distance (Mb) in the populations Viçosa (a), Viçosa c1 (b), Viçosa c4 (c), Viçosa c2 fsf (d), Viçosa S4 (e), Beija-Flor c1 (f), UFV MP-1(g), UFV MP-2 (h), Pop1, sample size 100 (i), and Pop2, sample size 100 (j).

Efficiency of selection

According to a t-test at the 5% significance level, the selection process used on non-inbred and inbred progeny caused, with one exception, an increase in the mean expansion volume of the base populations (Viçosa and Beija-Flor c1) and an indirect decrease in 100-kernel weight (Table 3). Compared with tropical populations, temperate populations had a greater expansion volume and a lower 100-kernel weight, with lower phenotypic variance for both traits. The tropical and temperate populations had equivalent kernel sphericities and densities. The simulated populations showed the same mean and phenotypic variance regardless of sample size. Estimates of phenotypic correlations for expansion volume and kernel traits included some significant values (p < 0.05,t-test), but were characterized by intermediate (0.4) to low (0.2) magnitudes, especially for tropical populations. The sign of the estimates was also variable depending on the population.
Table 3

Population, number of phenotyped individuals (N), and minimum, average, maximum and variance for expansion volume (mL/g), 100-kernel weight (g), kernel sphericity, and kernel density (g/mL)

PopulationNExpansion volume100-kernel weightKernel sphericityKernel density
Min.Av.Max.Var.Min.Av.Max.Var.Min.Av.Max.Var.Min.Av.Max.Var.
Viçosa939.026.950.048.3315.521.328.89.140.600.750.930.0030.891.331.630.012
Viçosa c14318.030.91 47.053.769.918.41 25.112.610.650.790.960.0070.681.291.620.040
Viçosa c410815.031.01 50.051.8711.418.71 28.79.800.620.750.920.0030.921.321.950.023
Viçosa c2 fsf8910.030.51 48.065.1713.419.21 26.28.990.630.740.930.0041.011.321.910.020
Viçosa S4 9815.028.644.043.4411.619.71 28.99.290.630.740.960.0041.041.322.020.023
Beija-Flor c1916.026.745.068.3911.620.030.314.420.610.730.900.0031.051.342.130.022
Beija-Flor c410310.031.32 51.061.7010.118.12 30.09.330.610.720.850.0030.971.341.990.029
UFV MP-19723.441.150.026.056.115.119.25.910.620.740.860.0011.041.402.310.038
UFV MP-28920.033.749.324.678.412.717.82.930.700.780.940.0021.001.261.820.021
Pop110022.933.443.216.84------------
Pop130022.933.744.817.02------------
Pop210024.533.643.117.23------------
Pop230023.633.645.015.50------------

Significant at the 5% level by the t-test in relation to Viçosa;

significant at the 5% level by the t-test in relation to Beija-Flor c1.

Significant at the 5% level by the t-test in relation to Viçosa; significant at the 5% level by the t-test in relation to Beija-Flor c1.

SNP frequency change

Selection for expansion volume was accompanied by increases in LD values in the Viçosa population (Table 2), population differentiation, and significant (non-random) changes in SNP frequency (Table 4). Increases in average LD values occurred only after four cycles of half-sib and inbred progeny selection and after two cycles of full-sib selection. With respect to linked SNPs, increases in average D, D', and r2 ranged from approximately 10% to 51%, 13% to 20%, and 64% to 118%, respectively. The increments for linked and unlinked SNPs were even higher. The genetic differentiation was proportional to the number of cycles. Relative to the Viçosa population, FST ranged from 0.09 in Viçosa c1 to 0.16 in Viçosa S4. The highest FST estimates, ranging from 0.19 to 0.34, were observed between tropical and temperate populations. The lowest value (0.00) was evidence of no genetic differentiation between Beija-Flor c1 and Beija-Flor c4 populations. Interestingly, genetic differentiation between the improved Viçosa and Beija-Flor populations was negligible (less than 0.05).
Table 4

Number of SNPs with significant allele frequency change by the Waples's test at 0.05% (N), and minimum, average, and maximum of the absolute value of the significant allele frequency changes in relation to population Viçosa, Beija-Flor c1, or Pop1

PopulationNMinimumAverageMaximum
Viçosa c1230.18480.27480.5493
Viçosa c4410.06570.30040.8254
Viçosa c2 fsf400.06570.28620.6537
Viçosa S4 350.10120.29990.7853
Beija-Flor c4250.16350.21360.4299
Pop21 00.00000.05510.1600
Pop22 50.10170.12200.1433

Sample size 100;

sample size 300.

Sample size 100; sample size 300. These findings are partially consistent with results from the population structure analysis. The inferred plateau method uncovered six subpopulations corresponding to UFV MP-1, UFV MP-2, Viçosa, three Viçosa-derived populations as a fourth subpopulation, Viçosa S4 and BeijaFlor derived populations as a fifth subpopulation, and a non-existent population (with individuals in the five previous subpopulations). Based on the Waples' test, the number of SNPs with significant (p < 0.05%) allele frequency changes relative to the base populations (Viçosa and Beija-Flor c1) ranged from 23 in Viçosa c1 to 41 in Viçosa c4, proportional to the number of cycles. The average change in SNP frequency ranged from 0.21 in Beija-Flor c4 to 0.30 in Viçosa c4 and Viçosa S4, which was also proportional to the number of cycles. Unexpected significant changes in SNP frequencies in the simulated population Pop2, with a sample size of 300, ranged from 0.10 to 0.14 (average of 0.12). It should be noted that one to seven SNPs in almost all bins showed significant frequency changes.

Association mapping

Not a single significant association at a FDR of 5% was observed in the popcorn populations. Assuming a FDR of 10%, we found three associations for expansion volume, two associations for 100-kernel weight, and seven associations for kernel density in distinct populations (Table 5). The Bayesian GWAS uncovered no significant associations. With respect to the simulated populations, association mapping at 5% level of significance revealed 13 significant associations in Pop1 with a sample size of 300, five significant associations in Pop2 with a sample size of 300, no significant associations in Pop1 with 100 individuals, and, surprisingly, six significant associations in Pop2 with 100 individuals (Table 6). Most of the significant associations were uncovered by Bayesian GWAS. Analyses of both field and simulated data evidenced differences between least squares regression and Bayesian GWAS results, and between SNPs with significant associations. Only SNPs 30 and 87 showed an association in Pop2 at both sample sizes, identifying QTLs 5 and 18, respectively. These two QTLs were also identified from the analysis of Pop1 data with a sample size of 300, but the associations were with SNPs 31 and 84. Importantly, no false positives were apparent, and in 70% of the significant associations, the distance between the SNP and the candidate gene ranged from 121 to 11,867 kb (average of 4,117 kb).
Table 5

Location of SNPs with significant association at a false discovery rate of 10% for expansion volume, 100-kernel weight, or kernel density, in popcorn populations, and the candidate genes.

TraitPopulationChr.Position (bp)SNPCandidate gene
Expansion volumeViçosa S41168429768PZE-101131166GRMZM2G018472
6147905068SYN4651GRMZM2G058472
8173931575PZE-108134983GRMZM2G118462
100-kernel weightBeija-Flor c41168256682PZE-101131103GRMZM2G018472
Viçosa c47125723685PZE-107083429GRMZM2G133613
Kernel densityViçosa c412208377SYN6001GRMZM2G109725
315222626SYN33443GRMZM2G334628
441873008PZE-104033459GRMZM2G138060
8173973257SYN20806GRMZM2G118462
UFV MP-118510819SYN11901GRMZM2G009014
1011184622PZE-110012671GRMZM2G392513
10114892724PZE-110060686GRMZM2G049681
Table 6

Location of SNPs with significant association for expansion volume, based on a false discovery rate (FDR) of 5% or the 95% highest probability density (HPD) interval of the regression coefficients, and location of the closest QTL (candidate gene), in two simulated populations

Pop.SampleChr.Position (bp)SNPQTLFDRHPD int.
Pop1300186620557-ns-2.89; −1.00
11204126010-0.0472ns
16921811717-0.0409ns
17092813918-0.0158ns
176126122-1--
114684793120-0.0195ns
116771466124-ns-2.16; −0.38
1168860031-3--
116898101825-0.0435ns
123659220929-0.0150ns
1245744736-4--
2326508-5--
285553031-ns0.01; 1.54
7146211268-0.0247ns
74291016-14--
8072-ns-2.36; −0.29
8163719894-15--
9458198584-ns0.25; 1.89
911190208-18--
958272812-19--
98580006491-ns0.03; 1.80
Pop23002030-ns0.15; 1.74
2326508-5--
4138458023-10--
414808996649-ns-1.67; −0.01
6133876175-13--
615279037559-ns-1.83; −0.03
615994090361-0.0049ns
9717406587-ns0.19; 2.27
911190208-18--
Pop21001168860031-3--
120459738226-ns-2.91; −0.24
2030-ns0.41; 3.37
2326508-5--
285553031-0.0203ns
5453672854-ns-2.89; −0.04
58577499-11--
74291016-14--
71615783770-ns-3.21; −0.03
9717406587-ns0.14; 3.81
911190208-18--

Non-significant at 5%.

Non-significant at 5%. We found one or more candidate genes for each SNP with a significant association at a FDR of 10% and/or a significant frequency change at 0.05% (seeTables S1 andS2 in the Supplementary Material). In general, the identified candidate genes are involved in starch biosynthesis, lipid metabolism, cell wall polysaccharide (hemicellulose, cellulose, and pectin) biosynthesis, and storage protein metabolic/catabolic processes such as α-zein synthesis. Expression levels of these candidate genes in seeds (embryo, endosperm, and pericarp) are variable, generally ranging from intermediate to high depending on the reproductive stage (R1 to R4).

Discussion

Selection based on expansion volume indirectly led to a decrease in the number of polymorphic SNPs and in the number of SNPs in Hardy–Weinberg equilibrium, and an increase in expected heterozygosity, FST, and D and r2 values in populations derived from Viçosa after two or four non-inbred progeny selection. The selection procedures also caused several non-random changes in SNP frequencies. Theoretically, the possible causes are selection (indirectly, due to linkage disequilibrium between the SNPs and QTLs for quality), genetic drift (due to finite population size), migration, and mutation. Migration and mutation should be irrelevant causes. The inclusion of the simulated populations evidenced that genetic drift is not a relevant cause. Notice the equivalence between the parameters estimated in the populations with sample sizes 300 (lower genetic drift) and 100 (higher genetic drift). It should be also highlighted that the average random change in SNP frequencies in the simulated populations was lower than the average changes in the popcorn populations. Newell observed an increase in LD between SNPs having significant associations with methionine levels over cycles of divergent selection for methionine content. The LD increase occurred for linked and unlinked SNPs. They also observed changes in allele frequencies for two genes controlling methionine concentration. At thecys2 locus, one allele showed a decrease with selection for high methionine content (from 0.25 to 0.01) and an increase with selection for low methionine content (from 0.25 to 0.74). Wen observed that 57% of SNPs with significant allelic frequency changes among accession regenerations were within flowering-time QTL regions, which was evidence of assortative mating. Our results revealed greater LD for SNPs separated by more than 10 Mb in tropical populations than in both temperate and simulated populations. In general, tropical populations showed average LD values greater than those of temperate populations. However, LD in the tropical populations was lower than that observed in a secondgeneration composite and higher than that in the composite after 10 generations of random crosses. Truntzler analyzed the extent of LD using a dent maize panel with public and private inbreds. For SNPs separated by 0 to 1,000 bp, the average r2 was higher for Syngenta lines (0.61) than for public lines (0.39). For SNPs separated by 1 to 10 Mb, the average r2 was 0.03 and 0.04 for public and Syngenta inbreds, respectively. In a study on the extent of LD in commercial maize germplasm, Van Inghelandt observed r2 values for unlinked and linked SNPs respectively ranging from 0.009 to 0.013 and 0.020 to 0.029 relative to four heterotic pools. The differing efficacy of least squares association mapping and Bayesian GWAS to detect true associations, as evidenced by the analysis of the simulated data, can be best attributed to the reduced proportion of phenotypic variance explained by the QTLs. In QTL mapping studies performed by Yongbin, Li, 2007, 2008, 2009), Babu , and Lu, the proportion of phenotypic variance explained by QTLs for expansion volume ranged from 3.1% to 35.9%, with average values varying from 4.7% to 15.5%. These high values are due to the phenotyping of progeny or recombinant inbred lines (RILs) instead of plants. In regard to the field data, inefficiency in the identification of QTLs for expansion volume and other quality traits in the breeding populations or in validation of previously identified QTLs can be best explained by reduced heritability. Estimated heritabilities at the plant level for expansion volume in the two temperate populations were 53.2% and 50.7%; these values were lower than the heritabilities at the progeny level observed in the previous QTL mapping studies, which ranged from 72.0 to 83.0% with F2:3, BC1S1, or BC2F2designs. From an analysis of RILs in four environments, Yongbin mapped seven QTLs for expansion volume and obtained an estimated heritability of 90.0%. To identify candidate genes for expansion volume and other popcorn quality traits, we based our analysis on kernel physiochemical characteristics affecting expansion volume, such as kernel size, shape, and density as well as kernel moisture, starch, protein, and fatty acid contents. The endosperm is the most important kernel component affecting popping, while starch is the major polymer involved in popcorn expansion. Popcorn kernels contain both vitreous (horny or hard) and opaque (floury or soft) endosperm. During popping, starch granules in the vitreous endosperm are highly expanded and responsible for flake formation, whereas starch granules in the opaque endosperm appear to undergo little change. Acting as a pressure vessel during heating, the pericarp gives popcorn its distinct popping ability. The pericarp is the primary source of fiber in the popcorn kernel, while the germ (embryo) is the primary source of lipids. Other than fracturing the pericarp, popping does not substantially alter either the germ or pericarp. In general, small- to medium kernel size (lower 100-kernel weight) and greater kernel sphericity, kernel density, ratio of vitreous to opaque endosperm, and linoleic acid, oleic acid, and α-zein protein levels are associated with greater expansion volume. Pericarp damage and thickness also greatly affect expansion volume (Sweley). One candidate gene for SNPs SYN4651 and SYN4646 is GRMZM2G058472 (Gramene ID). The gene product is a glycosyltransferase involved in synthesis of glucuronoxylan, a polysaccharide of the hemicellulose fraction of the cell wall. The gene exhibits intermediate expression level in the pericarp during the middle fruit ripening stage (R3). GRMZM2G060579 is the candidate gene for SNPs PZE-107105783 and SYN36108. This gene encodes an uncharacterized protein involved in pectin biosynthesis and shows an intermediate level of expression in the entire seed (embryo, endosperm, and pericarp) during early to middle stages of fruit ripening (R1 to R4). Tandjung showed that cellulose forms crystalline structures in the popcorn pericarp during microwave heating, thereby improving moisture retention and popping performance, the latter mainly by decreasing the number of unpopped kernels. The candidate gene for SNP PZE-107083429, GRMZM2G133613, encodes N-acetyllactosaminide 3-alpha-galactosyltransferase. This enzyme participates in glycoprotein synthesis, which is important for endosperm development (Riedell and Miernyk, 1988). The candidate gene for SNPs PZE-110060686, SYN16982, and SYN16979 is GRMZM2G049681. This gene codes for an uncharacterized protein that participates in protein metabolic processes and shows intermediate to high levels of expression in the embryo during early to middle fruit ripening stages. The candidate genes for SNPs PZE-101083826, SYN38509, and SYN38510 are GRMZM2G179521 and GRMZM2G074946; their respective gene products, 6-phosphogluconolactonase and glucose-6-phosphate 1-dehydrogenase, participate in the oxidative pentose phosphate pathway, which is a critical process for maize endosperm starch accumulation (Spielbauer ). Surprisingly, these genes show intermediate to low levels of expression in the entire seed during early to middle stages of fruit ripening. The candidate gene for SNPs PZE-104033459, PZE-104033791, and PZE-104033817 is GRMZM2G138060 (sugary1), a determinant of starch composition in maize kernels (James ). This gene shows high level of expression in seeds (especially endosperm) during early to middle stages of fruit ripening. Among the SNPs with significant frequency changes, PZE-104008299, SYN33394, and SYN526 are particularly of interest. Changes in the frequency of these SNPs ranged from 0.23 to 0.38, 0.15 to 0.39, and 0.35 to 0.50, respectively. The SNP PZE-104008299 is located in a region containing at least 12 genes coding for precursors of α-zeins, which are storage proteins accounting for 70% of maize endosperm protein (Holding and Larkins, 2006). All α-zein genes (including 19B1, PMS1, A30, and Z4) are highly expressed in the endosperm during early to middle stages of fruit ripening. The candidate gene for SNP SYN33394 is GRMZM2G429899 (shrunken-2), which encodes glucose-1-phosphate adenylyltransferase large subunit 1 involved in starch biosynthesis. Mutation at this locus greatly reduces starch levels in the endosperm (Bhave ). The gene is highly expressed in the endosperm during early to middle stages of fruit ripening. Finally, the candidate gene for SNP SYN526 is GRMZM2G007063 (ohp2). Similar to the well-known opaque-2 locus (o2), this gene also regulates the expression of many members of the zein multigene family of storage proteins (Ciceri ). The gene displays intermediate to high levels of expression in the entire seed during early to middle stages of fruit ripening. To conclude, our results confirm some previously mapped QTLs for popcorn quality traits and provide evidence for several candidate genes affecting starch, storage protein, and oil content of popcorn kernels and pericarp polysaccharide content. The highlighted candidate genes are located in bins 1.04, 3.08, 4.02, 4.05, 5.01, 6.05, 7.02, 7.04, and 10.04. Yongbin , Li , 2007,2008, 2009), Babu , and Lu mapped QTLs for expansion volume in bins 1.04, 3.08, 4.02, 4.05, 5.01, 6.05, 7.03, and 10.04, among others. The main candidate genes affecting starch content are located in bins 1.04, 3.08, and 4.05. Those related to storage protein content are located in bins 4.02, 5.01, and 7.02. Some candidate genes associated with oil content were found in bins 1.01, 3.04, and 7.04. Yanyang mapped QTLs for starch, protein, and oil concentration. Four of the six QTLs for starch content were mapped in bins 1.01, 1.06-1.07, and 4.01-4.02. Three of the seven QTLs for protein content were mapped in bins 4.01-4.02, 7.01, and 7.03. Three of the five QTLs for oil content were mapped in bins 1.03, 3.04, and 7.03.
  27 in total

1.  Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies.

Authors:  Daniel Falush; Matthew Stephens; Jonathan K Pritchard
Journal:  Genetics       Date:  2003-08       Impact factor: 4.562

Review 2.  Structure of linkage disequilibrium in plants.

Authors:  Sherry A Flint-Garcia; Jeffry M Thornsberry; Edward S Buckler
Journal:  Annu Rev Plant Biol       Date:  2003       Impact factor: 26.379

3.  Genetic analysis and characterization of a new maize association mapping panel for quantitative trait loci dissection.

Authors:  Xiaohong Yang; Jianbing Yan; Trushar Shah; Marilyn L Warburton; Qing Li; Lin Li; Yufeng Gao; Yuchao Chai; Zhiyuan Fu; Yi Zhou; Shutu Xu; Guanghong Bai; Yijiang Meng; Yanping Zheng; Jiansheng Li
Journal:  Theor Appl Genet       Date:  2010-03-27       Impact factor: 5.699

4.  Role of the pericarp cellulose matrix as a moisture barrier in microwaveable popcorn.

Authors:  Agung S Tandjung; Srinivas Janaswamy; Rengaswami Chandrasekaran; Adam Aboubacar; Bruce R Hamaker
Journal:  Biomacromolecules       Date:  2005 May-Jun       Impact factor: 6.988

5.  Efficacy of population structure analysis with breeding populations and inbred lines.

Authors:  José Marcelo Soriano Viana; Mágno Sávio Ferreira Valente; Fabyano Fonseca E Silva; Gabriel Borges Mundim; Geísa Pinheiro Paes
Journal:  Genetica       Date:  2013-09-21       Impact factor: 1.082

6.  Association genetics in crop improvement.

Authors:  J Antoni Rafalski
Journal:  Curr Opin Plant Biol       Date:  2010-01-19       Impact factor: 7.834

7.  Maize association population: a high-resolution platform for quantitative trait locus dissection.

Authors:  Sherry A Flint-Garcia; Anne-Céline Thuillet; Jianming Yu; Gael Pressoir; Susan M Romero; Sharon E Mitchell; John Doebley; Stephen Kresovich; Major M Goodman; Edward S Buckler
Journal:  Plant J       Date:  2005-12       Impact factor: 6.417

8.  Mapping QTLs for popping ability in a popcorn x flint corn cross.

Authors:  R Babu; S K Nair; A Kumar; H S Rao; P Verma; A Gahalain; I S Singh; H S Gupta
Journal:  Theor Appl Genet       Date:  2006-03-09       Impact factor: 5.699

9.  GenAlEx 6.5: genetic analysis in Excel. Population genetic software for teaching and research--an update.

Authors:  Rod Peakall; Peter E Smouse
Journal:  Bioinformatics       Date:  2012-07-20       Impact factor: 6.937

10.  Functional mechanisms of drought tolerance in subtropical maize (Zea mays L.) identified using genome-wide association mapping.

Authors:  Nepolean Thirunavukkarasu; Firoz Hossain; Kanika Arora; Rinku Sharma; Kaliyugam Shiriga; Swati Mittal; Sweta Mohan; Pottekatt Mohanlal Namratha; Sreelatha Dogga; Tikka Shobha Rani; Sumalini Katragadda; Abhishek Rathore; Trushar Shah; Trilochan Mohapatra; Hari Shankar Gupta
Journal:  BMC Genomics       Date:  2014-12-24       Impact factor: 3.969

View more
  3 in total

1.  The role of Tre6P and SnRK1 in maize early kernel development and events leading to stress-induced kernel abortion.

Authors:  Samuel W Bledsoe; Clémence Henry; Cara A Griffiths; Matthew J Paul; Regina Feil; John E Lunn; Mark Stitt; L Mark Lagrimini
Journal:  BMC Plant Biol       Date:  2017-04-12       Impact factor: 4.215

2.  SNP-based mixed model association of growth- and yield-related traits in popcorn.

Authors:  Gabrielle Sousa Mafra; Antônio Teixeira do Amaral Júnior; Janeo Eustáquio de Almeida Filho; Marcelo Vivas; Pedro Henrique Araújo Diniz Santos; Juliana Saltires Santos; Guilherme Ferreira Pena; Valter Jario de Lima; Samuel Henrique Kamphorst; Fabio Tomaz de Oliveira; Yure Pequeno de Souza; Ismael Albino Schwantes; Talles de Oliveira Santos; Rosimeire Barbosa Bispo; Carlos Maldonado; Freddy Mora
Journal:  PLoS One       Date:  2019-06-25       Impact factor: 3.240

Review 3.  Suitability of GWAS as a Tool to Discover SNPs Associated with Tick Resistance in Cattle: A Review.

Authors:  Nelisiwe Mkize; Azwihangwisi Maiwashe; Kennedy Dzama; Bekezela Dube; Ntanganedzeni Mapholi
Journal:  Pathogens       Date:  2021-12-09
  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.