Literature DB >> 32851437

Identification and validation of major QTLs associated with low seed coat deficiency of natto soybean seeds (Glycine max L.).

Qian Zhu1,2, Diana M Escamilla3, Xingbo Wu1, Qijian Song4, Song Li1, M Luciana Rosso1, Nilanka Lord1, Futi Xie2, Bo Zhang5.   

Abstract

KEY MESSAGE: Two major QTLs associated with low seed coat deficiency of soybean seeds were identified in two biparental populations, and three SNP markers were validated to assist low-SCD natto soybean breeding selection. Soybean seed coat deficiency (SCD), known as seed coat cracking during soaking in the natto production process, is problematic because split or broken beans clog production lines and increases production costs. Development of natto soybean cultivars with low SCD is crucial to support the growth of the natto industry. Unfortunately, information on the genetic control of SCD in soybean, which is desperately needed to facilitate breeding selection, remains sparse. In this study, two F2 populations derived from V11-0883 × V12-1626 (Pop 1) and V11-0883 × V12-1885 (Pop 2) were developed and genotyped with BARCSoySNP6K Beadchips and F2-derived lines were evaluated for SCD in three consecutive years (2016-2018) in order to identify quantitative trait loci (QTLs) associated with low SCD in soybean. A total of 17 QTLs underlying SCD were identified in two populations. Among these, two major and stable QTLs, qSCD15 on chromosome 15 and qSCD20 on chromosome 20, were detected across multiple years. These QTLs explained up to 30.3% of the phenotypic variation for SCD in Pop 1 and 6.1% in Pop 2 across years. Three SNP markers associated with the qSCD20 were validated in additional four biparental populations. The average selection efficiency of low-SCD soybean was 77% based on two tightly linked markers, Gm20_34626867 and Gm20_34942502, and 64% based on the marker Gm20_35625615. The novel and stable QTLs identified in this study will facilitate elucidation of the genetic mechanism controlling SCD in soybean, and the markers will significantly accelerate breeding for low-SCD soybean through marker-assisted selection.

Entities:  

Mesh:

Substances:

Year:  2020        PMID: 32851437      PMCID: PMC7547995          DOI: 10.1007/s00122-020-03662-5

Source DB:  PubMed          Journal:  Theor Appl Genet        ISSN: 0040-5752            Impact factor:   5.699


Introduction

Because of its high content of protein, fiber, amino acids, and isoflavones, soybean (Glycine max L. Merr.) has become increasingly appealing for human consumption as a nutritional and functional food (Gibbs et al. 2004; Ikeda et al. 2006; Sanjukta and Rai 2016). Natto, fermented whole soybeans, is a popular soyfood product in Japan and is well known for its nutrition, unique flavor and stickiness (Hu et al. 2010; Wei and Chang 2004). The USA has been the largest exporter of natto-type soybeans to Japan since the early 1990s. Utilization of US soybeans for natto production in Japan has been predicated on the ability of domestic growers to continuously meet the needs of natto manufacturers by supplying them with value-added, food-grade soybean seeds that result in high quality natto products (Ikeda et al. 2006; Yoshikawa et al. 2014). Soybean seeds must meet several quality standards for natto production: seeds must be small (< 9 g/100 seeds), have high water absorption capacity (Cook and Rainey 2010), be uniform in size, have minimum change in constituents during storage, be round in shape, and have clear hilum and yellow and smooth seed coat (Cui et al. 2004; Geater et al. 2000; Hosoi and Kiuchi 2003). Seeds with cracked seed coats or seed discoloration, or excessively flat seeds, were unqualified for natto production (Montague Farms, Inc., Center Cross, VA, personal communication). Thus, the quality of natto products is mainly determined by soybean cultivars, processing conditions (soaking, cooking and fermentation), and bacteria strains (Wei and Chang 2004). The initial water absorption in the natto producing process, where seeds are softened and soluble sugars are released, is an important step before natto fermentation (Cook and Rainey 2010). During this process, the seed coat regulates water absorption and prevents the destruction of seed tissue at the beginning of seed imbibition (Copeland and McDonald 2001; Koizumi et al. 2008). It is ideal for natto soybean to keep the seed coat intact despite weakening and expanding during absorption. However, inferior seed coat integrity has been frequently observed in USA, which was mainly caused by the seed coat cracking during water absorption and defined as seed coat deficiency (SCD) (Fig. 1S) (Cook and Rainey 2010 and personal communications). SCD is an undesirable trait for natto soybeans because it affects final natto appearance and clogs production lines which increases production costs and reduces profit (Yasui et al. 2017). The phenotypic data of SCD indicated that breeding selection is feasible to efficiently reduce SCD (Cook and Rainey 2010), but it is challenging due to the time-consuming, subjective, and laborious nature of phenotyping and the poorly understood influence of environment (Escamilla et al. 2019). Marker-assisted selection (MAS) has been incorporated into most soybean breeding programs (He et al. 2004); however, it has not been implanted in the low-SCD soybean selection due to limited genetic control information of this SCD trait. A recent study identified eight QTLs located on chromosomes 4, 6 and 8 that are associated with seed coat cracking after soaking and cooking (Yasui et al. 2017), but none of these QTLs were validated for MAS in breeding selection. SoyBase (http://www.soybase.org/ accessed Aug. 26, 2019) also reported 14 QTLs related to seed coat cracking in dry seeds; however, the phenotype of those QTLs was obtained by determining physical seed coat cracking after harvest, which was physiologically different compared to seed coat deficiency in natto soybean production. Currently, there is a widespread lack of available molecular tools that can be used for MAS of the low-SCD trait in soybean. Understanding of the genetic control of SCD is essential for development of molecular tools that can be used to improve breeding selection for low-SCD natto soybean cultivars. Therefore, the objectives of this study were to (1) identify QTLs associated with seed coat deficiency in soybean using two populations across multiple environments, and (2) validate these QTLs through Kompetitive Allele Specific PCR (KASP) assays using 86 breeding lines from four validation populations.

Materials and methods

Population development and experiment design

Two populations were developed by crossing small-seeded soybean breeding lines V11-0883 × V12-1626 (Pop 1) and V11-0883 × V12-1885 (Pop 2) (Pedigree was shown in Fig. 2S). The female parent V11-0883 produces the high-SCD phenotype, while the two male parents, V12-1626 and V12-1885, produce the low-SCD phenotype. Crosses were made in Blacksburg, VA in 2014, and the F1 generation was planted at the same location the following year. Six SSR markers (Satt449, Satt197, Satt281, Satt268, Satt431 and Satt345), which were polymorphic between parents, were used to verify true hybrids. A total of 240 and 153 F2 individuals from Pop 1 and Pop 2, respectively, were advanced to F3 at a winter nursery during the winter of 2015. In 2016, the F2:3 lines from two populations were separately spaced planted (0.03 m) in single, 3.05-mm-long rows with 0.76 m row spacing (with a seeding rate of 70,542 plants per hectare) arranged in a complete randomized design with three replications in Blacksburg, VA. F2:4 and F2:5 lines were harvested and replanted in subsequent years at the same growing location using the same experimental design. In all years, fertilizer was applied according to soil test recommendations and pre-emergent herbicide (Dual Magnum) was applied at rate of 2 L ha−1 to reduce weed pressure. No irrigation or insecticides was applied. Rows were inspected for flower and pubescence color each year in order to control population purity and avoid contamination. Seeds were harvested 5–10 days after 95% of the plants in a row reached R8 maturity.

Determination of seed coat deficiency

Seeds were stored in the seed storage room until seed moisture stabilized between 10 and 12.2% (Cook and Rainey 2010). Seeds with cracked seed coat or seed discoloration, or excessively flat seeds, were removed. One hundred intact soybean seeds of each plot were subsampled for SCD determination. A modified method from protocols described in previous studies (Cook and Rainey 2010; Rodda et al. 1973) was used for easy observation. Briefly, a 100-seed sample was placed in a plastic container with 50 ml of 1% commercial bleach solution for ten minutes; after soaking, the samples were drained and scored for SCD. Seeds that showed cracking were severely blistered around the hilum, or whose seed coats had detached from the hull were considered seed coat deficient (Fig. 1S) (Cook and Rainey 2010). The percentage of seed coat deficient seeds was used to score SCD.

Statistical data analysis

The SPSS statistical version 20.0 (SPSS Inc., Chicago, USA) was used to summarize the descriptive statistics of the SCD for each population. Normality assumption was assessed by the Shapiro–Wilk test and normal probability plots. Analysis of variance (ANOVA) was used to evaluate variation of SCD within and between each environment in each population. Histograms of SCD distributions were elaborated by R function “hist()”. The variance components were calculated and used to estimate the broad-sense heritability using the following equation:where H2 is heritability, is genotypic variance, is genotype × environment interaction variance, s2 is error variance, r is the number of replications, and e is the number of environments (Nyquist and Baker 1991).

DNA extraction and genotyping

Leaf tissue samples from F2 individuals and parental lines were collected from the field in centrifuge tubes and stored at −80 °C until extraction. For genomic DNA extraction, leaf tissue samples were freeze-dried at − 0.220 mbar with the collector temperature set at − 56 °C (FreeZone 6 Dryer system, Labconco, Kansas City, MO, USA). Lyophilized tissues were ground in liquid nitrogen using glass stirring rods. Total genomic DNA of each sample was isolated following a modified CTAB method (Saghai-Maroof et al. 1984). Fifty nanograms of genomic DNA for each genotype were sent to USDA–ARS Soybean Genomics and Improvement Laboratory (Beltsville, MD) for genotyping using the Illumina 6000-SNP BARCSoySNP6K Beadchip, selected from the SoySNP50K (Song et al. 2013). The SNP allele calling was conducted in GenomeStudio Module v2.0.3 (Illumina, Inc.). The low seed coat deficiency parents were scored as A, and high seed coat deficiency parents were scored as B. SNPs with no call and the monomorphic SNPs between parents were discarded. SNPs with low minor allele frequency (MAF) (< 10%) and high missing data ratio (< 5%), as well as severe segregation distortion, were filtered for quality control.

Linkage map construction and QTL analysis

Linkage maps were constructed by Joinmap 4.0 (Van Ooijen 2006) using a regression approach with a minimum logarithm of odds (LOD) threshold of 3. Recombination frequencies were converted to centimorgan (cM) using Kosambi mapping function (Kosambi 1943). To ensure consistency, QTL analyses were performed by single-marker analysis (SMA), interval mapping (IM) and composite interval mapping (CIM) implemented in ICiMapping v 4.1 (Wang et al. 2016). For SMA, p < 0.0001 was used as the experiment wide threshold for significant markers. In the CIM and IM, the experiment wide threshold was determined by 1000 permutation at significance level of 0.05 with a walk speed of 1 cM. MapChart (Voorrips 2002) was used to create the LOD plots based on JoinMap 4.0 and ICiMapping v 4.1 results.

KASP marker development

The SNPs tightly linked to major QTL identified in the mapping populations were converted into Kompetitive Allele Specific PCR (KASP) SNP genotyping assays (LGC, Middlesex, UK) with the flanking sequences obtained from the G. max genome Glyma.Wm82.a1 (Schmutz et al. 2010). The KASP oligos were synthesized by Integrated DNA Technologies (IDT, Iowa, USA), with primers carrying FAM tail (5′-GAAGGTGACCAAGTTCATGCT-3′) or VIC tail (5′-GAAGGTCGGAGTCAACGGATT-3′), and the target SNP in the 3′ end. Primer mix and PCR reaction were set up following LGC Genomics recommendation (46 µL distilled water, 30 µL common primer [100 µM], and 12 µL of each tailed primer [100 µM]). Thermocycling conditions consisted of the initial hot-start step at 95 °C for 15 min, followed by 10 cycles of touchdown PCR (annealing 65 °C to 57 °C, decreasing 0.8 °C per cycle), then 35 cycles of 20 s at 94 °C and 60 s at 57 °C. PCR and fluorescent endpoint reading were performed in FLUOstar Omega microplate reader (BMG LABTECH).

KASP marker validation

A total of 86 breeding lines from four biparental populations (MFS-561 × V09-0579, MFS-561 × V09-3876, V05-5973W × V09-3876, V05-5973W × V09-3984) were planted in Warsaw, VA for marker validation. The female parents, MFS-561 and V05-5973W produce the low-SCD phenotype, while all male lines produce the high-SCD phenotype. SCD was determined by the same method as in the mapping populations. Breeding lines with SCD less than or equal to that of either male parent were considered low-SCD lines. Selection efficiency (SE) of the selected markers linked to low SCD was calculated as follows: SE = (NC/NS) × 100, where NC is the number of low-SCD lines selected correctly by marker and NS is the total number of lines selected as low SCD by marker.

Results

Phenotypic analysis of seed coat deficiency

Both populations and their parental lines were scored for SCD during 2016–2018. In both populations, the SCD of the female parent V11-0883 (45.6%) exceeded that of both male parents (V12-1626, 7.3% and V12-1885, 13.7%) in all 3 years (Table 1). Among eight tests (two populations in 3 years plus mean across 3 years), three (SCD in 2017 and mean for Pop1; SCD in 2018 for Pop2) showed normal distribution (K–S test, p >0.2), while the rest showed continuous distribution (Table 1, Fig. 1), suggesting SCD was inherited as a quantitative trait. Large SCD variation was observed among individuals in both populations across all 3 years. For both populations, a larger degree of variation was observed in 2017 than in 2016 and 2018. The skewness was positive for SCD in all years and populations with the exception of Pop 2 in 2017, which displayed asymmetry toward the low-SCD end of the curve. Significant differences (p ≤ 0.0001) were detected when comparing SCD among genotypes, years, and the interaction between genotype and year in both populations (Table 1). The broad-sense heritability (H2) of SCD was 0.67 and 0.83 for Pop 1 and Pop 2, respectively, with an overall mean of 0.75.
Table 1

Descriptive statistics for seed coat deficiency (SCD) of two populations in 2016, 2017 and 2018

PopulationaYearMax (%)bMin (%)cMean (%)SEdSkewnessKurtosisp valueP1eP2fP3gH2hK–S testi
Pop 12016*48.0014.80.390.880.52< 0.000140.00.5p < 0.001
2017*94.03.043.31.020.18− 0.83< 0.000168.018.5p >0.2
2018*70.0021.70.660.710.31< 0.000133.03.0p = 0.004
2016/2017/2018**55.84.526.80.460.24− 0.69< 0.000147.07.30.67p >0.2
Pop 22016*59.02.022.00.780.43− 0.360.013340.06.5p = 0.017
2017*96.06.049.31.98− 0.07− 1.23< 0.000168.029.0p = 0.001
2018*75.02.027.01.100.560.240.013224.55.5p >0.2
2016/2017/2018**65.05.331.61.020.11− 0.720.011144.213.70.83p = 0.039

aPop 1 and Pop 2 were developed from the crosses of V11-0883 × V12-1626 and V11-0883 × V12-1885, respectively

bMax: maximum

cMin: minimum

dSE: standard error

eP1: V11-0883

fP2: V12-1626

gP3: V12-1885

hH2: Broad-sense heritability in combined environments (2016,2017, and 2018)

iK–S test: Kolmogorov–Smirnov test for normality distribution, probability was shown

*Significant difference among genotypes at the p < 0.0001

**Significant difference among genotypes (G), years (Y) and G × Y interaction at the p < 0.0001

Fig. 1

Distribution of seed coat deficiency (SCD) in 3 years (2016, 2017 and 2018) and their mean overall years (AVE) of Pop 1 (V11-0883 × V12-1626) and Pop 2 (V11-0883 × V12-1885)

Descriptive statistics for seed coat deficiency (SCD) of two populations in 2016, 2017 and 2018 aPop 1 and Pop 2 were developed from the crosses of V11-0883 × V12-1626 and V11-0883 × V12-1885, respectively bMax: maximum cMin: minimum dSE: standard error eP1: V11-0883 fP2: V12-1626 gP3: V12-1885 hH2: Broad-sense heritability in combined environments (2016,2017, and 2018) iK–S test: Kolmogorov–Smirnov test for normality distribution, probability was shown *Significant difference among genotypes at the p < 0.0001 **Significant difference among genotypes (G), years (Y) and G × Y interaction at the p < 0.0001 Distribution of seed coat deficiency (SCD) in 3 years (2016, 2017 and 2018) and their mean overall years (AVE) of Pop 1 (V11-0883 × V12-1626) and Pop 2 (V11-0883 × V12-1885)

QTL associated with SCD

Out of 6000 SNPs, 1318 and 1637 SNPs were polymorphic between the parental lines and subsequently selected after date filtration for linkage map construction in Pop 1 and Pop 2, respectively. A total of 1258 SNPs were mapped to 20 chromosomes (Chr.) in Pop 1 with a total genetic distance of 1826 cM ranging from 44.9 cM (Chr. 16) to 128.3 cM (Chr. 3) with an average length of 91.3 cM (Table 1S). The average distance between adjacent markers in Pop 1 was 1.6 cM. For Pop 2, a total of 1604 SNPs were mapped to 20 chromosomes (Table 2S). The total genetic length of LGs for Pop 2 was 1189 cM, with Chr. 10 being the longest (90.5 cM) and Chr. 16 being the shortest (33.3 cM). The average length of LG was 59.3 cM with an average marker interval of 0.8 cM. Genome-wide QTL analysis was performed based on the genetic map and phenotypic data of each population using composite interval mapping (CIM) and interval mapping (IM) in each year, as well as across 3 years. In Pop 1, two major QTLs were detected on Chr. 15 and Chr. 20 (Table 2, Fig. 2), while six other QTLs were detected on Chr. 3, 5, 6, 7, 10 and 14. The two stable QTLs, namely qSCD15 and qSCD20, were detected in each year and across 3 years, explaining 5.3–23.8% phenotypic variation. Meanwhile, qSCD15 and qSCD20 were also repeatedly detected in Pop 2 (Table 2, Fig. 3), along with seven other QTLs distributed on Chr. 1, 13, 16, 18 and 19. Among the two stable QTLs detected in Pop 2, qSCD15 was detected in individual and across years, explaining 3.1–10.4% phenotypic variation, while qSCD20 were detected in 2016, 2017 and across years, explaining 3.0–13.8% phenotypic variation. The additive effect of all detected QTLs was negative, indicating the male parents (V12-1626 and V12-1885) contributed negative allele decreasing SCD, while the female parent (V11-0883) contributed positive allele increasing SCD.
Table 2

Quantitative trait loci for seed coat deficiency (SCD) in the mapping populations

PopaEnvbQTL namecPosition (cM)ChrdFlanking markerseGI (cM)fWm82.a2.v1 Physical interval (bp)gLODhPVE(%)iAddjMethodk
Pop 12016qSCD142814Gm14_6024101-Gm14_495631726.5–29.56,135,158–5,040,7273.44.6− 1.9CIM
qSCD151215Gm15_511387-Gm15_47513378.5–13.5511,864–4,770,8144.5 − 6.16.6 − 10.1− 2.8 to − 2.9CIM, IM, SMA
qSCD204820Gm20_35625615-Gm20_3600214846.5–50.536,720,824–37,097,3157.3 − 8.47.5 − 11.3− 3.0 to − 3.1CIM, IM, SMA
2017qSCD343Gm03_241447-Gm03_7153930–8.5239,949–714,4484.5 − 7.24.2 − 5.3− 4.4 to − 6.1CIM, IM, SMA
qSCD102510Gm10_11518881-Gm10_2618842922.5–25.511,710,604–16,707,3343.4 − 4.54.1 − 5.0− 4.3 to − 6.1CIM, IM, SMA
qSCD152315Gm15_5312718-Gm15_606670922.5–23.55,331,364–6,085,7945.0 − 5.75.3 − 7.0− 5.0 to − 6.9CIM, IM, SMA
qSCD204520Gm20_34881595-Gm20_3562561544.5–46.536,021,058–36,720,82414.1 − 15.212.2 − 23.8− 9.3 to − 10.4CIM, IM, SMA
2018qSCD5985Gm05_36671535-Gm05_4174093695.5–9836,962,030–38,597,4253.14.9− 2.7CIM
qSCD6516Gm06_17258654-Gm06_3849946546.5–56.517,303,937–39,188,0865.5 − 5.99.3 − 9.5− 3.6 to − 4.6CIM, IM, SMA
qSCD7727Gm07_17030454-Gm07_1697958671.5–72.517,116,283–17,065,5623.14.8− 2.6CIM
qSCD152315Gm15_5312718-Gm15_606670922.5–23.55,331,364–6,085,7946.3 − 7.410.2 − 10.4− 3.7 to − 4.9CIM, IM, SMA
qSCD205220Gm20_36002148-Gm20_3609503751.5–52.537,097,315–37,190,2526.1 − 6.79.4 − 9.7− 3.7 to − 4.7CIM, IM, SMA
AVEqSCD6516Gm06_17258654-Gm06_3849946546.5–56.517,303,937–39,188,0865.2 − 7.37.3 − 7.8− 3.0 to − 4.7CIM, IM, SMA
qSCD102610Gm10_23967467-Gm10_1208027522.5–27.520,275,514–12,272,4543.1 − 3.83.8− 3.3IM, SMA
qSCD143014Gm14_5603904-Gm14_565548729.5–30.55,714,996–5,766,5792.93.5− 2.1CIM
qSCD152315Gm15_5312718-Gm15_606670922.5–23.55,331,364–6,085,7947.3 − 7.87.5 − 9.5− 3.5 to − 4.6CIM, IM, SMA
qSCD205220Gm20_36002148-Gm20_3609503751.5–52.537,097,315–37,190,25213.7 − 15.112.9 − 20.8− 5.1 to − 6.0CIM, IM, SMA
Pop 22016qSCD1_1151Gm01_1887205-Gm01_4932276013.5–16.51,887,609–50,206,3474.65.3− 4.3IM, SMA
qSCD1_2251Gm01_1653315-Gm01_5026249624.5–26.51,653,600–51,147,6755.4 − 7.26.8 − 17.4− 3.7 to − 5.2CIM, IM, SMA
qSCD1_3501Gm01_31906055-Gm01_705215646.5–55.533,203,133–7,071,1174.94.1− 3.9IM, SMA
qSCD132113Gm13_29677928-Gm13_2820601420.37–22.3730,875,555–29,399,3845.54.3− 3.8IM, SMA
qSCD15615Gm15_5312718-Gm15_62720065.5–6.55,331,364–6,291,0814.83.9− 4.1IM, SMA
qSCD18_12118Gm18_15324039-Gm18_734816820.5–21.515,066,204–7,370,3904.13.3− 3.5IM, SMA
qSCD203220Gm20_35625615-Gm20_3600214829.5–36.536,720,824–37,097,3154.73.6− 3.7IM, SMA
2017qSCD1_3511Gm01_31906055-Gm01_705215648.5–55.533,203,133–7,071,1174.0 − 5.54.8 − 14.0− 6.7 to − 8.6CIM, IM, SMA
qSCD15515Gm15_5312718-Gm15_62720061.5–7.55,331,364–6,291,0814.93.9− 8.3IM, SMA
qSCD16216Gm16_103052-Gm16_17558700–7.5102,882–1,772,7204.23.6− 7.5IM, SMA
qSCD18_2518Gm18_122382-Gm18_13084430–5.5122,380–1,308,7985.54.9− 8.5IM, SMA
qSCD196719Gm19_41918030-Gm19_5048691665.5–68.542,120,908–50,607,3365.35.8− 9.5IM, SMA
qSCD20_21820Gm20_26785339-Gm20_2506364615.5–21.527,947,431–26,214,4684.83.8− 8.0IM, SMA
qSCD203320Gm20_36095037-Gm20_3615304831.5–34.537,190,252–37,248,2634.1 − 5.44.3 − 13.8− 6.8 to − 8.4CIM, IM, SMA
2018qSCD15515Gm15_5312718-Gm15_62720062.5–6.55,331,364–6,291,0813.810.4− 5.1CIM, IM, SMA
AVEqSCD1_2251Gm01_1653315-Gm01_5026249624.5–26.51,653,600–51,147,6753.63.1− 5.0IM, SMA
qSCD1_3501Gm01_31906055-Gm01_705215646.5–55.533,203,133–7,071,1174.2 − 6.04.1 − 15.5− 4.5 to − 5.6CIM, IM, SMA
qSCD15515Gm15_5312718-Gm15_62720062.5–6.55,331,364–6,291,0814.93.1− 5.3IM, SMA
qSCD20_21220Gm20_827937-Gm20_123182327.5–14.5824,049–8,318,7185.74.3− 5.8IM, SMA
qSCD203320Gm20_36095037-Gm20_3615304831.5–34.537,190,252–37,248,2634.93.0− 5.0IM, SMA

Bold letters or values are QTLs names or the least or the maximum percentage of variance explained by corresponding QTL, respectively, mentioned in the text

Pop 1 and Pop 2 were developed from the crosses of V11-0883 × V12-1626 and V11-0883 × V12-1885, respectively

Env: the individual years and across years (AVE) in which the QTL was detected

The name of each QTL is a composite of SCD followed by the chromosome number, QTLs detected in different methods with the same or overlapping maker interval were designated as on QTL

dChr: chromosome

eFlanking markers: the markers flanking the QTL

fGI: genetic interval

gPhysical interval according to genome position at Wm82.a2.v

hLOD: logarithm of odds, data showed the range of the values detected across methods

iPVE: percentage of variance explained, data showed the range of the values detected across methods

jAdd: additive effect, data showed the range of the values detected across methods

kMethods: by which the QTL was detected, data presented were the range of different methods

Fig. 2

Mapping for seed coat deficiency (SCD) in the Pop 1 (V11-0883 × V12-1626) in individual and across years (AVE). QTL nomenclature is in the form of qTraitChr. Colored intervals refer to the mapped QTLs detected by CIM and/or IM in different environments, the curves indicate the physical position of markers against LOD score detected on chromosomes, and lines with different colors represent different environments

Fig. 3

Mapping for seed coat deficiency (SCD) in the Pop 2 (V11-0883 × V12-1885) in individual and across years (AVE). QTL nomenclature is in the form of qTraitChr. Colored intervals refer to the mapped QTLs detected by CIM and/or IM in different environments; the curves indicate the physical position of markers against LOD score detected on chromosomes, and lines with different colors represent different environments

Quantitative trait loci for seed coat deficiency (SCD) in the mapping populations Bold letters or values are QTLs names or the least or the maximum percentage of variance explained by corresponding QTL, respectively, mentioned in the text Pop 1 and Pop 2 were developed from the crosses of V11-0883 × V12-1626 and V11-0883 × V12-1885, respectively Env: the individual years and across years (AVE) in which the QTL was detected The name of each QTL is a composite of SCD followed by the chromosome number, QTLs detected in different methods with the same or overlapping maker interval were designated as on QTL dChr: chromosome eFlanking markers: the markers flanking the QTL fGI: genetic interval gPhysical interval according to genome position at Wm82.a2.v hLOD: logarithm of odds, data showed the range of the values detected across methods iPVE: percentage of variance explained, data showed the range of the values detected across methods jAdd: additive effect, data showed the range of the values detected across methods kMethods: by which the QTL was detected, data presented were the range of different methods Mapping for seed coat deficiency (SCD) in the Pop 1 (V11-0883 × V12-1626) in individual and across years (AVE). QTL nomenclature is in the form of qTraitChr. Colored intervals refer to the mapped QTLs detected by CIM and/or IM in different environments, the curves indicate the physical position of markers against LOD score detected on chromosomes, and lines with different colors represent different environments Mapping for seed coat deficiency (SCD) in the Pop 2 (V11-0883 × V12-1885) in individual and across years (AVE). QTL nomenclature is in the form of qTraitChr. Colored intervals refer to the mapped QTLs detected by CIM and/or IM in different environments; the curves indicate the physical position of markers against LOD score detected on chromosomes, and lines with different colors represent different environments The two stable QTLs, qSCD15 and qSCD20, were identified in both populations across multiple years and could together explain 21.4–30.8% and 6.3% of phenotypic variation for Pop 1 and Pop 2, respectively. The high stability and explained phenotypic variation percentage suggest that qSCD15 and qSCD20 are major QTLs for SCD in soybeans. Furthermore, the single-marker analysis (SMA) confirmed the significant associations between SCD and eleven markers within the two major QTL regions identified in Pop 1 (Table 3). Two markers within the region of qSCD15 (Gm15_5312718 and Gm15_6272006) were detected in 2018 and across years, and nine markers from the qSCD20 region were detected in 2017 and across years.
Table 3

SNPs highly significant associated with seed coat deficiency (SCD) detected by single-marker analysis (SMA) in Pop1 (V11-0883 × V12-1626) within two stable QTL regions

QTLaMarker IDPosition (cM)LODb
201620172018AVE
qSCD15Gm15_531271822.16.7*5.9*
Gm15_627200624.86.7*6.0*
qSCD20Gm20_3462686744.113.1*12.1*
Gm20_3494250244.513.8*12.8*
Gm20_3488159544.713.9*13.2*
Gm20_3562561547.97.3*11.7*11.8*
Gm20_3600214851.36.3*13.3*6.7*12.9*
Gm20_3609503752.06.4*13.5*6.7*13.1*
Gm20_3615304852.86.3*13.3*6.4*13.1*
Gm20_3665142955.413.3*6.4*12.9*
Gm20_3671044855.86.4*11.8*11.9*

aQTL: the QTL detected by CIM and/or IM, see in Table 2

bLOD: logarithm of odds

*SNPs significantly associated with seed coat deficiency at p < 0.0001

SNPs highly significant associated with seed coat deficiency (SCD) detected by single-marker analysis (SMA) in Pop1 (V11-0883 × V12-1626) within two stable QTL regions aQTL: the QTL detected by CIM and/or IM, see in Table 2 bLOD: logarithm of odds *SNPs significantly associated with seed coat deficiency at p < 0.0001

SNP validation via KASP genotyping assay

Eleven SNPs from the two stable QTLs (qSCD15 and qSCD20) that showed tight linkage to SCD (p < 0.001) were selected for marker validation (Table 3). None of the SNPs derived from the QTLs located on Chr. 15 segregated in the parental lines of the validation populations. In addition, three SNPs (Gm20_34626867, Gm20_34942502 and Gm20_35625615) located on Chr. 20 were found to be polymorphic between the parental lines of the validation populations. Gm20_34626867 and Gm20_34942502, located only 0.39 cM apart, had high selection efficiency (83% and 100%, respectively) in MFS-561-derived populations, and low selection efficiency (25% and 50%) in V05-5973 W-derived populations (Table 4). The other marker, Gm20_35625615, located 3.4 cM away from Gm20_34942502, showed variable selection efficiency across all four populations (57%, 92%, 33% or 50%). The combined average selection efficiency of marker Gm20_34626867 and Gm20_34942502 across all four validation populations was 77%, which exceeded that of marker Gm20_35625615 (64%) by just over 12%.
Table 4

Markers selection efficiency (SE) in four validation populations developed in crosses using one low-SCD soybean in each combination and three SNPs significantly associated with SCD

Validation populationLine no.Gm20_34626867Gm20_34942502Gm20_35625615
NCaNSbSE (%)NCaNSbSE (%)NCaNSbSE (%)
MFS-561 × V09-05793456835683122157
MFS-561 × V09-38762312121001212100111292
V05-5973W × V09-387616142514251333
V05-5973W × V09-398413245024503650
Total86202677202677274264

aNC: number of lines correctly selected as low SCD by marker

bNS: number of lines selected as low SCD by marker

Markers selection efficiency (SE) in four validation populations developed in crosses using one low-SCD soybean in each combination and three SNPs significantly associated with SCD aNC: number of lines correctly selected as low SCD by marker bNS: number of lines selected as low SCD by marker

Discussion

The quality of natto is largely cultivar dependent. Given its ramifications for consumer acceptance and production costs, developing natto soybean cultivars with lower incidence of SCD is of considerable interest to the natto industry. However, screening lines for SCD soybean is time-consuming and an effective selection tool is needed. In an effort to identify QTL and markers associated with low SCD, two sets of biparental populations and their parents were evaluated for SCD, QTLs associated with SCD were identified on Chr. 15 and 20. SNP markers derived from the two stable QTLs were developed and validated to select low-SCD natto soybean. Large SCD variance was observed among individuals in Pop 1 (0–94%) and Pop 2 (2–96%), consistent with previously reported SCD variation (Cook and Rainey 2010; Yasui et al. 2017). Significant variation in SCD was observed among different years in both populations. SCD was much more severe in 2017 than in 2016 and 2018 for both populations, suggesting that expression of this trait may be particularly influenced by environmental conditions (Table 1). Interestingly, the average temperature and total precipitation during pod development stage (August and September) were much lower in 2017 than in 2016 and 2018 (Table 3S). Low temperature can directly jeopardize soybean seed quality, as chilling temperatures during flowering and low moisture content are both conducive to seed coat cracking (Koizumi et al. 2008). Despite the differences in environmental conditions, continuous distributions were still observed in both populations in individual and across years. This suggests that while environmental influence holds potential to exacerbate absolute SCD incidence, recovery of genotypes that show consistently lower relative SCD is possible through targeted breeding. Although SCD is widely recognized as an important trait for natto cultivar development, few studies have investigated its genetic control. To address this need, we utilized two biparental mapping populations and identified seventeen QTLs associated with SCD (Table 2). Two stable QTLs were detected across multiple years and populations. A similar QTL study was conducted to study seed coat cracking during natto producing processes (soaking and cooking) in 126 recombinant inbred lines (Yasui et al. 2017). They found that QTL regions on Chr. 4, 6, and 8 were likely responsible for seed coat cracking; however no stable QTL was identified. The reliability of QTL mapping depends on the phenotyping methods, population size, linkage map density, and diverse environment factors. In order to increase the likelihood of identifying stable QTLs, we evaluated larger mapping populations over 3 years (environments) while observing greater marker density per chromosome. Thus, the QTLs identified in this study hold greater potential for future studies regarding the molecular mechanism and genetic basis of SCD in soybean. The QTL qSCD15, flanked by Gm15_5312718 and Gm15_6272006, included two significant SNPs. Gm15_5312718_C_T and Gm15_6272006_T_C were located at 5,331,364 bp and 6,291,081 bp (respectively) on Chr. 15 of the Wm82.a2.v1 reference genome. Even though no QTL associated with soybean seed traits was reported within this region, a previous study identified nearby QTLs associated with soybean seed coat hardness (Kuroda et al. 2013). Though further correlation studies must be conducted, the evidence from both our study and the study by Kuroda et al. (2013) suggest that seed coat deficiency and seed coat hardness might be closely related and that Chr. 15 may be particularly important for modification of seed coat traits in soybean. Nine significant SNPs were identified with the qSCD20 region. This region was flanked by Gm20_34881595 and Gm20_36095037, ranging from 36,021,058 bp to 37,190,252 bp on Chr. 20. QTLs responsible for seed composition (oil and isoflavones), seed set, and hilum color have been previously reported in this region (Fang et al. 2017; Leamy et al. 2017; Meng et al. 2016), all of which are also important for natto soybean breeding selection (Escamilla et al. 2019). Similar to the Chr. 15 QTL, qSCD20 overlapped with a previously identified QTL on Chr. 20 associated with seed coat hardness (Kuroda et al. 2013). Interestingly, qSCD20 also happened to be located approximately 20 Mbp (12.6 cM based on GmConsensus40) away from another previously reported QTL on Chr. 20 associated with seed coat cracking after harvest (Ha et al. 2012). Seed coat cracking is directly related to the strength of the seed coat. Chr. 20 may harbor several genes associated with seed coat strength and hardness, which play an important role or interact with other genes to cause a complex trait as seed coat deficiency. We identified candidate genes in the intervals that qSCD20 and qSCD15 map using public data on soybase.org. We found 52 genes in the qSCD20 interval and 95 genes in the qSCD15 interval. Among those genes in the qSCD20 interval, we identified one candidate gene, Glyma.20G128600, which is homologous to Arabidopsis CAD4 (AT3G19450.1). This gene is a GroES-like zinc-binding alcohol dehydrogenase family protein involved in lignin biosynthesis. Given that lignin is a key component of the cell wall, it is plausible that alleles governing improved lignin biosynthesis may result in decreased incidence of SCD. We also identified additional candidate genes Glyma.15G078300, Glyma.15G075300, Glyma.15G074700, Glyma.15G074000, and Glyma.15G072300, which are involved in cell wall organization, callose deposition in cell wall, cuticle development, cell wall modification, and secondary cell wall biogenesis (Shao et al. 2007). These candidate genes are targets for future experimental validations of the molecular mechanisms of SCD in soybean. Gene lists underlying both QTLs are provided as a supplementary table with the candidate genes highlighted (Table 4S). The broad-sense heritability (H2) of SCD in Pop 1 was 0.67, which was also observed for seed coat cracking after soaking (Yasui et al. 2017). Meanwhile, the broad-sense heritability of SCD was higher in Pop 2 (0.83). The reported heritability of other food-grade seed traits such as seed size, protein, sucrose, raffinose, and stachyose concentration range from 0.45 to 0.86 in soybean (Jaureguy et al. 2011). According to Robinson et al. (1949), trait heritability is categorized in three levels: low (0–30%), medium (31–60%), and high (> 60%). Heritability estimates indicate the potential to achieve genetic gain for a trait through breeding selection (Jaureguy et al. 2011). The high broad-sense heritability and large variance in SCD incidence observed in this study suggest that the reduction in SCD in natto soybean cultivars by accumulating favorable alleles through breeding selection is promising. Moreover, the QTLs identified in this study can be used to increase efficiency of breeding selection for the low-SCD trait, which can ultimately permit the development of superior natto soybean cultivars. Eleven SNP markers were developed from the two stable QTLs located on Chr. 15 and 20 and validated in four diverse biparental populations. None of the markers from Chr. 15 were polymorphic between the parental lines and thus were not suitable for marker-assisted selection in the four populations. Three markers located on Chr. 20 were polymorphic between the parental lines and segregated in the breeding lines of validation populations. Markers Gm20_34626867 and Gm20_34942502 showed the same selection efficiency pattern in the validation populations and both showed higher selection efficiency in populations derived from the low-SCD (2%) parent MFS-561 and relatively lower selection efficiency in populations derived from low-SCD (15%) parent V05-5973W. Marker Gm20_35625615 showed relatively higher selection efficiency in populations derived from MFS-561 than that in populations developed from V05-5973W, which was consistent with the other two markers. The pedigree of MFS-561 shows that its parent MFS-553 is the grand parent of V16-1626 and V12-1885 used for QTL mapping (Fig. 2S). V05-5973W doesn’t share any parents traced back to two generations with V16-1626 and V12-1885. Therefore, the markers explored in this study would be ideally used for SCD selection on populations derived from MFS-553. In addition, the different selection efficiency across populations might indicate the variation of recombination rate between the test markers and low-SCD genes, or the possible involvement or interaction with other QTL. The overall selection efficiency of three markers was 56% in four populations with higher selection efficiency on MFS-561-derived populations and lower selection efficiency on V05-5973W-derived populations, when single-marker selection efficiency was compared. The results indicated that multiple markers should be employed to make better selection decision when using marker-assisted breeding approach. In summary, two stable QTLs associated with low SCD were identified on chromosomes 15 and 20 by evaluating two biparental populations in three environments. Furthermore, three markers (Gm20_34626867, Gm20_34942502 and Gm20_35625615) were developed based on the stable QTL located on Chr. 20. The markers had a high selection efficiency in two populations with MFS-561 genetic background. The finding of these two major QTLs will shed light on the genetic control of seed coat deficiency, and the markers developed from this study will facilitate molecular marker-assisted selection of low-SCD natto soybean. Below is the link to the electronic supplementary material. Supplementary material 1 (DOCX 882 kb) Supplementary material 2 (DOCX 33 kb) Supplementary material 3 (DOCX 39 kb) Supplementary material 4 (XLSX 51 kb)
  14 in total

1.  MapChart: software for the graphical presentation of linkage maps and QTLs.

Authors:  R E Voorrips
Journal:  J Hered       Date:  2002 Jan-Feb       Impact factor: 2.645

2.  The outermost cuticle of soybean seeds: chemical composition and function during imbibition.

Authors:  Suqin Shao; Chris J Meyer; Fengshan Ma; Carol A Peterson; Mark A Bernards
Journal:  J Exp Bot       Date:  2007-01-11       Impact factor: 6.992

3.  Characterization of fermented black soybean natto inoculated with Bacillus natto during fermentation.

Authors:  Yongjin Hu; Changrong Ge; Wei Yuan; Renjun Zhu; Wujiu Zhang; Lijuan Du; Jie Xue
Journal:  J Sci Food Agric       Date:  2010-05       Impact factor: 3.638

4.  Ribosomal DNA spacer-length polymorphisms in barley: mendelian inheritance, chromosomal location, and population dynamics.

Authors:  M A Saghai-Maroof; K M Soliman; R A Jorgensen; R W Allard
Journal:  Proc Natl Acad Sci U S A       Date:  1984-12       Impact factor: 11.205

5.  Intake of fermented soybeans, natto, is associated with reduced bone loss in postmenopausal women: Japanese Population-Based Osteoporosis (JPOS) Study.

Authors:  Yukihiro Ikeda; Masayuki Iki; Akemi Morita; Etsuko Kajita; Sadanobu Kagamimori; Yoshiko Kagawa; Hideo Yoneshima
Journal:  J Nutr       Date:  2006-05       Impact factor: 4.798

6.  Evaluation of seed chemical quality traits and sensory properties of natto soybean.

Authors:  Yoko Yoshikawa; Pengyin Chen; Bo Zhang; Andrew Scaboo; Moldir Orazaly
Journal:  Food Chem       Date:  2013-12-16       Impact factor: 7.514

7.  Genome sequence of the palaeopolyploid soybean.

Authors:  Jeremy Schmutz; Steven B Cannon; Jessica Schlueter; Jianxin Ma; Therese Mitros; William Nelson; David L Hyten; Qijian Song; Jay J Thelen; Jianlin Cheng; Dong Xu; Uffe Hellsten; Gregory D May; Yeisoo Yu; Tetsuya Sakurai; Taishi Umezawa; Madan K Bhattacharyya; Devinder Sandhu; Babu Valliyodan; Erika Lindquist; Myron Peto; David Grant; Shengqiang Shu; David Goodstein; Kerrie Barry; Montona Futrell-Griggs; Brian Abernathy; Jianchang Du; Zhixi Tian; Liucun Zhu; Navdeep Gill; Trupti Joshi; Marc Libault; Anand Sethuraman; Xue-Cheng Zhang; Kazuo Shinozaki; Henry T Nguyen; Rod A Wing; Perry Cregan; James Specht; Jane Grimwood; Dan Rokhsar; Gary Stacey; Randy C Shoemaker; Scott A Jackson
Journal:  Nature       Date:  2010-01-14       Impact factor: 49.962

Review 8.  Genotyping-by-sequencing (GBS), an ultimate marker-assisted selection (MAS) tool to accelerate plant breeding.

Authors:  Jiangfeng He; Xiaoqing Zhao; André Laroche; Zhen-Xiang Lu; HongKui Liu; Ziqin Li
Journal:  Front Plant Sci       Date:  2014-09-30       Impact factor: 5.753

9.  A genome-wide association study of seed composition traits in wild soybean (Glycine soja).

Authors:  Larry J Leamy; Hengyou Zhang; Changbao Li; Charles Y Chen; Bao-Hua Song
Journal:  BMC Genomics       Date:  2017-01-05       Impact factor: 3.969

10.  Genome-wide association studies dissect the genetic networks underlying agronomical traits in soybean.

Authors:  Chao Fang; Yanming Ma; Shiwen Wu; Zhi Liu; Zheng Wang; Rui Yang; Guanghui Hu; Zhengkui Zhou; Hong Yu; Min Zhang; Yi Pan; Guoan Zhou; Haixiang Ren; Weiguang Du; Hongrui Yan; Yanping Wang; Dezhi Han; Yanting Shen; Shulin Liu; Tengfei Liu; Jixiang Zhang; Hao Qin; Jia Yuan; Xiaohui Yuan; Fanjiang Kong; Baohui Liu; Jiayang Li; Zhiwu Zhang; Guodong Wang; Baoge Zhu; Zhixi Tian
Journal:  Genome Biol       Date:  2017-08-24       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.