Literature DB >> 30369935

Identification of the Genomic Region Underlying Seed Weight per Plant in Soybean (Glycine max L. Merr.) via High-Throughput Single-Nucleotide Polymorphisms and a Genome-Wide Association Study.

Yan Jing1, Xue Zhao1, Jinyang Wang1, Weili Teng1, Lijuan Qiu2, Yingpeng Han1, Wenbin Li1.   

Abstract

Seed weight per plant (SWPP) of soybean (Glycine max (L.) Merr.), a complicated quantitative trait controlled by multiple genes, was positively associated with soybean seed yields. In the present study, a natural soybean population containing 185 diverse accessions primarily from China was used to analyze the genetic basis of SWPP via genome-wide association analysis (GWAS) based on high-throughput single-nucleotide polymorphisms (SNPs) generated by the Specific Locus Amplified Fragment Sequencing (SLAF-seq) method. A total of 33,149 SNPs were finally identified with minor allele frequencies (MAF) > 5% which were present in 97% of all the genotypes. Twenty association signals associated with SWPP were detected via GWAS. Among these signals, eight SNPs were novel loci, and the other twelve SNPs were overlapped or located in the linked genomic regions of the reported QTL from SoyBase database. Several genes belonging to the categories of hormone pathways, RNA regulation of transcription in plant development, ubiquitin, transporting systems, and other metabolisms were considered as candidate genes associated with SWPP. Furthermore, nine genes from the flanking region of Gm07:19488264, Gm08:15768591, Gm08:15768603, or Gm18:23052511 were significantly associated with SWPP and were stable among multiple environments. Nine out of 18 haplotypes from nine genes showed the effect of increasing SWPP. The identified loci along with the beneficial alleles and candidate genes could be of great value for studying the molecular mechanisms underlying SWPP and for improving the potential seed yield of soybean in the future.

Entities:  

Keywords:  candidate genes; genome-wide association analysis; seed weight per plant; single nucleotide polymorphism; soybean

Year:  2018        PMID: 30369935      PMCID: PMC6194254          DOI: 10.3389/fpls.2018.01392

Source DB:  PubMed          Journal:  Front Plant Sci        ISSN: 1664-462X            Impact factor:   5.753


Introduction

Seed weight per plant, a complicated and agronomically important quantitative trait, was significantly related with yield in soybean (Glycine max L. Merr.) (Chen et al., 2007; Liu et al., 2016). SWPP is controlled by multiple genes or quantitative trait loci (QTL). An additive effect dominates the inheritance pattern of SWPP (Chen et al., 2007). The development of cultivars with suitable SWPP has been an important breeding object of many soybean breeders because SWPP was an important soybean yield component. SWPP is influenced by the environment or genotype by environment interactions, and this trait performs differently in different environments. Hence, breeding soybean cultivars with suitable SWPP via traditional methods requires evaluation in multiple environments over several years, which is expensive, time consuming, and labor-intensive. The advances in molecular marker technologies have enabled the efficient elucidation of the genetic architecture of soybean SWPP. To date, fewer than 50 QTL, located on chromosome (Chr.) 3 (linkage group, LG N), Chr.4 (LG C1), Chr.5 (LG A1), Chr.6 (LG C2), Chr.7 (LG M), Chr.8 (LG A2), Chr.9 (LG K), Chr.10 (LG O), Chr.18 (LG G), and Chr.19 (LG L), have been reported in the SoyBase databank[1]. Of these identified QTL, some QTL, located on Chr. 3 (LG N), Chr.4 (LG C1), and Chr.6 (LG C2), were verified by many studies (Chen et al., 2007; Kuroda et al., 2013; Yao et al., 2015). The genetic maps used in most of these studies had incomplete coverage of the soybean genome, with large gaps in the maps. Thus, some QTL were difficult to directly apply for marker assisted selection (MAS) of SWPP. Genome-wide association studies (GWAS), based on high-density markers and natural populations, have been regarded as an efficient alternative to linkage analysis with more extensive recombination events and shorter LD segments. Thus, the resolution and accuracy of marker-phenotype associations could be further increased via GWAS compared with the conventional QTL mapping of segregating populations (Li et al., 2015). To date, GWAS has been widely utilized to elucidate the genetic basis of many complex traits in some crops, including rice (Huang et al., 2010), maize (Weng et al., 2011), wheat (Raman et al., 2010), and barley (Cockram et al., 2010). In soybean, the genetic architecture of some important traits, such as protein (Hwang et al., 2014), fatty acid content (Li et al., 2015), and SCN resistance (Han et al., 2015; Zhao et al., 2017), has been well dissected. Moreover, the rapid development of next-generation sequencing technology and SNP genotyping technology has propelled much of the practicability of GWAS. In previous studies, Yan et al. (2017), Contreras-Soto et al. (2017), and Zhang et al. (2016) identified seventeen, seven, and forty-eight SNPs, respectively, which were all significantly associated with 100-seed weight of soybean (HSW). However, to date, only few studies were conducted to identify QTL underlying SWPP based on high-throughput sequencing technology. Han et al. (2016) detected rs18976374 (located on Chr.16) significantly related with SWPP by using a bar coded multiplex sequencing approach with an Illumina Genome Analyzer II. Liu et al. (2016) found one SNP (ss244932137 located on Chr.3) associated with SWPP through GWAS strategy based on an association panel of 138 cultivars genotyped by Illumina SoySNP6KiSelect BeadChip. The aims of the present study were to understand better the genetic architecture of SWPP via GWAS based on 185 tested accessions and 33,149 SNPs and to analyze the potential candidate genes that might regulate soybean SWPP in associated genomic regions with peak SNPs.

Materials and Methods

Soybean Germplasms and Field Trials

A total of 185 diverse soybean accessions (Supplementary Table ), including landraces and elite cultivars, were applied to evaluate the variation of SWPP and for the subsequent sequencing analysis. All samples were grown in Harbin in 2015 and 2016, Gongzhuling in 2015 and 2016, and Shenyang in 2015 and 2016. Field experiments were performed with single row plots of 3 m in length with 0.65 m between rows. A randomized complete block design was used with three replications in each tested environment. After reaching full maturity, a total of 10 randomly selected plants per row in each plot were weighed and used to evaluate SWPP.

DNA Isolation and Sequencing Analysis

Total DNA from the fresh leaves of a single test sample was extracted by the CTAB method according to Han et al. (2015). The isolated high-quality DNA was partly sequenced via specific locus amplified fragment sequencing (SLAF-seq) methodology (Sun et al., 2013). Soybean reference genome (Version:Glyma.Wm82.a2) was preliminarily analyzed and digested enzyme Mse I (EC 3.1.21.4) and Hae III (EC: 3.1.21.4) (Thermo Fisher Scientific Inc., Waltham, MA, United States) were used to generate more than 50,000 sequencing tags (approximately 300–500 bp in length) of all tested samples. The obtained tags were evenly distributed among the unique genomic regions of the 20 soybean chromosomes. The sequencing libraries of each tested accession were built based on the sequencing tags. A barcode method combined with the Illumina Genome Analyzer II system (Illumina Inc., San Diego, CA, United States) was used to generate the 45-bp sequence reads at both ends of the sequencing tags from each accession library. Short Oligonucleotide Alignment Program 2 (SOAP2) software was used to align raw paired-end reads to the soybean reference genome (Version:Glyma.Wm82.a2). The SLAF groups were designed based on the raw reads, which mapped to the same unique genomic positions. Approximately 58,000 high-quality SLAF tags were acquired from each tested accession. The SNPs were called as such based on MAF ≥ 0.05. If the minor allele depth or the total depth of the sample was larger than 1/3, then the genotype was considered heterozygous. For thirty lines, a genome resequencing with 10-fold in depth was conducted on an Illumina HiSeq 2000 sequencer. Paired-end resequencing reads were mapped to the soybean Williams 82 reference genome (Version:Glyma.Wm82.a2) with BWA (Version: 0.6.1-r104) (Zhou et al., 2015) using the default parameters. SAMtools48 (Version: 0.1.18) software (Zhou et al., 2015) was used for converting mapping results into the BAM format and to filter the unmapped and non-unique reads. Duplicated reads were filtered with the Picard package (picard.sourceforge.net, Version:1.87) (Zhou et al., 2015). The BEDtools (Version: 2.17.0) (Zhou et al., 2015) coverageBed program was applied to compute the coverage of sequence alignments. A sequence was defined as absent when the coverage was lower than 90% and present when the coverage was higher than 90%. SNP detection was performed by the Genome Analysis Toolkit (GATK, version 2.4-7-g5e89f01) and SAMtools (Zhou et al., 2015). Only the SNPs detected by both methods could be analyzed further. SNPs with allele frequencies lower than 1% in the population were discarded. SNP annotation was performed based on the soybean genome (Version:Glyma.Wm82.a2) using the package ANNOVAR (Version: 2013-08-23) (Zhou et al., 2015).

Population Structure Evaluation and Linkage Disequilibrium (LD) Analysis

The population structure analysis of the natural group was conducted based on the PCA programs in the GAPIT software package (Lipka et al., 2012). The LD block was evaluated across the soybean genome based on SNPs with MAF ≥ 0.05 and missing data ≤ 10% by using squared allele frequency correlations (r2) in TASSEL version 3.0 (Bradbury et al., 2007). In contrast to the GWAS, missing SNP genotypes were not imputed with the major allele prior to LD analysis. The parameters in the software programs were set according to the MAF (≥0.05) and the integrity of each SNP (≥80%).

Genome-Wide Association Analysis

The SWPP association signals were identified based on 33,149 SNPs (Supplementary Table ) from 185 tested samples with CMLM in GAPIT (Lipka et al., 2012). The P value was calculated using the Bonferroni method with α ≤ 0.05 (≤2.58 × 10-4) and was used as the threshold to declare whether a significant association signal existed (Holm, 1979).

Prediction of Candidate Genes Controlling SWPP

According to the studies of Hwang et al. (2014), Han et al. (2015), and Zhao et al. (2017), the average LD decay distance of soybean genome (Version:Glyma.Wm82.a2) was approximately 200 kbp. Thus, candidate genes located in the 200-kbp genomic region of each peak SNP were classified and annotated underlying the soybean reference genome Williams 82[2].

Haplotype Analysis of Candidate Genes

Based on the genome annotation, SNPs were classified in exonic regions (overlapping with a coding exon), splicing sites (within 2 bp of a splicing junction), 5′UTRs and 3′UTRs, intronic regions (overlapping with an intron), upstream and downstream regions (within a 1 kb region upstream or downstream from the transcription start site), and intergenic regions. SNPs in coding exons were further grouped into synonymous SNPs (did not cause amino acid changes) and nonsynonymous SNPs (caused amino acid changes). The variation happened in these regions (except for intergenic regions) of candidate genes in thirty lines generated from genome re-sequencing data which were analyzed using the General Linear Model (GLM) method in TASSEL version 3.0 (Bradbury et al., 2007) to identify related SNPs and haplotypes. Significant SNPs affecting the SWPP were claimed when the test statistics reached P < 0.01.

Results

Statistical and Variation Analysis of SWPP

The SWPP of the 185 tested soybean accessions, grown in multiple locations over years, was determined. The mean results of the SWPP analysis showed that the effects of genotype, environment, and genotype by environment interactions were significant. The skewness and kurtosis of SWPP across the six environments were both less than 1, indicating continuous variation and a near normal distribution (Figure ). Therefore, the SWPP in the present study was suitable for the subsequent GWAS. Distribution of seed weight per plant (SWPP) among 185 soybean accessions.

Specific-Locus Amplified Fragment Sequencing (SLAF-seq) and Genotyping

The selective population contained 185 diverse accessions, primarily collected from China. The genomic DNA from all tested accessions was extracted and partially sequenced based on the SLAF-seq approach. For each tested sample, a mean of 49,571 high-quality tags (or SLAFs) was scanned from 153 million paired-end reads with a 45-bp read length and 6.51-fold sequencing depth. A total of 33,149 SNPs with MAF ≥ 0.05 were generated from the high-quality tags and subsequently used to perform GWAS for SWPP. The obtained SNPs were evenly distributed among the 20 soybean chromosomes, resulting in a marker density of 28.7 kbp per SNP. Chr. 6 and Chr. 11 included the most and least numbers of SNPs, which was 3,556 and 638, respectively (Figure ). Distribution of SNP markers across 20 soybean chromosomes.

Extent of Linkage Disequilibrium (LD) and Analysis of Population Structure

The average distance of LD decay was analyzed to characterize the mapping resolution for GWAS, and LD decayed differently among all the 20 chromosomes. Accordingly, the mean LD decay of the panel was evaluated at 214 kbp when r2 dropped to half of its maximum value (Figure ). To scan the population stratification of the association panel, principal component analysis and kinship analysis were conducted based on all 33,149 SNP markers. The first three PCs explained 13.83% of the genetic variation. A drastic decline in the genetic variation appeared at PC2 (Figure ). However, analysis of the variation of the first 10 PCs revealed an inflection point at PC3 (Figure ). Thus, these results suggested that mainly the first three PCs dominated the population structure. Additionally, the heatmap of kinship matrix revealed low levels of genetic relatedness among the 185 tested samples (Figure ). Linkage disequilibrium (LD), principal component, and kinship analyses of soybean genetic data. (A) LD decay of the genome-wide association study (GWAS) population. (B) The first three principal components of more than 30,000 SNPs used in the GWAS. (C) Population structure of soybean germplasm collection reflected by principal components. (D) A heatmap of the kinship matrix of the 185 soybean accessions calculated from the same SNPs.

Quantitative Trait Nucleotide (QTN) Associated With SWPP Evaluated by GWAS

A total of 20 association signals, distributed on 12 of the 20 soybean chromosomes, were detected by CMLM in the present study (Figure and Table ). Among these signals, only one QTN (Gm18:23052511 located on Chr.18) was identified in the linked region of a known SWPP QTL. However, as a specific member of seed weight, another 11 SNPs were overlapped or located in the linked region of a known seed weight QTL from the SoyBase databank, particularly for 100-seed weight (Table ). The remaining 8 association signals were regarded as novel loci, which were first reported for SWPP in the present study (Table ). The average SWPP for all tested samples with two different alleles were compared (Table ), and the results demonstrated that the SWPP among these samples were so different that the appropriate alleles might be effectively applied in the marker assisted selection (MAS) of soybean cultivars with suitable SWPP. Manhattan plot of association mapping of the SWPP in soybean. Peak SNP associated with SWPP and the evaluation of beneficial alleles. Significant SNPs and candidate genes associated with SWPP. The genes located in the 200-kbp genomic region of each peak SNP of the identified loci were considered as candidate genes, consistent with a mean LD decay distance of 214 kbp for the entire association panel. Approximately 126 candidate genes were identified. Among these genes, 26 genes had no functional annotation, and one gene had unknown function domains. The remaining 99 genes were classified into different groups by MAPMAN (Thimm et al., 2004) to determine clearly the potential functions. A total of 16 categories were scanned, including co-factor and vitamin metabolism, misc, RNA regulation of transcription, DNA synthesis/chromatin structure, protein synthesis/modification/degradation, signaling, development, transport, amino acid metabolism, hormone metabolism, abiotic stress, cell wall, major CHO metabolism, nucleotide metabolism, other groups, and unassigned genes (Supplementary Figure ). The factors that influenced SWPP were almost the same as those that influence seed weight, and seed size was the main component for determining seed weight. Thus, the factors that could regulate the mechanism of seed size might be important elements in directly or indirectly adjusting SWPP (Wang et al., 2017). In some plants, several genes controlling seed size/weight have been identified, including the genes associated with hormone metabolism, such as auxin and cytokinin (Schruff et al., 2006; Li et al., 2013), various transcription factors associated with RNA regulation of transcription in plant development and maturation, such as TTG2, AP2, MINI3, and C2H2 (Garcia et al., 2005; Ohto et al., 2009; Costa et al., 2010; Yin et al., 2010; Cao et al., 2016), the genes associated with protein modification or degradation (particularly ubiquitin ligase genes) (Li et al., 2008; Xia et al., 2013; Wang et al., 2017), and regulatory genes dominating transport systems and other metabolic processes associated with plant growth and development, such as ABC transporters and amino acid metabolism (Wu et al., 2007; Kim et al., 2010; Less et al., 2010; Hildebrandt et al., 2015). Among these genes detected by GWAS in the present study, those associated with the RNA regulation of transcription, including different transcription factor families, such as AP2 and C2H2, were scanned and identified as the functional genes for SWPP, including Glyma.02G159600, Glyma.02G173900, Glyma.07G157600, Glyma.07G157800, Glyma.07G157900, Glyma.08G195300, Glyma.08G195500, Glyma.08G195600, Glyma.08G195700, Glyma.08G195800, Glyma.17G127400, Glyma.17G127700, Glyma.19G057700, and Glyma.19G067900 located on Chr.2, Chr.7, Chr.8, Chr.17, and Chr.19. Two brassinosteroid response factor genes (Glyma.10G129700 and Glyma.17G127100 with distances of 34.31 and 39.83 kbp from peak SNPs Gm10:34747895 and Gm17:10196755, respectively) were identified as candidate genes. Similarly, Glyma.10G130000, an auxin response factor gene located 44.84 kbp from SNP Gm10:34946492 on Chr.10, Glyma.15G240600, an ethylene factor gene located 31.54 kbp from SNP Gm15:45663581 on Chr.15, and Glyma.17G127500, a gibberellin factor gene located 15.58 kbp from SNP Gm17:10196755 on Chr.17, were all considered as the genes controlling SWPP. Three E3 ubiquitin ligase genes (Glyma.08G195200, Glyma.08G195400, and Glyma.08G195900 with distances of 40.42, 17.31, and 15.42 kbp, respectively, from the peak SNP Gm08:15768591 located on Chr.8), which are associated with protein modification or degradation, might also affect SWPP. An additional 10 genes belonging to transport systems (such as ABC transporters) and other main metabolic processes (such as amino acid metabolism, N-metabolism, and vitamin metabolism), including Glyma.02G159900, Glyma.08G196200, Glyma.12G163700, Glyma.12G163800, Glyma.14G207200, Glyma.14G207300, Glyma.14G207700, Glyma.14G207800, Glyma.17G127800, and Glyma.18G143700, were also selected and might also contribute to SWPP. To predict the possible roles of candidate genes associated with SWPP, haplotype analysis of the 99 genes was performed. A total of 578 SNPs in 99 candidate genes were found among the thirty soybean lines (MAF > 0.1) through genome re-sequencing. Finally, 44 SNPs from nine genes were significantly associated with SWPP among multiple environments (Figure and Supplementary Table ). Glyma.07G157900, with only one SNP, was not shown in the figure (Figure ). Two haplotypes were identified for each of the nine genes and the SWPP between each pair of haplotypes showed significant or highly significant difference (Figure ). Glyma.07G157900, from Gm07:19488264, was detected under the environments of Shenyang in 2016 and the average at Shenyang in 2015 and 2016, which was located in the genomic region of the known loci, “seed weight 12-4” (Csanadi et al., 2001). A total of seven genes from Gm08:15768591 and Gm08:15768603 were detected, which overlapped the region of the known loci, “seed weight 22-1” (Zhang et al., 2004) and “seed weight 35-1” (Han et al., 2012). Of the seven genes, Glyma.08G195200 and Glyma.08G195900 were detected under the environments of Harbin in 2016 and the average at Harbin in 2015 and 2016. Glyma.08G195300 and Glyma.08G195400 were screened under Shenyang in 2015, Harbin in 2016, and the average at Harbin in 2015 and 2016. Glyma.08G195500 and Glyma.08G195700 were detected under Harbin in 2015, Harbin in 2016, and the average at Harbin in 2015 and 2016. Glyma.08G195600 was scanned under Shenyang in 2015, the average at Shenyang in 2015 and 2016, Harbin in 2016, and the average at Harbin in 2015 and 2016. Glyma.18G143700 was detected from Gm18:23052511 that overlapped the region of known loci, “SWPP 6-6” (Yao et al., 2015) and was detected under the condition of Shenyang in 2015, Shenyang in 2016, and the average at Shenyang in 2015 and 2016. These genes and beneficial haplotypes might be valuable for MAS in regulating SWPP of soybean. Candidate gene-based association. Gene-based association analysis of candidate genes with SNPs that were significantly correlated to SWPP. HB, Harbin; SY, Shenyang; Ave, Average seed weight per plant in 2015 and 2016. Haplotypes analysis of genes that related to SWPP. The ∗ and ∗∗ suggested significance of ANOVA at p < 0.05 and p < 0.01, respectively. HB, Harbin; GZ, Gongzhuling; SY, Shenyang; Ave, Average seed weight per plant in 2015 and 2016.

Discussion

Seed weight per plant, controlled by multiple genes, was an important component of seed yield in soybean. Thus far, some QTL have been identified based on linkage analysis from the SoyBase databank. However, few studies dissecting the genetic basis of SWPP in soybean via GWAS in combination with high-throughput SNPs and diverse accessions across multiple environments have been discussed, and even fewer candidate genes have been reported. In the present study, 185 soybean accessions were widely collected from China and used to conduct GWAS via high-throughput SNPs and diverse environments. In total, twenty SNPs were identified among twelve different soybean chromosomes, and these SNPs might have value for further breeding cultivars with appropriate SWPP. In addition to the QTL associated with SWPP in the SoyBase databank, the genes associated with seed weight were referred for more accurately identifying the genomic region underlying the SWPP of soybean via the twenty SNPs used in the present study. Among the twenty SNPs, eight loci, including Gm02:27731352 located on Chr.2, Gm03:25789435 and Gm03:26388034 located on Chr.3, Gm12:24101071 located on Chr.12, Gm14:47326193 located on Chr.14, Gm15:45663581 located on Chr.15, and Gm19:19848537 and Gm19:19848932 located on Chr.19, were novel genes first reported in the present study. The genomic region of Chr.19, which included two close loci (Gm19:19848537 and Gm19:19848932), might be a primary region associated with SWPP. Additionally, another twelve SNPs were overlapped or located near known QTL (Table ). Among these SNPs, one SNP (Gm18:23052511 located on Chr.18) was identified in the same region reported by Mian et al. (1996), and this polymorphism was also located near the locus named SWPP 6-6, which had been reported by Yao et al. (2015). A set of seven SNPs, including Gm02:17792241 on Chr.2, Gm08:15768591 and Gm08:15768603 on Chr.8, Gm10:34747895 and Gm10:34946492 on Chr.10, Gm12:31782089 on Chr.12, and Gm20:29716176 on Chr.20, were located in the same genomic region previously reported by Han et al. (2012) by using three RIL populations, and furthermore, among these seven SNPs, two close loci (Gm08:15768591 and Gm08:15768603 on Chr.8), Gm12:31782089 and Gm20:29716176 were also stably identified near previously reported loci (Sebolt et al., 2000; Zhang et al., 2004; Kato et al., 2014). Additionally, four genomic regions (Gm07:19488264 on Chr.7, Gm14:24306475 on Chr.14, Gm17:10196755 on Chr.17, and Gm19:11082560 on Chr.19) were also identified in previous studies (Csanadi et al., 2001; Hoeck et al., 2003; Teng et al., 2009). Moreover, some SNPs from candidate genes based on gene-based association analysis were found close to the loci that were verified in the present study by GWAS and previous studies. The main loci were Gm07:19488264, Gm08:15768591, Gm08:15768603, and Gm18:23052511. These four loci might be major genomic regions containing candidate genes associated with SWPP. Currently, GWAS had been an effective method to acquire and confirm genes with a suitable LD block (Zhao et al., 2017). In the present study, a total of 99 candidate genes were identified in 200-kbp genomic regions based on the twenty association signals by GWAS with an LD block of approximately 214 kbp in length. Among these genes, those involved in hormone pathways, RNA regulation of transcription in plant development and maturation (TTG2, AP2, MINI3, and C2H2, for example), ubiquitin, transporting systems, and other metabolic processes associated with plant growth (ABC transporters and amino acid metabolism, for example), were considered as the important factors in determining the regulation of SWPP. Therefore, fourteen genes (Glyma.02G159600, Glyma.02G173900, Glyma.07G157600, Glyma.07G157800, Glyma.07G157900, Glyma.08G195300, Glyma.08G195500, Glyma.08G195600, Glyma.08G195700, Glyma.08G195800, Glyma.17G127400, Glyma.17G127700, Glyma.19G057700, and Glyma.19G067900) mainly belonging to transcription factor families of AP2 and C2H2 were proposed as responsible for the SWPP of soybean. Another five novel genes associated with the pathways of brassinosteroid, auxin, ethylene, and gibberellin were also regarded as the candidate genes, including Glyma.10G129700, Glyma.10G130000, Glyma.15G240600, Glyma.17G127100, and Glyma.17G127500. As E3 ubiquitin ligase genes catalyze the ubiquitination of proteins in regulating the growth of plants associated with SWPP (Xia et al., 2013; Yao et al., 2017), Glyma.08G195200, Glyma.08G195400, and Glyma.08G195900, belonging to the E3 ubiquitin family, might act as important factors in controlling SWPP. Additionally, in Arabidopsis, the regulators of amino acid metabolism and ABC transporters, which participate in auxin transport, played a key role in regulating plant growth and seed maturation (Wu et al., 2007; Less et al., 2010). Thus, Glyma.02G159900, Glyma.08G196200, Glyma.12G163700, Glyma.12G163800, Glyma.14G207200, Glyma.14G207300, Glyma.14G207700, Glyma.14G207800, Glyma.17G127800, and Glyma.18G143700, which belong to transport systems (ABC transporters, for example) and other main metabolic processes (amino acid metabolism, for example), might be novel genes associated with SWPP. To further accurately detect the candidate genes controlling SWPP, haplotype analysis of candidate genes was performed. As a result, nine genes and 15 beneficial haplotypes were detected by gene-based association analysis. Definitive function of all candidate genes would be discussed and verified in further studies.

Author Contributions

YJ and XZ conceived the study and contributed to population development. JW contributed to phenotypic evaluation. WT and LQ contributed to genotyping. YH and WL contributed to the experimental design and drafting the manuscript. All authors contributed to and approved the final manuscript.

Conflict of Interest Statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Table 1

Peak SNP associated with SWPP and the evaluation of beneficial alleles.

SNPaChr.bPositionLocationYear-log10(P)MAFcAllele 1Allele 2Average SWPP of accessions with allele 1dAverage SWPP of accessions with allele 2dAverage SWPP of populationd
Gm02:17792241217792241Harbin20164.910.05GT27.3319.7120.07
HarbinAverage3.6327.5318.6619.01
Gm02:27731352227731352Shenyang20163.550.05CA32.5725.3725.91
ShenyangAverage3.6430.424.9525.48
Gm03:25789435325789435Shenyang20164.220.06AG31.1425.7325.91
ShenyangAverage4.8531.8125.2825.48
Gm03:26388034326388034Shenyang20163.630.05GT26.2019.1925.91
ShenyangAverage3.5825.7319.8325.48
Gm07:19488264719488264Shenyang20163.550.06GT28.0025.8425.91
ShenyangAverage4.0427.125.3425.48
Gm08:15768591815768591Harbin20164.720.22TC23.0719.1520.07
HarbinAverage4.2221.7518.1919.01
Gm08:15768603815768603Harbin20164.040.27AG21.7619.4220.07
HarbinAverage3.7520.6218.3919.01
Gm10:347478951034747895Gongzhuling20163.700.07TC20.1019.9520.04
GongzhulingAverage3.7619.6418.919.58
Gm10:349464921034946492Gongzhuling20164.470.05CT20.0719.6820.04
GongzhulingAverage4.2519.6518.2219.58
Gm12:241010711224101071Shenyang20163.610.06CA30.3525.8325.91
ShenyangAverage3.8230.4725.3825.48
Gm12:317820891231782089Shenyang20163.840.05CA28.7625.7625.91
ShenyangAverage4.5727.7125.4525.48
Gm14:243064751424306475Shenyang20164.410.05GT26.4025.8825.91
ShenyangAverage5.0725.5525.3925.48
Gm14:473261931447326193Shenyang20164.640.06TG25.9323.3725.91
ShenyangAverage4.0325.6322.9625.48
Gm15:456635811545663581Gongzhuling20164.290.07TA26.2419.8620.04
GongzhulingAverage4.1124.2819.4619.58
Gm17:101967551710196755Gongzhuling20164.820.05AC20.9119.9720.04
GongzhulingAverage4.7419.6819.5219.58
Gm18:230525111823052511Shenyang20164.630.05CA25.9719.9725.91
ShenyangAverage5.2925.5320.3725.48
Gm19:110825601911082560Gongzhuling20153.530.41CT19.8819.2719.58
GongzhulingAverage3.6219.6719.5119.58
Gm19:198485371919848537Shenyang20164.070.06GA31.7925.5325.91
ShenyangAverage3.7328.9625.2625.48
Gm19:198489321919848932Shenyang20164.130.06GA32.4625.4925.91
ShenyangAverage4.2528.8325.2625.48
Gm20:297161762029716176Gongzhuling20154.610.05AC19.6319.1819.58
Gongzhuling20165.1720.1119.3620.04
GongzhulingAverage5.7719.6119.3819.58
Table 2

Significant SNPs and candidate genes associated with SWPP.

SNPaChr.bPositionKnown QTLReferencesGeneDistance to SNP(Kbp)Functional annotations
Gm02:17792241217792241Seed weight 34-3Han et al., 2012Glyma.02G15950045.00C2H2 and C2HC zinc fingers superfamily protein
Glyma.02G15960041.97hAT transposon superfamily
Glyma.02G1599002.34pleiotropic drug resistance 11
Glyma.02G16000017.07zinc ion binding; nucleic acid binding
Gm02:27731352227731352Glyma.02G17390029.78KH domain-containing protein/zinc finger (CCCH type) family protein
Glyma.02G17410054.94SNF2 domain-containing protein/helicase domain-containing protein
Glyma.02G17420074.67Nodulin MtN21/EamA-like transporter family protein
Gm03:25789435325789435Glyma.03G08690051.99GDSL-like Lipase/Acylhydrolase superfamily protein
Gm03:26388034326388034Glyma.03G08850085.93RNA polymerase II, Rpb4, core protein
Glyma.03G08880058.73AGC kinase family protein
Glyma.03G08890085.80auxin-responsive family protein
Gm07:19488264719488264Seed weight 12-4Csanadi et al., 2001Glyma.07G15750067.24hAT dimerization domain-containing protein / transposase-related
Glyma.07G15760025.92Nucleic acid-binding, OB-fold-like protein
Glyma.07G15770014.16PIF1 helicase
Glyma.07G1578000.02Cyclophilin-like peptidyl prolyl cis-trans isomerase family protein
Glyma.07G15790027.01SNF2 domain-containing protein / helicase domain-containing protein
Gm08:15768591815768591Seed weightZhang et al., 2004/Han et al., 2012Glyma.08G19460087.34Multidrug resistance associated protein 6
Gm08:157686031576860322-1/Seed weight 35-1
Glyma.08G19490063.78Pyridoxal phosphate phosphatase related protein
Glyma.08G19520040.42Protein phosphatase 2A, regulatory subunit PR55
Glyma.08G19530022.24DOF zinc finger protein 1
Glyma.08G19540017.31F-box/RNI-like/FBD-like domains containing protein
Glyma.08G19550010.81C2H2 and C2HC zinc fingers superfamily protein
Glyma.08G1956009.55F-box/RNI-like superfamily protein
Glyma.08G1957002.34C2H2 and C2HC zinc fingers superfamily protein
Glyma.08G1958009.82C2H2 and C2HC zinc fingers superfamily protein
Glyma.08G19590015.42Ubiquitin-specific protease 6
Glyma.08G19610034.08Dentin sialophosphoprotein-related
Glyma.08G19620045.51Amino acid dehydrogenase family protein
Glyma.08G19630056.32Ubiquitin conjugating enzyme family protein
Glyma.08G19640071.62Tudor / PWWP / MBT superfamily protein
Glyma.08G19650080.14Ribosomal L28e protein family
Gm10:347478951034747895Seed weight 34-8Han et al., 2012Glyma.10G12930092.46Ubiquitin C-terminal hydrolase 3
Gm10:349464921034946492Seed weight 34-8Han et al., 2012Glyma.10G12970085.64Disease resistance family protein / LRR family protein
Glyma.10G12980034.31Myosin family protein with Dil domain
Glyma.10G13000044.84Aluminum-induced protein with YGL and LRDR motifs
Glyma.10G13010057.1Calcium-dependent lipid binding (CaLB domain) family protein
Gm12:241010711224101071Seed weightHan et al., 2012/Kato et al., 2014Glyma.12G15440085.95Cytochrome c oxidase 17
Gm12:31782089123178208935-4/Seed weight 50-15
Glyma.12G16380048.65Major facilitator superfamily protein
Glyma.12G16410080.92Auxin response factor 1
Glyma.12G16420092.20EMBRYO DEFECTIVE 140
Gm14:243064751424306475Seed weightHoeck et al., 2003Glyma.14G13680059.90ATP phosphoribosyl transferase 2
Gm14:47326193144732619313-2
Glyma.14G13690071.77Cellulose-synthase-like C5
Glyma.14G20700061.45Yippee family putative zinc-binding protein
Glyma.14G20710059.98Tetratricopeptide repeat (TPR)-like superfamily protein
Glyma.14G20720035.3ABC transporter of the mitochondrion 3
Glyma.14G20730013.48Tetratricopeptide repeat (TPR)-like superfamily protein
Glyma.14G2075005.30TRAM, LAG1 and CLN8 (TLC) lipid-sensing domain containing protein
Glyma.14G20760011.92Cystathionine beta-synthase (CBS) family protein
Glyma.14G20770039.72Thiamine pyrophosphate dependent pyruvate decarboxylase family protein
Glyma.14G20800052.97Late embryogenesis abundant (LEA) hydroxyproline-rich glycoprotein family
Glyma.14G20820065.81AMP-dependent synthetase and ligase family protein
Glyma.14G20840077.79NC domain containing protein related
Glyma.14G20850086.61Auxin response factor 8
Gm15:456635811545663581Glyma.15G24060031.54Senescence-related gene 1
Glyma.15G24070018.30MAP kinase 19
Glyma.15G24080051.16pfkbp-like carbohydrate kinase family protein
Glyma.15G24100079.81Multidrug resistance-associated protein 6
Gm17:101967551710196755Seed weight 13-5/Seed weight 49-10Hoeck et al., 2003/Teng et al., 2009Glyma.17G12680074.02Zinc finger C-x8-C-x5-C-x3-H type family protein
Glyma.17G12690067.85RPM1-interacting protein 4 (RIN4) family protein
Glyma.17G12700056.77GDSL-like Lipase / Acylhydrolase superfamily protein
Glyma.17G12710039.8326S proteasome, regulatory subunit Rpn7; Proteasome component (PCI) domain
Glyma.17G12720035.30Protein kinase superfamily protein
Glyma.17G12740024.28Nucleoside diphosphate kinase 2
Glyma.17G1277008.92Putative endonuclease or glycosyl hydrolase
Glyma.17G12780014.89Lysine histidine transporter 1
Glyma.17G12800037.93Malate synthase
Glyma.17G12810047.88Malate synthase
Glyma.17G12820054.33Protein kinase superfamily protein
Glyma.17G12830059.55Adenylate kinase 1
Glyma.17G12840070.67Homolog of nucleolar protein NOP56
Glyma.17G12850076.83RNA-metabolizing metallo beta lactamase family protein
Glyma.17G12860093.68Arginosuccinate synthase family
Glyma.17G12870099.79Gamma-irradiation and mitomycin c induced 1
Gm18:230525111823052511Seed weight per plant 6-6Yao et al., 2015Glyma.18G14370010.62MATE efflux family protein
Gm19:110825601911082560Seed weight 13-10Hoeck et al., 2003; Sebolt et al., 2000/Han et al., 2012Glyma.19G05770017.42Germin-like protein 2
Gm19:198485371919848537Seed weight 9-1/Seed weight 34-5Glyma.19G05780069.03BTB-POZ and MATH domain 6
Gm19:198489321919848932Glyma.19G06790024.55ABA-responsive element binding protein 3
Gm20:297161762029716176Glyma.20G07990061.09Coenzyme F420 hydrogenase family / dehydrogenase, beta subunit family
  40 in total

Review 1.  Amino Acid Catabolism in Plants.

Authors:  Tatjana M Hildebrandt; Adriano Nunes Nesi; Wagner L Araújo; Hans-Peter Braun
Journal:  Mol Plant       Date:  2015-09-15       Impact factor: 13.164

2.  Identification of an ABCB/P-glycoprotein-specific inhibitor of auxin transport by chemical genomics.

Authors:  Jun-Young Kim; Sina Henrichs; Aurélien Bailly; Vincent Vincenzetti; Valpuri Sovero; Stefano Mancuso; Stephan Pollmann; Daehwang Kim; Markus Geisler; Hong-Gil Nam
Journal:  J Biol Chem       Date:  2010-05-14       Impact factor: 5.157

3.  Principal transcriptional regulation and genome-wide system interactions of the Asp-family and aromatic amino acid networks of amino acid metabolism in plants.

Authors:  Hadar Less; Ruthie Angelovici; Vered Tzin; Gad Galili
Journal:  Amino Acids       Date:  2010-04-04       Impact factor: 3.520

4.  Stress-induced co-expression of two alternative oxidase (VuAox1 and 2b) genes in Vigna unguiculata.

Authors:  José Hélio Costa; Erika Freitas Mota; Mariana Virginia Cambursano; Martin Alexander Lauxmann; Luciana Maia Nogueira de Oliveira; Maria da Guia Silva Lima; Elena Graciela Orellano; Dirce Fernandes de Melo
Journal:  J Plant Physiol       Date:  2009-12-14       Impact factor: 3.549

5.  MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes.

Authors:  Oliver Thimm; Oliver Bläsing; Yves Gibon; Axel Nagel; Svenja Meyer; Peter Krüger; Joachim Selbig; Lukas A Müller; Seung Y Rhee; Mark Stitt
Journal:  Plant J       Date:  2004-03       Impact factor: 6.417

6.  Mutations in Arabidopsis multidrug resistance-like ABC transporters separate the roles of acropetal and basipetal auxin transport in lateral root development.

Authors:  Guosheng Wu; Daniel R Lewis; Edgar P Spalding
Journal:  Plant Cell       Date:  2007-06-08       Impact factor: 11.277

7.  QTL mapping of ten agronomic traits on the soybean ( Glycine max L. Merr.) genetic map and their association with EST markers.

Authors:  W-K Zhang; Y-J Wang; G-Z Luo; J-S Zhang; C-Y He; X-L Wu; J-Y Gai; S-Y Chen
Journal:  Theor Appl Genet       Date:  2004-01-22       Impact factor: 5.699

8.  Identification of QTL with large effect on seed weight in a selective population of soybean with genome-wide association and fixation index analyses.

Authors:  Long Yan; Nicolle Hofmann; Shuxian Li; Marcio Elias Ferreira; Baohua Song; Guoliang Jiang; Shuxin Ren; Charles Quigley; Edward Fickus; Perry Cregan; Qijian Song
Journal:  BMC Genomics       Date:  2017-07-12       Impact factor: 3.969

9.  SLAF-seq: an efficient method of large-scale de novo SNP discovery and genotyping using high-throughput sequencing.

Authors:  Xiaowen Sun; Dongyuan Liu; Xiaofeng Zhang; Wenbin Li; Hui Liu; Weiguo Hong; Chuanbei Jiang; Ning Guan; Chouxian Ma; Huaping Zeng; Chunhua Xu; Jun Song; Long Huang; Chunmei Wang; Junjie Shi; Rui Wang; Xianhu Zheng; Cuiyun Lu; Xiaowu Wang; Hongkun Zheng
Journal:  PLoS One       Date:  2013-03-19       Impact factor: 3.240

10.  Genetic characteristics of soybean resistance to HG type 0 and HG type 1.2.3.5.7 of the cyst nematode analyzed by genome-wide association mapping.

Authors:  Yingpeng Han; Xue Zhao; Guanglu Cao; Yan Wang; Yinghui Li; Dongyuan Liu; Weili Teng; Zhiwu Zhang; Dongmei Li; Lijuan Qiu; Hongkun Zheng; Wenbin Li
Journal:  BMC Genomics       Date:  2015-08-13       Impact factor: 3.969

View more
  7 in total

1.  Heterosis and Differential DNA Methylation in Soybean Hybrids and Their Parental Lines.

Authors:  Liangyu Chen; Yanyu Zhu; Xiaobo Ren; Dan Yao; Yang Song; Sujie Fan; Xueying Li; Zhuo Zhang; Songnan Yang; Jian Zhang; Jun Zhang
Journal:  Plants (Basel)       Date:  2022-04-22

2.  Identification of QTNs and Their Candidate Genes for 100-Seed Weight in Soybean (Glycine max L.) Using Multi-Locus Genome-Wide Association Studies.

Authors:  Muhammad Ikram; Xu Han; Jian-Fang Zuo; Jian Song; Chun-Yu Han; Ya-Wen Zhang; Yuan-Ming Zhang
Journal:  Genes (Basel)       Date:  2020-06-27       Impact factor: 4.096

3.  Genome-wide associations and epistatic interactions for internode number, plant height, seed weight and seed yield in soybean.

Authors:  Teshale Assefa; Paul I Otyama; Anne V Brown; Scott R Kalberer; Roshan S Kulkarni; Steven B Cannon
Journal:  BMC Genomics       Date:  2019-06-26       Impact factor: 3.969

4.  Seed protein content and its relationships with agronomic traits in pigeonpea is controlled by both main and epistatic effects QTLs.

Authors:  Jimmy Obala; Rachit K Saxena; Vikas K Singh; Sandip M Kale; Vanika Garg; C V Sameer Kumar; K B Saxena; Pangirayi Tongoona; Julia Sibiya; Rajeev K Varshney
Journal:  Sci Rep       Date:  2020-01-14       Impact factor: 4.379

5.  Trait associations in the pangenome of pigeon pea (Cajanus cajan).

Authors:  Junliang Zhao; Philipp E Bayer; Pradeep Ruperao; Rachit K Saxena; Aamir W Khan; Agnieszka A Golicz; Henry T Nguyen; Jacqueline Batley; David Edwards; Rajeev K Varshney
Journal:  Plant Biotechnol J       Date:  2020-03-12       Impact factor: 9.803

Review 6.  Applications of Artificial Intelligence in Climate-Resilient Smart-Crop Breeding.

Authors:  Muhammad Hafeez Ullah Khan; Shoudong Wang; Jun Wang; Sunny Ahmar; Sumbul Saeed; Shahid Ullah Khan; Xiaogang Xu; Hongyang Chen; Javaid Akhter Bhat; Xianzhong Feng
Journal:  Int J Mol Sci       Date:  2022-09-22       Impact factor: 6.208

7.  Isoflavones, anthocyanins, phenolic content, and antioxidant activities of black soybeans (Glycine max (L.) Merrill) as affected by seed weight.

Authors:  Yu-Mi Choi; Hyemyeong Yoon; Sukyeung Lee; Ho-Cheol Ko; Myoung-Jae Shin; Myung Chul Lee; On Sook Hur; Na Young Ro; Kebede Taye Desta
Journal:  Sci Rep       Date:  2020-11-17       Impact factor: 4.379

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.