Literature DB >> 35887057

Ensuring Global Food Security by Improving Protein Content in Major Grain Legumes Using Breeding and 'Omics' Tools.

Uday C Jha1, Harsh Nayyar2, Swarup K Parida3, Rupesh Deshmukh4, Eric J B von Wettberg5, Kadambot H M Siddique6.   

Abstract

Grain legumes are a rich source of dietary protein for millions of people globally and thus a key driver for securing global food security. Legume plant-based 'dietary protein' biofortification is an economic strategy for alleviating the menace of rising malnutrition-related problems and hidden hunger. Malnutrition from protein deficiency is predominant in human populations with an insufficient daily intake of animal protein/dietary protein due to economic limitations, especially in developing countries. Therefore, enhancing grain legume protein content will help eradicate protein-related malnutrition problems in low-income and underprivileged countries. Here, we review the exploitable genetic variability for grain protein content in various major grain legumes for improving the protein content of high-yielding, low-protein genotypes. We highlight classical genetics-based inheritance of protein content in various legumes and discuss advances in molecular marker technology that have enabled us to underpin various quantitative trait loci controlling seed protein content (SPC) in biparental-based mapping populations and genome-wide association studies. We also review the progress of functional genomics in deciphering the underlying candidate gene(s) controlling SPC in various grain legumes and the role of proteomics and metabolomics in shedding light on the accumulation of various novel proteins and metabolites in high-protein legume genotypes. Lastly, we detail the scope of genomic selection, high-throughput phenotyping, emerging genome editing tools, and speed breeding protocols for enhancing SPC in grain legumes to achieve legume-based dietary protein security and thus reduce the global hunger risk.

Entities:  

Keywords:  QTL; biofortification; grain legume; molecular marker; protein

Mesh:

Substances:

Year:  2022        PMID: 35887057      PMCID: PMC9325250          DOI: 10.3390/ijms23147710

Source DB:  PubMed          Journal:  Int J Mol Sci        ISSN: 1422-0067            Impact factor:   6.208


1. Introduction

Alarming trends of anthropogenic climate change and environmental deterioration jeopardize global crop yields, resource distribution, and ecosystems, resulting in global food insecurity and undernourishment in the growing human population [1]. An estimated 840 million people globally will be undernourished by 2030 [2]. The COVID-19 pandemic will have compounded this figure, increasing the food-related hunger crisis. Dietary protein is an essential macronutrient for human growth and development, with infants requiring 1.52 g per kg body weight per day and adults recommended 0.80 g per kg body weight per day [3]. Apart from micronutrient deficiency, malnutrition from dietary protein deficiency causes ‘marasmus’, ‘kwashiorkor’ anemia, impaired immunity, and ‘environmental enteric dysfunction,’ most prevalent in developing and low-income countries, especially southern Asia and sub-Saharan Africa [4,5,6]. Most of the people residing in these regions predominantly consume maize, sorghum, and cassava in their daily diets, which are rich in starch but insufficient in protein [6,7]. Thus, many people, especially infants, inhabiting these regions do not consume the required daily protein, affecting their overall growth and development [5,6]. Notably, Europe imports 70% of the plant-based protein consumed by its human population [8], a trend that the increasing global human population will further exacerbate. Breeding crops, especially legumes, with high-quality traits such as SPC is a promising approach for overcoming these challenges. Grain legumes are one of the richest sources of plant-based dietary protein, providing essential amino acids and supplying the increasing demand for protein-based human diets [9]. Grain legume seeds, popularly known as ‘poor man’s meat’, are the cheapest protein source [10,11,12]. In addition, legume-based protein could be instrumental in minimizing greenhouse gas emissions, helping to protect the environment [13]. Screening genetic variability for protein content in various legume germplasm and crop wild relatives is the first step to identifying high-protein grain legumes for the development of high-yielding, high-protein legumes. A classical genetics-based approach could identify the inheritance pattern of high-protein gene(s) in various legumes. Advances in genomics have enabled the dissection of the genetic architecture of QTLs/gene(s) in various legumes through biparental mapping and genome-wide association studies. Moreover, the availability of complete reference genome assemblies and pangenomes of various legumes could assist in underpinning high-protein genomic regions at the individual or species level. Likewise, advances in functional genomics have enabled the discovery of various candidate genes that improve legume protein content and their precise function. Proteomics and metabolomics can improve our understanding of various complex pathways, molecular networks, and metabolites underlying high-protein grain legumes. Non-destructive phenomics approaches could be instrumental for screening and identifying high-protein lines with high efficiency. Emerging technologies such as genomic selection, rapid generation advancement, and genome editing could be harnessed to improve SPC, eradicate malnutrition related to dietary protein deficiency, and meet the United Nations Sustainable Developmental Goal 2.

2. Grain Legumes as an Important Source of Dietary Protein

Grain legumes vary in their protein content, due to fundamental limitations on the components a seed must contain to be viable. Many grains legumes have 25–40% SPC, and it may be difficult to raise that number much beyond 40%. (See Table 1).
Table 1

Seed protein contents and deficient amino acids in major grain legumes.

Crop Scientific NameRange of Grain Seed Protein ContentReferencesDeficient Amino Acids
Chickpea Cicer erietinum L.17–22% before dehulling[14,15]Methionine, cysteinethreonine and valine [16]
25.3–28.9% after dehulling
Lentil Lens culinaris Medik20.6% and 31.4%[17]Methionine, cysteine [18]
Lupin Lupinus albus L.35–44%[19,20]Alanine, tryptophan [21]
Soybean Glycine max (L.) Merr.up to 40%[22,23]Methionine, cysteine, threonineand lysine [24]
Common bean Phaseolous vulgaris L.20–30%[25,26]Methionine, cysteine [27]
Pigeonpea Cajanus cajan (L.) Millsp20–22%[28]Methionine, cysteine, valine [29]
Faba bean Vicia faba L.26% to 41%[30,31]Methionine
Mung bean Vigna radiata L.20.97–31.32%[32]Methionine, cysteine
Cowpea Vigna unguiculata L. Walp.)14.8–25%[33,34,35]Methionine
Pea Pisum sativum L.13.7 to 30.7%[36,37,38]Methionine, cysteine and tryptophan[39]
[37,38,39]
Urd bean Vigna mungo L. Hepper25–28%[40,41]Methionine, cysteine
Lathyrus Lathyrus sativus L.8.6–34.6%[42]Methionine, cysteine
Chickpea (Cicer erietinum L.) SPC ranges from 17 to 22% before dehulling and 25.3 to 28.9% after dehulling [14,15] (see Table 1). Chickpea seed contains two main proteins—globulin (11S legumin and 7S vicilin) and albumin—with low amounts of glutelins and prolamine; however, the seed is deficient in cysteine and methionine amino acids [43,44]. Despite rich sources of various essential amino acids, cysteine, methionine, valine, and threonine are the major limiting amino acids in chickpea [45]. Desi type chickpea has higher SPC than the kabuli type but no differences in essential amino acids [45]. Common bean (Phaseolous vulgaris L.) SPC ranges from 20 to 30% [25,26], and plays a pivotal role in mitigating protein-related malnutrition, especially in underdeveloped countries [46,47]. The major storage protein in common bean is phaseolin, accounting for 36–46% of total seed proteins [48], with 50–60% of the phaseolin belonging to the 7S vicilin class, insufficient in methionine, cysteine, and tryptophan essential amino acids [49,50,51]. Cowpea (Vigna unguiculata L. Walp.) is a ‘multi-functional’ grain legume widely used for human consumption. It helps mitigate the challenges of malnutrition in sub-Saharan Africa, and tropical and sub-tropical regions globally [52,53]. Cowpea SPC ranges from 15 to 25% [33,34] (see Table 1). Cowpea storage proteins are abundant in lysine and tryptophan but deficient in methionine and cysteine [53]. Globulins are the most abundant storage protein fraction of cowpea grain, followed by albumins, glutelins, and prolamin [54]. Faba bean (Vicia faba L.) SPC ranges from 26 to 41% [30,31,55,56], with abundant essential amino acids except for tryptophan, cysteine, and methionine [57]. More than 80% of the seed proteins comprise globulins (vicilin and legumin) [55]. Of the essential amino acids, faba bean seed is highest in lysine [58]. Lentil (Lens culinaris Medik) SPC ranges from 20 to 30% [59]. Like other legumes, lentil seed has a high globulin content (44–70% of storage protein, constituting 11S legumin and 7S vicilin and convicilin) and albumin (26–61% of lentil proteins) but low prolamin and glutelin levels [60,61]. White lupin (Lupinus albus L.) seeds are a rich reservoir of protein containing up to 44% [19,62], with two major classes of protein—albumin (15%) and globulin (85%) [63]. The globulin protein comprises α-, β-, γ-, and δ-conglutins [20]. Despite some allergenic effects in white lupin seed protein, they are low in antinutritive properties compared with other grain legumes such as pea and soybean [62,64]. Moreover, white lupin seed contains higher amounts of some important amino acids (lysine, phenylalanine, arginine, and leucine) than soybean, rendering it a high-demand grain legume from a nutritional point of view [65]. Soybean (Glycine max (L.) Merr.) is rich in protein, ranging from 35 to 45%. It is deficient in methionine [22,23] but has sufficient lysine to overcome the lysine deficiency of cereals [66]. In 2018, it was estimated that soybean alone contributed 70% of the global protein meal [67]. Mung bean (Vigna radiata L.) contains easily digestible protein and several essential micronutrients [68]. It is an excellent source of protein except for sulfur-containing amino acids (methionine and cysteine) [69]. Due to its ease of digestibility relative to other legumes [70] and low hypoallergic properties, mung bean is used as a weaning food for infants [71]. Moreover, mungbean is a good meat substitute for vegetarians and those who cannot afford animal-based dietary protein [12]. Pea (Pisum sativum L.) is rich in protein, ranging from 13.7 to 30.7% [37]. Pea seed protein comprises legumin, vicilin, convicilin, and globulin-related proteins [37]. Vicilin is the most abundant protein (26.3–52.0% of total pea protein extract) [37]. Moreover, pea protein is in high demand in food industries due to its gluten-free quality and low allergenicity [72]. Pigeon pea (Cajanus cajan (L.) Millsp) seeds contain 20–22% protein and play an essential role in providing plant-based dietary protein to the vegetarian population in India, thus ensuring protein-based food security [73]. Urd bean (Vigna mungo L. Hepper) is another important grain legume rich in protein (up to 25%), comprising globulin (63%), albumin (12%), and glutelin (21%) [74]. Urd bean seeds are rich in glutamic acid, aspartic acid, and lysine but deficient in methionine and cysteine [74].

3. Harnessing Genetic Variability for Improving Seed Protein Content in Grain Legumes

Harnessing crop germplasm diversity is an economical way to improve important breeding traits, including SPC in grain legume crops [75,76,77,78,79]. Crop genetic resources are the key reservoir for exploring high-SPC genotypes in grain legumes. Considerable amounts of genetic variability for SPC have been captured in chickpea [78,80,81], such as 12.4–31.5% [82], 17–22% [83], and 14.6–23.2% [84]. Serrano et al. [84] identified several high SPC genotypes (LEGCA608, LEGCA609, LEGCA614, LEGCA619, LEGCA716) that could be used to improve chickpea SPC in elite cultivars. Cowpea is a cheap source of protein for improving human nutrition. Boukar et al. [77] assessed a set of 1541 cowpea lines for genetic variability in grain protein content and mineral profiles. They reported a wide range of genetic variability for SPC (17.5–32.5%), including TVu-2508 (32.2%) [77]. Likewise, Weng et al. [85] screened 173 cowpea accessions collected from various parts of the world at two locations (Fayetteville and Alma, Arkansas). They also reported a substantial amount of genetic variability for SPC (22.8–28.9%), including PI 662992 (28.9%), PI 601085 (28.5%), PI 255765 (28.4%), PI 255774 (28.4%), and PI 666253 (28.4%) [85], which could be used to transfer the high SPC trait into high-yielding elite cowpea varieties. The nutritional profiles (including grain protein content) of 22 cowpea genotypes collected from various regions of eastern, southern, and western Africa were evaluated at two locations in South Africa [34]. Seed protein contents, measured using the combustion method, ranged from 23.16 to 28.13% [34]. The authors noted significant positive correlations between SPC and various mineral contents, indicating the possibility of simultaneously selecting these traits. Among the tested genotypes, 98K-5301 had high Ca and SPC [34]. Similarly, an evaluation of 21 cowpea genotypes identified high SPC in COVU-702 (27.7%) and HC-98-64 (27.9%) [86]. In another study, GonÇalves et al. [87] identified high SPC in Paulistinha (29.2%) among 18 tested cowpea genotypes. An evaluation of 30 Brazilian cowpea lines for protein, vitamin, and mineral content identified high SPC in MNC01-649F-2 (28.3%), BRS-Cauamé (27.8%), BRS-Paraguacu (27.7%), BRS-Marataoa (27.4%), Canapuzinho (25.0%), BRS-Tumucumaque (24.8%), and MNC01-631F-15 (24.6%) [88]. The SPC of selected common bean landraces ranged from 16.54 to 25.23%, while selected modern common bean cultivars ranged from 19.70 to 24.30% (Celmeli et al. [79]; see Table 2).
Table 2

List of various legume genotypes with improved seed protein content.

CropGenotypesSeed Protein ContentSourceReferences
ChickpeaICC 591229.2%ICRISAT, Patancheru, India[78]
LEGCA608, LEGCA609, LEGCA614, LEGCA619, LEGCA716>22%Cordoba[84]
Common beanJ-216, FJIP-43222 (J/L-146) to 330 (J-216); 180 (G11027A) to 311 (FJIP-43) g kg−1Mexican state of Jalisco and Durango[89]
LR05 25.23%Food Safety and Agricultural Research Center, Akdeniz University[79]
6-EX 23%Santo Antônio de Goiás, Brazil [90]
Accession 4049 Portugal [91]
CowpeaHC-6, HC-5, CP-21, LST-II-C-12, CP-16, COVU-702, HC-98-6426.7–27.9%India[86]
TVu-2723, TVu-3638, TVu-250832.50%Minjibir, Kano State, Nigeria [77]
MNC01-649F-2, BRS-Cauamé, BRS-Paraguaçu, BRS-Marataoã27.4–28.3%[88]
“Early Scarlet” and 09-20426.9–27.4%Arkansas State (Fayetteville, Alma, Hope) [92]
Bengpla40%Dokpong and Bamahu near Wa, Ghana, South Africa, Taung[93]
PI662992, PI601085, PI255765, PI255774, PI66625328.4–28.9%Florida, Minnesota, Nigeria, Arkansas[85]
Vuli, Mamlaka, IT90K-59, Ngoji, TVU13953, 98K-5301 Tanzania, South Africa, Nigeria, South Africa, Nigeria[34]
Paulistinha29.20%Brazil[87]
Faba bean25 genotypes28.43–29.68%Manitoba and Saskatchewan, Canada [94]
GrasspeaIC127616 32.20%India[95]
Lentil L. orientalis 18.3–27.75%India, IIPR, Kanpur[96]
L. ervoides 18.9–32.7%India, IIPR, Kanpur[96]
MungbeanMGG330, Nagpuri29.9% and 29.3%India[97]
PeaPI206793, PI206801, PI206838, PI210619, PI210644, PI210675, PI210678, PI210684>30%Manitoba and Ontario, Canada[98]
Majoret240.4 g kg−1Grain Research Laboratory Winnipeg, Canada[82]
NGB 101293 Jordan26.80% [37]
L1 317.63 g kg−1Institute of Field and Vegetable Crops (Smederevska Palanka, Serbia)[72]
SoybeanD76-8070450 g kg−1[99]
BARC-6, BARC-7, BARC-8, BARC-9[100]
AC Proteus Central Experimental Farm (Ottawa, ON) Canada [101]
TN03-350, TN04-5321High protein contentTennessee Agricultural Experiment Station, Tennessee, USA, USDA–ARS and the North Carolina Agricultural Research Service[102]
N6202′ [103]
Lines developed from Kwangan- kong × Samnamkong and Danbaegkong × Samnam-kong34.3–44.4% and 35.8–49.6%Yeongnam Agricultural Research Institute (YARI), Milyang, Republic of Korea[104]
JIHJ11753%[105]
17D derived population and M23 derived lines382 and 403 g kg−1University of Missouri Fisher Delta Research Center, Portageville, MO[106]
High-pro 1′ developed from Wyandot × GASF98-114401 g kg−1USDA Agricultural Service and Ohio Agricultural Research and Developmental Centre Wooster[107]
‘TN11-5102’421 g kg−1 protein on a dry weight basis University of Tennessee Agricultural Research[108]
PI407228392.6–481.7 g kg−1 Central Crops Research Station in Clayton, NC, Bradford Farm in Columbia, Sandhills Research Station in Jackson Springs, NC[109]
R11-7999439 g kg−1 (dry weight)Arkansas Agricultural Experiment Station[110]
Bioagro[111]
S16-5540GT41.10%University of Missouri–Fisher Delta Research Center Soybean Breeding Program[112]
Grasspea is an inherent climate-resilient grain legume with an excellent source of SPC. An evaluation of 37 grasspea genotypes identified IC127616 rich in SPC (32.2%) [95]. An analysis of 27 local mung bean landraces using the micro-Kjeldahl method identified significant genetic variability for SPC (17.2–29.9%), with the highest values in MGG30 (29.9%), NAGPURI (29.3%), and BSN1 (27.8%) [97]. Moreover, significant genetic variability for SPC (15.2–21%) rich in lysine, tryptophan, valine, leucine, isoleucine and phenylalanine amino acids was noted in Vigna radiata var sublobata, a wild species of mung bean [113]. Genetic variability for SPC in lentil ranges from 20 to 30% [76,114,115,116,117]. Likewise, lentil crop wild relatives (CWRs) have significant genetic variability for SPC, such as L. orientalis (18.3–27.75%) and L. ervoides (18.9–32.7%) [96], which could be used in breeding programs to improve SPC in elite lentil cultivars. Pea has considerable genetic variability for SPC (19.3–25.2%) (Gottschalk et al. [75]; see Table 2). Wang and Daun [82] reported a SPC ranging from 201.6 to 266.6 g kg−1 DM in four elite pea cultivars. In large sets of pea accessions, the SPC was 22–32% [98] and 23–32% [118] using the Kjeldahl technique and 20.6–27.3% [119], 18.6–27.3% [120], 17–27% [121], 17.5–27.8% [122], and 19.3–30.3% [123] using the near-infrared technique. Several promising pea genotypes with high SPC have been identified: CDC Striker (up to 27.8%) [124,125,126,127], Ballet (up to 25.9%) [119,128], Solara (28.8%) [129], Caméor (29.9%), VavD265 (27.5%), and China (32%) [128]. Breeding for high SPC in soybean is a primary objective in soybean breeding programs; however, progress has been limited by the negative relationship between SPC and grain yield and oil content [24,130]. For example, Bandillo et al. [131] and Warrington et al. [132] reported a highly negative correlation between the soybean SPC allele and seed oil content, reducing oil content by 1% for every 2% increase in SPC. High-protein soybean lines include Danbaegkong (48.9%) [133] and Kwangankong (44.7%) [134], and TN11-5102 selected from 5601T cultivar (421 g kg−1 protein on a dry weight basis) [108]. Apart from cultivated species, soybean CWRs (e.g., Glycine soja) are an important source of high-protein QTLs [135,136,137]. A population developed by incorporating exotic soybean germplasm exhibited significant genetic variability for SPC [138]. Wehrmann et al. [139] and Wilcox and Cavins [140] backcrossed the high-protein trait from Pando into Cutler 71, a high-yielding low-protein genotype soybean. Later, Cober and Voldeng [101] attempted to transfer the high-protein trait from AC Proteus to Maple Glen; however, the selected progenies exhibited higher protein content than Maple Glen but no yield advantage. Sathia, Seti, Kavre, and Soida Chiny soybean cultivars, collected in Nepal, had high SPC (up to 42–45%) compared to William 82 (39%) and higher arginine (5–10%) content than William 82 (7.4%) [141]. Hence, harnessing the available genetic variability for SPC requires the large-scale screening of land races, CWRs, and grain legume germplasm locked in gene banks across the globe.

4. Mendelian Inheritance of Seed Protein Content in Legumes

Several researchers have worked out the genetics of SPC based on Mendelian genetics in various grain legumes [142,143,144]. Considering pea storage proteins (legumin and convicilin), Matta and Gatehouse [145] mapped the legumin gene (Lg-1), behaving as a single Mendelian gene with five alleles on LG7, and the convicilin gene (Cvc), behaving as a single Mendelian gene on LG2 using seeds developed from 1238 × 1263, 110 × 807 and 110 × 851 F2 crosses. Subsequently, Mahmoud and Gatehouse [146] explained the monogenic inheritance of another pea SPC vicilin (Vc-1) gene controlled by two codominant genes located on LG7 using an F2 cross from 360 × 611. Perez et al. [147] revealed the genetic basis of high and low SPC in pea using the genetics of seed size (round vs. wrinkled). They found that round-seeded pea plants (RR/RbRb) had low SPC with low albumin content, while those with recessive alleles (rr/rbrb) had high SPC and high albumin content [147]. High heritability of protein content and its control by a few gene(s) is an opportunity to improve protein content in cowpea [92]. Moreover, diallel crosses of six populations derived from two high-protein lines and two high-yielding soybean lines revealed a significant negative correlation between protein content and yield in the high protein × high protein population but a significant positive correlation between protein content and yield in the high yielding × high yielding population [148]. In pigeon pea, an analysis of F1 and F2 progenies derived from crosses involving four parents revealed a minimum of 3–4 genes controlling protein content [149]. The authors concluded that the low protein trait is partially dominant over the high protein trait. Various studies have reported a significant effect of environment on SPC [150,151,152]. In soybean, this significant effect involved multiple genes and the quantitative nature of the SPC trait [150,151]. In chickpea, an F2 segregating population developed from ICC5912 (blue flowered) × ICC17109 (white flowered) revealed the quantitative nature of the SPC trait and its high negative correlation with seed yield and seed size [78]. A 5 × 5 half diallel cross of cowpea lines revealed the presence of additive and non-additive gene effects for SPC. High seed albumin, prolamin, and globulin were associated with positive effects of the dominant gene, while high SPC and glutelin content were associated with recessive genes [153]. In lentil, Kumar et al. [154] also reported the quantitative nature of the SPC trait. High genetic variation in lentil seed storage protein resulted from high G × E interactions exhibiting moderate heritability (31.3%) [152].

5. QTL Mapping for Seed Protein Content

Advances in grain legume genomics have facilitated the identification of underlying QTLs controlling SPC using biparental mapping populations in various grain legumes [118,119,155,156,157]. Few studies have uncovered QTLs controlling SPC in chickpea. However, one study that phenotyped recombinant inbred lines (RILs) derived from ICC995 × ICC5912 across four environments and used a genotyping by sequencing approach delineated one major effect QTL q-3.2 for SPC that explained 44.3% of the phenotypic variation (PV) on LG3 [158]. In pea, using an F2-derived Wt10245 × Wt11238 mapping population, Irzykowska and Wolko [159] mapped five QTLs governing SPC on LG2, LG5, and LG7, explaining 13.1–25.8% PV. Subsequently, two F5 mapping populations developed from Wt11238 × Wt3557 and Wt10245 × Wt11238 revealed a QTL for protein content on LGVb flanked by cp, gp, and te markers [118]. Likewise, genotyping an Orb × CDC Striker RIL mapping population with SNP markers identified two SPC QTLs on LG1b, explaining 16% PV, and two on LG4a, explaining 10.2% PV, and genotyping a Carerra × CDC Striker RIL-based mapping population identified four SPC QTLs on LG7b, explaining 13% PV, and one on LG3b [160]. An evaluation of a Terese × K586 RIL population in five different environments identified 14 SPC QTLs located on LGI, LGIII, LGIV, LGV, LGVI, and LGVII [119]. The study identified the underlying candidate gene for the QTL on LGI as the Rgp gene (cell wall synthesis) and two underlying candidate genes for the QTL on LGV as Ls (GA biosynthesis) and Rbcs4 (encoding small Rubisco subunit) [119]. Obala et al. [157] gained insight into the genetic determinants controlling SPC in pigeonpea based on the results obtained from five F2 populations segregating for SPC (ICP 11605 × ICP 14209, ICP 8863 × ICP 11605, HPL 24 × ICP 11605, ICP 8863 × ICPL 87119, and ICP 5529 × ICP 11605). Fourteen major effect QTLs explaining 23.5% PV were found to be located on CcLG02, CcLG03, CcLG06 and CcLG11 [157]. In soybean, the SPC trait is controlled by multiple alleles and highly influenced by G × E interactions [150]. More than 300 QTLs contributing to SPC in soybean have been reported (http://www.soybase.org, (accessed on 10 May 2022)); [161] and reside across all chromosomes; however, major SPC QTLs are on chromosomes 5, 15, and 20. Diers et al. [155] first reported a major QTL governing high SPC on chromosome 20 in a population developed from crossing cultivated and wild soybean, which was later mapped to a 3 cM on LGI (Nichols et al., 2006) [156]. The location of this QTL was subsequently narrowed to 8.4 Mb [162], <1 MB [163], 77.4 kb [137], and even with only three candidate genes [131] on LG20. Likewise, another major SPC QTL, qSeedPro_15, was narrowed to 4 Mb (Zhang et al. [164]; see Table 3), overlapping the previously identified genomic region on chromosome 15 [24,131,155,165,166]. Zhang et al. [164] elucidated a possible candidate gene Glyma.15G049200 underlying the QTL. Genotyping recombinant inbred lines derived from the interspecific cross of Williams 82 × G. soja (PI 483460B) using Illumina Infinium BeadChip sequencing platform identified five SPC QTLs, mapped on chromosomes 6, 8, 13, 19, and 20, explaining 4.6–19.6% PV [167]. Of these identified QTLs, qPro_20 QTL was stable across the four tested environments.
Table 3

List of seed protein content QTLs reported in various grain legumes.

CropMapping Population/Panel of GenotypesQTL/GeneMarkerLGPV%References
ChickpeaGWAS, 1874 QTLs, 9 significant MTAsSSRLG3, 52.4–5.1[81]
GWAS, 3366 candidate genesSNP41[170]
ICC 995 × ICC5192, RIL (189) q-3.2 SNPLG344.3[158]
Common beanXana × Cornell 49242, RIL (104)SpA, SpB, SpE, SpI, SpJ, Pha, SpF, SpG, SpK, SpL, SpM, SpC, SpDAFLP, RAPD, ISSR, SCARLG7, 4, 3, 1 [171]
Xana × Cornell 49242, RIL (104)One QTLSSRPV07[172]
Ground nutTG26 × GPBD 4, RIL (146)8 QTLsSSRLG1, 3, 4, 7, 81.5–10.7[173]
Pea1238 × 1263, 110 × 807, 110 × 851 (F2)Convicillin (Cvc), Legumin (Lg-1)Protein markerLG2, 7[145]
360 × 611 (F2) Vicilin (Vc-1) LG7[146]
Wt10245 × Wt11238, F2 (114) prot1 prot2 prot3 prot4 prot5 AFLP, RAPD, ISSR, STS, CAPSLG2, 5, 713.1–25.5[159]
Térèse’ × K586, RIL (139)14 QTLs [119]
Wt11238 × Wt3557 (F5), Wt10245 × Wt11238 (F5)One QTLAFLP, RAPD, STS, CAPS, ISSR LGVa, 5b[118]
GWAS, 50One significant SNPSNP[120]
Orb × CDC Striker, Carrera × CDC Striker8 QTLsSNPLG1b, 4a 16[160]
1–2347–144 × CDC Meadow LG3b, 7b
GWAS, 135 genotypes Chr3LG5_194530376 SNP [174]
GWAS, 135Chr3LG5_138253621, Chr3LG5_194530376SNPLG3, 5[174]
9 populations, RIL (1213)21 QTLSNP[123]
PigeonpeaICP11605 × ICP 14209, ICP 8863 × ICP 11605, HPL 24 × ICP 11605, ICP 8863 × ICPL 87119, ICP 5529 × ICP 1160648 M-QTLs for SPC SNPCcLG03, 11, 02, 060.7–23.5[157]
SoybeanParker × PI 468916Two major quantitative trait locus (QTL) allelesLG20[136]
A3733 × PI 437088A, RIL (76)One QTLSatt496 and Satt239, RAPD marker OPAW13aLG20[175]
Essex × Williams LG6 [176]
PI 97100 × Coker 237 LG15, 20 [177]
N87-984-16 × TN93-99 LG18 [178]
N87-984-16 × TN93-99, F6 (101)4 QTL for cysteine, 3 QTL for methionineSatt235, Satt252, Satt427, Satt436D1a, F, G[102]
Satt252, Satt564, Satt590F, G, M
A81356022 × PI 468916 LG20 [156]
G. soja (PI468916) × G. max (A81-356022) backcrossingOne QTLSSR, AFLPLG1[162]
G. max A81-356022 × G. soja PI468916, near isogenic linesOne QTL, 13 genesSNPLG20 [162]
Magellan × PI 438489B LG15, 5, 6 [165]
ZDD09454 × Yudou12 LG18, 20 [179]
GWAS, 29817 genomic regions, Glyma20g19680, Glyma20g21030, Glyma20g21080, Glyma20g19620, Glyma20g196030, Glyma20g21040SNPLG8, 9, 20[180]
IL-1964 (619 accessions), IL-1966 (977 accessions), MS- 1996 (728 accessions), MS-2000 (934 accessions)SNPLG20[163]
ZYD2738 × Jidou 12, F2:3, ZYD2738 × Jidou 9, F2:3qPRO_2_1, qPRO_13_1, qPRO_20_1, qPRO_6_1, qPRO_18_1SSRLG2, 6, 13, 18, 206.6–14.5[181]
Benning × Danbaekkong4 QTLs LG14, 15, 17, 2055[132]
R05-1415 × R05-638 LG14, 20 [182]
GWAS, 139qPC19 and 8 significant genomic regionsSNPLG5, 8, 10, 14, 16, 1910.3[183]
SD02-4-59 × A02-381100 (RIL), SD02-911 × SD00-1501 (RIL)8 QTLs[184]
Danbaekkong × Glycine soja (PI468916)wp allele, cqSeed protein-003LG20[185]
G. max (Williams 82) × G. soja (PI 483460B)5 QTLs: qPro_06, qPro_19, qPro_20, qPro_08, qPro_13SNPLG6, 8, 13,19, 204.6–19.6[167]
GWAS, 144 lines derived from four parentsGlyma.03G100800, Glyma.10G207300, Glyma.12G019300, Glyma.12G112900, Glyma.14G081600, Glyma.18G028600, Glyma.18G07110, Glyma.18G071300SNPLG1, 2, 3, 4,6,7, 9, 10,12, 14, 183.84–19.21[186]
192 collinear protein QTLs, 13 candidate genes[187]
Linhefenqingdou × Meng 8206 RIL (104)25 main effect QTLsSNPLG1, 4, 6, 7, 8, 9, 10, 13, 14, 17, 18, 19, 205.7–26.22[169]
GWAS, 621 accessionsThree genomic regions, 16 significant SNPs, Glyma.15g049100, Glyma.15g049200, Glyma.15g050100, Glyma.15g050600SNPLG4, 5, 8, 9, 10, 13, 15, 19, 20 [188]
GWAS, 185rs53140888, rs19485676, rs24787338 SNPChromosomes 1, 13, 20[189]
Three significant SNP markers
(Kenfeng14× Kenfeng15) × (Heinong48×Kenfeng19), RIL (160)34 QTLsSSR 2.65–13.83 [189]
G15FN-12 mutantSoySNP50K BeadChip LG12[190]
GWAS, 24925 significant MTAs SNPLG2, 6, 7, 10, 13, 14, 16, 17, 18, 19[191]
AC Proteusx Maple Arrow F5, RIL5 QTLsSSR, DArT and DArTseq LG15, 20, 2, 1870%[168]
X3145-B-B-3-15 × 9063, F5, RIL; X3145-B-B-3-15 × AC Brant, F5, RIL; X3144-48-1-B/9063, F5, RIL; X3144-48-1-B × AC Brant, F5, RIL; X3145-B-B-3-15 × X3144-48-1-B, F5, RIL LG1, 8, 9, 14, 16, 17, 19, 20
AC X790P × S18-R6′ and ‘AC X790P × S23-T5, RILsqPro_Gm02–3, qPro_Gm04–4, qPro_Gm06–1, qPro_Gm06–3, qPro_Gm06–6, qPro_ Gm13–4, qPro-Gm15–3SNPLG1, 2, 4, 5, 6, 8, 12, 13, 15, 1810.4–21.9[192]
GWAS, 211qPC-7-1, qPC-13-1, qPC-15-1SNPLG7, 13, 1518–34[164]
(Kenfeng 14 × Kenfeng 15) × (Heinong 48 × Kenfeng 19)85 QTL, 123 QTNs2,232 SNPs and 63,306 SNPs[193]
G00-3213 × PI 594458A, 132 RIL16 QTLsSoySNP6k BeadChipLG3, 6, 13, 20 [194]
GWAS, 165138 significant MTA, SNPLG7[195]
Glyma.07g175700 and Glyma.07g176000
RIL, 944, Primus × Protina, Gallec × Sigalia, Primus × Sigalia, Protina × Sigalia, Gallec × Primus, Gallec × Protina, Sultana × Sigalia, Gallec × Protina, Gallec × Protina, Gallec × Sigalia, Primus × SultanaqPY1, qPY2, qPY3, qPY4, qPY5, qPY6, qPY7SNPLG5, 6, 7, 8, 16, 18, 19, 2015.5–60[196]
‘Nanxiadou 25′× Tongdou 11, RIL (178)50QTLs, three candidate genes: Glyma.20G088000, Glyma.20G111100, Glyma.20 g087600SNPLG1, 2, 3, 5,6, 7, 8, 9, 10, 11, 13, 15, 16, 20-[197]
PI 468916 × A81-356022, BCGlyma.20G85100, cqSeed protein-003 QTLSNPLG20-[137]
250, F23 QTLsInfinium Soy6KSNP BeadchipsLG6, LG13, LG20-[198]

AFLP = Amplified fragment length polymorphism; SNP = Single nucleotide polymorphism, SCAR = Sequenced cleaved amplified region, CAPS = cleaved amplified polymorphic sequence, RAPD = Random Amplified polymorphic DNA, SSR = Simple Sequence Repeats, ISSR = Inter Simple Sequence Repeat.

SSR, DArT, and DArTseq analysis of five RIL-based mapping populations for high and low SPC and one high × high SPC identified two major QTLs controlling SPC on LG15 and LG20 in soybean [168]. Furthermore, bulk segregation analysis of four high × low SPC mapping populations unveiled novel SPC-controlling genomic regions on LG1, 8, 9, 14, 16, 17, 19, and 20 [168]. An assessment of soybean RILs developed from Linhefenqingdou × Meng 8206 in six different environments identified 25 SPC QTLs explaining up to 26.2% PV [169]. Of the identified QTLs, qPro-7-1 was highly stable across all tested environments. Recently, Fliege et al. [137] cloned a major SPC governing QTL (cqSeed protein-003) and elucidated the underlying causative candidate gene Glyma.20G85100, encoding a CCT domain protein. Thus, efforts are needed to fine map or clone major QTLs controlling SPC in other grain legumes to delineate the underlying candidate gene(s) and their function for genomic-assisted breeding to improve SPC in grain legumes. List of seed protein content QTLs reported in various grain legumes. AFLP = Amplified fragment length polymorphism; SNP = Single nucleotide polymorphism, SCAR = Sequenced cleaved amplified region, CAPS = cleaved amplified polymorphic sequence, RAPD = Random Amplified polymorphic DNA, SSR = Simple Sequence Repeats, ISSR = Inter Simple Sequence Repeat.

6. Underpinning Genomic Region/Haplotypes Controlling High Protein Content through GWAS

Traditional biparental QTL mapping for obtaining genetic recombinants controlling complex traits such as protein content is limited due to the incorporation of only two parents in the crossing program. However, the increased capacity of next generation sequencing technology to derive single nucleotide polymorphism molecular markers in association with advanced phenotyping facilities has facilitated the development of numerous genetic recombinants and identification of the underlying plausible candidate genomic regions controlling protein content in various grain legumes using GWAS [81,174,183,186]. Jadhav et al. [81] performed association mapping for SPC using SSR markers on a panel of 187 chickpea genotypes (desi, kabuli, and exotic). Nine significant marker trait associations (MTAs) for SPC were uncovered on LG1, LG2, LG3, LG4, and LG5, explaining 16.85% PV. A recent GWAS using high-throughput SNP markers on 140 chickpea genotypes subjected to drought and heat stress to shed light on MTAs with various nutrients uncovered 66 (non-stress), 46 (drought stress), and 15 (heat stress) MTAs for SPC [199], which could be used to identify high-protein lines for improving SPC in chickpea. A GWAS relying on multilocation and multi-year phenotyping of a large set of pea germplasm representing diverse regions across the globe was undertaken to identify significant MTAs for agronomic and quality traits, including protein content [174]. Two significant MTAs controlling SPC were identified: Chr3LG5_138253621 and Chr3LG5_194530376. GWAS using 16,376 SNPs in 332 chickpea genotypes (desi and kabuli) delineated seven genomic loci controlling SPC and explaining 41% combined PV [170]. The authors also validated five SPC-controlling genes in a RIL-based mapping population ICC 12299 × ICC 4958, encoding cytidine (CMP), deoxycytidylate (dCMP) deaminases, ATP-dependent RNA helicase DEAD-box, and zinc finger protein. An earlier comprehensive GWAS of 298 soybean lines using Illumina Infinium and GoldenGate assays identified 17 significant genomic regions controlling SPC [180]. Among the SPC-controlling genomic regions, LG20 was important as it contained six candidate genes Glyma20g19680, Glyma20g21030, Glyma20g21080, Glyma20g19630, Glyma20g19620, and Glyma20g21040 in the 2.4 Mbp interval. Another GWAS performed on 139 soybean lines revealed eight significant regions contributing to SPC on LG5, LG8, LG10, LG14, LG16, LG19, and LG20 [183]. In addition, a major QTL qPC19 controlling SPC on LG19 in the 42.3 to 44.2 Mb interval explained 10.3% PV [183]. Likewise, an assay using SoySNP660k BeadChip in 144 soybean lines developed from four-way RILs identified eight candidate genes controlling SPC: Glyma.03G100800, Glyma.10G207300, Glyma.12G019300, Glyma.12G112900, Glyma.14G081600, Glyma.18G028600, Glyma.18G07110, and Glyma.18G071300 (Zhang et al. [186]; see Table 3). A comprehensive GWAS study in a collection of 877 soybean accessions, tested in five different environments in Midwest and southern USA using SoySNP50K iSelect BeadChip [188], identified significant genomic regions for SPC that coincided with previous QTL/genomic regions identified on chromosomes 15 and 20 [161,166]. Three SNPs identified within 91 kb overlapped the 118 kb genomic region of meta-QTL controlling SPC and seed oil content previously reported by Van and McHale [161]. Some important candidate genes identified in these genomic regions—Glyma.15g049100 Glyma.15g049200, Glyma.15g050100, and Glyma.15g050600—participate in partitioning carbon and regulating protein content (Lee et al. [188]; see Table 3). The authors also elucidated eight novel genomic regions controlling methionine, cysteine, lysine, and threonine contents. A GWAS using whole genome sequencing data of 631 soybean accessions combined with a biparental QTL analysis uncovered a pleotropic gene GmSWEET39 (encoding sugar transporter) controlling SPC and seed oil content in soybean [164]. The authors also reported that a 2 bp (CC) deletion in Glyma.15G049200 underlying the GmSWEET39 allele rendered high seed oil content and low SPC. A comprehensive association and linkage analysis surveyed 985 soybean accessions, including wild species, landraces, and old and modern cultivars, to capture haplotypic variation in the high SPC locus cqProt-003 on chromosome 20 [200]. The study uncovered significant trait-associated genomic regions within a 173 kb linkage block containing three causal candidate genes: Glyma.20G084500, Glyma.20G085250, and Glyma.20G085100 [200]. Of these, Glyma.20G085100 (containing a 304 bp deletion and trinucleotide insertions) was tightly linked with the high protein content phenotype [200].

7. Functional Genomics Shedding Light on Causal Candidate Gene(s) Contributing Seed Protein Content in Grain Legumes

In the last decade, unprecedented advances in RNA sequencing have expedited functional genomics research, especially transcriptome analysis for discovering trait gene(s), in various grain legumes [197]. Numerous studies have elucidated various SPC-contributing candidate gene(s) and their functional roles in grain legumes; notably, cDNA cloning based functional characterization of genes encoding storage proteins such as pea seed albumin (PA1, PA1b) [201] and conglutin family in narrow leaf lupin [202]. Functional characterization of genes encoding storage protein in narrow leaf lupin by sequencing cDNA clones from developing seed identified 11 new storage protein (conglutin family)-encoding genes [202]. Transcriptome analysis via RNA-seq shed light on 16 conglutin genes encoding storage protein in the Tanjil cultivar of narrow leaf lupin [203]. Conglutin gene(s) expression is similar in lupin varieties of the same species but distinct between species [203]. In soybean, functional genomic analysis via gene expression profiling identified 329 differentially expressed genes underlying qSPC_20–1 and qSPC_20–2 QTL regions accounting for SPC using a QTL-seq approach [197]. Of the nine candidate genes underlying these QTL regions, Glyma.20G088000, Glyma.20G111100, and Glyma.20 g087600 were functionally validated and identified as the most potential candidate genes controlling SPC [197]. RNAi technology—a robust functional genomic tool—offered novel insight into the regulatory role of Glyma.20g085100 harboring transposon insertion in the SPC-controlling genomic region of soybean [137]. Reduced expression of Glyma.20g085100 using RNAi enhanced the protein level in the low-protein Thorne soybean genotype [137]. Most functional genomics studies identifying SPC-controlling candidate genes with their putative function in major legumes have involved soybean; thus, studies should focus on elucidating candidate genes and deciphering the molecular mechanism for improving SPC via functional genomics in other grain legumes.

8. Proteomics and Metabolomics Shed Light on the Genetic Basis of High Seed Protein Content in Legumes

Proteomics helps us understand the entire set of proteins produced at a specific time under a particular set of conditions in an organism or cell [204]. This approach could be used to discover novel seed storage proteins and inquire about the molecular basis of enhancing SPC in various legumes [205]. A novel protein known as methionine-rich protein was discovered in soybean using a two-dimensional (2D) electrophoresis technique [205]. Later, a 2D-PAGE proteomic tool distinguished wild soybean (G. soja) from cultivated soybean based on high storage proteins (beta-conglycinin and glycinin) detecting 44 protein spots in wild soybean and 34 protein spots in cultivated soybean; thus, this helped in identifying high-protein soybean genotypes [206]. Combined SDS-PAGE and MALDI-TOF MS analysis in LG00-13260, PI 427138, and BARC-6 soybean genotypes revealed enhanced accumulation of beta-conglycinin and glycinins and thus high grain protein content compared to William 82 ([207]; see Table 4). A combined SDS-PAGE and MALDI-TOF MS analysis, comparing protein content in nine soybean accessions with William 82, revealed significant protein content differences in seed 11S storage globulins [208]. In common bean, proteome analysis of common bean deficient in seed storage proteins (phaseolin and lectins) revealed elevated sulfur amino acid content due to increased legumin, albumin 2, and defensin [209]. Santos et al. [210] characterized the protein content of 24 chickpea genotypes using a proteomics approach to explore genetic variability in storage protein. High-performance liquid chromatography analysis indicated the presence of sufficient genetic variability for SPC, with some genotypes rich in seven amino acids. In pea, a mature seed proteome map of a diverse set of 156 proteins identified novel storage proteins for enhanced SPC [211].
Table 4

Proteomic approach for investigating novel proteins for improving seed protein content in grain legumes.

CropProtein IdentifiedApproach UsedReferenceGenotype
ChickpeaHigh amino acid content, 454 protein spotsTwo-dimensional electrophoresis and mass spectrometry[210]Flip97-171C, Elite
Common beanSulfur-containing amino acids, S-methylcysteine accumulationHigh resolution liquid chromatography-tandem mass spectrometry[212]
Sulfur-containing amino acids; enhanced concentration of cysteine and methionineMass spectrometry[213]SARC1 and SMARC1N-PN1
Faba beanAmino acid metabolismiTRAQ[56]Cixidabaican
Legumin, vicilin, and convicilin1D SDS-PAGE, size-exclusion high-performance liquid chromatography[214]Cartouche, NV657, NV734
Narrow-leafed lupin2760 protein identifications LC-MS[215]P27255, Tanjil, Unicrop
Pea156 proteins2-D gels, MALDI-TOF MS [211]Caméor
SoybeanHigh arginine content in NepaleseMALDI-TOF; two-dimensional gel electrophoresis[141]Nepalese, Karve, Seti
High beta-conglycinin and glycininsTwo-dimensional electrophoresis SDS-PAGE[207]LG00-13260
High 11S storage globulins SDS-PAGE, MALDI-TOF, two-dimensional electrophoresis[208]PI407788A
High storage protein 2D-PAGE[206]Wild soybean
Asparagine, free 3-cyanoalanine, and L-malic acidGC-TOF/MS[216]
An iTRAQ-based proteomics analysis of CX (low SPC) and LX (high SPC) faba bean genotypes revealed differentially abundant proteins involved in amino acid metabolism [56]. Furthermore, a KEGG analysis suggested that valine, leucine, histidine, and β-alanine metabolism were significantly enriched by differentially abundant proteins [56]. Likewise, metabolomic studies help us understand various metabolic pathways and metabolites controlling protein accumulation during seed development [217]. A meticulous amino acid profiling study using contrasting high and low SPC soybean lines revealed that the ability of embryos to assimilate nitrogen and synthesize storage proteins determines SPC accumulation [217]. Further, the authors reported that high SPC at maturity is related to increased accumulation of asparagine in developing cotyledons. A metabolomics study using GC-TOF/MS in contrasting seed protein soybean lines showed a high abundance of metabolites (asparagine, aspartic acid, glutamic acid, free 3-cyanoalanine) that were positively associated with SPC and negatively associated with seed oil content [216]. However, various sugars (sucrose, fructose, glucose, mannose) had negative associations with seed protein and oil content [216]. Saboori-Robat et al. [218] undertook metabolite profiling of common bean genotypes differing in S-methylcysteine accumulation in seeds and found that S-methylcysteine accumulates as γ-glutamyl-S-methylcysteine during seed maturation, with a low accumulation of free methylcysteine. Amino acid profiling of Valle Agricola, a nutritionally rich chickpea genotype cultivated in southern Italy, revealed that 66% of the total amino acids comprised glutamic acid, glutamine, aspartic acid, phenyl alanine, asparagine, lysine, and leucine, while ~40% comprised histidine, valine, isoleucine, leucine, methionine and threonine [219]. Further advances in metabolomics could improve our understanding of various cellular metabolism networks and pathways related to SPC in legumes. Thus, integrating various ‘omics’ tools and emerging novel breeding approaches could assist in developing protein-fortified grain legumes (see Figure 1).
Figure 1

Integrated ‘omics’ and emerging novel breeding approach for improving protein content in grain legumes.

9. Progress of Genetic Engineering and Scope of Genome Editing for Improving SPC in Grain Legumes

Numerous studies have been undertaken to improve the essential amino acid content in various grain legumes by manipulating amino acid encoding genes using genetic engineering [220,221,222]. Many examples of improved essential amino acid contents, especially sulfur-rich amino acids, by manipulating gene(s) in various legumes using transgenic technology are available. Chiaiese et al. [223] introduced an albumin transgene encoding methionine and cysteine-rich protein from sunflower seed into chickpea to improve seed methionine content. The transgenic chickpea seed accumulated more methionine than the control. Likewise, Molvig et al. [224] improved seed methionine content in narrow leaf lupin by introducing sunflower seed albumin transgene at the transgenic level. However, cysteine-rich storage proteins, especially conglutin delta, declined in narrow leaf lupin seed due to low expression of the cysteine-encoding gene (Tabe and [225]; see Table 5). Introducing Bertholletia excelsa methionine-rich 2S albumin gene into common bean enhanced seed methionine content by more than 20% over non-transgenic plants [220]. Improving sulfur-rich amino acids, such as methionine and cysteine, in soybean has been a research priority, made possible by introducing the 15 kDa [226], 27 kDa [227], and 11 kDa [221,228] δ-zein encoding protein genes from maize using genetic engineering.
Table 5

Selected list of grain legumes with improved seed protein content using a genetic engineering approach.

CropGene SourceGene NameFunctionReferencesTransformation Approach
ChickpeaSunflowerSunflower seed albuminIncreased methionine up to 90%[223] Agrobacterium tumefaciens
Common bean BrazilnutBrazilnut 2S albuminIncreased methionine by 14–23%[220] Particle bombardment
Narrow-leafed lupinSunflowerSunflower seed albuminIncreased methionine by 90%[224,225] Agrobacterium tumefaciens
ArabidopsisSerine acetyltransferase26-fold increase in free cysteine[229] Agrobacterium tumefaciens
SoybeanMaize15 kDa δ-zeinIncreased methionine by 20% and cysteine by 35%[226] Agrobacterium tumefaciens
Maize27 kDa γ-zeinIncreased methionine from 15.49 to 18.57% and cysteine from 26.97 to 29.33%[227]Particle bombardment
Maize11 kDa δ-zeinMethionine[221] Agrobacterium tumefaciens
MB-16 Increased methionine by 16% and cysteine by 66% [230] Biolistic
SoybeanSoybean plastid ATP sulfurylase isoform 1Increase cysteine by 37–52% and methionine by 15–19% [228] Agrobacterium tumefaciens
Maize11 kDa δ-zeinIncreased sulfur amino acids[222] Agrobacterium tumefaciens
Soybean Glyma.20g085100 Enhance protein content[137]RNAi technology
Despite some successes introducing transgenes to enhance SPC in grain legumes at the transgenic level, transgenic regulatory or governing bodies do not allow or restrict the use of these genetically engineered improved grain legumes commercially due to health and environmental safety issues. To overcome these stringent issues related to genetically modified crops, rapidly evolving genome editing technologies could help develop enhanced-protein grain legumes without introducing foreign genes. Using genome editing technologies, various crop plants have improved quality traits, such as increased fragrance and low gluten, starch, or oleic acid contents (for details, see [231]). However, the use of genome editing for SPC fortification in grain legumes is limited; future studies could adopt these powerful technologies to improve SPC by editing various gene(s), such as those encoding essential sulfur-rich amino acids or improving storage proteins.

10. Whole Genome Resequencing and Pangenome Sequencing for Elucidating Novel Structural Variants Related to High SPC across the Genome

Current breakthroughs in genome sequencing technologies have facilitated the sequencing of the global germplasm of various crops, including legumes, to underpin novel structural variants (SVs) such as presence/absence and copy number variations prevailing at the genome level [232,233]. An analysis combining association and biparental mapping using WGRS data of 631 soybean genotypes discovered a pleiotropic sugar transporter QTL gene GmSWEET39 on chromosome 15 controlling SPC and seed oil content [164]. The authors suggested that deletion of 2 bp CC in the underlying causative Glyma.15G049200 gene reduced SPC and enhanced seed oil content. Likewise, a pangenomic approach can describe the full complement of genes in the ‘core genome’ and ‘accessory genome’ to capture structural variation (not available in ‘single reference genome assembly’) at the species level [232]. Pangenome assemblies have been reported in chickpea [233], pigeon pea [234], soybean [235] and mungbean [236]. Thus, future construction and annotation of pangenomes for different grain legumes could reveal missing information on SPC structural variations in the available reference genome assemblies, expediting the development of grain legumes with enriched protein.

11. Non-Destructive Phenomics Approach for Quantifying High Protein Content in Grain Legumes

Several high-throughput phenotyping approaches have been developed to bridge the genotyping and phenotyping gap for various quality traits, including protein content [237,238,239]. Advances in high-throughput non-destructive phenotyping approaches such as hyperspectral technologies, near-infrared reflectance spectroscopy, and nuclear magnetic resonance have enabled the phenotyping of various biochemical attributes in cereal and legume seeds, including protein content, with high accuracy and efficiency [237,238,239,240,241]. For example, Raman spectroscopy has been used to measure SPC in soybean [237]. Earlier, near-infrared reflectance spectroscopy was used to screen high-protein soybean genotypes [242,243]. Thus, non-destructive high-throughput phenotyping approaches could save time when screening high-SPC lines.

12. Genomic Selection and Rapid Generation Advances for Selecting High SPC Lines to Increase Genetic Gain

Unprecedented advances in genome-wide molecular marker development allow the use of genomic selection (GS) for predicting the genetic merit of progenies with complex traits without observing their phenotypic values from large target populations by developing a prediction model and calculating genomic-assisted breeding values in a ‘training population’ with known phenotypic observation [244]. The benefit of GS for improving genetic gain could be harnessed by increasing selection intensity (i) and selection accuracy (I), and reducing the breeding cycle length (L) in the breeder’s equation: ΔG = R = h2S = σa × i × r/L. [ΔG = genetic gain, R = response to selection, h2 = heritability, σa = additive genetic variance]. Notable instances of using GS as a substitute for phenotypic selection for complex traits include grain yield under moisture stress in chickpea [245], common bean [246], cowpea (Ravelombola et al., 2021) [247], and pea [248,249] and cooking time in common bean [250]. However, GS has limited application for selecting high SPC genotypes in legumes [251]. A rrBLUP model was used to predict SPC in 306 pea genotypes derived from three RILs, tested in three autumn seasons in northern and central Italy, to determine any advantage of GS over phenotypic selection for SPC [251]. The mean predictive ability of GS for SPC was 0.53. Future studies could use GS to improve SPC and select various grain legume progenies with high SPC without phenotyping. Likewise, the emerging benefits of speed breeding techniques could be harnessed by using optimum light intensity, photoperiod and temperature to enhance the rate of photosynthesis, resulting in early flowering and plant maturity, thus shortening the breeding cycle [252]. Speed breeding protocols have been established in chickpea, lupin, lentil, pea, soybean, and faba bean [253,254,255,256,257]. Further optimization of speed breeding protocols could fast-track improvements in various traits of breeding importance, including SPC, in grain legumes for sustaining global food security.

13. Fundamental Constraints on Seed Protein Content

As the offspring of plants, seeds are subject to several fundamental trade-offs that impact their size and composition. Seeds have fundamental required components, such as cell walls, and some amount of carbohydrates, lipids, and nucleic acids to make a viable embryo. Consequently, there are limits to potential selection on protein content. For example, long term selection on maize seed oil content has shown limits to the power of selection (e.g., [258]). Over the past two or more decades, ecologists have increasingly conceptualized these trade-offs as part of an economic spectrum, which influences the range of traits observed in leaves [259,260], stems [261,262] and roots [263]. As a dispersal unit, seeds are able to travel farther if they are smaller, but establish more readily if larger [264]. In many individual legume crops, wild relatives have presumably been under millenia of selection for these trade-offs in seed size and composition, limiting genetic variation and architecture. However, few researchers have linked evolutionary and ecological limits on seed composition to efforts at breeding, nor looked carefully at how they impact seed protein content. Seed size is generally an important co-variate in seed protein content, although among legumes its role differs somewhat among grain legumes. Recent elegant work in chickpea suggests that these constraints are in fact real, and shape contemporary genetic diversity in seed size and composition. Chickpea has a QTL hotspot for seed size, leaf size, drought responses, and other “Vigour” traits. Nguyen and colleagues have recently fine mapped this QTL [265,266] showing it to be due to variation in a TIFY gene, which mutant studies in Arabidopsis have shown to impact seed size. Natural variation at this locus suggests it contributes significantly to a seed-size number trade-off, among parents that also differ in seed protein content.

14. Conclusions and Future Perspective

The increasing human population is facing increasing malnutrition-related problems such as dietary protein deficiency, especially in underprivileged and developing countries. Supplying protein-rich legumes improved through plant breeding and molecular breeding approaches could minimize the rising challenge of hunger and malnutrition-related problems. Moreover, improved grain legume dietary protein could be an important and economically viable alternative to high-cost animal-based dietary protein. Protein biofortification of major grain legumes will help satisfy the daily needs of human dietary protein in underprivileged and developing countries. Accurate characterization of various crop gene pool and landrace haplotypes with genetic variation for SPC needs urgent attention to accelerate SPC improvement in legumes. Harnessing the benefits of pre-breeding approaches could play a pivotal role in introgressing gene(s)/QTLs regulating high protein content from CWRs into high-yielding low-protein elite legume cultivars [96]. Recent advances in genomics, genome-wide association mapping, and whole genome resequencing approaches and the availability of complete genome and pangenome sequences in various legume crops could help underpin the causative alleles/QTLs/haplotypes/candidate genes controlling high protein at the genome level, enabling genomics-assisted selection for improving protein concentration in grain legumes. Likewise, functional genomics, proteomics, and metabolomics could enrich our understanding of the complex molecular networks controlling improved protein content in various grain legumes. Selecting protein-rich grain legume genotypes in assessed germplasm or segregating progenies is challenging as most protein-estimating processes are based on destructive methods. Thus, high-throughput non-destructive methods are important for selecting high-protein legume genotypes. Likewise, genomic selection and rapid generation advances could be important for selecting high-protein progenies and rapidly developing protein-dense legumes. To overcome the challenges of transgenic technology, genome editing will help us manipulate and edit genes(s) governing high protein content at specific locations on legume genomes to enhance SPC. Capitalizing on these modern breeding tools, we should be able to identify grain legumes with improved protein content without compromising yield, as these two traits have a strong inverse relationship [123]. Hence, the amalgamation of approaches could help combat the growing protein-based malnutrition and lower the hunger risk, ensuring sustainable human growth globally.
  128 in total

1.  Effect of genotype and environment on the concentrations of starch and protein in, and the physicochemical properties of starch from, field pea and fababean.

Authors:  Shannon D Hood-Niefer; Thomas D Warkentin; Ravindra N Chibbar; Albert Vandenberg; Robert T Tyler
Journal:  J Sci Food Agric       Date:  2011-07-21       Impact factor: 3.638

2.  Characterization of lupin major allergens (Lupinus albus L.).

Authors:  Eva Guillamón; Julia Rodríguez; Carmen Burbano; Mercedes Muzquiz; Mercedes M Pedrosa; Beatriz Cabanillas; Jesús F Crespo; Ana I Sancho; E N Clare Mills; Carmen Cuadrado
Journal:  Mol Nutr Food Res       Date:  2010-11       Impact factor: 5.914

3.  Chemical composition and antioxidant activity of seeds of different cultivars of mungbean.

Authors:  F Anwar; S Latif; R Przybylski; B Sultana; M Ashraf
Journal:  J Food Sci       Date:  2007-09       Impact factor: 3.167

4.  Pigeonpea genomics initiative (PGI): an international effort to improve crop productivity of pigeonpea (Cajanus cajan L.).

Authors:  R K Varshney; R V Penmetsa; S Dutta; P L Kulwal; R K Saxena; S Datta; T R Sharma; B Rosen; N Carrasquilla-Garcia; A D Farmer; A Dubey; K B Saxena; J Gao; B Fakrudin; M N Singh; B P Singh; K B Wanjari; M Yuan; R K Srivastava; A Kilian; H D Upadhyaya; N Mallikarjuna; C D Town; G E Bruening; G He; G D May; R McCombie; S A Jackson; N K Singh; D R Cook
Journal:  Mol Breed       Date:  2009-09-17       Impact factor: 2.589

5.  A genome-wide association study of seed protein and oil content in soybean.

Authors:  Eun-Young Hwang; Qijian Song; Gaofeng Jia; James E Specht; David L Hyten; Jose Costa; Perry B Cregan
Journal:  BMC Genomics       Date:  2014-01-02       Impact factor: 3.969

6.  Compositional studies and biological activities of some mash bean (Vigna mungo (L.) Hepper) cultivars commonly consumed in Pakistan.

Authors:  Muhammad Zia-Ul-Haq; Shakeel Ahmad; Shazia Anwer Bukhari; Ryszard Amarowicz; Sezai Ercisli; Hawa Z E Jaafar
Journal:  Biol Res       Date:  2014-05-30       Impact factor: 5.612

7.  Identification of QTNs Controlling Seed Protein Content in Soybean Using Multi-Locus Genome-Wide Association Studies.

Authors:  Kaixin Zhang; Shulin Liu; Wenbin Li; Shiping Liu; Xiyu Li; Yanlong Fang; Jun Zhang; Yue Wang; Shichao Xu; Jianan Zhang; Jie Song; Zhongying Qi; Xiaocui Tian; Zhixi Tian; Wen-Xia Li; Hailong Ning
Journal:  Front Plant Sci       Date:  2018-11-21       Impact factor: 5.753

8.  Overexpression of serine acetlytransferase produced large increases in O-acetylserine and free cysteine in developing seeds of a grain legume.

Authors:  Linda Tabe; Markus Wirtz; Lisa Molvig; Michel Droux; Ruediger Hell
Journal:  J Exp Bot       Date:  2009-11-25       Impact factor: 6.992

9.  Phenological, nutritional and molecular diversity assessment among 35 introduced lentil (Lens culinaris Medik.) genotypes grown in Saudi Arabia.

Authors:  Salem S Alghamdi; Altaf M Khan; Megahed H Ammar; Ehab H El-Harty; Hussein M Migdadi; Samah M Abd El-Khalik; Aref M Al-Shameri; Muhammad M Javed; Sulieman A Al-Faifi
Journal:  Int J Mol Sci       Date:  2013-12-27       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.