| Literature DB >> 32005168 |
Wilson Kimani1,2, Li-Min Zhang1, Xiao-Yuan Wu1, Huai-Qing Hao3, Hai-Chun Jing4,5,6.
Abstract
BACKGROUND: In sorghum (Sorghum bicolor), one paramount breeding objective is to increase grain quality. The nutritional quality and end use value of sorghum grains are primarily influenced by the proportions of tannins, starch and proteins, but the genetic basis of these grain quality traits remains largely unknown. This study aimed to dissect the natural variation of sorghum grain quality traits and identify the underpinning genetic loci by genome-wide association study.Entities:
Keywords: Amino acids; Genome-wide association study; Grain quality; Sorghum; Starch; Tannins
Mesh:
Substances:
Year: 2020 PMID: 32005168 PMCID: PMC6995107 DOI: 10.1186/s12864-020-6538-8
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Population structure analysis of 196 diverse sorghum accessions using genome-wide SNPs. a Hierarchical organization of genetic relatedness of the 196 diverse sorghum lines. Each bar represents an individual accession. The six sub-populations were pre-determined as the optimum number based on ADMIXTURE analysis with cross-validation for K value from K = 2 to K = 10 using 841,038 unlinked SNPs (r2 < 0.8), distributed across the genome. Different colours represent different sub-populations. b A plot of the first two principal components (PCs) coloured by sub-populations. c PC2 vs PC3 coloured by sub-populations. d Phylogenetic tree constructed using the maximum likelihood method in SNPhylo. The colours are based on the six sub-populations from ADMIXTURE results. e Comparison of genome-wide average linkage disequilibrium (LD) decay estimated from the whole population and six sub-populations. The horizontal broken grey and red lines show the LD threshold at r2 = 0.2 and r2 = 0.1, respectively
Fig. 2Variations and spearman’s correlations among 17 amino acids. The lower panel left of the diagonal is the scatter plots containing measured values of 196 accessions. The red line through the scatter plot represents the line of the best fit. Spearman’s correlation coefficients between amino acids are shown on the upper panel on the right of the diagonal. The correlation significance levels are *p = 0.05, **p = 0.01 and ***p = 0.001, and the size of the coefficient values are proportional to the strength of the correlation
Summary statistics of tannins, starch and 17 amino acid contents measured in the association panel
| Trait | Units | Absolute value (pmol ul− 1 mg− 1) | Relative value (% of total) | |||
|---|---|---|---|---|---|---|
| Mean | SD | Minimum | Maximum | |||
| Ala | nmol mg−1 | 14.38 | 2.56 | 7.60 | 21.07 | 11.45 |
| Arg | nmol mg−1 | 2.84 | 0.69 | 1.09 | 4.96 | 2.26 |
| Asp | nmol mg−1 | 7.95 | 1.54 | 3.36 | 11.69 | 6.33 |
| Cys | nmol mg−1 | 14.83 | 13.56 | 0.05 | 70.56 | 11.82 |
| Glu | nmol mg−1 | 20.27 | 3.95 | 9.44 | 32.92 | 16.15 |
| Gly | nmol mg−1 | 6.49 | 1.44 | 0.05 | 11.49 | 5.17 |
| His | nmol mg−1 | 1.45 | 0.93 | 0.60 | 6.32 | 1.15 |
| Ile | nmol mg−1 | 4.48 | 1.02 | 2.40 | 7.42 | 3.57 |
| Leu | nmol mg−1 | 14.41 | 2.79 | 7.02 | 21.79 | 11.48 |
| Lys | nmol mg−1 | 2.09 | 0.59 | 1.16 | 4.60 | 1.67 |
| Met | nmol mg−1 | 1.45 | 0.48 | 0.05 | 3.03 | 1.15 |
| Phe | nmol mg−1 | 4.56 | 1.33 | 1.69 | 8.75 | 3.63 |
| Pro | nmol mg−1 | 11.06 | 3.97 | 0.05 | 20.60 | 8.81 |
| Ser | nmol mg−1 | 5.56 | 0.98 | 2.65 | 7.79 | 4.43 |
| Thr | nmol mg−1 | 4.49 | 1.30 | 0.05 | 9.23 | 3.57 |
| Tyr | nmol mg−1 | 2.81 | 0.72 | 0.42 | 5.08 | 2.24 |
| Val | nmol mg−1 | 5.72 | 1.28 | 1.87 | 9.16 | 4.56 |
| Starch | % dry grain weight | 59.28 | 6.02 | 38.65 | 75.80 | – |
| Tannin | % dry grain weight | 1.48 | 0.24 | 1.16 | 2.24 | – |
Fig. 4GWAS for starch content in sorghum grains (a) Manhattan plot for starch content GWAS. The red arrow shows significant SNP located close to candidate genes. (b) Distribution of starch content in 196 diverse accessions. (c) A close up of the significant association on chromosome 5. The broken red line represents the significance threshold. (d) LD block showing pairwise r2 values among all polymorphic sites in a candidate genes region, where the intensity of the colour corresponds to the r2 value as indicated on the legend
Fig. 3GWAS for Tannin levels in sorghum seed and direct hits to a priori candidate gene region. a Distribution of tannin content in 196 diverse accessions. b Manhattan plot for tannin content GWAS. Black arrows show associated SNPs located close to candidate genes. c Quantile-quantile plot for tannin content GWAS. d A close up of the significant association on chromosome 4. The broken red line represents the significance threshold. e and f LD blocks showing pairwise r2 values among all polymorphic sites in candidate genes region, where the intensity of the colour corresponds to the r2 value as indicated on the legend. Candidate genes Zm1 (~ 61.7 Mb region), Tannin1, TT16 and SCL8 (~ 62.3 Mb region) are shown
Candidate genes for tannins and starch content that mapped into various KEGG pathways
| Trait | SNP | Chr | Position (bp)a | candidate gene | Distance (kb)b | Annotation | Pathwayc | |
|---|---|---|---|---|---|---|---|---|
| Tannins | 4:3635914 | 4 | 3,635,914 | 2.45E-06 | Sobic.004G044200 | 1.01 | 1,4-dihydroxy-2-naphthoyl-CoA synthase, peroxisomal | Ubiquinone and other terpenoid-quinone biosynthesis |
| 4:61736881 | 4 | 61,736,881 | 1.62E-08 | Sobic.004G273900 | 33.72 | peroxidase 5 | Phenylpropanoid biosynthesis | |
| 5:34971014 | 5 | 34,971,014 | 6.02E-12 | Sobic.005G110600 | 32.00 | chitinase-3-like protein 1 | Amino sugar and nucleotide sugar metabolism | |
| 8:57291105 | 8 | 57,291,105 | 2.55E-08 | Sobic.008G141700 | 2.38 | heparanase-like protein 1 | Glycosaminoglycan degradation | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G072000 | −26.21 | phosphoribosylformylglycinamidine cyclo-ligase, chloroplastic/mitochondrial | Purine metabolism | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Pentose phosphate pathway | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Glycolysis/gluconeogenesis | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | RNA degradation | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Biosynthesis of amino acids | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Fructose and mannose metabolism | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Carbon metabolism | |
| 9:8660880 | 9 | 8,660,880 | 1.22E-08 | Sobic.009G071800 | −36.11 | ATP-dependent 6-phosphofructokinase 6 | Galactose metabolism | |
| Starch | 4:56136753 | 4 | 56,136,753 | 3.66E-07 | Sobic.004G211866 | 15.24 | S-adenosylmethionine decarboxylase proenzyme | Cysteine and methionine metabolism |
| 4:56136753 | 4 | 56,136,753 | 3.66E-07 | Sobic.004G211866 | 15.24 | S-adenosylmethionine decarboxylase proenzyme | Arginine and proline metabolism | |
| 4:56136753 | 4 | 56,136,753 | 3.66E-07 | Sobic.004G211833 | 8.31 | cytochrome c oxidase subunit 6b-2 | Oxidative phosphorylation |
a Physical position in base pairs for the peak SNP according to v3.1 of the sorghum genome
b Distance of the gene from the significant SNP
c Pathway of the candidate gene according to Kyoto Encyclopedia of Genes and Genomes (KEGG) database [36]
Fig. 5Chromosomal distribution of significant SNPs identified in amino acids content GWAS. SNP positions are represented by black circles. The size of the circle proportional to the significance level. Different amino acid families are represented by each colour as shown on the left of the y-axis. The x-Axis represents the physical position across the 10 sorghum chromosomes. The density map on the x-xis represents the number of amino acids significant loci identified across the genome. The red arrows show the association hotspots
Candidate genes for amino acid traits as identified by a priori candidate genes from amino acid biosynthesis and degradation pathways
| Trait | SNP | Chr | Position (bp)a | candidate gene | Distance (kb)b | Annotation | Pathwayc |
|---|---|---|---|---|---|---|---|
| Asp family | 1:10068032 | 1 | 10,068,032 | Sobic.001G127700 | −25.64 | similar to Lysine Decarboxylase, putative | lysine degradation I |
| Leu/BCAA | 1:1014946 | 1 | 1,014,946 | Sobic.001G011700 | −4.06 | similar to Aspartokinase | superpathway of lysine, threonine and methionine biosynthesis II |
| Val/BCAA | 1:24852243 | 1 | 24,852,243 | Sobic.001G241200 | −21.77 | similar to EDR1 | threonine degradation III (to methylglyoxal) |
| Ile/BCAA | 1:69010559 | 1 | 69,010,559 | Sobic.001G405500 | 4.08 | similar to Pyruvate decarboxylase isozyme 2 | superpathway of leucine, valine, and isoleucine biosynthesis |
| Phe/Shikimate family | 1:69010559 | 1 | 69,010,559 | Sobic.001G405500 | 4.08 | similar to Pyruvate decarboxylase isozyme 2 | superpathway of leucine, valine, and isoleucine biosynthesis |
| Tyr/Shikimate | 1:69010559 | 1 | 69,010,559 | Sobic.001G405500 | 4.08 | similar to Pyruvate decarboxylase isozyme 2 | superpathway of leucine, valine, and isoleucine biosynthesis |
| Leu/BCAA | 1:72963758 | 1 | 72,963,758 | Sobic.001G453100 | −10.87 | similar to Homocysteine S-methyltransferase 1 | superpathway of lysine, threonine and methionine biosynthesis II |
| Lys | 2:13818293 | 2 | 13,818,293 | Sobic.002G113600 | 15.98 | similar to Rac GTPase activating protein 3-like protein | superpathway of lysine, threonine and methionine biosynthesis II |
| Ile/Asp family | 2:4671226 | 2 | 4,671,226 | Sobic.002G049200 | −15.65 | weakly similar to PHD finger transcription factor-like | superpathway of leucine, valine, and isoleucine biosynthesis |
| Thr/Asp family | 2:58060555 | 2 | 58,060,555 | Sobic.002G193800 | 15.95 | GLUCOSE TRANSPORTER TYPE 1 | superpathway of lysine, threonine and methionine biosynthesis II |
| Leu/Pyruvate | 3:11583493 | 3 | 11,583,493 | Sobic.003G126500 | 17.82 | similar to Os01g0269000 protein | leucine degradation I |
| Ala/Pyruvate | 3:3063590 | 3 | 3,063,590 | Sobic.003G033900 | 26.43 | similar to 1-aminocyclopropane-1-carboxylic acid synthase | phenylalanine degradation III |
| Ala/total | 3:5411028 | 3 | 5,411,028 | Sobic.003G061300 | −17.63 | Thiamine pyrophosphate dependent pyruvate decarboxylase family protein | superpathway of leucine, valine, and isoleucine biosynthesis |
| Leu/Pyruvate | 3:57321213 | 3 | 57,321,213 | Sobic.003G234701 | 12.80 | similar to Pectin-glucuronyltransferase-like | arginine degradation I (arginase pathway) |
| Gly | 3:70271670 | 3 | 70,271,670 | Sobic.003G391600 | 9.40 | similar to Putative 4-coumarate:coenzyme A ligase | superpathway of lysine, threonine and methionine biosynthesis II |
| Lys | 4:11594929 | 4 | 11,594,929 | Sobic.004G114500 | −18.26 | Core-2/I-branching beta-1,6-N-acetylglucosaminyltransferase family protein | glycine cleavage complex |
| Ser | 4:1351183 | 4 | 1,351,183 | Sobic.004G016800 | −22.65 | similar to Putative serine/threonine protein kinase | threonine degradation III (to methylglyoxal) |
| Thr/total | 4:49321838 | 4 | 49,321,838 | Sobic.004G156000 | 10.33 | similar to Putative steroleosin | lysine degradation II |
| Leu/Pyruvate family | 4:65472831 | 4 | 65,472,831 | Sobic.004G319400 | −16.93 | similar to DNA helicase RECQE-like | superpathway of leucine, valine, and isoleucine biosynthesis |
| Val/BCAA | 4:65472831 | 4 | 65,472,831 | Sobic.004G319400 | −16.93 | similar to DNA helicase RECQE-like | superpathway of leucine, valine, and isoleucine biosynthesis |
| Glu/Glutamate family | 5:3605534 | 5 | 3,605,534 | Sobic.005G039700 | 10.91 | similar to Rac GTPase activating protein 1 | superpathway of lysine, threonine and methionine biosynthesis II |
| Pro | 5:3605534 | 5 | 3,605,534 | Sobic.005G039700 | 10.91 | similar to Rac GTPase activating protein 1 | superpathway of lysine, threonine and methionine biosynthesis II |
| Pro/Glutamate family | 5:3605534 | 5 | 3,605,534 | Sobic.005G039700 | 10.91 | similar to Rac GTPase activating protein 1 | superpathway of lysine, threonine and methionine biosynthesis II |
| Lys | 5:5579891 | 5 | 5,579,891 | Sobic.005G055300 | – | similar to Tropinone reductase | lysine degradation II |
| Lys | 5:5579891 | 5 | 5,579,891 | Sobic.005G055300 | – | similar to Tropinone reductase | phenylalanine degradation III |
| Lys | 5:5579891 | 5 | 5,579,891 | Sobic.005G055400 | 1.07 | similar to Amidase family protein | arginine degradation X (arginine monooxygenase pathway) |
| Val/BCAA | 5:63968450 | 5 | 63,968,450 | Sobic.005G164200 | 2.49 | similar to Putative uncharacterized protein | superpathway of leucine, valine, and isoleucine biosynthesis |
| Val/BCAA | 5:63968450 | 5 | 63,968,450 | Sobic.005G164300 | 6.84 | similar to Putative uncharacterized protein | superpathway of leucine, valine, and isoleucine biosynthesis |
| Ile/BCAA | 5:67881473 | 5 | 67,881,473 | Sobic.005G194900 | −22.93 | similar to Phosphoserine phosphatase | superpathway of serine and glycine biosynthesis I |
| Val/Pyruvate | 5:67881473 | 5 | 67,881,473 | Sobic.005G194900 | −22.93 | similar to Phosphoserine phosphatase | superpathway of serine and glycine biosynthesis I |
| Val/BCAA | 5:67881473 | 5 | 67,881,473 | Sobic.005G194900 | −22.93 | similar to Phosphoserine phosphatase | superpathway of serine and glycine biosynthesis I |
| Val/total | 5:67881473 | 5 | 67,881,473 | Sobic.005G194900 | −22.93 | similar to Phosphoserine phosphatase | superpathway of serine and glycine biosynthesis I |
| Met/Asp family | 5:69690963 | 5 | 69,690,963 | Sobic.005G210500 | 20.74 | similar to ATP-dependent DNA helicase, RecQ family protein, expressed | superpathway of leucine, valine, and isoleucine biosynthesis |
| Leu/BCAA | 6:54237869 | 6 | 54,237,869 | Sobic.006G187900 | −0.29 | similar to Acc synthase | phenylalanine degradation III |
| Leu/BCAA | 6:54237869 | 6 | 54,237,869 | Sobic.006G187900 | −0.29 | similar to Acc synthase | tyrosine degradation I |
| Tyr/total | 7:60330803 | 7 | 60,330,803 | Sobic.007G168200 | −14.06 | similar to Peptidyl-prolyl cis-trans isomerase | phenylalanine degradation III |
| Tyr/total | 7:60330803 | 7 | 60,330,803 | Sobic.007G168200 | −14.06 | similar to Peptidyl-prolyl cis-trans isomerase | tyrosine degradation I |
| Leu/Pyruvate | 8:1074094 | 8 | 1,074,094 | Sobic.008G012400 | −27.01 | similar to Os11g0142500 protein | superpathway of leucine, valine, and isoleucine biosynthesis |
| Ala/total | 8:51569085 | 8 | 51,569,085 | Sobic.008G111100 | 1.99 | Predicted transporter (major facilitator superfamily) | superpathway of lysine, threonine and methionine biosynthesis II |
| Leu/Pyruvate | 8:52368227 | 8 | 52,368,227 | Sobic.008G114900 | 18.62 | similar to Rac GTPase activating protein 3, putative, expressed | superpathway of lysine, threonine and methionine biosynthesis II |
| Leu/Pyruvate | 8:59438201 | 8 | 59,438,201 | Sobic.008G160700 | −28.52 | similar to Methylcrotonoyl-CoA carboxylase subunit alpha, mitochondrial precursor | leucine degradation I |
| Glu/Glutamate family | 8:5993722 | 8 | 5,993,722 | Sobic.008G057500 | 1.10 | similar to Aldehyde dehydrogenase family protein | arginine degradation I (arginase pathway) |
| Pro/Glu family | 8:5993722 | 8 | 5,993,722 | Sobic.008G057500 | 1.10 | similar to Aldehyde dehydrogenase family protein | arginine degradation I (arginase pathway) |
| Hist/total | 10:6862967 | 10 | 6,862,967 | Sobic.010G080300 | −16.74 | similar to Putative aminoacylase | superpathway of lysine, threonine and methionine biosynthesis II |
| Cys | 10:8489698 | 10 | 8,489,698 | Sobic.010G094900 | 9.71 | similar to Putative uncharacterized protein | Tryptophan degradation III (eukaryotic) |
| Cys/total | 10:8489698 | 10 | 8,489,698 | Sobic.010G094900 | 9.71 | similar to Putative uncharacterized protein | Tryptophan degradation III (eukaryotic) |
| Val/BCAA | 10:55465480 | 10 | 55,465,480 | Sobic.010G212000 | 25.56 | similar to Putative uncharacterized protein | arginine degradation I (arginase pathway) |
| Val/BCAA | 10:55465480 | 10 | 55,465,480 | Sobic.010G212000 | 25.56 | similar to Putative uncharacterized protein | proline degradation I |
| Val/BCAA | 10:55465480 | 10 | 55,465,480 | Sobic.010G212000 | 25.56 | similar to Putative uncharacterized protein | proline degradation II |
| Val/BCAA | 10:55465480 | 10 | 55,465,480 | Sobic.010G212000 | 25.56 | similar to Putative uncharacterized protein | valine degradation I |
a Physical position in base pairs for the peak SNP according to v3.1 of the sorghum genome
b Distance of the gene from the significant SNP
c Biosynthesis or degradation pathway of the candidate gene as curated from the Gramene pathway tool [38]
Fig. 6Biosynthesis of aspartate family and branched-chain amino acids. The blue and black arrows represent the aspartate family and branched-chain amino acid pathways, respectively. The candidate genes identified in this GWAS are shown in red text and surrounded by a textbox with broken red lines. AK, Aspartokinase; AK-HSDH, Aspartate kinase-homoserine dehydrogenase; ALS, Acetolactate synthase; ASD, Aspartate semialdehyde dehydrogenase; BCAT, branched-chain aminotransferases; CBL, cystathionine β-lyase; CGS, cystathionine γ-synthase; DAPAT, diaminopimelate aminotransferase; DAPDC, diaminopimelate decarboxylase; DAPE, diaminopimelate epimerase; DHAD, dihydroxylacid dehydratase; DHDPR, dihydrodipicolinate reductase; HMT, homocysteine S-methyltransferase; HSK, homo-Ser kinase; IPMDH, isopropylmalate dehydrogenase; IPMI, isopropylmalate isomerase; KARI, ketol-acid reductoisomerase; MS, Methionine synthase; TD, Threonine deaminase; TS, Threonine synthase