| Literature DB >> 30549178 |
Roslen Anacleto1, Saurabh Badoni1, Sabiha Parween1, Vito M Butardo1,2, Gopal Misra1, Rosa Paula Cuevas1, Markus Kuhlmann3, Trinidad P Trinidad4, Aida C Mallillin4, Cecilia Acuin1, Anthony R Bird5, Matthew K Morell1, Nese Sreenivasulu1.
Abstract
Reliably generating rice varieties with low glycaemic index (GI) is an important nutritional intervention given the high rates of Type II diabetes incidences in Asia where rice is staple diet. We integrated a genome-wide association study (GWAS) with a transcriptome-wide association study (TWAS) to determine the genetic basis of the GI in rice. GWAS utilized 305 re-sequenced diverse indica panel comprising ~2.4 million single nucleotide polymorphisms (SNPs) enriched in genic regions. A novel association signal was detected at a synonymous SNP in exon 2 of LOC_Os05g03600 for intermediate-to-high GI phenotypic variation. Another major hotspot region was predicted for contributing intermediate-to-high GI variation, involves 26 genes on chromosome 6 (GI6.1). These set of genes included GBSSI, two hydrolase genes, genes involved in signalling and chromatin modification. The TWAS and methylome sequencing data revealed cis-acting functionally relevant genetic variants with differential methylation patterns in the hot spot GI6.1 region, narrowing the target to 13 genes. Conversely, the promoter region of GBSSI and its alternative splicing allele (G allele of Wxa ) explained the intermediate-to-high GI variation. A SNP (C˃T) at exon-10 was also highlighted in the preceding analyses to influence final viscosity (FV), which is independent of amylose content/GI. The low GI line with GC haplotype confirmed soft texture, while other two low GI lines with GT haplotype were characterized as hard and cohesive. The low GI lines were further confirmed through clinical in vivo studies. Gene regulatory network analysis highlighted the role of the non-starch polysaccharide pathway in lowering GI.Entities:
Keywords: final viscosity; gene-regulatory network analysis; genome-wide association study; glycaemic index; transcriptome-wide association study; whole-genome bisulfite sequencing
Mesh:
Year: 2019 PMID: 30549178 PMCID: PMC6575982 DOI: 10.1111/pbi.13051
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
Figure 1Correlation among in vitro glycemic index (GI), eating (amylose content; AC) and cooking (final viscosity; FV) qualities of 305 rice germplasm lines. The scatter plots showed (a) the weak correlation of AC with GI (Pearson's r 2 = 0.2809), (b) practically no meaningful relationship between GI and FV and (c) correlation between AC and FV with r 2 = 0.30. Genotypes that had low GI values (GI ≤ 55) were shown as red dots whereas lines with GI value between 56 and 60 (intermediate GI) were randomly selected and plotted as blue dots. All three data sets (GI, AC and FV) were transformed into a distribution with normally distributed residuals that had mean 0 and unit variance prior to genetic analysis without regard to their raw data distribution. (d and e) represent the plots for in vivo GI evaluation of two landraces GQ02522 and GQ02497 identified to be low GI based on cohort studies of at least 10 human subjects with reference to control (glucose solution, blue line). (f) Overlaid sensory and texture profiling of three low GI lines using 14 textural attributes representing the characteristic variations in the radar graph. Fourteen attributes in case of the three accessions were evaluated in comparison to ‘Dinorado’ a popular rice variety in the Philippines (depicted by grey areas, wideness of which corresponds to the differences between the profiles generated in the two sessions of sensory test).
Figure 2Genome‐wide association study of glycemic index and transcription‐wide association study (TWAS) highlighting the expression level of . (a) Manhattan plot shows significant association signals in chromosomes 5 and 6 indicated by the red line at –log(p) = 7.7 that represents the Bonferroni‐corrected P‐value significance threshold. The blue line at –log(p) = 4.8 is the suggestive line represented by the association P‐value that corresponds to the q‐value<0.5 (false discovery rate). A C→T synonymous SNP (1 525 361 bp) in exon 2 of LOC_Os05g03600 (encodes a signalling receptor kinase) is a significant SNP in chromosome 5. In chromosome 6, we found a 230 kb region (1.60–1.83 Mb) that we refer to in this manuscript as GI6.1. (b) Zoomed‐in GI6.1 region, where multiple significant SNPs have implicated within a total of 26 genes that have been validated in the gene‐set analysis, distributed in 19 different linkage disequilibrium (LD) blocks. (c) Manhattan plot of the transcription‐wide association study that highlighted the expression level of . The red and blue lines were as described in (a) with the exception of the suggestive line being a little bit higher at 4.9 due mainly to the adjustment made for false positives. (d) A plot of TWAS with –log10 P‐values in the similar GI6.1 region, based on the expression QTL (eQTL) analysis, narrowing down to 13 candidates genes in the GI6.1 region.
Genes associated with glycaemic index identified by genome‐wide association study and gene set analysis
| Gene ID | Chr | Block | Start | End | Strand | nSNPs |
|
|
| Bonferroni | FDR | Annotation |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LOC_Os05g03600 | 5 | 1 | 1 523 847 | 1 529 707 | + | 5 | 303 | 3.7344 | 9.41e‐05 | 2.54e‐03 | 1.02e‐04 | Retrotransposon protein, putative, unclassified, expressed |
| LOC_Os06g03970 | 6 | 1 | 1 605 460 | 1 612 409 | + | 23 | 303 | 5.1343 | 1.42e‐07 | 3.82e‐06 | 2.55e‐07 | Receptor‐like protein kinase 5 precursor, putative, expressed |
| LOC_Os06g03980 | 6 | 1 | 1 613 664 | 1 620 183 | − | 26 | 301 | 5.2933 | 6.01e‐08 | 1.62e‐06 | 1.25e‐07 | Expressed protein |
| LOC_Os06g03990 | 6 | 2 | 1 629 717 | 1 633 629 | + | 10 | 302 | 5.5093 | 1.80e‐08 | 4.86e‐07 | 5.78e‐08 | Aminotransferase, classes I and II, domain containing protein, expressed |
| LOC_Os06g04000 | 6 | 3 | 1 635 096 | 1 636 029 | + | 1 | 299 | 5.3989 | 3.35e‐08 | 9.05e‐07 | 7.65e‐08 | Peptidyl‐prolyl cis‐trans isomerase, putative, expressed |
| LOC_Os06g04010 | 6 | 4 | 1 636 524 | 1 640 883 | − | 17 | 302 | 5.7231 | 5.23e‐09 | 1.41e‐07 | 2.29e‐08 | GAGA‐binding protein, putative, expressed |
| LOC_Os06g04020 | 6 | 4 | 1 642 214 | 1 643 800 | − | 8 | 302 | 4.8677 | 5.65e‐07 | 1.52e‐05 | 8.97e‐07 | Histone H1, putative, expressed |
| LOC_Os06g04030 | 6 | 5 | 1 645 414 | 1 646 326 | − | 5 | 301 | 4.7079 | 1.25e‐06 | 3.38e‐05 | 1.74e‐06 | Histone H3, putative, expressed |
| LOC_Os06g04040 | 6 | 6 | 1 648 595 | 1 654 300 | + | 15 | 302 | 5.4965 | 1.94e‐08 | 5.23e‐07 | 5.78e‐08 | WD domain, G‐beta repeat domain containing protein, expressed |
| LOC_Os06g04060 | 6 | 6 | 1667929 | 1 670 634 | + | 12 | 301 | 5.3964 | 3.40e‐08 | 9.18e‐07 | 7.65e‐08 | Expressed protein |
| LOC_Os06g04070 | 6 | 7 | 1 674 045 | 1 677 406 | − | 5 | 298 | 4.0391 | 2.68e‐05 | 7.24e‐04 | 3.15e‐05 | Pyridoxal‐dependent decarboxylase protein, putative, expressed |
| LOC_Os06g04080 | 6 | 8 | 1 691 885 | 1 694 140 | + | 9 | 299 | 5.1816 | 1.10e‐07 | 2.97e‐06 | 2.12e‐07 | Glycosyl hydrolases family 17, putative, expressed |
| LOC_Os06g04090 | 6 | 9 | 1 699 371 | 1 702 859 | − | 7 | 300 | 4.7765 | 8.92e‐07 | 2.41e‐05 | 1.34e‐06 | No apical meristem protein, putative, expressed |
| LOC_Os06g04120 | 6 | 10 | 1723575 | 1725392 | + | 2 | 302 | 5.0607 | 2.09e‐07 | 5.64e‐06 | 3.52e‐07 | Hypothetical protein |
| LOC_Os06g04130 | 6 | 10 | 1 725 313 | 1 729 996 | − | 7 | 302 | 5.7013 | 5.94e‐09 | 1.61e‐07 | 2.29e‐08 | Lung seven transmembrane domain containing protein, putative, expressed |
| LOC_Os06g04140 | 6 | 10 | 1 731 813 | 1 732 565 | − | 1 | 298 | 4.7024 | 1.29e‐06 | 3.47e‐05 | 1.74e‐06 | Expressed protein |
| LOC_Os06g04150 | 6 | 10 | 1 733 935 | 1 735 135 | − | 5 | 302 | 5.4788 | 2.14e‐08 | 5.78e‐07 | 5.78e‐08 | Magnesium‐protoporphyrin O‐methyltransferase, putative, expressed |
|
| 6 | 11 | 1 736 249 | 1 742 581 | − | 28 | 301 | 5.9279 | 1.53e‐09 | 4.14e‐08 | 1.02e‐08 | Hydrolase, alpha/beta fold family domain containing protein, expressed |
| LOC_Os06g04190 | 6 | 11 | 1 754 351 | 1 760 884 | − | 26 | 301 | 5.8932 | 1.89e‐09 | 5.11e‐08 | 1.02e‐08 | Rad1, putative, expressed |
|
| 6 | 12,13,14 | 1 765 622 | 1 770 656 | + | 33 | 300 | 6.6434 | 1.53e‐11 | 4.14e‐10 | 4.14e‐10 | Starch synthase, putative, expressed |
| LOC_Os06g04220 | 6 | 15 | 1 780 472 | 1 781 052 | + | 7 | 301 | 3.2284 | 6.22e‐04 | 1.68e‐02 | 6.22e‐04 | Expressed protein |
| LOC_Os06g04230 | 6 | 15 | 1 784 311 | 1 784 827 | + | 2 | 299 | 5.9086 | 1.73e‐09 | 4.66e‐08 | 1.02e‐08 | Expressed protein |
| LOC_Os06g04240 | 6 | 15 | 1 785 159 | 1 785 889 | − | 3 | 301 | 5.9713 | 1.18e‐09 | 3.18e‐08 | 1.02e‐08 | Expressed protein |
| LOC_Os06g04250 | 6 | 16 | 1 789 836 | 1 791 394 | + | 3 | 302 | 3.2908 | 4.99e‐04 | 1.35e‐02 | 5.19e‐04 | Phosphate‐induced protein 1 conserved region domain containing protein, expressed |
| LOC_Os06g04280 | 6 | 17 | 1 816 175 | 1 819 804 | + | 8 | 300 | 4.4773 | 3.78e‐06 | 1.02e‐04 | 4.86e‐06 | 3‐phosphoshikimate 1‐carboxyvinyltransferase, chloroplast precursor, putative, expressed |
| LOC_Os06g04300 | 6 | 18 | 1 822 766 | 1 826 383 | − | 6 | 300 | 3.7833 | 7.74e‐05 | 2.09e‐03 | 8.70e‐05 | tRNA 2‐phosphotransferase 1, putative, expressed |
| LOC_Os06g04310 | 6 | 19 | 1 828 169 | 1 831 736 | − | 14 | 298 | 4.2439 | 1.10e‐05 | 2.97e‐04 | 1.35e‐05 | Expressed protein |
These genes were implicated by SNPs that had genome‐wide association P‐values below the Bonferroni‐corrected threshold of 2.066345e‐08 (calculated as 0.05/m, where m = 2 419 731 (SNP markers).
Chr indicates the chromosome number; Block, the linkage block number as calculated from plink2 ‐–blocks within the implicated genomic region; Start and End, the start and end 1‐based integer coordinates of the gene's location within the chromosome; Strand, defined as + (forward) or − (reverse); nSNPs, number of SNPs within the gene; n, number of individuals tested; z‐stat, a probit transformation of the gene's P‐value computed during the gene analysis step; P‐value, calculated from magma; Bonferroni and FDR, Bonferroni‐corrected and False Discovery Rate corrected P‐values, respectively; Annotation, gene annotation reported in Rice Genome Annotation Project Release 7.
Genes associated with the variation in expression levels of the Waxy locus identified by cis‐eQTL analysis.
Italic gene Ids representing the candidate genes validated through SMR analysis.
Figure 3Summary for genomic variation present within the gene and 1 kb promoter region of granule‐bound starch synthase I (). (a) The –log plot of the association P‐values for the glycemic index (GI), shown in the top graph. A red bar indicates that the negative effect of the allele on the trait, while a black bar indicates a positive effect. Bar thickness reflects the relative effect size of the respective allele, as beta value. The middle one is the –log plot of the association P‐values for final viscosity (FV). The third track is the –log plot of the P‐values of the cis‐eQTL analysis done on the expression of using TWAS. The bottom track that shows SNP IDs with their alleles, the placement of the SNPs on the chromosome segment, and the haplotype blocks calculated using Gabriel's algorithm implemented in Haploview 4.2. The two linkage blocks within the gene were formed from the linkage break at the A→C SNP (located at chromosomal position 1 768 006 bp) in . This SNP is situated two base pairs away from the 3′ end of exon 6. (b and c) explained the haplotypes lying within and contributing influence the GI and FV values, respectively. The replacement of T with G allele of the splice junction SNP of intron1 predisposes with converting high to intermediate GI, whereas a conversion of T to C allele in the exon 10 SNP correlated to shifting FV from high to intermediate.
Figure 4Presence of CpG methylation and their correlation with the GI and amylose content (AC) observed in ten resequenced germplasm lines. (a) Representation of the correlations between degree of CpG methylation present in the promoter region of genes underlying GI6.1 hotspot region with the GI and AC; evaluated in three main categories, fully‐unmethylated, partially (10%–90% methylation) and fully methylated (>90% methylated region) categories. Correlation coefficients ranged from −1 (green) to +1 (red). Horizontal red arrow depicts the candidates showing the significant correlations with the respective trait. (b) Graphs showing the degree of the CpG methylation (fully and partially methylation) patterns in the genic region of the key candidates LOC_Os06g04200 linked with AC, (c) LOC_Os06g04070, (d) LOC_Os06g04230 and (e) LOC_Os06g04240, linked with and GI values (c–e).
Figure 5Weighted gene co‐expression analysis among Intermediate and low GI lines. (a) The heat map of differentially expressed genes between the intermediate versus low GI lines (Heatmap scale ranges from min −3 to max +3 normalized expression values which are shown as blue to red colour gradient representing the low and high expression, respectively). (b) Gene dendrogram with two clustered modules blue, turquoise in intermediate versus low; (c, d) Gene co‐expression subnetwork of Blue and Turquoise module, respectively in intermediate versus low. Nodes in each network represent the gene and edges as the interaction, the variation of node size show the different degree of connectivity and bordered wall nodes represents the hub. Nodes are colour coded based on MapMan functional categories.
Figure 6Schematic overview summarizing the predicted regions influencing rice glycaemic index and texture. Overlaying genome‐wide association study with transcriptome‐wide association study results (shown at the bottom left) narrowed the candidate genes down to 13 in the GI6.1 region, and further validated using expression QTL (eQTL) analysis. Four important loci influencing GI trait within GI6.1 region show interesting methylation patterns. The diagnostic haplotypes identified in LOC_Os06g04070, LOC_Os06g04200, LOC_Os06g04230 and LOC_Os06g04240 are critical to lowering the glycaemic index.