| Literature DB >> 30572838 |
Omika Thakur1, Gursharn Singh Randhawa2,3.
Abstract
BACKGROUND: Guar [Cyamopsis tetragonoloba, L. Taub.] is an important industrial crop because of the commercial applications of the galactomannan gum contained in its seeds. Plant breeding programmes based on marker-assisted selection require a rich resource of molecular markers. As limited numbers of such markers are available for guar, molecular breeding programmes have not been undertaken for the genetic improvement of this important crop. Hence, the present work was done to enrich the molecular markers resource of guar by identifying high quality SSR, SNP and InDel markers from the RNA-Seq data of the roots of two guar varieties.Entities:
Keywords: Insertions and deletions; Marker-assisted selection; Molecular markers; RNA-Seq; Simple sequence repeats; Single nucleotide polymorphisms
Mesh:
Substances:
Year: 2018 PMID: 30572838 PMCID: PMC6302463 DOI: 10.1186/s12864-018-5205-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
The data summary of de novo transcriptome assembly of guar root
| Sample name | RGC-1066 | M-83 |
|---|---|---|
| No. of raw reads | 29,623,208 | 29,853,028 |
| No. of bases/Mb | 2962.32 | 2985.3 |
| GC % | 44.89 | 44.435 |
| Q_30 | 86.755 | 87.67 |
| Read length | 100 X 2 | 100 X 2 |
| No. of clean reads | 17,305,480 | 17,517,086 |
| No. of bases/Mb | 1612.14 | 1637.48 |
| GC % | 44.27 | 43.625 |
| Q_30 | 97.345 | 97.51 |
| Read length | 100 X 2 | 100 X 2 |
Fig. 1Functional annotation of guar root transcriptome a. Length distribution b. GC distribution c. Distribution of total sequences obtained from Blast2GO three step processes including BLASTX, mapping and annotation of guar root transcriptome d. BLASTX E-value distribution
Statistics of de novo assembly of root transcriptome of guar
| Characteristic | Details |
|---|---|
| Total number of transcripts | 1,02479 |
| Min length | 201 |
| Max length | 16,844 |
| Average length | 1016.62 |
| Standard deviation | 1150.28 |
| Median length | 500 |
| Number of contigs < 500 bp | 51,229 |
| Number of contigs ≥500 bp | 51,250 |
| Number of contigs ≥1000 bp | 32,949 |
| Number of contigs ≥2000 bp | 15,360 |
| Number of contigs ≥5000 bp | 1297 |
| N50 | 1907 |
| Contigs in N50 | 16,574 |
| GC content | 39.82% |
Fig. 2Similarity distributions and enzyme code distributions of assembled unigenes of root transcriptome of guar varieties RGC-1066 and M-83 a. Similarity distributions of the top BLAST hits for each sequence against the Nr database b. Enzyme code distributions
Fig. 3Distribution of biological, molecular and cellular component categories in unigenes of guar root transcriptome
Repeat numbers of different SSRs obtained from the guar root transcriptome
| Repeat type | Repeat numbers | Total number of repeats | ||
|---|---|---|---|---|
| ≤5 | 6 to 9 | ≥10 | ||
| Mononucleotides | – | – | 9983 | 9983 |
| Dinucleotides | – | 1732 | 1723 | 3455 |
| Trinucleotides | 2050 | 1334 | 530 | 3914 |
| Tetranucleotides | 201 | 172 | – | 373 |
| Pentanucleotides | 67 | 3 | – | 70 |
| Hexanucleotides | 50 | 8 | – | 58 |
Percentage distribution frequencies of SSR motif repeats in guar root transcriptome
| Repeat number | Repeat type | |||||||
|---|---|---|---|---|---|---|---|---|
| Mono- | Di- | Tri- | Tetra- | Penta- | Hexa- | c | c* | |
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 2 | 0 | 0.01 | 0 | 0 | 0 | 0 | 0 | 0 |
| 3 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 4 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| 5 | 0 | 0 | 11 | 1 | 0.3 | 0.2 | 0 | 0 |
| 6 | 0 | 5 | 5 | 0.4 | 0.01 | 0.01 | 0 | 0 |
| 7 | 0 | 3 | 2 | 0.1 | 0 | 0 | 2 | 0 |
| 8 | 0 | 1 | 2 | 0.01 | 0 | 0 | 1 | 0 |
| 9 | 0 | 1 | 1 | 0 | 0 | 0 | 0.1 | 0 |
| 10 | 14 | 0.4 | 0.1 | 0 | 0 | 0 | 0 | 0 |
| 11 | 7 | 0.4 | 0.4 | 0 | 0 | 0 | 0 | 0 |
| 12 | 4 | 0.05 | 0.2 | 0 | 0 | 0 | 0 | 0 |
| 13 | 3 | 0.05 | 0.1 | 0 | 0 | 0 | 0 | 0.1 |
| 14 | 2 | 0.06 | 0.01 | 0 | 0 | 0 | 0 | 0.01 |
| 15 | 2 | 0.03 | 0.1 | 0 | 0 | 0 | 0 | 0 |
| 16 | 1 | 0.01 | 0.1 | 0 | 0 | 0 | 0 | 0 |
| 17 | 1 | 0 | 0.01 | 0 | 0 | 0 | 0 | 0 |
| 18 | 0.4 | 0.02 | 0 | 0 | 0 | 0 | 0 | 0 |
| 19 | 0.2 | 0.01 | 0 | 0 | 0 | 0 | 0 | 0 |
| 20 | 0.1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| ≥20 | 0.6 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
Fig. 4Number and frequency distributions of nucleotide-level variants in root transcriptomes of RGC-1066 and M-83 guar varieties a. SNP distribution at read depths (RD) 2, 5 and 10 b. InDels distribution at read depths (RD) 2, 5 and 10 c. SNPs distribution in the UniProt annotated unigenes d. Frequency distribution of SNPs in various gene functions of unigenes
Transition vs transversion mutation rate, type and ratio determined from the transcriptomes of RGC-1066 and M-83 varieties of guar
| Transition type | Occurance/Numbera | Percent occurance (%) |
|---|---|---|
| A/G | 865 | 17.8 |
| C/T | 585 | 12.04 |
| G/A | 673 | 13.9 |
| T/C | 910 | 18.7 |
| Transversion type | Occurance/Numbera | Percent occurance (%) |
| A/C | 206 | 4.2 |
| A/T | 297 | 6.1 |
| C/G | 202 | 4.1 |
| G/T | 214 | 4.4 |
| C/A | 232 | 4.8 |
| T/A | 275 | 5.6 |
| G/C | 140 | 2.9 |
| T/G | 237 | 4.9 |
| Transition/Transversion ratio: 1.72 | ||
aOut of 4907 SNPs
Characteristics of the SNPs located in genes involved in root development of guar
| S. No. | SNP name | Gene name | SNP Allele | Position | Gene description | Gene function |
|---|---|---|---|---|---|---|
| 1 | OT3537 | KK1_009388/ LOC100798193 GLYMA_06G069600 | G/T | 594 | cellulose synthase A catalytic subunit 1 [UDP-forming] | Transportation of mannose, nucleotide-sugar transport |
| 2 | OT2789 | -do- | C/A | 898 | -do- | β (1,4) Mannan synthesis |
| 3 | OT2790 | -do- | C/A | 899 | -do- | -do- |
| 4 | OT2791 | -do- | T/C | 917 | -do- | -do- |
| 5 | OT2792 | -do- | A/G | 959 | -do- | -do- |
| 6 | OT2793 | -do- | T/C | 960 | -do- | -do- |
| 7 | OT2794 | -do- | G/T | 1322 | -do- | -do- |
| 8 | OT4846 | LOC100797597 GLYMA_03G217500 | T/A | 344 | cellulose synthase-like protein D5 | -do- |
| 9 | OT1382 | LR48_Vigan02g201500 | T/A | 901 | cellulose synthase A catalytic subunit 2 [UDP-forming]-like | -do- |
| 10 | OT3553 | LOC100789704 GLYMA_12G002500 | A/C | 1528 | receptor-like protein kinase HSL1-like isoform X1 | Regulation of root development |
| 11 | OT3554 | -do- | A/G | 3027 | -do- | -do- |
| 12 | OT3555 | -do- | A/G | 3037 | -do- | -do- |
| 13 | OT3861 | LOC100780184 GLYMA_08G111400 | A/G | 2864 | uncharacterized protein LOC100780184 | Root cell wall biosynthesis |
| 14 | OT3862 | -do- | A/G | 5661 | -do- | -do- |
| 15 | OT3863 | -do- | A/G | 7007 | -do- | -do- |
| 16 | OT4656 | LOC100795671 GLYMA_12G192000 | T/C | 2145 | cellulose synthase-like protein B3-like | Root hair morphogenesis |
Characteristics of SNPs located in guar unigenes involved in biotic and abiotic stress responses
| S. No. | SNP name | Gene name | SNP allele | SNP position | Gene description | Gene function |
|---|---|---|---|---|---|---|
| 1 | OT769 | MTR_2g078070 | A/G | 3008 | NBS-LRR type | Defense response |
| 2 | OT1113 | KK1_009739 | G/T | 272 | Disease resistance protein RGA2 | Late blight ( |
| 3 | OT1114 | -do- | A/G | 428 | -do- | -do- |
| 4 | OT1115 | -do- | A/G | 485 | -do- | -do- |
| 5 | OT1116 | -do- | A/C | 1845 | -do- | -do- |
| 6 | OT1161 | MTR_3g035960 | A/T | 978 | NBS-LRR type disease resistance protein | R (Resistance) genes, plant defense |
| 7 | OT1162 | -do- | T/C | 1196 | -do- | -do- |
| 8 | OT1163 | -do- | A/G | 2357 | -do- | -do- |
| 9 | OT1431 | KK1_033834 | C/T | 493 | Disease resistance RPP13-like protein 1 | Plant hypersensitive response, disease resistance |
| 10 | OT1432 | -do- | A/G | 528 | -do- | -do- |
| 11 | OT1433 | -do- | T/G | 548 | -do- | -do- |
| 12 | OT1434 | -do- | C/A | 549 | -do- | -do- |
| 13 | OT1435 | -do- | T/G | 550 | -do- | -do- |
| 14 | OT1436 | -do- | T/A | 593 | -do- | -do- |
| 15 | OT1437 | -do- | G/A | 1323 | -do- | -do- |
| 16 | OT1881 | MTR_3g466750 | C/G | 1389 | NB-ARC domain disease resistance protein | Defense response |
| 17 | OT1916 | MTR_3g020890 | A/G | 290 | NB-LRR type disease resistance protein Rps1-k-2 |
|
| 18 | OT2923 | KK1_021617 | 588 | S-norcoclaurine synthase | Pathogen related (PR) 10 protein, defense response | |
| 19 | OT2924 | -do- | C/G | 648 | -do- | -do- |
| 20 | OT3720 | MTR_2g020040 | G/A | 876 | RING finger and CHY zinc finger domain-containing protein | Negative regulation of p53 |
| 21 | OT67 | KK1_005680 | C/T | 228 | KK1_005680 | Response to freezing |
| 22 | OT458 | TCM_044345 | G/A | 2522 | dnaJ protein homolog | Response to heat stress |
| 23 | OT543 | KK1_034657 | T/A | 718 | Heat shock factor protein HSF24 | Response to heat stress |
| 24 | OT3934 | LOC100818430 GLYMA_13G105700 | A/T | 218 | heat stress transcription factor A-2-like | Response to heat stress |
| 25 | OT3935 | -do- | G/T | 590 | -do- | -do- |
| 26 | OT1691 | KK1_001069 | A/T | 79 | Heat stress transcription factor A-5 | Response to heat stress |
| 27 | OT1738 | LOC100820566 GLYMA_13G225700 | C/T | 146 | Heat stress transcription factor A-4a-like | Response to heat stress |
Characteristics of the SNPs located in genes involved in galactomannan synthesis in guar
| S. No. | SNP name | SNP allele | Type of SNP | Position | Gene name | Gene description | Function |
|---|---|---|---|---|---|---|---|
| 1 | OT697 | T/A | TV | 1856 | LOC106775241 | PREDICTED: callose synthase 11-like [ | Control delivery of UDP-Glc to the synthase |
| 2 | OT698 | A/G | TT | 4865 | -do- | -do- | -do- |
| 3 | OT1715 | C/G | TV | 5925 | glysoja_027079 | Callose synthase 3 [ | -do- |
| 4 | OT4075 | A/T | TV | 2011 | KK1_031323 | Callose synthase 3 [ | -do- |
| 5 | OT4076 | C/G | TV | 3624 | -do- | -do- | -do- |
| 6 | OT4342 | T/C | TT | 6780 | LOC100787540 GLYMA_10G295100 | PREDICTED: callose synthase 9-like isoform 1 [ | -do- |
| 7 | OT1366 | T/A | TV | 901 | LR48_Vigan02g201500 | PREDICTED: cellulose synthase A catalytic subunit 2 [UDP-forming]-like [ | -do- |
| 8 | OT2742 | C/A | TV | 898 | KK1_009388 | Cellulose synthase A catalytic subunit 1 [UDP-forming] [ | Mannan biosynthesis |
| 9 | OT2743 | C/A | TV | 899 | -do- | -do- | -do- |
| 10 | OT2744 | T/C | TT | 917 | -do- | -do- | -do- |
| 11 | OT2745 | A/G | TT | 959 | -do- | -do- | -do- |
| 12 | OT2746 | T/C | TT | 960 | -do- | -do- | -do- |
| 13 | OT2747 | G/T | TV | 1322 | -do- | -do- | -do- |
| 14 | OT3490 | G/T | TV | 594 | LOC100798193 GLYMA_06G069600 | hypothetical protein GLYMA_06G069600 [Glycine max] | -do- |
Abbreviations: TT Transition, TV Transversion, UDP-Glc Uridine diphosphate glucose
Fig. 5Distribution of differentially expressed genes in the roots of RGC-1066 and M-83 varieties of guar (a) FPKM plot (b) MA plot (log2Fold change vs Base mean) (c) Volcano plot (−log10pval vs log2Foldchange) (d) Pie padj plot (e) Pie pval plot (f) Bar plot of log2foldchange of upregulated, downregulated and non-significant unigenes
Fig. 6Heat-map representation of highly upregulated and downregulated differentially expressed genes in the roots of RGC-1066 and M-83 varieties of guar