| Literature DB >> 28545396 |
N Sathyanarayana1, Ranjith Kumar Pittala2, Pankaj Kumar Tripathi2, Ratan Chopra3, Heikham Russiachand Singh4, Vikas Belamkar5, Pardeep Kumar Bhardwaj6, Jeff J Doyle7, Ashley N Egan8.
Abstract
BACKGROUND: The medicinal legume Mucuna pruriens (L.) DC. has attracted attention worldwide as a source of the anti-Parkinson's drug L-Dopa. It is also a popular green manure cover crop that offers many agronomic benefits including high protein content, nitrogen fixation and soil nutrients. The plant currently lacks genomic resources and there is limited knowledge on gene expression, metabolic pathways, and genetics of secondary metabolite production. Here, we present transcriptomic resources for M. pruriens, including a de novo transcriptome assembly and annotation, as well as differential transcript expression analyses between root, leaf, and pod tissues. We also develop microsatellite markers and analyze genetic diversity and population structure within a set of Indian germplasm accessions.Entities:
Keywords: Differential gene expression; EST-SSRs; Fabaceae; Leguminosae; Mucuna pruriens; Population structure; Transcriptomics; Velvet bean
Mesh:
Substances:
Year: 2017 PMID: 28545396 PMCID: PMC5445377 DOI: 10.1186/s12864-017-3780-9
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Details of accessions used for EST-SSR validation
| Sample Number | Accession number/Collector ID | Variety | Latitude (N) | Longitude (E) | Altitude (AMSL) (m) | State of Origin |
|---|---|---|---|---|---|---|
| 1 | 500101KA* | var | 13°14′ | 77°62′ | 911 | Karnataka |
| 2 | IC0620620** | var | 13°14′ | 77°62′ | 911 | Karnataka |
| 3 | IC0620622** | var | 20°00′ | 73°77′ | 745 | Maharashtra |
| 4 | 500120TN* | var | 9°55′ | 78°07′ | 138 | Tamil Nadu |
| 5 | IC0620624** | var | 14°48′ | 74°12′ | 7 | Karnataka |
| 6 | 500136TN* | var | 10°04′ | 77°45′ | 298 | Tamil Nadu |
| 7 | 500147AP* | var | 18°39′ | 78°10′ | 383 | Telangana |
| 8 | 500154AP* | var | 16°04′ | 78°52′ | 434 | Andhra Pradesh |
| 9 | 500186MH* | var | 19°09′ | 77°27′ | 373 | Maharashtra |
| 10 | 500192OR* | var | 20°18′ | 85°62′ | 63 | Odisha |
| 11 | 500193OR* | var | 21°94′ | 86°72′ | 51 | Odisha |
| 12 | 500194OR* | var | 21°94′ | 86°72′ | 51 | Odisha |
| 13 | 500195OR* | var | 21°63′ | 85°58′ | 650 | Odisha |
| 14 | 500196OR* | var | 20°47′ | 85°12′ | 121 | Odisha |
| 15 | 500197WB* | var | 26°71′ | 88°43′ | 125 | West Bengal |
| 16 | 500199WB* | var | 26°70′ | 88°80′ | 65 | West Bengal |
| 17 | 500202TN* | var | 12°57′ | 79°56′ | 43 | Tamil Nadu |
| 18 | 500210MN* | var | 25°41′ | 94°47′ | 782 | Manipur |
| 19 | 500211NL* | var | 25°67′ | 94°12′ | 1333 | Nagaland |
| 20 | 500212AS* | var | 26°11′ | 91°44′ | 61 | Assam |
| 21 | 500217MN* | var | 25°68′ | 93°03′ | 776 | Manipur |
| 22 | 500219TR* | var | 23°50′ | 91°25′ | 64 | Tripura |
| 23 | 500221AR* | var | 27°08′ | 93°40′ | 1035 | Arunachal Pradesh |
| 24 | 500224AR* | var | 27°08′ | 93°40′ | 296 | Arunachal Pradesh |
| 25 | 500267NL* | var | 25°68′ | 94°08′ | 1360 | Nagaland |
*Collectors ID of newly collected accessions; **National genebank ID
Fig. 1Map depicting collection locations of Mucuna pruriens used in this study
Summary of data generated for Mucuna pruriens transcriptome. G1 is Mucuna pruriens var. utilis (IC0620620; collector’s ID: 500108KA); G2 is M. pruriens var. pruriens(IC0620622; collector’s ID: 500113MH)
| Sample | fastq file size (GB) | Total number of paired end reads | Total number of reads after quality filtering |
|---|---|---|---|
| G1 Leaf | 1.86 | 19,406,426 | 18,997,424 |
| G1 Pod | 5.42 | 58,585,008 | 57,166,422 |
| G1 Root | 2.69 | 28,623,354 | 28,046,508 |
| G1 Pooled | 5.68 | 61,341,664 | 59,885,295 |
| G2 Pooled | 2.59 | 27,801,324 | 27,137,593 |
| Total | 18.24 | 195,757,776 | 191,233,242 |
Statistics of non-redundant set of Mucuna pruriens transcripts obtained from Trinity assembly
| Total number of assembled bases | 46,525,999 |
| Number of transcripts | 72,561 |
| The total number of transcripts after clustering | 67,561 |
| The mean sequence length | 626 |
| Average % of N | 0.00 |
| Average % of GC content | 44.58 |
| N50 | 987 |
| Maximum transcript length | 17,978 |
| Average transcript length | 641 |
| Number of putative non coding sequences | 1,493 |
| Length of the longest ORF (bp) | 2,362 |
| Number of ORFs ≥ 100 bp | 36,228 |
| Number of ORFs on plus (+) strand | 36,421 |
| Number of ORFs on minus (-) strand | 31,140 |
Fig. 2Functional annotation of Mucuna pruriens transcripts. Gene ontology term assignments to transcripts in different categories of a biological process, b cellular component, and c molecular function. Numbers are percentage of function for each major category
Fig. 3Distribution of Mucuna pruriens transcripts in different transcription factor families
Number of transcripts encoding for transcription factor families in Mucuna pruriens compared to other legumes. The data on M. pruriens is from our study; data for soybean, Medicago and Lotus is from Libault et al [69]; data for Chickpea is from Garg et al [70]
| TF family |
| Chickpea | Soybean |
|
|
|---|---|---|---|---|---|
| bHLH | 227 | 488 | 393 | 71 | 64 |
| AUX/IAA-ARF | 64 | 216 | 129 | 24 | 36 |
| C2C2-CO-like | 16 | 15 | 72 | 15 | 21 |
| C2C2-GATA | 44 | 49 | 62 | 29 | 16 |
| C2C2-YABBY | 13 | 8 | 18 | 6 | 4 |
| C3H | 93 | 594 | 147 | 41 | 50 |
| CAMTA | 18 | 26 | 15 | 6 | 4 |
| MYB | 146 | 528 | 791 | 171 | 191 |
| PHD | 10 | 489 | 222 | 45 | 47 |
Fig. 4Simple sequence repeat length distribution across different motif classes in Mucuna pruriens transcriptome
Statistics of SSRs identified in Mucuna pruriens transcripts
| SSRs mining | |
|---|---|
| Total number of sequences examined | 67,561 |
| Total size of examined sequences (bp) | 42,340,968 |
| Total number of identified SSRs | 7,943 |
| Number of SSR containing sequences | 6,284 (9.3%) |
| Number of sequences containing more than one SSR | 1,174 |
| Number of SSRs present in compound formation | 963 |
| Frequency of SSRs | One per 5.3 kb |
| Distribution of SSRs in different repeat types | |
| Mono-nucleotide | 3,638 (45.80%) |
| Di-nucleotide | 1,674 (21.07%) |
| Tri-nucleotide | 2,240 (28.20%) |
| Tetra-nucleotide | 146 (1.83%) |
| Penta-nucleotide | 64 (0.80%) |
| Hexa-nucleotide | 100 (1.25%) |
Gene diversity estimates for groups based on botanical varieties, geographical distribution and population structure analysis
| Population group | Na | Ne | I | h | |
|---|---|---|---|---|---|
| Geographical distribution | East India | 2.23 | 1.78 | 0.59 | 0.37 |
| North East India | 2.21 | 1.68 | 0.52 | 0.36 | |
| Peninsular India | 2.98 | 1.95 | 0.72 | 0.35 | |
| Ht | Hs | Gst | Nm | ||
| Mean | 0.41 | 0.36 | 0.04 | 4.09 | |
| SD (±) | 0.19 | 0.18 | |||
| Botanical varieties | var. | 2.67 | 1.83 | 0.64 | 0.36 |
| var. | 2.46 | 1.78 | 0.60 | 0.36 | |
| var. | 2.10 | 1.84 | 0.59 | 0.34 | |
| Ht | Hs | Gst | Nm | ||
| Mean | 0.43 | 0.36 | 0.04 | 2.57 | |
| SD (±) | 0.19 | 0.17 | |||
| Population groups based on K = 4 sub grouping | SG1 | 2.54 | 1.91 | 0.66 | 0.19 |
| SG2 | 1.87 | 1.53 | 0.41 | 0.16 | |
| SG3 | 2.36 | 1.77 | 0.59 | 0.21 | |
| SG4 | 2.10 | 1.84 | 0.60 | 0.18 | |
| Ht | Hs | Gst | Nm | ||
| Mean | 0.41 | 0.34 | 0.04 | 1.83 | |
| SD (±) | 0.19 | 0.17 |
Na- Number of alleles; Ne- Effective no. of alleles [41]; I- Shannon information content; h- Nei’s gene diversity [42]
Fig. 5Population Structure analysis of the 23 Indian Mucuna pruriens accessions. a Bayesian clustering (fastSTRUCTURE, K = 4); b Scatter plot from principal component analysis (PCA); c Neighbor-joining tree generated for all accessions
Fig. 6Differential transcript expression in leaf, root, and pod tissues. a Diagram showing overlap of genes between leaf, root, and pod tissues showing differential transcript expression. b Pairwise comparisons across tissues showing differentially expressed transcripts. Those above the line are transcripts up-regulated and those below are down-regulated within the pairwise comparison. c Heat map of secondary metabolite associated differentially expressed genes of leaf, pod, and root transcriptomes. The various shades in the boxes showed similar tendencies of gene expression. Labels along the right side correspond to transcript names (see Additional file 9)