| Literature DB >> 25860891 |
Xia Wang1, Dijia Chen1, Yuqi Wang1, Jun Xie1.
Abstract
The plant Dioscorea composita has important applications in the medical and energy industries, and can be used for the extraction of steroidal sapogenins (important raw materials for the synthesis of steroidal drugs) and bioethanol production. However, little is known at the genetic level about how sapogenins are biosynthesized in this plant. Using Illumina deep sequencing, 62,341 unigenes were obtained by assembling its transcriptome, and 27,720 unigenes were annotated. Of these, 8,022 unigenes were mapped to 243 specific pathways, and 531 unigenes were identified to be involved in 24 secondary metabolic pathways. 35 enzymes, which were encoded by 79 unigenes, were related to the biosynthesis of steroidal sapogenins in this transcriptome database, covering almost all the nodes in the steroidal pathway. The results of real-time PCR experiments on ten related transcripts (HMGR, MK, SQLE, FPPS, DXS, CAS, HMED, CYP51, DHCR7, and DHCR24) indicated that sapogenins were mainly biosynthesized by the mevalonate pathway. The expression of these ten transcripts in the tuber and leaves was found to be much higher than in the stem. Also, expression in the shoots was low. The nucleotide and protein sequences and conserved domains of four related genes (HMGR, CAS, SQS, and SMT1) were highly conserved between D. composita and D. zingiberensis; but expression of these four genes is greater in D. composita. However, there is no expression of these key enzymes in potato and no steroidal sapogenins are synthesized.Entities:
Mesh:
Substances:
Year: 2015 PMID: 25860891 PMCID: PMC4393236 DOI: 10.1371/journal.pone.0124560
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1The structures of some of the steroidal sapogenins in D. composita.
Summary of the transcript statistics generated from D. composita.
| Item | Characteristics |
|---|---|
| Number of paired-end reads | 114,136,043 |
| Transcript number | 131,655 |
| Transcripts ≥ 1000 bp | 62,285 |
| Transcripts ≥ 2000 bp | 31,868 |
| Average length of transcripts (bp) | 1,368 |
| Max length of transcripts (bp) | 25,464 |
| N50 length of transcripts (bp) | 2,303 |
| Unigene number | 62,341 |
| Unigenes ≥ 1000 bp | 14,425 |
| Unigenes ≥ 2000 bp | 5,975 |
| Average length of Unigenes (bp) | 724 |
| Max length of Unigenes (bp) | 22,744 |
| N50 length of Unigenes (bp) | 1,228 |
Summary of the statistics of the CDS-containing sequences.
| Length of CDS (bp) | Number of transcripts | Ratio (%) |
|---|---|---|
| 300–599 | 27,398 | 20.8 |
| 600–899 | 15,368 | 11.7 |
| 900–1199 | 13,797 | 10.5 |
| 1200–1499 | 9,205 | 7.0 |
| ≥ 1500 | 17,346 | 13.2 |
| Total | 83,114 | 63.1 |
Summary of the functional annotation statistics for the D. composita unigenes in the protein databases.
| Annotated database | Number of unigene | Percentage (%) |
|---|---|---|
| NR | 22,815 | 36.6 |
| Arabidopsis protein database | 20,257 | 32.5 |
| Rice protein database | 21,017 | 33.7 |
| KEGG | 8,022 | 12.9 |
| Pfam | 20,747 | 33.3 |
| Total | 27,720 | 44.5 |
Fig 2KEGG classification of the unigenes from RNA-Seq experiments on D. composita.
Key: A–cellular processes, B–environmental information processing, C–genetic information processing, D–metabolism, and E–organism systems.
Fig 3Unigenes related to secondary metabolism from RNA-Seq experiments on D. composita.
The number of unigenes (NU) potentially related to steroidal sapogenin biosynthesis from RNA-Seq experiments on D. composita and their ID.
| Pathway | Enzyme | NU | Pfam ID |
|---|---|---|---|
| Terpenoid backbone | atoB | 3 | PF00108, PF02803 |
| HMGS | 8 | PF01154, PF08540, PF08541, PF08545, PF00560 | |
| HMGR | 6 | PF03884, PF00368 | |
| MK | 1 | —- | |
| PMK | 1 | PF00288 | |
| MVD | 1 | PF00288 | |
| IDI | 2 | PF00293 | |
| GPPS | 7 | PF00348, PF00420, PF11057, PF04847, PF01690 | |
| FPPS | 1 | PF08091, PF00348 | |
| FNTA | 1 | PF01239 | |
| FACE2 | 1 | PF02517 | |
| STE24 | 1 | PF01435 | |
| FOLK | 1 | PF01254, PF01148 | |
| FLDH | 1 | PF01370, PF04321, PF02719, PF00106, PF01073 | |
| PCME | 1 | PF02230, PF01738, PF07859, PF00326, PF01179 | |
| ICMT | 1 | PF04140 | |
| PCYOX1 | 1 | PF07992, PF01593, PF01266 | |
| DXS | 4 | PF02780, PF00676, PF02775, PF04805, PF07817 | |
| DXR | 1 | PF03447, PF02670, PF08436, PF01118 | |
| CMK | 1 | —- | |
| MCS | 1 | PF02542 | |
| HMED | 1 | PF05699, PF04551, PF00293, PF05026, PF00118 | |
| HDR | 1 | PF00643, PF01667, PF01485 | |
| Sesquiterpenoid and triterpenoid biosynthesis | SQS | 1 | PF00494 |
| SQLE | 1 | PF01210, PF02737, PF05834, PF01266, PF08491, PF02558, PF07992, PF00070, PF01134 | |
| Steroid biosynthesis | CAS | 1 | PF04159, PF00432, PF07678 |
| Cycloeucalenol cycloisomerase | 1 | —- | |
| Sterol methyl oxidase | 8 | PF04116 | |
| Methyl transferase | 2 | PF01135, PF08003, PF08241, PF08498, PF01209, PF02353, PF05175, PF02322, PF09445 | |
| Carboxylate-dehydrogenase | 2 | PF01370, PF02719, PF04321, PF01073, PF02453 | |
| Cytochrome P450 | 6 | PF00067, PF00283,PF11023, PF01254, | |
| Hydroxysteroid-isomerases | 1 | PF04116, PF01545, PF00172,PF05241 | |
| Sterol reductase | 6 | PF00014, PF01222, PF02326, PF01565, PF00424, PF02148, PF04790, PF06363, PF09088 | |
| Methylsterol monooxygenase | 2 | PF04116 | |
| Sterol desaturase | 1 | PF04116, PF01545, PF00172 | |
| Total | 35 | 79 |
The similarity of unigenes (NU) potentially related to steroidal sapogenin biosynthesison D. composita with Arabidopsis and Rice.
| Enzyme | Similarity ID | |
|---|---|---|
|
| Rice | |
| atoB | AT5G48230.1, AT5G47720.1 | LOC_Os01g02020.3 |
| HMGS | AT4G11820.1, AT4G11820.2 | LOC_Os03g02710.1, LOC_Os08g43170.1, LOC_Os09g34960.1 |
| HMGR | AT2G17370.1 | LOC_Os09g31970.1 |
| MK | AT5G27450.3 | LOC_Os10g18220.1 |
| PMK | AT1G31910.1 | LOC_Os03g48160.1 |
| MVD | —- | —- |
| IDI | AT3G02780.1 | LOC_Os07g36190.1 |
| GPPS | AT2G34630.2 | LOC_Os06g46450.1 |
| FPPS | AT5G47770.1 | LOC_Os01g50760.1 |
| FNTA | AT3G59380.1 | LOC_Os09g33930.3 |
| FACE2 | AT2G36305.1 | LOC_Os05g28950.1 |
| STE24 | AT4G01320.1 | LOC_Os02g45650.1 |
| FOLK | AT5G58560.1 | LOC_Os01g61560.1 |
| FLDH | AT4G33360.1 | LOC_Os03g08624.1 |
| PCME | AT5G15860.1 | LOC_Os06g49440.1 |
| ICMT | AT5G23320.1 | LOC_Os04g51380.1 |
| PCYOX1 | AT5G63910.1 | LOC_Os04g59630.1 |
| DXS | AT4G15560.1, AT5G11380.2 | LOC_Os05g33840.1, LOC_Os06g05100.3 |
| DXR | AT5G62790.1 | LOC_Os01g01710.1 |
| CMK | —- | —- |
| MCS | AT4G25720.2 | LOC_Os06g01410.1 |
| HMED | AT5G60600.1 | LOC_Os02g39160.1 |
| HDR | AT4G34350.1 | LOC_Os03g52170.1 |
| SQS | AT3G59380.1 | LOC_Os09g33930.3 |
| SQLE | AT1G58440.1 | LOC_Os03g12910.1 |
| CAS | AT2G07050.1 | LOC_Os02g04710.1 |
| Sterol methyl oxidase | AT1G07420.1,AT1G07420.2,AT4G22753.1, AT2G29390.3,AT2G29390.4,AT2G29390.2,AT4G12110.1 | LOC_Os11g48020.1, LOC_Os07g01150.3, LOC_Os10g39810.1 |
| Methyl transferase | AT5G13710.2,AT1G20330.1 | LOC_Os07g10600.2, LOC_Os03g04340.1 |
| Carboxylate-dehydrogenase | AT2G43420.1,AT2G26260.1 | LOC_Os09g34090.1, LOC_Os03g29170.1 |
| Cytochrome P450 | AT2G28860.1,AT2G34500.1,AT2G34490.1,AT1G11680.1 | LOC_Os01g11300.1, LOC_Os01g11340.1, LOC_Os01g11270.1, LOC_Os11g32240.1 |
| Hydroxysteroid- isomerases | AT3G02580.1,AT1G20050.1 | LOC_Os01g04260.1, LOC_Os01g01369.1 |
| Sterol reductase | AT3G19820.3, AT3G52940.1,AT1G50430.1 | LOC_Os10g25780.1,LOC_Os10g25780.3, LOC_Os09g39220.1, LOC_Os02g26650.3 |
| Cycloeucalenol cycloisomerase | AT5G50375.1 | LOC_Os11g19700.1 |
| Methylsterol monooxygenase | AT4G12110.1 | LOC_Os10g39810.1 |
| Sterol desaturase | AT3G02580.1 | LOC_Os01g04260.1 |
Fig 4Putative biosynthetic pathway of steroidal sapogenin in D. composita.
Key: atoB: acetyl-CoA C-acetyltransferase, HMGS: hydroxymethylglutaryl-CoA synthase, HMGR: 3-hydroxy-3-methylglutaryl-CoA reductase, MK: mevalonate kinase, PMK: phosephomevalonate kinase, MVD: diphosphomevalonate decarboxylase, DXS: 1-deoxy-D-xylulose-5-phosphate synthase, DXR: 1-deoxy-D-xylulose-5-phosphate reductoisomerase, CMC: 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase, CMK: 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase, MCS: 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase, HMED: 4-Hydroxy-3-methy-2-butenyl-diphosphate synthase, HDR: 1-Hydroxy-2-methy-2-butenyl-4-diphosphate reductase, IDI: isopentenyl diphosphate isomerase, GPPS: geranyl diphosphate synthase, FPPS: farnesyl diphosphate synthase, FNTA: protein farnesyltransferase, FACE2: prenyl protein peptidase, STE24: endopeptidase, FOLK: farnesol kinase, FLDH: farnesol dehydrogenase, PCME: prenylcysteine alpha-carboxyl methylesterase, ICMT: protein-S-isoprenylcysteine O-methyltransferase, PCYOX1: prenylcysteine oxidase, SQS: squalene synthase, SQLE: squalene monooxygenase, CAS: cycloartenol synthase.
Fig 5Expression patterns for ten transcripts related to steroidal sapogenin in D. composita.
The tuber, leaf, and stem results relate to plants that are 18-month-old. The shoots results relate to 1-month-old plants. (Key: CYP51: sterol 14-demethylase, DHCR24: delta 24-sterol reductase, DHCR7: 7-dehydrocholesterol reductase.)
Fig 6Content of steroidal sapogenins in potato, cassava, Chinese yam, D. zingiberensis and D. composita (dry weight).
The identity of nucleotide sequences in D. composita, D. zingiberensis, and potato.
| Gene | Plant | Accession | Length (bp) | Query cover | Identity |
|---|---|---|---|---|---|
|
|
| 2409 | – | – | |
|
| KC960674.1 | 2077 | 89% | 90% | |
| Potato | L01400.1 | 2379 | No | No | |
|
|
| 3227 | – | – | |
|
| AM697885.1 | 2703 | 84% | 89% | |
| Potato | XM_006362747.1 | 2616 | No | No | |
|
|
| 1825 | – | – | |
|
| KC960673.1 | 1453 | 86% | 89% | |
| Potato | JF802610.1 | 1236 | No | No | |
|
|
| 5718 | – | – | |
|
| FR714840.1 | 1459 | 68% | 89% | |
| Potato | XM_006364446.1 | 1372 | No | No |
The comparison of conserved domains in D. composita, D. zingiberensis, and potato.
| Protein | Name of conserved domains | Plant | Identity |
|---|---|---|---|
|
| HMG-CoA_reductase_classI |
| – |
|
| 93% | ||
| Potato | 82% | ||
|
| SQCY_1 |
| – |
|
| 90% | ||
| Potato | 78% | ||
|
| Trans_IPPS_HH |
| – |
|
| 94% | ||
| Potato | 84% | ||
|
| Sterol_MT_C |
| – |
|
| 79% | ||
| Potato | 30% |
Key: HMG-CoA_reductase_classI: Class I hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase, accession: cd00643; SQCY_1: Squalene cyclase domain subgroup 1, accession: cd02892; Trans_IPPS_HH: Trans-Isoprenyl Diphosphate Synthases, accession: cd00683; Sterol_MT_C: Sterol methyltransferase C-terminal, accession: pfam08498.
Fig 7Expression of four transcripts related to steroidal sapogenin in the tuber of D. composita, D. zingiberensis, and potato.
The D. composita and D. zingiberensis samples came from 18-month-old plants, and the samples of potato from 6-month-old plants.
The identity of protein sequences in D. composita, D. zingiberensis, and potato.
| Protein | Plant | Accession | Length (aa) | Query cover | Identity |
|---|---|---|---|---|---|
|
|
| 580 | – | – | |
|
| AGN32411.1 | 585 | 96% | 92% | |
| Potato | AAA93498.1 | 596 | 100% | 66% | |
|
|
| 759 | – | – | |
|
| CAM91422.1 | 759 | 100% | 90% | |
| Potato | XP_006362809.1 | 757 | 99% | 77% | |
|
|
| 408 | – | – | |
|
| AGN32410.1 | 409 | 95% | 91% | |
| Potato | AEX26929.1 | 411 | 94% | 76% | |
|
|
| 345 | – | – | |
|
| CBX33151.1 | 333 | 99% | 93% | |
| Potato | XP_006364508.1 | 358 | 89% | 40% |