| Literature DB >> 28771546 |
Huapeng Sun1, Fang Li2, Zijian Xu3, Mengli Sun3, Hanqing Cong1, Fei Qiao1, Xiaohong Zhong2.
Abstract
Hedera helix L. is an important traditional medicinal plant in Europe. The main active components are triterpenoid saponins, but none of the potential enzymes involved in triterpenoid saponins biosynthesis have been discovered and annotated. Here is reported the first study of global transcriptome analyses using the Illumina HiSeq™ 2500 platform for H. helix. In total, over 24 million clean reads were produced and 96,333 unigenes were assembled, with an average length of 1385 nt; more than 79,085 unigenes had at least one significant match to an existing gene model. Differentially Expressed Gene analysis identified 6,222 and 7,012 unigenes which were expressed either higher or lower in leaf samples when compared with roots. After functional annotation and classification, two pathways and 410 unigenes related to triterpenoid saponins biosynthesis were discovered. The accuracy of these de novo sequences was validated by RT-qPCR analysis and a RACE clone. These data will enrich our knowledge of triterpenoid saponin biosynthesis and provide a theoretical foundation for molecular research on H. helix.Entities:
Mesh:
Substances:
Year: 2017 PMID: 28771546 PMCID: PMC5542655 DOI: 10.1371/journal.pone.0182243
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Summary of data output quality of various libraries.
| Library | Raw Reads | Clean Reads | Error (%) | Q20 (%) | Q30 (%) | GC (%) |
|---|---|---|---|---|---|---|
| L1 | 45,630,708 | 39,909,334 | 0.1 | 97.08 | 95.40 | 45.05 |
| L2 | 45,480,900 | 40,761,120 | 0.1 | 97.14 | 95.71 | 44.72 |
| L3 | 47,598,294 | 42,492,156 | 0.1 | 97.10 | 95.46 | 44.89 |
| R1 | 46,941,114 | 40,930,938 | 0.1 | 96.65 | 95.76 | 44.73 |
| R2 | 46,855,462 | 41,306,100 | 0.1 | 97.16 | 95.52 | 44.54 |
| R3 | 45,266,262 | 40,352,326 | 0.1 | 97.10 | 95.68 | 44.58 |
Summary of assembly results of H. helix.
| Samples | Total Number | Total Length (nt) | Mean Length (nt) | N50 (nt) | Distinct Clusters | Distinct Singletons |
|---|---|---|---|---|---|---|
| L1-Contig | 116,550 | 99,922,035 | 857 | 1339 | - | - |
| L2-Contig | 117,656 | 101,212,619 | 860 | 1341 | - | - |
| L3-Contig | 114,891 | 97,770,836 | 851 | 1323 | - | - |
| R1-Contig | 136,187 | 111,354,934 | 818 | 1271 | - | - |
| R2-Contig | 140,439 | 115,262,101 | 821 | 1280 | - | - |
| R3-Contig | 128,613 | 104,187,419 | 810 | 1267 | - | - |
| L1-Unigene | 71,734 | 69,222,524 | 965 | 1511 | 34,970 | 36,764 |
| L2-Unigene | 70,985 | 69,495,387 | 979 | 1526 | 35,544 | 35,441 |
| L3-Unigene | 68,637 | 66,692,896 | 972 | 1514 | 34,305 | 34,332 |
| R1-Unigene | 81,825 | 75,839,021 | 927 | 1462 | 38,521 | 43,304 |
| R2-Unigene | 85,302 | 79,392,368 | 931 | 1474 | 39,032 | 46,270 |
| R3-Unigene | 78,940 | 71,597,869 | 907 | 1440 | 35,493 | 43,447 |
| All-Unigene | 96,333 | 133,417,819 | 1385 | 1927 | 55,721 | 40,612 |
Fig 1Length distribution frequency of unigenes in H. helix.
Summary of functional annotations of H. helix unigenes.
| Public database | Number of Unigenes | Percentage (%) |
|---|---|---|
| Annotated in Nr | 75,773 | 78.7 |
| Annotated in Nt | 70,728 | 73.4 |
| Annotated in SwissProt | 51,320 | 53.3 |
| Annotated in KEGG | 47,100 | 48.9 |
| Annotated in COG | 32,443 | 33.7 |
| Annotated in GO | 50,479 | 52.4 |
| All annotated Unigenes | 79,085 | 82.1 |
| Total Unigenes | 96,333 | 100 |
Fig 2Gene similarity of unigenes against the Nr database.
(A) E-value distribution of top BLAST hits for each unigene (E-value of 1.0E-5). (B) Similarity distribution of best BLAST hits for each unigene. (C) Distribution of BLAST results by species shown as percentage of total homologous sequences (E-value ≤1.0E-5). All plant proteins in the NCBI Nr database were used for homology search and the best hit of each sequence was used for analysis.
Fig 3Comparison of Gene ontology (GO) classifications of H. helix.
Results are summarized into three main GO categories (biological process, cellular component, molecular function) and 44 sub-categories. The x-axis indicates subcategories; right y-axis indicates number of genes in a category; and left y-axis indicates percentage of a specific category of genes in the main category.
Fig 4COG function classification of H. helix transcriptome.
A total of 33,205 unigenes showed significant homology (E-value ≤1.0E-5) to genes in one of the 25 categories (A-W, Y and Z) in the NCBI COGs database.
Fig 5Differentially expressed gene analysis of six libraries in H. helix.
(A) Expressed higher unigenes in leaf samples. (B) Expressed lower unigenes in leaf samples.
Discovery and expression of unigenes involved in triterpenesaponin biosynthesis in Hedera helix L.
| Enzymes name | Abbreviation | EC number | Putative ortholog | Expressed higher | Expressed lower |
|---|---|---|---|---|---|
| Acetyl-CoA acetyl transferase | AACT | EC: | CL10734, CL8643 | ||
| Hydroxymethylglutaryl CoA synthase | HMGS | EC: | CL4883, Unigene7122 | ||
| 3-hydroxy-3-methylglutaryl-coenzyme A reductase | HMGR | EC: | CL84, Unigene12643, Unigene12948, Unigene9993, | Unigene9993 | CL84, Unigene12643 |
| Mevalonate kinase | MVK | EC: | CL10135, Unigene31942, Unigene37001 | CL10135 | |
| Phosphomevalonate kinase | PMVK | EC: | CL2755 | ||
| Mevalonate diphosphosphate decarboxylase | MVD | EC: | CL12343 | ||
| 1-deoxy-D-xylulose-5-phosphate synthase | DXS | EC: | CL1741, CL6964, CL7506, Unigene12956, Unigene20321, Unigene26465, Unigene26467, Unigene26469, Unigene7001 | CL1741 | |
| 1-deoxy-D-xylulose-5-phosphate reductoisomerase | DXR | EC: | CL4453, Unigene20397, Unigene27513, Unigene32536, Unigene5694 | CL4453, Unigene5694 | |
| 2-C-methyl-D-erythritol 4-phosphate cytidylyltransferase | ispD | EC: | CL177, Unigene28885 | ||
| 4-diphosphocytidyl-2-C-methyl-D-erythritol kinase | ispE | EC: | Unigene15885, Unigene29532 | ||
| 2-C-methyl-D-erythritol 2,4-cyclodiphosphate synthase | ispF | EC: | Unigene5248 | Unigene5248 | |
| (E)-4-hydroxy-3-methylbut-2-enyl-diphosphate synthase | ispG | EC: | CL4323, Unigene18905, Unigene21207, Unigene39208 | ||
| 4-hydroxy-3-methylbut-2-enyl diphosphate reductase | ispH | EC: | CL3923, Unigene11112, Unigene39239, Unigene40179 | CL3923 | |
| Isopentenyl-diphosphate Delta-isomerase | IDI | EC: | Unigene9222 | ||
| Geranyl diphosphate synthase | GPS | EC: | CL3895, Unigene23216, Unigene31228 | ||
| Geranylgeranyl diphosphate synthase | GGPS | EC: | CL12915, CL1497, CL3631, CL5060, CL8765, Unigene20086, Unigene26473, Unigene6117, Unigene7971 | Unigene26473, CL1497, Unigene20086 | |
| Farnesyl diphosphate synthase | FPS | EC: | CL8585, Unigene7386, Unigene7389, Unigene256 | CL8585 | |
| Squalene synthase | SS | EC: | CL11265, Unigene15662, Unigene17742, Unigene18488, Unigene28187, Unigene30735, Unigene35625, Unigene7507, Unigene14696 | CL11265, Unigene35625, Unigene7507, Unigene15662 | |
| Squaleneepoxidase | SE | EC: | CL6504, CL9719, CL9981, Unigene14156, Unigene17245 | CL6504,CL9981 | |
| β-amyrin synthase | β-AS | EC: | CL5897, CL1580, CL11721, Unigene29516, Unigene32344 | CL11721, CL1580 |
Fig 6RT-qPCR validation of selected unigenes involved in triterpene saponin biosynthesis.
Columns indicate relative expression obtained by RT-qPCR (left y-axis); lines indicating the expression level were calculated by FPKM method (right y-axis). All data are presented as mean value of three repeats.