Literature DB >> 27764127

De Novo Sequencing and Assembly Analysis of the Pseudostellaria heterophylla Transcriptome.

Jun Li1,2, Wei Zhen1,2, Dengkai Long1,2, Ling Ding1,2, Anhui Gong1,2, Chenghong Xiao1,2, Weike Jiang1,2, Xiaoqing Liu1,2, Tao Zhou1,2, Luqi Huang3.   

Abstract

Pseudostellaria heterophylla (Miq.) Pax is a mild tonic herb widely cultivated in the Southern part of China. The tuberous roots of P. heterophylla accumulate high levels of secondary metabolism products of medicinal value such as saponins, flavonoids, and isoquinoline alkaloids. Despite numerous studies on the pharmacological importance and purification of these compounds in P. heterophylla, their biosynthesis is not well understood. In the present study, we used Illumina HiSeq 4000 sequencing platform to sequence the RNA from flowers, leaves, stem, root cortex and xylem tissues of P. heterophylla. We obtained 616,413,316 clean reads that we assembled into 127, 334 unique sequences with an N50 length of 951 bp. Among these unigenes, 53,184 unigenes (41.76%) were annotated in a public database and 39, 795 unigenes were assigned to 356 KEGG pathways; 23,714 unigenes (8.82%) had high homology with the genes from Beta vulgaris. We discovered 32, 095 DEGs in different tissues and performed GO and KEGG enrichment analysis. The most enriched KEGG pathway of secondary metabolism showed up-regulated expression in tuberous roots as compared with the ground parts of P. heterophylla. Moreover, we identified 72 candidate genes involved in triterpenoids saponins biosynthesis in P. heterophylla. The expression profiles of 11 candidate unigenes were analyzed by quantitative real-time PCR (RT-qPCR). Our study established a global transcriptome database of P. heterophylla for gene identification and regulation. We also identified the candidate unigenes involved in triterpenoids saponins biosynthesis. Our results provide an invaluable resource for the secondary metabolites and physiological processes in different tissues of P. heterophylla.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 27764127      PMCID: PMC5072632          DOI: 10.1371/journal.pone.0164235

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Pseudostellaria heterophylla (Miq.) Pax, known as Hai Er Shen (HES) and false starwort belongs to the Caryophyllaceae family. The Chung Yao Chi New Chinese Materia Medica records the collection of HES plants since 1959 because of its local and ethnic use. P. heterophylla are distributed widely in the southern parts of China including Fujing, Jiangsu, Anhui, Shandong, Shanxi, Zhejiang, Jiangxi, Hubei, Shanxi, and Guangzhou provinces. The P. heterophylla is a mild tonic herb, weaker than Panax ginseng and popularly used in Traditional Chinese Medicines (TCM) products such as Jiangzhong Jianweixiaoshi Tablets, Composite Pseudostellaria granule. The mitogenic fraction (PH-I) from the hot water extract of P. heterophylla has significant potent anti-tumor activities against Ehrlich ascites tumor (EAT) cells in mice in vivo but not in vitro by releasing the tumor necrosis factor (TNF) [1]. The ethyl acetate fraction extracted from the roots of P. heterophylla markedly reduced the number of coughs and prolonged the latent cough period in rat model of stable phase chronic obstructive pulmonary disease induced by cigarette smoke exposure [2]. Saponins in P. heterophylla (PHS) are primary bioactive compounds and consist of Pseudostellarinoside A and A-cutifolisde D, both of which are oleanyl-type saponins [3]. PHS extracts have significant anti-fatigue, anti-anoxia activities [4] and prevent cell membrane of H9c2 cell from oxidative injury via preventing increased oxidative stress [5]. The precursor for the biosynthesis triterpenoid saponins is 2,3-oxidosqualene, which is synthesized via the MVA pathway [6]. Oxidosqualene cyclase (OSC) catalyzes the cyclization of 2.3-oxidosqualene to produce various triterpene skeletons. Some of candidate genes involved in triterpene saponin biosynthesis were isolated from P. quinquefolium [7], P. ginseng [8], and P. notoginseng [9], but none were identified from P. heterophylla. Tuberous roots or stem, are the primary medicinal plant organs of TCM plants such as P. heterophylla [10], Fallopia multiflora [11], Panax notoginseng [12], Salvia miltiorrhiza [13]. Chemical technology has helped identify the secondary metabolites including flavonoids, an isoquinoline alkaloid, terpenoids, and phenylpropanoid in these plants [3]; however, there have been no molecular studies on secondary metabolism pathways involved in their biosynthesis and degradation. Hua et al.[14] (2016) performed de novo sequencing and transcriptome analysis of P. heterophylla tuberous roots, but no transcriptomic and genomic information from the aboveground parts (leaf, stem, and flower) is available in the nucleotide databases of National Centre for Biotechnology Information (NCBI). Study of the molecular basis of traits related to saponin biosynthesis and secondary metabolism in P. heterophylla will facilitate its breeding and improvement. RNA-seq is a useful tool for studying the expressed transcripts in different tissues and stages [15]. In this study, we used Illumina HiSeq 4000 sequencing platform to sequence the mRNA of P. heterophylla from various tissues (flowers, leaves, stem, root cortex and xylem). A global transcriptome database of P. Heterophylla was constructed to identify the differentially expressed genes (DEGs) in different tissues and putative genes encoding the enzymes involved in the biosynthesis of triterpene saponins.

Methods

Plant materials and RNA extraction

P. heterophylla cultivar ‘Shitai 1’ was selected and grown in a commercial planting base in Sibing County, Guizhou Province, China. Five tissues (S1 Fig) were collected separately from three randomly selected individuals. After cleaning, all samples were cut into small pieces for RNA isolation, and partial materials were used for gene cloning and RT-qPCR. Total RNA was extracted following the instructions of the Transzol Plant RNA Extraction Kit (TransGen Biotech, Beijing, China). DNA contamination was removed using DNase I (Takara, Tokyo, Japan).

cDNA library preparation and transcriptome sequencing

The construction of the cDNA libraries and the RNASeq was performed by Shanghai Majorbio Bio-pharm Technology Co., Ltd. (Shanghai, China). Firstly, mRNA were purified from 12 μg of total RNA from five tissues (flowers, leaves, stem, root cortex and xylem) by using Oligo(dT) magnetic beads, respectively. Then, the mRNA was disrupted into small fragments (200 ± 25 bp), which were used for the second-strand cDNA synthesis. These cDNA fragments were ligated with the Illumina paired-end sequencing adaptors. Finally, these libraries were sequenced on a paired-end flow cell using Illumina Hiseq 4000 platform. We obtained 5–8 GB of reads from each sample for de novo assembly.

De novo assembly and Gene annotation

Before assembly, the adaptors and unknown nucleotides in raw reads were filtered with SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle software (https://github.com/najoshi/sickle). Then the high-quality clean reads from 15 samples were used for de novo assembly by Trinity software [16] (http://trinityrnaseq.sourceforge.net/). Finally, the redundant Trinity generated contigs were clustered to remove using TIGR Gene Indices Clustering Tools (TGICL) (http://www.tigr.org/tdb/tgi/software/). ORF prediction was performed using the Markov model as described on http://trinityrnaseq.sourceforge.net/analysis/extract_proteins_from_trinity_transcripts.html). Then, the results were determined by Pfam database (http://pfam.xfam.org/). All unigenes were annotated using BLASTx by sequence comparison with various protein databases [i.e., Nr, Swissprot, Cluster of Orthologous Groups of proteins (COG), Kyoto Encyclopedia of Genes and Genomes (KEGG)], with an e-value cutoff of 1e-5. Function analysis of all unigenes was performed by subjecting to Gene Ontology (GO). Blast2GO program (https://www.blast2go.com/) was used to identify the GO term from all assembled unigenes. Finally, we used the WEGO software (http://wego.genomics.org.cn/) to perform GO function classification and determine the distribution of gene functions in P. heterophylla at the macromolecular level.

Digital gene expression profiling

Gene expression profiles were performed using RSEM (http://deweylab.biostat.wisc.edu/rsem/). The reads per kb per million reads (RPKM) were used to normalize the expression levels for each gene in each tissue of P. heterophylla. The RPKM from all isoforms of the same gene were summed as the RPKM of that gene. Cluster 3.0 software (http://bonsai.hgc.jp/~mdehoon/software/cluster/) was used to normalize the expression level of triterpene saponins. Samples names are shown on the heat maps.

Identification of the unigenes involved in triterpene saponins

The amino acid sequences of triterpene saponins were downloaded from NCBI and used for searching the P. heterophylla transcriptomic database. Putative genes of saponin biosynthesis in P. heterophylla were identified using the BlastP program with an e-value of 1e-10. The default hits were removed manually.

Real-Time PCR verification

Total RNA was extracted from different tissues of P. heterophylla and first-strand cDNA synthesis was performed by the Reverse Transcriptional M-MLV (Takara, Japan). We used ABI 7500 real-time PCR system (Life Technologies, Carlsbad, CA, USA) to determine the expression by real time PCR. All reactions were performed using SYBR® Premix Ex Taq™ II (Takara Biotechnology, China) according to the procedure with ten-fold diluted cDNA as templates. Reactions were first incubated at 95°C for 30 s, followed by 40 cycles of amplification at 95°C for 5 s and then 60°C for 34 s, after a final cycle of amplification at 95°C for 15 s, 60°C for 1 min and 95°C for 15 s. The raw data were analyzed using ABI 7500 software, and expression levels were normalized to PhACT2 gene (gi: KT363848) to minimize the variation of cDNA template contents. The expression level was shown using 2−ΔCt method. The experiments were performed in three individual biological replications.

Results

Illumina paired-end sequencing data and De novo assembly

We obtained 87 Gb of sequencing data including 645,961,688 raw reads and 616,413,316 clean reads with the base average error rate below 0.02%. A brief overview of the transcriptome assembly statistics are shown in Table 1. We used the Trinity program for the de novo assembling of the clean data because P. heterophylla reference genome was not available. After removal of ambiguous reads and low-quality reads (Q20 < 20), 127,334 unique sequences were obtained from the cDNA library constructed from P. heterophylla flowers, stem, leaves, and tuberous roots of (Table 1). The Q20 percentage (sequencing error rate < 1%) and Q30 percentage were 98% and 93.81%, respectively. The GC percentage in ground parts (leaves, stem, and flowers) and underground parts (cortex and xylem of tuberous roots) were an average of 51.5% and 43.5%. The length of unigenes ranged from 201 to 82, 236 bp, with an N50 length of 951 bp. 48, 860 coding sequences were obtained from all P. heterophylla unigenes sequences, and 30, 396 CDSs (62.21%) were longer than 1000 bp.
Table 1

Sequence Summary of P. heterophylla tissues.

OrgansSamplesclean readsClean bases(Gb)Error (%)Q20 (%)Q30 (%)GC (%)
Flower1_Z_H324175864.60.011598.049451.7
3_Z_H385524465.40.011598.0293.9950.37
4_Z_H305201664.30.011498.194.1950.57
Stem1_Z_J428903286.10.011997.9293.6753.77
3_Z_J292400524.10.012397.7293.250.96
4_Z_J403035605.70.011598.0594.0551.46
Leaf1_Z_YD337569944.80.011598.0894.0953.03
3_Z_YD379210605.40.011398.1394.2450.89
4_Z_YD3540143050.012397.7393.1850.75
Root xylem1_G_M701423649.90.011998.0193.7343
3_G_M468575786.70.011698.1393.9943.31
4_G_M450864426.40.011598.1694.0643.11
Root cortex1_G_P393353325.60.01297.9993.6243.69
3_G_P447582446.30.012597.7993.1143.97
4_G_P4922973470.011698.1894.0843.93

(1) Reads sequencing from the left; (2) Reads sequencing from the right.

Q20: percentage of bases with a Phred value >20; Q30: percentage of bases with a Phred value >30.

(1) Reads sequencing from the left; (2) Reads sequencing from the right. Q20: percentage of bases with a Phred value >20; Q30: percentage of bases with a Phred value >30.

Functional annotation

Gene annotation showed that only a total of 52,937 unigenes (41.57%) had significant matches with the information from public databases. The annotation rate was much lower than those of previous reports [17, 18]. However, there are about 74,150 unigenes (58.43%) without any matches to known genes, and these unaligned genes may be specific genes and novel transcripts in P. heterophylla. Our results showed that 20,104 unigenes had high similarity (greater than 80%) in mapped sequences with Nr database and 20,497 unigenes (16.09%) had significant homology (e-value < 10−30) (Fig 1A and 1B). The mapping rates of unigenes against the Pfam, Swissprot, KEGG, String databases were 38.83%, 68.24%, 44.32% and 18.49%, respectively. The number of unigenes that were annotated in the unique database were as follows: 101 unigenes in the Pfam database, 90 unigenes in the SwissProt database, 36 unigenes in the KEGG database and 11,213 unigenes in the Nr database (Table 2). Species distribution analysis showed that only 23,711 unigenes (18.62%) had high homology with the genes from Beta vulgaris, followed by Vitis vinifera (1,182, 0.93%), Theobroma cacao (380, 0.30%), while 17,126 unigenes had high homology with sequences from other organisms (Fig 1C).
Fig 1

Species distribution of unigenes from P. heterophylla.

a: Similarity distribution of top BLAST hits for each unigene; b: E-value distribution of BLAST hits with a cut off E-value of 1.0E−5; c: Species distribution for top BLAST hits in the Nr database.

Table 2

Blast results of the assembled unigenes.

DatabaseTotal unigenesAnnotated unigenePercentage
Pfam1273342065216.22%
Swissprot1273343629128.50%
KEGG1273342357218.51%
String127334163917.72%
Nr1273345293741.57%

Species distribution of unigenes from P. heterophylla.

a: Similarity distribution of top BLAST hits for each unigene; b: E-value distribution of BLAST hits with a cut off E-value of 1.0E−5; c: Species distribution for top BLAST hits in the Nr database. Unigenes showing high similarities with genes from Microbotryum violaceum (686 unigenes), Fusarium oxysporum (509 unigenes), Leptosphaeria maculans (321 unigenes), Pseudomonas fluorescens (272 unigenes), Rhodosporidium toruloides (255 unigenes) may belong to endophytes surviving in different parts of P. heterophylla [19]. Three unigenes from each species were validated by RT-PCR (S1 Table & S2 Fig).

Functional classification

We classified the functions of all unigenes using the Nr annotation and Gene Ontology (GO) classification (Fig 2, S2 Table). Moreover, we assigned 28, 210 unigenes to one or more gene ontology categories, 24,129 to molecular function, 15,544 unigenes to cellular component, and 23,751 unigenes to biological process. In the molecular function group, we found unigenes related to “catalytic activity” (15, 220, 53.95%) and “binding” (14,909, 52.85%). For the cellular component category, “cell” (7,660, 45.78%), “cell part” (7,659, 45.77%), “organelle” (5,601, 33.47%), “membrane” (4,380, 26.18%), “macromolecular complex” (3,485, 20.83%) represented the majority of unique sequences. Among molecular function category, unigenes assigned to “metabolic process” (11,388, 68.06%), “cellular process” (10, 343, 61.81%), and “single-organism process” (8,446, 50.47%) were the most abundant. A high percentage of genes were grouped into the “biological regulation” (3,217, 19.22%), “response to stimulus” (3,084, 18.43%), “regulation of biological process” (3,015, 18.02%), and “cellular component organization or biogenesis” (2, 461, 14.71%) categories.
Fig 2

Gene Ontology classification of assembled unigenes.

The unigenes were categorized into three main categories: biological process, cellular component, and molecular function.

Gene Ontology classification of assembled unigenes.

The unigenes were categorized into three main categories: biological process, cellular component, and molecular function. COG database was used for the function prediction and classification of all unigenes(Fig 3). In brief, 5,140 unigenes were grouped into 25 COG classifications. The largest group in the 25 COG categories was “translation, ribosomal structure and biogenesis” (803, 14.74%), followed by “general function prediction” (631, 11.58%), “signal transduction mechanisms” (565, 10.37%), and “posttranslational modification, protein turnover, chaperones” (544, 9.99%).
Fig 3

COG functional categories of P. heterophylla.

KEGG classification

All unigenes were compared against KEGG for searching active biochemical pathways in P. heterophylla using BLASTx, with an e-value < 1e-10. We assigned 39, 795 unigenes to 356 KEGG pathways (Fig 4). “Ribosome” had the largest number of unigenes (1,075 unigenes) followed by “protein processing in endoplasmic reticulum” (404 unigenes), “oxidative phosphorylation” (390 unigenes), “glycolysis/gluconeogenesis” (315 unigenes), “endocytosis” (309 unigenes), “spliceosome” (287 unigenes). The metabolic pathways in our study were: “carbohydrate metabolism” (1,398 unigenes), “amino acid metabolism” (1,193 unigenes), “energy metabolism” (1,124 unigenes), “lipid metabolism” (653 unigenes), “metabolism of cofactors and vitamins” (425 unigenes), “metabolism of other amino acids” (345 unigenes), “nucleotide metabolism” (343 unigenes), “glycan of biosynthesis and metabolism” (306 unigenes), “metabolism of terpenoids and polyketides” (299 unigenes), and “biosynthesis of secondary metabolites” (270 unigenes). KEGG genetic information processing included “folding, sorting and degradation” (914 unigenes), followed by “replication” (413 unigenes) and “transcription” (108 unigenes). In the environmental information processing category, the most abundant subcategories were “signal transduction” (1,215 unigenes), “signaling molecules” and “interaction” (251 unigenes), and “membrane transport” (233 unigenes) (S3 Table).
Fig 4

Pathway assignment based on the Kyoto Encyclopedia of Genes and Genomes (KEGG).

(A) Classification based on metabolism categories, (B) Classification based on genetic information processing categories, (C) Classification based on environmental information processing categories, (D) Classification based on cellular processes categories, and (E) Classification based on organismal systems categories.

Pathway assignment based on the Kyoto Encyclopedia of Genes and Genomes (KEGG).

(A) Classification based on metabolism categories, (B) Classification based on genetic information processing categories, (C) Classification based on environmental information processing categories, (D) Classification based on cellular processes categories, and (E) Classification based on organismal systems categories.

Differential Expression Analysis of P. heterophylla

We used our assembled data as a reference and compared the unigenes from different tissues of P. heterophylla (Fig 5A). A unigenes was regarded as a Differentially Expressed Gene (DEG) when FDR < 0.05 and log2|FC| > = 1. There were 32,095 DEGs between root cortex and xylem, of which 21,073 were down-regulated, and 11,022 were up-regulated (Fig 5B). There were 30,070 DEGs between root cortex and leaf, in which 18,495 were down-regulated and 11,575 up-regulated. Moreover, we identified 31,555 DEGs between root cortex and stem, 18,212 of which were down-regulated and 13,343 of which were up-regulated. Between root cortex and flower, 17,073 DEGs were down-regulated while 6,948 DEGs were up-regulated. Overall, we identified 2,289 common DEGs from the four comparison groups. Root cortex showed the highest number of upregulated unigenes among all tissues.
Fig 5

Venn diagrams of unigenes of three libraries and statistical analysis of the differentially expressed genes (DEGs).

(A) Distribution of the unigenes of the three libraries; (B) The red columns indicate the up-regulated DEGs and the green columns represent the down-regulated DEGs in three pair-wise comparisons (FDR ≤ 0.001 and an absolute value of log 2 Ratio ≥ 1 was used as the significant threshold for DEGs).

Venn diagrams of unigenes of three libraries and statistical analysis of the differentially expressed genes (DEGs).

(A) Distribution of the unigenes of the three libraries; (B) The red columns indicate the up-regulated DEGs and the green columns represent the down-regulated DEGs in three pair-wise comparisons (FDR ≤ 0.001 and an absolute value of log 2 Ratio ≥ 1 was used as the significant threshold for DEGs).

GO enrichment analysis and KEGG enrichment analysis of DEGs in P. heterophylla

The GO enrichment analysis and KEGG enrichment analysis elucidated the functional differences of DEGs from different P. heterophylla samples (S3 Fig). In GO enrichment analysis, the function was regarded as enriched if the corrected p-value of which was below 0.05. The result showed that the unigenes involved in “response to fungus”, “oligosaccharide metabolic process”, “defense response to other organism”, “chloroplast envelope”, “hydrolase activity, hydrolyzing O-glycosyl compounds”, “sucrose metabolic process” were enriched between root cortex and flower (S4 Fig). Highly enriched DEGs were involved in “response to auxin”, “root development”, “plastid thylakoid”, “chloroplast thylakoid”, and “chloroplast stroma” between root cortex and leaf (S5 Fig). The DEGs involved in “pollen development”, “gametophyte development”, “response to auxin”, “response to external stimulus”, and “thylakoid” were enriched between root cortex and stem (S6 Fig). Other highly enriched genes were related to “oxidation−reduction process”, “naringenin−chalcone synthase activity”, “flavonoid metabolic process”, “protein disulfide oxidoreductase activity” between root cortex and xylem (S7 Fig). Moreover, we also analyzed 31 response categories related to DEGs using the heatmap according to the total RPKM values of all the DEGs in each pathway (Fig 6). Among these categories, most categories were up-regulated in underground parts (root cortex and xylem), including “response to biotic stimulus”, “response to insect”, “response to carbohydrate”, “response to endogenous stimulus”, “response to fungus”, “response to bacterium” and “response to wounding”. The only categories active in leaf were “response to cytokinin”, “response to jasmonic acid”, “response to light stimulus”, “response to cold”. Our results showed that 11 out of 31 response pathways had up-regulated expression in both leaf and stem. These included “response to salt stress”, “response to brassinosteroid”, “response to auxin”, “response to water deprivation”, “response to gibberellin” and “response to salt stress”.
Fig 6

GO annotation of DEGs.

The heat map shows 31 categories of DEGs in different tissues, leaf, stem, flower, root cortex and xylem included. Different colors indicated different expression levels. Green indicates down-regulated expression and red represents up-regulated expression. The heat map of all genes involved in each category was constructed using the log10 values of RPKM.1_G_M, 3_G_M, 4_G_M represents root xylem, 1_G_P, 3_G_P 4_G_P represents root cortex, 1_G_YD, 3_G_YD, 4_G_YD represents leaf, 1_G_J, 3_G_J, 4_G_J represents stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants.

GO annotation of DEGs.

The heat map shows 31 categories of DEGs in different tissues, leaf, stem, flower, root cortex and xylem included. Different colors indicated different expression levels. Green indicates down-regulated expression and red represents up-regulated expression. The heat map of all genes involved in each category was constructed using the log10 values of RPKM.1_G_M, 3_G_M, 4_G_M represents root xylem, 1_G_P, 3_G_P 4_G_P represents root cortex, 1_G_YD, 3_G_YD, 4_G_YD represents leaf, 1_G_J, 3_G_J, 4_G_J represents stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants. For a further study of DEGs, the KEGG database was used to search the significantly enriched biochemical pathway. Between root cortex and flower, the most significant enriched pathway was “plant hormone signal transduction”,which contained down-regulated DEGs in above-ground parts. Most of the DEGs that were involved in “plant-pathogen interaction”, “starch and sucrose metabolism”, “phenylpropanoid biosynthesis”, “alpha-Linolenic acid metabolism”, “circadian rhythm–plant”, “glycosylphosphatidylinositol (GPI)-anchor biosynthesis”, and “N-Glycan biosynthesis” were down-regulated. On the other hand, the DEGs involved in “diterpenoid biosynthesis”, “isoquinoline alkaloid biosynthesis”, “monoterpenoid biosynthesis”, “stibenoid diarylhepatanoid and gingerol biosynthesis”, “ubiquinone and other terpenoid-quinone biosynthesis”, “zeatin biosynthesis” were up-regulated (S8 Fig). We observed similar results in each underground parts (root cortex and xylem) compared to either aboveground parts (leaf, stem and xylem) in P. heterophylla (S9–S11 Figs). We further used the heatmap to analyze 14 KEGG pathways involved in the biosynthesis of secondary metabolites in different tissues (Fig 7). Our analysis showed that 6 out of 14 pathways showed up-regulated expression in underground parts (root cortex and xylem) including “monoterpenoid biosynthesis”, “zeatin biosynthesis”, “tropane, piperidine and pyridine alkaloid biosynthesis”, “sesquiterpenoid and triterpenoid biosynthesis”, “ubiquinone and other terpenoidquinone biosynthesis” and “isoquinoline alkaloid biosynthesis.” These results explain why tuberous root including root cortex and xylem is the principal medicinal part of P. heterophylla.
Fig 7

KEGG annotation of DEGs.

The heat map shows 31 pathways of secondary metabolism in different tissues, including leaf, stem, flower, root cortex and xylem. Expression differences are shown in different colors. Red represents high expression and green represents the low expression. 1_G_M, 3_G_M, 4_G_M indicates root xylem, 1_G_P, 3_G_P 4_G_P indicates root cortex, 1_G_YD, 3_G_YD, 4_G_YD indicates leaf, 1_G_J, 3_G_J, 4_G_J indicates stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants.

KEGG annotation of DEGs.

The heat map shows 31 pathways of secondary metabolism in different tissues, including leaf, stem, flower, root cortex and xylem. Expression differences are shown in different colors. Red represents high expression and green represents the low expression. 1_G_M, 3_G_M, 4_G_M indicates root xylem, 1_G_P, 3_G_P 4_G_P indicates root cortex, 1_G_YD, 3_G_YD, 4_G_YD indicates leaf, 1_G_J, 3_G_J, 4_G_J indicates stem, 1_G_H, 3_G_H, 4_G_H represents flower from three individual plants.

Identification of genes involved in triterpenoids saponins biosynthesis of P. heterophylla

We identified 70 candidate genes in P. heterophylla including AACT (acetyl-CoA acetyltransferase), HMGS (HMG-CoA synthase), HMGR (HMG-CoA reductase), MVK (mevalonate kinase), PMK (phosphomevalonate kinase), MVD (mevalonate diphosphate decarboxylase), GGPS (geranylgeranyl pyrophosphate synthase), FPS (farnesyl diphosphate synthase), IDI (isopentenyl diphosphate isomerase), SS (squalene synthase), SE (squalene epoxidase), LuS (lupeol synthase), β-A28O (β-amyrin 28-oxidase) (S4 Table). 2,3-oxidosqualene is the the key enzyme at the first committed step and the skeleton of triterpenoids saponins in plants depends on its activity. Notably, three unigenes (c24484_g1, c60124_g1, c27529_g1) encoding lupeol synthase were identified from our transcriptome data, but none encoding β-amyrin synthase and dammarenediol-II synthase were identified. The heat map result showed that most unigenes encoding AACT, HMGS, MK, PMK, MDD and IDI, had high expression levels in flowers, leaves, stem, root cortex, and xylem (Fig 8). Some members of the gene family of HMGR, SE andβ-A28O were up-regulated in the root cortex and xylem while others were down-regulated. The unigenes encoding GGPS (c12012_g1, c99329_g1) and IDI (c1497_g1) were up-regulated specially in leaf and stem. Some investigated genes showed high expression levels in the root cortex and xylem such as FPS (c51143_g1, c54472_g1), SS (c65449_g2, c65449_g4, c66040_g4) and LuS (c60124_g1). The identification of genes involved in triterpenoids saponins biosynthesis may help explain the accumulation of saponins in different tissues of P. heterophylla. We validated the expression levels of 11 randomly selected genes using real-time PCR. The expression profiles of these unigenes were consistent with the transcriptomic data (Fig 9). Gene-specific primers were designed based on the gene sequences and are shown in S5 Table.
Fig 8

The expression profiles of unigenes involved in triterpene saponin biosynthesis of P. Heterophylla.

Expression differences are shown in different colors. Red represents high expression and green represents the low expression.

Fig 9

The expression validation of candidate genes in triterpene saponin biosynthesis of P. heterophylla by qRT-PCR.

Error bars represent the mean (± SD) of three individual biologic experiments.

The expression profiles of unigenes involved in triterpene saponin biosynthesis of P. Heterophylla.

Expression differences are shown in different colors. Red represents high expression and green represents the low expression.

The expression validation of candidate genes in triterpene saponin biosynthesis of P. heterophylla by qRT-PCR.

Error bars represent the mean (± SD) of three individual biologic experiments.

Discussion

High throughput transcriptome sequencing has become a popular tool for sequencing non-model organisms such as Gingko kernels [20], Rehmannia glutinosa [21], Gossypium hirsutum [22], Liriodendron chinense [23], Ramia [24], and Centella asiatica [25]. We used Illumina HiSeq 4000 sequencing platform to sequence RNA from flowers, leaves, stem, root cortex, and xylem of P. heterophylla. The number of unigenes (127,334) identified in our study was much higher as compared with previous transcriptomic studies from Pseudostellariae redix [14]. Our data provides a useful resource for gene identification and regulation in different tissues of P. heterophylla. Our transcriptomic data identified unigenes related to five endophytes. Three of these endophytes (M. violaceum, L. maculans, and P. fluorescens) are harmful for the development of plant organs [26-28]. F. oxysporum is an important replant disease pathogen in Pseudostellaria heterophylla rhizospheric soil [29] and also isolated from Chamaecyparis lawsoniana [30], Quercus variabilis [31] and Ephedra fasciculate [32]. Some active chemicals were previously purified from F. oxysporum, such as oxysporidinone (pyridine, anti-fungus) and beauvericin (cycle-peptide, anti-cancer) [33]. R. toruloides is an oleaginous yeast and used for lipid production [34]. The results of transcriptome data and reverse transcript PCR indicated that the transcripts of unigenes from M. violaceum, P. fluorescens and R. toruloides were detected in aboveground parts (leaf, stem and flower), the expression profiles of unigenes from R. toruloides and L. maculans were determined in underground parts (root cortex and xylem).These results suggest that endophytes may participate in the interaction between plants and microorganisms; and thus, provide a novel guideline for the planting of P. heterophylla. The transcriptomic data from different tissues showed that most DEGs were either up-regulated in ground parts (leaf, stem, and flower) or underground parts (root cortex, and xylem) while a few DEGs showed special expression in certain tissues. The tuberous roots of sweet potato, cassava, and dahlia store nutrients, which permit survival from one year to the next. The formation of an enlarged area and secondary metabolic biosynthesis in the tuberous root is influenced by environment factors including fungus, bacteria, and wounding [35, 36]. In this study, these pathways were up-regulated both in root cortex and xylem. The pathways related to response to cytokinin, jasmonic acid, light stimulus, and cold were specially activated in the leaves. These results provided a better understanding of gene expression and regulation in different tissues of P. heterophylla. The unigenes involved in triterpenoids saponins biosynthesis of P. heterophylla were identified. The cyclization of 2,3-oxidosqualene–catalyzed 2,3-oxidosqualene cyclases (OSCs)–is the first committed step in the triterpenoid saponins, which provides potential products [37]. Although the structure of saponins in P. heterophylla was similar to that of P. vietnamensis and P. notoginseng, we did not identify any unigenes encoding β-amyrin synthase and dammarenediol-II synthase. The OSCs in plants contain four genes coding β-amyrin synthase, dammarenediol-II synthase, lupeol synthase and cycloartenol synthase, respectively. Because of high similarities, these pentacyclic triterpene synthases may have evolved in a complicated order in triterpenoid saponin biosynthesis and sterol biosynthesis with a common progenitor [38]. The in vitro activities of OSCs were analyzed by expressing them in Saccharomyces cerevisiae, strains carrying OSC2 accumulated α-, β-, and δ-amyrin and strains carrying LuS accumulated α-amyrin and lupeol [39]. The above study suggested that 2,3-oxidosqualene in triterpenoids saponins biosynthesis of P. heterophylla may mainly rely on the activity of lupeol synthase. Moreover, the discovery of β-amyrin synthase requires a precise sequencing technology in the future. Our qRT-PCR results and transcriptome data showed that two unigenes (c65449_g2, c65449_g4) encoding squalene synthase and two (c59462_g1, c55401_g1) encoding squalene epoxidase in triterpenoids saponins biosynthesis were up-regulated in both root cortex and xylem. Unigenes encoding GGPS (c99329_g1), IDI (c1497_g1), and MDD (c53051_g1) enzymes showed a high expression in both ground parts (leaf and stem) and underground parts (root cortex and xylem). Triterpene saponins can be extracted from underground parts (tuber root) and aerial parts (leaf and stem) [40] of P. heterophylla; however, these triterpene saponins may also accumulate in special tissues. Our study provides valuable information about pathways for the synthesis of triterpenoid saponins. Future studies involving isolation of key enzymes genes (OSCs) and their functional analysis are imperative for a complete understanding of the triperpenoid biosynthetic pathways.

Different tissues.

(PDF) Click here for additional data file.

Reverse transcript PCR determination of unigenes from endophytes.

(PDF) Click here for additional data file.

Heatmap_Cortex_Flower_Leaf_Stem_Xylem.

(PDF) Click here for additional data file.

Cortex_vs_Flower.DE.list.enrichment.detail.xls.go.

(PDF) Click here for additional data file.

Cortex_vs_Leaf.DE.list.enrichment.detail.xls.go.

(PDF) Click here for additional data file.

Cortex_vs_Stem.DE.list.enrichment.detail.xls.go.

(PDF) Click here for additional data file.

Cortex_vs_Xylem.DE.list.enrichment.detail.xls.go.

(PDF) Click here for additional data file.

Cortex_vs_Flower.pathway.

(PDF) Click here for additional data file.

Cortex_vs_Leaf.new.pathway.

(PDF) Click here for additional data file.

Cortex_vs_Stem.pathway.

(PDF) Click here for additional data file.

Cortex_vs_Xylem.pathway.

(PDF) Click here for additional data file.

Primers of unigenes from endophytes used in reverse transcript PCR.

(XLSX) Click here for additional data file.

unigene_GO.list.level234.stat.

(XLSX) Click here for additional data file.

pathway_table.

(XLSX) Click here for additional data file.

Candidate genes involved in triterpenoids saponins biosynthesis of P. heterophylla.

(XLSX) Click here for additional data file.

List of gene-specific primers used in real time PCR.

(XLSX) Click here for additional data file.
  33 in total

1.  Oxysporidinone: a novel, antifungal N-methyl-4-hydroxy-2-pyridone from Fusarium oxysporum.

Authors:  J Breinhold; S Ludvigsen; B R Rassing; C N Rosendahl; S E Nielsen; C E Olsen
Journal:  J Nat Prod       Date:  1997-01       Impact factor: 4.050

2.  Differences in population structure of the anther smut fungus Microbotryum violaceum on two closely related host species, Silene latifolia and S. dioica.

Authors:  E Bucheli; B Gautschi; J A Shykoff
Journal:  Mol Ecol       Date:  2001-02       Impact factor: 6.185

3.  Mitogenic and tumor necrosis factor producing activities of Pseudostellaria heterophylla.

Authors:  C K Wong; K N Leung; K P Fung; P K Pang; Y M Choy
Journal:  Int J Immunopharmacol       Date:  1992-11

4.  Molecular authentication of the traditional medicinal plant Fallopia multiflora.

Authors:  Chuan-Jin Zheng; Shu-Jin Zhao; Zhen-Hua Zhao; Jun Guo
Journal:  Planta Med       Date:  2009-02-25       Impact factor: 3.352

5.  De novo sequencing and assembly of Centella asiatica leaf transcriptome for mapping of structural, functional and regulatory genes with special reference to secondary metabolism.

Authors:  Rajender S Sangwan; Sandhya Tripathi; Jyoti Singh; Lokesh K Narnoliya; Neelam S Sangwan
Journal:  Gene       Date:  2013-05-01       Impact factor: 3.688

6.  De novo sequencing and assembly analysis of transcriptome in the Sodom apple (Calotropis gigantea).

Authors:  Nkatha G Muriira; Wei Xu; Alice Muchugi; Jianchu Xu; Aizhong Liu
Journal:  BMC Genomics       Date:  2015-09-22       Impact factor: 3.969

7.  Transcriptome analysis of Ginkgo biloba kernels.

Authors:  Bing He; Yincong Gu; Meng Xu; Jianwen Wang; Fuliang Cao; Li-An Xu
Journal:  Front Plant Sci       Date:  2015-10-06       Impact factor: 5.753

8.  Interaction of Pseudostellaria heterophylla with Fusarium oxysporum f.sp. heterophylla mediated by its root exudates in a consecutive monoculture system.

Authors:  Yongpo Zhao; Linkun Wu; Leixia Chu; Yanqiu Yang; Zhenfang Li; Saadia Azeem; Zhixing Zhang; Changxun Fang; Wenxiong Lin
Journal:  Sci Rep       Date:  2015-02-03       Impact factor: 4.379

9.  Transcriptome analysis of leaves, roots and flowers of Panax notoginseng identifies genes involved in ginsenoside and alkaloid biosynthesis.

Authors:  Ming-Hua Liu; Bin-Rui Yang; Wai-Fung Cheung; Kevin Yi Yang; He-Feng Zhou; Jamie Sui-Lam Kwok; Guo-Cheng Liu; Xiao-Feng Li; Silin Zhong; Simon Ming-Yuen Lee; Stephen Kwok-Wing Tsui
Journal:  BMC Genomics       Date:  2015-04-03       Impact factor: 3.969

10.  Antitussive activity of Pseudostellaria heterophylla (Miq.) Pax extracts and improvement in lung function via adjustment of multi-cytokine levels.

Authors:  Wensheng Pang; Siding Lin; Qiwen Dai; Hongcheng Zhang; Juan Hu
Journal:  Molecules       Date:  2011-04-19       Impact factor: 4.411

View more
  5 in total

1.  Correction: De Novo Sequencing and Assembly Analysis of the Pseudostellaria heterophylla Transcriptome.

Authors:  Jun Li; Wei Zheng; Dengkai Long; Ling Ding; Anhui Gong; Chenghong Xiao; Weike Jiang; Xiaoqing Liu; Tao Zhou; Luqi Huang
Journal:  PLoS One       Date:  2017-01-06       Impact factor: 3.240

2.  De novo leaf and root transcriptome analysis to identify putative genes involved in triterpenoid saponins biosynthesis in Hedera helix L.

Authors:  Huapeng Sun; Fang Li; Zijian Xu; Mengli Sun; Hanqing Cong; Fei Qiao; Xiaohong Zhong
Journal:  PLoS One       Date:  2017-08-03       Impact factor: 3.240

3.  Identification and characterization of miRNAome and target genes in Pseudostellaria heterophylla.

Authors:  Jun Li; Chongmin Wang; Tao Zhou; Haijun Jin; Xiaoqing Liu
Journal:  PLoS One       Date:  2022-10-05       Impact factor: 3.752

4.  Foliar Application of Different Iron Sources Improves Morpho-Physiological Traits and Nutritional Quality of Broad Bean Grown in Sandy Soil.

Authors:  Abdel Wahab M Mahmoud; Amira A Ayad; Hend S M Abdel-Aziz; Leonard L Williams; Rasha M El-Shazoly; Ahmed Abdel-Wahab; Emad A Abdeldaym
Journal:  Plants (Basel)       Date:  2022-10-02

5.  The Biosynthesis of Heterophyllin B in Pseudostellaria heterophylla From prePhHB-Encoded Precursor.

Authors:  Wei Zheng; Tao Zhou; Jun Li; Weike Jiang; Jinqiang Zhang; Chenghong Xiao; Dequn Wei; Changgui Yang; Rong Xu; Anhui Gong; Chen Zhang; Yan Bi
Journal:  Front Plant Sci       Date:  2019-10-17       Impact factor: 5.753

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.