Literature DB >> 32214546

Transcriptome changes in the phenylpropanoid pathway in senescing leaves of Toona sinensis.

Juanjuan Sui1, Changqing Qu1, Jingxia Yang1, Wenna Zhang2, Yuntao Ji1.   

Abstract

Toona sinensis is a deciduous tree native to eastern and southeastern Asia that has important culinary and cultural values. To expand current knowledge of the transcriptome and functional genomics in this species, a de novo transcriptome sequence analysis of young and mature leaf tissues of T. sinensis was performed using the Illumina platform. Over 8.1 Gb of data were generated, assembled into 64,541 unigenes, and annotated with known biological functions. Proteins involved in primary metabolite biosynthesis were identified based on similarities to known proteins, including some related to biosynthesis of carbohydrates, amino acids, lipids, and energy. Analysis of unigenes differentially expressed between young and mature leaves (transcriptomic libraries 'YL' and 'ML', respectively) showed that the KEGG pathways of phenylpropanoid, naringenin, lignin, cutin, suberin, and wax biosynthesis were significantly enriched in mature leaves. These results not only expand knowledge of transcriptome characteristics for this valuable species, but also provide a useful transcriptomic dataset to accelerate the researches on its metabolic mechanisms and functional genomics. This study can also further the understanding of unique aromatic metabolism and Chinese medicinal properties of T. sinensis. © Franciszek Górski Institute of Plant Physiology, Polish Academy of Sciences, Kraków 2019.

Entities:  

Keywords:  Leaf senescence; Phenylpropanoid; Toona sinensis; Transcriptome analysis

Year:  2019        PMID: 32214546      PMCID: PMC7088779          DOI: 10.1007/s11738-019-2915-9

Source DB:  PubMed          Journal:  Acta Physiol Plant        ISSN: 0137-5881            Impact factor:   2.354


Introduction

Chinese mahogany (Taihe Toona sinensis Roem, syn. Cedrela sinensis, family Meliaceae) is a perennial woody tree that grows 25 m high, and is used as a source of food, timber, and medicine, particularly in the Anhui province of China. Regarded as nutritious food, the edible buds and young leaves are commonly used to make the condiment Toona Paste, which has a floral and onion-like flavor (Park et al. 1996; Edmonds and Staniforth 1998). The unique flavor results from various natural compounds including triterpenes, phenolics, flavonoids, and lysine amino acid (Mu et al. 2007; Zhou et al. 2011; Kakumu et al. 2014; Zhang et al. 2015). The mature, fibrous leaves of T. sinensis are used in Chinese traditional medicines to treat conditions ranging from diarrhea and other intestinal complaints to reproductive concerns and cancer. Recently, other biological properties of T. sinensis leaf extracts have been reported, including anti-inflammatory, analgesic, inhibition of boil growth inhibition, antioxidant, anti-diabetic, and anti-neoplastic, as well as anti-atherosclerotic, and inhibition of replication of the severe acute respiratory syndrome (SARS) coronavirus and of the pandemic influenza A (H1N1) virus (Hsu et al. 2003; Chia et al. 2010; Huang et al. 2012; Yang et al. 2013, 2014; You et al. 2013). The nutritional value and potential health benefits of T. sinensis require further investigation. Currently, only very limited information is available about the compounds contributing to the flavor of young leaves and the medicinal content of mature leaves of T. sinensis. Only a few reports have addressed the effects of flavonoids on the taste of young leaves. Flavonoids, lysine, and polyphenols increase the antioxidant capacity of plant cells and associated tissues, and are responsible for the antioxidant properties of T. sinensis buds and young leaves (Wang et al. 2007; Vinodhini and Lokeswari 2014). Recent rapid developments in bioinformatics have allowed the transcriptome approach to emerge as a powerful method for direct sequencing. RNA-Seq, or whole transcriptome shotgun sequencing, can now be used for transcriptome studies due to its high-throughput and high-resolution capabilities (Young et al. 2010; Torre et al. 2014). RNA-Seq allows analysis of complex transcriptional regulation and variable metabolic pathways of different flavonoids, including across different groups or tissues (Shi et al. 2014). Previous transcriptome studies in T. sinensis using other species allowed increased understanding of multiple aspects of the biochemistry, development, and metabolism of leaves and shoots, as well as new insights into the biosynthesis of metabolic compounds (Long et al. 2014; Wang et al. 2015). In this study, we sequenced the transcriptomes of young and mature leaves of T. sinensis. RNA sequencing data was de novo assembled and annotated, and candidate gene expression changes were characterized. For the first time, molecular regulation of the phenylpropanoid and naringenin biosynthesis pathways was characterized in this species. Transcriptome differences between young and mature leaves described in the current study provide crucial resources for gene annotation and discovery, and gene function analysis. Moreover, our sequencing results enhance understanding of biosynthesis of phenylpropanoid and cutin, and provide insights into the potential molecular mechanisms of pharmacological action in T. sinensis, which can promote production and yield of phenylpropanoid for medicinal or culinary purposes of T. sinensis.

Materials and methods

Plant materials and growth conditions

Mature 5-year trees of the T. sinensis cultivar ‘Heiyouchun’ were sampled from a T. sinensis industry demonstration zone in Taihe County, Anhui, China. The first to third pinnate fronds with purple color were identified as young leaves (YL), and the sixth to eighth green pinnate fronds were considered as mature leaves (ML) (Fig. 1a). Young and mature leaves were harvested randomly from three T. sinensis clones, which were propagated by asexual reproduction and thus had the same genetics as the ‘Heiyouchun’ cultivar. At least 20 YL or ML were mixed in each sample pool for RNA-seq analysis. All samples were immediately immersed in liquid nitrogen and stored at − 80 °C.
Fig. 1

Constructed RNA libraries of young and mature leaves and alignment of unigenes annotated by databases. a The first to third pinnate fronds with purple color are identified as young leaves (YL) indicated by yellow stars, and the sixth to eighth green pinnate fronds are mature leaves (ML) indicated by white stars. At least 20 YL and ML were harvested randomly from three T. sinensis ‘Heiyouchun’ cultivars and mixed in two sample pools to construct RNA libraries. b The final clean data of YL and ML were obtained from raw data by discarding the adapters (> 5 bp), low-quality fragments (with a quality score Q ≤ 19) or N (unknown nucleotide) content > 5%, and fragments shorter than 50 bp (including redundant sequences) with Trinity software. c Comparison of the matching sequences with recently used NCBI sequence homologies showed 20,515 (43.06%) unigenes out of 64,541 identified unigenes were successfully annotated using BLAST searches of the public Nr, Nt, BLASTp, BLASTx databases

Constructed RNA libraries of young and mature leaves and alignment of unigenes annotated by databases. a The first to third pinnate fronds with purple color are identified as young leaves (YL) indicated by yellow stars, and the sixth to eighth green pinnate fronds are mature leaves (ML) indicated by white stars. At least 20 YL and ML were harvested randomly from three T. sinensis ‘Heiyouchun’ cultivars and mixed in two sample pools to construct RNA libraries. b The final clean data of YL and ML were obtained from raw data by discarding the adapters (> 5 bp), low-quality fragments (with a quality score Q ≤ 19) or N (unknown nucleotide) content > 5%, and fragments shorter than 50 bp (including redundant sequences) with Trinity software. c Comparison of the matching sequences with recently used NCBI sequence homologies showed 20,515 (43.06%) unigenes out of 64,541 identified unigenes were successfully annotated using BLAST searches of the public Nr, Nt, BLASTp, BLASTx databases

RNA extraction and cDNA library construction

Total RNA was extracted from leaf samples with TRIzol Reagent (Cat. #15596026, Invitrogen, Carlsbad, CA, USA) and then treated with DNase I (Invitrogen, Cat. #18047019) according to the established methods. To determine RNA quality and concentration, 1 µl of each RNA sample was electrophoresed (2%, agarose, 1x TBE) and quantified using a NanoDrop ND–1000 (Thermo Scientific). In addition, RNA integrity number (RIN) was determined with the Agilent 2100 BioAnalyzer (Agilent Technologies, Santa Clara, CA, USA). At least 20 µg of total RNA was combined with oligo(dT) magnetic beads (concentration ≥ 250 ng/µl, OD260/280 = 1.8~2.2, OD260/230 ≥ 2.0, 28S:18S ≥ 1.0) and used to confirm that the RNA integrity number (RIN) value was greater than 8.0 before further library construction. RNA-Seq libraries were prepared using the TruSeq RNA Sample Prep Kit (RS-122-2001, Illumina Inc., San Diego, CA, USA). Buffer reagent was used to fragment the extracted mRNA, and the resulting fragmented mRNA was reverse transcribed into cDNA, with purified short fragments used for end repair and ligated with adaptors. The cDNA was enriched by PCR amplification and quality was confirmed with BioAnalyzer, after which RT-PCR was used to quantify the cDNA library, and it was sequenced (Illumina HiSeq™ 4000, BGI, Shenzhen, China), generating paired-end reads with 150 bp in length.

Data processing and de novo assembly

Raw reads were preprocessed using the filter-fq software (v1.2.0; https://github.com/bowentan/filterfq) to discard the adapters (> 5 bp), low-quality fragments (with a quality score Q ≤ 19) or N (unknown nucleotide) content > 5%, and those fragments shorter than 50 bp (including redundant sequences). This clean, high-quality data were used to calculate Q20 and Q30 values, levels of GC content and sequence duplication, and for all downstream analyses. The resulting paired-end reads were clustered using TGICL software (Pertea et al. 2003) to analyze the length and distribution of the transcriptional and unigene clusters. Paired-end sequences were separated into two files, “left” reads into the “left.fq” file and “right” reads into the “right.fq” file. Reads that uniquely mapped to the left contigs were considered to be derived from T. sinensis. Any reads matching to genus qualified them as right reads. Unmatched reads at this stage of the process were considered a set of singleton reads and also placed into the right.fq file. Potential transcripts and unigenes were assembled from the pooled clean reads of left.fq and right.fq files using Trinity software (v20140717) (Manfred 2011).

Gene annotation and analysis

Trinotate was used to perform the functional annotation of unigenes and ORFs (Bryant et al. 2017). We processed all unigene sequences for identification and functional annotation including homology search with known databases including NCBI’s Nt (nonredundant nucleotide sequences), GO (gene ontology), COG (Cluster of Orthologous Groups), and KEGG (Kyoto Encyclopedia of Genes and Genomes). The highest similarity of aligned proteins was used for functional annotation of unigene sequences. First, BLASTx and BlastN (both with parameters of match length ≥ 90 bp, e value < 1e−5, and the allowance of ≤ 1 mismatch and 1 gap, and identity ≥ 90%) were used to align unigenes to protein databases and Nt, respectively. Subsequently, ESTScan software was used to determine sequence direction. Then Blast2GO was employed to determine GO annotation against the GO database for unigenes annotated by NCBI Nr (nonredundant protein sequences) (Conesa et al. 2005; Götz et al. 2008). InterProScan (v5) was used to give further protein annotation. Prediction of protein-coding regions was performed with OrfPredictor software (Min et al. 2005). Additionally, GO functional classification of all unigenes was performed using the Web Gene Ontology Annotation Plot (WEGO) software (Ye et al. 2006), which visualizes and characterizes gene functions and distributions across different pathways.

Analysis of differentially expressed genes (DEGs)

Gene expression levels were estimated by mapping clean reads to the Trinity transcript assembly using RNA-Seq by Expectation–Maximization (RSEM) (Li and Dewey 2011) for each sample. The abundance of each gene was normalized and calculated using the unigene expression via the Reads Per Kilo bases per Million reads (RPKM) method (Mortazavi et al. 2008) as follows: where C and N represent the counts of mapped reads uniquely aligned to a unigene and the sum of reads sequenced that were uniquely aligned to total unigenes, respectively, and L represents the sum of a unigene in base pairs. The DEGSeq package in R was used to conduct differential expression analysis for the young and mature leaves by modeling count data with negative binomial distributions (Anders and Huber 2010). P values were adjusted to reduce false positives due to multiple testing (Storey and Tibshirani 2003), with a q value < 0.05 and |log2 (ratio)| ≥ 1 set as the thresholds for significantly differential expression between the two samples. The identified differentially expressed genes (DEGs) were analyzed according to KEGG enrichment pathways and GO functional categories. GO enrichment analyses were conducted in GOseq with the Wallenius’ noncentral hypergeometric distribution used to search for and map all significantly enriched GO terms among the DEGS (Young et al. 2010). KEGG online tools were used for pathway enrichment analysis of the DEGs (http://www.kegg.jp/) (Mao et al. 2005).

Results

Sequencing and de novo assembly of the transcriptome of T. sinensis young and mature leaves

To eliminate genetic differences of individual cultivars, at least 20 young leaves (YL) or mature leaves (ML) from three individual T. sinensis ‘Heiyouchun’ cultivars were mixed in each sample pool for RNA extraction. To construct high-quality YL and ML RNA libraries, total RNA quality was determined by agarose gel electrophoresis, Nanodrop, and RNA integrity number (RIN) value is shown in Supplemental Fig. 1. The quality of YL RNA was 38 µg (concentration = 411 ng/µl, OD260/280 = 2.1, OD260/230 = 2.0, 28S/18S = 1.8, RIN = 10), and quality of ML RNA was 24 µg (concentration = 532 ng/µl, OD260/280 = 2.1, OD260/230 = 1.3, 28S/18S = 1.7, RIN = 8.9). The quality was satisfactory for use in constructing libraries. To generate a complete T. sinensis leaf transcriptome, two cDNA libraries from YL and ML were constructed and sequenced using the Illumina HiSeq™ 4000 platform, generating 3.82 and 3.10 Gb of raw RNA-seq data, respectively. After deletion of adaptor-polluted, redundant, and other low-quality sequences, 3.32 and 2.80 Gb clean reads of YL and ML, respectively, were retained and assembled. For these clean reads, the Q30 scores (sequencing error rate, 0.1%) were 97.32% and 95.44%, and GC contents were 40.06% and 40.26%, generated from the transcriptome libraries of ‘YL’ and ‘ML’, respectively (Fig. 1b, Table 1).
Table 1

Description of two samples of T. sinensis transcriptome

#SamplesMature leavesYoung leaves
Raw reads number42,092,48846,710,424
Raw bases number6313,873,2007006,563,600
Raw reads length (bp)150150
Clean reads number39,413,79042,323,334
Clean bases number5912,068,5006348,500,100
Clean reads length (bp)150150
Clean reads rate (%)93.6490.61
Adapter polluted reads number185,512206,652
Adapter polluted reads rate (%)0.440.44
Ns reads number37304510
Ns reads rate (%)0.010.01
Low-quality reads number2489,4564175,928
Low-quality reads rate (%)5.918.94
Raw Q30 bases rate (%)95.3992.66
Clean Q30 bases rate (%)97.3295.44
Description of two samples of T. sinensis transcriptome After filtration, the Trinity tool was used to assemble independent high-quality clean sequences from each library, which were further merged, generating 102,881 transcripts and 64,541 unigenes. These transcripts were 107,527,675 bp and 53,892,623 bp with unigene GC contents of 40.16% and 39.94% of YL and ML, respectively. Mean sizes for total transcripts with N50 s and N90 s were 1758 and 417 bp, respectively, while mean sizes for unigenes with N50 s and N90 s were 1563 and 313 bp. The mean lengths of total transcripts and unigenes were 1045 bp and 835 bp of YL and ML, respectively (Table 2). An overview of the sequence size distribution of transcripts and unigenes is shown in Supplemental Table 1.
Table 2

Summary of de novo sequence assembly for Toona sinensis

Assembly parametersTranscriptUnigene
Transcripts generated102,88164,541
N50 (bp)17581563
N90 (bp)417313
Minimum length201201
Maximum length17,85517,855
Mean length1045835
GC percent (%)40.1639.94
Total bases107,527,67553,892,623
Summary of de novo sequence assembly for Toona sinensis The quality and quantity of raw sequence data were sufficient to perform further analysis. 63.16% (40,767) of the unigenes were between 200 and 600 bp in length, 12.11% (7814) were between 600 and 1000 bp, 14.19% (9160) were between 1 and 2 kb, 8.90% (5743) were between 2 and 4 kb, and unigenes of lengths more than 4 kb accounted for only 1.64% (1057) (Table 3).
Table 3

Sequence size of transcripts and unigenes of Toona sinensis

Length rangeTranscriptsUnigenes
NumberPercentage (%)NumberPercentage (%)
200–600 bp49,84248.4540,76763.16
600–1 kb14,99814.58781412.11
1–2 kb22,66522.03916014.19
2–4 kb13,44413.0757438.90
> 4 kb193214.9410571.64
Sequence size of transcripts and unigenes of Toona sinensis

Functional annotation

To obtain functional annotations, we subjected all generated unigenes to BLASTx alignment using a serial blast with a cut-off e value 1e−5, in the NCBI databases and sequence homologies. In total, 75,779,461 raw reads (27.80% of the total reads) were annotated. Of these unigenes, 1746 were annotated with the Nr database and 726 with Nt (Fig. 1c); 33,791 with UniProt (including Swiss-Prot, TrEMBL, and PIR-PSD); 20,515 with GO (Supplemental Table 2), 8696 with COG; 5482 with KEGG; and 23,970 with PFAM. The 20,515 unigenes annotated with GO were assigned to categories including molecular functions, cellular processes, and biological processes (Fig. 2). The two most abundant unigene sequences belonged to cellular processes (4644, 22.63%) and metabolic processes (4375, 21.33%) within biological processes. Unigenes involved in cellular processes were distributed in cell and cell parts (6044 unigenes, 33.83%), organelles (2052, 11.48%), and plasma membrane (3377, 18.90%). Unigenes involved in molecular functions played roles in binding (4744, 42.59%) and catalytic activity (4107, 36.87%), whereas 20.53% represented activity proteins, including transporters, structural molecules, molecular transducers, enzyme regulators, receptors, antioxidants, electron carriers, and transcription factors.
Fig. 2

GO annotation categories of assigned unigenes. The annotated 20,515 unigenes were assigned to GO annotation categories of molecular function, cellular process, and biological process categories

GO annotation categories of assigned unigenes. The annotated 20,515 unigenes were assigned to GO annotation categories of molecular function, cellular process, and biological process categories COG analysis aligned 8696 unigenes for functional classification (Fig. 3). For 14.26% (1240 unigenes), a general function was predicted while translation, ribosomal structure, and biogenesis accounted for 9.50% (826), posttranslational modification was related to 8.41% (732), 6.41% (556) were engaged in carbohydrate transport and metabolism, amino acid transport and metabolism accounted for involved 5.76% (501), replication functions were predicted for 4.37% (380), and 2.94% (256) were involved in transcription.
Fig. 3

Functional classification with the COG database for assigned unigenes. A total of 8696 unigenes were aligned to data in the COG database for functional classification

Functional classification with the COG database for assigned unigenes. A total of 8696 unigenes were aligned to data in the COG database for functional classification Assembled unigenes were assigned to metabolic pathways in the KEGG database based on sequence similarity (Fig. 4). Of the 5482 unique mapped sequences, 14.61% (801) were assigned to amino acid metabolism pathways and 8.31% (456) to ribosome metabolism and translation pathways; 2.77% (152) were involved in the immune system; 2.04% (112) were classified under biosynthesis of secondary metabolites; 1.17% (64) under metabolism of terpenoids and polyketides; 0.97% (53) were assigned to phenylpropanoid biosynthesis; and 0.53% (14) to flavonoid biosynthesis.
Fig. 4

KEGG metabolic pathways of assembled unigenes. Assembled unigenes were assigned to metabolic pathways in the KEGG database based on sequence similarity

KEGG metabolic pathways of assembled unigenes. Assembled unigenes were assigned to metabolic pathways in the KEGG database based on sequence similarity

Differentially expressed genes between young and mature leaves

To identify genes with different expression levels between YL and ML, the unigene expression levels were calculated with the RPKM method, which accounts for effects of both sequencing depth and gene length on the read count (Fig. 5a). A total of 15,172 unigenes had differential expression (with q value < 0.05 and |log2 (ratio)| ≥ 1) between the two samples and thus were identified as differentially expressed genes (DEGs). Among these DEGs, 9648 were up-regulated and 5524 were down-regulated in ML compared with YL (Fig. 5b). DEGs mapped within each GO term category were counted. The hypergeometric test revealed that a total of 67 functional groups, including molecular functions, cellular components, and biological processes, showed remarkable enrichment in DEGs compared with the transcriptomic background (Fig. 6).
Fig. 5

The comparison of expression levels between mature leaves and young leaves. The unigenes’ expression levels based on RPKM (a) and fold change (b). q value < 0.05 and |log2 (ratio)| ≥ 1

Fig. 6

GO annotation categories with differentially expressed unigenes. All DEGs were mapped to each GO database term and counted within the corresponding GO term categories. DEGs when a cutoff ratio of |log2 (ratio)| ≥ 1, and q value < 0.05

The comparison of expression levels between mature leaves and young leaves. The unigenes’ expression levels based on RPKM (a) and fold change (b). q value < 0.05 and |log2 (ratio)| ≥ 1 GO annotation categories with differentially expressed unigenes. All DEGs were mapped to each GO database term and counted within the corresponding GO term categories. DEGs when a cutoff ratio of |log2 (ratio)| ≥ 1, and q value < 0.05 Several enriched pathways, including amino acid biosynthesis, signal transduction, and metabolic pathways, were identified using KEGG enrichment analysis of DEGs. A total of 308 pathways by DEGs are shown in Table 4 and Supplement Table 4, with 22 metabolic pathways significantly over-represented. Significantly highly enriched pathways of YL samples were primarily related to plant biological and human pathogen resistance metabolism pathways, including general ribosome (ko03010), cell cycle (ko04110), RNA transport (ko03013), ribosome biogenesis in eukaryotes (ko03008), DNA replication (ko03030); HTLV-I infection (ko05166), Fanconi anemia pathway (cellular response to DNA interstrand crosslink) (ko03460), and systemic lupus erythematosus (ko05332). The most enriched pathways in samples of ML were related to secondary metabolism pathway, including ribosome (ko03010); phenylpropanoid biosynthesis (ko00940); pathogenic E. coli infection (ko05130); axon guidance (ko04360), hypertrophic cardiomyopathy (HCM) (ko05410); cutin, suberin, and wax biosynthesis (ko00037), dilated cardiomyopathy (ko05414); photosynthesis antenna protein (ko00196); malaria (similar to plant galactolipid metabolism pathway) (ko05144); flavonoid biosynthesis (ko00941); nitrogen metabolism (ko00910); carotenoid biosynthesis (ko00906); limonene and pinene degradation and stilbenoid, diarylheptanoid and gingerol biogenesis (ko00906). The phenylpropanoid biosynthesis exclusive ribosome in those metabolic pathways related specific medical traits of T.sinensis was a most significantly enriched pathway, which included 33 up-regulated genes and 20 down-regulated genes.
Table 4

Significant KEGG enrichment analysis of young leaf and mature leaf DEGs of T. sinensis

New leaf vs. mature leafMapCount1Count2Count3Count4 p q Up_CountDown_CountSignificance
Ribosomemap0301026566174529234.45E−531.37E−5048217Yes
Phenylpropanoid biosynthesismap009405341195729480.0010060.018223320Yes
Pathogenic Escherichia coli infectionmap051303414197629751.57E−050.0008051717Yes
Axon guidancemap043602714198329750.0007670.0168781413Yes
Hypertrophic cardiomyopathy (HCM)map05410219198929800.0008990.017299147Yes
Cutin, suberin and wax biosynthesismap00073236198729831.95E−050.0008581310Yes
Dilated cardiomyopathymap05414198199129810.0014380.023305136Yes
Arrhythmogenic right ventricular cardiomyopathy (ARVC)map05412197199129820.0006820.016161127Yes
Cell adhesion molecules (CAMs)map04514165199429840.0008730.017299115Yes
ECM–receptor interactionmap04512134199729850.0026260.036758103Yes
Malariamap05144123199829860.0020050.029412102Yes
Hematopoietic cell lineagemap04640112199929870.0013670.02330592Yes
RNA transportmap030138668192429214.96E−050.001911878Yes
Photosynthesis-antenna proteinsmap0019680200229890.0006770.01616180Yes
HTLV-I infectionmap051666124194929652.83E−092.90E−07655Yes
Systemic lupus erythematosusmap05322298198129812.36E−060.000145623Yes
Gap junctionmap04540155199529840.0016830.02591769Yes
Progesterone-mediated oocyte maturationmap049144116196929731.08E−068.28E−05239Yes
Cell cyclemap041108229192829603.61E−135.55E−11181Yes
Ribosome biogenesis in eukaryotesmap030085132195929576.66E−050.002278150Yes
DNA replicationmap030303218197829710.0005450.01527032Yes
Fanconi anemia pathwaymap034602812198229770.0001250.003855028Yes
Significant KEGG enrichment analysis of young leaf and mature leaf DEGs of T. sinensis

Differentially expressed genes (DEGs) related to phenylpropanoid biosynthesis in mature leaves

Many unigenes related to phenylpropanoid biosynthesis were identified in ML transcriptome (Fig. 7). Transcriptome analysis revealed that 53 enzyme genes related to phenylpropanoid biosynthesis (Table 5) were up-regulated compared with YL, including genes related to the general phenylpropanoid pathway [4CL (6)], caffeic acid biosynthesis [CoumCoA3H (2), HCT (4), CYP98A (2)], and the later steps of lignin biosynthesis [CCR (5), CAD (5), REF1 (2), POD (5), CAD (3)] (Supplemental Fig. 1). These results indicated that caffeoyl-CoA, flavonoids, and lignin were each metabolized in an enzyme-dependent manner and accumulated in ML extracts. In addition, almost all major enzyme genes involved in cutin, suberin, and wax biosynthesis were annotated in this pathway (Supplemental Fig. 2; Supplemental Table 3). Despite this increased information, the complexity of the molecular mechanism for the biosynthesis of cutin, suberin, and wax in mature leaves of T. sinensis remains uncertain and requires further study.
Fig. 7

Schematic diagram of the phenylpropanoid biosynthesis pathway. Differentially expressed genes involved in the phenylpropanoid biosynthesis pathway in response to leaf senescence in T. sinensis. The red-colored names of enzymes indicate the response pattern (up-regulated) of the unigenes that encoded the corresponding enzyme in mature leaf. Numbers of putative unigenes encoding enzymes are given for T. sinensis in parentheses

Table 5

Changes in transcript abundance of candidate genes related to phenylpropanoid biosynthesis in old leaves and new leaves

GeneMature leaf normalizationYoung leaf normalizationlog2 fold changeP valueUp/downPFAM namePFAM description
c35946_g11.219970.0138489526.460931.20E−11Upp450Cytochrome P450
c36184_g13.253250.0553958095.875967.72E−29UpMethyltransf_3O-Methyltransferase
c44663_g20.319520.0138489524.528040.00067591Upp450Cytochrome P450
c34177_g130.78972.2158323773.796532.43E−221UpPeroxidasePeroxidase
c22474_g17.087450.526260193.751425.25E−52UpCYP98A3C3′H; Coumaroylquinate (coumaroylshikimate) 3′-monooxygenase
c16277_g11.336160.1107916193.592178.85E−11Upp450Cytochrome P450
c56414_g11.771860.1661874283.414382.47E−13UpBeta-glucosidase.Beta-glucosidase
c43864_g14.618460.4985622853.211571.11E−30UpperoxidasePeroxidase
c6701_g1211.02626.146822053.012710UpEpimeraseNAD dependent epimerase/dehydratase family
c35488_g10.580940.0830937142.805570.00012938UpAldedhAldehyde dehydrogenase family
c39663_g149.43787.1737573212.784814.78E−271UpTransferaseTransferase family
c44868_g196.058117.449679972.460710UpGlyco_hydro_1Glycosyl hydrolase family 1
c41387_g1140.96529.110497852.275720UpMethyltransf_3O-Methyltransferase
c51066_g165.03615.316941312.086114.85E−255UpGlyco_hydro_1Glycosyl hydrolase family 1
c51757_g131.3416222.6080602− 2.82840DownGlyco_hydro_3Glycosyl hydrolase family 3N-terminal domain
c36298_g25.0251244.45513707− 3.14512.35E−274DownADH_NAlcohol dehydrogenase GroES-like domain
c44584_g10.5228411.46693255− 4.4551.57E−91DownPeroxidasePeroxidase
c31990_g10.116197.450736368− 6.00299.94E−64DownPeroxidasePeroxidase
c42734_g10.232382.188134472− 3.23521.84E−15DownPeroxidasePeroxidase
c51571_g30.668083.185259042− 2.25332.35E−15DownPeroxidasePeroxidase
c37802_g10.058090.99712457− 4.10134.73E−09DownPeroxidasePeroxidase
c57658_g10.014520.775541332− 5.73875.28E−08Downp450Cytochrome P450
c15574_g10.058090.609353904− 3.39081.86E−05DownPeroxidasePeroxidase
c346_g128.0012214.4648762− 2.93720Downadh_shortShort-chain dehydrogenase
c29453_g177.5262193.9684267− 1.32310DownADH_NAlcohol dehydrogenase GroES-like domain
c46059_g1215.4794.726834121.185640Upp450Cytochrome P450
c35840_g123.673284.70019262− 1.83915.20E−281DownGlyco_hydro_3Glycosyl hydrolase family 3N-terminal domain
c37180_g135.8148100.1279255− 1.48321.03E−244DownGlyco_hydro_1Glycosyl hydrolase family 1
c38152_g128.11742.0773428543.758656.25E−201UpPeroxidasePeroxidase
c53977_g10.639030.1661874281.943080.00124426UpPeroxidasePeroxidase
c4677_g10.290470.0138489524.390540.00128935Up4CL4-Coumarate-CoA ligase
c58363_g10.014520.249281142− 4.10130.00340948DownPeroxidasePeroxidase
Schematic diagram of the phenylpropanoid biosynthesis pathway. Differentially expressed genes involved in the phenylpropanoid biosynthesis pathway in response to leaf senescence in T. sinensis. The red-colored names of enzymes indicate the response pattern (up-regulated) of the unigenes that encoded the corresponding enzyme in mature leaf. Numbers of putative unigenes encoding enzymes are given for T. sinensis in parentheses Changes in transcript abundance of candidate genes related to phenylpropanoid biosynthesis in old leaves and new leaves

Discussion

Comparison of software packages for detecting gene differential expression of T. sinensis young and mature leaves

Transcriptome sequencing can be used to efficiently and effectively analyze the cellular transcriptome. Many computational software packages and pipelines have already been widely used during RNA-seq data analysis, including edgeR (Robinson et al. 2010), DESeq (Anders and Huber 2010), DEGSeq (Wang et al. 2010), and limma (Smyth 2004). edgeR is normally used to determine differential expression with empirical Bayes estimation and exact tests based on a negative binomial model. edgeR can be used for small numbers of replicates with over-dispersed data to assess differential gene expression. TMM normalization and Benjamini–Hochberg procedures are used as default to control sequencing depths and FDR, respectively (Robinson et al. 2010). Similar to edgeR, DESeq also uses a negative binomial model, a scaling factor normalization procedure and the Benjamini–Hochberg procedure to control sequencing depths and FDR of different samples, but exhibits more general dispersion estimation and balanced selection of DEGs. DESeq is technically possible to use with experiments without any biological replicates but this is not recommended (Anders and Huber 2010). Limma was originally used for microarray data analysis but was later extended to RNA-seq data. TMM normalization of the edgeR package and ‘voom’-conversed log2 scale are used to determine weight prior to linear modeling. The Benjamini–Hochberg procedure is used as default to estimate FDR (Smyth 2004). DEGseq exports gene expression values in a table format, which are then directly processed by edgeR. It analyzes gene expression based on a random sampling model or raw counts in Poisson distribution model. DEGseq can also be applied to identify differential expression of exons or pieces of transcripts with or without a small number of replicates. In our study, to get higher sequencing depth and detect subtle gene expression changes, we directly pooled 20 individual biological replicates together into YL and ML sample groups. Due to lack of replicates, DEGseq was more suitable than the other programs to conduct differential gene expression analysis. When we use DEGseq package, it will first homogenize the sample when analyzing single replicate (this homogenization process will avoid the biasness to some extent) according to the internal arithmetic method, and then we analyze the difference based on the data after homogenization instead of directly analyzing the difference between the original data input. To sort off reliable DEGs, the software accounting calculates the corresponding p value and corrected q value. In addition, DESeq detected DEGs based on the level of gene expression according to the negative binomial distribution of statistical methods. The obtained p value will be corrected to control false-positive results according to Benjamini and Hochberg methods. The corrected q value < 0.05 and |log2 (ratio)| ≥ 1 set as the thresholds is defined as DEGs.

Characterization, assembly, and gene annotation of leaves of T. sinensis

In this study, using transcriptome sequencing analysis, we obtained 64,541 unigenes with an N50 value of 1563 bp and a mean length of 835 bp and used these for assembly evaluation by comparison with NCBI and sequence homologies. In total, 20,515 (43.06%) of these unigenes were successfully annotated using BLAST searches of the public Nr, PFAM, Swiss-Prot, GO, COG, and KEGG databases. The resulting RNA-Seq data provided a high-quality annotated assembly for T. sinensis generated by comprehensive analysis. Distribution patterns annotated similarly across several databases indicated that YL and ML of T. sinensis undergo multiple unique developmental processes (Fig. 1; Supplemental Table 2). The large number of annotated enzymes suggests the presence of genes associated with different pathways of primary and secondary metabolite biosynthesis across life stages (Zhang et al. 2016; Zhao et al. 2017).

Differentially expressed unigenes in phenylpropanoid biosynthesis

Our findings demonstrate that the phenylpropanoid and lignin biosynthesis pathways were among the most enriched. Nine differentially expressed unigenes, including 4CL, CoumCoA3H, HCT, CYP98A, CCR, REF1, CAD, and POD, were up-regulated in YL and ML. ML were significantly enriched in phenylpropanoids, consistent with increased content of flavonoid, lignin, cutin, and wax. In plants, control of phenylpropanoid biosynthesis is complex and plays a significant role in pathogen resistance, anthocyanin biogenesis, and pharmacology (Jimene and Riguera 1994). In this transcriptome study, we identified most of the catabolic genes associated with phenylpropanoid synthesis, demonstrating an understanding of the precise pathway in plants (Shi et al. 2013). Genetic, molecular, and biochemical evidence suggests that synthesis and catabolism of phenylpropanoid amino acids are regulated by previously undescribed coordinated mechanisms (Burkhard et al. 2001; Grabherr et al. 2011). Information from the current study will advance understanding of the regulation of phenylpropanoid metabolism in T. sinensis, which will provide valuable information for the future production of high-phenylpropanoid crops with medical applications.

Author contribution statement

Conceived and designed the experiments: JS, CQ, JY, and YJ. Conducted the experiments: JS, CQ, and JY. Performed data analysis: JS, CQ, and WZ. Wrote a draft of the manuscript: WZ and YJ. Below is the link to the electronic supplementary material. Supplementary material 1 (XLS 952 kb) Supplementary material 2 (DOCX 51 kb) Supplementary material 3 (XLS 3132 kb) Supplementary material 4 (XLSX 47 kb) Supplementary material 5 (XLSX 214 kb) Supplemental Figure 2. Metabolic pathways of phenylpropanoid biosynthesis of assembled unigenes. (TIFF 6595 kb) Supplemental Figure 3. Metabolic pathways of cutin, suberin, and wax biosynthesis of assembled unigenes. (TIFF 6595 kb) Supplemental Figure 1. Total RNA quality determined by agarose gel electrophoresis, Nanodrop and RNA integrity number (RIN) value. (A). 2% RNA agarose gel was run with 1x TBE buffer at 120 v for 30 min. (B). RIN value of young leaves (TR183982) and mature leaves (TR183983) were determined by an Agilent 2100 BioAnalyzer. (C). Total RNA quality determined by Nanodrop ND-1000. (TIFF 6595 kb) Supplementary material 9 (TIFF 20 kb)
  1 in total

1.  Polymyxin B1 and E2 From Paenibacillus polymyxa Y-1 for Controlling Rice Bacterial Disease.

Authors:  Wenshi Yi; Chao Chen; Xiuhai Gan
Journal:  Front Cell Infect Microbiol       Date:  2022-03-28       Impact factor: 6.073

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.