| Literature DB >> 29144390 |
Tianyuan Zhang1,2, Chi Song3, Li Song4, Zhiwei Shang5, Sen Yang6, Dong Zhang7, Wei Sun8, Qi Shen9, Degang Zhao10,11.
Abstract
Perillafrutescen is used as traditional food and medicine in East Asia. Its seeds contain high levels of α-linolenic acid (ALA), which is important for health, but is scarce in our daily meals. Previous reports on RNA-seq of perilla seed had identified fatty acid (FA) and triacylglycerol (TAG) synthesis genes, but the underlying mechanism of ALA biosynthesis and its regulation still need to be further explored. So we conducted Illumina RNA-sequencing in seven temporal developmental stages of perilla seeds. Sequencing generated a total of 127 million clean reads, containing 15.88 Gb of valid data. The de novo assembly of sequence reads yielded 64,156 unigenes with an average length of 777 bp. A total of 39,760 unigenes were annotated and 11,693 unigenes were found to be differentially expressed in all samples. According to Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, 486 unigenes were annotated in the "lipid metabolism" pathway. Of these, 150 unigenes were found to be involved in fatty acid (FA) biosynthesis and triacylglycerol (TAG) assembly in perilla seeds. A coexpression analysis showed that a total of 104 genes were highly coexpressed (r > 0.95). The coexpression network could be divided into two main subnetworks showing over expression in the medium or earlier and late phases, respectively. In order to identify the putative regulatory genes, a transcription factor (TF) analysis was performed. This led to the identification of 45 gene families, mainly including the AP2-EREBP, bHLH, MYB, and NAC families, etc. After coexpression analysis of TFs with highly expression of FAD2 and FAD3 genes, 162 TFs were found to be significantly associated with two FAD genes (r > 0.95). Those TFs were predicted to be the key regulatory factors in ALA biosynthesis in perilla seed. The qRT-PCR analysis also verified the relevance of expression pattern between two FAD genes and partial candidate TFs. Although it has been reported that some TFs are involved in seed development, more direct evidence is still needed to verify their function. However, these findings can provide clues to reveal the possible molecular mechanisms of ALA biosynthesis and its regulation in perilla seed.Entities:
Keywords: Perilla frutescens; RNA sequencing (RNA-seq); herbgenomics; triacylglycerol (TAG) biosynthesis; α-linolenic acid (ALA)
Mesh:
Substances:
Year: 2017 PMID: 29144390 PMCID: PMC5713401 DOI: 10.3390/ijms18112433
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Summary of perilla seed transcriptome data sequenced by the Illumina platform.
| Sample ID | Total Reads | Total Bases | GC Content | Q20 | Q30 |
|---|---|---|---|---|---|
| 2DAF | 18,094,914 | 2,261,864,250 | 46.64% | 94.64% | 89.76% |
| 6DAF | 17,885,482 | 2,235,685,250 | 47.88% | 94.36% | 89.70% |
| 10DAF | 17,075,076 | 2,134,384,500 | 49.31% | 94.38% | 89.77% |
| 14DAF | 16,747,764 | 2,093,470,500 | 49.61% | 94.25% | 89.58% |
| 18DAF | 14,929,122 | 1,866,140,250 | 51.42% | 93.70% | 88.05% |
| 22DAF | 20,751,538 | 2,593,942,250 | 52.52% | 93.74% | 88.71% |
| 26DAF | 21,517,792 | 2,689,724,000 | 47.89% | 95.39% | 91.21% |
| Total | 127,001,688 | 15,875,211,000 |
DAF: days after flowering.
Statistics of de novo assembly of sequence reads.
| Item | Total Number (bp) | N50 (bp) | Median Length (bp) | Average Length (bp) | Total Length (bp) |
|---|---|---|---|---|---|
| Transcripts | 104,638 | 1608 | 600 | 968 | 101,378,085 |
| Unigenes | 64,156 | 1417 | 402 | 777 | 49,883,108 |
Figure 1Overview of the de novo assembly of transcriptome sequencing in Perilla frutescens and annotation based on a non-redundant (NR) protein database. Length (A) and GC distribution (B) of transcripts; length (C) and GC distribution (D) of unigenes are shown; (E) e-value distribution of BLAST hits for the assembled unigenes; (F) Similarity score distribution of the top BLAST hits for the assembled unigenes; (G) Species distribution of the top BLAST hits for the assembled unigenes.
Statistics of annotations of assembled unigenes.
| Database | Account | Percentage c |
|---|---|---|
| NR a | 32,132 | 50.08% |
| KEGG classified unigenes | 10,904 | 17.00% |
| COG classified unigenes | 8654 | 14.47% |
| GO classified unigenes | 22,263 | 34.70% |
| Blast_hit b | 31,287 | 48.77% |
| Pfam classified unigenes | 19,340 | 30.15% |
| Eggnog classified unigenes | 9425 | 14.69% |
| TmHMM classified unigenes | 6719 | 10.47% |
| SignalP classified unigenes | 2354 | 3.67% |
| All annotated unigenes | 39,760 | 61.97% |
| All | 64,156 | 100.00% |
a NCBI non-redundant database; b SWISSPORT and TREMBLE database; c Percentage of all assembled unigenes.
Figure 2Perilla sequences associated with fatty acid biosynthetic pathway. Each row represents a gene, and each column represents a specimen (stage). Depths of color in the red and blue rectangles indicate higher and lower represents the Z-score RNA expression lever. Identified enzymes include: PDHC: pyruvate dehydrogenase complex; ACCase: acetyl-CoA carboxylase; MAT: malonyl-CoA ACP transacylase; ACP: acyl carrier protein; KAS I, II, III: ketoacyl-ACP synthase I, II, III; KAR: ketoacyl-ACP reductase; HAD: hydroxyacyl-ACP dehydrase; EAR: enoyl-ACP reductase; SAD: stearoyl-ACP desaturase; FAD: fatty acid desaturase; FATA/B: fatty acyl-ACP thioesterase A/B; PCH: palmitoyl-CoA hydrolase; LACS: long-chain acyl-CoA synthetase; FAD2: oleate desaturase (endoplasmic reticulum); FAD3: linoleate desaturase; GK: glycerol kinase: GPDH: glycerol-3-phosphate dehydrogenase; GPAT: glycerol-3-phosphate acyltransferase; LPAT: 1-acylglycerol-3-phosphate acyltransferase; ATS1: glycerol-3-phosphate O-acyltransferase; PP: phosphatidate phosphatase LPIN; DGAT, acyl-CoA: diacylglycerolacyltransferase; PDAT: phospholipid:diacylglycerol acyltransferase; LPCAT: 1-acylglycerol-3-phosphocholine acyltransferase; CPT: diacylglycerol cholinephosphotransferase; PDCT: phosphatidylcholine:diacylglycerol cholinephosphotransferase.
Differentially Expressed Genes (DEGs) between two different developmental stages.
| 2DAF | 6DAF | 10DAF | 14DAF | 18DAF | 22DAF | 26DAF | |
|---|---|---|---|---|---|---|---|
| 2DAF | 0 | ||||||
| 6DAF | 137↑ 263↓ | 0 | |||||
| 10DAF | 876↑ 1316↓ | 294↑ 558↓ | 0 | ||||
| 14DAF | 1755↑ 1399↓ | 1352↑ 1227↓ | 353↑ 368↓ | 0 | |||
| 18DAF | 3022↑ 1813↓ | 2370↑ 1583↓ | 2001↑ 955↓ | 257↑ 85↓ | 0 | ||
| 22DAF | 3666↑ 2031↓ | 3188↑ 1816↓ | 2941↑ 1363↓ | 1350↑ 674↓ | 47↑ 83↓ | 0 | |
| 26DAF | 4751↑ 2221↓ | 4730↑ 2119↓ | 4445↑ 1692↓ | 2615↑ 1268↓ | 1253↑ 949↓ | 360↑ 122↓ | 0 |
DAF: days after flowering.
Figure 3(A) A lipid metabolism-enriched module is presented with the degree-sorted circle layout of Cytoscape v3.4.10, with the sizes and colors of nodes reflecting the level of connectivity within the network. The bigger the node, the greater the number of connections it has. For clarity, the edges with correlation values smaller than 0.95 were removed; (B) Heat maps of the coexpression genes of lipid metabolism; The gene in left heat maps is correspond with the subnetwork I, and The genes in right heat maps is correspond with the subnetwork II. Each line represents a gene, and each column represents a specimen (stage). Depths of color in the red and blue rectangles indicate higher and lower represents the Z-score RNA expression lever.
Figure 4Transcription factor analysis. (A) Distribution of transcription factor (TF) families; (B) Coexpression network of transcript factors and fatty acid desaturase (FAD). The TF module was presented by Cytoscape v3.4.10. The rectangle indicates the TFs directly related to the FAD in the network. The solid line represents positive correlation, and the dotted line represents negative correlation.