| Literature DB >> 31681373 |
Yonghong Zhang1, Lanlan Zheng1, Yan Zheng1, Chao Zhou2, Ping Huang1, Xiao Xiao1, Yongheng Zhao1, Xincai Hao1, Zhubing Hu3, Qinhua Chen4, Hongliang Li5, Xuanbin Wang5, Kenji Fukushima6, Guodong Wang7, Chen Li1.
Abstract
Polygonum cuspidatum (Japanese knotweed, also known as Huzhang in Chinese), a plant that produces bioactive components such as stilbenes and quinones, has long been recognized as important in traditional Chinese herbal medicine. To better understand the biological features of this plant and to gain genetic insight into the biosynthesis of its natural products, we assembled a draft genome of P. cuspidatum using Illumina sequencing technology. The draft genome is ca. 2.56 Gb long, with 71.54% of the genome annotated as transposable elements. Integrated gene prediction suggested that the P. cuspidatum genome encodes 55,075 functional genes, including 6,776 gene families that are conserved in the five eudicot species examined and 2,386 that are unique to P. cuspidatum. Among the functional genes identified, 4,753 are predicted to encode transcription factors. We traced the gene duplication history of P. cuspidatum and determined that it has undergone two whole-genome duplication events about 65 and 6.6 million years ago. Roots are considered the primary medicinal tissue, and transcriptome analysis identified 2,173 genes that were expressed at higher levels in roots compared to aboveground tissues. Detailed phylogenetic analysis demonstrated expansion of the gene family encoding stilbene synthase and chalcone synthase enzymes in the phenylpropanoid metabolic pathway, which is associated with the biosynthesis of resveratrol, a pharmacologically important stilbene. Analysis of the draft genome identified 7 abscisic acid and water deficit stress-induced protein-coding genes and 14 cysteine-rich transmembrane module genes predicted to be involved in stress responses. The draft de novo genome assembly produced in this study represents a valuable resource for the molecular characterization of medicinal compounds in P. cuspidatum, the improvement of this important medicinal plant, and the exploration of its abiotic stress resistance.Entities:
Keywords: Polygonum cuspidatum; genome assembly; medicinal plant; resveratrol biosynthesis; stress tolerance; whole-genome duplication
Year: 2019 PMID: 31681373 PMCID: PMC6813658 DOI: 10.3389/fpls.2019.01274
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Kmer distribution of the genome survey result. The x-axis shows the depth of each Kmer, and the y-axis shows the frequency of each Kmer. In this analysis, all Kmers were 21 bp long.
Summary of the genome assembly and annotation in P. cuspidatum.
| Total assembly size (bp) | 2,565,149,001 |
|---|---|
| Total gap length (bp) | 57,172,525 |
| Total number of scaffolds | 948,118 |
| Scaffold N50 (bp) | 3,215 |
| Max scaffold length (bp) | 649,261 |
| Total number of contigs | 1,078,298 |
| Contig N50 (bp) | 2,769 |
| Max contig length (bp) | 628,071 |
| GC content (%) | 37.46 |
| TE content (%) | 71.54 |
| Total number of genes | 55,075 |
Figure 2Gene annotation in P. cuspidatum. (A) Sequence alignment of all genes against the nr database. The proportions of genes with the closest homologs in different species are shown. (B) GO annotation of genes in P. cuspidatum. The GO terms were categorized into three different groups, including cellular component, molecular function, and biological process.
Figure 3Classification of gene families. (A) Venn diagram showing the number of gene families in five plant species. Each number represents the number of gene families in each species or those shared by different species. The analysis was carried out using OrthoMCL software. (B) Summary of the number of genes in different groups. The genes were parsed from OrthoMCL clustering analysis.
Figure 4Dynamic evolution of gene families. (A) Gene family expansion and contraction in five plants species. The gene families that have undergone expansion and contraction are shown in green and red, respectively. The numbers separated by slashes (expansions/contractions) indicate the number of gene families. The scale bar indicates the divergence time [million years ago (MYA)]. (B) Ks distribution of paralogous gene pairs in P. cuspidatum and F. tataricum. In this analysis, two peaks were identified, which are thought to represent two WGD events. The x-axis shows Ks values, and the y-axis shows the density of distribution.
Figure 5Identification of differentially expressed genes in roots versus aboveground tissues. (A) Venn diagram of upregulated genes in roots and aboveground tissues. The number in the overlapping region represents the number of genes that were expressed (FPKM > 0.1) in at least one sample but did not appear to be differentially expressed. The other numbers represent the number of genes that were upregulated in each sample. (B), GO enrichment analysis of genes that were upregulated in root samples. The most highly enriched GO terms in different biological process categories are shown.
Figure 6Phylogenetic and expression analysis of gene families involved in the resveratrol biosynthetic pathway in five species. (A), Proposed pathway for resveratrol biosynthesis derived from the phenylpropanoid pathway in P. cuspidatum. PAL, phenylalanine ammonia lyase; C4H, cinnamic acid 4-hydroxylase; 4CL, 4-coumarate CoA ligase; STS, stilbene synthase; CHS, chalcone synthase. The colors represent the expression levels of each gene in ln(FPKM) in aboveground and root tissues. (B), Phylogenetic tree of the STS/CHS gene family. Genes in P. cuspidatum are indicated by red dots. The tree was constructed using the neighbor-joining method with 500 bootstrap replicates. (C), Expression analysis of the representative resveratrol biosynthetic genes, as determined by qRT-PCR. (D), Resveratrol levels in roots and aboveground tissues. Each bar represents the mean value (n = 3). Error bars represent the SD.