| Literature DB >> 29048480 |
Jiang Xu1, Yang Chu1, Baosheng Liao1, Shuiming Xiao1, Qinggang Yin1, Rui Bai1, He Su1,2, Linlin Dong1, Xiwen Li1, Jun Qian1, Jingjing Zhang1, Yujun Zhang1, Xiaoyan Zhang1, Mingli Wu1, Jie Zhang1, Guozheng Li3, Lei Zhang4, Zhenzhan Chang5, Yuebin Zhang6, Zhengwei Jia7, Zhixiang Liu1, Daniel Afreh8, Ruth Nahurira8, Lianjuan Zhang1, Ruiyang Cheng1, Yingjie Zhu1, Guangwei Zhu1, Wei Rao7, Chao Zhou7, Lirui Qiao7, Zhihai Huang2, Yung-Chi Cheng9, Shilin Chen1.
Abstract
Ginseng, which contains ginsenosides as bioactive compounds, has been regarded as an important traditional medicine for several millennia. However, the genetic background of ginseng remains poorly understood, partly because of the plant's large and complex genome composition. We report the entire genome sequence of Panax ginseng using next-generation sequencing. The 3.5-Gb nucleotide sequence contains more than 60% repeats and encodes 42 006 predicted genes. Twenty-two transcriptome datasets and mass spectrometry images of ginseng roots were adopted to precisely quantify the functional genes. Thirty-one genes were identified to be involved in the mevalonic acid pathway. Eight of these genes were annotated as 3-hydroxy-3-methylglutaryl-CoA reductases, which displayed diverse structures and expression characteristics. A total of 225 UDP-glycosyltransferases (UGTs) were identified, and these UGTs accounted for one of the largest gene families of ginseng. Tandem repeats contributed to the duplication and divergence of UGTs. Molecular modeling of UGTs in the 71st, 74th, and 94th families revealed a regiospecific conserved motif located at the N-terminus. Molecular docking predicted that this motif captures ginsenoside precursors. The ginseng genome represents a valuable resource for understanding and improving the breeding, cultivation, and synthesis biology of this key herb.Entities:
Keywords: Panax ginseng; genome; ginsenosides; mass spectrometry imaging
Mesh:
Substances:
Year: 2017 PMID: 29048480 PMCID: PMC5710592 DOI: 10.1093/gigascience/gix093
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Statistical analysis of the P. ginseng draft genome
| Size, bp | Number | |
|---|---|---|
| Contig | ||
| N90 | 4516 | 150 620 |
| N80 | 8639 | 103 388 |
| N70 | 12 833 | 75 040 |
| N50 | 21 977 | 39 481 |
| Longest | 574 183 | – |
| Total size | 2 999 700 459 | 337 439 |
| Scaffold | ||
| N90 | 24 143 | 33 423 |
| N80 | 45 718 | 23 391 |
| N70 | 65 171 | 17 168 |
| N50 | 108 708 | 9072 |
| Longest | 1 303 414 | – |
| Total size | 3 414 349 854 | 83 074 |
| Gap ratio | 12.15%a | – |
aAmong these gaps, 368 679 gaps are single-N.
Figure 1:P. ginseng genome assembly and functional gene annotations. (a) Phylogenetic tree and divergence data of 14 species, including P. ginseng, based on the proteins of 383 single-copy genes annotated to the genome sequence of each species. (b) Distribution of orthologous gene families in P. ginseng and 4 sequenced species: carrot (Daucus carota), coffee (Coffea canephora), Arabidopsis (Arabidopsis thaliana), and tomato (Solanum lycopersicum).
Figure 2:Ginsenoside distribution in the P. ginseng root cross-sections that were obtained through mass spectrometric imaging based on the DESI–MS. (a) Optical image of the main root. (b) Tetramethylsilane (TMS) image spectrum. (c) DESI–MS image of metabolites and ginsenosides: maltose, citbismine C, Rg1/Rf, pseudo-Rc1, Ra1/Ra2, Rd/Re, Rs1/Rs2, and Ra3. Scale bar = 2 mm.
Figure 3:Metabolism and transcriptome analysis of P. ginseng root. (a) HPLC chromatograms of the ginsenosides Rg1, Re, Rf, Rg2, Rb1, Rc, Rb2, and Rd standards. (b) PCA score plots based on the HPLC dataset (red circles indicate periderm, black circles indicate cortex, and green circles indicate stele). (c) PLS-DA score plots based on the HPLC dataset. (d) Cluster tree of the ginseng samples based on the expression pattern of 42 006 genes. The leaves of the tree correspond to the different ginseng tissue samples (Per: periderm; Cor: cortex; Ste: stele). The color bands beneath the tree represent the relative content of the total ginsenosides, Rb1 and Rg1 (red indicates high values).
Figure 4:Gene expression in the MVA pathway for ginsenosides in P. ginseng. (a) Possible biosynthesis pathway for ginsenosides with the designated candidate genes. AACT: acetyl-CoA C-acetyltransferase; HMGS: 3-hydroxy-3-methylglutaryl-CoA synthase; HMGCoA: 3-hydroxy-3-methylglutaryl-CoA; HMGR: 3-hydroxy-3-methylglutaryl-CoA reductase; MVK: mevalonate kinase; MVP: mevalonate phosphate; PMK: phosphomevalonate kinase; MVPP: diphosphomevalonate; MVD: mevalonate diphosphate decarboxylase; IPP: isopentenyl diphosphate; DMAPP: dimethylallyl diphosphate; IDI: isopentenyl-diphosphate delta-isomerase; FPS: farnesyl diphosphate synthase; FPP: farnesyl diphosphate; SS: squalene synthase; SE: squalene epoxidase; β-AS: β-amyrin synthase; DDS: dammarenediol synthase; LAS: lanosterol synthase; CAS: cycloartenol synthase; OAS: oleanolic acid synthase; PPDS: protopanaxadiol synthase; PPTS: protopanaxatriol synthase. (b) Heatmap of the candidate biosynthesis pathway gene expression patterns in 9 organs from P. ginseng.
Figure 5:Sequence analysis and transcript levels of the HMGR gene family. (a) Phylogenetic analysis of PgHMGRs and characterized HMGRs from other plants. (b) Multiple alignments of the amino acid sequences of PgHMGRs with homologous HMGRs from Arabidopsis. The black boxes indicate identical residues, and the gray boxes represent identical residues for at least 2 of the sequences. Functional domains are highlighted in colored boxes (red, membrane domain; green, linker domain; and blue, catalytic domain). The 2 putative HMGR-CoA-binding sites, 2 NADP(H)-binding sites, and ER retention motifs are denoted by square boxes. (c) Tissue-specific PgHMGR expression patterns in 4-year-old roots. The data represent the mean ± SD of the 3 independent samples. (d) Genomic DNA structure of PgHMGRs. The exons are represented by the green-filled square boxes. The lines between the boxes correspond to the introns. The numbers above the exons indicate the length in bp.
Figure 6:Analysis of UGTs from P. ginseng. (a) All the identified UGTs, which were newly classified according to the standardization of the UGT Nomenclature Committee, were assigned to 24 subfamilies. (b) The expression (lower) of UGT gene copies (PG22765) from the same scaffold (upper) in the different tissues of P. ginseng.