Literature DB >> 34954899

MPOD: Applications of integrated multi-omics database for medicinal plants.

Simei He1,2, Ling Yang1,3, Shuang Ye1,2, Yuan Lin1,2, Xiaobo Li1,2, Yina Wang1,2, Geng Chen1,2, Guanze Liu1,2, Ming Zhao1,2, Xiu Zhao4, Kunhua Wei5, Guanghui Zhang1,2, Jianhua Miao5, Yang Dong1,5,6, Shengchao Yang1,2.   

Abstract

Entities:  

Keywords:  biosynthesis; genome; medicinal plants; secondary metabolite; transcriptome

Mesh:

Year:  2022        PMID: 34954899      PMCID: PMC9055808          DOI: 10.1111/pbi.13769

Source DB:  PubMed          Journal:  Plant Biotechnol J        ISSN: 1467-7644            Impact factor:   13.263


× No keyword cloud information.
Plant natural products (PNPs) have been an important source in human nutrition, industrial raw materials, medicinal ingredients and half of anticancer drugs are derived from PNPs such as paclitaxel, vinblastine, and ginsenoside (Caputi et al., ; Luo et al., 2019; Yang et al., 2020). Biosynthesis is one of the key ways to produce PNPs, and the increasing development of medicinal Phyto‐omics data helps to decode the PNPs biosynthetic pathway (Liu et al., 2017). Genetic resources also provide the basis for medicinal plants (MPs) molecular breeding. To integrate the genome and transcriptome data of MPs, we completed the first omics database for herbal medicine (HMOD) in December 2017 (Wang et al., 2018). The less genomic data and the simple metabolites information from the website, as the data increases, makes it necessary to comprehensively optimize and upgrade the database from the data, interface, tool, and management. Thus, we constructed an integrated multi‐omics database for MPs (MPOD; http://medicinalplants.ynau.edu.cn/). MPOD collects genomes and transcriptomes of MPs published since January 2018. In addition, we sequenced six genomes, 28 transcriptomes, and five metabolomes in this study. All genomic and transcriptomic sequences in the MPOD are available for query of orthologous gene candidates, and homology comparison between gene families from different species by blast. More importantly, correlation analyses between metabolite distribution and gene expression including metabolite content in different tissues, Pearson correlation analyses of genes involved metabolic pathways and expression profile were performed. Compared with HMOD, MPOD details metabolic pathways of flavonoids, alkaloids and terpenoids, respectively. To facilitate synthetic biology, ‘the biosynthetic tools’ module is added in MPOD with some popular bioinformatics tools including SynVisio, heatmap, and enrichment. The framework of MPOD is constructed using MySQL, ThinkPHP, and FastAdmin, with four main modules, including genomics, transcriptomics, pathways, and biosynthetic tools (Figure 1a, b). In brief, the genomics module consists of genomes, genome size, re‐sequencing, and gene (Figure 1c). This module contains 154 published genomes and 6 unpublished genome‐assemblies (Synsepalum dulcificum, Antirrhinum majus, Platycodon grandiflorus, Codonopsis pilosula, Panax vietnamensis, Gynostemma pentaphyllum) from this project. The web interface of species constitutes species introduction, sequencing data, assembly results, the data source links, and reference. For the published genomic data, the GCA data uploaded on NCBI has been linked to MPOD, and for unpublished data, FASTA formatted files for assembly, CDS, and protein sequences can be downloaded from this database. Genome size provides 50 plant genome size results, predicted by flow cytometry. Re‐sequencing contains single nucleotide polymorphism (SNP) information of Erigeron breviscapus, P. notoginseng (He et al., 2021) from our team, and published re‐sequencing data for 19 other plants. Gene section provides gene assembly, annotation, and expression profiles from E. breviscapus and Acanthopanax senticosus.
Figure 1

Schematic of the Database for Medicinal Plants. (a) The flow diagram showing design and construction of MPOD. (b) The home page of MPOD. (c) The ‘genomics’ module providing summary of genomes, genome size, and re‐sequencing. (d) The ‘transcriptomics’ module showing sequencing, assembly result and expression profiles. (e) The ‘pathways’ module. (f) The ‘biosynthetic tools module’ providing detailed information of catalytic components, chassis cells, and regulatory elements. (g) A case study for the application of MPOD.

Schematic of the Database for Medicinal Plants. (a) The flow diagram showing design and construction of MPOD. (b) The home page of MPOD. (c) The ‘genomics’ module providing summary of genomes, genome size, and re‐sequencing. (d) The ‘transcriptomics’ module showing sequencing, assembly result and expression profiles. (e) The ‘pathways’ module. (f) The ‘biosynthetic tools module’ providing detailed information of catalytic components, chassis cells, and regulatory elements. (g) A case study for the application of MPOD. The transcriptomics module contains transcriptomes, expression, and Pearson. The transcriptomes collect 200 published and 28 de novo sequenced data in this project (Figure 1d). It consists of species introduction, sample information, sequencing data, assembly results, annotation methods, the data source links, and reference. The transcriptome data is uploaded and linked like genomes. More importantly, for 28 unpublished transcriptomes, we provide gene expression profiles from different experimental conditions or tissues in a heatmap for easy visualization. We also perform Pearson correlation analyses of genes involved in metabolic pathways using some of our transcript expression data. The pathways module collects 85 typical compounds whose biosynthetic pathway has been deciphered, including 28 flavonoids, 28 terpenoids, 20 alkaloids, and 9 other compounds. This module lists the compound name, molecular formula, molecular weight, function, basic organisms, precursor, host, synthesis type, downstream gene, pathway, and reference (Figure 1e). Furthermore, this module also collects 7 important compounds, but their biosynthetic pathways are not completely deciphered. Similarly, it includes type of compounds, distribution, proposed pathway, and provides the sequences and expression profiles of candidate genes potentially involved in biosynthesis. It also provides five metabolomes showing that metabolite content from different tissues using heatmap. The biosynthetic tools module lists chassis cells, catalytic components, and regulatory elements (Figure 1f). Chassis cells present 46 strains of Escherichia coli and Saccharomyces cerevisiae commonly used in biosynthesis, and Nicotiana benthamiana and Solanum lycopersicum as a heterologous expression platform for reconstituting PNPs pathways. In the section of catalytic components, 629 enzymes from 8 major gene families that play key roles in the biosynthesis of natural products were summarized, including 21 acyltransferase (ACT), 7 C‐glycosyltransferase (CGT), 159 cytochrome P450 (CYP), 75 O‐methyltransferase (OMT), 163 oxidosqualene cyclase (OSC), 25 squalene epoxidase (SE), 65 terpene synthases (TPS), and 114 UDP‐glycosyltransferases (UGT). The accession number, gene length, sequence, reaction equation, and references are listed. The regulatory elements section presents 196 microbial promoter and terminator sequences commonly used in biosynthesis. In addition to the main modules, MPOD provides some popular bioinformatics tools including ‘BLAST’, ‘Search’, ‘Heatmap’, and ‘JBrowse’ (Dong et al., 2020). All available MPOD genomes and gene models are incorporated into JBrowse. ‘SynVisio’ shows gene synteny relationships of chromosome‐level reference genomes. ‘Co‐expression analysis’ creates networks comprising sets of genes whose expressions are highly correlated. A typical case of a user using our web is shown in Figure 1g. Gypenoside A is the main active component of G. pentaphyllum, and its content is the highest in leaves from metabolome. The biosynthesis of gypenoside A begins with 2,3‐oxidosqualene, but the key downstream enzymes OSC, CYP, and UGT have not been identified. A total of 235 CYPs from G. pentaphyllum (GpCYPs) were found by Blast. The phylogenetic tree was constructed based on the deduced amino acid sequences for the GpCYPs and other plant CYPs, and were distributed in eight subfamilies, namely 144 CYP71, 34 CYP85, 28 CYP72, 20 CYP86, and 4 CYP74. We also explored the expressions of GpCYPs from different tissues and presented as a heatmap. Furthermore, we performed Pearson correlation analyses of our transcript expression data among GpOSCs, GpCYPs, and GpUGTs using GpOSCs as the query gene (Figure 1g). These results facilitate the discovery of unknown genes involved in gypenoside A biosynthesis. In summary, from genes to metabolite levels, MPOD integrates the genomics, transcriptomics, and metabolomics data of MPs published in almost recent years and sequenced in this study. These datasets provide a rich genetic resource for mining functional genes, screening molecular markers, and developing biological elements. Further combination of pathways and catalytic components greatly facilitate to decode the biosynthetic pathways of medicinal ingredients. MPOD will be continuously updated as multi‐omics data increases and new bioinformatics tools emerge, so that MPOD provides long‐term support to the research of MPs molecular‐assisted breeding and synthetic biology.

Conflict of interest

The authors declare no conflict of interest.

Author contributions

S. Y. and Y. D. conceived the study. S. H., L. Y., S. Y., Y. L., X. L., Y. W., G. C., G. L., M. Z., X. Z., K. W., and G. Z. collected and processed data. S. Y., Y. D., J. M., S. H., and L. Y. designed the experiments and wrote the manuscript. All the authors approved the manuscript.
  8 in total

Review 1.  Research progress on the source, production, and anti-cancer mechanisms of paclitaxel.

Authors:  Yan-Hua Yang; Jia-Wang Mao; Xiao-Li Tan
Journal:  Chin J Nat Med       Date:  2020-12

2.  HMOD: An Omics Database for Herbal Medicine Plants.

Authors:  Xiao Wang; Jiajin Zhang; Simei He; Yuanni Gao; Xiaoqin Ma; Yun Gao; Guanghui Zhang; Ling Kui; Wen Wang; Ying Wang; Shengchao Yang; Yang Dong
Journal:  Mol Plant       Date:  2018-03-07       Impact factor: 13.164

3.  VitisGDB: The Multifunctional Database for Grapevine Breeding and Genetics.

Authors:  Xiao Dong; Wei Chen; Zhenchang Liang; Xuzhen Li; Peter Nick; Shanshan Chen; Yang Dong; Shaohua Li; Jun Sheng
Journal:  Mol Plant       Date:  2020-05-15       Impact factor: 13.164

Review 4.  Naturally occurring anti-cancer compounds: shining from Chinese herbal medicine.

Authors:  Hua Luo; Chi Teng Vong; Hanbin Chen; Yan Gao; Peng Lyu; Ling Qiu; Mingming Zhao; Qiao Liu; Zehua Cheng; Jian Zou; Peifen Yao; Caifang Gao; Jinchao Wei; Carolina Oi Lam Ung; Shengpeng Wang; Zhangfeng Zhong; Yitao Wang
Journal:  Chin Med       Date:  2019-11-06       Impact factor: 5.455

5.  Missing enzymes in the biosynthesis of the anticancer drug vinblastine in Madagascar periwinkle.

Authors:  Lorenzo Caputi; Jakob Franke; Scott C Farrow; Khoa Chung; Richard M E Payne; Trinh-Don Nguyen; Thu-Thuy T Dang; Inês Soares Teto Carqueijeiro; Konstantinos Koudounas; Thomas Dugé de Bernonville; Belinda Ameyaw; D Marc Jones; Ivo Jose Curcino Vieira; Vincent Courdavault; Sarah E O'Connor
Journal:  Science       Date:  2018-05-03       Impact factor: 63.714

Review 6.  Engineering microbial cell factories for the production of plant natural products: from design principles to industrial-scale production.

Authors:  Xiaonan Liu; Wentao Ding; Huifeng Jiang
Journal:  Microb Cell Fact       Date:  2017-07-19       Impact factor: 5.328

7.  High quality genome of Erigeron breviscapus provides a reference for herbal plants in Asteraceae.

Authors:  Simei He; Xiao Dong; Guanghui Zhang; Wei Fan; Shengchang Duan; Hong Shi; Dawei Li; Rui Li; Geng Chen; Guangqiang Long; Yan Zhao; Mo Chen; Mi Yan; Jianli Yang; Yingchun Lu; Yanli Zhou; Wei Chen; Yang Dong; Shengchao Yang
Journal:  Mol Ecol Resour       Date:  2020-10-22       Impact factor: 7.090

8.  MPOD: Applications of integrated multi-omics database for medicinal plants.

Authors:  Simei He; Ling Yang; Shuang Ye; Yuan Lin; Xiaobo Li; Yina Wang; Geng Chen; Guanze Liu; Ming Zhao; Xiu Zhao; Kunhua Wei; Guanghui Zhang; Jianhua Miao; Yang Dong; Shengchao Yang
Journal:  Plant Biotechnol J       Date:  2022-03-10       Impact factor: 13.263

  8 in total
  3 in total

1.  Integrated Metabolomic and Transcriptomic Analysis and Identification of Dammarenediol-II Synthase Involved in Saponin Biosynthesis in Gynostemma longipes.

Authors:  Shuang Ye; Lei Feng; Shiyu Zhang; Yingchun Lu; Guisheng Xiang; Bo Nian; Qian Wang; Shuangyan Zhang; Wanling Song; Ling Yang; Xiangyu Liu; Baowen Feng; Guanghui Zhang; Bing Hao; Shengchao Yang
Journal:  Front Plant Sci       Date:  2022-03-25       Impact factor: 5.753

2.  Abortion occurs during double fertilization and ovule development in Paeonia ludlowii.

Authors:  Tingqiao Chen; Mengyu Xie; Yumeng Jiang; Tao Yuan
Journal:  J Plant Res       Date:  2022-01-20       Impact factor: 2.629

3.  MPOD: Applications of integrated multi-omics database for medicinal plants.

Authors:  Simei He; Ling Yang; Shuang Ye; Yuan Lin; Xiaobo Li; Yina Wang; Geng Chen; Guanze Liu; Ming Zhao; Xiu Zhao; Kunhua Wei; Guanghui Zhang; Jianhua Miao; Yang Dong; Shengchao Yang
Journal:  Plant Biotechnol J       Date:  2022-03-10       Impact factor: 13.263

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.