| Literature DB >> 32025321 |
Minghui Kang1, Haolin Wu1, Qiao Yang1, Li Huang1, Quanjun Hu1, Tao Ma1, Zaiyun Li2,1, Jianquan Liu1,3.
Abstract
Isatis indigotica (2n = 14) is an important medicinal plant in China. Its dried leaves and roots (called Isatidis Folium and Isatidis Radix, respectively) are broadly used in traditional Chinese medicine for curing diseases caused by bacteria and viruses such as influenza and viral pneumonia. Various classes of compounds isolated from this species have been identified as effective ingredients. Previous studies based on transcriptomes revealed only a few candidate genes for the biosynthesis of these active compounds in this medicinal plant. Here, we report a high-quality chromosome-scale genome assembly of I. indigotica with a total size of 293.88 Mb and scaffold N50 = 36.16 Mb using single-molecule real-time long reads and high-throughput chromosome conformation capture techniques. We annotated 30,323 high-confidence protein-coding genes. Based on homolog searching and functional annotations, we identified many candidate genes involved in the biosynthesis of main active components such as indoles, terpenoids, and phenylpropanoids. In addition, we found that some key enzyme-coding gene families related to the biosynthesis of these components were expanded due to tandem duplications, which likely drove the production of these major active compounds and explained why I. indigotica has excellent antibacterial and antiviral activities. Our results highlighted the importance of genome sequencing in identifying candidate genes for metabolite synthesis in medicinal plants.Entities:
Keywords: Comparative genomics; Medical genomics
Year: 2020 PMID: 32025321 PMCID: PMC6994597 DOI: 10.1038/s41438-020-0240-5
Source DB: PubMed Journal: Hortic Res ISSN: 2052-7276 Impact factor: 6.793
Statistics for the final genome assembly of I. indigotica
| Sequencing platform | PacBio Sequel |
| Assembly size (bp) | 293,875,465 |
| GC % | 38.18 |
| Number of scaffolds | 810 |
| Scaffold N50 size (bp) | 36,165,591 |
| Scaffold N90 size (bp) | 87,397 |
| Number of contigs | 1199 |
| Contig N50 size (bp) | 1,176,212 |
| Contig N90 size (bp) | 75,736 |
| Gap % | 0.01 |
| Longest sequence length (bp) | 38,253,781 |
Statistics of predicted protein-coding genes in the I. indigotica genome
| Number of protein-coding genes | 30,323 |
| Number of transcripts | 42,061 |
| Average transcript length (bp) | 2693.27 |
| Average exon length (bp) | 252.24 |
| Average intron length (bp) | 215.32 |
| Average number of exons per gene | 5.50 |
| Average exon length per gene (bp) | 1387.32 |
Fig. 1Ancestral Brassicaceae genomes and the distribution of ancestral GBs along the seven pseudochromosomes of I. indigotica.
a The ancestral genomes ACK, PCK, and tPCK, each comprising 22 ancestral GBs. Blocks with opposite orientations relative to that of ACK are represented by downward-pointing arrows. M-N and D GBs are translocated in tPCK chromosomes tPC2 and tPC7 compared with PCK chromosomes PC2 and PC7. The structures were drawn based on previous studies[32–34]. b Twenty-two GBs and their positions within the I. indigotica genome. Genes of each GB boundary are shown beside the chromosomes with the corresponding A. thaliana locus IDs within parentheses based on MCScanX results. The A. thaliana GB boundaries were derived from a previous study[34]
Fig. 2Evolutionary and comparative genomic analyses.
a The phylogenetic tree of I. indigotica and eight other Brassicaceae species with C. hassleriana as the outgroup. All branch bootstrap values are 100. Gene family expansions are indicated in purple, while gene family contractions are indicated in light brown. The estimated divergence times (million years ago, Mya) are indicated at each node; bars are the 95% highest probability densities (HPDs). Circles in blue represent recent whole-genome duplication events. b Ks value distributions between B. rapa, I. indigotica, and A. thaliana. c Orthogroups shared by selected species
Fig. 3Putative biosynthetic pathways of three main class active compounds in I. indigotica.
The putative biosynthetic pathways of terpenoids (a), phenylpropanoids (b), and indole alkaloids (c) of active compounds in I. indigotica. Values within brackets indicate the numbers of gene copies corresponding to the catalytic genes in the pathways