| Literature DB >> 31878043 |
Ling Peng1, Liangwei Li1, Xiaochuan Liu1, Jianwei Chen1, Chengcheng Shi1, Wenjie Guo1, Qiwu Xu1, Guangyi Fan1,2, Xin Liu1,2, Dehai Li3.
Abstract
Penicillium is an ascomycetous genus widely distributed in the natural environment and is one of the dominant fungi involved in the decomposition of mangroves, which can produce a variety of antitumor compounds and bioactive substances. However, in mangrove ecosystems there is no complete genome in this genus. In this study, we isolated a fungus strain named Penicillium variabile HXQ-H-1 from coast mangrove (Fujian Province, China). We generated a chromosome-level genome with total size of 33.32 Mb, scaffold N50 of 5.23 Mb and contig N50 of 96.74 kb. Additionally, we anchored about 95.91% assembly sequences into the longest seven scaffolds, and predicted 10,622 protein-coding genes, in which 99.66% could be annotated by eight protein databases. The secondary metabolites analysis reveals the strain has various gene clusters involving polyketide synthase (PKS), non-ribosomal peptide synthetase (NRPS) and terpene synthase that may have a largely capacity of biotechnological potential. Comparison genome analysis between Penicillium variabile and Talaromyces islandicus reveals a small difference in the total number of genes, whereas HXQ-H-1 has a higher gene number with COG functional annotation. Evolutionary relationship of Penicillum based on genome-wide data was carried out for the first time, showing the strain HXQ-H-1 is closely related to Talaromyces islandicus. This genomic resource may provide a new resource for development of novel bioactive antibiotics, drug candidates and precursors in Penicillium variabile.Entities:
Keywords: Penicillium variabile; chromosome-level genome; mangrove; phylogeny; secondary metabolites
Year: 2019 PMID: 31878043 PMCID: PMC7151134 DOI: 10.3390/jof6010007
Source DB: PubMed Journal: J Fungi (Basel) ISSN: 2309-608X
Statistic of genome assembly.
| Values | Draft Genome | Chromosome Genome | ||
|---|---|---|---|---|
| Scaffold | Contig | Scaffold | Contig | |
| Num | 3860 | 3860 | 348 | 757 |
| Length (bp) | 34,059,084 | 34,059,084 | 33,318,899 | 33,114,399 |
| N50 (bp) | 93,482 | 93,482 | 5,232,202 | 96,738 |
| N90 (bp) | 24,393 | 24,393 | 2,951,000 | 30,747 |
| GC% | 46.58 | 47 | 48 | 48 |
Figure 1Summary of genome assembly and assessment. (a) The seven longest scaffolds interactive heatmap of Penicillium variabile assembly. The horizontal and vertical lines divide the genome into seven major pseudochromosomes. Gradient red indicates the intensity of contact between sequences. The intra-contact of sequences in pseudochromosome is stronger than the inter-contact. (b) The length of each chromosome and involved contig number. (c) GC-depth distribution of final assembly. The horizontal coordinate represents the average GC content per 10 kb windows, and the vertical coordinate represents the corresponding coverage depth. The bar charts on the top and right represent the distribution of GC and coverage depth. (d) BUSCO (benchmarking universal single-copy orthologs) assessment of genome and gene set. The bar chart below is a BUSCO assessment of genome, of which 3090 genes (97.91%) are predicted to be complete including 3074 single copies and 16 duplications. The upper bar chart is a BUSCO assessment of gene set, of which 2788 genes (88.34%) are predicted to be complete, including 2769 single copies and 19 duplications.
Gene prediction results using homologous and de novo.
| Values | Homolog |
| Glean | |||
|---|---|---|---|---|---|---|
|
|
|
|
| Augustus | ||
| Gene_number | 8947 | 9214 | 9510 | 9360 | 12,032 | 10,622 |
| Average_gene_len (bp) | 1456.62 | 1394.67 | 1459.34 | 1478.38 | 1773.88 | 1994.87 |
| Average_cds_len (bp) | 1317.95 | 1294.49 | 1349.26 | 1363.3 | 1581.44 | 1568.29 |
| Average_exon_number | 2.34 | 2.31 | 2.43 | 2.47 | 3.38 | 3.31 |
| Average_exon_len (bp) | 562.45 | 560.76 | 554.42 | 552.59 | 468.3 | 473.94 |
| Average_intron_len (bp) | 103.24 | 76.57 | 76.78 | 78.44 | 80.96 | 184.74 |
Figure 2Function annotation of strain P. variabile HXQ-H-1 (a). Statistics of annotated genes by 9 databases. The third column indicates the ratio of the annotated genes number. (b). KEGG pathway classification. Histograms represent the gene number of pathways and grouped into 6 classifications which were tagged by colors. (c). Gene numbers of annotated by CAZyme.
Figure 3Genome comparison of Penicillium variabile HXQ-H-1 and Talaromyces islandicus WF-38-12. The lefts are the 7 scaffolds of HXQ-H-1, and the rights are the 13 scaffolds of WF-38-12. The collinear relationship between scaffolds is connected by center thin lines. The scale of each small square of the outermost circle represents 5 kb. Circle A is GC heatmap. Circle B is the histogram of GC (red: G > C; blue: G < C). Circle C is gene density in chromosomes. Circles D and E are the COG positive/negative annotation heatmaps, and the legend is shown in the bottom right.
Figure 4The species phylogenetic trees of Penicillium variabile HXQ-H-1 and other 33 Penicillium genomes. Trees were reconstructed based on core genes (a), homolog genes (b) and SNPs (c) respectively. The three clades are marked with different colors.