| Literature DB >> 35685935 |
Zhitao Mao1,2,3, Ping Yang2,3,4, Huanhuan Liu5, Yufeng Mao1,2,3, Yu Lei3,4, Dongwei Hou3,4, Hongwu Ma1,2,3, Xiaoping Liao1,2,3, Wenxia Jiang2,3,4.
Abstract
Ceriporia lacerata is an endophytic white-rot fungus that has lignocellulolytic and terpenoid-biosynthetic abilities. However, little is known about the genomic architecture of this fungus, even at the genus level. In this study, we present the first de novo genome assembly of C. lacerata (CGMCC No. 10485), based on PacBio long-read and Illumina short-read sequencing. The size of the C. lacerata genome is approximately 36 Mb (N50, 3.4 Mb). It encodes a total of 13,243 genes, with further functional analysis revealing that these genes are primarily involved in primary metabolism and host interactions in this strain's saprophytic lifestyle. Phylogenetic analysis based on ITS demonstrated a primary evolutionary position for C. lacerata, while the phylogenetic analysis based on orthogroup inference and average nucleotide identity revealed high-resolution phylogenetic details in which Ceriporia, Phlebia, Phlebiopsis, and Phanerochaete belong to the same evolutionary clade within the order Polyporales. Annotation of carbohydrate-active enzymes across the genome yielded a total of 806 genes encoding enzymes that decompose lignocellulose, particularly ligninolytic enzymes, lytic polysaccharides monooxygenases, and enzymes involved in the biodegradation of aromatic components. These findings illustrate the strain's adaptation to woody habitats, which requires the degradation of lignin and various polycyclic aromatic hydrocarbons. The terpenoid-production potential of C. lacerata was evaluated by comparing the genes of terpenoid biosynthetic pathways across nine Polyporales species. The shared genes highlight the major part of terpenoid synthesis pathways, especially the mevalonic acid pathway, as well as the main pathways of sesquiterpenoid, monoterpenoid, diterpenoid, and triterpenoid synthesis, while the strain-specific genes illustrate the distinct genetic factors determining the synthesis of structurally diverse terpenoids. This is the first genomic analysis of a species from this genus that we are aware of, and it will help advance functional genome research and resource development of this important fungus for applications in renewable energy, pharmaceuticals, and agriculture.Entities:
Keywords: Ceriporia lacerata; comparative genomics; de novo genome sequencing; lignin degradation; phylogenetic analysis; terpenoid biosynthesis
Year: 2022 PMID: 35685935 PMCID: PMC9171200 DOI: 10.3389/fmicb.2022.880946
Source DB: PubMed Journal: Front Microbiol ISSN: 1664-302X Impact factor: 6.064
Representative genomes of order Polyporales.
| Strain | Assembly accession | Number of scaffolds | Genome coverage | Sequencing technology | Year |
|
| 237 | 144.4 × | Illumina | 2016 | |
|
| 542 | 50.63 × | 454; Illumina | 2012 | |
|
| 861 | N/A | N/A | 2012 | |
|
| 504 | 85.9 × | Illumina; PacBio | 2013 | |
|
| 69 | 500.0 × | 454; Illumina HiSeq | 2017 | |
|
| 740 | 56.6 × | 454; Sanger | 2013 | |
|
| 127 | 100.0 × | PacBio | 2016 | |
|
| 399 | 85.2 × | Illumina | 2016 | |
|
|
| 712 | 127 × | Illumina | 2016 |
|
| 1137 | 58.1 × | Sanger; 454; Illumina | 2012 | |
|
|
| 1355 | 160.0 × | Illumina HiSeq | 2018 |
|
| 573 | 145 × | Illumina | 2015 | |
|
| 549 | 47 × | 454; Sanger; Illumina | 2017 | |
|
| 776 | 31 × | N/A | 2014 | |
|
| 222 | 99.4 × | Illumina | 2017 | |
|
|
| 1731 | 160.0 × | Illumina HiSeq | 2016 |
|
| 283 | 40 × | Sanger; 454; Illumina | 2012 | |
|
| 348 | 40 × | Sanger; 454; Illumina | 2013 |
Whole-genome assembly features of C. lacerata CGMCC No. 10485.
| Assembly parameters | Value |
| Total genome size (bp) | 36,361,585 |
| Number of contigs | 58 |
| Maximum contig length (bp) | 4,415,373 |
| Minimum contig length (bp) | 1,192 |
| Average contig length (bp) | 626,923 |
| N50 value (bp) | 3,409,197 |
| GC (%) | 49.33 |
| BUSCO (%) | 98.4 |
|
| |
| Total number of predicted proteins/genes | 13,243 |
| Total number of annotated proteins/genes | 9085 |
| Non-coding RNAs | |
| tRNAs | 179 |
| rRNAs | 14 |
| snRNAs | 10 |
| other | 20 |
|
| |
| Number of protein-coding genes | 13,243 |
| Average gene length (bp) | 1860.07 |
| Gene density (number of genes per Mb) | 364.22 |
|
| |
| Number of exons | 119,885 |
| Total exon length (Mb) | 24.18 |
| Average exon length (bp) | 201.71 |
| Average number of exons per gene | 9.05 |
|
| |
| Number of introns | 89,079 |
| Total intron length (Mb) | 6.35 |
| Average intron length (bp) | 71.31 |
FIGURE 1Repetitive sequences in the genome of Ceriporia lacerata. (A) Repeat region types and counts identified by RepeatMasker. (B) SSRs identified by MISA. LINE, long interspersed nuclear element; LTR, long terminal repeat; SINE, short interspersed nuclear element.
FIGURE 2Functional annotations of the Ceriporia lacerata genome. (A) Top 15 GO terms ranked based on gene counts. GO terms with asterisks refer to oxidoreductase activity acting on paired donors, incorporating or reducing molecular oxygen. (B) Top 15 KEGG BRITE categories ranked based on KO counts; (C) KOG annotation and classification; (D) GO annotations of the secretome. BP, biological process; CC, cellular component; MF, molecular function.
FIGURE 3Genome-based phylogenetic analysis of representative strains within the order Polyporales. (A) Phylogenetic analysis based on the ITS sequences from representative TYPE materials from Polyporales and C. lacerata CGMCC No. 10485. Different genera were distinguished with different colors; (B) Unrooted ML phylogenetic tree with single-copy orthologous genes from representative genomes of the order Polyporales based on hidden Markov models. (C) ANI values and ANI-value-based hierarchical clustering. The data matrix was created using species as the independent variable and ANI values between two species as the dependent variable using the group-average method and Euclidean distance as a scale.
FIGURE 4CAZyme-encoding genes in the genome of Ceriporia lacerata identified using a comparative approach based on 18 additional species from the order Polyporales. (A) Gene counts and phylogenetic analysis of CAZymes. The phylogenetic tree was constructed with ML algorithm. (B) Gene count heatmap of CAZyme subfamilies, and the color scale represents the count of the gene normalized by the Z-score method. The CAZyme annotation information of the analyzed species is listed in Supplementary Table 2.
FIGURE 5Identification of terpenoid biosynthesis pathways in the Ceriporia lacerata genome. (A) Sketch of the biosynthesis pathways. (B) Annotation of the terpenoid backbone biosynthesis pathway (KEGG pathway map00900); (C) Annotation of the monoterpenoid biosynthesis pathway (map00909); (D) Annotation of the sesquiterpenoid and triterpenoid biosynthesis pathway (map00904); (E) Annotation of the diterpenoid biosynthesis pathway (map00902). The Enzyme Commission (EC) numbers for each pathway were converted into KO identifiers. The gene counts of each KO term are normalized by the Z-score method and increasing from blue to red in the color bar, while the missing KOs are indicated by black circles. More information is supplied in Supplementary Tables 3–6.