| Literature DB >> 35205897 |
Luiz Marcelo Ribeiro Tomé1, Felipe Ferreira da Silva1, Paula Luize Camargos Fonseca2, Thairine Mendes-Pereira1, Vasco Ariston de Carvalho Azevedo3, Bertram Brenig4, Fernanda Badotti5, Aristóteles Góes-Neto1.
Abstract
Trametes villosa is a wood-decaying fungus with great potential to be used in the bioconversion of agro-industrial residues and to obtain high-value-added products, such as biofuels. Nonetheless, the lack of high-quality genomic data hampers studies investigating genetic mechanisms and metabolic pathways in T. villosa, hindering its application in industry. Herein, applying a hybrid assembly pipeline using short reads (Illumina HiSeq) and long reads (Oxford Nanopore MinION), we obtained a high-quality genome for the T. villosa CCMB561 and investigated its genetic potential for lignocellulose breakdown. The new genome possesses 143 contigs, N50 of 1,009,271 bp, a total length of 46,748,415 bp, 14,540 protein-coding genes, 22 secondary metabolite gene clusters, and 426 genes encoding Carbohydrate-Active enzymes. Our CAZome annotation and comparative genomic analyses of nine Trametes spp. genomes revealed T. villosa CCMB561 as the species with the highest number of genes encoding lignin-modifying enzymes and a wide array of genes encoding proteins for the breakdown of cellulose, hemicellulose, and pectin. These results bring to light the potential of this isolate to be applied in the bioconversion of lignocellulose and will support future studies on the expression, regulation, and evolution of genes, proteins, and metabolic pathways regarding the bioconversion of lignocellulosic residues.Entities:
Keywords: CAZymes; Trametes villosa CCMB561; comparative genomics; genome assembly; lignocellulosic biomass
Year: 2022 PMID: 35205897 PMCID: PMC8876698 DOI: 10.3390/jof8020142
Source DB: PubMed Journal: J Fungi (Basel) ISSN: 2309-608X
Summary of the Illumina HiSeq and Oxford Nanopore MinION reads statistics after preprocessing step.
| Illumina | MinION | |
|---|---|---|
| Total reads number | 48,347,940 | 1,043,247 |
| Total reads bases (bp) | 5,798,237,268 | 4,189,223,607 |
| Coverage | 129× | 93× |
| Longest read (bp) | 151 | 21,613 |
| Mean reads length (bp) | 138 | 4476 |
| GC content (%) | 57.5 | 56 |
Summary statistics for the assembled genomes of Trametes villosa CCMB561 using reads from Illumina HiSeq and Oxford Nanopore MinION.
| Assembly | Assembly Oxford Nanopore (MinION) | Hybrid Assembly | ||||||
|---|---|---|---|---|---|---|---|---|
| Assembly/ | MaSuRCa | CANU | CANU-smartdenovo | RACON | FLYE | SPADES | MaSuRCa | MaSuRCa-Purge_Dups |
| Number of contigs | 4026 | 1836 | 337 | 1836 | 882 | 12,829 | 264 | 143 |
| Number of contigs (≥500 bp) | 3930 | 1836 | 337 | 1836 | 881 | 1940 | 264 | 143 |
| Largest contig | 470,636 | 1,594,329 | 1,660,310 | 1,605,280 | 1,891,910 | 1,207,893 | 4,772,416 | 9,749,168 |
| Total length (≥500 bp) | 58,820,861 | 63,704,316 | 42,774,667 | 63,971,542 | 49,876,064 | 65,406,907 | 62,711,988 | 46,748,415 |
| GC (%) | 59.40 | 59.36 | 59.39 | 59.41 | 59.35 | 59.39 | 59.39 | 59.45 |
| N50 | 27,657 | 103,641 | 238,816 | 104,325 | 204,679 | 282,055 | 598,690 | 1,009,271 |
| L50 | 503 | 115 | 43 | 114 | 55 | 69 | 21 | 8 |
| # N’s per 100 kbp | 0.00 | 0.00 | 0.00 | 0.00 | 2.41 | 227.07 | 0.16 | 0.69 |
Figure 1Overview of the newly assembled genome of Trametes villosa CCMB561. (a) Assembly workflow proposed as the best approach for genome assembly. (b) Summary evaluation of the genome assembled through MaSuRCa-Purge_Dups workflow and the reference genome of Trametes villosa deposited in the NCBI database (GCA_002964805.1). (c) BUSCO completeness assessment of the new genome and the reference of Trametes villosa previously deposited in the NCBI (GCA_002964805.1).
Completeness assessment of Trametes villosa CCMB561 assemblies using BUSCO software.
| Complete (%) | Single-Copy (%) | Duplicated (%) | Fragmented (%) | Missing (%) | |
|---|---|---|---|---|---|
| CANU | 80.7 | 65.2 | 15.5 | 6.7 | 12.6 |
| CANU-smartdenovo | 76.2 | 73.8 | 2.4 | 8.6 | 15.2 |
| FLYE | 90.2 | 85.2 | 5.0 | 3.9 | 5.9 |
| MaSuRCa (Hybrid) | 99.0 | 64.0 | 35.0 | 0.2 | 0.8 |
| MaSuRCa (Illumina) | 97.4 | 64..3 | 33.1 | 0.9 | 1.7 |
| MaSuRCa-Purge_Dups | 99.1 | 96.7 | 2.4 | 0.1 | 0.8 |
| RACON | 88.2 | 70.0 | 18.2 | 4.5 | 7.3 |
| SPADES | 99.1 | 41.6 | 57.5 | 0.2 | 0.7 |
Figure 2Gene Ontology (GO) functional annotation of Trametes villosa CCMB561 proteins. (a) Number of hits (GO terms) associated with the predicted proteins by GO categories (Biological process, Cellular component, and Molecular function), in which one protein can be associated with multiple GO terms. (b–d) The 20 most assigned terms per category in the GO enrichment analysis.
Figure 3Annotation of Secondary metabolite gene clusters (SMGCs) and Carbohydrate-Active enzymes (CAZymes). (a) SMGCs identified in the genome of Trametes villosa CCMB561. (b) CAZymes identified in the genome of Trametes villosa CCMB561.
Figure 4Comparative genomics results overview. (a) Maximum-likelihood phylogenomic tree constructed using the newly assembled genome of Trametes villosa CCMB561 (marked with *) and nine available genomes from the Trametes genus. Bootstrap values are expressed in percentage and the features of each genome are shown beside the phylogeny. (b) Network plot created using a matrix containing the values of genome size, number of genes, TE coverage, GC content, and number of tRNA of each genome. (c) Correlation analysis among the main metrics of the genome (statistically significant correlations are represented with *).
Transposable elements (TE) identified in the Trametes species.
| ID Fungo | Total No. TE | Total TE Coverage% | Retroelements | DNA Transposons | Helitron | Unclassified | |||
|---|---|---|---|---|---|---|---|---|---|
| SINEs | LINEs | LTR Elements | |||||||
| Ty1/Copia | Gypsy/DIRS1 | ||||||||
|
| 252 | 8.57 | 0 | 144 | 337 | 450 | 390 | 0 | 3079 |
|
| 104 | 2.22 | 0 | 0 | 119 | 137 | 0 | 0 | 1187 |
|
| 129 | 4.50 | 0 | 0 | 210 | 451 | 14 | 0 | 1841 |
|
| 191 | 10.61 | 0 | 73 | 264 | 387 | 65 | 42 | 1949 |
|
| 172 | 3.67 | 0 | 0 | 317 | 263 | 89 | 0 | 2219 |
|
| 349 | 6.41 | 0 | 113 | 416 | 912 | 105 | 69 | 5043 |
|
| 303 | 5.82 | 19 | 106 | 107 | 160 | 258 | 30 | 3385 |
|
| 191 | 7.73 | 0 | 41 | 184 | 328 | 32 | 0 | 1855 |
|
| 234 | 4.6 | 0 | 50 | 38 | 144 | 11 | 94 | 4165 |
|
| 274 | 7.13 | 17 | 97 | 186 | 503 | 74 | 70 | 4437 |
Figure 5CAZyme-encoding genes involved in the degradation of lignocellulosic biomass. (a) Number of auxiliary redox enzyme-encoding genes. (b) Number of hemicellulose-degrading enzyme-encoding genes. (c) Number of cellulose breakdown enzyme-encoding genes. (d) Number of pectin-degrading enzyme-encoding genes.