| Literature DB >> 35049946 |
Tao Sun1, Yixuan Zhang1, Hao Jiang1, Kai Yang1, Shiyu Wang1, Rui Wang1, Sha Li1, Peng Lei1, Hong Xu1, Yibin Qiu2, Dafeng Sun3.
Abstract
Naematelia aurantialba is a rare edible fungus with both nutritional and medicinal values and especially rich in bioactive polysaccharides. However, due to the lack of genomic information, researches on the mining of active compounds, artificial breeding and cultivation, genetics, and molecular biology are limited. To facilitate the medicinal and food applications of N. aurantialba, we sequenced and analyzed the whole genome of N. aurantialba for the first time. The 21-Mb genome contained 15 contigs, and a total of 5860 protein-coding genes were predicted. The genome sequence shows that 296 genes are related to polysaccharide synthesis, including 15 genes related to nucleoside-activated sugar synthesis and 11 genes related to glucan synthesis. The genome also contains genes and gene clusters for the synthesis of other active substances, including terpenoids, unsaturated fatty acids, and bioactive proteins. In addition, it was also found that N. aurantialba was more closely related to Naematelia encephala than to Tremella fuciformis. In short, this study provides a reference for molecular cognition of N. aurantialba and related researches.Entities:
Keywords: Naematelia aurantialba; functional annotation; polysaccharides; secondary metabolism; whole-genome sequencing
Year: 2021 PMID: 35049946 PMCID: PMC8777972 DOI: 10.3390/jof8010006
Source DB: PubMed Journal: J Fungi (Basel) ISSN: 2309-608X
Figure 1Fruiting bodies of N. aurantialba.
T Statistics of N. aurantialba NX-20 genome assembly and gene prediction.
| Feature | Value |
|---|---|
| Genome assembly | |
| Contigs number | 15 |
| Max length (bp) | 2,546,384 |
| N50 length (bp) | 1,814,705 |
| Total length (bp) | 20,998,359 |
| GC (%) | 56.42 |
| Gene prediction | |
| Gene number | 5860 |
| Gene total length (bp) | 8,989,977 |
| Gene average length (bp) | 1534 |
| Gene length/Genome (%) | 42.81 |
Assembly summary statistics compared to other mushrooms of Tremellales.
| Species | NCBI BioProject | Total Length (Mb) | GC% | Contigs | N50 Length (bp) | Completeb a | Fragmented | Missing |
|---|---|---|---|---|---|---|---|---|
| PRJNA281519 | 23.6356 | 57.0 | 3502 | 18,448 | 92.4% | 1.4% | 6.2% | |
| PRJNA225529 | 28.6399 | 46.8 | 484 | 123,767 | 92.0% | 1.4% | 6.6% | |
| PRJNA207298 | 27.1109 | 41.3 | 1019 | 73,463 | 90.6% | 2.4% | 7.0% | |
| PRJNA330699 | 19.7863 | 49.3 | 151 | 209,500 | 85.5% | 3.4% | 11.1% | |
| PRJNA772294 | 20.9984 | 56.4 | 15 | 1,825,336 | 93.1% | 2.4% | 4.5% |
Note: a number of BUSCO proteins (percent of total BUSCOs).
Statistical results of repeat sequences in the N. aurantialba NX-20 genome.
| Repeat Type | Type | Number of Elements | Length Occupied (bp) | Repeat Size (bp) | Percentage of Genome (%) |
|---|---|---|---|---|---|
| Interspersed repeat | SINE | 9 | 1030 | - | 0.0049 |
| LINEs | 395 | 39,539 | - | 0.1883 | |
| LTR elements | 643 | 115,566 | - | 0.5504 | |
| DNA elements | 418 | 39,329 | - | 0.1873 | |
| RC | 68 | 8542 | - | 0.0407 | |
| Unknown | 16 | 1593 | - | 0.0076 | |
| Tandem repeat | TR | 12,449 | 583,229 | 1~982 | 2.7775 |
| Microsatellite DNA | 1448 | 91,405 | 2~6 | 0.4353 | |
| Minisatellite DNA | 9096 | 453,057 | 10~60 | 2.1576 |
Note: -, not detected.
Statistical results of noncoding RNAs in the N. aurantialba NX-20 genome.
| Type | Number of Elements | Total Length (bp) | Average Length (bp) | Percentage in Genome (%) |
|---|---|---|---|---|
| tRNA | 44 | 3925 | 89 | 0.01869 |
| 5s_rRNA | 9 | 1034 | 115 | 0.00599 |
| 5.8s_rRNA | 0 | 0 | 0 | 0 |
| 18s_rRNA | 1 | 1802 | 1802 | 0.02294 |
| 28s_rRNA | 1 | 3492 | 3492 | 0.05030 |
| sRNA | 0 | 0 | 0 | 0 |
| snRNA | 7 | 677 | 96 | 0.00322 |
| miRNA | 0 | 0 | 0 | 0 |
Figure 2The number of CAZymes genes in N. aurantialba and the other 18 fungi.
Figure 3Comparative genomics analysis. (A) Gene family (Single-Copy Orthologs, the number of single-copy homologous genes in the species common gene families; Multiple-Copy Orthologs, the number of multiple-copy homologous genes in the species common gene families; Unique Paralogs, genes in specific gene families; Other Orthologs, other genes; Unclustered Genes, genes that have not been clustered into any families); (B) conserved and specific gene counts (each ellipse represents a strain, and the numbers in the ellipses are specific genes. In addition, the central white circle represents conserved genes among the nine strains); (C) maximum likelihood phylogenetic tree.
Figure 4Synteny of N. aurantialba NX-20 with N. encephala 68-887.2 (A), T. mesenterica DSM 1558 (B), and Tremella fuciformis Tr26 (C). The upper axis indicates the genome measured, and the lower axis indicates the reference sequence genome. The forward and reverse strands are represented by yellow boxes and blue boxes, respectively. The height of the filled color in the box indicates the similarity of the alignment, and full filling indicates 100% similarity. The color of the linked graph between the upper and lower axes indicates the alignment type: Collinear, syntenic alignment; Translocation, translocation alignment; Inversion, inverted alignment; Tran + inver, alignment of translocations and inversions.