| Literature DB >> 34864973 |
Chunji Li1,2, Ping Cheng1,2, Yunhao Sun1,2, Di Qin1,2, Guohui Yu1,2.
Abstract
Sporobolomyces roseus is an important oleaginous red yeast with critical biotechnological applications and has received significant recognition as a valuable source of industrial enzymes, carotenoids, and lipids. To reveal the genetic basis and functional components underlying its biotechnological applications, a high-quality genome assembly is required. Here, we present a novel genome assembly of S. roseus CGMCC 2.4355 using a combination of Illunima and Oxford Nanopore technologies. The genome has an assembly size of 21.4 Mb and consists of 15 scaffolds with an N50 length of 2,126,566 bp and GC content of 49.52%. The assembly is of high integrity, comprising 95.2% complete Benchmarking Universal Single-Copy Orthologs (BUSCOs) as evaluated by a genome completeness assessment. The genome was predicted to contain 8,124 protein-coding genes, 6,890 of which were functionally annotated. We believe that the combination of our analyses and high-quality genome assembly will promote the basic development of S. roseus as an agent for biotechnological applications and make a significant contribution to assess the evolutionary relationship of Sporobolomyces species.Entities:
Keywords: zzm321990 Sporobolomyces roseuszzm321990 ; biotechnology; evolutionary relationship; genome assembly; oleaginous red yeasts
Mesh:
Substances:
Year: 2021 PMID: 34864973 PMCID: PMC8643702 DOI: 10.1093/gbe/evab258
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
(A) Genomic landscape of S. roseus CGMCC 2.4355. (B) Venn diagram illustrating shared and unique genes annotated in the NR, SwissProt, KOG, and KEGG databases. (C) Distribution of orthologous and species-specific genes within three sequenced Sporobolomyces species. (D) Phylogenetic tree of three sequenced Sporobolomyces species was constructed based on aligned orthologs with Neighbor-Joining method (bootstrap: 1,000 replicates).
Summary of Assembly Statistics
| Assembly | Size (bp) | 22,396,975 |
|---|---|---|
| Number of scaffolds | 15 | |
| Scaffold N50 (bp) | 2,126,566 | |
| Scaffold N90 (bp) | 1,104,580 | |
| Longest scaffold (bp) | 4,742,556 | |
| Shortest scaffold (bp) | 23,714 | |
| GC content (%) | 49.52 | |
| BUSCO | Complete and single-copy BUSCOs | 1,266 |
| Complete and duplicated BUSCOs | 5 | |
| Fragmented BUSCOs | 22 | |
| Missing BUSCOs | 42 | |
| Total BUSCOs searched | 1,335 | |
| Repetitive elements | SINEs (bp) | 2,206 |
| LINEs (bp) | 6,840 | |
| LTR (bp) | 930 | |
| DNA transposons (bp) | 2,446 | |
| Total (bp) | 12,422 | |
| Annotation | Predicted genes | 8,124 |
| Functional-annotated genes | 6,890 | |
| Mean gene length (bp) | 2,391.07 | |
| Exons/gene | 6.98 | |
| Introns/gene | 5.98 | |
| Exon ratio (%) | 67.96 | |
| Intron ratio (%) | 18.77 | |
| Mean exon length (bp) | 268.38 | |
| Mean intron length (bp) | 86.54 |