| Literature DB >> 29666196 |
Chenghua Zhang1, Wangqiu Deng1, Wenjuan Yan1, Taihui Li2.
Abstract
Cordyceps guangdongensis is an edible fungus which was approved as a novel food by the Chinese Ministry of Public Health in 2013. It also has a broad prospect of application in pharmaceutical industries, with many medicinal activities. In this study, the whole genome of C. guangdongensis GD15, a single spore isolate from a wild strain, was sequenced and assembled with Illumina and PacBio sequencing technology. The generated genome is 29.05 Mb in size, comprising nine scaffolds with an average GC content of 57.01%. It is predicted to contain a total of 9150 protein-coding genes. Sequence identification and comparative analysis indicated that the assembled scaffolds contained two complete chromosomes and four single-end chromosomes, showing a high level assembly. Gene annotation revealed a diversity of transposons that could contribute to the genome size and evolution. Besides, approximately 15.57% and 12.01% genes involved in metabolic processes were annotated by KEGG and COG respectively. Genes belonging to CAZymes accounted for 3.15% of the total genes. In addition, 435 transcription factors, involved in various biological processes, were identified. Among the identified transcription factors, the fungal transcription regulatory proteins (18.39%) and fungal-specific transcription factors (19.77%) represented the two largest classes of transcription factors. This genomic resource provided a new insight into better understanding the relevance of phenotypic characters and genetic mechanisms in C. guangdongensis.Entities:
Keywords: Cordyceps; Genome Report; chromosome; transcription factors; transporters
Mesh:
Substances:
Year: 2018 PMID: 29666196 PMCID: PMC5982816 DOI: 10.1534/g3.118.200287
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1General genomic features of Cordyceps guangdongensis. A, I, scaffolds, the different colors represented different scaffolds; II, gene density, represented as the number of genes per 100 kb, increased in color intensity from light blue, to dark blue, dark, dark red, and light red. The density of non-coding RNA increased in color intensity from dark blue, to light blue, white, light red, and dark red; III, percentage of coverage of repetitive sequences, increased in color intensity from light green, to dark green, dark, dark red, and light red; IV, GC content estimated by the percentage of G + C in 100 kb. B, Genomic element density including genic and nongenic features of the overall genome assembly length including 40.5% non-annotated sequences.
Assembly summary statistics of Cordyceps guangdongensis GD15 compared to other Cordyceps genomes
| Species | ||||
|---|---|---|---|---|
| NCBI Bio Project | NRQP00000000 | AEVU00000000 | ANOV00000000 | MWMN00000000 |
| Assembly size (Mb) | 29.0 | 32.2 | 78.5 | 33.9 |
| Coverage fold | 183x | 147x | 241x | 80x |
| No. of Scaffold | 9 | 32 | 10603 | 599(>1 kb) |
| N50 | 7.88 Mb | 0.11 Mb | 5.39 kb | 0.21 Mb |
| GC% | 57.0 | 51.4 | 46.1 | 53.0 |
| Repeat content (%) | 2.77 | 3.04 | 37.98 | 3.19 |
| Gene density (genes per Mb) | 315 | 301 | 87 | 286 |
Chromosome analysis of Cordyceps guangdongensis GD15 genomic sequence
| Scaffold | Size (bp) | Start Telomere | End Telomere | Judge | chromosome |
|---|---|---|---|---|---|
| Scaffold1 | 8,817,043 | CCCTAA | TTAGGG | double-end | Complete chromosome |
| Scaffold2 | 7,881,840 | No | TTAGGG | single-ended | Chromosome fragment |
| Scaffold3 | 5,000,199 | CCCTAA | TTAGGG | double-end | Complete chromosome |
| Scaffold4 | 4,508,454 | CCCTAA | No | single-ended | Chromosome fragment |
| Scaffold5 | 2,058,248 | No | TTAGGG | single-ended | Chromosome fragment |
| Scaffold6 | 614,660 | CCCTAA | No | single-ended | Chromosome fragment |
| Scaffold7 | 75,887 | No | No | No | Mitochondrial genome |
| Scaffold8 | 68,140 | No | No | No | Fragment |
| Scaffold9 | 31,250 | No | No | No | Fragment |
Genome annotation features of Cordyceps guangdongensis GD15
| Feature | Total number | Total length (bp) | Average length (bp) | Length/ genome length (%) |
|---|---|---|---|---|
| gene | 9,150 | 16,372,278 | 1,789.32 | 56.35 |
| Exons | 29,548 | 14,031,735 | 474.88 | 48.29 |
| CDS | 9,150 | 14,031,735 | 1,533.52 | 48.29 |
| Introns | 20,398 | 2,340,543 | 114.74 | 8.06 |
| tRNA | 111 | 9,519 | 85.75 | 0.03 |
| rRNA | 31 | 95,875 | 3,092.74 | 0.33 |
| sRNA | 121 | 7,369 | 60.9 | 0.025 |
| snRNA | 25 | 2,908 | 116.32 | 0.01 |
| miRNA | 26 | 1,766 | 67.92 | 0.006 |
Figure 2COG functional classification of proteins in the Cordyceps guangdongensis genome.
Transposable element repeat class analysis in Cordyceps guangdongensis
| Repeat element family | Number of unique elements in family | cumulative length (bp) | % of genome assembly | ||
|---|---|---|---|---|---|
| | 343 | 57,004 | 0.19618 | ||
| L1 | 14 | 660 | |||
| others | 329 | 56,344 | |||
| | 661 | 259,862 | 0.89435 | ||
| Copia | 269 | 174,266 | |||
| DIRS | 4 | 324 | |||
| ERV1 | 17 | 1,170 | |||
| Gypsy | 314 | 79,093 | |||
| Pao | 17 | 1,741 | |||
| others | 40 | 3,268 | |||
| | 15 | 1,304 | 0.00448 | ||
| Alu | 1 | 51 | |||
| Others | 14 | 1,253 | |||
| | 14 | 1,154 | 0.00397 | ||
| CMC-EnSpm | 44 | 3,001 | |||
| Dada | 8 | 485 | |||
| Ginger | 2 | 141 | |||
| hAT | 4 | 440 | |||
| Merlin | 3 | 108 | |||
| MULE-MuDR | 29 | 4,628 | |||
| P | 2 | 134 | |||
| PIF-Harbinger | 11 | 729 | |||
| PiggyBac | 63 | 19,337 | |||
| Sola | 29 | 1,840 | |||
| TcMar-Tc1 | 14 | 3,780 | |||
| Zisupton | 1 | 88 | |||
| others | 284 | 39,028 | |||
Figure 3Transcription factors analysis in the Cordyceps guangdongensis genome.