| Literature DB >> 31653932 |
.
Abstract
Morchella are macrofungi and are also called morels, as they exhibit a morel-like upper cap structure. Morels contain abundant essential amino acids, vitamins and biologically active compounds, which provide substantial health benefits. Approximately 80 species of Morchella have been reported, and even more species have been isolated. However, the lack of wild Morchella resources and the difficulties associated with culturing Morchella have caused a shortage in the morels available for daily consumption. Additionally, in-depth genomic and morphological studies are still needed. In this study, to provide genomic data for further investigations of culturing techniques and the biological functions of Morchella sextelata (M. sextelata), de novo genome sequencing was carried out on the Illumina HiSeq. 4000 platform using both the Illumina 150 and PacBio systems. The final estimated genome size of M. sextelata was 52.93 Mb, containing 59 contigs and a GC content of 47.37%. A total of 9,550 protein-coding genes were annotated. In addition, the repeat sequences, gene components and gene functions were analyzed using various databases. Furthermore, the secondary metabolite gene clusters and the predicted structures of their products were analyzed. Finally, a genomic comparison of different species of Morchella was performed.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31653932 PMCID: PMC6814724 DOI: 10.1038/s41598-019-51831-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Horizontal coordinates represent the sequencing quality. The bars correspond to the left vertical coordinate, which shows the reads relevant to the quality of each sequencing data set.
Figure 2Horizontal coordinates represent sequencing read lengths. The bars correspond to the left vertical coordinate, which shows the read number relevant to each read length.
The contig statistics of the assembled genome of M. sextelata.
| Sample ID | Contig | Max Length (bp) | N50 Length (bp) | Total Length (bp) | Sequence GC (%) |
|---|---|---|---|---|---|
|
| 59 | 4,823,818 | 1,569,782 | 52,925,331 | 47.37 |
Note: N50 length indicates the length of the contig that localized to 50% of the total contig length when the contigs are arranged from longer to shorter sizes.
Figure 3The horizontal coordinate represents the length of protein-coding genes, and the vertical coordinate shows the number of protein-coding genes. The number written on the top of the bar is the number of genes within each range of gene lengths.
The summary of the final genome size and protein-coding gene annotation.
|
| Number or content |
|---|---|
| Genome size (bp) | 52,925,331 |
| Gene number | 9,550 |
| Gene total length (bp) | 13,107,305 |
| GC content in genes (%) | 53.53 |
| Gene content in genome (%) | 24.77 |
| Gene average length (bp) | 1,372 |
| Internal gene length (bp) | 39,818,026 |
| Internal gene GC content (%) | 45.34 |
| Internal gene content in genome (%) | 75.23 |
The summary of gene function annotation in different databases.
|
| Enriched gene number |
|---|---|
| NR | 6009 |
| Swiss-Prot | 2702 |
| KEGG | 5839 |
| KOG | 2148 |
| TCDB | 324 |
| GO | 5876 |
| PHI | 810 |
| DFVF | 635 |
| P450 | 61 |
| Secretary protein | 542 |
| CAZy | 290 |
*NR: non-redundant protein database, KOG: eukaryotic orthologous groups, TCDB: transporter classification database, PHI: pathogen host interactions, DFVF: database of fungal virulence Factors, CAZy: carbohydrate-active enzymes database.
Figure 4(A) The genetic map of M. sextelata. The outer ring is the gene position on the M. sextelata sequence. The ring (a) represents the information of coding gene positions. (b–d) represent the information of gene functions in the KOG, KEGG, and GO databases, respectively. (e) shows the ncRNA information. The outside ring is the positive chain, and the inner ring is the negative chain. The colors correspond to different functional characteristics predicted in the GO, KEGG, and KOG databases and different ncRNAs. (B) GO legend for (A). (C): KEGG legend for (A). (D) KOG legend for (A). (E) ncRNA legend for (A).
Summary of Carbohydrate-Active Enzymes database annotation.
| CAZy_class | Match number |
|---|---|
| CBM | 57 |
| CE | 13 |
| GH | 159 |
| GT | 41 |
| PL | 20 |
| AA | 44 |
Figure 5The horizontal coordinate represents the gene cluster name, and the vertical coordinate shows the number of gene clusters and genes in each cluster. Blue bars correspond to the gene number in each gene cluster, while red bars represent the number of clusters.
The summary of secondary metabolite gene clusters.
| Clusters | Cluster number | Gene number |
|---|---|---|
| terpene | 4 | 24 |
| nrps | 1 | 9 |
| t1pks | 1 | 9 |
| other | 4 | 31 |
The statistics of secondary metabolite gene clusters.
| Clusters | Contigs | Genes | |
|---|---|---|---|
| 1 | terpene | Contig3 | A6295; A6296; A6297; A6298 |
| 2 | other | Contig4 | A7647; A7648; A7650; A7651; A7652; A7653; A7654; A7655; A7656 |
| 3 | other | Contig8 | A9156; A9157; A9158; A9159; A9160; A9161; A9162; A9163 |
| 4 | terpene | Contig11 | A0451; A0452; A0453; A0454; A0455 |
| 5 | terpene | Contig14 | A1287; A1288; A1289; A1290; A1291; A1292; A1293; A1295; A1296; A1297 |
| 6 | t1pks | Contig20 | A3426; A3427; A3428; A3429; A3430; A3431; A3432; A3433; A3434 |
| 7 | nrps | Contig27 | A4555; A4556; A4557; A4558; A4559; A4560; A4561; A4562; A4563 |
| 8 | other | Contig28 | A4609; A4610; A4611; A4612; A4613; A4614; A4615; A4616 |
| 9 | other | Contig33 | A5824; A5825; A5826; A5827; A5828; A5829 |
| 10 | terpene | Contig36 | A6021; A6022; A6023; A6024; A6025 |
Figure 6A: The predicted structure of the product of gene cluster 6; B: the predicted structure of the product of gene cluster 7.
The statistics of P450 genes.
| P450 class | Class name | Gene number | Gene |
|---|---|---|---|
| 1 | P450, CYP52 | 7 | A7570, A0041, A7809, A2469, A1467, A3417, A4602 |
| 2 | E-class P450, CYP2D | 2 | A7321, A1285 |
| 3 | Undetermined | 8 | A8534, A8641, A1943, A1303, A0047, A0110, A1004, A1375 |
| 4 | Cytochrome P450 | 11 | A6397, A8907, A2892, A7709, A0184, A3635, A4884, A6462, A2432, A5836, A5236 |
| 5 | E-class P450, group I | 25 | A2224, A0822, A0168, A1345, A7129, A6827, A6624, A9296, A7194, A0622, A2360, A3045, A8405, A6914, A1184, A1955, A9488, A5674, A9367, A9393, A4439, A2648, A8332, A5691, A0796 |
| 6 | E-class P450, group IV | 8 | A1869, A6522, A6749, A0104, A0103, A3061, A4737, A5331 |
Genomic assembly statistics of different species of Morchella.
| Species | Strain | Assembly (Mb) | GC% | Scaffolds | Contigs | N50 | Genome coverage | INSDC | Sequencing | Assembly method | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 |
| MG91 | 49.62 | 47.1 | 3707 | 5121 | 28400 | 58x | QLOX00000000.1 | Illumina | platanus version v. 1.2.1 |
| 2 |
| MG90 | 73.46 | 46.0 | 7793 | 10613 | 26474 | 57x | QMFK00000000.1 | Illumina | platanus version v. 1.2.1 |
| 3 |
| M04 M26 | 51.08 | 47.3 | 106 | 110 | 958716 | 298x | QOKS00000000.1 | Illumina HiSeq | AllPaths v. 44849 |
| 4 |
| CCBAS932 | 48.21 | 47.2 | 540 | 2145 | 52248 | 67.8x | PZQV00000000.1 | Illumina | AllPathsLG v. R47710 |
| 5 |
| M04 M24 | 48.86 | 47.0 | 323 | 504 | 362388 | 210x | QORM00000000.1 | Illumina HiSeq | AllPaths v. 44849 |
| 6 |
| MG91 | 49.96 | — | 5231 | 5241 | 37765 | 151.17x | PYSJ00000000.1 | Illumina GAIIx | SPAdes v. 3.0 |
| 7 |
| MG113 | 51.40 | — | 9172 | 11637 | 28426 | 88x | QMFJ00000000.1 | Illumina | platanus version v. 1.2.1 |
Figure 8Left: the clustering tree of dispensable genes. Top: the clustering tree of samples. Middle: the expression level of dispensable genes in Morchella species (or strains) with different identity values, which are described with different colors, as shown on the upper right.