Literature DB >> 34849782

The whole-genome sequence of the novel yeast species Metschnikowia persimmonesis isolated from medicinal plant Diospyros kaki Thunb.

Endang Rahmat1,2, Inkyu Park2, Youngmin Kang1,2.   

Abstract

The new yeast Metschnikowia persimmonesis KCTC 12991BP (KIOM G15050 strain) exhibits strong antimicrobial activity against some pathogens. This activity may be related to the medicinal profile of secondary metabolites that could be found in the genome of this species. Therefore, to explore its future possibility of producing some beneficial activities, including medicinal ability, we report high-quality whole-genome assembly of M. persimmonesis produced by PacBio RS II sequencer. The final draft assembly consisted of 16 scaffolds with GC content of 45.90% and comprised a fairly complete set (82.8%) of BUSCO result using Saccharomycetales lineage data set. The total length of the genome was 16.473 Mb, with a scaffold N50 of 1.982 Mb. Annotation of the M. persimmonesis genome revealed presence of 7029 genes and 6939 functionally annotated proteins. Based on the analysis of phylogenetic relationship and the average nucleotide identities, M. persimmonesis was proved to a novel species within the Metschnikowia genus. This finding is expected to significantly contribute to the discovery of high-value natural products from M. persimmonesis as well as for genome biology and evolution comparative analysis within Metschnikowia species.
© The Author(s) 2021. Published by Oxford University Press on behalf of Genetics Society of America.

Entities:  

Keywords:  zzm321990 Metschnikowia spzzm321990 ; zzm321990 de novo assembly; genome annotation; next-generation sequencing; yeast

Mesh:

Year:  2021        PMID: 34849782      PMCID: PMC8527480          DOI: 10.1093/g3journal/jkab246

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Introduction

Persimmon (Diospyros kaki Thunb.) is an edible medicinal fruit belonging to the Ebenaceae family. It is usually cultivated in countries like Korea, China, Japan, Brazil, Turkey, and Italy (Butt ). The calyx of Diospyros kaki Thunb. exhibits antimicrobial, anticancer, anti-inflammatory, antioxidant, anticoagulant, antithrombotic, antigenotoxic, diaphragm contraction inhibitory, elastase inhibitory, and tyrosinase inhibitory effects (Jang ; Park ; Choi 9). In traditional Korean medicine (TKM), the calyx has a mild nature, a bitter and astringent taste, poison-free (Heo 2001) and it is known to release blocked Qi. A number of microorganisms that could become a new valuable source of medicine (Choi ) also inhabited D. kaki fruit stalk. Based on microbial community analysis, Metschnikowia sp. are the dominant microorganisms associated with the native varieties of Diospyros kaki fruits cultivated in Gyeongnam Province, Korea (Choi ). Metschnikowia is one of the most diverse genera of ascomycetous yeasts, consisting of more than 80 identified species isolated from many types of hosts (Lachance 2016). Metschnikowia (Saccharomycetaceae) has needle-shaped ascospores as their sexual forms (Lachance 2011) and some of species in this genus are used as commercial biocontrol agents to kill molds in the fruits (Hershkovitz ). A novel yeast species, Metschnikowia persimmonesis, was later discovered in the calyx of Korean Diospyros kaki cultivars originating from various parts of South Korea (Kang ), and has been registered for the Korean Patent (10-2016-0137873) and International Patent (PCT-KR2017-010681). Based on these patents, M. persimmonesis KIOM G15050 possesses antimicrobial activity against some pathogenic microorganisms, like Botrytis cinerea, Fusarium oxysporum, Paecilomyces inflatus, Colletotrichum higginsianum, Sclerotinia sclerotiorum, and Alternaria alternata (Kang ). These activities are possibly related to the presence of some medicinal secondary metabolites in M. persimmonesis such as cyclodipeptides, benzoic acid, tyrosol, etc. Owing to the mycostatic effects of this strain, it can be used as a natural preservative that is much safer than chemical preservatives. The advances in next generation sequencing technologies have rapidly boosted fundamental and applied studies on fungal species (Shendure ; Rahmat and Kang 2020). To date, many of the Metschnikowia sp. yeast genomes have been successfully analyzed using various next-generation sequencing platforms, such as genome sequence of Metschnikowia fructicola (Piombo ), draft genome of Metschnikowia australis (Batista ), draft genome of Metschnikowia pulcherrima subclade, UCD127 (Venkatesh ), etc. Here, we report the first genome assembly of M. persimmonesis KCTC 12991BP (KIOM G15050 strain). A hybrid and hierarchical de novo assembly approach was applied to predict the genome of M. persimmonesis using Pacific Biosciences (PacBio) long reads RS II sequencer.

Materials and methods

DNA sample collection, library construction, and sequencing

The yeast M. persimmonesis was isolated from the calyx of local Diospyros kaki, acquired from Gyeongnam Province, South Korea (35.141180 N 128.144161 E). The single colony strain was cultured in Potato-Dextrose or Luria-Bertani medium for 48 h at 25°C, with shaking (100 rpm) in dark conditions. The strain was earlier characterized at the phenotypic, physiological, and molecular levels and was deposited in the Korean Collection for Type Cultures (KCTC) (Kang ). Genomic DNA was extracted using a DNA extraction kit (MG™ gDNA purification kit, www.macrogen.com). The genomic DNA (gDNA) quality and integrity were evaluated by 0.8% agarose gel electrophoresis and densitometry and were compared to suitable size standards. The yield and purity of the genomic DNA were verified using a Nanodrop 2000 spectrophotometer (Thermo Fisher Scientific Inc., Wilmington, DE, USA) and a Qubit 2.0 fluorimeter (Life Technologies Ltd., Paisley, UK). The library was prepared using 16 µg gDNA, which was trimmed to a length of 20 kb fragments using g-tubes (Covaris, Inc., Woburn, MA, USA). DNA integrity and actual size distribution were determined using Bioanalyzer 2100 (Agilent Technologies, Santa Clara, CA, USA). The fragmented DNA was purified using 0.45 × AMPure® PB beads (Pacific Biosciences, USA). DNA Template Prep Kit 1.0 was used to build the SMRTbell 20 kb template library using a 20 kb insert library protocol (Pacific Biosciences; Menlo Park, CA, USA). The DNA size was selected using BluePippin (Sage Science, Beverly, MA, USA). The P6/C4 chemistry of 24 single-molecule real-time (SMRT) cells (8 with BluePippin and 16 without) was used to sequence the library, and a 240 min collection protocol was applied with the stage start feature to achieve longer reads.

De novo assembly of the genome

Initially, the pre-assembly process was performed, followed by correction and filtering of the reads. The length cutoff for seed reads used for initial mapping was 6000 bp. The pre-assembly option details are as follows: length_cutoff_pr: 6000 bp; falcon_sense_option: –output_multi –min_idt 0.70 –min_cov 4–local_match_count_threshold 2 –max_n_read 200 –n_core 6; overlap_filtering_setting: –max_diff 240 –max_cov 360 –min_cov 5 –bestn 10. De novo genome assembly of PacBio sub-reads was conducted using FALCON (v0.2.1) to acquire primary contigs and haplotigs (Chin ). The sequence of the assembled DNA was polished using Quiver (v1). Errors were rectified using SMRT Pipe (v2.3.0.139497) following the outcome evaluation of FALCON diverse parameters during the pre-assembly step. Mapping process of assembled contigs to long reads from PacBio was carried out using BLASR (Chaisson and Tesler 2012).

Protein-coding gene prediction and functional annotation

Using the assembled draft sequence of M. persimmonesis and the estimated transcriptome and protein sequence data (GCF_001664035.1) on NCBI, a gene prediction pipeline was followed. This pipeline builds up a gene prediction model [SNAP (2012/05/17)]. These prediction results were then incorporated with MAKER annotation software (v2.28) (Cantarel ). For extra annotation, the consensus sequences were blasted against the GenBank non-redundant and Swiss-Prot database using blastx (v2.4.0+) (E value threshold 1e-3) (Camacho ). Annotation and classification of the orthologous groups (COGs) correlated with M. persimmonesis genes were performed using the EggNog tool (http://eggnogdb.embl.de/) (Huerta-Cepas ).

Genome quality assessment

The completeness of the assembled and annotated draft genome was evaluated using the BUSCO program (v5) (Seppey ) with the help of which we confirmed the number of core genes of the species (Saccharomycetales) present in our analysis results. Furthermore, the contiguity and completeness of the draft genome was validated by mapping the RNA sequences from the same M. persimmonesis strain using CLC Genomics Workbench ver. 7.0.3, with the default parameters.

Genome comparison, phylogenetic analysis, and ANI calculation

A genome comparison analysis was performed with closely related species (Metschnikowia bicuspidata and M. fructicola). We used the cross-match program (v1.080812) to get information on the aligned positions and the Circos program (v0.69-3) to draw circular maps (Krzywinski ). To identify phylogenetic relationships, we used orthologous genes from 17 available Metschnikowia genomes. Of these, 16 genome sequences were downloaded from NCBI GenBank (Supplementary Figure S1). First, we obtained single-copy genes from BUSCO (v5) (Seppey ) with default parameters. Each FASTA file was generated for each coding sequence (CDS) ID and aligned using MAFFT program (v7) (Katoh and Standley 2013) with default parameters. A total of 17 alignment files (299 total CDSs of analyzed Metschnikowia species) were merged using in-house scripts. The alignment files were filtered to remove ambiguously aligned regions using Gblocks (v5) (Castresana 2000). Phylogenetic tree was constructed using RAxML (v8.2.12) (Stamatakis 2014) with GTRGAMMA model. The constructed tree was visualized using iTOL (v6) (https://itol.embl.de/). The average nucleotide identity (ANI) value between M. persimmonesis and M. fructicola was calculated using pairwise genome comparisons of JSpeciesWS webpage (http://jspecies.ribohost.com/jspeciesws/). ANIb algorithm was used for the analysis.

Results and discussion

Assembly, gene prediction, and functional annotation of the genome of Metschnikowia persimmonesis

The genome assembly (accession number JACBPP000000000) of M. persimmonesis (type strain KIOM G15050 = KCTC 12991BP) was created using sequence data generated from the PacBio RS II Sequencer. The results of the filtered subreads were as follows: Mean subread length—5921 bp, total number of bases—1,272,475,916 bp, number of reads—214,881 bp, and N50—9372 bp (Supplementary Figure S2). The genomic data sequences were assembled using the FALCON (v0.2.1) program and a draft genome comprising 17 contigs with an N50 of 1,981,760 bp was produced (Supplementary Figure S3). After the process of eukaryotic annotation pipeline, we removed mitochondria DNA sequence (contig15), thus leaving 16 contigs in the final draft genome. The size of the genome estimated was approximately 16.473 Mb. Moreover, histogram analysis of biallelic single-nucleotide polymorphisms frequencies showed that the genome is diploid. This assembled genome of M. persimmonesis has 16 final scaffolds, which are relatively fewer than those in M. fructicola and M. bicuspidata genomes (Table 1). Long length reads of PacBio sequencing technology can result in a more complete genome. A summary of the complete draft genome without gaps is presented in Table 1. The total predicted transcripts by MAKER annotation software were 7,029 and the total annotated proteins with blastP (v2.4.0+) were 6,939 (Supplementary Table S1). COGs correlated with M. persimmonesis genes are shown in Figure 1A.
Table 1

 Statistics of M. persimmonesis final genome assembly compared to M. fructicola and M. bicuspidata

M. persimmonesis M. fructicola M. bicuspidata
Total length16,473,58426,126,10016,055,203
Scaffolds169348
N501,981,760957,8362,544,904
MaxLength2,532,6302,548,6895,390,915
MinLength17,88814,1132018
AvgLength1,029,599280,925334,483
GC (%)45.9045.9048
Number of Gap00546
N’s00377,479
Figure 1

Metschnikowia persimmonesis genome characteristic and comparison. (A) Clusters of orthologous groups (COG) distribution of protein coding genes in M. persimmonesis based on EggNog database. (B) Circos plot showing relationship between M. persimmonesis and M. fructicola genome. Draft sequence of M. persimmonesis and reference scaffolds of M. fructicola are arranged around the circumference of the figure. (C) Circos plot showing relationship between M. persimmonesis and M. bicuspidata genome. Arrangement around the circumference of the figure contains draft sequence of M. persimmonesis and reference scaffolds of M. bicuspidata. Respective color lines crossing the circle represent a unique alignment and similarity.

Metschnikowia persimmonesis genome characteristic and comparison. (A) Clusters of orthologous groups (COG) distribution of protein coding genes in M. persimmonesis based on EggNog database. (B) Circos plot showing relationship between M. persimmonesis and M. fructicola genome. Draft sequence of M. persimmonesis and reference scaffolds of M. fructicola are arranged around the circumference of the figure. (C) Circos plot showing relationship between M. persimmonesis and M. bicuspidata genome. Arrangement around the circumference of the figure contains draft sequence of M. persimmonesis and reference scaffolds of M. bicuspidata. Respective color lines crossing the circle represent a unique alignment and similarity. Statistics of M. persimmonesis final genome assembly compared to M. fructicola and M. bicuspidata Analysis of genome completeness using BUSCO software showed that 82.8% of the detected genes of M. persimmonesis assembled data were complete and single copy, and only a few were fragmented (7.7%) or missing (9.5%) BUSCO orthologs (Table 2). This value is fairly enough and not very different with the result of BUSCO analysis of other Metschnikowia genome such as M. fructicola (86.8%), M. pulcherrima (85%), and M. reukaufii (87%) (Supplementary Figure S4). Further validation using RNA-Seq reads of M. persimmonesis KIOM G15050 showed that about 93.8% of them mapped in pairs on the final genome draft version, indicating its high quality in terms of contiguity and completeness (Supplementary Table S2).
Table 2

 Statistics of the genome completeness using BUSCO

C: 82.8% [S: 79.3%, D: 3.5%], F: 7.7%, M: 9.5%, n: 1711
1416Complete BUSCOs (C)
1356Complete and single-copy BUSCOs (S)
60Complete and duplicated BUSCOs (D)
132Fragmented BUSCOs (F)
163Missing BUSCOs (M)
1711Total BUSCO groups searched
Statistics of the genome completeness using BUSCO

Genome comparison

A comparison of genome sizes between M. persimmonesis and two closely related species (Table 1) showed that the genome size of M. persimmonesis (16.473 Mb) strain used in this study was similar to that of M. bicuspidata (16.055 Mb), unlike M. fructicola (genome size of 26.126 Mb), with 16, 93, and 48 scaffolds, respectively. The comparatively larger genome size of M. fructicola is probably due to a whole-genome duplication event (Piombo ). However, final genome assembly of M. persimmonesis and M. fructicola showing no gap compared to M. bicuspidata with relatively high number of gap (546). This happens because both M. persimmonesis and M. fructicola genomes were assembled from long reads data, unlike M. bicuspidata which used a short read sequencer. Short-read sequence data more difficult to assembled because of sequencing biases, repetitive genomic features, genomic polymorphism, many gaps, and other complicating factors (English ). Whole-genome alignment between M. persimmonesis (query) and M. fructicola (subject) showed that 99.01% of M. persimmonesis sequences were aligned to 64.02% M. fructicola sequences with substitution, deletion, and insertion ratios of 6.54, 0.71, and 0.62, respectively (Supplementary Table S3). Meanwhile, genome alignment of M. persimmonesis with M. bicuspidata (subject) had 33.64% query coverage and a 0.04% query overlap region compared to 32.44% subject coverage and 2.27% subject overlap region with substitution, deletion, and insertion ratios of 20.58, 0.38, and 0.33, respectively (Supplementary Table S4). The relationship and genomic interval between M. persimmonesis, M. fructicola, and M. bicuspidata are shown in Figure 1, B and C.

Phylogenetic relationship

To identify the phylogenic relationships between M. persimmonesis and other Metschnikowia species, we aligned 299 conserved single-copy protein-coding sequences shared by the 17 Metschnikowia species (Figure 2). We used yeast model species Saccharomyces cerevisiae as an outgroup. This whole-genome-based phylogeny tree revealed that M. persimmonesis is grouped with M. fructicola and M. pulcherrima and shared the closest phylogenetic relationship with M. fructicola. The result is consistent with our previous report based on 16S rRNA gene sequences which showed that M. persimmonesis is a sister clade of M. fructicola (Kang ). Furthermore, we performed ANI analysis to ascertain whether M. persimmonesis is really a new and distinct species from M. fructicola. As a result, ANI value between M. fructicola and M. persimmonesis was 94.19%, which is lower than the threshold of 95-96% of the boundary for species circumscription. Therefore, the combination of ANI value, whole-genome-based phylogenetic tree, and previous result of 16S rRNA sequence (Kang ) confirmed that M. persimmonesis belonged to novel species within the genera of Metschnikowia.
Figure 2

Phylogenetic relationship of 17 Metschnikowia strains. The whole genome-based phylogeny tree was constructed using 299 CDSs of available Metschnikowia species genome. Yeast model species Saccharomyces cerevisiae was used as an outgroup.

Phylogenetic relationship of 17 Metschnikowia strains. The whole genome-based phylogeny tree was constructed using 299 CDSs of available Metschnikowia species genome. Yeast model species Saccharomyces cerevisiae was used as an outgroup.

Conclusion

Using Pacific Biosciences (PacBio) long reads RS II sequencer and high-coverage RNA-seq data we have generated a high-quality draft genome assembly and annotation of the novel species M. persimmonesis, providing an important genomic resource for the future study of this important endophytic yeast. Comprehensive study of this M. persimmonesis genomic assets could open more insight in the area of evolutionary biology within Metschnikowia genus, genetic adaptions to highly distinct environments, and discovery of highly valuable natural product.
  21 in total

1.  Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis.

Authors:  J Castresana
Journal:  Mol Biol Evol       Date:  2000-04       Impact factor: 16.240

2.  Circos: an information aesthetic for comparative genomics.

Authors:  Martin Krzywinski; Jacqueline Schein; Inanç Birol; Joseph Connors; Randy Gascoyne; Doug Horsman; Steven J Jones; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

3.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

4.  BLAST+: architecture and applications.

Authors:  Christiam Camacho; George Coulouris; Vahram Avagyan; Ning Ma; Jason Papadopoulos; Kevin Bealer; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2009-12-15       Impact factor: 3.169

5.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

6.  Genome Sequence, Assembly and Characterization of Two Metschnikowia fructicola Strains Used as Biocontrol Agents of Postharvest Diseases.

Authors:  Edoardo Piombo; Noa Sela; Michael Wisniewski; Maria Hoffmann; Maria L Gullino; Marc W Allard; Elena Levin; Davide Spadaro; Samir Droby
Journal:  Front Microbiol       Date:  2018-04-03       Impact factor: 5.640

7.  Characterization of a novel yeast species Metschnikowia persimmonesis KCTC 12991BP (KIOM G15050 type strain) isolated from a medicinal plant, Korean persimmon calyx (Diospyros kaki Thumb).

Authors:  Young Min Kang; Ji Eun Choi; Richard Komakech; Jeong Hwan Park; Dae Wook Kim; Kye Man Cho; Seung Mi Kang; Sang Haeng Choi; Kun Chul Song; Chung Min Ryu; Keun Chul Lee; Jung-Sook Lee
Journal:  AMB Express       Date:  2017-11-10       Impact factor: 3.298

Review 8.  Persimmon (Diospyros kaki) fruit: hidden phytochemicals and health claims.

Authors:  Masood Sadiq Butt; M Tauseef Sultan; Mahwish Aziz; Ambreen Naz; Waqas Ahmed; Naresh Kumar; Muhammad Imran
Journal:  EXCLI J       Date:  2015-05-04       Impact factor: 4.068

9.  eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences.

Authors:  Jaime Huerta-Cepas; Damian Szklarczyk; Kristoffer Forslund; Helen Cook; Davide Heller; Mathias C Walter; Thomas Rattei; Daniel R Mende; Shinichi Sunagawa; Michael Kuhn; Lars Juhl Jensen; Christian von Mering; Peer Bork
Journal:  Nucleic Acids Res       Date:  2015-11-17       Impact factor: 16.971

10.  Anti-inflammatory activities of astringent persimmons (Diospyros kaki Thunb.) stalks of various cultivar types based on the stages of maturity in the Gyeongnam province.

Authors:  Jieun Choi; Mi Jeong Kim; Richard Komakech; Haiyoung Jung; Youngmin Kang
Journal:  BMC Complement Altern Med       Date:  2019-09-23       Impact factor: 3.659

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.