Literature DB >> 26933475

Genome sequences of Knoxdaviesia capensis and K. proteae (Fungi: Ascomycota) from Protea trees in South Africa.

Janneke Aylward1, Emma T Steenkamp2, Léanne L Dreyer1, Francois Roets3, Brenda D Wingfield4, Michael J Wingfield2.   

Abstract

Two closely related ophiostomatoid fungi, Knoxdaviesia capensis and K. proteae, inhabit the fruiting structures of certain Protea species indigenous to southern Africa. Although K. capensis occurs in several Protea hosts, K. proteae is confined to P. repens. In this study, the genomes of K. capensis CBS139037 and K. proteae CBS140089 are determined. The genome of K. capensis consists of 35,537,816 bp assembled into 29 scaffolds and 7940 predicted protein-coding genes of which 6192 (77.98 %) could be functionally classified. K. proteae has a similar genome size of 35,489,142 bp that is comprised of 133 scaffolds. A total of 8173 protein-coding genes were predicted for K. proteae and 6093 (74.55 %) of these have functional annotations. The GC-content of both genomes is 52.8 %.

Entities:  

Keywords:  Gondwanamycetaceae; Knoxdaviesia; Microascales; Ophiostomatoid fungi; Protea

Year:  2016        PMID: 26933475      PMCID: PMC4772463          DOI: 10.1186/s40793-016-0139-9

Source DB:  PubMed          Journal:  Stand Genomic Sci        ISSN: 1944-3277


Introduction

Two lineages of the polyphyletic assemblage known as ophiostomatoid fungi [1] are associated with the fruiting structures (infructescences) of serotinous L. plants [2]. species are a key component of the fynbos vegetation in the Core Cape Subregion (CCR) of South Africa [3] and the genus is predominantly encountered in South Africa [4, 5]. The Protea-associated ophiostomatoid fungi are, therefore, believed to be endemic to this region, similar to their hosts. This association of ophiostomatoid fungi with a keystone plant genus in a biodiversity hotspot is intriguing [6], as many ophiostomatoid fungi are notorious pathogens of trees [7-10], yet the ophiostomatoid species are not associated with disease symptoms [11]. Ophiostomatoid fungi are characterized by the flask-shaped morphology of their sexual fruiting structures and their association with arthropods [1, 12]. The Protea-associated members of this assemblage are primarily dispersed by mites that come into contact with fungal spores in the infructescences [13, 14]. These mites have limited dispersal ability, but use beetles and possibly larger vertebrates (such as birds) as vehicles for long-distance dispersal [15, 16]. The three M.J. Wingf., P.S. van Wyk & Marasas species associated with have intriguing host ranges. M.J. Wingf. & P.S. van Wyk occurs in at least eight different hosts, whereas M.J. Wingf., P.S. van Wyk & Marasas and (Roets & Dreyer) Z.W. de Beer & M.J. Wingf. are confined to single host species, respectively P. repens L. and P. caffra Meisn.[17-20]. An investigation of the population biology of , revealed that this fungus has a high level of intra-specific genetic diversity and that it is extensively dispersed within the CCR of South Africa [16, 21]. However, other than host range and dispersal mechanisms, little is known about the biology and ecology of in general [11]. Here we present the description of the first drafts of the genome sequences of the two CCR species, and , as well as their respective annotations.

Organism information

Classification and features

The one lineage of Protea-associated ophiostomatoid fungi resides in the (Ophiostomatales, Ascomycota), while the second resides in the (Microascales, Ascomycota) [11, 22]. The latter group includes three closely related Protea-associated species in the genus (Fig. 1). This genus was initially described to accommodate the asexual state of the first species in the genus, [23]. Under the dual nomenclature system of fungi, the sexual state of this fungus was described in the same paper as M.J. Wingf., P.S. van Wyk & Marasas [23]. A new genus, G.J. Marais & M.J. Wingf., was later described to accommodate the sexual state of this species and that of another species, capense M.J. Wingf. & P.S. van Wyk [24]. The asexual states of both remained to be treated as species of . Since the abolishment of the dual nomenclature system of fungi, the oldest genus name takes preference, irrespective of morph [25, 26]. The name , therefore, has priority and all species previously treated in were transferred to [27].
Fig. 1

Maximum Likelihood tree illustrating the phylogenetic position of K. capensis and K. proteae in the Gondwanamycetaceae (grey block). The Protea-associated species are shaded red and the two isolates for which genome sequences were determined are indicated with a box. The sequences of the Internal Transcribed Spacer (ITS) region (available from GenBank®, accession numbers in brackets following isolate numbers) were aligned in MAFFT 7 [55]. The phylogeny was calculated in MEGA6 [56] using the Tamura-Nei substitution model [57], 1000 bootstrap replicates and Ceratocystis fimbriata (Ceratocystidaceae) as an outgroup

Maximum Likelihood tree illustrating the phylogenetic position of K. capensis and K. proteae in the Gondwanamycetaceae (grey block). The Protea-associated species are shaded red and the two isolates for which genome sequences were determined are indicated with a box. The sequences of the Internal Transcribed Spacer (ITS) region (available from GenBank®, accession numbers in brackets following isolate numbers) were aligned in MAFFT 7 [55]. The phylogeny was calculated in MEGA6 [56] using the Tamura-Nei substitution model [57], 1000 bootstrap replicates and Ceratocystis fimbriata (Ceratocystidaceae) as an outgroup In a study determining the genome sequence of any fungus, it is advisable to use a living isolate connected to the type specimen. However, the ex-type isolate of (CMW738 = CBS486.88) is more than 20 years old and does not display the characteristic morphological features of the fungus in culture anymore. No living ex-type isolate exists for . We thus collected fresh isolates of both species for this study in order to eliminate possible mutations or degradation that may have occurred though continual artificial propagation in culture media. The new isolates (Figs. 1 & 2) were collected from the same localities and hosts as the holotype specimens: (CMW40890 = CBS139037) from the infructescences of P. longifolia Andrews in Hermanus, and (CMW40880 = CBS140089) from P. repens infructescences in Stellenbosch, both locations in the Western Cape Province of South Africa. General features of these isolates are outlined in Table 1.
Fig. 2

Sexual sporing structures of the two Knoxdaviesia species sequenced in this study. K. capensis (a) and K. proteae (b) were sampled from Protea longifolia and P. repens flowers, respectively. Scale bars = 1 mm

Table 1

Classification and general features of K. capensis and K. proteae [29]

MIGS IDProperty K. capensis Term K. proteae TermEvidence codea
ClassificationDomain FungiDomain FungiTAS [19, 23]
Phylum AscomycotaPhylum AscomycotaTAS [19, 23]
Class SordariomycetesClass SordariomycetesTAS [19, 23]
Order MicroascalesOrder MicroascalesTAS [2]
Family Gondwanamycetaceae Family Gondwanamycetaceae TAS [22]
Genus Knoxdaviesia Genus Knoxdaviesia TAS [27]
Species K. capensis Species K. proteae TAS [27]
Strain: CMW40890 = CBS139037Strain: CMW40880 = CBS140089
Cell shapeseptate, smooth-walled hyphaeseptate, smooth-walled hyphaeTAS [19, 23]
MotilityNon-motileNon-motileNAS
SporulationUnsheathed allantoid ascosporesFalcate ascosporesTAS [19, 23]
Temperature range15–30 °C15–30 °CTAS [19, 23]
Optimum temperature25 °C25 °CTAS [19, 23]
pH range; OptimumUnknownUnknown
Carbon sourceUnknownUnknown
MIGS-6HabitatSeed cones (infructescences) of Protea spp.Seed cones (infructescences) of Protea repens L.TAS [19, 23]
MIGS-6.3SalinityUnknownUnknown
MIGS-22Oxygen requirementAerobic; requirement/tolerance unknownAerobic; requirement/tolerance unknown
MIGS-15Biotic relationshipPlant-associatedPlant-associatedTAS [24]
MIGS-14PathogenicityNone knownNone known
MIGS-4Geographic locationHermanus, South AfricaStellenbosch, South Africa
MIGS-5Sample collectionFebruary 2014January 2014
MIGS-4.1Latitude-34.4093-33.9430
MIGS-4.2Longitude19.215018.8802
MIGS-4.4Altitude20 m140 m

aEvidence codes - IDA inferred from direct assay, TAS traceable author statement (i.e., a direct report exists in the literature), NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from http://www.geneontology.org/GO.evidence.shtml of the Gene Ontology project [58]

Sexual sporing structures of the two Knoxdaviesia species sequenced in this study. K. capensis (a) and K. proteae (b) were sampled from Protea longifolia and P. repens flowers, respectively. Scale bars = 1 mm Classification and general features of K. capensis and K. proteae [29] aEvidence codes - IDA inferred from direct assay, TAS traceable author statement (i.e., a direct report exists in the literature), NAS non-traceable author statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from http://www.geneontology.org/GO.evidence.shtml of the Gene Ontology project [58]

Genome sequencing information

Genome project history

Considering the lack of ecological information on the genus and the close relationship these Microascalean fungi have to important plant pathogens, two Protea-associated species, believed to be native to the CCR in South Africa, were selected for genome sequencing. Both species were sequenced at Fasteris in Switzerland. The genome projects are listed in the Genomes OnLine Database [28] and the whole genome shotgun (WGS) project has been deposited at DDBJ/EMBL/GenBank (Table 2). Table 2 presents the project information and its association with the minimum information about a genome sequence version 2.0 compliance [29]. The full MIGS records for and are available in Additional file 1: Table S1 and Additional file 2: Table S2, respectively.
Table 2

Project information

MIGS IDProperty K. capensis Term K. proteae Term
MIGS 31Finishing qualityHigh quality draftHigh quality draft
MIGS-28Libraries used2x paired-end (PE) (350 and 550 bp) and 1x mate-pair (MP) (3 kbp)2x paired-end (PE) (350 and 550 bp) and 1x mate-pair (MP) (3 kbp)
MIGS 29Sequencing platformsIllumina Hiseq 2500Illumina Hiseq 2500
MIGS 31.2Fold coveragePE library 1: 91.6 xPE library 1: 142 x
PE library 2: 80 xPE library 2: 79.3 x
MP library: 17 xMP library: 50.2 x
MIGS 30AssemblersABySS 1.5.2; SSPACE 3.0ABySS 1.5.2; SSPACE 3.0
MIGS 32Gene calling methodMAKER 2.31.8MAKER 2.31.8
Genbank IDLNGK00000000LNGL00000000
GenBank Date of Release11th January 201611th January 2016
GOLD IDGp0093999Gp0110284
BIOPROJECTPRJNA246171PRJNA275563
MIGS 13Source Material IdentifierCMW40890/CBS139037CMW40880/CBS 140089
Project relevanceBiodiversity, evolutionBiodiversity, evolution
Project information

Growth conditions and genomic DNA preparation

Both and were cultured on Malt Extract Agar (MEA; Merck, Wadeville, South Africa) overlaid with sterile cellophane sheets (Product no. Z377597, Sigma-Aldrich, Steinham, Germany). After 10 days of growth at 25 °C, mycelia was scraped from the cellophane and DNA was extracted according to Aylward et al. [30]. Approximately 5 μg DNA from each species was used to prepare the three Illumina libraries (Table 2). RNA was extracted from the genome isolate to use as evidence for gene prediction. After growth on MEA at 25 °C for approximately 10 days, total RNA was isolated from the mycelia with the PureLink™ RNA Mini Kit (Ambion, Austin, TX, USA). Quality control was performed on the Agilent 2100 Bioanalyzer (Agilent Technologies, USA) using the RNA 6000 Nano Assay kit (Agilent Technologies, USA). The mRNA component of the total RNA was subsequently extracted with the Dynabeads® mRNA purification kit (Ambion, Austin, TX, USA).

Genome sequencing and assembly

The genomes of and were sequenced with the Illumina HiSeq 2500 platform at Fasteris, Switzerland, using two paired-end and one Nextera mate-pair library (Table 2). More than 60 million paired-end and 8 million mate-pair reads were obtained for each species. These reads were trimmed in CLC Genomics Workbench 6.5 (CLC bio, Aarhus, Denmark) so that the Phred Q (quality) score of each base was at least Q20. VelvetOptimiser (Gladman & Seeman, unpublished), a Perl script used as part of the Velvet assembler [31, 32], was initially used to optimize the assembly parameters. Assembly of contigs was performed in ABySS 1.5.2 [33] using the optimal parameters suggested by VelvetOptimiser as a starting point. Several assemblies were computed using kmer-values slightly higher and lower than the kmer-value suggested by VelvetOptimiser. The assembly with the lowest number of contigs was used to build scaffolds in SSPACE 3.0 [34], discarding scaffolds smaller than 1000 bp. Automatic gap closure was performed in GapFiller 1.10 [35]. The average genome coverage of each library was estimated using the Lander-Waterman equation (total sequenced nucleotides/genome size) (Table 2), which yielded a combined average coverage for the three libraries of 188.5x () and 271.5x (). The genome consists of 29 scaffolds ranging between 1226 and 5,637,848 bp, whereas the 133 scaffolds of are sized between 1022 and 2,610,973 bp. A search for the 1438 fungal universal single-copy ortholog genes with BUSCO 1.1b1 [36] identified 1355 complete and 67 partial genes in and 1366 complete and 57 partial genes in . The two genomes are therefore estimated to be >98 % complete. The extracted mRNA of was sequenced using an Ion PI™ Chip on the Ion Proton™ System (Life Technologies, Carlsbad, CA) at the Central Analytical Facility (CAF), Stellenbosch University, South Africa. The >49 million raw RNA-Seq reads were mapped to the genome in CLC Genomics Workbench and assembled with Trinity 2.0.6 [37] using the genome-guided option.

Genome annotation

Genome annotation was performed with the MAKER 2.31.8 pipeline [38, 39], using custom repeat libraries for each species constructed with RepeatScout 1.0.5 [40] and two de novo gene predictors, SNAP 2006-07-28 [41] and AUGUSTUS 3.0.3 [42]. The assembled RNA-Seq and predicted protein and/or transcript sequences from 22 sequenced Sordariomycete species (Additional file 3: Table S3), including two Microascalean fungi, were provided as additional evidence. AUGUSTUS was trained with the assembled RNA-Seq data and subsequently MAKER was used to annotate the largest scaffold of the and the largest scaffold of the assembly, independently. After manually curating all the gene predictions on these scaffolds with Apollo 1.11.8 [43], SNAP was trained with the curated gene predictions of each scaffold and the scaffolds were re-annotated. SNAP was re-trained for each species individually and subsequently both genomes were annotated. EuKaryotic Orthologous Group (KOG) classifications were assigned to the predicted proteins through the WebMGA [44] portal that performs reverse-position-specific BLAST [45] searches on the KOG database [46]. Additional functional annotations were predicted with InterProScan 5.13-52.0 [47, 48], SignalP 4.1 [49] and TMHMM 2.0 [50].

Genome properties

and have similar genome sizes at 35.54 and 35.49 Mbp, respectively. It was possible to assemble the genome into 29 scaffolds larger than 1000 bp, whereas the number of scaffolds above this threshold achieved for was 133. Both genomes had a GC content of 52.8 %. A total of 7940 protein-coding genes were predicted for and 8174 for . Additionally 137 and 116 tRNA and 30 and 27 rRNA genes were predicted for each species, respectively. More than 74 % of the protein-coding genes of each species could be assigned to a putative function via the KOG and Pfam databases. The content of the two genomes are summarized in Tables 3 and 4.
Table 3

Genome statistics

Species K. capensis K. proteae
AttributeValue% of Totala Value% of Totala
Genome size (bp)35,537,816100.0035,489,142100.00
DNA coding (bp)12,640,36835.5712,542,58035.34
DNA G + C (bp)18,774,62852.8318,745,36552.82
DNA scaffolds29133
Total genes8107100.008316100.00
Protein coding genes794097.94817398.28
RNA genesb 1672.061431.72
Pseudo genesunknownunknown
Genes in internal clustersunknownunknown
Genes with function prediction619277.98609374.55
Genes assigned to KOGs605976.31601573.60
Genes with Pfam domains545568.70533565.28
Genes with signal peptides3544.463354.10
Genes with transmembrane helices151019.02152718.68
CRISPR repeatsN/AN/A

aThe total is based on either the size of the genome in base pairs or the total number of protein-coding genes in the annotated genome

bBased on tRNA and rRNA genes only

Table 4

Number of genes associated with the 25 general KOG functional categories

Species K. capensis K. proteae
CodeValue% of totala Value% of totala Description
J3594.523714.54Translation, ribosomal structure and biogenesis
A2803.532733.34RNA processing and modification
K4755.984845.92Transcription
L1962.471982.42Replication, recombination and repair
B1091.37991.21Chromatin structure and dynamics
D2092.632272.78Cell cycle control, cell division, chromosome partitioning
Y340.43320.39Nuclear structure
V320.40320.39Defence mechanisms
T5056.365865.95Signal transduction mechanisms
M690.87760.93Cell wall/membrane/envelope biogenesis
N60.0860.07Cell motility
Z2793.512893.54Cytoskeleton
W100.13120.15Extracellular structures
U5396.795436.64Intracellular trafficking, secretion, and vesicular transport
O5026.324956.06Post-translational modification, protein turnover, chaperones
C2653.342563.13Energy production and conversion
G2022.542022.47Carbohydrate transport and metabolism
E2272.862282.79Amino acid transport and metabolism
F760.96740.91Nucleotide transport and metabolism
H871.10851.04Coenzyme transport and metabolism
I2342.952342.86Lipid transport and metabolism
P1441.811511.85Inorganic ion transport and metabolism
Q1391.751371.68Secondary metabolites biosynthesis, transport and catabolism
R7359.266948.49General function prediction only
S3444.333304.04Function unknown
X20.0310.01Multiple functions
-188123.69215926.41Not in KOGs

aThe total is based on the total number of protein coding genes in the genome

Genome statistics aThe total is based on either the size of the genome in base pairs or the total number of protein-coding genes in the annotated genome bBased on tRNA and rRNA genes only Number of genes associated with the 25 general KOG functional categories aThe total is based on the total number of protein coding genes in the genome

Conclusions

At least six Microascalean fungi currently have publically accessible genomes [51-54]. and , however, represent the first sequenced genomes from the Microascalean family . The genomes of these two species will not only enable further understanding of the unique ecology of Protea-inhabiting fungi, but will also be valuable in taxonomic and evolutionary studies.
  40 in total

1.  Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes.

Authors:  A Krogh; B Larsson; G von Heijne; E L Sonnhammer
Journal:  J Mol Biol       Date:  2001-01-19       Impact factor: 5.469

2.  Interactions among Scolytid bark beetles, their associated fungi, and live host conifers.

Authors:  T D Paine; K F Raffa; T C Harrington
Journal:  Annu Rev Entomol       Date:  1997       Impact factor: 19.686

3.  Biotic and abiotic constraints that facilitate host exclusivity of Gondwanamyces and Ophiostoma on Protea.

Authors:  Francois Roets; Natalie Theron; Michael J Wingfield; Léanne L Dreyer
Journal:  Fungal Biol       Date:  2011-10-06

4.  MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors:  Kazutaka Katoh; Daron M Standley
Journal:  Mol Biol Evol       Date:  2013-01-16       Impact factor: 16.240

5.  A new bioinformatics analysis tools framework at EMBL-EBI.

Authors:  Mickael Goujon; Hamish McWilliam; Weizhong Li; Franck Valentin; Silvano Squizzato; Juri Paern; Rodrigo Lopez
Journal:  Nucleic Acids Res       Date:  2010-05-03       Impact factor: 16.971

6.  Toward almost closed genomes with GapFiller.

Authors:  Marten Boetzer; Walter Pirovano
Journal:  Genome Biol       Date:  2012-06-25       Impact factor: 13.583

7.  A new dawn for the naming of fungi: impacts of decisions made in Melbourne in July 2011 on the future publication and regulation of fungal names.

Authors:  David L Hawksworth
Journal:  IMA Fungus       Date:  2011-11-11       Impact factor: 3.515

8.  IMA Genome-F 1: Ceratocystis fimbriata: Draft nuclear genome sequence for the plant pathogen, Ceratocystis fimbriata.

Authors:  P Markus Wilken; Emma T Steenkamp; Michael J Wingfield; Z Wilhelm de Beer; Brenda D Wingfield
Journal:  IMA Fungus       Date:  2013-12-06       Impact factor: 3.515

9.  Draft Genome Sequence of the Pathogenic Fungus Scedosporium apiospermum.

Authors:  Patrick Vandeputte; Sarah Ghamrawi; Mathias Rechenmann; Agnès Iltis; Sandrine Giraud; Maxime Fleury; Christopher Thornton; Laurence Delhaès; Wieland Meyer; Nicolas Papon; Jean-Philippe Bouchara
Journal:  Genome Announc       Date:  2014-10-02

10.  Panmixia defines the genetic diversity of a unique arthropod-dispersed fungus specific to Protea flowers.

Authors:  Janneke Aylward; Léanne L Dreyer; Emma T Steenkamp; Michael J Wingfield; Francois Roets
Journal:  Ecol Evol       Date:  2014-08-21       Impact factor: 2.912

View more
  2 in total

1.  Persistence of ecologically similar fungi in a restricted floral niche.

Authors:  Vuledzani O Mukwevho; Léanne L Dreyer; Francois Roets
Journal:  Antonie Van Leeuwenhoek       Date:  2022-04-07       Impact factor: 2.271

2.  Molecular basis of cycloheximide resistance in the Ophiostomatales revealed.

Authors:  Brenda D Wingfield; Mike J Wingfield; Tuan A Duong
Journal:  Curr Genet       Date:  2022-03-22       Impact factor: 2.695

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.