| Literature DB >> 29186370 |
Taruna A Schuelke1, Guangxi Wu2, Anthony Westbrook3, Keith Woeste4, David C Plachetzki1, Kirk Broders2, Matthew D MacManes1.
Abstract
Geosmithia morbida is an emerging fungal pathogen which serves as a model for examining the evolutionary processes behind pathogenicity because it is one of two known pathogens within a genus of mostly saprophytic, beetle-associated, fungi. This pathogen causes thousand cankers disease in black walnut trees and is vectored into the host via the walnut twig beetle. Geosmithia morbida was first detected in western United States and currently threatens the timber industry concentrated in eastern United States. We sequenced the genomes of G. morbida in a previous study and two nonpathogenic Geosmithia species in this work and compared these species to other fungal pathogens and nonpathogens to identify genes under positive selection in G. morbida that may be associated with pathogenicity. Geosmithia morbida possesses one of the smallest genomes among the fungal species observed in this study, and one of the smallest fungal pathogen genomes to date. The enzymatic profile in this pathogen is very similar to its nonpathogenic relatives. Our findings indicate that genome reduction or retention of a smaller genome may be an important adaptative force during the evolution of a specialized lifestyle in fungal species that occupy a specificniche, such as beetle vectored tree pathogens. We also present potential genes under selection in G. morbida that could be important for adaptation to a pathogenic lifestyle.Entities:
Keywords: Geosmithia morbida; pathogenicity; thousand cankers disease; tree pathogen
Mesh:
Substances:
Year: 2017 PMID: 29186370 PMCID: PMC5737690 DOI: 10.1093/gbe/evx242
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Species, Geographic Origins, and Host Information for Geosmithia morbida, Geosmithia flava, and Geosmithia putterillii
| Species | Pathogen | Isolate | Geographic Origins | Host |
|---|---|---|---|---|
| Yes | 1262 | California | ||
| No | CCF3333 | Czech Republic | ||
| No | CCF4204 | California |
This isolate is the reference genome. The details of assembly for this genome are discussed in Schuelke et al. (2016).
Statistics for Sequence Data from Isolates of Geosmithia morbida, Geosmithia flava, and Geosmithia putterillii
| Species | Total Read Pairs | Est. Coverage | ||
|---|---|---|---|---|
| 14,013,863 | 20,674,289 | 109× | 160× | |
| 16,183,281 | 102× | |||
| 19,711,745 | 131× | |||
These values are for paired-end read data for G. morbida from Schuelke et al. (2016).
These values are for mate-pair read data for G. morbida from Schuelke et al. (2016).
Fungal Species Used for Phylogenetic Analysis in This Study
| Species | Class | Order | Ecological Role | Download Source | References |
|---|---|---|---|---|---|
| Sordariomycetes | Hypocreales | Pathogen | — | ||
| Sordariomycetes | Hypocreales | Nonpathogen | — | – | |
| Sordariomycetes | Hypocreales | Nonpathogen | — | – | |
| Sordariomycetes | Hypocreales | Beneficial | FungalEnsembl | ||
| Sordariomycetes | Hypocreales | Saprotrophic | JGI | Used with permission | |
| Sordariomycetes | Hypocreales | Mycoparasite | JGI | ||
| Sordariomycetes | Hypocreales | Saprotrophic | FungalEnsembl | ||
| Sordariomycetes | Hypocreales | Mycoparasite | EnsemblGenomes | ||
| Sordariomycetes | Hypocreales | Biotrophic pathogen | FungalEnsembl | ||
| Sordariomycetes | Hypocreales | Insect pathogen | FungalEnsembl | ||
| Sordariomycetes | Hypocreales | Saprotrophic | JGI | Used with permission | |
| Sordariomycetes | Hypocreales | Necrotrophic pathogen | FungalEnsembl | ||
| Sordariomycetes | Hypocreales | Necrotrophic pathogen | FungalEnsembl | ||
| Sordariomycetes | Microascales | Pathogen | FungalEnsembl | ||
| Sordariomycetes | Sordariales | Saprotrophic | FungalEnsembl | ||
| Sordariomycetes | Sordariales | Saprotrophic | JGI | ||
| Sordariomycetes | Ophiostomatales | Pathogen | FungalEnsembl | ||
| Sordariomycetes | Xylariales | Pathogen | JGI | ||
| Leotiomycetes | Helotiales | Necrotrophic pathogen | FungalEmsebl | ||
| Leotiomycetes | Incertae sedis | Mycorrhizal | JGI |
Note.—The species in bold were utilized for positive selection analysis.
Length-Based Statistics for Geosmithia morbida, Geosmithia flava, and Geosmithia putterillii Generated with QUAST v2.3
| Species | Est. Genome Size (Mb) | Scaffold Count | Largest Scaffold | NG50 | LG50 | Genome Completeness | Predicted Proteins | Transcript Completeness | |
|---|---|---|---|---|---|---|---|---|---|
| 26.5 | NA | 73 | 2,597,956 | 1,305,468 | 7 | 98 | 6,273 | 93 | |
| 29.6 | 91 | 1,819 | 1,534,325 | 460,430 | 22 | 98 | 6,976 | 94 | |
| 30.0 | 91 | 320 | 2,758,267 | 1,379,352 | 9 | 98 | 7,086 | 94 |
Note.—The average GC content for G. morbida, G. flava, and G. putterillii equals 54%, 52%, and 55.5%, respectively. All genome completeness values were produced with BUSCO v1.1b1. These percentages represent genes that are complete and not duplicated or fragmented. NG50 is the scaffold length such that considering scaffolds of equal or longer length produce 50% of the bases of the reference genome. LG50 is the number of scaffolds with length NG50.
Genome assembly for G. morbida was constructed using AllPaths-LG (v49414). See Schuelke et al. (2016) for further details.
These percentages were computed using the fungal data set 9 provided with BUSCO.
Repetitive Elements Profile of Geosmithia Species Generated with RepeatMasker v4.0.5
| Genome Size (Mb) | GC (%) | Bases Masked (%) | No. of Retroelements | No. of DNA Transposons | |
|---|---|---|---|---|---|
| 26.5 | 54.30 | 0.81 | 152 | 60 | |
| 29.6 | 51.87 | 0.63 | 401 | 42 | |
| 30.0 | 55.47 | 0.64 | 214 | 15 |
. 1.—CAZymes distribution for Geosmithia species, other Hypocreales, and C. platani. The species in red are pathogens, whereas the names in black are nonpathogens. CAZymes were identified with HMMer searches of dbCAN peptide models. GH, glycoside hydrolases; GT, glycosylTransferases; PL, polysaccharide lyases; CE, carbohydrate esterases; AA, auxiliary activities enzymes; CBM, carbohydrate-binding molecules.
. 2.—Proteolytic enzymes distribution for Geosmithia species, other Hypocreales, and C. platani. The species in red are pathogens, whereas the names in black are nonpathogens. Proteases were identified using BLASTp searches against the MEROPs database v10. S, serine; M, metallo; C, cysteine; A, aspartic; T, threonine; I, inhibitors; P, mixed; G, glutamic.
. 3.—ML was estimated with RAxML (Stamatakis 2014) using a scheme determined by PartitionFinder (Lanfear et al. 2014). The IC (top) and IC All (bottom) scores are also presented for each node. This topology is identical to the BMCMC phylogeny constructed in MrBayes (Ronquist et al. 2012). All nodes in ML and BMCMC analyses receive maximum support of 1. The black circles symbolize classes. The color-shaded boxes at the right of the figure denote the orders within each class. The first and second numbers in parentheses represent the genome sizes in Mb and the number of predict protein models, respectively. Black and red branches correspond to nonpathogens and pathogens, respectively, which span multiple orders.
Functional Analyses of Genes under Positive Selection in Geosmithia morbida Detected by the Branch-Site Model in PAML 4.8
| Gene Number | Function | d | Transmembrane Domain ( |
|---|---|---|---|
| 3078 | Takes part in intracellular signaling, protein recruitment to various membranes | 2.04 | 0 |
| 2666 | Involved in receptor-mediated endocytosis and vesicle trafficking | 2.01 | 0 |
| 563 | Unclear function | 1.94 | 1 |
| 2194 | Unknown function | 1.94 | 0 |
| 801 | Catalyzes the transfer of electrons from ferrocytochrome c to oxygen converting the cytochrome c into water | 1.93 | 1 |
| 3944 | Involved in methylation and have a wide range of substrate specificity | 1.90 | 5 |
| 5058 | Involved in ubiquitination of proteins target for degradation | 1.90 | 0 |
| 1843 | Involved in heat-shock response | 1.86 | 0 |
| 521 | Involved in damage DNA binding and repair | 1.85 | 0 |
| 5111 | Involved in receptor-mediated endocytosis and vesicle trafficking | 1.84 | 0 |
| 4128 | Catalyzes the hydrolysis of esters | 1.84 | 0 |
| 923 | Hydrolases the peptide bond at the C-terminus of ubiquitin | 1.83 | 1 |
| 4405 | Involved in transport and metabolism of lipids | 1.83 | 1 |
| 3137 | Part of proteins with diverse functions such as cell-cycle regulators, signal transducers, transcriptional initiators | 1.78 | 0 |
| 4359 | Unknown function | 1.73 | 2 |
| 5639 | Involved in rRNA synthesis | 1.67 | 0 |
| 5 | Involved in vesicular transport | 1.63 | 0 |
| 624 | Involved in transfer of glucose molecules that are part of a larger glycosylation machinery | 1.62 | 9 |
| 3929 | Unknown function but associates with GRAM domain found in glucosyltransferases and other membrane affiliated proteins | 1.61 | 0 |
| 1456 | Involved in DNA repair and replication | 1.59 | 0 |
| 4829 | Form cAMP | 1.59 | 0 |
| 254 | Major ATP transporters | 1.59 | 2 |
| 4888 | Unknown function | 1.54 | 0 |
| 5426 | Hydrolyzes nonubiquitinated peptides | 1.54 | 0 |
| 5709 | Transcription factors | 1.50 | 0 |
| 859 | May be involved in the timing of nuclear migration | 1.50 | 0 |
| 5703 | Cleave peptide bonds in other proteins | 1.47 | 6 |
| 5255 | Heat shock protein involved in induced stress response to ethanol | 1.46 | 3 |
| 5704 | Regulates gene expression during oxidative stress caused by the host plant | 1.46 | 0 |
| 2485 | Transfer phosphates | 1.39 | 0 |
| 6116 | Hydratase and/or isomerase | 1.38 | 0 |
| 5266 | Breaks down actin, cell membrane deformations | 1.34 | 0 |
| 5000 | Catalyzes the first step in histidine biosynthesis | 1.34 | 0 |
| 3326 | Involved in de novo synthesis of nucleotide purine | 1.32 | 0 |
| 2142 | E2 enzymes that catalyze the binding of activated ubiquitin to the substrate protein. The substrate proteins are targeted for degradation by the proteasome | 1.24 | 0 |
| 581 | Ribosomal protein | 1.17 | 0 |
| 5948 | Involved in initiation of transcription | 1.14 | 1 |
| 3700 | Part of the TOM complex that recognizes and regulates the transport of mitochondrial precursor molecules from the cytosol to the intracellular space of the mitochondrion | 1.03 | 0 |
Note.—The gene number corresponds to the sequence ID in the G. morbida protein file available at DRYAD. The P-values for each dN/dS ratio is < 0.05. dN/dS is the ratio of nonsynonymous substitutions to synonymous changes.