| Literature DB >> 29764946 |
Yan Wang1,2,3,4, Matt Stata5, Wei Wang5, Jason E Stajich3,4, Merlin M White6, Jean-Marc Moncalvo5,2.
Abstract
Modern genomics has shed light on many entomopathogenic fungi and expanded our knowledge widely; however, little is known about the genomic features of the insect-commensal fungi. Harpellales are obligate commensals living in the digestive tracts of disease-bearing insects (black flies, midges, and mosquitoes). In this study, we produced and annotated whole-genome sequences of nine Harpellales taxa and conducted the first comparative analyses to infer the genomic diversity within the members of the Harpellales. The genomes of the insect gut fungi feature low (26% to 37%) GC content and large genome size variations (25 to 102 Mb). Further comparisons with insect-pathogenic fungi (from both Ascomycota and Zoopagomycota), as well as with free-living relatives (as negative controls), helped to identify a gene toolbox that is essential to the fungus-insect symbiosis. The results not only narrow the genomic scope of fungus-insect interactions from several thousands to eight core players but also distinguish host invasion strategies employed by insect pathogens and commensals. The genomic content suggests that insect commensal fungi rely mostly on adhesion protein anchors that target digestive system, while entomopathogenic fungi have higher numbers of transmembrane helices, signal peptides, and pathogen-host interaction (PHI) genes across the whole genome and enrich genes as well as functional domains to inactivate the host inflammation system and suppress the host defense. Phylogenomic analyses have revealed that genome sizes of Harpellales fungi vary among lineages with an integer-multiple pattern, which implies that ancient genome duplications may have occurred within the gut of insects.IMPORTANCE Insect guts harbor various microbes that are important for host digestion, immune response, and disease dispersal in certain cases. Bacteria, which are among the primary endosymbionts, have been studied extensively. However, fungi, which are also frequently encountered, are poorly known with respect to their biology within the insect guts. To understand the genomic features and related biology, we produced the whole-genome sequences of nine gut commensal fungi from disease-bearing insects (black flies, midges, and mosquitoes). The results show that insect gut fungi tend to have low GC content across their genomes. By comparing these commensals with entomopathogenic and free-living fungi that have available genome sequences, we found a universal core gene toolbox that is unique and thus potentially important for the insect-fungus symbiosis. This comparative work also uncovered different host invasion strategies employed by insect pathogens and commensals, as well as a model system to study ancient fungal genome duplication within the gut of insects. © Crown copyright 2018.Entities:
Keywords: FISCoG; Trichomycetes; Zoopagomycota; Zygomycota; phylogenomics
Mesh:
Substances:
Year: 2018 PMID: 29764946 PMCID: PMC5954228 DOI: 10.1128/mBio.00636-18
Source DB: PubMed Journal: mBio Impact factor: 7.867
Genome features and statistics of the nine Harpellales taxa
| Strain | No. of | Genome size by | % CEGMA | GC | No. of | Repeat | SNP | NCBI |
|---|---|---|---|---|---|---|---|---|
| 6,137 | 77.12 | 97.98 | 28.61 | 11,209 | 3.34 | 0.45 | ||
| 7,749 | 71.05 | 97.58 | 29.46 | 10,024 | 3.64 | 0.68 | ||
| 7,797 | 102.35 | 93.55 | 26.05 | 8,712 | 2.94 | 0.75 | ||
| 1,954 | 28.70 | 92.74 | 35.52 | 7,387 | 4.29 | 0.64 | ||
| 3,927 | 43.63 | 96.77 | 32.49 | 7,132 | 4.60 | 0.41 | ||
| 1,283 | 28.05 | 99.19 | 32.40 | 7,385 | 1.60 | 0.43 | ||
| 1,312 | 28.13 | 99.19 | 32.37 | 7,338 | 1.58 | 0.43 | ||
| 72 | 24.85 | 97.18 | 37.82 | 6,649 | 4.54 | 0.06 | ||
| 1,131 | 43.91 | 94.76 | 28.38 | 6,519 | 3.38 | 0.58 |
FIG 1 Genome size variation across recognized subclades and Venn diagrams showing homologues across the nine genome-sequenced members of the Harpellales. (a) Harpellales phylogenetic tree based on 5 genes (reconstruced using the data set from reference 41 by adding the strains of S. culicis ID-206-W2 and Capniomyces stellatus MIS-10-108). Branches indicated in bold are considered strongly supported, with Bayesian posterior probability (BPP) values of >95% and maximum-likelihood bootstrap probability (MLBP) values of >0.70. Genome sizes of the recently sequenced 9 taxa were mapped with subclade information (non-Smittium Harpellales, true Smittium, Parasmittium subclades I and II). (b) Venn diagrams for each subclade derived from analysis of reciprocal best matches of protein-coding genes, showing relatedness and homologous comparisons across the subclades of Harpellales. (c) Identification of the Harpellales feature genes. Clade-specific genes were also identified in comparisons of the three major subclades of Harpellales.
FIG 2 Whole-genome dot plots among the nine Harpellales genome sequences (centered diagonally, from lower left corner; determined using MUMmer plotting). Circles with detailed outputs of comparisons with exact match numbers (left) and the identity level of the matches (right) (centered diagonally, from upper right corner) are shown. Light blue circles indicate the pairs with matched regions longer than 100 kb. A default minimum cluster length of 65 bp was used for the comparison pairs, except for S. mucronatum and S. culicis (GSMNP) (75 bp), S. mucronatum and S. culicis (ID-206-W2) (75 bp), S. culicis (GSMNP) and S. culicis (ID-206-W2) (350 bp), S. angustum and Furculomyces boomerangus (4,000 bp), and S. angustum and S. simulii (70 bp), as well as S. simulii and Capniomyces stellatus (74 bp). Self-comparisons were performed using a minimum cluster length of 500 bp.
FIG 3 Phylogenomics and genome statistics of Harpellales. (a) The phylogenomic tree was reconstructed based on a concatenated alignment of 1,241 homologues using IQ-TREE v1.5.3 for maximum-likelihood analysis and ultrabootstrap analysis performed with 1,000 replications (true Smittium and Parasmittium members are colored in red and blue, respectively, while non-Smittium Harpellales taxa are in black). (b to g) Genomic feature of the Harpellales in the order of genome sizes (b), predicted gene models (c), single nucleotide polymorphism sites (d), signal peptide numbers (e), transmembrane helix numbers (f), and numbers of genes that have homologues in the Pathogen-Host Interaction (PHI) database (g).
FIG 4 Genome-wide allele frequency distribution among the single-copy orthologs of the nine Harpellales taxa (applied to 460 to 484 transcripts individually, allowing one taxon to be missing from among the nine). Eight of the nine taxa (except C. stellatus) exhibited a cumulative percentage around the 50% position, suggesting a disomic tendency of the genomes. Specifically, 1,398 to 3,235 variable nucleotide positions were analyzed and plotted for the eight Harpellales (allele frequency interval of 10% to 90%) but only 238 for the C. stellatus.
FIG 5 Comparative genomics between the entomopathogenic fungi (Ascomycota in red and Zoopagomycota in green) and insect commensals of the Harpellales (in blue). (a) Venn diagram derived from interphylum homologues with the aim to sort out fungus-insect symbiotic core genes (FISCoGs), using pathogenic representatives both from Ascomycota and Zoopagomycota and commensals from Harpellales. (b) Box plot comparisons of genome-wide PHI genes, signal peptides, and transmembrane helices among the three groups. (c) Cladogram exhibiting the phylogenetic relationship of the included taxa based on 29 shared single-copy genes. (d) Heat map enrichment of the FISCoG toolbox among the insect-associated fungi (analyzed by removing the 1,612 false-positive hits with non-insect-associated Zoopagomycota genomes from those corresponding to the 1,620 shared genes in panel a). (e) Heat map comparison showing the enrichment pattern of genome-wide Pfam domains (detailed information for the fungus-insect symbiotic core domains is listed in Table S3).
Information and detailed annotations of the FISCoG toolbox
| FISCoG | Description | GO name(s) | Homologue with | Subcellular | PHI hit(s) |
|---|---|---|---|---|---|
| FISCoG.g1 | Peroxisomal NADH | F, hydrolase activity | Regulation of concn of | Peroxisomal | N/A |
| FISCoG.g2 | Fasciclin domain- | C, fungal vacuole membrane, | Cell adhesion protein ( | Extracellular | PHI:4231 |
| FISCoG.g3 | Acyl-CoA N- | F, N-acetyltransferase activity | Involvement in intestinal | Cytoplasmic | PHI:5571 |
| FISCoG.g4 | Nuclear movement | N/A | Nuclear migration and | Cytoplasmic | PHI:2524 |
| FISCoG.g5 | F-box/LRR-repeat | F, protein kinase activity, ATP binding, | Ubiquitin ligase complex F-box | Cytoplasmic | PHI:733; |
| FISCoG.g6 | Platelet-activating | F, 1-alkyl-2- | Enzyme that catabolizes | Cytoplasmic | N/A |
| FISCoG.g7 | Putative SET-like | N/A | Related to growth control, | Nuclear | N/A |
| FISCoG.g8 | RNA-binding | F, RNA binding; C, cytoplasm; P, | Nucleotide binding; | Nuclear | N/A |
F, molecular function; C, cellular component; P, biological process; N/A, not available.