| Literature DB >> 32198418 |
Kristian Barrett1, Kristian Jensen2, Anne S Meyer1, Jens C Frisvad3, Lene Lange4.
Abstract
Fungi secrete an array of carbohydrate-active enzymes (CAZymes), reflecting their specialized habitat-related substrate utilization. Despite its importance for fitness, enzyme secretome composition is not used in fungal classification, since an overarching relationship between CAZyme profiles and fungal phylogeny/taxonomy has not been established. For 465 Ascomycota and Basidiomycota genomes, we predicted CAZyme-secretomes, using a new peptide-based annotation method, Conserved-Unique-Peptide-Patterns, enabling functional prediction directly from sequence. We categorized each enzyme according to CAZy-family and predicted molecular function, hereby obtaining a list of "EC-Function;CAZy-Family" observations. These "Function;Family"-based secretome profiles were compared, using a Yule-dissimilarity scoring algorithm, giving equal consideration to the presence and absence of individual observations. Assessment of "Function;Family" enzyme profile relatedness (EPR) across 465 genomes partitioned Ascomycota from Basidiomycota placing Aspergillus and Penicillium among the Ascomycota. Analogously, we calculated CAZyme "Function;Family" profile-similarities among 95 Aspergillus and Penicillium species to form an alignment-free, EPR-based dendrogram. This revealed a stunning congruence between EPR categorization and phylogenetic/taxonomic grouping of the Aspergilli and Penicillia. Our analysis suggests EPR grouping of fungi to be defined both by "shared presence" and "shared absence" of CAZyme "Function;Family" observations. This finding indicates that CAZymes-secretome evolution is an integral part of fungal speciation, supporting integration of cladogenesis and anagenesis.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32198418 PMCID: PMC7083838 DOI: 10.1038/s41598-020-61907-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Map of selected fungi based on their predicted secreted CAZyme inventory presented as a multidimensional scaling plot. Similarity mapping of secreted “Function;Family” annotated carbohydrate active enzymes from 465 representative genomes of species of Dikarya visualized in two-dimensional space. In total 295 different enzyme “Function;Family” observations were identified. The relative sizes of the dots represent the number of different enzyme “Function;Family” observations, ranging from 40 to 144, in each genome analyzed. The distances were calculated using Yule distances based on in silico annotated carbohydrate active enzyme protein families combined with in silico prediction of enzyme function, represented by their respective EC number (if available). The clusters that represent members of the Ascomycota and Basidiomycota phyla, respectively were defined by hierarchical clustering of the calculated distances among the genomes using a flat clustering threshold of 0.7. For illustrative purposes, all species in each of these two phylum clusters are connected with pink and yellow lines, respectively. The cluster defining Aspergillus and Penicillium, containing 95 species, was based on a threshold of 0.3. All the Aspergillus and Penicillium species are connected with red lines. The coordinates were obtained by conducting 50,000 different initiations and shown as the map with the smallest final stress.
Figure 2Circular dendrogram representing the secreted carbohydrate active enzyme profile relatedness, EPR, of Aspergillus and Penicillium presented with one representative genome of each of the fungal species. The distances are based on binary absence or presence assessment of “Function;Family” observation matches of the in silico predicted CAZyme secretomes from the genomes using Yule dissimilarity. The blue rings concentrically dividing the EPR-based dendrogram in the middle indicate the scale and have a spacing of 0.15 (innermost) and 0.3 (outermost). Circulating the dendrogram, the labels are associated to the individual genomes, as genus, strain or isolate number, species, and section, respectively. A dashed line indicates sections having members with diverse habitats or an adjacent section whose members share the same habitat. The stylized images in the outermost area indicate the primary natural habitat (or ecological specialization) of the fungal species: Clockwise description of images as they first appear, starting from section A. Terrei: Compost, dry Cereal, Tropical plants, Coffee, Wood, Nuts, Hay, Grapes, Plant soil, Maize, Grass, Fallen leaves, Dung, Desert plants, Cheese, Apple, Citrus and Silage. A dashed line indicates a section having more than one primary habitat. The asterisk on P. canescens indicates a revision of incorrect P. capsulatum species identification (see Supplementary Material, Fig. S2).
Summary of the enzyme “Function;Family” observations underlying the dendrogram in Fig. 2 organized according to the fungal sections in the dendrogram, starting from section Terrei.
The “Number of species” states the number of genomes included from different fungal species in each section; “Total observations” gives the number of different “Function;Family” observations in each section; “Total different functions” is the total number of different EC numbers (functions) found in the section, i.e. the number of different enzyme functions annotated from the genomes (an unknown function counts as a “function”, but does not have an EC number); “Function overlap between families” describes the number of times an EC number found in more than one CAZy family in the section; “Shared observations” describes the number of observations found within all members of a section; “Absent observations” states the number of observations that are not found in any of the members of the given section, but present in one or more of the other sections; “Ratio of present:absent” describes the proportion of the “Total observations” versus the “Absent observations”. The “present:absent ratio” obtained for the individual sections was used to assign an enzyme profile type to each section; type III, having a “present:absent ratio” below 1, indicating that members of the section are weak enzyme producers; type II having a ratio between one and two, indicating that the members of the section are medium enzyme producers; and type I, having a ratio above two, indicating the section members being strong enzyme producers.
Enzyme observation overview for A. Aspergillus, A. Candidi, A. Flavi and P. Fasciculata in relation to action on the major polysaccharides cellulose, xylan, and pectin (these four fungal sections all have dry cereal as preferred habitats while being taxonomically diverse).
Orange colored cells indicate the most prominent differences between the sections, contributing to their separation with regard to EPR profile and fungal section. Dots indicate presence of an individual enzyme observation in the given fungal species; orange cells with dots indicate the presence of a particular enzyme observation (“Function;Family”) among all members of a section, except where all the included species (19 in total) have the particular observation; empty orange cells indicate enzyme observations whose absence are shared among all members of a section.