| Literature DB >> 25033262 |
Alejandro Barrera1, Ana Alastruey-Izquierdo2, María J Martín3, Isabel Cuesta2, Juan Antonio Vizcaíno3.
Abstract
Over the past several years fungal infections have shown an increasing incidence in the susceptible population, and caused high mortality rates. In parallel, multi-resistant fungi are emerging in human infections. Therefore, the identification of new potential antifungal targets is a priority. The first task of this study was to analyse the protein domain and domain architecture content of the 137 fungal proteomes (corresponding to 111 species) available in UniProtKB (UniProt KnowledgeBase) by January 2013. The resulting list of core and exclusive domain and domain architectures is provided in this paper. It delineates the different levels of fungal taxonomic classification: phylum, subphylum, order, genus and species. The analysis highlighted Aspergillus as the most diverse genus in terms of exclusive domain content. In addition, we also investigated which domains could be considered promiscuous in the different organisms. As an application of this analysis, we explored three different ways to detect potential targets for antifungal drugs. First, we compared the domain and domain architecture content of the human and fungal proteomes, and identified those domains and domain architectures only present in fungi. Secondly, we looked for information regarding fungal pathways in public repositories, where proteins containing promiscuous domains could be involved. Three pathways were identified as a result: lovastatin biosynthesis, xylan degradation and biosynthesis of siroheme. Finally, we classified a subset of the studied fungi in five groups depending on their occurrence in clinical samples. We then looked for exclusive domains in the groups that were more relevant clinically and determined which of them had the potential to bind small molecules. Overall, this study provides a comprehensive analysis of the available fungal proteomes and shows three approaches that can be used as a first step in the detection of new antifungal targets.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25033262 PMCID: PMC4102429 DOI: 10.1371/journal.pcbi.1003733
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Figure 1Distribution of number of Pfam domains and domain architectures found for selected species.
Figure 2Distribution of protein domains (A) and domain architectures (B) exclusively found in the different fungal subphyla.
Figure 3Distribution of Pfam domains and domain architectures per genus.
In parenthesis, the number of species and the number of strains that belong to a given genus are indicated. The area occupied by each genus corresponds to the number of exclusive domain architectures, whereas the colour correlates with the number of exclusive domains present among those architectures (calculated as the number of domains divided by number of domain architectures).
List of genera including more than one species.
| Genera (Nb. species/Nb. strains) | Sum of the number of strain-specific architectures | Number of genus-specific architectures | Increase (%) |
|
| 20 | 39 | 95.0 |
|
| 88 | 140 | 59.0 |
|
| 59 | 88 | 49.0 |
|
| 131 | 152 | 16.8 |
|
| 218 | 246 | 12.8 |
|
| 839 | 943 | 12.4 |
|
| 244 | 273 | 11.8 |
|
| 149 | 160 | 7.4 |
|
| 230 | 245 | 6.5 |
|
| 351 | 372 | 6.0 |
|
| 160 | 154 | 3.9 |
|
| 367 | 378 | 3.0 |
|
| 149 | 145 | 2.6 |
|
| 387 | 392 | 1.3 |
|
| 43 | 43 | 0 |
The table is sorted by the increase in the number of exclusive architectures in the genus when compared to the individual species. In parenthesis, the number of species and the total number of strains are indicated.
Protein domains most frequently found among the 25 top-ranked most promiscuous domains in all the fungal organisms.
| Pfam domain name | Description | Times in top 25 ranking of promiscuous domains | Average number of bigrams | Gene ontology (GO) terms |
| AAA* | AAA family proteins often perform chaperone-like functions that assist in the assembly, operation, or disassembly of protein complexes | 132 | 17 | GO:0005524 ATP binding |
| GATase* | Glutamine amidotransferase class-I | 123 | 7 | - |
| SH3_1* | SH3 (Src homology 3) domains are often indicative of a protein involved in signal transduction related to cytoskeletal organization | 122 | 11 | GO:0005515 protein binding |
| PX | PX domains bind to phosphoinositides. | 117 | 10 | GO:0005515 protein binding; GO:0007154 cell communication; GO:0035091 phosphatidylinositol binding |
| PH* | PH stands for pleckstrin homology | 116 | 9 | GO:0005515 protein binding; GO:0005543 phospholipid binding |
| SNF2_N | SNF2 family N-terminal domain. This domain is found in proteins involved in a variety of processes including transcription regulation, DNA repair, DNA recombination and chromatin unwinding | 115 | 12 | GO:0003677 DNA binding; GO:0005524 ATP binding |
| Helicase_C | Helicase conserved C-terminal domain | 108 | 20 | GO:0003676 nucleic acid binding; GO:0004386 helicase activity; GO:0005524 ATP binding |
| MMR_HSR1 | The full-length GTPase protein is required for the complete activity of the protein interacting with the 50 S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotide | 98 | 8 | GO:0005525 GTP binding |
| DEP* | Domain found in Dishevelled, Egl-10, and Pleckstrin (DEP). The DEP domain is responsible for mediating intracellular protein targeting and regulation of protein stability in the cell | 89 | 5 | GO:0035556 intercellular signal transduction |
| UBA* | UBA/TS-N domain. Found in several proteins having connections to ubiquitin and the ubiquitination pathway | 88 | 7 | GO:0005515 protein binding |
| TPR_1* | Tetratricopeptide repeat | 86 | 9 | GO:0005515 protein binding |
| zf-RING_2 | Ring finger domain | 80 | 11 | GO:0005515 protein binding; GO:0008270 zinc ion binding |
| C1_1* | Phorbol esters/diacylglycerol binding domain (C1 domain). This domain is also known as the Protein kinase C conserved region 1 (C1) domain. | 76 | 5 | GO:0035556 intercellular signal transduction |
| JmjC* | The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins are hydroxylases that catalyse a novel histone modification | 74 | 5 | GO:0005515 protein binding; |
| UCH* | Ubiquitin carboxyl-terminal hydrolase | 72 | 9 | GO:0004221 ubiquitin thiolesterase activity; GO:0006511 ubiquitin-dependent protein catabolic process |
| BRCT* | BRCA1 C-terminus (BRCT) domain. Canonical BRCT phosphopeptide interaction cleft at a groove between the BRCT domains | 71 | 7 | - |
| PHD* | PHD folds into an interleaved type of Zn-finger chelating two Zn ions in a similar manner to that of the RING and FYVE domains | 66 | 9 | GO:0005515 protein binding |
| UBACT | Repeat in ubiquitin-activating (UBA) protein | 65 | 5 | GO:0005524 ATP binding; GO:0006464 cellular protein modification process; GO:0008641 small protein activating enzyme activity |
| TPR_2 | Tetratricopeptide repeat | 65 | 7 | - |
| RhoGEF | RhoGEF domain. Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases Also called Dbl-homologous (DH) domain. It appears that Pfam:PF00169 domains invariably occur C-terminal to RhoGEF/DH domains | 57 | 6 | GO:0005089 Rho guanyl-nucleotide exchange factor activity; GO:0035023 regulation of Rho protein signal transduction |
| CBM_1 | Fungal cellulose binding domain | 24 | 13 | GO:0004553 hydrolase activity, hydrolyzing O-glycosyl compounds; GO:0005576 extracellular region; GO:0005975 carbohydrate metabolic process; GO:0030248 cellulose binding |
Domains marked with an asterisk had been previously identified as promiscuous in animals, plants and fungi [37].
Figure 4Distribution of protein domains (A), domain architectures (B) and Pfam clans (C) shared between the fungal species included in this study and Homo sapiens.
The category “fungi” refers to the set of 137 organisms analysed.
Pathways and protein domain architectures related to fungal promiscuous domains. The promiscuous domains are indicated in bold letters.
| Protein(s) originally annotated | Domain architecture | Number of Proteins | Number of species | Metabolic pathway (UniPathway) | Metabolic pathway description |
| Q0C8M3 | ketoacyl-synt∼Ketoacyl-synt_C∼Acyl_transf_1∼PS-DH∼Methyltransf_12∼KR∼PP-binding∼Condensation | 4 | 4 | UPA00875: lovastatin biosynthesis | Biosynthesis of lovastatin, an HMG-CoA reductase inhibitor produced by the fungus |
| Q4WBW4; A1DBP9 | Esterase_phd∼CBM_1 | 34 | 22 | UPA00114: xylan degradation. | Degradation of xylan, a polymer of xylose residues |
| P15807; O14172 | NAD_binding_7∼Sirohm_synth_M∼Sirohm_synth_C | 125 | 96 | UPA00262: siroheme biosynthesis | Biosynthesis of siroheme, the cofactor for sulfite and nitrite reductases. Siroheme is formed by methylation, oxidation and iron insertion into the tetrapyrrole uroporphyrinogen III (Uro-III) |
Protein domains found exclusively in proteins from the Groups 3 and 4 of clinical isolates.
| Pfam domain name | Domain description | Pfam clan information | Group |
| ATP1G1_PLM_MAT8 | ATP1G1/PLM/MAT8 family | - | 3 |
| CTP_transf_3 | Cytidylyltransferase. This family consists of two main Cytidylyltransferase activities: 1) 3-deoxy-manno-octulosonate cytidylyltransferase; 2) acylneuraminate cytidylyltransferase. NeuAc cytydilyltransferase of | CL0110: GT-A. This is the GT-A clan that contains diverse glycosyltransferases that possess a Rossmann like fold | 3 |
| FRG | FRG domain. This presumed domain contains a conserved N-terminal (F/Y)RG motif. It is functionally uncharacterised | - | 3 |
| HI0933_like | HI0933-like protein | CL0063: NADP_Rossmann. A class of redox enzymes is composed by two domain proteins. One domain, termed the catalytic domain, confers substrate specificity and the precise reaction of the enzyme. The other domain, which is common to this class of redox enzymes, is a Rossmann-fold domain | 3 |
| PTS-HPr | PTS HPr component phosphorylation site | - | 3 |
| SdiA-regulated | SdiA-regulated. This family represents a conserved region approximately within a number of hypothetical bacterial proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. Some family members contain the Pfam:PF01436 repeat | CL0186: Beta_propeller. This large clan contains proteins that contain beta propellers. These are composed of between 6 and 8 repeats. The individual repeats are composed of a four stranded sheet | 3 |
| Sugarporin_N | Maltoporin periplasmic N-terminal extension. This domain would appear to be the periplasmic, N-terminal extension of the outer membrane maltoporins | - | 3 |
| TIR_2 | TIR domain. This is a family of bacterial Toll-like receptors | CL0173: STIR. Both members of this clan are thought to be involved in TOLL/IL1R-like pathways, by mediating protein-protein interactions between pathway components. The N-termini of SEFIR and TIR domains are similar, but the domains are more divergent towards the C-terminus | 3 |
| Uma2 | Putative restriction endonuclease. This family consists of hypothetical proteins that are greatly expanded in cyanobacteria. The proteins are found sporadically in other bacteria. A small number of member proteins also contain Pfam:PF02861 domains that are involved in protein interactions. Solutions of several structures for members of this family show that it is likely to be acting as an endonuclease | CL0236: PDDEXK. This clan includes a large number of nuclease families related to holliday junction resolvases | 3 |
| FixP_N | N-terminal domain of cytochrome oxidase-cbb3, FixP. This is the N-terminal domain of FixP, the cytochrome oxidase type-cbb3. The exact function is not known | - | 3 |
| MFMR | G-box binding protein MFMR. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain), whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C, classified according to motif composition | - | 3 |
| HEPN | HEPN domain | CL0291: KNTase_C. This alpha helical domain is found associated with a variety of nucleotidyltransferase domains | 4 |
| Keratin_B2_2 | Keratin, high sulfur B2 protein | CL0520: Keratin_assoc. Families in this clan are cysteine-rich and are from proteins associated with Keratin | 4 |