| Literature DB >> 26157797 |
Sergio Hernández1, Luís Franco1, Alejandra Calvo2, Gabriela Ferragut2, Antoni Hermoso1, Isaac Amela1, Antonio Gómez3, Enrique Querol1, Juan Cedano2.
Abstract
Multitasking or moonlighting is the capability of some proteins to execute two or more biochemical functions. Usually, moonlighting proteins are experimentally revealed by serendipity. For this reason, it would be helpful that Bioinformatics could predict this multifunctionality, especially because of the large amounts of sequences from genome projects. In the present work, we analyze and describe several approaches that use sequences, structures, interactomics, and current bioinformatics algorithms and programs to try to overcome this problem. Among these approaches are (a) remote homology searches using Psi-Blast, (b) detection of functional motifs and domains, (c) analysis of data from protein-protein interaction databases (PPIs), (d) match the query protein sequence to 3D databases (i.e., algorithms as PISITE), and (e) mutation correlation analysis between amino acids by algorithms as MISTIC. Programs designed to identify functional motif/domains detect mainly the canonical function but usually fail in the detection of the moonlighting one, Pfam and ProDom being the best methods. Remote homology search by Psi-Blast combined with data from interactomics databases (PPIs) has the best performance. Structural information and mutation correlation analysis can help us to map the functional sites. Mutation correlation analysis can only be used in very specific situations - it requires the existence of multialigned family protein sequences - but can suggest how the evolutionary process of second function acquisition took place. The multitasking protein database MultitaskProtDB (http://wallace.uab.es/multitask/), previously published by our group, has been used as a benchmark for the all of the analyses.Entities:
Keywords: bioinformatics; moonlighting protein; multifunctional; multitasking; protein evolution; protein function
Year: 2015 PMID: 26157797 PMCID: PMC4478894 DOI: 10.3389/fbioe.2015.00090
Source DB: PubMed Journal: Front Bioeng Biotechnol ISSN: 2296-4185
Figure 1An example of second function mapping using sequence approaches. In (A), a moonlighting protein sequence (red) is aligned with ClustalW to another moonlighting protein sequence of a different organism (black) that was found after a BLASTP analysis. In (B), the same approach is used and the moonlighting protein sequence (red) is aligned with two monofunctional proteins (green/blue), each one in a region of the moonlighting protein; therefore, mapping the canonical and moonlighting functions of the original protein.
Examples of moonlighting proteins prediction combining PPI databases and Bypass.
| Canonical function | Moonlighting function | PPI partners (only some hits are shown) | Bypass output (only some hits are shown) |
|---|---|---|---|
| Phosphoglucose isomerase | Neurotrophic factor | GO:4842 autocrine motility factor receptor 2 | giI17380385 |
| Pyruvate kinase | Tyroid hormone-binding protein | GO:3707 nucelar hormone receptor member nhr-111 | giI20178296 |
| Ribosomal protein S3 (human) | Apurinic/apirymidinic endonuclease | GO: 31571 DNA damage binding protein 1 | giI290275 |
| Ure2 | Glutathione peroxidase | GO: 6808 nitrogen regulatory protein | giI173152; gi449015276 |
| P0 ribosomal protein | DNA repair | GO: 6281, FACT complex subunit SSRP1 | |
| Vhs3-phosphopantothenoylcysteine decarboxylase subunit Vhs3 | Regulator of serine/threonine protein phosphatase | GO: 4724, serine/threonine-protein phosphatase PP-Z1 | gi|254572327|ref|XP_002493273.1|Negative regulatory subunit of the protein phosphatase 1 Ppz1p |
| Epsin | Organizing mitotic membranes/influencing spindle assembly | GO: 7067, cell division control protein 2 homolog | gi|2072301|gb|AAC60123.1|mitotic phosphoprotein 90 |
| Alpha-crystallin A chain | Heat-shock protein | GO: 6986, Heat shock protein beta-1 | gi|1706112|sp|P02489.2|CRYAA_HUMAN |
| Hexokinase | Transcriptional regulation | GO: 16563, metallothionein expression activator | gi|254573908|ref|XP_002494063.1|Non-essential protein of unknown function required for transcriptional induction |
| Ribosomal protein L7 | Autogenous regulation of translation | GO: 6414, 60S ribosomal protein L7a | gi|339256006|ref|XP_003370746.1|eukaryotic translation initiation factor 2C 2 |
| PIAS1 (E3 SUMO-protein ligase PIAS1) | Activation of p53 | GO: 7569, cellular tumor antigen p53 | gi|58176991|pdb|1V66|A Chain A, solution structure of human P53 binding domain of Pias-1 |
Figure 2Two examples of the outputs of two motif/domain programs. (A) Blocks server identifies both functions of the protein Arg 2 in the top positions of the output. (B) ProDom program shows two domains related to both canonical and moonlighting functions of aconitase.
Figure 3Enolase mutation correlation analysis. It can be seen that the areas that have been redesigned to fit the new function of enolase (highlighted in green and navy blue) change the correlation matrix in those regions directly related with the new interaction. However, the modification of a portion of the protein without compromising the network of internal interactions may involve additional changes (depicted here in light blue) in order to maintain the correct conformation and the canonical function of the protein.
Figure 4An example of second function mapping using structural approaches. The same proteins shown in Figure 1 were used to do a structure comparison using SwissPDBViewer and USCF Chimera. The 3D structure of the “red” moonlighting protein was predicted using Phyre, while the other structures were found in the PDB. In (A), the sequence similarity previously found in Figure 1A was corroborated. In (B), the structure superposition of the three proteins aligned in Figure 1B emphasizes the utility of these methods to map the two functions of a moonlighting protein.
Figure 5The problem of moonlighting prediction. (A) The plot shows that databases containing a high heterogeneity of functions, such as PFamB, allow for the identification of non-canonical functions that cannot be found by searching databases of patterns with high functional homology, such as PFamA. However, this implies an increased rate of false positives, when compared with PFamA [as shown in (D)]. (B) An attempt to exploit the variability of annotation in databases such as a non-redundant database also has its costs, as the hypersaturation of canonical function-annotations contains all sorts of synonyms. (C) Checking the reported supplementary documentation can help you to find out relevant details related to the moonlighting function to explore. (D) The ratio between false positives and true positives gives us an idea of the compromise between specificity and sensitivity. As we can see when the scores are relaxed, although we are still able to find new moonlighting functions, the number of false positives increases more sharply.