Literature DB >> 29126295

MoonProt 2.0: an expansion and update of the moonlighting proteins database.

Chang Chen1, Shadi Zabad2, Haipeng Liu3, Wangfei Wang1, Constance Jeffery1,4.   

Abstract

MoonProt 2.0 (http://moonlightingproteins.org) is an updated, comprehensive and open-access database storing expert-curated annotations for moonlighting proteins. Moonlighting proteins contain two or more physiologically relevant distinct functions performed by a single polypeptide chain. Here, we describe developments in the MoonProt website and database since our previous report in the Database Issue of Nucleic Acids Research. For this V 2.0 release, we expanded the number of proteins annotated to 370 and modified several dozen protein annotations with additional or updated information, including more links to protein structures in the Protein Data Bank, compared with the previous release. The new entries include more examples from humans and several model organisms, more proteins involved in disease, and proteins with different combinations of functions. The updated web interface includes a search function using BLAST to enable users to search the database for proteins that share amino acid sequence similarity with a protein of interest. The updated website also includes additional background information about moonlighting proteins and an expanded list of links to published articles about moonlighting proteins.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29126295      PMCID: PMC5753272          DOI: 10.1093/nar/gkx1043

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

MoonProt is an expert manually curated and non-redundant resource of information about moonlighting proteins. Moonlighting proteins are proteins in which more than one physiologically relevant discrete function is performed by a single polypeptide chain (1–3). For example, the taxon specific crystallins are lens structural proteins in the eyes of several species and a metabolic enzymes in other tissues (4). Moonlighting proteins are found throughout the evolutionary tree and perform many kinds of functions (1–11). Moonlighting proteins are usually found through serendipity, lacking a shared sequence or structural feature that can indicate that a protein has multiple functions, and information about the proteins is scattered in many different publications, so a database provides a way for researchers to learn about these proteins and to find out if a protein of interest is a known moonlighting protein or related to a known moonlighting protein. In addition, the collection of information about known moonlighting proteins can aid in understanding the connections between protein structure and function, determining the functions of genes identified in newly sequenced genomes, interpreting proteomics results, and annotating protein sequence and structural databases. Information about the structures and functions of moonlighting proteins can be helpful in understanding the evolution of protein function, which can also help in the design of proteins with novel functions. In 2014, our lab constructed the open-access web server MoonProt, the Moonlighting Proteins Database (http://www.moonlightingproteins.org/) (12). In this paper, we present the latest version of MoonProt. Since its first development three years ago, the database has grown to include annotation for 370 proteins, the website interface has been redesigned, and information about individual moonlighting proteins and moonlighting proteins in general have been updated.

MATERIALS AND METHODS

Selection of moonlighting proteins included in the database

For inclusion of a protein in the MoonProt Database, peer-reviewed published biochemical, biophysical, mutagenic, or other data to support the presence of multiple physiologically-relevant functions was required and was critically reviewed by the PI. Proteins were not included if the ‘multiple functions’ are due to gene fusions, different RNA splice variants, the same function in two different locations, pleiotropic effects on multiple pathways or multiple physiological processes, or a family of proteins in which the different functions are performed by different proteins. Proteins were not included if the ‘multiple functions’ are simply different aspects of the same function (i.e. ‘membrane protein’ and ‘transmembrane receptor’).

Information included about individual proteins

Information about each protein was manually curated from published journal articles and online resources as described for Version 1.0 (12). The entry for each protein includes a description of each function and a list of references for publications providing experimental evidence of that function. When available, information is included about the specific cellular location in which the protein exhibits each function. Importantly, the specific species in which each protein has two or more functions was identified and included because a homologue from another species might or might not have both functions. Amino acid sequences were identified using UniProtKB (13) or Pubmed [http://www.ncbi.nlm.nih.gov/pubmed/] resources and are included in FASTA format. Those sequences were used with BLAST [http://blast.ncbi.nlm.nih.gov/Blast.cgi] to identify structures in the Protein Data Bank (14) that correspond to the amino acid sequence, if available. GO terms (15) were identified from the UniProtKB (13), and Enzyme Commission (EC) numbers are included in order to illustrate the different types of proteins included. UniProt entry IDs are included as links for easy connection to external resources.

Database architecture and web interface

The database is based on MySQL (http://www.mysql.com) for data storage, together with PHP 7.1 (http://www.php.net), HTML (HyperText Markup Language), and CSS (Cascading Style Sheets) for construction of the new interface. A Content Management System (CMS): WordPress, which utilizes modern web technologies, was used to help streamline the software development process.

RESULTS

New developments in MoonProt

Additional proteins and updated annotations

The MoonProt Database version 2.0 is now available at www.moonlightingproteins.org and provides information about hundreds of moonlighting proteins for which experimental evidence is available confirming the presence of more than one function. The database has grown by over one third since our last report with an additional 90 moonlighting proteins added based on information from the peer-reviewed literature. At the time of writing, the database includes 370 proteins. The new entries increase the number of human proteins included to 73, with an increase in the number of proteins from several model organisms such as Saccharomyces cerevisiae (34 proteins) and Escherichia coli (31 proteins). As in version 1.0, most of the new entries have catalytic activities as one or more of their functions. There is also an increase in the number of proteins that are enzymes or chaperones inside the cell and have a second function on the cell surface or when secreted to the extracellular fluid (i.e. blood). Many of these proteins play important roles in health and disease. For prokaryotes, cytoplasmic enzymes can have a second role as a secreted signaling protein that affects the host immune system or as a cell surface receptor for host proteins. This can play a key role in infection for pathogens, but even commensal or ‘good’ bacteria have been found to make use of intracellular/surface moonlighting proteins to interact with the host. Even our own cells make use of cytoplasmic proteins on the cell surface, such as in several new additions to the database that are cytosolic enzymes that are also found on the surface of sperm and involved in sperm and egg interactions during fertilization. Along with adding more proteins to the database, the annotation for many of the proteins has been updated, including more links to protein structures in the Protein Data Bank. For some proteins, additional references have been included, and a few dozen outdated UniProt IDs have been replaced with updated IDs.

New web interface

Since our last publication, we have developed a website with a new interface located at www.moonlightingproteins.org that gives access to the manually curated information about moonlighting proteins. The front page/home page, which is now also accessible with full functionality on mobile devices, includes a panel of summary information and several mechanisms to access the data. Several of the previous interaction options are also available, including a Proteins link that leads to a list of all the proteins in the database. Clicking on the protein names in the Proteins list will lead to the individual Protein Details page that displays the annotation information for that protein (Figure 1). Other links on the home page lead to general information about moonlighting proteins (FAQs), review articles about moonlighting proteins (Publications), and references for resources used in annotating the database (Resources). The information in each of these pages has been expanded and updated.
Figure 1.

Example of a protein annotation page. Each protein page contains the names of the protein, a UniProt accession number, the species of organism for which the protein has been shown to have more than one function (homologues of a moonlighting protein might have only one of the functions), GO terms, the length of the amino acid sequence, the amino acid sequence in FASTA format, PDB IDs for any available protein structures in the Protein Data Bank, descriptions of at least two functions, links to peer-reviewed publications describing experiments demonstrating the protein performs each function, and Enzyme Commission numbers (if an enzyme).

Example of a protein annotation page. Each protein page contains the names of the protein, a UniProt accession number, the species of organism for which the protein has been shown to have more than one function (homologues of a moonlighting protein might have only one of the functions), GO terms, the length of the amino acid sequence, the amino acid sequence in FASTA format, PDB IDs for any available protein structures in the Protein Data Bank, descriptions of at least two functions, links to peer-reviewed publications describing experiments demonstrating the protein performs each function, and Enzyme Commission numbers (if an enzyme).

BLAST search function added

On the homepage, an updated Search link leads to a page with two types of search options, a text search and a BLAST sequence similarity search. The Search box enables a text search of all the annotated information in the database, which is expanded from the first version of the database, which allowed a search of only some of the categories of information. The search returns a list of protein entries containing that term. A second box on the Search page, labeled BLAST, enables use of the NCBI-blast-2.6.0+ algorithm (Basic Local Alignment Search Tool) (16) to search the database for moonlighting proteins that share sequence similarity with a query sequence. Users can paste an amino acid sequence (in the single letter code) in the box, and the search returns a sorted list of protein queries ranked by their similarity to the query sequence (Figure 2). By using this feature a user can determine if their protein of interest is a known moonlighting protein or if any of the known moonlighting proteins share sequence similarity to their protein of interest.
Figure 2.

Example of the output of a Blast query. Users can supply the amino acid sequence of a protein of interest and check if that protein or a homologous protein is in the MoonProt Database. In this example, the user submitted a fragment of the sequence for glycyl-tRNA synthetase, ‘FNLMFKTFIGPGGNMPGYLRPETAQGIFLNFKRLLEFNQGKLPFAAAQIGNSFRNEISPRSGLIRVREFTMAEIEHFVDPSEKDHPKFQNVADLHLYLYSAKAQVSGQSARKMRLGDAVEQGVINNTVLGYFIGRIYLYLTKVGISPDKLRFRQHMENEMAHYACDCWDAESKTSYGWIEIVGCADRSCYDLSCHARATKVPLVAEKPLKEPKTVNV’. The search returns a sorted list of protein names ranked by their similarity to glycyl-tRNA synthetase, the query sequence. Clicking on the link for each protein name leads to its protein page.

Example of the output of a Blast query. Users can supply the amino acid sequence of a protein of interest and check if that protein or a homologous protein is in the MoonProt Database. In this example, the user submitted a fragment of the sequence for glycyl-tRNA synthetase, ‘FNLMFKTFIGPGGNMPGYLRPETAQGIFLNFKRLLEFNQGKLPFAAAQIGNSFRNEISPRSGLIRVREFTMAEIEHFVDPSEKDHPKFQNVADLHLYLYSAKAQVSGQSARKMRLGDAVEQGVINNTVLGYFIGRIYLYLTKVGISPDKLRFRQHMENEMAHYACDCWDAESKTSYGWIEIVGCADRSCYDLSCHARATKVPLVAEKPLKEPKTVNV’. The search returns a sorted list of protein names ranked by their similarity to glycyl-tRNA synthetase, the query sequence. Clicking on the link for each protein name leads to its protein page.

CONCLUSIONS AND PERSPECTIVES

The MoonProt Database version 2.0 is now available at www.moonlightingproteins.org and provides a centralized, organized resource containing information about 370 moonlighting proteins for which experimental evidence is available for more than one function. Most moonlighting proteins have been discovered through serendipity, with the absence of a common physical or sequence characteristic among moonlighting proteins, which prevents the development of a robust algorithm for accurately predicting the presence of moonlighting functions. This database, with its collection of information about hundreds of moonlighting proteins, provides a resource for labs interested in developing computational methods for predicting protein functions based on sequence, structure, cellular localization, protein–protein interactions, or other characteristics. It also includes links to structures in the Protein Data Bank that could be used by synthetic biologists as a guide for designing proteins that can perform more than one function. We note that MoonProt 2.0 might be more useful for some of these purposes than another recent resource describing multifunctional proteins (17) because MoonProt only includes proteins for which biochemical or biophysical experiments demonstrated that the multiple functions are performed by a single polypeptide chain and are not due to different functions of different proteins within a large multiprotein complex or the effects of pleiotropy or other similar mechanisms. We continue to add annotations to the MoonProt Database as new peer-reviewed publications about moonlighting proteins become available and as new protein structures are deposited in the Protein Data Bank. The MoonProt Database is likely to grow considerably in the next few years as the discovery of protein functions is aided by large scale functional proteomics studies. In addition, new formation about the known moonlighting proteins is likely to increase as new protein structures are solved.

AVAILABILITY AND LICENSE

The MoonProt Database is freely available via a user-friendly graphical user interface (GUI) at the web address www.moonlightingproteins.org. The interface enables text search for a protein name, species, or a UniProtKB or PDB identifier and a BLAST search using an amino acid sequence in the one letter code. The user can also browse a list of all the proteins in the database. The database is ‘read and search only’ by the public, but additional information about the known moonlighting proteins and suggestions of other proteins that might also be moonlighting are welcome and can be sent to the curators for possible inclusion in the database.
  16 in total

Review 1.  Moonlighting proteins.

Authors:  C J Jeffery
Journal:  Trends Biochem Sci       Date:  1999-01       Impact factor: 13.807

2.  Moonlighting proteins: old proteins learning new tricks.

Authors:  Constance J Jeffery
Journal:  Trends Genet       Date:  2003-08       Impact factor: 11.639

Review 3.  Trigger enzymes: bifunctional proteins active in metabolism and in controlling gene expression.

Authors:  Fabian M Commichau; Jörg Stülke
Journal:  Mol Microbiol       Date:  2007-12-11       Impact factor: 3.501

Review 4.  Moonlighting proteins--an update.

Authors:  Constance J Jeffery
Journal:  Mol Biosyst       Date:  2009-02-03

Review 5.  Essential nontranslational functions of tRNA synthetases.

Authors:  Min Guo; Paul Schimmel
Journal:  Nat Chem Biol       Date:  2013-03       Impact factor: 15.040

6.  Recruitment of enzymes as lens structural proteins.

Authors:  G Wistow; J Piatigorsky
Journal:  Science       Date:  1987-06-19       Impact factor: 47.728

Review 7.  Moonlighting proteins in yeasts.

Authors:  Carlos Gancedo; Carmen-Lisset Flores
Journal:  Microbiol Mol Biol Rev       Date:  2008-03       Impact factor: 11.056

8.  MoonProt: a database for proteins that are known to moonlight.

Authors:  Mathew Mani; Chang Chen; Vaishak Amblee; Haipeng Liu; Tanu Mathur; Grant Zwicke; Shadi Zabad; Bansi Patel; Jagravi Thakkar; Constance J Jeffery
Journal:  Nucleic Acids Res       Date:  2014-10-16       Impact factor: 16.971

9.  Expansion of the Gene Ontology knowledgebase and resources.

Authors: 
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

10.  UniProt: the universal protein knowledgebase.

Authors: 
Journal:  Nucleic Acids Res       Date:  2016-11-29       Impact factor: 16.971

View more
  32 in total

1.  Moonlighting protein prediction using physico-chemical and evolutional properties via machine learning methods.

Authors:  Farshid Shirafkan; Sajjad Gharaghani; Karim Rahimian; Reza Hasan Sajedi; Javad Zahiri
Journal:  BMC Bioinformatics       Date:  2021-05-24       Impact factor: 3.169

Review 2.  Multiple and Overlapping Functions of Quorum Sensing Proteins for Cell Specialization in Bacillus Species.

Authors:  Abel Verdugo-Fuentes; Gabriela Gastélum; Jorge Rocha; Mayra de la Torre
Journal:  J Bacteriol       Date:  2020-04-27       Impact factor: 3.490

Review 3.  Understanding protein multifunctionality: from short linear motifs to cellular functions.

Authors:  Andreas Zanzoni; Diogo M Ribeiro; Christine Brun
Journal:  Cell Mol Life Sci       Date:  2019-08-20       Impact factor: 9.261

Review 4.  An enzyme in the test tube, and a transcription factor in the cell: Moonlighting proteins and cellular factors that affect their behavior.

Authors:  Constance J Jeffery
Journal:  Protein Sci       Date:  2019-05-24       Impact factor: 6.725

5.  Identification of Moonlighting Proteins in Genomes Using Text Mining Techniques.

Authors:  Aashish Jain; Hareesh Gali; Daisuke Kihara
Journal:  Proteomics       Date:  2018-10-10       Impact factor: 3.984

6.  Protein moonlighting elucidates the essential human pathway catalyzing lipoic acid assembly on its cognate enzymes.

Authors:  Xinyun Cao; Lei Zhu; Xuejiao Song; Zhe Hu; John E Cronan
Journal:  Proc Natl Acad Sci U S A       Date:  2018-07-09       Impact factor: 11.205

7.  Translocatome: a novel resource for the analysis of protein translocation between cellular organelles.

Authors:  Péter Mendik; Levente Dobronyi; Ferenc Hári; Csaba Kerepesi; Leonardo Maia-Moço; Donát Buszlai; Peter Csermely; Daniel V Veres
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

8.  Natural human genetic variation determines basal and inducible expression of PM20D1, an obesity-associated gene.

Authors:  Kiara K Benson; Wenxiang Hu; Angela H Weller; Alexis H Bennett; Eric R Chen; Sumeet A Khetarpal; Satoshi Yoshino; William P Bone; Lin Wang; Joshua D Rabinowitz; Benjamin F Voight; Raymond E Soccio
Journal:  Proc Natl Acad Sci U S A       Date:  2019-10-28       Impact factor: 11.205

9.  Functional definition of NrtR, a remnant regulator of NAD+ homeostasis in the zoonotic pathogen Streptococcus suis.

Authors:  Qingjing Wang; Bachar H Hassan; Ningjie Lou; Justin Merritt; Youjun Feng
Journal:  FASEB J       Date:  2019-02-13       Impact factor: 5.834

10.  Pathogen Moonlighting Proteins: From Ancestral Key Metabolic Enzymes to Virulence Factors.

Authors:  Luis Franco-Serrano; David Sánchez-Redondo; Araceli Nájar-García; Sergio Hernández; Isaac Amela; Josep Antoni Perez-Pons; Jaume Piñol; Angel Mozo-Villarias; Juan Cedano; Enrique Querol
Journal:  Microorganisms       Date:  2021-06-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.