Literature DB >> 22424087

PupDB: a database of pupylated proteins.

Chun-Wei Tung1.   

Abstract

BACKGROUND: Prokaryotic ubiquitin-like protein (Pup), the firstly identified post-translational protein modifier in prokaryotes, is an important signal for the selective degradation of proteins. Recently, large-scale proteomics technology has been applied to identify a large number of pupylated proteins. The development of a database for managing pupylated proteins and pupylation sites is important for further analyses. DESCRIPTION: A database named PupDB is constructed by collecting experimentally identified pupylated proteins and pupylation sites from published studies and integrating the information of pupylated proteins with corresponding structures and functional annotations. PupDB is a web-based database with tools for browses and searches of pupylated proteins and interactive displays of protein structures and pupylation sites.
CONCLUSIONS: The structured and searchable database PupDB is expected to provide a useful resource for further analyzing the substrate specificity, identifying pupylated proteins in other organisms and developing computational tools for predicting pupylation sites. PupDB is freely available at http://cwtung.kmu.edu.tw/pupdb.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22424087      PMCID: PMC3314583          DOI: 10.1186/1471-2105-13-40

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

Protein-to-protein modifications are essential for regulating protein functions. In eukaryotes, ubiquitylation involved in numerous regulatory functions such as protein degradation, DNA repair, transcription and signal transduction is particular important [1]. Recently, pupylation has been identified as the first post-translational protein-to-protein modification in prokaryotes [2,3]. Similar to ubiquitin, prokaryotic ubiquitin-like protein (Pup) attaches to specific lysine residues of substrate proteins by forming isopeptide bonds to target the proteins for proteasomal degradation [2,3]. Although ubiquitylation and pupylation are functional analogues, the enzymology of ubiquitylation and pupylation is different. In contrast to the three-step reaction of ubiquitylation, pupylation requires only two steps that only two enzymes are involved in pupylation. First, the C-terminal glutamine of Pup is deamidated to glutamine by deamidase of Pup (Dop) [4]. Subsequently, proteasome accessory factor A (PafA) attaches the deamidated Pup to specific lysine residues of substrate proteins [5]. The identification of pupylated proteins and pupylation sites can provide insights into the substrate specificity and functions of pupylation. Recently, large-scale proteomics technology has been applied to identify pupylated proteins and pupylation sites [6-9]. As the number of identified pupylated proteins and sites grows, a structured and searchable database of pupylated proteins and pupylation sites is desirable for further analyzing substrate specificity and functions of pupylated proteins and developing prediction methods for pupylation sites. For this purpose, the freely accessible database named PupDB integrating information of pupylated proteins and pupylation sites, protein structures, functional annotations and tools for browses, searches and interactive displays of protein structures and pupylation sites was constructed.

Construction and content

The PupDB database is implemented using MySQL Server Edition 5.1. The PupDB website is publicly available at http://cwtung.kmu.edu.tw/pupdb. The web interface and all functions are implemented using PHP and Perl languages. The software of Google Chart Tools [10] is utilized to make sortable tables.

Database content

Two kinds of proteins included in PupDB are pupylated proteins and candidate pupylated proteins. All proteins are collected from four large-scale proteomics studies [6-9]. Proteins with experimentally identified pupylation sites are annotated as pupylated proteins. Candidate pupylated proteins are experimentally identified proteins whose pupylation sites are still unknown. Redundant proteins are removed from PupDB by using CD-HIT [11,12] with a sequence identity threshold of 98%. Currently, PupDB contains 182 pupylated proteins with 215 known pupylation sites and 1,123 candidate pupylated proteins. All proteins belong to three organisms of Mycobacterium smegmatis, Mycobacterium tuberculosis and Escherichia coli. For each protein, the corresponding information consists of six major parts of basic information, PDB ID, gene ontology (GO) annotation, pupylation site, protein sequence and structure as shown in Figure 1. PupDB will be regularly updated with additional data and corrections and analytical tools. Researchers are encouraged to contribute their data and suggestions to PupDB.
Figure 1

Content of a typical PupDB entry. (a) Basic information. (b) Structure. (c) GO annotation. (d) Pupylation site. (e) Sequence. (f) 3D structure.

Content of a typical PupDB entry. (a) Basic information. (b) Structure. (c) GO annotation. (d) Pupylation site. (e) Sequence. (f) 3D structure.

Annotations

As shown in Figure 1a, the first part of basic information includes the UniProt AC, description, gene name, organism and sequence length. For further information of protein annotations, PupDB provides links to the corresponding entries of UniProt database [13]. Also, structure information including PDB (Protein Data Bank) ID and hyperlinks to the PDB database [14] is provided in the second part (Figure 1b). The visualization of pupylation sites in a protein structure can provide helpful information for analysis. The protein 3D structure and associated pupylation sites can be viewed in PupDB by clicking the link of '3D visualization'. The java applet-based program Jmol [15] is utilized for interactive displays of protein structure (Figure 1f). The UniProt protein accession numbers and PDB IDs are obtained by using the ID mapping function of UniProt. Currently, there are 766 PDB structures associated with 294 PupDB entries. The GO annotations [16] can give useful information of molecular function, cellular component and biological process. For a given protein, the corresponding GO annotations can be extracted by using its UniProt accession number. Figure 1c shows the third part of GO annotations for protein P69440. Further GO information can be accessed by clicking the hyperlink of 'Detailed GO annotation' that links to the corresponding entry of QuickGO [17]. The fourth part of pupylation sites includes pupylation sites and corresponding references for pupylated proteins (Figure 1d). References are represented as PubMed IDs with hyperlinks to PubMed database [18]. Instead of showing only references for a candidate pupylated protein whose pupylation sites are still unknown, PupDB highlights pupylation sites in both sequence and structure of a pupylated protein for visualization as shown in Figure 1e and 1f, respectively.

Utility and discussion

PupDB is a database of pupylated proteins and pupylation sites aiming to provide an easily accessible web service for the analysis of pupylated proteins. The analysis of pupylated proteins in PupDB can provide better insights into the specificity of pupylation. For example, Two Sample Logo [19] can be utilized to graphically analyze over- and underrepresented residues surrounding pupylation sites as shown in Figure 2.
Figure 2

Two-Sample Logo representation of over- (upper half) and underrepresented (lower half) residues surrounding pupylation sites.

Two-Sample Logo representation of over- (upper half) and underrepresented (lower half) residues surrounding pupylation sites. Hyperlinks to major protein, structure and annotation databases are provided for accessing related information. Four useful tools are constructed and integrated into PupDB to provide functions of browses, keyword searches, sequence similarity searches and interactive displays of protein structures. The functions of the integrated tools are introduced in the follows.

Browse tool

Users can browse PupDB by selecting the 'Browse' option. All proteins will be shown in a sortable table. The entry with 'Y' in the field of 'Site' is a pupylated protein. Otherwise, it is a candidate pupylated protein with 'N' in the field of 'Site'. By clicking the caption of a specific column in a sortable table, the output table will be sorted according to data of the selected column. Furthermore, users can specify the number of rows shown per page (Figure 3).
Figure 3

Browse tool.

Browse tool.

Search and BLAST tools

For retrieving entries of interest, PupDB provides two search tools of keyword and similarity searches. 1) The tool of keyword search can be accessed by selecting the 'Search' option. There are six fields for searching PupDB including description, UniProt AC, gene name, organism, protein type and protein with structure. By entering keywords for any one or combination of the fields, PupDB will return search results as a sortable table according to the user input keywords. 2) Users can enter a protein sequence of interest in FASTA format to perform a BLAST [20] search against PupDB to fetch entries with a user-defined threshold of E-value. The BLAST tool can serve as a potentially useful tool for predicting promising pupylation sites by sequence similarity. In addition to the protein information, three additional columns of scores, E-values and alignments obtained from the BLAST search are included in the output sortable table. The detailed information of BLAST sequence alignment can be downloaded by clicking the download link. Figure 4 shows an example of BLAST search. In the query sequence, lysines aligned to known pupylation sites will be marked in red color. Users can submit proteins in other organisms to predict pupylation sites.
Figure 4

BLAST tool.

BLAST tool.

Interactive tool for protein structure

PupDB incorporates the Jmol applet of latest version 12.2 for interactive displays of protein structures. By default, PupDB represents protein structures and pupylation sites in grey and yellow colors, respectively. Users can either use the user interface or scripting console to manipulate protein structures.

Conclusions

The PupDB database is a comprehensive repository of pupylated proteins and pupylation sites with a web-based user interface. The built-in tools for browses, searches and interactive displays of protein structures and pupylation sites make PupDB a useful resource for further analyzing the substrate specificity, identifying pupylated proteins in other organisms and developing computational tools for predicting pupylation sites. In addition to the graphical analysis using two-sample logos, advanced machine learning methods such as string kernels [21] can also be utilized to further analyze the specificity of pupylation. The exported dataset of pupylated proteins is downloadable at PupDB. Post-translational modification databases serve as good data source for developing prediction tools. For example, the construction of UbiPred [22] for predicting ubiquitylation sites is based on dataset of UbiProt [23]. Although a predictor GPS-PUP [24] is available for predicting pupylation sites, PupDB with 215 pupylation sites can be utilized to further improve GPS-PUP trained on only 127 pupylation sites. Future works are two-fold. First, the development and integration of prediction tools based on the dataset of PupDB would be useful for analyzing and predicting pupylation sites. Second, the incorporation of orthology relationships and locations of functional domains can largely improve PupDB.

Availability and requirements

The PupDB is freely available at http://cwtung.kmu.edu.tw/pupdb. The website has been tested with browsers of Safari, Opera, Internet Explorer 7 or later, Firefox and Google Chrome. The Java Runtime Environment (JRE) is required for interactive displays of protein 3D structures by Jmol.

Competing interests

The author declares that they have no competing interests.

Authors' contributions

CWT designed and implemented the database, performed the analysis and wrote the manuscript.
  21 in total

1.  Proteasomal protein degradation in Mycobacteria is dependent upon a prokaryotic ubiquitin-like protein.

Authors:  Kristin E Burns; Wei-Ting Liu; Helena I M Boshoff; Pieter C Dorrestein; Clifton E Barry
Journal:  J Biol Chem       Date:  2008-11-21       Impact factor: 5.157

2.  Reconstitution of the Mycobacterium tuberculosis pupylation pathway in Escherichia coli.

Authors:  Francisca A Cerda-Maira; Fiona McAllister; Nadine J Bode; Kristin E Burns; Steven P Gygi; K Heran Darwin
Journal:  EMBO Rep       Date:  2011-07-08       Impact factor: 8.807

3.  Mycobacterial ubiquitin-like protein ligase PafA follows a two-step reaction pathway with a phosphorylated pup intermediate.

Authors:  Ethan Guth; Michael Thommen; Eilika Weber-Ban
Journal:  J Biol Chem       Date:  2010-11-16       Impact factor: 5.157

4.  Proteome-wide identification of mycobacterial pupylation targets.

Authors:  Christian Poulsen; Yusuf Akhter; Amy Hye-Won Jeon; Gerold Schmitt-Ulms; Helmut E Meyer; Anja Stefanski; Kai Stühler; Matthias Wilmanns; Young-Hwa Song
Journal:  Mol Syst Biol       Date:  2010-07-13       Impact factor: 11.429

5.  Expansion of the mycobacterial "PUPylome".

Authors:  Jeramie Watrous; Kristin Burns; Wei-Ting Liu; Anand Patel; Vivian Hook; Vineet Bafna; Clifton E Barry; Steve Bark; Pieter C Dorrestein
Journal:  Mol Biosyst       Date:  2009-11-16

6.  Ubiquitin-like protein involved in the proteasome pathway of Mycobacterium tuberculosis.

Authors:  Michael J Pearce; Julian Mintseris; Jessica Ferreyra; Steven P Gygi; K Heran Darwin
Journal:  Science       Date:  2008-10-02       Impact factor: 47.728

7.  UniProt Knowledgebase: a hub of integrated protein data.

Authors:  Michele Magrane
Journal:  Database (Oxford)       Date:  2011-03-29       Impact factor: 3.451

8.  QuickGO: a web-based tool for Gene Ontology searching.

Authors:  David Binns; Emily Dimmer; Rachael Huntley; Daniel Barrell; Claire O'Donovan; Rolf Apweiler
Journal:  Bioinformatics       Date:  2009-09-10       Impact factor: 6.937

9.  Prokaryotic ubiquitin-like protein (Pup) proteome of Mycobacterium tuberculosis [corrected] .

Authors:  Richard A Festa; Fiona McAllister; Michael J Pearce; Julian Mintseris; Kristin E Burns; Steven P Gygi; K Heran Darwin
Journal:  PLoS One       Date:  2010-01-06       Impact factor: 3.240

10.  CD-HIT Suite: a web server for clustering and comparing biological sequences.

Authors:  Ying Huang; Beifang Niu; Ying Gao; Limin Fu; Weizhong Li
Journal:  Bioinformatics       Date:  2010-01-06       Impact factor: 6.937

View more
  20 in total

1.  Genetic and Proteomic Analyses of Pupylation in Streptomyces coelicolor.

Authors:  Corey L Compton; Michael S Fernandopulle; Rohith T Nagari; Jason K Sello
Journal:  J Bacteriol       Date:  2015-06-01       Impact factor: 3.490

Review 2.  Systems proteomics approaches to study bacterial pathogens: application to Mycobacterium tuberculosis.

Authors:  Amir Banaei-Esfahani; Charlotte Nicod; Ruedi Aebersold; Ben C Collins
Journal:  Curr Opin Microbiol       Date:  2017-10-13       Impact factor: 7.934

3.  A time-resolved Förster resonance energy transfer assay to measure activity of the deamidase of the prokaryotic ubiquitin-like protein.

Authors:  Ian C Eustis; Jessica Huang; Meagan E Pilkerton; Samuel D Whedon; Champak Chatterjee
Journal:  Anal Biochem       Date:  2015-07-21       Impact factor: 3.365

4.  dbPTM 2016: 10-year anniversary of a resource for post-translational modification of proteins.

Authors:  Kai-Yao Huang; Min-Gang Su; Hui-Ju Kao; Yun-Chung Hsieh; Jhih-Hua Jhong; Kuang-Hao Cheng; Hsien-Da Huang; Tzong-Yi Lee
Journal:  Nucleic Acids Res       Date:  2015-11-17       Impact factor: 16.971

Review 5.  Prokaryotic ubiquitin-like protein modification.

Authors:  Julie A Maupin-Furlow
Journal:  Annu Rev Microbiol       Date:  2014-05-29       Impact factor: 15.500

6.  Fluorescent probes reveal a minimal ligase recognition motif in the prokaryotic ubiquitin-like protein from Mycobacterium tuberculosis.

Authors:  Denis Smirnov; Abhinav Dhall; Kalkena Sivanesam; Rudy J Sharar; Champak Chatterjee
Journal:  J Am Chem Soc       Date:  2013-02-14       Impact factor: 15.419

7.  Computational Identification of Protein Pupylation Sites by Using Profile-Based Composition of k-Spaced Amino Acid Pairs.

Authors:  Md Mehedi Hasan; Yuan Zhou; Xiaotian Lu; Jinyan Li; Jiangning Song; Ziding Zhang
Journal:  PLoS One       Date:  2015-06-16       Impact factor: 3.240

Review 8.  Systematic analysis and prediction of pupylation sites in prokaryotic proteins.

Authors:  Xiang Chen; Jian-Ding Qiu; Shao-Ping Shi; Sheng-Bao Suo; Ru-Ping Liang
Journal:  PLoS One       Date:  2013-09-03       Impact factor: 3.240

9.  DbPTM 3.0: an informative resource for investigating substrate site specificity and functional association of protein post-translational modifications.

Authors:  Cheng-Tsung Lu; Kai-Yao Huang; Min-Gang Su; Tzong-Yi Lee; Neil Arvin Bretaña; Wen-Chi Chang; Yi-Ju Chen; Yu-Ju Chen; Hsien-Da Huang
Journal:  Nucleic Acids Res       Date:  2012-11-27       Impact factor: 16.971

10.  TIPdb: a database of anticancer, antiplatelet, and antituberculosis phytochemicals from indigenous plants in Taiwan.

Authors:  Ying-Chi Lin; Chia-Chi Wang; Ih-Sheng Chen; Jhao-Liang Jheng; Jih-Heng Li; Chun-Wei Tung
Journal:  ScientificWorldJournal       Date:  2013-05-12
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.