Literature DB >> 17135204

Sentra: a database of signal transduction proteins for comparative genome analysis.

Mark D'Souza1, Elizabeth M Glass, Mustafa H Syed, Yi Zhang, Alexis Rodriguez, Natalia Maltsev, Michael Y Galperin.   

Abstract

Sentra (http://compbio.mcs.anl.gov/sentra), a database of signal transduction proteins encoded in completely sequenced prokaryotic genomes, has been updated to reflect recent advances in understanding signal transduction events on a whole-genome scale. Sentra consists of two principal components, a manually curated list of signal transduction proteins in 202 completely sequenced prokaryotic genomes and an automatically generated listing of predicted signaling proteins in 235 sequenced genomes that are awaiting manual curation. In addition to two-component histidine kinases and response regulators, the database now lists manually curated Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews. All entries in Sentra are extensively annotated with relevant information from public databases (e.g. UniProt, KEGG, PDB and NCBI). Sentra's infrastructure was redesigned to support interactive cross-genome comparisons of signal transduction capabilities of prokaryotic organisms from a taxonomic and phenotypic perspective and in the framework of signal transduction pathways from KEGG. Sentra leverages the PUMA2 system to support interactive analysis and annotation of signal transduction proteins by the users.

Entities:  

Mesh:

Substances:

Year:  2006        PMID: 17135204      PMCID: PMC1751548          DOI: 10.1093/nar/gkl949

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Recent experimental and in silico studies have resulted in a much better understanding of the principles and mechanisms of prokaryotic signal transduction (1–6). The list of recognized environmental sensors has been dramatically expanded and now includes, in addition to two-component histidine kinases and methyl-accepting chemotaxis proteins, Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases (2–10). These classes of proteins are also found as (predicted) cytoplasmic proteins, proposed to function as sensors of the intracellular biochemical parameters, such as pH, osmolarity or levels of oxygen, CO, NO and other molecules (2,10). Accordingly, many prokaryotic genomes contain multiple copies of the respective genes, whose exact functions (i.e. the parameters sensed by their protein products) are rarely known. Detailed analyses of protein sets involved in signal transduction in such model organisms as Escherichia coli, Bacillus subtilis, Pseudomonas aeruginosa, Synechocystis sp. PCC6803, Anabaena sp. PCC7120 or Halobacterium salinarum brought very interesting results and provided needed insight into the signal transduction mechanisms. In silico studies have contributed by highlighting such phenomena as the abundance of (predicted) diguanylate cyclases and c-di-GMP phosphodiesterases in many bacterial genomes, the importance of cross-talk between different signaling pathways and the existence of a complex system of intracellular signaling (2,3,10). Progress in understanding of prokaryotic signal transduction systems, as well as availability of a large number of newly sequenced genomes, prompted us to perform a major update of Sentra (), a database of signal transduction proteins developed by the Bioinformatics group at Argonne National Laboratory (13,14). The objective of further development of Sentra was to provide users with an analytical environment containing expert-curated information describing prokaryotic signal transduction systems, as well as up-to-date knowledge base and interactive analytical tools for further analysis of signal transduction proteins in all completely sequenced genomes as they become publicly available. Such an environment will add accuracy and sensitivity to the sequence analysis of signal transduction proteins and aid in the development of conjectures regarding the nature of the transmitted signal. The previous release of Sentra featured signal transduction proteins encoded in 43 completely sequenced genomes (14). Although it contained all complete, public genomes at the time of publication, it was missing a number of valuable data and analytical capabilities. For example, it did not include diguanylate cyclases or c-di-GMP phosphodiesterases and did not support cross-genome comparative analysis of signal transduction systems (14). Further, since most components of the signal transduction machinery are multi-domain proteins, they are notoriously difficult to annotate through automated sequence comparisons and are commonly misannotated in genomic databases (10,15). Discovery of new domains often makes the existing annotations incomplete or even obsolete. To provide the solution to this problem, Sentra was redesigned to perform periodic (monthly) automated updates that include automated pre-computed analysis of newly sequenced genomes and re-analysis of existing Sentra genomes with an array of bioinformatics tools including InterPro (16), Blocks (17), BLAST (18), TMHMM (19) and tools developed by our group (e.g. Dremmel, and Chisel, ). The results of these automated analyses are presented to the users in Sentra's interactive environment for further updates and annotation. The most significant changes in Sentra database content, capabilities and user interface are as follows.

Update of the Sentra database content

Sentra now consists of two principal components: (i) a manually curated list of signal transduction proteins that includes proteins derived from 202 completely sequenced prokaryotic genomes, and (ii) an automatically generated listing of predicted signaling proteins in 235 genomes that are awaiting manual curation. The expert-curated section of the database now lists, besides two-component histidine kinases and response regulators, Ser/Thr/Tyr protein kinases and protein phosphatases, as well as adenylate and diguanylate cyclases and c-di-GMP phosphodiesterases, as defined in several recent reviews (2,10,12).

Support for comparative and evolutionary analysis of signal transduction proteins and signaling pathways

In the process of adaptation to environment, prokaryotic organisms have developed an ability to detect and process environmental signals that are vital for their survival. Sentra provides a unique opportunity to explore and compare the signaling apparatus of prokaryotes according to their habitat (e.g. aquatic, terrestrial), lifestyle (e.g. pathogenic) and major physiological features (e.g. energy source, motility). Users can also perform comparative analysis of signal transduction proteins characteristic of different taxonomic groups of organisms in the framework of the signaling pathways from the KEGG database (20). This capability allows identification of signaling pathways and mechanisms characteristic of particular taxonomic groups and habitats. Sentra leverages the PUMA2 (21) system for high-throughput analysis of genomes being developed by the Bioinformatics group at Argonne. Such a connection allows Sentra to support comparative analysis of the prokaryotic signal transduction systems at multiple levels of organization: users may explore domain and feature composition of signal transduction proteins and perform interactive analysis of sequences by over 30 bioinformatics tools. All entries in Sentra are annotated with the information from the PUMA2 knowledge base integrating information from over 20 sequence, structural, metabolic and taxonomic databases, as well as the derived results from various bioinformatics tools. Sentra also contains information regarding participation of the signal transduction proteins in conserved chromosomal gene clusters (22). Such information may provide important clues regarding the nature of the transmitted signal.

Support for user annotation of signal transduction proteins

One of the important new features of Sentra is its support for the user annotation of the signal transduction proteins via the PUMA2 framework. Registered users can interactively analyze the sequences, correct functional assignment and provide detailed comments. Such capability will allow us to leverage an enormous expert knowledge accumulated in the scientific community for annotation of information in the Sentra database. All computationally intensive operations in Sentra are performed using the Grid technology-based engine GADU (23) being developed by the Bioinformatics group at Argonne.

Future prospects

As new completely sequenced microbial genomes become publicly available, they will be processed through the automated pipeline and included in quarterly updates of the database. These genomes will also be subject to manual curation of the overall protein lists and orthology groupings. We also intend to provide manually curated lists of proteins containing certain signal transduction domains, such as PAS (24) and FHA (25).
  24 in total

1.  The use of gene clusters to infer functional coupling.

Authors:  R Overbeek; M Fonstein; M D'Souza; G D Pusch; N Maltsev
Journal:  Proc Natl Acad Sci U S A       Date:  1999-03-16       Impact factor: 11.205

Review 2.  PAS domains: internal sensors of oxygen, redox potential, and light.

Authors:  B L Taylor; I B Zhulin
Journal:  Microbiol Mol Biol Rev       Date:  1999-06       Impact factor: 11.056

3.  One-component systems dominate signal transduction in prokaryotes.

Authors:  Luke E Ulrich; Eugene V Koonin; Igor B Zhulin
Journal:  Trends Microbiol       Date:  2005-02       Impact factor: 17.079

4.  GNARE: automated system for high-throughput genome analysis with grid computational backend.

Authors:  Dinanath Sulakhe; Alex Rodriguez; Mark D'Souza; Michael Wilde; Veronika Nefedova; Ian Foster; Natalia Maltsev
Journal:  J Clin Monit Comput       Date:  2005-10       Impact factor: 2.502

Review 5.  Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

Authors:  S F Altschul; T L Madden; A A Schäffer; J Zhang; Z Zhang; W Miller; D J Lipman
Journal:  Nucleic Acids Res       Date:  1997-09-01       Impact factor: 16.971

Review 6.  C-di-GMP: the dawning of a novel bacterial signalling system.

Authors:  Ute Römling; Mark Gomelsky; Michael Y Galperin
Journal:  Mol Microbiol       Date:  2005-08       Impact factor: 3.501

7.  Distribution and evolution of multiple-step phosphorelay in prokaryotes: lateral domain recruitment involved in the formation of hybrid-type histidine kinases.

Authors:  Weiwen Zhang; Liang Shi
Journal:  Microbiology (Reading)       Date:  2005-07       Impact factor: 2.777

8.  PUMA2--grid-based high-throughput analysis of genomes and metabolic pathways.

Authors:  Natalia Maltsev; Elizabeth Glass; Dinanath Sulakhe; Alexis Rodriguez; Mustafa H Syed; Tanuja Bompada; Yi Zhang; Mark D'Souza
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

9.  Diversity in domain architectures of Ser/Thr kinases and their homologues in prokaryotes.

Authors:  A Krupa; N Srinivasan
Journal:  BMC Genomics       Date:  2005-09-19       Impact factor: 3.969

Review 10.  A census of membrane-bound and intracellular signal transduction proteins in bacteria: bacterial IQ, extroverts and introverts.

Authors:  Michael Y Galperin
Journal:  BMC Microbiol       Date:  2005-06-14       Impact factor: 3.605

View more
  13 in total

Review 1.  Identification of sensory and signal-transducing domains in two-component signaling systems.

Authors:  Michael Y Galperin; Anastasia N Nikolskaya
Journal:  Methods Enzymol       Date:  2007       Impact factor: 1.600

Review 2.  Comparative genomic reconstruction of transcriptional regulatory networks in bacteria.

Authors:  Dmitry A Rodionov
Journal:  Chem Rev       Date:  2007-07-18       Impact factor: 60.622

3.  Comparative genomic and phylogenetic analyses reveal the evolution of the core two-component signal transduction systems in enterobacteria.

Authors:  Mingsheng Qi; Feng-Jie Sun; Gustavo Caetano-Anollés; Youfu Zhao
Journal:  J Mol Evol       Date:  2010-01-05       Impact factor: 2.395

4.  A framework for classification of prokaryotic protein kinases.

Authors:  Nidhi Tyagi; Krishanpal Anamika; Narayanaswamy Srinivasan
Journal:  PLoS One       Date:  2010-05-26       Impact factor: 3.240

5.  A PPM-family protein phosphatase from the thermoacidophile Thermoplasma volcanium hydrolyzes protein-bound phosphotyrosine.

Authors:  Hanan Dahche; Abdulshakur Abdullah; M Ben Potters; Peter J Kennelly
Journal:  Extremophiles       Date:  2008-11-29       Impact factor: 2.395

6.  P2CS: a database of prokaryotic two-component systems.

Authors:  Mohamed Barakat; Philippe Ortet; David E Whitworth
Journal:  Nucleic Acids Res       Date:  2010-11-04       Impact factor: 16.971

7.  Systems level analysis of two-component signal transduction systems in Erwinia amylovora: role in virulence, regulation of amylovoran biosynthesis and swarming motility.

Authors:  Youfu Zhao; Dongping Wang; Sridevi Nakka; George W Sundin; Schuyler S Korban
Journal:  BMC Genomics       Date:  2009-05-26       Impact factor: 3.969

8.  P2CS: a two-component system resource for prokaryotic signal transduction research.

Authors:  Mohamed Barakat; Philippe Ortet; Cécile Jourlin-Castelli; Mireille Ansaldi; Vincent Méjean; David E Whitworth
Journal:  BMC Genomics       Date:  2009-07-15       Impact factor: 3.969

9.  Different evolutionary modifications as a guide to rewire two-component systems.

Authors:  Beate Krueger; Torben Friedrich; Frank Förster; Jörg Bernhardt; Roy Gross; Thomas Dandekar
Journal:  Bioinform Biol Insights       Date:  2012-05-03

10.  Structure and function predictions of the Msa protein in Staphylococcus aureus.

Authors:  Vijayaraj Nagarajan; Mohamed O Elasri
Journal:  BMC Bioinformatics       Date:  2007-11-01       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.