| Literature DB >> 19604365 |
Mohamed Barakat1, Philippe Ortet, Cécile Jourlin-Castelli, Mireille Ansaldi, Vincent Méjean, David E Whitworth.
Abstract
BACKGROUND: With the escalation of high throughput prokaryotic genome sequencing, there is an ever-increasing need for databases that characterise, catalogue and present data relating to particular gene sets and genomes/metagenomes. Two-component system (TCS) signal transduction pathways are the dominant mechanisms by which micro-organisms sense and respond to external as well as internal environmental changes. These systems respond to a wide range of stimuli by triggering diverse physiological adjustments, including alterations in gene expression, enzymatic reactions, or protein-protein interactions. DESCRIPTION: We present P2CS (Prokaryotic 2-Component Systems), an integrated and comprehensive database of TCS signal transduction proteins, which contains a compilation of the TCS genes within 755 completely sequenced prokaryotic genomes and 39 metagenomes. P2CS provides detailed annotation of each TCS gene including family classification, sequence features, functional domains, as well as genomic context visualization. To bypass the generic problem of gene underestimation during genome annotation, we also constituted and searched an ORFeome, which improves the recovery of TCS proteins compared to searches on the equivalent proteomes.Entities:
Mesh:
Year: 2009 PMID: 19604365 PMCID: PMC2716373 DOI: 10.1186/1471-2164-10-315
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Schematic modular pipeline of P2CS. After data download of prokaryotic genomes and metagenomes, protein sequence files are constructed and searches for TCS domains performed. Finally each TCS protein is annotated and categorized into a TCS family.
Enzyme list implicated in TCS signal transduction
| EC Number | Enzyme Description | Present In TCS |
| 2.7.13.3 | Histidine kinase | Yes |
| 3.1.1.61 | Protein-glutamate methylesterase | Yes |
| 2.1.1.80 | Protein-glutamate O-methyltransferase | Yes |
| 4.6.1.1 | Adenylate cyclase | Yes |
| 4.6.1.2 | Guanylate cyclase | Yes |
| 3.1.3.3 | Phosphoserine phosphatase | Probable |
| 3.6.3.25 | Sulfate-transporting ATPase | Probable |
| 2.7.11.25 | Mitogen-activated protein kinase kinase kinase | Probable |
| 2.7.11.17 | Calcium/calmodulin-dependent protein kinase | Probable |
| 3.1.3.16 | Phosphoprotein phosphatase | Probable |
| 3.4.21.53 | Endopeptidase La | Probable |
| 1.8.1.9 | Thioredoxin-disulfide reductase | Probable |
| 3.6.3.28 | Phosphonate-transporting ATPase | Probable |
| 4.1.1.18 | Lysine decarboxylase. | Probable |
| 1.4.1.3 | Glutamate dehydrogenase (NAD(P)(+)) | Probable |
| 4.1.1.19 | Arginine decarboxylase | Probable |
| 4.1.1.17 | Ornithine decarboxylase | Probable |
Figure 2Response regulators classification. Schematic representation of the conserved domain architectures of RRs.
Figure 3Numbers and types of TCS proteins. Histidine kinases (A) and response regulators (B), in prokaryotic genomes and metagenomes.
Figure 4Histidine kinases classification. Schematic representation of the conserved domain architectures of HKs.
Performance test of P2CS
| Species | Manually defined TCS proteins | P2CS Predicted TCS proteins | Sensitivity (%) | Specificity (%) | Precision (%) | Reference |
| 174 | 188 (185+3) | 100 | 99.67 | 92.55 | [ | |
| 102 (94+8) | 104 (94+10) | 100 | 99.96 | 97.9 | [ | |
| 102 | 103 | 99.02 | 99.96 | 98.06 | [ | |
| 101 (99+2) | 102 (99+3) | 100 | 99.98 | 99.02 | [ | |
| 101 (100+1) | 99 (98+1) | 98.02 | 100 | 100 | [ | |
| 107 | 107 | 100 | 100 | 100 | [ | |
| 70 | 70 | 100 | 100 | 100 | [ | |
| 109 | 109 | 100 | 100 | 100 | [ | |
| 62 | 62 (61+1) | 100 | 100 | 100 | [ | |
| 278 (276+2) | 283 (281+2) | 99.28 | 99.91 | 97.53 | [ | |
| 62 | 59 | 93.55 | 99.96 | 98.31 | [ | |
| 142 | 143 | 97.89 | 99.92 | 97.2 | [ | |
| 143 (140+3) | 144 (139+5) | 97.2 | 99.92 | 96.53 | [ | |
| 139 (137+2) | 141 (138+3) | 97.84 | 99.87 | 96.45 | [ | |
| 267 | 273 | 98.88 | 99.92 | 96.70 | [ | |
| 164 | 187 | 99.39 | 99.68 | 87.17 | [ | |
| 106 | 110 | 100 | 99.9 | 96.36 | [ | |
| 106 | 110 | 100 | 99.9 | 96.36 | [ | |
| 114 | 120 | 100 | 99.86 | 95 | [ | |
| 121 | 126 | 100 | 99.86 | 95 | [ | |
| 92 (91+1) | 96 (95+1) | 100 | 99.9 | 95.83 | [ | |
| 93 | 95 | 100 | 99.95 | 97.89 | [ | |
Comparison to manually detected TCS proteins (numbers in parentheses are the details of predicted and mis-predicted TCS proteins). Parameters calculation: Sensitivity = TP/TP+TN, Specificity = TN/TN+FP, Precision = TP/TP+FP.
TP. True positive, TN. True negative, FP. False positive, FN. False negative.
Figure 5P2CS analysis of the . Each class result provides a link to a detailed gene list. Clickable links are underlined.
Figure 6. Genomic context for a chromosomal region (2133100 – 2139600). Genes are represented in the six reading frames with initial data from the NCBI database (green genes with domains in blue) and the result of the P2CS analysis process (genes in light red with domains in purple). Red vertical lines represent stop codons and green lines represent potential start codons.