| Literature DB >> 11305941 |
K S Makarova1, A A Mironov, M S Gelfand.
Abstract
BACKGROUND: The arginine repressor ArgR/AhrC is a transcription factor universally conserved in bacterial genomes. Its recognition signal (the ARG box), a weak palindrome, is also conserved between genomes, despite a very low degree of similarity between individual sites within a genome. Thus, the arginine repressor is different from two other universal transcription factors - HrcA, whose recognition signal is very strongly conserved both within and between genomes, and LexA/DinR, whose signal is strongly conserved within, but not between, genomes. The arginine regulon is well studied in Escherichia coli and to some extent in Bacillus subtilis and some other genomes. Here, we apply the comparative genomic approach to the prediction of the ArgR-binding sites in all completely sequenced bacterial genomes.Entities:
Mesh:
Substances:
Year: 2001 PMID: 11305941 PMCID: PMC31482 DOI: 10.1186/gb-2001-2-4-research0013
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Comparison of three transcriptional regulator families with predominantly single representatives from each bacterial genome
| Definition | Regulated pathway | Pattern of species* | Type of DNA-binding domain and fused domain | DNA-binding domain conservation† | Recognition site‡ | Sites per genome | Reference§ |
| LexA (Gram-negative) | SOS repair | ADC-VEBMH---- | 'Winged helix'¶ HTH and serine protease (S24 family) | 2.20 ± 0.28 | SOS box (Gram-negative): CTGTatatatatMCAG | 8-20 | [3,4] |
| DinR (Gram-positive) | Cheo box (Gram-positive): cGAACrnryGTTYg | ||||||
| HrcA | Heat shock | A-CS-EBM-XYTP | Predicted 'winged helix' HTH and uncharacterized domain possibly responsible for activation by chaperonin GroE | 1.72 ± 0.22 | CIRCE box: TTAGCACTCn9GAGTGCTAA | 1-2 | [5,6] |
| ArgR (Gram-negative) | Arginine metabolism | ADC-VEBMH---P | 'Winged helix' HTH and arginine-binding domain | 1.81 ± 0.22 | ARG box: TGMATwwwwATKCA | 1-20 | [8,9,10,11] |
| AhrC (Gram-positive) |
*Abbreviations for species: B, Bacillus subtilis; C, Clostridium acetobutylicum; M, Mycobacterium tuberculosis; D, Deinococcus radiodurans; A, Thermotoga maritima; S, Synechocystis sp; H, Haemophilus influenzae; E, Escherichia coli; P, Chlamydia pneumoniae; T, Chlamydia trachomatis; Z, Mycoplasma genitalium; Y, Mycoplasma pneumoniae; V, Vibrio cholerae. †The estimate was obtained as the average maximum likelihood distance between the last-step UPGMA clusters (of the corresponding tree reconstructed by the PHYLIP package program NEIGHBOR) counted using distance matrix (calculated by PHYLIP package program PROTDIST) only for DNA-binding domains [39]. ‡ Letter codes used in consensus sequences are the following: M = A or C; Y = T or C; R/r =A or G; W/w =A or T; K = G or T; N/n = any nucleotide, where upper-case letters denote strongly conserved nucleotides and lower-case letters denote less conserved nucleotides. § References correspond to the two last columns. ¶The 'winged helix' superfamily is defined in the SCOP database [40]; LexA [41] and ArgR [42,43,44] DNA-binding domains have been resolved by X-ray crystallography. The same type of domain was predicted for HrsA using PSIBLAST program.
Figure 1Schematic representation of the operon organization and regulation of the arginine metabolism and transport genes. Genes are represented by boxes. ARG boxes in the upstream region are shown by black arrows. The direction of the arrow indicates the direction of transcription. The linear pathway (in E. coli and V. cholerae) involves N-acetylglutamate synthase (argA) and N2-acetylornithine deacetylase (argE). The circular pathway (in other bacteria) involves N2-acetyl-L-ornithine: L-glutamate acetyltransferase (argJ). The common genes are acetylglutamate kinase (argB); acetylglutamate semialdehyde dehydrogenase (argC); acetylornitine delta-aminotransferase (argD); ornithine carbamoyltransferase (argF, argI); argininosuccinate synthase (argG); argininosuccinate lyase (argH); carbamoyl-phosphate synthase (carAB). The H. influenzae genome contains only argH, argG, argF and possibly argD orthologs. There are difficulties in identifying orthologs for argC, argJ and argB in D. radiodurans because there are several paralogous genes encoding proteins that can possibly perform these functions. The B. subtilisroc operons involved in arginine degradation are also regulated by AhrC, as well as anaerobic arginine catabolism genes arcABCD in B. licheniformis [14] (data not shown). The transporter genes are: periplasmic binding protein (white), permease transmembrane protein (light gray), ATPase component (dark gray).
Candidate ARG boxes upstream of arginine metabolism related genes and operons
| Genome | Operon | Position | Score | Site |
| -64 | 4.24 | |||
| -43 | 3.34 | |||
| -50 | 3.98 | |||
| -39 | 3.98 | |||
| -128 | 4.61 | |||
| -109 | 4.61 | |||
| -68 | 4.01 | |||
| -47 | 3.50 | |||
| -64 | 3.80 | |||
| -43 | 3.39 | |||
| -65 | 4.16 | |||
| -44 | 4.41 | |||
| -210 | 4.31 | |||
| -91 | 3.90 | |||
| -70 | 4.51 | |||
| -63 | 4.33 | |||
| -42 | 4.49 | |||
| -50 | 4.36 | |||
| -39 | 3.79 | |||
| -72 | 4.08 | AtTGaATAAttATTCtgT | ||
| -86 | 4.36 | AtTGcATAtAAATTCAcT | ||
| -50 | 4.27 | AgTGAATttttATgCAaT | ||
| -54 | 4.52 | tATGAATAAAtATgCAca | ||
| -64 | 3.87 | AtaGAATttttATTCAca | ||
| -43 | 3.75 | AtcGAtTAtttATTCAaT | ||
| -50 | 4.01 | tATGcATAAAAATgtAaT | ||
| -15 | 3.82 | AaaGAATAAAAAgTCATT | ||
| -52 | 3.57 | ttTGcAaAAtAATTtATT | ||
| -72 | 3.72 | ttaaAATAtttATTCAcT | ||
| -51 | 3.35 | AtaGcATtttcATgCtTT | ||
| -120 | 3.83 | AaTaAATgtAAATaCAaT | ||
| -76 | 3.00 | AacacATAttAAaTCAcT | ||
| -55 | 4.00 | AGTGAATAAAAAaaCAaT | ||
| -79 | 3.50 | AtTGttTttttATTCAcT | ||
| -58 | 3.27 | AGTGAtTtAAtATgtgTT | ||
| -74 | 3.00 | ttTtggTttttATaCATT | ||
| -53 | 3.85 | AtTGcATAAAAATaCgTT | ||
| -64 | 5.22 | |||
| -55 | 4.54 | AgGcATAaAAATTCAT | ||
| -35 | 4.02 | AtaAtTAatTATTCAT | ||
| -67 | 5.04 | |||
| -31 | 4.54 | AaGcATTTtTATTCAT | ||
| -47 | 4.91 | ATttATTTtTATaCAa | ||
| -53 | 5.14 | |||
| -63 | 4.76 | |||
| -193 | 4.76 | cTttATATAAATgCAa | ||
| -88 | 4.21 | tTGcATAaAAATgaga | ||
| -51 | 4.62 | AcGAATAaATATTCAa | ||
| 414 | -191 | 5.27 | ATGAATAaATATTCAa | |
| -132 | 4.62 | tTGAATAaATATTCgT | ||
| -32 | 5.27 | ATGAATAaAAATTCAa | ||
| -124 | 5.04 | ATGAATATtTATaaAa | ||
| -63 | 4.79 | tTGAATATtTATaaAg | ||
| -43 | 4.59 | gTGtATAatTATTCAT | ||
| -118 | 4.59 | ATGAATAatTATaCAc | ||
| -98 | 4.79 | cTttATAaATATTCAa | ||
| -181 | 4.96 | ATGcATAaATATaaAT | ||
| -80 | 4.91 | ATGtATAaATATaaAa | ||
| -87 | 4.96 | ATGcATAaATATaaAT | ||
| -67 | 4.89 | tTGAATAatTATaaAa | ||
| -194 | 5.09 | ATGcATAaATATTaAT | ||
| -69 | 4.62 | gTtAATAatTATTCAT | ||
| -59 | 4.21 | tTtcATATtTATgCta | ||
| -81 | 3.92 | tTtAATTcAAAgTaAa | ||
| -34 | 4.16 | tTGtgTTatAATaaAT | ||
| -138 | 3.87 | cgtAATTgATATTCAT | ||
| -91 | 4.44 | ATttATTTAAcTTaAT | ||
| -57 | 4.11 | cTGtATTTcTATaCAg | ||
| -131 | 4.09 | tTGcATAgtcATTCAT | ||
| -85 | 3.62 | ATGgATTgAAATcCAg | ||
| -62 | 3.69 | cTGgATTTtAAggaAT | ||
| -53 | 4.26 | tTGcATAaATATgatT | ||
| -32 | 4.64 | ATaAATAaATATgCAT | ||
| -112 | 3.92 | tTtAATcaAAATTatT | ||
| 27 | 3.96 | ATttATTTtTATaatg | ||
| -11 | 3.91 | tTGcATAacgATgCAa |
Position is indicated relative to the start of translation. The score for E.coli and H. influenzae genes was computed using the profile from [27]; other scores were computed using the profile trained on B. subtilis candidate ARG boxes using the procedure from [28]. The sites used to construct the profile are shown in bold.
Figure 2Unrooted, neighbor-joining tree of the predicted polar amino acid periplasmic binding proteins for selected organisms. The tree was reconstructed using the PHYLIP package (SEQBOOT, PROTDIST, NEIGHBOR, CONSENSE and FITCH programs). Nodes with bootstrap value exceeding 60% are marked by open circles. BS, B. subtilis; CA, Cl. acetobutylicum; Cpn, C.pneumoniae; DR, D. radiodurans; EC, E.coli; HI, H. influenzae; Rv, M. tuberculosis; TM, T. maritima. Experimentally established specificity of transporters is indicated in parentheses. Genes with candidate ARG boxes in upstream regions are shown in italic and in a larger font.