Literature DB >> 26843427

Transcription profile of Escherichia coli: genomic SELEX search for regulatory targets of transcription factors.

Akira Ishihama1, Tomohiro Shimada2, Yukiko Yamazaki3.   

Abstract

Bacterial genomes are transcribed by DNA-dependent RNA polymerase (RNAP), which achieves gene selectivity through interaction with sigma factors that recognize promoters, and transcription factors (TFs) that control the activity and specificity of RNAP holoenzyme. To understand the molecular mechanisms of transcriptional regulation, the identification of regulatory targets is needed for all these factors. We then performed genomic SELEX screenings of targets under the control of each sigma factor and each TF. Here we describe the assembly of 156 SELEX patterns of a total of 116 TFs performed in the presence and absence of effector ligands. The results reveal several novel concepts: (i) each TF regulates more targets than hitherto recognized; (ii) each promoter is regulated by more TFs than hitherto recognized; and (iii) the binding sites of some TFs are located within operons and even inside open reading frames. The binding sites of a set of global regulators, including cAMP receptor protein, LeuO and Lrp, overlap with those of the silencer H-NS, suggesting that certain global regulators play an anti-silencing role. To facilitate sharing of these accumulated SELEX datasets with the research community, we compiled a database, 'Transcription Profile of Escherichia coli' (www.shigen.nig.ac.jp/ecoli/tec/).
© The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2016        PMID: 26843427      PMCID: PMC4797297          DOI: 10.1093/nar/gkw051

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Bacteria constantly monitor the physical, chemical and biological conditions in the environment and adapt to these conditions by modifying the expression pattern of their genomes. Transcription, the primary step gene expression, is carried out by a single species of RNA polymerase (RNAP). Transcription pattern is modulated by controlling the utilization of a limited number of RNAPs among the whole set of genes in the genome (1,2). The RNAP core enzyme with the subunit composition α2ββ′ω possesses RNA polymerization activity but lacks the ability to initiate transcription. For promoter recognition and transcription initiation, another subunit, the sigma (σ) factor, is required. Transcriptionally active RNAP holoenzyme (or transcriptase) is formed after association of the core enzyme with the σ subunit (3–5). The model prokaryote Escherichia coli harbors seven species of the σ subunit, each recognizing a specific set of promoters (6–8). Each resultant RNAP holoenzyme is able to recognize a specific set of promoters, referred to the constitutive promoters, in the absence of additional transcription factors (TFs) and initiate transcription from the constitutive promoters (9). The promoter selectivity of RNAP holoenzyme is, however, modulated through interplay with another set of regulatory proteins, referred to as TFs (1,2). Most TFs associate with DNA targets (usually located near promoters), interact with promoter-bound RNAP, and modulate transcription from their target promoters (Figure 1A).
Figure 1.

Outline of genomic SELEX screening system. (A) Functional differentiation of RNAP. The core enzyme, with subunit structure α2ββ′ω, has RNA polymerization activity but cannot recognize promoters. After binding one of the sigma (σ) subunits, the holoenzyme acquires the ability to initiate transcription from constitutive promoters. The promoter selectivity of RNAP is further modulated through interactions with DNA-binding TFs that bind to their target DNA near promoters. Each DNA-bound TF interacts with a subunit of promoter-bound RNAP. Based on the contact subunit, TFs are classified into four groups (1,2). (B) Genomic SELEX system for identification of regulatory targets of TFs. A plasmid library of Escherichia coli genomic DNA was constructed using collections of genomic DNA fragments 200–300 bp in length. The mixture of genome DNA segments can be regenerated by PCR. All 285 E. coli TFs were expressed in His-tagged form by addition of a His6 sequence to either the N- or C-terminus, and then affinity purified. Mapping of TF-bound DNA segments on the E. coli genome was carried out by either SELEX-clos (cloning-sequencing) or SELEX-chip (DNA tiling array analysis) method. (C) The binding sites of TFs on the genome were classified into four types: type-A, spacer of bidirectional transcriptional units; type-B, spacer upstream of one transcription unit but downstream of another transcription unit; type-C, spacer downstream of both transcription units; and type-D, inside open reading frames. Regulatory targets of TFs were predicted based on the locations of their binding sites.

Outline of genomic SELEX screening system. (A) Functional differentiation of RNAP. The core enzyme, with subunit structure α2ββ′ω, has RNA polymerization activity but cannot recognize promoters. After binding one of the sigma (σ) subunits, the holoenzyme acquires the ability to initiate transcription from constitutive promoters. The promoter selectivity of RNAP is further modulated through interactions with DNA-binding TFs that bind to their target DNA near promoters. Each DNA-bound TF interacts with a subunit of promoter-bound RNAP. Based on the contact subunit, TFs are classified into four groups (1,2). (B) Genomic SELEX system for identification of regulatory targets of TFs. A plasmid library of Escherichia coli genomic DNA was constructed using collections of genomic DNA fragments 200–300 bp in length. The mixture of genome DNA segments can be regenerated by PCR. All 285 E. coli TFs were expressed in His-tagged form by addition of a His6 sequence to either the N- or C-terminus, and then affinity purified. Mapping of TF-bound DNA segments on the E. coli genome was carried out by either SELEX-clos (cloning-sequencing) or SELEX-chip (DNA tiling array analysis) method. (C) The binding sites of TFs on the genome were classified into four types: type-A, spacer of bidirectional transcriptional units; type-B, spacer upstream of one transcription unit but downstream of another transcription unit; type-C, spacer downstream of both transcription units; and type-D, inside open reading frames. Regulatory targets of TFs were predicted based on the locations of their binding sites. Complete genome sequences have allowed prediction of the full repertoire of TFs (1–2,10–11). In E. coli, a total of 285 TFs have been identified and classified into 54 families based on their DNA-binding motifs (2,12) (Table 1; for details see Supplementary Table S1). When bound to target DNA sites, these proteins interact directly with one of the RNAP subunits (13–15). There is a tight correlation between the mode of transcriptional action and the contact subunit (class I, II, III or IV) (Figure 1A) (1,2). Binding of TFs at specific target sites near promoters plays an essential role in ensuring effective protein–protein interactions by increasing the local concentration of pairing proteins at promoter regions. To date, 70–80% of the estimated 285 TFs in E. coli have been linked to at least one regulatory target gene; however, the whole set of target promoters, genes or operons has been identified for only a small number of TFs; moreover, the regulatory targets and regulatory functions remain unknown for about a quarter of E. coli TFs, referred to as Y-gene TFs. Consequently, we currently have only fragmentary knowledge of the genome-wide transcriptional regulation even for E. coli, the best-characterized model prokaryote. In order to obtain detailed insight into the molecular basis of genome regulation, it is absolutely necessary to determine the targets and regulatory modes of all predicted TFs.
Table 1.

DNA-binding transcription factors (TF) in Escherichia coli

FamilyNo.Characterized TFUncharacterized TF
AraC32Ada, AdiY, AppY, AraC, ChbR, EnvY, EutR, FeaR, FliZ, FrvR, GadW, GadX, LsrR, MarA, MelR, NimR, RclR, RhaR, RhaS, Rob, SoxS, XylR, YdeO, YkgA, YqhCYbcM, YdiP, YidL, YiiF, YijO, YneL, YpdC
ArgR1ArgR
ArsR3ArsR, FeoC, YgaV
AsnC3AsnC, LrpYbaO
BirA1BirA
BolA1BolA
CadC2CadCYqeI
CaiF1CaiF
CbpA1CbpA
CdaR1CdaR
CitB3CitB, DctR, DcuR
Crp4Crp, Fnr, YeiLYaiV
DeoR13AgaR, DeoR, DeoT, FucR, GatR, GlpR, SgcR, SrlR, UlaRYdjF, YfjR, YgbI, YihW
DicC1DicC
DnaA1DnaA
Dps1Dps
DtxR*1MntR
Fis1Fis
FlhC1FlhC
FlhD1FlhD
FrmR2FrmR, RcnR
Fur2Fur, Zur
GntR25CsiR, DgoR, ExuR, FadR, FarR, FrlR, GlcC, LctR, LgoR, McbR, MtlR, NanR, PaaX, PdhR, PhnF, UxuR, YdfH, YggD, YqjIYdcR, YegW, YidP, YieP, YihL, YjiR
GutM1GutM
H-NS2Hns, StpA
HU4HupA, HupB, IhfA, IhfB
IclR8AllR, IclR, KdgR, MhpR, RhmR, YiaJ,YagI, YjhI
LacI15AscG, Cra, CscR, CytR, EbgR, GalR, GalS, GntR, IdnR, LacI, MalI, PurR, RbsR, TreRYcjW
LexA1LexA
LuxR*20BglJ, CsgD, EvgA, FimZ, GadE, MalT, MatA, NarL, NarP, RcsA, RcsB, SdiA, UhpA, UvrY, YgeH, YjjQYahA, YgeK, YhjB, YqeH
LysR47AbgR, AllS, ArgP, Cbl, CynR, CysB, Dan, DmlR, DsdC, GcvA, HcaR, HdfR, IlvY, LeuO, LrhA, LysR, MetR, Nac, NhaR, OxyR, PerR, PgrR, QseA, QseD, TdcA, XapR, YeiE, YfeR, YidZYafC, YahB, YbdO, YbeF, YbhD, YcaN, YdcI, YdhB, YeeY, YfeC, YfeD, YfiE, YgfI, YhaJ, YhjC, YiaU, YneJ, YnfL
LytR*2YehT, YpdB
MarR5EmrR, MarR, SlyAYdaW, YtfH
MerR*5BluR, CueR, MlrA, SoxR, ZntR
MetJ1MetJ
ModE1ModE
MraZ1MraZ
NagC3Mlc, NagCYphH
NikR1NikR
NrdR1NrdR
NsrR1NsrR
NtrC8AtoC, NorR, NtrC, PrpR, PspF, QseF, RtcR, ZraR
OgrK1OgrK
OmpR14ArcA, BaeR, BasR, CpxR, CreB, CusR, KdpE, OmpR, PhoB, PhoP, QseB, RstA, TorR, YedW
PepA1PepA
PutA1PutA
RpiR4HexR, MurR, RpiRYfhH
SfsA2SfsA, SfsB
TdcR1TdcR
TetR*15AcrR, BdcR, BetI, ComR, EnvR, FabR, GusR, NemR, RcdA, RutR, SlmAYagA, YbiH, YcfQ, YjdC
TrpR1TrpR
TyrR4DhaR, FhlA, HyfR, TyrR
Xre*13DicA, DinJ, HigA, MqsA, NadR, PuuR, RacR, SutR, YgiVYbaQ, YddM, YheO, YiaG
5428522758

*Thse six families are named using TF symbols from bacteria other than E. coli. DtxR, diphtheria toxin repressor of Corynebacterium diphtheriae; LuxR, luminescence activator of Vibrio fisheri; LytR, autolysin regulator of Streptococcus mutans; MerR, metal regulatory protein of R100 plasmid; TetR, tetracycline repressor of Tn10; and Xre, xenobiotic-response-element recognition protein.

*Thse six families are named using TF symbols from bacteria other than E. coli. DtxR, diphtheria toxin repressor of Corynebacterium diphtheriae; LuxR, luminescence activator of Vibrio fisheri; LytR, autolysin regulator of Streptococcus mutans; MerR, metal regulatory protein of R100 plasmid; TetR, tetracycline repressor of Tn10; and Xre, xenobiotic-response-element recognition protein. Most of the 285 E. coli TFs recognize and bind to specific DNA sequences. However, it is difficult to identify the whole set of in vivo TF-binding sites in the genome, mainly because of competition between DNA-binding proteins, including the abundant nucleoid proteins that cover the entire E. coli genome. In addition, the target selectivity of TFs is modulated in vivo by molecular interaction with specific effector ligands and/or by covalent modification of TFs by phosphorylation and acetylation. For accurate and quick search for DNA sequences that are recognized by a defined form of TF under a fixed condition, therefore, we developed the in vitro ‘Genomic SELEX’ system (16). In the original SELEX (systematic evolution of ligands by exponential enrichment) system, DNA–protein complexes are isolated from mixtures of a test DNA-binding protein and synthetic oligonucleotides with all possible sequences, followed by sequencing of protein-bound DNA fragments (17,18). As the size of the probes increases, the number of probe species increases; consequently, it becomes difficult to solubilize all probes at the effective concentrations required for protein binding. To overcome this solubility problem, a mixture of genomic DNA fragments is used in place of synthetic oligonucleotide mixtures, because the binding sites of the TFs of interest are located in the E. coli genome (16,19–20). Using the improved genomic SELEX system (Figure 1B), we initiated a systematic search for DNA sequences recognized by each of the seven sigma subunits and 285 DNA-binding TFs from E. coli. The sequences of protein-bound SELEX DNA fragments are determined using two procedures, SELEX-clos (cloning and sequencing) and SELEX-chip (mapping by tiling array consisting of 43 450 species of 60-base oligonucleotides probe aligned at ∼45-bp intervals along the E. coli genome). Genomic SELEX screening is particularly useful for identifying regulatory targets of uncharacterized TFs because it is difficult to predict targets from phenotypic screening of TF mutants. To date, this method has been used to identify the regulatory functions of more than 10 hitherto uncharacterized Y-gene TFs, including Dan (renamed from YgiP) (21), NemR (renamed from YdhM) (22), NimR (renamed from YeaM) (23), PgrR (renamed from YcjZ) (24), RcdA (renamed from YbjK) (25), RutR (renamed from YcdC) (26,27) and SutR (renamed from YdcN) (28). EcoCyc (29,30) and RegulonDB (31,32) are currently the most comprehensive databases of the E. coli regulatory network. Most of our published data obtained by SELEX screening of regulatory targets have been deposited into these databases. However, the regulatory targets of E. coli TFs listed in these databases were assigned with varying levels of accuracy: some were predicted in silico simply based on the presence of sequences similar to the recognition sequence of a given TF, but are not supported by experimental confirmation. One serious problem with current TF databases is that the data were obtained using various E. coli strains grown in different types of media, under different conditions, using various techniques, by a number of different laboratories. Since we found five different types of the rpoS gene encoding the stationary phase-specific σ subunit of RNAP in a single and the same E. coli K-12 W3110 strain from eleven different laboratories (33), the heterogeneity of RpoS, a key regulator of transcription, has been widely recognized in a number of laboratory stocks of E. coli from all over the world (34). At present, massive diversification is recognized in the genome of wild-type E. coli lab strains, leading to difference in the regulatory mode of genome transcription (35). To overcome this problem, we have undertaken a long-term project aimed at identifying the regulatory targets of all 285 TFs from a single E. coli strain, K-12 W3110, using the same experimental systems in a single research group. This is the first summary report of a total of 156 SELEX patterns of a set of 116 TFs. Using the collection of data sets, we elucidated a number of novel concepts regarding transcriptional regulation in E. coli, some of which are described in this report. In order to make the useful information open to the public, the SELEX screening data are assembled into a database, entitled ‘Transcription Profile of E.coli’ (TEC).

MATERIALS AND METHODS

Genomic SELEX screening

Genomic SELEX screenings of 116 TFs and SELEX-chip analyses were conducted as previously described (16). In brief, a collection of 200–300-bp DNA fragments of the E. coli W3110 genome was prepared by polymerase chain reaction (PCR) using an E. coli DNA plasmid library (16) as the template and a set of primers, EcoRV-F and EcoRV-R, that hybridize to vector pBR322 at EcoRV junctions. This collection includes ∼105 different random DNA fragments, covering about 6.5-fold the length of E. coli genome. The size of DNA fragments was determined so as to allow binding of all the hitherto characterized E. coli TFs. The standard protocol for genomic SELEX was as follows: (i) a mixture of 5 pmol of genome DNA fragments and 10 pmol of His-tagged TF in binding buffer [10 mM Tris–HCl (pH 7.8 at 4°C), 3 mM Mg acetate, 150 mM NaCl and 1.25 mg/ml bovine serum albumin] was incubated for 30 min at 37°C [note that in the initial screening, TFs were added in excess to find the binding sites even with weak affinity]; (ii) The DNA–TF mixture was applied to a Ni-NTA column; (iii) To remove unbound DNA, the column was washed three times with binding buffer containing 10 mM imidazole; and (iv) DNA–TF complexes were eluted with elution buffer containing 200 mM imidazole. DNA was isolated from the DNA–TF complexes and PCR-amplified using EcoRV-F and EcoRV-R primers. This SELEX cycle was repeated at least twice and up to eight times for some TFs, to enrich TF-binding DNA. At each SELEX step, a gel-shift assay was carried out to examine enrichment of specific DNA segments with high affinity to the test TF. The activity and specificity of most E. coli TFs are modulated by covalent modifications, such as phosphorylation, or interaction with low-molecular weight effectors (2). Therefore, the genomic SELEX screening of TF-binding sequences was also performed in the presence of effector ligands. Among the 116 TFs examined in this study, some were analyzed in the presence and absence of effector ligands; as a result, the total number of SELEX patterns increased to 156.

Identification of TF-bound DNA

To identify DNA segments recovered by genomic SELEX screening, we employed the SELEX-chip (tilling array) system. For this purpose, we used a DNA tiling array consisting of 43 450 species of 60-base probes spaced at an average interval of 45 bp along the E. coli genome. Using this array, the 200–300-bp SELEX fragments should bind to three or more consecutive probes. The TF-bound SELEX DNA fragment and the original DNA library are labeled with Cy5 and Cy3, respectively. The height of each peak is calculated relative to the combined fluorescence intensities of all probes within the highest peak. The height of TF-bound peaks correlates with the affinity of test TF to the corresponding probe sequence. The fluctuation level of background DNA was <2-fold as estimated after SELEX experiment in the absence of TF addition. Under the resolution of this genomic SELEX system herein employed, however, the exact binding sites and their orientations within promoter/operator regions remain unknown. The TF-binding sites within the E. coli genome were classified into four types depending on their positions relative to transcriptional units (Figure 1C): type-A, upstream of bidirectional transcription units; type-B, upstream of one transcription unit but downstream of another; type-C, downstream of two transcription units; and type-D, inside open reading frames. TFs bound to type-A spacers could regulate one or both bidirectional transcription units, whereas TFs bound to type-B spacers should regulate transcription in only one direction.

Construction of TEC database

The collection of datasets from SELEX screening was assembled into a single database, entitled ‘TEC’. The data in the TEC database are managed by a PostgreSQL relational database system. Web applications were developed on an Apache Tomcat web server. Software for data processing and drawing maps were written in gcc, Perl or Java. BioProspector and WebLogo, furnished under the MIT license, can be used to analyze the consensus sequence of each TF.

RESULTS AND DISCUSSION

Classification of TFs

When we started this project in the mid-1990s, nothing was known of the structure and function for more than 100 DNA-binding TFs in E. coli K-12. Since we published the first classification of E. coli TFs (12), the regulatory functions have been identified for some of the uncharacterized TFs, and in addition, several additional proteins with DNA-binding activity have been shown to participate in transcriptional regulation. Accordingly the classification list of DNA-binding TFs has been revised twice (1,2). Based on recent progress in this field, including the systematic screening using the improved genomic SELEX system described herein, we propose a revised version of the TF classification including 285 DNA-binding TFs in E. coli K-12 W3110 (Table 1; for details see Supplementary Table S1). Fifty-eight out of the 285 TFs still remain uncharacterized. Genomic SELEX screening system is, in particular, useful for the identification of regulatory targets for hitherto uncharacterized TFs, referred to Y-gene TFs. Besides these DNA-binding TFs, about 10 species of TFs, including transcription termination and anti-termination factors such as Rho, NusA and GreA can be isolated as complexes with RNAP in the absence of DNA. These RNAP-binding TFs without DNA-binding affinity are not included in the revised list of TFs.

Genomic SELEX search for TF regulatory targets

Genomic SELEX screening aimed at identification of the binding sites of all seven RNAP sigma factors and all 285 TFs from E. coli K-12 W3110 is currently underway. For instance, the full set of 669 constitutive promoters recognized by RpoD holoenzyme was identified by SELEX screening and already published (9,36). For SELEX screening of DNA sequences recognized by TFs, DNA–TF complexes were recovered by affinity purification using His-tagged TFs. Most TFs retain their binding activity on specific target sequences after addition of the His-tag at the N- or C-terminus. However, in some cases, such as OmpR, the addition of a His-tag at the N-terminus interferes with the recognition specificity of known target sequences, and in this case, His-tag was added to the C-terminus (37). The cycle of SELEX screening was repeated at least twice, but in some cases, up to eight cycles were necessary for enrichment of specific DNA segments as detected by gel electrophoresis. The distribution of TF-bound DNA sequences along the E. coli genome was analyzed using a tiling array (16). In the case of E. coli TFs, target promoters, genes and operons are located in close proximity to TF-binding sites. Thus, the regulatory targets of each TF were estimated based on the binding site of the test TF (Figure 1C). In some cases, the identification of TF-bound DNA segments was identified using the cloning and sequencing procedure (the SELEX-clos system) (see the reference list in Table 2). The number of clones identified by the SELEX-clos procedure correlates with the affinity for the test TF. The combination of SELEX-clos and SELEX-chip patterns provides a more reliable set of regulatory targets for a given TF. In this report, we describe results of the SELEX screening for a total of 116 TFs (84 characterized and 32 uncharacterized) (Supplementary Table S2), of which the SELEX screening results have already been reported for 19 TFs (note that SELEX screening of SdiA was performed in the absence and present of four different effectors) after experimental confirmation of the predicted targets (Table 2).
Table 2.

Regulation targets of transcription factor (TF)

TFEffectorFamilyNo. TF binding sitesNo.regulation targets [A]No.known targets [B]Increase [A]/[B]Ref
AscGLacI1313–1943.3∼4.8(91)
BasRAcetyl-POmpR3833–47113.0∼4.3(92)
BetICholineRutR1212.0Figure 2A
CitBCitB1111–1552.2–3.0(93)
CraLacI194193–296404.8–7.4(16,58)
CRPcAMPCRP277192–2901901.0–1.5(57)
LeuOLysR140131–173158.7–11.5(60)
LrpAsnC314218–296435.1–6.9(63)
NanRGntR233∼1.0Figure 2C
NemRRutR242∼2.0(22)
NimRAraC122∼1.0(23)
OmpRAcetyl-POmpR2621–32121.8–2.1(37)
PgrRLacI118–1142.0–2.75(24)
RbsRLacI53–421.5–2.0(94)
RcdARutR2011–1781.4–2.1(25)
RutRRutR4035–6075.0–8.6(26)
SdiASdiA3120–2736.7–9.0(41)
SdiAHSL-ASdiA4332–54310.7–18.0(41)
SdiAHSL-FSdiA2617–2935.7–9.7(41)
SdiAHSL-KSdiA3726–4138.7–13.7(41)
SdiAHSL-OCSdiA3425–4438.3–14.7(41)
SutRDinJ1713–1543.3–3.8(28)
UlaRAcetyl-PDeoR2321.0–1.5Figure 2B
The regulation targets of these TFs were predicted by Genomic SELEX screening and afterward experimentally confirmed. Number of the known targets are from EcoCyc.
Nucleoid protein
TFEffectorFamilyNo. TF binding sitesNo.regulation target operonsNo. targets (EcoCyc)Ref
DanLysR688(21)
FisFis1269Figure 3A
HNSHNS987(60); Figure 3C
IHFHU813Figure 3B

The binding sites of these nucleoid proteins were identified by using the in vitro SELEX screening system.

The binding sites of these nucleoid proteins were identified by using the in vitro SELEX screening system. The affinity and specificity of DNA-binding of E. coli TFs are mostly modulated by either protein phosphorylation or interactions with low-molecular-weight effectors. E. coli contains 285 species of DNA-binding TF (Table 1), of which 30–33 are organized into two-component systems (TCSs) (38,39), each consisting of a signal sensor that can auto-phosphorylate its specific Asp residue (HK) and a response regulator (RR) that can phosphorylate its own His residue using the HK-bound phosphate. In most cases, RR phosphorylation can take place in vivo using acetyl phosphate (AcP) as a substrate. Therefore, for TFs belonging to TCS RRs, SELEX screening was performed in the presence or absence of high concentrations of acetyl phosphate. In this report, we provide SELEX screening data for 11 TCS RRs, including the hitherto uncharacterized YpdB (Table 2 and Supplementary Table S2). On the other hand, the activity and specificity of about 90% of other E. coli TFs are regulated through molecular interactions with low-molecular weight effector ligands. For 106 TFs under the control of known effectors, SELEX screening was performed in the presence or absence of the appropriate effector ligands. For some TFs, the involvement of multiple effectors in activity control has been established and for TFs in this group, SELEX was performed in the presence of individual effectors such as Arg and Lys for ArgP (40) or HSL-A, HSL-F, HSL-K and HSL-OC for SdiA (41). For some TFs, the activity and specificity of DNA recognition are modulated through protein–protein interactions. For instance, SELEX screening of the nucleoid protein H-NS was performed in the presence or absence of its small partner protein Hha (42). As a result, the total number of SELEX patterns for effector-regulated TFs increased to 156 (Supplementary Table S2).

Prediction of regulatory targets of TFs

The whole set of regulatory target promoters, genes and operons for each TF was predicted based on the distribution of TF-binding sites along the entire E. coli genome, as determined by SELEX patterns (see Figure 1C for the classification of TF-binding sites). SELEX patterns, however, always include background noise, for instance, arising from TF binding to non-specific DNA sequences and incomplete removal of unbound DNA segments during SELEX screening. Therefore, the initial screening of the targets of each TF was carried out by setting a fixed but low cut-off for all TFs (Supplementary Table S2). The binding sites of 116 TFs (156 SELEX patterns) were located within 641 type-A spacers between bidirectional transcription units, which are thus predicted to regulate a minimum of 641 (in cases of one direction transcription) and a maximum of 1282 units (in cases of bidirectional transcription) (Supplementary Table S3, type-A spacer). Based on the known operon organization, the total number of genes under the control of 641 type A-spacer-associated TFs is 2176 (average, 3.39 genes per TF). By contrast, a total of 1212 TFs bind to promoter-regions of type-B spacers, which should regulate unidirectional transcription units including 1858 genes (average, 1.53 genes per TF) (Supplementary Table S3, type-B spacer). One surprising finding of this research is that a total of as many as 1223 TF-binding sites was identified within type-B spacers inside operons (Supplementary Table S3, type-B spacer). A marked difference in the number of TF-binding peaks was observed among the 143 SELEX patterns, ranging from a single peak to more than 1000 peaks. Likewise, a marked difference in TF-binding intensity was observed, with a ∼10 000-fold difference (after back-ground subtraction) in the peak height between the highest and the lowest values. Both the number and the height of TF-binding peak vary after resetting the cut-off level. The background level differed among the 156 SELEX patterns. To achieve a more accurate estimation of regulatory targets, care was taken to set the background level for each TF. One important criterion employed for background setting was the inclusion of known targets, allowing a more accurate estimation of specific TF-binding sites and regulatory targets. Depending on the research purpose, the background can be freely controlled by using the TEC database (see below). The set of regulatory targets described in Supplementary Table S2 represents the list of estimation that is still useful for detailed analysis of the regulatory function of each TF. For confirmation of the exact regulatory targets, we employ several experimental approaches such as in vitro assays of DNA-TF interaction (by gel retardation assay and DNase foot-printing) and in vivo assays of promoter regulation by the test TFs (using reporter assays, northern blot and RT-PCR of the target RNAs). The regulatory targets have been experimentally confirmed for a total of 19 TFs (Table 2). For these TFs, the number of regulatory target promoters (and operons) increased from 2- to 20-folds than those listed in the databases such as RegulonDB and EcoCyc (Table 2). The list of promoters identified by the SELEX screening system herein employed represents the potential promoters that are recognized in the presence of excess TF alone in the absence of other DNA-binding proteins. The number of target promoters should decrease in vivo with decrease in the level of functional TF and in the presence of competing proteins. Early molecular genetic studies suggested that E. coli promoters are regulated by a single specific regulatory protein, either a repressor or an activator; a classic example is regulation of the lac operon by LacI repressor (43). Accordingly, most TFs in E. coli are believed to be ‘local’ TFs that control transcription of a specific gene or a small number of transcription units (44,45). Until recently, only a few were classified as global regulators that influence the expression of a large number of transcription units belonging to different metabolic pathways, thereby exhibiting pleiotropic phenotypes (10,46–47). The set of global TFs includes CRP (cAMP receptor protein), FNR (fumarate and nitrate reduction), ArcA (anoxic redox control), Fur (ferric uptake regulation), Lrp (leucine-responsive regulatory protein) and NarL (nitrate/nitrite response regulator) (48). Based on the whole sets of regulatory targets for the 116 TFs (and 156 SELEX patterns) listed in this report, we next reexamined the classification of TFs with respect to the number of regulatory targets.

Single-target transcription factors (local regulators)

Among the 116 TFs (156 SELEX patterns) examined, 24 (20%) regulate fewer than 10 genes. This group of local regulators includes BetI (betaine inhibitor), NorR (NO reduction and detoxification regulator), NanR (N-acetyl-neuraminic acid regulator), NimR (2-imidazole regulator) and UlaR (utilization of L-ascorbate operon regulator). For instance, BetI (betaine inhibitor) binds to a single site within the spacer of bidirectional transcription units betIBA (encoding BetI itself and enzymes for glycine betaine biosynthesis from choline) and betT (encoding choline:H+ symporter) (Figure 2A: genome-wide pattern in upper panel, expanded local pattern in lower panel). Expression of the bet operon is regulated in vivo by the effector choline (49), but BetI is a unique regulator that remains bound to DNA even in the presence of choline (50). NanR (N-acetyl-neuraminic acid regulator) was identified in the spacer between nanCM and fimB, and upstream of nanATEK (Figure 2C). NanR activates transcription of both nanCM and nanATEK operons, together encoding proteins involved in transport and metabolism of N-acetyl-neuraminic acid (51,52), but NanR also represses transcription of fimB, which encodes a Fim switch protein (51,52). NimR, a typical single-target regulator, regulates divergent transcription of nimT, which encodes a transporter of the anti-bacterial agent 2-imidazole, (23). UlaR is the repressor of the divergent ula operon involved in transport and catabolism of L-ascorbate (53). The high-affinity binding site of UlaR is located within the spacer of bidirectional transcription units ulaG and ulaABCDEF (Figure 2B), both of which are involved in transport and utilization of L-ascorbic acid. In addition, a low-level peak was identified upstream of the ulaR gene itself, indicating that it regulates its own expression.
Figure 2.

Single-target TFs. After in vitro SELEX screening of DNA-binding sequences of purified TFs, their regulatory targets were predicted based on their binding sites, as noted in Figure 1C. Among 156 SELEX patterns examined for 116 TFs, a small number of TFs regulated only one specific target operon. Some representative SELEX patterns are shown: (A) BetI: (B) UlaR; and (C) NanR. For each SELEX pattern, the upper panel shows the genome-wide distribution of TF-binding sites, whereas the lower panel shows the expanded local region of a TF-binding site [note that the expanded patterns were retrieved from TEC (Transcription Profile of Escherichia coli) database]. The Y-axis indicates the fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA. The Y-axis in each expanded image indicates the level of each peak relative to that of the highest peak.

Single-target TFs. After in vitro SELEX screening of DNA-binding sequences of purified TFs, their regulatory targets were predicted based on their binding sites, as noted in Figure 1C. Among 156 SELEX patterns examined for 116 TFs, a small number of TFs regulated only one specific target operon. Some representative SELEX patterns are shown: (A) BetI: (B) UlaR; and (C) NanR. For each SELEX pattern, the upper panel shows the genome-wide distribution of TF-binding sites, whereas the lower panel shows the expanded local region of a TF-binding site [note that the expanded patterns were retrieved from TEC (Transcription Profile of Escherichia coli) database]. The Y-axis indicates the fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA. The Y-axis in each expanded image indicates the level of each peak relative to that of the highest peak.

Multi-target transcription factors (global regulators)

On the other hand, Genomic SELEX screening revealed that most TFs herein examined regulate multiple targets, ranging from 10 to 100 operons. Among the TFs controlling more than 100 targets, two sets of TFs were shown to act as global regulators of metabolism: CRP (cAMP receptor protein) and Cra (catabolite repressor and activator) for carbon metabolism, LeuO (leucine regulator) and Lrp (leucine response regulatory protein) for nitrogen metabolism (Table 3).
Table 3.

Overlapping of binding sites between silencer H-NS and global regulators

[A] Overlapping of H-NS binding sites with global regulators
Type-A spacer (A)Type-B spacer (B)
TFTotal no. H-NS sitesNo. H-NS sitesOverlap TFNo. TF sitesOverlapping (%)No. H-NS sitesOverlap TFNo. TF sitesOverlapping (%)Total no. H-NS (A + B)Total no. overlap TF sitesOverlapping (%)Ref
H-NS987137CRP3727.0294CRP3612.24317316.9(60)
H-NS987137LeuO3122.6294LeuO5619.14318720.2(60)
H-NS987137Lrp3626.3294Lrp6120.74319722.5(60)
[B] Overlapping of global TF-binding sites with silencer H-NS
Type-A spacer (A)Type-B spacer (B)
TFTotal no. TF sitesNo. TF sitesOverlap H-NS sitesNo. H-NS sitesOverlapping (%)No. TF sitesOverlap H-NS sitesNo. H-NS sitesOverlapping (%)Total no. TF sites (A + B)Total no. overlap H-NS sitesOverlapping (%)Ref
CRP374142H-NS3726.0141H-NS3625.52837325.8(57)
Leu14042H-NS3173.876H-NS5673.71188773.7(60)
Lrp31478H-NS3646.2140H-NS6143.62189744.5(63)

Focused three TFs were set bold in this table.

Focused three TFs were set bold in this table. The availability of carbon in the environment influences the expression pattern of multiple genes in E. coli in various ways. The cAMP receptor protein CRP, also called catabolite gene activator protein (CAP), is the best-characterized global regulator of genes involved in transport and utilization of carbon sources (54–56). Lack of glucose activates synthesis of cAMP, which associates with CRP to convert it into an active regulator of transcription. Based on the SELEX pattern of CRP in the presence of cAMP, CRP was predicted to regulate minimum 119 and maximum 219 target operons (57). Taken together with the hitherto identified targets, a total of 283–425 operons were predicted to be under the control of cAMPCRP complex (Table 3). The primary role of cAMP-CRP is control of genes involved in both uptake of carbon sources and metabolism downstream of glycolysis, including the TCA cycle and aerobic respiration. On the other hand, Cra (renamed from FruR, another global regulator of carbon metabolism) was determined to bind a total of 173 sites, allowing prediction of regulatory targets between minimum 70 and maximum 104 (16,58). Cra was shown to act as an activator of most genes encoding enzymes involved in gluconeogenesis, the TCA cycle, and the glyoxylate shunt pathway, as well as a repressor of the genes encoding the Entner–Doudoroff pathway and glycolysis. Derepression of glycolysis genes takes place when the repressor Cra is inactivated after interaction with inducers such as D-fructose-1-phosphate and D-fructose-1,6-bisphosphate. In the presence of glucose, the intracellular concentrations of these inducers increase, and they interact with Cra to prevent it from binding to the target operons. Genomic SELEX screening revealed that two global regulators, LeuO and Lrp, regulate a large set of genes including those involved in N metabolism. Leucine, a metabolic signal that controls the level of amino acids, affects expression of multiple E. coli genes. LeuO, one of the Leu sensors, was originally identified as a regulator of genes involved in Leu biosynthesis (59). Genomic SELEX screening identified at least 140 LeuO-binding sites in the E. coli genome (60), of which 118 are in promoter regions involved in gene regulation. Again the total number of regulatory targets increased, from 15 known targets to 118 (60), of which 74% overlap with the binding sites of H-NS, the universal silencer of stress-response genes (Table 3). This finding indicates that one important function of LeuO is anti-silencing of H-NS-mediated repression of a set of unnecessary genes, including some toxic genes transferred from other organisms. Another Leu sensor, Lrp, regulates genes involved in nutrient transport, biosynthesis and catabolism of amino acids, as well as utilization of amino acids for varieties of cellular functions (61,62). Using Genomic SELEX screening, we identified 314 Lrp-binding sites on the E. coli genome, and predicted 218–296 regulatory targets (Table 3) (63). Consistent with the role of Lrp in sensing leucine availability, multiple genes involved in nitrogen metabolism and genes encoding components of the translation system appear to be under the direct control of Lrp. In addition, a variety of stress-response genes that respond to nutrient availability were also included in the Lrp regulon. With respect to the wide range of genome regulation, the role of Lrp as a key regulator of N-source utilization is similar to that of CRP in carbon utilization.

Nucleoid transcription factors (bifunctional regulators)

In the E. coli nucleoid, two groups of nucleoid protein are present: universal nucleoid proteins (UNPs) that always remain in the nucleoid, and growth phase- and growth condition-specific nucleoid proteins (GNPs) that appear only at specific phases of cell growth (12,64). All of these nucleoid proteins are bifunctional and play both architectural roles in folding and packaging of genomic DNA and regulatory roles in genome functions (12,64–66). We subjected all of the nucleoid proteins to SELEX screening. IHF is abundant during all growth phases, at concentrations ranging from 6000 dimers per cell in log phase to 3000 dimers per cell in stationary phase (67,68). Genomic SELEX screening revealed 813 IHF-binding sites in the E. coli genome (Figure 3C). The list of IHF-binding targets supports its dual role, i.e. an architectural role in DNA supercoiling and DNA duplex destabilization, and a regulatory role in genome functions controlling processes such as DNA replication, recombination, and the expression of multiple genes (12,69–70). H-NS is present in many copies per cell in exponential growth phase, but is much less abundant in stationary phase (12,65,71). H-NS is a general silencer of a large number of unused genes, with a strong preference for horizontally acquired genes (72–74). Genetic SELEX screening revealed that the E. coli genome contains 987 H-NS binding sites (Figure 3B). Almost all the genes under silencing control by H-NS are included in the list of H-NS binding targets.
Figure 3.

Multi-target TFs. SELEX screening of TF-binding sequences revealed that a set of the bifunctional nucleoid proteins with both architectural roles in genome folding and regulatory role in transcription bind up to ∼1000 sites along the entire Escherichia coli genome. The number of binding peaks was: 1269 for Fis (A); 987 for H-NS (B); and 813 for IHF (C). High-level binding sites of these nucleoid proteins within type-A and type-B spacers are shown under green background. In the case of bidirectional transcription units within type-A spacers, the first genes on both sides are shown, whereas in the case of type-B spacers, the first genes in one transcriptional direction are shown. The Y-axis indicates the fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA.

Multi-target TFs. SELEX screening of TF-binding sequences revealed that a set of the bifunctional nucleoid proteins with both architectural roles in genome folding and regulatory role in transcription bind up to ∼1000 sites along the entire Escherichia coli genome. The number of binding peaks was: 1269 for Fis (A); 987 for H-NS (B); and 813 for IHF (C). High-level binding sites of these nucleoid proteins within type-A and type-B spacers are shown under green background. In the case of bidirectional transcription units within type-A spacers, the first genes on both sides are shown, whereas in the case of type-B spacers, the first genes in one transcriptional direction are shown. The Y-axis indicates the fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA. The GNP group of nucleoid proteins includes Fis (preferentially expressed in exponentially growing phase), Dps (expressed only in the stationary phase) and Dan (expressed under anaerobic conditions), all of which participate in the organization and maintenance of nucleoid structure through direct contact with specific sequences in the genome. Marked growth-coupled changes are observed in the intracellular levels of these nucleoid proteins (67,68). Fis is a typical GNP involved in activation of the growth-related genes (75). SELEX screening identified 1269 Fis-binding sites (Figure 3A), including multiple genes necessary for high-rate growth such as components of the translation system. The stationary phase-specific nucleoid protein Dps binds in vitro to a total of 624 sites, but as in the case of other nucleoid proteins, it forms clusters on the genome through cooperative protein–protein interactions when its concentration increases. The uncharacterized nucleoid protein YgiP, renamed Dan, is virtually absent during aerobic growth, but its levels increase under anaerobic conditions, when it binds 688 sites (21). Overall each nucleoid protein has a large number (∼1000) primary binding sites on the E. coli genome.

Novel concepts of the genome regulation using SELEX data

Multi-factor promoters

Current promoter databases indicate that ∼50% of E. coli promoters are under the control of a specific regulator, whereas the remaining genes are regulated by two or more TFs. Genomic SELEX search, however, revealed that most E. coli promoters carry binding sites for multiple TFs, each of which monitors a different environmental condition or a different metabolic state. The involvement of multiple TFs may be employed in fine tuning of genomic transcription. One well-characterized example is the regulation of promoter of the csgD gene encoding the master regulator of biofilm formation. The expression of csgD is controlled by more than 10 TFs, each monitoring a specific environmental factor or condition that is effective in induction of biofilm formation (2,32,76–77) (Figure 4A). In parallel with the SELEX screening, a systematic search for multi-TF promoters is being carried out using the PS-TF (promoter-specific TF) screening system (41).
Figure 4.

Multi-TF promoters. After SELEX screening of regulatory targets for a number of TFs, multiple TFs were shown to bind the same promoters. (A) Promoter of the csgDEFG operon, which encodes CsgD, the master regulator of biofilm formation and the biofilm components CsgEFG. (B) Promoter of the flhDC operon, which encodes the master regulator of flagella formation. (C) Promoter of the gadE-mdtEF operon, which encodes the acid-stress response regulator and multi-drug efflux system proteins. (D) Promoter of the fimBE operon, which encodes the recombinase for fimbriae switching. TFs shown in blue indicate those hitherto listed in RegulonDB (29,30) and EcoCyc (31,32) whereas TFs shown in green were identified by this research team and published, but not listed in these databases: H-NS, RcdA and RcdB for the csgG promoter (2,32,76–77); and CpxR for the flhD promoter (2,32,78–79). TFs shown in orange are those identified by the SELEX screening (Ishihama, A., unpublished).

Multi-TF promoters. After SELEX screening of regulatory targets for a number of TFs, multiple TFs were shown to bind the same promoters. (A) Promoter of the csgDEFG operon, which encodes CsgD, the master regulator of biofilm formation and the biofilm components CsgEFG. (B) Promoter of the flhDC operon, which encodes the master regulator of flagella formation. (C) Promoter of the gadE-mdtEF operon, which encodes the acid-stress response regulator and multi-drug efflux system proteins. (D) Promoter of the fimBE operon, which encodes the recombinase for fimbriae switching. TFs shown in blue indicate those hitherto listed in RegulonDB (29,30) and EcoCyc (31,32) whereas TFs shown in green were identified by this research team and published, but not listed in these databases: H-NS, RcdA and RcdB for the csgG promoter (2,32,76–77); and CpxR for the flhD promoter (2,32,78–79). TFs shown in orange are those identified by the SELEX screening (Ishihama, A., unpublished). A set of the genes that are involved in construction of the flagella, the key apparatus of bacterial motility for planktonic growth, is controlled by the master regulator FlhDC (78). The flhDC operon promoter is also under the direct control of multiple TFs (Figure 4B), each monitoring a different environmental condition and factor (32,78–79). The gadE operon encodes the master regulator (GadE) of pH homeostasis and multidrug efflux system (MdtEF) (80,81). In concert with the wide-range of regulation, the gadE promoter is under the control of more than nine TFs (32,80–81) (Figure 4C). As in the case of csgD, flhD and gadE operons, the fimBE operon encoding the fim switch recombinase for type-I fimbriae expression is controlled by various environmental signals through at least eight TFs (Figure 4D) (32,52).

Multi-factor operons

During the search for TF-binding sites by SELEX screening, we identified a set of TFs that bind inside specific operons (type-D binding site; see Figure 1) (2,27). For instance, Cra (renamed from FruR), one of the global regulators controlling the whole set of genes encoding enzymes for central carbon metabolism, binds to a total of 170 sites on the E. coli genome, of which 100 sites are located on open reading frames (type-D site, see Figure 1) (57). This unique pattern of Cra binding implies that Cra is involved in regulation, either activation or repression, of flanking genes organized downstream of the same operons. The assembly of SELEX patterns allows identification of operons that contain TF-binding sites inside a single and the same operon. For instance, the hyfA operon is composed of 12 genes: 5′-proximal 10 genes encoding all 10 subunits of the hydrogenase that interacts with formate dehydrogenase (FdhF) to produce the active formate hydrogenlyase complex; the hyfR gene encoding a regulator of hydrogenase 4; and the 3′-proximal focB gene coding for a formate transporter (82). At least five different TFs bind along this 13 581-bp operon, including YahA (TF with c-di-GMP phosphodiesterase activity), GalR (galactose repressor), YidZ (anaerobiosis- and NO-response TF), QseB (quorum-sensing TCS response regulator) and Rob (Mar-Sox-Rob group dual regulator) (Figure 5A). Inside the 3228-bp fusion operon, including the 5′-proximal sdhCDAB genes encoding quinone oxidoreductase and the 3′-proximal sucABCD genes encoding 2-oxoglutaratet decarboxylase (83), two regulators, TreR (trehalose repressor) and Fis (nucleoid-architectural regulator of growth-related genes), bind within the sdhCDAB operon and Fis within the sucABCD operon, respectively (Figure 5B). Inside the 13 390-bp wca operon, which includes 12 genes involved in synthesis of the capsular polysaccharide export apparatus, the binding sites of QseB (QS regulator) and YgfB (uncharacterized TF) were identified (Figure 5C). Likewise, the binding sites of at least two regulators, SoxR (superoxide response regulator) and EvgA (acid response TCS regulator), were identified within the 15 024-bp nuoA operon, which encodes the NADH dehydrogenase I complex (53) (Figure 5D). We hypothesize that TFs located inside operons play as-yet-unidentified roles in transcription initiation from internal promoters and/or transcription continuation, including transcriptional pausing, elongation, termination, anti-termination.
Figure 5.

Biding of TFs inside operons. After SELEX screening of DNA-binding sequences for a total of116 TFs, some operons were found to carry the binding sites for specific TFs. Some representative operons are shown: (A) hyfA operon (13 581 bp); (B) sdhC operon (9872 bp); (C) wcaC operon (13 390 bp); and (D) nuoA operon (15 024 bp). Some TFs with clear binding peaks are shown, but there were several small peaks within all four operons. The binding level of the general silencer H-NS is also shown (blue line) even though the peak height is low. The fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA was determined, and is shown as the relative value to that of the highest peak.

Biding of TFs inside operons. After SELEX screening of DNA-binding sequences for a total of116 TFs, some operons were found to carry the binding sites for specific TFs. Some representative operons are shown: (A) hyfA operon (13 581 bp); (B) sdhC operon (9872 bp); (C) wcaC operon (13 390 bp); and (D) nuoA operon (15 024 bp). Some TFs with clear binding peaks are shown, but there were several small peaks within all four operons. The binding level of the general silencer H-NS is also shown (blue line) even though the peak height is low. The fluorescent intensity of SELEX fragment binding to each probe relative to that of library DNA was determined, and is shown as the relative value to that of the highest peak.

Novel TF network between silencers and anti-silencers

Nucleoid structuring protein (H-NS) plays a key role in silencing of a set of genes acquired through horizontal gene transfer, including pathogenicity islands (84,85). Gene silencing is now recognized to expand to the set of E. coli genes that are not essential for growth under laboratory culture conditions. After extensive comparison of binding sites between LeuO and H-NS, we realized that among 140 LeuO-binding sites, as many as 133 (95%) overlapped with the H-NS-binding sites (60). Among a total of 118 LeuO-binding promoters (type-A and type-B spacers; see Figure 1), 87 (74%) overlapped with H-NS-binding sites (Table 3). Our findings support the role of LeuO in antagonizing H-NS-mediated silencing. In fact, the order-of-addition experiment indicated that LeuO did not bind to its targets when LeuO was added after addition of silencer H-NS (60). Extending along these lines, we examined the possible overlap of other global regulators with H-NS binding. To our surprise, among a total of 431 high-level H-NS-binding sites that were identified in promoter regions, 97 (22.5%) and 73 (16.9%) promoters harbor the binding sites of Lrp and CRP, respectively (Figure 6). Including LeuO (20.2%), overall more than 50% of H-NS-binding sites were found to overlap with the binding sites of these three global regulators (LeuO, Lrp and CRP). Based on these findings, we propose that one key role of global regulators is anti-silencing, e.g. antagonism of the general silencer H-NS. The search for global regulators involved in anti-silencing of other H-NS binding sites is in progress.
Figure 6.

Network involving the silencer H-NS and the anti-silencers Crp, LeuO and Lrp. Among the 987 binding sites of the nucleoid protein H-NS (see Figure 3B), 431 are located near promoter regions within type-A and type-B spacers, where H-NS acts as a general silencer of gene expression. These H-NS binding sites overlap with binding sites of global regulators: Crp (73 sites), LeuO (87 sites) and Lrp (97 sites). High-level overlap of binding sites between the silencer H-NS and the global regulators Crp, LeuO and Lrp indicates that these global regulators act as anti-silencers, and that each is involved in derepression of a large set of target operons (for details see Table 3).

Network involving the silencer H-NS and the anti-silencers Crp, LeuO and Lrp. Among the 987 binding sites of the nucleoid protein H-NS (see Figure 3B), 431 are located near promoter regions within type-A and type-B spacers, where H-NS acts as a general silencer of gene expression. These H-NS binding sites overlap with binding sites of global regulators: Crp (73 sites), LeuO (87 sites) and Lrp (97 sites). High-level overlap of binding sites between the silencer H-NS and the global regulators Crp, LeuO and Lrp indicates that these global regulators act as anti-silencers, and that each is involved in derepression of a large set of target operons (for details see Table 3).

Transcription profile of Escherichia coli: TEC database

The SELEX screening data were assembled into a database entitled ‘Transcription Profile of E.coli’ (TEC), which is open to the public. The first version of this TEC database compiles SELEX-chip analysis data of 123 TFs herein described. TEC is complementary to the established databases, EcoCyc (29,30), RegulonDB (31,32) and CollectTF (86). The most important and unique property of this database is that all data were obtained from a single strain of E. coli K-12 W3110, using the same experimental systems, and by a single research group. Therefore, the data are quite reliable and useful, especially for comparisons among different TFs. The TEC database is the first to provide both locations and intensity of TF binding across the E. coli genome. Therefore, it has enormous potential to provide new insight and facilitate discoveries, especially for TFs whose targets have not yet been identified. TEC provides a number of functions, some of which are discussed below.

Search for TF-binding sites and regulatory targets

The genome-wide distribution of TF-binding sites can be displayed on a single pattern (Figure 7A), and candidate genes and operons under the control of a specified TF can be retrieved from this pattern. The list of regulated genes and operons can also be narrowed down by changing the cut-off value of TF-binding intensity. By selecting a set of genes under the control of a specific TF, a bar chart can be generated in which the vertical axis corresponds to the binding strength between the TF and probes and the horizontal axis represents genome location (Figure 7B). For identification of binding sites of multiple TFs, the binding patterns of specified TFs can be assembled in a single chart (Figure 7C). From two to ten TFs can also be selected on the same screen, and multiple bar charts will be displayed. For comparison, users can merge these into one chart with different colors by clicking the ‘merge’ button (Figure 7C). In addition to the bar charts, TEC provides a heat map view, allowing visualization of binding intensities among test TFs (Figure 7D).
Figure 7.

TEC database. All SELEX patterns described herein are compiled in the TEC database. Using this database, various types of analysis are possible: (A) search for the list of regulatory targets under direct control of each TF. (B) The location of binding peaks of a test TF along the entire E. coli genome can be visualized at various scales, from the whole genome to restricted areas. (C) The relative location of binding sites of different TFs can be visualized on various scales. (D) Heat map of binding intensity of test TFs along the entire E. coli genome. (E) Search for the consensus sequence of TF binding, using the whole set of TF-binding sequences.

TEC database. All SELEX patterns described herein are compiled in the TEC database. Using this database, various types of analysis are possible: (A) search for the list of regulatory targets under direct control of each TF. (B) The location of binding peaks of a test TF along the entire E. coli genome can be visualized at various scales, from the whole genome to restricted areas. (C) The relative location of binding sites of different TFs can be visualized on various scales. (D) Heat map of binding intensity of test TFs along the entire E. coli genome. (E) Search for the consensus sequence of TF binding, using the whole set of TF-binding sequences.

Search for TF consensus sequence

Using the sequence collection of binding regions of TF of interest, the consensus sequence of TF binding can be analyzed with the open-source tool ‘BioProspector’, and shown in WebLogo style (Figure 7E). For this purpose, all 500 bp sequences centered on a probe are prepared in advance. Following selection of a test TF and a cut-off value for TF-binding level, only continuous probes exceeding the set values are automatically extracted. Then the probe with the highest intensity from each continuous probe set is collected and sequences of the final collection are imported into the BioProspector software.

Search for regulatory networks

Using the Regulatory Network page of TEC, the list of regulated genes can be retrieved as compressed zip files(s), including .eda, .sif and .txr files, and the TF network can then be created using ‘Cytoscape’. The levels of TF peaks can be modulated by controlling the cut-off value. This function was successfully used to identify the regulatory network involving the H-NS silencer and global regulators with anti-silencing activity (see Figure 6).

Future of TEC database

Based on advances in strategies for whole-genome studies, several attempts have been made to identify the full set of regulatory targets by comparing transcriptome patterns between wild-type and mutants lacking or over-producing specific TFs (87,88). However, the majority of genes detected in this manner represent indirectly affected genes (1,2). A search for genomic sites associated with TFs in vivo has been carried out using chromatin immunoprecipitation (ChIP)-chip assays (89,90). As noted above, the in vivo binding of TFs within nucleoids interferes with the binding of competing proteins, but it is worthwhile to compare the list of potential TF-binding sites identified in vitro by SELEX-chip and the actual binding sites within nucleoids in vivo, detected by ChIP-chip. For instance, among of a total of 70–80 binding sites in vitro of RutR (the key regulator of pyrimidine synthesis and degradation), only 10∼20 sites were found to be bound in vivo by RutR under normal growth conditions (1). SELEX screening of all 285 TFs from E. coli is in progress, and we plan to add more SELEX data to the TEC database. In parallel with SELEX screening for target promoters, genes and operons of each sigma and TF, we initiated a systematic search of TFs involved in regulation of one specific promoter using a newly developed PS-TF (promoter-specific screening) system (41). The results of PS-TF screening will be added to TEC in near future. Using both SELEX screening and PS-TF screening data together, TEC will provide a comprehensive view of the genome-wide regulation of transcription in E. coli. The interaction of TFs with target promoters, genes and operons depends on the intracellular concentration of each TF, which varies depending on environmental physical and chemical conditions, including the levels of nutrients, toxic compounds and cell–cell communication signals. The intracellular concentrations of TFs in E. coli K-12 W3110 under various growth conditions and at various growth phases will also be entered into the TEC database (67,68).
  90 in total

Review 1.  The lactose operon-controlling elements: a complex paradigm.

Authors:  W S Reznikoff
Journal:  Mol Microbiol       Date:  1992-09       Impact factor: 3.501

Review 2.  Role of the RNA polymerase alpha subunit in transcription activation.

Authors:  A Ishihama
Journal:  Mol Microbiol       Date:  1992-11       Impact factor: 3.501

3.  In vitro selection of RNA molecules that bind specific ligands.

Authors:  A D Ellington; J W Szostak
Journal:  Nature       Date:  1990-08-30       Impact factor: 49.962

Review 4.  Protein-protein communication within the transcription apparatus.

Authors:  A Ishihama
Journal:  J Bacteriol       Date:  1993-05       Impact factor: 3.490

Review 5.  Transcriptional regulation by cAMP and its receptor protein.

Authors:  A Kolb; S Busby; H Buc; S Garges; S Adhya
Journal:  Annu Rev Biochem       Date:  1993       Impact factor: 23.643

Review 6.  The leucine-responsive regulatory protein, a global regulator of metabolism in Escherichia coli.

Authors:  J M Calvo; R G Matthews
Journal:  Microbiol Rev       Date:  1994-09

7.  Expanded roles of two-component response regulator OmpR in Escherichia coli: genomic SELEX search for novel regulation targets.

Authors:  Tomohiro Shimada; Hiraku Takada; Kaneyoshi Yamamoto; Akira Ishihama
Journal:  Genes Cells       Date:  2015-09-02       Impact factor: 1.891

8.  Characterization of the regulon controlled by the leucine-responsive regulatory protein in Escherichia coli.

Authors:  B R Ernsting; M R Atkinson; A J Ninfa; R G Matthews
Journal:  J Bacteriol       Date:  1992-02       Impact factor: 3.490

Review 9.  The role of integration host factor in gene expression in Escherichia coli.

Authors:  M Freundlich; N Ramani; E Mathew; A Sirko; P Tsui
Journal:  Mol Microbiol       Date:  1992-09       Impact factor: 3.501

10.  A novel regulator RcdA of the csgD gene encoding the master regulator of biofilm formation in Escherichia coli.

Authors:  Tomohiro Shimada; Yasunori Katayama; Shuichi Kawakita; Hiroshi Ogasawara; Masahiro Nakano; Kaneyoshi Yamamoto; Akira Ishihama
Journal:  Microbiologyopen       Date:  2012-10-08       Impact factor: 3.139

View more
  47 in total

Review 1.  Carbon catabolite regulation in Streptomyces: new insights and lessons learned.

Authors:  Alba Romero-Rodríguez; Diana Rocha; Beatriz Ruiz-Villafán; Silvia Guzmán-Trampe; Nidia Maldonado-Carmona; Melissa Vázquez-Hernández; Augusto Zelarayán; Romina Rodríguez-Sanoja; Sergio Sánchez
Journal:  World J Microbiol Biotechnol       Date:  2017-08-02       Impact factor: 3.312

2.  Dimerization site 2 of the bacterial DNA-binding protein H-NS is required for gene silencing and stiffened nucleoprotein filament formation.

Authors:  Yuki Yamanaka; Ricksen S Winardhi; Erika Yamauchi; So-Ichiro Nishiyama; Yoshiyuki Sowa; Jie Yan; Ikuro Kawagishi; Akira Ishihama; Kaneyoshi Yamamoto
Journal:  J Biol Chem       Date:  2018-04-25       Impact factor: 5.157

3.  Escherichia coli Lrp Regulates One-Third of the Genome via Direct, Cooperative, and Indirect Routes.

Authors:  Grace M Kroner; Michael B Wolfe; Peter L Freddolino
Journal:  J Bacteriol       Date:  2019-01-11       Impact factor: 3.490

4.  Persistence and plasticity in bacterial gene regulation.

Authors:  Leo A Baumgart; Ji Eun Lee; Asaf Salamov; David J Dilworth; Hyunsoo Na; Matthew Mingay; Matthew J Blow; Yu Zhang; Yuko Yoshinaga; Chris G Daum; Ronan C O'Malley
Journal:  Nat Methods       Date:  2021-11-25       Impact factor: 28.547

5.  A balancing act in transcription regulation by response regulators: titration of transcription factor activity by decoy DNA binding sites.

Authors:  Rong Gao; Libby J Helfant; Ti Wu; Zeyue Li; Samantha E Brokaw; Ann M Stock
Journal:  Nucleic Acids Res       Date:  2021-11-18       Impact factor: 16.971

6.  Molecular basis for the differential expression of the global regulator VieA in Vibrio cholerae biotypes directed by H-NS, LeuO and quorum sensing.

Authors:  Julio C Ayala; Hongxia Wang; Jorge A Benitez; Anisia J Silva
Journal:  Mol Microbiol       Date:  2017-12-11       Impact factor: 3.501

7.  Genome-wide identification of Listeria monocytogenes CodY-binding sites.

Authors:  Rajesh Biswas; Abraham L Sonenshein; Boris R Belitsky
Journal:  Mol Microbiol       Date:  2020-02-05       Impact factor: 3.501

Review 8.  Conserved DNA Methyltransferases: A Window into Fundamental Mechanisms of Epigenetic Regulation in Bacteria.

Authors:  Pedro H Oliveira; Gang Fang
Journal:  Trends Microbiol       Date:  2020-05-13       Impact factor: 17.079

9.  Dynamic landscape of protein occupancy across the Escherichia coli chromosome.

Authors:  Peter L Freddolino; Haley M Amemiya; Thomas J Goss; Saeed Tavazoie
Journal:  PLoS Biol       Date:  2021-06-25       Impact factor: 8.029

10.  Single-Target Regulators Constitute the Minority Group of Transcription Factors in Escherichia coli K-12.

Authors:  Tomohiro Shimada; Hiroshi Ogasawara; Ikki Kobayashi; Naoki Kobayashi; Akira Ishihama
Journal:  Front Microbiol       Date:  2021-06-18       Impact factor: 5.640

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.