Chengli Fang1,2, Lingting Li1,2, Liqiang Shen1,2, Jing Shi3, Sheng Wang4, Yu Feng3, Yu Zhang1. 1. Key Laboratory of Synthetic Biology, CAS Center for Excellence in Molecular Plant Sciences, Shanghai Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai 200032, China. 2. University of Chinese Academy of Sciences, Beijing 100049, China. 3. Department of Biochemistry and Molecular Biology, School of Medicine, Zhejiang University, Hangzhou 310058, China. 4. Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST) Thuwal, 23955, Saudi Arabia.
Abstract
Bacterial RNA polymerase (RNAP) forms distinct holoenzymes with extra-cytoplasmic function (ECF) σ factors to initiate specific gene expression programs. In this study, we report a cryo-EM structure at 4.0 Å of Escherichia coli transcription initiation complex comprising σE-the most-studied bacterial ECF σ factor (Ec σE-RPo), and a crystal structure at 3.1 Å of Mycobacterium tuberculosis transcription initiation complex with a chimeric σH/E (Mtb σH/E-RPo). The structure of Ec σE-RPo reveals key interactions essential for assembly of E. coli σE-RNAP holoenzyme and for promoter recognition and unwinding by E. coli σE. Moreover, both structures show that the non-conserved linkers (σ2/σ4 linker) of the two ECF σ factors are inserted into the active-center cleft and exit through the RNA-exit channel. We performed secondary-structure prediction of 27,670 ECF σ factors and find that their non-conserved linkers probably reach into and exit from RNAP active-center cleft in a similar manner. Further biochemical results suggest that such σ2/σ4 linker plays an important role in RPo formation, abortive production and promoter escape during ECF σ factors-mediated transcription initiation.
Bacterial RNA polymerase (RNAP) forms distinct holoenzymes with extra-cytoplasmic function (ECF) σ factors to initiate specific gene expression programs. In this study, we report a cryo-EM structure at 4.0 Å of Escherichia coli transcription initiation complex comprising σE-the most-studied bacterial ECF σ factor (Ec σE-RPo), and a crystal structure at 3.1 Å of Mycobacterium tuberculosis transcription initiation complex with a chimeric σH/E (Mtb σH/E-RPo). The structure of Ec σE-RPo reveals key interactions essential for assembly of E. coli σE-RNAP holoenzyme and for promoter recognition and unwinding by E. coli σE. Moreover, both structures show that the non-conserved linkers (σ2/σ4 linker) of the two ECF σ factors are inserted into the active-center cleft and exit through the RNA-exit channel. We performed secondary-structure prediction of 27,670 ECF σ factors and find that their non-conserved linkers probably reach into and exit from RNAP active-center cleft in a similar manner. Further biochemical results suggest that such σ2/σ4 linker plays an important role in RPo formation, abortive production and promoter escape during ECF σ factors-mediated transcription initiation.
Bacterial σ factors are key components of the bacterial RNAP holoenzyme. During transcription initiation, the σ factors associate with RNAP core enzyme, guide the transcription machinery to promoter regions of genes, unwind double-strand promoter DNA, and facilitate de novo RNA synthesis (1–4). The genomes of bacteria comprise one primary σ factor (or group-1 σ factor; σ70 in Escherichia coli) maintaining expression of majority of genes, and a collection of alternative σ factors in control of subsets of genes responding to certain intracellular and environmental signals (5,6).The group-1 σ factor is the most studied and well-known σ factors, E. coli σ70 is composed of multiple domains—σ1.1, σ1.2, σNCR, σ2, σ3.1, σ3.2 and σ4. Domains σ1.2, σNCR, σ2, σ3.1 and σ4 reside on the surface of RNAP core enzyme and are responsible for recognizing promoter DNA (7–14). Domain σ2 also initiates unwinding of double-stranded promoter DNA to form a transcription bubble (10,11,13,15). Domain σ3.2, a linker between σ3.1 and σ4, threads the RNAP and makes extensive interactions in the active-center cleft. The σ3.2 linker serves as a mimic of RNA to pre-organize template single-strand DNA (ssDNA) of the transcription bubble into a helical conformation (10), facilitating base-pairing of initiating NTPs to template ssDNA (16,17); however, the RNA-mimic σ3.2 linker would inevitably collide with the 5′-end of nascent RNA of length >4 nt, partially accounting for abortive production, transcription initiation pausing (18–20) and promoter escape (21,22).The alternative σ factors contain three groups of σs belonging to the σ70 family (group-2, 3 and 4 σs) and one group of σs belonging to the σ54 family (1). The group-2 σ factors (σ38 in E. coli) contain all domains except domain σ1.1 and recognize promoters very similar to those of group-1 σ factor (primary σ factor). The group-3 σ factors (σ32 or σ28 in E. coli) lack domain σ1.1 and σ1.2 and recognize promoters distinct from those of group-1 factor. The group-4 σ factors (also known as Extra-Cytoplasmic Function σ factors; ECF σ factors) only retain conserved domains σ2 and σ4. ECF σ factors are the most abundant, compact and divergent σ factors (1,3). They are important for stress adaption of most bacteria and are associated with virulence and drug resistance of pathogenic bacteria (6,23–26). ECF σ factors recognize promoters with stringent specificity and have been engineered to orthogonal transcriptional elements for constructing gene circuits (27–29).Escherichia coli σE (σ24) is an essential ECF σ factor. It maintains cell envelope integrity both under stress conditions (heat-shock, acid or oxidative stresses) and during normal growth (30); it also participates in biofilm formation and drug resistance of pathogenic E. coli (31,32). The activation of σE is induced by mis-folded proteins in periplasm under cell envelope stress, which triggers a cascade of protease cleavage resulting in release of σE into cytoplasm (33). The σE subsequently forms a holoenzyme with RNAP and directly upregulates expression of ∼100 protein-encoding genes that are involved in transport and assembly of outer membrane proteins and lipo-polysaccharide to relieve stress. It also indirectly downregulates expression of outer membrane proteins by activating transcription of their small regulatory RNAs–MicA, RybB and MicL to reduce protein load (34,35).Escherichia coli σE contains two conserved domains (σ2 and σ4) and a non-conserved σ2/σ4 linker as other bacterial ECF σ factors. Escherichia coli σE recognizes promoters with consensus sequences at the −35 and −10 elements of ‘GGAACTT’ and ‘GTC’, respectively (36–38). Crystal structures of E. coli σE2 complexed with the −10 element promoter DNA and of E. coli σE4 complexed with the −35 element promoter DNA reveal protein-DNA interactions for sequence-specific promoter recognition by E. coli σE (12,39). However, no structural information is available for a RNAP complex comprising E. coli σE except a model of σE-RNAP holoenzyme predicting that σE2 and σE4 bind to RNAP as primary σ factors do (40). It is unknown how RNAP and E. coli σE are assembled to a functional σE-RNAP holoenzyme, how the σE-RNAP recognizes promoter DNA, and how σE-RNAP initiates transcription.Recently Li et al. and Lin et al. have independently reported crystal structures of Mycobacterium tuberculosis transcription initiation complexes comprising ECF σ factors (σH or σL) (41,42). The structures together revealed interactions among ECF σ factors, RNAP core enzyme and promoter DNA, and surprisingly showed that the σ2/σ4 linkers of the two ECF σ factors interact with RNAP core enzyme in an analogous way as the σ3.2 of primary σ factor does—the linker inserts into the active center cleft and exits out through the RNA exit channel (43,44). As the σ2/σ4 linkers of ECF σ factors are highly divergent in length and sequence, it is intriguing to know whether the σ2/σ4 linkers of other ECF σ factors interact with RNAP in a similar manner, how RNAP manages to accommodate the extremely variable structure modules using one binding site, and more importantly what role the linkers play during transcription initiation.In this study, we determined a cryo-EM structure at 4.0 Å of E. coli transcription initiation complex comprising E. coli σE, and a crystal structure of M. tuberculosis transcription initiation complex comprising a chimeric M. tuberculosis σH/E factor. The structures reveal protein-protein interactions essential for RNAP holoenzyme assembly, and protein-DNA interactions critical for promoter recognition and unwinding by E. coli σE. More importantly, the structures show that the σ2/σ4 linkers of E. coli σE and M. tuberculosis σE insert into the active-center cleft of RNAP and interact with template single-stranded DNA as do the σ2/σ4 linkers of M. tuberculosis σH and σL, despite no sequence similarity of the linker regions. The structure prediction of 27,670 bacterial ECF σ factors shows that the σ2/σ4 linkers of ECF σ factors retain similar secondary structures at the end regions, indicating that the σ2/σ4 linkers, albeit highly divergent in sequence, probably follow the same path to enter and exit the active center of RNAP. We demonstrated that the σ2/σ4 linker is essential for ECF σ-initiated transcription probably by facilitating several steps including RPo formation, synthesis of initial short RNA transcripts, and promoter escape.
MATERIALS AND METHODS
Plasmids
DNA fragments encoding E. coli σE, Bacillus subtilis σW, B. subtilis σM, M. tuberculosis σH, and M. tuberculosis σE were amplified from genomic DNA of E. coli, B. subtilis and M. tuberculosis respectively. The DNA fragments were subsequently cloned into pTolo-EX5 (ToloBio Inc.) or pET28a as described in Supplementary Table S1. The pEASY-prpoH, pEASY-prpoE, pEASY-psigM, pEASY-psigW, pEASY-psigB and pEASY-pClpB were constructed by inserting the promoter region (−50 to +50) of respective genes amplified from genomic DNA into the pEASY-blunt vector (Transgen biotech, China).
Proteins
The wild-type or derivatives of bacterial ECF σ factors were over-expressed in E. coliBL21(DE3) cells (NovoProtein), and purified from soluble fractions using Ni-NTA (SMART, Inc.) and Heparin columns (GE Healthcare). The Mtb σE2 and Bs σE2 were obtained from the inclusion body. The E. coli, M. tuberculosis, and B. subtilis RNAP core enzymes were over-expressed in E. coliBL21(DE3) and sequentially purified on a Ni-NTA affinity column, a Mono Q ion-exchange column, and a Superdex S200 size-exclusion column.
Crystallization and structure determination of Mtb σH–RPo and σH/E–RPo
The Mtb σH–RPo and σH/E–RPo complexes for crystallization were prepared by reconstitution. The Mtb RNAP core enzyme, Mtb σH(or σH/E), and nucleic-acid scaffolds (Figure 3A) were mixed at 1: 4: 1.2 molar ratio and incubate at 4°C overnight. The RPo complexes were purified using a Hiload 16/60 Superdex S200 column (GE Healthcare, Inc.) and stored in 20 mM Tris–HCl pH 8.0, 0.1 M NaCl, 1% (v/v) glycerol, 1 mM 1,4-dithiothreitol (DTT) with a concentration of 7.5 mg/ml. Crystals of Mtb σH–RPo were obtained from 0.08 M Magnesium acetate, 0.05 M sodium cacodylate pH 6.5, 15% PEG400; and crystals of Mtb σH/E–RPo were obtained from 0.2 M sodium acetate, 0.1 M sodium citrate pH 5.5, 10% PEG4000. The X-ray diffraction data were collected at Shanghai Synchrotron Radiation Facility (SSRF) beamlines 17U and 19U, and the structures were solved by molecular replacement with Phaser MR using the structure of M. tubercolusis RNAP holo enzyme (PDB ID: 5ZX3).
Figure 3.
The crystal structure of Mycobacterium tuberculosis σH-RPo and σH/E-RPo. (A) The nucleic-acid scaffold used for the structure determination. (B) The schematic diagram of M. tuberculosis σH and interaction of the σ3.2-like linker of σH (σH3.2) with RNAP active-center cleft. (C) The schematic diagram of M. tuberculosis σH/E and interaction of σE3.2 with RNAP active-center cleft. (D) The domain σ3.2 of Ec σE, Ec σ70, Mtb σE, Mtb σL, Mtb σH factors follow similar path to enter and exit RNAP active-center cleft.
Cryo-EM structure determination of E. coli σE-RPo
The E. coli σE-RPo were obtained by reconstitution with E. coli RNAP core enzyme, E. coli σE, and nucleic-acid scaffold as above (Figure 1A). The E. coli σE-RPo were concentrated to ∼15 mg/ml and stored in 10 mM Hepes pH 7.5, 50 mM KCl, 5 mM MgCl2, 3 mM DTT. The E. coli σE-RPo was mixed with CHAPSO (Hampton Research Inc.) to a final concentration of 8 mM prior to grid preparation. The complex (3 μl) were subsequently applied on a glow-discharged C-flat CF-1.2/1.3 400 mesh holey carbon grids (Protochips, Inc.), and plunge-frozen in liquid ethane using a Vitrobot Mark IV (FEI). The grids were loaded into a 300 keV Titan Krios (FEI) equipped with a K2 Summit direct electron detector (Gatan) and a dataset was collected. The electron density map was obtained by single-particle reconstitution with RELION2.1. Gold-standard Fourier-shell-correlation analysis indicated a mean map resolution of 4.02 Å. The structure model was built in Coot and refined in Phenix.
Figure 1.
The cryo-EM structure of Escherichia coli σE-RPo. (A) The nucleic-acid scaffold used for structure determination. (B) Schematic diagram of E. coli σE. (C) The side and top views of cryo-EM electron density map (gray surface) at resolution 4.0 Å and the structure model of σE-RPo. (D) The side and top views of the overall structure of E. coli σE-RPo. (E) The electron density map and structural model of E. coli σE. (F) The electron density map and structural model of the nucleic-acid scaffold. RNAP-α subunit, light brown; RNAP-β subunit, gray; RNAP-β’ subunit, dark gray; RNAP-ω subunit, pink; σE2, light green; the σ3.2-like linker of σE (σE3.2), cyan; σE4, green. Template DNA, red; non-template DNA, orange; RNA, blue; the −35 element DNA, light blue; the −10 element, violet.
The cryo-EM structure of Escherichia coli σE-RPo. (A) The nucleic-acid scaffold used for structure determination. (B) Schematic diagram of E. coli σE. (C) The side and top views of cryo-EM electron density map (gray surface) at resolution 4.0 Å and the structure model of σE-RPo. (D) The side and top views of the overall structure of E. coli σE-RPo. (E) The electron density map and structural model of E. coli σE. (F) The electron density map and structural model of the nucleic-acid scaffold. RNAP-α subunit, light brown; RNAP-β subunit, gray; RNAP-β’ subunit, dark gray; RNAP-ω subunit, pink; σE2, light green; the σ3.2-like linker of σE (σE3.2), cyan; σE4, green. Template DNA, red; non-template DNA, orange; RNA, blue; the −35 element DNA, light blue; the −10 element, violet.
Stopped-flow assay
The promoter for the stop flow assay was prepared as in Supplementary Figure S5A. To monitor the efficiency of RPo formation of E. coli RNAP holoenzymes comprising wild-type or derivatives of E. coli σE, 60 μl σE-RNAP holoenzyme (200 nM) and 60 μL Cy3-PrpoE (4 nM) in 10 mM Tris–HCl, pH 7.7, 20 mM NaCl, 10 mM MgCl2, 1 mM DTT were rapidly mixed and the change of Cy3 fluorescence was monitored in real time by a stopped-flow instrument (SX20, Applied Photophysics Ltd, UK) equipped with a excitation filter (515/9.3 nm) of and a long-pass emission filter (570 nm). The data were plotted in SigmaPlot (Systat software, Inc.) and the observed rates Kobs,1 and Kobs,2 were estimated as in the Supplementary Materials and Methods.
Fluorescence polarization (FP) competitive assay
The E. coli σE was labeled with fluorescein at residues C165. The affinity of E. coli RNAP core enzyme and wild-type E. coli σE was first determined as ∼53 nM by a FP assay. A FP competition assay was further employed to compare the affinities of wild-type and derivatives of E. coli σE to RNAP core enzyme. Label-free E. coli σE (0, 2.5, 5, 10, 20, 40, 80, 160, 320, 640, 1280, 2560 or 5120 nM; final concentration) pre-mixed with fluorescein-labeled E. coli σE (5 nM; final concentration) were incubated with E. coli RNAP core enzyme (100 nM; final concentration) in FP buffer at room temperature for 20 min in FP buffer. The FP signals were measured using a plate reader (SPARK, TECAN Inc.) equipped with excitation filter of 495/10 nm and emission filter of 520/20. The data were plotted in SigmaPlot (Systat software, Inc.) and the IC50 were estimated as in the Supplementary Materials and Methods.
RESULTS
The cryo-EM structure of E. coli σE-RPo
To obtain a structure of E. coli σE-RPo, we reconstituted the E. coli σE-RPo complex with E. coli RNAP core enzyme, E. coli σE and a nucleic-acid scaffold (Figure 1A and B; Supplementary Figure S1A). The nucleic-acid scaffold (−34 to +14; with respect to +1 as transcription start site) is composed of a 24-bp upstream double-stranded DNA (dsDNA) with consensus sequences of the −35 element, a 12-bp transcription bubble (maintained open by having non-complimentary sequences on nontemplate- and template-strand DNA), a 12-bp downstream dsDNA, and a 5-mer RNA.We obtained a cryo-EM map at 4.0 Å for the E. coli σE-RPo complex with local resolution at the active-center cleft of RNAP around ∼3 Å (Figure 1C; Supplementary Figure S1B–E and Table S3). The map shows clear density for residues of σE2 (residues 5–87) and σE4 (residues 131–190) and all residues of the σE2/ σE4 linker (residues 88–130) (Figure 1E and Supplementary Figure S2B). The map also shows clear density for the upstream dsDNA, template and nontemplate ssDNA of transcription bubble, the RNA/DNA hybrid and the downstream dsDNA (Figure 1F and Supplementary Figure S2D). The RNAP clamp in the structure of σE-RPo adopts a closed conformation as other bacterial transcription RPo complexes (Supplementary Figure S3A) (10,45–48). The template ssDNA and nontemplate ssDNA follow the same path as in the structure of M. tuberculosis σH-RPo (Supplementary Figure S3B) (41).In the structure of σE-RPo, the domains σE2 and σE4 locate on the surface of RNAP (Figure 1D). σE2 attaches to the clamp helices of the RNAP-β’ subunit via a polar surface as the σ2 of other σ70-family factors does (Figure 1D and Supplementary Figure S3C) (48,49). The residues on the interface are conserved (Supplementary Figure S4A). Intriguingly, σE4 uses a distinct hydrophobic surface to bind the tip helix of RNAP-β flap domain (βFTH). The interface residues include V131, F132, I135, L151, I181, V185 and I189 of σE4 and E898, L901, L902, I905 and F906 of the βFTH. Such interaction induces a 90° rotation of the βFTH, where the βFTH is further stabilized by the extended hydrophobic surface created by residues I121, L123 and L127 of the σE2/σE4 linker (Figure 2A and B).
Figure 2.
The interactions among RNAP core enzyme, σE and nucleic-acid scaffold. (A) The βFTH interacts with a large hydrophobic surface created by σE4 and the σ3.2-like linker of σE. (B) The structural comparison of interactions between βFTH and domain σ4 of Escherichia coli σE-RPo (green and dark gray), E. coli σA-RNAP (PDB ID: 6CA0; yellow) and Mycobacterium tuberculosis σH-RPo (PDB ID: 5ZX2; gray). (C) The E. coli σE–RPo (green) is superimposable to the crystal structure of E. coli σE4/-35 dsDNA (gray). (D) The E. coli σE-RPo (yellow) is superimposable to the crystal structure of E. coli σE2/-10 ssDNA (gray); (E) The detailed interaction between the E. coli σE2 and the −10 element promoter DNA. (F) The stopped-flow experiments measuring the kinetics of promoter unwinding by WT or derivatives of E. coli σE-RNAP. The data points were recorded every 0.1 s and the data were fitted as described in ‘Materials and Methods’ section. The experiments were repeated for three times and representative curves are shown. (G) The σ3.2-like linker of σE (σE3.2) inserts into the active-center cleft. (H) The detailed interactions between σE3.2 and template ssDNA of the transcription bubble.
The interactions among RNAP core enzyme, σE and nucleic-acid scaffold. (A) The βFTH interacts with a large hydrophobic surface created by σE4 and the σ3.2-like linker of σE. (B) The structural comparison of interactions between βFTH and domain σ4 of Escherichia coli σE-RPo (green and dark gray), E. coli σA-RNAP (PDB ID: 6CA0; yellow) and Mycobacterium tuberculosis σH-RPo (PDB ID: 5ZX2; gray). (C) The E. coli σE–RPo (green) is superimposable to the crystal structure of E. coli σE4/-35 dsDNA (gray). (D) The E. coli σE-RPo (yellow) is superimposable to the crystal structure of E. coli σE2/-10 ssDNA (gray); (E) The detailed interaction between the E. coli σE2 and the −10 element promoter DNA. (F) The stopped-flow experiments measuring the kinetics of promoter unwinding by WT or derivatives of E. coli σE-RNAP. The data points were recorded every 0.1 s and the data were fitted as described in ‘Materials and Methods’ section. The experiments were repeated for three times and representative curves are shown. (G) The σ3.2-like linker of σE (σE3.2) inserts into the active-center cleft. (H) The detailed interactions between σE3.2 and template ssDNA of the transcription bubble.
The promoter recognition and unwinding by E. coli σE
The structure of E. coli σE-RPo is superimposable on the binary structure of E. coli σE4/−35 element promoter dsDNA (Figure 2C), supporting the previous conclusion that σ4 of bacterial ECF σ factors reads sequence and shape of −35 dsDNA (12). The structure of E. coli σE-RPo is also superimposable on the binary structure of E. coli σE2/−10 element promoter ssDNA (Figure 2D). In particular, the T-10 and C-9 of the non-template strand were inserted into two protein pockets (Figure 2E) in the exactly same manner as in the structure of E. coli σE2/−10 ssDNA. The DNA–protein interactions are sequence specific, as swapping the ‘specificity loop’ of E. coli σE altered the specificity of the element (39).The structure implicates that N80 might serve as a wedge to separate the base pair at position −10. To explore contributions of the residue to promoter unwinding, we modified a stopped-flow assay to monitor the RPo formation by E. coli σE-RNAP, in which the fluorescence of a Cy3 fluorophore at +1 position on non-template strand DNA increases upon RPo formation (Figure 2F and Supplementary Figure S5A). Similar assays have been used to measure the kinetics of RPo formation by the primary σ factor (50–52). As shown in Figure 2F, the fluorescence rapidly increases and reaches to a plateau in 5 seconds after mixing the σE-RNAP with promoter DNA, while RNAP core enzyme induces no change of fluorescence, validating the assay. The kinetics of RPo equilibration is two times slower by σE(N80A)-RNAP holoenzyme compared with wild-type σE-RNAP, suggesting a role of N80 during RPo formation probably by facilitating promoter unwinding (Figure 2F). Interestingly, mutations of the protein pockets on σE for T-10 and C-9 (F64A or W73A) also exhibited slowed RPo equilibration (Figure 2F), indicating that the RPo equilibration could be accelerated by securing the unwound nucleotides. It is worth noting that all curves could be perfectly fitted with a typical two-phase kinetics (a fast phase and a slow phase), suggesting the existence of a significant intermediate (RPi) on the path toward RPo (Figure 2F and Supplementary Figure S5C–E). Alanine substitutions of N80, F64 or W73 slow down kinetics of both phases (Supplementary Table S5).The above evidence supports the conclusion that E. coli σE unwinds promoter at the −11/−10 junction in a previous study (39), similar to M. tuberculosis σH (41), but different from E. coli σ70 (13), which unwinds promoter DNA at a position 1-bp downstream of that by the ECF σ factors (Supplementary Figure S3E–H) (9,41,42). Structure superimposition (M. tuberculosis σH-RPo, E. coli σ70-RPo, and E. coli σE-RPo) reveals that the melting residues of the primary of σA (W433 and W434 for E. coli σ70) and ECF σ factors (N80 for E. coli σE) locate at slightly different positions on the protein surface (Supplementary Figure S3E–H). Tryptophan substitution of the residues of E. coli σE locating at the corresponding positions of the W-dyad on σA (R76W, I77W or R76W/I77W) resulted in substantial decrease of promoter unwinding efficiency, confirming that σE opens promoter through a different mechanism than primary σ factor (Figure 2F and Supplementary Table S5).
The σE2/σE4 linker interacts with the active-center cleft of RNAP
We discovered that the σE2/σE4 linker dives into the active-center cleft of RNAP and emerges out through the RNA-exit channel (Figure 2G). The path inside of RNAP of the σE2/σE4 linker is remarkably similar to that of the σ3.2 of the group-1 σ factor and also to those of linker regions of two other ECF σ factors (M. tuberculosis σH and σL) (41,42), therefore, we designated the σE2/σE4 linker as σE3.2-like linker (Figure 2G and Supplementary Figure S6). The σE3.2-like linker region could be further divided into three sub-regions—the head (residues 88–98), middle (residues 99–118) and tail (residues 119–130) (Figure 2G). The head sub-region extends the helix of σE2 and enters the active-center cleft through T-ss DNA channel created by the RNAP-β’ lid and rudder motifs. The middle sub-region passes underneath the lid domain and makes a turn toward the RNA-exit channel; it resides in the RNAP active-center cleft and contacts the T-6 nucleotide (Figure 2H). The tail sub-region forms a continuous helix with the first helix of σE4 and exits the RNAP active-center cleft through the RNA-exit channel (Figure 2G).
The σ3.2-like linker of M. tuberculosis σE also inserts into the active-center cleft of RNAP
Considering the fact that there is no similarity on primary sequences of the σ2/σ4 linker regions of bacterial ECF σ factors (1), we are interested to know whether the linker regions of other bacterial ECF σ factors follow the same path in RNAP. The initial attempts to obtain additional structures of RNAP complexed with ECF σ factors failed. Inspired by the results that chimeric σ factors with the linker region swapped function normally and the idea of determination of crystal structures of transcription initiation complexes containing chimeric σ factors (41,42), we sought to obtain crystal structures of the Mtb RPo complexes with chimeric σH factors. We first took advantage of the high crystallizability of M. tuberculosis σH–RPo and obtained a novel robust crystal form of M. tuberculosis σH–RPo (Mtb σH-RPo-Fork) at 2.9 Å (Supplementary Table S2) (41). The Mtb σH-RPo is reconstituted in vitro with M. tuberculosis RNAP σH-holoenzyme, a downstream fork DNA scaffold and a 5-mer RNA (Figure 3A). The Mtb σH in the structure of Mtb σH-RPo-Fork makes the same interactions with RNAP core enzyme and with promoter DNA as it does in the previously reported structure of Mtb σH-RPo with a full transcription bubble promoter DNA (41).The crystal structure of Mycobacterium tuberculosis σH-RPo and σH/E-RPo. (A) The nucleic-acid scaffold used for the structure determination. (B) The schematic diagram of M. tuberculosis σH and interaction of the σ3.2-like linker of σH (σH3.2) with RNAP active-center cleft. (C) The schematic diagram of M. tuberculosis σH/E and interaction of σE3.2 with RNAP active-center cleft. (D) The domain σ3.2 of Ec σE, Ec σ70, Mtb σE, Mtb σL, Mtb σH factors follow similar path to enter and exit RNAP active-center cleft.By using the same fork scaffold, we determined a crystal structure at 3.1 Å of Mtb σH/E-RPo comprising the same nucleic-acid scaffold and a chimeric σH/E with σ2/σ4 linker of σH replaced by that of Mtb σE (Figure 3C). In the structure of Mtb σH/E-RPo, the σ2/σ4 linker region of Mtb σE follows a similar path through RNAP active-center cleft and makes interactions with the template ssDNA as other bacterial ECF σ factors, providing another evidence for the conserved interaction mode of the linker region with RNAP (Figure 3D).
The head and tail of σ3.2-like linkers retain conserved secondary structures
All of the four available structures comprising bacterial ECF σ factors (Ec σE-RPo and Mtb σH/E-RPo in this study and Mtb σL-RPo and Mtb σH-RPo in previous reports) show that σ3.2-like regions of ECF σ factors interact with RNAP in a similar manner (Supplementary Figure S6A–D), although the four σ factors share little similarity of primary sequences in the regions. Multiple sequence alignments (MSAs) of the 27,670 ECF σ factors reveal a clear boundary of their σ3.2-like region (residues 88–130 for E. coli σE) and confirmed that the linker is the least conserved region of ECF σ factors (Figure 4A and Supplementary Figure S4; Supplementary Files 1 and 2). However, structural comparison of the σ3.2-like linkers of the four available RPo structures comprising ECF σ factors exhibits similar secondary structures for the head and tail sub-regions. Namely, the head sub-regions contain a short helix followed by a short β strand or a coil; while the tail sub-regions are mainly composed of a helix (Figure 4B and C).
Figure 4.
The sequence alignment and secondary-structure prediction of 27,670 bacterial ECF σ factors. (A) The sketched diagram of the MSA of 27,670 bacterial ECF σ factors. Upper panel, the schematic diagram of Escherichia coli σE; middle panel, the conservation score from the MSA for each position of E. coli σE; bottom panel, the occupancy score from MSA for each position of E. coli σE. The conservation and occupancy scores were calculated by Jalview. The occupancy scores show the ratio of ungapped positions in each column of the alignment. (B) The structures of the head, middle and tail sub-regions of Ec σE3.2, Mtb σE3.2, Mtb σH3.2 (PDB ID: 5ZX3) and Mtb σL3.2 (PDB ID: 6DV9), and Ec σ703.2(PDB ID: 6CA0). (C) The primary protein sequences and secondary structures of E. coli σE3.2 and Mtb σE3.2. (D) The probability score of secondary structures of σ3.2-like linkers of bacterial ECF σ factors at corresponding positions of σE3.2.
The sequence alignment and secondary-structure prediction of 27,670 bacterial ECF σ factors. (A) The sketched diagram of the MSA of 27,670 bacterial ECF σ factors. Upper panel, the schematic diagram of Escherichia coli σE; middle panel, the conservation score from the MSA for each position of E. coli σE; bottom panel, the occupancy score from MSA for each position of E. coli σE. The conservation and occupancy scores were calculated by Jalview. The occupancy scores show the ratio of ungapped positions in each column of the alignment. (B) The structures of the head, middle and tail sub-regions of Ec σE3.2, Mtb σE3.2, Mtb σH3.2 (PDB ID: 5ZX3) and Mtb σL3.2 (PDB ID: 6DV9), and Ec σ703.2(PDB ID: 6CA0). (C) The primary protein sequences and secondary structures of E. coli σE3.2 and Mtb σE3.2. (D) The probability score of secondary structures of σ3.2-like linkers of bacterial ECF σ factors at corresponding positions of σE3.2.To explore whether other bacterial ECF σ factors also retain similar secondary structure folds for the σ3.2-like linker regions. We performed secondary-structure prediction of the σ3.2-like linker regions of the 27,670 bacterial ECF σ factors using RaptorX-Property and calculated the probability score of secondary structures for each position (53,54). The predictions agree very well with the secondary-structure pattern of the four available structures (Supplementary Figure S7); 85% of residues adopt exactly the same secondary structures as predicted, validating the predictions. More importantly, the predictions show a strikingly conserved pattern of secondary structures for the head and tail sub-regions of σ3.2-like linkers. Namely, ∼80% of ECF σ factors are predicted to contain a short helix followed by a coil in the head sub-region and a short helix in the tail sub-region of σ3.2-like linkers (Figure 4D and Supplementary File 3).The conserved helical structures of the head and tail sub-regions of σ3.2-like linkers strongly implicate that σ3.2-like linkers of most bacterial ECF σ factors probably bind to the RNAP active-center cleft. The head sub-region of σ3.2-like linkers extends the last helix of domain σ2, and precisely guides the rest of σ3.2-like linkers into the RNAP active-center cleft through T-ssDNA channel as those of M. tuberculosis σH, σE, σL and E. coli σE (Figures 2G, 3B and C); while the short coil of the head sub-region passes through the T-ssDNA channel probably by forming a β-sheet with the RNAP β’-lid motif as those of M. tuberculosis σH, σL and σE (Figure 4B). Similarly, the first helix of σ4 together the tail helix of σ3.2-like linkers reaches into the RNA exit channel, where it connects with the middle sub-region of σ3.2-like linkers as those of M. tuberculosis σH, σL, σE and E. coli σE (Figures 2G, 3B-C and 4B).
The σ3.2-like linker plays pivotal role during transcription initiation
Above evidence suggests that σ3.2-like linker of most bacterial ECF σ factors probably follows a similar path to enter into and exit from the active-center cleft of RNAP, implicating that such σ3.2-like linker is an indispensable domain and probably plays essential function. We next explored the functional importance of the σ3.2-like linker. We prepared wild-type and derivatives of well-studied bacterial ECF σ factors (including E. coli σE, B. subtilis σW, B. subtilis σM, M. tuberculosis σE and M. tuberculosis σH) and performed in vitro transcription experiments. The results of Figure 5A clearly showed that deleting or replacing the σ3.2-like linker with a disordered sequence completely abolished the transcription activity of all tested bacterial ECF σ factors. The results suggest that the σ3.2-like linker region is indeed essential for the transcription activity of bacterial ECF σ factors.
Figure 5.
The σ3.2-like linker plays essential roles in multiple steps of transcription initiation. (A) The in vitro transcription assay showing the transcription activity of WT and derivatives of various bacterial ECF σ factors. RO represents the run-off transcripts. (B) The FP competitive assay showing binding affinities of WT and derivatives of Escherichia coli σE to bacterial RNAP core enzyme. The experiments were repeated in triplicate, and the data are presented as mean ± S.E.M. (C) The stopped-flow experiments measuring the kinetics of promoter unwinding by WT or derivatives of E. coli σE. The data points were recorded every 0.1 s and the data were fitted as described in ‘Materials and Methods’ section. The Ec σE head region (residues 88–98) was replaced by ‘GGSSGSGGSSS’ resulting in Ec σE (head); the σE3.2 tail region (residues 119–130) was replaced by ‘GGSSGSGGGSSS’ resulting in Ec σE (tail); E σE3.2 head region (residues 88–98) and tail region (residues 119–130) were replaced by ‘GGSSGSGGSSS’ and ‘GGSSGSGGGSSS’, respectively resulting in Ec σE (head/tail). (D) The in vitro transcription assay with WT or derivatives of E. coli σE. The ‘abortive’ represents abortive transcripts and the ‘T’ represents terminated transcripts of 82 nt. The in vitro transcription and stopped-flow experiments were repeated for three times and representative data are shown. The FP competitive experiments were repeated for three times and the data were presented as mean ± S.E.M.
The σ3.2-like linker plays essential roles in multiple steps of transcription initiation. (A) The in vitro transcription assay showing the transcription activity of WT and derivatives of various bacterial ECF σ factors. RO represents the run-off transcripts. (B) The FP competitive assay showing binding affinities of WT and derivatives of Escherichia coli σE to bacterial RNAP core enzyme. The experiments were repeated in triplicate, and the data are presented as mean ± S.E.M. (C) The stopped-flow experiments measuring the kinetics of promoter unwinding by WT or derivatives of E. coli σE. The data points were recorded every 0.1 s and the data were fitted as described in ‘Materials and Methods’ section. The Ec σE head region (residues 88–98) was replaced by ‘GGSSGSGGSSS’ resulting in Ec σE (head); the σE3.2 tail region (residues 119–130) was replaced by ‘GGSSGSGGGSSS’ resulting in Ec σE (tail); E σE3.2 head region (residues 88–98) and tail region (residues 119–130) were replaced by ‘GGSSGSGGSSS’ and ‘GGSSGSGGGSSS’, respectively resulting in Ec σE (head/tail). (D) The in vitro transcription assay with WT or derivatives of E. coli σE. The ‘abortive’ represents abortive transcripts and the ‘T’ represents terminated transcripts of 82 nt. The in vitro transcription and stopped-flow experiments were repeated for three times and representative data are shown. The FP competitive experiments were repeated for three times and the data were presented as mean ± S.E.M.To further dissect the steps σ3.2-like linker might be involved in during transcription initiation, we studied the assembly of RNAP holoenzyme, the formation of RPo, and the synthesis of abortive and productive transcripts by using wild-type or derivatives of E. coli σE. We developed a competitive FP assay (in which the unlabeled wild-type or derivatives of σE compete with [C165-FAM]σE for binding to RNAP core enzyme) to compare binding affinities of various σ factors. The E. coli σA exhibited the strongest inhibition with an IC50 ∼5-fold lower than E. coli σE, which is in consistent with the previous finding that σA has higher affinity than that of other ECF σ factors (red in Figure 5B and Supplementary Table S4) (55). Deletion of the σ3.2-like linker of σE substantially decreases the binding affinity with an IC50 ∼20-fold higher than E. coli σE (Figure 5B and Supplementary Table S4). However, replacing the head or the tail sub-regions of the σ3.2-like linker of σE with random sequences has no significant change on the affinity of E. coli σE; while replacing the entire linker with a disordered acidic loop instead slightly increased the binding affinity. The results suggest that the presence of a physical linker—regardless of protein sequences of the linker—between σ2 and σ4 is necessary for maintaining the high affinity of E. coli σE to RNAP core enzyme (the linker physically ties the σE2 and σE4 together and thus greatly increase the affinity of the two domains to RNAP), but the interactions of the linker with RNAP plays little role for assembly of RNAP holoenzyme. The results are also consistent with the fact that bacterial ECF σ factors show highest conservation scores for RNAP-contacting residues on σ2 and σ4, but show no conservation on any residues on σ3.2-like linkers (Figure 4A and Supplementary Figure S4). The results also explain that the identities of the −10 element are exclusively recognized at the non-template strand of promoter DNA (10,41,42).The chimeric E. coli σE factors serve as good materials for subsequent experiments, as they showed similar affinity to wild-type σE with RNAP core enzyme. Therefore, any effects can be attributed the altered conformation of the σ3.2-like linker or interactions between the linker and RNAP. We next studied the potential effect on RPo formation using the chimeric E. coli σE factors by a stopped-flow fluorescence assay as described above. All the chimeric E. coli σE factors showed slowed RPo equilibration (Figure 5C and Supplementary Table S6), suggesting a role of the σ3.2-like linker during RPo formation.To explore the potential role of the σ3.2-like linker of σE on the steps following RPo formation, we performed in vitro transcription assays. As shown in Figure 5D, RNAP holoenzymes comprising chimeric E. coli σE factors produce substantially less amount of abortive as well as full-length products. Intriguingly, RNAP holoenzyme with σE (DL) (the whole linker replaced by a disordered loop), σE (Head/Tail) (the head and tail regions of the σ3.2-like linker are replaced by disordered loops), or σE (R2/R4) (disconnected σE2 and σE4; the σ3.2-like linker is completely truncated) still produced abortive transcripts, albeit less efficiently, but produced no full-length products (Lane IV, V and VI in Figure 5D), suggesting that the σ3.2-like linker probably also affect the later step of transcription initiation (i.e. promoter escape).
DISCUSSION
In this work, we have solved a cryo-EM structure of E. coli σE-RPo at 4.0 Å, a crystal structure of M. tuberculosis σH–RPo at 2.9 Å, and a crystal structure of M. tuberculosis σH/E-RPo at 3.1 Å. We included a 5-nt RNA primer (complimentary to nucleotides of template ssDNA at positions −4 to +1) to stabilize the complexes, a strategy has been used previously for determination of bacterial RPo complexes (10,13,56). The conformation of the 5-bp hybrid in our structures is indistinguishable to that of the bona fide bacterial transcription initiation complexes with 5-nt RNA (16,48), although it is not an on-pathway state of transcription initiation.The structure of E. coli σE-RPo reveals protein-protein interactions essential for σE-RNAP holoenzyme assembly, and protein-DNA interactions essential for promoter recognition and unwinding. More importantly, the four structures of transcription initiation complexes comprising ECF σ factors and secondary-structure prediction of available 27,670 ECF σ factors show that the σ3.2-like linkers of most bacterial ECF σ factors retain conserved pattern of secondary structures of the head and tail sub-regions and strongly suggest the σ3.2-like linkers follow the same path to get in and out the active-center cleft of RNAP.Our study explains how bacterial RNAP manages to accommodate such divergent σ3.2-like linkers and why the primary sequences of σ3.2-like linkers become so divergent during evolution. The head sub-region of σ3.2-like linkers comprises a short helix followed by a coil. The short helix extends the last helix of σ2 and help guide σ3.2-like linker approaching into the channel to enter the active-center cleft of RNAP. The short coil forms a β-sheet with the lid domain of RNAP-β’ subunit in three of four available structures of ECF σ-RPo (Figure 4B). Such interaction model explains the poor conservation of primary sequence in this region; as a β-sheet is typically stabilized through main-chain interactions. The tail sub-region of σ3.2-like linkers in the RNA exit channel forms a long intact helix (occasionally with a kink) with residues of σ4 (Figure 4B). It seems that the channels for entry and exit of σ3.2-like linkers of ECF σ factors put some evolutionary pressure on the head or tail sub-regions and consequently certain secondary-structure patterns in the two sub-regions are retained. The middle sub-region of σ3.2-like linkers locates mainly in the active-center cleft—a wide channel for accommodating DNA/RNA hybrid which puts much less restraint for indels on this sub-region during evolution, and thereby exhibits varied lengths in primary sequence and diverse secondary structures.In case of primary σ factors, the σ3.2 plays an essential role during transcription initiation (10,16,17,21,44,57). It inserts into the active-center cleft of RNAP, where it mimics an RNA molecule, pre-organizes the template ssDNA into a helical conformation, and increases the binding affinity of initiating NTPs. After showing that the σ3.2-like linkers of bacterial ECF σ factors bind to the active-center cleft of RNAP and to the template ssDNA in the transcription bubble in a similar manner to the σ3.2 of primary σ factors (Figures 2G and 3B-C), we demonstrated the σ3.2-like linker of bacterial ECF σ factors is also crucial to transcription initiation as the σ3.2 of primary σ factors. Deletion of the σ3.2-like linker of bacterial ECF σ factors completely abolished production of full-length transcripts (Figure 5A). We further showed that multiple steps of transcription initiation require proper engagement of the σ3.2-like linker in the active-center cleft of RNAP, as disrupting such interactions resulted in impaired ability to form RPo complex, synthesis of abortive transcripts as well as promoter escape (Figure 5).Transcription machineries from all three domains of lives retain similar essential structure modules as the domain σ3.2 of σ70 family (42)—domain RII.3 of σ54 family in bacteria (58,59), B-reader of TFIIB in archaea RNAP (60), reader of Rrn7 for eukaryotic Pol I (61,62), the B-finger (B-reader) of TFIIB for eukaryotic Pol II (63,64) and the linker of Brf1 for eukaryotic Pol III (Supplementary Figure S6) (65,66). Apparently, distinct multiple-subunit DNA-dependent RNAP have evolved non-homologous, but functionally equivalent structure modules for efficient transcription initiation, implicating a unified mechanism for transcription initiation for multiple-subunit DNA-dependent RNAP.
DATA AVAILABILITY
Atomic coordinates and structure factors of Ec σE–RPo complex, Mtb σH–RPo complex, Mtb σH/E–RPo complex have been deposited into the Protein Data Bank with accession code 6JBQ, 6JCX and 6JCY, respectively (https://www.wwpdb.org/).Click here for additional data file.
Authors: Elizabeth A Campbell; Jonathan L Tupy; Tanja M Gruber; Sheng Wang; Meghan M Sharp; Carol A Gross; Seth A Darst Journal: Mol Cell Date: 2003-04 Impact factor: 17.970
Authors: Stefan Dexl; Robert Reichelt; Katharina Kraatz; Sarah Schulz; Dina Grohmann; Michael Bartlett; Michael Thomm Journal: Nucleic Acids Res Date: 2018-11-02 Impact factor: 16.971
Authors: Alec Fraser; Maria L Sokolova; Arina V Drobysheva; Julia V Gordeeva; Sergei Borukhov; John Jumper; Konstantin V Severinov; Petr G Leiman Journal: Nat Commun Date: 2022-06-20 Impact factor: 17.694
Authors: Elias Eger; Michael Schwabe; Lukas Schulig; Nils-Olaf Hübner; Jürgen A Bohnert; Uwe T Bornscheuer; Stefan E Heiden; Justus U Müller; Fazal Adnan; Karsten Becker; Carlos L Correa-Martinez; Sebastian Guenther; Evgeny A Idelevich; Daniel Baecker; Katharina Schaufler Journal: Microbiol Spectr Date: 2022-04-18
Authors: Joshua M Thiede; Nicholas A Dillon; Michael D Howe; Ranee Aflakpui; Samuel J Modlin; Sven E Hoffner; Faramarz Valafar; Yusuke Minato; Anthony D Baughn Journal: mBio Date: 2022-02-01 Impact factor: 7.867