| Literature DB >> 33114549 |
Kristi Moncja1, Michael W Van Dyke1.
Abstract
Transcription factors (TFs) have been extensively researched in certain well-studied organisms, but far less so in others. Following the whole-genome sequencing of a new organism, TFs are typically identified through their homology with related proteins in other organisms. However, recent findings demonstrate that structurally similar TFs from distantly related bacteria are not usually evolutionary orthologs. Here we explore TTHB099, a cAMP receptor protein (CRP)-family TF from the extremophile Thermus thermophilus HB8. Using the in vitro iterative selection method Restriction Endonuclease Protection, Selection and Amplification (REPSA), we identified the preferred DNA-binding motif for TTHB099, 5'-TGT(A/g)NBSYRSVN(T/c)ACA-3', and mapped potential binding sites and regulated genes within the T. thermophilus HB8 genome. Comparisons with expression profile data in TTHB099-deficient and wild type strains suggested that, unlike E. coli CRP (CRPEc), TTHB099 does not have a simple regulatory mechanism. However, we hypothesize that TTHB099 can be a dual-regulator similar to CRPEc.Entities:
Keywords: bioinformatics; biolayer interferometry (BLI); electrophoretic mobility shift assay (EMSA); extremophile; protein-DNA binding; type IIS restriction endonuclease
Mesh:
Substances:
Year: 2020 PMID: 33114549 PMCID: PMC7662524 DOI: 10.3390/ijms21217929
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Selection of TTHB099-binding DNA sequences. Shown are IR fluorescence images of restriction endonuclease cleavage-protection assays made during Rounds 1–7 of REPSA selection with 50.6 nM TTHB099 protein. The presence (+) or absence (−) of TTHB099 and IISRE FokI (F) or BpmI (B) are indicated above each lane. The electrophoretic mobility of the intact (T) and cleaved (X) ST2R24 selection template, primer dimer species (D), as well as the IRD7_ST2R primer (P) are indicated at the right of the figure.
Figure 2Validation of TTHB099–binding DNA sequences. (A) Shown are IR fluorescence images of restriction endonuclease protection assays made with DNA from Round 4 and 7 of REPSA selection. The presence (+) or absence (−) of TTHB099 and IISRE FokI (F) or BpmI (B) are indicated above each lane. The electrophoretic mobility of the intact (T) and cleaved (X) IRD8-labeled REPSAis control DNA (green, Tc and Xc), IRD7-labeled ST2R24 selection template (red, Ts and Xs), primer-dimers (D), as well as the IRD7_ST2R primer (P) are indicated at the right of the figure and color-coded to match the fluorescently labeled DNA present. (B) Shown are IR fluorescence images of electrophoretic mobility shift assays made with DNA mixtures obtained from Round 1 (left lanes) and Round 7 (right lanes) of REPSA selection incubated with increasing concentrations of TTHB099 protein (from left to right: 0, 5.06, 50.6, 506, and 5060 nM TTHB099). The electrophoretic mobility of a single protein-DNA complex (S) as well as the uncomplexed ST2R24 selection template (T) and IRD7_ST2R primer (P) are indicated at the right of the figure.
Figure 3TTHB099-binding motifs. Sequence logos were determined using MEME software with an input of 1000 Round 7 DNA sequences. (A) MEME performed with no filters. (B) MEME performed using a palindromic filter.
Figure 4EMSA analysis of TTHB099 binding to its palindromic consensus sequence. Shown is an IR fluorescence image of IRD7-labeled ST2_099 incubated with 0, 0.66, 1.32, 2.64, 5.27, 10.5, 21.1, or 42.2 nM TTHB099 protein. (S) Protein-DNA complex, and (T) uncomplexed DNA.
Figure 5Biolayer interferometry analysis of TTHB099 binding to DNA. Shown are raw traces (dots) and best-fit lines of TTHB099 binding to (A) ST2_099 consensus DNA and (B) ST2_REPSAis control DNA TTHB099. Concentrations investigated include 450 nM (red), 150 nM (green), 50 nM (blue), and 17 nM (magenta).
TTHB099-DNA binding parameters for consensus and mutant sequences.
| Name | Sequence | KD (M) | R2 | ||
|---|---|---|---|---|---|
| wt | TGTATTCTAGAATACA | 131,308 | 2.907 × 10−4 | 2.214 × 10−9 | 0.9883 |
| m1 | gGTATTCTAGAATACA | 120,059 | 7.558 × 10−4 | 6.295 × 10−9 | 0.9895 |
| m2 | TtTATTCTAGAATACA | 112,773 | 3.785 × 10−3 | 3.356 × 10−8 | 0.9778 |
| m3 | TGaATTCTAGAATACA | 88,146 | 1.221 × 10−3 | 1.385 × 10−8 | 0.9824 |
| m4 | TGTcTTCTAGAATACA | 142,953 | 1.366 × 10−3 | 9.557 × 10−9 | 0.9817 |
| m5 | TGTAcTCTAGAATACA | 110,766 | 5.379 × 10−4 | 4.856 × 10−9 | 0.9879 |
| m6 | TGTATaCTAGAATACA | 125,945 | 7.064 × 10−4 | 5.608 × 10−9 | 0.9794 |
| m7 | TGTATTtTAGAATACA | 119,827 | 6.978 × 10−4 | 5.823 × 10−9 | 0.9805 |
| m8 | TGTATTCaAGAATACA | 115,299 | 7.848 × 10−4 | 6.807 × 10−9 | 0.9840 |
| wt + cAMP | TGTATTCTAGAATACA | 214,759 | 4.780 × 10−4 | 2.226 × 10−9 | 0.9231 |
(Sequence) Lowercase nucleotides indicate a mutation from the TTHB099 consensus sequence (wt). (wt + cAMP) Binding reactions performed with the consensus sequence in the presence of 100 nM 3′,5′cAMP.
TTHB099-consensus sequences mapped in the genome of T. thermophilus HB8.
| Start | End | Sequence | Loc | Gene | Op | ||
|---|---|---|---|---|---|---|---|
| 81,408 | 81,423 | 4.03 × 10−6 | 1 | AGTAAACTAAAACACA | +1 |
| 1/3 |
| 81,408 | 81,423 | 4.03 × 10−6 | 1 | TGTGTTTTAGTTTACT | −48 |
| S |
| 32,704 | 32,719 | 5.82 × 10−6 | 1 | TGTGTACGAAATTACA | +434 |
| 1/2 |
| 472,203 | 472,218 | 7.74 × 10−6 | 1 | TGTATCTTGAAAAACA | −26 |
| S |
| 472,203 | 472,218 | 7.74 × 10−6 | 1 | TGTTTTTCAAGATACA | −56 |
| S |
| 130,005 | 130,020 | 1.01 × 10−5 | 1 | TTTATTCTCCCTTACA | −10 |
| 1/2 |
| 130,005 | 130,020 | 1.01 × 10−5 | 1 | TGTAAGGGAGAATAAA | −3 |
| S |
| 1506 | 1521 | 1.23 × 10−5 | 1 | AGTGAGATAACTCACA | −666 |
| 1/3 |
| 1506 | 1521 | 1.23 × 10−5 | 1 | TGTGAGTTATCTCACT | +627 |
| S |
| 79,627 | 79,642 | 1.30 × 10−5 | 1 | TGTGGTCCAGGCTACC | −78 |
| 1/3 |
| 79,627 | 79,642 | 1.30 × 10−5 | 1 | GGTAGCCTGGACCACA | −162 |
| S |
| 615,132 | 615,147 | 1.46 × 10−5 | 1 | GGTAGCCAGGGATACA | +909 |
| 4/4 |
| 1,715,061 | 1,715,076 | 1.65 × 10−5 | 1 | TGTAGGCCAGGCCACG | −33 |
| 1/2 |
| 609,145 | 609,160 | 1.83 × 10−5 | 1 | CGTGTCCCTGAACACA | +790 |
| 2/4 |
| 614,143 | 614,158 | 2.12 × 10−5 | 1 | TGTGCCTTTGGCCACA | +326 |
| 1/3 |
| 1,794,923 | 1,794,938 | 2.33 × 10−5 | 1 | GGTATGCTCAAGTACA | +13 |
| 1/2 |
| 1,794,923 | 1,794,938 | 2.33 × 10−5 | 1 | TGTACTTGAGCATACC | −19 |
| 1/4 |
| 1272 | 1287 | 2.61 × 10−5 | 1 | TGTAGCCCAGGCCAAA | +239 |
| S |
| 1272 | 1287 | 2.61 × 10−5 | 1 | TTTGGCCTGGGCTACA | +536 |
| 4/4 |
| 199,120 | 199,135 | 2.90 × 10−5 | 1 | TGTGGCGTATAACAAA | −17 |
| S |
| 199,120 | 199,135 | 2.90 × 10−5 | 1 | TTTGTTATACGCCACA | −103 |
| S |
| 357,035 | 357,050 | 3.43 × 10−5 | 1 | AGTGATGTAAACTAAA | −26 |
| S |
| 314,103 | 314,118 | 3.67 × 10−5 | 1 | TGTGTTGCAGGACCCA | +58 |
| 2/11 |
| 1,540,358 | 1,540,373 | 3.95 × 10−5 | 1 | TGTAGCTTCCCATACC | −67 |
| S |
| 1,540,358 | 1,540,373 | 3.95 × 10−5 | 1 | GGTATGGGAAGCTACA | +13 |
| S |
(p-Value) The probability of a random sequence of the same length matching that position of the sequence with an as good or better score. (Q-value) False discovery rate if the occurrence is accepted as significant. (Loc) Location of the TTHB099-binding site relative to the start site of transcription. (Gene) Proximal gene downstream of TTHB099 consensus sequence. (Op) Gene position within the postulated operon. (S) No operon, single transcriptional unit.
Figure 6Promoter predictions of sequences potentially regulated by TTHB099 within the T. thermophilus HB8 genome. Shown are ±200 bp sequences from the TTHB099 binding site identified through FIMO (see Table 2). Blue nucleotides represent the longest open reading frames with a downstream orientation relative to the TTHB099 binding site; Green nucleotides indicate open reading frames with the opposite orientation; Black nucleotides imply intergenic regions. Potential promoter elements (−35 and −10 boxes, +1 start site of transcription) are indicated with cyan highlighting; TTHB099-binding sites are indicated with yellow highlighting; Overlapping TTHB099-binding and core promoter elements are indicated by green highlighting.
Binding kinetics parameters of TTHB099 to potential gene promoter elements.
| Gene | Sequence | KD (M) | R2 | ||
|---|---|---|---|---|---|
|
| TGTGTTTTAGTTTACT | 122,852 | 1.145 × 10−2 | 9.322 × 10−8 | 0.9817 |
|
| TGTTTTTCAAGATACA | 164,971 | 1.280 × 10−2 | 7.762 × 10−8 | 0.9718 |
|
| TGTAAGGGAGAATAAA | 96,736 | 2.140 × 10−2 | 2.212 × 10−7 | 0.9687 |
|
| GGTAGCCTGGACCACA | 214,153 | 7.163 × 10−4 | 3.345 × 10−9 | 0.9805 |
|
| TGTAGGCCAGGCCACG | 332,611 | 1.013 × 10−3 | 3.046 × 10−9 | 0.9757 |
|
| TGTACTTGAGCATACC | 136,294 | 8.938 × 10−3 | 6.558 × 10−8 | 0.9806 |
|
| TTTGTTATACGCCACA | 57,231 | 4.464 × 10−2 | 7.801 × 10−7 | 0.9596 |
|
| AGTGATGTAAACTAAA | − | − | − | − |
|
| GGTATGGGAAGCTACA | 126,605 | 1.291 × 10−2 | 1.020 × 10−7 | 0.9759 |
(TTHA0080/81) A common TTHB099-binding site shared by two bidirectional promoters. (−) No apparent binding.
Expression profile data of the FIMO identified operons in a TTHB099-deficient strain of T. thermophilus HB8.
| Operon | Gene | Role | LogFC | Adj. |
|---|---|---|---|---|
| S |
| hypothetical protein | 0.851 | 0.0268 |
| 1 |
| hypothetical protein | −0.202 | 0.421 |
| 2 |
| phosphoesterase | −0.176 | 0.463 |
| 3 |
| dimethyladenosine transferase | −0.219 | 0.336 |
| S |
| malate synthase | −0.454 | 0.0983 |
| S |
| IclR family transcriptional regulator, acetate operon repressor | 0.276 | 0.619 |
| S |
| hypothetical protein | 0.872 | 0.0295 |
| 1 |
| Short-chain dehydrogenase/reductase family oxidoreductase | −0.211 | 0.674 |
| 2 |
| NrdR family transcriptional regulator | −0.328 | 0.350 |
| S |
| Zn-dependent hydrolase | −0.386 | 0.653 |
| 1 |
| hypothetical protein | −0.779 | 0.0451 |
| 2 |
| hypothetical protein | −0.0653 | 0.955 |
| 3 |
| hypothetical protein | −0.217 | 0.674 |
| 1 |
| ABC transporter permease | −0.294 | 0.287 |
| 2 |
| ABC transporter ATP-binding protein | −0.195 | 0.567 |
| 1 |
| 3-isopropylmalate dehydratase large subunit | −0.817 | 0.0246 |
| 2 |
| homoaconitate hydratase small subunit | −1.14 | 0.0265 |
| 3 |
| hypothetical protein | −0.0793 | 0.790 |
| 4 |
| hypothetical protein | −0.0327 | 0.905 |
| 1 |
| hypothetical protein | 0.353 | 0.154 |
| 2 |
| hypothetical protein | 0.723 | 0.0284 |
| S |
| Mg2+ chelatase family protein | 0.141 | 0.698 |
| S |
| hypothetical protein | 0.454 | 0.0644 |
| S |
| hypothetical protein | 0.687 | 0.0421 |
| S |
| hypothetical protein | 2.62 | 2.10 × 10−3 |
| S |
| hypothetical protein | −1.20 | 0.0960 |
(Operon) Numbers indicate positions of the genes within the operon. (S) Single transcriptional unit. (Role) The biological function identified using the KEGG database [19]. (LogFC) Log2-fold change between data obtained from TTHB099-deficient (accessions GSM530118/20/22) and wild-type (accessions GSM532194/5/6) T. thermophilus HB8 strains, SuperSeries GSE21875. (Adj. p-value) The p-value obtained following multiple testing corrections using the default Benjamini and Hochberg false discovery rate method [25].
GEO2R analysis of the most affected genes in the absence of TTHB099.
| Operon | Gene | Role | LogFC | Adj. |
|---|---|---|---|---|
| 1 |
| Elongation Factor G | +4.384 | 2.07 × 10−4 |
| 2 |
| MoxR-like protein | +5.067 | 7.03 × 10−5 |
| 3 |
| Phosphoenolpyruvate Synthase | +5.231 | 7.03 × 10−5 |
| 4 |
| Hemolysin III | +3.133 | 1.27 × 10−3 |
| 5 |
| Response Regulator_two-component system, OmpR family | +1.087 | 9.51 × 10−3 |
| 6 |
| Sensor Histidine Kinase | +0.369 | 2.46 × 10−1 |
| S |
| Isocitrate lyase | +4.423 | 1.52 × 10−4 |
| 1 |
| SufC protein, ATP-binding protein | −2.465 | 1.06 × 10−3 |
| 2 |
| SufB protein, membrane protein | −2.593 | 9.53 × 10−4 |
| 3 |
| SufD protein, membrane protein | −2.630 | 6.25 × 10−4 |
| 4 |
| Dioxygenase ferredoxin subunit | −2.419 | 2.59 × 10−3 |
| 1 |
| ba3-type cytochrome C oxidase polypeptide IIA | +1.311 | 4.37 × 10−2 |
| 2 |
| ba3-type cytochrome C oxidase polypeptide II | +2.944 | 7.89 × 10−3 |
| 3 |
| ba3-type cytochrome C oxidase polypeptide I | +4.269 | 1.27 × 10−3 |
| 1 |
| hypothetical protein | +1.910 | 1.29 × 10−3 |
| 2 |
| Major facilitator superfamily transporter | +2.300 | 9.53 × 10−4 |
| 1 |
| Elongation factor Tu | −1.254 | 1.17 × 10−2 |
| 1 |
| 50S ribosomal protein L33 | −1.139 | 8.04 × 10−3 |
| 2 |
| Preprotein translocase subunit SecE | −0.997 | 9.18 × 10−3 |
| 3 |
| Transcription antitermination protein NusG | −1.136 | 7.86 × 10−3 |
| 1 |
| 50S ribosomal protein L11 | −2.378 | 1.27 × 10−3 |
| 2 |
| 50S ribosomal protein L1 | −1.776 | 2.16 × 10−3 |
| 1 |
| NADH-quinone oxidoreductase subunit 7 | +1.083 | 8.73 × 10−3 |
| 2 |
| NADH dehydrogenase subunit B | +1.005 | 2.41 × 10−2 |
| 3 |
| NADH-quinone oxidoreductase subunit 5 | +1.251 | 1.06 × 10−2 |
| 4 |
| NADH-quinone oxidoreductase subunit 4 | +1.255 | 6.43 × 10−3 |
| 5 |
| NADH-quinone oxidoreductase subunit 2 | +0.693 | 4.43 × 10−2 |
| 6 |
| NADH-quinone oxidoreductase subunit 1 | +1.249 | 4.68 × 10−3 |
| 7 |
| NADH-quinone oxidoreductase subunit 3 | +1.248 | 5.76 × 10−3 |
| 8 |
| NADH-quinone oxidoreductase subunit 8 | +1.490 | 3.62 × 10−3 |
| 9 |
| NADH-quinone oxidoreductase subunit 9 | +1.502 | 2.21 × 10−3 |
| 10 |
| NADH-quinone oxidoreductase subunit 10 | +1.626 | 6.84 × 10−3 |
| 11 |
| NADH-quinone oxidoreductase subunit 11 | +1.043 | 6.39 × 10−3 |
| 12 |
| NADH-quinone oxidoreductase subunit 12 | +1.492 | 2.85 × 10−3 |
| 13 |
| NADH-quinone oxidoreductase subunit 13 | +1.679 | 3.34 × 10−3 |
| 14 |
| NADH-quinone oxidoreductase subunit 14 | +1.509 | 2.84 × 10−3 |
| 15 |
| arginyl-tRNA synthetase | +0.397 | 8.43 × 10−2 |
| 16 |
| serine protease | +0.106 | 6.09 × 10−1 |
| 17 |
| UDP-N-acetylmuramoylalanyl-D-glutamate--2,6-diaminopimelate ligase | +0.520 | 5.11 × 10−2 |
| S |
| hypothetical protein | +2.616 | 2.10 × 10−3 |
| S |
| Osmotically inducible protein OsmC | +1.206 | 3.65 × 10−3 |
| 1 |
| Iron ABC transporter substrate-binding protein | −2.947 | 1.83 × 10−3 |
| 2 |
| Iron ABC transporter permease | −2.344 | 1.68 × 10−3 |
| 3 |
| Iron ABC transporter ATP-binding protein | −0.796 | 1.69 × 10−2 |
| 4 |
| tRNA pseudouridine synthase A | −0.461 | 8.43 × 10−2 |
| S |
| MutT/nudix family protein | −1.369 | 6.82 × 10−3 |
| 1 |
| nicotinamide nucleotide transhydrogenase subunit alpha 1 | +1.516 | 5.30 × 10−3 |
| 2 |
| nicotinamide nucleotide transhydrogenase subunit alpha 2 | +1.596 | 2.85 × 10−3 |
| 3 |
| nicotinamide nucleotide transhydrogenase subunit beta | +1.647 | 2.10 × 10−3 |
| 1 |
| 50S ribosomal protein L10 | −1.673 | 5.33 × 10−3 |
| 2 |
| 50S ribosomal protein L7/L12 | −1.326 | 8.74 × 10−3 |
| 1 |
| putative type IV pilin | +1.125 | 4.09 × 10−2 |
| 2 |
| secretion system protein | +1.450 | 3.74 × 10−3 |
| 3 |
| prepilin-like protein | +1.429 | 5.85 × 10−3 |
| 4 |
| hypothetical protein | +2.250 | 1.27 × 10−3 |
| 1 |
| maltose ABC transporter substrate-binding protein | +1.787 | 1.72 × 10−3 |
| 2 |
| maltose ABC transporter permease | +2.154 | 1.17 × 10−3 |
| 3 |
| maltose ABC transporter permease | +2.108 | 1.29 × 10−3 |
| 1 |
| putative transcriptional regulator | +3.377 | 2.59 × 10−3 |
| 2 |
| hypothetical protein | +2.036 | 7.58 × 10−3 |
| 1 |
| hypothetical protein | +1.215 | 9.19 × 10−3 |
| 2 |
| CRISPR-associated Cse2 family protein | +1.514 | 4.80 × 10−3 |
| 3 |
| hypothetical protein | +1.671 | 6.62 × 10−3 |
| 4 |
| hypothetical protein | +1.480 | 4.34 × 10−3 |
| 5 |
| hypothetical protein | +1.669 | 4.68 × 10−3 |
| 6 |
| hypothetical protein | +1.446 | 6.84 × 10 −3 |
| 7 |
| hypothetical protein | +1.549 | 1.71 × 10−2 |
(Operon) Numbers indicate positions of the genes within the operon. (S) Single transcriptional unit. (Role) The biological function identified using the KEGG database (LogFC) Log2-fold change between data obtained from TTHB099-deficient and wild-type T. thermophilus HB8 strains. (Adj. p-value) The p-value obtained following multiple testing corrections using the default Benjamini and Hochberg false discovery rate method [25].