| Literature DB >> 19494184 |
Justina Krawczyk1, Thomas A Kohl, Alexander Goesmann, Jörn Kalinowski, Jan Baumbach.
Abstract
Year by year, approximately two million people die from tuberculosis, a disease caused by the bacterium Mycobacterium tuberculosis. There is a tremendous need for new anti-tuberculosis therapies (antituberculotica) and drugs to cope with the spread of tuberculosis. Despite many efforts to obtain a better understanding of M. tuberculosis' pathogenicity and its survival strategy in humans, many questions are still unresolved. Among other cellular processes in bacteria, pathogenicity is controlled by transcriptional regulation. Thus, various studies on M. tuberculosis concentrate on the analysis of transcriptional regulation in order to gain new insights on pathogenicity and other essential processes ensuring mycobacterial survival. We designed a bioinformatics pipeline for the reliable transfer of gene regulations between taxonomically closely related organisms that incorporates (i) a prediction of orthologous genes and (ii) the prediction of transcription factor binding sites. In total, 460 regulatory interactions were identified for M. tuberculosis using our comparative approach. Based on that, we designed a publicly available platform that aims to data integration, analysis, visualization and finally the reconstruction of mycobacterial transcriptional gene regulatory networks: MycoRegNet. It is a comprehensive database system and analysis platform that offers several methods for data exploration and the generation of novel hypotheses. MycoRegNet is publicly available at http://mycoregnet.cebitec.uni-bielefeld.de.Entities:
Mesh:
Substances:
Year: 2009 PMID: 19494184 PMCID: PMC2724278 DOI: 10.1093/nar/gkp453
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Diagram of the prediction pipeline. The diagram shows the main steps performed during transfer of gene regulations from C. glutamicum to M. tuberculosis. Starting with an orthology detection, the next step was a prediction of conserved regulations. Based on that, a TFBSs prediction provided further evidence. Finally, the results can be exported as TAB-delimited files and imported into the MycoRegNet data repository.
Putative gene regulations of CG in MT
| TF | Target genes | |
|---|---|---|
| Carbohydrate metabolism | ||
| Rv0465c | Carbohydrate metabolism | |
| Cell division and septation | ||
| Specific biosynthesis pathways | ||
| Rv0792c | Carbohydrate metabolism | |
| Rv1719 | Carbohydrate metabolism | |
| Rv3676 | Carbohydrate metabolism | |
| Cell division and septation | ||
| Macroelement and metal homeostasis | ||
| SOS and stress response | ||
| Specific biosynthesis pathways | ||
| Cellular Program | ||
| RelA | Sigma factor module | |
| SOS and stress response | ||
| Macroelement and metal homeostasis | ||
| Rv0485 | Macroelement and metal homeostasis | |
| PhoP | Macroelement and metal homeostasis | |
| Rv0827c | Macroelement and metal homeostasis | |
| Rv1994c | Macroelement and metal homeostasis | |
| IdeR | Carbohydrate metabolism | |
| Macroelement and metal homeostasis | ||
| Rv3160c | Macroelement and metal homeostasis | |
| Rv3173c | Macroelement and metal homeostasis | |
| Sigma factor module | ||
| SigB | Carbohydrate metabolism | |
| Rv0363c ( | ||
| Rv1437 ( | ||
| SOS and stress response | ||
| Rv3132c ( | ||
| Specific biosynthesis pathways | ||
| Rv2210c ( | ||
| SigM | SOS and stress response | |
| Rv0384c ( | ||
| Rv3418c ( | ||
| SOS and stress response | ||
| HspR | SOS and stress response | |
| Rv0350 ( | ||
| Rv0384c ( | ||
| HrcA | SOS and stress response | |
| Rv0440 ( | ||
| LexA | Cell division and septation | |
| Rv2748c ( | ||
| SOS and stress response | ||
| Rv1235 ( | ||
| Rv2593c ( | ||
| Rv2737c ( | ||
| Rv2745c | SOS and stress response | |
| Rv0782 ( | ||
| Rv3596c ( | ||
| WhiB1 | SOS and stress response | |
| Rv3913 ( | ||
| MtrA | SOS and stress response | |
| Rv0917 ( | ||
| CspA | Carbohydrate metabolism | |
| Specific biosynthesis pathways | ||
| PyrR | Specific biosynthesis pathways | |
| ArgR | Specific biosynthesis pathways | |
Putative gene regulations of CG in MT, predicted in silico by using the introduced MycoRegNet pipeline
Detected binding sites upstream transferred target genes of CG in MT
| TF | Gene ID | Gene name | Operon | Binding motif |
|---|---|---|---|---|
| Rv0465c | Rv0211a | ATAACTACGCAGG | ||
| Rv0249c | – | Rv0249c-Rv0248c-Rv0247ca | AGTAGTTCGCGAT | |
| Rv0363ca | – | CGTACTTCTCAAA | ||
| Rv0407 | Rv0407-Rv0408a-Rv0409a | CGTGCTGTGCTCA | ||
| Rv0465c | – | Rv0465ca-Rv0464c | CTAACTCTGCGAA | |
| Rv0467a | – | CAAAATTTGCAAA | ||
| Rv0884ca | – | ATGGCATGGCCGA | ||
| Rv0896a | – | TGAGCAGATCACT | ||
| Rv0904ca | – | ATTGCATGGCAAG | ||
| Rv0951 | Rv0951a-Rv0952a | AGTGCTAAGCCGT | ||
| Rv1009 | Rv1009a-Rv1010a-Rv1011a | TCTACTTACCAAA | ||
| Rv1379 | Rv1379a-Rv1380a-Rv1381a-Rv1382-Rv1383-Rv1384-Rv1385 | AGTGCTACGCTGC | ||
| Rv1475c | Rv1475ca-Rv1474c | ACTGCTAGGCTGA | ||
| Rv1837ca | – | TAGGCTGAGCAAT | ||
| Rv1862a | – | TGTGCTGGGCTAA | ||
| Rv2193 | Rv2193a-Rv2194-Rv2195-Rv2196 | ACTACAAAGCGTC | ||
| Rv2241 | Rv2241a-Rv2242 | CAAACAGCGCAAG | ||
| Rv2332a | – | TGCGCTCTGCGAA | ||
| Rv2967ca | – | CATGCAATGTCAA | ||
| Rv3316 | Rv3316-Rv3317-Rv3318a-Rv3319 | GTTGCATTGCCCC | ||
| IdeR | Rv0249c | – | Rv0249c-Rv0248c-Rv0247ca | TTAGATGAGCGCACCCACG |
| Rv0827ca | – | – | CTATGGATCGCTGTACTAC | |
| Rv0844ca | – | CGACGAGCAGCTAAACTCA | ||
| Rv1285 | Rv1285a-Rv1286a | GAGGGCGAGGCACACGTCA | ||
| Rv2391 | Rv2391a-Rv2392a-Rv2393a | TCAGGTGCGCGTCTCCCAG | ||
| Rv2895ca | – | TAAGCGAAGCCGAACGCCA | ||
| Rv3044a | – | GTAGACCAGGCTCCCCTTG | ||
| Rv3316 | Rv3316-Rv3317-Rv3318a-Rv3319 | CTAAGAAAAGCCAGCCTAA | ||
| Rv3841a | – | CTAGGAAAGCCTTTCCTGA | ||
| LexA | Rv1235 | Rv1235a-Rv1236-Rv1237-Rv1238 | TCGACTATCTATCCGA | |
| Rv1638a | – | TCGAATGTCAGCTCGC | ||
| Rv1696a | – | |||
| Rv2594c | Rv2594ca-Rv2593ca-Rv2592ca | TCGAACGATTGTTCGG | ||
| Rv2720a | – | TCGAACACATGTTTGA | ||
| Rv2737c | Rv2737ca-Rv2736ca | TCGAACAGGTGTTCGG | ||
| Rv2748ca | – | CCGACCAGGTGCTCGC | ||
| Rv3370ca | – | TCGAACAATTGTTCGA | ||
| Rv3395c | – | Rv3395ca-Rv3394c | TCGAACATATTTTCGA | |
| Rv3160c | Rv1848 | Rv1848a-Rv1849-Rv1850a-Rv1851-Rv1852a-Rv1853 | GTGTCTACTGCGCGATGATCGAGAGCAT | |
| Rv2220a | – | CAACACGGGGTTGACTGACGGGCAATAT | ||
| Rv2920c | Rv2920ca-Rv2919ca-Rv2918ca | AAGTTTTACGTTAATCCTGATGAAACAT | ||
| Rv3666c | Rv3666ca-Rv3665c-aRv3664ca-Rv3663c-Rv3662c | GTGGTAGCTAACGGTCACCGGCGAGTGT | ||
| Rv3859c | Rv3859ca-Rv3858c | CGCTTGACGGACAGCCTATCGACAAGAC | ||
| Rv3676 | Rv0211a | TGTGAGCAGGCTTATA | ||
| Rv0249c | – | Rv0249c-Rv0248c-Rv0247ca | TGTGATCTGTAACACC | |
| Rv0400ca | – | AGTGATGAGCACCCCG | ||
| Rv0465c | – | Rv0465ca-Rv0464c | TTTGTCGAGGCTCACG | |
| Rv0467a | – | TGTTACAACGCTCACA | ||
| Rv0820a | GGTGGTGATCCGCACC | |||
| Rv0867ca | TGTGACATTACCCACA | |||
| Rv0884ca | TGTGAGCTTGTTCACA | |||
| Rv0896a | – | GGCGTTGAACATCACC | ||
| Rv0904ca | – | CGTGAGTCGTATCACG | ||
| Rv0928 | Rv0928a-Rv0929a-Rv0930a | ACTGAATTGAAACTCA | ||
| Rv0951 | Rv0951a-Rv0952a | TGTGAGTTGGATCACG | ||
| Rv1009 | Rv1009a-Rv1010a-Rv1011a | GGTGGCGCTCATCACC | ||
| Rv1092ca | TGCCACGTAGGTCACG | |||
| Rv1099c | Rv1099c-Rv1098ca-Rv1097c | |||
| Rv1130 | – | Rv1130a-Rv1131 | TGTGGATAAGTCCAGG | |
| Rv1161 | Rv1161a-Rv1162a-Rv1163a-Rv1164-Rv1165-Rv1166 | TGCGTTGAACGGCACG | ||
| Rv1436 | Rv1436a-Rv1437a-Rv1438a | GGTTGTTTAGCCAACA | ||
| Rv1475c | Rv1475ca-Rv1474c | TGTAACTGCCGACATA | ||
| Rv1837ca | – | AGGGATGCACTACACA | ||
| Rv1854ca | – | TGTGGCTGATGACACA | ||
| Rv1862a | – | CGTGGGGCGCCACACA | ||
| Rv1872ca | – | GATGCCGTAGCGCACT | ||
| Rv2029c | Rv2029ca-Rv2028c-Rv2027c-Rv2026c | GGTGACGAGTCGCGCA | ||
| Rv2145c | CGTGACTGGCGTCCCA | |||
| Rv2193 | Rv2193a-Rv2194a-Rv2195a-Rv2196a | GGTGGATAGGTTCACC | ||
| Rv2200c | Rv2200ca-Rv2199c | TGTGATACAGGAGGCG | ||
| Rv2201 | GCTGTCGAAGACCACG | |||
| Rv2220a | TGTGACGGAAAAGACG | |||
| Rv2524ca | – | CGTTACCCACGACACG | ||
| Rv2835c | Rv2835ca-Rv2834ca-Rv2833ca-Rv2832ca | GGTGATGCCGGGCACG | ||
| Rv2920c | Rv2920ca-Rv2919ca-Rv2918ca | AGTGGACCAATTCCCC | ||
| Rv2967ca | – | CGTGGTGGTGGTCACC | ||
| Rv3003ca | Rv3003ca-Rv3002ca-Rv3001ca | TGTGGTGGCCACCCCA | ||
| Rv3010ca | – | GGTGATGGCGATGACC | ||
| Rv3043c | Rv3043ca-Rv3042c | AGTGGATCGCATCCCG | ||
| Rv3048ca | GGTGACTGGAAACGCA | |||
| Rv3217ca | – | TGTGGTGGCGGTCGCA | ||
| Rv3219a | AGTGAGATAGCCCACG | |||
| Rv3279c | Rv3279ca-Rv3278c | TATCGGCTGCCGCACA | ||
| Rv3280 | Rv3280-aRv3281-Rv3282 | CGGGACGTCGACCACA | ||
| Rv3316 | Rv3316-Rv3317-Rv3318a-Rv3319 | CGAGACGTTTTCCACG | ||
| Rv3549c | – | Rv3549c-Rv3548ca | GGTGATCGGCATTGCA | |
| Rv3676a | – | TGTCACCTACGACAGA | ||
| Rv3681ca | TGAGATACAGGTAACA | |||
| Rv3859c | Rv3859ca-Rv3858c | TGCTCCGGATTTCACA |
Detected binding sites of GlxR (ortholog in MT: Rv3676/Crp), RamB (ortholog in MT: Rv0465c), AmtR (ortholog in MT: Rv3160c), DtxR (ortholog in MT: IdeR/Rv3173c) and LexA (ortholog in MT: Rv2720/LexA) orthologs of CG in MT. Code:
aTransferred target gene of CG in MT.
Figure 2.MycoRegNet main page. The main page includes a typical search mask, a statistical overview of the database content, an entry point to browse the integrated organisms, and links to more specific statistics, the system documentation and a tutorial on how to use the MycoRegNet Web Service.
Figure 3.Sequence logo of the predicted Crp binding sites (A) in comparison to the sequence logo of GlxR (B). The sequence logo models the binding site motif of Crp. It was deduced from the predicted binding sites in Table 3. The height of each letter within an individual stack represents the nucleotide's frequency relative to the particular motif position; thus, the degree of a nucleotide's conservation is indicated by the stack according to the respective position.
Predicted Crp binding sites
| Gene ID | Gene | Motif position | Motif sequence | Operon |
|---|---|---|---|---|
| Carbohydarate metabolism | ||||
| Rv0211 | −166 | – | ||
| Rv0249c | −104 | |||
| Rv0249c | −410 | |||
| Rv0458 | −41 | |||
| Rv0465c | – | −167 | ||
| Rv0467a,g | −341 | – | ||
| Rv0896 | −356 | – | ||
| Rv0951 | −173 | |||
| Rv1099c | – | −515 | ||
| Rv1130 | −152 | |||
| Rv1436 | −48 | |||
| Rv1475c | −462 | |||
| Rv1552 | −284 | |||
| Rv1837c | −381 | – | ||
| Rv1862 | −227 | – | ||
| Rv1872c | −200 | – | ||
| Rv2029c | −410 | |||
| Rv2967ca,f | −389 | – | ||
| Rv3010c | −532 | – | ||
| Rv3316 | −386 | |||
| Rv3676 | CRP | −538 | – | |
| Fatty acid metabolism | ||||
| Rv0097 | – | −526 | ||
| Rv0166 | −84 | – | ||
| Rv0400ca,f | −5 | – | ||
| Rv1185c | −168 | – | ||
| Rv1714 | – | −405 | ||
| Rv2485c | −91 | – | ||
| Rv2486 | −287 | – | ||
| Rv2524ca,f | −259 | – | ||
| Rv2930 | −498 | |||
| Rv3279c | −38 | |||
| Rv3280 | −331 | |||
| Rv3549c | – | −67 | ||
| Nitrogen assimilation | ||||
| Rv1538c | −187 | – | ||
| Rv2220a,f,g | −1 | – | ||
| Rv2920c | −2 | |||
| Rv3859c | −398 | |||
| PGRS | ||||
| Rv0453 | PPE11 | −269 | – | |
| Rv1386 | PE15 | −133 | ||
| Rv2408 | PE24 | −213 | – | |
| Rv2591 | P_PGRS44 | −38 | – | |
| Rv3136 | PPE51 | −16 | – | |
| Rv3650 | PE33 | −83 | – | |
| Respiration | ||||
| Rv1161 | −512 | |||
| Rv1623c | −181 | – | ||
| Rv1854c | −109 | – | ||
| Rv2193 | −517 | |||
| Rv2200c | −23 | |||
| Rv3043c | −227 | |||
| Other cellular processes | ||||
| Rv0019c | −69 | – | ||
| Rv0079 | – | −110 | ||
| Rv0103c | −159 | – | ||
| Rv0104 | – | −1 | – | |
| Rv0145 | – | −59 | ||
| Rv0188 | – | −356 | – | |
| Rv0194 | – | −517 | – | |
| Rv0232 | – | −53 | ||
| Rv0250c | – | −37 | – | |
| Rv0360c | – | −2 | – | |
| Rv0457c | – | −43 | – | |
| Rv0470A | – | −212 | – | |
| Rv0483 | −116 | – | ||
| Rv0793 | – | −375 | – | |
| Rv0820a,g | −538 | – | ||
| Rv0867c | −443 | – | ||
| Rv0884ca,f | −91 | – | ||
| Rv0885 | – | −133 | ||
| Rv0904c | −2 | – | ||
| Rv0928 | −6 | |||
| Other cellular processes | ||||
| Rv0993 | −8 | |||
| Rv0950c | – | −153 | – | |
| Rv0992c | – | −109 | ||
| Rv1009 | −271 | |||
| Rv1057 | – | −248 | – | |
| Rv1092ca,f | −242 | – | ||
| Rv1111c | – | −411 | – | |
| Rv1158c | – | −69 | ||
| Rv1159 | −77 | – | ||
| Rv1230c | – | −79 | – | |
| Rv1291c | – | −323 | – | |
| Rv1314c | – | −294 | – | |
| Rv1324 | – | −104 | – | |
| Rv1482c | – | −23 | – | |
| Rv1566c | – | −235 | – | |
| Rv1568 | −553 | |||
| Rv1592c | – | −215 | – | |
| Rv1757c | – | −351 | – | |
| Rv1779c | – | −89 | – | |
| Rv1780 | – | −147 | – | |
| Rv1890c | – | −7 | – | |
| Rv1891e,g | – | −63 | ||
| Rv2145ca,f | −463 | – | ||
| Rv2172c | – | −2 | – | |
| Rv2180c | – | −304 | – | |
| Rv2201a,f | −336 | – | ||
| Rv2258c | – | −459 | – | |
| Rv2362c | −224 | |||
| Rv2377c | −268 | – | ||
| Rv2406c | – | −34 | – | |
| Rv2407 | – | −242 | – | |
| Rv2428 | −93 | – | ||
| Rv2450c | −509 | – | ||
| Rv2450c | −422 | – | ||
| Rv2455c | – | −237 | ||
| Rv2650c | – | −305 | – | |
| Rv2699c | – | −116 | – | |
| Rv2700e,f | – | −138 | – | |
| Rv2712c | – | −296 | – | |
| Rv2835c | −513 | |||
| Other cellular processes | ||||
| Rv2874 | −351 | – | ||
| Rv3003c | −335 | |||
| Rv3048ca,f | −2 | – | ||
| Rv3053c | −347 | |||
| Rv3217ca,e | – | −278 | – | |
| Rv3219 | −176 | – | ||
| Rv3613c | – | −458 | – | |
| Rv3617 | −315 | |||
| Rv3645 | – | −179 | – | |
| Rv3681c | −106 | – | ||
| Rv3729 | – | −190 | – | |
| Rv3843c | – | −505 | ||
| Rv3856c | – | −547 | – | |
| Rv3857c | – | −341 | – | |
| Consensus | TGTGANNNNNNTCACA | |||
Crp binding sites detected by the TFBS search of the introduced pipeline and by the additional TFBS search with adopted and optimized PWMs. Bold letters indicate conserved pentamers of the motif. Codes:
aTransferred target gene from CG.
bExperimentally verified binding site by EMSA/CHiP/RT-PCR (21–23).
cGene showed altered expression in microarray studies of ΔRv3676 versus WT (24).
dMotif position relative to the translation start site.
eCore gene.
fEssential gene.
gGene involved in virulence processes
Figure 4.Reconstructed network of the GlxR ortholog Crp. The network reconstruction of the Crp regulon is based on the 121 transcription units presented in Table 3. It was generated by the integrated network reconstruction tool GraphVis of MycoRegNet. Transcription units relying on binding site predictions/experimental verifications that were reported previously in (22–24,60) and correspond with our findings are colored according to the appropriate publication. Arrows and gene IDs (node labels) coloured in red indicate a repressive regulation of Crp, green arrows correspond to an activating regulation.