Literature DB >> 19494184

From Corynebacterium glutamicum to Mycobacterium tuberculosis--towards transfers of gene regulatory networks and integrated data analyses with MycoRegNet.

Justina Krawczyk1, Thomas A Kohl, Alexander Goesmann, Jörn Kalinowski, Jan Baumbach.   

Abstract

Year by year, approximately two million people die from tuberculosis, a disease caused by the bacterium Mycobacterium tuberculosis. There is a tremendous need for new anti-tuberculosis therapies (antituberculotica) and drugs to cope with the spread of tuberculosis. Despite many efforts to obtain a better understanding of M. tuberculosis' pathogenicity and its survival strategy in humans, many questions are still unresolved. Among other cellular processes in bacteria, pathogenicity is controlled by transcriptional regulation. Thus, various studies on M. tuberculosis concentrate on the analysis of transcriptional regulation in order to gain new insights on pathogenicity and other essential processes ensuring mycobacterial survival. We designed a bioinformatics pipeline for the reliable transfer of gene regulations between taxonomically closely related organisms that incorporates (i) a prediction of orthologous genes and (ii) the prediction of transcription factor binding sites. In total, 460 regulatory interactions were identified for M. tuberculosis using our comparative approach. Based on that, we designed a publicly available platform that aims to data integration, analysis, visualization and finally the reconstruction of mycobacterial transcriptional gene regulatory networks: MycoRegNet. It is a comprehensive database system and analysis platform that offers several methods for data exploration and the generation of novel hypotheses. MycoRegNet is publicly available at http://mycoregnet.cebitec.uni-bielefeld.de.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19494184      PMCID: PMC2724278          DOI: 10.1093/nar/gkp453

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Year by year, approximately two million people die worldwide from tuberculosis (1) and one-third of the world's total population suffer from this communicable disease (http://www.who.int) caused by the bacterium Mycobacterium tuberculosis. Tuberculosis is the leading cause of death to people living with HIV and claims on average 200 000 lives every year, most of them in Africa. Persons infected with tuberculosis will not directly develop the characteristic full-blown clinical picture, but in most cases the latent form, which can progress to an active condition after years. About 10–15 people can be infected by a person with active tuberculosis a year, if she or he is left untreated (http://www.who.int). Although there is effective treatment to cure patients with tuberculosis, and new strategies have been developed to stop its further dissemination, its containment is still a serious problem (2). The number of multi-resistant strains not responding to standard drug treatments is increasing constantly worldwide (3,4). Consequently, there is a tremendous need for new anti-tuberculosis therapies (antituberculotica) and drugs to cope with the spread of tuberculosis. Despite many efforts to obtain a better understanding of the pathogenicity of M. tuberculosis and its survival strategy in humans, many questions are still unresolved. The molecular mechanisms responsible for resisting the human immune system and their activation are not perceived sufficiently so far; most notably, its ability to remain within the human host for years in a clinically latent state (5). Among other cellular processes in bacteria, pathogenicity is controlled by transcriptional regulation. Thus, various studies on M. tuberculosis concentrate in the analysis of transcriptional regulation in order to gain new insights on pathogenicity and other essential processes ensuring mycobacterial survival. The identification and characterization of transcriptional regulation on a genome-wide level will enable a better understanding of drug metabolism in M. tuberculosis and facilitate the development of new antibiotics, which are urgently needed. At present, studies focus mainly on the analysis of single regulons, or distinct subunits of the complex transcriptional regulatory network of M. tuberculosis [see e.g. (3,5,6)]. Bioinformatics platforms for data storage and public access of transcriptional regulation exist for M. tuberculosis, similar to other organisms such as Escherichia coli [RegulonDB (7)] or Corynebacterium glutamicum [CoryneRegNet (8)]. MtbRegList (9) and MTBreg (http://www.doe-mbi.ucla.edu/services/mtbreg) offer information relevant to regulatory interactions in M. tuberculosis H37Rv (MT) accumulated from literature or attained from computational predictions. While MtbRegList contains predicted and characterized regulatory DNA motifs cross-referenced with transcription factors (TFs), MTBreg combines a collection of conditionally regulated proteins together with information about selected TFs. However, both systems are designed as data repositories and only provide nonsatisfying bioinformatics support necessary for transcriptional gene regulatory network visualization, analysis, and reconstruction. Recently, the TB database has become available. This integrated online platform for tuberculosis research combines the annotated genome and expression data with a suite of bioinformatic tools for data analysis (10). The scope of TB database is placed on investigating and providing expression data, while little support is given for the reconstruction of regulatory networks based on these findings. Hence, there is currently no online platform or database system available, which aims to an appropriate data handling and analysis of transcriptional regulation in M. tuberculosis on a genome-wide level. Here, we introduce MycoRegNet, an online accessible, user-friendly platform dedicated to the biomedical researcher, who is interested in the regulation of gene expression in the human pathogen M. tuberculosis. MycoRegNet is online available at http://mycoregnet.cebitec.uni-bielefeld.de. The first idea of our approach is based on the assumption that orthologous TFs tend to regulate the expression of orthologous target genes for taxonomically closely related species (11–13). Corynebacterium glutamicum and M. tuberculosis are taxonomically classified into the suborder Corynebacterineae of the Actinobacteria phylum and are thus taxonomically closely related (14). Hence, the industrially important amino acid producer C. glutamicum has been successfully applied as model organism, e.g. for investigating cell envelope synthesis of M. tuberculosis (15–17). We therefore started with the well-examined regulatory network of C. glutamicum ATCC 13032 (CG) (18), which is stored in the corynebacterial reference database CoryneRegNet (8). Our comparative genomics approach aims for a reliable transfer of known regulatory interactions from CG to MT. Instead of relying exclusively on the detection of orthologous genes, we consider further evidence by means of an integrated TF binding site (TFBS) prediction. The resulting data were subsequently stored in an online platform designed for the visualization and analysis of the deduced transcriptional regulatory network, which enables the execution of bioinformatics tools for further hypotheses generation: MycoRegNet. The remainder of this article is structured as follows: we first describe the workflow used for the transfer of C. glutamicum data to M. tuberculosis in detail. The design of MycoRegNet is briefly introduced afterwards. It aims to overcome typical data integration problems and to supply online visualization and hypotheses generation tools. In the last section, we illustrate and discuss these functionalities. We finally conclude that MycoRegNet is an appropriate reference database and platform for gene regulatory network analysis of M. tuberculosis.

MATERIALS AND METHODS

The network reconstruction pipeline mainly consists of the detection of (i) conserved genes between CG and MT and (ii) binding sites upstream the conserved genes in MT. Based on the corresponding results, a list of putative gene regulatory interactions in MT is generated and imported into the MycoRegNet database back-end (see Figure 1 for a graphical overview of the workflow).
Figure 1.

Diagram of the prediction pipeline. The diagram shows the main steps performed during transfer of gene regulations from C. glutamicum to M. tuberculosis. Starting with an orthology detection, the next step was a prediction of conserved regulations. Based on that, a TFBSs prediction provided further evidence. Finally, the results can be exported as TAB-delimited files and imported into the MycoRegNet data repository.

Diagram of the prediction pipeline. The diagram shows the main steps performed during transfer of gene regulations from C. glutamicum to M. tuberculosis. Starting with an orthology detection, the next step was a prediction of conserved regulations. Based on that, a TFBSs prediction provided further evidence. Finally, the results can be exported as TAB-delimited files and imported into the MycoRegNet data repository.

Detection of orthologous genes

Generally, the detection of orthologous genes is not straightforward, since analysis can be perturbed by factors like paralogs or sequence divergences in the genomes of interest. To reduce such effects, we searched for orthologous genes by performing bidirectional BLASTP (19) searches on the corresponding protein sequences. Therefore, we scanned the CG genome for sequence similarities with the MT genome and vice versa, performing BLASTP with an E-value cut-off of 10−4 in both directions. As a result, we obtained amino acid sequence pairs, so called bidirectional best hits (BBHs), representing the reciprocal best alignments of respective protein sequences. Thus, identified BBHs were considered to be putative orthologous proteins in CG and MT, which in turn indicates the respective genes to be regulated in both bacteria by orthologous TFs.

Transfer of regulatory interactions

Based on the previously identified BBHs, regulatory interactions characterized in CG were transferred to MT. We utilized the comprehensive data on transcriptional regulation in CG collected in the corynebacterial reference database CoryneRegNet (8), which contains 806 regulatory interactions of 72 TFs and 544 regulated target genes on CG (status: January 2009). For each regulatory interaction taken from CG, both the gene encoding the TF and the target gene were compared to the list of predicted orthologs in the MT genome. Only if both, the TF as well as its target gene, were identified as BBH, the regulatory interaction was transferred from CG to the orthologous counterparts in MT and was considered as a candidate transcriptional regulation in MT. Furthermore, we assume the regulatory role of the TF (activation or repression) to be conserved as well, including known autoregulations.

Further evidence through conserved TFBSs

In the last step of our regulatory network prediction pipeline, we add further evidence to the orthology-based approach introduced above by combining the preliminary results with the prediction of TFBSs. Therefore, all known binding sites of characterized TFs of CG with potential orthologs in MT were utilized to create appropriate motif profiles. TF binding motifs were modeled as so called position weight matrices (PWMs), the most widely used model for that purpose. However, we applied only PWMs of corynebacterial TFs deduced from more than 20 binding sites, i.e. of the TFs GlxR, RamB, AmtR, DtxR and LexA. To detect instances of the respective motifs in MT, we employed the TFBS matching tool PoSSuMsearch (20) and scanned 580-bp long, noncoding DNA sequences upstream all genes and operons, which have been detected as potential orthologs to target genes of the respective TF. The upstream sequences ranged from +20 bp relative to the transcription start. In our initial approach, we performed a restrictive search by setting the P-value threshold to 10−5. Due to the low number of detected binding sites in the first PoSSuMsearch runs, we decided to decrease the P-value threshold since the set P-value might be chosen too restrictive for our TFBS predictions. To determine a new P-value, we considered P-values of binding site matches of the PWM for GlxR upstream 26 target genes as marking value, where the binding of the GlxR ortholog Rv3676 in MT was experimentally verified (21–24). For P-value definition, we chose the binding site that match upstream one of these genes with the worst P-value. Thus, we finally set the P-value to <10−2 and defined for each target gene/operon in MT the TFBS match with the lowest P-value as prediction for the respective binding site. Taken together, the outcome of the above introduced workflow is a list of transcriptional regulations for MT where (i) the TF is conserved, and (ii) the target gene is conserved between CG and MT as well, and additionally (iii) a binding site is predicted, if the target gene/operon is controlled by one of the five TFs where a TFBS search was performed for. Hence, the resulting predictions present most likely regulatory interactions in MT due to the taxonomically close relation between CG and MT. This is the data we aim to integrate into the MycoRegNet platform together with validated knowledge we have from (5,25–34).

Data integration with the MycoRegNet platform

Based on our experiences with CoryneRegNet, we designed MycoRegNet in a very similar way: as an ontology-based data warehouse for mycobacterial TFs and regulatory networks. We set it up as a sister project of CoryneRegNet to store, analyze and visualize the regulatory interactions in M. tuberculosis that are derived from the above introduced prediction pipeline. MycoRegNet is composed of two main parts: (i) A web front-end running on an Apache HTTP web server that manages user-database interactions as well as the execution of further online bioinformatics computations. (ii) The back-end consists of data preprocessing tools and a MySQL database that stores all data corresponding to the deduced and ontologically restructured mycobacterial gene regulatory interactions. This process comprises the integration of transcriptional regulations, the complete genome sequence of MT along with the genome annotation as stored in the GenBank database (NCBI) (35), operon predictions available from the Virtual Institute of Microbial Stress and Survival (VIMSS) (36), precalculated PWMs and other preprocessed data necessary for subsequent online TFBS detections, and stimulons derived from literature (25–31). The import and conversion software is implemented in Java, while the web pages generated at front-end level are developed in PHP. An embedded Java applet realizes the visualization of gene regulatory networks from the included data. A SOAP-based web Service (37) client/ server system implemented by means of NuSOAP enables a bidirectional interconnection with GenDB (38) and EMMA (39). The server is open access and provides well-structured data access via the SOAP interface to any other bioinformatics client. GenDB is an open source system for the annotation of prokaryotic genomes, while EMMA is a web-based application for the storage and analysis of transcriptomics data from microarrays. By means of the clients for GenDB and EMMA, data integrated in MycoRegNet is supplemented with up-to-date information on the genome annotation of MT (GenDB) and gene expression data preanalyzed with EMMA. To give one example, the Web Service client for GenDB facilitates the mapping of all genes controlled by a certain regulator to KEGG pathways (40) in order to provide an overview on the general nature of a TF of interest. Furthermore, the automatic annotation pipeline of GenDB can be used to regularly update gene function assignments.

RESULTS AND DISCUSSION

Here, we first summarize the database content. Subsequently, we present and discuss the benefits of MycoRegNet from the end-user perspective. We first describe the web interface with special attention to the TFBS prediction feature and the network visualization and analysis capability. We briefly describe the Web Service access afterwards and finally demonstrate the platforms' visualization functionality by means of an application example.

The database content

By using the above described transfer pipeline for regulatory interactions, we identified 1012 of 3991 proteins from MT as putative orthologs to proteins from CG. Based on the respective set of genes coding for the orthologous proteins, we detected 226 of 806 regulatory interactions from CG as likely conserved in MT (Table 1). Our initial findings reveal 24 partial conserved regulons affecting processes of the carbohydrate metabolism, cellular program, macroelement and metal homeostasis, SOS and stress response, specific biosynthesis as well as processes governed by sigma factors. By setting the P-value threshold to 102, we could put further evidence to 129 target genes 40 by predicting binding sites upstream the respective target genes/operons regulated by the TF orthologs of GlxR, RamB, AmtR, DtxR and LexA in MT (Table 2). All in all, we obtained a set of regulatory interactions which is based on good evidence. The database content comprises 618 regulatory interactions for 515 target genes regulated by 26 TFs. Several gene expression experiments are also directly stored within MycoRegNet's database back-end (data not shown). We also integrated genome annotation data of M. tuberculosis CDC1551 for future investigations concerning transcriptional regulation in another ecotype of M. tuberculosis.
Table 1.

Putative gene regulations of CG in MT

TFTarget genes
Carbohydrate metabolism
    Rv0465cCarbohydrate metabolism
Rv0211 (pckA), Rv0247c (-), Rv0363c (fba), Rv0408 (pta)
Rv0409 (ackA), Rv0465c (-), Rv0467 (icl), Rv0896 (gltA2)
Rv0904c (accD3), Rv0951 (sucC), Rv0952 (sucD), Rv1475c (acn)
Rv1837c (glcB), Rv1862 (adhA), Rv2193 (ctaE), Rv2241 (aceE)
Rv2332 (mez), Rv2967c (pca), Rv3318 (sdhA)
Cell division and septation
Rv1009 (rpfB)
Specific biosynthesis pathways
Rv0884c (serC), Rv1010 (ksgA), Rv1011 (ispE), Rv1379 (pyrR)
Rv1380 (pyrB), Rv1381 (pyrC)
    Rv0792cCarbohydrate metabolism
Rv0753c (mmsA)
    Rv1719Carbohydrate metabolism
Rv0554 (bpoC), Rv1074c (fadA3), Rv1719 (-), Rv2503c (scoB), Rv2504c (scoA)
    Rv3676Carbohydrate metabolism
Rv0211 (pckA), Rv0247c (-), Rv0400c (fadE7), Rv0465c (-)
Rv0467 (icl), Rv0896 (gltA2), Rv0904c (accD3), Rv0951 (sucC)
Rv0952 (sucD), Rv1098c (fumC), Rv1130 (-), Rv1161 (narG)
Rv1162 (narH), Rv1163 (narJ), Rv1436 (gap), Rv1437 (pgk)
Rv1438 (tpi), Rv1475c (acn), Rv1837c (glcB), Rv1854c (ndh)
Rv1862 (adhA), Rv1872c (lldD2), Rv2029c (pfkB), Rv2193 (ctaE)
Rv2194 (qcrC), Rv2195 (qcrA), Rv2196 (qcrB), Rv2200c (ctaC)
Rv2524c (fas), Rv2967c (pca), Rv3010c (pfkA), Rv3043c (ctaD)
Rv3279c (birA), Rv3280 (accD5), Rv3318 (sdhA), Rv3548c (-), Rv3676 (-)
Cell division and septation
Rv1009 (rpfB), Rv2145c (wag31), Rv2201 (asnB)
Macroelement and metal homeostasis
Rv0820 (phoT), Rv0928 (pstS3), Rv0929 (pstC2), Rv0930 (pstA1), Rv2220 (glnA1)
Rv2832c (ugpC), Rv2833c (ugpB), Rv2834c (ugpE), Rv2835c (ugpA), Rv2918c (glnD)
Rv2919c (glnB), Rv2920c (amt), Rv3859c (gltB)
SOS and stress response
Rv0867c (rpfA), Rv3048c (nrdF2), Rv3217c(-), Rv3219 (whiB1), Rv3681c (whiB4)
Specific biosynthesis pathways
Rv0884c (serC), Rv1010 (ksgA), Rv1011 (ispE), Rv1092c (coaA)
Rv3001c (ilvC)
Rv3002c (ilvN), Rv3003c (ilvB1)
Cellular Program
    RelASigma factor module
Rv1221 (sigE), Rv2710 (sigB), Rv3221A (-), Rv3911 (sigM)
SOS and stress response
Rv2720 (lexA)
Macroelement and metal homeostasis
    Rv0485Macroelement and metal homeostasis
Rv0132c (fgd2)
    PhoPMacroelement and metal homeostasis
Rv0545c (pitA), Rv0757 (phoP), Rv0758 (phoR), Rv0820 (phoT)
Rv0928 (pstS3), Rv0929 (pstC2), Rv0930 (pstA1), Rv1095 (phoH2)
Rv2832c (ugpC), Rv2833c (ugpB), Rv2834c (ugpE), Rv2835c (ugpA)
    Rv0827cMacroelement and metal homeostasis
Rv0827c (-)
    Rv1994cMacroelement and metal homeostasis
Rv1994c (-)
    IdeRCarbohydrate metabolism
Rv0247c (-), Rv3318 (sdhA)
Macroelement and metal homeostasis
Rv0827c (-), Rv0844c (narL), Rv1285 (cysD), Rv1286 (cysN)
Rv2391 (nirA), Rv2392 (cysH), Rv2393 (-), Rv2895c (viuB)
Rv3044 (fecB), Rv3841 (bfrB)
    Rv3160cMacroelement and metal homeostasis
Rv1848 (ureA), Rv1849 (ureB), Rv1850 (ureC), Rv1852 (ureG)
Rv2220 (glnA1), Rv2918c (glnD), Rv2919c (glnB), Rv2920c (amt)
Rv3664c (dppC), Rv3665c (dppB), Rv3666c (dppA), Rv3859c (gltB)
    Rv3173cMacroelement and metal homeostasis
Rv0132c (fgd2), Rv0485 (-), Rv1079 (metB), Rv1133c (metE)
Rv1175c (fadH), Rv1285 (cysD), Rv1286 (cysN), Rv1294 (thrA)
Rv1296 (thrB), Rv1392 (metK), Rv2124c (metH), Rv2334 (cysK1)
Rv2391 (nirA), Rv2392 (cysH), Rv2393 (-), Rv3025c (iscS)
Rv3028c (fixB), Rv3029c (fixA), Rv3173c (-), Rv3340 (metC)
Rv3341 (metA)
Sigma factor module
    SigBCarbohydrate metabolism
Rv0363c (fba), Rv1023 (eno), Rv1098c (fumC), Rv1436 (gap)
Rv1437 (pgk), Rv1438 (tpi), Rv3010c (pfkA)
SOS and stress response
Rv3132c (devS)
Specific biosynthesis pathways
Rv2210c (ilvE)
    SigMSOS and stress response
Rv0384c (clpB), Rv1464 (csd), Rv1465, Rv1471 (trxB1)
Rv3418c (groES), Rv3913 (trxB2), Rv3914 (trxC)
SOS and stress response
    HspRSOS and stress response
Rv0350 (dnaK), Rv0351 (grpE), Rv0352 (dnaJ1), Rv0353 (hspR)
Rv0384c (clpB), Rv0440 (groEL), Rv2745c
    HrcASOS and stress response
Rv0440 (groEL), Rv3418c (groES)
    LexACell division and septation
Rv2748c (ftsK)
SOS and stress response
Rv1235 (lpqY), Rv1638 (uvrA), Rv1696 (recN), Rv2592c (ruvB)
Rv2593c (ruvA), Rv2594c (ruvC), Rv2720 (lexA), Rv2736c (recX)
Rv2737c (recA), Rv3370c (dnaE2), Rv3395c, Rv3585 (radA)
    Rv2745cSOS and stress response
Rv0782 (ptrBb), Rv2460c (clpP2), Rv2461c (clpP1), Rv2725c (hflX)
Rv3596c (clpC1), Rv3715c (recR), Rv3716c
    WhiB1SOS and stress response
Rv3913 (trxB2), Rv3914 (trxC)
    MtrASOS and stress response
Rv0917 (betP), Rv3476c (kgtP)
    CspACarbohydrate metabolism
Rv1837c (glcB)
Specific biosynthesis pathways
    PyrRSpecific biosynthesis pathways
Rv1379 (pyrR), Rv1380 (pyrB), Rv1381 (pyrC), Rv2883c (pyrH)
    ArgRSpecific biosynthesis pathways
Rv1383 (carA), Rv1384 (carB), Rv1652 (argC), Rv1653 (argJ)
Rv1654 (argB), Rv1655 (argD), Rv1656 (argF), Rv1657 (argR)
Rv0488

Putative gene regulations of CG in MT, predicted in silico by using the introduced MycoRegNet pipeline

Table 2.

Detected binding sites upstream transferred target genes of CG in MT

TFGene IDGene nameOperonBinding motif
Rv0465cRv0211apckAATAACTACGCAGG
Rv0249cRv0249c-Rv0248c-Rv0247caAGTAGTTCGCGAT
Rv0363cafbaCGTACTTCTCAAA
Rv0407ptaRv0407-Rv0408a-Rv0409aCGTGCTGTGCTCA
Rv0465cRv0465ca-Rv0464cCTAACTCTGCGAA
Rv0467aiclCAAAATTTGCAAA
Rv0884caserCATGGCATGGCCGA
Rv0896agltA2TGAGCAGATCACT
Rv0904caaccD3ATTGCATGGCAAG
Rv0951sucCRv0951a-Rv0952aAGTGCTAAGCCGT
Rv1009rpfBRv1009a-Rv1010a-Rv1011aTCTACTTACCAAA
Rv1379pyrRRv1379a-Rv1380a-Rv1381a-Rv1382-Rv1383-Rv1384-Rv1385AGTGCTACGCTGC
Rv1475cacnRv1475ca-Rv1474cACTGCTAGGCTGA
Rv1837caglcBTAGGCTGAGCAAT
Rv1862aadhATGTGCTGGGCTAA
Rv2193ctaERv2193a-Rv2194-Rv2195-Rv2196ACTACAAAGCGTC
Rv2241aceERv2241a-Rv2242CAAACAGCGCAAG
Rv2332amezTGCGCTCTGCGAA
Rv2967capcaCATGCAATGTCAA
Rv3316sdhARv3316-Rv3317-Rv3318a-Rv3319GTTGCATTGCCCC
IdeRRv0249cRv0249c-Rv0248c-Rv0247caTTAGATGAGCGCACCCACG
Rv0827caCTATGGATCGCTGTACTAC
Rv0844canarLCGACGAGCAGCTAAACTCA
Rv1285cysDRv1285a-Rv1286aGAGGGCGAGGCACACGTCA
Rv2391nirARv2391a-Rv2392a-Rv2393aTCAGGTGCGCGTCTCCCAG
Rv2895caviuBTAAGCGAAGCCGAACGCCA
Rv3044afecBGTAGACCAGGCTCCCCTTG
Rv3316sdhARv3316-Rv3317-Rv3318a-Rv3319CTAAGAAAAGCCAGCCTAA
Rv3841abfrBCTAGGAAAGCCTTTCCTGA
LexARv1235lpqYRv1235a-Rv1236-Rv1237-Rv1238TCGACTATCTATCCGA
Rv1638auvrATCGAATGTCAGCTCGC
Rv1696arecN
Rv2594cruvCRv2594ca-Rv2593ca-Rv2592caTCGAACGATTGTTCGG
Rv2720alexATCGAACACATGTTTGA
Rv2737crecARv2737ca-Rv2736caTCGAACAGGTGTTCGG
Rv2748caftsKCCGACCAGGTGCTCGC
Rv3370cadnaE2TCGAACAATTGTTCGA
Rv3395cRv3395ca-Rv3394cTCGAACATATTTTCGA
Rv3160cRv1848ureARv1848a-Rv1849-Rv1850a-Rv1851-Rv1852a-Rv1853GTGTCTACTGCGCGATGATCGAGAGCAT
Rv2220aglnA1CAACACGGGGTTGACTGACGGGCAATAT
Rv2920camtRv2920ca-Rv2919ca-Rv2918caAAGTTTTACGTTAATCCTGATGAAACAT
Rv3666cdppARv3666ca-Rv3665c-aRv3664ca-Rv3663c-Rv3662cGTGGTAGCTAACGGTCACCGGCGAGTGT
Rv3859cgltBRv3859ca-Rv3858cCGCTTGACGGACAGCCTATCGACAAGAC
Rv3676Rv0211apckATGTGAGCAGGCTTATA
Rv0249cRv0249c-Rv0248c-Rv0247caTGTGATCTGTAACACC
Rv0400cafadE7AGTGATGAGCACCCCG
Rv0465cRv0465ca-Rv0464cTTTGTCGAGGCTCACG
Rv0467aivlTGTTACAACGCTCACA
Rv0820aphoTGGTGGTGATCCGCACC
Rv0867carpfATGTGACATTACCCACA
Rv0884caserCTGTGAGCTTGTTCACA
Rv0896agltA2GGCGTTGAACATCACC
Rv0904caaccD3CGTGAGTCGTATCACG
Rv0928pstS3Rv0928a-Rv0929a-Rv0930aACTGAATTGAAACTCA
Rv0951sucCRv0951a-Rv0952aTGTGAGTTGGATCACG
Rv1009rpfBRv1009a-Rv1010a-Rv1011aGGTGGCGCTCATCACC
Rv1092cacoaATGCCACGTAGGTCACG
Rv1099cfumCRv1099c-Rv1098ca-Rv1097c
Rv1130Rv1130a-Rv1131TGTGGATAAGTCCAGG
Rv1161narGRv1161a-Rv1162a-Rv1163a-Rv1164-Rv1165-Rv1166TGCGTTGAACGGCACG
Rv1436gapRv1436a-Rv1437a-Rv1438aGGTTGTTTAGCCAACA
Rv1475cacnRv1475ca-Rv1474cTGTAACTGCCGACATA
Rv1837caglcBAGGGATGCACTACACA
Rv1854candhTGTGGCTGATGACACA
Rv1862aadhACGTGGGGCGCCACACA
Rv1872calldD2GATGCCGTAGCGCACT
Rv2029cpfkBRv2029ca-Rv2028c-Rv2027c-Rv2026cGGTGACGAGTCGCGCA
Rv2145cwag31CGTGACTGGCGTCCCA
Rv2193ctaERv2193a-Rv2194a-Rv2195a-Rv2196aGGTGGATAGGTTCACC
Rv2200cctaCRv2200ca-Rv2199cTGTGATACAGGAGGCG
Rv2201asnBGCTGTCGAAGACCACG
Rv2220aglnA1TGTGACGGAAAAGACG
Rv2524cafasCGTTACCCACGACACG
Rv2835cugpARv2835ca-Rv2834ca-Rv2833ca-Rv2832caGGTGATGCCGGGCACG
Rv2920camRv2920ca-Rv2919ca-Rv2918caAGTGGACCAATTCCCC
Rv2967capcaCGTGGTGGTGGTCACC
Rv3003cailvB1Rv3003ca-Rv3002ca-Rv3001caTGTGGTGGCCACCCCA
Rv3010capfkAGGTGATGGCGATGACC
Rv3043cctaDRv3043ca-Rv3042cAGTGGATCGCATCCCG
Rv3048canrdF2GGTGACTGGAAACGCA
Rv3217caTGTGGTGGCGGTCGCA
Rv3219awhiB1AGTGAGATAGCCCACG
Rv3279cbirARv3279ca-Rv3278cTATCGGCTGCCGCACA
Rv3280accD5Rv3280-aRv3281-Rv3282CGGGACGTCGACCACA
Rv3316sdhARv3316-Rv3317-Rv3318a-Rv3319CGAGACGTTTTCCACG
Rv3549cRv3549c-Rv3548caGGTGATCGGCATTGCA
Rv3676aTGTCACCTACGACAGA
Rv3681cawhiB4TGAGATACAGGTAACA
Rv3859cgltBRv3859ca-Rv3858cTGCTCCGGATTTCACA

Detected binding sites of GlxR (ortholog in MT: Rv3676/Crp), RamB (ortholog in MT: Rv0465c), AmtR (ortholog in MT: Rv3160c), DtxR (ortholog in MT: IdeR/Rv3173c) and LexA (ortholog in MT: Rv2720/LexA) orthologs of CG in MT. Code:

aTransferred target gene of CG in MT.

Putative gene regulations of CG in MT Putative gene regulations of CG in MT, predicted in silico by using the introduced MycoRegNet pipeline Detected binding sites upstream transferred target genes of CG in MT Detected binding sites of GlxR (ortholog in MT: Rv3676/Crp), RamB (ortholog in MT: Rv0465c), AmtR (ortholog in MT: Rv3160c), DtxR (ortholog in MT: IdeR/Rv3173c) and LexA (ortholog in MT: Rv2720/LexA) orthologs of CG in MT. Code: aTransferred target gene of CG in MT.

The user interface

As for other online databases, MycoRegNet's web interface provides the three major capabilities: browsing the database content, searching by specifying filter criteria and basic visualization possibilities. Furthermore, the front-end offers the execution of computational features. At the main page (Figure 2), one has the option to search or to browse the database content. The user may browse the data repository by clicking on an ecotype name of interest and is provided with an overview on the selected organism. Alternatively, using one of the provided options within the search form, the database can be searched for specific gene/protein identifiers, gene/protein names, regulator types or functional modules. The search results are presented in tabular form, listing all relevant information for subsequent investigation. Furthermore, the following built-in features can be accessed from the main page, directly: TFBScan [for TFBS predictions; see below) and COMA (to check for contradictions within microarray gene expression studies, given the regulatory network stored in the database; refer to (8) for more details]. Detailed information on the results can be obtained via respective links at the result page. By selecting a particular gene, the corresponding gene details page is invoked. It presents a detailed overview of all available data attached to the gene of interest. Besides general information about the gene/protein (position in the genome, nucleotide sequence, etc.), it comprises a graphical representation of the genomic context, regulated target genes (if encoding a TF) including the TFBSs, etc., and stimulons that initiate a differential gene expression level. The integrated Web Service client for GenDB maintains the representation of up-to-date gene annotation data. General information (description, comments, an assigned function, etc.) is listed as well as the EC numbers for enzymes, and links to COG (41) and GO (42). Additionally, all target genes of a TF of interest, are linked to KEGG pathways and a list of regulated pathways is displayed.
Figure 2.

MycoRegNet main page. The main page includes a typical search mask, a statistical overview of the database content, an entry point to browse the integrated organisms, and links to more specific statistics, the system documentation and a tutorial on how to use the MycoRegNet Web Service.

MycoRegNet main page. The main page includes a typical search mask, a statistical overview of the database content, an entry point to browse the integrated organisms, and links to more specific statistics, the system documentation and a tutorial on how to use the MycoRegNet Web Service.

TFBS prediction

With the integrated PoSSuMsearch software, MycoRegNet provides a statistically sound tool for the prediction of TFBSs based on PWMs, which have been precalculated during data import. To our knowledge, PoSSuMsearch is the only TFBSs profiling tool that offers exact P-value calculations and at the same time provides reasonable response times on genome-wide runs. There are three ways to access this feature through the MycoRegNet web site: (1) The TFBScan button at the main page offers the possibility to upload user-defined sequences in FASTA format. (2) At any gene details page the user can predict TFBSs in the upstream sequence of the selected gene. (3) If the gene of interest encodes a TF, the PWM learned from the known TFBSs of the TF may be used to scan for further TFBSs in the upstream sequences of all other mycobacterial genes. The predicted results may further be visualized as graphs. The interface is easy to use: one just has to choose a background model (nucleotide distribution) and a P-value threshold. For further details regarding the prediction of prokaryotic TFBSs by utilizing PoSSuMsearch, the reader is referred to (20,43).

Gene regulatory network visualization

As mentioned earlier, MycoRegNet also provides a network visualization toolkit: GraphVis. It is a Java applet, which graphically reconstructs regulatory networks as graphs based on selected genes and a user-defined graph depth cutoff. It traverses all regulatory interactions from the starting point until the graph depth cutoff has been reached. Finally, a Java applet window appears showing the regulatory network as graph, where nodes represent genes and edges regulatory interactions. GraphVis allows the user to zoom into the graph, apply different layout styles, remove selected elements or retrieve detailed information on selected genes. Furthermore, it is possible to extend the graph dynamically with more genes/regulations from the database and to display the operon grouping of presented genes. Visualized networks may also be graphically compared between two species or between a predicted and an evidenced network by utilizing special comparative graph layout algorithms. In addition, GraphVis features the projection of gene expression data onto the genes of a visualized network. The user can choose to apply gene expression data from the stimulon repository of MycoRegNet or from own tab-delimited text or MS-Excel files, which can be uploaded to GraphVis directly. It is also possible to use expression data extracted from EMMA by means of the integrated Web Service interconnection. According to the differential expression level of the genes, the concerned nodes are resized within the graph. Thus, the user can achieve a comprehensive overview of the transcriptional response of M. tuberculosis to a certain stimulus.

Well-structured data access by using Web Services

Although no real standard in bioinformatics, a growing number of platforms offer SOAP-based Web Service access to their data repositories [refer to some EBI resources (44) or to the BRENDA database (45), just to name two of them]. Many databases still provide flat files for exchange with other data processing systems. Thus, the developers of novel tools and platforms have to perform updates in regular time intervals and to adjust the downloaded data for their special purpose. On that account, gene regulatory data stored in MycoRegNet can also be accessed via the integrated Web Service server. The data can be integrated directly into corresponding projects without further time-consuming data processing. Detailed information on how to use the MycoRegNet Web Services is available from the main page via the Web Service button.

Application example—the regulatory network of the GlxR ortholog Rv3676 (Crp) in MT

Both GlxR (Cg0350) of C. glutamicum and CRP (Rv3676) of M. tuberculosis belong to the Crp-Fnr family of TFs (46) and have been characterized as cAMP sensing homologs of E. coli Crp (23,47,48). Crp-cAMP-dependent gene regulation is commonly involved in carbon catabolite repression and forms one of the possible connections between carbon metabolism and virulence (49,50). In mycobacteria, cAMP signalling is the subject of intensive research, as it may be related to virulence of these strains (51,52). It is noteworthy that M. tuberculosis contains 16 putative adenylate cyclases, as well as 10 putative cyclic nucleotide binding proteins (53,54), hinting at a crucial and diverse role for cAMP signalling in mycobacteria. GlxR of C. glutamicum has been in the focus of interest in the last years (48,55–59), and available data indicates GlxR as global regulator with about 150 target genes in functional diverse network modules, such as carbohydrate metabolism, aerobic and anaerobic respiration, fatty acid metabolism, aromatic compound degradation, glutamate uptake and nitrogen assimilation, the cellular stress response and resuscitation. Previous studies suggested a similar vital role for Crp in M. tuberculosis. Published data implicate Crp in virulence, hypoxia and nutrient starvation (21,23,24,60). Deletion of Crp altered the expression of 16 genes, and caused an impaired growth phenotype in bone marrow-derived macrophages as well as in tuberculosis mouse models (24). Several suggestions for a putative Crp regulon have been made, although these studies relied solely on the detection of putative binding sites (23,24,60). As part of our pipeline, the known regulatory interactions of GlxR collected in CoryneRegNet have been used to reconstruct the regulon of the orthologous TF Crp. Due to the apparent vital role of these regulators in their respective organisms, and the available data on putative target genes and characterized binding sites, we chose them as application case for our analysis. Employing our pipeline, regulatory interactions with 64 target genes could be transferred from C. glutamicum GlxR to M. tuberculosis Crp. Furthermore, we considered 26 genes with experimental evidence of regulation by Crp as potential target genes (21–24). Based on experimentally verified binding sites of Crp (21–23) together with binding sites predicted by the TFBS search of our pipeline, we complemented the suggested regulon with the prediction of Crp binding sites in the upstream regions of putative target genes. In contrast to the TFBS searched within our pipeline, we created an adopted and optimized PWM for Crp from experimentally verified and predicted binding sites, and applied it for TFBS search. To detect the novel binding sites, we set the P-value threshold to 10−5 and performed a restrictive search on sequences upstream genes/operons concering the whole genome of MT. Again, we used PoSSuMsearch for binding site prediction scanning 580-bp long upstream sequence, ranging from +20 bp relative to the transcription start site. Using Weblogo (61), we generated a sequence logo from the detected binding sites of Crp and from the appropriate binding sites of GlxR that were used for PWM creation (see Methods section). The resulting sequence logos are shown in Figure 3.
Figure 3.

Sequence logo of the predicted Crp binding sites (A) in comparison to the sequence logo of GlxR (B). The sequence logo models the binding site motif of Crp. It was deduced from the predicted binding sites in Table 3. The height of each letter within an individual stack represents the nucleotide's frequency relative to the particular motif position; thus, the degree of a nucleotide's conservation is indicated by the stack according to the respective position.

Sequence logo of the predicted Crp binding sites (A) in comparison to the sequence logo of GlxR (B). The sequence logo models the binding site motif of Crp. It was deduced from the predicted binding sites in Table 3. The height of each letter within an individual stack represents the nucleotide's frequency relative to the particular motif position; thus, the degree of a nucleotide's conservation is indicated by the stack according to the respective position.
Table 3.

Predicted Crp binding sites

Gene IDGeneMotif positiondMotif sequenceOperon
Carbohydarate metabolism
    Rv0211apckA−166TGTGAGCAGGCTTATA
    Rv0249csdhCD−104TGTGATCTGTAACACCRv0249c-Rv0248c-Rv0247ca
    Rv0249csdhCD−410GGTGTCGGAGGTCACARv0249c-Rv0248c-Rv0247ca
    Rv0458adhA−41TGTGAGCTGTATTACARv0458-Rv0459
    Rv0465c−167TTTGTCGAGGCTCACGRv0465caRv0464cg
    Rv0467a,gicl−341TGTTACAACGCTCACA
    Rv0896agltA2−356GGCGTTGAACATCACC
    Rv0951sucC−173TGTGAGTTGGATCACGRv0951a,fRv0952a,f
    Rv1099c−515GCTGATGAATCCCACGRv1099cRv1098ca,fRv1097c
    Rv1130prpD2−152TGTGGATAAGTCCAGGRv1130aRv1131
    Rv1436gap−48GGTTGTTTAGCCAACARv1436a,fRv1437a,fRv1438a,f
    Rv1475cacn−462TGTAACTGCCGACATARv1475ca,fRv1474c
    Rv1552frdA−284TGTGATCTAGGTCACGbRv1552Rv1553Rv1554Rv1555
    Rv1837caglcB−381AGGGATGCACTACACA
    Rv1862aadhA−227CGTGGGGCGCCACACA
    Rv1872calldD2−200GATGCCGTAGCGCACT
    Rv2029capfkB−410GGTGACGAGTCGCGCARv2029cRv2028cRv2027cRv2026cf
    Rv2967ca,fpca−389CGTGGTGGTGGTCACC
    Rv3010capfkA−532GGTGATGGCGATGACC
    Rv3316sdhC−386CGAGACGTTTTCCACGRv3316Rv3317Rv3318aRv3319
    Rv3676aCRP−538TGTCACCTACGACAGA
Fatty acid metabolism
    Rv0097−526TGTCACGCCGGCCACGRv0097Rv0098eRv0099cRv0100eRv0101
    Rv0166fadD5−84TGTGACCCAGACAACA
    Rv0400ca,ffadE7−5AGTGATGAGCACCCCG
    Rv1185cfadD21−168CGTGACGCCCCTCACG
    Rv1714−405GGTGACGGCGGCCACARv1714fRv1715fRv1716Rv1717Rv1718
    Rv2485cclipQ−91TGTGATCCTCGACACA
    Rv2486echA14−287TGTGTCGAGGATCACA
    Rv2524ca,ffas−259CGTTACCCACGACACG
    Rv2930cfadD26−498TGTTAATCTCGTCACARv2930Rv2931Rv2932fRv2933Rv2934Rv2935Rv2936gRv2937gRv2938Rv2939
    Rv3279cbirA−38TATCGGCTGCCGCACARv3279caRv3278ce
    Rv3280accD5−331CGGGACGTCGACCACARv3280aRv3281e,fRv3282
    Rv3549c−67GGTGATCGGCATTGCARv3549cRv3548ca
Nitrogen assimilation
    Rv1538cansA−187TGTGAGCACCACCACA
    Rv2220a,f,gglnA1−1TGTGACGGAAAAGACG
    Rv2920camt−2AGTGGACCAATTCCCCRv2920caRv2919ca,gRv2918ca
    Rv3859cgltB−398TGCTCCGGATTTCACARv3859ca,fRv3858cf
PGRS
    Rv0453PPE11−269GGTGACCAAACTCACG
    Rv1386PE15−133TGTGACCAAACTCACCbRv1386eRv1387c,e
    Rv2408PE24−213GGTGATCGGCGTCACG
    Rv2591P_PGRS44−38CGTGACATGTGTCACA
    Rv3136cPPE51−16AAGGAGCTGAGACACA
    Rv3650PE33−83TGTGATGCACTTGACA
Respiration
    Rv1161narG−512TGCGTTGAACGGCACGRv1161aRv1162aRv1163aRv1164Rv1165Rv1166f
    Rv1623cccydA−181CGTGGTGATCGGCACA
    Rv1854candh−109TGTGGCTGATGACACA
    Rv2193ctaE−517GGTGGATAGGTTCACCRv2193a,fRv2194a,fRv2195a,fRv2196a,f
    Rv2200cctaC−23TGTGATACAGGAGGCGRv2200ca,fRv2199c
    Rv3043cctaD−227AGTGGATCGCATCCCGRv3043ca,fRv3042cf
Other cellular processes
    Rv0019cgfhaB−69CGTGACTTTGCTGACGb
    Rv0079−110GGTGACACAGCCCACARv0079Rv0080
    Rv0103cctpB−159TGTGACGGGCGTCACA
    Rv0104−1TGTGACGCCCGTCACA
    Rv0145−59AGTGATGTGCCACACAbRv0145Rv0146
    Rv0188c−356AGAGAACAACGTCGCA
    Rv0194−517TGTCATCTAGATCACG
    Rv0232−53CGTGATGCAGCGCACARv0232Rv0233
    Rv0250ce−37TGTGATCTGTAACACC
    Rv0360c−2CGTGACCAAGCGCACA
    Rv0457c−43TGTAATACAGCTCACA
    Rv0470A−212TGTGGTGGGAATCACA
    Rv0483lprQ−116TGTGTTTGGTATCACA
    Rv0793−375TGTGATGGTGCGCACG
    Rv0820a,gphoT−538GGTGGTGATCCGCACC
    Rv0867carpfA−443TGTGACATTACCCACAb
    Rv0884ca,fserC−91TGTGAGCTTGTTCACAb
    Rv0885−133TGTGAACAAGCTCACARv0885Rv0886
    Rv0904caaccD3−2CGTGAGTCGTATCACG
    Rv0928pstS3−6ACTGAATTGAAACTCARv0928a,gRv0929a,gRv0930a,g
Other cellular processes
    Rv0993galU−8TGTGAACGATGTCACGRv0993fRv0994gRv0995g
    Rv0950c−153CGTGATCCAACTCACAb
    Rv0992c−109CGTGACATCGTTCACARv0992Rv0991Rv0990
    Rv1009rpfB−271GGTGGCGCTCATCACCRv1009aRv1010a,gRv1011a,f
    Rv1057−248CGTGACCTAGGTAACA
    Rv1092ca,fcoaA−242TGCCACGTAGGTCACG
    Rv1111ce−411GGTGACATGAGTCACG
    Rv1158c−69TGTCACTTGAGTCACAbRv1158ceRv1157ce
    Rv1159pimE−77TGTGACTCAAGTGACA
    Rv1230c−79GGTGATCTAGTTCACGb
    Rv1291c−323TGTGATCGGCGCCACC
    Rv1314c−294GGTGATCCGGGCCACG
    Rv1324e−104TGTGATCTTGGTCATA
    Rv1482c−23TGTGACTCAGCACACT
    Rv1566c−235CGTGACTGAAATCACA
    Rv1568bioA−553TGTGATTTCAGTCACGRv1568Rv1569gRv1570Rv1571
    Rv1592cc−215TGTGATAGGCGCCACG
    Rv1757c−351TGTGACGGCGGCCACG
    Rv1779c−89TGTGAACAACACCACA
    Rv1780−147TGTGGTGTTGTTCACA
    Rv1890c−7TGTGTCGTGGCCCACA
    Rv1891e,g−63TGTGGGCCACGACACARv1891Rv1892Rv1893e
    Rv2145ca,fwag31−463CGTGACTGGCGTCCCA
    Rv2172ce−2TGTGACCCTCAACACG
    Rv2180c−304TGTGTGGAACAACACA
    Rv2201a,fasnB−336GCTGTCGAAGACCACG
    Rv2258c−459GGTGACGTCGACCACG
    Rv2362crecO−224TGTGGGCTGGCTCACARv2362cRv2361cfRv2360c
    Rv2377cmbtH−268TGTGGTTCACCTCACT
    Rv2406c−34TGTGAACCAGCTCACC
    Rv2407−242GGTGAGCTGGTTCACA
    Rv2428cahpC−93GGTGTGATATATCACC
    Rv2450crpfE−509TGTGGCGCAGGTCACC
    Rv2450crpfE−422CGTGATTCGGCTCACG
    Rv2455c−237AGTGACCAATACCACARv2455cRv2454cRv2453c
    Rv2650c−305CGTGAGGAGCCTCACG
    Rv2699c−116TGTGATGTAAATCACA
    Rv2700e,f−138TGTGATTTACATCACA
    Rv2712c−296GGTGAGGTAGAGCACA
    Rv2835cugpA−513GGTGATGCCGGGCACGRv2835cRv2834caRv2833ca,fRv2832ca,f
Other cellular processes
    Rv2874dipZ−351TGTGGCGGAGTTCACA
    Rv3003cilvB1−335TGTGGTGGCCACCCCARv3003ca,fRv3002ca,fRv3001ca,f
    Rv3048ca,fnrdF2−2GGTGACTGGAAACGCA
    Rv3053cnrdH−347GGTGATCTGCGACACGRv3053cRv3052cRv3051cf
    Rv3217ca,e−278TGTGGTGGCGGTCGCA
    Rv3219 a,cwhiB1−176AGTGAGATAGCCCACGb
    Rv3613cc−458CGTGACGAATCCCCCA
    Rv3617ephA−315TGTGACCGGTGTCACTRv3617Rv3618
    Rv3645−179TGTGAGCCGAATCACG
    Rv3681cawhiB4−106TGAGATACAGGTAACA
    Rv3729−190TGTGACCACGGCCACG
    Rv3843c−505GGTGAGGTAAGTCACARv3843ceRv3842cg
    Rv3856c−547TGTGGGCTTCGTCACA
    Rv3857c−341TGTGGGCTTCGTCACAb
    ConsensusTGTGANNNNNNTCACA

Crp binding sites detected by the TFBS search of the introduced pipeline and by the additional TFBS search with adopted and optimized PWMs. Bold letters indicate conserved pentamers of the motif. Codes:

aTransferred target gene from CG.

bExperimentally verified binding site by EMSA/CHiP/RT-PCR (21–23).

cGene showed altered expression in microarray studies of ΔRv3676 versus WT (24).

dMotif position relative to the translation start site.

eCore gene.

fEssential gene.

gGene involved in virulence processes

In total, we identified 207 putative target genes of Crp, organized in 121 transcription units (see Table 3 and Figure 4). Of this set, 17 genes belong to the mycobacterial core regulon (62) and 41 were reported as essential for M. tuberculosis (63,64). Furthermore, at least 17 genes of the suggested regulon are connected to antibiotic resistance and virulence of M. tuberculosis (65–69). Based on annotation information for M. tuberculosis (69), knowledge about orthologous C. glutamicum genes and operon structures, we attributed individual target genes to distinct functional modules.
Figure 4.

Reconstructed network of the GlxR ortholog Crp. The network reconstruction of the Crp regulon is based on the 121 transcription units presented in Table 3. It was generated by the integrated network reconstruction tool GraphVis of MycoRegNet. Transcription units relying on binding site predictions/experimental verifications that were reported previously in (22–24,60) and correspond with our findings are colored according to the appropriate publication. Arrows and gene IDs (node labels) coloured in red indicate a repressive regulation of Crp, green arrows correspond to an activating regulation.

Predicted Crp binding sites Crp binding sites detected by the TFBS search of the introduced pipeline and by the additional TFBS search with adopted and optimized PWMs. Bold letters indicate conserved pentamers of the motif. Codes: aTransferred target gene from CG. bExperimentally verified binding site by EMSA/CHiP/RT-PCR (21–23). cGene showed altered expression in microarray studies of ΔRv3676 versus WT (24). dMotif position relative to the translation start site. eCore gene. fEssential gene. gGene involved in virulence processes Reconstructed network of the GlxR ortholog Crp. The network reconstruction of the Crp regulon is based on the 121 transcription units presented in Table 3. It was generated by the integrated network reconstruction tool GraphVis of MycoRegNet. Transcription units relying on binding site predictions/experimental verifications that were reported previously in (22–24,60) and correspond with our findings are colored according to the appropriate publication. Arrows and gene IDs (node labels) coloured in red indicate a repressive regulation of Crp, green arrows correspond to an activating regulation. Similar to present knowledge on GlxR, results implicate Crp in the regulation of several functional modules such as carbohydrate metabolism (40 target genes), fatty acid metabolism (33 target genes), respiration (16 target genes) and nitrogen assimilation (7 target genes). Therefore, the position of the GlxR homolog Crp as global regulator in the transcriptional regulatory network seems to be conserved in M. tuberculosis. It is interesting to note that the suggested regulon comprises genes involved in essential functional modules, e.g. the citrate cycle, as well as genes involved in the synthesis of the cellular envelope which plays an important role in the virulence of M. tuberculosis. Together with the supposed regulation of further virulence−associated genes this might explain why a functional Crp is required for virulence in model systems (24).

CONCLUSIONS

With MycoRegNet, we have set up a system that allows researchers of the tuberculosis community to perform comprehensive analysis and visualizations of the gene regulatory network of MT. With its TFBS prediction it further provides easy access to a method that helps to generate new hypotheses in silico. As the sister project to CoryneRegNet, the MycoRegNet database content was generated through our comparative genomics pipeline, which provided us with reliable transfers of gene regulatory interactions from the reference organism C. glutamicum to M. tuberculosis. With MycoRegNet, the corresponding data are publicly available and can be accessed easily through the web interface, or in a well-structured manner by using the MycoRegNet Web Service to maintain the reconstruction, visualization, and validation of mycobacterial regulatory networks at different hierarchical levels. Taken together, MycoRegNet is a reference resource for the tuberculosis community to gain a better understanding of the complex coherences of transcriptional gene control. It has the potential to assist researchers at the development of new vaccines and drugs to treat and prevent tuberculosis. Although MycoRegNet has been initially designed for MT, it may also serve for other mycobacterial strains in future, such as the already integrated M. tuberculosis CDC1551.
  65 in total

Review 1.  The mechanisms of carbon catabolite repression in bacteria.

Authors:  Josef Deutscher
Journal:  Curr Opin Microbiol       Date:  2008-03-21       Impact factor: 7.934

2.  Triple transcriptional control of the resuscitation promoting factor 2 (rpf2) gene of Corynebacterium glutamicum by the regulators of acetate metabolism RamA and RamB and the cAMP-dependent regulator GlxR.

Authors:  Britta Jungwirth; Denise Emer; Iris Brune; Nicole Hansmeier; Alfred Pühler; Bernhard J Eikmanns; Andreas Tauch
Journal:  FEMS Microbiol Lett       Date:  2008-03-18       Impact factor: 2.742

Review 3.  Carbon catabolite repression in bacteria: many ways to make the most out of nutrients.

Authors:  Boris Görke; Jörg Stülke
Journal:  Nat Rev Microbiol       Date:  2008-08       Impact factor: 60.633

4.  Genome scale portrait of cAMP-receptor protein (CRP) regulons in mycobacteria points to their role in pathogenesis.

Authors:  Yusuf Akhter; Sailu Yellaboina; Aisha Farhana; Akash Ranjan; Niyaz Ahmed; Seyed E Hasnain
Journal:  Gene       Date:  2007-10-22       Impact factor: 3.688

5.  The GlxR regulon of the amino acid producer Corynebacterium glutamicum: in silico and in vitro detection of DNA binding sites of a global transcription regulator.

Authors:  Thomas A Kohl; Jan Baumbach; Britta Jungwirth; Alfred Pühler; Andreas Tauch
Journal:  J Biotechnol       Date:  2008-06-03       Impact factor: 3.307

6.  Effect of carbon source availability and growth phase on expression of Corynebacterium glutamicum genes involved in the tricarboxylic acid cycle and glyoxylate bypass.

Authors:  Sung Ok Han; Masayuki Inui; Hideaki Yukawa
Journal:  Microbiology       Date:  2008-10       Impact factor: 2.777

7.  TB database: an integrated platform for tuberculosis research.

Authors:  T B K Reddy; Robert Riley; Farrell Wymore; Phillip Montgomery; Dave DeCaprio; Reinhard Engels; Marcel Gellesch; Jeremy Hubble; Dennis Jen; Heng Jin; Michael Koehrsen; Lisa Larson; Maria Mao; Michael Nitzberg; Peter Sisk; Christian Stolte; Brian Weiner; Jared White; Zachariah K Zachariah; Gavin Sherlock; James E Galagan; Catherine A Ball; Gary K Schoolnik
Journal:  Nucleic Acids Res       Date:  2008-10-03       Impact factor: 16.971

8.  GenBank.

Authors:  Dennis A Benson; Ilene Karsch-Mizrachi; David J Lipman; James Ostell; Eric W Sayers
Journal:  Nucleic Acids Res       Date:  2008-10-21       Impact factor: 16.971

9.  RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation.

Authors:  Socorro Gama-Castro; Verónica Jiménez-Jacinto; Martín Peralta-Gil; Alberto Santos-Zavaleta; Mónica I Peñaloza-Spinola; Bruno Contreras-Moreira; Juan Segura-Salazar; Luis Muñiz-Rascado; Irma Martínez-Flores; Heladia Salgado; César Bonavides-Martínez; Cei Abreu-Goodger; Carlos Rodríguez-Penagos; Juan Miranda-Ríos; Enrique Morett; Enrique Merino; Araceli M Huerta; Luis Treviño-Quintanilla; Julio Collado-Vides
Journal:  Nucleic Acids Res       Date:  2007-12-23       Impact factor: 16.971

10.  CoryneRegNet 4.0 - A reference database for corynebacterial gene regulatory networks.

Authors:  Jan Baumbach
Journal:  BMC Bioinformatics       Date:  2007-11-06       Impact factor: 3.169

View more
  18 in total

Review 1.  Cyclic AMP signalling in mycobacteria: redirecting the conversation with a common currency.

Authors:  Guangchun Bai; Gwendowlyn S Knapp; Kathleen A McDonough
Journal:  Cell Microbiol       Date:  2010-12-28       Impact factor: 3.715

2.  Systems biology approaches to understanding mycobacterial survival mechanisms.

Authors:  Helena I M Boshoff; Desmond S Lun
Journal:  Drug Discov Today Dis Mech       Date:  2010

3.  Dysregulation of serine biosynthesis contributes to the growth defect of a Mycobacterium tuberculosis crp mutant.

Authors:  Guangchun Bai; Damen D Schaak; Eric A Smith; Kathleen A McDonough
Journal:  Mol Microbiol       Date:  2011-09-08       Impact factor: 3.501

4.  Comparing Galactan Biosynthesis in Mycobacterium tuberculosis and Corynebacterium diphtheriae.

Authors:  Darryl A Wesener; Matthew R Levengood; Laura L Kiessling
Journal:  J Biol Chem       Date:  2016-12-30       Impact factor: 5.157

5.  ClpR protein-like regulator specifically recognizes RecA protein-independent promoter motif and broadly regulates expression of DNA damage-inducible genes in mycobacteria.

Authors:  Yi Wang; Yuanxia Huang; Chaolun Xue; Yang He; Zheng-Guo He
Journal:  J Biol Chem       Date:  2011-07-19       Impact factor: 5.157

Review 6.  Tuberculosis: global approaches to a global disease.

Authors:  Denise E Kirschner; Douglas Young; JoAnne L Flynn
Journal:  Curr Opin Biotechnol       Date:  2010-07-14       Impact factor: 9.740

7.  RegPrecise: a database of curated genomic inferences of transcriptional regulatory interactions in prokaryotes.

Authors:  Pavel S Novichkov; Olga N Laikova; Elena S Novichkova; Mikhail S Gelfand; Adam P Arkin; Inna Dubchak; Dmitry A Rodionov
Journal:  Nucleic Acids Res       Date:  2009-11-01       Impact factor: 16.971

8.  Role of the transcriptional regulator RamB (Rv0465c) in the control of the glyoxylate cycle in Mycobacterium tuberculosis.

Authors:  Julia C Micklinghoff; Katrin J Breitinger; Mascha Schmidt; Robert Geffers; Bernhard J Eikmanns; Franz-Christoph Bange
Journal:  J Bacteriol       Date:  2009-09-18       Impact factor: 3.490

9.  Scoring protein relationships in functional interaction networks predicted from sequence data.

Authors:  Gaston K Mazandu; Nicola J Mulder
Journal:  PLoS One       Date:  2011-04-19       Impact factor: 3.240

Review 10.  The regulation of sulfur metabolism in Mycobacterium tuberculosis.

Authors:  Stavroula K Hatzios; Carolyn R Bertozzi
Journal:  PLoS Pathog       Date:  2011-07-21       Impact factor: 6.823

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.