Literature DB >> 22563442

MycoRRdb: a database of computationally identified regulatory regions within intergenic sequences in mycobacterial genomes.

Mohit Midha1, Nirmal K Prasad, Vaibhav Vindal.   

Abstract

The identification of regulatory regions for a gene is an important step towards deciphering the gene regulation. Regulatory regions tend to be conserved under evolution that facilitates the application of comparative genomics to identify such regions. The present study is an attempt to make use of this attribute to identify regulatory regions in the Mycobacterium species followed by the development of a database, MycoRRdb. It consist the regulatory regions identified within the intergenic distances of 25 mycobacterial species. MycoRRdb allows to retrieve the identified intergenic regulatory elements in the mycobacterial genomes. In addition to the predicted motifs, it also allows user to retrieve the Reciprocal Best BLAST Hits across the mycobacterial genomes. It is a useful resource to understand the transcriptional regulatory mechanism of mycobacterial species. This database is first of its kind which specifically addresses cis-regulatory regions and also comprehensive to the mycobacterial species. Database URL: http://mycorrdb.uohbif.in.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 22563442      PMCID: PMC3338573          DOI: 10.1371/journal.pone.0036094

Source DB:  PubMed          Journal:  PLoS One        ISSN: 1932-6203            Impact factor:   3.240


Introduction

Over the past few years the genomic sequence repertoire of mycobacterial sequences has increased tremendously. The availability of complete genome sequences makes it possible to efficiently employ computational approaches to understand the genome function and its complexity [1]. One of the important aspects to compare genome sequences is to find orthologous proteins among the existing species [2], [3]. The identification of orthologs is important not only to assist the functional annotation of a gene but also to identify its regulatory region. These regions are known to evolve at a slower rate than non-functional elements, and therefore finding the conserved DNA motifs within non coding region is an efficient method to predict these regions [4], [5]. Different approaches have been used to find the regulatory regions [6]–[8]. Generally, identification of these DNA elements relies on an extensive set of known target genes [4], [9]. Therefore, identification of regulatory region for a novel transcriptional regulator remains a challenging task. Extensive research on mycobacteria has produced a number of online resources, providing information on pathogenicity, cellular physiology, operon arrangement, microarray, etc. [10]–[15]. These resources also include a database, MtbRegList, which contains the reported regulatory regions in Mycobacterium tuberculosis [16]. Nevertheless, there is still need to document the putative regulatory regions for all the mycobacterial genomes. Our present study addresses this issue, as it identifies the putative cis- regulatory sequences within the intergenic regions of mycobacterial species and also the similar DNA motif in a genome. In addition to the predicted regulatory regions, the database includes list of Reciprocal Best BLAST Hits (RBBHs) for all 25 mycobacterial species. The database also has a search feature to identify the sequences similar to a query DNA motif. This database can assist in the characterization of gene regulation in all the mycobacterial species.

Methods

Retrieval and filtering the genome sequences

The complete genome sequences of 25 Mycobacterium species were downloaded from NCBI ftp site (ftp://ftp.ncbi.nih.gov/genomes/Bacteria/). Some of the proteins were found to be present in more than one copy, identical in sequence, in certain species. In present study, such Multiple Identical Proteins (MIPs) were identified and replaced with only one representative protein sequence for further analysis.

Identification of orthologs

Reciprocal Best BLAST Hit (RBBH) method was used to predict orthologous proteins in mycobacterial proteomes. Pairs of proteins, from two mycobacterial species, covering the at least 50% sequence length of both the proteins in alignment and E-values lower than of 10−20 for both directions using BLASTP program with all other parameters at default values were selected as RBBHs [17]–[19].

Retrieval of operons and the intergenic sequences

Information for all mycobacterial operons in genomes of all 25 species were retrieved from the DOOR database (version2) [20], [21] Intergenic sequence upstream of the first gene of each operon was retrieved using perl script. Sets of intergenic sequences were compiled for each orthologous gene. These sets of sequences were further subjected to the identification of a regulatory region.

Identification of regulatory regions

MEME suite was used to identify the conserved regulatory DNA elements from the set of sequences described earlier [4]. The DNA motif length, from minimum of 20 bases to maximum length of 30, was optimized using known DNA targets from M. tuberculosis [16]. DNA search was carried out to look for palindromes within the given strand as well as its complementary strand. Additionally, the top predicted DNA motifs observed associated with three or more orthologous sequences were selected as potential regulatory DNA element. All other parameters were kept on its default values. These DNA motifs were further searched in their respective genomes to identify the significantly similar motifs with minimum aligned length (L) of 16 bases (allowing N mismatch where N< = 0.2L; L-N>14).

Database development

Subsequent to the identification of regulatory regions from all mycobacterial genomes, a web resource, MycoRRdb was developed. This database has been developed using MySQL. It is constructed to allow user to browse the outcome of study in an easy accessible mode. Web interface of the database is designed using PHP, HTML and Javascripts. Flow chart of the methodology followed in the study is depicted in Figure 1.
Figure 1

Flowchart of the methodology.

Results and Discussion

RBBHs across the mycobacterial species

The ortholog prediction is not only important to identify the regulatory region but also helps in functional annotation of a sequenced genome. Our study also began with the identification of the RBBHs which serves as potential ortholog. All the RBBHs from the mycobacterial species were identified using the methodology discussed. The identified lists of RBBHs for any mycobacterial gene across all 25 mycobacterial genome were used as a data source for the MycoRRdb.

Mycobacterial regulatory regions

Subsequently, DNA regulatory regions were identified across the all 25 Mycobacterium species. The total predicted regulatory motifs were 37101 in number for all 25 mycobacterial genomes. Further, the motifs predicted across the Mycobacterial species were compared with the known DNA motifs reported in the literature [3], [5], [16], [22]–[41]. It was observed that 116 DNA motifs, out of 181 retrieved, were mapped in MycoRRdb and notified through the link given in the database. The comparative list of the predicted and the reported DNA motifs is given in Table S1. The maximum number of motifs was predicted from Mycobacterium tuberculosis H37Ra while the minimum number was from Mycobacterium abscessus ATCC 19977. These predicted DNA motifs are the putative Transcription Factor Binding Sites (TFBS). The TFBS identified, positioned at more than 400 nucleotide upstream to the translational start site, are highlighted with red colour font. Further in view of over representation, similar DNA motifs were searched to find the similar motifs within the predicted list of intergenic regulatory region. All the identified motifs are displayed with the strand information and the position from translational start site.

Database access

MycoRRdb can be accessed through the database web interface at http://mycorrdb.uohbif.in. There are two kind of data that has been stored in MycoRRdb:(i)Reciprocal Best BLAST Hits (RBBHs), and (ii) Predicted Regulatory Region for each transcription unit (Figure 2). This information for any mycobacterial gene can be retrieved from MycoRRdb in either browsable or searchable fashion. Homepage of the database provides links for the mycobacterial genome which further leads to complete list of genes/protein id/ORF id of a particular species. From the list one can proceed to find the RBBHs of any gene across other mycobacterial species and associated regulatory DNA motifs along with its occurrence in the orthologous intergenic sequences. It also gives link, to facilitate user, to the retrieve the known motif reported in literature. In addition to this list of similar DNA motifs in a genome is also available (Figure 3).
Figure 2

Schematic architecture of MycoRRdb.

Figure 3

A browsable interface to retrieve RBBHS and DNA motifs.

Besides browsing data from complete genes list, separate links have also been made available on web interface to quickly retrieve RBBHs or regulatory DNA motifs by gene name/protein id/ORF id. A searchable interface to retrieve RBBHs is shown in Figure 4A. The predicted regulatory regions and the similar sequences present in that genome can be also be retrieved by searchable interface using gene name/protein id/ORF id (Figure 4B). Moreover, user can scan the availability of its desired DNA sequence, if it exists in any Mycobacterial species, in the identified DNA motifs set of the Database (Figure 4C).
Figure 4

A searchable mode to retrieve RBBHS and DNA motifs.

A. Interface to retrieve the RBBHs; B. Interface to retrieve the regulatory DNA motifs; C. Interface to retrieve the similar DNA motifs to the desired DNA sequence.

A searchable mode to retrieve RBBHS and DNA motifs.

A. Interface to retrieve the RBBHs; B. Interface to retrieve the regulatory DNA motifs; C. Interface to retrieve the similar DNA motifs to the desired DNA sequence. This database is under constant development to gather the experimentally validated DNA motifs to incorporate in the database. It also provides link for biologist to put forward the experimentally validated mycobacterial regulatory regions, if any.

Conclusions

The availability of whole genome sequences makes Mycobacterium one of the highly sequenced genera. This wealth of sequence data provides unique opportunity to extract the genome information in order to address cellular physiology and to develop better intervention strategies for pathogenic species. This study is a systematic approach to reveal the putative regulatory regions and RBBHs across the mycobacterial species. On the one hand, the identified regulatory regions will help to understand the transcriptional regulation of the mycobacterial genes, and on the other hand, the identified RBBHs will assist to impart the functional knowledge of one gene to another. The availability of all the identified regulatory regions and RBBHs from the mycobacterial species at a websource, MycoRRdb, will help to access the data and will have potential implications to unravel the genomic complexity of the mycobacteria. DNA motifs in MycoRRdb mapped with regulatory regions reported in literature. (Available at: http://mycorrdb.uohbif.in/links.php). (XLS) Click here for additional data file.
  41 in total

1.  The role of multiple SOS boxes upstream of the Mycobacterium tuberculosis lexA gene--identification of a novel DNA-damage-inducible gene.

Authors:  Edith M Dullaghan; Patricia C Brooks; Elaine O Davis
Journal:  Microbiology       Date:  2002-11       Impact factor: 2.777

2.  A nickel-cobalt-sensing ArsR-SmtB family repressor. Contributions of cytosol and effector binding sites to metal selectivity.

Authors:  Jennifer S Cavet; Wenmao Meng; Mario A Pennella; Rebecca J Appelhoff; David P Giedroc; Nigel J Robinson
Journal:  J Biol Chem       Date:  2002-08-05       Impact factor: 5.157

3.  Identification of some DNA damage-inducible genes of Mycobacterium tuberculosis: apparent lack of correlation with LexA binding.

Authors:  P C Brooks; F Movahedzadeh; E O Davis
Journal:  J Bacteriol       Date:  2001-08       Impact factor: 3.490

4.  Identification and characterization of two divergently transcribed iron regulated genes in Mycobacterium tuberculosis.

Authors:  G M Rodriguez; B Gold; M Gomez; O Dussurget; I Smith
Journal:  Tuber Lung Dis       Date:  1999

5.  Expression, autoregulation, and DNA binding properties of the Mycobacterium tuberculosis TrcR response regulator.

Authors:  Shelley E Haydel; William H Benjamin; Nancy E Dunlap; Josephine E Clark-Curtiss
Journal:  J Bacteriol       Date:  2002-04       Impact factor: 3.490

6.  The Mycobacterium tuberculosis IdeR is a dual functional regulator that controls transcription of genes involved in iron acquisition, iron storage and survival in macrophages.

Authors:  B Gold; G M Rodriguez; S A Marras; M Pentecost; I Smith
Journal:  Mol Microbiol       Date:  2001-11       Impact factor: 3.501

7.  A novel copper-responsive regulon in Mycobacterium tuberculosis.

Authors:  Richard A Festa; Marcus B Jones; Susan Butler-Wu; Daniel Sinsimer; Russell Gerads; William R Bishai; Scott N Peterson; K Heran Darwin
Journal:  Mol Microbiol       Date:  2010-10-29       Impact factor: 3.501

8.  Definition of the mycobacterial SOS box and use to identify LexA-regulated genes in Mycobacterium tuberculosis.

Authors:  Elaine O Davis; Edith M Dullaghan; Lucinda Rand
Journal:  J Bacteriol       Date:  2002-06       Impact factor: 3.490

9.  Dissection of the heat-shock response in Mycobacterium tuberculosis using mutants and microarrays.

Authors:  Graham R Stewart; Lorenz Wernisch; Richard Stabler; Joseph A Mangan; Jason Hinds; Ken G Laing; Douglas B Young; Philip D Butcher
Journal:  Microbiology       Date:  2002-10       Impact factor: 2.777

10.  ideR, An essential gene in mycobacterium tuberculosis: role of IdeR in iron-dependent gene expression, iron metabolism, and oxidative stress response.

Authors:  G Marcela Rodriguez; Martin I Voskuil; Benjamin Gold; Gary K Schoolnik; Issar Smith
Journal:  Infect Immun       Date:  2002-07       Impact factor: 3.441

View more
  6 in total

1.  Phosphorylation of mycobacterial phosphodiesterase by eukaryotic-type Ser/Thr kinase controls its two distinct and mutually exclusive functionalities.

Authors:  Neha Malhotra; Subramanian Karthikeyan; Pradip K Chakraborti
Journal:  J Biol Chem       Date:  2017-08-30       Impact factor: 5.157

2.  OrFin: A web tool for detection of putative orthologs.

Authors:  Mohit Midha; Raja Polavarapu; Potshangbam Angamba Meetei; Hari Krishnan; Krishnaveni Mohareer; Vaibhav Vindal
Journal:  Bioinformation       Date:  2012-08-03

3.  Phosphorylation Modulates Catalytic Activity of Mycobacterial Sirtuins.

Authors:  Ghanshyam S Yadav; Sandeep K Ravala; Neha Malhotra; Pradip K Chakraborti
Journal:  Front Microbiol       Date:  2016-05-09       Impact factor: 5.640

4.  Eukaryotic-Type Ser/Thr Protein Kinase Mediated Phosphorylation of Mycobacterial Phosphodiesterase Affects its Localization to the Cell Wall.

Authors:  Neha Malhotra; Pradip K Chakraborti
Journal:  Front Microbiol       Date:  2016-02-09       Impact factor: 5.640

5.  Genetic heterogeneity revealed by sequence analysis of Mycobacterium tuberculosis isolates from extra-pulmonary tuberculosis patients.

Authors:  Sarbashis Das; Tanmoy Roychowdhury; Parameet Kumar; Anil Kumar; Priya Kalra; Jitendra Singh; Sarman Singh; H K Prasad; Alok Bhattacharya
Journal:  BMC Genomics       Date:  2013-06-17       Impact factor: 3.969

6.  Genome-Wide De Novo Prediction of Cis-Regulatory Binding Sites in Mycobacterium tuberculosis H37Rv.

Authors:  Wei Wu; Xian Sun; Yun Gao; Jun Jiang; Zhenling Cui; Baoxue Ge; Hai Wu; Lu Zhang; Yao Li
Journal:  PLoS One       Date:  2016-02-17       Impact factor: 3.240

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.