| Literature DB >> 22096236 |
Blanca Taboada1, Ricardo Ciria, Cristian E Martinez-Guerrero, Enrique Merino.
Abstract
The Prokaryotic Operon DataBase (ProOpDB, http://operons.ibt.unam.mx/OperonPredictor) constitutes one of the most precise and complete repositories of operon predictions now available. Using our novel and highly accurate operon identification algorithm, we have predicted the operon structures of more than 1200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: (i) organism name, (ii) metabolic pathways, as defined by the KEGG database, (iii) gene orthology, as defined by the COG database, (iv) conserved protein domains, as defined by the Pfam database, (v) reference gene and (vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient method to select the most representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool to visualize their genomic context and retrieve the sequence of their corresponding 5' regulatory regions, as well as the nucleotide or amino acid sequences of their genes.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22096236 PMCID: PMC3245079 DOI: 10.1093/nar/gkr1020
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Operon structures of genes participating in the thiamine metabolism pathway, KEGG 00730 in different organisms. Among the diverse alternatives offered by ProOpDB, the selection based on KEGG pathways allows the comparison of the different transcription units that belongs to a specific metabolic process in different organisms. (a) The great diversity of operon organization that is involved in the thiamine metabolism can be observed. It is important to note that genes, in the ProOpdB output, are colored in accordance to the feature (phylogenetic—COG, metabolic—KEGG or conserved protein domains—Pfam), that was used to in the operon retrieval process. In our example, the genes are colored based on the KEGG pathway annotations, thus the potential relationships between metabolic pathways can be inferred. For example, genes that belong to the thiamine metabolism (KEGG 00730, yellow color) are part of operons co-transcribing genes of the sulfur relay system (KEGG 04122, red color) and with genes of the purine metabolism (KEGG 00230, orange color) in Aquifex aeolicus (Aquificae), Corynebacterium diphtheriae gravis (Actinobacteria) and E. coli K-12 MG1655 (Proteobacteria). (b) The 5′ and 3′ regulatory sequences of the operon as well as the protein and nucleotide sequence of the genes can be retrieved for specific analyses by particular user programs. (c) Finger-print analyses can be performed using the locally installed programs in the ProOpDB web server. The redundant sequences are eliminated using the CD-HIT program (15) prior the analysis of over-represented motifs using the MEME program (4).