| Literature DB >> 25326239 |
Salim Bougouffa1, Aleksandar Radovanovic1, Magbubah Essack1, Vladimir B Bajic2.
Abstract
Microorganisms are known to counteract salt stress through salt influx or by the accumulation of osmoprotectants (also called compatible solutes). Understanding the pathways that synthesize and/or breakdown these osmoprotectants is of interest to studies of crops halotolerance and to biotechnology applications that use microbes as cell factories for production of biomass or commercial chemicals. To facilitate the exploration of osmoprotectants, we have developed the first online resource, 'Dragon Explorer of Osmoprotection associated Pathways' (DEOP) that gathers and presents curated information about osmoprotectants, complemented by information about reactions and pathways that use or affect them. A combined total of 141 compounds were confirmed osmoprotectants, which were matched to 1883 reactions and 834 pathways. DEOP can also be used to map genes or microbial genomes to potential osmoprotection-associated pathways, and thus link genes and genomes to other associated osmoprotection information. Moreover, DEOP provides a text-mining utility to search deeper into the scientific literature for supporting evidence or for new associations of osmoprotectants to pathways, reactions, enzymes, genes or organisms. Two case studies are provided to demonstrate the usefulness of DEOP. The system can be accessed at. Database URL: http://www.cbrc.kaust.edu.sa/deop/Entities:
Mesh:
Substances:
Year: 2014 PMID: 25326239 PMCID: PMC4201361 DOI: 10.1093/database/bau100
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.DEOP conceptual diagram.
DEOP statistics
| Type | Count |
|---|---|
| Osmoprotectants | 109 |
| Orphan osmoprotectants | 34 |
| Candidate osmoprotectants | 3 |
| Osmoprotection related | 23 |
| Other compounds | 2738 |
| Pathways | 834 |
| Reactions affecting one or more osmoprotectants | 1883 |
| Other reactions | 2166 |
| Genes | 3529 |
| Enzymes | 4899 |
| Organisms | 1160 |
aAn orphan osmoprotectant is one that we could not link to a metabolic pathway.
bA candidate osmoprotectant is one that was reported to accumulate in a microorganism under stress but was not labelled as an osmoprotectant.
cCompounds that play a role in the osmoprotection response but are not accumulated as osmolytes.
dOther compounds in a reaction in which a curated osmoprotectant is involved.
eReactions that are part of a pathway in which one or more osmoprotectants are involved, but these reactions do not directly affect any of the osmoprotectants.
Data integration database
| Name | Records | Description | Source |
|---|---|---|---|
| Chemical Entities of Biological Interest (ChEBI) | 38 580 | The database of chemical entities of biological interest | |
| ChEBI ontology | 29 974 | ||
| Enzyme | 5 418 | Enzymes nomencalture | ca.expasy.org/enzyme |
| Gene | 8 927 911 | NCBI gene database | |
| Functional association data/networks (GeneMania) | 21 084 | Gene associations database | genemania.org |
| GO | 34 940 | Gene ontology database | |
| GOA | 11 300 749 | Gene ontology annotation | |
| HUGO Gene nomenclature | 35 795 | Human genes nomenclature | |
| Human major histocompatibility complex | 6 939 | Human major histocompatibility complex (HLA) sequences | |
| Immunoglobulins and T-cell receptors nucleotide sequences | 156 529 | The international imMunoGeneTics information system | |
| Interpro | 21 749 | Protein sequence analysis and classification | |
| KEGG module | 196 659 | Collection functional units used for annotation and biological interpretation of sequenced genomes. | |
| KEGG pathway | 262 432 | Pathway maps on the molecular interaction and reaction networks for biological interpretation of higher-level systemic functions. | |
| KEGG ligand compound | 34 182 | Database of chemical substances and reactions that are relevant to life. | |
| KEGG ligand enzyme | 6 118 | ||
| KEGG Ligand Glycan | 10 985 | ||
| KEGG Ligand Reaction | 9 400 | ||
| Oxford Human Mouse grid | 17 834 | Laboratory mouse genetic, genomic and biological data resources. | |
| Pfam-A | 12 273 | Collection of protein families, each represented by multiple sequence alignments and hidden Markov models | pfam.sanger.ac.uk |
| Pfam-B | 233 174 | ||
| Pfam seed | 12 273 | ||
| PRINTS | 2 050 | Protein fingerprints, groups of conserved motifs used to characterize a protein families | bioinf.man.ac.uk/dbbrowser/PRINTS |
| Prosite | 2 247 | Documentation entries describing protein domains, families, functional sites and associated patterns and profiles to identify them. | |
| Prosite documentation | 1 621 | ||
| REBASE | 5 020 | The restriction enzyme database. | rebase.neb.com/rebase/rebase.html |
| RefSeq | 18 236 994 | Set of reference sequences including genomic, transcript, and protein. | |
| UniProt/Swiss-Prot | 531 473 | Protein database, manually annotated and reviewed. | |
| Taxonomy | 817 120 | Classification and nomenclature for all of the organisms in the public sequence databases. | |
| UniProt/TrEMBL | 16 504 022 | Protein database, automatically annotated and not reviewed. | |
| Unigene | 2 652 777 | NCBI database of the transcriptome. | |
| Uniprot KB | 17 035 495 | Protein knowledgebase (Swiss-Prot + TrEMBL). | |
| Total records | 77 163 817 |
Synonyms extracted from NCBI Gene database
| Synonym type | Number of synonyms |
|---|---|
| Alias | 756 689 |
| Alternate_name | 405 135 |
| Gene_Symbol | 1 366 580 |
| Locus_tag | 9 824 390 |
| Official_Full_Name | 109 969 |
| Official_Symbol | 296 280 |
| Total synonyms | 12 759 043 |
Figure 2.Values in the left panel are pathway completeness estimates. The values in the right panel are the total hits for each pathway that was normalized against its completeness. A minimum completeness of 50% was applied to both panels, where at least one sample had to satisfy for each pathway. The left panel demonstrates the presence or absence of selected pathways in the mangrove samples vs the control data sets, whereas the right panel shows the enrichment of these pathways.
Figure 3.Values in the left panel are pathway completeness estimates. The values in the right panel are the total hits for each pathway that was normalized against its completeness. A minimum completeness of 75% was applied to both panels, where at least one sample had to satisfy for each pathway. The left panel demonstrates the presence or absence of selected pathways in the mangrove samples vs the control data sets, whereas the right panel shows the enrichment of these pathways.