| Literature DB >> 25468931 |
Márcia Duarte1, Ruy Jauregui2, Ramiro Vilchez-Vargas1, Howard Junca2, Dietmar H Pieper3.
Abstract
Understanding prokaryotic transformation of recalcitrant pollutants and the in-situ metabolic nets require the integration of massive amounts of biological data. Decades of biochemical studies together with novel next-generation sequencing data have exponentially increased information on aerobic aromatic degradation pathways. However, the majority of protein sequences in public databases have not been experimentally characterized and homology-based methods are still the most routinely used approach to assign protein function, allowing the propagation of misannotations. AromaDeg is a web-based resource targeting aerobic degradation of aromatics that comprises recently updated (September 2013) and manually curated databases constructed based on a phylogenomic approach. Grounded in phylogenetic analyses of protein sequences of key catabolic protein families and of proteins of documented function, AromaDeg allows query and data mining of novel genomic, metagenomic or metatranscriptomic data sets. Essentially, each query sequence that match a given protein family of AromaDeg is associated to a specific cluster of a given phylogenetic tree and further function annotation and/or substrate specificity may be inferred from the neighboring cluster members with experimentally validated function. This allows a detailed characterization of individual protein superfamilies as well as high-throughput functional classifications. Thus, AromaDeg addresses the deficiencies of homology-based protein function prediction, combining phylogenetic tree construction and integration of experimental data to obtain more accurate annotations of new biological data related to aerobic aromatic biodegradation pathways. We pursue in future the expansion of AromaDeg to other enzyme families involved in aromatic degradation and its regular update. Database URL: http://aromadeg.siona.helmholtz-hzi.deEntities:
Mesh:
Substances:
Year: 2014 PMID: 25468931 PMCID: PMC4250580 DOI: 10.1093/database/bau118
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Aerobic metabolism of selected aromatics via di- or trihydroxylated intermediates. Reactions catalyzed by Rieske non-heme iron oxygenases are indicated by R, those catalyzed by extradiol dioxygenases of the vicinal chelate superfamily by an E, those catalyzed by enzymes of the LigB superfamily by an L and those catalyzed by enzymes of the cupin superfamily by a C. Ring-cleavage products are channeled to the Krebs cycle via central reactions.
Figure 2.Evolutionary relationships of α-subunits of the salicylate family of Rieske non-heme iron oxygenases. Protein sequences were aligned with MAFFT and the phylogenetic tree was constructed with MEGA5 using the neighbor-joining algorithm with p-distance correction and pairwise deletion of gaps and missing data. A total of 100 bootstrap replications were done to test for branch robustness (bootstrap values are shown adjacent to each cluster node) and redundant protein sequences (>95–99% of sequence identity) were removed. According to the documented substrate specificity of representative members they can be clustered as follows: Clusters I, XIV and XV—salicylate 1-hydroxylases; Cluster II—salicylate 5-hydroxylases; Clusters III, IV, V and VI—Rieske oxygenases related to salicylate 5-hydroxylases; Cluster VII—chlorobenzoate dioxygenases; Clusters VIII, X, XI and XII—Rieske oxygenases related to terephthalate dioxygenases; Cluster IX—terephthalate dioxygenases; Cluster XVI—probable salicylate 1-hydroxylases; Cluster XVII—anthranilate dioxygenases of Burkholderia and some other organisms. Further information about each cluster is included in Table 1.
Phylogenomic clusters of α-subunits of the salicylate family of Rieske non-heme iron oxygenases
| Clusters | Representative sequence | Annotation | Substrate | Abbreviation | Pubmed id |
|---|---|---|---|---|---|
| I | BAC65453 | Salicylate 1-hydroxylases | Salicylate | Sa1 | |
| II | AAD12607 | Salicylate 5-hydroxylases | Salicylate | Sa5 | |
| III | ZP_06687231 | Rieske oxygenases related to salicylate 5-hydroxylases | Unknown | Non | |
| IV | NP_928334 | Rieske oxygenases related to salicylate 5-hydroxylases | Unknown | Non | |
| V | YP_556166 | Rieske oxygenases related to salicylate 5-hydroxylases | Unknown | Non | |
| VI | ZP_06687854 | Rieske oxygenases related to salicylate 5-hydroxylases | Unknown | Non | |
| VII | AAL17610 | Chlorobenzoate dioxygenases | Chlorobenzoate | Cbz | |
| VIII | YP_999321 | Rieske oxygenases related to terephthalate dioxygenases | Unknown | Non | |
| IX | BAE47077 | Terephthalate dioxygenases | Terephthalate | Tph | |
| X | WP_009518931 | Rieske oxygenases related to terephthalate dioxygenases | Unknown | Non | |
| XI | YP_552991 | Rieske oxygenases related to terephthalate dioxygenases | Unknown | Non | |
| XII | WP_017727645 | Rieske oxygenases related to terephthalate dioxygenases | Unknown | Non | |
| XIII | WP_007398631 | Rieske oxygenases related to salicylate 1-hydroxylases | Unknown | Non | |
| XIV | BAC65433 | Salicylate 1-hydroxylases | Salicylate | Sa1 | |
| XV | BAC65426 | Salicylate 1-hydroxylases | Salicylate | Sa1 | |
| XVI | YP_008374324 | Probable salicylate 1-hydroxylases | Probably salicylate | Non | |
| XVII | YP_002361789 | Probable salicylate 1-hydroxylases | Probably salicylate | Non | |
| XVIII | AAO83639 | Anthranilate dioxygenases of | Anthranilate | Ant |
Notes. List of the three- or four-letter code indicate the experimentally validated function and/or substrate: Ant, anthranilate; Cbz, chlorobenzoate; Sa1, salicylate (salicylate 1-hydroxylases); Sa5, salicylate (salicylate 5-hydroxylases); Tph, terephthalate.