Literature DB >> 19296857

The PHA Depolymerase Engineering Database: A systematic analysis tool for the diverse family of polyhydroxyalkanoate (PHA) depolymerases.

Michael Knoll¹, Thomas M Hamm, Florian Wagner, Virginia Martinez, Jürgen Pleiss.

Abstract

BACKGROUND: Polyhydroxyalkanoates (PHAs) can be degraded by many microorganisms using intra- or extracellular PHA depolymerases. PHA depolymerases are very diverse in sequence and substrate specificity, but share a common alpha/beta-hydrolase fold and a catalytic triad, which is also found in other alpha/beta-hydrolases.
RESULTS: The PHA Depolymerase Engineering Database (DED, http://www.ded.uni-stuttgart.de) has been established as a tool for systematic analysis of this enzyme family. The DED contains sequence entries of 587 PHA depolymerases, which were assigned to 8 superfamilies and 38 homologous families based on their sequence similarity. For each family, multiple sequence alignments and profile hidden Markov models are provided, and functionally relevant residues are annotated.
CONCLUSION: The DED is a valuable tool which can be applied to identify new PHA depolymerase sequences from complete genomes in silico, to classify PHA depolymerases, to predict their biochemical properties, and to design enzyme variants with improved properties.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2009 PMID： 19296857 PMCID： PMC2666664 DOI： 10.1186/1471-2105-10-89

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

Background

In the past decade, polyhydroxyalkanoates (PHAs) gained industrial interest as biodegradable substitutes for non-degradable plastics. While poly (R)-3-hydroxybutyric acid (PHB) is the most widely studied and the best characterized PHA, a wide variety of PHAs with differences in flexibility and thermostability have been described. Many bacteria accumulate PHAs as storage compounds of carbon and energy [1-3]. PHAs have been assigned to two classes, depending on the number of carbon atoms of the monomers: short chain length PHAs (PHASCL) with 3 to 5 carbon atoms per monomer and medium chain length PHAs (PHAMCL) with 6 to 15 carbon atoms per monomer. PHAs are degraded by intracellular and extracellular PHA depolymerases. Intracellular PHA depolymerases hydrolyze an endogenous carbon reservoir, the native PHA granules, consisting of the polymer with a surface layer of proteins and phospholipids. Extracellular PHA depolymerases degrade denatured extracellular granules which are partially crystalline and are lacking a surface layer [4,5]. Thus, depending on their substrate and its physical state, PHA depolymerases are grouped generally into four families: PHA depolymerases degrading the native intracellular granules (nPHAMCL depolymerases and nPHASCL depolymerases) and PHA depolymerases degrading the denatured extracellular PHA granules (dPHAMCL depolymerases and dPHASCL depolymerases). One exception of this classification is an extracellular nPHASCL depolymerase from Paucimonas lemoignei which is active only against native PHA granules [6]. Additionally periplasmatic PHA depolymerases exist, as a PHA depolymerase from Rhodospirillum rubrum has been described recently to be located in the periplasm [7]. PHA depolymerases are carboxylesterases and belong to the α/β-hydrolase fold family [5]. As member of the α/β-hydrolase fold, two families including PHA depolymerases have also been described in the Pfam proteins families database [8]: the family of Esterase PHB depolymerases (Pfam accession code: PF10503) and the family describing the C-terminus of bacterial PHB depolymerases (Pfam accession code: PF06850). With exception of a few intracellular nPHASCL depolymerases, all PHA depolymerases have a catalytic triad (serine – histidine – aspartic acid) as active site. The catalytic serine is embedded in a GxSxG sequence motif (known as 'lipase box') as found in other α/β-hydrolases. Additionally, a conserved non-catalytic histidine near the oxyanion hole is found analogous to lipases [5,9]. The best studied PHA depolymerases are dPHASCL depolymerases. They share a common domain architecture consisting of a short signal peptide, a catalytic domain (including the lipase box and the oxyanion hole), a short linker domain, and a substrate binding domain [10]. Depending on the location of the lipase box on sequence level relative to the oxyanion hole, two types of catalytic domains are known. Within the sequences of type 1 catalytic domains, the oxyanion hole can be found N-terminal to the lipase box, similar to lipases. Within the sequences of type 2 catalytic domains, the oxyanion hole is found C-terminal to the catalytic triad. In contrast to dPHASCL depolymerases, dPHAMCL depolymerases possess no substrate binding domain. In these enzymes, the N-terminal region of the catalytic domain is assumed to function as substrate binding site [5]. The PHA depolymerase from Rhodospirillum rubrum which is described to be located in the periplasm [7] is a special case, as it has a catalytic domain similar to extracellular PHA depolymerases with a catalytic domain type 2. For intracellular nPHA depolymerases no particular substrate binding domain has been described so far. A few intracellular nPHASCL depolymerases have no lipase box, but have a catalytic triad consisting of cysteine, histidine, and aspartic acid. One member of this family is the nPHASCL depolymerase of Ralstonia eutropha [11]. Only about 30 PHA depolymerases with experimentally validated PHA depolymerase activity have been described so far. The factors which mediate the capability of depolymerases to degrade PHAs with high specificity are not yet understood. Although the sequence similarity of PHA depolymerases to other known α/β-hydrolases like lipases and esterases is low and substrate specificity differs considerably, they belong to the same fold family and possess a highly conserved active site. From a systematic comparison of the PHA depolymerase family to other α/β-hydrolases, depolymerase-specific motifs can be derived. However, a data resource is still lacking which integrates sequence and structure information and provides tools for a systematic analysis of the sequence-structure-function relationship of PHA depolymerases. Therefore, the PHA Depolymerase Engineering Database (DED, ) has been designed to assist a comprehensive analysis of sequences, the annotation of new sequences and the design of mutants. For the analysis of lipases and esterases, the Lipase Engineering Database (LED, ) has previously been established and applied [12,13]. Comparison of the rules derived from the LED to the DED will help to understand differences of PHA depolymerases and other α/β-hydrolases, and will relate experimentally observed properties of PHA depolymerases to their sequence.

Construction and content

To establish the PHA Depolymerase Engineering Database, the data warehouse system DWARF [14] has been applied. The DWARF system provides an automated retrieval tool to extract information on sequence, structure, or function from different source databases into a local data warehouse system. As a first step, 28 seed sequences of proteins with experimentally validated depolymerase activity (Table 1[15-33]) were stored in the database and annotated. These seed sequences were assigned to 6 previously described superfamilies based on their function [34]. Additionally the families of intracellular nPHASCL depolymerases (lipase box), of which one family member has recently been described [35], and the family of periplasmatic PHA depolymerases including the PHA depolymerase from Rhodospirillum rubrum [7] were introduced. Thus, a total of 8 superfamilies were introduced:

Table 1

Experimentally validated PHA depolymerases, which were used as seed sequences to set up the DED.

Accession number (gi)	Organism	Family	Reference
3641686	Ralstonia eutropha H16	Intracellular nPHA_SCLdepolymerases (no lipase box)	[28]

75763431	Bacillus thuringiensis serovar israelensis ATCC 35646	Intracellular nPHA_SCLdepolymerases (lipase box)	[35]

22035160	Rhodospirillum rubrum	Periplasmatic PHA depolymerases	[7]

130002	Pseudomonas oleovorans	Intracellular nPHA_MCLdepolymerases	[18]
21689574	Pseudomonas putida		[17]

130019	Ralstonia pickettii	Extracellular dPHA_SCLdepolymerises (catalytic domain type 1)	[29]
1777951	Alcaligenes faecalis		[23]
116744367	Bacillus megaterium		[48]
1730532	Paucimonas lemoignei		[20]
7385117	Paucimonas lemoignei		[31]
1657610	Paucimonas lemoignei		[15]
1621355	Paucimonas lemoignei		[15]
531464	Paucimonas lemoignei		[45]
531466	Paucimonas lemoignei		[45]
75538924	Pseudomonas stutzeri		[26]
5360565	Ralstonia pickettii
1381030	Ralstonia pickettii

4033618	Acidovorax sp. TP4	Extracellular dPHA_SCLdepolymerises (catalytic domain type 2)	[25]
7209864	Caldimonas manganoxidans		[33]
565666	Comamonas sp.		[19]
75340123	Delftia acidovorans		[21]
47078657	Schlegelella sp. KB1a		[27]
1389770	Streptomyces exfoliatus		[24]
88192747	Penicillium funiculosum		[16]

15788987	Paucimonas lemoignei	Extracellular nPHA_SCLdepolymerases	[6]

34452163	Pseudomonas alcaligenes	Extracellular dPHA_MCLdepolymerases	[22]
29470160	Pseudomonas alcaligenes		[22]
21542177	Pseudomonas fluorescens		[30]

Experimentally validated PHA depolymerases, which were used as seed sequences to set up the DED. ▪ intracellular nPHASCL depolymerases (no lipase box) ▪ intracellular nPHASCL depolymerases (lipase box) ▪ intracellular nPHAMCL depolymerases ▪ periplasmatic PHA depolymerases ▪ extracellular dPHASCL depolymerases (catalytic domain type 1) ▪ extracellular dPHASCL depolymerases (catalytic domain type 2) ▪ extracellular nPHASCL depolymerases ▪ extracellular dPHAMCL depolymerases The DWARF system was further applied to populate the database with sequences obtained from a pool of selected sequences, which have been annotated as "depolymerase" in the GenBank [36]. This was done by performing a BLAST search [37] of each seed sequence in the database against the selected sequence pool derived from the GenBank. As a final population step, BLAST searches against the non-redundant sequence database at NCBI were performed for each sequence with an E-value cut-off of E = 10-50 to populate the database with more sequences. Superfamilies were subdivided into homologous families, which were introduced based on sequence similarity and phylogenetic analysis (Fig. 1). New protein entries were assigned to homologous families and superfamilies by their sequence similarity.

Figure 1

Assignment of homologous families. Homologous families of intracellular nPHASCL depolymerases (no lipase box) [A], extracellular dPHASCL depolymerases (catalytic domain type 1) [B], and extracellular dPHASCL depolymerases (catalytic domain type 2) [C]. The homologous family number is indicated. Sequence entries with more than 98% sequence identity, which originate from the same source organism, were assigned to a single protein entry. In case of multiple sequence entries for one protein, the longest sequence was set as reference sequence. For protein entries with available structure information, structural monomers were downloaded from the Protein Data Bank [38] and stored as structure entries. Secondary structure information was calculated applying the program DSSP [39] and displayed in the annotated multiple sequence alignments which are generated using ClustalW (v1.83) with default parameters [40]. Annotation information on structurally or functionally relevant residues (active site, disulfide bridges, signal peptide) was extracted from the NCBI entries and annotated in the DED. Information on experimentally validated depolymerases was manually added. Residues of the lipase box and the catalytic triad were manually annotated, which enables an easy identification of these residues for almost all PHA depolymerases based on multiple sequence alignments. The PHA Depolymerase Engineering Database consists of 735 sequence entries which code for 587 different proteins. The proteins have been assigned to 8 superfamilies and 38 homologous families. The largest PHA depolymerase families are the intracellular nPHASCL depolymerases (no lipase box) and the extracellular dPHASCL depolymerases (catalytic domain type 1) with 224 and 234 protein entries, respectively, and account for 38% and 39% of all protein entries. Only one member of the family of periplasmatic PHA depolymerases was found, the PHA depolymerase of Rhodospirillum rubrum. For the families of extracellular dPHASCL depolymerases (catalytic domain type 2) and the family of extracellular nPHASCL depolymerases, structure information is available. Interestingly, two proteins from Cupriavidus taiwanensis and Ralstonia eutropha H16 which are annotated as "intracellular PHA depolymerase" in the GenBank were assigned to the family of extracellular dPHASCL depolymerases (catalytic domain type 1) due to their sequence similarity (gi: 194292521, gi:74267419 [41]). The latter is reported to be highly active against artificial amorphous PHB granules, and is lacking a signal peptide, a linker domain, and a substrate binding domain. Another exception is the PHA depolymerase from Pseudomonas sp. which is annotated as "extracellular PHA depolymerase" in the GenBank but was assigned to the family of intracellular nPHAMCL depolymerases in the DED (gi:34452171).

Utility and discussion

The DWARF system is an integrative bioinformatics tool to build up protein family databases into a local data warehouse system. The DWARF system has previously been successfully applied to build up the Lipase Engineering Database [12,13], the Cytochrome P450 Engineering Database [42], and the Medium-Chain Dehydrogenase/Reductase Engineering Database [43]. A local data warehouse has the advantage of a common and consistent data structure which enables systematic analysis of complete protein families. The DED is the first data source that integrates information on sequence, structure, and function of PHA depolymerases in a systematic and consistent format.

Web accessibility

The database can be browsed on the level of sequence, structure, or organism. All protein entries are linked to the respective NCBI entries. Annotated multiple sequence alignments and phylogenetic trees that are visualized applying the program PHYLODENDRON are provided via the online accessible version of the DED at . For each family, information of amino acid conservation is given as calculated by PLOTCON [44]. For each homologous family and superfamily, family-specific profile hidden Markov models were calculated by the HMMER program to assist the classification of new PHA depolymerase sequences and the identification of new PHA depolymerase sequences from complete genomes in silico. A local BLAST interface is available to perform a BLAST search against the DED. A new dynamic user interface was developed which enables fast and easy integration of updated versions of the DED. The DED will be regularly updated by an automated script. For new sequence entries referring to a new structure in the Protein Data Bank (PDB), structure information is updated as well. New sequence and structure entries are classified into the homologous families and superfamilies based on their sequence identity.

Analysis

All PHA depolymerases in the DED possess a lipase box around the catalytic serine with a Gx1Sx2G sequence motif with the exception of the family of intracellular nPHASCL depolymerases (no lipase box), which possess a catalytic cysteine instead of the lipase box. For particular PHA depolymerases it has been previously described that a hydrophobic residue is found at position x1 within the Gx1Sx2G motif [4,9,45]. This seems to be a common feature of almost all PHA depolymerases as seen from a systematic analysis of the DED family multiple sequence alignments. Thus, compared to other α/β-hydrolases like lipases and esterases, where a polar residue is most frequently found at position x1, this conserved residue of the Gx1Sx2G motif might be relevant to differentiate between lipases or esterases and PHA depolymerases on sequence level. This hydrophobic residue is solvent exposed and located near the catalytic serine at the bottom of a deep cleft, as seen in the structure of the PHB depolymerase from Penicillium funiculosum (PDB entry 2D80) [46] (Fig. 2). The hydrophobic residue at position x1 is tryptophan and isoleucine for the families of intracellular nPHASCL depolymerases (lipase box) and periplasmatic PHA depolymerases, respectively. For the family of intracellular nPHAMCL depolymerases, the residue at position x1 is valine for almost all proteins. Although not possessing a lipase box, but utilizing a catalytic cysteine, all family members of the family of intracellular nPHASCL depolymerases (no lipase box) also have a hydrophobic residue (almost all valine) at position cysteine-1. While the hydrophobic residue at position x1 differs among the families of intracellular PHA depolymerases, leucine and isoleucine are the most frequent residues at this position for extracellular PHA depolymerases. While all proteins of the family of extracellular dPHASCL depolymerases (catalytic domain type 2) possess a hydrophobic residue at position x1, only 81% of the proteins of the family of extracellular dPHASCL depolymerases (catalytic domain type 1) have a hydrophobic residue at position x1. All extracellular dPHAMCL depolymerases have an isoleucine at position x1. One exception is the family of extracellular nPHASCL depolymerases, which neither possess a typical Gx1Sx2G motif nor has a hydrophobic residue a position x1. In this family, the Gx1Sx2G motif is altered to a AHSMG motif which can also be found in the family of Bacillus lipases (homologous family abH18.01 in the LED, ). One family member of this special family is the PHB depolymerase from Paucimonas lemoignei, for which also structure information is available (PDB entry: 2VTV) [6,47]. This PHB depolymerase has also special biochemical properties, as it is an extracellular nPHASCL depolymerase degrading native granules, and is the only experimentally validated extracellular PHASCL depolymerase not having a substrate binding domain. Within lipases and esterases, a polar residue is typically found at position x1. However, a few exception also exist among lipases and esterases, such as the of Candida antarctica lipase like family (homologous family abH37 in the LED) and the family of Bacillus carboxylesterases (abH11.1).

Figure 2

Top view of the binding site of the PHB Depolymerase from 2D80, [46]). The catalytic residues are marked in red, the hydrophobic residue at position x1 of the Gx1Sx2G motif is marked in blue.

Conclusion

The PHA Depolymerase Engineering Database (DED) has been designed to serve as a navigation and analysis tool of PHA depolymerases. It serves as a platform to analyze sequence-structure-function relationships and to classify new sequences by providing multiple sequence alignments, phylogenetic trees, and family-specific profiles. The DED hence provides a valuable source of information to investigate the family of PHA depolymerases in a systematic way, to identify new proteins from genomes, and to distinguish between PHA depolymerases and lipases. Thus, it paves the way for a deeper understanding of biochemical properties of PHA depolymerases and to design PHA depolymerases with improved properties.

Availability and requirements

The PHA Depolymerase Engineering Database (DED) is online accessible at . All information on families of sequence and structure data, as well as alignments, phylogenetic trees, and family-specific profiles can be accessed by manual download.

Authors' contributions

MK established and analyzed the database, and wrote the manuscript. TH carried out analysis and contributed to writing of the manuscript. FW programmed the dynamic user interface. VM contributed to establish the database. JP supervised the project and finalized the manuscript. All authors read and approved the final manuscript.

43 in total

1. The Protein Data Bank.

Authors: H M Berman; J Westbrook; Z Feng; G Gilliland; T N Bhat; H Weissig; I N Shindyalov; P E Bourne
Journal: Nucleic Acids Res Date: 2000-01-01 Impact factor: 16.971

2. From oil to bioplastics, a dream come true?

Authors: María A Prieto
Journal: J Bacteriol Date: 2006-11-03 Impact factor: 3.490

3. The Cytochrome P450 Engineering Database: a navigation and prediction tool for the cytochrome P450 protein family.

Authors: Markus Fischer; Michael Knoll; Demet Sirim; Florian Wagner; Sonja Funke; Juergen Pleiss
Journal: Bioinformatics Date: 2007-05-17 Impact factor: 6.937

4. The Medium-Chain Dehydrogenase/reductase Engineering Database: a systematic analysis of a diverse protein family to understand sequence-structure-function relationship.

Authors: Michael Knoll; Jürgen Pleiss
Journal: Protein Sci Date: 2008-07-09 Impact factor: 6.725

5. Biochemical evidence that phaZ gene encodes a specific intracellular medium chain length polyhydroxyalkanoate depolymerase in Pseudomonas putida KT2442: characterization of a paradigmatic enzyme.

Authors: Laura I de Eugenio; Pedro Garci A; José M Luengo; Jesu S M Sanz; Julio San Roma N; José Luis Garci A; Mari A A Prieto
Journal: J Biol Chem Date: 2006-12-14 Impact factor: 5.157

6. Characterization of the extracellular poly(3-hydroxybutyrate) depolymerase of Comamonas sp. and of its structural gene.

Authors: D Jendrossek; M Backhaus; M Andermann
Journal: Can J Microbiol Date: 1995 Impact factor: 2.419

7. Biochemical and molecular characterization of the polyhydroxybutyrate depolymerase of Comamonas acidovorans YM1609, isolated from freshwater.

Authors: K Kasuya; Y Inoue; T Tanaka; T Akehata; T Iwata; T Fukui; Y Doi
Journal: Appl Environ Microbiol Date: 1997-12 Impact factor: 4.792

8. Structural basis of poly(3-hydroxybutyrate) hydrolysis by PhaZ7 depolymerase from Paucimonas lemoignei.

Authors: Anastassios C Papageorgiou; Siska Hermawan; Chouhan Bhanupratap Singh; Dieter Jendrossek
Journal: J Mol Biol Date: 2008-08-05 Impact factor: 5.469

9. Metabolism of poly(3-hydroxyalkanoates) (PHAs) by Pseudomonas oleovorans. Identification and sequences of genes and function of the encoded proteins in the synthesis and degradation of PHA.

Authors: G W Huisman; E Wonink; R Meima; B Kazemier; P Terpstra; B Witholt
Journal: J Biol Chem Date: 1991-02-05 Impact factor: 5.157

10. DWARF--a data warehouse system for analyzing protein families.

Authors: Markus Fischer; Quan K Thai; Melanie Grieb; Jürgen Pleiss
Journal: BMC Bioinformatics Date: 2006-11-09 Impact factor: 3.169

33 in total

1. Crystal Structure and Substrate Specificity Modification of Acetyl Xylan Esterase from Aspergillus luchuensis.

Authors: Dai Komiya; Akane Hori; Takuya Ishida; Kiyohiko Igarashi; Masahiro Samejima; Takuya Koseki; Shinya Fushinobu
Journal: Appl Environ Microbiol Date: 2017-09-29 Impact factor: 4.792

2. Identification and biochemical evidence of a medium-chain-length polyhydroxyalkanoate depolymerase in the Bdellovibrio bacteriovorus predatory hydrolytic arsenal.

Authors: Virginia Martínez; Fernando de la Peña; Javier García-Hidalgo; Isabel de la Mata; José Luis García; María Auxiliadora Prieto
Journal: Appl Environ Microbiol Date: 2012-06-15 Impact factor: 4.792

3. Characterization of a novel immobilized biocatalyst obtained by matrix-assisted refolding of recombinant polyhydroxyoctanoate depolymerase from Pseudomonas putida KT2442 isolated from inclusion bodies.

Authors: M Arroyo; J García-Hidalgo; M Villalón; L de Eugenio; D Hormigo; C Acebal; J L García; M A Prieto; Isabel de la Mata
Journal: J Ind Microbiol Biotechnol Date: 2010-11-20 Impact factor: 3.346

4. Enhanced degradation of polyhydroxyalkanoates (PHAs) by newly isolated Burkholderia cepacia DP1 with high depolymerase activity.

Authors: Nor Azura Azami; Ira Aryani Wirjon; Shantini Kannusamy; Aik-Hong Teh; Amirul Al-Ashraf Abdullah
Journal: 3 Biotech Date: 2017-04-27 Impact factor: 2.406

5. Coordinated Regulation of the Size and Number of Polyhydroxybutyrate Granules by Core and Accessory Phasins in the Facultative Microsymbiont Sinorhizobium fredii NGR234.

Authors: Yan-Wei Sun; Yan Li; Yue Hu; Wen-Xin Chen; Chang-Fu Tian
Journal: Appl Environ Microbiol Date: 2019-09-17 Impact factor: 4.792

6. The Thiamine diphosphate dependent Enzyme Engineering Database: a tool for the systematic analysis of sequence and structure relations.

Authors: Michael Widmann; Robert Radloff; Jürgen Pleiss
Journal: BMC Biochem Date: 2010-02-01 Impact factor: 4.059

7. Polyhydroxyalkanoate synthesis affects biosurfactant production and cell attachment to hydrocarbons in Pseudomonas sp. KA-08.

Authors: Carla Di Martino; Mariela V Catone; Nancy I López; Laura J Raiger Iustman
Journal: Curr Microbiol Date: 2014-02-12 Impact factor: 2.188

8. Characterization of a novel subgroup of extracellular medium-chain-length polyhydroxyalkanoate depolymerases from actinobacteria.

Authors: Joana Gangoiti; Marta Santos; María Auxiliadora Prieto; Isabel de la Mata; Juan L Serra; María J Llama
Journal: Appl Environ Microbiol Date: 2012-08-03 Impact factor: 4.792

9. Crystallization and preliminary crystallographic analysis of poly(3-hydroxybutyrate) depolymerase from Bacillus thuringiensis.

Authors: Yung Lin Wang; Yi Ting Lin; Chia Lin Chen; Gwo Chyuan Shaw; Shwu Huey Liaw
Journal: Acta Crystallogr F Struct Biol Commun Date: 2014-09-25 Impact factor: 1.056

10. The cytochrome P450 engineering database: Integration of biochemical properties.

Authors: Demet Sirim; Florian Wagner; Andrey Lisitsa; Jürgen Pleiss
Journal: BMC Biochem Date: 2009-11-12 Impact factor: 4.059