Literature DB >> 19723314

The B6 database: a tool for the description and classification of vitamin B6-dependent enzymatic activities and of the corresponding protein families.

Riccardo Percudani1, Alessio Peracchi.   

Abstract

BACKGROUND: Enzymes that depend on vitamin B6 (and in particular on its metabolically active form, pyridoxal 5'-phosphate, PLP) are of great relevance to biology and medicine, as they catalyze a wide variety of biochemical reactions mainly involving amino acid substrates. Although PLP-dependent enzymes belong to a small number of independent evolutionary lineages, they encompass more than 160 distinct catalytic functions, thus representing a striking example of divergent evolution. The importance and remarkable versatility of these enzymes, as well as the difficulties in their functional classification, create a need for an integrated source of information about them. DESCRIPTION: The B6 database http://bioinformatics.unipr.it/B6db contains documented B6-dependent activities and the relevant protein families, defined as monophyletic groups of sequences possessing the same enzymatic function. One or more families were associated to each of 121 PLP-dependent activities with known sequences. Hidden Markov models (HMMs) were built from family alignments and incorporated in the database. These HMMs can be used for the functional classification of PLP-dependent enzymes in genomic sets of predicted protein sequences. An example of such analyses (a census of human genes coding for PLP-dependent enzymes) is provided here, whereas many more are accessible through the database itself.
CONCLUSION: The B6 database is a curated repository of biochemical and molecular information about an important group of enzymes. This information is logically organized and available for computational analyses, providing a key resource for the identification, classification and comparative analysis of B6-dependent enzymes.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19723314      PMCID: PMC2748086          DOI: 10.1186/1471-2105-10-273

Source DB:  PubMed          Journal:  BMC Bioinformatics        ISSN: 1471-2105            Impact factor:   3.169


Background

The term 'vitamin B6' refers to a collective of six biologically interconvertible 3-hydroxy-2-methylpyridine compounds: pyridoxal, pyridoxine, pyridoxamine, and their respective 5'-phosphates. Among these, pyridoxal 5'-phosphate (PLP) is the main metabolically active form, serving as a cofactor for a variety of enzymes in all organisms [1-7]. Nearly all PLP-dependent enzymes, with the exception of glycogen phosphorylases, are associated with biochemical pathways involving amino compounds - mostly amino acids. The reactions catalyzed by the PLP-dependent enzymes that act on amino acids include transamination, decarboxylation, racemization, and eliminations or replacements at the β- or γ-carbons. Such versatility arises from the fact that PLP can covalently bind the substrate and then act as an electrophilic catalyst, stabilizing different types of carbanionic reaction intermediates [7] (Figure 1).
Figure 1

A schematic view of the different reaction types catalyzed by PLP-dependent enzymes that act on amino acids. In these enzymes, PLP is bound to the ε-amino group of a catalytic lysine residue, forming a Schiff base (internal aldimine). Covalent binding of the substrate amino acid occurs through a transimination reaction, leading to formation of an external aldimine intermediate (structure on the upper left corner). Subsequently, the protonated ring system of PLP acts as an electron sink, to stabilize species carrying a negative charge on the α-carbon (carbanions). Depending on the enzyme (and hence on the specific arrangement of the active site residues) such stabilized carbanions can be formed upon cleavage of any of the three covalent bonds connecting the α-carbon to its substituents. Removal of the carboxylate group is typical of decarboxylases. Removal of the amino acid side chain occurs for example in threonine aldolase. Finally, removal of the α-proton may be the prequel to the formation of various further intermediates, leading to racemization, cyclization, β- and γ-elimination, and transamination reactions [1,4,7].

A schematic view of the different reaction types catalyzed by PLP-dependent enzymes that act on amino acids. In these enzymes, PLP is bound to the ε-amino group of a catalytic lysine residue, forming a Schiff base (internal aldimine). Covalent binding of the substrate amino acid occurs through a transimination reaction, leading to formation of an external aldimine intermediate (structure on the upper left corner). Subsequently, the protonated ring system of PLP acts as an electron sink, to stabilize species carrying a negative charge on the α-carbon (carbanions). Depending on the enzyme (and hence on the specific arrangement of the active site residues) such stabilized carbanions can be formed upon cleavage of any of the three covalent bonds connecting the α-carbon to its substituents. Removal of the carboxylate group is typical of decarboxylases. Removal of the amino acid side chain occurs for example in threonine aldolase. Finally, removal of the α-proton may be the prequel to the formation of various further intermediates, leading to racemization, cyclization, β- and γ-elimination, and transamination reactions [1,4,7]. The Enzyme Commission (EC; ) lists more than 140 PLP-dependent activities, corresponding to ~4% of all classified activities [6]. Despite this wide functional variety, all structurally characterized PLP-dependent enzymes have been classified into just five distinct structural groups (also known as 'fold types') [4,8], which presumably correspond to independent evolutionary lineages [3,5]. This represents a remarkable example of divergent evolution, meaning that proteins with similar structure and sequence can perform different chemical reactions. Due to the mechanistic similarities between PLP-dependent enzymes and to their limited structural diversity, inferring the function of these catalysts solely based on sequence similarity entails particular difficulties. To help the identification and classification of sequences belonging to PLP-dependent enzymes, we have created the B6 database. In addition to a wealth of links to other Internet resources (including BRENDA [9] and the PLP mutant enzyme database[10]), the B6 database contains over 180 documented PLP-dependent activities that are associated, when possible, to one or more protein families (defined as monophyletic groups of homologous proteins sharing the same function). The database also contains hidden Markov models (HMMs) that were built from family alignments and that can be employed for the identification and functional classification of PLP-dependent enzymes in genomic sets of protein sequences. Indeed, we have used these HMMs to scan a series of complete genomes, obtaining a census of predicted PLP-dependent enzymes in various organisms.

Construction and content

Organization and statistics of the B6 database

Figure 2 summarizes the structure of the database, illustrating the types of information it includes and the ways in which this information is linked together and can be searched. As shown, the B6 database site actually accesses and integrates four distinct databases, namely a list of PLP-dependent activities, a collection of pertinent literature references, a large set of sequences of PLP-dependent proteins (grouped into protein families) and the results of our genomic searches.
Figure 2

The B6 database relational structure. This figure delineates the four modules composing the database and their relationships. The database of PLP-dependent enzyme families was assembled based on the examination of the literature and on a collection of functionally validated sequences, as described in the text. These families, and in particular the HMMs associated to them, have been used for the identification and functional classification of PLP-dependent enzymes in sets of predicted protein sequences from whole genomes.

The B6 database relational structure. This figure delineates the four modules composing the database and their relationships. The database of PLP-dependent enzyme families was assembled based on the examination of the literature and on a collection of functionally validated sequences, as described in the text. These families, and in particular the HMMs associated to them, have been used for the identification and functional classification of PLP-dependent enzymes in sets of predicted protein sequences from whole genomes. The B6 database release 1.0 (as of 15/05/2009) includes 184 activities and over 2000 sequences of B6-dependent enzymes, subdivided into 149 families. For each family, the database provides a multiple sequence alignment and the derived hidden Markov model.

Assembly of the databases: activities, sequences and protein families

The B6 database was constructed based on an inventory of documented B6-dependent activities, most but not all of which have been catalogued by the Enzyme Commission and are therefore associated to an official EC number. A systematic examination of the literature showed that 121 of these activities could be associated to enzymes of known sequences, and in these cases we proceeded to the creation of protein families, that we define as monophyletic groups of sequences all possessing the same enzymatic activity. Each given activity was associated to one or more families based on this criterion. The number of sequences in individual families was then increased by homology searches, i.e. by scanning the GenBank with BLAST [11] or with psi-BLAST [12], using as query the functionally validated protein(s). Criteria for inclusion of a sequence in a family were the following: (1) Only sequences yielding an E value < 10-10 were generally considered (this limit could be somewhat lowered for families composed of short sequences). (2) Sequences showing a >90% identity to a protein of known function were usually not included, to diminish redundancy. (3) Sequences being substantially (>30%) shorter than the shortest functionally validated sequence in the family were discarded. Sequences lacking the PLP-binding lysine residue were also discarded (except for rare cases in which the protein is known not to bind PLP via a lysine). (4) Sequences showing a higher similarity to other characterized PLP-dependent enzymes (i.e., to some functionally validated protein belonging to another family) were discarded. (5) Finally, sequences from taxa in which the enzymatic activity of the family was not documented, were also generally discarded. Multiple alignments were constructed with ClustalW [13]. Given that the families were composed of closely related sequences, these alignments did not need to be manually adjusted or to be guided by structural information (even when available). The ProDom program [14] was used for alignment inspection and phylogenetic analysis. Family alignments were used to build Hidden Markov Models (HMM) with programs of the HMMER suite [15]. The scores of sequences included or excluded from a given family were then calculated with respect to the family HMM. From this procedure, score cut-offs for each family were determined and then used for sequence classification. A family HMM is a probabilistic model, constructed from a multiple alignment, which describes the sequence conservation within a protein family. In comparison to consensus sequences or similar regular expressions, HMMs provide a more articulated modeling of the features of a protein family. Such higher complexity is responsible for the greater discriminatory power of the HMM methodology in the identification of other putative family members [15]. Depending on family inclusion criteria and score thresholds, HMMs can be used to identify homology at different levels of granularity. The 'family' definition adopted in the B6 database is similar to the 'equivalog family' definition of TIGRFAM [16], while a single family in PFAM [17] typically corresponds to many different families in our database.

Cluster analysis of PLP-dependent enzyme families

To elucidate the relationships between the 149 enzyme families defined as above, we performed an all versus all comparison of the families in the database using an HMM-HMM alignment software [18]. The results of this comparison were analyzed with an interaction network software [19] to build an homology-based network of PLP-dependent families (Figure 3). By considering only significant similarities (E < 10-5) between HMMs, the analysis identified seven separated clusters of PLP-dependent families (Figure 3). Five of these clusters corresponded to the traditional classification of PLP-dependent enzymes into five distinct structural groups (fold types I to V). Of the two additional clusters, one included lysine 5,6-aminomutase (EC: 5.4.3.4) and the other lysine 2,3-aminomutase (EC: 5.4.3.2) - two enzymes whose structures have been recently determined and found to be different from the known structures of PLP-dependent enzymes [20,21]. In the database, the protein families belonging to these two clusters were assigned, respectively, to fold types VI and VII.
Figure 3

Homology network of PLP-dependent enzymes. Nodes represent Hidden Markov models (HMMs) of PLP-dependent families. Edges represent homology connections (E < 10-5) between families established by HMM-HMM comparisons [18]. Black edges connect protein families with the most significant similarities (E < 10-50). The network is visualized with the "Degree sorted circle layout" of Cytoscape [19]. Colors were mapped into nodes using the structural group of the protein family as a node property.

Homology network of PLP-dependent enzymes. Nodes represent Hidden Markov models (HMMs) of PLP-dependent families. Edges represent homology connections (E < 10-5) between families established by HMM-HMM comparisons [18]. Black edges connect protein families with the most significant similarities (E < 10-50). The network is visualized with the "Degree sorted circle layout" of Cytoscape [19]. Colors were mapped into nodes using the structural group of the protein family as a node property. Since HMM-HMM comparison is very sensitive to sequence similarity, it can reveal faint evolutionary relationships between protein families. This information can be particularly useful to identify relatives for PLP-dependent families that fail to reveal similarity with other families when analyzed by sequence-sequence (e.g., Blast) or sequence-HMM (e.g. HMMPFAM) methods. The HMM-HMM analysis, for example, indicates a significant similarity between Prosc (a family of proteins with unknown function) and diaminopimelate decarboxylase (EC: 4.1.1.20) - a relationship that is not apparent through Blast or HMMPFAM comparisons. Inter-families distances deriving from HMM-HMM comparisons served as a guide to build alignments representative of the seven distinct structural groups. Distance matrices among families were analyzed with an UPGMA algorithm and a rapid multiple sequence alignment method [22] was used to progressively align PLP-dependent families belonging to the same structural type. From these alignments, we constructed HMMs (hereafter named "fold-type HMMs") representative of the seven structural groups of PLP-dependent enzymes.

Utility and discussion

The B6 database is a repository in which detailed (biochemical and genetic) information about an important group of enzymes is concentrated, organized and made available for computational analyses. We expect that the B6 database will be a valuable tool for experimental researchers in the PLP field, but also a reference point for the design of theoretical studies by bioinformaticians. In particular, the sequence information accumulated in the database can be used to facilitate the identification and functional assignment of B6-dependent enzymes. To illustrate this point, we employed the family and fold-type HMMs (constructed as described above) to search and preliminarily classify PLP-dependent enzymes in genomic sets of predicted proteins. The results of such analyses have also been incorporated in the database. Complete sets of protein sequences deduced from genomic data were generally obtained from NCBI or from similar ftp repositories. The classification of protein sequences was achieved through a two-step procedure. First, each sequence was compared with our database of PLP-dependent sequences by performing a HMM search with the seven fold-type HMMs, using relaxed significance criteria (E ≤ 10-1; database size = 10000). This step served as a quick filter to sift out genes that were likely to code for PLP-dependent enzymes. Candidates were subsequently compared with the library of family HMMs using HMMPFAM. This step was more time-consuming and served for a preliminary functional classification of the proteins. A protein was considered to possess the same activity as its best-hit family if it exhibited a significant similarity to the family HMM (E ≤ 10-3) and a score above a 'trusted' cut-off established by the family curator. Sequences with a score below this threshold were marked as 'low-score' to indicate their modest similarity to the family model. These sequences were not considered as possessing the enzymatic function of the family, but were regarded as possessing an uncharacterized, possibly related, activity. According to this analysis, very few sequences exhibited a significant similarity to a fold-type HMM (E ≤ 10-3) but no significant similarity to any family HMMs. In such cases, sequences were considered as potential PLP-dependent enzymes with an uncharacterized catalytic activity. To further characterize the protein sequence under examination, the classification program searched for a putative PLP-binding lysine residue (see legend of Figure 1). This was achieved by aligning the sequence with validated family members in which the position of the catalytic lysine had been previously mapped. This analysis can reveal proteins that are evolutionary related to PLP-dependent enzymes, but have lost the ability to bind the PLP cofactor.

Example: a census of human genes that encode PLP-dependent enzymes

By employing the approach outlined above, we searched the latest draft of the human genome (NCBI 36 assembly, downloaded at ) to obtain an inventory of the human genes coding for PLP-dependent enzymes. The initial output of the program (69 sequences recognized as probable PLP-dependent proteins) was further analyzed to identify pseudogenes, false positives and entries representing alternative protein isoforms. The search identified 56 expressed genes coding for PLP-dependent proteins (Table 1. Note that the products of genes SPTLC1, ADC and AZIN1, albeit homologs of bona fide PLP-dependent enzymes, appear to have acquired a nonenzymic function during evolution). Thirteen more proteins were recognized as isoforms deriving from some of the genes above. To appreciate the rate of false negatives in our analysis, we performed an extensive text search in the GenBank database of human genes, to identify all those genes annotated (directly or indirectly) to code for B6-dependent proteins. However, we found no hits other than the 56 genes listed in Table 1, which therefore represent, to the best of our current knowledge, the complement of human PLP-dependent genes.
Table 1

Inventory of the human genes that encode PLP-dependent enzymes

ActivityFamilyProtein accession #E-valueIsoformsGeneChromosome
Glycine dehydrogenase1.4.4.2NP_0001610GLDC9
Glycine hydroxymethyltransferase2.1.2.1NP_0041602 e-294NP_683718SHMT117
Glycine hydroxymethyltransferase2.1.2.1NP_0054038 e-289SHMT212
Glycine C-acetyltransferase2.3.1.29NP_0551064 e-225GCAT22
5-aminolevulinic acid synthase2.3.1.37NP_9546351 e-269NP_000679ALAS13
5-aminolevulinic acid synthase2.3.1.37NP_0000232 e-270NP_001033057NP_001033058NP_001033056ALAS2X
Serine C-palmitoyltransferase2.3.1.50.aNP_0064066 e-240SPTLC1(a)14
Serine C-palmitoyltransferase2.3.1.50.bNP_0048546 e-299SPTLC220
Serine C-palmitoyltransferase2.3.1.50.bNP_0607973 e-260SPTLC320
Phosphorylase2.4.1.1NP_0028540PYGL14
Phosphorylase2.4.1.1NP_0056000PYGM11
Phosphorylase2.4.1.1NP_0028530PYGB20
Aspartate aminotransferase2.6.1.1.aNP_0020702 e-295GOT110
Aspartate aminotransferase2.6.1.1.aNP_6896261 e-68GOT1L18
Aspartate aminotransferase2.6.1.1.aNP_0020712 e-306GOT216
Alanine aminotransferase2.6.1.2.bNP_0053006 e-262GPT8
Alanine aminotransferase2.6.1.2.bNP_5977003 e-267GPT216
Tyrosine aminotransferase2.6.1.5NP_0003442 e-292TAT16
Kynurenine:oxoglutarate aminotransf.2.6.1.7NP_0040507 e-279CCBL19
Kynurenine:oxoglutarate aminotransf.2.6.1.7NP_0010086624 e-266NP_001008661CCBL21
Ornithine:oxo-acid aminotransferase2.6.1.13NP_0002654 e-274OAT10
4-aminobutyrate aminotransferase2.6.1.19.aNP_0657370NP_000654ABAT16
2-aminoadipate aminotransferase2.6.1.39.aNP_8726033 e-265NP_057312AADAT4
Branched-chain aa aminotransferase2.6.1.42NP_0054953 e-191BCAT112
Branched-chain aa aminotransferase2.6.1.42NP_0011811 e-181BCAT219
Alanine:glyoxylate aminotransferase2.6.1.44.aNP_1141060AGXT25
Alanine:glyoxylate aminotransferase2.6.1.44.aNP_1125691 e-114AGXT2L1(b)4
Alanine:glyoxylate aminotransferase2.6.1.44.aNP_6992048 e-100AGXT2L2(b)5
Serine:pyruvate aminotransferase2.6.1.51NP_0000217 e-269AGXT2
Phosphoserine aminotransferase2.6.1.52NP_4780598 e-235NP_066977PSAT19
Cysteine desulfurase2.8.1.7.aNP_0669235 e-290NFS120
Cysteine desulfurase2.8.1.7.bNP_0604170MOCOS18
Kynureninase3.7.1.3NP_0039283 e-259KYNU2
Glutamate decarboxylase4.1.1.15.aNP_0008080GAD12
Glutamate decarboxylase4.1.1.15.aNP_0008090GAD210
Ornithine decarboxylase4.1.1.17.1NP_0025301 e-175ODC12
Histidine decarboxylase4.1.1.22.bNP_0021031 e-285HDC15
Aromatic-L-amino-acid decarboxylase4.1.1.28NP_0007812 e-291NP_001076440DDC7
Sulfinoalanine decarboxylase4.1.1.29NP_9972422 e-116GADL13
Sulfinoalanine decarboxylase4.1.1.29NP_0570730CSAD12
Sphinganine-1-phosphate aldolase4.1.2.27NP_0038929 e-249SGPL110
Cystathionine beta-synthase4.2.1.22NP_0000622 e-228CBS21
Threonine synthase4.2.3.1NP_0791147 e-91THNSL110
Threonine synthase4.2.3.1NP_0607414 e-80THNSL2(c)2
L-serine ammonia-lyase4.3.1.17NP_0068345 e-209SDS12
L-serine ammonia-lyase4.3.1.17NP_6124412 e-205SDSL12
Cystathionine gamma-lyase4.4.1.1NP_0018937 e-296NP_714964CTH1
1-ACC synthase4.4.1.14NP_1159811 e-81ACCS(d)11
1-ACC synthase4.4.1.14NP_0010270253 e-78ACCSL11
Selenocysteine lyase4.4.1.16.bNP_0575940SCLY2
Serine racemase5.1.1.18NP_0687668 e-243SRR17
Ornithine decarboxylase paralogueodcpNP_4437248 e-197ADC(e)1
Ornithine decarboxylase paralogueodcpNP_0569622 e-116NP_680479AZIN1(e)8
L-phosphoseryl-tRNASer:seleno- phosphate seleniumtransferasesec.synthaseNP_7225470NP_058651SEPSECS4
Unclassified activityUncharact. family.proscNP_0091295 e-166PROSC8
Unassignedn/aNP_0558429 e-8PDXDC116

The complete set of proteins from the NCBI 36 assembly of the human genome was searched as described in the text. The 'Family' column reports the name of the family whose HMM scored best when compared to the query. When possible, the family name is constituted by, or includes, the four-digits EC number of the corresponding activity. Note that the layout of the present table recapitulates the output provided by B6 database's site, with some modifications. In particular, the occurrence of protein isoforms derived from some genes was obtained by analysis of the crude program results. The official names of the genes corresponding to the transcripts and their chromosomal locations were obtained from NCBI.

(a) This gene encodes the subunit LCB1 of human serine palmitoyltransferase. This subunit is not itself a PLP-dependent enzyme, but is homologous to a second subunit (LCB2) that is PLP-dependent [24].

(b) The products of genes AGXT2L1 and AGXT2L2 are homologs of the mitochondrial alanine-glyoxylate aminotransferase (AGXT2), but were recently shown to lack this activity [25].

(c) The murine homolog of this gene was shown to encode an enzyme with phospho-lyase activity [26].

(d) This gene is homolog to the plant 1-aminocyclopropane-1-carboxylate (ACC) synthase, but the gene product was shown to lack this activity [27].

(e) Genes ADC and AZIN1 encode ornithine decarboxylase antizyme inhibitors, i.e. proteins homologous to ornithine decarboxylase, but devoid of ornithine decarboxylase activity, that are involved in regulation of polyamine biosynthesis [28,29]. The human ADC, contrary to its mouse ortholog, retains a conserved lysine at the active site.

Inventory of the human genes that encode PLP-dependent enzymes The complete set of proteins from the NCBI 36 assembly of the human genome was searched as described in the text. The 'Family' column reports the name of the family whose HMM scored best when compared to the query. When possible, the family name is constituted by, or includes, the four-digits EC number of the corresponding activity. Note that the layout of the present table recapitulates the output provided by B6 database's site, with some modifications. In particular, the occurrence of protein isoforms derived from some genes was obtained by analysis of the crude program results. The official names of the genes corresponding to the transcripts and their chromosomal locations were obtained from NCBI. (a) This gene encodes the subunit LCB1 of human serine palmitoyltransferase. This subunit is not itself a PLP-dependent enzyme, but is homologous to a second subunit (LCB2) that is PLP-dependent [24]. (b) The products of genes AGXT2L1 and AGXT2L2 are homologs of the mitochondrial alanine-glyoxylate aminotransferase (AGXT2), but were recently shown to lack this activity [25]. (c) The murine homolog of this gene was shown to encode an enzyme with phospho-lyase activity [26]. (d) This gene is homolog to the plant 1-aminocyclopropane-1-carboxylate (ACC) synthase, but the gene product was shown to lack this activity [27]. (e) Genes ADC and AZIN1 encode ornithine decarboxylase antizyme inhibitors, i.e. proteins homologous to ornithine decarboxylase, but devoid of ornithine decarboxylase activity, that are involved in regulation of polyamine biosynthesis [28,29]. The human ADC, contrary to its mouse ortholog, retains a conserved lysine at the active site. We also compared the functional classification provided by the B6 database with the manual annotation included in the NCBI 36 release of the human genome, finding no significant differences. This implies that the accuracy of our automatic classification system can match that of a manual expert annotation. It should be noted that only a minority of complete genomes have been subjected to accurate manual annotation. In genomes where proteins have been mostly annotated through a general system of automatic annotation, our specialized tool provides a more complete and accurate classification of PLP-dependent enzymes. Of course, accuracy in the annotation of a gene product does not always guarantee a precise functional assignment, as it can be gleaned by inspecting Table 1. For example, some of the human PLP-dependent proteins in our inventory are homologs of enzymes (such as plant ACS synthases or bacterial threonine synthases) that are not expected to occur in mammals. In other cases, the proteins are homologs of other (functionally validated) human enzymes, but it is unclear whether they represent true isozymic forms, or rather possess distinct catalytic activities - this latter possibility may be especially pertinent for those sequences that were recognized as 'low-score' by our search procedure. These uncharacterized gene products represent therefore interesting subjects for functional genomic studies. Some genes encoding for PLP-dependent enzymes may be missing from the list, possibly due to the limits of the current human genome assembly, even eight years after publication of the first genome draft [23]. For example, the gene ACCSL has been recognized as protein-coding only in the NCBI 36 assembly but was absent in the preceding version (NCBI 35).

Conclusion

The increasing number of predicted protein sequences generated by genomic sequencing projects require methods to predict details regarding functions. The B6 database allows the comparison of newly sequenced PLP-dependent proteins with a curated collection of protein families, making it more reliable a preliminary functional classification but also helping to pinpoint gene products that are the most interesting candidates to functional studies. Due to the progresses of functional genomics, as well as to classical biochemical and genetic approaches, the body of information on PLP-dependent enzymes is necessarily going to increase. Many activities that are currently 'orphan' (i.e., with no molecular details about the responsible enzymes) will be associated to specific sequences, while many new activities are likely to be discovered [6]. Accordingly, we expect to periodically update and expand the B6 database with the ensuing information, to maintain this database a serviceable tool and a reference point for the scientific community.

Availability and requirements

The B6 database, which is based on the web-oriented Perl package Woda, is publicly available over the Internet . Users are asked to cite the present article.

Abbreviations

PLP: pyridoxal 5'-phosphate; HMM: hidden Markov model; EC: enzyme Commission.

Authors' contributions

RP designed and implemented the B6 database and website, carried out the genomic analysis and revised the manuscript. AP collected the literature included in the database, selected the functionally validated sequences, helped to build the families, drafted and revised the manuscript. Both authors read and approved the final manuscript.
  29 in total

1.  Initial sequencing and analysis of the human genome.

Authors:  E S Lander; L M Linton; B Birren; C Nusbaum; M C Zody; J Baldwin; K Devon; K Dewar; M Doyle; W FitzHugh; R Funke; D Gage; K Harris; A Heaford; J Howland; L Kann; J Lehoczky; R LeVine; P McEwan; K McKernan; J Meldrim; J P Mesirov; C Miranda; W Morris; J Naylor; C Raymond; M Rosetti; R Santos; A Sheridan; C Sougnez; Y Stange-Thomann; N Stojanovic; A Subramanian; D Wyman; J Rogers; J Sulston; R Ainscough; S Beck; D Bentley; J Burton; C Clee; N Carter; A Coulson; R Deadman; P Deloukas; A Dunham; I Dunham; R Durbin; L French; D Grafham; S Gregory; T Hubbard; S Humphray; A Hunt; M Jones; C Lloyd; A McMurray; L Matthews; S Mercer; S Milne; J C Mullikin; A Mungall; R Plumb; M Ross; R Shownkeen; S Sims; R H Waterston; R K Wilson; L W Hillier; J D McPherson; M A Marra; E R Mardis; L A Fulton; A T Chinwalla; K H Pepin; W R Gish; S L Chissoe; M C Wendl; K D Delehaunty; T L Miner; A Delehaunty; J B Kramer; L L Cook; R S Fulton; D L Johnson; P J Minx; S W Clifton; T Hawkins; E Branscomb; P Predki; P Richardson; S Wenning; T Slezak; N Doggett; J F Cheng; A Olsen; S Lucas; C Elkin; E Uberbacher; M Frazier; R A Gibbs; D M Muzny; S E Scherer; J B Bouck; E J Sodergren; K C Worley; C M Rives; J H Gorrell; M L Metzker; S L Naylor; R S Kucherlapati; D L Nelson; G M Weinstock; Y Sakaki; A Fujiyama; M Hattori; T Yada; A Toyoda; T Itoh; C Kawagoe; H Watanabe; Y Totoki; T Taylor; J Weissenbach; R Heilig; W Saurin; F Artiguenave; P Brottier; T Bruls; E Pelletier; C Robert; P Wincker; D R Smith; L Doucette-Stamm; M Rubenfield; K Weinstock; H M Lee; J Dubois; A Rosenthal; M Platzer; G Nyakatura; S Taudien; A Rump; H Yang; J Yu; J Wang; G Huang; J Gu; L Hood; L Rowen; A Madan; S Qin; R W Davis; N A Federspiel; A P Abola; M J Proctor; R M Myers; J Schmutz; M Dickson; J Grimwood; D R Cox; M V Olson; R Kaul; C Raymond; N Shimizu; K Kawasaki; S Minoshima; G A Evans; M Athanasiou; R Schultz; B A Roe; F Chen; H Pan; J Ramser; H Lehrach; R Reinhardt; W R McCombie; M de la Bastide; N Dedhia; H Blöcker; K Hornischer; G Nordsiek; R Agarwala; L Aravind; J A Bailey; A Bateman; S Batzoglou; E Birney; P Bork; D G Brown; C B Burge; L Cerutti; H C Chen; D Church; M Clamp; R R Copley; T Doerks; S R Eddy; E E Eichler; T S Furey; J Galagan; J G Gilbert; C Harmon; Y Hayashizaki; D Haussler; H Hermjakob; K Hokamp; W Jang; L S Johnson; T A Jones; S Kasif; A Kaspryzk; S Kennedy; W J Kent; P Kitts; E V Koonin; I Korf; D Kulp; D Lancet; T M Lowe; A McLysaght; T Mikkelsen; J V Moran; N Mulder; V J Pollara; C P Ponting; G Schuler; J Schultz; G Slater; A F Smit; E Stupka; J Szustakowki; D Thierry-Mieg; J Thierry-Mieg; L Wagner; J Wallis; R Wheeler; A Williams; Y I Wolf; K H Wolfe; S P Yang; R F Yeh; F Collins; M S Guyer; J Peterson; A Felsenfeld; K A Wetterstrand; A Patrinos; M J Morgan; P de Jong; J J Catanese; K Osoegawa; H Shizuya; S Choi; Y J Chen; J Szustakowki
Journal:  Nature       Date:  2001-02-15       Impact factor: 49.962

2.  MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform.

Authors:  Kazutaka Katoh; Kazuharu Misawa; Kei-ichi Kuma; Takashi Miyata
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

3.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

4.  Basic local alignment search tool.

Authors:  S F Altschul; W Gish; W Miller; E W Myers; D J Lipman
Journal:  J Mol Biol       Date:  1990-10-05       Impact factor: 5.469

Review 5.  A genomic overview of pyridoxal-phosphate-dependent enzymes.

Authors:  Riccardo Percudani; Alessio Peracchi
Journal:  EMBO Rep       Date:  2003-09       Impact factor: 8.807

6.  The human cDNA for a homologue of the plant enzyme 1-aminocyclopropane-1-carboxylate synthase encodes a protein lacking that activity.

Authors:  K A Koch; G Capitani; M G Gruetter; J F Kirsch
Journal:  Gene       Date:  2001-07-11       Impact factor: 3.688

Review 7.  Pyridoxal phosphate-dependent enzymes.

Authors:  R A John
Journal:  Biochim Biophys Acta       Date:  1995-04-27

8.  CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

Authors:  J D Thompson; D G Higgins; T J Gibson
Journal:  Nucleic Acids Res       Date:  1994-11-11       Impact factor: 16.971

Review 9.  From cofactor to enzymes. The molecular evolution of pyridoxal-5'-phosphate-dependent enzymes.

Authors:  P Christen; P K Mehta
Journal:  Chem Rec       Date:  2001       Impact factor: 6.771

Review 10.  Serine palmitoyltransferase, a key enzyme of sphingolipid metabolism.

Authors:  Kentaro Hanada
Journal:  Biochim Biophys Acta       Date:  2003-06-10
View more
  69 in total

1.  Rational assignment of key motifs for function guides in silico enzyme identification.

Authors:  Matthias Höhne; Sebastian Schätzle; Helge Jochens; Karen Robins; Uwe T Bornscheuer
Journal:  Nat Chem Biol       Date:  2010-09-26       Impact factor: 15.040

2.  Pyridoxine-dependent epilepsy: report on three families with neuropathology.

Authors:  Florent Marguet; Hager Barakizou; Abdellah Tebani; Lenaig Abily-Donval; Stéphanie Torre; Fethi Bayoudh; Sami Jebnoun; Marie Brasseur-Daudruy; Stéphane Marret; Annie Laquerriere; Soumeya Bekri
Journal:  Metab Brain Dis       Date:  2016-07-20       Impact factor: 3.584

Review 3.  5-Aminolevulinate synthase catalysis: The catcher in heme biosynthesis.

Authors:  Bosko M Stojanovski; Gregory A Hunter; Insung Na; Vladimir N Uversky; Rays H Y Jiang; Gloria C Ferreira
Journal:  Mol Genet Metab       Date:  2019-06-13       Impact factor: 4.797

Review 4.  Direct and Functional Biomarkers of Vitamin B6 Status.

Authors:  Per Magne Ueland; Arve Ulvik; Luisa Rios-Avila; Øivind Midttun; Jesse F Gregory
Journal:  Annu Rev Nutr       Date:  2015-05-13       Impact factor: 11.848

5.  The Mtm1p carrier and pyridoxal 5'-phosphate cofactor trafficking in yeast mitochondria.

Authors:  Mei M Whittaker; Aravind Penmatsa; James W Whittaker
Journal:  Arch Biochem Biophys       Date:  2015-01-28       Impact factor: 4.013

6.  Functional evolution of PLP-dependent enzymes based on active-site structural similarities.

Authors:  Jonathan Catazaro; Adam Caprez; Ashu Guru; David Swanson; Robert Powers
Journal:  Proteins       Date:  2014-06-20

Review 7.  Reactive Enamines and Imines In Vivo: Lessons from the RidA Paradigm.

Authors:  Andrew J Borchert; Dustin C Ernst; Diana M Downs
Journal:  Trends Biochem Sci       Date:  2019-05-15       Impact factor: 13.807

8.  Lysine relay mechanism coordinates intermediate transfer in vitamin B6 biosynthesis.

Authors:  Matthew J Rodrigues; Volker Windeisen; Yang Zhang; Gabriela Guédez; Stefan Weber; Marco Strohmeier; Jeremiah W Hanes; Antoine Royant; Gwyndaf Evans; Irmgard Sinning; Steven E Ealick; Tadhg P Begley; Ivo Tews
Journal:  Nat Chem Biol       Date:  2017-01-16       Impact factor: 15.040

9.  Molecular identification of hydroxylysine kinase and of ammoniophospholyases acting on 5-phosphohydroxy-L-lysine and phosphoethanolamine.

Authors:  Maria Veiga-da-Cunha; Farah Hadi; Thomas Balligand; Vincent Stroobant; Emile Van Schaftingen
Journal:  J Biol Chem       Date:  2012-01-12       Impact factor: 5.157

10.  Structural characterization of human O-phosphoethanolamine phospho-lyase.

Authors:  Chiara Vettraino; Alessio Peracchi; Stefano Donini; Emilio Parisini
Journal:  Acta Crystallogr F Struct Biol Commun       Date:  2020-04-01       Impact factor: 1.056

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.