Literature DB >> 27670643

HydDB: A web tool for hydrogenase classification and analysis.

Dan Søndergaard1, Christian N S Pedersen1, Chris Greening2,3.   

Abstract

H2 metabolism is proposed to be the most ancient and diverse mechanism of energy-conservation. The metalloenzymes mediating this metabolism, hydrogenases, are encoded by over 60 microbial phyla and are present in all major ecosystems. We developed a classification system and web tool, HydDB, for the structural and functional analysis of these enzymes. We show that hydrogenase function can be predicted by primary sequence alone using an expanded classification scheme (comprising 29 [NiFe], 8 [FeFe], and 1 [Fe] hydrogenase classes) that defines 11 new classes with distinct biological functions. Using this scheme, we built a web tool that rapidly and reliably classifies hydrogenase primary sequences using a combination of k-nearest neighbors' algorithms and CDD referencing. Demonstrating its capacity, the tool reliably predicted hydrogenase content and function in 12 newly-sequenced bacteria, archaea, and eukaryotes. HydDB provides the capacity to browse the amino acid sequences of 3248 annotated hydrogenase catalytic subunits and also contains a detailed repository of physiological, biochemical, and structural information about the 38 hydrogenase classes defined here. The database and classifier are freely and publicly available at http://services.birc.au.dk/hyddb/.

Entities:  

Year:  2016        PMID: 27670643      PMCID: PMC5037454          DOI: 10.1038/srep34212

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Microorganisms conserve energy by metabolizing H2. Oxidation of this high-energy fuel yields electrons that can be used for respiration and carbon-fixation. This diffusible gas is also produced in diverse fermentation and anaerobic respiratory processes1. H2 metabolism contributes to the growth and survival of microorganisms across the three domains of life, including chemotrophs and phototrophs, lithotrophs and heterotrophs, aerobes and anaerobes, mesophiles and extremophiles alike12. On the ecosystem scale, H2 supports microbial communities in most terrestrial, aquatic, and host-associated ecosystems13. It is also proposed that H2 was the primordial electron donor45. In biological systems, metalloenzymes known as hydrogenases are responsible for oxidizing and evolving H216. Our recent survey showed there is a far greater number and diversity of hydrogenases than previously thought2. It is predicted that over 55 microbial phyla and over a third of all microorganisms harbor hydrogenases27. Better understanding H2 metabolism and the enzymes that mediate it also has wider implications, particularly in relation to human health and disease38, biogeochemical cycling9, and renewable energy1011. There are three types of hydrogenase, the [NiFe], [FeFe], and [Fe] hydrogenases, that are distinguished by their metal composition. Whereas the [Fe]-hydrogenases are a small methanogenic-specific family12, the [NiFe] and [FeFe] classes are widely distributed and functionally diverse. They can be classified through a hierarchical system into different groups and subgroups/subtypes with distinct biochemical features (e.g. directionality, affinity, redox partners, and localization) and physiological roles (i.e. respiration, fermentation, bifurcation, sensing)16. It is necessary to define the subgroup or subtype of the hydrogenase to predict hydrogenase function. For example, while Group 2a and 2b [NiFe]-hydrogenases share >35% sequence identity, they have distinct roles as respiratory uptake hydrogenases and H2 sensors respectively1314. Likewise, discrimination between Group A1 and Group A3 [FeFe]-hydrogenases is necessary to distinguish fermentative and bifurcating enzymes215. Building on previous work1617, we recently created a comprehensive hydrogenase classification scheme predictive of biological function2. This scheme was primarily based on the topology of phylogenetic trees built from the amino acid sequences of hydrogenase catalytic subunits/domains. It also factored in genetic organization, metal-binding motifs, and functional information. This analysis identified 22 subgroups (within four groups) of [NiFe]-hydrogenases and six subtypes (within three groups) of [FeFe]-hydrogenases, each proposed to have unique physiological roles and contexts2. In this work, we build on these findings to develop the first web database for the classification and analysis of hydrogenases. We developed an expanded classification scheme that captures the full sequence diversity of hydrogenase enzymes and predicts their biological function. Using this information, we developed a classification tool based on the k-nearest neighbors’ (k-NN) method. HydDB is a user-friendly, high-throughput, and functionally-predictive tool for hydrogenase classification that operates with precision exceeding 99.8%.

Results and Discussion

A sequence-based classification scheme for hydrogenases

We initially developed a classification scheme to enable prediction of hydrogenase function by primary sequence alone. To do this, we visualized the relationships between all hydrogenases in sequence similarity networks (SSN)18, in which nodes represent individual proteins and the distances between them reflect BLAST E-values. As reflected by our analysis of other protein superfamilies1920, SSNs allow robust inference of sequence-structure-function relationships for large datasets without the problems associated with phylogenetic trees (e.g. long-branch attraction). Consistent with previous phylogenetic analyses21617, this analysis showed the hydrogenase sequences clustered into eight major groups (Groups 1 to 4 [NiFe]-hydrogenases, Groups A to C [FeFe]-hydrogenases, [Fe]-hydrogenases), six of which separate into multiple functionally-distinct subgroups or subtypes at narrower logE filters (Fig. 1; Figure S1). The SSNs demonstrated that all [NiFe]-hydrogenase subgroups defined through phylogenetic trees in our previous work2 separated into distinct clusters, which is consistent with our evolutionary model that such hydrogenases diverged from a common ancestor to adopt multiple distinct functions2. The only exception were the Group A [FeFe]-hydrogenases, which, as previously-reported217, cannot be classified by sequence alone as they have principally diversified through changes in domain architecture and quaternary structure. It remains necessary to analyze the organization of the genes encoding these enzymes to determine their specific function, e.g. whether they serve fermentative or electron-bifurcating roles.
Figure 1

Sequence similarity network of hydrogenase sequences.

Nodes represent individual proteins and the edges show the BLAST E-values between them at the logE filter defined at the bottom-left of each panel. The sequences are colored by class as defined in the legends. Figure S1 shows the further delineation of the encircled [NiFe] hydrogenase classes.

The SSN analysis revealed that several branches that clustered together on the phylogenetic tree analysis2 in fact separate into several well-resolved subclades (Fig. 1). We determined whether this was significant by analyzing the taxonomic distribution, genetic organization, metal-binding sites, and reported biochemical or functional characteristics of the differentiated subclades. On this basis, we concluded that 11 of the new subclades identified are likely to have unique physiological roles. We therefore refine and expand the hydrogenase classification to reflect the hydrogenases are more diverse in both primary sequence and predicted function than accounted for by even the latest classification scheme2. The new scheme comprises 38 hydrogenase classes, namely 29 [NiFe]-hydrogenase subclasses, 8 [FeFe]-hydrogenase subtypes, and the monophyletic [Fe]-hydrogenases (Table 1).
Table 1

Expanded classification scheme for hydrogenase enzymes.

[NiFe] Group 1: Respiratory H2-uptake [NiFe]-hydrogenases
 1aPeriplasmicElectron input for sulfate, metal, and organohalide respiration. [NiFeSe] variants.2
 1bPrototypicalElectron input for sulfate, fumarate, metal, and nitrate respiration.2
 1cHyb-typeElectron input for fumarate, nitrate, and sulfate respiration. Physiologically reversible.2
 1dOxygen-tolerantElectron input for aerobic respiration and oxygen-tolerant anaerobic respiration.2
 1eIsp-typeElectron input primarily for sulfur respiration. Physiologically reversible.2
 1fOxygen-protectingUnresolved role. May liberate electrons to reduce reactive oxygen species.2
 1gCrenarchaeota-typeElectron input primarily for sulfur respiration.2
 1hActinobacteria-typeElectron input for aerobic respiration. Scavenges electrons from atmospheric H2.2,46
 1iCoriobacteria-type (putative)Undetermined role. May liberate electrons for anaerobic respiration.This work
 1jArchaeoglobi-typeElectron input for sulfate respirationπ.This work
 1kMethanophenazine-reducingElectron input for methanogenic heterodisulfide respiration22.This work
[NiFe] Group 2: Alternative and sensory uptake [NiFe]-hydrogenases
 2aCyanobacteria-typeElectron input for aerobic respiration. Recycles H2 produced by other cellular processes.16
 2bHistidine kinase-linkedH2 sensing. Activates two-component system controlling hydrogenase expression.16
 2cDiguanylate cyclase-linked (putative)Undetermined role. May sense H2 and regulate processes through cyclic di-GMP production.2
 2dAquificae-typeUnresolved role. May generate reductant for carbon fixation or have a regulatory role.2
 2eMetallosphaera-type (putative)Undetermined role. May liberate electrons primarily for aerobic respiration26.This work
[NiFe] Group 3: Cofactor-coupled bidirectional [NiFe]-hydrogenases
 3aF420-coupledCouples oxidation of H2 to reduction of F420 during methanogenesis. Physiologically reversible. [NiFeSe] variants.16
 3bNADP-coupledCouples oxidation of NADPH to evolution of H2. Physiologically reversible. May have sulfhydrogenase activity.16
 3cHeterodisulfide reductase-linkedBifurcates electrons from H2 to heterodisulfide and Fdox in methanogens. [NiFeSe] variants.16
 3dNAD-coupledInterconverts electrons between H2 and NAD depending on cellular redox state.16
[NiFe] Group 4: Respiratory H2-evolving [NiFe]-hydrogenases
 4aFormate hydrogenlyaseCouples formate oxidation to fermentative H2 evolution. May be H+-translocating.2
 4bFormate-respiringRespires formate or carbon monoxide using H+ as electron acceptor. Na+-translocating via Mrp23.This work
 4cCarbon monoxide-respiringRespires carbon monoxide using H+ as electron acceptor. H+-translocating.2
 4dFerredoxin-coupled, Mrp-linkedCouples Fdred oxidation to H+ reduction. Na+-translocating via Mrp complex24.This work
 4eFerredoxin-coupled, Ech-typeCouples Fdred oxidation to H+ reduction. Physiologically reversible via H+/Na+ translocation.2
 4fFormate-coupled (putative)Undetermined role. May couple formate oxidation to H2 evolution and H+ translocation.2
 4gFerredoxin-coupled (putative)Undetermined role. May couple Fdred oxidation to proton reduction and H+/Na+ translocation.This work
 4hFerredoxin-coupled, Eha-typeCouples Fdred oxidation to H+ reduction in anaplerotic processes. H+/Na+-translocating25.This work
 4iFerredoxin-coupled, Ehb-typeCouples Fdred oxidation to H+ reduction in anabolic processes. H+/Na+-translocating25.This work
[FeFe] Hydrogenases
 A1PrototypicalCouples ferredoxin oxidation to fermentative or photobiological H2 evolution.2,17
 A2Glutamate synthase-linked (putative)Undetermined role. May couple H2 oxidation to NAD reduction, generating reductant for glutamate synthase.2,17
 A3BifurcatingReversibly bifurcates electrons from H2 to NAD and Fdox in anaerobic bacteria.2,17
 A4Formate dehydrogenase-linkedCouples formate oxidation to H2 evolution. Some bifurcate electrons from H2 to ferredoxin and NADP.2,17
 BColonic-type (putative)Undetermined role. May couple Fdred oxidation to fermentative H2 evolution.17
 C1Histidine kinase-linked (putative)Undetermined role. May sense H2 and regulate processes via histidine kinases2.This work
 C2Chemotactic (putative)Undetermined role. May sense H2 and regulate processes via methyl-accepting chemotaxis proteins2.This work
 C3Phosphatase-linked (putative)Undetermined role. May sense H2 and regulate processes via serine/threonine phosphatases2.This work
[Fe] Hydrogenases
 AllMethenyl-H4MPT dehydrogenaseReversibly couples H2 oxidation to 5,10-methenyltetrahydromethanopterin reduction.16

The majority of the classes were defined in previous work2161746. The [NiFe] Group 1i, 1j, 2e, 4d, 4g, 4h, and 4i enzymes and [FeFe] Groups C1, C2, and C3 enzymes were defined in this work based on their separation into distinct clusters in the SSN analysis (Fig. 1). HydDB contains detailed information on each of these classes, including their taxonomic distribution, genetic organization, biochemistry, and structures, as well a list of primary references.

Three lineages originally classified as Group 1a [NiFe]-hydrogenases were reclassified as new subgroups, namely those affiliated with Coriobacteria (Group 1i), Archaeoglobi (Group 1j), and Methanosarcinales (Group 1i). Cellular and molecular studies show these enzymes all support anaerobic respiration of H2, but differ in the membrane carriers (methanophenazine, menaquinone) and terminal electron acceptors (heterodisulfide, sulfate, nitrate) that they couple to2122. The previously-proposed 4b and 4d subgroups2 were dissolved, as the SSN analysis confirmed they were polyphyletic. These sequences are reclassified here into five new subgroups: the formate- and carbon monoxide-respiring Mrp-linked complexes (Group 4b)23, the ferredoxin-coupled Mrp-linked complexes (Group 4d)24, the well-described methanogenic Eha (Group 4h) and Ehb (Group 4i) supercomplexes25, and a more loosely clustered class of unknown function (Group 4g). Enzymes within these subgroups, with the exception of the uncharacterized 4g enzymes, sustain well-described specialist functions in the energetics of various archaea232425. Three crenarchaeotal hydrogenases were also classified as their own family (Group 2e); these enzymes enable certain crenarchaeotes to grow aerobically on O22627 and hence may represent a unique lineage of aerobic uptake hydrogenases currently underrepresented in genome databases. The Group C [FeFe]-hydrogenases were also separated into three main subtypes given they separate into distinct clusters even at relatively broad logE values (Fig. 1); these subtypes are each transcribed with different regulatory elements and are likely to have distinct regulatory roles21728 (Table 1).

HydDB reliably predicts hydrogenase class using the k-NN method and CDD referencing

Using this information, we built a web tool to classify hydrogenases. Hydrogenase classification is determined through a three-step process following input of the catalytic subunit sequence. Two checks are initially performed to confirm if the inputted sequence is likely to encode a hydrogenase catalytic subunit/domain. The Conserved Domain Database (CDD)29 is referenced to confirm that the inputted sequence has a hydrogenase catalytic domain, i.e. “Complex1_49kDa superfamily” (cl21493) (for NiFe-hydrogenases), “Fe_hyd_lg_C superfamily” (cl14953) (for FeFe-hydrogenases), and “HMD” (pfam03201) (for Fe-hydrogenases). A homology check is also performed that computes the BLAST E-value between the inputted sequence and its closest homolog in HydDB. HydDB classifies any inputted sequence that lacks hydrogenase conserved domains or has low homology scores (E-value > 10−5) as a non-hydrogenase (Table S1). In the final step, the sequence is classified through the k-NN method that determines the most similar sequences listed in the HydDB reference database. To determine the optimal k for the dataset, we performed a 5-fold cross-validation for k = 1…10 and computed the precision for each k. The results are shown in Fig. 2. The classifier predicted the classes of the 3248 hydrogenase sequences with 99.8% precision and high robustness when performing a 5-fold cross-validation (as described in the Methods section) for k = 4. The six sequences where there were discrepancies between the SSN and k-NN predictions are shown in Table S2. The classifier has also been trained to detect and exclude protein families that are homologous to hydrogenases but do not metabolize H2 (Nuo, Ehr, NARF, HmdII12) using reference sequences of these proteins (Table S1).
Figure 2

Evaluating the k-NN classifier for k = 1…10.

For each k, a 5-fold cross-validation was performed. The mean precision ± two standard deviations of the folds is shown in the figure (note the y-axis). k = 1 provides the most accurate classifier. However, k = 4 provides almost the same precision and is more robust to errors in the training set (reflected by the lower standard deviation). In general, the standard deviation is very small, indicating that the predictions are robust to changes in the training data.

Sequences of the [FeFe] Group A can be classified into functionally-distinct subtypes (A1, A2, A3, A4) based on genetic organization2. The classifier can classify such hydrogenases if the protein sequence immediately downstream from the catalytic subunit sequence is provided. The classifier references the CDD to search for conserved domains in the downstream protein sequence. A sequence is classified as [FeFe] Group A2 if one of the domains “GltA”, “GltD”, “glutamate synthase small subunit” or “putative oxidoreductase”, but not “NuoF”, is found in the sequence. Sequences are classified as [FeFe] Group A3 if the domain “NuoF” is found and [FeFe] Group A4 if the domain “HycB” is present. If none of the domains are found, the sequence is classified as A1. These classification rules were determined by collecting 69 downstream protein sequences. The sequences were then submitted to the CDD and the domains which most often occurred in each subtype were extracted. In addition to its precision, the classifier is superior to other approaches due to its usability. It is accessible as a free web service at http://services.birc.au.dk/hyddb/ HydDB allows the users to paste or upload sequences of hydrogenase catalytic subunit sequences in FASTA format and run the classification (Figure S2). When analysis has completed, results are presented in a table that can be downloaded as a CSV file (Figure S3). This provides an efficient and user-friendly way to classify hydrogenases, in contrast to the previous standard which requires visualization of phylogenetic trees derived from multiple sequence alignments30.

HydDB infers the physiological roles of H2 metabolism

As summarized in Table 1, hydrogenase class is strongly correlated with physiological role. As a result, the classifier is capable of predicting both the class and function of a sequenced hydrogenase. To demonstrate this capacity, we used HydDB to analyze the hydrogenases present in 12 newly-sequenced bacteria, archaea, and eukaryotes of major ecological significance. The classifier correctly classified all 24 hydrogenases identified in the sequenced genomes, as validated with SSNs (Table 2). On the basis of these classifications, the physiological roles of H2 metabolism were predicted (Table 2). For five of the organisms, these predictions are confirmed or supported by previously published data2731323334. Other predictions are in line with metabolic models derived from metagenome surveying353637. In some cases, the capacity for organisms to metabolize H2 was not tested or inferred in previous studies despite the presence of hydrogenases in the sequenced genomes32383940.
Table 2

Predictive capacity of the HydDB.

OrganismPhylumHydrogenase accession no.HydDB classificationSSN classificationPredicted H2 metabolismConfirmed H2 metabolism
Pyrinomonas methylaliphatogenesAcidobacteriaWP_041979300.1[NiFe] Group 1h[NiFe] Group 1hPersistence by aerobic respiration of atmospheric H2Confirmed experimentally31
Phaeodactylibacter xiamenensisBacteroidetesWP_044227713.1 WP_044216927.1 WP_044227053.1[NiFe] Group 1d [NiFe] Group 2a [NiFe] Group 3d[NiFe] Group 1d [NiFe] Group 2a [NiFe] Group 3dChemolithoautotrophic growth by aerobic H2 oxidationBacterium grows aerobically, but H2 oxidation untested32
Bathyarchaeota archaeon BA1BathyarchaeotaKPV62434.1 KPV62673.1 KPV62298.1[NiFe] Group 3c [NiFe] Group 3c [NiFe] Group 4g[NiFe] Group 3c [NiFe] Group 3c [NiFe] Group 4gCouples Fdred oxidation to H2 evolution in energy-conserving and bifurcating processesUnconfirmed but consistent with metagenome-based models36
Lenisia limosaObazoa (Breviatea class)LenisMan28[FeFe] Group A1[FeFe] Group AFermentative evolution of H2Confirmed experimentally47
Acidianus copahuensisCrenarchaeotaWP_048100721.1 WP_048100713.1 WP_048100378.1 WP_048100359.1[NiFe] Group 1g [NiFe] Group 1g [NiFe] Group 1h [NiFe] Group 2e[NiFe] Group 1g [NiFe] Group 1g [NiFe] Group 1h [NiFe] Group 2eChemolithoautotrophic growth by H2 oxidation using O2 or S0 as electron acceptorsPartially confirmed experimentally27
Arcobacter sp. E1/2/3Proteobacteria (Epsilon class)Arc.peg.2312[NiFe] Group 1b[NiFe] Group 1bChemolithoautotrophic growth by anaerobic H2 oxidationConfirmed experimentally47
Methanoperedens nitroreducensEuryarchaeota (ANME)WP_048088262.1 WP_048090768.1[NiFe] Group 3b [NiFe] Group 3b[NiFe] Group 3b [NiFe] Group 3bSecondary role for H2 metabolism limited to fermentative evolution of H2Unconfirmed but consistent with metagenome-based models35
Kryptonium thompsoniKryptoniaCUU03002.1 CUU06124.1[NiFe] Group 1d [NiFe] Group 3b[NiFe] Group 1d [NiFe] Group 3bChemolithoautotrophic growth by aerobic H2 oxidation, fermentative evolution of H2.Untested, candidate phylum identified by metagenomics39
Lokiarchaeum sp. GC14_75LokiarchaeotaKKK40681.1[NiFe] Group 3c[NiFe] Group 3cBifurcates electrons between H2, heterodisulfide, and ferredoxinUnconfirmed but consistent with metagenome-based models48
Nitrospira moscoviensisNitrospiraeWP_053379275.1[NiFe] Group 2a[NiFe] Group 2aChemolithoautotrophic growth by aerobic H2 oxidationConfirmed experimentally33
Bacterium GW2011_GWE1_35_17MoranbacteriaKKQ46070.1 KKQ45273.1[NiFe] Group 1a [NiFe] Group 3b[NiFe] Group 1a [NiFe] Group 3bChemolithoautotrophic growth by anaerobic H2 oxidation, fermentative evolution of H2Unconfirmed but consistent with metagenome-based models37
Bacterium GW2011_GWA2_33_10PeregrinibacteriaKKP36897.1[FeFe] Group A3[FeFe] Group ABifurcates electrons between H2, NADH, and ferredoxinUnconfirmed but consistent with metagenome-based models37
Entotheonella sp. TSY1TectomicrobiaETW97737.1 ETW94065.1[NiFe] Group 1h [NiFe] Group 3b[NiFe] Group 1h [NiFe] Group 3bPersistence by aerobic respiration of atmospheric H2, fermentative evolution of H2Untested, candidate phylum identified by metagenomics40

HydDB accurately determined hydrogenase content and predicted the physiological roles of H2 metabolism in 12 newly-sequenced archaeal and bacterial species.

While HydDB serves as a reliable initial predictor of hydrogenase class and function, further analysis is recommended to verify predictions. Hydrogenase sequences only provide organisms with the genetic capacity to metabolise H2; their function is ultimately modulated by their expression and integration within the cell141. In addition, some classifications are likely to be overgeneralized due to lack of functional and biochemical characterization of certain lineages and sublineages. For example, it is not clear if two distant members of the Group 1h [NiFe]-hydrogenases (Robiginitalea biformata, Sulfolobus islandicus) perform the same H2-scavenging functions as the core group9. Likewise, it seems probable that the Group 3a [NiFe]-hydrogenases of Thermococci and Aquificae use a distinct electron donor to the main class42. Prominent cautions are included in the enzyme pages in cases such as these. HydDB will be updated when literature is published that influences functional assignments.

HydDB contains interfaces for hydrogenase browsing and analyzing

In addition to its classification function, HydDB is designed to be a definitive repository for hydrogenase retrieval and analysis. The database presently contains entries for 3248 hydrogenases, including their NCBI accession numbers, amino acid sequences, hydrogenase classes, taxonomic affiliations, and predicted behavior (Figure S4). To enable easy exploration of the data set, the database also provides access to an interface for searching, filtering, and sorting the data, as well as the capacity to download the results in CSV or FASTA format. There are individual pages for the 38 hydrogenase classes defined here (Table 1), including descriptions of their physiological role, genetic organization, taxonomic distribution, and biochemical features. This is supplemented with a compendium of structural information about the hydrogenases, which is integrated with the Protein Databank (PDB), as well as a library of over 500 literature references (Figure S5).

Conclusions

To summarize, HydDB is a definitive resource for hydrogenase classification and analysis. The classifier described here provides a reliable, efficient, and convenient tool for hydrogenase classification and functional prediction. HydDB also provides browsing tools for the rapid analysis and retrieval of hydrogenase sequences. Finally, the manually-curated repository of class descriptions, hydrogenase structures, and literature references provides a deep but accessible resource for understanding hydrogenases.

Methods

Sequence datasets

The database was constructed using the amino acid sequences of all curated non-redundant 3248 hydrogenase catalytic subunits represented in the NCBI RefSeq database in August 20142 (Dataset S1). In order to test the classification tool, additional sequences from newly-sequenced archaeal and bacterial phyla were retrieved from the Joint Genome Institute’s Integrated Microbial Genomes database43.

Sequence similarity networks

Sequence similarity networks (SSNs)18 constructed using Cytoscape 4.144 were used to visualize the distribution and diversity of the retrieved hydrogenase sequences. In this analysis, each node represents one of the 3248 hydrogenase sequences in the reference database (Dataset S1). Each edge represents the sequence similarity between them as determined by E-values from all-vs-all BLAST analysis, with all self and duplicate edges removed. Three networks were constructed, namely for the [NiFe]-hydrogenase large subunit sequences (Dataset S2), [FeFe]-hydrogenase catalytic domain sequences (Dataset S3), and [Fe]-hydrogenase sequences (Dataset S4). To control the degree of separation between nodes, logE cutoffs that were incrementally decreased from −5 to −200 until no major changes in clustering was observed. The logE cutoffs used for the final classifications are shown in Fig. 1 and Figure S1.

Classification method

The -NN method is a well-known machine learning method for classification45. Given a set of data points x, x, … x (e.g. sequences) with known labels y, y, …, y (e.g. type annotations), the label of a point, , is predicted by computing the distance from to x, x, … x and extracting the labeled points closest to , i.e. the neighbors. The predicted label is then determined by majority vote of the labels of the neighbors. The distance measure applied here is that of a BLAST search. Thus, the classifier corresponds to a homology search where the types of the top results are considered. However, formulating the classification method as a machine learning problem allows the use of common evaluation methods to estimate the precision of the method and perform model selection. The classifier was evaluated using -fold cross-validation. The dataset is first split in to parts of equal size. parts (the training set) are then used for training the classifier and the labels of the data points in the remaining part (the test set) are then predicted. This process, called a fold, is repeated times. The predicted labels of each fold are then compared to the known labels and a precision can be computed.

Additional Information

How to cite this article: Søndergaard, D. et al. HydDB: A web tool for hydrogenase classification and analysis. Sci. Rep. 6, 34212; doi: 10.1038/srep34212 (2016).
  42 in total

Review 1.  Classification and phylogeny of hydrogenases.

Authors:  P M Vignais; B Billoud; J Meyer
Journal:  FEMS Microbiol Rev       Date:  2001-08       Impact factor: 16.408

2.  A bacterial electron-bifurcating hydrogenase.

Authors:  Kai Schuchmann; Volker Müller
Journal:  J Biol Chem       Date:  2012-07-18       Impact factor: 5.157

3.  The methanogenic redox cofactor F420 is widely synthesized by aerobic soil bacteria.

Authors:  Blair Ney; F Hafna Ahmed; Carlo R Carere; Ambarish Biswas; Andrew C Warden; Sergio E Morales; Gunjan Pandey; Stephen J Watt; John G Oakeshott; Matthew C Taylor; Matthew B Stott; Colin J Jackson; Chris Greening
Journal:  ISME J       Date:  2016-08-09       Impact factor: 10.302

4.  Intact functional fourteen-subunit respiratory membrane-bound [NiFe]-hydrogenase complex of the hyperthermophilic archaeon Pyrococcus furiosus.

Authors:  Patrick M McTernan; Sanjeev K Chandrayan; Chang-Hao Wu; Brian J Vaccaro; W Andrew Lancaster; Qingyuan Yang; Dax Fu; Greg L Hura; John A Tainer; Michael W W Adams
Journal:  J Biol Chem       Date:  2014-05-23       Impact factor: 5.157

5.  The physiology and habitat of the last universal common ancestor.

Authors:  Madeline C Weiss; Filipa L Sousa; Natalia Mrnjavac; Sinje Neukirchen; Mayo Roettger; Shijulal Nelson-Sathi; William F Martin
Journal:  Nat Microbiol       Date:  2016-07-25       Impact factor: 17.745

6.  Using sequence similarity networks for visualization of relationships across diverse protein superfamilies.

Authors:  Holly J Atkinson; John H Morris; Thomas E Ferrin; Patricia C Babbitt
Journal:  PLoS One       Date:  2009-02-03       Impact factor: 3.240

Review 7.  Physiology, Biochemistry, and Applications of F420- and Fo-Dependent Redox Reactions.

Authors:  Chris Greening; F Hafna Ahmed; A Elaaf Mohamed; Brendon M Lee; Gunjan Pandey; Andrew C Warden; Colin Scott; John G Oakeshott; Matthew C Taylor; Colin J Jackson
Journal:  Microbiol Mol Biol Rev       Date:  2016-04-27       Impact factor: 11.056

8.  Anaerobic oxidation of methane coupled to nitrate reduction in a novel archaeal lineage.

Authors:  Mohamed F Haroon; Shihu Hu; Ying Shi; Michael Imelfort; Jurg Keller; Philip Hugenholtz; Zhiguo Yuan; Gene W Tyson
Journal:  Nature       Date:  2013-07-28       Impact factor: 49.962

9.  Methane metabolism in the archaeal phylum Bathyarchaeota revealed by genome-centric metagenomics.

Authors:  Paul N Evans; Donovan H Parks; Grayson L Chadwick; Steven J Robbins; Victoria J Orphan; Suzanne D Golding; Gene W Tyson
Journal:  Science       Date:  2015-10-23       Impact factor: 47.728

10.  Complex archaea that bridge the gap between prokaryotes and eukaryotes.

Authors:  Anja Spang; Jimmy H Saw; Steffen L Jørgensen; Katarzyna Zaremba-Niedzwiedzka; Joran Martijn; Anders E Lind; Roel van Eijk; Christa Schleper; Lionel Guy; Thijs J G Ettema
Journal:  Nature       Date:  2015-05-06       Impact factor: 49.962

View more
  123 in total

1.  Energy conservation involving 2 respiratory circuits.

Authors:  Marie Charlotte Schoelmerich; Alexander Katsyv; Judith Dönig; Timothy J Hackmann; Volker Müller
Journal:  Proc Natl Acad Sci U S A       Date:  2019-12-26       Impact factor: 11.205

2.  Cryptic CH4 cycling in the sulfate-methane transition of marine sediments apparently mediated by ANME-1 archaea.

Authors:  F Beulig; H Røy; S E McGlynn; B B Jørgensen
Journal:  ISME J       Date:  2018-09-07       Impact factor: 10.302

3.  Active sulfur cycling in the terrestrial deep subsurface.

Authors:  Emma Bell; Tiina Lamminmäki; Johannes Alneberg; Anders F Andersson; Chen Qian; Weili Xiong; Robert L Hettich; Manon Frutschi; Rizlan Bernier-Latmani
Journal:  ISME J       Date:  2020-02-11       Impact factor: 10.302

Review 4.  Energy-converting hydrogenases: the link between H2 metabolism and energy conservation.

Authors:  Marie Charlotte Schoelmerich; Volker Müller
Journal:  Cell Mol Life Sci       Date:  2019-10-19       Impact factor: 9.261

5.  Genomic profiling of four cultivated Candidatus Nitrotoga spp. predicts broad metabolic potential and environmental distribution.

Authors:  Andrew M Boddicker; Annika C Mosier
Journal:  ISME J       Date:  2018-07-26       Impact factor: 10.302

Review 6.  Molecular Hydrogen Metabolism: a Widespread Trait of Pathogenic Bacteria and Protists.

Authors:  Stéphane L Benoit; Chris Greening; Robert J Maier; R Gary Sawers
Journal:  Microbiol Mol Biol Rev       Date:  2020-01-29       Impact factor: 11.056

7.  Atmospheric trace gases support primary production in Antarctic desert surface soil.

Authors:  Mukan Ji; Chris Greening; Inka Vanwonterghem; Carlo R Carere; Sean K Bay; Jason A Steen; Kate Montgomery; Thomas Lines; John Beardall; Josie van Dorst; Ian Snape; Matthew B Stott; Philip Hugenholtz; Belinda C Ferrari
Journal:  Nature       Date:  2017-12-06       Impact factor: 49.962

8.  Expanded Diversity and Metabolic Versatility of Marine Nitrite-Oxidizing Bacteria Revealed by Cultivation- and Genomics-Based Approaches.

Authors:  Soo-Je Park; Adrian-Ştefan Andrei; Paul-Adrian Bulzu; Vinicius S Kavagutti; Rohit Ghai; Annika C Mosier
Journal:  Appl Environ Microbiol       Date:  2020-10-28       Impact factor: 4.792

9.  Hydrogen Does Not Appear To Be a Major Electron Donor for Symbiosis with the Deep-Sea Hydrothermal Vent Tubeworm Riftia pachyptila.

Authors:  Jessica H Mitchell; Juliana M Leonard; Jennifer Delaney; Peter R Girguis; Kathleen M Scott
Journal:  Appl Environ Microbiol       Date:  2019-12-13       Impact factor: 4.792

10.  Proposal of the reverse flow model for the origin of the eukaryotic cell based on comparative analyses of Asgard archaeal metabolism.

Authors:  Anja Spang; Courtney W Stairs; Nina Dombrowski; Laura Eme; Jonathan Lombard; Eva F Caceres; Chris Greening; Brett J Baker; Thijs J G Ettema
Journal:  Nat Microbiol       Date:  2019-04-01       Impact factor: 17.745

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.