| Literature DB >> 29078300 |
Eyal Akiva1, Janine N Copp2, Nobuhiko Tokuriki3, Patricia C Babbitt4,5.
Abstract
Insight regarding how diverse enzymatic functions and reactions have evolved from ancestral scaffolds is fundamental to understanding chemical and evolutionary biology, and for the exploitation of enzymes for biotechnology. We undertook an extensive computational analysis using a unique and comprehensive combination of tools that include large-scale phylogenetic reconstruction to determine the sequence, structural, and functional relationships of the functionally diverse flavin mononucleotide-dependent nitroreductase (NTR) superfamily (>24,000 sequences from all domains of life, 54 structures, and >10 enzymatic functions). Our results suggest an evolutionary model in which contemporary subgroups of the superfamily have diverged in a radial manner from a minimal flavin-binding scaffold. We identified the structural design principle for this divergence: Insertions at key positions in the minimal scaffold that, combined with the fixation of key residues, have led to functional specialization. These results will aid future efforts to delineate the emergence of functional diversity in enzyme superfamilies, provide clues for functional inference for superfamily members of unknown function, and facilitate rational redesign of the NTR scaffold.Entities:
Keywords: enzyme superfamilies; evolution; flavoenzyme; nitroreductase; sequence similarity network
Mesh:
Substances:
Year: 2017 PMID: 29078300 PMCID: PMC5692541 DOI: 10.1073/pnas.1706849114
Source DB: PubMed Journal: Proc Natl Acad Sci U S A ISSN: 0027-8424 Impact factor: 11.205
Fig. 1.An overview of NTR superfamily structure and reaction diversity. (A) A representative NTR structure (PDB ID code 3E39) is depicted in cartoon display in two orientations, with individual monomers colored in gray and red and FMN depicted as a stick model with carbons in yellow (Dataset S1 includes a detailed list of NTR superfamily structures). (B) Diagram showing the ping-pong bi-bi reaction scheme. (C) Representative NTR superfamily reactions: Electron donor (oxidative) reactions, e.g., (1) nicotinamide oxidation, (2) thiazoline oxidation, (3) diketopiperazine oxidation; FMN reduction from (4) oxidized FMN, (5) FMN semiquinone to (6) reduced FMN; electron acceptor (reductive) reactions, e.g., (7) deiodination, (8) quinone reduction, (9) nitroimidazole reduction, (10) ene reduction, and (11) the fragmentation of reduced FMN to dimethylbenzamide.
Fig. 2.A representative SSN of the NTR superfamily: 24,270 protein sequences are depicted by 5,337 nodes (circles), which represent proteins sharing >60% sequence identity. Edges between nodes indicate an average pairwise BLAST E-value of at least 1 × 10−18. Node coloring represents subgroup classification. White nodes with light gray borders indicate remainder sequences that do not belong in any of the categorized subgroups. Large triangle nodes include at least one solved crystal structure; black borders indicate that a biochemical activity was also experimentally characterized. Large circular nodes with black borders include at least one protein associated with experimental evidence (but without structural information). Names in bold indicate subgroups that contain at least one protein with literature-documented functional information. The network is visualized by Cytoscape (74) using the organic layout algorithm (36). (Inset) HMM networks of the NTR superfamily. Nodes represent SSGs (), and node size correlates with SSG size, from smallest (<100 proteins) to largest (>300 proteins). Edges represent pairwise HMM alignment between SSGs, and similarities with HHALIGN scores >154 (corresponding with an HMM alignment score more significant than 1 × 10−24) are shown. Edge color and width correspond with the HHalign score: <160 indicated by thin and light edges, >300 indicated by thick and dark edges. Nodes are colored based on (A) subgroup and (B) betweenness centrality.
NTR subgroup summary and taxonomic distribution
| Subgroup | Sequences/ investigated enzymes | EC number(s) | Activity (function) | Taxonomic profiling, % representation | ||||||||
| Bacteria | ||||||||||||
| ND | Ar | Eu | Bdt | Str | Pro | Frm | Act | Oth | ||||
| NfsB | 2,632/18 | 1.3.1.x, 1.5.1.x, 1.6.5.x, 1.6.99.x | Diverse ( | 2 | 1 | — | 20 | — | 54 | 18 | 1 | 4 |
| Hub | 2,540/3 | 1.3.3.x | Diverse ( | 3 | 9 | — | 17 | 1 | 7 | 50 | 3 | 10 |
| NfsA | 2,299/20 | 1.5.1.x, 1.6.3.x, 1.6.5.x, 1.6.99.x | Diverse ( | 4 | 1 | — | 7 | — | 35 | 41 | 8 | 4 |
| SagB | 1,936/5 | 1.3.1.x, 3.4.21.x | Azole oxidation (TOMM biosynthesis) ( | 5 | 7 | 1 | 5 | 7 | 26 | 24 | 13 | 12 |
| unk1 | 1,769/3 | — | Unknown ( | 6 | — | — | — | 1 | 90 | — | 3 | — |
| MhqN | 1,688/5 | 1.6.5.x, 1.6.99.x, | Diverse ( | 3 | 2 | 3 | 11 | — | 27 | 44 | 3 | 7 |
| Frm2 | 1,568/2 | 1.6.5.x, | Quinoline reduction (redox stress) ( | 4 | 1 | 13 | 6 | — | 20 | 53 | 2 | 1 |
| PnbA | 1,455/7 | 1.6.5.x, 1.6.99.x | Diverse ( | 4 | — | 2 | — | 2 | 66 | 7 | 17 | 2 |
| TdsD | 943/1 | 1.5.1.x | FMN reduction ( | 5 | 5 | — | 4 | 1 | 50 | 5 | 21 | 9 |
| RutE | 861/1 | 1.1.1.x | Malonate semialdehyde reduction (pyrimidine catabolism) ( | 4 | — | — | — | 6 | 80 | — | 10 | — |
| BluB | 859/4 | 1.13.11.x, 1.16.8.x | Unknown (FMN fragmentation) ( | 7 | 5 | — | 1 | 5 | 61 | 3 | 14 | 4 |
| unk2 | 827 | — | Unknown | 3 | 5 | 1 | 18 | — | 6 | 57 | 2 | 8 |
| unk3 | 789 | — | Unknown | 6 | — | 3 | 17 | — | 2 | 67 | 1 | 4 |
| Acg | 773/5 | — | Unknown (virulence) ( | 8 | 1 | — | 12 | 13 | 20 | — | 44 | 2 |
| Iyd | 625/12 | 1.21.x | Dehalogenation (iodine salvage) ( | 13 | 3 | 25 | 5 | 14 | 29 | — | 9 | 2 |
| unk4 | 623 | — | Unknown | 10 | — | — | — | 15 | 1 | — | 73 | 1 |
| unk5 | 533 | — | Unknown | 3 | 2 | 1 | 4 | — | 1 | 70 | 14 | 5 |
| FeS | 529/2 | 1.6.99.x | Nitroaromatic reduction ( | 2 | 7 | 3 | 7 | — | 28 | 47 | 2 | 4 |
| unk6 | 287 | — | Unknown | 4 | — | — | 14 | — | 20 | 36 | 5 | 21 |
| FbiB | 242/2 | 6.3.2.x | Unknown (F420 biosynthesis) ( | 5 | — | — | — | 30 | — | — | 65 | — |
| unk7 | 135 | — | Unknown | 3 | 1 | — | 21 | — | 35 | 9 | 24 | 7 |
| unk8 | 129 | — | Unknown | 1 | — | — | — | 5 | — | — | 93 | 1 |
| unk9 | 71 | — | Unknown | 7 | — | — | — | — | 1 | 88 | 3 | 1 |
| unk10 | 59 | — | Unknown | 3 | — | 2 | 48 | — | 10 | 28 | 2 | 7 |
| unk11 | 14 | — | Unknown | — | — | 100 | — | — | — | — | — | — |
| Remainder | 84 | — | Unknown | 7 | 11 | 1 | — | 1 | 6 | 11 | 32 | 31 |
| Superfamily | 24,270 | — | Diverse | 4.9 | 2.6 | 2.5 | 8 | 3 | 35 | 26 | 13 | 5 |
Act, Actinobacteria; Acg, acr coregulated gene; Ar, Archaea; Bdt, Bacteroidetes; BluB, Blush B; Eu, Eukaryota; FbiB, F420 biosynthetic pathway B; Frm, Firmicutes; Frm2, fatty acid repression mutant 2; Iyd, Iodotyrosine dehalogenase; MhqN, 2-methylhydroquinone reductase N; ND, sequences typically originating from metagenomic surveys; NfsA, nitrofurazone sensitivity A; NfsB, nitrofurazone sensitivity B; Oth, other; PnbA, p-nitrobenzoate reductase A; Pro, Proteobacteria; RutE, pyrimidine utilization E; SagB, SLS-associated gene B; Str, Streptomycetales; TdsD, Thermophilic desulfurization D.
To the best of our knowledge.
Subgroup activity and function were assigned based on literature associated with canonical members.
Taxonomical frequencies are based on UniProtKB/National Center for Biotechnology Information data retrieved for each subgroup member.
Multidomain enzymes.
Taxonomic profiling numbers represent percentages from all NTR enzymes.
Fig. 3.The NTR superfamily scaffold. (A) The NTR superfamily domain: An overlay of 17 representative NTR structures showing the conserved α+β FMN binding fold that was generated using MUSTANG-MR (76) at a sieving level of 2.0 Å. (B) A 2D topology map of the minimal NTR scaffold colored from blue (N terminus) to red (C terminus) with numbered α-helices and β-strands. (C) A ribbon representation of the hub subgroup structure PDB ID code 3E39 with monomers colored in gray and red, respectively. FMN is depicted in stick form with carbons in yellow. (Inset) Key FMN interacting residues: The FMN moiety and interacting active site residues are displayed in stick form and labeled.
Fig. 4.Structural analysis of the NTR superfamily. (A) A structure similarity network of the NTR superfamily. Each node represents a crystal structure, colored by subgroup as per Fig. 2 (red nodes represent hub subgroup members). Nodes are filled according to the presence or absence of the structural extensions inserted in any of the three hot spot sites, as depicted by the key (Inset). Edges represent pairwise structural similarity scored <0.746, as measured by TM-align. (B) A diagram of the structural diversity observed at the E1, E2, and E3 insertion sites relative to one FMN binding active site of the enzyme. A cartoon representation of a hub protein structure (PDB ID code 3E39) is shown with monomers depicted in gray and red. The locations of the E1, E2, and E3 structural insertion points are indicated by spheres that depict the bordering residues of each insertion (E3 has only one bordering residue, as it extends the C terminus). The FMN molecule is shown in a stick model with carbons colored in yellow. (Inset) Boxes display examples of subgroup specific diversity at each extension site labeled by PDB ID code and subgroup. Extensions are colored by subgroup as per Fig. 2.
Fig. 5.Conservation of FMN-interacting positions across the NTR superfamily. (A) A representative SSN of the NTR superfamily is shown with nodes colored by the most frequent residue type found in the FMN phosphate moiety interacting position. (Inset) Ribbon representation of the active site of hub subgroup structure PDB ID code 3E39 is shown with FMN depicted in stick form with carbons in yellow. The residue (arginine) interacting with the phosphate moiety is circled. (B) A representative SSN of the NTR superfamily is shown with nodes colored by the most frequent residue type found in the re-loop position. (Inset) Ribbon representation of PDB ID code 3E39 (as per A) with the re-loop residue (leucine) circled. Note that the re-loop residue originates from the alternative chain of the homodimer (shown in red).
Conservation and location of functional residues within extensions
| Subgroup | Reaction or function | Catalytic residues | % cons |
| NfsB | Reduction of a diverse substrate range ( | E1 & E2 | 34–55 |
| NfsA | Reduction of a diverse substrate range ( | E3 | 19–60 |
| MhqN | Diverse catalysis ( | E1 & E2 | 5–51 |
| Frm2 | Reduction of 4NQO; oxidative stress ( | E1 & E2 | 81 |
| PnbA | Reduction of a diverse substrate range ( | E1 | 10–72 |
| BluB | FMN fragmentation ( | E2 & E3 | 99–100 |
| IyD | Dehalogenation of aromatic compounds ( | E1 | 92–100 |
| FbiB | Biosynthesis of the F420 flavin cofactor ( | E1 & E2 | 98 |
Location of catalytic residues.
Percentage conservation; Dataset S2 includes further details.
Fig. 6.A phylogenetic reconstruction of the NTR superfamily. Branches are colored and labeled by subgroup; dispersed red branches represent hub subgroup sequence sets, and black branches represent members of the remainder subgroup. The eight hub SSG-5 branches are labeled (H5). Circles represent branching points with probabilities >0.9; triangles represent probabilities >0.8.