Literature DB >> 35912344

Distinct Modes of Hidden Structural Dynamics in the Functioning of an Allosteric Polysaccharide Lyase.

Abstract

Dynamics is an essential process to drive an enzyme to perform a function. When a protein sequence encodes for its three-dimensional structure and hence its function, it essentially defines the intrinsic dynamics of the molecule. The static X-ray crystal structure was thought to shed little insight into the molecule's dynamics until the recently available tool "Ensemble refinement" (ER). Here, we report the structure-function-dynamics of PanPL, an alginate-specific, endolytic, allosteric polysaccharide lyase belonging to the PL-5 family from Pandoraea apista. The crystal structures determined in apo and tetra-ManA bound forms reveal that the PanPL maintains a closed state with an N-terminal loop lid (N-loop-lid) arched over the active site. The B-factor analyses and ER congruently reveal how pH influences the functionally relevant atomic fluctuations at the N-loop-lid. The ER unveils enhanced fluctuations at the N-loop-lid upon substrate binding. The normal-mode analysis finds that the functional states are confined. The 1 μs simulation study suggests the existence of a hidden open state. The longer N-loop-lid selects a mechanism to adopt a closed state and undergo fluctuations to facilitate the substrate binding. Here, our work demonstrates the distinct modes of dynamics; both intrinsic and substrate-induced conformational changes are vital for enzyme functioning and allostery.

Entities: Chemical

Year: 2022 PMID： 35912344 PMCID： PMC9336148 DOI： 10.1021/acscentsci.2c00277

Source DB: PubMed Journal: ACS Cent Sci ISSN： 2374-7943 Impact factor: 18.728

Introduction

Enzymes are molecules that sample fluctuating conformational states in order to drive reactions at an accelerated rate that would otherwise be difficult to achieve. The conformational states correspond to discrete energy levels, and thus the enzyme resides in multiple local minima on an energy landscape. How do enzymes access these energy states? The main mode is intrinsic; the dynamics encoded at the sequence level, modulated by extrinsic factors, drive the enzymes to go through the discrete energy levels. These motions are on the time scale ranging from microseconds to milliseconds, essentially vital for catalysis.[1] Like temperature, the pH of the solution forms an essential extrinsic factor. Therefore, it would be interesting to explore the dynamics of enzymes triggered by the pH of the solution. In our continuous effort to understand the structure–function relationships in the polysaccharide lyase (PL) -5 family of proteins, we have stumbled upon the PanPL from Pandoraea apista. Interestingly, the protein was stable over a wide range of pH from acidic to alkaline, allowing us to explore the third dimensions of the protein universe, “the dynamics”. Thus, PanPL formed a suitable candidate to probe the structure–function–dynamics relationship experimentally. The polysaccharide lyases (PLs) present in all kingdoms of life are a diverse and expanding class of carbohydrate-active enzymes (CAZymes), categorized into 42 families based on sequence similarities.[2] The PLs employ a β-elimination reaction mechanism to depolymerize the anionic polysaccharides. Polysaccharides have a wide range of applications in the food, cosmetic, and pharmaceutical industries, thus making PLs valuable biotechnological tools. Besides the enormous possibilities in industrial and medical applications, the researchers are intrigued by PLs across the families for their interesting molecular attributes like substrate specificity, conformational dynamics, catalytic mechanism, mode of actions, pH regulations, etc. Such PLs include Smlt1473, which was reported to be a unique PL-5 enzyme which possesses a pH-regulated substrate specificity.[3] The vAL-1 from PL-14 switches its mode of action, i.e., endolytic to exolytic with a shift in pH from pH 7 to pH 10.[4] Some interesting studies discussed the dynamical exchange between open and closed conformations during substrate acquisition in different PLs like Sphingomonas sp alginate lyase A-III (PL-5), Aly-SJ02(PL-18), and BtHepIII(PL-12) belonging to families with various structural folds. The dynamical conformation shift from an open cleft-like apo structure to a closed tunnel-like structure was induced upon substrate binding in the case of PL-5 alginate lyase AIII Sphingomonas sp.[5] The Normal Mode Analysis of a PL-12 Heparin lyase BtHepIII discovers the dynamical open–closed movement around the enzyme’s catalytic site.[6] In PL-18 alginate lyase Aly-SJ02, the substrate entry is facilitated by an open–closed gating function of lid loops (A208-G217; A260-T265) obtained by their side-chain conformational changes.[7] The catalytic activity for Aly-SJ02 was abolished by an N214C/T236C mutant which restricted the open–closed dynamics by disulfide bond formation. Several studies reportedly described not only the variability of structural folds with the emergence of new folds among PLs but also the conserved catalytic geometry to facilitate the classical β-elimination reaction. Adding to this existing knowledge, here, we report the biochemical and structural characterization of a novel PL, PanPL from Pandoraea apista. Further, our study on intrinsic dynamics of PanPL provides a new insight into the functioning of PL-5 family enzymes. Our biochemical characterization suggests that PanPL optimally cleaves alginate at pH 7.0 with activity falling off at other pH values and specific to alginate. The kinetics follows allostery with positive cooperativity. The product analysis reveals the PanPL to be endolytic, with dimeric units as the major product. The allosteric behavior renders the PanPL unique among the PL-5 family of proteins. To investigate the structural changes of PanPL as a function of pH, we have determined the crystal structures by trapping the enzyme across the pH spectrum (3.5–8.5 in steps of 1.0 units) in the crystal lattice. The static structures across the pH did not reveal any other observable conformational changes except R48 rotameric transitions; thus, we sought ensemble refinement to probe the conformational flexibility of the protein as a pH function. To further delineate the substrate interactions and possible explanation for the endolytic nature of PanPL, we have determined the crystal structure of the substrate-bound (tetra-ManA) complex. To obtain an in-depth view of the structural dynamics of the enzyme, we performed molecular dynamics simulations on apo and substrate-bound form structures. Our results suggest that PanPL has minimal atomic fluctuations and discrete charge distribution at its functionally optimum pH, i.e., pH 7.0. Since we observed a closed tunnel in all the crystal structures across the pH spectrum, the closed state has the lowest energy on the energy landscape and is present as a major conformer.[8] Our analysis suggests that the N-loop-lid length is important for maintaining this catalytically competent closed state as a major conformer. Our MD simulation studies show the feasibility of accessing the open state in solution. The open state might be present as an alternative conformer of high energy but lies near the closed state on the energy landscape, thus being transiently attainable. However, the atomic fluctuations at the catalytic tunnel in apo structures as demonstrated by ensemble refinement and normal-mode analysis seem important for substrate acquisition, substrate positioning, and functioning.[9] The ensemble refinement of a substrate-bound structure suggests that the N-loop-lid of the close catalytic tunnel undergoes fluctuations to facilitate substrate entry into the tunnel. Both the intrinsic dynamics to access an open state and the substrate-enhanced conformational flexibility of the N-loop-lid in solution collectively provide a basis for the allosteric nature of PanPL biochemistry.

Results and Discussion

Biochemical Characterization Reveals the Allosteric Nature of PanPL

We recorded the rates of unsaturated product formation by quantifying 235 nm UV absorbance at 25 °C for different substrate concentrations. Enzyme activity data was plotted against substrate concentrations using OriginLab software (Figure S1). We noticed that the kinetic plots show deviations from the classical Michaelis–Menten kinetic behavior while fitting the curve. At a low concentration range of the substrates, the enzyme activities increase very slowly, and as the concentration of substrates increased further, the curves took the sigmoidal shape. We tried fitting the curves using Hill’s equation by floating the Hill’s coefficient. The values of Hill’s coefficient n for alginate and poly ManA are 4.4 and 2.6, respectively (n > 1). These kinds of enzyme kinetics reflect positive cooperativity and contribute to the allosteric mechanism. The allosteric behavior observed in PanPL is never reported among PL-5 family proteins (alginate lyases), rendering a unique characteristic feature for PanPL. Further, we derived the enzyme kinetic parameters from the data, the rate of product formation.[10] The Km and Kcat values for the substrates alginate and poly ManA were calculated to be 0.0039 mM, 0.76 s–1 and 0.175 mM, 0.82 s–1 respectively. The specific activities of PanPL for alginate and poly ManA are 7.94 μmol min–1 mg–1 and 8.56 μmol min–1 mg–1, respectively, which fall in the range of reported values.

Overall Structure of PanPL Forms a Catalytic Tunnel

To provide a structural basis for the functioning of PanPL, we have determined the crystal structures across the pH spectrum. The protein folds into an incomplete toroid, comprising six α-helices forming the inner core, five α-helices constituting outer surface, and connecting loops giving rise to (α/α)5 incomplete toroid fold (Figure ). Figure Aii shows the topology diagram; inner helices are considered as Hi1 to Hi6 and outer helices Ho1 to Ho5, while L1 to L15 designate the loops. The N-terminal segment assumes a loop conformation [L1]; a part of it partially covers the cavity created by the inner helices, constituting an N-loop-lid (aa 42–52). The N-loop-lid follows the inner helix Hi1, the longest helix that curves with a bending angle of ∼27° at V63. Particularly interesting is the N-loop-lid having interactions with the Loop (aa 218–223): R219 side chain atoms (NE, NH2) forming hydrogen bonds with backbone carbonyl oxygen atoms of R48 and A47 (Figure Biii). This forms a closed state for the molecule. Figure A shows the surface representation, clearly revealing the tunnel formation.

Figure 1

Figure 2

Tunnel architecture: entry site, constriction region, exit site, and substrate binding in the tunnel. Tunnel formed can be visualized as a passage defined by concentric shells with decreasing radius; the arrow points from the entry site to exit site. The tunnel’s side view defines a passage with a wide opening at the entry site following the constriction region and exit site (A). The substrate-bound structure shows the substrate being snugly fit into the tunnel cavity (B,C). The 2Fo-Fc electron density map quality for the substrate contoured at 1.2σ level is also shown (C).

Primary structure, topology, and overall structure of PanPL: Sequence overlaid with the secondary structural elements that the sequence segment can assume (Ai); signal peptide depicted in red color text. Topology of the PanPL depicting the sequential arrangement of helices connected through loops (Aii). Three-dimensional structure of PanPL, when aligned with its principal axis along the Z-axis (Bi), represents the toroidal shape. Overall fold appears to be (α/α)5 toroid (Bi, Bii, Biv). Inner helices are colored cyan, outer helices green, and interconnecting loops magenta. The loop segment represented in stick mode is the N-loop-lid (Bii); the inset shows the N-loop-lid locked in position by the side-chain to main-chain hydrogen bonds. Tunnel architecture: entry site, constriction region, exit site, and substrate binding in the tunnel. Tunnel formed can be visualized as a passage defined by concentric shells with decreasing radius; the arrow points from the entry site to exit site. The tunnel’s side view defines a passage with a wide opening at the entry site following the constriction region and exit site (A). The substrate-bound structure shows the substrate being snugly fit into the tunnel cavity (B,C). The 2Fo-Fc electron density map quality for the substrate contoured at 1.2σ level is also shown (C).

Tunnel Architecture

The tunnel can be characterized by concentric shells: from the first shell to the fifth shell (Figure A). The tunnel is wider at the first shell and gets narrower as it progresses toward the fifth shell. The residues lie in each of these shell points into the tunnel (Figure ), and their physiochemical properties are imparted to the tunnel’s surface texture. The aromatic residues like W175 favor the floor, giving a mosaic texture. The continuity from the first shell to the second shell defines the entry site, while the fifth shell and beyond give rise to the exit site. The third and fourth shells harbor the active site residues.

Figure 3

Detailed view of the tunnel: amino acid distribution and active site location. The catalytic tunnel resembles a wine glass and can be constructed using concentric shells with decreasing radii: the first shell to the fifth shell. Each shell is considered when we trace cα atoms on the circumference. The first shell has the largest radius and favors the polar residues, while the second shell is surrounded by polar and aromatic residues. In the third and fourth shell active site residues (Y226, N171, and H172) reside. The fifth shell corresponds to the constricted region favored by hydrophilic residues, which lie closer to the exit site. The active site residues are represented in stick mode with the orange color. The carbohydrate subsites [−1] and [+1] are also depicted and correspond to the third and fourth shells.

Location of Active Site Residues

The conserved active site residues[3] [N171, H172, and Y226] in terms of tunnel architecture reside in the fourth and third shells, respectively (Figure ). The residues N171, H172 are located near the C-terminal end of Hi3 and Y226 is at the N-terminal end of Hi4. At the constellation of active site residues, the constriction of tunnel is observed [∼2 Å] (Figure S2). Since we observed that activity varies as a function of pH value, it was intriguing to check if the tunnel architecture changes as a function of pH. However, our crystal structures revealed the tunnel architecture; the constriction of active site residues remained invariant across the pH spectrum. The side chain conformations of flanking residues also remain invariant except the R48 residue at the exit side as a function of pH (Figure S7).

Enzyme Substrate Interaction: PanPL tetra-ManA Complex Crystal Structure

Having realized the apo structure details, it was interesting for us to characterize the substrate-bound structure, and we reasoned that the substrate-bound structure would provide, first, the visualization of how the substrate fits into the constricted region of the active site; second, the possible side-chain configurations at the tunnel, and the interaction with active site residues as well as the rest of the tunnel residues. To cocrystallize the substrate (tetra-ManA)-bound protein, we created the active site mutants (Y226F, H172A, and N171L) and found that the H172A mutant trapped the substrate in the crystal structure among the mutants. We determined the crystal structure of the tetra-ManA-PanPL H172A complex at 2.2 Å resolution. The structure was maintained at a closed state. Table shows the data collection and refinement statistics. During the course of model building, we could resolve only two sugar units in the electron density map (Figure A, Figure S8). Our analyses on the structure revealed several interesting features. First, the constriction of the tunnel is similar to apo structures (Figure S2), and the catalytic active site residues are in same conformations observed in the apo structures. Second, the first unit of the sugar is at the [+1] subsite, which is similar to the other PL-5 structures[3] (Figure S9). The substrate fits into the constricted tunnel. We analyzed the enzyme–substrate interactions. The crystal structure of the substrate-bound state of PanPL with tetra-ManA catalogues the substrate-interacting amino acid residues and the interaction types (Table ). The interactions are mostly H-bond interactions; however, distribution of aromatic residues throughout the tunnel also facilitate optimum orientation, optimum binding, and substrate processing in the tunnel (Figure B).[11] The substrate positioning and optimum binding are crucial for catalysis. A dense network of H-bonds grips the [+1] and [−1] sugar units (Figure C). The first sugar subunit of tetra-ManA is positioned at the [+1] subsite in between the fourth and third shell and interacts with the active site residues N171 and the other fourth shell residues, i.e., R219 and Q116. The third shell active site residue Y226 interacts with the glycosidic bond O4 atom via an H-bond as well as with the O8 atom at the [−1] sugar unit (Figure C). The second shell residues Y42 interact with both the [+1] and [−1] sugar subunits. The second sugar unit at the [−1] subsite has a greater number of interactions with second shell residues like H225 and R319. We could see a diffused electron density for the R48 residue in the catalytic tunnel and hence could not model the side chain. The overall electron density of the N-loop-lid in the substrate-bound structure is less ordered compared to the apo structures. Additionally, we could observe a partial electron density for the sugar unit at the [−2] subsite (Figure S8) and no electron density at the [−3] subsite. Therefore, we could not model the two sugar units of tetra-ManA at the [−2] and [−3] subsites, possibly because the lower number of interactions with the flanking residues impair their stable positioning in the tunnel.

Table 1

Crystallographic Data Collection and Refinement Statistics

Crystal Name	PanPL_pH3.5	PanPL_pH4.5	PanPL_pH5.5	PanPL_pH6.5	PanPL_pH7.5	PanPL_pH8.5	PanPL_H172A + tetra-ManA
PDB ID	7WXJ	7WXK	7WXL	7WXM	7WXN	7WX0	7WXP
Crystallization condition	0.1 M Citric acid pH 3.5, 25% w/v Polyethylene glycol 3350	0.1 M Sodium acetate trihydrate pH 4.5, 25% w/v Polyethylene glycol 3350	0.1 M BIS-TRIS pH 5.5, 25% w/v Polyethylene glycol 3350	0.1 M BIS-TRIS pH 6.5, 25% w/v Polyethylene glycol 3350	0.1 M HEPES pH 7.5, 25% w/v Polyethylene glycol 3350	0.1 M TRIS pH 8.5, 25% w/v Polyethylene glycol 3350	0.1 M Citric acid pH 3.5, 25% w/v Polyethylene glycol 3350
Data Collection Statistics
Beamline	ID29	ID29	ID29	ID29	NISER Home source.BRUKER-PROTEUM	ID29	NISER Home source,BRUKER-PROTEUM
Wavelength (Å)	1.07234	1.07234	1.07234	1.07234	1.54178	1.07234	1.54178
Detector	Pilatus	Pilatus	Pilatus	Pilatus	Photon 100	Pilatus	Photon 100
Processing software	XDS/autoproc	XDS/autoproc	XDS/autoproc	XDS/autoproc	Proteum 3	XDS/autoproc	Proteum 3
Cell Parameter	a = 99.01 Å, b = 46.10 Å, c = 64.59Å; β = 91.1°	a = 45.97 Å, b = 65.41 Å, c = 98.89 Å	a = 36.09 Å, b = 84.30 Å, c = 91.16 Å	a = 36.11 Å, b = 84.62 Å, c = 91.37 Å	a = 36.03 Å, b = 84.62 Å, c = 91.30 Å	a = 35.92 Å, b = 85.48 Å, c = 90.82 Å	a = 46.65 Å, b = 53.01 Å, c = 121.41 Å
Space group	C2	P2₁2₁2₁	P2₁2₁2₁	P2₁2₁2₁	P2₁2₁2₁	P2₁2₁2₁	P2₁2₁2₁
Resolution range (Å)	64.58–2.10 (2.14–2.10)b	54.56–1.68 (1.70–1.68)b	61.89–1.45 (1.47–1.45)b	62.09–1.24 (1.26–1.24)b	62.06–2.14 (2.24–2.14)b	62.25–1.96 (1.99–1.96)b	60.71–2.20 (2.30–2.2)b
R_merge	0.076 (0.146)b	0.104 (0.823)b	0.123 (0.944)b	0.054 (0.658)b	0.185 (0.488)b	0.102 (0.810)b	0.058 (0.180)b
Unique reflections	15375 (408)b	34187 (1525)b	50270 (2460)b	78113 (2704)b	15538 (1605)b	20836 (1047)b	15844 (1839)b
I/σ(I)	16.7 (6.4)b	17.2 (2.1)b	15.2 (3.3)b	21.9 (2.1)b	10.54 (2.51)b	19.2 (3.3)b	26.84 (5.61)b
Completeness (%)	89.2 (48.9)b	97.5 (88.0)b	100 (100)b	96.1 (67.8)b	96.6 (80.3)b	100 (100)b	99.2 (94.6)b
Redundancy	5.1 (2.9)b	11.2 (5.9)b	12.1 (10.3)b	11.3 (4.9)b	10.37 (2.19)b	12.8 (12.6)b	9.86 (2.55)b
CC(l/2)	0.996 (0.975)b	0.998 (0.683)b	0.998 (0.822)b	1.0 (0.759)b	0.961 (0.650)b	0.998 (0.878)b	0.998 (0.948)b
Data Refinement Statisticsa
Refinement	phenix	phenix	phenix	phenix	phenix	phenix	phenix
R_work/R_free	0.1893/0.2297	0.2122/0.2554	0.1680/0.1903	0.1697/0.1860	0.2449/0.2854	0.1959/0.2325	0.1695/0.2171
ΔR = \|R_work – R_free\|	0.0404	0.0432	0.0223	0.0163	0.0405	0.0366	0.0476
Matthews Coefficient (Å³/Da)	2.11	2.13	1.99	2	1.99	2	2.15
Number of molecules in asymmetrie unit (Z′)	1	1	1	1	1	1	1
Solvent content (%)	41.8	42.3	38.1	38.5	38.4	38.5	42.9
Number of atoms
Proteins	2431	2405	2413	2442	2409	2407	2394
Water and other (ligands or ions)	186	278	322	276	5	121	221
Overall B-factor (Å²)
Proteins	16.0	17.6	11.0	15.0	11.3	24.9	13.4
Ligands or ions			15.0	14.0			19.7
Water	19.4	25.4	19.9	22.5	3.0	26.9	13.9
RMSD from ideal values
rms bond length (Å)	0.002	0.006	0.006	0.009	0.004	0.003	0.009
rms bond angle (°)	0.49	0.85	0.81	1.06	0.85	0.66	0.91
Ramachandran plot statistics
Favored (%)	96.8	96.8	97.4	97.4	96.4	97.1	97.1
Allowed (%)	3.2	3.2	2.6	2.6	3.6	2.9	2.9
Outlier (%)	0	0	0	0	0	0	0

Rwork = ∑||FO|−|FC||/∑|FO|. Rfree is the Rwork value for 5% of the reflections excluded from the refinement. Rmerge = ∑|I – ⟨I⟩|/∑I.

Values in parentheses are for the highest resolution shell.

Figure 4

Enzyme–substrate interaction with tunnel residues PanPL and tetra ManA complex; crystal structure and docked structure. The electron density for the two sugar units in the tetra-ManA bound crystal structures is shown (A). The mosaic floor formed by aromatic residues and the substrate is shown in the catalytic tunnel (B). The interacting residues of PanPL and two units of sugar molecules (orange) are shown in stick form (C). The polar interactions are shown as dashes. The interacting residues in the tetra-ManA docked structure are shown for all four sugar units (D).

Table 2

Enzyme–Substrate Interaction in Substrate-Bound Crystal Structure

ATOM:Residue	ATOM:Substrate	Subsite	Interaction Type	Distance (A)
ND2:Asnl71	O6A:BEM1	[+1]	H-BOND	2.8
NHl:Arg219	O6A:BEM1	[+1]	H-BOND	3.1
OEl:Glnll6	O2:BEM1	[+1]	H-BOND	2.8
NE2:Glnll6	O3:BEM1	[+1]	H-BOND	2.9
NE2:Glnll6	O2:BEM1	[+1]	H-BOND	3.2
OH:Tyr42	O3:BEM1	[+1]	H-BOND	3.1
OH:Tyr226	O4:BEM1	[+1]	H-BOND	2.8
O:HOH298	O1:BEM1	[+1]	H-BOND	3.1
O:HOH298	O5:BEM1	[+1]	H-BOND	2.7
O:HOH298	O6A:BEM1	[+1]	H-BOND	3.6
OH:Tyr226	O2:BEM2	[-1]	H-BOND	3.0
NE2:His225	O2:BEM2	[-1]	H-BOND	2.9
NE2:His225	O3:BEM2	[-1]	H-BOND	3.2
OH:Tyr42	O6B:BEM2	[-1]	H-BOND	2.9
NHl:Arg319	O6A:BEM2	[-1]	H-BOND	2.9
NH2:Arg319	O6A:BEM2	[-1]	H-BOND	3.5
O:HOH282	O6B:BEM2	[-1]	H-BOND	3.5
O:HOH220	O3:BEM2	[-1]	H-BOND	3.6

Rwork = ∑||FO|−|FC||/∑|FO|. Rfree is the Rwork value for 5% of the reflections excluded from the refinement. Rmerge = ∑|I – ⟨I⟩|/∑I. Values in parentheses are for the highest resolution shell. Enzyme–substrate interaction with tunnel residues PanPL and tetra ManA complex; crystal structure and docked structure. The electron density for the two sugar units in the tetra-ManA bound crystal structures is shown (A). The mosaic floor formed by aromatic residues and the substrate is shown in the catalytic tunnel (B). The interacting residues of PanPL and two units of sugar molecules (orange) are shown in stick form (C). The polar interactions are shown as dashes. The interacting residues in the tetra-ManA docked structure are shown for all four sugar units (D). Since we observed only two units in the enzyme–substrate-bound crystal structure, we decided to perform a docking study of PanPL and tetra-ManA to investigate the interacting residues with sugar subunits at [−2] and [−3] subsites.

Docking of PanPL with tetra-ManA

We performed the docking calculation for the tetra-ManA and PanPL pH 6.5 apo structure using the Rosetta flexible docking protocol.[12] The docked structure showed the interacting residues for the sugar unit at the [−2] subsite (Figure D). The first shell residue K56 interacts with the sugar unit at the [−2] subsite, and there is no interaction observed for the sugar unit at the [−3] subsite (Table S2). The reduced number of interactions grant flexibility at the [−2] and [−3] subsites; therefore, the electron densities for these two sugar subunits at [−2] and [−3] subsites in the substrate-bound crystal structure were untraceable. Similarly, the loss of interaction at the corresponding ends of the polysaccharide substrates has been reportedly observed in the GH7 CBH US analysis,[13] while the leading two subunits of the sugar chain had a greater number of H-bond interactions.

Role of Active Site Residues in Enzyme–Substrate Interaction

The characteristic (α/α)5 incomplete toroid fold of the PL-5 family maintained across the pH spectrum indicates robust pH stability of the fold. The heart of the catalytic site of PanPL resides in the closed tunnel. The active-site geometry remains unaltered in crystal structures across the pH spectrum. To delineate the binding mode of a substrate in the tunnel, we generated inactive mutants N171L, H172A, and Y226F and attempted cocrystallization with tetra-ManA substrates. The cocrystallization of tetra-ManA with N171L did not yield the substrate-bound structure; however, we could trap ligand in H172A mutant. In the H172A substrate-bound structure, we find that side-chain polar atoms of N171 initiate hydrogen bond interactions with the carboxylate group of sugar unit at [+1] subsite. This observation points out that, in the N171L mutant structure, even though the electrostatic charge distribution favors a substrate entry to the tunnel, N171L, being hydrophobic, did not provide the critical contact for the substrate stabilization and positioning in the tunnel. Our understanding supports the idea that the neutralizer active site residue N171 plays a critical role in substrate binding by establishing the initial contact of the sugar residue at the [+1] subsite in the catalytic tunnel. The crystal structure of the substrate-bound H172A mutant provides the knowledge of the orientation of sugar subunits in the catalytic tunnel and the other flanking site amino acid interactions to hold the substrate precisely. Though the H172A mutation renders the enzyme inactive, the role of H172 as a catalytic base or proton abstractor cannot be established, as the orientation of C5–H is not accessible to H172. The substrate-bound crystal structure suggests the catalytic mechanism to be a syn β-elimination reaction mechanism. Based on our docking and biochemical studies, we suggest the functionally critical H172 residue as a substrate stabilizer in the case of tetra-ManA substrate. This is further supported by the crystal structure of the Smlt1473-tetra-ManA complex (PDB id 7FI0).[3] Only in the case of guluronic acid (C5 epimer of ManA), the histidine residue in the active site acts as a catalytic base where the C5–H orients towards it and performs the anti β-elimination reaction. The catalytically inactive Y226F mutant establishes the role of Y226 as both proton abstractor and proton donor for the ManA substrate, as the mechanism suggests a syn β-elimination reaction mechanism. Since all the crystal structures were in a closed state, we were curious to see if crystal packing played a role in restricting the degrees of freedom of the N-loop-lid. Thus, we performed crystal packing analysis on all the structures to get the details of interactions between symmetry-related molecules with a focus on the N-loop-lid. Except for the H172A bound structure and N171L and Y226F apo structures, we found a hydrogen bond interaction (crystal contact) at one end of the N-loop-lid (either near the N-terminal or C-terminal) with symmetry-related molecules (Table S4), while the rest of the loop is completely exposed to the solvent in the lattice (Table S5). This suggests that irrespective of variations in cell parameters and space group, the loop was not found to be restricted.

pH Modulates the Electrostatic Surface Charge

Further, as the pH changes the ionization states, we sought to calculate electrostatic surface charge distribution for the structures determined across the pH spectrum (Figure ). The electrostatic models revealed that in lower pH (pH 3.5 and 4.5), the surface charge was more electropositive, and toward the higher pH (pH 5.5 to 8.5), the blue patch faded away making the surface more electronegative. For the enzyme catalysis to happen, the anionic substrate should precisely enter into the tunnel. In the catalytic activity range (pH 5.5 to 7.5), the electrostatic surface charge helps to avoid nonspecific binding of substrate. At pH 6.5, the surface electropositive charge, i.e., the blue patch, becomes more confined around the catalytic tunnel, which attracts the substrate optimally toward the tunnel. The electrostatic interactions between the positively charged tunnel and negatively charged polysaccharide substrate can bring about the substrate acquisition step of the catalysis. In addition, the mosaic floor consisting of aromatic residues (Figure B) in the tunnel helps the sugar rings of the substrates to orient properly in the tunnel.[11] This is in line with the study of Umbrella Sampling (US) simulations of Glycoside Hydrolase 7 cellobiohydrolase in which the strong electrostatic interaction of the sugar chain with the polar residues in the tunnel is the driving force for cellulose chain processability and the conserved aromatic residues are also facilitating the substrate processability.[13]

Figure 5

B factor putty representation of crystal structures and electrostatic surface charge distribution of PanPL across the pH spectrum. The plot represents the enzyme activity as a function of pH and corresponding B-factor putty representation and electrostatic surface charge distribution of PanPL structures determined across the pH spectrum (3.5–8.5 in steps of 1.0 units). The PanPL shows higher enzyme activity in the pH range of 5.5–7.5, with optimal pH of 7.0. The N-loop-lid vibrations at pH 3.5, 4.5, and 8.5 are relatively higher in contrast to pH 5.5, 6.5, and 7.5. pH 5.5 has the least vibration across the pH spectrum. The electropositive charge is distributed throughout the surface at pH 3.5 and pH 4.5, which might lead to nonspecific binding of anionic substrates. Within the range pH 5.5–7.5, the electropositive charge becomes confined around the catalytic tunnel which might guide the anionic substrate into the tunnel. At pH 8.5, the electronegative surface charge increases and becomes less attractive to an anionic substrate. The comparison of fluctuations and electrostatic surface charge distribution among different pH structures provides a basis of pH optimum of enzyme activity around pH 6.5–7.0.

pH Tunes the Structural Flexibility and Atomic Fluctuation

As the B-factor can assess the regions of flexibility independently, to gain further insight into flexible regions, we performed B-factor analysis on the structures.

B-Factor Analyses

To each apo structure, the average B-factor for the main chain atoms per residue was calculated, as the main chain atoms are responsible for the flexibility in protein structure (Figure S5). Since the flexible regions will have high B-factors in contrast to the core of the protein, we recognized the areas with peaks as the flexible segments in the proteins. We sorted the B-factor plot into two groups. The structures pH 5.5, 6.5, and 7.5 have a similar trend and fall into group 1, while pH 3.5, 4.5, and 8.5 structures fall into group 2. We looked for the humps in the plots in the context of structural features. We observed the peaks around the N-loop-lid, a part of the first inner helix (Hi1), and the loop segments (Figure S5). In group 1, the peak heights of L4, L6, and L13 are higher than those in group 2. In group 2 the peaks of N-loop-lid, a part of the inner helix (Hi1), L3, L5, and L9, are higher compared to those in group 1. To make the raw average B-factors independent of crystal packing and resolution effects, we calculated the normalized B-factor. Figure shows the plots of the normalized B-factor for main-chain atoms per residue. We used a threshold of 1.0 units for normalized B-factor in the plots to identify the peaks corresponding to the flexible protein segments. The plots confirm the grouping categorized in the previous plots and agrees with the inference from the average B-factor plots. Furthermore, we observed that the fluctuations at the N-loop-lid and a part of the first inner helix (Hi1) are in sync with that at L3, but anti-sync with those at L2 and L4.

Figure 6

Normalized B-factor fluctuation per residue for PanPL structures at different pH. The plot shows the normalized B-factor fluctuations at different pH for PanPL crystal structures. The plots for different pH are grouped into two groups based on the fluctuation pattern observed at different regions of PanPL structure. The secondary structural components are represented on the top of each group. The vertical dotted bars are used to show the vibrational hotspots distinctly. The fluctuation pattern for pH 3.5, 4.5, and 8.5 structures resemble and represent the low to no enzyme activity group. The pH 5.5, 6.5, and 7.5 structures show similar fluctuation patterns and represent the range of enzyme activity group. The dynamics/fluctuations in the tunnel, near and distal to the active site, are under the influence of changes in pH values. These analyses suggest that change in pH modulates the dynamics of the molecules and puts the molecule in different energy states. Thus, while some states in the continuum make the molecules functionally active, some states drive the molecules to be nonfunctional or have reduced activity. We analyzed the B-factor putty representation of all crystal structures at different pH values in apo form, and a definite trend of atomic fluctuations is observed at the N-loop-lid as a function of pH (Figure ). The N-loop-lid vibrates more at extreme ends of the pH spectrum, compared to active pH range 5.5–7.0. Therefore, we anticipated that performance of normal analysis on the molecule would reveal the functional correlated/concerted motions.

Normal Mode Analysis: A Study of Functional Dynamics

We used normal-mode analysis to get insights into the functional dynamics. The normal modes provide internal molecular motions, giving rise to indications of flexibility and rigid parts in the proteins and correlated motions.[14] We performed all-atom normal-mode analysis, considering the Cartesian coordinate system. Since we had structures across the pH spectrum, we performed the ensemble normal-mode analysis. Figure S6 shows the fluctuations derived from normal-mode analysis; the pattern resembles the B-factor plot depicting the flexible parts in the protein structures. This clearly shows that the N-loop-lid is highly flexible. To gain further insights into the nature of motions, we generated the trajectories for mode 7, the first nontrivial mode. The vector field representation of these trajectories is shown in Figure S6C and provides visualization of normal mode 7, denoting the possible correlated functional motions. As the structures were available across the pH spectrum, we performed principal component analysis (PCA) to decode the relationship between different structures. We found that the pH 5.5, 6.5, and 7.5 structures lie closer in the PC-space, while structures at pH 3.5, 4.5, and 8.5 are all spread across the PC-space (Figure S6E). This suggests that the conformations (structures) closer to optimal functioning may be constrained in a limited space while other conformations are widely distributed.

Ensemble Refinement Displays the Conformational Substates in Crystal Structures

The static crystal structures explain little about the mechanism of product expulsion followed by the substrate processability in the narrow catalytic tunnel. Some local fluctuations are expected here to bring about the flexibility in the enzyme to process long-chain polysaccharide without compromising its stability. We utilized ensemble refinement (ER) to sample the hidden alternative conformers in the static crystal structures.[15] ER uses time-averaged refinement and molecular dynamic simulations for sampling local molecular motions to generate an ensemble of structures. We have performed ER on each structure and determined optimal empirical refinement parameters (t, ptls, Tbath) to generate the ensemble of structures.[16] The ensembles of structures fit X-ray data better than single structure as validated by reduced Rfree values in contrast to single structures (Table S3). For each structure, ER produced a number of structures (Table S3); however for analysis to interpret the functional dynamics, we consider the equal number of structures in ensembles of each structure. Figure S11 shows the ensembles of apo structures. It indicates well-ordered residues in the protein core and flexible residues in the loop regions. ER modeled a significant number of alternative conformations in the N-loop-lid region. It reflected the conformational dynamics: highly flexible motions for the N-loop-lid region at pH 3.5, 4.5, and 8.5 and comparatively less flexibility at pH 5.5, 6.5, and 7.5 (Figure S13). This implies that the protein adopts a less dynamic state in its active state. The ensemble refinement of the substrate-bound structure shows that substrate acquisition enhances the fluctuation at the N-loop-lid region (Figure ). For substrate to enter in the catalytic tunnel, the loop lid undergoes local backbone dynamics along with side chain fluctuation. This substrate-induced loop flexibility samples a huge number of conformational substates for the loop residues. Even though the substrate was not bound in the N171L cocrystallized crystal structure, we found enhanced fluctuations at the N-loop-lid (Figure S12). This is not surprising, as the substrate in cocrystallization solution has induced the dynamics at the N-loop-lid, and those molecules assembled in a crystal lattice during the crystal growth phase. This implies that the substrate presence is sufficient to induce the dynamics in the loop. The loop dynamics was recorded in solution NMR studies on ABL kinase upon inhibitor binding, and this was divulged during the case study of ER on the ABL kinase crystal structure.[15,17] Since ensemble refinement reveals the loop dynamics, it indicates the occurrence of such dynamics in the solution during substrate acquisition. Thus, analyses of the resulting ensembles have provided details implying that atomic fluctuations are essential for functioning.

Figure 8

Dynamic behavior of N-loop-lid of PanPL. A and B show the substrate-induced fluctuation. A. Ensembles of structures (ribbons) for apo and substrate-bound states generated by ensemble refinement. Substrate binding enhances the fluctuation at the N-loop-lid. The plots show the rmsd per residue. B. B-factor putty representation of PanPL apo and substrate-bound structures. Substrate-bound structure shows a high atomic fluctuation at the N-loop-lid. C. Intrinsic dynamics. MD simulation analysis shows a closed tunnel (0 ns) to open forming a cleft-like structure (1 μs) during simulation.

Molecular Dynamics Simulation Shows a Hidden Open State

Our B-factor analysis on apo structures across the pH spectrum shows how fluctuations at the N-loop-lid are correlated with enzyme activity. The normal-mode analysis brought consistency to the previous data and established the N-loop-lid dynamics to be functionally relevant (Figure S6). Supporting this observation, the substrate-bound structure also reflects a huge vibration at the N-lid-loop (Figure S6). All these data motivated an in-depth study of the loop dynamics of the enzyme. Thus, we performed molecular dynamic simulation studies for the apo (pH 6.5) and substrate-bound crystal structures to observe the dynamic property of PanPL. We observed the time evolution of atomic coordinates of PanPL for 100 ns initially. Further, we performed the simulation for 1 μs, the time scale required for loop motion. We analyzed the structures at different time points and compared it with the initial structures for both apo and substrate-bound structures. It was interesting to realize that the major jumps in the rmsd plots at different transition time points are reflecting a shift of conformation from closed state to open state (Figure A,B). Further, we intended to investigate the dynamics of the H-bond interactions holding the N-loop-lid during 1 μs molecular dynamics simulation.

Figure 7

Analysis of the molecular dynamics simulation trajectories of apo and tetra-ManA-bound crystal structures. Figure shows rmsd as a function of time (ns). A. rmsd plot for the apo structure trajectory with snapshots of the coordinates at different time points (green color, initial structure; magenta color, structures at several time points during simulation). The jump at 650 ns corresponds to the N-loop-lid opening and the transition from closed tunnel to open cleft. B. rmsd plot of substrate-bound crystal structure during 1 μs simulation with snapshots of coordinates at different time points. C. Atomic fluctuations per residue during the length of trajectory, suggesting the residue range ∼30–60 has large movement during the course of simulation.

N-Loop-Lid Dynamics and the Role of Lid Loop H-Bonds

Three hydrogen bond interactions between the R48 and A47 main chain carbonyl group of the N-loop-lid (aa 42–52) and the side chain of the R219 residue loop (aa 218–223) favor the tunnel formation. This sort of hydrogen bond reportedly helps to maintain the fold’s architecture.[18] To investigate the significance of the H-bonds, we analyzed the trajectories of 1 μs molecular dynamics simulation data. In the case of apo structure, our result shows the breaking and making of the H-bonds (Figure S15), as a result of back and forth N-loop-lid movement. During the simulation at 650 ns, the H-bond interactions were lost permanently, and the N-loop-lid moved away from the tunnel to adopt an open state without dismantling the overall scaffold. This open cleft formation was never observed in any of the PanPL crystal structures. This proves the notion that the open state is transiently accessible while a closed state is the most stable. However, above 650 ns, the open state prevails throughout the simulation. Further, in the case of tetra-ManA crystal structures, the hydrogen bonds were retained up to ∼100 ns (Figure S15). The rmsd plot shows the fluctuations (Figure C) at the N-loop-lid region for both apo and substrate-bound structures in a similar fashion as seen in ensemble refinement and B-factor trends (Figure ). Further, we created the R219L mutation to establish the importance of the H-bonds experimentally, which resulted in the loss of lyase activity confirmed by TBA assay (Figure S19). Dynamic behavior of N-loop-lid of PanPL. A and B show the substrate-induced fluctuation. A. Ensembles of structures (ribbons) for apo and substrate-bound states generated by ensemble refinement. Substrate binding enhances the fluctuation at the N-loop-lid. The plots show the rmsd per residue. B. B-factor putty representation of PanPL apo and substrate-bound structures. Substrate-bound structure shows a high atomic fluctuation at the N-loop-lid. C. Intrinsic dynamics. MD simulation analysis shows a closed tunnel (0 ns) to open forming a cleft-like structure (1 μs) during simulation.

Principal Component Analysis Clustering of MD Trajectories

To delineate the conformations spanned, we performed principal component (PC) clustering of MD trajectories of 100 ns simulation for both apo and substrate-bound structures.[19] The PC analysis gives the essential dynamics and samples major conformational changes occurring during MD simulation. The hierarchical clustering showed the presence of three states. The average structures for each group were analyzed and compared with the initial structures. We observed the presence of three distinct states, i.e., a closed state, an intermediate open state, and a widely open state, during simulation in both apo and substrate-bound structures. This result deduces the inherent flexibility of the lid loop of the enzyme (Figure S14).

Significance of N-Loop-Lid Length in Selecting Open/Closed or Only Closed State Mechanism

We compared the PL-5 crystal structures available in the PDB to observe the occurrence of both open cleft-like and closed tunnel-like architectures. The alginate lyase AIII from Sphingomonas sp. could access an open cleft-like configuration in the apo structure (1QAZ) and adopt a closed conformation to create a catalytically competent tunnel-like microenvironment in its substrate-bound crystal structure (4F13) following an induced fit mechanism (Figure S16). Whereas the crystal structures of alginate lyase from P. aeruginosa (4OZV), Smlt1473, and PanPL in both apo and substrate bound forms preferred only the closed tunnel conformation. The sequence alignment analysis (Figure S16) reveals that there is a deletion of 4 residues at the N-terminal long-helix region of alginate lyase AIII compared to the other PL-5 which shortens the stalk (the bent portion of the helix) of the N-loop-lid. This short stalk loosens the H-bond locking interactions necessary to maintain the tunnel and thereby access the open cleft conformation. However, substrate binding triggers the conformational transition (tunnel-like) to perform the catalysis. In other PL-5 enzymes structures, the N-terminal helix is bent to facilitate the lid loop to form a catalytically competent tunnel without accessing the open conformation in their crystal structures. This is feasible for a longer N-terminal helix. Here the length of the lid loop plays an important role in selecting the mechanism of dynamics, whether to follow (i) open and closed conformation or (ii) a predominantly closed state with functional fluctuations around the catalytic site to ensure catalysis. Our ensemble refinement analysis on the PanPL substrate-bound crystal structure confirms the occurrence of conformational fluctuation at the N-loop-lid backbone and side chain during substrate acquisition, preserving the catalytically competent closed state (Figure ). However, our MD analysis shows that the scope of accessing a transient open state by hidden dynamics at the loop lid cannot be ignored in the case of PanPL. These results suggest that the N-loop-lid of PanPL is inherently flexible and accessible to different modes of dynamics (an open/closed state transition and a closed state with functional fluctuation). The longer stalk of the N-loop lid in PanPL predominantly selects the closed state with a functional fluctuation mechanism over the open/closed state transition. Similar kinds of divergent dynamics have been reported in the case of the enzyme E. coli dihydrofolate reductase (ecDHFR) and human dihydrofolate reductase (hDHFR), where a single residue insertion in hDHFR at the Met20 loop alters the dynamic mechanism considerably.[20] The ecDHFR Met20 loop (7 residues) has 2 different loop conformations, i.e., a closed and an occluded state, whereas the hDHFR Met20 loop (8 residues) predominantly has a closed conformation in its crystal structure. The ecDHFR utilizes the closed to occluded state conformational transition mechanism, whereas hDHFR undergoes fast fluctuations in its closed structure to facilitate the ligand binding and product expulsion.[20] Further based on our sequence alignment analysis, we selected two PL-5 enzymes from Ralstonia picketti (RpPL) and Burkholderia cenocepacia (BcPL) where a deletion of 10 residues at the N-terminal helix region was observed. We built homology models using the Rosseta Comparative Modeling module[21] for both open and closed states. In the predicted closed structures of RpPL and BcPL, the tunnels formed were too constricted to bind any substrates (Figure S16). Further, the open state form cleft-like structure was never observed to bind substrates. We also modeled RpPL and BcPL using Alphafold.[22] Although the models show tunnel formation, the hydrogen bonds critical to maintain the tunnel were missing. This in turn disrupts the tunnel architecture and hence affects the catalysis. In addition to this, the putative catalytic residue histidine is substituted by leucine in the cases of both RpPL and BcPL. We overexpressed RpPL and BcPL to test their activity. As anticipated, the enzymes did not give any confirmation of lyase activity upon TBA assay.

Allostery as a Result of Conformational Flexibility and Structural Dynamics

Besides NMA and MD simulation, the ensemble refinement of the crystal structure data of all pH values captures the hidden conformational substates in average crystal structures. In this ensemble, there exist different fractions of conformational substates with a narrow range of energy variation. These population distributions of conformers can be modulated by external perturbations like pH change, substrate binding, mutations, etc. Our ensemble refinement study of substrate-bound structures shows the N-loop-lid to be a vibrational hot-spot around the catalytic tunnel. The substrate acquisition causes local lid loop dynamics and corresponding conformational substates for the loop residues. Here the occurrence of substrate-induced loop flexibility along with loop residue side-chain conformational heterogeneity can explain allosteric behavior of PanPL. Along with the ensemble representation, the hidden open–closed state N-loop-lid dynamics suggested by our MD simulation analysis also might contribute to the positive cooperativity reflected in PanPL biochemistry. This type of positive cooperativity was never reported in PL-5 family enzymes previously. PanPL represents a monomeric allosteric enzyme which shows homotropic positive cooperativity with a single binding site. This allosteric property of PanPL adds the enzyme to the list of allosteric enzymes where dynamics drive allostery without noticeable structural changes. Hence, we suggest that N-loop-lid dynamics is the key factor for positive cooperativity of PanPL. Our overall studies of structure–function and dynamics of PanPL helped us explain the enzyme’s behavior holistically. Here, our work suggests inclusion of the dynamics to the classical structure–function aspects of all PL-5 family enzymes.

Conclusion

We have performed biochemical, structural, and molecular dynamic studies on a polysaccharide lyase PanPL belonging to the PL-5 family from the Gram-negative bacteria Pandoraea apista. The biochemical characterization establishes PanPL as an allosteric, endolytic alginate lyase. Our structural work reveals that PanPL folds as a pseudotoroid with an N-loop-lid arc over the active site residues, defining a tunnel with an entry, active site, and exit site. Furthermore, we observe that in the crystal structure, PanPL exists in a closed state for apo across the pH spectrum and substrate-bound form, the closed-state being catalytically competent. There is an intrinsic flexibility at the N-loop-lid as supported by the B-factor trend, ensemble refinement, and normal-mode analysis on all apo structures across the pH spectrum. Here pH attunes the electrostatic surface charge distribution as well as the loop flexibility to approach optimality at optimum pH. The ensemble refinement of the PanPL substrate-bound structure displayed the enhanced fluctuations at the N-loop-lid recorded during substrate acquisition maintaining the closed tunnel architecture. In some cases, substrate binding induces open to closed-state transition. Here, in closed state, the substrate enhanced the fluctuations in the loop. Nevertheless, the molecular dynamics simulation study suggests the presence of a hidden and transiently accessible open state in the energy landscape. Our study suggests that the insertion in the N-terminal helix stabilizes the low-energy closed state causing a shift in paradigm: to adopt the closed state with functional fluctuations over an open and closed state transition for the functioning. All these dynamic properties of PanPL collectively influence the biochemistry, and a unique trend of allostery or positive cooperativity is reflected in enzyme kinetics of PanPL.

Experimental Section

Cloning and Overexpression of PanPL

PanPL gene (GenBank: AJE99968.1) was cloned into a pET28(a) vector between NdeI and XhoI restriction sites. We optimized the overexpression of PanPL for E. coli Lemo21(DE3) cells. The plasmid was transformed into E. coli Lemo21(DE3) cells and spread on a prewarmed agar plate with antibiotics Kanamycin and Chloramphenicol. A single colony was inoculated into a 10 mL of LB broth with 10 μL of Kanamycin and Chloramphenicol. The same was left to grow overnight at 37 °C to obtain the primary culture. The next day, the primary culture was inoculated to 1 L of LB broth with 1 mL of antibiotics. After 3 h, the secondary culture was induced with 1.5 mM of IPTG (OD 0.6) and left for 12 h at 18 °C in the shaker incubator. The N-terminal signal peptide and the (His)6 tag were cleaved during the expression, rendering the protein tagless. The molecular weight of the tagless protein was calculated to be 34.9 kDa by the Expasy Protparam tool.[23]

Extracellular Secretion

We performed an alginate plate assay to confirm the secretion of PanPL into the periplasmic space. We prepared the LB agar plate with 1 mg of alginate for the assay, and the induced bacteria culture was spotted onto the agar-alginate plate and left for 24 h at room temperature. Later, the plates were flooded with 10% cetyl butyl solution, and we observed the clear sections to confirm the extracellular lyase activity.

Purification of PanPL

Anion exchange chromatography was used to purify the tagless protein. We used the RESOURCE-Q column (GE healthcare). As the theoretical pI of PanPL is 7.2, the lysis buffer was chosen to be at pH 9.0 to keep the protein negatively charged. We prepared the lysis buffer with the composition 50 mM Tris pH 9.0, 150 mM NaCl, 0.02% Triton X-100, and 5 mM β-mercaptoethanol. Cells were lysed in the lysis buffer using a tip-sonicator operating with a pulse rate of ON (5 s) and OFF (10 s) for 15 min at 45% amplitude. The lysate was centrifuged at 18000 rpm for 1 h. The supernatant was diluted with buffer 50 mM Tris pH 9.0 to maintain the salt concentration at 30 mM NaCl. Buffer A: 50 mM Tris pH 9, 30 mM NaCl, 5 mM β-mercaptoethanol, and B: 50 mM Tris pH 9, 1 M NaCl, 5 mM β-mercaptoethanol were prepared for the subsequent purification steps. The column was pre-equilibrated with buffer A, and the protein was loaded onto the column. The gradient was set between A and B buffers to elute protein. The PanPL eluted at 100 mM NaCl. To further purify, the size exclusion chromatography was employed using the Superdex 75 10/300 pg. We used the purified protein for biochemical studies and crystallization.

TBA Assay

The standard colorimetric-based TBA assay[24] was performed to confirm the lyase activity of PanPL.[24] For the TBA assay, we used three solutions: a periodate solution and an arsenite solution. 20 μg of protein was added to the substrates (500 μg/mL) dissolved in 200 μL buffers at different pH values ranging from 4.0 to 9.0. The reactions were allowed to happen for 10 min. 50 μL of periodate solution was added to each reaction and incubated for 20 min. During the reaction, the unsaturated product formed results in a pre-chromogen in the presence of periodate. The extra periodate was destroyed with the addition of 200 μL of arsenite to the reaction mixture. A 500 μL of TBA solution was added to the reaction mixture and heated in a boiling water bath. The prechromogen reacts with TBA to give a pink coloration. The pink color confirms the lyase activity. To quantify the chromogen formed, we measured absorption at 550 nm. For the absorption spectroscopy, the blank buffer was prepared with a 1:1 reaction solution and cyclohexanone. The lyase activity of the PanPL mutants was performed with the cell lysate by taking WT PanPL as the positive control. The TBA assay for RpPL and BcPL was done with the cell lysate and partially purified enzyme, respectively.

Enzyme Kinetics

The optimum pH of the PanPL lyase activity for different substrates was first determined using TBA assay and further confirmed using the enzyme kinetic assay. For a pH scan in the enzyme kinetics, we incubated the 50 μg of protein and 100 μM of substrates (alginate/polyManA) in buffers with pH ranging from 4.0 to 9.0 in steps of 1.0 units. For each pH, enzyme activity at 235 nm was recorded on Eppendorf Bio-spectrometer Kinetics using 10 mm quartz cuvette. The reaction was performed at room temperature. The unit of enzyme activity was μmol/min/mL. The optimum pH was found to be 7.0. For the optimum pH, the kinetic measurement was performed on 50 μg of protein sample with different substrate (polymannuronic and alginate) concentration ranging from 20 μM to 1 mM. The reaction volume of 500 μL was used in the experiment. We plotted the curves and fitted the data points using the OriginLab software.

Product Analysis

Alginate is a polymer of anionic monosaccharide with a negative charge on each unit. We utilized this size and charge proportionality to analyze the products. The final products of enzymatic degradation of substrates by PanPL were separated by anion exchange chromatography by Hi-trap HP (Cytiva) column. We incubated 5 mg of substrates (alginate/poly-ManA) with PanPL overnight in phosphate buffer pH 7.0. The enzyme was separated from the reaction mixture using 10 kDa cutoff Amicon-Ultra Centricon tubes. The presence of the cleaved products were detected by measuring the absorbance at 235 nm (Figure S17). The peak fractions were collected, lyophilized, and analyzed further using mass spectrometry (Figure S18).

Site-Directed Mutagenesis, Mutant Expression, and Purification

The active site mutants N171L, H172A, and Y226F and a R219L mutants were generated using site-directed mutagenesis protocol. The PCR was performed using mutant primers and NEB Q5 polymerase. The PCR product was purified using a QIAquick PCR purification kit. The T4 PNK (NEB) enzyme was used for phosphorylation of PCR product ends, followed by ligation using Quick ligase (NEB). We degraded the parental strands in the product using the DpN1 (NEB) enzyme. The dpn1 digested product (10 μL) was transformed into DH5α cells followed by amplification using primary culture. The primary culture was prepared using a single colony inoculated in 10 mL LB with 0.1% Kanamycin, incubated to grow overnight. Mutant plasmids were isolated from the cells using Qiagen MiniPrep kit, and the mutation was confirmed by Sanger sequencing. The mutant plasmids were transformed into Lemo21(DE3) E. coli cells. The mutant expressions were optimized at different IPTG concentrations. We used PanPL wild-type purification protocols to purify the mutant proteins.

Crystallization

For wild-type PanPL initial crystal screening, we used the purified protein (5mg/ml) in 50 mM Tris-HCl, pH 9.0, 200 mM NaCl. The single crystals were grown in crystallization condition 0.1 M Bis-Tris pH 5.5 and 25% PEG 3350. Later, we optimized the crystallization for different pH values. We used buffer exchange protocol to prepare the protein samples in different pH values (50mM citric acid buffer for pH 3.5, 50 mM acetate buffer for pH 4.5, pH 5.5, 50 mM HEPES buffer for pH 6.5, 50 mM sodium cacodylate buffer for pH 7.5, 50 mM Tris for pH 8.5) with 200 mM NaCl. To grow crystal at a specific pH value, both protein solution and crystallization solution were maintained at that particular pH. We used the commercially available crystallization “Index Screen”. Index 40–45 (pH 3.5 to pH 8.5) crystallization conditions gave crystals for the protein. Both hanging drop and sitting drop methods were utilized to crystallize the protein.

Active Site Mutants (N171L and H172A) Crystallization

N171L was crystallized at 100 mM citric acid pH 3.5, 25% PEG 3350. H172A, Y226F, and R219L was crystallized at 100 mM Bis-Tris pH 5.5, 25% PEG 3350.

Cocrystallization of Substrate with Active Site Mutant (N171L and H172A)

The mutant protein N171L and H172A were incubated with tetra-ManA for half an hour and crystallization was set up with 0.1 M citric acid pH 3.5 and 25% PEG 3350. Prior to the diffraction data collection, the H172A crystal was again soaked with 50 mM tetra-ManA overnight.

X-ray Diffraction Data Collection and Structure Solution

We first observed the PanPL protein crystal growth in the crystallization condition with pH 5.5. The crystal was flash-frozen using 20% ethylene glycol as a protectant and mounted on a home source diffractometer equipped with BRUKER’s rotation anode Cu Kα radiation source, cryo LN2 stream maintained at 100 K, and photon100 detector. The unit cell determining strategy implemented in the PORTEUM software from BRUKER was used to determine the cell parameter. The complete data sets were collected using a φ-scan, and diffraction data sets were recorded to a maximum resolution of 1.89 Å. The raw data sets were integrated and scaled using PROTEUM software. As sequence similarity with Smlt1473 was ∼60%, we used the molecular replacement (MR) method implemented in PHASER(25) and the polyala model of 7FHX (Smlt1473 crystal structure) as the probe to solve the phase problem. The refinement was performed using PHENIX,[26] and all the side chains were traced in the electron density map. The water molecules were placed using Fo-Fc maps contoured at a 3.0 σ level at positions that satisfy the geometric parameters for hydrogen bonds with the polar atoms. The model building, tracing the side chain, and mapping electron density of solvent molecules were performed using the visual program coot.[27] The refinement was stopped once the refined data converged (data not shown). We used this structure as a model probe in MR to solve the phase problem for the diffraction data sets collected for the PanPL crystals grown at other pH values and the mutants. Similarly, for the crystals of wild-type grown at pH 7.5, mutants N171L, H172A, and N171L cocrystallized with tetra-ManA, and H172A cocrystallized with tetra-ManA, the diffraction data sets were collected on the home source. For all of these crystals, 20% ethylene glycol was used as a cryoprotectant while flash-freezing. In the case of mutants H172A cocrystallized with tetra-ManA, the crystal was soaked with 50 mM tetra-ManA substrate overnight before the crystal was flash-frozen. Tables and S1 show the data collection statistics. We used a molecular replacement method implemented in PHASER to solve these structures. The autobuild protocol implement in phenix.autobuild was used to build the model, and phenix.refine was used to refine the structures.[26] In the N171L cocrystallized structure, we did not find the density for the substrate (tetra-ManA). However, in the case of H172A cocrystallized structure, the substrate was modeled in the Fo-Fc map at the 3 σ level, and we could resolve only two sugar units in the map. For the crystals were grown in conditions with pH values 3.5, 4.5, 5.5, 6.5, and 8.5, the diffraction data sets were collected at the ESRF ID29 beamline. The data for Y226F crystal was collected at the ESRF ID30A beamline. Table and Table S2 show the data collection statistics. To solve the phase problem for all of these data sets, as mentioned before, we used the poly ala model of the first structure solved at pH 5.5. We used phenix.autobuild for the model building and phenix.refine the refinement. We modeled the water and solvent molecules using a Fo-Fc map contoured at a 3.0 σ level. Table and Table S2 show the refinement statistics for the structures.

Electrostatic Surface Charge Analysis

To create electrostatic surface charge map, the pqr models were generated by using pdb2pqr.[28] The pdb2pqr calculation was performed for all pH crystal structures by giving pdb files as input. We used PARSE as force field and PROPKA for pKa calculation at their corresponding pH, H bond, contacts, and salt bridges. The outputs were used to generate and to visualize the electrostatic potential surface map by Pymol APBS plugin in an electrostatic potential range (+5000 Ke/T to −5000 Ke/T) with 0.05 grid spacing.

Ensemble Refinement

We used the ensemble refinement (ER) protocol implemented in the Phenix software package.[16] First, the alternate conformations were stripped off, and occupancies were set at 1.0 in the structures for the ensemble refinement. For each pH structure, we performed the ER at various settings of ER parameters ptls, Tbath, T, and nmodels to figure out the optimal values. The resolution-specific parameters T and nmodels were allowed to configure by the program, while Tbath was set at values 10K, 5K, and 2K, and for each Tbath value, the ptls was set from 0.6 to 1.0 in steps of 0.1. In each ER, we have assigned a monomeric unit of protein as a single TLS group. Table S4 shows the ER statistics and associated parameters. The ER parameters were considered optimal that yield the model with low Rfree. We performed ER four more times independently using the optimized parameters to test the reproducibility. Since we had structures deduced across the pH spectrum, we adapted the ER approach to probe the atomic detail protein dynamics as a pH function. This helped us to provide an analysis of the structure–function–dynamics relationship.

Homology Modeling

We used the comparative modeling method implemented in RosettaCM[21] to generate the homology structural models for BcPL (GenBank: AIO37757.1) and RpPL (GenBank: ACS64514.1) in closed and open states. Four structures, 4OZV, 4OZW, 4F13, 7FHX, and PanPL used for the closed state template, while for an open state, we used 1QAZ as a template. We deleted the predicted signal peptide region of RpPL and BcPL before proceeding for modeling. First, we generated sequence-based 3-mer and 9-mer fragments using the Rosetta server for the BcPL and RpPL sequences. The sequence alignment between the template structures (open state or closed state) and the target sequences (BcPL and RpPL) were created. A partial threading protocol (partial_thread) of Rosetta was utilized to build threaded models of the target sequences using the individual sequence alignment and corresponding template structure. We utilized all three stages of RosettaCM: in stage 1, the full-length model is generated using the sequence-based fragments (3-mer and 9-mer) and template-derived segments. In stage 2, to the full-length model generated, the optimization of local structure and loop closure was performed. In stage 3, all-atom, including backbone and side chain refinement (fastrelax), was performed on the stage 2 model. In each of the stages, appropriate weights were defined. We had four templates for the closed-state modeling and thus treated them as multiple templates to generate hybridized structures using RosettaCM. The run script that calls the RosettaCM was configured to generate 100 models. We selected the lowest Rosetta score model as the best homology structure for further analysis.

Rosetta Flexible Ligand Docking

We performed ligand docking using the Rosetta flexible docking protocol.[12] It implements a two-step docking protocol. In the low-resolution docking step, the ligand was translated into the putative binding site as a rigid body followed by rotation and translation. The next step was high-resolution docking where the side chains were repacked and scored with ligand_soft_rep energy terms. The complex was minimized using hard_rep energy terms. Different conformations and orientations for the ligand were sampled by utilizing this protocol. In order to sample protein conformational flexibility, the side chains around the ligand were repacked and relaxed along with the backbone. We generated 5000 models for the protein ligand complex and sorted them based on total energy score. Further the top 20% of the sorted models were screened for top interfacial binding energy. The best model for the complex was chosen based on the lowest interface energy and with plausible ligand–protein interactions in the active site in comparison with the bound crystal structures (7WXP, 7FI1, and 7FI0) (Figure S10).

Molecular Dynamics Simulations

Classical molecular dynamic simulations were performed on a PanPL (apo) protein crystal structure. Initially, the protein was hydrated in a cubic box, keeping protein at the center of the box, and the periodic boundary was set. Gromacs version 2019.2[29] was used for simulation using the CHARMM force field[30] (CHARMM36), and the TIP3P[31] model was used for water. The initial protein structure was in a closed state. The particle mesh Ewald (PME) method was employed to evaluate long-range electrostatic interactions with grid-spacing of 1.2 nm (the cutoff for Lennard-Jones potential) combined with the LINCS constraint algorithm. First, the system was relaxed through energy minimization to ensure the system was free from steric clash or abnormal geometries. The minimization was carried out to reach the minimum potential energy (negative value) with a maximal target force no greater than 1000 kJ mol–1 nm–1. The minimization was performed using the steepest descent method. This was followed by equilibration for 4 ns and production for 100 ns. A 2 fs time step was used throughout the equilibration and production stages. During equilibration, simulation restraints were applied on all protein atoms, while for the production stage, the restraints were removed. In the production stage, the Nose Hoover thermostat was used to maintain a constant temperature (303.15 K), while the Parrinello–Rahman barostat was used with isotropic coupling to control the pressure. The equation of motions was integrated using a multistep leapfrog integrator. Further we performed an extended production for 1 μs. The above approach was applied for substrate-bound crystal structure. In the case of crystal structure, we introduced mutation from alanine to histidine at 172 computationally, as there are two H-bond interactions observed in our docked structure which are crucial for functioning according to biochemical studies. The protein was modeled using the CHARMM force field, and the substrate di-ManA (as the electron density of two sugar units of tetra-ManA substrate were resolved) was parametrized using the CHARMM General Force Field (CGenFF, version 3.0.1).[32]

Analysis of Molecular Dynamics Simulation

For analysis of PanPL simulations, the trajectories were post-processed (using the built-in tool of Gromacs; trjconv) to account for periodicity, i.e., to correct for jumps or breaks that protein undergoes as it diffuses across the periodic boundary (the unit cell) during the simulation. The corrected trajectory had the protein as a group at the center of the box, and all the analyses were done on this. For each snapshot at 10 ps intervals of 1 μs trajectory, rmsd (root-mean-square deviation) was computed after aligning with initial structure (Figure ). Similarly, rmsf (root-mean-square fluctuations) per residue for the backbone atoms was calculated with respect to zeroth snapshot (Figure ).

Principal Component Analysis on MD Simulation Trajectories

The PC analysis was performed on the Cartesian coordinates of Cα atoms evolved along the time course of the 100 ns trajectory. To begin with, these Cα atoms were aligned with respect to Cα atoms of the reference structure (zeroth frame). As the signal-to-noise ratio was high, the first few PCs were considered to interpret the essential dynamics; the first component (PC1; apo 55.9%, substrate-bound 52.1%) had the highest proportion of variance followed by the second (PC2; apo 6.7%, substrate-bound 7.6%). The first principal component (PC1) highlights the major conformational states or dynamics, followed by other components. To further decode the dynamical states, hierarchical clustering was done in PC space. This gave rise to three prominent subgroups (Figure S14). The subgroups’ average structures represented three different conformational states; one corresponds to the closed state, while the other two are of open state (Figure S14). The contribution of individual residues to the principal components was also evaluated to further discern the segments/regions of the protein as a major contributor of concerted motions. The trajectories corresponding to concerted motions along the first two principal components were extracted, the first principal component (PC1) revealed the N-loop-lid motions configuring the molecule to a closed and an open state (Figure S14). In contrast, the second principal component had the motions confined to sliding back and forth (twisting motion) along the tunnel axis. The complete PC analysis was performed using the library functions implemented in bio3D package.[14]

28 in total

1. The energy landscapes and motions of proteins.

Authors: H Frauenfelder; S G Sligar; P G Wolynes
Journal: Science Date: 1991-12-13 Impact factor: 47.728

2. Small-molecule ligand docking into comparative models with Rosetta.

Authors: Steven A Combs; Samuel L Deluca; Stephanie H Deluca; Gordon H Lemmon; David P Nannemann; Elizabeth D Nguyen; Jordan R Willis; Jonathan H Sheehan; Jens Meiler
Journal: Nat Protoc Date: 2013-06-06 Impact factor: 13.491

3. On the evolutionary conservation of hydrogen bonds made by buried polar amino acids: the hidden joists, braces and trusses of protein architecture.

Authors: Catherine L Worth; Tom L Blundell
Journal: BMC Evol Biol Date: 2010-05-31 Impact factor: 3.260

4. Principal component and clustering analysis on molecular dynamics data of the ribosomal L11·23S subdomain.

Authors: Antje Wolf; Karl N Kirschner
Journal: J Mol Model Date: 2012-09-08 Impact factor: 1.810

5. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields.

Authors: K Vanommeslaeghe; E Hatcher; C Acharya; S Kundu; S Zhong; J Shim; E Darian; O Guvench; P Lopes; I Vorobyov; A D Mackerell
Journal: J Comput Chem Date: 2010-03 Impact factor: 3.376

6. Solution conformations and dynamics of ABL kinase-inhibitor complexes determined by NMR substantiate the different binding modes of imatinib/nilotinib and dasatinib.

Authors: Navratna Vajpai; André Strauss; Gabriele Fendrich; Sandra W Cowan-Jacob; Paul W Manley; Stephan Grzesiek; Wolfgang Jahnke
Journal: J Biol Chem Date: 2008-04-22 Impact factor: 5.157

7. Divergent evolution of protein conformational dynamics in dihydrofolate reductase.

Authors: Gira Bhabha; Damian C Ekiert; Madeleine Jennewein; Christian M Zmasek; Lisa M Tuttle; Gerard Kroon; H Jane Dyson; Adam Godzik; Ian A Wilson; Peter E Wright
Journal: Nat Struct Mol Biol Date: 2013-09-29 Impact factor: 15.369

8. Integrating protein structural dynamics and evolutionary analysis with Bio3D.

Authors: Lars Skjærven; Xin-Qiu Yao; Guido Scarabelli; Barry J Grant
Journal: BMC Bioinformatics Date: 2014-12-10 Impact factor: 3.169

9. Highly accurate protein structure prediction with AlphaFold.

Authors: John Jumper; Richard Evans; Alexander Pritzel; Tim Green; Michael Figurnov; Olaf Ronneberger; Kathryn Tunyasuvunakool; Russ Bates; Augustin Žídek; Anna Potapenko; Alex Bridgland; Clemens Meyer; Simon A A Kohl; Andrew J Ballard; Andrew Cowie; Bernardino Romera-Paredes; Stanislav Nikolov; Rishub Jain; Demis Hassabis; Jonas Adler; Trevor Back; Stig Petersen; David Reiman; Ellen Clancy; Michal Zielinski; Martin Steinegger; Michalina Pacholska; Tamas Berghammer; Sebastian Bodenstein; David Silver; Oriol Vinyals; Andrew W Senior; Koray Kavukcuoglu; Pushmeet Kohli
Journal: Nature Date: 2021-07-15 Impact factor: 49.962

10. Phaser crystallographic software.

Authors: Airlie J McCoy; Ralf W Grosse-Kunstleve; Paul D Adams; Martyn D Winn; Laurent C Storoni; Randy J Read
Journal: J Appl Crystallogr Date: 2007-07-13 Impact factor: 3.304