Zilin Song1, Francesco Trozzi1, Hao Tian1, Chao Yin1, Peng Tao1. 1. Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75205, United States.
Abstract
With the increasing popularity of machine learning (ML) applications, the demand for explainable artificial intelligence techniques to explain ML models developed for computational chemistry has also emerged. In this study, we present the development of the Boltzmann-weighted cumulative integrated gradients (BCIG) approach for effective explanation of mechanistic insights into ML models trained on high-level quantum mechanical and molecular mechanical (QM/MM) minimum energy pathways. Using the acylation reactions of the Toho-1 β-lactamase and two antibiotics (ampicillin and cefalexin) as the model systems, we show that the BCIG approach could quantitatively attribute the energetic contribution in one system and the relative reactivity of individual steps across different systems to specific chemical processes such as the bond making/breaking and proton transfers. The proposed BCIG contribution attribution method quantifies chemistry-interpretable insights in terms of contributions from each elementary chemical process, which is in agreement with the validating QM/MM calculations and our intuitive mechanistic understandings of the model reactions.
With the increasing popularity of machine learning (ML) applications, the demand for explainable artificial intelligence techniques to explain ML models developed for computational chemistry has also emerged. In this study, we present the development of the Boltzmann-weighted cumulative integrated gradients (BCIG) approach for effective explanation of mechanistic insights into ML models trained on high-level quantum mechanical and molecular mechanical (QM/MM) minimum energy pathways. Using the acylation reactions of the Toho-1 β-lactamase and two antibiotics (ampicillin and cefalexin) as the model systems, we show that the BCIG approach could quantitatively attribute the energetic contribution in one system and the relative reactivity of individual steps across different systems to specific chemical processes such as the bond making/breaking and proton transfers. The proposed BCIG contribution attribution method quantifies chemistry-interpretable insights in terms of contributions from each elementary chemical process, which is in agreement with the validating QM/MM calculations and our intuitive mechanistic understandings of the model reactions.
Machine learning (ML)
has emerged with a great promise to approximate
the target function of any form regardless of the a priori knowledge
about the underlying correlations among variables. The application
of various ML techniques has also advanced theoretical chemistry in
various subjects,[1−4] which have been suffering from either the extensive computational
demands of high levels of theory[5−8] or the high dimensionality of the chemical and/or
conformational spaces.[9−14] Although ML could be introduced to many topics that require accurate
and efficient approximations, its performance and effectiveness have
been limited by the feature representation and interpretability of
the model.[15] In addition to the routinely
applied feature representations, unsupervised models and rational
statistical procedures have been developed and applied to extract
robust feature vectors from the chemical feature space.[16−18] In particular, considerable pioneering efforts have focused on the
development of suitable descriptors and accurate ML models for approximating
hybrid quantum mechanical/molecular mechanical (QM/MM) potentials.[19−23]With the booming popularity of ML, interest to interpret deep
learning
(DL) neural networks have synergistically risen as a subfield of great
importance, namely the explainable artificial intelligence (XAI).[24] The DL models being interpretable not only elevate
our understanding of the learning algorithms but also constitute responsible
ML/DL-assisted decision making.[25] Practically,
XAI techniques attribute the predicted outcome of DL models to individual
feature contributions, therefore, rationalizing the driving forces
behind the decision flow in the models that are black boxes. While
explicit indicators for feature contributions are straightforward
in linear models[26] and are incorporated
by design in specific ensemble-based models,[27] explaining neural networks is in general hindered by the high nonlinearity
accumulated through the activations of the hidden layers. Based on
the assumption that the predicted nonlinear surface could be approximated
as linear at local regions, effective importance attribution methods
have been proposed based on model gradients.[28] The state-of-the-art XAI techniques, such as the integrated gradients
(IG)[29] and the layer-wise relevance propagation,[30] have demonstrated great promise in various explaining
tasks such as medical diagnosis[31] and cheminformatics
applications.[32] In addition, interpretable-by-design
DL architectures effectively making the use of the attention mechanism
have been developed for challenging explaining tasks such as protein–ligand
binding affinities.[33,34] Whereas DL models have proven
to accurately predict various chemical properties, we are particularly
interested in that if the DL models could also be post hoc explained to gain mechanistic insights into enzyme catalysis.In this study, the acylation between Toho-1, a class A serine-based
β-lactamase (ASβL), and two β-lactam antibiotics,
ampicillin (AMP) and cefalexin (CEX), were used as the model reactions
(Figure a,b). Theoretical
efforts have revealed that the acylation of ASβLs could undergo
two distinct pathways, which are distinguished by the reactant protonation
states of Lys73 and Glu166 (Figure c).[35−40] Hermann et al.[35,36] first reported that the acylation
is initialized by a deprotonated Glu166 acting as the general base,
leading to the Glu166 as the sole base pathway which produces the
acyl-enzyme product (R1-AE). Meroueh et al.[37] further proposed that the deprotonated Lys73 and protonated Glu166
could concertedly mediate the acylation (R2-AE). Whereas the acylation
mechanisms of β-lactamases are considered similar, in general,
the reactivity of β-lactamases varies with different β-lactam
families.[38] The understanding of the determinant
factors that impact the acylation activity of different ligand scaffolds
is challenging but critical for future optimizations of β-lactam
drugs.[39] Our recent study on the acylation
of Toho-1 by AMP and CEX showed that the R1-AE pathways are viable
for AMP but prohibitive for CEX, while the R2-AE mechanism is generally
accessible for both AMP and CEX.[40] Herein,
we aim to unravel the conformational and energetic factors that prime
the different pathways observed for the acylation of Toho-1 by AMP
and CEX. We first present 800 minimum energy pathways (MEPs) that
were obtained from a robust computational workflow combining constrained
molecular dynamics (MD) and QM/MM chain-of-state calculations. Feed-forward
neural networks were designed to accurately predict the MEP profiles
using conformational features selected from an ad hoc univariate variable
analysis. In light of the pioneering works on XAI, we developed a
thermodynamics-aware contribution attribution method to interpret
the ML MEP models. We show that quantitative mechanistic insights
into various aspects of enzyme catalysis can be derived from interpreting
purpose-oriented ML-MEP models in terms of energy contributions from
chemistry-interpretable factors.
Figure 1
Toho-1 β-lactamase, the β-lactam
antibiotics, and the
acylation pathways of Toho-1. (a) Toho-1/β-lactam complex and
the selection of QM atoms. The carbon atoms of the amino acid residues
are colored in dark gray. The Cα–Cβ bonds (used
as the QM/MM boundary for the amino acid side chains) are colored
in cyan. The carbon atoms of the β-lactam ligand are colored
in magenta, except the carbonyl carbon in β-lactams, which is
colored in dark purple. The catalytic water molecule is uniformly
colored in green. All nitrogen, oxygen, sulfur, and hydrogen atoms
are colored in blue, red, yellow, and white, respectively. All hydrogen
atoms in the ligand are omitted for clarity. (b) Chemical structures
of AMP and CEX. The penam and cephem scaffolds are colored in red;
(c) Acylation pathways using Glu166 as the general base (R1-AE) and
using Glu166/Lys73 concertedly as the general base (R2-AE).
Toho-1 β-lactamase, the β-lactam
antibiotics, and the
acylation pathways of Toho-1. (a) Toho-1/β-lactam complex and
the selection of QM atoms. The carbon atoms of the amino acid residues
are colored in dark gray. The Cα–Cβ bonds (used
as the QM/MM boundary for the amino acid side chains) are colored
in cyan. The carbon atoms of the β-lactam ligand are colored
in magenta, except the carbonyl carbon in β-lactams, which is
colored in dark purple. The catalytic water molecule is uniformly
colored in green. All nitrogen, oxygen, sulfur, and hydrogen atoms
are colored in blue, red, yellow, and white, respectively. All hydrogen
atoms in the ligand are omitted for clarity. (b) Chemical structures
of AMP and CEX. The penam and cephem scaffolds are colored in red;
(c) Acylation pathways using Glu166 as the general base (R1-AE) and
using Glu166/Lys73 concertedly as the general base (R2-AE).
Methods
For
clarity, we first present the theoretical framework for contribution
attribution of ML/DL models that were trained on an ensemble of QM/MM
MEPs. For an ML-MEP model, which is designated as F and is trained on a dataset of MEPs,
the contribution of a chemical process c on the p-th MEP can be represented as the “pathway-wise”
contribution attributing function ApathThe overall contribution of
chemical process c in the ensemble of MEPs is the sum
of the contribution weighted by
the accessibility (the Boltzmann factor) of p-th
MEP, which iswhere ΔE is the energy barrier of
the p-th
MEP, R is the ideal gas constant, T is the temperature, and CN is a normalization
factor.The exponential averaging implicitly assumes sample
completeness
in the MEP datasets, which is mostly impractical for actual MEP calculations.
In practice, the direct application of Boltzmann weights would lead
to numerical instability with a limited number of sampled MEPs. Alternatively,
a probability density function (PDF), PDF(ΔE), could be introduced to smoothen the
density of MEP barriers. In its simplest form, the PDF could be a
single Gaussian function. Furthermore, in cases where the sampled
barrier distribution failed to approximate a Gaussian distribution,
alternative density estimators such as Gaussian mixture models (GMMs)
or Kernel density estimations could be employed for better approximation.
Nonetheless, introducing PDF to eq yieldsNote that the normalization
factor CN′ isIn our
implementation, 2-component GMMs, which approximate the
distribution of MEP barriers using a weighted sum of two independent
Gaussian, were used as the PDF for MEP barriers.If the chain-of-states
replica path method (RPM, see Computational Details)[41−43] is used for
MEP calculations, each transition path is represented by a series
of discrete replicated structures (replicas) that connect the reactant
and product. The “pathway-wise” contribution Apath could be calculated from the “replica-wise”
attribution function Areplica of c to the energy of the r-th replica on
the p-th MEPwhere is the
total number of replicas in each MEP.As proposed by Sundararajan
et al.,[29] for a DL model F, the contribution of the i-th feature x of the feature vector corresponding
to a specific sample point can be calculated as the IG along a path
γ(α) that connects the sample point with feature vector (where α = 1) and a baseline with
feature vector x′ (where α = 0)In our case, the reactant states on the MEPs were selected
as the
baselines, and the contribution of c at the r-th replica was the integrated partial derivatives (with
regard to c) through the intermediate replicas preceding
the r-th replica along the MEP. Accordingly, eq is adapted for the discrete
reaction pathway aswhere i is the index of the
pathway replicas, and (p,i) is the feature vector of the i-th replica on the p-th MEP.The representation
of c must be determined to
expand the first partial derivative in eq . As noted, c represents
a “chemical process” that includes (but is not limited
to) bond making/breaking and proton transfers. The progress of the
chemical process is commonly represented by the linear combination
of multiple order parameters such as atomic distances, often referred
to as the reaction coordinates or collective variables. However, we
note that the correlation between the atomic distances is highly nonlinear
in the evolution along the optimal reaction path obtained from the
chain-of-state calculations. Therefore, instead of feeding the reduced
representation of linear-combined atomic distances, we used a set
of atomic distances that accounts for all chemical processes during
the acylation for the training of the DL model F,
that iswhere E(p,i) is the energy of the i-th replica
on the p-th MEP. Furthermore, owing to the nonlinearity
of the correlation between the feature dimensions, the gradients of F with regard to c, which usually correspond
to multiple feature dimensions in ,
cannot be calculated analytically. Therefore, the first partial derivative
in eq is computed numericallywhere ε denotes a small perturbative
factor (0.01 as in the current study); δ acts as the selector
for the feature dimensions included in the chemical process (c). In practice, δ is a multihot-encoded mask to ensure
that the perturbation ε is applied only to the features that represent the chemical process of
interest; Dist(A,B) stands for the distance metric that accounts for the pathway curvature.
The gradient on the pathway curvature readsCombining eqs , 9, and 10, the contribution
of the chemical process c to the r-th replica on the p-th pathway is calculated as
the integrated partial gradients of the ML-MEP model F asAccordingly, the contribution of c along one MEP Apath(F,c,p) can be calculated by cumulatively
summing the integrated
partial gradients (eq ). The sign of Apath(F,c,p) gives the interaction between
the chemical processes, whereas its absolute values give the perturbative
response of the ML-MEP model regarding different c. Therefore, the absolute values of Apath(F,c,p) were used
to calculate the weighted contributions in eq .In this study, eq is termed the Boltzmann-weighted cumulative
IG (BCIG) contributions.
We proceed with the computational details that implement the BCIG
approach for interpreting the acylation reactions of Toho/AMP and
Toho/CEX.
Computational Details
System Setup
The crystal structure
of the Toho-1/benzylpenicillin
complex with high resolution (PDB entry: 5KMW)[44] was used
as the template to generate the target systems. Toho/AMP and Toho/CEX
systems were constructed from the crystal structure of the Toho-1/benzylpenicillin
complex, respectively. The parameters for AMP and CEX were obtained
from the CHARMM general force field (CGenFF).[45,46] The three engineered mutations (Ala166/Asn274/Asn276) in the crystal
structure were modified to Glu166/Arg274/Arg276 as in the wild type
enzyme. All titratable residues were protonated under neutral pH based
on PropKa calculations and experimental neutron diffraction results.[47,48] Both systems were solvated in 80 Å cubic water boxes. Sodium
and chloride ions were added to neutralize the total charges in the
systems. The CHARMM36 force field (C36),[49] CGenFF,[50] and TIP3P parameters were used
to treat the protein, ligands, and solvent molecules, respectively.
The solvent molecules were held rigid with the SHAKE constraint.[51] The nonbonding interactions excluding long-range
electrostatic interactions between the atoms using molecular mechanics
(MM) were explicitly treated within 10 Å and were smoothened
out to zero at 12 Å. The periodic boundary conditions were employed
in all three dimensions, and the particle mesh Ewald method[52] was used for the summation of classical long-range
electrostatic interactions.The solvated systems were then subjected
to 500 steps of steepest descent minimizations followed by 5000 steps
of adopted basis Newton–Raphson (ABNR) minimizations using
the MM potentials. In order to obtain reliable orientations of the
active site, we further optimized the system for 5000 ABNR steps using
the semiempirical third-order density-functional tight-binding (DFTB3)
potential with the 3OB parameter set for the QM region and CHARMM36
force field for the MM region (DFTB3/3OB/C36).[53,54] The DFTB3/3OB/C36 potential has been used extensively for simulating
biomolecules, including β-lactamases,[55] for its computational efficiency. While the size of the QM region
converges slowly with respect to physicochemical properties,[56,57] we note that we did not perform extensive benchmarks on the dependency
of energetic profiles to the QM selections. Instead, the QM atoms
are picked as the balance between interactions with the reacting atoms
and computational cost. The QM region includes Ser70, Lys73, Ser130,
Glu166, Asn170, Lys234, Thr235, Ser237, ligands, and catalytic water
(Figure a). Our selections
yield 125 and 123 atoms in the QM subsystems of Toho/AMP and Toho/CEX,
respectively. The QM part of the amino acid residues was capped by
hydrogen link atoms through their Cα–Cβ bonds.
In all QM/MM simulations, the classical charges on the MM host atoms
(Cα) were removed to prevent the unphysical overpolarizations
to the QM density.After energy minimization, both systems were
subjected to heating
and equilibration simulations using classical force fields. All QM
atoms were kept frozen during the heating and equilibration stage
in order to preserve the QM-optimized geometries of the active site.
Accordingly, the QM link atoms (used in the DFTB3 minimizations) were
temporally removed, and the masses of the MM hosting carbon atoms
were rescaled to their unit atomic mass (12.0110). Both systems were
heated from 110 to 310 K with a temperature increment of 10 K per
1.5 ps followed by canonical (NVT) equilibrations at 310 K for 150
ps. The temperatures during the heating and the NVT equilibration dynamics were regulated via explicit
velocity rescaling. Both Toho/AMP and Toho/CEX systems were then subjected
to 150 ps isothermal–isobaric (NPT) equilibration
dynamics. During this stage, the temperature of the systems was maintained via the Hoover thermostat[58] at
310 K, while the pressure was coupled to 1 atm with the Langevin piston
method.[59]Based on reaction mechanisms
of the target reaction, we created
the reactant state suitable for the mechanism with Glu166 as the sole
base (R1), the reactant state suitable for the mechanism with Lys73
and Glu166 as dual bases (R2), and the acyl-enzyme (AE) state for
Toho/AMP and Toho/CEX systems using the DFTB3/3OB/C36 potential with
necessary distance-based restraints. A total of six states (R1, R2,
and AE of Toho/AMP and Toho/CEX) were subjected to extensive conformational
sampling with constrained NVT simulations for 150 ns. During the constrained
dynamics, the reacting function groups (the hydroxyls of the Ser70/Ser130,
the Lys73 amino group, the Glu166 carboxyl group, the carbonyl–nitrogen
bond of the ligands, and the catalytic water) were fixed in place
to retain their QM optimized orientations, and the rest of the QM
atoms were allowed to move freely. The snapshots used for the MEP
calculations were taken from the last 120 ns of the constrained MD
trajectories with a time interval of 1.2 ns for each system. Briefly,
a total number of 600 snapshots (100 snapshots from three states of
two systems) were selected as the starting conformations for the MEP
calculations.
QM/MM MEPs
The 600 starting conformations
were all
optimized using the DFTB3 potential. During the optimizations, the
surrounding MM residues within 4 Å of the QM region were allowed
to move while the remaining of the system was fixed. The corresponding
product/reactant states were generated from the starting conformations.
Specifically, AE states were generated from each R1 or R2 conformation
from the simulations as the product state for each R1 → AE
or R2 → AE pathway. For each AE state sampled from the simulations,
both R1 and R2 reactant states are generated, leading to two independent
AE → R1 and AE → R2 pathways. In total, 800 pairs of
either R1/AE or R2/AE states were generated for the Toho/ligand complexes
and were used as the reactant/product pairs for the MEP optimizations
using the RPM with holonomic constraints.The accuracy of the
QM potential is crucial for the quality of the MEP profiles. The DFTB3
method has been previously validated to produce accurate geometries
at stable states. However, it was also reported that this approach
would largely overestimate the bond dissociation energies between
the heavy atoms.[60] Alternatively, a plausible
approach for obtaining high-quality pathway profiles is to first optimize
the MEP geometry under a fast, parameterized semiempirical QM method
and refine the single-point energies using higher ab initio/DFT QM
levels of theory along the reaction pathway represented as a chain
of replicas.[61] This approach requires that
the MEPs optimized at the semiempirical QM level of theory are close
to the MEPs optimized at higher ab initio/DFT QM levels of theory
to achieve the goals of avoiding large errors of the energetics using
semiempirical QM models such as DFTB3 and high computational cost
of pathway optimization using the higher QM levels of theory. We compared
the MEPs optimized using DFTB3/3OB/C36 and B3LYP-D3/6-31G**/C36 levels
of theory, respectively (Figure S1). Additionally,
the 3OB-f parameter set,[53] with optimal
stretching vibrational frequencies for the C=C, C=N,
and C=O species, was also employed as comparison (referred
to as DFTB3/3OB-f/C36). Overall, the DFTB3/3OB/C36 level of theory
overestimates the bond lengths of the tetrahedral intermediates compared
to that of DFT calculations. On the other hand, DFTB3/3OB-f/C36 MEPs
closely resemble the B3LYP-D3/6-31G**/C36 MEPs in terms of key bond
distances during the acylation processes. Therefore, DFTB3/3OB-f/C36
was applied to carry out all the pathway optimization calculations
in this study. The B3LYP hybrid functional[62,63] with the 6-31+G** basis set[64] plus the
D3 dispersion correction[65] (B3LYP-D3/6-31+G**/C36)
were employed as the high-level QM counterpart to refine the single-point
energies along the optimized MEPs. We note that the combination of
the levels of theory (DFTB3 and B3LYP-D3) has been proposed previously
for enzyme catalysis simulations.[66,67] It has also
been demonstrated that both the DFTB3 and B3LYP methods are applicable
for studying concerted reaction steps in QM/MM simulations.[68,69]The initial guesses of the MEPs were created by linearly intercepting
the Cartesian space between the reactants and the acyl-enzyme products
using 50 replicated structures. The constrained RPM method,[41−43] which enforces equal mass-weighted root-mean-square deviation distances
between adjacent replicas, was applied to optimize the initial linear
chain-of-replicas. The atomic weights of the migrating protons were
scaled by a factor of 50 in order to capture continuous proton transfers
along the pathway trajectory. A kinetic energy potential force constant
of 0.05 kcal mol–1 Å–2 was
applied for all MEP optimizations to facilitate the smoothness and
convergence of the transition path. All 800 Toho/AMP and Toho/CEX
acylation reaction MEPs were subjected to single-point calculations
at the B3LYP-D3/6-31+G**/C36 level of theory. These 800 MEPs (40,000
replica conformations and energies) were grouped into four datasets:
the Toho/AMP: R1-AE pathways, the Toho/AMP: R2-AE pathways, the Toho/CEX:
R1-AE pathways, and the Toho/CEX: R2-AE pathways (Figure S2).
Machine Learning
A proper selection
of feature representation
is critical to the performance and interpretability of the ML models.
Various feature vectors built from conformational and/or physical
descriptors have been proposed to accurately predict molecular properties
or potentials.[13,14] In this study, we note that the
goal of the ML regressions is not only to predict the single-point
total energies of the systems but also to bridge the conformational
change with the energy evolution along the optimized MEPs, which mainly
attributes to the displacement of the reacting atoms in the QM regions.
Therefore, the initial selection of features covered (1) the atomic
distances between the chemical-bonded and hydrogen-bonded heavy atoms
in the QM region and (2) the hydrogen-donor/acceptor distances between
the reacting hydrogens and surrounding QM heavy atoms. This initial
selection of features was then subjected to a univariate analysis
against the energy profiles using mutual information as the metric.
In each pathway dataset, the atomic distances sharing higher mutual
information than the intuitive reaction coordinates were included
in the feature vectors. The features selected based on datasets of
the same mechanism (e.g., R1-AE pathways for both Toho/AMP and Toho/CEX
complexes) were merged to produce unified selections for either R1-AE
or R2-AE mechanisms across different systems, respectively. The final
selection of features on the R1-AE and R2-AE pathway datasets was
illustrated in Figure a,b.
Figure 2
Selected features, the 2D representation of the pathway conformations,
and the architecture of the ML-MEP models. The selected features and
chemical processes of (a) Toho/AMP: R1-AE and Toho/CEX: R1-AE datasets
and (b) Toho/AMP: R2-AE and Toho/CEX: R2-AE datasets. The atomic distances
that are included in feature vectors are noted in orange lines and
the chemical processes are noted in blue; see also Table S1. (c) 2D principal component dimensionality reduction
of the pairwise inter-heavy-atom distances in the QM region, and a
schematic demonstration for the loss of pathway context of the replicas;
(d) architecture of the QM/MM MEP learning deep-and-wide neural network.
Selected features, the 2D representation of the pathway conformations,
and the architecture of the ML-MEP models. The selected features and
chemical processes of (a) Toho/AMP: R1-AE and Toho/CEX: R1-AE datasets
and (b) Toho/AMP: R2-AE and Toho/CEX: R2-AE datasets. The atomic distances
that are included in feature vectors are noted in orange lines and
the chemical processes are noted in blue; see also Table S1. (c) 2D principal component dimensionality reduction
of the pairwise inter-heavy-atom distances in the QM region, and a
schematic demonstration for the loss of pathway context of the replicas;
(d) architecture of the QM/MM MEP learning deep-and-wide neural network.In order to reflect the conformational change along
each MEP, the
selected bond lengths as features were offset by subtracting their
corresponding values in the acyl-enzyme conformation as the product
state in the pathway. Furthermore, for each MEP, the feature vectors
on the replicas are rescaled by dividing the largest values along
each feature dimensionwhere represents
the feature vectors
of all replicas on the -th MEP. Similarly,
for each MEP, the replica energy profiles were also offset using the
energies of the acyl-enzyme product states as the reference (0 kcal
mol–1).Notably, our preprocessing of input
features enables the independent
representation of conformational and energetic evolutions along each
MEP: each pathway is treated individually without reference to other
MEPs in the dataset. However, the largely reduced feature representation
also blurred the context of each replica, leading to the loss of pathway
identity. As shown in Figure c, replicas that belong to different pathways could distribute
closer than their neighboring replica on the same pathway (d1 <
d2). Using the reduced feature representation, it is important to
distinguish the pathway identity for replicas, especially those that
are close to each other in the reduced feature space. In this regard,
the distinguishability between different pathways should be retrieved
by further hardcoding the pathway affinity of the replicas in a “one-hot”
manner.With these considerations, a feed-forward neural network
with the
“deep-and-wide” learning architecture (DaWNN) proposed
by Cheng et al.[70] was implemented for the
learning of the QM/MM MEPs (Figure d). The conformational features were fed to the deep
side of the input layers and were embedded through four hidden layers.
The encoded pathway identities were introduced from the wide entrance
and were concatenated to the deep embedding flow before the last hidden
layer. The single-point replica energies were obtained from a linear
output layer. The rectifier linear unit was used to activate all hidden
neuron nodes. The dropout strategy[71] was
applied for all hidden layers to promote the generalizability of the
neural networks and prevent overfitting. The dropout rate (0.1) and
the number of neuron units (256) on the hidden layers were tuned via a grid search strategy on a 10% pathwise stratified
validation set. Practically, we constructed the validation set by
randomly picking five replicas from each of the pathways carrying
50 replicas. The standard mean squared error (MSE) was used as the
objective loss function to train the ML-MEP models. All models were
trained with the AdaM optimizer[72] for 300
epochs with a sample batch size of 25.
Implementation Details
All MD simulations were performed
using CHARMM and OpenMM.[73,74] All QM/MM calculations
were performed with the built-in DFTB module of CHARMM or its interface
to Q-Chem.[75−77] The postprocessing of the pathway conformations,
featurization, and the density estimating GMMs took advantage of the
MDAnalysis and the scikit-learn package.[78,79] The MEP-learning neural networks were implemented with TensorFlow.[80] All molecular graphics were prepared with UCSF
ChimeraX.[81]
Results
QM/MM MEPs
Pioneering mechanistic studies have suggested
that the acylation half of the ASβL catalysis is the rate-limiting
step of the β-lactam hydrolysis,[35,36] allowing the
comparison between the theoretical acylation barriers and the experimental kcat values using the Eyring equation. According
to the Toho-1 catalysis kinetic studies by Shimizu-Ibuka et al.,[82] the acylation barrier of Toho/AMP is ∼15.5
kcal mol–1. Moreover, the enzyme kinetics reported
by Nitanai et al.[83] on Toho-1 with the
Arg274Asn/Arg276Asn-engineered mutants showed that the acylation barrier
of AMP is lower compared to that of CEX (determined from the kcat/kM ratio) by
∼1.7 kcal mol–1. While both mutations were
known to not participate in the acylation process, the acylation barrier
of wild type Toho/CEX can be estimated as ∼17.2 kcal mol–1.The barrier distributions of the calculated
QM/MM MEPs are plotted in Figure . While the free-energy barriers of a certain reaction
path could be obtained from the exponential average of the barriers
from a set of MEPs, the number of minimum energy barriers needed to
achieve an accurate estimation is prohibitively high. According to
the numerical simulations of Ryde,[84] if
the barriers were Gaussian distributed, the number of pathway samples
required for an accurate exponential averaged free-energy barrier
is >5000 for all four pathways investigated in this study. Fortunately,
the relative rankings of free-energy barriers for different mechanisms
can be robustly estimated from the arithmetic averaged barrier. In
particular, in cases where the difference between mean averages of
the modeled systems is larger than ∼4.8 kcal mol–1, reliable comparisons of reactivity can be assessed from ∼70
barrier samples with regard to the largest standard deviation of the
pathway barriers in the current study (12.58 kcal mol–1 of the Toho/CEX: R2-AE pathways, Figure d).[84] With 200
barrier samples for each mechanism, we accordingly rank the relative
activity of the four mechanisms as Toho/AMP: R2-AE > Toho/AMP:
R1-AE
> Toho/CEX: R2-AE > Toho/CEX: R1-AE.
Figure 3
The distribution of the
acylation barriers (ΔE) at B3LYP-D3/6-31+G**/C36
level of theory. (a) Toho/AMP: R1-AE acylation
pathways; (b) Toho/CEX: R1-AE acylation pathways; (c) Toho/AMP: R2-AE
acylation pathways; and (d) Toho/CEX: R2-AE acylation pathways. The
scatters show the locations of the energy barriers. The width of the
histograms is 4 kcal mol–1. The red curves note
the density estimation from the GMMs. The labels “min. ΔE”, “expo. ”, “mean ”, “med. ΔE”, and “std”
refer to the minimum, exponential
average, arithmetic average, median, and standard deviation of the
energy barriers from the QM/MM MEP profiles, respectively.
The distribution of the
acylation barriers (ΔE) at B3LYP-D3/6-31+G**/C36
level of theory. (a) Toho/AMP: R1-AE acylation
pathways; (b) Toho/CEX: R1-AE acylation pathways; (c) Toho/AMP: R2-AE
acylation pathways; and (d) Toho/CEX: R2-AE acylation pathways. The
scatters show the locations of the energy barriers. The width of the
histograms is 4 kcal mol–1. The red curves note
the density estimation from the GMMs. The labels “min. ΔE”, “expo. ”, “mean ”, “med. ΔE”, and “std”
refer to the minimum, exponential
average, arithmetic average, median, and standard deviation of the
energy barriers from the QM/MM MEP profiles, respectively.Both R1-AE and R2-AE acylation pathways of Toho/AMP are accessible
as they show lower mean averaged barriers than the Toho/CEX ones.
The Toho/AMP pathways also resemble the energetic landscapes of ASβL/penam
acylation reported in previous studies.[37] The exponential-averaged acylation barrier is 16.98 kcal mol–1 for the Toho/AMP: R1-AE pathway (Figure a), which is 12.81 kcal mol–1 higher than the Toho/AMP: R2-AE pathway(4.17 kcal
mol–1, Figure c). The lowest Toho/AMP: R1-AE barrier is shown to
be 14.01 kcal mol–1, which is lower than the experimental
acylation barriers of ∼15.5 kcal mol–1. As
for Toho/CEX acylation, the pathway via the R2-AE
mechanism confers an exponential-averaged barrier of 14.33 kcal mol–1 (Figure d), which is 11.22 kcal mol–1 lower than
its R1-AE alternative (25.55 kcal mol–1, Figure b). Furthermore,
as the lowest energy barrier found on Toho/CEX: R1-AE pathways (22.35
kcal mol–1) is much higher than the estimated experimental
barrier (∼17.2 kcal mol–1), these pathways
are considered generally inaccessible. The viable acylation path for
Toho/CEX is therefore verified to be the R2-AE mechanism.[40]
Energetics Interpreted from BCIG Contributions
In order
to delineate the BCIG contributions of the chemical processes during
the acylation, each MEP dataset was used to train an ML-MEP model
to learn the correlation between the evolution of energetics and conformations
along the MEPs.The prerequisite for model expandability is
that the ML/DL model should correctly resemble the underlying variable
correlations, which also facilitate accurate predictions. The importance
of accounting for the pathway identity is reflected in the predictive
performances of our DaWNN architecture and conventional deep neural
networks (DNN) with identical hyperparameters and training configurations.
As shown in Figure a, the DaWNN models could accurately predict the replica-wise energy
with R2 scores >0.995. Meanwhile, the
predictive performance of the DNN models dramatically decreases (R2 < 0.940) in the absence of pathway distinguishability
(Figure S3). On the other hand, the DaWNN
models could correctly predict the barrier heights with RMSE <
2 kcal mol–1 (Figure b). While the improvement of including the pathway
identity of each replica is evident, intuitively, the identity encoding
extended the input feature space to higher dimensions and embedded
the replicas into their respective pathways, allowing each pathway
replica to be independently correlated to its own conformational/energetic
context.
Figure 4
Predictive performance and the BCIG contributions of the ML-MEP
models. The predictive performance of (a) replica energies and (b)
pathway barriers of (left to right) the Toho/AMP: R1-AE, Toho/CEX:
R1-AE, Toho/AMP: R2-AE, and Toho/CEX: R2-AE models. The BCIG contributions
of the models: (c) Toho/AMP: R1-AE; (d) Toho/CEX: R1-AE; (e) Toho/AMP:
R2-AE; and (f) Toho/CEX: R2-AE. The rankings (highest to lowest) of
the BCIG contributions are noted below.
Predictive performance and the BCIG contributions of the ML-MEP
models. The predictive performance of (a) replica energies and (b)
pathway barriers of (left to right) the Toho/AMP: R1-AE, Toho/CEX:
R1-AE, Toho/AMP: R2-AE, and Toho/CEX: R2-AE models. The BCIG contributions
of the models: (c) Toho/AMP: R1-AE; (d) Toho/CEX: R1-AE; (e) Toho/AMP:
R2-AE; and (f) Toho/CEX: R2-AE. The rankings (highest to lowest) of
the BCIG contributions are noted below.To interpret the ML-MEP models in terms of energetic contributions,
we first grouped the input features (as atomic distances) by the chemical
processes that they are involved in (Figure a,b, Table S1).
The contributions of each chemical process are quantified using the
proposed BCIG metrics. At this stage, the BCIG measurements are not
ready to be quantitatively compared between the ML-MEP models that
were trained from different pathway datasets (see below).For
the Toho/AMP: R1-AE pathways (Figure c), the highest BCIG contributions come from
the concerted proton transfers from Lys73 to the thiazolidine nitrogen
(P2 and P3). The proton abstraction of Glu166 carboxyl (P0 and P1)
was assessed to be moderately rate-limiting as they pose higher contributions
than the nucleophilic attacks of the Ser70 hydroxyl to the β-lactam
carbonyl (B0 and B1). The contributions for the Toho/CEX: R1-AE pathways
also suggest that the protonation of the cephem nitrogen (P2 and P3)
is the determinant factor with the highest contribution (Figure d). However, the
deprotonation of Ser70 Oγ (P1) and its nucleophilic attach to
the cephem carbonyl carbon (B0) were shown to considerably contribute
to the reaction profiles of Toho/CEX: R1-AE. On the R2-AE acylation
pathways, the concerted proton transfers from Lys73 to the β-lactam
nitrogen, bridged by Ser130 hydroxyl (P2 and P3), remains as the reaction
step of the highest BCIG contributions in both Toho/AMP and Toho/CEX
systems (Figure e,f).
Interestingly, in the Toho/AMP: R2-AE pathways, the highest individual
contribution comes from P2 (the proton transfer between Lys73 and
Ser130), while in Toho/CEX: R2-AE, it was determined as P3 (the protonation
of the cephem nitrogen). In summary, on all reactive pathways (Figure c–f), the
protonation of the β-lactam nitrogen (P2 and P3) is deemed to
be the highest energy contribution, in agreement with our previous
investigations on TEM-1/benzylpenicillin acylation.[85]Noteworthily, even if the contributions between two
systems cannot
be directly compared, the ordering of the contributions from various
chemical processes could qualitatively hint at the major factors that
differentiate the acylation kinetics of different systems. Specifically,
the ranking of contributions from the nucleophilic attack of Ser70
Oγ (B0 and P1) are higher for the Toho/CEX: R1-AE pathways (Figure d) than that of the
Toho/AMP: R1-AE pathways (Figure c). This suggests that the formation of the tetrahedral
intermediates for the CEX acylation is one of the main causes for
its decreased acylation activity. On the other hand, the ranked order
of BCIG contributions on the R2-AE pathways suggests that the tetrahedral
collapsing (B1 and P3) is the major source of the kinetic differences,
as P3 and B1 are respectively ranked the second- and the fourth-highest
BCIG contributions in the Toho/AMP contributions (Figure e), and increase to the first
and the third as in the Toho/CEX contributions (Figure f), respectively. These implications are
related to the BCIG-explained reactivities.
Reactivity Interpreted
from BCIG Contributions
As aforementioned,
the BCIG contributions cannot be compared among different models trained
from different MEP datasets due to that the distributions of input
features from different pathway datasets are not guaranteed to be
identical. In this case, the BCIG contributions were incompatible
with each other as the models from which the contributions were derived
do not share the same input space (Figure c). A solution is to derive BCIG contributions
from a new model that is trained on both datasets, which allows the
ML model to extract knowledge from the union input space of different
systems. Therefore, the pathway datasets of the same mechanism were
merged to produce the Toho/AMP&CEX: R1-AE and Toho/AMP&CEX:
R2-AE datasets. Additional ML-MEP models with the DaWNN architecture
were trained on the combined datasets with extended pathway identity
encodings. We note that the mixed ML-MEP models could deep-embed all
the input MEP samples through their hidden layers from both systems,
adapting their trainable weights to the mixing of training samples
from different sources. While this mixed embedding does not impact
the prediction accuracy of the model, from the perspective of sample
affiliations (Toho/AMP or Toho/CEX), the model’s knowledge
on the learned PES landscape of one system is likely biased. In other
words, when the DaWNNs are trained from a dataset with MEPs from different
systems, the BCIG-explained reactivity contribution of the same feature
from one system (e.g., the Toho/AMP: R1-AE pathways) can be compared
to its contribution to the MEPs of the other system (the Toho/CEX:
R1-AE pathways), as the knowledge landscape of the ML-MEP models is
aware of the coexistence and the merged sample distribution of both
systems in the training data. Meanwhile, the reactivity contributions
of different features from the same system cannot be compared because
the contributions of all feature dimensions in one system are biased
by the other.With the above awareness, we first examined the
predictive performance of the ML-MEP models trained on the unified
datasets. As shown in Figure a,b, the model prediction on the replica energies or the pathway
barriers remain at the same predictive accuracies (RMSEs < 2 kcal
mol–1 and R2 > 0.995)
compared with the models trained on individual datasets (Figure a,b). The scalability
of the DaWNN architecture for learning MEPs is demonstrated.
Figure 5
Predictive
performance and the BCIG contributions of the unified
ML-MEP models. The predictive performance of (left to right) the replica
energies and the pathway barriers of (a) Toho/AMP&CEX: R1-AE and
(b) Toho/AMP&CEX: R2-AE models. The BCIG contributions of (c)
Toho/AMP&CEX: R1-AE and (d) Toho/AMP&CEX: R2-AE models.
Predictive
performance and the BCIG contributions of the unified
ML-MEP models. The predictive performance of (left to right) the replica
energies and the pathway barriers of (a) Toho/AMP&CEX: R1-AE and
(b) Toho/AMP&CEX: R2-AE models. The BCIG contributions of (c)
Toho/AMP&CEX: R1-AE and (d) Toho/AMP&CEX: R2-AE models.The BCIG contributions are computed for the reactivity-explaining
models: Toho/AMP&CEX: R1-AE and Toho/AMP&CEX: R2-AE. For the
R1-AE pathways, the BCIG contributions to all chemical processes are
much higher in the Toho/CEX acylation MEPs than the Toho/AMP ones
(Figure c). As expected,
the correlated P2 and P3 contributions largely increase for the Toho/CEX
R1-AE acylation pathways, reflecting the less active protonation of
the cephem nitrogen. In addition, the nucleophilic serine attack (B0
and P1) in Toho/CEX is more than three times higher than that in Toho/AMP,
which aligns with the increment of rankings of B0 and P1 for the energetic
contribution in Toho/CEX (Figure c,d). While the acylation pathways are initialized
by the nucleophilic serine addition, the high BCIG contributions attributed
to this process indicate that the R1-AE acylation pathways for Toho/CEX
are unfavored compared to Toho/AMP. Combining the enzyme kinetics
discussed above, the interpretation of BCIG contributions shows that
the acylation pathway using solely Glu166 as the general base is turned
off for Toho/CEX due to its incapability to activate the serine attack
on the cephem carbonyl.On the other hand, the contributions
in the Toho/AMP&CEX: R2-AE
models demonstrated the same trend (Figure d): the BCIG values for most chemical processes
in Toho/CEX pathways are higher than those of the Toho/AMP pathways.
The differences of the BCIG contributions in the two systems mainly
come from the residue-involved processes: B0, P4 (the serine nucleophilic
attack to β-lactam carbonyl), and P2, P3 (the concerted proton
transfers to protonate the β-lactam nitrogen). Interestingly,
the BCIG contribution from C–N bond breaking (B1) for Toho/CEX
concerted base acylation is shown to be slightly lower than that for
Toho/AMP. In brief, the BCIG contributions from the Toho/AMP&CEX:
R2-AE MEP learning model show that the energy contributions of concerned
chemical processes on the Toho/CEX pathways are moderately higher
than the Toho/AMP pathways, suggesting lower acylation activity for
the Toho/CEX: R2-AE pathways.
Validation of the BCIG
Explanations
The mechanistic
insights into the Toho/β-lactam energetic and reactivity contributions
derived from the ML-MEP models using the BCIG metric are further validated
by additional QM/MM protocols. Specifically, we measured the atomic
distances between critical heavy atoms and performed the ChElPG population
analysis[86] on the 800 reactant conformations
in all MEPs.We first validate the BCIG-explained energy contribution
for each system. For the Glu166 mediated acylation pathways (R1-AE),
the concerted proton transfers from Lys73 to the β-lactam nitrogen
(P2 and P3) are determined as the reaction steps of the highest energy
contributions in both Toho/AMP and CEX pathways (Figure c,d). The proton transfer processes
in enzyme catalysis are known to highly correlate with the strength
of the hydrogen-bonding interactions between the proton donor and
the acceptor. Here, the atomic distances among Lys73 Nζ, Ser70
Oγ, and the β-lactam nitrogen (P2 and P3) are shown to
be the largest compared to that of other proton transfers (P0 and
P1, Figure a), suggesting
the weaker the hydrogen actions, the harder the proton transfers.
A similar conclusion could also be drawn on the R2-AE acylation pathways
(Figure e,f), where
the P2/P3 atomic distances are larger than that of P4 (Figure c).
Figure 6
Atomic distances between
the ChElPG charges on critical heavy atoms
in all reactant states at the B3LYP-D3/6-31+G**/C36 level. The (a)
atomic distances and (b) ChElPG charge profiles of the Toho/AMP: R1-AE
and Toho/CEX: R1-AE pathways. The (c) atomic distances and (d) ChElPG
charge profiles of the Toho/AMP: R2-AE and Toho/CEX: R2-AE pathways.
The meanings of the labels on x-axes are defined
in the symbolic legend (right panel).
Atomic distances between
the ChElPG charges on critical heavy atoms
in all reactant states at the B3LYP-D3/6-31+G**/C36 level. The (a)
atomic distances and (b) ChElPG charge profiles of the Toho/AMP: R1-AE
and Toho/CEX: R1-AE pathways. The (c) atomic distances and (d) ChElPG
charge profiles of the Toho/AMP: R2-AE and Toho/CEX: R2-AE pathways.
The meanings of the labels on x-axes are defined
in the symbolic legend (right panel).The BCIG-explained reactivities generally state that most of the
chemical processes in Toho/AMP acylation pathways are more reactive
than the Toho/CEX ones, regardless of the acylation mechanism. For
the R1-AE acylation pathways that are prohibitive for Toho/CEX, it
is shown that the atomic distances between the Lys73 amino and the
Ser130 hydroxyl (P2) and the Ser130 hydroxyl to the cephem nitrogen
(P3) are generally longer than those in Toho/AMP (Figure a). Moreover, the ChElPG charges
on the β-lactam nitrogen atoms in CEX are much higher than that
of the penam nitrogen of AMP (Figure b). These suggest the poor proton affinity of the CEX
nitrogen atoms, which lead to higher BCIG contributions of P2 and
P3 observed in Figure c. The contributions from the Ser70 addition processes (B0, P1) are
another source of the deactivation of R1-AE acylation pathways for
CEX. The key reactant distances in the two systems also suggest that
the serine attack in CEX is less favorable than that of AMP due to
the longer B0 distance in Toho/CEX (Figure a). Meanwhile, the ChElPG analysis shows
no significant change in the density population of Ser70 Oγ
upon binding of different ligands, but the atomic charges on the CEX
carbonyl carbon are found to decrease (Figure b). Such electrostatic features are expected
for the cephem scaffolds as the extended π-conjugations introduced
by the C3=C4 double bond could delocalize the densities on
the β-lactam carbonyl and nitrogen, leading to stronger resistance
to the nucleophilic serine addition for CEX acylation. In the BCIG-explained
ML-MEPs, this is reflected as the increased contributions from B0
and P1 observed in Toho/CEX: R1-AE pathways (Figure c).The general mechanistic insights
made for the different reactivity
in R1-AE acylation also hold for the R2-AE acylation pathways: the
cephem scaffold delocalizes the density on the β-lactam, promoting
its resistance to the protonation of the dihydrothiazine nitrogen
(P2, P3, Figure c)
and the nucleophilic serine addition (B0, P4, Figure d). As for the C–N bond breaking (B1)
in different systems, the BCIG contribution from the pairwise distance
between two bonded atoms refers not to the bond strength
between the atoms but the impact to the energy profiles when perturbations
are applied to the progress of that distance (eq ). Specifically, the BCIG contribution of
B1 in the R2-AE hybrid models being close (Figure d) does not indicate that the “bond
strengths” of the scissile C–N bonds are similar. Instead,
it suggests that the “sensitivity” of the energetics
of the QM/MM MEPs to the perturbation of the atomic distance between
the carbonyl carbon and the nitrogen are similar. This is further
supported by the reactant conformations and the atomic charges. As
shown in Figure c,
the β-lactam C–N bond distances are generally similar
in different systems; it is expected that the BCIG contributions of
B1 are close in Toho/CEX: R2-AE and Toho/AMP: R2-AE. On the other
hand, the distributions of ChElPG charges of the bonded atoms (the
β-lactam C, N) largely deviate between the two systems (Figure d), suggesting very
different densities and thus the “bond strength” of
the C–N scissile bonds in Toho/AMP and Toho/CEX systems. Alternatively,
the relative “bond strength” of the β-lactam carbonyl
nitrogen is reflected from the contribution of the serine attack (B0)
and the nitrogen protonation (P3), which are concerted processes to
the ring-opening (B1).
Discussions
In this work, we propose
the contribution attribution method, BCIG,
as an XAI technique to interpret ML-MEP models for mechanistic insights
into enzyme catalysis. Using the BCIG contributions, we demonstrate
how to extract interpretable mechanistic information that is difficult
to quantify due to the concerted nature of the enzymatic reaction
steps.The BCIG method is helpful to discern the thermodynamics
that governs
all chemical transitions. However, since the ML-MEP models are trained
based on the limited samples of MEPs, the direct application of the
Boltzmann factor to the cumulative IG could be limited by the size
of the training set. Therefore, a PDF was introduced to smoothen the
density of barriers distributions. Pioneering efforts have extensively
applied equivalent density smoothening assumptions, such as Gaussian
functions, for investigating or promoting the numerical convergence
of free-energy methods.[84,87,88] Moreover, von der Esch et al.[89] reported
that the adiabatic mapped desuccinylation barriers of Sirtuin five
could reasonably resemble a Gaussian distribution with ∼150
samples. In this regard, we further examined the applicability of
Gaussian distributions of our MEP barriers by a goodness-of-fit test.
It was shown that Toho/CEX: R2-AE barriers failed to approximate a
reasonable Gaussian distribution (Table S2). However, we note that the type of the PDFs does not seem to qualitatively
impact the BCIG results (Table S3, Figure S4). Furthermore, numerical features of
BCIG are characterized with a bootstrapping-based convergence test
(Supporting Information Methods). We note
that the BCIG contributions in all four pathway datasets converge
when ∼170 MEPs were used (Figure S5).Moreover, our BCIG approach inherits the advantage of “model
implementation invariance” from the IG formulism of Sundararajan
et al.,[29] with the contributions attributed
to the feature inputs being independent of the ML/DL algorithms used.
Based on the model implementation invariance, if two models predict
the same output on all input examples, the gradients of the predicted
hypersurface should also be equal everywhere, and thus the invariance
of the IG and the BCIG contributions. On the other hand, it is notable
that the “path implementation invariance” of the IG
approach is not preserved in the BCIG formulism for obvious reasons:
it is vital to account for the contribution of the chemical process
along a specific transition path (MEPs as in the current study). Thus,
the gradients calculated for each replica along the path are cumulatively
summed to achieve path specificity for the feature contributions.It is also worth mentioning that our explanations are purpose-oriented.
It is preferred to build individual ML-MEP models using datasets for
distinct chemical processes, such as Toho/AMP acylation pathways or
Toho/CEX acylation pathways in this study. Therefore, we intentionally
trained multiple ML-MEP models to learn different aspects of Toho-1-catalyzed
acylation. Practically, the energetic contributions in each acylation
pathway dataset are interpreted from models that learn only on a particular
pathway dataset, and the reactivity of the chemical processes between
different systems is interpreted from models trained on per mechanism
merged datasets.We further note the extensibility of the current
workflow based
on the BCIG approach. In terms of data preparation, any MEP sampling
method that effectively explores the transition region could be applied
to construct the MEP datasets. Thanks to the “model implementation
invariance,” the ML/DL algorithm used to learn the chemical
MEPs are not limited to the DaWNN as we used. However, it is important
to monitor the transformations applied to the model input and rationally
design the learning strategies. Herein, we sequentially applied feature
selections and pathway-wise feature rescaling, resulting in the largely
dimensionality-reduced feature space and the loss of the pathway context
for each replica. Accordingly, this is solved by hardcoding the pathway
identity, which leads to the DaWNN architecture. Lastly, the c term to be explained is not limited to “chemical
processes” as in the current study. The only requirement for
the representation of c is that it should describe
a certain interaction of interest, which is generally open to any
physical descriptors that are known to correlate with chemical transitions.
Conclusions
In this study, we first presented a QM/MM computational workflow
that achieves fast sampling of QM/MM MEPs for enzyme catalysis. Based
on the observation that the 3OB-f DFTB3 parameter set could correctly
resemble the bond dissociation distances during the β-lactam
acylation reaction, we optimized 800 MEP conformations and refined
the single-point energies using B3LYP-D3/6-31+G**/C36 calculations.
The energetics from this computational workflow are in good agreement
with previous calculations based on DFT/MM calculations.[40]ML-MEP models with high performance and
scalability using the DaWNN
architecture were developed for the ML of the QM/MM MEPs of enzyme
catalysis. Compared to conventional DNN models, the DaWNN architecture
achieves much higher accuracy in learning the energetic profiles from
the conformational evolutions along the QM/MM MEPs. Furthermore, the
DaWNN model is shown to be highly scalable to the training size or
the source of the training data without significant loss in performance.Inspired by the IG approach for explaining ML/DL models, we further
developed the BCIG approach to interpreting the ML-MEP models for
mechanistic insights into enzyme catalysis. Using Toho/AMP and Toho/CEX
as the model systems, the energetic and the reactivity contributions
of the processes with different substrates are quantified by the BCIG
attributions. The conformational factors that differentiate the Toho-1
acylation activities of AMP and CEX were identified. The BCIG contributions
quantified that the cephem scaffold was less susceptible to the nucleophilic
serine addition and the protonation of the β-lactam nitrogen
than the penam. Moreover, we presented a purpose-oriented training-explaining
strategy to focus on mode interpretability. Whereas the different
ML-MEP models are trained and interpreted for specific mechanistic
aspects, we have shown that the interpretations of different models
give consistent mechanistic insights that agree with our intuitive
mechanistic understandings and the validating QM/MM calculations on
the modeled systems.With the developments of robust ML/DL-based
QM/MM potential functions
and the implementations of thermodynamics-aware XAI approaches for
protein-related systems, one could foresee the emergence of the “full-stack”
(simulation to analysis) ML/DL-assisted QM/MM studies on enzyme catalytic
mechanisms in the near future.
Authors: K Vanommeslaeghe; E Hatcher; C Acharya; S Kundu; S Zhong; J Shim; E Darian; O Guvench; P Lopes; I Vorobyov; A D Mackerell Journal: J Comput Chem Date: 2010-03 Impact factor: 3.376