Alexander H Williams1,2, Chang-Guo Zhan1,2. 1. Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, Kentucky 40536, United States. 2. Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, Kentucky 40536, United States.
Abstract
Since the introduction of the novel SARS-CoV-2 virus (COVID-19) in late 2019, various new variants have appeared with mutations that confer resistance to the vaccines and monoclonal antibodies that were developed in response to the wild-type virus. As we continue through the pandemic, an accurate and efficient methodology is needed to help predict the effects certain mutations will have on both our currently produced therapeutics and those that are in development. Using published cryo-electron microscopy and X-ray crystallography structures of the spike receptor binding domain region with currently known antibodies, in the present study, we created and cross-validated an intermolecular interaction modeling-based multi-layer perceptron machine learning approach that can accurately predict the mutation-caused shifts in the binding affinity between the spike protein (wild-type or mutant) and various antibodies. This validated artificial intelligence (AI) model was used to predict the binding affinity (Kd) of reported SARS-CoV-2 antibodies with various variants of concern, including the most recently identified "Deltamicron" (or "Deltacron") variant. This AI model may be employed in the future to predict the Kd of developed novel antibody therapeutics to overcome the challenging antibody resistance issue and develop structural bases for the effects of both current and new mutants of the spike protein. In addition, the similar AI strategy and approach based on modeling of the intermolecular interactions may be useful in development of machine learning models predicting binding affinities for other protein-protein binding systems, including other antibodies binding with their antigens.
Since the introduction of the novel SARS-CoV-2 virus (COVID-19) in late 2019, various new variants have appeared with mutations that confer resistance to the vaccines and monoclonal antibodies that were developed in response to the wild-type virus. As we continue through the pandemic, an accurate and efficient methodology is needed to help predict the effects certain mutations will have on both our currently produced therapeutics and those that are in development. Using published cryo-electron microscopy and X-ray crystallography structures of the spike receptor binding domain region with currently known antibodies, in the present study, we created and cross-validated an intermolecular interaction modeling-based multi-layer perceptron machine learning approach that can accurately predict the mutation-caused shifts in the binding affinity between the spike protein (wild-type or mutant) and various antibodies. This validated artificial intelligence (AI) model was used to predict the binding affinity (Kd) of reported SARS-CoV-2 antibodies with various variants of concern, including the most recently identified "Deltamicron" (or "Deltacron") variant. This AI model may be employed in the future to predict the Kd of developed novel antibody therapeutics to overcome the challenging antibody resistance issue and develop structural bases for the effects of both current and new mutants of the spike protein. In addition, the similar AI strategy and approach based on modeling of the intermolecular interactions may be useful in development of machine learning models predicting binding affinities for other protein-protein binding systems, including other antibodies binding with their antigens.
The SARS-CoV-2 virus (which causes the
COVID-19 disease) continues
to remain the world health community’s number one priority
as the pandemic prepares to enter its third year.[1] Although potent vaccines have been developed for the virus,
widespread inoculation has been achieved primarily in well-developed
nations, leaving vast areas of lesser developed nations in peril against
the virus and its many variants.[2,3] Vaccine hesitancy within
the United States has led to multiple additional waves of COVID-19
infections, in which the B.1.617.2 (Delta) and B.1.1.529 (Omicron)
variants have played significant roles, especially within the southern
United States.[3−9] The Delta variant is noted for both its increased infectivity and
the severity of symptoms it causes.[7] Although
vaccines still remain largely effective against the Delta variant,
breakthrough infections are still possible, and the viral load carried
by vaccinated individuals is unaffected, a testament to the infectivity
of the Delta variant.[10−12] However, the resistance against mutations seen within
the fast-tracked vaccines from Pfizer, Moderna, and Janssen is not
carried over to the monoclonal antibody therapeutics that have also
been approved for COVID-19 treatment.[13−17] Of note is Ly-CoV555, an antibody that received FDA
emergency use authorization, which has dramatic reduction in efficacy
when used against the B.1.617.1 (Kappa), Delta, and most notably the
Omicron variant.[13,15,18−20]Mutations seen within the receptor binding
domain (RBD) region
of the spike protein are of particular interest when developing COVID-19
variant-resistant antibodies. Due to the similar positioning of angiotensin-converting
enzyme II (ACE2) and many antibodies in relation to the spike protein’s
RBD region (see Supporting Information Figure
S1), the mutations that confer greater biding affinity to ACE2 may
be deleterious to the binding of the antibody.[21] Of note is the L452R mutation and T478K mutation, both
of which are capable of greatly decreasing binding affinity with multiple
known antibodies on their own, including Ly-CoV555.[13] As variants continue to develop, it is important to develop
an accurate and efficient methodology that can predict the effects
on binding affinity these mutations may have. During the early months
of the pandemic in 2020, our laboratory group was attempting to adapt
our in silico protein/ligand binding affinity methodology
to the spike protein and several engineered ACE2 proteins and ACE2
mimics to optimize the binding interactions between them. This methodology
is a very convenient, computationally inexpensive way of estimating
the Gibbs binding free energy using only energy minimization and molecular
mechanics Poisson–Boltzmann surface area calculations (MM-PBSA),
a methodology that our laboratory had seen great success with in multiple
protein/ligand systems.[22−27] Concurrently, the first variant of major public note, the N501Y
(later named B.1.1.1 or Alpha), began sweeping across the United Kingdom.
Our group then decided to include a prediction for the binding affinity
of this Alpha variant using the linear regression model that had been
established for the spike protein and the engineered ACE2 protein
and its mimics. Our prediction of 0.44 nM Kd for the Alpha variant was validated during the review process of
our first spike/ACE2 model as the Alpha variant’s Kd was determined in vitro at 0.8 nM,
less than a 0.5 nM difference.[28] Throughout
2021, additional waves of the COVID-19 virus manifested, introducing
new, even more heavily mutated variants (e.g., B.1.617.2
and B.1.1.529) to the public. Using the same computational methodology,
our laboratory was able to further predict the binding affinity of
the Omicron variant at 22.63 nM, a prediction that was later validated
during review, as in vitro experiments determined
the Kd at 20.69 nM.[29]Since the publication of our previous methodologies
focusing on
the interactions between the spike protein and ACE2, a slew of new
antibody structures binding with the spike protein have been deposited
and have subsequently been used in in silico methodologies.
Both coarse-grained (CG) and all-atom molecular dynamics (MD) simulations
have been performed with numerous antibodies to study the changes
in the binding modes induced by the mutation of the spike protein
across the many variants of SARS-CoV-2.[30−32] These models have also
been used to produce reasonable estimates of the binding affinity;
however, these studies are more focused on a particular subset of
antibodies, rather than producing a more generalizable model.[32] With these new antibody structures available,
along with the empirically determined Kd values, we can update our model to accurately predict the binding
affinity of these antibodies with the different variants of the spike
protein.However, building a model that can predict the affinity
of various
antibodies with the spike protein broadly is a much more challenging
task than building a model to determine how certain mutations will
affect the binding of a singular protein/protein pair. Generalized
linear regression models for the prediction of protein/protein interactions[33] do suffer in terms of their accuracy when compared
to models such as our previously published ACE2/spike mutation model.[28] This loss in accuracy is inherent to the methodology
as while a change in a residue in an otherwise unchanged protein structure
can be seen as a small perturbation of the overall system, the use
of many antibodies, each with their own unique binding mode with the
spike protein (Figure S1), cannot. This
issue is further compounded when considering the multitude of different
charge/charge combinations between the spike protein and the antibody,
which is variable due to several common mutations seen within the
SARS-CoV-2 spike protein (e.g., L452R, T478K, etc.).To circumvent this loss in accuracy, we have
constructed a multi-layer
perceptron (MLP) neural network, a machine learning method that is
able to accurately predict the binding affinity of a wide variety
of antibodies, including several monoclonal antibody therapeutics
on offer from Regeneron, AstraZeneca, and Eli Lilly.[34] By including multiple predictors including the gas and
solvation free energy predictions from the MM-PBSA methodology, solvent-accessible
surface areas (SASA) of the complex, hydrogen bonds between the antibody
and spike protein, and the charges of each, we have developed and
cross-validated such an artificial intelligence model which can accurately
predict the binding affinity of various SARS-CoV-2 variants’
spike protein with numerous antibodies. We have then proceeded to
use the model to predict the binding affinity of these antibodies
with numerous variants of concern (VOCs) including the B.1.1.529 and
the newly published “Deltamicron” (or “Deltacron”)
variant identified within southern France.[35]
Methods
The methodology for preparing our MLP neural
network models consists
of three major stages: (1) structure preparation of spike/antibody
structures, (2) energy minimization utilizing the Sander module of
the Amber MD package followed by an MM-PBSA calculation, and finally,
(3) the construction, training, and cross-validation of the MLP models.[28,36] Once the MLP models are validated for their accuracy in both the
training and validation data sets, the model predictions may be averaged
to mitigate local inaccuracies located within one model.
Modeling of Spike Protein/Antibody Complexes
For the
first stage, multiple spike/antibody crystal and cryo-electron microscopy
(cryo-EM) structures were obtained from the RCSB database (PDB IDs
are available in Supporting Information Table S1), and the structures which had only one type of antibody
binding to the spike protein and had a known binding affinity Kd value were chosen for use in the MLP model.
Additional data points from mutated versions of these antibodies were
taken from Liu et al.,[37] who measured the Kd for numerous antibodies
with spike proteins that mimicked the B.1.617.1 and B.1.617.2 strains
of the RBD region of the spike protein. Each of these structures was
prepared using the PDB4AMBER module of the AmberTools suite, which
removed any non-protein atoms from the structure.[38] These prepared structures then had their initial coordinates
and parameters created using the tLEaP module of the AmberTools package;
this package also assigned the positions for the hydrogens for each
structure. For the non-wild-type data points, the structures were
perturbed using the PyMol mutation tool to replace each residue with
its respective mutation in the RBD region of the spike protein, making
sure that the lowest-energy rotamer available was chosen to avoid
large steric clashes with nearby residues.[44]For the second stage, to prepare the complexes for their interaction
energy calculation with the MM-PBSA method, we employed our previous
methodology that employs computationally efficient energy minimization
over a two-step process. This methodology has been shown to accurately
estimate the binding affinity in both protein/protein and protein/ligand
systems[22−27] and has previously been used to engineer an anti-cocaine therapeutics.[39,40] This methodology, although more simplistic than other methodologies
that employ all-atom or CG MD simulations, has major advantages. Sufficiently
long MD simulations require extensive computational resources and
can require days of computing time before a system may enter equilibrium.
Additionally, over these long simulations, the limitations placed
upon these MD simulations (e.g., force field cutoffs,
periodic conditions, etc.) can introduce artifacts
into the system, impacting the actual accuracy of the simulation.
Usage of energy minimization also limits the number of snapshots required
to determine the ΔG of the system as only one
snapshot is needed as opposed to MD simulations or with using free-energy
perturbation methods.[41,42] However, this methodology does
assume that the structure used within the energy minimization/MM-PBSA
calculations is close to its equilibrium state (i.e., unperturbed), and the amino acid mutations inserted into each of
these structures are a small perturbation away from the equilibrium
state, which can be returned with the energy minimization steps.The wild-type protein/protein complex structures were energy-minimized
over a two-step process using the CUDA-optimized version of PMEMD
in the AMBER20 package.[38] The first round
of energy minimization used 1000 steps of steepest descent minimization
followed by 4000 steps of conjugate gradient energy minimization,
using 10 kcal/Å restraints upon the backbone alpha carbons of
the proteins. The second round of energy minimization reduced these
restraints to 2 kcal/Å. Interaction (binding) energies of the
spike/antibody complexes were then evaluated using the MMPBSA.py module
of AMBER20, using the MM-GBSA methodology.[36,38,43] The ΔGGB values were deconstructed into their constituent electrostatic (EEL),
van der Waals (vdW), and solvation (SOLV) energies for use in training/validation
of our model. Unlike our other empirical models, the residues mutated
within the spike protein variants induced a change in charge, which
caused overestimations of the ΔΔGEEL between variants using the standard MM-PBSA method. To
account for these overestimations, we implemented a distance-dependent
dielectric coefficient that we previously employed to correctly predict
the binding affinity of ligands with the nicotinic acetylcholine receptors.[44−46] The distance between the charge centers of the antibody and spike
protein was calculated by exporting the minimized parameters and topology
of each in the PQR format. Using the product of each atom’s
charge and XYZ coordinates, an average location for
the overall charge in each structure was calculated, followed by determining
the distance between these two points using the Euclidian distance
formula. Finally, the SASA of each complex, antibody, and spike protein
was calculated using a 1.4 Å probe via the CPPTRAJ
module of the AMBER package.[47]
Construction of the MLP Neural Network Model
Once the
necessary in silico parameters for each complex were
generated, the in vitro Kd values were
converted to kcal/mol units to match those of the MM-GBSA calculations.
These converted Kd values were set as
the response variable in the JMP 16.0 software package, utilizing
the neural predictive modeling package.[48] The neural network structure consists of initial 11 input nodes
(i.e., electrostatic, vdW, and solvation energy decomposed
from the MM-GBSA calculation, long-range electrostatic energy (LRE),
antibody and spike overall charges, distance between antibody and
spike protein charge centers, hydrogen bonds between the spike protein
and antibody, and the surface areas of the overall complex, spike
protein, and the antibody surface), forward feeding into two hidden
layers of 11 nodes, which is forward feeding into a single-output
node, providing the binding free energy (ΔG) prediction of the antibody/protein complex in question (Figure ).
Figure 1
Schematic of the MLP
neural network used to predict spike/antibody
binding affinities. The network is divided into four sections: an
input layer, two fully connected hidden layers, and a final singular
output layer.
Schematic of the MLP
neural network used to predict spike/antibody
binding affinities. The network is divided into four sections: an
input layer, two fully connected hidden layers, and a final singular
output layer.This setup (Figure ) allows for each of the input variables collected
in the previous
stages to possess their own singular input node for weight training
while simultaneously avoiding overfitting the data with an overly
complex model.[49,50] Each node within the hidden layer
uses the TanH activation function, which has previously been shown
to be superior in accuracy and the root mean square error (RMSE) in
MLP performance meta-analyses.[51] To train
and cross-validate the model, the 48 data points collected were randomly
split into training and validation sets, using 3 K-fold cross-validation,
allowing each of our models to receive 67% (32 points) of the available
data as a training set, while the other 33% (16 points) is used as
the validation set to train the weights for each fold.[52] This cross-validation method allows for efficient
use of the relatively small set of available Kd values measured for antibody/protein complexes and allows
us to determine the generalizability of the model over the entire
data set. Each K-fold neural-network model was trained over 10,000
iterations, and loss was measured using the RMSE of the validation
set’s prediction versus the empirically determined binding
affinity values. Finally, each K-fold’s prediction was added
and averaged for the final spike/antibody Kd prediction.Once the MLP model was established, the structures
for multiple
antibody/spike complexes were reconstructed for each variant’s
set of mutations using the PyMol mutation tool. The variants chosen
for analysis are the VOCs listed by the CDC in June 2021. Much like
our previous work investigating the effects of spike protein mutations
with ACE2,[28] the mutations investigated
were treated as small structural perturbations of the overall spike
protein structure, which is made possible due to the availability
of reliable experimental crystal or cryo-EM structures of these antibody/spike
complexes, whose energy-minimized structures produce results consistent
with in vitro data.[22−27] This methodology allows us to avoid long-timescale MD simulations
of the spike/antibody complex that can introduce artifacts into the
simulated system from the imperfect force field and from the truncation
of long-distance interactions.[53−57] Additionally, these long MD simulations allow for deviations away
from the original crystal/cryo-EM structures, thus losing the overall
equilibrium state of the complex captured by these structural methods
(Supporting Information Figure S17). These
complex structures were similarly energy-minimized, and their interaction
energies were calculated. The values obtained from the MM-GBSA calculation,
along with the charge center, SASA, and H-bond measures, were then
used within our MLP network to obtain the antibody-predicted affinity
for the mutated spike protein. To obtain the predicted Kd values of these antibodies against the SARS-CoV-2 variants,
the difference between the ΔGGB value
of the WT spike protein versus each variant was applied to the ΔGexp obtained from in vitro experiments.
Results and Discussion
Structural Insights into the Binding Modes of VOCs
Binding Interactions of B.1.617.2 (Delta) Variant’s Spike
Protein with Human ACE2
We have previously utilized the available
structures of ACE2 binding with the spike protein to investigate the
effects the N501Y mutation had on the interactions within the ACE2/spike
interface.[28] Although the mutations of
the Delta variant are still located in the RBD region of the spike
protein, the L452R and T478K mutations are not located within the
interface between the spike protein and ACE2. These mutations are
either closer to the core of the spike protein (e.g., L452R) (Figure B) or are located on the receptor binding motif (RBM) but too far
away to closely interact with any ACE2 residues (e.g., T478K) (Figure B). A commonality between these two residues is their change from
an uncharged residue to a positively charged residue; this change
alters the electrostatic surface of the spike protein and thus its
affinity with ACE2, which has an overall charge of −27. With
this large number of negative charges within ACE2, it is not surprising
that many of the VOCs utilize such mutations within the RBD region
to gain affinity with ACE2 [e.g., B.1.1.7 (Alpha),
B.1.351 (Beta), B.1.427 (Epsilon), B.1.429 (Epsilon), B.1.525 (Eta),
B.1.526 (Iota), B.1.526.1, B.1.617.1-2, P.1 (Gamma), P.2 (Zeta), C.37
(Lambda), and B.1.1.529 (Omicron)].[58] However,
the distance of these residues from the interface between spike and
ACE2 limits their effects on modifying any existing interactions between
the two. This can be seen in Figure C,D, revealing the low rmsd between the two energy-minimized
protein structures (rmsd = 0.046) (Figure A,B).
Figure 2
(A) Overall binding structure of the WT
spike protein with ACE2,
with L452 and T478 indicated in stick representation. (B) Overall
binding structure of the B.1.617.2 (Delta) spike protein and ACE2,
with L452R and T478K mutations in ball and stick representation. (C)
Detailed representation of the WT spike and ACE2 binding mode, showing
several hydrogen bonds of interest at the interface between the two
proteins. (D) Detailed binding mode of Delta spike and ACE2, revealing
limited deviation in position for the mutated spike protein’s
residues. All hydrogen bonds in (C) are retained.
(A) Overall binding structure of the WT
spike protein with ACE2,
with L452 and T478 indicated in stick representation. (B) Overall
binding structure of the B.1.617.2 (Delta) spike protein and ACE2,
with L452R and T478K mutations in ball and stick representation. (C)
Detailed representation of the WT spike and ACE2 binding mode, showing
several hydrogen bonds of interest at the interface between the two
proteins. (D) Detailed binding mode of Delta spike and ACE2, revealing
limited deviation in position for the mutated spike protein’s
residues. All hydrogen bonds in (C) are retained.
Mutational Escape of Spike VOCs against Antibodies
Although the Delta variant is known for its increased infectivity,
its mutational escape against several antibodies is also notable.
Although there are few examples in the literature of empirical binding
data of antibodies against the B.1.617.1 (Kappa) spike protein, Liu et al. released a study of the effects of the T478K and
E484Q mutations in the spike RBD region against several of their identified
COVOX-series antibodies.[37] Two antibodies,
COVOX-316 and COVOX-384, were notable for their drastic increase in
binding Kd versus the B.1.617.1 protein
and the WT spike, 200 and >500-fold, respectively. Identifying
these Kd values against the Delta spike’s
RBD
region while also elucidating the cryo-EM structure of these antibody
complexes with the spike protein provides an opportunity to establish
a predictive model like our previously published model with the spike/ACE2
complex and enables to predict Kd values
across a representative sample of known antibodies.[37]COVOX-316 and COVOX-384 bind to the RBM of the spike
protein much in the same way as ACE2 does (Figure A,D). However, both antibodies are in much
closer proximity to the L452 and E484 mutational sites, making them
much more susceptible to the effects of these changes. Although the
L452R mutation’s primary effect is the addition of deleterious
+/+ charge interactions with the positively charged antibody, the
E484Q mutation has a much larger effect. The mutated glutamine residue
is unable to fit within the same pocket as the glutamic acid and thus
loses two strong hydrogen bonds with the COVOX-316 (Figure B,C). In Covox-384, the mutation
of E484Q is more devastating, removing a critical +/– charge
interaction between E484 and R52 of COVOX-384 (Figure E,F). Additionally, a +/+ charge repulsion
between the mutated R452 and H56 further adds to the loss in binding
affinity between these two proteins (Table ). This drastic loss in interactions within
COVOX-384 explains the “knockout” reported by Liu et al. regarding the Kd value
of the complex. Several antibodies share a similar positioning of
an arginine residue in the range of the L452R mutation including COVA2-04,
C1A-C2, C1A-F10, C1A-B3, and C1A-B12 (Figures S6 and S9–S11), a potential issue if these antibodies
were to be trialed against either the Kappa or Delta variants.
Figure 3
(A) Overall
binding mode of COVOX-316 and the B.1.617.1 variant
spike protein, with the mutated residues R452 and Q484 in green ball
and stick representation. (B) Detailed binding mode of COVOX-316 with
WT spike protein, showing the hydrogen bonds within the complex. (C)
Binding mode of COVOX-316 with Delta spike protein. The E484Q mutation
causes a loss of two hydrogen bonds between E484/S494 and E484/N54
of the spike and antibody, respectively, leaving only the hydrogen
bond with Y33 with the Q484 of the mutated spike protein. (D) Overall
binding mode of COVOX-384 and the Delta variant spike protein, with
mutated residues in green ball and stick representation. (E) Binding
mode of COVOX-384 with WT spike protein; E484 forms a +/– charge
interaction with R52 of the antibody, which is destroyed upon mutation
to Q484, along with a hydrogen bond with the backbone amine of F490
(F).
Table 1
Training Data Used for the Establishment
of the MLP Neural Network, along with the In Vitro Experimental Binding Energies and Predicted ΔG Values (kcal/mol)
strain
antibody
EEL (kcal/mol)
VDW (kcal/mol)
GAS (kcal/mol)
SOLV (kcal/mol)
LREa (Hartrees)
antibody charge
spike charge
distanceb (Å)
complex surface (Å2)
spike surface (Å2)
antibody surface (Å2)
H-bonds
Kdc (nM)
ΔGd (kcal/mol)
predicted ΔGe (kcal/mol)
WT
AZD1061[37]
–269.46
–77.95
–347.41
265.56
–0.05
5
2
45.70
26086.97
18300.65
9303.83
7
3.90
–11.54
–12.04
WT
AZD8895[37]
–99.10
–72.96
–172.06
115.85
–0.04
5
2
39.11
26638.58
18504.48
9467.62
6
1.00
–12.35
–11.84
WT
C1A-B3[61]
–128.24
–116.13
–244.36
150.00
–0.07
6
3
34.77
23735.05
17006.07
9145.51
8
76.30
–9.77
–9.95
WT
C1A-C2[61]
–321.24
–122.33
–443.57
345.33
–0.08
3
3
49.00
25301.64
18396.42
9537.21
11
14.10
–10.78
–10.36
WT
C1A-F10[61]
–216.73
–114.17
–330.89
233.77
–0.08
5
3
41.72
24711.42
17710.94
9458.29
9
55.70
–9.96
–10.07
WT
CC12.1[62]
–121.01
–93.60
–214.61
221.08
–0.05
4
3
41.59
22147.20
16163.98
8829.57
10
17.00
–10.66
–10.19
WT
CC12.3[62]
–87.60
–100.28
–187.88
118.23
–0.04
5
4
51.74
25532.84
17837.53
9615.36
6
14.00
–10.78
–10.76
WT
COVA2-04[63]
–223.60
–130.87
–354.47
253.52
–0.07
4
2
34.06
24389.47
17863.50
9118.65
7
40.00
–10.15
–10.04
WT
COVOX-150[37]
–237.88
–130.69
–368.56
256.08
–0.07
4
3
55.10
24272.68
17905.28
9064.87
8
0.57
–12.69
–12.64
WT
COVOX-158[37]
–172.95
–112.40
–285.35
198.85
–0.06
4
3
39.99
23722.72
16853.48
9218.80
8
1.40
–12.15
–11.81
WT
COVOX-222[37]
–235.74
–118.04
–353.78
255.04
–0.06
2
2
33.28
23836.53
16887.70
9274.22
7
0.25
–13.18
–12.89
WT
COVOX-253[37]
–138.11
–69.95
–208.06
162.06
–0.04
3
1
48.69
25509.26
17822.87
9089.26
7
0.51
–12.76
–12.68
WT
COVOX-269[37]
–114.77
–138.27
–253.04
143.41
–0.07
7
3
46.79
24935.20
18424.49
9276.91
10
0.52
–12.74
–12.76
WT
COVOX-278[37]
–206.63
–90.13
–296.76
207.93
–0.06
6
3
37.77
25211.40
17444.56
9509.06
8
1.60
–12.07
–11.49
WT
COVOX-316[37]
41.74
–75.93
–34.20
–16.57
–0.03
8
3
41.95
24697.54
16854.09
9315.47
2
0.38
–12.93
–13.22
WT
COVOX-384[37]
–1.01
–74.98
–76.00
22.88
–0.03
7
4
52.84
26393.46
18008.86
9635.81
1
0.65
–12.61
–12.55
WT
COVOX-40[37]
–143.09
–116.85
–259.95
158.64
–0.07
5
4
52.15
25831.58
18107.25
10191.22
13
0.34
–13.00
–13.50
WT
CR3022[64]
–316.93
–106.82
–423.75
323.02
–0.07
3
4
36.23
25756.90
18227.55
9606.00
8
68.00
–9.84
–9.92
WT
LY-CoV488[37,61]
–409.96
–87.61
–497.57
410.27
–0.08
0
2
0.00
25032.37
17670.79
9295.96
9
54.00
–9.98
–9.65
WT
LY-CoV555[37,65]
–90.46
–93.19
–183.66
99.17
–0.05
8
2
21.38
26215.80
18375.75
9674.69
3
1.45
–12.13
–12.53
WT
REGN10933[37,66]
–216.00
–77.40
–293.40
238.72
–0.05
1
2
92.38
28940.33
19871.40
10808.07
6
0.38
–12.93
–12.90
WT
REGN10987[37,66]
–41.04
–63.99
–105.03
69.98
–0.02
5
2
18.26
27916.56
18461.53
10639.06
4
0.81
–12.48
–12.62
B.1.617.1
AZD1061
–226.77
–79.91
–306.68
221.52
–0.05
5
4
46.61
26333.59
18247.78
9651.99
7
6.70
–11.22
–11.59
B.1.617.1
AZD8895
–13.93
–71.38
–85.31
28.90
–0.04
5
4
39.77
27406.48
19052.16
9704.11
7
3.90
–11.54
–11.24
B.1.617.1
COVOX-150
–212.20
–130.34
–342.55
221.40
–0.07
4
5
51.23
24740.35
18286.38
9177.15
10
0.70
–12.57
–13.83
B.1.617.1
COVOX-158
–151.24
–111.05
–262.29
172.04
–0.06
4
5
37.46
24345.54
17207.79
9508.15
10
1.40
–12.15
–11.92
B.1.617.1
COVOX-222
–224.82
–115.86
–340.69
238.36
–0.07
2
4
30.55
23984.29
17061.75
9305.89
10
1.40
–12.15
–12.40
B.1.617.1
COVOX-253
–105.82
–66.88
–172.70
127.81
–0.04
3
3
38.15
25688.33
17846.93
9220.55
6
1.90
–11.97
–11.90
B.1.617.1
COVOX-269
–33.03
–140.32
–173.35
63.19
–0.06
7
5
42.49
25273.68
18619.63
9407.03
11
1.50
–12.11
–12.22
B.1.617.1
COVOX-278
–100.56
–95.83
–196.39
107.79
–0.05
6
5
35.29
25587.32
17700.77
9736.49
9
22.00
–10.51
–10.67
B.1.617.1
COVOX-316
311.56
–82.17
229.39
–264.31
–0.01
8
5
34.97
24674.69
16764.88
9337.28
1
1623.00
–7.95
–7.67
B.1.617.1
COVOX-384
255.82
–76.57
179.25
–227.98
–0.01
7
6
43.07
26777.68
18192.43
9828.18
3
4000.00
–7.41
–7.54
B.1.617.1
COVOX-40
–92.14
–122.14
–214.28
109.53
–0.06
5
6
46.94
26394.45
18546.29
10359.61
12
0.33
–13.01
–12.49
B.1.617.1
REGN10933
–175.61
–77.75
–253.36
195.96
–0.05
1
4
91.41
29099.79
20076.96
10706.73
6
0.13
–13.57
–13.55
B.1.617.1
REGN10987
–4.14
–68.69
–72.83
30.77
–0.03
5
4
25.91
28161.94
18723.90
10712.40
4
0.44
–12.84
–12.58
B.1.617.2
AZD1061
–241.80
–78.22
–320.02
269.66
–0.06
5
4
48.11
26428.70
18364.63
9616.34
7
6.90
–11.20
–10.74
B.1.617.2
AZD8895
–68.24
–75.81
–144.05
144.32
–0.03
5
4
37.62
27252.96
19050.93
9660.93
6
1.50
–12.11
–12.24
B.1.617.2
COVOX-150
–188.94
–129.24
–318.19
274.25
–0.07
4
5
48.84
24743.11
18245.38
9231.71
10
0.77
–12.51
–13.31
B.1.617.2
COVOX-158
–190.27
–108.09
–298.36
266.29
–0.07
4
5
34.96
24337.77
17167.34
9600.56
10
2.40
–11.83
–11.79
B.1.617.2
COVOX-222
–214.29
–117.07
–331.36
301.95
–0.06
2
4
26.41
24166.89
17096.96
9447.41
10
0.52
–12.74
–12.22
B.1.617.2
COVOX-253
–207.08
–69.15
–276.23
260.99
–0.05
3
3
39.48
25698.71
17939.00
9218.31
7
0.96
–12.38
–12.42
B.1.617.2
COVOX-269
–5.13
–140.46
–145.59
113.60
–0.06
7
5
40.86
25517.58
18815.95
9461.17
10
0.73
–12.54
–12.44
B.1.617.2
COVOX-278
–157.34
–95.21
–252.55
201.98
–0.06
6
5
34.46
25848.16
17877.36
9857.21
8
6.40
–11.25
–11.22
B.1.617.2
COVOX-316
203.42
–75.78
127.64
–118.28
–0.03
8
5
35.32
26552.11
18185.91
9809.75
2
0.78
–12.50
–12.47
B.1.617.2
COVOX-384
141.35
–71.84
69.51
–87.59
–0.03
7
6
42.50
28084.62
18938.32
10393.09
3
5.60
–11.33
–11.43
B.1.617.2
COVOX-40
–64.52
–118.51
–183.03
143.12
–0.06
5
6
43.55
26454.01
18592.44
10344.35
14
0.47
–12.80
–11.71
B.1.617.2
REGN10933
–234.27
–75.22
–309.48
308.17
–0.05
1
4
92.68
28988.04
19942.75
10747.29
6
0.74
–12.53
–12.75
B.1.617.2
REGN10987
–18.88
–69.00
–87.88
99.78
–0.03
5
4
29.93
28046.79
18694.79
10638.52
5
0.38
–12.93
–12.71
B.1.617.2
Ly-Cov555
252.20
–74.30
177.90
–187.88
–0.04
6
6
29.42
25682.40
19568.10
11423.69
4
31.00
–10.31
–9.58
RMSE
0.38
LRE interaction as calculated using
the distance-dependent dielectric function.
Distance between the overall charge
center of the spike and antibody proteins.
Computationally predicted ΔG in kcal/mol using the MLP Neural Network.
(A) Overall
binding mode of COVOX-316 and the B.1.617.1 variant
spike protein, with the mutated residues R452 and Q484 in green ball
and stick representation. (B) Detailed binding mode of COVOX-316 with
WT spike protein, showing the hydrogen bonds within the complex. (C)
Binding mode of COVOX-316 with Delta spike protein. The E484Q mutation
causes a loss of two hydrogen bonds between E484/S494 and E484/N54
of the spike and antibody, respectively, leaving only the hydrogen
bond with Y33 with the Q484 of the mutated spike protein. (D) Overall
binding mode of COVOX-384 and the Delta variant spike protein, with
mutated residues in green ball and stick representation. (E) Binding
mode of COVOX-384 with WT spike protein; E484 forms a +/– charge
interaction with R52 of the antibody, which is destroyed upon mutation
to Q484, along with a hydrogen bond with the backbone amine of F490
(F).LRE interaction as calculated using
the distance-dependent dielectric function.Distance between the overall charge
center of the spike and antibody proteins.Experimentally determined Kd.Experimental binding
affinity[6] converted to Gibbs binding free
energy: ΔGexp = −RT ln(Kd).Computationally predicted ΔG in kcal/mol using the MLP Neural Network.
Delta Variant’s Escape of Ly-CoV555
Upon closer
inspection, the cause behind the large loss in efficacy of Ly-CoV555
when used against the Delta variant is clear when one looks at the
antibody/spike interface, particularly at the L452R mutation site
(Figure A,B)[7] The mutation from leucine to arginine constitutes
a shift in both residue size and charge, putting it in stark contrast
to the hydrophobic Ly-CoV555 I54 and L55 residues it interfaces with.
Additionally, both the L452R and T478K mutations within the Delta
variant introduce two additional positive charges into the spike protein,
shifting the total charge of the spike protein from +2 to +4 and thus
inducing multiple sources of positive/positive charge repulsion, the
strongest of which coming from nearby R24 on the Ly-CoV555 light chain.
These mismatches in residue affinity lead to a 20% loss in LRE energy
between Ly-CoV555 and the WT/Delta variant spike protein (−31.4
to −25.1 kcal/mol, respectively) (Table ).
Figure 4
(A) Overall structure of the spike protein and
Ly-CoV555 complex.
(B) Interface of the L452 to R452 (in yellow ball and stick representation)
mutation with Ly-CoV555, with nearby residues I54 and L55 clashing
with the polar and charged arginine residue.
(A) Overall structure of the spike protein and
Ly-CoV555 complex.
(B) Interface of the L452 to R452 (in yellow ball and stick representation)
mutation with Ly-CoV555, with nearby residues I54 and L55 clashing
with the polar and charged arginine residue.
Construction and Validation of the MLP Model of Spike/Antibody
Interactions
Our previous models would primarily use either
the ΔGPB or the deconstructed van
der Waals, electrostatic, and solvation energies to create a linear
regression to predict the binding affinity, creating a generalized
model for numerous classes of antibodies presents an additional challenge.
The differences in spike/antibody
charges and binding modes cause each class of antibodies (i.e., COVOX and CC12.x, each of which contains three or
more examples available for linear regression and both have high sequence
similarity and have been published contemporaneously) to be locally
comparable with one another via linear regression
analysis (Figure S14 and Table S2). However,
when combining these two classes of antibodies, the ability to predict
the binding affinity worsens significantly, with an R2 value of only 0.001 and an RMSE of nearly 1 kcal/mol
(Figure S13), far below that of our previous
models, which can achieve up to R2 = 0.924.[28,29,36] With these differences between
the antibodies, a more complex model that can take into account the
differences in the structure is required to accurately predict the
binding affinity of each to the spike protein and its variants.Using the obtained interaction energy values and the additional structural
information variables (Figure ) from the spike/antibody structures and their modified, mutated
versions, we proceeded to train a four-layer perceptron neural network
(MLP) machine learning model, using the converted Kd in kcal/mol unit as the response variable. Due to the
limited number of available experimental data points for spike/antibody
binding affinities, we decided to implement a methodology which would
allow for the greatest utilization of the data. By partitioning the
whole data set of spike/antibody binding affinity data into 67% training
and 33% validation partitions (each data point labeled as either 0
or 1, respectively, within JMP 16.0 software,[48]Table S1), we effectively created three
data sets (i.e., K-folds 1, 2, and 3) to train three
separate models, each having a different combination of data points
that were included in the training set. Additionally, this ensures
that each data point is within the validation set once over the course
of training the three neural network models, which alleviates the
overtraining commonly seen within supervised learning tasks for the
neural network models.[59] For each of the
training partitions of the data, the models have both high accuracy
and precision when compared to the experimentally derived values,
with an average RMSE of 0.16 kcal/mol for all three K-folds (Figure ). Additionally,
the R2 values for each K-fold range from
0.962 to 1.0 for the training sets, which is representative when compared
to our previous linear regression-based prediction models of spike/ACE2
and GPCR/ligand binding affinities.[28,44,46,60] Additional covariance
analysis of the 11 features used in comparison with the ΔGbind values used to train the model shows that
the electrostatic (EDW), LRE, and charges of the spike/antibody proteins
show both high correlation and covariance with the ΔG value (Tables S3 and S4). This
correlation is expected as each of these variables is either directly
force-field-based calculation of the interactions between the spike
protein or is used to calculate the said force-field-based features
(i.e., the charges used to calculate Coulombic potential).
Although each of these parameters on its own does not have a strong
correlation with the ΔGbind, their
use within the neural network creates a model that can accurately
predict these values.
Figure 5
Predicted binding free energy (kcal/mol) vs experimental
binding free energy (kcal/mol) of antibody proteins and SARS-CoV-2
spike proteins listed in Table utilizing the neural networks trained on fold 1, 2 and 3
training and validation partitions of the spike/antibody binding free
energy data. RMSE and R2 values for each
K-fold’s training (0) and validation (1) partitions are displayed.
Predicted binding free energy (kcal/mol) vs experimental
binding free energy (kcal/mol) of antibody proteins and SARS-CoV-2
spike proteins listed in Table utilizing the neural networks trained on fold 1, 2 and 3
training and validation partitions of the spike/antibody binding free
energy data. RMSE and R2 values for each
K-fold’s training (0) and validation (1) partitions are displayed.Once the ability for the models to accurately predict
against each
K-fold’s training partition had been confirmed (correlation
matrix of the ΔGprediction to the
used descriptors available in Table S5),
we than proceeded to test their generalizability by introducing the
models to their respective validation partitions of the data. These
predictions also had low average RMSE values (0.40 kcal/mol in average),
and the averaging of the K-folds’ predictions further improved
this metric, decreasing the RMSE of our final prediction to 0.38 kcal/mol
(Table ). While fold
2 does appear to be over-trained on the training partition, with a
0.0 kcal/mol training RMSE, the model’s generalizability is
not impacted when compared to the fold’s validation set, which
has a similar RMSE (0.35) when compared to the average K-fold validation
RMSE (0.4 kcal/mol). Importantly, the K-folds’ models displayed
an ability to predict when the spike protein’s binding affinity
with a certain antibody would be dramatically decreased in the micromolar
range, which will serve as a warning when utilizing this model with
three antibody/variant combinations.
Prediction of Binding Affinity of Concerned Spike Variants with
Various Antibodies via the MLP Neural Network
With the establishment and successful validation of our three K-folds’
neural networks, attention was then turned to the spike VOCs and how
their mutations would affect a wide array of antibodies, especially
the antibodies that are currently being used to treat SARS-CoV-2 infections.
By utilizing the same computational methodology to generate the parameters
needed (e.g., MM-PBSA, LRE interactions, surface
area, etc.), each variant spike/antibody complex’s
binding free energy was predicted using each the K-fold’s neural
network and subsequently averaged to reduce the influence of any one
K-fold model’s biases and reduce the overall RMSE of the predictions.
The predicted binding affinity data are summarized in Table . As seen in Table , experimental Kd values have been reported in the literature for multiple
antibodies with spike variants B.1.617.1 (Kappa) and B.1.617.2 (Delta).
The predicted Kd values are all close
to the corresponding experimental data, suggesting that the predictions
provided in Table are reasonable.
Table 2
Predicted Kd (nM) Values of Antibodies with Multiple SARS-CoV-2 Spike Protein
VOCs Using Our Neural Network Modelsa
B.1.351
B.1.427/B.1.429/B.1.526.1d
B.1.525/P.2e
B.1.526
B.1.617.1
B.1.617.2
C.37
P.1
B.1.1.529
antibody
Kdb,c
ΔGNN
AY.1
Beta
epsilon
eta
iota
kappa
delta
lambda
gamma
omicron
delta/omicron hybridf
clinical antibodies
AZD1061
3.9
1.7
55
55
27
2.8
3.1
3.2 (6.7)
15 (6.9)
26
2.1
3.8
22
AZD8895
1
2
0.4
0.4
0.7
3
20
6 (3.9)
1 (1.5)
0.5
1
50
50
LY-CoV488
54
93
13
13
72
3.7
48
4.1
30
43
25
37
43
LY-CoV555
1.45
0.74
1.38
1.38
1.21
107
44.6
46.2
105 (31)
1.02
2.18
288
368
REGN10933
0.38
0.4
0.93
0.93
0.33
0.11
0.11
0.11
0.41
1
1.9
0.31
0.83
REGN10987
0.81
0.64
7
7
4.5
15
15
13
16
0.61
6
2.1
5
non-clinical antibodies
B38
70.1
81
140
140
130
210
200
220
170
24
150
15
15
C1A-B12
4.22
1.35
17.4
17.4
7.88
17.8
14.3
15.2
8.5
6.32
2460
8.13
11.9
C1A-B3
76.3
56.7
16
16
6.93
10.8
12.9
9.86
6.75
14.1
6.3
3.26
4.55
C1A-C2
14.1
28.3
0.357
0.357
4.54
1.24
1.46
2.14
2.29
12.6
0.194
1.21
1.23
C1A-F10
55.7
46.2
2.94
2.94
4.71
9.05
5.61
4.93
1.86
47.1
39.5
2.79
2.37
CC12.1
17
38
0.44
0.44
0.47
1.1
0.71
0.69
0.82
0.23
0.3
43
43
CC12.3
14
15
21
21
5.1
6.2
6.5
6.3
5.9
1.2
99
15
15
COVA2-04
40
50
0.9
0.9
20
1
0.6
2
2
20
0.8
0.8
0.8
COVA2-39
21
110
7.5
7.5
6.7
37
24
39
34
17
89
530
530
COVOX-150
0.57
0.62
0.099
0.099
0.12
0.39
0.33
15 (0.77)
0.20
(0.77)
2
0.85
0.86
0.86
COVOX-158
1.4
2.5
5
5
2
1.6
1.4
2.1 (1.4)
2.4 (2.6)
1.3
13
0.78
0.97
COVOX-253
0.51
0.57
2.3
2.3
1.1
1.4
1.8
2.1 (1.9)
0.90 (0.96)
0.69
0.78
2.1
2.1
COVOX-269
0.52
0.51
4.3
4.3
0.6
1.9
1.5
1.3 (1.5)
0.86 (0.73)
0.24
3.4
3.5
2.6
COVOX-278
1.6
4.3
3
3
3.7
2.3
1.8
17 (22)
6.7 (6.4)
1.8
3.2
4
4.3
COVOX-316
0.38
0.24
650
650
1200
1500
2100
2600 (1623)
0.82 (0.78)
39
1500
470
1600
COVOX-384
0.65
0.71
220
220
380
3200
1900
3200
(4000)
4.7 (5.6)
7
2400
420
250
COVOX-40
0.34
0.15
1.5
1.5
2.7
4.9
2.5
0.80 (0.33)
2.9 (0.47)
1.5
6.8
3.1
3.1
COVOX-88
4.4
0.83
230
230
10
20
19
0.39 (0.13)
0.79 (5.8)
17
66
11
15
CR3022
68
59
80
80
68
16
3
15
33
70
30
29
15
Known variant experimental Kd values are set within parentheses. Ly-CoV555
shows decreased affinity with B.1.617.1 and B.1.617.2 in confirmation
with in vitro data.[37]
Experimental Kd obtained by Liu et al. (2021).[37]
Experimental Kd obtained by Liu et al. (2021).[68]
RBD
region of these noted strains
contain the same mutations within their respective columns.
Neural network prediction of the
wild-type strain.
Delta/Omicron
hybrid mutations obtained
by Colson et al. (2022).[35]
Known variant experimental Kd values are set within parentheses. Ly-CoV555
shows decreased affinity with B.1.617.1 and B.1.617.2 in confirmation
with in vitro data.[37]Experimental Kd obtained by Liu et al. (2021).[37]Experimental Kd obtained by Liu et al. (2021).[68]RBD
region of these noted strains
contain the same mutations within their respective columns.Neural network prediction of the
wild-type strain.Delta/Omicron
hybrid mutations obtained
by Colson et al. (2022).[35]Upon analysis of the screened antibodies versus the
numerous SARS-CoV-2
variants’ spike proteins, a pattern emerged that many antibodies
are unable to retain their affinity across the entire spectrum of
variants. This is especially true with Ly-CoV555, which loses significant
affinity toward B.1.617.1, B.1.617.2, and B.1.1.529.[13,37,67] Additionally, due to the high
amount of similarity between the Omicron and “Deltamicron”
RBD regions, the same antibodies that showed susceptibility to the
Omicron variant show similar binding affinities, including Ly-CoV555
(Table ). However,
the positioning of the antibody in relation to the spike protein has
significant effects on its susceptibility to these variants. The antibody
CR3022 (RCSB: 6YLA) has a unique positioning against the spike protein, where it instead
binds in proximity to the β-sheets of the RBD region of the
spike protein, rather than the ACE2 binding interface, where most
other antibodies bind (Figure ).[64] This confers significant resistance
for CR-3022 to the mutations that are common in the VOCs. This is
reflected in the data, where CR-3022’s binding affinity has
little variability when compared to that of the RBD binding antibodies
and has no predicted knockouts among the variants tested. This unique
positioning and high affinity to the spike protein could potentially
be used as a dual-antibody treatment, wherein CR3022 is used as a
secondary binder to the spike protein in combination with an antibody
tailored to the SARS-CoV-2 VOC (e.g., B.1.617.2).
Figure 6
(A) Overall
binding mode of CR3022 with the WT spike protein; important
to note is the unique positioning of the antibody in the spike RBD
region. Unlike Ly-CoV555 (B) and many other antibodies (Figure S1), CR3022 binds to a separate motif
of the spike protein, separate from the commonly used RBM region utilized
by the ACE2. This difference in positioning (C) leads to additional
resistance against the spike protein mutations (Table ), which are tailored to increase binding
affinity with ACE2 at the RBM.
(A) Overall
binding mode of CR3022 with the WT spike protein; important
to note is the unique positioning of the antibody in the spike RBD
region. Unlike Ly-CoV555 (B) and many other antibodies (Figure S1), CR3022 binds to a separate motif
of the spike protein, separate from the commonly used RBM region utilized
by the ACE2. This difference in positioning (C) leads to additional
resistance against the spike protein mutations (Table ), which are tailored to increase binding
affinity with ACE2 at the RBM.Conversely, when looking at the trends from a variant-centric
perspective,
the Kappa, Delta, and Gamma variants all have substantial changes
in their average binding affinity from 17 nM for the WT Kd values to 560, 240, and 260 nM, respectively. These
large increases in Kd over a large class
of antibodies can be tied to these variant’s mutation upon
the RBD; although these three variants do not contain the same mutation,
they do each contain a mutation which alters the overall charge balance
of the (i.e., E484 for B.1.617.1 & P.1 and T478K
for B.1.617.2). This change in charge balance will affect the more
positively charged antibodies (e.g., COVOX-316 and
COVOX-384) and decrease the binding affinity between the spike protein
and antibody.
Conclusions
We have developed and validated
an intermolecular interaction modeling-based MLP neural network that
is able to replicate the experimentally determined in vitro binding affinities of a wide set of known antibodies toward the
wild-type SARS-CoV-2 spike protein and various measurements of these
same antibodies with both the Kappa and Delta variants of the spike
protein. This neural network utilizes non-proprietary in silico-derived parameters, which can be obtained using the publicly available
structures of these spike/antibody complexes. Using this model, we
have predicted the binding affinities across multiple VOCs for each
of these antibodies, including those from Eli Lilly and Regeneron,
which are currently being used to treat SARS-CoV-2. In the case of
the Ly-CoV555 antibody, the L452R and T478K mutations appear to be
deleterious to the antibody’s binding to the protein, through
a combination of the disruption of the charge/charge interactions
between the antibody and spike protein and steric hindrance induced
by the L452R mutation. Our model was also able to predict the deleterious
effects of the B.1.1.529 (Omicron) variant’s mutations on the
Ly-CoV555’s efficacy, with a predicted 200-fold decrease in
binding affinity; this prediction has been borne out in clinical studies
where Ly-Cov555 has little to no effect on treating the Omicron variant.[18,67] These effects carry over to the newly discovered “Deltamicron”
variant, which shares significant similarity with the Omicron variant
RBD region. Additionally, this method has revealed a resistance to
mutations in the RBD region with the antibody CR3022, which bind peripherally
to the RBD region of the protein; this resistance could be employed
within a multi-antibody therapeutic to circumvent the loss of affinity
seen with many other antibodies. This neural network methodology provides
a quick and computationally inexpensive way to not only test currently
existing antibodies against new SARS-CoV-2 variants but also to allow
for the quick in silico screening of newly designed
antibodies before their costly production and in vitro testing.In addition, our machine learning approach for fast
protein–protein
binding affinity prediction is based on modeling of intermolecular
interactions between proteins. Compared to usually used machine learning
models that often use hundreds or even thousand(s) of molecular descriptors
as input nodes, an intermolecular interaction modeling-based machine
learning model will only need to use a few intermolecular interaction
descriptors as input nodes, such as only 11 input nodes in this study.
The similar computational strategy and approach may be useful in development
of machine learning models predicting binding affinities for other
protein–protein binding systems, including other antibodies
binding with their antigens. Further iterations of this model may
focus on including additional mutation Kd values as they are reported along with implementing other machine
learning methods (e.g., AdaBoost, Boosting Gradient, etc.), which can also produce models that show strong correlation
with the empirically obtained mutational data (Figure S16). Additionally, further feature optimization may
be performed by paring unnecessary features (a prospective five-feature
model’s performance can be seen in Supporting Information S18).
Authors: Bill R Miller; T Dwight McGee; Jason M Swails; Nadine Homeyer; Holger Gohlke; Adrian E Roitberg Journal: J Chem Theory Comput Date: 2012-08-16 Impact factor: 6.006
Authors: Bryan E Jones; Patricia L Brown-Augsburger; Kizzmekia S Corbett; Kathryn Westendorf; Julian Davies; Thomas P Cujec; Christopher M Wiethoff; Jamie L Blackbourne; Beverly A Heinz; Denisa Foster; Richard E Higgs; Deepa Balasubramaniam; Lingshu Wang; Yi Zhang; Eun Sung Yang; Roza Bidshahri; Lucas Kraft; Yuri Hwang; Stefanie Žentelis; Kevin R Jepson; Rodrigo Goya; Maia A Smith; David W Collins; Samuel J Hinshaw; Sean A Tycho; Davide Pellacani; Ping Xiang; Krithika Muthuraman; Solmaz Sobhanifar; Marissa H Piper; Franz J Triana; Jorg Hendle; Anna Pustilnik; Andrew C Adams; Shawn J Berens; Ralph S Baric; David R Martinez; Robert W Cross; Thomas W Geisbert; Viktoriya Borisevich; Olubukola Abiona; Hayley M Belli; Maren de Vries; Adil Mohamed; Meike Dittmann; Marie I Samanovic; Mark J Mulligan; Jory A Goldsmith; Ching-Lin Hsieh; Nicole V Johnson; Daniel Wrapp; Jason S McLellan; Bryan C Barnhart; Barney S Graham; John R Mascola; Carl L Hansen; Ester Falconer Journal: Sci Transl Med Date: 2021-04-05 Impact factor: 19.319
Authors: Pengfei Wang; Manoj S Nair; Lihong Liu; Sho Iketani; Yang Luo; Yicheng Guo; Maple Wang; Jian Yu; Baoshan Zhang; Peter D Kwong; Barney S Graham; John R Mascola; Jennifer Y Chang; Michael T Yin; Magdalena Sobieszczyk; Christos A Kyratsous; Lawrence Shapiro; Zizhang Sheng; Yaoxing Huang; David D Ho Journal: Nature Date: 2021-03-08 Impact factor: 69.504
Authors: Marek Widera; Alexander Wilhelm; Sebastian Hoehl; Christiane Pallas; Niko Kohmer; Timo Wolf; Holger F Rabenau; Victor M Corman; Christian Drosten; Maria J G T Vehreschild; Udo Goetsch; Rene Gottschalk; Sandra Ciesek Journal: J Infect Dis Date: 2021-07-05 Impact factor: 5.226
Authors: Jamie Lopez Bernal; Nick Andrews; Charlotte Gower; Eileen Gallagher; Ruth Simmons; Simon Thelwall; Julia Stowe; Elise Tessier; Natalie Groves; Gavin Dabrera; Richard Myers; Colin N J Campbell; Gayatri Amirthalingam; Matt Edmunds; Maria Zambon; Kevin E Brown; Susan Hopkins; Meera Chand; Mary Ramsay Journal: N Engl J Med Date: 2021-07-21 Impact factor: 91.245