Jagdish Suresh Patel1, F Marty Ytreberg2. 1. Center for Modeling Complex Interactions, University of Idaho , Moscow, Idaho 83844, United States. 2. Department of Physics, University of Idaho , Moscow, Idaho 83844, United States.
Abstract
Determination of protein-protein binding affinity values is key to understanding various underlying biological phenomena, such as how missense variations change protein-protein binding. Most existing non-rigorous (fast) and rigorous (slow) methods that rely on all-atom representation of the proteins force the user to choose between speed and accuracy. In an attempt to achieve balance between speed and accuracy, we have combined rigorous umbrella sampling molecular dynamics simulation with a coarse-grained protein model. We predicted the effect of missense variations on binding affinity by selecting three protein-protein systems and comparing results to empirical relative binding affinity values and to non-rigorous modeling approaches. We obtained significant improvement both in our ability to discern stabilizing from destabilizing missense variations and in the correlation between predicted and experimental values compared to non-rigorous approaches. Overall our results suggest that using a rigorous affinity calculation method with coarse-grained protein models could offer fast and reliable predictions of protein-protein binding free energies.
Determination of protein-protein binding affinity values is key to understanding various underlying biological phenomena, such as how missense variations change protein-protein binding. Most existing non-rigorous (fast) and rigorous (slow) methods that rely on all-atom representation of the proteins force the user to choose between speed and accuracy. In an attempt to achieve balance between speed and accuracy, we have combined rigorous umbrella sampling molecular dynamics simulation with a coarse-grained protein model. We predicted the effect of missense variations on binding affinity by selecting three protein-protein systems and comparing results to empirical relative binding affinity values and to non-rigorous modeling approaches. We obtained significant improvement both in our ability to discern stabilizing from destabilizing missense variations and in the correlation between predicted and experimental values compared to non-rigorous approaches. Overall our results suggest that using a rigorous affinity calculation method with coarse-grained protein models could offer fast and reliable predictions of protein-protein binding free energies.
Protein–protein interactions are
at the heart of regulation
for all biological processes in a cell. Missense variations (or mutations)
of the amino acids that make up these proteins play an essential role
by introducing diversity into genomes. These missense variations can
lead to an altered protein affinity and can result in dysfunction
of the protein interaction network.[1] To
understand living organisms, it is thus vital to have a comprehensive
knowledge of how proteins interact under physiological conditions,
that is, to determine their binding affinities and how these affinities
can be modified.[2]Many techniques
have been successful in determining the Gibbs free
energy change of protein–protein binding due to a missense
variation (i.e., relative affinity, ΔΔG). Experimental biophysical methods can quantitatively measure ΔΔG values for protein interactions, but these methods are
typically costly, laborious, and time-consuming since all mutants
must be expressed and purified.[3] Consequently,
many researchers have developed and utilized computational methods
to predict ΔΔG values. The most promising
in terms of accuracy are rigorous methods based on statistical mechanics
that use molecular dynamics (MD) simulations and are capable of addressing
conformational flexibility and entropic effects; however, these approaches
are computationally highly expensive.[4] By
contrast, non-rigorous, computationally less expensive, methods have
been developed using the static all-atom protein complex structure.
Such methods typically involve the following: (i) empirical energy
scoring function;[5] (ii) potentials derived
using molecular mechanics principles that enumerate the interactions
in physically meaningful terms;[6] (iii)
statistical potentials based on the likelihood of similar interactions
and local conformations occurring in the Protein Data Bank (PDB);[7] (iv) combination of the first three;[1c] and (v) protein–protein docking.[3b,8] Other approaches have also emerged relying on either coarse representation
of the protein (use of Cα or Cβ backbone atoms) to derive
a simple contact map potential[9] or machine
learning on sequence conservation, solvent accessibility, and secondary
structure information to predict ΔΔG values.[10] These approaches are fast
and show some degree of success, but they do not account for the conformational
changes that can be induced due to missense variation that can prevent
clashes and allow residues to form more favorable interaction.[1c] Protein–protein docking has been successful
in identifying the interface region but struggles to correctly predict
ΔΔG values.[3b,8] Some
efforts have also been put into addressing flexibility into non-rigorous
binding affinity calculations.[6,11] However, such approaches
could only model small deviations in the protein complex fearing the
loss of computational speed.A promising approach to increase
MD simulation speed is to use
coarse-grained (CG) force fields that rely on abstract descriptions
of the biomolecular system including the solvent, yet retain essential
physicochemical information. Several CG models for water and proteins
have emerged over the years, each with their strengths and limitations.
These models have the potential to significantly increase the speed
of a molecular simulation with a cost to biochemical accuracy as compared
to atomistic force fields.[2,12] CG models came into
existence mainly with the purpose of modeling the structure and the
dynamics of the biomolecular systems.[12b] However, in recent years CG simulations have been employed in combination
with enhanced sampling/biasing methods to obtain single-dimensional
and multidimensional projections of the free energy landscape of association
and dissociation processes of biomolecular assembly systems[13] (also see review by Baaden and Marrink[2] on this topic). CG modeling can enable researchers
to extend the time scale of the simulation and increase the phase
space exploration allowing the study of rare events and large-scale
motions of the biomolecules at less computational expense compared
to all-atom.[2,12b,14]In this work, we investigate the performance of a strategy
combining
a SIRAH CG protein model[15] with rigorous
umbrella sampling[16] molecular dynamics
to predict the ΔΔG values of single amino
acid missense variations. We are interested in predicting the effect
of multiple missense variations and thus have developed a semi-automated
strategy with default values for input simulation parameters that
avoids fine-tuning each parameter to individual complex systems. To
investigate whether this strategy has a good trade-off between speed
and accuracy, we chose three protein–protein test systems with
empirical ΔΔG values for observed missense
variations. For each test system we selected eight different missense
variations occurring at the different sites with varying empirical
ΔΔG values. We calculated ΔΔG values for each missense variation using fast umbrella
sampling simulations (i.e., short simulation time with similar input
parameters) and compared the results with empirical ΔΔG values and with two non-rigorous approaches.[11a,17] We obtained significant improvement in the correlation between predicted
and experimental ΔΔG values compared
to faster approaches. Moreover, our strategy predicted the sign of
ΔΔG values correctly at a much higher
rate compared to the other tested methods. To our knowledge, there
is only one study by May et al.[13a] that
has previously applied a strategy of using Martini CG models[18] combined with restrained simulations to estimate
effect of single missense variations on protein–protein binding
affinities. In their study, absolute affinity values were calculated
and found to be in reasonable agreement with those from atomistic
simulation and correlated well with evolutionary likelihood. However,
their predictions lacked experimental validation, and the study did
not investigate the performance of this strategy in computing ΔΔG compared to other methods or in predicting the sign of
ΔΔG. Combining CG models with enhanced
sampling techniques is slowly gaining traction for calculating the
free energy of various physiological processes. However, previous
studies have mainly employed the Martini CG model or a highly coarse
Go̅-like model and lacked a systematic evaluation of these models
in predicting the effect of missense variations.
Methods
Test Systems
To provide a test of the speed and accuracy
of predicting relative binding free energy differences (ΔΔG), we selected three different protein–protein complexes
(see Figure ) from
the SKEMPI database:[19] (i) Complex between
humanleukocyte elastase (218 aa) and the third domain of the turkey
ovomucoid inhibitor (56 aa) (PDB ID 1PPF);[20] (ii) Barnase
(110 aa)–Barstar (89 aa) complex (PDB ID 1BRS);[21] (iii) an antigen–antibody complex of the lysozyme
(129 aa)–HY/HEL-10 FAB (429 aa) (PDB ID 3HFM).[22] We chose eight missense variations for each of the three
protein complex systems. The choice of these missense variations was
driven by several factors: (i) the values for ΔΔG for reported experimental missense variations were varied
in sign, important since negative, stabilizing values are often harder
to predict than positive, destabilizing values; (ii) there were non-alanine-scanning
point missense variations at differing sites; (iii) the structures
in the PDB were not missing a large number of residues; (iv) there
was a range in the size of the chosen protein complexes; and (v) missense
variations were reported on one chain (1PPF), on both chains (1BRS), and on multiple
chains (3HFM) (see Figure ).
Figure 1
Three-dimensional structures of test protein–protein complexes.
System names (PDB IDs) are given above each panel (1PPF, 1BRS, and 3HFM). Each protein pair
is colored in orange and green. The red spheres along the interface
of the protein complex indicate the sites of the single missense variations
chosen for the present study.
Three-dimensional structures of test protein–protein complexes.
System names (PDB IDs) are given above each panel (1PPF, 1BRS, and 3HFM). Each protein pair
is colored in orange and green. The red spheres along the interface
of the protein complex indicate the sites of the single missense variations
chosen for the present study.
Preparation of the Wild-Type and Mutant Complexes
Each
test complex was prepared in an identical manner using the following
steps: (i) experimental structures were downloaded from the PDB Web
site (http://www.rcsb.org/pdb/home/home.do); (ii) the structure files were edited to remove all but the two
interacting chains listed in the SKEMPI database;[19] (iii) all missing residues or atoms in the PDB files were
added using MODELER v9.15;[23] and (iv) mutant
complexes were generated using Dunbrack rotamer library[24] in UCSF Chimera.[25]
Coarse-Grained Molecular Dynamics Simulations
All MD
simulations were carried out using GROMACS v5.1.2.[26] Biasing potentials necessary to carry out umbrella sampling[16] (US) with restraints were introduced via the
PLUMED v2.2 plugin[27] integrated in the
GROMACS code.Coarse-grained simulations were performed using
the SIRAH force field[15] (http://www.sirahff.com) for all
three systems. SIRAH CG force field aims to address some common limitations
of CG force fields such as the use of uniform dielectric constant,
lack of long-range interactions, use of topological information to
maintain the secondary structure, and implicit or no ionic strength
effects, etc.[15] In contrast to the “four
heavy atoms to one CG bead” rule used by the popular Martini
force field,[18,28] SIRAH CG force field treats the
peptide bonds with a relatively high degree of detail, preserving
the positions of the nitrogen (N), α-carbon (Cα), and
oxygen (O), while side chains are modeled more coarsely. WT4water
model[29] included in SIRAH CG force field
is formed by four linked beads, each carrying a partial charge, thus
allowing it to generate its own dielectric permittivity. Moreover,
the CG electrolytes are capable of mimicking the ionic strength effects
and osmotic pressure. This residue-based CG model provides all the
interactions within a classical Hamiltonian, which is commonly found
in most MD simulation packages.[15]We followed the protocol reported in Darré et al.[15] for carrying out CG simulations. Coordinate
mapping and analysis were performed with SIRAH tools.[30] Prior to mapping to the SIRAH CG model, protonation states
were assigned based on the assumption of neutral pH using PDB2PQR server[31] and choosing the AMBER[32] naming scheme as an output. Following the all-atom to CG conversion,
the protein complexes were placed in a dodecahedral box of SIRAH WT4water and given neutral charge by adding Na+ and Cl– ions at a concentration of 0.15 mol/L. Thickness of
the water layer was kept at 4 nm resulting in the following number
of WT4 molecules added to each protein complex: 1PPF, 5472; 1BRS, 5687; 3HFM, 10943. The large
box size was chosen to make sure that when the two proteins are at
the maximum separation distance, they do not interact with each other
via their periodic images. Each system was then minimized using the
steepest decent for 10,000 steps. To allow for equilibration of the
water around the protein complex, each system was then simulated for
1 ns with the positions of all CG atoms in the complex harmonically
restrained. During the restrained simulations, the temperature of
the systems was set to 300 K and the pressure to 1 atm using respectively
the V-rescale thermostat[33] and Parrinello–Rahman
barostat[34] with isotropic pressure coupling.
Unrestrained simulations were then carried out for 2 ns. All the simulations
used a time step of 20 fs and updated neighbor lists every 10 steps.
Electrostatic interactions are calculated using particle mesh Ewald[35] with a direct cutoff of 1.2 nm and a grid spacing
of 0.2 nm, and a 1.2 nm cutoff was used for van der Waals (vdW) interactions.
Coarse-Grained-Umbrella Sampling Simulations
To calculate
the potential of mean force (PMF) for the wild-type and mutant protein
complexes, we chose the widely used umbrella sampling (US) method.[16] Since we were interested in predicting the effects
of missense variations on protein–protein binding affinity,
we selected interprotein separation (i.e., distance) as the reaction
coordinate (RC; i.e., pulling variable). To avoid any distortions
of the protein as a consequence of application of external harmonic
potential, this distance was defined between the center of mass of
all the coarse-grained atoms of both proteins in the complex (see Supporting Information (SI) Figure S1). Suitable
spring constants for each complex (1PPF, 500 kJ/mol/nm2; 1BRS, 2000 kJ/mol/nm2; 3HFM: 1500 kJ/mol/nm2) were chosen by test simulations performed
on wild-type complexes to be strong enough to separate the two proteins
in a short amount of time without affecting the overall structure
of the proteins. The maximum pulling length was chosen to be 1.7 nm,
which ensured the complete solvation of both the proteins in the complex
in their unbound states. (see SI Figure S2) The unbinding pathway was chosen to be a vector joining the center
of masses of the two proteins. To prevent the drifting of the systems,
a weak harmonic restraint with force constant of 20 kJ/mol/nm2 was added to all the CG atoms of a largest protein in the
case of 1PPF and 1BRS complexes
and to the antibody in 3HFM antigen–antibody complex. The RC for the US
simulation of each complex was discretized into 35 windows with a
spacing of 0.05 nm adopted from May et al.[13a] for each complex, ensuring sufficient overlap of the probability
distribution of each window. The simulation length for each window
was 8 ns (coarse-grained time scale). The time scale for the US simulations
was intentionally kept small to match the time scales of non-rigorous
approaches used in this study and also to match the same order of
magnitude of CPU hours (CPUh) time used in protein–protein
simulations using the Martini coarse-grained model in May et al.[13a] To improve the convergence of the PMF, we used
a cylindrical harmonic restraint to prevent interactions between the
protein being pulled out of the pocket with the full surface of the
other protein. (see SI Figure S1) This
cylindrical restraint was implemented using the distance from the
center of mass of all the CG atoms of the protein being pulled from
an axis between the centers of mass of two groups of atoms of the
other. One of these two groups was the same as that used to define
the RC, and the atoms in the second group are denoted in the Supporting Information. A harmonic restraint
of 500 (kJ/mol)/nm2 on the center of mass of each unbinding
protein was applied when the distance from the axis was 0.3 nm or
larger. This cylindrical restraint was applied only to the US windows
with pulling length greater than 1.5 nm, i.e., only in the unbound
state. The effect of this cylindrical restraint was not factored into
the calculation of relative binding free energy differences (ΔΔG) since the same restraint was applied to all protein complexes
in the same fashion.
Potential of Mean Force and ΔΔG Calculation
In the US method,[16] a biasing potential is used at a certain position along
the RC (distance
in our case) to enhance the sampling of the regions involved in high
potential barriers. The RC was discretized into 35 windows, and a
harmonic potential, eq , was added to the original potential (unbiased potential) in each
window to drive the system from one thermodynamic state (bound) to
another (unbound).where s(q) is the
current RC (distance) and V(s) is the biasing potential. Here, s is the reference distance
for the nth window, k is the spring
constant, and q are the microscopic coordinates.
We used the weighted histogram analysis method (WHAM)[36] to eliminate the bias from the restrained US simulations
and construct PMFs using 100 bins along the RC.Binding free
energy (ΔG) for a protein complex was calculated
by taking a difference of free energy in the bound and unbound states.
A cutoff distance of 1.5 nm was chosen to differentiate a bound from
an unbound state as the protein being pulled had no residual contact
beyond this distance (see SI Figure S2):where Φ is the
PMF associated with the ith bin along the
RC. The relative binding free energy difference (ΔΔG) for a missense variation is then calculated usingThe average PMF profiles for each wild-type and mutant complex
for all three chosen systems were calculated by averaging the outcomes
of four independent trials per complex.
Results and Discussion
The purpose of our study was to assess the ability of combining
the SIRAH CG model and US (CG-US) using short simulation times with
similar input parameters to calculate ΔΔG values for missense variations and to compare the results with both
experimental ΔΔG values and two semiempirical
modeling methods FoldX[5,17] and MD-FoldX.[11a] Our strategy was applied to three different protein complexes: 1PPF, 1BRS, and 3HFM (see Figure ), for which experimental ΔΔG values for the reported missense variations are available
in the literature. The total of eight missense variations for each
protein complex were chosen as per our criteria listed in Methods (see Figure ).Figure shows the
PMF profiles resulting from CG-US simulations for each missense variation
for all three protein complexes. Each PMF is an average from four
independent simulation trials. The panels indicate our chosen 1.5
nm distance used to distinguish bound versus unbound as seen in eqs and 3. In comparison to wild-type, a stabilizing (ΔΔG < 0) missense variation will typically be indicated
by a PMF profile with larger barrier, and destabilizing missense variations
(ΔΔG > 0) will typically have a lower
barrier. As the interactions between the proteins diminish in the
unbound state, the PMF profiles for most protein complexes approach
a plateau. We acknowledge that these profiles have not yet reached
full convergence due to the use of a short simulation time per US
window (e.g., see 1BRS). Averaged PMF profiles were used to compute the corresponding ΔΔG value for each missense variation. The goal of this work
is to compare our results with the experimental data and not with
atomistic simulation results, but we note that the PMF profile obtained
for wild-type 1BRS complex is similar to what was obtained using an atomistic model.[4c] This is an interesting observation considering
the small simulation time per window and the granularity of the SIRAH
CG model.
Figure 2
PMF profiles (kcal/mol) for the protein complexes as a function
of separation distance (nm). System names (PDB IDs) are given above
each panel (1PPF, 1BRS, and 3HFM). PMF profiles for
wild-type protein complexes are shown using bold black lines, and
mutant protein complex profiles are shown using other colors as denoted
in the legends. The dashed black line at the 1.5 nm illustrates the
bound from unbound state. Each PMF profile shown above is shifted
so that the average PMF value in the unbound state is 0 kcal/mol.
PMF profiles (kcal/mol) for the protein complexes as a function
of separation distance (nm). System names (PDB IDs) are given above
each panel (1PPF, 1BRS, and 3HFM). PMF profiles for
wild-type protein complexes are shown using bold black lines, and
mutant protein complex profiles are shown using other colors as denoted
in the legends. The dashed black line at the 1.5 nm illustrates the
bound from unbound state. Each PMF profile shown above is shifted
so that the average PMF value in the unbound state is 0 kcal/mol.Figure summarizes
our ability to estimate ΔΔG values for
single missense variations. In the case of 1PPF our strategy clearly outperforms FoldX
and MD-FoldX; CG-US shows a high correlation (R2 = 0.88) with the experimental data, and five out of eight
missense variations were predicted with high accuracy of within ±1.0
kcal/mol (see SI Table S1). Although we
obtained high correlation, the CG-US strategy led to a large error
for the R21P mutant complex. We believe this is due to the fact that
predicting missense variation to proline is difficult due to their
uniquely fused side chain, even for atomistic methods. CG-US performed
equally well for 1BRS with R2 = 0.92, but in this case FoldX
and MD-FoldX also have high R2 values.
This is perhaps not surprising since the FoldX energy function was
trained on a set of protein complexes that included 1BRS.[5] CG-US was able to estimate four out of eight missense variations
with high accuracy (see SI Table S1). However,
we observed significant overestimation of ΔΔG values in the cases of R83Q and D39A missense variations. We believe
this is because the spring constant used for all missense variations
in the case of 1BRS was tuned to wild-type protein complex and using the same constant
for R83Q and D39A complexes led to probability distributions that
were not sufficiently overlapped, causing larger errors. For the third
and the larger protein complex 3HFM, CG-US significantly outperforms FoldX
and MD-FoldX but yields a lower correlation to experiments compared
to the other systems with R2 = 0.59. CG-US
still predicted five out of eight missense variations with high accuracy
despite having a low R2 value (see SI Table S1). It is important to note that the large
error associated with the calculation of the D101 K mutant is significantly
lowering the overall R2 value (see Figure ). Experimental data
suggest this mutant has a positive ΔΔG value but interestingly all the approaches here predict it to have
a negative ΔΔG. We assume that the experimental
data are correct, and this error is likely associated with modeling;
however we also note that there are at least two serine residues in
the neighborhood of the mutation site that can interact with the positively
charged lysine.
Figure 3
Experimentally observed ΔΔG compared
to calculated ΔΔG (kcal/mol) for all
three test protein complexes. System names (PDB IDs) are given above
each panel (1PPF, 1BRS, and 3HFM). The three methods
of ΔΔG are (i) using FoldX on the experimental
structure (FoldX); (ii) using FoldX on each of 100 samples taken from
a MD simulation and averaging (MD-FoldX); and (iii) using our strategy
of combining SIRAH CG with US simulation (CG-US). A perfect fit to
experimental data would fall along the gray diagonal line. The solid
(green and blue) and dashed (red) lines show the linear relationship
between calculated and experimental observation with method of prediction
(FoldX, MD-FoldX, or CG-US), and corresponding R2 values are given in the inset legends. Error bars shown on
the CG-US data points represent the standard errors.
Experimentally observed ΔΔG compared
to calculated ΔΔG (kcal/mol) for all
three test protein complexes. System names (PDB IDs) are given above
each panel (1PPF, 1BRS, and 3HFM). The three methods
of ΔΔG are (i) using FoldX on the experimental
structure (FoldX); (ii) using FoldX on each of 100 samples taken from
a MD simulation and averaging (MD-FoldX); and (iii) using our strategy
of combining SIRAH CG with US simulation (CG-US). A perfect fit to
experimental data would fall along the gray diagonal line. The solid
(green and blue) and dashed (red) lines show the linear relationship
between calculated and experimental observation with method of prediction
(FoldX, MD-FoldX, or CG-US), and corresponding R2 values are given in the inset legends. Error bars shown on
the CG-US data points represent the standard errors.It is worth noting that our CG-US strategy consistently
outperforms
FoldX and MD-FoldX in predicting the signs of the ΔΔG values even if we consider the associated standard errors
(see Figure ). We
believe this is an important achievement of the CG-US strategy since
correctly predicting the sign of ΔΔG allows
discrimination between missense variations that enhance or disrupt
binding, e.g., for predicting antibody escape missense variations.
Figure 4
ΔΔG (kcal/mol) for all three test
protein complexes. System names (PDB IDs) are given above each panel
(1PPF, 1BRS, and 3HFM). Black bars indicate
experimentally observed ΔΔG, whereas
other colored bars indicated in the inset legends represent results
from different methods of prediction (FoldX or MD-FoldX or CG-US).
ΔΔG (kcal/mol) for all three test
protein complexes. System names (PDB IDs) are given above each panel
(1PPF, 1BRS, and 3HFM). Black bars indicate
experimentally observed ΔΔG, whereas
other colored bars indicated in the inset legends represent results
from different methods of prediction (FoldX or MD-FoldX or CG-US).FoldX, as expected, was the fastest
among three approaches tested
in this work, requiring ∼0.42 CPUh to complete a single ΔΔG calculation for 3HFM, the largest among three test protein complexes. MD-FoldX
and CG-US approaches for the same consumed ∼4093 and ∼425
CPUh, respectively (see the SI for more
details). It should be emphasized that CG-US is trivially parallelizable
in that each US window can run independently without relying on the
completion of the previous simulation window; thus, the speed of the
calculation depends on the availability of the computational resources.Our current strategy assumes that conformations in the bound and
unbound states do not significantly change due to missense variation.
When the protein–protein interaction involves induced fit effects,
it is unlikely that our strategy will be directly applicable because
of our use of shorter simulation times and mild restraints used to
prevent the drifting of stable protein. All-atom MD simulations would
be equally unfeasible in this case due to the cost of achieving adequate
conformational sampling.Given that our interest is in calculating
relative binding affinities,
it should be noted that a more efficient implementation of our approach
would be to use alchemical simulation, i.e., using the well-studied
single- or dual-topology methods.[37] Such
methods have the potential for shorter simulation times and smaller
system sizes. However, these methods also require the generation of
hybrid structures and topologies, significantly increasing the challenge
associated with proper calculation of affinities, and thus will be
investigated in future studies.
Conclusions
In
this article, we have described a computational strategy combining
the SIRAH coarse-grained (CG) force field with rigorous umbrella sampling
(US) simulations using short simulation times with similar input parameters
and tested it to predict the effects of single missense variations
on protein–protein binding affinity. We have shown that our
strategy is capable of delivering more accurate results than two non-rigorous,
semiempirical methods. Moreover,
it predicted the signs of relative binding free energy (ΔΔG) values of the studied missense variations with high accuracy
compared to those of non-rigorous approaches, which is remarkable
given that the simulation times were intentionally kept short to match
the speed of the non-rigorous approaches. With ever-increasing computational
power, this strategy has the potential of becoming a routine tool
to screen the effect of missense variations. In future work, we will
test the generality of these findings by using a larger test set.
In addition, we will test the ability of the CG-US strategy in predicting
relative affinity changes due to missense variations far from the
binding interface, and for multiple missense variations.
Authors: Eric F Pettersen; Thomas D Goddard; Conrad C Huang; Gregory S Couch; Daniel M Greenblatt; Elaine C Meng; Thomas E Ferrin Journal: J Comput Chem Date: 2004-10 Impact factor: 3.376
Authors: David Van Der Spoel; Erik Lindahl; Berk Hess; Gerrit Groenhof; Alan E Mark; Herman J C Berendsen Journal: J Comput Chem Date: 2005-12 Impact factor: 3.376
Authors: Daniel A Cannon; Lu Shan; Qun Du; Lena Shirinian; Keith W Rickert; Kim L Rosenthal; Martin Korade; Lilian E van Vlerken-Ysla; Andrew Buchanan; Tristan J Vaughan; Melissa M Damschroder; Bojana Popovic Journal: PLoS Comput Biol Date: 2019-05-01 Impact factor: 4.475
Authors: Jeppe Kari; Gustavo A Molina; Kay S Schaller; Corinna Schiano-di-Cola; Stefan J Christensen; Silke F Badino; Trine H Sørensen; Nanna S Røjel; Malene B Keller; Nanna Rolsted Sørensen; Bartlomiej Kolaczkowski; Johan P Olsen; Kristian B R M Krogh; Kenneth Jensen; Ana M Cavaleiro; Günther H J Peters; Nikolaj Spodsberg; Kim Borch; Peter Westh Journal: Nat Commun Date: 2021-06-22 Impact factor: 14.919