Vasili Korol1, Peter Husen1, Emil Sjulstok1,2, Claus Nielsen1, Ida Friis1, Anders Frederiksen1, Adrian B Salo1, Ilia A Solov'yov1,3. 1. Department of Physics, Chemistry and Pharmacy, University of Southern Denmark, Odense 5230, Denmark. 2. Neuroscience, University of Texas Southwestern Medical Center at Dallas, Dallas 75390, Texas, United States. 3. Department of Physics, Carl von Ossietzky Universität Oldenburg, Oldenburg 26111, Germany.
Abstract
Various biochemical and biophysical processes, occurring on multiple time and length scales, can nowadays be studied using specialized software packages on supercomputer clusters. The complexity of such simulations often requires application of different methods in a single study and strong computational expertise. We have developed VIKING, a convenient web platform for carrying out multiscale computations on supercomputers. VIKING allows combining methods in standardized workflows, making complex simulations accessible to a broader biochemical and biophysical society.
Various biochemical and biophysical processes, occurring on multiple time and length scales, can nowadays be studied using specialized software packages on supercomputer clusters. The complexity of such simulations often requires application of different methods in a single study and strong computational expertise. We have developed VIKING, a convenient web platform for carrying out multiscale computations on supercomputers. VIKING allows combining methods in standardized workflows, making complex simulations accessible to a broader biochemical and biophysical society.
Computational methods
have in recent decades increasingly been
used to model complex molecular systems and have in particular been
extensively employed in the study of the biophysical and biochemical
processes in living organisms.[1−3] The computational modeling tools
allow researchers to study molecular processes and effects that are
difficult or even impossible to probe experimentally, such as quantum
mechanical effects,[4−6] diffusion of small molecules in various intracellular
environments,[7,8] protein conformational changes,[9−11] and self-assembly of biomembranes.[12]Molecular processes and phenomena occur at different length- and
timescales (see Figure ) and require different modeling methods. Chemical reactions, characterized
by electron transfers and the formation and breaking of bonds, and
processes involving quantum spin states or absorption/emission of
photons all require a quantum mechanical treatment, whereas the dynamic
behavior of larger biomolecules, such as proteins or DNA, are best
treated with a classical molecular dynamics (MD) approach. Even larger
scale phenomena, such as diffusion of macromolecules, self-aggregation
of supramolecular structures, or the kinetics of a network of processes,
require yet other methods, such as coarse-grained particle dynamics[12,13] or Monte Carlo methods.[14,15] Crucially, complex
biomolecular processes, such as enzymatic reactions, often inherently
consist of a system of subprocesses at a range of different scales.
Correspondingly, a range of modeling techniques, therefore, needs
to be applied for a comprehensive treatment.
Figure 1
Processes occurring at
different time and length scales can be
modeled using different methods. (a) Ions penetrating a molecule in
ion beam therapy[5] and (b) absorption spectra
of ligand molecules,[16] which may be crucial
for biological function, can be modeled through computational quantum
chemistry. (c) Diffusion of small molecules in various biomolecular
environments,[17,18] (d) ensembles of possible conformations
of mobile parts of proteins, and (e) larger scale conformational changes
and adhesion to surfaces of proteins[19] can
all be studied through MD simulations. (f) Diffusion of macromolecules,
such as proteins, within a cell[20] can be
modeled using Monte Carlo-based methods.
Processes occurring at
different time and length scales can be
modeled using different methods. (a) Ions penetrating a molecule in
ion beam therapy[5] and (b) absorption spectra
of ligand molecules,[16] which may be crucial
for biological function, can be modeled through computational quantum
chemistry. (c) Diffusion of small molecules in various biomolecular
environments,[17,18] (d) ensembles of possible conformations
of mobile parts of proteins, and (e) larger scale conformational changes
and adhesion to surfaces of proteins[19] can
all be studied through MD simulations. (f) Diffusion of macromolecules,
such as proteins, within a cell[20] can be
modeled using Monte Carlo-based methods.
Results and Discussion
In this paper,
we introduce VIKING, the Scandinavian online kit
for nanoscale modeling, which is tailored to model a broad range of
molecular processes occurring at different scales. Many powerful software
toolkits for computational modeling of molecular systems exist, like
NAMD,[21] Gromacs,[22] AMBER,[23] MBN Explorer,[3,24] and
AutoDock Vina[30] for classical atomistic
study, and Gaussian,[25] GAMESS,[26] Dalton,[27] ORCA,[28] Molcas,[29] and Molspin,
which enable modeling quantum chemical processes. These programs are
generally highly specialized for a particular modeling technique and
scale of modeling, whereas VIKING integrates a number of these tools
in a single easy-to-use multiscale platform that provides tools for
setting up simulations, data analysis, and visualization. VIKING not
only alleviates the need for specialized know-how, which is traditionally
required for each individual modeling technique, but also provides
a standardized workflow, making the elaborate work of integrating
multiple methods in a single study significantly more tractable and
reproducible. Available completely in a regular web browser at https://viking-suite.com, VIKING
is designed to be a powerful tool for both experts in computational
modeling and researchers, who do not usually make use of computational
modeling. This lowers the entry threshold for running multiscale simulations,
which will eventually cause computational methods to be more widely
used in new research areas.Furthermore, VIKING offers a gateway
to an increasing number of
supercomputers and allows researchers to make use of these high-performance
computing (HPC) resources at the click of a button. As the multiscale
simulations often rely on large datasets, spanning up to millions
of atoms, and the most appropriate methods for specific modeling have
a high computational complexity, the link to supercomputers is critical
to enable successful molecular multiscale studies. Users can link
their existing supercomputer accounts with the VIKING interface, allowing
the platform to operate with the computational resource on behalf
of the user. VIKING automatically takes care of data transfer to and
from an HPC resource and manages jobs in the queueing system, thus
encapsulating the intricacies of working with individual supercomputers
and allowing the researcher to focus on the higher level protocol
of a computational study. The computational tasks of a single study
can even be spread across separate supercomputers, whereas the researcher
interacts seamlessly with the molecular structures and simulation
data in the same interface—including extending simulations,
transferring structures between different methods, and analyzing results.The computational tasks that can be solved through VIKING include
MD simulations, various quantum chemistry (QC) calculations, virtual
screening, spin chemistry, and genome studies. These tasks are configured
step by step in the browser, providing a similar interface and workflow
across a diverse set of computational methods. The step-by-step process
is tailored for each class of computations in order to provide just
the necessary configuration options. Different calculation types can
be combined by using the output of one task as the input of another:
for example, the extracted parts of a structure resulting from an
MD simulation can be used to start quantum chemical calculations,
and equilibrated protein structures can be used for virtual screening
simulations. Such seamless interlinking of simulations in VIKING facilitates
comprehensive studies of multiscale processes by creating networks
of tasks using different computational methods. The general workflow
of VIKING is shown in Figure .
Figure 2
Concept and workflow of VIKING. Computational tasks are configured
in the web interface by supplying the input data (structures, potentials,
input field values, etc.), from the local computer or an online database.
The simulation is then performed on a supercomputer (Stampede2, Marconi
and Abacus 2.0 are currently supported), and the results are aggregated
and represented visually in the web browser. Supercomputer photograph
courtesy of iStockphoto LP. Copyright 2012.
Concept and workflow of VIKING. Computational tasks are configured
in the web interface by supplying the input data (structures, potentials,
input field values, etc.), from the local computer or an online database.
The simulation is then performed on a supercomputer (Stampede2, Marconi
and Abacus 2.0 are currently supported), and the results are aggregated
and represented visually in the web browser. Supercomputer photograph
courtesy of iStockphoto LP. Copyright 2012.In order to measure the efficiency of VIKING in
comparison with
alternative approaches to setting up computational tasks, a benchmarking
experiment has been carried out, involving 19 users with different
levels of experience with computer simulations and molecular modeling.The participants of the experiment carried out several simulations
(see verbose description in the Supporting Information) in VIKING, using Abacus 2.0, the Danish national supercomputer
located at the University of Southern Denmark. Each participant measured
the time spent on configuring the simulation in VIKING from scratch,
as well as the time spent on analyzing the results; the time spent
on actually running the simulation was not measured. The recorded
timings were collected and processed anonymously. Additionally, each
respondent reported the self-estimated levels of experience with various
IT-related and scientific disciplines. The recorded times were then
compared with the corresponding times obtained by experts, who attempted
to perform the same computations in a traditional manner without using
VIKING.The list of assignments included several computational
tasks, such
as: equilibrium MD of two different systems, free-energy perturbation
calculations, geometry optimization, electronic properties calculation,
virtual screening, infrared (IR) spectroscopy, Raman spectroscopy
(RS), circular dichroism spectroscopy (CDS) modeling, as well as nuclear
magnetic resonance (NMR) shielding tensor determination and spin dynamics
modeling of radical pairs.In order to obtain the characteristic
timings for the simulations
without VIKING, we asked seven experts in computational biophysics
to perform the same tasks using conventional methods, which required
configuring various software packages (NAMD,[21] VMD,[31] and Gaussian09[25]) and processing the output data manually. The experiment
data (participant expertise estimation and measured times) are presented
in Tables S1–S3 in the Supporting Information. The results of the experiment are presented in Figure . It illustrates that by using
VIKING, novice users, even those without any prior experience with
biophysical simulations, are able to work almost as efficiently as
professionals. Moreover, VIKING also reduces the time needed for the
experts to configure the simulations and analyze the results compared
to conventional methods involving manual usage of specialized software.
Figure 3
Efficiency
of VIKING. Characteristic time needed to set up various
computational tasks and to analyze basic results in VIKING and manually.
Here, 11 representative computations were studied that include MD,
QC, drug docking, and spin chemistry tasks. The times required for
novice users to carry out the computations without VIKING were not
measured, as they can be arbitrarily high and require extensive preknowledge.
The expert (manual) results define the typical minimal possible time
for each task.
Efficiency
of VIKING. Characteristic time needed to set up various
computational tasks and to analyze basic results in VIKING and manually.
Here, 11 representative computations were studied that include MD,
QC, drug docking, and spin chemistry tasks. The times required for
novice users to carry out the computations without VIKING were not
measured, as they can be arbitrarily high and require extensive preknowledge.
The expert (manual) results define the typical minimal possible time
for each task.These results show that VIKING is a promising tool
for the computational
biophysics and biochemistry communities, lowering the barrier to entry
of computational methods significantly. By making the use of computational
methods more widespread, VIKING will lead to crucial opportunities
for interdisciplinary studies, combining experimental and computational
efforts to investigate the hypotheses in biophysics and biochemistry.
At the same time, VIKING makes complex computational studies using
multiple methods both more tractable and more reproducible thanks
to the simplified and standardized workflow, which is independent
of the use of specific supercomputers.
Methods
VIKING supports a number of
different task types based on different
computational methods. Every task is configured using a step-by-step
interface, and the task types can be combined by using the results
of one simulation as the input for another one. At each step, VIKING
runs appropriate software packages, extracts the results, and presents
them to the user. In this section, a description of each computational
task type in VIKING is provided.
Equilibrium MD: General Concepts
MD simulations provide a powerful tool to study biomolecular systems,
with an atomistic resolution. They can be used to investigate the
mechanical and thermal properties of proteins,[32−34] transport events,[7,8] and enzyme reaction mechanisms,[17,18,35,36] to name a few examples.At its core, the MD task implementation in VIKING functions as
a user-friendly interface to the NAMD software package.[21] Setting up an MD simulation only requires the
user to provide a molecular structure, e.g. a protein, and to set
thermodynamic parameters, such as the temperature and pressure, while
running the simulation, file handling and data analysis is done internally.As the output, VIKING produces plots of energy and temperature
as a function of simulation time and a dynamic trajectory of the structure.
A completed MD simulation can also be further analyzed to study the
stability of a structure or the time evolution of separation distances
for the chosen selection of atoms, all directly from the overview
of the results of the MD simulation task. Furthermore, the data rendered
during the MD simulation could also be employed for the energy perturbation
calculation or drug docking tasks. The typical workflow for running
an MD simulation in VIKING is illustrated in Figure S2 in the Supporting Information.
Drug Discovery
An important application
that goes beyond the standard MD workflow is related to modern drug
discovery. Computational drug discovery has become an important part
of pharmaceutical research as it allows for the screening of hundreds
of thousands of ligands in an efficient and cheap way, such that only
the top candidate compounds are tested in the lab. Drug discovery
involves computational docking of candidate ligands from a library
of small molecules to a receptor protein, in order to find leads in
medical drug design.VIKING allows automatic docking of ligands
to receptor structures, relying on the AutoDock Vina software package.[30] The user may select a receptor and either upload
the ligands or retrieve them from the PubChem online database (see
Figure S3 in the Supporting Information for the illustration of the workflow in VIKING).As a result
of the drug discovery screening task, VIKING presents
the list of best candidate ligands, arranged by a calculated docking
score and the interaction energy between the receptor and the ligand.
The user can observe each combined receptor–ligand structure
using the VIKING structure viewer in the browser and use them for
further modeling tasks, such as an MD simulation to refine the docked
structure and provide better sampling of the interaction energy.
MD: Free-Energy Perturbation Method
One of the most accurate ways to calculate the free energy of binding
between two molecules, for example, receptor and ligand, is using
the MD free energy perturbation (MDFEP) method.[37,38] VIKING supports free-energy calculations of receptor–ligand
complexes using the alchemical approach:[37,39] in a series of simulations based on NAMD,[21] the interactions between the ligand and its surroundings are gradually
decoupled, essentially annihilating the ligand, either in the binding
site of the receptor or in the solvent, and the resulting free-energy
changes are sampled and used to reconstruct the total binding free
energy. Unlike pure force field interaction energy calculations, the
free-energy perturbation method crucially captures the entropic contributions
to the free energy of binding.[37] This information
can be crucial, when investigating the factors that may affect the
binding of drugs or other ligands, for example, studying drug resistance,
or as a part of predicting the free energies and rates of enzymatic
reactions.Running an MDFEP task in VIKING requires the user
to supply a simulation state consisting of a set of atom positions
and velocities for a molecular structure and specifying the part of
the structure to be considered as the ligand. The simulation state
can be uploaded as a set of files or chosen directly from a previous
MD task. Restraints can then be applied to avoid the ligand diffusing
away from the binding site, when interactions with the protein are
turned off. This is important, as the MDFEP framework requires the
decoupling transformation to be reversible. VIKING ensures this by
performing both an annihilation, “forward” transformation,
turning off the interactions between the ligand and its surroundings,
and a subsequent “backward”, or creation, transformation,
turning the interactions back on. A separate set of simulations is
automatically set up to measure the free-energy error because of the
artificial constraints, and VIKING ensures proper bookkeeping to assemble
this information at the end of the FEP task to provide the binding
free energy to the user. The MDFEP workflow in VIKING is illustrated
in Figure S4 in the Supporting Information.
Atomic and Molecular Properties
It
has been well established that polarization interactions play a key
role in biochemical systems,[40,41] whereas the conventional
MD simulations typically rely only on Lennard-Jones and Coulomb potentials,[21] which do not take polarization into account.
In order to consider polarizabilities and different quantum phenomena
in molecular structures of increased complexity, QC simulations are
needed to fully account for these aspects.[4,42−44] The main difficulty of the QC calculations is the
poor scalability, making it infeasible, even on modern supercomputers,
to accurately treat systems with more than ∼1000 atoms quantum-mechanically.[45] Despite this limitation, a great deal of successful
QC algorithms and programs have been developed, and in particular,
VIKING employs the popular Gaussian09[25] software package for the QC calculations described here.VIKING
offers several QC task types for calculating atomic and molecular
properties, specifically geometry optimization, electronic properties
calculation, and NMR properties calculation. A wide variety of calculation
methods provided by Gaussian09, such as the Hartree Fock,[46] Møller–Plesset perturbation theory
(MP2),[47] or density functional theory[48] with a variety of different functionals, can
be selected, jointly with all the standard basis sets typically used
for the wave function expansion, including variants with polarization
or diffuse functions. It is also possible to add constraints to certain
atom coordinates, bond lengths, or angles or to divide the molecular
structure into fragments to provide a better starting guess for the
chosen QC method.After a successful calculation, VIKING analyzes
the output files
from Gaussian09 and collects the relevant data for the chosen QC task.
These data are visually presented to the user on the results page,
and, in case of a geometry optimization task, the resulting optimized
structure is available for use in other computational tasks in VIKING.
The general workflow for the QC tasks is illustrated in Figure S5
in the Supporting Information.
Molecular Spectroscopy
VIKING provides
a tool for studying the properties of smaller molecules and is specifically
equipped with capabilities to perform a variety of molecular spectroscopy
calculations. Every spectroscopy task is designed to reveal specific
properties of the molecules, and VIKING supports IR spectroscopy,
RS, NMR spectroscopy, and circular dichroism spectroscopy.As
for the task types in the previous section for determining atomic
and molecular properties, the spectroscopy tasks in VIKING also rely
on the Gaussian09 software package[25] to
carry out the computations, and the same set of QC methods and basis
sets are available. To set up a spectroscopy calculation task, the
user is required to supply the molecular structure, select the charge
and spin states of the molecule, and select the calculation method
to use.The resulting spectra and other calculated molecular
properties
are presented directly in the web interface, and normal mode vibrations
calculated in the IR spectroscopy, RS, and CDS tasks can be visualized
as animations in the structure viewer.
Spin Chemistry
Radicals have received
renewed interest as possible biological implications of radical pairs
have been suggested, in particular, the radical pair mechanism of
avian magnetoreception[43,44,49] and the possible adverse health effects of radiofrequency radiation.[50−52]The radical pair dynamics task in VIKING is designed to allow
the investigation of radical pair processes in various ways, including
studies of how the quantum yields are affected by static magnetic
fields or radiation, or calculation of the time evolution of radical
pair ensembles. VIKING can track the energy levels of the radical
pairs in a new and innovative way that may help tremendously in interpreting
any unexpected results. In particular, the energy levels may be obtained
as a function of, for example, external magnetic field strength or
time, and the spin states involved in each energy level are color-coded
in the figures produced by VIKING. With the capability to include
time-dependent magnetic fields, it is possible not only to study magnetic
resonance experiments or the effects of radiofrequency radiation but
also to simply calculate the set of resonance frequencies that may
affect the dynamics of a radical pair.Describing a radical
pair requires a range of parameters that are
normally obtained from quantum chemical calculations. As VIKING supports
these types of quantum calculations, all the parameters needed for
the radical pair can be imported directly from the other VIKING tasks
through a visual interface. The workflow of the radical pair dynamics
task is illustrated in Figure S6 in the Supporting Information.The radical pair dynamics task in VIKING
relies on the MolSpin
software,[53] which has been developed separately
by the members of the VIKING development team. It is a dedicated spin
dynamics software package, which is designed to be able to perform
any kind of calculation on the spin systems of arbitrary complexity.
Genome Editing
Apart from providing
interfaces for the existing program packages, VIKING features a specialized
tool for analyzing profiles from genome editing studies, ProfileIt[54] (https://cobotechnologies.com/software/indel-analysis-software/). Sponsored by Cobo Technologies, it allows INDEL (insertion/deletion
of bases) profiling of sample data produced by Applied Biosystems
genetic analyzer devices. ProfileIt provides an interactive interface
for visualizing, selecting, and subtracting INDEL peaks, as well as
displaying various statistics and export of profile data and publication-quality
images. The user is only required to provide the files obtained from
an analyzer device and assign a control sample and, optionally, a
negative control. The peak discovery process in VIKING is tuned for
an accurate detection of profile extrema exceeding a given threshold
and calculation of their basic properties. The sample profiles are
visualized in an interactive plot in the interface.
Molecular Editor
VIKING provides
a fully functional molecular editor which is a diverse and intuitive
tool for constructing and editing the molecular structures the user
wishes to simulate. The collection of editing tools is integrated
in VIKING, and no additional programs are needed to apply the desired
manipulations.In addition to common translation and rotation
tools, VIKING provides extended possibilities to construct custom
structures. This set of tools includes generating chemical bonds,
placing single atoms, and merging multiple structures. This allows
the user to construct small molecules from scratch and incorporate
them in bigger structures or construct large protein complexes. The
implemented tools can also be used to delete and replace atoms in
molecules or structures with just a few clicks.The support
for different representations allows the user to choose
between visualizing individual atoms in a structure or showing the
secondary structures of a protein, if the latter is more convenient.
This allows for utilizing the editor in a multiscale fashion: it is
possible to choose to work with single atoms at a time, or choose
to work with bigger chunks of a structure, like an entire β-sheet
or α-helix, making it easy to select and edit many individual
atoms at once.As mentioned earlier, VIKING is able to visualize
trajectories
from the MD tasks as well as the individual steps of a geometry optimization
procedure. This allows the user to choose specific configurations
of a simulated system, edit them, and apply them for further studies.
Moreover, VIKING makes it possible to extract a specific part of a
structure from a specific frame of a trajectory from MD simulations
and apply the extracted structure in a quantum chemical task. Furthermore,
VIKING supports visualizing structures in an immersive three-dimensional
experience using virtual reality devices. The interface of the molecular
viewer and editor can be seen in Figure S7 in the Supporting Information.
Authors: Maria-Andrea Mroginski; Suliman Adam; Gil S Amoyal; Avishai Barnoy; Ana-Nicoleta Bondar; Veniamin A Borin; Jonathan R Church; Tatiana Domratcheva; Bernd Ensing; Francesca Fanelli; Nicolas Ferré; Ofer Filiba; Laura Pedraza-González; Ronald González; Cristina E González-Espinoza; Rajiv K Kar; Lukas Kemmler; Seung Soo Kim; Jacob Kongsted; Anna I Krylov; Yigal Lahav; Michalis Lazaratos; Qays NasserEddin; Isabelle Navizet; Alexander Nemukhin; Massimo Olivucci; Jógvan Magnus Haugaard Olsen; Alberto Pérez de Alba Ortíz; Elisa Pieri; Aditya G Rao; Young Min Rhee; Niccolò Ricardi; Saumik Sen; Ilia A Solov'yov; Luca De Vico; Tomasz A Wesolowski; Christian Wiebeler; Xuchun Yang; Igor Schapiro Journal: Photochem Photobiol Date: 2021-02-13 Impact factor: 3.521