Piers A Townsend1, Matthew N Grayson2. 1. Centre for Sustainable Chemical Technologies, Department of Chemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom. 2. Department of Chemistry, University of Bath, Claverton Down, Bath BA2 7AY, United Kingdom.
Abstract
As a field, computational toxicology is concerned with using in silico models to predict and understand the origins of toxicity. It is fast, relatively inexpensive, and avoids the ethical conundrum of using animals in scientific experimentation. In this perspective, we discuss the importance of computational models in toxicology, with a specific focus on the different model types that can be used in predictive toxicological approaches toward mutagenicity (SARs and QSARs). We then focus on how quantum chemical methods, such as density functional theory (DFT), have previously been used in the prediction of mutagenicity. It is then discussed how DFT allows for the development of new chemical descriptors that focus on capturing the steric and energetic effects that influence toxicological reactions. We hope to demonstrate the role that DFT plays in understanding the fundamental, intrinsic chemistry of toxicological reactions in predictive toxicology.
As a field, computational toxicology is concerned with using in silico models to predict and understand the origins of toxicity. It is fast, relatively inexpensive, and avoids the ethical conundrum of using animals in scientific experimentation. In this perspective, we discuss the importance of computational models in toxicology, with a specific focus on the different model types that can be used in predictive toxicological approaches toward mutagenicity (SARs and QSARs). We then focus on how quantum chemical methods, such as density functional theory (DFT), have previously been used in the prediction of mutagenicity. It is then discussed how DFT allows for the development of new chemical descriptors that focus on capturing the steric and energetic effects that influence toxicological reactions. We hope to demonstrate the role that DFT plays in understanding the fundamental, intrinsic chemistry of toxicological reactions in predictive toxicology.
In the last few decades, the ethical conundrum of in vivo animal testing has plagued toxicology; the use of animals in science
has been under intense scrutiny for many years, and finding fast,
sustainable, alternative ways to reduce animal testing is of huge
interest to both scientists and nonscientists across the globe. Currently,
there are many well established in vitro and in vivo methods in toxicology, each having advantages and
disadvantages. Computational methods in toxicology, however, are not
so well established, and they will play an important role in finding
solutions to the complex ethical issue of animal testing. Computational
toxicology has seen a surge in popularity throughout the last two
decades; this is due to an increase in the accessibility of toxicological
databases, pressure from industries to provide low-cost methods to
test the safety of compounds, and reducing the need for animal testing.[1] If sufficient accuracy can be achieved, then in silico methods are typically inexpensive, are relatively
fast, and allow circumvention of the ethical issues attached to animal
testing. For this reason, legislative programs are increasingly keen
to explore the use of computational methods. Furthermore, computational
chemistry has an important role to play in the development of in silico approaches in toxicology. Current computational
approaches are not typically built on the fundamental chemical origins
of toxicity, and quantum mechanical methods such as density functional
theory (DFT) can be used to explore the intrinsic chemistry behind
a toxicological reaction. This perspective will explore a few topics:
the benefits and problems associated with in vivo animal testing (Section ), current and widely used methodologies for in silico mutagenicity prediction (Section ), and last, how methods, such as quantum chemical
methods, for example, DFT, can be used to explore the chemistry of
mutagenicity (Section ).
Animal Testing in Toxicology
Throughout
the course of history, animals have been utilized by humans for many
different uses: food supply, transport, and domestication to name
a few. However, the most contentious use of animals since the dawn
of the scientific era is their use in scientific research, for example,
testing new pharmaceutical products and toxicological screenings.[2] A formal definition for animal testing (or animal
experimentation) is given by the German Animal Welfare act, “the
use or manipulation of animals that involves the inflicting of suffering,
pain or injury to them”.[3] This applies
to any procedure involving an animal that subjects them to “stress
equivalent to, or higher than that caused by the introduction of a
needle in accordance with good veterinary practice” as defined
in a 2010 European directive on the protection of animals used for
scientific purposes.[4] Animal testing remains
contentious due to the purposeful inflicting of pain for the acquisition
of knowledge, proposing the question, is it ethically justifiable
to kill for the advancement of knowledge? To understand the extent
of animal testing in the modern era, it is important to examine our
everyday lives. A colossal number of domestic products such as food,
utilities, and pharmaceuticals have likely used animal testing in
their development cycle at one stage or another. Despite the widespread
criticism of animal testing, it plays a critical role in ensuring
that substances are safe for human use and consumption. Animal testing
directly allows scientists to empirically observe the emergent properties
of chemical exposure to organisms; without its widespread use, many
adverse toxicological properties would remain misunderstood and unexplored.
It is thus important to acknowledge the vital role that animal testing
has played and continues to play in the development of in
vitro and in silico approaches in toxicology.[5] For toxicologists, a clear challenge lies in
developing in vitro assays and cheminformatic tools
that are accurate and reliable testing methods, without missing key
information that could lead to human harm or death. It has been argued
that experiments involving animals can often be poorly predictive
and wasteful by design.[6] For this reason,
combined with the ethical considerations, it is imperative that science
tries to reduce animal testing where possible, and finds alternative
solutions. However, it is more realistic to acknowledge that animal
testing and alternative approaches (in silico, in vitro, in chemico methods) could exist
as complementary methods, as opposed to permanent and direct replacements.
A good starting point for this movement was proposed by Russell and
Burch in 1959; three Rs were defined: replacement, reduction, and
refinement.[7] The three Rs should be applied
to any experimental design or methodology that involves animals. Can
you replace the process that involves in vivo testing
with an alternative method? Can you reduce the number of animals used
in experiments? And last, if you must use animals, can you refine
the process such that husbandry and care is refined? A recent study
highlighted how the three Rs have become an integral part of scientific
research in the United Kingdom while simultaneously becoming a “transnational
gold standard” in laboratory ethics.[8] Despite widespread acceptance of the three Rs, a large number of
animals are still involved in scientific experimentation. A recent
report published by the U.K. government suggests that around 3.80
million experiments involving animals were performed in 2017.[9] This is undoubtedly one of the largest drivers
for the development of computational methods within toxicology. As
a field, it absolutely meets the three Rs and at its core aims to
drastically reduce the use of animals in safety testing and pharmaceutical
drug design.
Legislation in Toxicology
To ensure
widespread consistency, and to minimize the risk of harm to society
and public health, legislation plays an important role in chemical
toxicology. There are many boards that regulate toxicology such as
the Organization for Economic Co-operation and Development (OECD)
and the International Council for Harmonization (ICH). However, one
of the most important and prominent legislative programs is a European
Union regulation called REACH (Registration, Evaluation, Authorisation
and restriction of Chemicals). This regulation aims to protect human
health and keep the environment safe while simultaneously promoting
innovation in the EU chemicals industry. Its aims also prove to meet
a few principles of green chemistry as proposed by Anastas and Warner.[10] In particular, REACH aligns with principle 4
‘designing safer chemicals’ and principle 12 ‘safer
chemistry for accident prevention’. The principles of green
chemistry should be key considerations when designing any new, modern
chemical process or technology. One of REACH’s main goals is
to support the use and development of alternative methods for the
assessment of chemical safety, methods such as quantitative structure
activity relationships (QSARs, see Section ). This is indicative of the vital role that
computational toxicology is set to play in REACH’s vision of
the future. It has become increasingly common for alternative nontesting
methods to be cited as possible ways of meeting data requirements
within a regulatory context. For example, Annex VII of REACH regulation
requires in vitro/in chemico tests as a first step
in addressing the skin sensitization risk of a chemical.[11] Traditionally, in vivo and in vitro chemical safety assessment has been performed according
to test guidelines (TG) as put forward by the OECD, ensuring that
consistency and reliability are core to the test procedures and outcomes.
Although documentation does exist for guidance on how to utilize and
report data obtained from computational approaches (e.g., QSAR), no
formal test guidelines have been put forward for in silico approaches.[12] Evaluating in silico approaches for the assessment of mutagenicity and other end points
of concern remains an area of active interest for computational toxicologists.
This perspective aims to introduce and appraise how density functional
theory (DFT) can and has previously been used as a tool for the assessment
of mutagenic risk in pharmaceutically relevant organic molecules.
Computational Models for the Prediction of Mutagenicity
As a biological concept, mutagenicity refers to “the permanent
and transmissible changes in the amount or structure of the genetic
materials of cells and organisms”. These changes can be focused
toward a single gene, clusters of genes, or entire chromosomes.[13] The chemical causing changes to DNA is itself
termed a “mutagen”, and mutagens can cause direct (or
indirect) damage to DNA, resulting in different types of mutation
to the genome. A variety of experimental approaches exist for assessing
mutagenic risk, but these will not be discussed in this review. Please
see Hasselgrin et al. for a thorough, in-depth analysis of in silico genotoxicity assessment and associated experimental
protocols.[12] Before broadly discussing in silico approaches for mutagenicity prediction, it is
first important to highlight the role that in vivo and in vitro data play in constructing computational
models. Without large, high-quality experimental data sets, there
would be no method of anchoring chemical structures to their associated
adverse toxicological outcomes. However, computational methods are
among the most dynamic, flexible tools for the assessment of chemical
safety. Predictions are relatively inexpensive and fast when compared
to in vivo and in vitro methods,
and this continues to drive the development of in silico approaches. Since 1991, when Ashby and Tennant published a study
that successfully correlated chemical structure with genotoxicity
and DNA reactivity,[14]in silico approaches to predict mutagenicity have been at the forefront of
toxicological research. Models for the prediction of mutagenicity
typically fall into one of two categories. This section will explore
these categories and the approaches they take toward mutagenicity
prediction.
Structure–Activity Relationships (SARs)
As a concept, structure–activity relationships (SARs) underpin
all fundamental investigations in toxicology. SARs are computational
models that attempt to link qualitative chemical structure with biological
activity. The central idea of SARs is that molecular structure implicitly
determines physical and chemical properties. These properties then
directly influence the biological interactions and therefore the toxicological
mode of action.[15] In 1991, Ashby and Tennant
published ground-breaking work that introduced the role of SARs and
computational modeling in the prediction of mutagenicity.[16] They chose 301 chemicals and categorized them
according to pre-existing chemical “structural alerts”
that indicate a propensity toward DNA reactivity (see Figure ). The chemicals were split
into 154 “alerting” chemicals and 147 “nonalerting”
chemicals. The alerting chemicals were further subcategorized into
aromatic amino/nitro types, DNA alkylating agents, and an “assorted”
structurally alerting group. The results of this study showed that
most structurally alerting chemicals were mutagenic, while approximately
95% of the nonalerting chemicals were not mutagenic. These results
showed that using so-called “structural alerts” can
give a good level of confidence in predicting mutagenicity. This work
by Ashby and Tennant laid the foundation for further work in attempting
to correlate structural features with mutagenicity, and their work
still plays an important role in modern predictive techniques. The
idea of “chemical category” formation is fundamental
in the development of SARs, and it is has previously been proposed
that chemicals should be categorized according to their initial mode
of action, the so-called “molecular initiating event”
(MIE).[17] Category formation and the MIE
are built around the concept that chemicals with similar profiles
will exhibit similar toxicological responses. The first discussion
of the MIE can be traced back to 2006, where Schultz et al. showed
that a framework for reactive toxicity can be constructed according
to the initial covalent reaction of biological nucleophiles (such
as DNA) with soft electrophiles.[18] It is
worth noting, however, that directly applying mechanistic organic
chemistry in toxicology does have limitations; this is due to a wide
range of conditions in which reactions may be carried out. In organic
chemistry, reactions can be carried out in different solvents and
at a range of different temperatures. This differs greatly from an
aqueous, well-controlled biological or cellular environment. Thus,
reactions conditions can drastically affect reactivity and the extent
of reaction. Despite these limitations, understanding the initial
MIE, and the fundamental chemistry associated with a toxicological
end point is of paramount importance in predictive toxicology. The
approach of category formation according to the MIE intrinsically
focuses on the mechanistic chemistry as opposed to previously obtained
toxicological data sets that only rely on empirical evidence (e.g.,
Ames test data).[19] For mutagenicity, the
most important MIE for category formation includes chemicals that
can react to form covalent DNA adducts. Although the chemistry of
DNA adducts will not be discussed in this work, Benigni and Bossa
present a large number of chemical categories that show evidence of
mutagenicity and carcinogenicity and act as a great starting point
for understanding the mechanisms behind DNA reactivity.[20] Returning to the structural alerts developed
by Ashby and Tennant,[17] a number of “expert
systems” exist that utilize structural alerts for toxicity
prediction, systems such as Derek and Toxtree.[21,22] An expert system is one of the earliest forms of artificial intelligence,
which uses rules and knowledge to make “if–then”
decisions. It takes advantage of information gathered from human experts
and makes decisions according to a set of rules. Derek and Toxtree
are widely used SAR software programs, and both use chemical categories
and structural alerts to make predictions about the mutagenic risk
of chemicals. However, despite their widespread use, one problem that
frequently occurs relates to the “applicability domain”
of the models. Computer models are typically constructed and trained
with a limited data set. A recent study by Bossuyt et al. showed that
the applicability domain is important in assessing the confidence
of a predictive model. The study showed that the predictive potential
is moderately low for chemicals that are new to a model and not included
in the initial training data set.[23] This
therefore leads to difficulties in evaluating test compounds that
are structurally different to those in the initial training data set.
To evaluate the predictive performance of SAR models, sensitivity,
accuracy, specificity, positive predictivity, and negative predictivity
are all parameters that should be evaluated according to the OECD
guidance document.[24] In particular, model
sensitivity and accuracy are two of the most important metrics when
developing SAR models. Overall, studies show that SARs are widely
used and well developed in the field of toxicology. However, due to
the size of their wide-ranging applicability domains, there can exist
issues with model performance.[25] Begging
the question, can large robust models with a large applicability domain
be developed, allowing for universal models that allow the accurate
and sensitive prediction of mutagenicity? Alternatively, does the
key lie in the construction of individual small models for each chemical
category; although these models may have a limited applicability domain,
can their targeted focus ensure they remain highly sensitive and highly
predictive?
Figure 1
Diagram showing 5 structural alerts associated with DNA reactivity.
These compounds sit within the mechanistic domain of “Michael
addition”. These were developed more recently by Enoch and
Cronin. However, Ashby and Tenant laid the foundation for this type
of work.
Diagram showing 5 structural alerts associated with DNA reactivity.
These compounds sit within the mechanistic domain of “Michael
addition”. These were developed more recently by Enoch and
Cronin. However, Ashby and Tenant laid the foundation for this type
of work.
Quantitative structure activity relationships (QSARs)
are models built from biological, chemical, and statistical data to
better understand toxicological events (see Figure ). The fundamental principle of a QSAR is
to establish links between a chemical descriptor and the biological
activity. In this section, the use of QSAR models in mutagenicity
is examined, with particular emphasis on how computational chemistry
can play a powerful role in underpinning the data used in QSAR models.
Molecules are represented as numerical models and their properties
can be calculated using classical and quantum equations; these properties,
alternatively called descriptors (e.g., Number of bonds, HOMO/LUMO
energies), can then be analyzed for variation and coupled to their
associated biological activity. This allows development of a model
that contains “rules” for predicting the activity AB of any chemical structure.[26] Generally,
QSARs adopt the form of a linear equation as below:where c is a coefficient, P is a parameter derived from molecular structure
(e.g., hydrophobicity),
and N is the number of parameters included within
the model. The descriptors are computed for each molecule in a given
data set, followed by calculating the coefficient for each parameter;
this is done by fitting variation in both parameters and the biological
activity. QSAR models can be one of two types: global or local. Global
QSAR models take large data sets of chemicals (which are both structurally
comparable and noncomparable) and attempt to refine the predictive
potential of the model. Alternatively, local QSAR models take congeneric
groups of chemicals and refine the models predictivity, for example,
Gramatica et al. successfully developed a local (Q)SAR model to predict
the toxicological response of phenylureas and s-triazines.[27] They arrived at an important conclusion; although
the same toxicological end point was considered for the two different
groups of chemicals, the descriptors showing highest predictivity
were different between groups. This highlights the variability in
chemical structure, and how not all adverse outcomes arise from the
same chemical “origin”. Despite the widespread use of
local models, global models come with the advantage of being able
to offer predictions on any chemical entity, accompanied by a numerical
level of confidence in the prediction. Two of the most popular global
(Q)SARs are Sarah Nexus[28] and CAESAR,[29] both of which have shown to be successful models.
A recent study by Honma et al. compared the performance of global
(Q)SAR models for the prediction of Ames test results in three different
phases over the course of three years.[30] In 2014 (phase I), the models respectively showed a sensitivity
of 51.2% and 69.5%, while three years later in 2017 (phase III), the
models showed a sensitivity of 44% and 67.5%. This study demonstrates
that the prediction of Ames test results using global (Q)SAR models
has room for improvement. Despite the success of global (Q)SAR models
such as Sarah Nexus and CAESAR, the confidence in prediction for large
data sets of congeneric compounds can suffer; this is due to global
models being built around chemicals with largely varying structures
and physicochemical properties. Due to the commercial success of global
models, (Q)SARs built specifically for congeneric groups of chemicals
can often be left underutilized. This is despite the fact that improved
levels of confidence and predictivity may be achieved through use
of a local model as opposed to a large global (Q)SAR model.[31] The development of QSARs to predict mutagenicity
has been an active area of research for many years, with an increasing
focus on using them as part of evidence-based regulatory submissions.
The OECD proposed a set of guidelines for the validation of (Q)SARs
when used for a regulatory purpose.[24] It
is suggested that any (Q)SAR should be constructed with the following
characteristics: (i) a clearly defined end point, (ii) an unambiguous
algorithm, (iii) a defined applicability domain, (iv) appropriate
measures of predictivity and robustness, and last, (v) if possible,
a mechanistic interpretation. These guidelines are deemed heavily
appropriate for the development of a useful, highly applicable (Q)SAR.
The authors encourage particular emphasis on guideline 2; often, commercially
available software can be difficult to interpret due to the ambiguity
in its algorithm. This often means that the reliable use of (Q)SAR
models is restricted to experts in programming and computer science.
Ensuring that transparent, easily interpretable algorithms are available
to accompany regulatory submissions is vital in ensuring that models
can be independently assessed.
Figure 2
(Q)SARs are computational models that
rely on statistics, chemistry,
and biology to make predictions in toxicology.
(Q)SARs are computational models that
rely on statistics, chemistry,
and biology to make predictions in toxicology.
Density Functional Theory in Predictive Toxicology
One of the fundamental steps in developing a (Q)SAR is the selection
of relevant toxicological descriptors. Chemical descriptors are at
the core of any (Q)SAR model, and many types of descriptor have been
proposed that represent different levels of chemical structure (e.g.,
atom counts (0D), substructures (1D), topological (2D), geometrical
(3D) descriptors). Many of these descriptors can be calculated using
quantum chemical methods. This section will explore the basics of
DFT, the quantum chemical descriptors that can be calculated (such
as HOMO/LUMO energies), and discuss why DFT transition state modeling
could have an important role to play in the prediction of mutagenicity.
Introducing Density Functional Theory (DFT)
It is well
established that chemicals, when seen as atomic scale
constructs, obey the laws of quantum mechanics. Thus, to gain a detailed
energetic understanding of chemicals, it is necessary to use the mathematical
toolset provided by quantum chemistry. As a field, quantum chemistry
is defined by the application of quantum mechanical models to study
chemical reactions. For the last 40 years, DFT has been one of the
primary methods in physics and chemistry for probing the electronic
structure of periodic systems such as crystals.[32] However, more recently, the uses of DFT have become more
widespread. As a quantum chemical method, DFT has the best trade-off
between accuracy and speed. Thus, in the last 20 years, DFT has become
widely used for the calculation of molecular properties in toxicologically
relevant organic and inorganic species. DFT is a quantum mechanical
method used in computational chemistry for calculating potential energy
surfaces (PES) of chemical systems; a PES provides information about
the energy of a chemical at a multitude of geometries and degrees
of freedom.Quantum chemistry is primarily concerned with solving
the time-independent, nonrelativistic electronic Schrödinger
equation as follows:where Ĥ is the Hamiltonian
operator, E is the energy, Ψ is the wavefunction,
and r is the coordinate
of each
electron. In quantum chemical methods, the Born–Oppenheimer
approximation is invoked meaning that electronic (electrons) and nuclear
motion (nuclei) has been decoupled and separated. The electronic Schrödinger
equation can be solved through the construction of approximate many-electron
wavefunctions, for example, in the Hartree–Fock (HF) theory.[33] The central object in DFT, as proposed by Hohenberg
and Kohn, is the electron density distribution ρ(r) rather than
the wavefunction. The electronic ground state energy of a molecule
can be calculated as a functional of its density, E[ρ(r)]. To understand the principles of DFT from an intuitive,
nonmathematical point of view, E. B. Wilson proposed three fundamental
ideas about the electron density (see Figure ): (i) so-called “cusps” in
the electron density correspond to the position of nuclei, (ii) the
heights of these cusps are directly linked to nuclear charge, and
(iii) numerical integration of the electron density gives the total
number of electrons in the system, for example, the electron density
in benzene would integrate to 42.[34] These
core ideas are what allow us to understand the direct relationship
between the electron density and the energy of a system under study.
To use DFT in computational chemistry (and therefore toxicology),
an orbital approach needs to be adopted as put forward by Kohn and
Sham.[35] The Kohn–Sham (KS-DFT) approach
defines the total electron density as a sum over Kohn–Sham
orbital densities as seen in the following equation:
Figure 3
Graphical
representation of the electron density surface for a
water molecule. Cusps are observed at the position of nuclei, and
the total electron density must integrate to the total number of electrons.
Diagram reprinted with permission from Koch and Holthausen.[34] Copyright 2001 Wiley-VCH.
Graphical
representation of the electron density surface for a
water molecule. Cusps are observed at the position of nuclei, and
the total electron density must integrate to the total number of electrons.
Diagram reprinted with permission from Koch and Holthausen.[34] Copyright 2001 Wiley-VCH.Where ψi are individual Kohn-Sham orbitals and
nelec is the number of electrons in the system. A problematic
term in the overall energy expression is the unknown exchange–correlation
energy functional. Many, approximate functional forms of this term
have been developed over the years, and thus, it is not always clear
which one to choose for a given chemical problem. One of the most
commonly used functionals is B3LYP, but benchmarking studies should
be consulted to determine which functional will likely perform best
for a chemical system of interest.[36−38]The advantages
of DFT are best described by comparing it to wavefunction
based approaches. Speed of calculation is an important consideration
when working with large data sets of chemical structures. Although
calculation length will differ for each functional, generalizations
can be made for different quantum chemical methods. The simplest wavefunction
based method, HF theory, can show N4 scaling, where N is
a relative measure of the system size. Higher level wavefunction methods
such as MP2 can show N5 scaling, while coupled-cluster
singles, doubles, and perturbative triples (CCSD(T)) can show very
expensive N7 scaling.[39] DFT
can show a substantial reduction in computational cost, with N3 scaling.[40] Further, a research
field of wide-ranging interest, linear-scaling DFT, aims to further
reduce the scaling to N for very large systems.[39] Although a reduced scaling can be attached to DFT, many
popular functionals show improved performance when compared to HF.[36] Some functionals have also been shown to outperform
MP2.[38] It is this fine balance between
accuracy and computational efficiency that makes Kohn–Sham
DFT so desirable for the calculation of molecular properties. This
is paramount in predictive toxicology and (Q)SAR, where accurate geometries
and molecular properties are vital in building consistent, reliable
models.
DFT-Derived Chemical Descriptors in Mutagenicity
Prediction
Molecular structures are complex entities. Much
research has been concerned with trying to capture and utilize the
theoretical information embedded within chemical structures for the
construction of (Q)SARs. Evidence of scientific focus on molecular
descriptors is shown by many (>5000) proposed descriptors for use
in fields such as toxicology and environmental protection. Molecular
descriptors are described as “the final result of a logic and
mathematical procedure which transforms chemical information encoded
within a symbolic representation of a molecule into a useful number
or the result of some standardized experiment”, and have an
important role to play in predictive toxicology (see Figure ).[41] Many mutagenic events are initiated by the reaction between exogenous
electrophiles with nucleophilic atoms in DNA nucleobases such as nitrogen
and sulfur.[42,43] A number of mechanisms and reaction
types can occur, such as the formation of cyclic adducts, frameshift
mutations, and strand breaks.[44] Many of
these mechanisms will be fundamentally governed by electrophilicity,
nucleophilicity, and regioselectivity, and building models that utilize
the descriptors that control this behavior can prove powerful in predictive
potential.[45] Quantum mechanical methods
such as DFT can be used to calculate and develop such descriptors
for use in predictive toxicological models. These descriptors can
be simple zero-dimensional parameters such as molecular weight or
higher-dimensional descriptors such as free energy of activation.
This chapter will detail and examine some of the most commonly used
descriptors in the prediction of mutagenicity.
Figure 4
Chemical descriptors
can vary from simple 0D parameters (e.g.,
number of atoms) up to complex 3D descriptors such as free energy
of activation. The graphic on the right is a typical output from quantum
chemical DFT calculations.
Chemical descriptors
can vary from simple 0D parameters (e.g.,
number of atoms) up to complex 3D descriptors such as free energy
of activation. The graphic on the right is a typical output from quantum
chemical DFT calculations.
HOMO/LUMO Energies
The highest
occupied molecular orbital (HOMO), lowest unoccupied molecular orbital
(LUMO), and difference between them can be key determinants in the
likelihood of reaction between two chemical species. These descriptors
are easily and routinely calculated using quantum chemical methods
such as DFT. They find their origins in the frontier molecular orbital
(FMO) theory as proposed by Fukui, who argued there is better orbital
overlap when the nucleophile HOMO and the electrophile LUMO are closer
in energy.[46] The HOMO–LUMO energy
gap can be used in predictive (Q)SAR models; however, it is not uncommon
for studies to neglect the toxicant-target HOMO-LUMO interaction,
and to examine only toxicant energies, for example, individual LUMO
energies for a range of congeneric toxicants. A recent study by Kuhnke
et al. used the DFT-derived HOMO–LUMO energy gap as a descriptor
to predict Ames mutagenicity data for primary aromatic amines.[47] Their results showed that the HOMO–LUMO
gap was an effective descriptor, particularly when combined with a
quantum mechanical stability term, when applied to the prediction
of Ames mutagenicity. HOMO/LUMO energies can also be utilized for
calculation of chemical hardness η and chemical softness S according
to the hard and soft, acids, and bases (HSAB) theory and the following
equations:LoPachin et al. used DFT to show that
hardness and softness as chemical descriptors can be instrumental
in understanding irreversible, covalent toxicant–target interactions.[48] They showed that soft–soft and hard–hard
interactions are favorable, and nucleophile–electrophile selectivity
is significant when examining toxicological phenomena. The authors
believe this paper highlights the importance of developing parameters
that relate to the molecular initiating event. Building QSAR models
that utilize DFT-derived chemical descriptors associated with regioselectivity
could be key to predicting the most prevalent molecular sites that
control covalent toxicological phenomena. We also consider that many
descriptors focus exclusively on a single molecule of interest, and
more research should be pursued to examine the fundamental chemistry
between toxicant and target.
Molecular
Size and Shape
The size
and shape of a molecule play important roles in its degree of bioavailability.
Once a structure has been geometrically optimized using DFT, its shape
can be graphically visualized. The relevant metrics may be both molecular
weight and molecular volume, for example, oral bioavailability is
not significant with a molecular weight >1000 Da.[49] These descriptors are among the simplest descriptors to
calculate yet can often be vital building blocks when constructing
multivariate QSAR models.
Partial Charges
Partial charges
are extremely useful for understanding inter- and intramolecular electrostatic
interactions. In chemistry, a partial charge is typically considered
to be a noninteger charge on atoms in molecules, brought about by
the asymmetrical distribution of electrons between chemical bonds.
These charges play a vital role in steering where reactivity is likely
to occur and therefore which molecular regions will likely be involved
in mutagenicity. Although many different methods exist for the calculation
of partial charges, accurate atomic charges are generally obtained
only through quantum mechanical calculations such as DFT. For example,
a study by Korchowiec et al. showed the strength of DFT for examining
the relative reactivity of different sites in purine bases.[50] By examining charge distribution in guanine,
regions that would likely be involved in electrophilic attack were
ascertained; many toxic chemicals are known electrophiles that cause
genetic damage, and this type of model allows better prediction of
where and why these reactions occur. There are different types of
charge that may be calculated for use in QSAR. Class I charges are
obtained by matching to experimental data or using nonquantum models
that involve methods employed from classical physics. The advantage
of using class I charges relates to the speed of acquiring data: they
can be very useful for investigating large data sets.[51] Class II charges are obtained by using wavefunction or
electron density-based approaches, such as HF or DFT, with the charges
being partitioned into individual atomic contributions. An example
of a class II approach is Mulliken population analysis (MPA). This
method has been used previously in toxicology, where Kim et al. used
DFT to perform MPA for examination of partial charges on exocyclic
nitrogen atoms in aryl amines.[52] Their
results showed that nitrenium ions formed from known mutagenic aryl
nitro drug candidates show greater partial charges on their exocyclic
nitrogen, when compared to other similar drug classes. This work directly
shows that partial charges can control the extent of mutagenic activity
and has an important role to play in understanding mutagenesis, allowing
them to be utilized as chemical descriptors where possible. Class
III charges are obtained through the direct analysis of physical observables
that are predicted from the molecular wavefunction. However, for an
understanding of intermolecular interactions, Class III charges appear
to have limited accuracy and applicability.[51] Class IV charges show remarkable accuracy for fast, low-cost calculations
and can be considered semiempirical in nature. They typically utilize
predetermined values from Class II charges that are mapped onto the
atom types, both of which can be calculated using DFT.[53]
Hydrogen Bonding
As a phenomenon,
hydrogen bonding is one of the most important concepts in the field
of biochemistry. Proteins, DNA, RNA, and many reactive biological
nucleophiles have a variety of residues that can accept and donate
hydrogen bonds. Hydrogen bonding as a descriptor can be approached
by understanding the energetics behind hydrogen bond formation using
quantum chemical methods such as DFT. Numerous studies have used DFT
to investigate hydrogen bonding and the associated interaction energies
between toxicological phenomena.[54−56] However, in predictive
toxicology, detailed energetic studies are limited, and the number
of hydrogen bond donors/acceptors is typically chosen as a simple
descriptor. It has been shown that when probing these energetics,
wavefunction based approaches can show improved performance when compared
to DFT. Boese tested over 50 DFT functionals and their performance
for assessing hydrogen bonding and showed that large errors are omnipresent
when compared to higher level wavefunction-based methods such as Møller–Plesset
(MP2, MP3) perturbation theory and Coupled-Cluster (CC) methods.[57] It should be made clear to the reader that when
working with quantum mechanical methods, a fine balance exists between
accuracy and computational feasibility. Many high-level wavefunction
based methods, such as Coupled-Cluster, can take impractical lengths
of computation time, ranging from hours to days for large, individual
molecules. Naturally, a cheminformatic setting will often consider
thousands of molecules, ensuring that high-level wavefunction methods
are difficult to consider.[58]As described
earlier, there are many types of chemical descriptor that can be included
in (Q)SAR models for mutagenicity prediction. However, many of these
descriptors are solely obtained from the potential toxicant itself
and neglect any target–toxicant interaction. As discussed in Section , the molecular
initiating event is an important step in toxicological reactions,
and more broadly, adverse outcome pathways (AOPs).[59] To fully probe this step, and gain a detailed understanding
of the MIE, methods that investigate the steric and energetic interactions
between toxicant and target, could reveal a hidden layer of information
when attempting to build and develop new descriptors and models.
DFT Transition State Modeling for Mutagenicity
Prediction
DFT transition state modeling is a quantum mechanical
method for exploring complex organic reaction mechanisms (see Figure ). Many mutagenic
events arise due to reaction between a biological nucleophile and
an organic molecule, and thus, transition state modeling can be an
invaluable tool for probing these reactions. According to IUPAC nomenclature,
a transition state is defined as a specific geometric assembly of
atoms, which when randomly placed at the saddle point, would have
an equal probability of forming the reactants or of forming the products.[60] Reaction activation barriers can be calculated
by using quantum chemical methods to calculate the energy of the reactants
(toxicant and target) and transition states (toxicant-target). The
magnitude of the activation barrier gives an indication to the likelihood
of reaction between a biological nucleophile and an exogenous electrophile.
If this methodology can be successfully implemented into predictive
computational toxicology, new insights into the energetic and mechanistic
details of mutagenic events may be possible. Transition state modeling
can give the user insight into competing reaction pathways and will
often reveal the lowest energy pathway.[61−63] Few attempts at using
this methodology for the prediction of mutagenicity have previously
been performed but will be highlighted further. In 2011, Cronin et
al. used DFT to show that transition state modeling can be used to
predict the reactivity of α,β-unsaturated carbonyl compounds
with glutathione.[64] They showed that steric
hindrance plays a key role in the reactivity profiles and that mechanistic
information is an invaluable tool in predicting electrophilic toxicity.
Although this study was not targeted directly at mutagenicity, it
showed that transition state modeling can be successfully used to
group compounds according to their intrinsic reactivity. In the same
year, Mulliner et al. studied a data set of 35 electrophilic 1,4-Michael
acceptors using DFT transition state modeling. Although this study
focused on correlating transition state barriers to experimental rate
constants, with a targeted end point of aquatic toxicity, it proved
that modeling transition states can be vital for the in silico study of the MIE.[65] In 2012, Kostal et
al. used transition state modeling to examine an SN2 reaction
between 15 epoxides and a chloride anion.[66] Their results showed that free energy of activation (ΔG⧧) could not be used effectively to examine
the mutagenic potential of their epoxide data set. However, in this
case, a chloride anion was chosen as the nucleophile due to “comparable
nucleophilic strength to DNA nucleotides in aqueous solution”.
This could be an oversimplification of the true situation due to DNA
nucleotides having multiple nucleophilic sites with different relative
strengths. An unsuitable choice of nucleophile for examination in
transition states could drastically affect the predictive performance
of a model. Following this in 2013, Leach et al. carried out a set
of Ames test procedures on a virtual array of aminopyrazoles, and
in parallel, used DFT to predict the associated probability of being
positive in the Ames test.[67] The dissociation
energy ΔE was calculated for a variety of activated
aminopyrazole conjugates at the B3LYP/6-31G* level of theory. The
probabilistic results generated from DFT calculations showed excellent
promise for predicting the risk of mutagenic activity in the Ames
test. This work directly highlights the pivotal role that DFT can
play in making predictions related to mutagenicity. In 2018, Goodman
et al. published a study investigating whether DFT transition state
modeling can be used to predict the Ames test result and thus the
mutagenic potential, of 19 1,4 Michael acceptor-type compounds.[61] Their chosen nucleophile was methylamine, and
the results demonstrated that free energy of activation shows good
predictivity for the mutagenic potential of Michael acceptors. This
study has importance in showing that transition state modeling may
be widely applicable for studying the mutagenicity of different groups
of electrophilic chemicals. We have since published work that builds
upon this model, where improvements were made to the previously published
transition state barriers, and LUMO energies were proven to show significant
predictivity toward Ames test results and thus their mutagenic risk.[68] The study showed that a data set of 29 1,4 Michael
acceptors could be separated by their Ames test result, with 100%
of compounds being correctly predicted and categorized. The work showed
that compounds with reaction barriers less than 20.7 kcal/mol and
LUMO energies less than −1.85 eV should be Ames positive, while
those with reaction barriers greater than 22 kcal/mol and LUMO energies
greater than −1.83 eV should be Ames negative. We believe that
transition state modeling has an important role to play in the future
of predictive toxicology. We further propose that free energy of activation
(with a relevant biological nucleophile) should be more commonly examined
as a chemical descriptor when building future (Q)SAR models for mutagenicity.
In previous years, quantum chemical calculations required considerable
time and expertise to perform. However, with the continued increase
in computational processing power and automation, DFT transition state
calculations are more readily performed than ever, thus unlocking
the potential for toxicologists to incorporate them into chemical
risk assessment.
Figure 5
Diagram of a reaction coordinate showing reactants, products,
and
a transition state. Transition state modeling involves calculating
reactant and transition state energies with quantum mechanical methods,
for example, DFT and HF.
Diagram of a reaction coordinate showing reactants, products,
and
a transition state. Transition state modeling involves calculating
reactant and transition state energies with quantum mechanical methods,
for example, DFT and HF.
Conclusion
This perspective has provided a broad insight into the current
status of how in silico methods can identify genotoxicants,
specifically mutagens, with particular emphasis on how DFT can aid
in the computational prediction of mutagenicity. We first discussed
the importance of developing predictive in silico methods in toxicology, along with the increasing desire to reduce
animal testing where possible. Different in silico approaches (SARs and QSARs) for examining mutagenic potential were
discussed, followed by rationalizing how DFT and transition state
modeling are both powerful tools for calculating molecular descriptors
in predictive toxicology. Despite the broad approach in this perspective,
we have discussed and highlighted why the computational sciences have
an important role to play in the prediction of mutagenicity. We further
ask the research community to consider transition state modeling as
a fundamental method for assessing the mutagenic potential of electrophilic
toxicants. We thank you for your attention and hope that reading this
perspective has been a fruitful endeavor.