Dagmar Stumpfe1, Huabin Hu1, Jürgen Bajorath1. 1. Department of Life Science Informatics, b-it, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Endenicher Allee 19c, D-53115 Bonn, Germany.
Abstract
Activity cliffs (ACs) are generally defined as pairs or groups of structurally similar compounds that are active against the same target but have large differences in potency. Accordingly, ACs capture chemical modifications that strongly influence biological activity. Therefore, they are of particular interest in structure-activity relationship (SAR) analysis and compound optimization. The AC concept is much more complex than it may appear at a first glance, especially if one aims to represent ACs computationally and identify them systematically. To these ends, molecular similarity and potency difference criteria must be carefully considered for AC assessment. Furthermore, ACs are often perceived differently in medicinal and computational chemistry, depending on whether they are studied on a case-by-case basis or systematically. For practical applications, intuitive access to AC information plays a major role. Over the years, the AC concept has been further refined and extended. Herein, we review the evolution of the AC concept, emphasizing new analysis schemes and findings that help to better understand ACs and extract SAR knowledge from them.
Activity cliffs (ACs) are generally defined as pairs or groups of structurally similar compounds that are active against the same target but have large differences in potency. Accordingly, ACs capture chemical modifications that strongly influence biological activity. Therefore, they are of particular interest in structure-activity relationship (SAR) analysis and compound optimization. The AC concept is much more complex than it may appear at a first glance, especially if one aims to represent ACs computationally and identify them systematically. To these ends, molecular similarity and potency difference criteria must be carefully considered for AC assessment. Furthermore, ACs are often perceived differently in medicinal and computational chemistry, depending on whether they are studied on a case-by-case basis or systematically. For practical applications, intuitive access to AC information plays a major role. Over the years, the AC concept has been further refined and extended. Herein, we review the evolution of the AC concept, emphasizing new analysis schemes and findings that help to better understand ACs and extract SAR knowledge from them.
Activity cliffs (ACs) have been discussed in computational and
medicinal chemistry for nearly three decades.[1−4] The first mention of the term
probably dates back to a book chapter published by Michael Lajiness
in 1991.[1] ACs were generally defined as
pairs of structurally similar active compounds with a large difference
in potency.[1,2] As such, ACs received strong attention in
computational chemistry and drug design because they represented instances
of structure–activity relationship (SAR) discontinuity that
were detrimental for quantitative SAR (QSAR) modeling.[2] In QSAR, AC compounds were often falsely considered outliers,
as pointed out in a milestone commentary by Gerry Maggiora that raised
awareness of ACs in the field.[2] However,
SAR discontinuity also translates into high SAR information content
and ACs identify small chemical modifications that determine the potency
of active compounds. This explains the strong interest in ACs in medicinal
chemistry to aid in compound optimization.[3,4] ACs
and the SAR discontinuity they capture are usually focal points during
early stages of compound optimization efforts when potency must be
improved. By contrast, encountering ACs is less desirable during late
stages when multiple compound properties must be balanced.[5] This is the case because ACs and the underlying
“steep” SARs complicate multiple property optimization
while attempting to retain sufficiently high compound potency.[5] In medicinal chemistry and structure-based compound
design, individual ACs are frequently found[6,7] including
instances revealing dramatic effects of minute chemical modifications
such as “magic methyl” effects.[7] Moreover, large volumes of SAR information become available when
ACs are systematically identified across bioactive compounds.[8] This is an area where computational and medicinal
chemistry meet, providing a large knowledge base for compound optimization.[4,5] While ACs might be intuitively appreciated from a chemical viewpoint,
by subjectively considering compound series and judging SARs, the
systematic study of ACs requires careful consideration of two criteria
that are essential for defining ACs, the similarity criterion and
potency difference criterion.[3,4] Investigating these
criteria and their interplay has largely determined the way in which
the AC concept has evolved over the years and continues to evolve,[5] as discussed in the following.
Similarity
Molecular Graph-Based Similarity
Compound similarity can be assessed in a variety of ways. For computational
representation of ACs, molecular fingerprints (bit string representations
of chemical structure and/or properties) have originally been used
as descriptors to calculate Tanimoto similarity,[3] a standard procedure in chemoinformatics. Tanimoto coefficient
(Tc)-based ACs are shown in Figure . While convenient computationally, the use of the Tc, which ranges from 0 (no fingerprint
overlap) to 1 (identical fingerprints), requires the choice of a threshold
value for classifying compounds as similar, which is largely subjective.[3] Furthermore, different fingerprints produce different
Tc values, which changes similarity criteria. Accordingly, attempts
have been made to identify ACs that would be formed independently
of different molecular representations and similarity calculations,
leading to the search for “consensus ACs”.[9] Moreover, although straightforward to calculate
for computational screening of compounds, Tc values are sometimes
difficult to interpret from a chemical perspective because they are
calculated as a whole-molecule similarity measure and do not take
any substituent rules or reaction information into account. Therefore,
as an alternative to calculated similarity values, substructure-based
similarity criteria can be applied, which lead to a binary assessment
of similarity,[4] i.e., either two compounds
share a predefined substructure and are classified as similar or not.
Substructure-based similarity can also be assessed in different ways,
for example, by using formally defined molecular scaffolds (core structures)
and distinguishing different scaffold/substituent (R-group) relationships,[10] as illustrated in Figure .
Figure 1
Fingerprint- and substructure-based activity
cliffs. At the top,
two fingerprint-based ACs are shown. Tanimoto similarity values for
comparison of structural fingerprints are reported. At the bottom,
ACs defined on the basis of different scaffold/R-group relationships
are shown, applying substructure-based similarity criteria instead
of fingerprint calculations. Distinguishing R-groups and other structural
differences between compounds are highlighted in red and compound
potency (pKi) values are provided (Adapted
from the work of Stumpfe et al.[4] Copyright
2014. American Chemical Society).
Fingerprint- and substructure-based activity
cliffs. At the top,
two fingerprint-based ACs are shown. Tanimoto similarity values for
comparison of structural fingerprints are reported. At the bottom,
ACs defined on the basis of different scaffold/R-group relationships
are shown, applying substructure-based similarity criteria instead
of fingerprint calculations. Distinguishing R-groups and other structural
differences between compounds are highlighted in red and compound
potency (pKi) values are provided (Adapted
from the work of Stumpfe et al.[4] Copyright
2014. American Chemical Society).The substructure-based classification scheme depicted in Figure establishes different
categories of ACs that are chemically intuitive including “chirality
cliffs” where compounds differ only by the configuration at
a single stereo center but display large potency differences. The
study of such chirality cliffs (or chiral cliffs) capturing subtle
structural differences was recently further extended by systematically
evaluating if pairs of enantiomers tested in the same assay formed
an AC or not.[11] In this study, chiral cliffs
were represented by a variety of chirality-sensitive descriptors and
subjected to machine learning to predict chiral cliffs and distinguish
them from pairs of enantiomers that did not form ACs.[11]Another substructure-based similarity criterion for
ACs is provided
by the formation of matched molecular pairs (MMPs),[12] i.e., pairs of compounds that are only distinguished by
a chemical modification at a single site.[13] MMPs can be efficiently identified algorithmically in large compound
databases.[13] The size of chemical modifications
can be restricted to mostly limit compounds forming MMPs to structural
analogs typically generated in medicinal chemistry.[12] Application of this similarity criterion led to the introduction
of MMP-based ACs that were termed “MMP-cliffs”.[12] For MMP generation, algorithmic fragmentation
of exocyclic single bonds can also be replaced with bond fragmentation
according to retrosynthetic rules, giving rise to the formation of
retrosynthetic MMPs (RMMPs)[14] and “RMMP-cliffs”
(in analogy to MMP-cliffs), as further discussed below. By definition,
MMPs and RMMPs are characterized by substitutions at a single site.
Hence, MMP- and RMMP-cliffs cannot contain different R-groups at multiple
sites, which is often the case for lead optimization series. Recently,
a computational method has been developed for systematically identifying
analog series with single or multiple substitution sites in compound
data sets. This approach, termed the compound-core relationship (CCR)
method,[15] is applicable to enumerate analog
pairs (APs) from a given series with single or multiple substitution
sites, hence providing an extension of the (R)MMP formalism to multiple
points of chemical modification (R-group replacements)[15] and another similarity criterion for AC formation.
From Two to Three Dimensions
ACs
can not only be defined using molecular graph-based (2D) similarity
but also on the basis of three-dimensional (3D) structures. To these
ends, bound ligands in complex structures with a given target protein
must be spatially aligned based upon protein superposition and 3D
similarity is calculated taking conformational and positional differences
between ligands into account.[16] For this
purpose, numerical 3D similarity functions are available.[4] Then, if ligands with similar binding modes exceed
a predefined potency difference threshold, “3D-cliffs”
are obtained. As illustrated in Figure , compounds forming 3D-cliffs often (but not always)
display characteristic interaction differences in experimental structures
that are likely to explain differences in their potency. As such,
3D-cliffs provide informative examples for structure-based compound
design and excellent test cases for the calibration of computational
methods to calculate binding energies.
Figure 2
Categorization of 3D-cliffs.
Shown are representative 3D-cliffs
that were defined on the basis of binding mode similarity. 3D-cliffs
are assigned to five categories on the basis of different interactions
that distinguish highly and weakly potent cliff partners. For each
3D-cliff, the superimposed compound binding modes and their calculated
(property density function-based) 3D similarity values (black) are
provided. In addition, binding site views of individual X-ray complexes
are shown, with structures of weakly and highly potent ligands on
the left and right, respectively. Compound potency (pKi) values are reported (blue). On the right, regions with
clearly evident interaction differences between ligands forming 3D-cliffs
are highlighted using red dashed circles (Reprinted from the work
of Stumpfe et al.[4] Copyright 2014. American
Chemical Society).
Categorization of 3D-cliffs.
Shown are representative 3D-cliffs
that were defined on the basis of binding mode similarity. 3D-cliffs
are assigned to five categories on the basis of different interactions
that distinguish highly and weakly potent cliff partners. For each
3D-cliff, the superimposed compound binding modes and their calculated
(property density function-based) 3D similarity values (black) are
provided. In addition, binding site views of individual X-ray complexes
are shown, with structures of weakly and highly potent ligands on
the left and right, respectively. Compound potency (pKi) values are reported (blue). On the right, regions with
clearly evident interaction differences between ligands forming 3D-cliffs
are highlighted using red dashed circles (Reprinted from the work
of Stumpfe et al.[4] Copyright 2014. American
Chemical Society).In 2015, a systematic
analysis of publicly available X-ray structures
of ligands in complex with human targets identified 630 3D-cliffs
for which high confidence activity data were available in the medicinal
chemistry literature.[16] These ACs involved
61 human targets. MMP search in ChEMBL,[17] the major public repository of compounds and activity data for medicinal
chemistry, identified a total of 1980 structural analogs of 268 3D-cliff
compounds with activity against 50 human targets.[16] These analogs complemented 414 3D-cliffs and hence bridged
between 3D- and 2D-ACs, providing ample opportunities for advanced
SAR analysis.3D-cliffs were also used to investigate the ability
of different
docking and scoring approaches to predict large potency differences
between similar compounds.[18] From 158 complex
X-ray structures of nine target proteins, 146 3D-cliffs were isolated
and subjected to docking calculations. A target-based scoring scheme
was established to evaluate 3D-cliffs and predict cliff partners for
compounds with known or modeled binding modes.[18] Another study combined the analysis of corresponding 2D-
and 3D-ACs using a database of more than 1700 X-ray structures of
inhibitor complexes of 190 kinases.[19] Compound
similarity was assessed on the basis of 2D fingerprints and similarity
of crystallographic inhibitor-kinase interactions using molecular
interaction fingerprints (IFPs), hence providing a complementary assessment
of molecular and interaction similarity, leading to the introduction
of “interaction cliffs”.[19] Only ∼25% of 2D-ACs identified in this study also qualified
as interaction cliffs, due to overall low interaction similarity detected
with IFPs. However, interaction cliffs revealed ligand–target
interaction “hot spots” and were proposed to aid in
the design of new inhibitors,[19] thus widening
the spectrum of 3D-cliffs.
Concerted
Formation of Activity Cliffs
Originally, ACs were investigated
on the basis of compounds pairs,[2,3] consistent with how
ACs were typically encountered during sequential
compound optimization efforts.[5] However,
in compound data sets, ACs are rarely formed by “isolated”
pairs of compounds (i.e., pairs without structural neighbors formin
ACs). Rather, most ACs (>90%) are formed by groups of structural
analogs
with varying potency, involving compounds in multiple ACs.[20] The formation of these “coordinated”
ACs can be rationalized using AC network representations in which
nodes represent compounds and edges pairwise AC relationships.[20]Figure shows a representative example. In networks, coordinated
ACs give rise to clusters of varying size and topology. These clusters
reveal more SAR information than ACs considered as isolated pairs.
Figure 3
Activity
cliff network. The RMMP-based AC (RMMP-cliff) network
for adenosine A1 receptor ligands is shown. Nodes represent compounds
and edges represent pairwise ACs. Highly and weakly potent cliff partners
are colored green and red, respectively, while yellow nodes represent
compounds that are highly and weakly potent partners in different
ACs. For two exemplary clusters (I, II) from the network, compound
structures are displayed, and pKi values
are reported. Structural differences between AC-forming compounds
are highlighted using orange circles. For ACs, a general potency difference
threshold of ΔpKi ≥ 2 was
applied.
Activity
cliff network. The RMMP-based AC (RMMP-cliff) network
for adenosine A1 receptor ligands is shown. Nodes represent compounds
and edges represent pairwise ACs. Highly and weakly potent cliff partners
are colored green and red, respectively, while yellow nodes represent
compounds that are highly and weakly potent partners in different
ACs. For two exemplary clusters (I, II) from the network, compound
structures are displayed, and pKi values
are reported. Structural differences between AC-forming compounds
are highlighted using orange circles. For ACs, a general potency difference
threshold of ΔpKi ≥ 2 was
applied.AC clusters are frequently formed
by a highly potent compound and
multiple weakly potent analogs or vice versa, resulting in densely
connected central nodes or “hubs” in AC networks.[20] ACs can also be visualized in “activity
landscapes”, which are generally defined as graphical representations
that integrate compound similarity and potency relationships.[4] Hubs in AC networks correspond to “AC
generators” in activity landscape models, which were defined
as compounds forming ACs with high frequency.[21] AC generators identified using fingerprint Tanimoto similarity can
be organized on the basis of scaffolds and enrichment factors can
be calculated that account for the proportion of ACs containing a
particular scaffold,[21] thereby combining
numerical and substructure-based similarity assessment.
Relevant Potency Differences
While similarity criteria for
AC formation have been intensely
investigated, as discussed above, comparably little attention has
been paid to determining most relevant potency differences. In general,
as potency measurements for AC assessment, assay-independent equilibrium
constants (Ki values) are strongly preferred
over other assay-dependent measurements (such as IC50 values).[4] Although it is possible to represent ACs as a
continuum of compound pairs with increasing potency differences,[22] most studies have used a predefined and constant
potency threshold across different compound activity classes (also
termed target sets).[4,5] A potency difference threshold
of 2 orders of magnitude (100-fold) has often been applied,[12,18] but also a threshold of only 1 order of magnitude.[11,19] A constantly applied potency difference threshold is convenient
for computational identification and analysis of ACs across different
target sets but does not take target set-dependent differences in
potency value distributions into account.[23] However, these distributions significantly vary in target sets where
AC formation is strongly influenced by the interplay between potency
value distributions and structural relationships.[23] Therefore, ACs have also been defined on the basis of variable
target set-dependent potency difference thresholds.[24] For a given target set, it is first investigated if compound
potency variations are large enough to yield ACs,[23,24] then all qualifying RMMPs or APs are identified and their potency
differences determined. Finally, the set-dependent potency difference
threshold for AC formation is calculated as the mean of the compound
pair-based potency difference distribution plus two standard deviations.[24]Figure provides a representative example.
Figure 4
Potency difference criterion
for target set-dependent activity
cliffs. For an exemplary target set (serine/threonine-protein kinase
PIM2 inhibitors), the potency value (pKi) distribution of all compounds (CPDs) is shown on the left (each
dot represents a CPD). In addition, potency differences (ΔpKi) between CPDs forming analog pairs (APs) are
reported (middle, each triangle represents an AP). The red dashed
line indicates the potency difference threshold value for the target
set-dependent AC definition (ΔpKi ≥ 2.26). This threshold is calculated as the mean plus two
times the standard deviation (SD, δ) of the ΔpKi distribution. Red triangles represent target
set-dependent ACs. On the right, two exemplary ACs are shown.
Potency difference criterion
for target set-dependent activity
cliffs. For an exemplary target set (serine/threonine-protein kinase
PIM2 inhibitors), the potency value (pKi) distribution of all compounds (CPDs) is shown on the left (each
dot represents a CPD). In addition, potency differences (ΔpKi) between CPDs forming analog pairs (APs) are
reported (middle, each triangle represents an AP). The red dashed
line indicates the potency difference threshold value for the target
set-dependent AC definition (ΔpKi ≥ 2.26). This threshold is calculated as the mean plus two
times the standard deviation (SD, δ) of the ΔpKi distribution. Red triangles represent target
set-dependent ACs. On the right, two exemplary ACs are shown.In a systematic analysis of ChEMBL (release 23),
212 target sets
with available Ki measurements and qualifying
potency value distributions yielded a total of 16 096 set-dependent
RMMP-cliffs.[24] Most target set-dependent
potency difference thresholds fell into the range 1 ≤ ΔpKi ≤ 2.5. For comparison, when a general
potency difference criterion of ΔpKi ≥ 2 was applied, 11 773 RMMP-cliffs originating from
195 target sets were obtained.[24] Hence,
the target set-dependent potency difference criterion yielded more
ACs in more target sets than a generally applied potency difference
threshold of comparable magnitude and a more balanced distribution
of ACs across target sets. The increase in the number of set-dependent
ACs was often due to AC formation in different compound subsets, which
also further increased the amount of AC-associated SAR information.
Different Generations of Activity Cliffs
On the basis
of alternative similarity and potency difference criteria,
as discussed above, we distinguish between different generations of
ACs, as illustrated in Figure , which reflects the evolution of the AC concept in our work.
The first generation of ACs was based upon fingerprint descriptors,
numerical Tanimoto similarity, and predefined potency difference thresholds.[3] The transition to second generation ACs was marked
by the application of substructure-based similarity criteria such
as the formation of MMPs[12] or RMMPs[23] and the use of general or target set-dependent
potency difference criteria, leading to the introduction of MMP-cliffs[12] and RMMP-cliffs,[24] respectively. These types of ACs contained a single substitution
site, consistent with the underlying MMP formalism. Compared to numerical
similarity measures, the use of MMPs/RMMPs often increased the interpretability
and medicinal chemistry relevance of ACs and set-dependent potency
difference criteria further increased their SAR information content.
Figure 5
Evolution
of activity cliff definitions. Exemplary first, second,
and third generation ACs are shown. In each case, the similarity criterion
for AC definition is given (fingerprint-based, formation of (R)MMPs
or APs) and the applied method is specified (Tc, random or rule-based
fragmentation, CCR). Compound potency (pKi) values are provided and structural modifications in ACs are highlighted
in red.
Evolution
of activity cliff definitions. Exemplary first, second,
and third generation ACs are shown. In each case, the similarity criterion
for AC definition is given (fingerprint-based, formation of (R)MMPs
or APs) and the applied method is specified (Tc, random or rule-based
fragmentation, CCR). Compound potency (pKi) values are provided and structural modifications in ACs are highlighted
in red.Third generation ACs, as introduced
recently,[25] are exclusively based on variable
target set-dependent
potency difference thresholds and depend on APs enumerated from computationally
identified or already available analog series[15] such that single and multiple substitution sites are taken into
account, as illustrated in Figure . Thus, application of this similarity criterion has
led to the introduction of single-site and multisite ACs. A systematic
search in ChEMBL (release 24.1) revealed that only 25.6% of currently
available third generation ACs are multisite cliffs. The 4205 multisite
ACs include 3805 (90.5%) dual-site ACs,[25] as depicted in Figure .
Figure 6
Single- and dual-site activity cliffs. Shown are third generation
ACs with a structural modification at a single site (single-site AC,
left) or modifications at two sites (dual-site AC, right) that are
formed by kinase inhibitors given a target set-dependent ΔpKi threshold of 1.49. For the dual-site AC, single-site
analogs were identified that revealed a synergistic effect of the
two substitutions contributing to AC formation, as illustrated by
the potency value comparison of the four analogs at the right.
Single- and dual-site activity cliffs. Shown are third generation
ACs with a structural modification at a single site (single-site AC,
left) or modifications at two sites (dual-site AC, right) that are
formed by kinase inhibitors given a target set-dependent ΔpKi threshold of 1.49. For the dual-site AC, single-site
analogs were identified that revealed a synergistic effect of the
two substitutions contributing to AC formation, as illustrated by
the potency value comparison of the four analogs at the right.For 297 dual-site ACs, pairs of single-site analogs
were identified
that contained the individual substitutions of the dual-site ACs.
Comparison of the potency of dual-site AC partners with single-site
analogs identified redundant dual-site ACs, which were represented
by single-site cliffs, as well as the presence of additive, synergistic,
and compensatory potency effects in confirmed dual-site ACs.[25]Figure illustrate a synergistic effect of individual substitutions.
Monitoring Activity Cliff Populations on a Time
Scale
Some new findings are reported to illustrate how AC
populations
evolve over time. In ChEMBL (release 25), 96 target sets were identified
that grew over a period of 10 years (2009–2018) and contained
at least 100 APs from different analog series in 2009. For each year,
the number of third generation ACs per set was determined. Figure shows the cumulative
growth in the number of compounds, APs, and ACs. While the total number
of third generation ACs available in all sets increased from 5563
to 17 539 over 10 years, the proportion of APs forming target
set-dependent ACs remained essentially constant at 4.9%. Over time,
there only was a slight increase in the proportion of dual-site and
multisite ACs with three or more substitution sites relative to single-site
ACs. Hence, ACs were formed at a constant rate over time in different
targets sets in the presence of stable compound structure-potency
relationships and single-site ACs dominated AC populations.
Figure 7
Growing numbers
of third generation activity cliffs. For 96 target
sets, cumulative growth in the number of compounds (CPDs), APs, and
third generation ACs is monitored from 2009 to 2018 (top left). In
addition, time-dependent changes in the proportion (percentage) of
single-, dual-, and multisite ACs relative to all APs are reported
(top right). Furthermore, boxplots report target set-dependent ΔpKi values for APs over time for all target sets
(gray background) and subsets with increasing compound growth (bottom).
For each boxplot, the median ΔpKi value range is given in blue and the number of target sets is reported
in parentheses.
Growing numbers
of third generation activity cliffs. For 96 target
sets, cumulative growth in the number of compounds (CPDs), APs, and
third generation ACs is monitored from 2009 to 2018 (top left). In
addition, time-dependent changes in the proportion (percentage) of
single-, dual-, and multisite ACs relative to all APs are reported
(top right). Furthermore, boxplots report target set-dependent ΔpKi values for APs over time for all target sets
(gray background) and subsets with increasing compound growth (bottom).
For each boxplot, the median ΔpKi value range is given in blue and the number of target sets is reported
in parentheses.
Conclusions and Outlook
Herein, foundations of the AC concept and its evolution have been
reviewed. ACs are of interest in both computational and medicinal
chemistry. They are explored on a case-by-case basis, especially during
early stages of compound optimization, as well as on a large scale
through systematic computational analysis of compound collections
and activity data. In computational chemistry, there is also interest
in predicting ACs using machine learning, and a few models have been
reported so far, yielding promising prediction accuracy.[11,26,27]Most of our current knowledge
about ACs has originated from compound
data analysis. In growing compound data sets, ACs are formed at an
essentially constant rate. Large numbers of ACs have been identified
across current target sets, providing a wealth of SAR information
for medicinal chemistry. In addition to applying molecular graph-based
similarity measures, ACs can also be defined on the basis of compound
binding modes and 3D similarity, providing interesting examples and
test cases for drug design.Several factors influence AC analysis.
First and foremost, compound
similarity and potency difference criteria are of critical importance
for the definition, perception, and utilization of ACs.Clearly,
alternative AC definitions change AC populations and SAR
information associated with them. Another critical aspect of AC analysis
is that experimental accuracy and measurement characteristics must
be carefully considered. For example, AC analysis is not reliable
if incompatible measurement types such as Ki and IC50 values are used in combination. In general,
high-confidence activity data are strongly preferred.A recent
conceptual advance has been the introduction of target
set-dependent ACs taking differences in potency value distributions
into account and arriving at set-dependent potency difference thresholds.
These ACs best reflect SAR characteristics of different compound classes
and data sets. The evolution of the AC concept is well illustrated
by distinguishing different generation of ACs and their characteristics,
as discussed herein.The AC concept is subject to further extensions.
For example, an
interesting task will be the evaluation of ACs during sequential optimization
efforts, considering alternative compound series in parallel. Since
most large-scale analyses of ACs have focused on data sets from heterogeneous
sources, the frequency with which ACs occur during series-based compound
optimization is currently unknown. A thorough analysis of this topic
has thus far been prohibited by limited public availability of “real
life” lead optimization series from drug discovery. Another
interesting area for future research will be the use of the AC knowledge
base to better understand structural features that determine compound
potency across different target sets. This also provides further opportunities
for machine learning and significance analysis of molecular features
that determine successful predictions. Clearly, despite progress made
in rationalizing ACs over the years, we are just beginning to explore
their potential for practical applications. There is more to come.
Authors: Andreas Tosstorff; Markus G Rudolph; Jason C Cole; Michael Reutlinger; Christian Kramer; Hervé Schaffhauser; Agnès Nilly; Alexander Flohr; Bernd Kuhn Journal: J Comput Aided Mol Des Date: 2022-09-25 Impact factor: 4.179
Authors: Xin Yi See; Xuelan Wen; T Alexander Wheeler; Channing K Klein; Jason D Goodpaster; Benjamin R Reiner; Ian A Tonks Journal: ACS Catal Date: 2020-11-05 Impact factor: 13.084
Authors: Gabriel Idakwo; Sundar Thangapandian; Joseph Luttrell; Yan Li; Nan Wang; Zhaoxian Zhou; Huixiao Hong; Bei Yang; Chaoyang Zhang; Ping Gong Journal: J Cheminform Date: 2020-10-27 Impact factor: 5.514
Authors: Valery M Dembitsky; Ekaterina Ermolenko; Nick Savidov; Tatyana A Gloriozova; Vladimir V Poroikov Journal: Molecules Date: 2021-01-28 Impact factor: 4.411