Literature DB >> 35633896

Computational Modeling of Supramolecular Metallo-organic Cages-Challenges and Opportunities.

Tomasz K Piskorz¹, Vicente Martí-Centelles², Tom A Young¹, Paul J Lusby³, Fernanda Duarte¹.

Abstract

Self-assembled metallo-organic cages have emerged as promising biomimetic platforms that can encapsulate whole substrates akin to an enzyme active site. Extensive experimental work has enabled access to a variety of structures, with a few notable examples showing catalytic behavior. However, computational investigations of metallo-organic cages are scarce, not least due to the challenges associated with their modeling and the lack of accurate and efficient protocols to evaluate these systems. In this review, we discuss key molecular principles governing the design of functional metallo-organic cages, from the assembly of building blocks through binding and catalysis. For each of these processes, computational protocols will be reviewed, considering their inherent strengths and weaknesses. We will demonstrate that while each approach may have its own specific pitfalls, they can be a powerful tool for rationalizing experimental observables and to guide synthetic efforts. To illustrate this point, we present several examples where modeling has helped to elucidate fundamental principles behind molecular recognition and reactivity. We highlight the importance of combining computational and experimental efforts to speed up supramolecular catalyst design while reducing time and resources.

Entities: Chemical

Year: 2022 PMID： 35633896 PMCID： PMC9127791 DOI： 10.1021/acscatal.2c00837

Source DB: PubMed Journal: ACS Catal Impact factor: 13.700

Introduction

Nature provides stunning examples that show how the organization of relatively simple building blocks leads to vital functions, from compartmentalization to catalysis. Inspired by these observations, chemists have attempted to design artificial structures in the laboratory that can also deliver useful properties. Although these structures are yet to match the performance of nature’s systems, impressive progress has been made in the synthesis of increasingly complex molecular systems, highlighted by the award of the 2016 Nobel Prize in Chemistry for molecular machines. Enzymes, in particular, provide inspiration for the design of self-assembled catalysts.[1−3] Prominent architectures include porous organic cages (POCs)[4,5] and metallo-organic cages.[6,7] In particular, metallo-organic cages have emerged as important bioinspired systems due to their tunability and predictable structure, with applications in drug delivery,[8−10] chemical sensing,[11,12] recognition,[13−17] separation,[18−23] cargo transport,[24] stabilization of the reactive state,[25−29] and catalysis.[30] However, creating metallo-organic cages that mimic the way enzymes work is challenging. This is because enzymes are complex biopolymers containing many different residues that only become functional upon correct folding. The formation of a hollow interior containing a network of noncovalent interactions, such as hydrogen bonds, ion-pairing, and van der Waals interactions, enables enzymes to selectively bind substrates, stabilize transition states (TSs), and achieve catalytic turnover. While metallo-organic cages can mimic several of these features and have the advantage of being easier to (re)design, synthesize, and prepare than enzymes,[31] it remains challenging to control self-assembly beyond highly symmetric systems (Figure ).[32,33] Therefore, current efforts have centered on developing synthetic protocols to obtain cages that are easy to functionalize or with low-symmetry cavities.[34−39]

Figure 1

Comparison between enzymes and self-assembled metallo-organic cages. The design of metallo-organic cage catalysts and property prediction requires a detailed understanding of each of the following stages: structural design, self-assembly, binding, catalysis, and release. Alongside new synthetic methods, computational molecular modeling has been employed to investigate structural parameters of metallo-organic cages, such as the volume of the cavity and the metal-to-metal distance. However, more recently, it has also been used to study their binding and catalytic properties, shedding light on the molecular features driving selectivity and activity. Efficient computational tools have also emerged to rapidly predict molecular properties, enabling synthetic chemists to quickly screen multiple cage designs before attempting their synthesis in the laboratory.[40,41] Combining computational and experimental efforts could substantially speed up the design of functional metallo-organic cages, reducing time and resources.

From Building Blocks to Catalytic Metallo-organic Cages

As it is the case for enzymes, where folding mainly depends on the sequence of amino acids and their environment, self-assembly is determined by the nature of the ligand(s) and metal(s) building blocks and the experimental conditions. Understanding how assembly takes place becomes particularly challenging when the number of assembling components increases, as many intermediates may be possible. Following assembly, the precise recognition and uptake of a given substrate(s) are determined by the size, shape, and electrostatic complementarity of the host–guest(s) complex. While achieving size and shape complementarity is relatively straightforward employing 3D models, predicting binding and catalysis is much more difficult. This is due to the complexity of the processes involved, which are difficult to characterize experimentally. This has meant that most of the catalysts reported to date have been obtained via a trial-and-error approach and/or chemical intuition.[42−45] Therefore, for computational chemistry to contribute to the discovery of novel catalysts, it is necessary to develop better computational protocols that can accurately and efficiently quantify solvation, dynamics, and electrostatic effects at the reactant, TS, and product stage. This will, in turn, facilitate experimental design and speed up the identification of new catalysts (Figure ).

General Self-Assembly Principles

Understanding the design principles underlying metal-driven self-assembly is essential for creating discrete assemblies with well-defined internal environments. Several design strategies have been developed, enabling access to structures with an ever-increasing number of components, albeit often focused on symmetric homoleptic systems that only use two different components, i.e., one ligand and one metal building block.[38] More recently, strategies to obtain low-symmetry structures, i.e., heteroleptic cages with different organic building blocks, have been developed.

Symmetric Cages

Over the years, a series of synthetic strategies that exploit metal centers as structural building blocks have been introduced to rationalize and design increasingly large and diverse homoleptic structures. Two main approaches include directional bonding and symmetry interaction, which are based on either control of bonding vectors of the metal precursor or control of the overall symmetry of the components.[46] While they have illustrated the power of geometrical considerations when designing novel assemblies, a clear-cut division remains challenging, especially when the classification is done a posteriori to rationalize rather than design a given system. Below we briefly describe these strategies and refer the reader to relevant reviews on the topic for further details.[6,7,47] The directional bonding approach coined by Stang et al.[6] exploits the use of metallo building blocks to “direct” ligands onto either the edge and/or the face of a polygon or polyhedron (Figure a). The outcome of the assembly reaction is mainly determined by the number and relative orientation of the acceptor and donor sites on the metal and ligand, respectively.[6,48]cis-Protected square planar complexes are the most widely used metallo component within this method, as the “vacant” coordination sites provide a 90° turn that promotes closure to give a discrete assembly. The strategy was first exemplified by Fujita in his seminal 1990 paper,[49] which showed that the combination of (en)Pd(NO3)2 (en = ethylenediamine) and 4,4′-bypyridine leads to a Pd4L4 molecular square in quantitative yield. Molecular paneling(50) can be seen as a subset of the directional bonding approach. This method, which employs planar ligands that occupy the faces rather than the edges of the cage, has yielded a number of notable cage structures that possess interesting host–guest and catalytic properties (Figure a).

Figure 2

Design strategies for homoleptic cages: (a) directed bonding approach[6,52] and its variation molecular paneling,[50] (b) symmetry interaction strategy,[51] and (c) family of roughly spherical coordination Platonic or Archimedean polyhedra. Raymond pioneered a method he defined as the symmetry interaction approach to rationalize and predict the outcome of coordination assembly reactions using multibranched catecholate ligands and trivalent pseudo-octahedral metal ions (e.g., Ga3+/Fe3+).[51] This method uses ligand design to control the relative orientation of coordination sphere symmetries. For example, a tris(catecholate) metal(III) coordination vertex possesses a C3-axis that lies perpendicular to the chelate plane. In a M2L3 helicate, the two chelate planes must be aligned in a parallel arrangement, whereas for a M4L6 cage, the ideal angle between any two coordination planes is 70.6°. This guide is relevant to all cages that are composed of multibranched bidentate ligands and “naked” octahedral metal ions. However, there are multiple instances in which the outcome is different from what would be expected. Assemblies that use tris(bidentate) metal units are interesting (especially from a catalytic perspective) because this coordination sphere is intrinsically chiral (Figure b). Often, multimetallic structures are produced as a single diastereomer because strong mechanical coupling influences the Δ or Λ-stereoconfiguration at an adjacent coordination site. For instance, M4L6 cages most commonly, although not always, forms as a 1:1 mixture of ΔΔΔΔ- and ΛΛΛΛ-enantiomers. Coordination cages that utilize the assembly of “naked” square planar metal ions with ditopic ligands have been classified as the directional bonding approach[7] but can also be considered using a symmetry interaction description. Indeed, the simplest of these structures, the M2L4 “lantern” cage, can be defined as two MN4 coordination planes (where, e.g., N = pyridyl) that are aligned in a parallel arrangement (c.f., M2L3 helicate structure). As the angle between these two planes changes, which is controlled by the bend angle in the bridging ligand, then the size of the [ML2] (e.g., n > 2) architecture changes. Both approaches are underpinned by the same thermodynamic principles, which consider maximum site occupancy (i.e., all metal–ligand interactions are satisfied) and the formation of the smallest, minimally strained structure, maximizing the number of system particles.[53] With a few exceptions, e.g., square-triangle equilibria or prismatic structures,[6] the system’s energy minimum is the thermodynamic product. However, for larger cages, kinetic traps can preclude the formation of the predicted lowest energy cage product.[54] Fujita and co-workers have employed geometry-based design principles to design a series of ML2 Platonic or Archimedean polyhedra (n = 6, 12, 24, 30, or 60, Figure c).[38] In this case, unprotected Pd centers were used as precursors in combination with ditopic ligands that, depending on their bend angles, give rise to different size polyhedra. While for the smallest systems (n = 6 and 12) the outcome of the assembly reaction follows the predicted product, for larger assemblies kinetic effects play a role.[33] For example, ligand L (α = 149°; Figure a), which has a nearly ideal bending angle (α = 150°) to form [Pd30L60]60+, led to the kinetically trapped [Pd24L48]48+ cage, which only upon heating was partially converted into [Pd30L60]60+.[54] Using a longer ligand, which results in slightly higher flexibility, they later obtained the expected [Pd30L60]60+ cage quantitatively. Efforts toward the [Pd60L120]120+ cage have also serendipitously led to the self-assembly of a new series of Goldberg polyhedra, including a new topology of [Pd30L60]60+ and the giant [Pd48L96]96+ cage.[55] These examples demonstrate that, in addition to geometrical considerations, aspects such as ligand flexibility and experimental conditions also affect the final assembly as they may favor the formation of kinetic traps. Therefore, accounting for these effects in modeling is essential for the successful computational design of cages.

Low-Symmetry Cages

While the strategies mentioned above have led to impressive structures, they are restricted to symmetric cages, often involving just a single ligand, limiting complexity inside the cavity. Synthetic strategies to generate low-symmetry cages have been achieved using either heteroleptic designs based on (i) steric hindrance, (ii) coordination sphere engineering, and (iii) shape complementarity or homoleptic designs (iv) using low-symmetry ligands.[34,35] Most recent strategies to obtain low-symmetry structures have focused on the [Pd2L4]4 “lantern” topology, which is also outlined below.

Steric Hindrance

Hooley and Johnson have exploited steric hindrance between endohedrally modified ligands to obtain heteroleptic cages. Using a mixture of bispyridyl ligands, L and L, they obtained the heteroleptic [Pd2LL3]4 cage (Figure a(i)).[56] One of the main disadvantages of this method is that the functional group occupies the cavity of the cage, which blocks the binding of guests. Clever and co-workers have shown that appending ligands with steric bulk does not necessarily lead to cages with blocked cavities. They showed that an unusual [Pd4L4L′4]8+ tetrahedral cage could be formed from a mixture of exohedrally modified and nonmodified ligands by balancing the entropic tendency to form smaller assemblies and repulsion between bulky functional groups, resulting in low-symmetry unoccupied cavities.[57]

Figure 3

Design strategies to obtain low-symmetry cages. (a) Thermodynamic control: (i) steric hindrance, (ii) coordination sphere engineering, (iii) shape complementarity, and (iv) homoleptic cages via low-symmetry ligands. Red spheres indicate the methyl group proximal to the metal ion. (b) Kinetic control.

Coordination Sphere Engineering

The coordination sphere engineering approach uses substituted ligands that, due to steric or noncovalent interactions around the coordinating atom, disfavor the formation of homoleptic cages. Fujita and co-workers pioneered this strategy and demonstrated that prismatic assemblies that incorporate ditopic and tritopic ligands could be favored by exploiting a sterically hindered 2,6-dimethyl-subsituted pyridyl motif, which stopped self-sorting.[58] Clever and co-workers have also employed this strategy to assemble a heteroleptic cis-[Pd2L2L2]4 cage from 6- or 2-methyl-substituted ligands.[59] When the homoleptic [Pd2L2]4+ assembly with ligands L or L was attempted, steric hindrance around the metal center disfavored cage formation; instead, [Pd2L3(solvent)2]4+ and [Pd2L2(solvent)4]4+ structures were generated under kinetic control. When the combination of ligands L and L was used, the “in/out” orientation of methyl groups resulted in the formation of cis-[Pd2L2L2] cages under thermodynamic control (Figure a(ii)).

Shape Complementarity

In this strategy, two shape-complementary ligands are combined, resulting in enthalpic destabilization of the homoleptic species and, in some cases, entropy reduction of the heteroleptic cage due to size reduction. Zhou and Li originally employed this approach in a series of CuII-based cages,[60] where the heteroleptic cage was achieved via ligand displacement from a homoleptic cage. Clever and co-workers have also employed this approach to obtain thermodynamically stable cis-[Pd2L2L2]4+ and trans-[Pd2L2L2]4+ heteroleptic cages either from the mixture of precursors or via ligand substitution from their homoleptic cage precursors (Figure a(iii)).[61]

Low-Symmetry Ligands

Low-symmetry homoleptic cavities can also be obtained using low-symmetry ligands.[62−65] This strategy uses coordination sphere engineering and/or shape complementarity introduced in a single ligand. For example, Lewis and co-workers employed a low-symmetry ligand containing a 2-substituted pyridyl donor to generate low-symmetry [Pd2L4]4+ cages for which four different isomeric forms exist. The increased steric hindrance around the metal center and the use of different linker lengths resulted in misalignment around the metals, favoring the formation of the unsymmetrical cage (Figure a(iv)).[36] While most unsymmetric cages are formed using one of the strategies described above, other notable examples exist. For example, Crowley and co-workers have studied the sequential substitution of ligands by reacting the [Pd2L4]4+ cage with 2-amine-substituted bispyridyl ligand, L.[66] Rather than observing the expected [Pd2L4]4+ thermodynamic product, they obtained the kinetically trapped heteroleptic [Pd2L2L2]4+ cage (Figure b), which did not undergo further exchange after being left for 40 days at room temperature. Presumably, the amino groups of the ligands shield the palladium ion and prevent ligand exchange. Obtaining kinetically trapped cages could be an alternative strategy for designing heteroleptic cages, but it remains an unexplored direction in the field.[67]

Computational Cage Design

In recent years, the computational prediction of synthetically viable structures via in silico screening has become increasingly popular,[68−71] complementing the synthetic strategies described above. This has been possible thanks to the growth in computational power and algorithmic improvement of modeling software. Different modeling techniques are currently available, where the choice of the method depends on the size of the system and process under study and the resources available. These approaches can be generally divided into classical molecular mechanics (MM) and quantum mechanics (QM) approaches.

Classical Approaches

Classical force fields (FFs) describe atoms as charged points with Lennard-Jones interactions linked by springs representing bonds, allowing the evaluation of potential energies with a simple and computationally efficient algorithm. As a result, systems with millions of atoms can be simulated on millisecond time scales.[72] Several force fields exist for describing organic molecules, including the universal force field (UFF),[73] the general AMBER force field (GAFF),[74] the CHARMM general force field (CGenFF),[75] the optimized potentials for liquid simulation force field (OPLS),[76] and the OpenFF family.[77] The UFF includes parameters for most atoms of the periodic table (including metals) and has also been extended to metal–organic frameworks (MOFs, UFF4MOF).[78] These FFs have been carefully parametrized to reproduce, for example, hydration free energies, partition coefficients, QM energy profiles, or vibrational frequencies, often targeting biological systems; however, their intrinsic simplicity means that they provide limited quantitative estimates of, for example, binding free energies.[79] Moreover, while classical FFs can describe noncovalent interactions and self-assembly, they do not allow the study of the formation or breaking of covalent bonds, making them unsuitable for studying chemical reactions and catalysis. Indeed, only a few exceptions exist, such as ReaxFF and the empirical valence bond (EVB) approach, which require extensive parametrization.[80,81] Modeling metal-containing systems using current FFs is particularly challenging, as FFs often lack parameters for metal centers or even protocols to obtain them. Moreover, when they exist, problems associated with their stability during molecular dynamics (MD) simulations often appear. For example, metals may strongly interact with counterions or repel other metals centers nearby.[82] Three main protocols have been reported to model metal ions classically; most of them aim to reproduce aquo complexes geometries and solvation free energies. They include the commonly used soft-sphere model,[83,84] in which the metal–ligand interactions are described through electrostatic and van der Waals terms only. While these models are simple to parametrize, they are unable to simultaneously reproduce two or more experimental properties, e.g., first solvation shell and hydration free energy.[84] The covalent bond model includes predefined covalent bonds between the metal and ligands, which enhance stability; however, it precludes ligand exchange. The Seminario method,[85] automated in MCPB.py protocol,[86] is often used to obtain bonded parameters. To account for charge transfer between the ligand and the metal, partial charges are also recomputed. Finally, the dummy model describes the metal center as a set of cationic dummy atoms placed around the metal nucleus, encouraging a specific coordination geometry on the metal center.[87] Since this model allows for breaking metal–ligand bonds, it is the model of choice for self-assembly studies.[88−90] As described below, all these methods have been used to model metallo-organic cages with different levels of success.

Quantum Approaches

To reliably quantify the origin of catalysis, ab initio (wave function-based) or density functional theory (DFT) methods are necessary. They allow the optimization of geometries and the calculation of energies and relative (free) energies. However, their applicability is limited; even low-cost DFT methods (e.g., B97-3c[91] and PBEh-3c[92]), which can be applied for energy calculations with up to 1000 atoms, are computationally impractical for larger systems.[92] Semiempirical QM methods, such as PMx methods[93−95] and the more recently developed extended tight-binding methods of the xTB family,[96,97] provide an efficient alternative to optimize large structures. For instance, the xTB family enables optimization and thermochemistry evaluation of systems with up to 1000 atoms, including metal centers. In the field of metallo-organic cage modeling, currently used methods include DFT calculations to quantify the relative stability of cage conformers/isomers with structures optimized at either the DFT,[66,98] PMx,[27,63,99−101] or xTB[36,102] level of theory.

Automated Tools

In the last 10 years, there has been enormous progress in open-source software development, including Open Babel[103] and RDKit,[104] which facilitate 3D conformer generation and determination of ground-state properties, such as geometries, charges, and dipoles. Indeed, several open-source tools are now available for high-throughput screening of COFs, MOFs, rotaxanes, and metallo-organic cages.[40,41,105−107] They commonly employ classical force fields or semiempirical-based algorithms for fast generation of structures allowing the creation of an extensive library of scaffolds, which subsequently is reduced by a series of filters to structures with desired properties. The computational tool HostDesigner has been developed by Hay and Firman[108] to design new hosts that can effectively bind cations[109] and anions.[110] The authors designed a sulfate host by mimicking the interaction that the anions establish with water (Figure a).[111] Their calculations indicated that sulfate forms up to 12 hydrogen bonds with water; these interactions were mimicked with six urea molecules that formed a T-symmetry [SO4(urea)6]2– complex. They then employed [Ni4L6]8+ whereby [Ni(bpy)3]2+ molecules occupy the vertices of the tetrahedron with sulfate in the center. By simultaneously screening linkers and varying the positions and orientations of the vertices and complex, they designed and synthesized a cage with a higher affinity toward sulfate than any available sulfate receptor synthesized to date.

Figure 4

High-throughput screening. (a) HostDesinger procedure to generate a cage with high affinity toward SO42–, (b) stk procedure for cage generation and its use in the identification of low-symmetry cis-[Pd2L4]4+ cages, and (c) cgbind procedure for cage generation. Two prominent open-source cage generation tools with graphical interfaces include stk developed by Jelfs and co-workers[40] and cgbind developed by our group.[41] Both tools generate cages by providing ligands with specified and predefined topology. In their current form, they do not allow predicting the lowest energy architecture for specified ligand(s). stk was originally designed to generate structures of small linear polymers, porous organic cages, and covalent organic frameworks (Figure b).[40,112] However, it has also been extended to rotaxanes, host–guest complexes, metallo-organic cages, and MOFs.[40] The tool uses predefined topological graphs, where the building blocks are placed on the edges and the vertices of the graph. They are then joined by bonds, and in the case of covalent organic molecules, redundant atoms are removed. To ensure that atoms do not overlap, the assembly initially has bonds with exaggerated distances, which are then energy minimized using third-party optimizers, such as RDKit, xTB, Schrödinger’s Macromodel, or GULP,[113] or their Monte Carlo based MCHammer optimization.[114] Lewis, Jelfs, and co-workers have used stk in combination with UFF4MOF and xTB to screen low-symmetry cis-[Pd2L4]4+ cages.[102] A library of 60 ligands, which generated 240 cages, was screened using three filters. First, isomers with energies >6 kJ mol–1 relative to the lowest energy isomer were disregarded. To ensure that dinuclear structures containing a square planar metal configuration were preferred over multinuclear ones, two criteria were measured: the sum of the distance of four nitrogen atoms from the plane defined by the PdN4 unit and a square planar order parameter.[115] As a result, five out of 60 ligands were synthesized, and four of them successfully formed a clean cis-isomer, which was confirmed by NMR and DOSY experiments. Our computational tool, cgbind,[41] targets the generation of metallo-organic cages from crystal-structure templates. The approach is based on finding a common motif of donor atoms in a template and input ligand. The optimal structure is found by screening different conformations of ligands and optimizing the distance of the position of the metal from the center of the cage (Figure c). Several frameworks are implemented in the code (including M2L4 and M4L6), while new ones can be added by providing a structure downloaded from the CSD or generated by a molecule editor, e.g., Spartan[116] or Avogadro.[117] The cages generated by this method have shown an excellent agreement with crystal structures (root-mean-square deviation (RMSD) < 1.5 Å). Additionally, the structures can be further optimized by xTB, MOPAC, ORCA, or NWChem by interfacing cgbind to autodE.[118]cgbind also includes a series of analysis tools, such as a maximum enclosed and escape sphere and electrostatic potential surface. Moreover, it provides a fast and straightforward way to optimize the position of substrates inside the cavity and estimate the binding affinity using a simple encoded nonbonded force field. In addition to the Python module, a limited version of cgbind is available as a web-based graphical user interface at cgbind.chem.ox.ac.uk. HostDesinger, stk, and cgbind rely on covalently connected ligands and metals of predefined architectures to reduce the configurational space to search. As a result, some of the generated cages might have unreasonable structures. Although they can be improved, for example, by geometry optimization with xTB, this would significantly increase computational cost when exploring increasingly large systems. Moreover, the methods do not consider interactions with solvent or ions, flexibility of ligands, and entropic contributions. Therefore, they only provide information about the final structure and not their likelihood (kinetic or thermodynamic driving force) to form under specific experimental conditions. These factors could be considered, for example, by using MD simulations. However, simulating self-assembly from metal and ligand precursors might require microsecond-long MD simulation (Section ),[89,119] making the approach impractical for routine cage design. Another consideration is accessibility of the software to a broader scientific audience with less computational experience. Most of the described software is command-line and therefore requires basic programming skills (with the exception of the web-based version of cgbind). Development of the graphical user interfaces is needed for broader applicability of these methods.

Architecture Prediction

Reek and co-workers have employed classical modeling to predict the preferential cage architecture of homo- and heteroleptic cages with four different ligands.[120] They employed GAFF to describe the organic ligands and a covalent bond model to describe the metal center, with Lennard-Jones parameters obtained from a soft-sphere model.[84] Bonded parameters between the metal and nitrogen donor were fitted to reproduce DFT energies for configurations generated from xTB. To account for charge transfer between the donor nitrogen and the metal, the RESP method was employed,[121] which assigned a +0.26 e charge to Pd. Acetonitrile solvent was modeled implicitly using the Generalized Born model.[122] The ligands were mapped into predefined templates, representing existing and hypothetical polyhedral ML2 architectures with n = 3–30 vertices. The relative distribution of cages was then evaluated by Boltzmann weighting. The procedure was first tested for the L ligand, originally reported by Fuijta and co-workers to form a single topology, [Pd1L24]24+ (Figure a).[123,124] Reek and co-workers reproduced these results computationally, demonstrating the formation of [Pd12L24]24+ as a major product (89.1%).[120] However, the model suggests the presence of a minor assembly, [Pd15L30]30+ (Figure a), which could be experimentally detected via mass spectroscopy.

Figure 5

Prediction of the [PdL2]2cage architectures.[120] (a) Prediction of homoleptic cages for the L ligand. (b) Architecture prediction of heteroleptic [PdLL2]2n+ cages for a mixture of L and L ligands. Adapted from ref (120) with permission from the Royal Society of Chemistry. The same approach was then extended to study assemblies for various molar fractions of L and L ligands, which are known to form [Pd12LL24–]24+ heteroleptic cages.[123] The authors suggested a thermodynamic preference for [Pd12LL24–]24+ cages when up to 0.27 mole fraction of L was used, while for a higher molar fraction of L, the [Pd24LL48–]48+ cage was preferred. This is in line with previous experimental reports (Figure b).[123] This protocol also successfully predicted the formation of homoleptic cages formed from endo- and exohedrally modified ligands; however, it failed to predict the correct architectures when using different ratios of ligand mixtures, for which larger assemblies are expected. The authors associated this failure with the differences in the ligand’s dihedral angles, which were only captured by DFT but not by the classical model.

Self-Assembly Mechanism

While design rules enable one to assess the thermodynamic stability of the desired assembly, they do not guarantee that such a structure can be isolated or even observed, as kinetic traps may prevent the thermodynamic product from being reached. A mechanistic understanding of self-assembly could help the rational design of novel cages by identifying competing pathways, potential kinetic traps, and interconversion barriers. However, obtaining information using experimental techniques is challenging as it is often extremely difficult to detect (and reliably characterize) early-stage intermediates using noninvasive, quantifiable methods. For example, metal–ligand assemblies that form at the beginning of a reaction are not only present in low concentration, but they are also often low-symmetry structures, which makes their NMR signal weak. This contrasts with the NMR signals observed for the final closed structure, where a single resonance corresponds to many atoms in equivalent chemical environments. Therefore, computational modeling provides a promising avenue to fill this gap, complementary to experiments.

Experimental Approaches to Quantify the Self-Assembly Reaction Pathway

There are only a few experimental reports exploring the assembly mechanism of metallo-organic cages. Most notably, Hiraoka and co-workers have developed the quantitative analysis of the self-assembly process (QASAP) approach.[125,126] This method indirectly provides information about intermediate structures from measurable data, most commonly the amount of a precursor ligand that is displaced during the course of the reaction.[127−131] QASAP relies on calculating two parameters from the species that can be measured by NMR: n—the average number of metal ions bound to ligand and k—the metal/ligand ratio. From average (n, k) values, the progression of the sample’s composition and the presence of intermediates can be inferred. This technique has been used to study the self-assembly of [Pd2L4]4+,[127−131] [Pd3L6]6+,[132] [Pd4L8]8+,[133,134] [Pd6L8]12+,[130] [Pd12L24]24+,[135] [Pt6L6]12+,[136] and [Pt3L3]6+/[Pt6L6]12+.[137] For instance, the formation of a [Pd2L4]4+ cage from [Pd(Py*)4]2+ (Py* = 3-chloropyridine) and rigid ditopic ligands was found to proceed via intermediates [Pd2L4(Py*)2]4+ and [Pd2L4Py*]4+.[131] By monitoring (n, k) values, it was estimated that all the building blocks were consumed within the first 5 min, after which a steady increase of final product was observed via the identified intermediates (15–300 min) (Figure a). Mass spectroscopy experiments also confirmed the presence of these intermediates. The authors also obtained the activation energy barriers for the formation of the first (ΔG1‡ = 22.3 kcal mol–1) and second (ΔG2‡ = 21.9 kcal mol–1) intermediate, in reasonable agreement with DFT calculations (ΔG1‡ = 17.5 kcal mol–1 and ΔG2‡ = 17.7 kcal mol–1, respectively). When the same approach was applied to studying the formation of the [Pd2L4]4+ cage from a flexible ligand, L (Figure a),[128] it was found that the product was slowly formed via a submicrometer-sized sheet intermediate rather than the intermediates described above. This state was characterized by dynamic light scattering (DLS) and transmission electron microscopy (TEM).[128]

Figure 6

QASAP and NASAP approach. (a) Assembly mechanism deduced using the QASAP approach for [Pd2L4]4+ and [Pd2L4]4+ cages.[128,131] (b) Assembly mechanism of the [Pd2L4]4+ cage elucidated using the NASAP approach. The assembly was classified into four stages to infer the main assembly pathway. Adapted from ref (126) with permission from the Chemical Society of Japan and Wiley-VCH GmbH, Weinheim. Sato and Hiraoka have also introduced a numerical analysis of the self-assembly process (NASAP),[126,138] which complements the QASAP approach by providing information on intermediates at an early stage of self-assembly (<5 min, Figure b). NASAP identifies the major assembly pathways using a graph representation and the Gillespie stochastic simulations,[139] and it has been used to probe the formation of [Pd2L4]4+,[67,140,141] [Pd3L6]6+,[142] and [Pd6L4]12+ cages.[138,143,144] For [Pd2L4]4+, 29 intermediate species, 68 elemental reactions, and four pathways were analyzed.[140] Only pathways with small-sized intermediates (ML, n ≤ 2 and m ≤ 4) were considered. For this system, the analysis showed that assembly starts with the formation of [Pd2L2(Py*)5]4+, which then leads to [Pd2L4(Py*)3]4+ and finally rearranges to the final product. Both QASAP and NASAP have enabled a better understanding of the self-assembly mechanism and, consequently, the rational access to kinetically trapped cages. For example, [Pd2L4]4+ cage[67] and [Pd6L4]12+ square-based pyramid[144] structures have been obtained using this information rather than serendipity. Both methodologies also demonstrated the importance of counteranions in the self-assembly pathway. For example, the expected cage was only formed when the templating guest BF4– was present; in contrast, only uncharacterized byproducts were obtained in the absence of any templating anion.[67]

Computational Approaches to Explore Self-Assembly

Yoneya and co-workers pioneered the computational modeling of the metallo-organic cage assembly. They have used stochastic MD simulations and implicit solvation to study the formation of [Pd6L8]12+ and [Pd12L24]24+ cages.[89,119] To describe the ligands, they used a united-atom model, where hydrogen atoms are merged into bonded non-hydrogen atoms, and a dummy model for Pd2+, which was parametrized to reproduce the crystallographic Pd–N distance. This simplified model substantially reduced the number of degrees of freedom and enabled them to reach a microsecond time scale. Long-range electrostatic interactions were computed by the generalized reaction-field method with a relative dielectric constant of the solvent (εr = 47; DMSO).[145] To model the [Pd6L8]12+ cage, short-range electrostatic interactions were screened with various relative dielectric constants (εr = 1.0, 2.5, and 4.0) to determine the optimal one for the process under study. When a dielectric constant of εr = 1.0 (vacuum) was employed, the ligand–metal interaction was so strong that disassembly was extremely rare, preventing the correction of initially formed kinetic traps. On the other hand, employing a dielectric constant of ε = 4.0 led to a fast ligand–metal exchange that made it impossible to form any ligand–metal complex. Only for ε = 2.5, the exchange rate of ligand and metal provides a balance between growth and disassembly, allowing cage formation and correction of structural defects. Subsequently, explicit DMSO molecules were also added to the system. This resulted in a shortening of the relative lifetime of small assemblies compared with the complete nanosphere. The simulations did not include counterions as they affected the stability of the metal model employed. In subsequent work, the same authors studied the formation of two larger homoleptic [Pd12L24]24+ cages from ligands with different bend angles.[124] In comparison to their previous study, a higher number of kinetic traps—closed structures with lower nuclearity than [Pd12L24]24+, i.e., [Pd6L12]12+, [Pd8L16]16+, and [Pd9L18]18+—were observed. Moreover, furan-cored ligands with slightly larger bend angles led to fewer kinetic traps, as the formation of a small cluster increased strain. Modifying atomic charges to mimic charge transfer also affected the resulting structures, further stabilizing the complexes. These results illustrate the challenges when attempting to evaluate the kinetics of assembly processes, which can be highly dependent on the dielectric of the environment, the presence of explicit solvent molecules, and the consideration of charge transfer. As mentioned above, these aspects are often evaluated via trial and error to balance ligand exchange and assembly events on a reasonable time scale. Tan and co-workers have also performed simulations in implicit solvent to study the formation of a [Hg2L4]4+ cage.[146] This cage experimentally assembles in acetonitrile and has been found to encapsulate C60 and C70, with the [C60⊂Hg2L4]4+ complex being the most stable. The guest can be removed upon the addition of Hg2+ ions, leading to the formation of the [Hg2L2]4+ metallocycle.[21] Similar to Yoneya’s approach, Hg2+ was described with a dummy model with parameters obtained to reproduce the Hg–N distances and N–Hg–N angle in [Hg2L2]4+. Organic molecules were modeled with the GAFF force field, and simulated annealing was employed to speed up conformational sampling and promote guest encapsulation. The authors identified a stepwise assembly from the [Hg2L2]4+ metallocycle to the [Hg2L3]4+ intermediate and finally to the [Hg2L4]4+ cage. While metallo-organic cage self-assembly remains challenging to model due to the difficulties in reliably modeling metal centers, it is expected that advances in force-field development will enable more realistic simulations with explicit solvent and counterions. For example, it would be interesting to combine MD with rare-event sampling methods, such as metadynamics,[147,148] or Markov state models,[117] to quantify the kinetics of the assembly process. These approaches have been used, for example, by Hiraoka and co-workers to study the solvent effect in the self-assembly of an organic cage formed from six aromatic ligands in aqueous methanol[149,150] but not yet to model metallo-organic cage assembly. In the cited example, Markov state models were used to study the assembly pathway and estimate the rate-limiting step. Applying similar techniques to study metallo-organic cage formation would provide valuable information about the formation of kinetic traps and their rate of interconversion, which could be exploited for unsymmetric cage design. Moreover, they could complement more demanding experimental approaches, such as QASAP and NASAP. However, for these approaches to become predictive and accurate, the description of solvent and counterions is required, as they will affect the stability of intermediates and their rate of exchange. It is worth noting that the composition of the system depends exponentially on the relative energy difference, and therefore high-accuracy FFs are required to determine the ratio of the products.

Binding and Guest Release

Experimental Studies

Key Factors Controlling Host–Guest Interactions

The ability of a guest to bind to a given cage depends on a number of parameters, including size, shape, and electrostatic complementarity, as well as other factors such as the nature of counterions and solvation. Metallo-organic cages are also invariably charged, and this can have significant implications. Rebek introduced the “55% rule” to predict host–guest binding—it states that the optimal volume of the guest should be around 55 ± 9% of the internal host volume.[151] While originally developed for organic capsules, this rule-of-thumb has also been used with metallo-organic cages.[17,152,153] However, caution has to be taken when using this rule. This is because the volume of organic capsules is often easier to define. In contrast, metallo-organic cages may possess several windows making the calculation of their internal volume challenging, as it is difficult to define the boundary between the inner cavity and bulk solvent. This possibly explains why the 55% rule has been applied to metallo-organic cages with varying success.[153−155] Counterions play an enormous role in the host–guest chemistry of metallo-organic cages. Often, one or several counterions act as strong binding guest(s); this is common with small tetrahedral cages that are often shape or size complementary for common weakly coordinating anions (e.g., BF4–, PF6–) that are often used in self-assembly reactions.[156,157] Counterions can also influence the solubility of metallo-organic cages. For example, small, charge dense counterions that can strongly hydrate have been used to create water-soluble systems with both cationic cages (using oxyanions such as nitrate) and anionic cages (e.g., use of K+ in [Ga4L6]12–). Externally hydrated counteranions allow the cage to bind less polar guests, which can be significantly enhanced by the hydrophobic effect. This approach has been particularly successful with cages that possess flat aromatic panels because these can create a solvophobic cavity. Water-soluble cage systems have been particularly relevant for catalysis as they facilitate the binding of organic substrates. Engineering the cage system so it remains free for binding organic substrates has also been accomplished using large counterions that cannot access the internal cavity. Our group has adopted this approach, exploiting tetrakis[3,5-bis(trifluoromethyl)phenyl]borate (BArF4–) counteranions.[158] This strategy has an opposite effect on solubility; it allows charged, multimetallic cages to be used in apolar solvents such as dichloromethane. This also means that the cavity acts as a polar environment in comparison to the solvent phase (c.f., hydrophobic guest binding with water-soluble cages). As binding is driven by the formation of polar host–guest interactions rather than solvent–solvent interactions, this maximizes the opportunities to leverage catalytic activity using electrostatic effects.

Mechanism of Guest Binding

Guest binding and release can be achieved via a series of mechanisms involving the expansion and partial or complete disassembly of the cage. For example, Raymond and co-workers employed bulky guests (NEt4+ and PPr4+) within [M4L6]12– cages (M = Ga3+, Ti4+, Ge4+).[159,160] Those guests were much larger than the size of their window sizes; however, they were able to bind inside the cavity via expansion of the cage (Figure a). In contrast, rupture of the cage to enable guest binding was observed by Yoshizawa and co-workers for binding of fullerene inside a [Hg2L4]4+ cage.[21]

Figure 7

Binding of guests. (a) Uptake of the guest via expansion of the cage or (partial) rupture of the cage. (b) Example of controlled uptake and release of the guest involving the complete disassembly of the cage. Recognizing the dynamic nature of the binding process, recent efforts have shifted toward systems where guest binding is switched on/off, which has potential applications in drug delivery.[161] Since the ligand–metal bond formation is in principle dynamic, the simplest way to alter binding is by using invasive methods based on full or partial disassembly of the metallo-organic cage. Nitschke and co-workers showed switchable binding of cyclohexane inside a [Fe4L6]4– tetrahedron (Figure b).[152] This structure disassembles upon the addition of acid, presumably through the protonation and subsequent hydrolysis of the ligands. The process is then reversed by the addition of a base. Crowley and co-workers have shown that the disassembly of a [Pd2L4]4+ cage, triggered by the addition of the strongly coordinating dimethylaminopyridine (DMAP), can be used to release the chemotherapeutic Cisplatin.[10] The stronger basicity of DMAP compared to the cage ligand means that the cage reassembles when treated with p-toluenesulfonic acid (TsOH). It is interesting to compare the Crowley and Nitschke methods and the orthogonal way in which acid and base can be used to trigger both disassembly and assembly of cages. More recently, Crowley has applied the DMAP-responsive chemistry to a heteronuclear [PdPtL4]4+ cage. In this example, the DMAP selectively removes the more labile Pd(II) ion.[37] Noninvasive stimuli such as light,[99,162] temperature,[98] or chemical signals[154,163,164] have also been used to open cages reversibly. For example, Fujita and co-workers functionalized a [Pd12L24]24+ cage with internalized photoswitchable azobenzene units.[162] Upon irradiation, the azobenzene group changed from trans to cis conformation, increasing the size of the cavity and allowing a hydrophobic guest, pyrene, to enter. This process could be reversed by heating. Clever and co-workers also inserted a photoswitch unit in the ligand of a [Pd2L4]4+ cage, resulting in contraction and expansion of the cage upon irradiation with light.[99] The contracted structure was found to have the optimal size to bind [B12F12]2–.

Computational Prediction of Binding

Predicting host–guest binding is one of the central goals of computational chemistry.[165] As a result, several methods have been developed, which differ in their accuracy and computational cost. For example, efficient but low-accuracy methods such as docking allow screening of a large number (∼109) of possible guests.[166−168] Docking relies on sampling possible binding modes that are subsequently ranked by a scoring function, which can be empirical, knowledge-based, or force-field-based. Related approaches include linear interaction energy (LIE)[169] and molecular mechanics Poisson–Boltzmann/generalized Born surface area (MM-PBSA/MM-GBSA),[170−172] which also include conformational sampling from short MD simulations and the implicit or explicit consideration of solvent. On the other hand, accurate methods include free energy perturbation (FEP),[173] umbrella sampling (US),[174] and metadynamics,[147,148] which are based on extensive MD simulations of the unbound guest and host, and bound guest–host complex in explicit solvent. Although these methods have relatively high accuracy (<2 kcal mol–1 error),[79] their computational cost makes them unsuitable for screening. Ward, Hunter, and co-workers employed the GOLD (Genetic Optimization for Ligand Docking) package[175] to predict the binding affinity of 54 guests inside the [Co8L°12]16+ cage for which experimental data was available (Figure a).[176] Initially, they employed the scoring function design for protein–ligand interaction, CHEMPLP;[177] however, this resulted in a poor correlation to experiments (R2 = 0.02). A significant improvement was achieved by reparameterizing the scoring function to predict the association constant (log K) directly. The modified scoring function included only four parameters (ligand_clash, ligand_torsion, nonpolar, and buried part, RMSD = 1.0 kcal mol–1). Further improvement, especially for flexible guests, was obtained by including the number of rotatable bonds as a fifth parameter (RMSD = 0.5 kcal mol–1; Figure b). Using this function, they screened 3000 potential binders, from which 15 were experimentally characterized. An excellent agreement between computed and experimental binding affinity was obtained (RMSD = 0.5 kcal mol–1). Moreover, they found a new guest with a much higher binding affinity than previously reported, demonstrating that correct docking parametrization for a specific cage–guest system can aid the identification of strongly interacting substrates.

Figure 8

Docking of guests inside [CoL°12]16+.[176] (a) [Co8L°12]16+ cage structure. (b) Correlation between computed and experimental binding affinity (blue dots). The function was used to predict the binding of 3000 substrates, from which 15 were synthesized and their binding affinity calculated (red dots). Adapted from ref (176) with permission from the Royal Society of Chemistry.

Computational Mechanistic Binding Studies

Complementary to experimental characterization, detailed computational studies have enabled the study of the mechanisms of guest encapsulation and release and the flexibility of the cage during these processes. Such studies have been performed, for example, to study the binding of fullerenes inside [Pd4L8]8+,[178−180] charged guests inside [Ga4L6]12–,[153] photoswitchable [Pd2L4]4+ cages,[181] and photoswitchable guests inside the [Pd6L4]12+ cage.[182] Since binding occurs on the millisecond to second time scale, which is practically impossible to reach using conventional MD simulations, rare event sampling methods such as US,[174] accelerated MD (aMD),[183] the attach–pull–release (APR) method,[184] and metadynamics[147,148] are employed. For example, Ribas and co-workers studied the binding of fullerenes inside [Pd4L8]8+ cages using a combination of 1H–1H exchange spectroscopy (2D-EXSY) NMR, conventional MD, and aMD with explicit solvent.[180] The mechanism of fullerene binding was found to be regulated by the aromatic rings of cage ligands, which act as gatekeepers. The rate of the guest entrance was determined by the rotation of aromatic rings along the ligand axis. Ujaque and co-workers used the APR method to study the binding of cationic guests inside [Ga4L6]12- (Figure ). Their computed binding affinities were obtained within 2.3 kcal mol–1 of those obtained by NMR. Similar to Ribas and co-workers, they observed that the rotation of aromatic rings of the cage’s ligands controls the guest’s entrance, becoming the rate-limiting step for binding. Moreover, they found that the binding affinity strongly depends on the method of parametrization of the cage. For instance, using implicit solvation to calculate bonded parameters and partial charges significantly improved the results. Moreover, the use of low-cost metrics, such as the relative guest/cavity volume or guest volume, correlated with the binding affinities (R2 = 0.97 and R2 = 0.86, respectively). Despite the significant difference between sizes of guests (80–160 Å3), it was noticed that the 55% Rebek rule holds for guests if the encapsulated solvent molecules are considered.

Figure 9

Computational mechanistic studies of binding. Binding pathway of NEt4+ inside the [Ga4Lp6]12– cage. Adapted from ref (153) with permission from the American Chemical Society.

Computational mechanistic studies of binding. Binding pathway of NEt4+ inside the [Ga4Lp6]12– cage. Adapted from ref (153) with permission from the American Chemical Society. Schäfer and co-workers have also studied the controllable uptake and release of [B12F12]2– in a [Pd2L4]4+ cage with photoswitchable ligands.[181] They calculated binding affinity using US with explicit solvent, which compared well to the value obtained from experiments (computed −6.7 kcal mol–1 vs −6.4 and −5.9 kcal mol–1 from ITC and NMR, respectively). Moreover, they estimated activation barriers of the removal of the guest from the cage with open, closed, or mixed open/closed ligands. When all the ligands are in a closed configuration, the binding affinity and activation barrier lower significantly, suggesting that the guest is released after closure of the first ligand of the cage. Finally, Pavan and co-workers studied the binding and the cis–trans isomerization of azobenzene inside a flexible [Pd6L4]12+ cage.[182] They performed MD simulations in explicit water solvent using a covalently bound metal model. Metadynamics simulations were used to calculate binding and kinetic parameters (kon and koff). Moreover, the trans–cis isomerization time (τ = 1/k) of azobenzene inside the cage was found to be orders of magnitude shorter than the residence time of the encapsulated guest (τoff), suggesting that isomerization occurs inside the cage. These examples demonstrate the unique opportunities that molecular modeling provides to gain insights into the mechanisms driving binding, which could be used for further optimization.

Catalytic Activity

Brief Summary of Experimental Studies

Mimicking enzymes by utilizing the defined cavities of supramolecular systems has been of significant interest to supramolecular chemists for many decades.[1,185] Prominent catalytic coordination cages include the following: the [Ga4Lp6]12– cage, originally developed by Raymond and studied in collaboration with Bergman and Toste, which has been shown to catalyze a broad range of reactions, including aza-Cope[186−189] and Prins rearrangements,[190] Nazarov cyclization,[191−194] hydrolysis of acid-labile compounds under basic conditions,[42,195,196] alkyl–alkyl reductive elimination,[197,198] the octahedral and bowl-shaped [Pd6L4]12+ cages developed by Fujita and co-workers, and other examples by Mukherjee,[199] which have been shown to catalyze Diels–Alder[200−203] and Knoevenagel reactions;[204] and the octanuclear cubic [Co8L°12]16+ cages by Ward and co-workers, which have been shown to promote the Kemp elimination,[205] phosphate ester hydrolysis,[206] and aldol reactions[207] (although the latter two examples have been shown to occur on the cage surface rather than inside). Recently, we demonstrated the catalytic activity of the [Pd2L4]4+ topology in Diels–Alder,[45] Michael addition,[208] and radical–cation cycloaddition reactions.[209] For comprehensive reviews focused on the experimental studies of cages catalysis, we refer to the relevant literature.[30,210−212] While these promising examples demonstrate the potential of metallo-organic cages to achieve selectivity and activity not possible with other synthetic catalysts, the limited number of examples show that the design of these systems is challenging. In the examples reported to date, acceleration occurs either because of enthalpic stabilization, i.e., efficient and selective recognition of intermediates and TS, versus the reactants and products, or via entropic mechanisms, i.e., increasing the effective concentration of reactants, or by constricting acyclic substrates. However, for catalysis to occur it also requires turnover; therefore, the relative association constants for the reactants and products are key. Usually, all these requirements may be difficult to balance.

Computational Cage Catalysis

Computational modeling provides atomic-level insight into the fundamental aspects of cage catalysis, helping to rationalize experimental observables and predict possible outcomes.[213] From these investigations, several factors have been identified as crucial for catalytic activity, including reduction of entropy, (relative) destabilization of reactant complexes, TS stabilization, distortion, and microsolvatation. In the next paragraphs, we discuss relevant examples where computation has helped elucidate the origin of metallo-organic cage catalysis. We will not cover computational studies on supramolecular capsules (organic noncovalent cages) and refer the reader to relevant works in this area.[213−216]

Gallium [Ga4L6]12– Cage

Orthoformate Hydrolysis

The cage-catalyzed hydrolysis of orthoformates and acetals was reported by Raymond and co-workers, who showed that the [Ga4L6]12– cage promotes these reactions at high pH, whereas the bulk-phase reaction only occurs under acidic conditions.[42] Warshel and co-workers were the first to computationally study the mechanism of this reaction inside the [Ga4L6]12– cage (Figure a).[217] They employed the EVB approach,[81] which was parametrized to fit the reference hydrolysis of two orthoformates in water. These parameters were used unchanged to model the reaction inside the cage (Figure b).[217] While TS stabilization was found to be important, the overall catalytic activity was found to primarily arise from electrostatic preorganization of the H3O+ species, leading to a very low “local pH” inside the cage even when the external solution was kept at a high pH.

Figure 10

Catalysis in the [Ga4L6]12–cage. (a–d) Structure and reactions catalyzed by the cage.[197] (e–h) Computational studies on the origin of catalysis for reductive elimination: (e) electrostatic effect,[197] (f) solvation effects inside and outside the cage,[219] (g) explicit solvatation and encapsulation,[221] and (h) catalysis of triethyl-substituted complex.[222]

Aza-Cope Rearrangement

The cage-catalyzed 3-aza-Cope rearrangement of allyl enammonium cations to iminium cations (which subsequently hydrolyze to the corresponding aldehydes) was also reported by Raymond and co-workers. The reaction was accelerated by a factor of 850 inside the [Ga4L6][12] cage (Figure c).[187,189] The origin of this effect and the selectivity preference for the R-enantiomer were computationally investigated by Nakajima and co-workers.[218] They performed QM/MM calculations, with the cage and substrate modeled at the QM (B97D/def2-SV(P) and MP2/def2-SV(P)//B97D/def2-SV(P)) levels of theory and the 12 countercations in the MM region. The effect of solvent, modeled implicitly, was found to be negligible. The authors computed enthalpy only as it was experimentally shown that entropy reduction played a major role in catalysis but not in enantioselectivity. The computed enthalpies of activation, ΔH‡, for the uncatalyzed and catalyzed reactions were in excellent agreement with experimental values; uncatalyzed comp. 24.5 kcal mol–1 vs exp. 23.6 kcal mol–1 and catalyzed 21.7 kcal mol–1, exp. 22.7 kcal mol–1. The preference for the R-product was suggested to originate from the different stability of the prochiral structures inside the cage due to deformation of the bulky substituent.

Reductive Elimination

In 2015, Toste and co-workers reported that the C–C reductive elimination reaction of high-valent [Au(III), Pt(IV)] metal alkyl complexes is accelerated inside [Ga4L6]12– cages (Figure d).[197] This elementary reaction was incorporated into a dual cross-coupling cycle, for which the metal complex and supramolecular cage were required for efficient turnover (TON > 300). The C–C bond-forming reaction was suggested to proceed via a pre-equilibrium halide dissociation followed by a transient and reversible encapsulation of the nascent organometallic cationic species, which then undergoes irreversible elimination inside the cavity. In their subsequent studies, they evaluated the effect of spectator ligands, reactive alkyl ligands, solvent, and catalyst structure.[198] In particular, they observed an increased reaction rate upon increasing the water content of the methanol solvent and by substituting the methyl phosphine ligand for ethyl. The first computational rationalization of this [Ga4L6]12– catalyzed reaction was reported by Head-Gordon and co-workers using DFT (ωB97X-v/TZV2P(Au),DZVP).[219] They hypothesized that the negatively charged cage stabilizes the positively charged intermediate and TS (Figure e). Indeed, the activation energy for the reductive elimination step was found to be lower for the charged intermediate than for the uncharged substrate. Electrostatic stabilization was further confirmed by comparing this cage to a lesser charged analogue ([Si4L6]8–). Moreover, the high turnover of the catalyst was rationalized based on the stronger binding of the reactant state compared to the product. Their study suggested that the addition of water would decrease the activation energy; however, this effect was not quantified. In a subsequent study, they investigated the role of water using ab initio MD (AIMD; B97M-rV/DZVP) simulations in an explicit water solvent (Figure f). The calculated acceleration rate was in reasonable agreement with the experiment (comp. 3.3 × 107 vs exp. 5.0 × 105–2.5 × 106). Electrostatic effects and the presence of a single complexed water molecule inside the cage, which organizes the electrostatic potential, were suggested to facilitate the reaction. They also noted that, unlike enzymes, the cage poorly reorganizes water on its surface. Since they found that the unorganized interfacial water is detrimental for TS stabilization, it was suggested that even better catalytic acceleration could be achieved by enhancing the ordering of the solvent inside the cage. They suggested that substitution of gallium for indium ion would reduce the energy required for solvent reorganization and result in stabilization of the transition state.[220] The same reaction has been studied by Ujaque and co-workers using MD simulation with methanol explicit solvent and DFT (SMD-B3LYP-D3/6-31G(d),SDD(Au)). MD simulations were used to evaluate the number of solvent molecules inside the cage while DFT was used to study catalytic activity (Figure g).[221] On average, two methanol molecules were found to be present inside the cage. The activation barrier for the uncatalyzed reaction was calculated with DFT using implicit solvent and 12 explicit methanol molecules. For the reaction in the cage, two explicit methanol molecules were placed inside the cavity. The computed barriers were in excellent agreement with the experimental values (uncatalyzed: exp. 27.2 kcal mol–1 vs comp. 26.2 kcal mol–1 and cage-catalyzed: exp. 20.7 kcal mol–1 vs comp. 19.5 kcal mol–1). Using an energy decomposition analysis, they identified that TS stabilization arises from its encapsulation and interaction with the cage (3 kcal mol–1). Additionally, in solution, the formation of the charged intermediate requires dissociation of iodide from the saturated organometallic complex, which is slightly endergonic (ΔG = 1.8 kcal mol–1). They investigated the role of the solvent by removing explicit methanol molecules from the uncatalyzed reaction. Removal of the 10 methanol molecules, leaving only two, had a minor effect (0.7 kcal mol–1) on the activation barrier. However, removal of the remaining two molecules significantly decreased the activation barrier by 7 kcal mol–1. Since simulations show two methanol molecules inside the cage, it was proposed that the solvent has a negligible effect on catalysis. However, the authors noted that hypothetical removal of the remaining methanol molecules from the cage would significantly lower the activation barrier. In a different study, Ujaque et al. also investigated the effect of changing the substrate’s trimethyl phosphine ligand to triethyl (Figure h).[222] As expected, a larger ligand led to less solvent encapsulated by the cage. Their MD studies showed that in equilibrium, the cavity is occupied by 5–8 methanol molecules and surrounded by 6–9 K+ ions with up to two inside the cavity. As a result, the cage has an effective charge in the range −6 to −3. Upon encapsulation of the charged Au(III) iodide complex, the iodide rapidly leaves the cage, and only one methanol stays inside. They also investigated the reaction in a vacuum and explicit solvent, modeling the cage using DFT (SMD-B3LYP-D3/6-31G(d),SDD(Au)). Compared to the uncatalyzed reaction with 12 explicit methanol molecules, they observed a decrease of 8.8 kcal mol–1 barrier for the reaction inside the cage. Overall, two factors were found to contribute to catalytic activity: desolvatation, with removal of 11 explicit methanol molecules (leaving only one in the complex) decreasing the activation energy by 5.7 kcal mol–1, and the electrostatic interaction between the TS and the cage, which reduces the barrier by 3.1 kcal mol–1. Surprisingly, the surrounding ions, which effectively increase charge, did not have a significant effect on the activation barrier. They extended this analysis to the reductive elimination of a Pt(IV) complex.[223] Despite the similarity of the Au(III) and Pt(IV) complex reactions, the factors contributing to catalytic activity differ. In the Pt(IV) complex, catalytic activity was found to arise from encapsulation effects (6.5 kcal mol–1) rather than microsolvatation (0.5 kcal mol–1) as was observed in the Au(III) complex.

Pd-Cage Catalysis

Diels–Alder Reaction

The [Pd2L4]4+ topology originally reported by Steel and McMorran,[224] and later utilized by Clever,[59,61,99,101] Hooley,[225] and Crowley,[10] has been extensively used for binding, transport, and catalysis. In 2018, we evaluated the ability of [Pd2L4]4+ and [Pd2L4]4+ cages to catalyze Diels–Alder (DA) reactions using quinone substrates as dienophiles. In these systems, the linkers differ in having either benzene or pyridine as central group, and therefore, they are referred to as CH-cage ([Pd2L4]4+) and N-cage ([Pd2L4]4+), respectively (Figure a).[45] Unlike the cage-catalyzed reactions described above, which take place in water, this system operates in dichloromethane with noncompeting BArF– anions, enabling enthalpic activation of the substrate via C–H hydrogen bond interactions. It should be noted that unlike previous examples of cage-promoted Diels–Alder reactions,[195−198] this catalysis only involves formal binding of the dienophile, and so there is no contribution from increased effective concentration. While the N-cage was found to be catalytic, with rate accelerations (kcat/kuncat) of up to 103, the CH-cage was inactive, despite the latter having strong substrate binding. The contrasting catalytic ability was postulated to arise from a combination of weakened substrate binding (ground-state destabilization) and enhanced TS stabilization.

Figure 11

Reactivity in [PdL2]2+. (a) Diels–Alder reaction is catalyzed by the N-cage ([Pd2Lr4]4+).[226] (b) Electrostatic potential (ESP) slices of the CH-cage ([Pd2Ll4]4+) and N-cage ([Pd2Lr4]4+) on the xz plane containing two opposing ligands and two metal centers.[208] Reprinted with permission from refs (208) and (226), from the American Chemical Society. To rationalize how those subtle structural differences in the cage framework affect binding and catalysis, a computational protocol was developed, employing MD and DFT methods.[226] MD simulations in explicit dichloromethane (DCM) solvent were used to evaluate the flexibility of the cage, using a modified version of Yoneya’s Pd2+ dummy model (Section ). Higher flexibility for the N-cage compared to the CH-cage was identified, suggesting that the active N-cage can accommodate significant deformation to accommodate the increasing bulk of the TS without the energetic penalty that would be expected for the CH-cage. Guest binding was calculated using DFT with the SMD(DCM)-M06-2X/def2-TZVP//PBE0-D3BJ/def2-SVP level of theory, which previously was shown to accurately describe association energies for large supramolecular systems.[227] Relative (ΔΔEbind) rather than absolute binding energies were calculated. This avoids having to consider entropic contributions, which can introduce significant errors, as they are expected to be similar in both systems.[228,229] The relative binding affinities of a range of quinone-based guests were calculated with very good accuracy (MAD = 1.9 kcal mol–1, R2 = 0.76). Activation energies for the reaction between different benzoquinones and dienes were calculated at the SMD(DCM)-M06-2X/def2-TZVP//PBE0-D3BJ/def2-SVP level of theory. In addition, a distortion-interaction analysis was used to determine the components of the activation energy. A more favorable interaction energy than for the uncatalyzed reaction was observed for both cages, which arises from lowering of the dienophile LUMO energy and enhancement of its electrophilic character. However, the lack of catalytic activity in the CH-cage was found to arise from a large distortion penalty due to steric clashes between the ligand CH moieties and dienophile at the TS (ΔE⧧cat > ΔE⧧uncat). To decrease the computing time when estimating catalytic proficiency, “TS analogues” were used, where bonds being formed and broken are constrained to the values found in the uncatalyzed TS. This strategy resulted in >80% accuracy in classifying catalytic cages and reduced the computational time by 10 times.

Base-free Michael Addition Reaction

In a subsequent study, the CH-cage was found to efficiently catalyze several Michael addition reactions, while the N-cage was catalytically inactive.[208] The cage promoted the spontaneous, base-free pro-nucleophile deprotonation via stabilization of the conjugated anion, with an acidity enhancement comparable to several pKa units. The calculated electrostatic potential of the two cages showed that the nitrogen atoms of the N-cage significantly neutralize the remote charge arising from the Pd2+ ions, making the central cavity much less electropositive than the CH-cage (Figure b). This means that the CH-cage is better at stabilizing anionic species compared to the N-cage, hence explaining the reactivity difference. The cage also promoted different levels of diastereoselectivity for reactions in which the product possesses multiple stereocenters. This effect has been probed with DFT calculations at the SMD(DCM)-M06-2X/def2-TZVP//PBE0-D3BJ/def2-SVP level of theory, which showed a pronounced bias toward encapsulating one of the diastereomers inside the cage, supporting the observed diastereoselectivity in the reactions.

Overview and Future Perspectives

In recent years, substantial progress has been made toward the design of structurally diverse metallo-organic cages. For example, simple geometric principles have been successfully employed to guide the design of increasingly complex systems. However, functional metallo-organic cages, for example, as catalysts, remain underdeveloped. Despite the hundreds of assemblies reported in the past three decades, catalysis appears limited to a few privileged structures. By reviewing the processes enabling the functional activity of these systems, we have illustrated the key aspects that need to be considered to study metallo-organic cage design and move away from current trial-and-error approaches. While substantial challenges remain, we demonstrate that existing computational tools can already help interpret and, in some cases, predict experimental observations. For example, efficient open-source computational tools have enabled the screening of hundreds of new designs, saving precious time and resources for experimentalists. Ensuring that these tools continue to be developed and are accessible to experimentalists will be essential to generating a feedback loop between computational and supramolecular chemists. Only in the past decade have techniques extensively used in enzyme modeling, such as QM and MD, been employed to study processes such as cage assembly, binding, and catalysis. Still to overcome are the difficulties related to the quality of the models and efficiency of the techniques available. For example, the choice of the force fields and inclusion or not of solvent and/or counterions have a tremendous influence on the results of self-assembly and binding simulations. Moreover, most classical force fields are intrinsically unsuitable for studying cage formation, as they lack charge transfer. Therefore, high-quality and easy to generate force fields that accurately describe charge transfer and polarization will be necessary. Recently developed machine-learned potentials could help bridge the current gap between classical and quantum techniques, providing the accuracy and efficiency required. The community will also need to go beyond conventional modeling techniques targeted at describing thermodynamic minima and introduce efficient enhanced sampling techniques able to describe processes that experimentally take place on the scale of seconds or beyond. Moreover, knowledge of the pathways involved will enable us to explore kinetic traps from where novel designs could arise. Finally, in the realm of catalytic cage design, detailed mechanistic studies on targeted catalytic activity may still be necessary. This will require efficient and accurate electronic structure methods that enable the calculation of large QM regions and their accurate sampling. Advances in this area include linear scaling DFT approaches, the introduction of GPU architectures, and the development of improved semiempirical methods such as xTB. These studies could then be complemented with large-scale screening efforts and machine-learning techniques to further explore the available chemical space. Collaboration with experimentalists is key to success, which will give access to both positive and negative results, such that the quality of the models and accuracy of the conclusions will be better estimated. Computational chemistry shows great promise for the design of supramolecular cages. We envision that the pioneering efforts highlighted in this review will be expanded to create more robust and efficient design methods. The synergy between rapid screening approaches, accurate molecular modeling, and experimental validation will enable us to go beyond traditional approaches of intuition-driven trial-and-error, reducing the overall time and cost needed to discover new functional cages.

180 in total

1. Encapsulation, storage and controlled release of sulfur hexafluoride from a metal-organic capsule.

Authors: Imogen A Riddell; Maarten M J Smulders; Jack K Clegg; Jonathan R Nitschke
Journal: Chem Commun (Camb) Date: 2010-09-27 Impact factor: 6.222

2. Designed enclosure enables guest binding within the 4200 å(3) cavity of a self-assembled cube.

Authors: William J Ramsay; Filip T Szczypiński; Haim Weissman; Tanya K Ronson; Maarten M J Smulders; Boris Rybtchinski; Jonathan R Nitschke
Journal: Angew Chem Int Ed Engl Date: 2015-04-14 Impact factor: 15.336

3. The "complex-in-a-complex" cations [(acac)2M subset Ru6(p-iPrC6H4Me)6(tpt)2(dhbq)3]6+: A trojan horse for cancer cells.

Authors: Bruno Therrien; Georg Süss-Fink; Padavattan Govindaswamy; Anna K Renfrew; Paul J Dyson
Journal: Angew Chem Int Ed Engl Date: 2008 Impact factor: 15.336

Review 4. The coming of age of de novo protein design.

Authors: Po-Ssu Huang; Scott E Boyken; David Baker
Journal: Nature Date: 2016-09-15 Impact factor: 49.962

5. Quantitative Analysis of the Self-Assembly Process of a Pd₁₂ L₂₄ Coordination Sphere.

Authors: Shumpei Kai; Taro Shigeta; Tatsuo Kojima; Shuichi Hiraoka
Journal: Chem Asian J Date: 2017-11-23

6. Multiple Pathways in the Self-Assembly Process of a Pd₄L₈ Coordination Tetrahedron.

Authors: Tomoki Tateishi; Tatsuo Kojima; Shuichi Hiraoka
Journal: Inorg Chem Date: 2018-02-22 Impact factor: 5.165

7. Cavity-Directed Chromism of Phthalein Dyes.

Authors: Hiroki Takezawa; Shouta Akiba; Takashi Murase; Makoto Fujita
Journal: J Am Chem Soc Date: 2015-06-01 Impact factor: 15.419

Review 8. Metal-organic frameworks and self-assembled supramolecular coordination complexes: comparing and contrasting the design, synthesis, and functionality of metal-organic materials.

Authors: Timothy R Cook; Yao-Rong Zheng; Peter J Stang
Journal: Chem Rev Date: 2012-11-02 Impact factor: 60.622