Literature DB >> 35450023

Characterization of near-miss connectivity-invariant homogeneous convex polyhedral cages.

Bernard M A G Piette¹, Agnieszka Kowalczyk^2,3, Jonathan G Heddle².

Abstract

Following the discovery of a nearly symmetric protein cage, we introduce the new mathematical concept of a near-miss polyhedral cage (p-cage) as an assembly of nearly regular polygons with holes between them. We then introduce the concept of the connectivity-invariant p-cage and show that they are related to the symmetry of uniform polyhedra. We use this relation, combined with a numerical optimization method, to characterize some classes of near-miss connectivity-invariant p-cages with a deformation below 10% and faces with up to 17 edges.

Entities: Chemical

Keywords: Platonic group; capsid; nano-cage; near-miss cages; protein cage; uniform polyhedra

Year: 2022 PMID： 35450023 PMCID： PMC8984814 DOI： 10.1098/rspa.2021.0679

Source DB: PubMed Journal: Proc Math Phys Eng Sci ISSN： 1364-5021 Impact factor: 2.704

Introduction

Recently, a structure referred to as a TRAP-cage, made out of 24 nearly regular hendecagons, was engineered from TRAP (trp RNA-binding attenuation protein) [1-4]. The structure is such that each hendecagon has five neighbours with which it shares an edge. This leaves six edges per face which define the boundary of 38 holes. Thirty-two of the holes are triangles whereas the remaining six are in between four hendecagons, each contributing two of their edges to them (figure 1).

Figure 1

(a) The structure of a TRAP-cage as determined using cryo-electron microscopy [3]. The cage is shown in surface view with each TRAP ring (made from 11 identical protein monomers) coloured a different colour. (b) Polyhedral representation of the TRAP-cage: 24 hendecagons, 32 triangular holes and six non-planar holes. (c) As in (b), but viewed from an axis centred in between two triangular holes. (Online version in colour.) Such a regular structure is mathematically impossible but can be realized if the edge lengths and angles of the polygons are deformed by as little as 0.5%. This makes such a structure look totally regular and symmetric, although in reality it is only nearly so [3]. Given the huge library of diverse protein structures known, there may be a number of proteins from which it would be useful to build cage-like structures but which may have been overlooked owing to the mathematical ‘impossibility’ of them forming a regular cage. This raises the question of whether other protein cages with such nearly symmetrical geometries could be made. We start by defining the new concept of a polyhedral cage [5], referred to as a p-cage, as an assembly of (nearly) regular planar polygons, which we also call faces, with holes in between them. The holes can have any shape and do not have to be planar. From a biochemical point of view, the faces will be made out of proteins, while the holes can be empty or can be used to attach particular molecules of choice. The mathematical concept of a regular shape is not new. Uniform polyhedra are assemblies of regular polygons such that all the vertices are equivalent. It was Kepler who showed [6] that the only convex uniform polyhedra are the Platonic and Archimedean solids, as well as prisms and antiprisms. The non-convex versions were described by Coxeter et al. [7]. Johnson [8] generalized the concept to strictly convex assemblies of regular polygons without requiring any equivalence between the vertices. Johnson [8] also listed the 92 so-called Johnson solids and Zalgaller [9] later proved that the list was complete. This was further extended [10] to near-miss Johnson solids as strictly convex assemblies of nearly regular polygons. Because of the holes, p-cages are neither proper solids nor polytopes and so do not fall within the already classified polyhedra. What we are defining is a further generalization of convex regular assemblies of (nearly) regular polygons by allowing holes, of any shape, in the structure.

Definition

We define a polyhedral cage as an assembly of planar polygons, which we also refer to as faces, separated by holes which do not need to be either regular or even planar. The edges of the p-cage faces are either shared with another face or with a hole. Of two adjacent edges at least one of them must be shared with a hole. Moreover, we impose that two adjacent faces do not share more than one edge with each other and that each face must have at least three neighbour faces. Examples of p-cages are presented in figures 1 and 2.

Figure 2

(a) A p-cage made out of decagons (). Each hole is made out of faces, each contributing one edge to the hole. The red line, drawn on the p-cage faces for clarity, illustrates the construction of the hole-polyhedron. (b) The hole-polyhedron, an icosahedron, for the p-cage in (a). (c) A p-cage made out of dodecagons () with three types of holes: , and . Each face contributes, respectively, one, two and three edges to each of these holes. (d) Planar projections of four heptagons placed on the vertices (large dots) of a tetrahedron (dotted lines). The holes are painted except the one underneath the projection. Each hole is made out of two edges from one face and one edge from two different faces, as indicated by the numbers around each edge. (Online version in colour.) A p-cage is defined as regular if all of its faces are regular planar polygons, but the bionanometric cages such as the TRAP-cage are made out of strictly planar but nearly regular polygonal faces. We refer to these as near-miss p-cages. The amount of deformation is subjective, but in this paper we will consider edge lengths and angles between edges differing by up to 10% from the regular polygons. This is motivated by the fact that p-cages with such a deformation are noticeably irregular to the naked eye, but not excessively so. It is also motivated by the fact that nano-cages with deformation close to or exceeding 2% are known to be likely (AP Biela 2019, personal communication) [11]. The hull of a p-cage is obtained by extending all the faces to infinite planes and considering the interior of all the resulting intersecting half-spaces. The hull will always be a polyhedron, but the faces will usually be irregular. We define a p-cage as convex if its hull is convex. The p-cage graph is the graph generated from the edges of the p-cage. Some of the nodes will belong to three edges and be part of two faces and one hole while the others will only belong to two edges and be part of one face and one hole. A homogeneous p-cage is defined as a p-cage for which all the faces are polygons with the same number of edges. A homogeneous p-cage is said to be connectivity invariant if all the faces are indistinguishable from their connectivity; in other words, if for any pair of faces and there is an automorphism of the p-cage graph onto itself that maps the vertices of face onto the vertices of face , such that the connecting edges are mapped to connecting edges and so that hole edges are also mapped to hole edges. The faces are assumed to be isomorphic but not isometric. For nanobiotechnological motivations, in this paper, we do not consider the p-cages for which one must involve a reflection to achieve the connectivity invariance. We are also only interested in convex p-cages. Connectivity invariance is introduced because bionano-cages with that property will be able to assemble randomly/thermally following a larger number of possible assembly paths and, as a result, are more likely to be generated experimentally. In what follows, we will use the following notations: will stand for the total number of faces of a p-cage while will refer to the number of edges of each face. As we will only consider homogeneous cages herein, will be defined for each p-cage. We will also denote as the number of faces surrounding a given hole. In general, p-cages will have holes made out of a different number of faces and so there will be different values of for a given cage (figures 1 and 2.) The paper is organized as follows. We start by defining the dual of the p-cage, which we call the hole-polyhedron. We then show that each p-cage can be constructed from these hole-polyhedra. For this, we must start by characterizing all the possible ways to distribute face edges to the holes. Finally, we proceed by identifying all the graphs with distributed edges corresponding to connectivity-invariant p-cages (either regular or near-miss) for ranging between 6 and 17. We then describe a method that will be used to construct all the connectivity-invariant p-cages as well as some geometric constraints used to rule out p-cages with deformation exceeding our chosen threshold. To achieve this, we describe a quality function which measures the non-regularity of p-cages and which one must minimize to find the most regular configuration for the convex near-miss p-cages. In the final section, we will describe the cages we have found, focusing our attention on the most regular ones.

Hole-polyhedron

If we join the centre of each face of a p-cage to the centres of the faces that share one edge with it and project this skeleton on a plane, we obtain a planar graph which can also be seen as the three-dimensional (3D) graph of a polyhedron (the faces of the 3D graph will not necessarily be planar, but the vertices of the graph can be projected onto a plane so that none of the edges cross each other.). This is effectively the dual of the p-cage, but as the faces of that polyhedron bear information about the holes of the p-cage, we call it a hole-polyhedron. The -gonal faces of the hole-polyhedron do surround the p-cage holes made out of faces and capture the connectivity of the p-cages: the vertices, edges and faces of the hole-polyhedron correspond, respectively, to the faces, shared edges and holes of the p-cage (figure 2). To create p-cages, we can proceed backwards and consider any planar graph as a hole-polyhedron, placing a polygon on each vertex (figure 2). One must then distribute the edges of each face between the adjacent faces and holes. The edges of the hole-polyhedron specify which polygons are adjacent on the p-cage. When we add a -gon on the vertex of degree , we must join of its edges to neighbour faces, distributing the remaining edges, the hole-edges, between the holes. This step is not unique as the hole-edges can be distributed in several ways. For example, an octagon with three neighbours can contribute one, two and two edges to the three holes or one, one and three. This will result in a number of different p-cages. If the polygons are regular and identical, the p-cage will be regular. If the polygons are slightly deformed regular polygons, the p-cage will be near-miss. By considering all the planar graphs of interest and all the possible repartitions of the edges, we will obtain all the possible p-cage connectivities. While this is very simple conceptually, the number of possibilities grows very quickly with the size of the hole-polyhedron and with , so much so that it very quickly becomes intractable [12,13]. From a bionanotechnology point of view, the most relevant p-cages are the ones where all the faces have the same number of edges and play an equivalent role. This is because such cages, by biological necessity, are built from multiple identical building blocks with identical amino acid sequences and held together by specific interactions between particular amino acids. So we have decided to restrict ourselves to homogeneous connectivity-invariant p-cages. This reduces considerably the number of possible p-cage geometries. Note that the hole-polyhedra of homogeneous connectivity-invariant p-cages correspond to regular planar graphs. Moreover, the connectivity invariance of the p-cage implies that the hole-polyhedron is vertex transitive or, in other words, a Cayley graph. These graphs were classified by Maschke [14] and are the Platonic and Archimedean solids as well as the uniform prisms and uniform antiprisms. As we disregard connectivity invariance via reflection, we must exclude the truncated cuboctahedron and truncated icosidodecahedron.

Repartition of edges on the p-cage holes

The next task we must perform is to determine all the connectivity-invariance-preserving ways to distribute the face edges of the p-cage between the holes. For a p-cage hole with edges per vertex and -gonal faces, this is equivalent to distributing the strictly positive numbers , , around each hole-polyhedron vertex, in such a way that the p-cage is connectivity invariant; in other words, such that, for any two vertices of the hole-polyhedron, there is at least one automorphism of the hole-polyhedron that maps the first vertex onto the second and that preserves the distribution of the . We will now perform the construction graphically using the letters ‘a’, ‘b’, ‘c’, ‘d’ and ‘e’ instead of . As this labelling will later be used to name the p-cages, we have adopted the following convention to decide on which face the ‘a’ is placed for the first and arbitrary set of labels. For prisms and antiprisms, the label ‘a’ is placed on the base polygon. For Platonic solids, it does not matter as all the faces are identical, while for Archimedean solids we place the ‘a’ on the face with the smallest number of edges except for the snub cube and the snub dodecahedron because this could lead to an ambiguity as there are two types of triangles for these solids. We thus place the label ‘a’, respectively, on the square and the pentagon for these solids. The other labels are then placed anti-clockwise around the vertices in alphabetical order. To identify the p-cages unambiguously, we have adopted a notation made out of three parts: SYM where PN is the letter P followed by the number of edges of the p-cage faces; SYM refers to a symbol, listed in table 1, used to specify the hole-polyhedron from which the p-cage is made; finally, QI refers to the diagram values of ‘a’, ‘b’, ‘c’, ‘d’ and ‘e’ separated by the symbol ‘–’. For example, the p-cage in figure 2a is called Pic because its hole-polyhedron is the icosahedron, its faces have 10 edges and each face contributes five times a single edge to the holes (a=b=c=d=e=1 in figure 2a).

Table 1

Symbols for convex uniform solids.

solid	SYM	solid	SYM
triangular prism	tp	tetrahedron	Pte
square prism (cube)	Pcu	octahedron	Poc
pentagonal prism	pp	dodecahedron	Pdo
hexagonal prism	hp	icosahedron	Pic
heptagonal prism	7p	truncated cube	Atc
octagonal prism	8p	truncated tetrahedron	Att
nonagonal prism	9p	truncated octahedron	Ato
decagonal prism	10p	truncated dodecahedron	Atd
triangular antiprism	ta	truncated icosahedron	Ati
square antiprism	sa	snub cube	Asc
pentagonal antiprism	pa	snub dodecahedron	Asd
hexagonal antiprism	ha	cuboctahedron	Aco
heptagonal antiprism	7a	rhombicuboctahedron	Arco
octagonal antiprism	8a	rhombicosidodecahedron	Arcd
nonagonal antiprism	9a	icosidodecahedron	Aid
decagonal antiprism	10a

Symbols for convex uniform solids. We now consider each regular solid in turn, starting with the prisms but excluding the cube that has a higher symmetry. By connectivity invariance, all the corners of the base of the prism must be identical, as they can only be mapped between themselves via a rotation of the base, and we label them ‘a’. The corners of the squares, forming the sides of the prism, can then have a different number of edges, labelled ‘b’ and ‘c’ but diagonally opposite corners must have the same value. This is shown graphically in figure 3.

Figure 3

Repartition of edges: (a) on a prism; (b) on an antiprism. The arrows show the order in which the labelling is performed: one starts with the arbitrary labels (top left), moves on to the next ones (bottom left) and then infers the following ones. (Online version in colour.) Similarly for antiprisms, the corners of the base of the prism must be identical. The corners of the triangles can then have a different number of edges, labelled ‘b’, ‘c’ and ‘d’, as shown i figure 3. Before we consider all the Platonic and Archimedean solids, we consider the possible configurations for a triangle, as shown in figure 4, as this will be a recurrent structure which we will use several times. We start by placing the labels ‘a’, ‘b’ and ‘c’ around the top vertex. We then place the three labels on the bottom left vertex in the three possible positions and consider each case in turn. We then try to impose the connectivity invariance on the other vertices. For the first case, we see that the pair ‘c’, ‘a’ faces the pair ‘a’, ‘b’ and this means that on the third vertex ‘a’, ‘b’ must face ‘c’, ‘a’ from the second vertex. We also notice that the connectivity invariance is satisfied on the third edge so this configuration is invariant. For the second case, there is no connectivity invariance we can apply, so we try all three possibilities on the third vertex and, in all three cases, we see that the only possible configuration is the trivial one where ‘a’=‘b’=‘c’. For the third case, we only have one possible configuration. We can then conclude that the only possible configurations are the ones for which the three corners of the triangle have either the same label or three different ones, in which case they are ordered clockwise.

Figure 4

Repartition of edges on an triangle. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.)

Repartition of edges on an triangle. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.) We can do the same construction for a triangle with vertices connected to four other vertices (figure 5). In this case, we also see that the vertices of the triangle must be either all identical or all different. When different they must be ordered clockwise and there are three different ways to do this as one of the labels must be missed out (except ‘a’, which is fixed.)

Figure 5

Repartition of edges on an triangle. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.)

Repartition of edges on an triangle. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.) Finally, we also consider a square with vertices connected to three other vertices (figure 6). In this case, we see that the four corners of the square must either have the same label or have two different ones with identical labels for diagonally opposite corners.

Figure 6

Repartition of edges on an square. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.)

Repartition of edges on an square. The non-trivial connectivity-invariant configurations are surrounded by a box. (Online version in colour.) We will now construct the different repartitions for the Platonic solids. For the tetrahedron, and it is not possible to fit three ‘a’ on one face as this leads to contradiction unless all three labels are identical. One must then have three different labels on each face and one obtains the diagram shown in figure 7. For the cube, and we can place four ‘a’ on each face or alternating ‘b’ and ‘c’. In both cases, we obtain the diagram shown in figure 7. For the octahedron, we can place three ‘a’ on the same face or three different labels in clockwise order. When trying all possible combinations, one obtains two different diagrams. In the first one, named Poc1, 2 opposite faces have three ‘a’, while the others have the remaining three labels. In the second, named Poc2, each face has only ‘a’ or only ‘b’ in such a way that similar faces do not share an edge (figure 7) (see electronic supplementary material).

Figure 7

Repartition of hole-edges on the Platonic hole-polyhedra.

Repartition of hole-edges on the Platonic hole-polyhedra. For the dodecahedron, the only diagram which ensures connectivity invariance for the p-cages is when all the corners have the same label, so we only have Pdo. For the icosahedron, there is only one possible diagram, modulo some equivalence and it is shown in figure 7 (see electronic supplementary material for the proof). We can now construct the connectivity invariance diagrams for the Archimedean solids. The truncated tetrahedron, truncated cube, truncated dodecahedron, rhombicuboctahedron and rhombicosidodecahedron all have one triangle per vertex. By connectivity invariance, this implies that the corners of triangles must all be ‘a’ and that the other labels are uniquely distributed. The truncated octahedron and snub cube have one square per vertex, implying that these faces must have ‘a’ in each corner and that this determines the position of the other labels. The same is true for the truncated icosahedron and snub dodecahedron, which have one pentagon per vertex. This is shown in figure 8.

Figure 8

Repartition of hole-edges on the truncated tetrahedron, truncated dodecahedron, rhombicosidodecahedron, truncated icosahedron, snub dodecahedron, cube octahedron, truncated octahedron, truncated cube, rhombicuboctahedron, snub cube and icosidodecahedron. For the large solids, we only present a section of the diagram. The vertices of the cuboctahedron are adjacent to two triangles and two squares. We can thus place three ‘a’ on one of the triangles and as we do so the triangles on the opposite side of the vertex must be ‘c’ and the diagram is completely determined (figure 8). One could also start with three different labels clockwise on one of the triangles, but this is not possible without breaking the connectivity invariance unless the labels are all identical. The vertices of the icosidodecahedron are adjacent to two triangles and two pentagons. We can thus place three ‘a’ on one of the triangles and as we apply the connectivity invariance rule we find that ‘a’=‘c’ and ‘b’=‘d’. So all the triangles must be filled with ‘a’ and all the pentagons with ‘b’. It is not possible to fill a triangle with three different labels without breaking the connectivity invariance. The vertices of the truncated cuboctahedron and truncated icosidodecahedron are not invariant if one excludes reflections. This is easily seen by noticing that the rotation symmetry around the axis going through an n-gonal face is a rotation of and not . As a result, it is not possible to generate connectivity-invariant p-cages from these two Archimedean solids.

Constraints on holes

While in principle we could consider all the possible distributions of edges on the holes, some of them lead to configurations which cannot correspond to a p-cage, or ones for which the deformation would be too large. By deformation, we mean angles different from the angle of the regular polygon or edge lengths different from a reference length, which we will ultimately set to 1. To make the problem tractable, we start by deriving a set of constraints on the holes and their edges to guarantee face deformations below a set threshold. We start by defining the p-cage sub-face as the polygon, usually irregular, made out of the edges shared with other p-cage faces and completed by replacing the edges contributing to the holes by a straight line joining the two exterior vertices (figure 9a). These sub-faces will be hexagons, octagons and decagons, respectively, for faces with three, four and five neighbours. The p-cage sub-faces generate a -gon, which is usually not flat, around any hole made out of faces. We call it the sub-face hole.

Figure 9

(a) Sub-face (black) for a p-cage with and a hole with . (b) Structure of the flattened sub-faces hole. (c) Close-up view of face-edges contributing to a hole and depicting the angles between the face hole edges (segmented dotted lines) and the angles between the face hole edges and the sub-face hole edge (black bold line). Note . (Online version in colour.) If we take the faces surrounding a hole and disjoin two of the adjacent faces, one will be able to flatten the structure onto a plane and the two faces that have been severed will not overlap (figure 9b). This means that the edges of the sub-face holes will not be intersecting and will not close into a polygon. This in turn implies that, to form a convex p-cage, we must impose that the sum of the angles of the sub-faces hole, in figure 9b, once projected onto a plane must be greater than the sum of the angles of a -gon. As illustrated in figure 9c, we call the angle between two adjacent edges of a polygon and the inner angle between the two edges. denotes the angle between the edge of a sub-face hole and the p-cage edge adjacent to it. Following the notation of figure 9 and using the index to label the faces, the sum of the is equal to twice the sum of the and we thus have where we have used the fact that the sum of the inside angles of an -gon is . In what follows, we will use the index 0 to denote the angles of the regular polygons/faces, , and write for a non-regular polygon, where is a deformation factor which can differ between faces, hence the index . We then have and . If the face contributes edges to the hole, we must then have Substituting the expression for , we obtain and the constraint (2.1) becomes In the construction of the p-cages, it is the deformation of the angle which we use as the deformation parameter. Denoting it , we have , where . Then, as we can rewrite (2.1) and (2.3) as The second constraint we can derive is that the edge length of the sub-faces hole must be smaller than the sum of the edge lengths of the other sub-faces hole contributing to the same hole, When the equality sign holds, one can join the faces together by deforming the polygons with the smallest contribution to the hole so that their is 0. We can assume that the extreme configuration is one where all the edges and angles are all stretched or contracted to the maximum amount. So, when evaluating (2.5) we must assume on the left must be deformed to become as small as possible while on the right-hand side are to be as large as possible. We need to perform the test taking each of the hole on the left-hand side and rule out any cage for which the test fails. If is even, we have where is a length scale where we take or depending on if we want to majorate or minorate . If is odd, we have Note that, for , . To help satisfy (2.5) we can stretch the shorter lengths by a factor up to , where is the threshold deformation factor, and shorten the longest one by the same amount, so (2.5) becomes where we compute using and the on the right-hand side using . The conditions (2.4) and (2.8) allow one to rule out many possible p-cage configurations. For the configurations which fulfil those two conditions, we must construct the corresponding p-cages and deform the polygonal faces until one obtains a convex p-cage with planar faces. This can be achieved by using a computer program where the vertices of the polygons are moved so that all the necessary conditions for the p-cage are obtained. Some p-cages differ only by a chiral transformation. In that case, we have only kept one of them. For ranging from 6 to 17 and considering only prisms and antiprisms with bases ranging from triangles to decagons as well as the Platonic and 11 of the Archimedean solids, we found 5743 potential p-cage configurations satisfying condition (2.8). The list of all these configurations is given in the electronic supplementary material. Many of these cage configurations will have angles and edge length deformations larger than 10%. To discard them, we need to realize each of these cages geometrically and minimize the amount of deformation for the angles and the edges of the faces. This is achieved by defining and then minimizing numerically a functional which measures the amount of deformation of the p-cage faces. One starts from an approximate position for the vertices of all the faces and then randomly displaces them using a Metropolis algorithm to optimize the functional, which we will now define.

Deformation functional

To optimize the regularity of the faces of a convex p-cage we define a functional which is the sum of five terms. The first two measure, respectively, the amount of deformation of the face edge lengths and the face angles. The third measures the non-planarity of the faces, while the last two measure the convexity of, respectively, the faces and the p-cage. We need to impose planarity as a constraint, rather than geometrically, because the vertices of the p-cage faces are the degree of freedom we need to optimize. Each of these five terms is then multiplied by a weight factor as explained below. There is no reason to assume that all the p-cage faces will be deformed the same way and become identical. In our optimization, we thus assume that the vertices of the p-cage are independent parameters. Before we define each of these functionals, we need to define a few quantities. First of all, we call node the vertices of the p-cage faces and we denote by the total number of the nodes for the p-cage (each counted only once). To keep our notation compact, we use as the operator modulo for any pair of integers and . We then denote by the coordinates of node , of face , while corresponds to the vector spanning the edge between node and node where the vectors are oriented so that they rotate anti-clockwise when looking at the face from outside the p-cage. As a result, the angle at node of face is We define the centre of the p-cage, , and the centre of the -gonal face , , as The centre of a face relative to the centre of the p-cage is then . We now introduce the following vectors: are the local vertex coordinates relative to the centre of the face, i.e. the vector joining the centre of face and node . is the vector position of the centre of the edge linking node and node and is the vector from the centre of a face to the centre of an edge. We now define a facelet as the triangle spanned by two adjacent vectors . Its normal vector and the area vector of face are given, respectively, by For flat faces, is a vector perpendicular to the face and of length equal to its area. As a result, the vector orthonormal to the face is We now define the different terms for a functional which we will use to minimize the deformation of the p-cage faces while ensuring face planarity as well as face and p-cage convexity.

Face regularity

The first constraint we want to impose is that the lengths of the edges of the faces are as close as possible to a reference length and we thus define the following least-squares quality function: Similarly, we want to impose that the angles of the faces are as close as possible to the angle of a regular polygon, which, for a polygon with edges, is given by . We thus define the quality function We have chosen these two functions so that they carry similar weight. This can be justified by considering an isosceles right triangle and deforming it so that the long edge, , is elongated by a small amount. This can be achieved by either keeping the two smaller edges at the same length and changing the right angle , or by keeping the right angle and elongating one of the smaller edges. In the second case, we have , and . Then . In the first case, we have , where and are small deformation parameters. Comparing the two cases, we have implying that if is measured in radians. We indeed found that for many p-cages the most regular configurations were obtained when these two functions have roughly the same weight.

Face planarity and convexity

We must also impose that the faces are planar. We need to impose the planarity constraint numerically because imposing it analytically would involve solving a large number of algebraic equations, which would make the minimization algorithm computationally far too slow. We can do this by imposing that all the facelet vectors are parallel to each other and parallel to the face normal vector and define the quality function which corresponds to the sum of the squares of the projected lengths of the facelet vectors onto the face plane. This evaluates to 0 if the face is planar. We must also impose that each face is convex and, as the edge vectors are rotating anti-clockwise, the vector must point towards the outside of the face and so we must have . We can then use the following quality function: where is the Heaviside function.

Convexity of a p-cage

We must finally impose the condition that the p-cage is convex. As we will optimize the quality function using the Metropolis algorithm, we must use an expression which depends on as few points as possible so as to make the algorithm as fast as possible. To achieve this, we impose that two adjacent faces, i.e. sharing an edge, must be bent towards the centre of the cage. Mathematically, this implies that if the faces and are adjacent and touching at their respective edges and , the sum of the two vectors , defined in (3.3), must be pointing away from the centre of the cage; in other words, Note that and as a quality function we can use This expression does not strictly impose convexity in all configurations but we found that it works very well in the majority of cases. For some p-cages, we had to use another expression which is more expensive computationally but more rigorous. If we consider the normal unit vectors and of two adjacent faces, the p-cage will be convex if the distance between the base of the two vectors is smaller than the distance between their tips. In other words, we can use as a quality function

Optimizing functional

Putting all of these functions together, what we have to do is to find p-cages which optimize the function where and are weight parameters. For , we use (3.12) most of the time except for some cages which prefer to assume a concave configuration and for which using (3.13) works better. To perform this optimization, we have considered each hole edge repartition for each hole-polyhedron separately, hence fixing the connectivity from the start. We then started from a simple mechanical model of semi-rigid faces connected together by springs. The polygonal faces, connected by springs, were very crudely distributed around a sphere and the system was relaxed to obtain a better estimate of the face positions. We then used a Metropolis algorithm to optimize (3.14) with , and . The convexity parameters were usually set to and but in most cases the actual value did not matter as the p-cages were naturally assuming a convex configuration. For some cages had to be larger or smaller for the optimization to work well. Once a good configuration was obtained, we used a combined downhill simplex [15] and Metropolis method to relax the configurations for varying and while keeping their sum equal to 2. We used 100 different values spread logarithmically in that interval. Defining the maximum relative deformation of edge lengths and face angles as we then took as the best cage the one which minimizes . Finally, we used a bisection method, varying and but keeping , to find the cage with the smallest deformation. We then ruled out any cages for which . When determining the node positions of a p-cage numerically it is possible that some of the faces intersect each other. This occurs mostly for some p-cages derived from prisms. These cage configurations must be ignored. We have thus written a Python program which searches for such intersections using an algorithm derived by Möller [16].

Results

As we have identified nearly 1000 near-miss p-cages, it is not possible to describe them all in the main text, but a full list is provided in the electronic supplementary material. The numbers of cages found for each polygon are listed in table 2. Some of the p-cages are regular and these can be determined using basic trigonometry.

Table 2

Number of connectivity-invariant convex p-cages for each polygon.

P	6	7	8	9	10	11	12	13	14	15	16	17	total
near-miss	2	4	12	32	38	63	69	99	117	141	183	228	988
regular	6	8	15	6	11	11	23	8	11	18	18	14	149

Number of connectivity-invariant convex p-cages for each polygon.

Regular connectivity-invariant homogeneous p-cages

One can first build regular p-cages from the prism hole-polyhedra. This amounts to making two pyramid-like structures, with holes, where the bases are glued together and removed. These regular p-cages are listed in table 3. All the faces are arranged symmetrically around the prism rotation axis. One can also obtain p-cages from non-symmetric arrangements of the faces, but they are all degenerate cages where some of the holes are pinched such that two opposite edges merge with each other; the resulting p-cages are equivalent to other p-cages (for example, tp is equivalent to Poc1). A similar construction can be done with antiprism hole-polyhedra as shown at the bottom of table 3.

Table 3

Prism and antiprism-based regular connectivity-invariant p-cages. . is the angle between the p-cage face and the base of the underlying prism. ( is provided in the electronic supplementary material.)

Prism and antiprism-based regular connectivity-invariant p-cages. . is the angle between the p-cage face and the base of the underlying prism. ( is provided in the electronic supplementary material.) Placing a regular polygon on a vertex of a Platonic solid is equivalent to placing it on the faces of its dual polyhedron. This is quite an easy problem to solve and the results are provided on the left-hand side of table 4.

Table 4

Left: Regular connectivity-invariant p-cages derived from Platonic solids (except the Pcu ones), . Centre: Regular p-cages obtained from the truncated Platonic hole-polyhedra . is the angle between the p-cage faces and the face of the underlying solid. Right: Regular p-cages obtained from cuboctahedron, rhombicuboctahedron and rhombicosidodecahedron hole-polyhedra. Not all values of lead to regular p-cages. ( is provided in the electronic supplementary material.) Placing faces on the vertices of a truncated Platonic solid is equivalent to placing a pyramid without its base on the face of the dual of the corresponding Platonic solid. The resulting p-cages are listed in the centre of table 4. One is then left with placing regular polygons on the vertices of the cuboctahedron, rhombicuboctahedron and rhombicosidodecahedron. The resulting regular p-cages are listed on the right-hand side of table 4. The detailed geometric derivations are provided in the electronic supplementary material.

Near-miss connectivity-invariant homogeneous p-cages

We will now describe the different properties that the near-miss p-cages exhibit. The level of deformation varies greatly between p-cages; not surprisingly, polygons with a large number of edges form more p-cages below the set deformation threshold. The number of near-miss p-cages with deformation below 1% is relatively small and we have listed them in table 5.

Table 5

Near-miss p-cages with up to 1% deformation grouped by type of polygonal faces. (None for .)

Near-miss p-cages with up to 1% deformation grouped by type of polygonal faces. (None for .) From the onset of our construction, we have avoided imposing that the p-cage faces are identical. We should hence find out if this was indeed justified or if the obtained p-cages do happen to have identical faces. As our computer program outputs for each p-cage the length of all the edges of face as well as the angles , it was easy to compare the edge length of any pair of faces and by computing , varying to determine the smallest value of the difference. We did the same with the corresponding angles (which we now label with both an edge and a face index). The relative deformation of the p-cage faces was hence obtained by computing We found that for most cages the relative deformations were smaller than the deformations of the faces themselves, as expected, but not small, justifying our decision to impose face connectivity invariance at graph level rather than geometrically. The only p-cages for which and are very small are some of the ones corresponding to a tiling of the face of a Platonic solid and some for which the angles are all regular. Full details are provided in the electronic supplementary material, which describes all the near-miss p-cages. To test the accuracy of our minimization, we have performed it several times on the same p-cages and obtained the same result each time. Moreover, to evaluate the numerical accuracy of our procedure, we have also relaxed the regular p-cages and obtained deformations or equal to or smaller. The p-cages with the smallest deformation for each value of are presented in figure 10. We can see from table 5 and figure 10 that the deformations are very small, below 0.1% for several of them, and are so small that they are impossible to detect with the naked eye. We also see that the p-cages exhibit a variety of features, which we will now describe.

Figure 10

Least deformed (smallest ) near-miss p-cages for each value of . While they do look regular they are not so. (Online version in colour.)

Least deformed (smallest ) near-miss p-cages for each value of . While they do look regular they are not so. (Online version in colour.) In figure 11a, we present some p-cages obtained from prism and antiprism hole-polyhedra. They all appear as rings, which is well illustrated by 9p and 9p. These rings can then be nearly flat, like 7p or elongated, like tp. P-cages obtained from antiprisms look similar except that they have four neighbours, forcing them to assume ring-like structures such as 8a.

Figure 11

Some p-cages obtained from: (a) prisms and antiprisms, (b) Platonic solids. (Online version in colour.)

Some p-cages obtained from: (a) prisms and antiprisms, (b) Platonic solids. (Online version in colour.) Several p-cages derived from prisms look very similar to p-cages obtained from antiprisms, the only difference being that the latter p-cages have an extra set of joined faces which for the former p-cage becomes a tiny gap. They are listed on the left-hand side of table 6.

Table 6

Visual similarity between p-cages with different numbers of neighbour faces. Left: Prism-derived p-cages and antiprism-derived p-cages. Right: Archimedean solid-derived p-cages. Parenthesis denotes deformations exceeding 10%. Figure 11b presents some typical p-cages derived from Platonic solid hole-polyhedra. They all correspond to an embedding of the polygon into the faces of the dual of the hole-polyhedron. When the numbers of hole edges are all equal, the p-cage is regular. P16 is different because the cube is also a square prism and this allows it to flatten like other prism-based p-cages, but this cannot happen for the other Platonic-based p-cages. Figures 12 and 13 present a range of p-cages derived from Archimedean solid hole-polyhedra. The majority of p-cages assume a sphere-like shape, such as Atc, Ato or Asc, but for some specific hole edge distributions the p-cage can look like a tiling of the faces of a Platonic solid, such as Att and Ato (figure 12), where each edge of the Platonic solid is where two faces of the p-cage are joined together, or Ati, Aco, Arco and Arcd (figure 13), where each edge of the Platonic solid is where two pairs of faces of the p-cage are joined together. Some others, on the other hand, look like a wire-frame construction of the Archimedean solids, such asAtt, Ato or Atd.

Figure 12

Some p-cages obtained from Archimedean solids (part I). (Online version in colour.)

Figure 13

Some p-cages obtained from Archimedean solids (part II). (Online version in colour.)

Some p-cages obtained from Archimedean solids (part I). (Online version in colour.) Some p-cages obtained from Archimedean solids (part II). (Online version in colour.) As we can see from all the figures, we also note that some cages have very small holes while others have very large ones. For some cages with holes with a large value, the faces organize themselves to fill the gap of what could potentially be a very large hole. This is the case for Ati where for one group of holes, but where each face seems to have five neighbours when they actually have only three.

Conclusion

In this paper, we have defined near-miss connectivity-invariant p-cages as assemblies of nearly regular polygons with holes between them where all the faces are connectivity equivalent. We have then shown that each p-cage can be characterized by a planar graph, the hole-polyhedron, where the holes’ edges are distributed around the nodes of the graph. We have then enumerated all the distributions of hole edges on the hole-polyhedra compatible with the connectivity invariance of the p-cages, excluding those which would necessarily lead to edge length and angle deformation exceeding 10% and restricting ourselves to the polygons with 6–17 edges. We have then derived a quality function which measures the level of deformation of the p-cages and have used a numerical method to minimize that quality function for each of the possible configurations we had identified. This resulted in a large number of non-regular p-cages, most of which had a deformation exceeding a 10% threshold that we had set upfront, but still leaving around 1000 near-miss p-cages with deformation below 10% and 74 near-miss p-cages with deformation less than 1%. We proceeded by describing some properties of the obtained p-cages with 6–17 edges. Most near-miss p-cages have configurations similar to the regular p-cages, but some are different in that large holes are filled with the faces, leaving what looks like medieval castle loopholes. In our approach, we have not assumed any symmetry for the deformed cages, as the different faces of a p-cage could potentially be deformed differently. We have found that, for most cages, the faces were deformed slightly differently and that our assumption was thus justified. We have thus generated a very large list of potential geometries for nearly symmetric protein cages. While some p-cages exhibit large holes, probably making them of lesser use in biochemistry, many others have a pseudo-spherical shape, making them good geometrical candidates for shells which could contain some cargo. Click here for additional data file.

5 in total

1. Gold nanoparticle-induced formation of artificial protein capsids.

Authors: Ali D Malay; Jonathan G Heddle; Satoshi Tomita; Kenji Iwasaki; Naoyuki Miyazaki; Koji Sumitomo; Hisao Yanagi; Ichiro Yamashita; Yukiharu Uraoka
Journal: Nano Lett Date: 2012-03-13 Impact factor: 11.189

2. The structure of trp RNA-binding attenuation protein.

Authors: A A Antson; J Otridge; A M Brzozowski; E J Dodson; G G Dodson; K S Wilson; T M Smith; M Yang; T Kurecki; P Gollnick
Journal: Nature Date: 1995-04-20 Impact factor: 49.962

3. Probing structural dynamics of an artificial protein cage using high-speed atomic force microscopy.

Authors: Motonori Imamura; Takayuki Uchihashi; Toshio Ando; Annika Leifert; Ulrich Simon; Ali D Malay; Jonathan G Heddle
Journal: Nano Lett Date: 2015-01-12 Impact factor: 11.189

4. An ultra-stable gold-coordinated protein cage displaying reversible assembly.

Authors: Ali D Malay; Naoyuki Miyazaki; Artur Biela; Soumyananda Chakraborti; Karolina Majsterkiewicz; Izabela Stupka; Craig S Kaplan; Agnieszka Kowalczyk; Bernard M A G Piette; Georg K A Hochberg; Di Wu; Tomasz P Wrobel; Adam Fineberg; Manish S Kushwah; Mitja Kelemen; Primož Vavpetič; Primož Pelicon; Philipp Kukura; Justin L P Benesch; Kenji Iwasaki; Jonathan G Heddle
Journal: Nature Date: 2019-05-08 Impact factor: 49.962

5. Artificial Protein Cage with Unusual Geometry and Regularly Embedded Gold Nanoparticles.

Authors: Karolina Majsterkiewicz; Artur P Biela; Sourav Maity; Mohit Sharma; Bernard M A G Piette; Agnieszka Kowalczyk; Szymon Gaweł; Soumyananda Chakraborti; Wouter H Roos; Jonathan G Heddle
Journal: Nano Lett Date: 2022-03-07 Impact factor: 12.262

5 in total

1 in total

1. Artificial Protein Cage with Unusual Geometry and Regularly Embedded Gold Nanoparticles.

1 in total