Literature DB >> 33490793

From Infection Clusters to Metal Clusters: Significance of the Lowest Occupied Molecular Orbital (LOMO).

Abstract

In this paper, the nature of the lowest-energy electrons is detailed. The orbital occupied by such electrons can be termed the lowest occupied molecular orbital (LOMO). There is a good correspondence between the Hückel method in chemistry and graph theory in mathematics; the molecular orbital, which chemists view as the distribution of an electron with a specific energy, is to mathematicians an algebraic entity, an eigenvector. The mathematical counterpart of LOMO is known as eigenvector centrality, a centrality measure characterizing nodes in networks. It may be instrumental in solving some problems in chemistry, and also it has implications for the challenge facing humanity today. This paper starts with a demonstration of the transmission of infectious disease in social networks, although it is unusual for a chemistry paper but may be a suitable example for understanding what the centrality (LOMO) is all about. The converged distribution of infected patients on the network coincides with the distribution of the LOMO of a molecule that shares the same network structure or topology. This is because the mathematical structures behind graph theory and quantum mechanics are common. Furthermore, the LOMO coefficient can be regarded as a manifestation of the centrality of atoms in an atomic assembly, indicating which atom plays the most important role in the assembly or which one has the greatest influence on the network of these atoms. Therefore, it is proposed that one can predict the binding energy of a metal atom to its cluster based on its LOMO coefficient. A possible improvement of the descriptor using a more sophisticated centrality measure is also discussed.

Entities: Chemical Disease Gene Species

Year: 2021 PMID： 33490793 PMCID： PMC7818624 DOI： 10.1021/acsomega.0c04913

Source DB: PubMed Journal: ACS Omega ISSN： 2470-1343

Introduction

Kenichi Fukui first published a paper on the so-called frontier orbital theory of chemical reactions in 1952.[1] His theory was supported by a series of works published by Robert Woodward and Roald Hoffmann on the conservation of orbital symmetry,[2−4] the so-called Woodward–Hoffmann rules. Since then, the frontier orbital theory has been applied not only to organic chemical reactions[5] but also to organometallic reactions[6] and even to surface reactions.[7] As an example of frontier orbitals, Figure illustrates those of hexatriene, calculated using the simple Hückel method. In this paper, the Hückel molecular orbitals (HMOs) are depicted using the HuLiS program.[8,9] Usually, the frontier orbitals refer to the highest occupied molecular orbital (HOMO) and the lowest unoccupied molecular orbital (LUMO). The two electrons occupying the HOMO are distinct from the other electrons in that they are essential to the reactivity of a molecule as they interact with the LUMO of another molecule.[1] This simplistic picture has made it possible to develop an understanding of a variety of physicochemical properties of molecules, including thermodynamic stability,[10] reorganization energies,[11] and electron transmission.[12]

Figure 1

Frontier orbitals of hexatriene calculated with the simple Hückel method are shown along with the lowest occupied molecular orbital (LOMO), which will be referred to as the central orbital in this paper. The unit of energy is the resonance integral β. In this paper, we will focus on the lowest occupied molecular orbital (LOMO). Though this orbital has not yet been found as useful as the frontier orbitals, we describe some interesting aspects of this orbital. We would respectfully like to call it the central orbital, or central molecular orbital, for some reasons. The term “frontier” literally means the part of a country which fronts or faces another country.[13] Its opposite would be the capital or the center of the country. On the analogy of Bohr’s model, the most stable, lowest-energy electron is found in the innermost orbital, which is situated closest to the center of the atom.[14] These grounds may rationalize our nomenclature; however, the most critical reason why we call it central will soon become clear, when we demonstrate that the orbital denotes the centrality of a network. In the chemistry literature, one may find the term “the lowest occupied molecular orbital”, but they are often a typo for the lowest unoccupied molecular orbital. As only a few exceptions, one can find the use of the term in a work by Imamura and Aoki[15] on polyyne oligomers and a work by Erkoç and Türker[16] on ammonia molecules encapsulated by fullerene. In the field of chemical graph theory, some researchers have paid attention to the LOMO. For example, Bonchev and co-workers pointed out that the expansion coefficients of the LOMO of a molecule are related with the number of self-returning walks (SRWs) on its molecular graph.[17−19] Since SRWs play an important role in the characterization of the center of a graph,[20] the LOMO coefficients may be of importance for determining atomic environments in molecules. They further developed their method based on the SRW, proposing topological atomic charges, valencies, and bond orders.[21] Estrada and Knight suggested in their seminal book of network theory[22] that the LOMO distribution has something to do with centrality. Recently, Redzepovic and co-workers have applied spectral graph theory to benzenoid hydrocarbons, finding that the π-electron energies of these molecules can be related with the energies of the HOMO and LOMO.[23]

Results and Discussion

Note for the Reader

The focus of this paper is on the use of centrality on molecules. LOMO coefficients are equivalent to a centrality measure, i.e., eigenvector centrality. In this paper, we will describe how useful the LOMO of a molecule is to find out an atom located at the center of an interatomic network. “Centrality” is a term borrowed from network theory. Expert readers, who may be familiar with it, can skip Section and go on to Sections and 2.4, where we discuss the possibility that eigenvector centrality (LOMO coefficients) can be used as a descriptor for the binding energy of an atom in an atomic cluster. Eigenvector centrality was introduced by Bonacich[24] in 1987 for the study of social networks. One may further trace it back to his early study in 1972.[25] Extensive studies on this centrality measure have been done in the literature.[22] In Sections and 2.2.2, using a simplistic model as an example, we demonstrate what eigenvector centrality is all about for the sake of chemists. If one takes a look at a network theory textbook, one can find the derivation of the equations regarding eigenvector centrality, but they might be far from evident for chemists. In Sections and 2.2.4, we rely on principles of quantum chemistry or quantum mechanics, which are much more familiar to chemists than network theory, to clarify the relation between LOMO and eigenvector centrality. We are sure that this paper will help the chemistry community to satisfactorily understand the significance of LOMO along with its underlying mathematical structure.

Example of the Use of Eigenvector Centrality and Its Relation to LOMO

Infection Network

In December 2019, a local outbreak of pneumonia was detected in Wuhan (Hubei Province, China) and was soon identified as being caused by a new coronavirus (COVID-19).[26] The first half of 2020 witnessed the rapid spread of COVID-19 in China and then globally. Understanding the propagation of viruses in social networks is of significant importance. On the basis of a simple model by Spizzirri,[27] we will demonstrate how the spread of infection happens and its relation with the centrality of a network. Figure shows a simplified model of the propagation of infection in a town network. In Figure a, town B is connected with two towns, A and C. Suppose the first infected person appears in town B. This will result in the same number of infected people in each town connected with town B while the source of the infection in town B will recover. For a while, there will be no infected person in town B, but the infection will soon return. This time, the virus will be transferred from both town A and town C, resulting in two infected people in town B while the infected in town A and town C will recover. This process will continue.

Figure 2

Evolution of the number of infected people in town networks. The number indicated below a town denotes the number of infected people in that town. The black arrows indicate the propagation of infection. Let us take a look at the propagation of infection in the town network shown in Figure b, in which every town has two neighboring towns. Again, suppose the first infected person appears in town B. The first step of the propagation is the same as that for the network detailed above. A difference emerges in the second step. The infection will return to town B from town A and town C, leading to two infected people in town B. Meanwhile, an exchange of infection will happen between town A and town C. And then the sources of infection in town A and town C will recover; however, a new infected person will appear in both towns. From then on, all of the towns will have at least one infected person, and the exchange of infection will continue in all of the pairs of neighboring towns.

Graph Theory

One may want to have a mathematical apparatus to deal with the evolution of the propagation of infection. Graph theory will be helpful for this purpose. In Figure , we show how one can mathematically represent the town networks introduced in Figure . The towns are converted into nodes (also called vertices) labeled using numbers (1, 2, and 3), while the roads are converted into lines called edges though in the drawing of Figure , there is no distinction. The network representation consisting of the nodes and the edges is called graph. Similarly, when the atoms and the bonds of a molecule are converted into nodes and edges, one can obtain a molecular graph, which has been extensively investigated in chemical graph theory.[28]

Figure 3

Process of reducing a network to mathematical entity: graph and adjacency matrix.

Process of reducing a network to mathematical entity: graph and adjacency matrix. One of the most useful matrices associated with graph G is the adjacency matrix A(G). This matrix can be constructed such that if the ith and jth nodes are adjacent (connected), the (i,j) and (j,i) entries of A(G) are 1; otherwise 0. One can perceive a similarity of A(G) to the Hückel matrix. We will return to this point later. The initial distribution of the infected population on a town network is represented by a vector of C(0) (see Figure ). Figure shows the first few steps of iterations to simulate the evolution of the infected population in the town network shown in Figure a. The product of AC(0) corresponds to the distribution of the infected population after a certain period of time has passed. Generally, the ith row element of the vector of AC(0) can be calculated from ∑ac(0), where a denotes the (i,j) entry of the adjacency matrix, either 0 or 1, and c(0) the initial infected population at the jth node (town); therefore, the ith entry of AC(0) counts the total infected population propagated from the nodes connected with the ith node. The vector of C(1), equivalent to AC(0), can also be used as the initial population for the next step. Then, AC(1) tells how infected people are distributed over the nodes after the next.

Figure 4

Parallel between mathematical and graphical representations for the spread of infection on the town network shown in Figure a. The vector of C(0) describes the initial distribution of the infected population and the vector of AC(0) corresponds to the distribution of the infected population after a certain period of time has passed. Additionally, after the same length of time has passed, the infected population grows as represented by the vector of AC(1). Since the process presented in Figure looks very similar to a Markov chain,[29,30] we make the link to it clear briefly in the Supporting Information (SI). Let us move on to a bit more complicated network (see Figure ). For this network, we will try out a variety of initial populations. We will simulate how the infected population grows at each node as the virus spreads. We performed three simulations: The initial infected person appears at the node of 1 (a), at the node of 2 (b), and at the node of 3 (c). One should notice that nodes 3 and 4 are identical due to symmetry.

Figure 5

Graph representing a town network in a small county (left) and its adjacency matrix (right).

Graph representing a town network in a small county (left) and its adjacency matrix (right). The results of the simulations for the growth of infected population are presented in Figure . Generally, one can see an exponential growth in every case. The most rapid growth is observed when the initial person is set to appear at the node of 2. The slowest growth is observed for the case where the initial person is set to appear at the node of 1. This may be because town 1 is rural whereas town 2 is urban in the county. Or one could say that town 1 is “frontier”, whereas town 2 is “central”.

Figure 6

Growth of the number of infected people at each node of the network shown in Figure ; node 1: blue, node 2: orange, and node 3 (equivalent to node 4): gray. The images on the right indicate the growth of infected population representing the initial distribution of the infected population. The initial infected person appears at the first node (a), second node (b), and third node (c). Figure shows that the initial distribution of infected people will affect the pattern of the growth of the number of infected people. But we demonstrate in Figure that the ratio of the population does not vary after a sufficiently long time has passed whatever the initial distribution was. At every iteration step, the number of infected people at each node was divided by the number of infected people at the node with the largest number of infected people of the four, resulting in Figure . Thus, in this figure, the maximum value does not exceed 1. At the very beginning, the ratios oscillate greatly, but they converge soon. Remarkably, this figure shows that different initial distributions converge to the same distribution. This can be proved mathematically as will be shown later.

Figure 7

Evolution of the ratio of the number of infected people at each node shown in Figure ; node 1: blue, node 2: orange, and node 3 (equivalent to node 4): gray. The ratios at the 30th iteration step are shown explicitly. The images on the right represent the initial distribution of the infected population. The initial infected person appears at the first node (a), second node (b), and third node (c). This simulation implies one can make a distinction of the towns in a county. Based on the converged distribution of the ratio of infected people, the towns can be ranked. Town 2 holds the largest number of infected people of the four while the ratio of infected people in town 1 stays low. The ratios of towns 3 and 4 fall in between. The distinction of the towns is likely to be traced back to the extent to which the town is urbanized. Thus, the ratios shown in Figure can be used as a measure of centrality of the nodes in a network. This is the eigenvector centrality.[24,31] The reason why the name includes “eigenvector” will become clear soon. As more sophisticated models for epidemics on networks, instead of the one presented above, one may use two fundamental models known in the literature:[32−34] Susceptible-Infected-Susceptible (SIS) and Susceptible-Infected-Recovered (SIR) models, which are extensions of the classical models used in epidemiology that consider the influence of the topology of a network on the propagation of an epidemic, where the spreading of an infectious disease on the network have been modeled representing individuals as nodes and the contacts between them as edges. To better incorporate the dynamical process of the spread of epidemics into the model, one needs to consider a ratio of the infection birth rate and the infection death rate, which is called the epidemic threshold and determines whether an infection becomes epidemic or not.[34,35]

Chemical Graph Theory

It is time to start chemistry. Not only infection networks but also molecules can be viewed as a graph. A simple undirected graph is used for the description of molecules, in which the nodes correspond to the atoms in the molecule and the edges correspond to the chemical bonds.[36] Such a graph is termed the molecular graph. π-conjugated compounds, whose π-electronic structures are well described with the simple Hückel model, have been a good target for the application of graph theory because there is a one-to-one correspondence between the Hückel Hamiltonian (Hückel matrix) and the adjacency matrix.[37,38] The relation between the Hückel Hamiltonian H and the adjacency matrix A can be written as[39]where I denotes the identity matrix and α and β represent the Coulomb and resonance integrals for sp2-hybridized carbon atoms and C(sp2)–C(sp2) bonds, respectively. By setting α and β to 0 and 1, respectively, the Hückel Hamiltonian can be reduced to the adjacency matrix. One should note that α corresponds to the energy origin of the eigenvalue spectrum while β the energy unit of the spectrum, so the arbitrariness of setting the values of α and β does not undermine the essential features of the spectrum. In Figure a, one can see a process of reducing a chemical structure to a graph representation. The π-conjugated molecule of triafulvene includes four sp2 carbon atoms and four C(sp2)–C(sp2) bonds, which are, respectively, converted into nodes and edges in the so-called Hückel graph (hydrogen-depleted graph).[40] The Hückel graph for triafulvene shares the same network structure as the infection network simulated in the previous section. We will see a remarkable correspondence between the simulation of the evolution of the infected population on the network and the electronic structure of triafulvene.

Figure 8

(a) Process of reducing the chemical structural formula for triafulvene to the Hückel graph. (b) Hückel Hamiltonian for triafulvene and the adjacency matrix for the Hückel graph.

(a) Process of reducing the chemical structural formula for triafulvene to the Hückel graph. (b) Hückel Hamiltonian for triafulvene and the adjacency matrix for the Hückel graph. Figure shows the central orbital—LOMO—of triafulvene calculated with the Hückel method. For comparison, the frontier orbitals are also shown. By and large, one can perceive that the frontier orbitals have a large amplitude at the atoms located at “frontier” areas, whereas the central orbital has a large amplitude at the atom located at the “central” area. Since chemical reactions are prone to occur at terminal atoms in a molecule, that the frontier orbitals are localized at the frontier region can be regarded as reasonable in terms of the fact that they play a great role in chemical reactions.

Figure 9

(a) Eigenspectrum of the Hückel matrix for triafulvene. The distributions of the frontier orbitals (HOMO and LUMO) and the central orbital (LOMO) are shown. (b) Amplitudes of the LOMO coefficients of triafulvene are shown. The numbers in the parentheses indicate a normalized value such that the largest amplitude becomes 1. The main purpose of this paper is to shed a new light on the role played by the LOMO. The amplitudes of the LOMO coefficients are indicated in Figure b. The coefficient of the ith carbon atom (c) is normalized such that ∑c2 = 1 in the Hückel method. As we have done in Figure , we renormalize these coefficients such that the largest coefficient takes a value of 1. The resultant values are shown in parentheses in Figure b. One can see a remarkable correspondence between the LOMO coefficients and the converged distribution of the ratios of the infected populations shown in Figure . This is not accidental, as will be proved below using the basic principles of quantum mechanics.

Proof

Both adjacency matrix A for graph theory and Hamiltonian matrix H for the electronic structure calculation are Hermitian, that is A = A† and H = H†, where † denotes conjugate transpose. The eigenvectors of a Hermitian matrix form a complete set, meaning that if the matrix is of order N, any vector of dimension N can be written as a linear combination of the orthonormal eigenvectors.[41] Let ψk denote the kth molecular orbital associated with the Hamiltonian of H with a dimension of N × N. One can define an initial wave function as follows:Thanks to the Hermitian property, one can even form the initial wave function such that Ψ(0) coincides with the vector of C(0), which describes the initial distribution of the infected population used in the previous section. The procedure to trace the evolution of the infected population shown in Figure is reproduced below but we will use the Hamiltonian and the wave function of Ψ(0), instead of the adjacency matrix A and the vector of C(0). The operation of the Hamiltonian Ĥ on the wave function Ψ(0) results inWhen we remind ourselves that the Schrödinger equation states that Ĥψk = εψk, where ε denotes the eigenvalue of the molecular orbital of ψk, eq leads toUsing this, we define a new wave function as followsThis process corresponds to the step of updating C(0) to C(1) in Figure . By repeating this procedure n times, we arrive atSuppose that the absolute value of ε1 is the largest of the N eigenvalues. By parenthesizing Ψ( with ε1, one obtainsSince the absolute value of ε1 is the largest, ; therefore, Ψ( converges to ε1c1ψ1. Note that ε1c1 is just a coefficient of the eigenvector of ψ1, so the normalization of ε1c1ψ1 results in ψ1. This suggests that by multiplying an arbitrary wave function (or an arbitrary vector) from the left by the Hamiltonian (or adjacency matrix) repeatedly, one can see the convergence of the wave function to an eigenvector of the Hamiltonian, i.e., ψ1. This means that whatever the initial distribution is, the process yields the same distribution. This process is well known in mathematics as the power method.[42] Next thing we need to do is to check whether ψ1 is the LOMO. Since the LOMO is the most stable orbital, there is no wave function node. This is the eigenvector whose all components have the same sign.[43] The Perron–Frobenius theorem guarantees that when all of the elements of a matrix are positive (positive matrix), the largest eigenvalue of the matrix is not degenerate and all of the elements of the corresponding eigenvector (a.k.a. Perron–Frobenius vector) have the same sign.[44] Any other eigenvector does not show this property due to orthogonality. Some of the elements must have the opposite sign. From this theorem, it is clear that ψ1 is the LOMO because all of the LOMO coefficients have the same sign and the eigenvalue corresponding to ψ1, namely, ε1, is the largest. Someone versed in this theorem may worry about the distinction between the positive matrix and non-negative matrix. Actually, the adjacency matrix is not the positive matrix because it includes some zeros. By adding a small perturbation to the adjacency matrix, one can convert it to a positive matrix from the non-negative matrix. If the perturbation is substantially small, the eigenvalues and eigenvectors are hardly affected. This point is discussed further in the SI. For a detailed mathematical proof of the Perron–Frobenius theorem, one can consult the mathematical literature,[45] but here we provide a proof of a certain kind from the point of view of quantum mechanics, which may be helpful for chemists. A matrix version of the Schrödinger equation reads,[46]whereand ε = diag (ε1, ε2,..., ε), where diag(...) denotes a diagonal matrix in which the ith element in the parentheses corresponds to its (i,i) entry. Note that in the Hückel method, the overlap matrix is approximated by the identity matrix, so it does not appear in eq . The (i,j) entry of C corresponds to the jth MO expansion coefficient at site (atom) i. The molecular orbital of ψ is equivalent to the column vector of . Using C, eq can be rewritten as[46]If one uses the adjacency matrix of A instead of H, eq can be changed intoThe relation between ε and λ is ε = α + βλ.[47] By setting α to the origin of energy, we have ε = βλ. AC has the dimension of N × 1 because A and C have the dimensions of N × N and N × 1, respectively. AC is a column vector. Its ith row element is ∑a[C], where [C] denotes the kth row of the vector of C, which is equal to C. Of course, λC is also a column vector. Its ith row element is λ[C], which is equal to λC. Therefore, we obtainSuppose the lth eigenvector corresponds to the LOMO. The following equality holds trueThis is because all of the lth MO expansion coefficients have the same sign. One should also note that a ≥ 0. As mentioned above, if we are allowed to add a small positive number to each nonzero entry of A as a perturbation, we have a > 0, and we assume this in the following. For the mth eigenvector, where m ≠ l, in contrast, we havebecause some of the coefficients have the different sign from the rest. Only the LOMO is allowed to let all of the coefficients have the same sign. This is not the case for the other orbitals, which have to be orthogonal to the LOMO. By multiplying both sides of the inequality in eq by |C| and summing them up over i, we have ∑|λ||C||C| < ∑∑a|C||C|. By interchanging i and k on the right-hand side of this inequality, we get ∑|λ||C||C| < ∑∑a|C||C|. Since A is a symmetric matrix, a = a. Using this equality and changing the order of summation, ∑|λ||C||C| < ∑∑a|C||C| holds. Using eq on the right-hand side of this inequality, ∑|λ||C||C| < ∑|λ||C||C| is obtained. This leads toThis inequality states that the eigenvalue associated with the LOMO, i.e., |λ|, is larger than any other eigenvalues. Note that the mth eigenvector is an arbitrary one. This is an outcome of the application of the Perron–Frobenius theorem in conjunction with the basic principles in quantum mechanics to the Hückel matrix or the adjacency matrix. Since all of the LOMO coefficients have the same sign, it is clear from eq that λ > 0. Eq can be read asWhen we return to the Hückel energy scale, eq is converted intowhere λβ is the Hückel energy of the LOMO. One should note that β < 0.

Example

Let us look at an example of the LOMO. In Figure , we show the LOMO of calicene. The energy of the LOMO is 2.36β, so the Perron–Frobenius theorem guarantees that the other orbitals fall in a range of 2.36β < ε < −2.36β. This is indeed satisfied as can be seen from the eigenspectrum shown in Figure .

Figure 10

Hückel energy spectrum calculated for calicene (left) and the distribution of the LOMO coefficients (right).

Hückel energy spectrum calculated for calicene (left) and the distribution of the LOMO coefficients (right). We turn to the LOMO coefficients of calicene. The largest amplitude is found on the C7 atom while the smallest on the C2 and C3 atoms. This implies that the C7 atom is the center of the molecule whereas the C2 and C3 atoms are located farthest from the center. Thus, one can discriminate an atom from another depending on the topological distance from the center of the molecule. The LOMO coefficient corresponds to eigenvector centrality. This is not the only way to assess the centrality of a node in a graph. An important alternative would be degree centrality, which is defined as the number of edges incident upon a node.[48] In the chemical terminology, degree centrality is the same as the coordination number. When counting the coordination number for a carbon atom in a molecule represented by the Hückel graph, we do not take hydrogen atoms into account because they are not incorporated in the Hamiltonian at the Hückel level. The carbon atoms in calicene can be divided into two groups using the degree centrality (coordination number): the first group includes the C7 and C8 atoms, whose coordination number is 3. The other carbon atoms are classified into the second group: they have the coordination number of 2. That all of the carbon atoms except for C7 and C8 have the same centrality when relying on the method of degree centrality may run counter to our intuition for the network. A major drawback of degree centrality is that it only takes into account local information. For example, a node with a few important neighbors may have more global influence than a node with many less important neighbors.[49] Thus, one uses eigenvector centrality as a refinement of degree centrality. Generally, eigenvector centrality can be used to measure the influence of a node in a network.[50] A recursive definition of eigenvector centrality, in which the eigenvector centrality of a vertex is defined as being proportional to the sum of the eigenvector centrality of the vertices to which it is connected, is also helpful for understanding the importance of eigenvector centrality.[51] This is a way one can understand what the LOMO is all about.

Molecular Graph as a Network for Electron Flow

What walks on the networks detailed in the former part of this paper (Sections and 2.2.2) may be a human being or virus. In contrast, on the molecular graph, electrons are transported. Chemical graph theory has recently been found useful for understanding how electrons flow in molecules.[38,52−54] This is reminiscent of the theory elaborated by Bonchev and co-workers mentioned in the introductory part of this paper. What Bonchev and co-workers presented in the chemistry literature can be written as follows[18]where C2 is the square of the LOMO coefficient at atom i and SRW denotes the number of self-returning walks (SRWs) of length k starting and ending at atom i. It should be noted that in the mathematics literature, this equation can be traced back to Wei’s work in 1952,[55] noted by Berg in 1958,[56] and later reproduced by Cvetković, Rowlinson, and Simić in their book.[57] In Figure , the enumeration of SRWs is exemplified by the counting of SRW14 for the graph corresponding to triafulvene. SRWs are also of importance in the method of moments, which has been applied to a variety of solids to establish relationships between geometric and electronic structures.[58,59]

Figure 11

Illustration of how to count SRW14 for the graph corresponding to triafulvene.

Illustration of how to count SRW14 for the graph corresponding to triafulvene. Since ∑SRW corresponds to the total number of SRWs of length k on the molecular graph, the right-hand side of eq can be viewed as the limit for the relative number of SRWs. Bonchev and co-workers proposed that SRW in the molecular graph can be interpreted as electron motion near atom i.[18,21] Thus, LOMO coefficients are expected to have something to do with the relative frequencies of such electron motion. This further can be correlated with Feynman’s idea of accounting for all possible electron paths.[60] Nagao, Nishikawa, and Aono demonstrated that the transition amplitude in the Feynman path integral method can be obtained by summing contributions from all independent trajectories.[61] Each SRW in a molecular graph would be interpreted as an independent Feynman’s electron trajectory.[21]

Atom Binding Energy

As has been clarified in the previous sections, LOMO coefficients will be helpful for identifying an atom in a molecule where electrons frequently visit: The larger the LOMO coefficient of an atom, the more frequently electrons pass by the atom. Also, according to network theory, the larger the LOMO coefficient of an atom, the more likely it is to be influential in the network. Based on these facts, we have conceived that the binding energy of an atom to a host (another atom, molecule, solid, or surface) may be correlated with the LOMO coefficient of the atom. The LOMO coefficient is likely to be a good descriptor for the prediction of the binding energy. This is because the binding energy can be traced back to the electronic communication between the atom and those of the host. In the remaining part of this paper, we will see a correlation between the binding energy of a metal atom to its cluster and the LOMO coefficient of the corresponding atom. Since the theory of LOMO coefficients have been developed relying on the simple Hückel theory, we investigate s-block metal clusters, whose electronic structures are expected to be well described by the Hückel method: Li, Be, Na, and Mg clusters were thus chosen. Fan et al.[62] found that lithium ions are likely to diffuse on graphene, an anode material for lithium-ion batteries, and as the lithium content increases, small lithium clusters can be formed thereon. Such nanoclusters can potentially nucleate Li dendrites, leading to device failure. Based not only on such a practical point of view but also on fundamental interest, Li clusters have attracted researchers’ attention.[63−65] Also, high-spin Li clusters have caught the attention of theoreticians because of their novel bonding motifs.[66−68] Before we take a look at the LOMO coefficients of these clusters, we need to obtain their structures. The prediction of the structure of the ground state of a cluster, which corresponds to the global minimum of the potential energy surface, is a difficult task. Generally, the number of local minima increases exponentially as the size of the cluster increases.[69] The implementation of an evolutionary approach known as particle swarm optimization (PSO) capable of finding the global minimum of the potential energy surface of clusters was reported by Call et al.[70] The PSO algorithm was originally proposed by Kennedy and Eberhart in 1995, inspired by the social behavior of flocks of birds and fish.[71] Recently, Ma and co-workers have developed a crystal structure analysis by particle swarm optimization (CALYPSO) methodology, which has made it possible to apply the PSO algorithm to the global optimization of extended systems.[72−74] The CALYPSO method has been developed so that it can also be applied to cluster structure prediction.[75] In this study, to effectively obtain stable lowest-energy structures of Li, Be, Na, and Mg clusters, we used CALYPSO[72−74] in conjunction with the periodic density functional theory (DFT) code of Vienna ab initio simulation package (VASP) 5.4.4.[76−79] The number of structures produced at each step—population—was set to 10. The number of CALYPSO steps—generations—was fixed to 10. The first generation was generated randomly. For the next generation, 20% of the population were generated randomly while the others were generated on the basis of the previous generation using a local version of PSO algorithm with Metropolis criterion.[75] All of the structures generated were locally optimized using VASP. Since VASP employs periodic boundary conditions, an orthorhombic unit cell with a dimension of 15 Å × 15 Å × 15 Å or larger was built and the cluster was located at its center so that the vacuum surrounding the cluster could be regarded as thick enough to avoid interactions between the cluster and its replica in the neighboring cell. The generalized gradient approximation (GGA) was adopted with the functional described by Perdew, Burke, and Ernzerhof (PBE).[80] The Kohn–Sham equations were solved with a plane-wave basis set using the projector-augmented wave (PAW) method.[81,82] The cutoff energy for the plane-wave basis set was set to 300 eV. The break conditions for the electronic self-consistent-field (SCF) loop and the optimization loop were set to 1.0 × 10–4 and 1.0 × 10–3 eV, respectively. Only Γ point was used for the Brillouin zone sampling. Spin-polarization calculations were performed for all of the systems. The most stable structure thus found was singled out as the initial configuration for the reoptimization with more accurate convergence criteria. The cutoff energy was increased to 500 eV. A tighter SCF convergence criterion of 1.0 × 10–5 eV was used. The all band simultaneous update of wave functions[83] was selected for the algorithm for SCF. The atoms in the cluster were relaxed until the forces on them become less than 0.05 eV/Å. Figure shows optimized structures of some Li clusters. The atomic coordinates of these clusters are shown in the SI. We take clusters with at least three or more symmetrically distinct atoms into consideration. The reason for this will become clear soon.

Figure 12

Optimized structures of Li5 (a), Li8 (b), and Li10 (c). The LOMO coefficients for the 2s atomic orbitals on symmetrically distinct atoms calculated at the level of the extended Hückel method are indicated. The point group of each cluster is shown in parentheses. The structures of the Li5 and Li10 clusters seem consistent with what were reported in preceding studies.[63,65] As for the Li8 cluster, a D5 structure is reported in the literature,[63,64] while our CALYPSO calculation pointed to the C3 structure. The VASP-calculated energy of the C3 structure is almost the same as that of the D5, but the C3 was found slightly lower in energy by 0.04 eV. The coexistence of multiple Li isomers at room temperature was suggested by Fournier et al. based on their DFT calculations.[84] These clusters in Figure were optimized at the level of DFT, while the extended Hückel method was used for the generation of the LOMO coefficients. The reader who has read thus far surely appreciates the value of the Hückel method. However, the metal clusters cannot be dealt with in the framework of the simple Hückel method, so we adopted the extended Hückel method.[85] The standard extended Hückel parameters were taken from the literature[86] and those used in this study were tabulated in the SI. For the extended Hückel calculation, Yet Another extended Hückel Molecular Orbital Package (YAeHMOP) was used.[87] The off-diagonal elements of the Hamiltonian were evaluated using the Wolfsberg–Helmholz approximation.[88] The proportional constant in this approximation was set to 1.75 as was suggested by Hoffmann.[85] Following the convention, four Slater-type basis functions of 2s, 2p, 2p, and 2p were used for the Li atoms. The contributions of the 2p orbitals to the LOMO were found negligibly small. Their coefficients are always lower than 0.05. The LOMO coefficients shown in Figure are those for the 2s basis functions. For the visualization of the metal clusters, we use VESTA.[89] In Figure , bonds are drawn between Li atoms separated by 3.1 Å or less. The distance of 3.1 Å corresponds to the double of the standard value of the metallic radius of Li.[90] In the cluster of Li5, one can perceive a correlation between the coordination number and the LOMO coefficient. Without using the LOMO coefficients, one may rank the centrality of the Li atoms in the Li5 cluster based on the coordination number. However, things are not so simple when moving to a more complicated cluster. In the Li8 cluster, the Li7 atom has the coordination number of 7 whereas that of the others is 1. Thus, there is no doubt that the Li7 atom has the highest centrality, whereas the other atoms, namely, the peripheral atoms, share the lowest centrality. Nevertheless, we see a variation in the LOMO coefficients for the peripheral atoms. This implies that the centrality may vary from atom to atom even if these atoms have the same coordination number. Complicated as the geometry of the Li10 cluster is, one may manage to recognize the correspondence between the coordination number and the centrality. Such a problem as the one we have seen in the Li8 cluster will not arise. Let us take a look at whether the LOMO coefficient is correlated with the binding energy of an atom in a cluster. We define the binding energy for the metal cluster with a composition of M as followswhere E(M) and E(M) are the energies of the M atom to be removed and the parent M cluster, respectively. The removal of the M atom results in the M cluster, whose energy is expressed as E(M). For this energy calculation, the geometry of the M cluster was kept fixed to the same as that adopted in the parent cluster structure. These energies were obtained from the calculations at the DFT level with VASP. In Figure , one can see correlations between the binding energy for a Li atom in the Li clusters and the LOMO coefficient of its 2s orbital. The coefficients of determination (R2) calculated for these plots are very high, approaching 1. This appears to be a justification of the idea that the LOMO coefficient could be a good descriptor for the atom binding energy in atom assemblies.

Figure 13

Scattering plot of the LOMO coefficient for the Li atom to be removed vs. the binding energy of that Li atom is obtained for the Li5 (a), Li8 (b), and Li10 (c) clusters. The coefficient of determination (R2) of the linear regression line is also shown. If the number of points in these plots was less than 3, any plots would result in the R2 value of 1, which would hinder one from clarifying whether or not there is a correlation between them. This is why the metal clusters consisting of at least three or more symmetrically distinct atoms were singled out for analysis. In the clusters of Li5 and Li10, the correspondence between the measures of centrality based on the coordination number and the LOMO coefficient and the binding energy is apparent. One would argue that such a geometrical feature as the coordination number might be sufficient for the prediction of the binding energy. However, the Li8 cluster would oppose this argument. Owing to the lack of correspondence between the two measures of centrality in this cluster, this system begs for further analysis. Since all of the peripheral Li atoms in the Li8 cluster have the coordination number of 1, they are expected to be equally bound to the cluster as long as we rely on the centrality measure of the coordination number. However, the binding energies of these atoms vary. As another factor which affects the binding energy, the distance of the atom measured from the center of the cluster occurs to us. The center of the Li8 cluster is the Li7 atom. How far a Li atom is from the center of the cluster is shown in Figure a. A scatter plot of the distance vs. the binding energy is generated as shown in Figure b, where one cannot see any correlation. This means that the binding energies of the Li atoms in the Li8 cluster cannot be predicted only on the basis of the geometrical features. It is the LOMO coefficient that is helpful for predicting the binding energy.

Figure 14

(a) Selected Li–Li bond distances in the Li8 cluster. (b) Scatter plot of the Li–Li distance vs. the binding energy.

(a) Selected Li–Li bond distances in the Li8 cluster. (b) Scatter plot of the Li–Li distance vs. the binding energy. The results for the other clusters are presented in the SI. Here, just briefly, one can, by and large, observe a good correlation between the LOMO coefficient and the binding energy for the Be, Na, and Mg clusters, too. For the clusters of Na5 and Mg10, the highest R2 value of 1 was marked (see Figure a,b). The lowest R2 value was seen in the analysis of Mg8 cluster (see Figure c).

Figure 15

Top: Optimized structures of Na5 (a), Mg10 (b), and Mg8 (c). The distance thresholds for the visualization of the Na–Na and Mg–Mg bonds are set to 3.8 and 3.2 Å, respectively, based on the metallic radii of Na and Mg.[90] The LOMO coefficients for the 3s atomic orbitals on symmetrically distinct atoms calculated at the level of the extended Hückel method are indicated. The contributions of the 3p orbitals to the LOMO are smaller than 0.04. The point group of each cluster is shown in parentheses. Bottom: Scattering plot of the LOMO coefficient for the metal atom to be removed vs. the binding energy for that atom is generated for the Na5 (a), Mg10 (b), and Mg8 (c) clusters. The coefficient of determination (R2) of the linear regression line is also shown. All in all, these structures look consistent with those reported in the literature.[91−93] Given that higher correlations are found for the clusters with higher symmetry, the symmetry of the cluster may also have something to do with the binding energy. We should mention that there is also a problem in the use of eigenvector centrality (LOMO), which problem is that this centrality has the same value for all of the nodes in a regular graph, i.e., a graph in which all of the nodes have the same degree.[94] This may be particularly serious in the case of atomic clusters whose constituting atoms all have the same coordination number. The Mg8 cluster may fall into this class because all of the atoms have the same coordination number of three. In the Mg8 cluster, it is easy to see that for instance, Mg1 participates in one triangle while Mg5 in two, so they are not equivalent. At the simple Hückel level, eigenvector centrality (LOMO) does not distinguish such important structural differences. However, at the extended Hückel level, since not only the through-bond interaction but also the through-space interaction is included, the atoms are ranked as shown in Figure c. Nevertheless, the relatively lower R2 value of 0.73 for the Mg8 cluster suggests a need for amelioration. In 2005, Estrada and Rodríguez-Velázquez[94] proposed a measure of centrality whereby nodes are ranked according to their participation in different network subgraphs. This is called subgraph centrality, which is defined for the ith node in a graph represented by the adjacency matrix of A asThe use of subgraph centrality was later contextualized in molecular systems.[95,96] Estrada and co-workers proved a theorem that opens for extending this context in a quantum chemical context, which allowed the use of Hermitian matrices such as those used in quantum chemistry.[95] By assuming a simple relation between the Hamiltonian and adjacency matrices, namely, H = −A,[96] subgraph centrality for the ith atom in an atomic cluster may be calculated fromwhere Cμ is the kth MO expansion coefficient associated with basis function μ centered on the ith atom and ε is the kth MO energy at the extended Hückel level. Using this centrality, the R2 value for the correlation in the Mg8 cluster is increased to 0.76 (see the SI).

Conclusions

The importance of electrons with the highest energy in a molecule or empty orbitals that can accept them is well received in chemistry. On the other hand, the nature of the lowest-energy electrons in a molecule or the orbital that holds them, namely, the lowest occupied molecular orbital (LOMO), has not been given much attention, but we had the audacity to name it the central orbital, exploring the possibility of its use. Thanks to graph theory, we have learned that LOMO quantifies which atoms have the greatest effect on a network of atoms. LOMO pinpoints which atom is the most important in an aggregate of atoms, or cluster. On the other hand, in an aggregate of atoms, how strong the interaction between one atom and another can also be a measure of centrality. We investigated the relationship between the LOMO coefficient of an atom in a cluster and its binding energy using some s-block metal clusters as examples, and found that there is a fairly good correlation between them. Using a more sophisticated centrality measure such as subgraph centrality, one may further improve the correlation. There is an urgent need to develop descriptors for predicting adsorption energy in catalytic chemistry because machine learning for catalytic informatics and catalyst development has been active in recent years and seeking for them.[97−99] We hope that the utility of LOMO as a descriptor for the binding energy as well as centrality, as revealed in the present study, will provide some hints for these studies.

26 in total

1. Generalized Gradient Approximation Made Simple.

Authors:
Journal: Phys Rev Lett Date: 1996-10-28 Impact factor: 9.161

2. Ab initio molecular-dynamics simulation of the liquid-metal-amorphous-semiconductor transition in germanium.

Authors:
Journal: Phys Rev B Condens Matter Date: 1994-05-15

Review 3. Networks and epidemic models.

Authors: Matt J Keeling; Ken T D Eames
Journal: J R Soc Interface Date: 2005-09-22 Impact factor: 4.118

4. Theoretical organometallic chemistry.

Authors: R Hoffmann
Journal: Science Date: 1981-03-06 Impact factor: 47.728

5. Structural acid-base chemistry in the metallic state: how μ3-neutralization drives interfaces and helices in Ti21Mn25.

Authors: Timothy E Stacey; Daniel C Fredrickson
Journal: Inorg Chem Date: 2013-01-31 Impact factor: 5.165

6. Deciphering the Structural Evolution and Electronic Properties of Magnesium Clusters: An Aromatic Homonuclear Metal Mg₁₇ Cluster.

Authors: Xinxin Xia; Xiaoyu Kuang; Cheng Lu; Yuanyuan Jin; Xiaodong Xing; Gabriel Merino; Andreas Hermann
Journal: J Phys Chem A Date: 2016-09-20 Impact factor: 2.781

7. Computational and spectral studies of 3,3'-(propane-1,3-diyl)bis(7,8-dimethoxy-1,3,4,5-tetrahydro-2H-benzo[d]azepin-2-one).

Authors: S Anil Kumar; B L Bhaskar
Journal: Heliyon Date: 2019-09-30