Literature DB >> 33449681

Multilevel Density Functional Theory.

Gioia Marrazzini¹, Tommaso Giovannini², Marco Scavino¹, Franco Egidi¹, Chiara Cappelli¹, Henrik Koch¹.

Abstract

Following recent developments in multilevel embedding methods, we introduce a novel density matrix-based multilevel approach within the framework of density functional theory (DFT). In this multilevel DFT, the system is partitioned in an active and an inactive fragment, and all interactions are retained between the two parts. The decomposition of the total system is performed upon the density matrix. The orthogonality between the two parts is maintained by solving the Kohn-Sham equations in the MO basis for the active part only, while keeping the inactive density matrix frozen. This results in the reduction of computational cost. We outline the theory and implementation and discuss the differences and similarities with state-of-the-art DFT embedding methods. We present applications to aqueous solutions of methyloxirane and glycidol.

Entities: Chemical Disease Gene

Year: 2021 PMID： 33449681 PMCID： PMC7880574 DOI： 10.1021/acs.jctc.0c00940

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

The study of the energetics and physico-chemical properties of large molecular systems is one of the most challenging problems in quantum chemistry.[1] Many processes of chemical interest take place in solution,[2−6] in biological matrices,[7,8] or at the interfaces between different materials.[9−11] The large size of such systems poses theoretical and computational challenges because high-level correlated electronic structure methods are usually unfeasible because of their high computational cost and unfavorable scaling.[12,13] A good compromise between accuracy and computational cost is provided by density functional theory (DFT),[14,15] which accounts for electron correlation in an approximate way. Because of the proven reliability of the results that can be obtained at the DFT level, it has become the most widely used approach for describing the electronic structure of large systems. DFT permits the investigation of much larger systems than offered by highly correlated methods. However, it cannot be routinely applied to systems constituted by more than 500 atoms, unless implementations through graphical processing units are exploited.[16] The practical limit of 500 atoms makes applications of DFT to biological matrices, interfaces, and solutions particularly cumbersome and, in some cases, even impossible. For these reasons, several approximations have been developed in the past. The nature of any approximation aimed at reducing the computational cost of a calculation is often rooted in the information that one seeks to extract from a simulation. If one is interested in the interaction between two complex medium-sized subsystems and how they influence each other, then, one might seek a partition method which accurately models both moieties with a computational cost comparable to treating both systems separately. More often than not, however, one is more interested in modeling only a small part of a larger chemical system, while the rest is only relevant in as much as it can alter the physical and chemical properties of the smaller region. This is the case of systems embedded in an external environment, whose properties are modified by the latter, the archetypal example being a molecule dissolved in a solvent. For these problems, the system may be partitioned into two or more regions which are then treated at different levels of accuracy, and different approaches may be more or less suitable depending on the specificities of the system/environment. In the special case of systems in solution, particular success has been enjoyed by methods belonging to the family of the so-called focused models,[17−20] which are extremely useful when dealing with the property of a moiety or a chromophore embedded in an external environment. In focused models, the target molecule is described at a higher level of theory with respect to the environment, which acts as a perturbation on the target system. Among the different focused models that have been developed in the past, the large majority belongs to the family of quantum mechanics (QM)/classical approaches, in which the target is treated at the QM level. The environment is instead described classically, by means of either continuous descriptions, such as the polarizable continuum model,[17,18] or by retaining its atomistic nature in the so-called QM/molecular mechanics (QM/MM) approaches.[21−24] In all these methods, however, the interaction between the two parts of the whole system is usually described by classical electrostatics[20,25−27] and very rarely by including the interactions of the quantum nature, such as Pauli repulsion and dispersion.[28−30] Also, QM/classical methods allow for the treatment of very large systems; however, their accuracy crucially depends on the quality of the parametrization of the classical fragments. In order to avoid such a variability, quantum embedding methods can be exploited.[31−48] In these approaches, the whole system is treated by resorting to a QM description; thus, Pauli repulsion effects are introduced in the modeling. The reduction in the computational cost is then achieved by partitioning the system in at least one active and one inactive part. The former is described at a higher level of accuracy, whereas the latter may be kept frozen or described at a lower level of theory. Different approaches have been proposed in the past, ranging from projection-based methods, such as DFT-in-DFT or HF-in-DFT,[39,46,47,49,50] to frozen density embedding (FDE).[48,51−57] Quantum embeddings have become a popular choice to overcome the aforementioned shortcomings of QM/classical approaches; however, the partitioning of the electronic degrees of freedom for the whole system into the two layers is theoretically far from trivial. In fact, all such partitioning methods can be thought of as being based upon the same basic idea. However, the partition of the electronic degrees of freedom may be performed on molecular orbitals, the density function, or the density matrix. Which basis set must be used to expand the density and wavefunction of the different regions is also an important factor. All these choices lead to drastically different methods which have their unique advantages and complications (vide infra). In addition, all quantum-embedding approaches based on DFT suffer from the complication of the exchange–correlation functionals being nonlinear, which leads to nonadditive energy terms in the partitioned equations. In this paper, we are proposing a novel quantum embedding approach defined in a DFT framework. We denote this method multilevel DFT (MLDFT), due to its similarity with multilevel Hartree–Fock (MLHF),[58,59] and multilevel coupled-cluster[12,60] that we have recently developed. Similar to FDE, in MLDFT, the system is partitioned into an active and an inactive region, and all the interactions are retained between the two parts. The MLDFT conceptually differs from the aforementioned quantum embedding methods because the partitioning is performed on the density matrix instead of the density function as in FDE.[61] As a consequence, the nonadditive term in MLDFT does not contain any kinetic energy term, thus avoiding the theoretical complications present in FDE (vide infra). Also, the MLDFT equations are defined in the MO basis of the active fragment only. This feature automatically allows for savings in the computational cost because the inactive MOs are not involved in the self-consistent field (SCF) procedure. In this paper, we derive MLDFT, and we apply the method to ground state (GS) energies of aqueous solutes. The results are compared, in all cases, to full DFT, in order to assess the quality of the multilevel partition. The manuscript is organized as follows. In the next section, DFT theory is formulated in the MO basis, and MLDFT equations are presented with a particular focus on the computational savings that can be expected. Theoretical comparison with other quantum embedding approaches is also discussed. Then, after a brief section reporting on the computational details of the method, MLDFT is applied to selected aqueous systems, with emphasis on comparison with full DFT results. Conclusions and perspectives of the present work end the manuscript.

Theory

Our starting point is the DFT expression for the electronic energy of the systemHere, D is the one-particle density matrix expanded in the atomic orbital (AO) basis {χμ}, and h is the one-electron operator, whereas J and K are Coulomb and exchange matrices, respectively. The Ex and Ec terms are DFT exchange and correlation energy functionals; ρ(r) is the DFT density function, and εx and εc are the exchange and correlation energy densities per unit particle, respectively. The coefficient cx defines whether pure DFT (cx = 0) or hybrid DFT functionals (cx ≠ 0) is used. Note that in the second equality in eq , it is implied that we are restricting the treatment to semilocal possibly hybrid functionals that are commonly used in quantum chemistry. The energy defined in eq is usually minimized in the AO basis. It is also possible to reformulate the minimization in the MO basis, which is the strategy employed in this work, similarly to what was presented for the Hartree–Fock case by Sæther et al.[58] This can be accomplished by parametrizing the density matrix D in terms of an antisymmetric rotation matrix, in which only the nonredundant occupied-virtual rotations are considered.[58]

Multilevel DFT

The MLDFT method belongs to the family of the so-called focused models. The part of the system which is under investigation (active) is described accurately, whereas the remaining (inactive) part remains frozen during the optimization of the active fragment. The choice of the partitioning intimately depends on the specificities of the system, its chemical nature, and the properties one wishes to simulate. Like other QM/QM methods, the details of the partitioning largely define the nature of the method, with its advantages as well as the limitations that arise from the simplifications that bring about the sought-after computational savings. Within the MLDFT formalism, the separation of the system into two parts is based on the following decomposition of the density matrix D, which in turn defines the separation of the density function ρ(r) as wellwhere A and B indicate the active and inactive fragments, respectively. As stated above, the active and inactive densities are usually defined on a physico-chemical basis. In case of a molecular system in solution, it is natural to define the solute as the active fragment, whereas the solvent molecules are treated as the inactive part. It is crucial to notice that the partitioning in eq is mathematical and therefore has some level of arbitrariness; that is, there is no unique way to perform the separation purely based on physico-chemical properties of the system. Therefore, a choice has to be made at this point. In this work, the partitioning is performed by means of Cholesky decomposition of the total density matrix for the active occupied MOs, from which the active density matrix DA is calculated, similarly to what was done in previous works presenting alternative multilevel methods.[12,58,60,62,63] One advantage of this procedure is that it ensures the all active and inactive orbitals are orthogonal and remain so during all the subsequent SCF procedure performed on the active subsystem.[64] Now using eq , the total electronic energy in eq can be written aswhere the symmetry of J and K matrices has been used. So far, the method is analogous to the recently proposed MLHF approach.[58] However, in MLDFT, the last two terms are not linear in the densities of the two subsystems; therefore, we cannot directly split them into distinct contributions arising from ρA(r) and ρB(r). In order to acquire a physical understanding of eq , we rewrite the last two terms by using this trivial identity for the exchange–correlation energy density (εxc = εx + εc) Substituting eq into 4 and reorganizing the different terms we obtainwhere In eq , the first four lines define the energy of the active and inactive fragments, whereas the last three lines define the active–inactive interaction. In MLDFT, the density matrix of the inactive part DB (and ρB(r)) is frozen, and therefore, it acts as an external field on the active fragment. Therefore, only the energy terms containing the B labels are fixed during the SCF procedure. In this regard, the MLDFT is similar in construction to FDE approaches.[52,55,61] However, as it can be noticed from eq , one crucial difference is that the nonadditive energy terms do not involve the kinetic energy functional, thus allowing for a computational advantage with respect to common FDE approaches, which need to specifically employ nonadditive kinetic terms. This advantage stems from the fact that in MLDFT, the separation of the system into two subsystems is performed on the total density matrix, rather than the density function. However, one disadvantage with respect to FDE is that for MLDFT, we are forced to select a suitable decomposition algorithm [Cholesky/projected atomic orbitals (PAOs) are used in this work for active occupied and virtual orbitals, respectively], and the results may depend on it, while FDE does not need to make this choice. The total DFT Fock matrix is given bywhere vx(ρ(r)) and vc(ρ(r)) are the exchange and correlation potential densities, respectively. Using the partitioning in eqs and 8 we get We exploit the same identity of eq for the exchange–correlation potential density (vxc = vx + vc). In this way, the last two terms in eq become Reorganizing the terms in eq , we can obtain the working expression for the MLDFT Fock matrixwhere the two-electron contributions of A and B fragments and the interaction term AB are highlighted as 2eX,μν, {X = A, B, AB}. There are two main advantages of using MLDFT compared to full DFT. First, the HF exchange contribution is usually the most expensive term in most hybrid functionals. In MLDFT, only the active exchange term is to be computed at each SCF cycle, whereas the exchange integral of the inactive fragment is computed at the first SCF cycle only, as it is constant during the optimization. Second, the MLDFT SCF procedure can be performed in the MO basis of the active part only, thus intrinsically reducing the computational time as previously observed for the MLHF method.[58]

Comparison between MLDFT and Other Quantum-Embedding Methods

As stated above, in MLDFT, the density matrix of the whole system is partitioned into active and inactive fragment densities (see eq ), from which the active and inactive density functions are calculated. The inactive density matrix is kept frozen during the SCF cycles and enters the active Fock matrix as a one-electron term. The MLDFT shares many similarities with FDE methods;[42−45,52,54,61] however, some relevant differences are present, some of which are mentioned in the previous section. First, due to the partitioning performed on the density matrix instead of the density function as in FDE, we avoid the problems arising in the definition of the nonadditive kinetic potential terms that enforce Pauli exclusion between the electrons of the various subsystems in FDE. This is clearly evident in the definition of the nonadditive energy terms in eq , which are only due to the intrinsic nonlinearity of the DFT exchange–correlation functionals. However, a choice needs to be made on the decomposition algorithm to partition the density matrix. One other difference that is worth mentioning is that if the decomposition is applied to the fully converged DFT density matrix, then, the MLDFT energy corresponds to the exact DFT energy of the full system, and it is in fact completely independent on the decomposition algorithm. This feature is shared with projection-based approaches, though the philosophy of MLDFT conceptually differs from DFT-in-DFT methods.[35,39,41,46,47,49] Of course, the point of MLDFT is to avoid a full DFT calculation on the entire system and instead find a way to approximate the total density matrix, which is then partitioned and the density of the active subsystem is then optimized through a subsequent SCF procedure while the density matrix of the inactive part is kept frozen. In this work, this initial full density matrix is obtained through superposition of molecular densities[51,65] followed by Fock matrix diagonalization, although we should emphasize that the method is general and a different choice may be made. Within this procedure, the solute–solvent and solvent–solvent polarization energy terms are only introduced in the starting full Fock diagonalization step. In projection-based approaches, a DFT calculation on the whole system is performed, and the occupied orbitals are then localized. The localized MOs are then assigned to the active or inactive fragments, based on a population threshold.[50] Neither step is required in MLDFT, in which a starting Fock matrix is diagonalized and the obtained density matrix is decomposed into fragment densities. This explains why the SCF procedure on the active subsystem has to be performed in the MO basis of the active fragment only, which is defined in terms of the full AO basis set. We do not perform any basis set truncation which is instead usually applied in projection-based approaches. This feature automatically allows for saving the computational cost because the inactive MOs are not involved in the SCF procedure.

Computational Details

The DFT and MLDFT are implemented in a development version of the electronic structure program eT v.1.0.[66] In particular, the DFT grid is constructed using the widely employed Lebedev grid,[67] with the radial quadrature proposed in ref (68). All the calculations are performed using a quadrature of 25th order, and the radial threshold is set to 10–5. The DFT functionals are implemented using the LibXC library.[69,70] The MLDFT calculation follows this computational protocol: Construction of the initial density matrix by means of superposition of molecular densities,[51,65] followed by the diagonalization of the initial Fock matrix. Partitioning of the new density matrix into A and B densities, using Cholesky decomposition for the active occupied orbitals and PAOs for active virtual orbitals.[12,58,60,62−64,71,72] We refer the reader to ref (63) for the full details on this partitioning of the density matrix. The inactive density matrix is obtained by subtracting the active density matrix from the total one. Calculation of the constant energy terms and the one-electron contributions due to the inactive density matrix B entering in eqs and 11. Minimization of the energy defined in eq in the MO basis of the active part A only, until convergence is reached. All the equations reported in Section , which are expressed in the AO basis set, can be transformed in the active MO basis by using the active MO coefficients. To show the robustness of MLDFT, three different functionals are used: LDA,[73] GGA (PBE[74]), and hybrid (B3LYP[75]). These are combined with three different basis sets: 6-31G, 6-31G*, and aug-cc-pVDZ.

Numerical Applications

In this section, the MLDFT is applied to some test cases to show the accuracy and the performance of the method. Solvation is one of the main physico-chemical phenomena in which such approaches can be exploited. We show the results of coupling MLDFT with two alternative, fully atomistic, strategies to model aqueous solutions. The first consists of a static modeling, which uses small clusters composed of the solute and a small number of surrounding water molecules. As an alternative, we apply MLDFT to snapshots extracted from a molecular dynamics (MD) simulation. In the latter framework, the dynamical aspects of the solvation phenomenon are retained, as those arising from the combination of conformational changes in the solute and the surrounding solvent. In addition, long range interactions are taken into account. This latter modeling of the solvation phenomenon has been amply and successfully exploited by some of us within the framework of QM/MM approaches.[20,30,76−78] In the following sections, the combination of MLDFT to the two aforementioned solvation approaches is tested, with application to two relatively small molecules, that is, methyloxirane and glycidol in aqueous solution, which have been studied in the literature both theoretically and experimentally.[27,79−89] Such systems are chosen not only for their simplicity but also because methyloxirane is a rigid molecule, whereas glycidol is not. Therefore, in the latter case, the results depend on the selected QM level, and the approach used to solvation and conformational flexibility, which is instead discarded in the case of methyloxirane. In this way, we can dissect the various effects and highlight the quality of the MLDFT approach in details.

Cluster Models

Methyloxirane/Water Clusters

The first studied solute is methyloxirane (MOXY), which is one of the smallest molecules that exhibits a chiral carbon. We have selected different clusters constituted by MOXY and one or two water molecules (see Figure ) that have been previously studied by Su and Xu[86] to explain the unique characteristics of MOXY in aqueous solution.[86,89]

Figure 1

Structure of conformers methyloxirane/water clusters. MLDFT partition is constructed so that methyloxirane is the active part whereas the water molecules are the inactive fragments.

Structure of conformers methyloxirane/water clusters. MLDFT partition is constructed so that methyloxirane is the active part whereas the water molecules are the inactive fragments. The two different conformers for the cluster composed of MOXY and one water molecule (MOXY + 1w) are depicted in Figure a. In the 1w-syn structure, water interacts with MOXY through hydrogen bonding on the same side of the methyl group, whereas the opposite occurs for the 1w-anti structure. In both cases, MOXY is the active fragment, and water is the inactive moiety in MLDFT calculations. GS energy differences between DFT and MLDFT calculations are depicted in Figure , panel (a), left. Raw data are reported in Table S1 given in Supporting Information. We see that the error between MLDFT and full DFT is below 1 mHartree (<0.628 kcal/mol), irrespective of the combination of functional/basis set employed. The error due to the MLDFT partitioning is well below the chemical accuracy (i.e. 1 kcal/mol).

Figure 2

(a) (Left) 1w-syn and 1w-anti total energy differences between MLDFT and DFT. (Right) MLDFT and DFT energy difference between 1w-anti and 1w-syn. (b) (Left) 2w-syn, 2w-anti, and 2w-bi total energy differences between MLDFT and DFT. (Right) MLDFT-DFT difference on relative energies of 2w-anti and 2w-bi with respect to 2w-syn. In the right panel of Figure a, DFT and MLDFT energy differences between 1w-anti and 1w-syn conformers are reported for all the considered combinations of the functional/basis set. The raw data are reported in Table S1 in the Supporting Information. We see that DFT and MLDFT values almost coincide. In particular, LDA and PBE functionals predict 1w-syn to be the most stable conformer, both at DFT and MLDFT levels, independently of the selected basis set. Notice however that the energy difference between the two conformers decreases either as GGA functionals are employed or diffuse/polarization basis sets are used. The inclusion of HF exchange makes 1w-anti the most stable conformer, if polarization/diffuse functions are considered. However, for all the considered combinations of the functional/basis set, MLDFT and DFT values are almost perfectly in agreement, with the largest discrepancy being reported for B3LYP/aug-cc-pVDZ (0.08 kcal/mol). These findings clearly show that for this system, MLDFT is able to catch small energy differences, which are again well below the chemical accuracy. We now turn to the clusters composed of MOXY and two water molecules (MOXY + 2w, Figure b). Three main conformers are considered, according to Su and Xu:[86]2w-syn, 2w-anti, and 2w-bi. The first two conformers differ from the position of water molecules, being both placed on the same side with respect to the methyl group in case of 2w-syn or on the opposite side for 2w-anti. In 2w-bi, the two water molecules are instead placed on the opposite sides of the epoxyl oxygen atom. In all MLDFT calculations, MOXY is the active moiety, whereas the two water molecules are inactive. In Figure b, left, GS energy differences between DFT and MLDFT for the three conformers are reported. The raw values associated with the data plotted in Figure b are given in Table S2 in the Supporting Information. The MLDFT and DFT results are, also in this case, in very good agreement with an absolute error below 1 kcal/mol for all combinations of functional/basis sets. However, the absolute deviation between DFT and MLDFT energies is larger than for the previous case (see Figure a). In particular, the MLDFT error is larger for 2w-syn and 2w-anti than for 2w-bi, for which it is in line with what we have shown above for MOXY + 1w clusters (∼0.1–0.3 kcal/mol). The increase in the error may be justified by the fact that 2w-syn and 2w-anti feature one water molecule that is linked to another water molecule by means of intermolecular hydrogen bonding. The density matrix of the inactive fragments (the two water molecules) is kept frozen; therefore, the water molecule that is not directly bonded to the solute remains in its frozen electronic configuration, resulting in a larger error in the total energy. Such an hypothesis is confirmed by the fact that the error increases when the diffuse aug-cc-pVDZ basis set is used, and the same does not occur for 2w-bi, where both water molecules are directly linked to methyloxirane through hydrogen bonding interactions. The MLDFT-DFT deviations in energy differences between each conformer and 2w-syn are shown in Figure b, right. We note small discrepancies between MLDFT and full DFT; however, also in this case, they are below the chemical accuracy, with the maximum error reported by PBE/6-31G* (∼0.35 kcal/mol). The error in the energy differences between the conformers is lower than for the total GS energies reported in Figure b, left.

Glycidol/Water Clusters

MOXY is a rigid molecule, so the different solvated conformers mainly differ by the position of the water molecules. In this section, we show how MLDFT can treat flexible solutes, and to this end, we have selected glycidol (GLY), which is a derivative of MOXY, where one hydrogen of the methyl group is replaced by the OH group (see Figures –5). In all MLDFT calculations, the GLY moiety is the active fragment and the water molecules are the inactive part. The presence of the hydroxyl group makes glycidol flexible up to the point that eight different conformers can be located in the gas phase potential energy surface (PES).[79,84]

Figure 3

Structures of the six conformers of glycidol + 1 water clusters. In MLDFT calculations, the glycidol moiety is active whereas the water molecule is inactive.

Figure 5

Structures of the eight conformers of glycidol + 3 waters clusters. In MLDFT calculations, glycidol is active whereas water molecules are inactive.

Structures of the six conformers of glycidol + 1 water clusters. In MLDFT calculations, the glycidol moiety is active whereas the water molecule is inactive. Structures of the ten conformers of glycidol + 2 waters clusters. In MLDFT calculations, glycidol is active whereas water molecules are inactive. Structures of the eight conformers of glycidol + 3 waters clusters. In MLDFT calculations, glycidol is active whereas water molecules are inactive. To build up a glycidol/water clusters, different structures constituted by GLY and one, two, and three water molecules were constructed, by following the strategy reported in ref (84). Such structures are depicted in Figures –5. We note that the different structures not only differ by the position of the water molecules but also by the conformation of glycidol. In particular, the six conformers constituted by GLY and one water (GLY + 1w) are characterized by a different position of the water molecule. The latter interacts via hydrogen bonding with both the hydroxyl and epoxyl groups (1w-I and 1w-II), with the epoxyl group only (1w-III and 1w-IV) or with only the oxygen atom of the hydroxyl group (1w-V and 1w-VI). The inclusion of additional water (GLY + 2w) results in ten different conformers, which are shown in Figure . These contain three or four center bridges (conformers 2w-I, 2w-II, 2w-IV, 2w-V, 2w-VI, 2w-VII, and 2w-VIII) or are conformers where the two water molecules interact via hydrogen bonding with the epoxyl and hydroxyl groups (conformers 2w-III, 2w-IX, and 2w-X). If three explicit water molecules are added to GLY (GLY + 3w), the conformational search provides eight main conformers, which are graphically depicted in Figure . Similarly to the previous case, some of them contain three or four center bridges (conformers 3w-I, 3w-II, 3w-V, and 3w-VI), whereas in conformers 3w-IV, 3w-VII, and 3w-VIII, a five center bridge is present. In all cases, water molecules that are not involved in bridges interact with GLY through hydrogen bonding interaction. Conformer 3w-III is instead characterized by a three center bridge and by the remaining water molecules hydrogen bonded to the bridge water.

Figure 4

Structures of the ten conformers of glycidol + 2 waters clusters. In MLDFT calculations, glycidol is active whereas water molecules are inactive.

We now move to discuss GS energy differences between DFT and MLDFT (see Figure a, raw data are given in Tables S3–S5 in the Supporting Information).

Figure 6

(a) GLY + 1w (left), GLY + 2w (middle), and GLY + 3w (right) GS energy differences between MLDFT and reference DFT values. (b) MLDFT-DFT energy deviations for the energy differences between each conformer of GLY + 1w conformers and 1w-I (left), GLY + 2w conformers and 2w-I (middle) and GLY + 3w conformers and 3w-I (right). All values are reported in kcal/mol. In Figure , panel (a), MLDFT - DFT GS energy differences for all the different conformers of GLY + 1w, GLY + 2w, and GLY + 3w water clusters are shown. The error reported by MLDFT is below 0.1 mH (<0.627 kcal/mol) when applied to GLY + 1w, at all the selected levels of theory. In particular, energy differences are perfectly in line with what is shown in Figure a, left panel, in case of MOXY + 1w clusters. Moving to GLY + 2w conformers, the agreement between DFT and MLDFT is almost perfect at all levels of theory, being the energy difference below 0.8 kcal/mol in all cases. We also see that at the B3LYP/aug-cc-pVDZ level, for 2w-I and 2w-II, the difference between MLDFT and full DFT is larger than for the other conformers (>0.1 mH, 0.627 kcal/mol). This is due to the specific spatial arrangement of water molecules, which create a four-center bridge connecting GLY hydroxyl and epoxyl groups (see Figure ). As stated above for MOXY + 2w clusters, MLDFT accounts for all the interactions between active and inactive parts, with the exception of dispersion; however, the inactive fragment(s) are described by a frozen density matrix. Therefore, polarization and charge transfer (and dispersion) effects are neglected in the inactive region. For 2w-I and 2w-II, we can speculate that such interactions may play a relevant role because the two inactive water molecules are hydrogen bonded. Also, their role is clearly increased when diffuse and polarization functions are included in the basis set (aug-cc-pVDZ) because such functions enhance the effects of these interactions. This does not occur in case of other conformers because of the different spatial arrangement of the solvent molecules. We now focus on GLY + 3w conformers. The agreement between MLDFT and the reference full DFT values is generally worse than in the previous cases (see the right panel of Figure a). However, the average error is of about 0.67 kcal/mol (∼0.1 mH), that is, again well beyond the chemical accuracy. The largest discrepancy is shown by 3w-III for all the functionals (LDA, PBE, or B3LYP) in combination with aug-cc-pVDZ (∼1.2 kcal/mol). Again, this can be explained by considering the spatial arrangement of water molecules around GLY (see Figure ). Similar to 2w-I and 2w-II, the effect of charge transfer and polarization interactions, which are neglected by the partitioning of the inactive density matrix in MLDFT, may play a relevant role. Such effects are larger for 3w-III; however, they affect also other conformers which are characterized by a four/five center bridge. It is also worth noticing that the MLDFT error is expected to increase with the size of the studied system because the energy is an extensive quantity. Such a trend is in fact reported for both MOXY and GLY clusters. Let us now discuss the MLDFT-DFT energy deviations for the energy differences between each conformer of the GLY clusters and 1w-I, 2w-I, and 3w-I, which are reported in Figure b. Raw data are given in Tables S3–S5 in the Supporting Information. For the GLY + 1w system, both MLDFT and DFT predict 1w-I to be the most stable at all levels of theory, whereas the relative populations of the other conformers strongly depend on the theory level (see Figure b, left panel). In particular, the energy differences of each conformer with respect to 1w-I decrease as larger basis sets are employed and also by moving from LDA to PBE and B3LYP. The error between MLDFT and DFT is instead almost constant (in absolute value) for all different combinations of the basis set and DFT functional, and in all cases, MLDFT correctly reproduces the trends obtained at the reference full DFT level. The same considerations outlined above for GLY + 1w conformers also apply to GLY + 2w ones (see Figure b, middle panel). In fact, by moving from LDA to B3LYP and by including polarization and diffuse functions in the basis set, MLDFT errors with respect to DFT reference values decrease. The largest DFT-MLDFT discrepancy is reported for 2w-X at the B3LYP/aug-cc-pVDZ level (−0.55 kcal/mol). This is due to the fact that the largest error is associated to the GS energy of the most stable conformer 2w-I (see left panel of Figure a) for this combination of the DFT functional/basis set. However, as already reported for all the other studied systems, the error in the relative energies of the different conformers is always lower than the corresponding error in the total energies. Finally, also in case of GLY + 3w clusters, the agreement between DFT and MLDFT is almost perfect, with errors ranging from −0.6 to 0.6 kcal/mol. The maximum error is observed for 3w-III at the PBE/aug-cc-pVDZ level (0.53 kcal/mol), whereas the minimum is reported for 3w-VII at the B3LYP/6-31G* level (error < 0.01 kcal/mol). Therefore, also for these systems, MLDFT provides a reliable description of the relative energies of the different conformers. The only notable exceptions are conformers 3w-III and 3w-IV at the LDA/6-31G and B3LYP/6-31G levels, respectively. As a final comment, we note that although the MLDFT error on total GS energy can be larger than 1 kcal/mol, relative energies of the different conformers are accurately predicted, with an error that is always below 1 kcal/mol.

Toward a Realistic Picture of Solvation

In the previous sections, we have presented and discussed solute–solvent structures obtained by modeling the solvation phenomenon in aqueous solution by means of the so-called cluster approach,[90] in which only the closest water molecules are explicitly treated at the QM level. However, this picture is not realistic, being a strongly approximate way of modeling solvation. In fact, any dynamical aspect of solvation is neglected as well as, more importantly, long range interactions which are especially relevant for polar environments such as water. In this section, we show how MLDFT may be coupled to approaches that have been developed to model solvation more realistically. In particular, we will apply MLDFT to a randomly selected structure extracted from a classical MD simulation performed on both MOXY and GLY in aqueous solution. In this way, the atomistic details of solvation are retained, and dynamical aspects could easily be introduced by repeating the calculations on several structures. A closer investigation of the latter aspect is beyond the scope of our first work on MLDFT and will be the topic of further studies. Let us start with MOXY. We have selected one random snapshot extracted from a MD simulation, which was previously reported by some of the present authors.[82,91,92] Note that MOXY is a rigid molecule; therefore, a single snapshot well represents its conformational structure (Figure ).

Figure 7

Selected structure of MOXY + 50 water molecules, as extracted from MD. In MLDFT calculations, MOXY is the active part and water molecules are inactive.

Selected structure of MOXY + 50 water molecules, as extracted from MD. In MLDFT calculations, MOXY is the active part and water molecules are inactive. In MLDFT calculations, MOXY is the active fragment and it is treated at the B3LYP/6-31+G* level. The inactive part is constituted by the 50 closest water molecules, which are described at the B3LYP/6-31G level. The reference full DFT calculation is instead performed by using the B3LYP functional, in combination with the 6-31+G* basis set for MOXY and the 6-31G one for water molecules. In order to quantify the accuracy of MLDFT, we compute the solvation energy Esolv, which is defined aswhere Etot, EMOXY, and Ew are the total, MOXY, and water GS energies, respectively. Note that EMOXY is calculated in the gas phase, and thus, it is the same in both full DFT and MLDFT calculations. Etot and Ew are defined differently in the two approaches; in MLDFT, Ew is calculated at step 1 of the computational protocol (see Section ), whereas in full DFT, it refers to the GS energy of the 50 water molecules. Computed energy values for MOXY are reported in Table for both DFT and MLDFT. We first notice that the MLDFT error on the total energy Etot is larger than what is found for clusters (see previous sections). This is not surprising because the error of the method scales with the number of the water molecules in the inactive part. Such discrepancies are primarily due to the neglect of polarization and charge-transfer interactions in the inactive solvent water molecules because their density matrix remains fixed in MLDFT. The largest contribution to the error on total energy is due to Ew. In fact, MLDFT Ew differs from full DFT of about the same extent as total energies. Such differences between MLDFT and DFT are reflected by the computed solvation energy, which can be taken as a measure of the accuracy of MLDFT. For the studied snapshot, the agreement between MLDFT and DFT is almost perfect (−12.30 vs −12.48 kcal/mol), and the error is of about 0.2 kcal/mol.

Table 1

DFT and MLDFT Total GS Energies (Etot) of MOXY + 50 Water Molecules Snapshot Depicted in Figure a

	DFT	MLDFT
E_tot	–4013.1956	–4013.1660
E_MOXY	–193.1079	–193.1079
E_w	–3820.0681	–3820.0382
E_solv	–0.0196	–0.0199

EMOXY, Ew, and Esolv are also reported. All values are given in Hartree.

EMOXY, Ew, and Esolv are also reported. All values are given in Hartree. The same analysis may be applied to glycidol, for which the snapshots were extracted from MD simulations previously reported by some of us.[79] We recall that GLY is a flexible solute, of which the main conformers may be identified by means of two dihedral angles δ1 and δ2 [see Figure , panel (a)]. Seven most probable conformers have been selected [see Figure panel (b)].

Figure 8

(a) Definition of δ1 and δ2 dihedral angles of GLY; (b) δ1 and δ2 values for the selected GLY + 50 water molecules snapshots extracted from MD; (c) molecular structures of the seven selected snapshots. In MLDFT calculations, GLY is the active part, whereas water molecules are inactive. The MLDFT partition has been done so that GLY is the active fragment and treated at the B3LYP/6-31+G* level, whereas water molecules are inactive and described at the B3LYP/6-31G level. All the reference full DFT calculations are performed by using the B3LYP functional in combination with the 6-31+G* basis set for the solute and the 6-31G one for the water molecules. The DFT and MLDFT energies (Etot, EGLY, Ew, and Esolv) are reported in Table S6 in the Supporting Information. Overall, MLDFT total energies are higher than DFT values of about 0.02–0.03 Hartree. The reasons of this discrepancy are the same as reported for MOXY. The DFT and MLDFT solvation energies are graphically compared in Figure . We observe that all MLDFT values are almost in perfect agreement with the reference full DFT data. The average discrepancy is of about 0.7 kcal/mol (∼1 mH), with the largest discrepancy reported for conformer 5 (1.1 kcal/mol). Notice that in this study, we only include the GLY moiety in the active part. Similar calculations performed at the MLHF level[58] needed to insert at least five water molecules in the active fragment to reach the same level of accuracy.

Figure 9

DFT and MLDFT solvation energies (Esolv) for the different conformers graphically depicted in Figure . Values are given in kcal/mol.

Summary and Conclusions

In this work, we report a novel density-based multilevel approach based on the DFT treatment of the electronic structure problem. In MLDFT, the full system is partitioned in two layers, one active and one inactive. MLDFT stands apart from methods based on a similar paradigm because of the choice of partitioning an initial density matrix into active and inactive parts, instead of the density function. The MLDFT SCF procedure is then performed in the MO basis of the active subsystem only. This is the source of the reduction in computational cost because the density matrix of the inactive fragments is kept frozen during the optimization of the density. The MLDFT was applied to aqueous methyloxirane and glycidol, for which two different approaches to solvation were discussed. First, the so-called cluster approach is employed, which models solvation in terms of minimal clusters composed of the solute and a small number of water molecules. Second, a more realistic picture is considered, which focuses on randomly selected snapshots extracted from MD simulations. For all studied structures, the computed data confirm that MLDFT is able to correctly reproduce reference full DFT values, with errors which are always ≤1 kcal/mol. Because of its favorable computational scaling, MLDFT can be coupled to more realistic approaches to solvation, that is, it can treat a large number of representative snapshots extracted from MD simulations, so to effectively take into account the dynamical aspects of solvation. In this first presentation of the approach, we have limited the analysis to GS energies. However, MLDFT has the potential to be extended to the calculation of molecular properties and spectra. In particular, the analytical evaluation of molecular gradients will allow for MLDFT geometry optimizations. However, this extension is not trivial because in order to guarantee the continuity of the PES, the same pivots must be imposed in the Cholesky decomposition. The calculation of molecular geometries and properties will be the topic of future communications. Also, the active and inactive parts can be described at two different levels of theory, for instance, using two different DFT functionals. Similarly to projection-based approaches, the active and inactive regions can also be treated at the HF and DFT levels, respectively, thus allowing for post-HF calculations on the active part only.[50] Such extensions will be the topic of future communications. The method will also be further developed by focusing on some technical aspects, which are worth being improved. For instance, in the current implementation, the DFT grid is homogeneous in the whole space. However, it is reasonable to assume that the grid can be downgraded further away from the active part, because within the focused model paradigm, we only seek to accurately model the effect of the environment on the properties of the active subsystem, rather than the intrinsic properties of the environment itself. Technical refinements of the current implementation are in progress and will be discussed in future communications.

65 in total

1. Chiral self-recognition: direct spectroscopic detection of the homochiral and heterochiral dimers of propylene oxide in the gas phase.

Authors: Zheng Su; Nicole Borho; Yunjie Xu
Journal: J Am Chem Soc Date: 2006-12-27 Impact factor: 15.419

2. Effective yet reliable computation of hyperfine coupling constants in solution by a QM/MM approach: Interplay between electrostatics and non-electrostatic effects.

Authors: Tommaso Giovannini; Piero Lafiosca; Balasubramanian Chandramouli; Vincenzo Barone; Chiara Cappelli
Journal: J Chem Phys Date: 2019-03-28 Impact factor: 3.488

3. Correlated natural transition orbitals for core excitation energies in multilevel coupled cluster models.

Authors: Ida-Marie Høyvik; Rolf Heilemann Myhre; Henrik Koch
Journal: J Chem Phys Date: 2017-04-14 Impact factor: 3.488

4. Improved Accuracy and Efficiency in Quantum Embedding through Absolute Localization.

Authors: Dhabih V Chulhai; Jason D Goodpaster
Journal: J Chem Theory Comput Date: 2017-03-09 Impact factor: 6.006

5. Polarizable Embedding Approach for the Analytical Calculation of Raman and Raman Optical Activity Spectra of Solvated Systems.

Authors: Tommaso Giovannini; Marta Olszówka; Franco Egidi; James R Cheeseman; Giovanni Scalmani; Chiara Cappelli
Journal: J Chem Theory Comput Date: 2017-08-18 Impact factor: 6.006

6. The dynamics of water at DNA interfaces: computational studies of Hoechst 33258 bound to DNA.

Authors: Kristina E Furse; Steven A Corcelli
Journal: J Am Chem Soc Date: 2008-09-04 Impact factor: 15.419

7. Hydrogen-bonding-induced shifts of the excitation energies in nucleic acid bases: an interplay between electrostatic and electron density overlap effects.

Authors: Tomasz A Wesolowski
Journal: J Am Chem Soc Date: 2004-09-22 Impact factor: 15.419

8. A gauge invariant multiscale approach to magnetic spectroscopies in condensed phase: general three-layer model, computational implementation and pilot applications.

Authors: Filippo Lipparini; Chiara Cappelli; Vincenzo Barone
Journal: J Chem Phys Date: 2013-06-21 Impact factor: 3.488