Literature DB >> 32338891

Performance of Localized Coupled Cluster Methods in a Moderately Strong Correlation Regime: Hückel-Möbius Interconversions in Expanded Porphyrins.

Nitai Sylvetsky¹, Ambar Banerjee¹, Mercedes Alonso², Jan M L Martin¹.

Abstract

Localized orbital coupled cluster theory has recently emerged as a nonempirical alternative to DFT for large systems. Intuitively, one might expect such methods to perform less well for highly delocalized systems. In the present work, we apply both canonical CCSD(T) approximations and a variety of localized approximations to a set of flexible expanded porphyrins-macrocycles that can switch between Hückel, figure-eight, and Möbius topologies under external stimuli. Both minima and isomerization transition states are considered. We find that Möbius(-like) structures have much stronger static correlation character than the remaining structures, and that this causes significant errors in DLPNO-CCSD(T) and even DLPNO-CCSD(T1) approaches, unless TightPNO cutoffs are employed. If sub-kcal mol-1 accuracy with respect to canonical relative energies is required even for Möbius-type systems (or other systems plagued by strong static correlation), then Nagy and Kallay's LNO-CCSD(T) method with "tight" settings is the suitable localized approach. We propose the present POLYPYR21 data set as a benchmark for localized orbital methods or, more broadly, for the ability of lower-level methods to handle energetics with strongly varying degrees of static correlation.

Entities: Chemical Disease Gene Species

Year: 2020 PMID： 32338891 PMCID： PMC7304861 DOI： 10.1021/acs.jctc.0c00297

Source DB: PubMed Journal: J Chem Theory Comput ISSN： 1549-9618 Impact factor: 6.006

Introduction

Expanded porphyrins have drawn much attention over the past few decades due to their facile redox interconversions, novel metal coordination behaviors, versatile electronic states, and conformational flexibility.[1] The latter is responsible for the rich chemistry associated with such systems, which has led to various applications such as near-infrared dyes,[2] nonlinear optical materials,[3] magnetic resonance imaging contrast agents,[4] and molecular switches.[5] Contrary to the parent porphyrin, expanded porphyrins are flexible enough to easily undergo conformational changes, which correspond to distinct π-conjugation topologies (Hückel, Möbius, and twisted-Hückel/figure-eight) encoding different chemical and physical properties (Scheme ).[6,7]

Scheme 1

Representation of Different π-Conjugation Topologies of Expanded Porphyrins and Their Expected Aromaticities as a Function of the Number of π-Electrons

Such changes may involve a Hückel–Möbius aromaticity switch within a single molecule, which can easily be induced by, inter alia, an appropriate solvent, pH, temperature, and metalation conditions.[8,9] Thus, these Hückel–Möbius aromaticity switches have already been recognized for their potential applications in molecular optoelectronic devices.[10] Additional applications for expanded porphyrins, including conductance switching devices[11,12] and efficient nonlinear optical switches,[13] have recently been covered in the literature. In a very recent collaboration[6] with Alonso et al., relative energies and isomerization pathways of a set of expanded porphyrins were investigated using wave function ab initio methods and DFT methods,[6] motivated by the fact that DFT-based energetics were shown to be highly dependent on the density functional employed in the calculations.[14,15] Furthermore, different DFT studies on expanded porphyrins have arrived at contradictory findings concerning the best-performing functionals to be used for these systems.[14−18] Indeed, since the stability of these isomers depends on the complex interplay of different factors (hydrogen bonding, π···π stacking, steric effects, ring strain, aromaticity, and so forth), it is no surprise that the selection of an exchange-correlation functional appropriate for describing the energy profiles of such topological switches is no trivial task. Thus, in ref (6), we opted to assess the performance of different exchange-correlation functionals for describing the thermochemistry and kinetics of topology interconversions in N-fused [24]penta-, [28]hexa-, and [32]heptaphyrins—by comparing them to benchmark results obtained at the canonical CCSD(T)/CBS level of theory. The structures included in this benchmark are illustrated in Figure .

Figure 1

(a) Hückel (H), Möbius (M), and figure-eight (F) conformations of selected expanded porphyrins. Aromatic and antiaromatic macrocycles are colored in red and green, respectively. (b) Two 28H ⇌ 28M interconversion pathways investigated for the Hückel–Möbius interconversion in [28]hexaphyrin. Unfortunately, canonical CCSD(T) calculations are notorious for their heavy computational burden, having formal CPU-time scaling of O(n3N4), n being the number of electrons in the system and N corresponding to the number of basis functions employed in the calculation. Hence, even for heptaphyrins with the cc-pVDZ basis set, canonical CCSD(T) hit the ceiling of our computational resources. As an illustration, a canonical CCSD(T)/cc-pVDZ calculation on structure 28M required no less than two node months total CPU time. Thus, treating even larger polypyrroles by means of robust, nonempirical ab initio methods is only feasible using alternative, computationally more economical methodologies. DLPNO-type approaches (domain localized pair natural orbital), which have recently gained popularity due to their near-linear scaling properties, embrace the notion of pair natural orbitals (PNOs) in order to reduce the virtual space which has to be taken into account in a given calculation.[19−22] Recent methodological developments have led to the situation in which, using modern commodity servers, systems with over 44,000 basis functions and 2,300 atoms[23] are within reach of localized orbital-based ab initio methods. They may therefore constitute an obvious solution for the larger expanded porphyrins.[24,25] That being said, the systems under consideration are known to be strongly delocalized:[26,27] thence, one may intuitively expect that localized orbital-based correlation approaches, such as the above-mentioned DLPNO-type ones, would prove to be inadequate. For this reason, assessing the performance of DLPNO-type approaches against canonical benchmark results is essential for confirming their reliability for highly delocalized π-systems. We shall therefore assess the performance of several localized orbital approaches for the problem at hand. Below we show that some of the structures, specifically Möbius π-systems and the transition states resembling them, suffer from elevated degrees of static correlation, and that the errors for such systems can reach several kcal mol–1 for the more cost-effective localized methods, although such errors can be mitigated through judicious choice of cutoffs.

Theoretical Methods

In the present work, we shall consider four different localized orbital approaches. The first and second, both used as implemented in ORCA 4.1 and later, are two variants of the MPI-Mühlheim DLPNO approach. The popular DLPNO-CCSD(T) approach, in which off-diagonal Fock matrix elements are neglected in the (T) contribution (such elements vanish for closed-shell canonical orbital calculations, but not for localized orbitals), actually corresponds to an approximation to canonical CCSD(T0).[28] The latter approximation is eliminated in the more rigorous DLPNO-CCSD(T1)[29] approach, at considerable additional CPU cost and I/O overhead. The third method is the PNO-LCCSD(T) approach of Werner and co-workers[30,31] as implemented in MOLPRO 2018.[32] It likewise eschews the (T0) approximation but differs substantially from DLPNO-CCSD(T) in terms of domain construction strategy—as explained in detail in refs (23 and 24) and summarized below. Finally, we consider the LNO-CCSD(T) approach of Kállay and co-workers[23] as implemented in the MRCC package.[33] Here, the correlation energy is partitioned into occupied orbital contributions, and domains are adjusted for each such orbital individually to ensure that it is adequately represented. For orbitals that are not strongly delocalized, domains will be small, while strongly delocalized orbitals will entail large domains. As we shall see, this mitigates errors in such cases. For example, in the present work and for the systems at hand, we found that Möbius structures of the hexaphyrin required LNO-CCSD(T) wall times a factor of 4–5 longer than for simpler Hückel structures, compared to only about a factor of 2–2.5 for DLPNO-CCSD(T). Each of the above DLPNO, PNO, and LNO methods has an array of cutoffs, screening thresholds, and other numerical parameters too unwieldy for routine manipulation by the nonspecialist user. Hence, typically several tuned combinations of such settings are offered that aim to consistently yield a given numerical precision for optimal computational cost. In the case of DLPNO-CCSD(T) in ORCA,[34] for example, three ascending levels of accuracy are collected under the keywords LoosePNO, NormalPNO (the default), and TightPNO: for details see Table 1 of ref (34). NormalPNO aims to yield energetics precise to 1 kcal mol–1, while TightPNO sets the bar higher and is intended for applications like noncovalent interactions or conformer/isomer equilibria, where 1 kcal mol–1 would be an unacceptably large fraction of the interaction and relative conformer/isomer energies, respectively. Similarly, PNO-LCCSD(T) in MOLPRO offers “Normal” and “Tight” domain settings (cf. Tables 1–4 of ref (31)), while the corresponding MRCC settings are detailed in Table 1 of Nagy and Kállay.[35] While the DLPNO-CCSD approach in ORCA and the equivalent PNO-LCCSD method in MOLPRO are very similar in their fundamentals and both achieve roughly linear CPU time scaling with system size, they differ considerably in their practical implementation details. Aside from the subtle differences in screening and cutoff strategies between codes, one more fundamental variance has chemical consequences for highly delocalized systems. Both codes construct virtual orbital domains for each correlation pair from the PAOs (projected atomic orbitals, i.e., the original basis set after projecting out all occupied MO components) and then constructs virtual orbital “domains” from these for the diagonal pair correlation E of each localized MO i [domains for off-diagonal pair E are taken as the union of the domains for the diagonal pairs E and E]. Pair natural orbitals are then calculated at the MP2 level, and only those PNOs whose natural orbital occupation number exceeds a set threshold are retained. Where DLPNO-CCSD(T1) in ORCA, and PNO-LCCSD(T) in MOLPRO, differ is how domains are constructed. MOLPRO uses a spatial criterion based on a fixed number of atom shells (or a given maximum distance) around the bonded atom pair, viz., the atom that the lone pair sits on.[31,36] In contrast, ORCA uses an orbital population (older version) or orbital overlap (newer version) based criterion. In the older version,[20,22] all atoms for which the orbital had a Mulliken population greater in absolute value than the population cutoff parameter TCutMKN were included in the domain; whereas in ORCA 4 and later,[37] the orbital is included if the square root of the differential overlap is greater than the differential overlap cutoff parameter TCutDO. The MOLPRO approach typically yields much more compact domains, while the ORCA approach appears to be more resilient toward highly delocalized systems, such as the considered polypyrroles. It should be noted that, for nonconjugated molecules, the two approaches may be expected to perform comparably well. Ma and Werner[31] have argued that, in view of the much faster basis set convergence of F12 approaches, their ultimate goal is PNO-LCCSD(T)-F12: the deficiencies of the smaller PNO domains would then in practice be obviated by inclusion of F12 corrections. While we acknowledge this argument, we do not currently have a viable way of generating canonical CCSD(T)-F12 data for such large systems. Canonical CCSD(T) reference data for a DZP basis set, on the other hand, proved computationally tractable albeit demanding. We do believe that it is valuable to test the approximations in the localized methods in isolation against the corresponding canonical answers, our view “uncluttered” by any F12 correction. How do specific domain size settings affect the CPU time required for a given calculation? As an example, we consider the 28M Möbius structure. For DLPNO-CCSD(T1)/cc-pVDZ, the CPU time ratio TightPNO:NormalPNO was 8.65:1; in other words, the more lenient settings save ∼88% of the total CPU time required for such a calculation. A somewhat smaller ratio (5.81:1) was observed for the DLPNO-CCSD(T0) calculation. All else being equal, we find that DLPNO-CCSD(T1) for these systems requires about twice the CPU time for DLPNO-CCSD(T0): the fairly high I/O bandwidth required for (T1) required running on nodes with high-speed local SSD (solid state drive) scratch disk volumes. For the problem at hand, it may be said that neither approaches require outlandish computational resources—and that the difference between them is still small enough to justify “going the extra mile” for superior accuracy. The CPU requirements stand in stark contrast to those for the corresponding canonical calculations, which are almost 2 orders of magnitude larger: as said above—running massively parallel on eight 16-core machines with a fully nonblocking InfiniBand interconnect and local SSD (solid state disk) scratch on all machines, canonical CCSD(T) on 28M required about 1 week total wall clock time. Moreover, adding just one more pyrrole ring into the macrocycle already quadruples the required time for the canonical calculation, while the difference is barely noticeable in the DLPNO or PNO calculations. Formally, canonical CCSD(T) asymptotically scales with system size n as O(n7), while DLPNO-CCSD(T) and PNO-LCCSD(T) asymptotically scale linearly. As part of the present work, we have also considered the following diagnostics for type A static correlation[38] (a.k.a. left–right static correlation,[39] absolute near-degeneracy correlation): D1 [defined as[40] λmax(T·T†)1/2 where T is the single excitations amplitude vector], 1 – C02 (i.e., one minus the squared coefficient of the reference determinant in a CASSCF calculation with an appropriate active space), the M diagnostic proposed by Truhlar and co-workers[41] (which for closed-shell systems reduces to 1 – nHOMO/[2 + (nLUMO/2)]), and Matito’s IND diagnostic[42] based on natural orbital occupations. A fairly recent review of static correlation diagnostics can be found in ref (43). Additional diagnostics for the considered systems are discussed in ref (6). As shown there, FOD analysis[44] (as expected) indicates that static correlation is smeared out over the entire molecule. (As an aside, using the same (12,12) active space in all CASSCF calculations for the purpose of calculating 1 – C02 sidesteps the issue pointed out by a reviewer that 1 – C02 is not size-extensive. A workaround to bring 1 – C02 on the same scale for different numbers of correlated electrons was pointed out in endnote 31 of the work of Via-Nadal et al.:[45] tt effectively amounts to replacing x ≡ 1 – C02 by 1 – (1 – x)1/ where a ≡ Nval/Nval,ref or the number of valence electrons divided by the number for a reference system. If x is not too large, MacLaurin expansion in x yields and hence . Finally, in order to verify that SCF orbitals for all structures in all codes correspond to global minima on the S = 0 Hartree–Fock energy hypersurface, relative energies at the SCF level were compared to those obtained in the same basis set using stable = (int,opt) in Gaussian 16[46] and found to agree to 0.01 kcal mol–1 or better. Some of the Möbius structures, in particular, required care to ensure convergence to the correct state with the other codes: in the most refractory case, 32M, we resorted to HF/STO-3G on the quadruple cation, used as initial guess for HF/STO-3G on the neutral species, finally used in turn as initial guess for HF/cc-pVDZ.

Results and Discussion

Adequacy of the Canonical Reference Level

As mentioned in the Introduction, the largest basis set for which we were able to obtain fully canonical CCSD(T) answers for comparison was the cc-pVDZ (no p on hydrogen) basis set.[47] The mind wonders whether, at least for the problem at hand, this level of theory is sufficiently close to the FCI/CBS (full configuration interaction/complete basis set) limit to be adequate as a canonical reference point. Concerning the first aspect, i.e., post-CCSD(T) correlation effects, the size of the system clearly precludes carrying out CCSDT(Q) let alone CCSDTQ calculations. However, for limited orbital active spaces, we were able to carry out ICE-CI (iterative configuration expansion–configuration interaction—ICE-CI is effectively ORCA’s implementation of Malrieu’s CIPSI algorithm[48]) calculations using ORCA and compare them to CCSD(T) in the same orbital space. The result, for active spaces ranging from 12-electrons-in-12-orbitals, or (12,12) for short, to (30,30) are given in Table . Clearly, at least for the property of interest, post-CCSD(T) corrections are surprisingly small. This may, of course, be the result of a fortunate error compensation between neglect of higher-order iterative triple substitution effects CCSDT-CCSD(T) and neglect of connected quadruple excitations. Similar cancellations are seen in the atomization energies of some small molecules with multireference character, e.g., C2.[49−51]

Table 1

Post-CCSD(T) Corrections (kcal mol–1) for the Relative Energies of [24] N-Fused Pentaphyrin, [28]Hexaphyrin, and [32]Heptaphyrin structuresa

system	CCSD(T)	ICE-CI	CCSD(T)	ICE-CI	CCSD(T)	ICE-CI	CCSD(T)	ICE-CI	CCSD(T)
active space	all orbitals	(12,12)	(12,12)	(18,18)	(18,18)	(24,24)	(24,24)	(30,30)	(30,30)
24H_a	9.12	6.79	6.82	–0.53	–0.49	4.84	4.84	4.49	4.36
24H_b	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
24M	6.06	8.12	8.24	4.89	4.96	7.90	7.92	8.40	8.31
24TS₁	9.05	6.70	6.71	3.29	3.28	6.68	6.62	6.53	6.38
24TS₂	4.87	6.00	6.04	3.08	3.09	5.86	5.83	6.39	6.27
28H	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
28M	–0.73	10.88	11.12	8.91	9.09	9.28	9.43	7.62	7.56
28M_1A	0.46	12.60	12.87	9.91	10.12	10.39	10.54	8.55	8.48
28M_1B	1.82	13.57	13.82	11.67	11.86	10.98	11.09	11.38	11.34
28F	–0.38	7.41	7.38	9.18	9.12	5.40	5.28	4.70	4.45
28TS_1A	6.33	13.75	13.77	12.24	12.16	10.82	10.66	14.06	13.92
28TS_1B	2.86	9.14	9.12	10.02	9.97	8.72	8.60	6.56	6.16
28TS_2A	6.87	26.41	26.68	28.09	28.31	24.60	24.74	22.21	22.05
28TS_2B	9.89	30.33	30.57	31.42	31.62	28.31	28.44	26.44	26.30
28TS₃	5.17	15.03	15.02	14.55	14.44	13.31	13.15	12.17	11.84
32F	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00	0.00
32M_a	16.81	18.35	18.63	10.75	10.99	13.47	13.71	12.55	12.62
32M_b	16.74	18.46	18.72	13.85	14.15	15.71	16.03	17.48	17.98
32H	34.60	22.91	22.90	24.75	24.74	24.18	24.17	27.23	27.49
32TS₁	17.49	16.64	16.64	10.69	10.67	11.23	11.22	14.13	14.39
32TS₂	33.79	24.28	24.22	24.71	24.65	25.40	25.37	27.58	27.74
RMSDa	-	0.15		0.14		0.13		0.21
MUEa	-	0.10		0.10		0.09		0.16

See Figure for the structural notation. RMSD and MUE (in kcal mol–1) for the relative energies computed with ICE-CI and CCSD(T) methods for different orbital active spaces. Orbitals in the (n,n) active space are the n/2 highest occupied molecular orbitals and the n/2 lowest unoccupied MOs, selected from HF level orbital energies. Concerning the second aspect, i.e., basis set incompleteness, we were able to carry out canonical explicitly correlated[52,53] RI-MP2-F12 calculations with the cc-pVDZ-F12 basis set[54] and associated auxiliary basis sets[55] for all species. For the largest macrocycle (i.e., the [32]heptaphyrin), such calculations required about 10TB of scratch space each, which we “jury-rigged” by cross-mounting SSD scratch directories from other nodes through NFS-over-InfiniBand. Typically (see, e.g., reviews on F12 theory[52,53]), F12 calculations with appropriate basis sets gain about 2–3 “zetas” in basis set convergence. Hence, the MP2-F12/cc-pVDZ-F12 energetics ought to be comparable or superior to MP2/cc-pVQZ in terms of convergence. We can easily verify this in the present context, of course, by carrying out RI-MP2/cc-pVTZ and cc-pVQZ calculations and extrapolating to the complete basis set limit using the Helgaker formula.[56] In this event, MP2/cc-pV{T,Q}Z relative energies thus obtained deviate from their MP2-F12/cc-pVDZ-F12 counterparts by less than 0.1 kcal mol–1 RMSD. The basis set extension effect itself, from MP2/cc-pVDZ, is just 0.9 kcal mol–1 RMSD in both cases. We may thus safely assume that the coupling term C in the equation below is negligible:and thus, that we can make the familiar “high-level correction” (HLC) approximation:(For a discussion of one-particle/“basis set” vs n-particle space/“electron correlation method” coupling, see ref (57).) Our best estimates for the relative energies of the topology interconversions in our expanded porphyrin database are collected in Table . For the purpose of assessing localized methods against canonical results, however, the above gives us confidence that CCSD(T)/cc-pVDZ is a reasonable starting point.

Table 2

Our Best Estimates for the Relative Isomer Energies Considered in This Worka

system	CCSD(T)/cc-pVDZ	MP2/cc-pV{T,Q}Z + [CCSD(T)-MP2]/cc-pVDZ	MP2-F12/cc-pVDZ-F12 + [CCSD(T)-MP2]/cc-pVDZ
24H_a	9.12	7.92	8.06
24H_b	0.00	0.00	0.00
24M	6.06	6.38	6.48
24TS₁	9.05	8.93	9.01
24TS₂	4.87	5.12	5.18
28H	0.00	0.00	0.00
28M	–0.73	–1.77	–1.75
28M_1A	0.46	0.28	0.28
28M_1B	1.82	1.39	1.39
28F	–0.38	0.16	–0.08
28TS_1A	6.33	4.65	4.58
28TS_1B	2.86	2.00	1.92
28TS_2A	6.87	6.10	6.02
28TS_2B	9.89	8.88	8.79
28TS₃	5.17	4.50	4.36
32F	0.00	0.00	0.00
32M_a	16.81	15.45	15.65
32M_b	16.74	16.59	16.52
32H	34.60	32.59	32.72
32TS₁	17.49	16.08	16.16
32TS₂	33.79	32.33	32.36

The latter were obtained at the MP2/cc-pV{T,Q}Z + [CCSD(T)-MP2]/cc-pVDZ and MP2-F12/cc-pVDZ-F12 + [CCSD(T)-MP2]/cc-pVDZ levels of theory, with p functions on H omitted. All entries are in kcal mol–1.

Initial Assessment of the Localized vs Canonical Methods

For heptaphyrin, each canonical CCSD(T) required about a week on eight 16-core Intel Haswell nodes, with MOLPRO running a 3-level parallelism of nodes, processes, and [in (T) and LAPACK] OpenMP threads. In contrast, the corresponding localized calculations took from a few hours to 1 day on just a single node. A comparison of various approximate PNO-CCSD(T) relative energies with the canonical reference values is given in Table , and the box-and-whisker plots for different localized approaches are shown in Figure .

Table 3

Canonical CCSD(T) Relative Energies (kcal mol–1) and Errors with Various Localized Orbital CCSD(T) Approximations for the Relative Energies of [24] N-Fused Pentaphyrin, [28]Hexaphyrin, and [32]Heptaphyrin Structures (F = figure-eight, M = Möbius, H = Hückel; TS = transition states)a

RMSDs from canonical results in the same basis set (kcal mol–1). Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value); diagnostics are heat-mapped green (low) via orange to red (high) on a continuous percentile gradient for each column.

NormalPNO.

tightPNO.

defaultDomain.

tightDomain.

lcorthr = normal.

lcorthr = tight.

lcorthr = tight, wpairtol = 1 × 10–6.

Taken from Table 2 in ref (6). (1 – C02) was obtained at the CASSCF(12,12) level for all species, M and IND at the ICE-FCI(30/30) level, and D1 at the CCSD/cc-pVDZ(no p functions on H) level.

Figure 2

Box-and-whisker plots for various localized orbital CCSD(T) approximations, showing the error spread for the expanded porphyrin database with respect to canonical CCSD(T) energies. The RMSDs (in kcal mol–1) are also displayed below each method. RMSDs from canonical results in the same basis set (kcal mol–1). Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value); diagnostics are heat-mapped green (low) via orange to red (high) on a continuous percentile gradient for each column. NormalPNO. tightPNO. defaultDomain. tightDomain. lcorthr = normal. lcorthr = tight. lcorthr = tight, wpairtol = 1 × 10–6. Taken from Table 2 in ref (6). (1 – C02) was obtained at the CASSCF(12,12) level for all species, M and IND at the ICE-FCI(30/30) level, and D1 at the CCSD/cc-pVDZ(no p functions on H) level. First of all, DLPNO-CCSD(T1) with tight PNO settings appears to be the overall best performer among all PNO-type approaches, having an RMSD of only 1.43 kcal mol–1 from the reference. Resorting to default PNO settings raises the error by only ∼0.4 kcal mol–1, while reducing wall time by about 75–80%, and may therefore be a desirable option in cases where tight PNO settings become too computationally demanding. DLPNO-CCSD(T0) does not measure up to the former scheme—exhibiting 2.14 and 2.27 kcal mol–1 RMSDs from the canonical energies using tight and default PNO settings, respectively. Indeed, the difference associated with the latter settings is not as large as in the (T1) case–the domain improvement “drowns in the noise” of the T0 approximation, so to speak. PNO-LCCSD(T1) seemingly offers the least-satisfactory performance among this class of localized methods, deviating from the reference values by 3.66 and 2.73 kcal mol–1 using default and tight PNO settings, respectively. The latter PNO settings are clearly superior in this case. LNO-CCSD(T) performs exceptionally well compared to the above PNO-type approaches—having an RMSD of 1.47 kcal mol–1 with default settings (lcorthr = Normal) and just 0.74 kcal mol–1 with (lcorthr = Tight). Further tightening thresholds to lcorthr = vtight approximately quadruples the total CPU time; fortunately, we found that just tightening wpairtol from 3 × 10–6 to 1 × 10–6, while leaving the remaining parameters at their lcorthr = Tight values, recovered most of the improvement at minimal additional cost. This setting is denoted as “Tight+” in Tables –7. RMSD from canonical CCSD(T) for Tight+ was found to be just 0.42 kcal mol–1.

Table 7

[CCSD(T)-MP2] Relative Energies (kcal mol–1) and Errors with Various Localized Orbital HLC Approximations for the Relative Energies of the Expanded Porphyrins under Considerationa

NormalPNO (ORCA).

tightPNO (ORCA).

defaultDomain (MOLPRO).

tightDomain (MOLPRO).

lcorthr = normal (MRCC).

lcorthr = tight.

lcorthr = tight, wpairtol = 1 × 10–6.

RMSDs from canonical results in the same basis set likewise in kcal mol–1. Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value). NormalPNO (ORCA). tightPNO (ORCA). defaultDomain (MOLPRO). tightDomain (MOLPRO). lcorthr = normal (MRCC). lcorthr = tight. lcorthr = tight, wpairtol = 1 × 10–6. RMSDs from canonical results in the same basis set likewise in kcal mol–1. Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value). NormalPNO (ORCA). tightPNO (ORCA). defaultDomain (MOLPRO). tightDomain (MOLPRO). lcorthr = normal (MRCC). lcorthr = tight. lcorthr = tight, wpairtol = 1 × 10–6. RMSDs from canonical results in the same basis set likewise in kcal mol–1. Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value. NormalPNO (ORCA). tightPNO (ORCA). defaultDomain (MOLPRO). tightDomain (MOLPRO). Normal settings (MRCC). Tight settings (MRCC). RMSDs from canonical results in the same basis set likewise in kcal mol–1. Energy differences are heat-mapped on a continuous gradient from blue (most negative value) via white (zero) to red (most positive value). NormalPNO (ORCA). tightPNO (ORCA). defaultDomain (MOLPRO). tightDomain (MOLPRO). lcorthr = normal (MRCC). lcorthr = tight. lcorthr = tight, wpairtol = 1 × 10–6. As can be seen in Table and Figure for the expanded porphyrin database, deviations of DLPNO and PNO from canonical relative energies are found to be statistically correlated with several diagnostics for the type A static correlation (i.e., absolute near-degeneracy). Indeed, the largest values for all four diagnostics, on the one hand, and the largest deviations from canonical energetics, on the other hand, are specifically observed for the Möbius structures and for two Möbius-like transition states (28TS and 28TS). The M diagnostic, in particular, turns out to be a fair predictor for the energy difference between the localized orbital approaches and canonical CCSD(T) method, with R2 = 0.89 for DLPNO-CCSD(T1) (Figure a) and R2 = 0.96 for PNO-LCCSD(T1) (Figure b), both of them with Tight settings.

Figure 3

Relationship between the energy differences computed for various localized orbital CCSD(T) approximations and the canonical CCSD(T) method and the M diagnostic for static correlation: a) For DLPNO-CCSD(T1) with Tight settings and b) For PNO-LCCSD(T1) with Tight settings. The Möbius structures are highlighted in red. We found it informative, then, to break down error statistics between Möbius(-like) structures vs the Hückel and twisted-Hückel structures. At the bottom of Table , we then see that, for the non-Möbius structures, all three PNO methods can reach about 0.5 kcal mol–1 RMSD on Tight settings and about 1 kcal mol–1 on regular settings. However, for the Möbius structures, much more pronounced errors are seen. A similar discrepancy can be observed for the DLPNO methods. Indeed, for the non-Möbius structures, RMSD is just 0.7 kcal mol–1 for NormalPNO and 0.4 kcal mol–1 for TightPNO, while these figures jump up to 2 kcal mol–1 for DLPNO-CCSD(T1) with TightPNO and to 3 kcal mol–1 for the other options, when considering only the Möbius-type structures. Also, while T0 and T1 are essentially indistinguishable in quality for the non-Möbius systems, T1 is markedly superior for the Möbius ones. The deficiencies of the T0 approximation for systems with some static correlation are of course not unique to the macrocycles at hand. In the original DLPNO-CCSD(T1) paper,[29] it was shown that, for small-gap systems, the (T0) approximation breaks down and relative energies show substantial deviations from the parent canonical CCSD(T) results. Relatedly, we point to the work of Iron and Janes on metal–organic barrier heights (MOBH35),[58,59] where a comparatively small, yet significant, difference of almost 1 kcal mol–1 RMSD was found between DLPNO-CCSD(T0) and DLPNO-CCSD(T1) barrier heights.[59] Efremenko and Martin[60] found more significant differences for the mechanisms of Ru(II) and Ru(III) catalyzed hydroarylation and oxidative coupling.[61] While the Möbius RMSD does get worse from (T1) to (T0), it is a difference of degree and not of kind. Switching from “Normal” to “Tight” criteria actually has the largest impact for LNO-CCSD(T), where it cuts the remaining error for the Möbius structures by over half; a significant improvement is also seen for DLPNO-CCSD(T1). In stark contrast, while the LNO-CCSD(T) approach with Normal criteria does exhibit nearly twice the RMSD (1.9 kcal mol–1) for Möbius as for non-Möbius (1.0 kcal mol–1), for Tight and especially Tight+ criteria, this difference essentially vanishes. Indeed, for Tight+ all errors are below 1 kcal mol–1.

Component Breakdown of Localized vs Canonical Methods

Let us now decompose the relative canonical CCSD(T) energies into their MP2 and CCSD building blocks, in order to get deeper insights regarding the relationship between the canonical and PNO-based methods considered above. As can be seen in Table , PNO-LMP2 with default PNO settings performs rather poorly, having a RMSD of no less than 2.6 kcal mol–1 from the canonical reference values. Resorting to tight PNO domains does lead to an improvement, reducing the error by more than half (1.2 kcal mol–1 RMSD). Like for the complete CCSD(T) energetics, it can be seen that the Möbius structures are responsible for most of the observed errors—RMSD = 3.7 (Normal) and 1.8 (Tight) kcal mol–1 versus, for the non-Möbius structures, 1.0 and 0.3 kcal mol–1, respectively.

Table 4

Canonical MP2 Relative Energies (kcal mol–1) and Errors with Various Localized Orbital MP2 Approximations for the Relative Energies of the Expanded Porphyrins under Considerationa

NormalPNO (ORCA).

tightPNO (ORCA).

defaultDomain (MOLPRO).

tightDomain (MOLPRO).

lcorthr = normal (MRCC).

lcorthr = tight.

lcorthr = tight, wpairtol = 1 × 10–6.

ORCA’s DLPNO-MP2, on the other hand, has no such large discrepancy between Möbius and non-Möbius systems. Overall RMSD = 0.33 kcal mol–1 (NormalPNO) and 0.14 kcal mol–1 (TightPNO), the latter functionally equivalent in quality to the reference values. For LNO-MP2, we do see a substantial difference between Möbius and non-Möbius structures at default settings, but this is much reduced for Tight and especially Tight+ settings. Overall, RMSD for Tight settings (0.57 kcal mol–1) is still almost double that for NormalPNO DLPNO-MP2 (0.33 kcal mol–1), while Tight+ slightly reduces the statistical errors to RMSD = 0.26 kcal mol–1. We shall now move on to the CCSD contributions (Table ). For LNO-CCSD(T), we have followed the recommendation from Nagy et al.[35] to split the weak-pair MP2 corrections evenly between CCSD and (T). It can be seen that DLPNO-CCSD gets closer to canonical CCSD in the same basis set compared to PNO-LCCSD; even DLPNO-CCSD with DefaultPNO settings, at RMSD = 1.1 kcal mol–1, outperforms PNO-LCCSD with TightDomain settings (1.3 kcal mol–1), and with default domain settings, RMSD for PNO-LCCSD even increases to 1.9 kcal mol–1. For the non-Möbius systems, DLPNO-CCSD and PNO-LCCSD are virtually indistinguishable in performance; for the Möbius structures, PNO-LCCSD exhibits more significant errors (2.7 and 1.9 kcal mol–1 RMSD for default and tight settings, respectively), whereas DLPNO-CCSD seems to offer a more satisfying performance (1.3 and 0.8 kcal mol–1).

Table 5

Canonical CCSD Relative Energies (kcal mol–1) and Errors with Various Localized Orbital CCSD Approximations for the Relative Energies of the Expanded Porphyrins under Considerationa

NormalPNO (ORCA).

tightPNO (ORCA).

defaultDomain (MOLPRO).

tightDomain (MOLPRO).

lcorthr = normal (MRCC).

lcorthr = tight.

lcorthr = tight, wpairtol = 1 × 10–6.

LNO-CCSD with Tight+ settings, at RMSD = 0.27 kcal mol–1, is the best performer; RMSD climbs to 0.46 kcal mol–1 for tight settings, still superior to DLPNO-CCSD TightPNO. Here too, we observe a performance difference between Möbius and non-Möbius structures. What about the (T) contribution when considered in isolation? As we have seen for the full CCSD(T) relative energies, there is little difference between DLPNO-(T0) and DLPNO-(T1) for the non-Möbius structures, and this applies for both NormalPNO and TightPNO. Nevertheless, for the Möbius structures, the difference is quite pronounced with DLPNO-(T1) TightPNO having only about one-half the error (0.9 kcal mol–1) of DLPNO-(T0) TightPNO and about one-third the error of DLPNO-(T0) NormalPNO. LNO-CCSD(T) can, however, match the best result even with normal settings, while on TightPNO RMSD drops to just 0.3 kcal mol–1, with no noticeable difference between Möbius and non-Möbius. Our attempts to carry out PNO-LCCSD(T)-F12/cc-pVDZ-F12 calculations[30,62] on these extended π-systems met with failure for technical reasons. Presumably, if we were able to run them to completion, they would be much closer to the canonical basis set limit than PNO-LCCSD(T) is to its canonical counterpart. This comparatively weak basis set sensitivity beyond cc-pVDZ (Table ) indicates that thermodynamic equilibria in the present systems are primarily driven by nondynamical correlation effects—which are well-known (e.g., ref (49)) to converge fairly rapidly with the basis set—rather than the slowly converging dynamical correlation contributions. In such a scenario, especially for still larger systems like octaphyrins or decaphyrins, it may be attractive not just to combine MP2 in a large basis set with a “high-level correction”, i.e., the aggregate post-MP2 correction [CCSD(T)-MP2] from a small basis set, but to obtain the latter using a DLPNO, PNO-L, or LNO approach to reduce the scaling with system size. For the HLCs of non-Möbius structures, all methods can comfortably meet the 1 kcal mol–1 threshold (see Table ); DLPNO with tight settings can even reach down to 0.4 kcal mol–1, and PNO-LCCSD(T) and LNO-CCSD(T) on their respective tight settings can go as low as 0.2 kcal mol–1 RMSD. It is again the Möbius structures that are the most problematic for DLPNO-CCSD(T) (x = 0,1) and PNO-LCCSD(T), while LNO-CCSD(T) is much more resilient to them. Even on default settings, LNO-CCSD(T) achieves RMSD = 0.7 kcal mol–1 overall, which drops to 0.3 kcal mol–1 on tight settings. PNO-LCCSD(T) on default settings and DLPNO-CCSD(T1) on TightPNO settings both come close to the 1 kcal mol–1 mark overall. Particularly, the Möbius heptaphyrins throw a spanner in the works for DLPNO and PNO, which is much less the case for LNO on default settings, while no error exceeds 0.6 kcal mol–1 for LNO on tight settings. For the entire database of expanded porphyrins, we find an RMSD of 1.2–1.3 kcal mol–1 both for PNO-LCCSD(T) on Normal settings and for DLPNO-CCSD(T1) on Tight settings. The above results do make a good case for combining a localized HLC—for which either PNO-CCSD(T) Normal or DLPNO-CCSD(T1) Tight, but especially LNO-CCSD(T), would fit the bill—with a separate canonical MP2 calculation in a larger basis set—be it canonical RI-MP2 or DLPNO-MP2. For larger systems, eventually the O(N5) scaling of RI-MP2 would dominate the CPU time, but we have seen in Table that especially DLPNO-MP2 with TightPNO can closely emulate canonical MP2 energetics. Another approach toward converging the MP2 part would be to carry out PNO-LMP2-F12 calculations.[63]

Conclusions

Localized natural orbital approaches are a very promising new alternative to both wave function methods and density functional theory. They, in principle, offer the gentle system size scaling of DFT without the empiricism (of accuracy) involved in the exchange-correlation functional—at the expense of introducing a measure of “empiricism of precision” through the various cutoffs introduced. For systems with predominantly dynamical correlation, approaches like DLPNO-CCSD(T1) and PNO-LCCSD(T) seem to track canonical CCSD(T) results quite closely (see also the very recent paper[64] by Liakos, Guo, and Neese on the GMTKN55 benchmark suite[65]), while for truly severe static correlation, both canonical CCSD(T) and its localized approximations may be beyond help. Our results concern the intermediate regime, i.e., moderate static correlation: we found not only that discrepancies between canonical CCSD(T) and DLPNO-CCSD(T1) or PNO-LCCSD(T) can reach several kcal mol–1 for isomerization energies of chemical interest but that their magnitude is roughly proportional to several diagnostics for Type A static correlation. These problems can be somewhat mitigated by combining HLCs (i.e., CCSD(T)-MP2 differences), from the localized methods with more rigorous MP2 energetics, which are comparatively inexpensive to obtain. The LNO-CCSD(T) approach of Nagy and Kallay offers an alternative that, at least for the considered macrocycles, is more resilient to static correlation, especially with tight cutoffs, and can consistently approach the canonical values to better than 1 kcal mol–1. It achieves this feat at the expense of CPU times being more dependent on the degree of static correlation. For the present problem, Möbius structures required about four times as much CPU time as their Hückel and figure-eight isomers, while the Möbius:Hückel ratio is only about 2:1 for the DLPNO and PNO approaches. As in so many scientific and nonscientific contexts, the TANSTAAFL principle[66] applies (“there ain’t no such thing as a free lunch”). Finally, since the expanded porphyrins considered here and in ref (6) appear to be a useful test for resilience of quantum chemical approaches to static correlation, we propose the present POLYPYR21 data set as a benchmark for this purpose. The reference geometries, obtained at the B3LYP/6-311G(d,p) level[67−69] in ref (6), are available for download as Supporting Information to the present paper.

Table 6

Canonical (T) Relative Energies (kcal mol–1) and Errors with Various Localized Orbital (T) Approximations for the Relative Energies of the Expanded Porphyrins under Considerationa

NormalPNO (ORCA).

tightPNO (ORCA).

defaultDomain (MOLPRO).

tightDomain (MOLPRO).

Normal settings (MRCC).

Tight settings (MRCC).

47 in total

1. Scalable Electron Correlation Methods. 2. Parallel PNO-LMP2-F12 with Near Linear Scaling in the Molecular Size.

Authors: Qianli Ma; Hans-Joachim Werner
Journal: J Chem Theory Comput Date: 2015-10-27 Impact factor: 6.006

2. Basis set convergence of post-CCSD contributions to molecular atomization energies.

Authors: Amir Karton; Peter R Taylor; Jan M L Martin
Journal: J Chem Phys Date: 2007-08-14 Impact factor: 3.488

3. The two faces of static correlation.

Authors: Joshua W Hollett; Peter M W Gill
Journal: J Chem Phys Date: 2011-03-21 Impact factor: 3.488

4. Optimization of the Linear-Scaling Local Natural Orbital CCSD(T) Method: Improved Algorithm and Benchmark Applications.

Authors: Péter R Nagy; Gyula Samu; Mihály Kállay
Journal: J Chem Theory Comput Date: 2018-07-24 Impact factor: 6.006

5. Topology switching in [32]heptaphyrins controlled by solvent, protonation, and meso substituents.

Authors: Mercedes Alonso; Paul Geerlings; Frank De Proft
Journal: Chemistry Date: 2012-12-18 Impact factor: 5.236

6. New electron delocalization tools to describe the aromaticity in porphyrinoids.

Authors: Irene Casademont-Reig; Tatiana Woller; Julia Contreras-García; Mercedes Alonso; Miquel Torrent-Sucarrat; Eduard Matito
Journal: Phys Chem Chem Phys Date: 2018-01-24 Impact factor: 3.676

Review 7. Chemistry of meso-Aryl-Substituted Expanded Porphyrins: Aromaticity and Molecular Twist.

Authors: Takayuki Tanaka; Atsuhiro Osuka
Journal: Chem Rev Date: 2016-08-18 Impact factor: 60.622

8. Three-level topology switching in a molecular Möbius band.

Authors: Marcin Stepień; Bartosz Szyszko; Lechosław Latos-Grazyński
Journal: J Am Chem Soc Date: 2010-03-10 Impact factor: 15.419

Review 9. The Rise of Near-Infrared Emitters: Organic Dyes, Porphyrinoids, and Transition Metal Complexes.

Authors: Andrea Barbieri; Elisa Bandini; Filippo Monti; Vakayil K Praveen; Nicola Armaroli
Journal: Top Curr Chem (Cham) Date: 2016-07-19

10. Exploring the structure-aromaticity relationship in Hückel and Möbius N-fused pentaphyrins using DFT.

Authors: M Alonso; P Geerlings; F De Proft
Journal: Phys Chem Chem Phys Date: 2014-07-28 Impact factor: 3.676

5 in total

1. Accurate Reduced-Cost CCSD(T) Energies: Parallel Implementation, Benchmarks, and Large-Scale Applications.

Authors: László Gyevi-Nagy; Mihály Kállay; Péter R Nagy
Journal: J Chem Theory Comput Date: 2021-01-05 Impact factor: 6.006

2. Triplet State Baird Aromaticity in Macrocycles: Scope, Limitations, and Complications.

Authors: Rabia Ayub; Ouissam El Bakouri; Joshua R Smith; Kjell Jorner; Henrik Ottosson
Journal: J Phys Chem A Date: 2021-01-11 Impact factor: 2.781

3. Fine-Tuning of Nonlinear Optical Contrasts of Hexaphyrin-Based Molecular Switches Using Inverse Design.

Authors: Eline Desmedt; Tatiana Woller; Jos L Teunissen; Freija De Vleeschouwer; Mercedes Alonso
Journal: Front Chem Date: 2021-12-03 Impact factor: 5.221

4. Performance of Electronic Structure Methods for the Description of Hückel-Möbius Interconversions in Extended π-Systems.

Authors: Tatiana Woller; Ambar Banerjee; Nitai Sylvetsky; Golokesh Santra; Xavier Deraet; Frank De Proft; Jan M L Martin; Mercedes Alonso
Journal: J Phys Chem A Date: 2020-03-13 Impact factor: 2.781

5. The MOBH35 Metal-Organic Barrier Heights Reconsidered: Performance of Local-Orbital Coupled Cluster Approaches in Different Static Correlation Regimes.

Authors: Emmanouil Semidalas; Jan M L Martin
Journal: J Chem Theory Comput Date: 2022-01-19 Impact factor: 6.006

5 in total