| Literature DB >> 35517149 |
Abstract
This article studies two kinds of information extracted from statistical correlations between methods for assigning net atomic charges (NACs) in molecules. First, relative charge transfer magnitudes are quantified by performing instant least squares fitting (ILSF) on the NACs reported by Cho et al. (ChemPhysChem, 2020, 21, 688-696) across 26 methods applied to ∼2000 molecules. The Hirshfeld and Voronoi deformation density (VDD) methods had the smallest charge transfer magnitudes, while the quantum theory of atoms in molecules (QTAIM) method had the largest charge transfer magnitude. Methods optimized to reproduce the molecular dipole moment (e.g., ACP, ADCH, CM5) have smaller charge transfer magnitudes than methods optimized to reproduce the molecular electrostatic potential (e.g., CHELPG, HLY, MK, RESP). Several methods had charge transfer magnitudes even larger than the electrostatic potential fitting group. Second, confluence between different charge assignment methods is quantified to identify which charge assignment method produces the best NAC values for predicting via linear correlations the results of 20 charge assignment methods having a complete basis set limit across the dataset of ∼2000 molecules. The DDEC6 NACs were the best such predictor of the entire dataset. Seven confluence principles are introduced explaining why confluent quantitative descriptors offer predictive advantages for modeling a broad range of physical properties and target applications. These confluence principles can be applied in various fields of scientific inquiry. A theory is derived showing confluence is better revealed by standardized statistical analysis (e.g., principal components analysis of the correlation matrix and standardized reversible linear regression) than by unstandardized statistical analysis. These confluence principles were used together with other key principles and the scientific method to make assigning atom-in-material properties non-arbitrary. The N@C60 system provides an unambiguous and non-arbitrary falsifiable test of atomic population analysis methods. The HLY, ISA, MK, and RESP methods failed for this material. This journal is © The Royal Society of Chemistry.Entities:
Year: 2020 PMID: 35517149 PMCID: PMC9058476 DOI: 10.1039/d0ra06392d
Source DB: PubMed Journal: RSC Adv ISSN: 2046-2069 Impact factor: 4.036
Fig. 1Geometry illustrating the error measures used in total least squares (approach 1) and orthogonal distance regression (approach 2). The red line represents the model equation. The green dot represents the measured datapoint. Approach 1 minimizes t2, and approach 2 minimizes h2.
Relative charge transfer magnitudes of 26 NAC methods across ∼2000 molecules and ions. The NAC methods are ordered from smallest to largest charge transfer magnitude. Other characteristics of each NAC method are listed in the remaining columns. The last column includes the following additional comments on convergence properties: (a) non-convex means there is a problem in some materials where the converged solutions are not unique because the optimization landscape is not convex, (b) fails for buried atoms (FFBA) means the method assigns erroneous charges on buried atoms, and (c) frozen core inconsistent (FCI) means the method is defined in such a way that it may give vastly different results if a different number of frozen core electrons is chosen
|
|
| Relative charge transfer magnitude | Basis set limit? | Non-negative density partition? | Approach | Comment | |
|---|---|---|---|---|---|---|---|
| Hirshfeld | 0.1284 | 0.00171 | 0.413 | Yes | Overlapping | Deformation density | |
| VDD | 0.1318 | 0.00191 | 0.424 | Yes | No | Deformation density | |
| Mulliken | 0.1993 | 0.00171 | 0.641 | No | No | 1PDM projection | |
| ACP | 0.2208 | 0.00171 | 0.710 | Yes | Overlapping | Dipole intent | FCI |
| CM5 | 0.2225 | 0.00171 | 0.716 | Yes | No | Dipole intent | |
| ADCH | 0.2291 | 0.00171 | 0.737 | Yes | No | Dipole fit | |
| EEQ | 0.2294 | 0.00171 | 0.738 | Yes | No | Classical (no QM) |
|
| i-ACP | 0.2994 | 0.00170 | 0.963 | Yes | Overlapping | Dipole intent | FCI |
| DDEC6 | 0.3108 | 0.00171 | 1.000 | Yes | Overlapping | Confluence | |
| CHELPG | 0.3210 | 0.00171 | 1.033 | Yes | No | MEP fit | FFBA |
| IBO | 0.3220 | 0.00171 | 1.036 | Yes | No | Reference orbitals |
|
| RESP | 0.3231 | 0.00171 | 1.039 |
| No | Constrained MEP fit |
|
| MK | 0.3304 | 0.00171 | 1.063 | Yes | No | MEP fit | FFBA |
| Bickelhaupt | 0.3345 | 0.00171 | 1.076 | No | No | 1PDM projection | |
| HLY | 0.3465 | 0.00171 | 1.115 | Yes | No | MEP fit | FFBA |
| ISA | 0.3516 | 0.00116 | 1.131 | Yes | Overlapping | Spherical averaging | FFBA |
| Hirshfeld-I | 0.3783 | 0.00171 | 1.217 | Yes | Overlapping | Reference ions | Non-convex |
| MBIS | 0.3808 | 0.00111 | 1.225 | Yes | Overlapping | Slater functions | Non-convex |
| MBSBickelhaupt | 0.3828 | 0.00171 | 1.231 | No | No | 1PDM projection | |
| Becke | 0.3914 | 0.00171 | 1.259 | Yes | Overlapping | Reference radii | |
| Stout–Politzer | 0.3937 | 0.00171 | 1.267 | No | No | 1PDM projection | |
| APT | 0.3952 | 0.00171 | 1.272 | Yes | No | Dipole derivatives fit | |
| NPA | 0.4272 | 0.00171 | 1.374 | No | No | 1PDM projection | |
| MBSMulliken | 0.4333 | 0.00171 | 1.394 |
| No | 1PDM projection | |
| Ros–Schuit | 0.4557 | 0.00171 | 1.466 | No | No | 1PDM projection | |
| QTAIM | 0.6299 | 0.00171 | 2.027 | Yes | Non-overlapping | Viral compartments |
|
No basis set or quantum chemistry calculation is required to compute EEQ NACs.
Many different charge electronegativity equilibration schemes have been proposed. Many of these are not robust, because they sometimes produce extremely high NAC magnitudes.
The IBO method currently requires the first-order density matrix to be idempotent.
Whether or not the RESP NACs have a complete basis set limit depends on the type of fitting constraints used. If and only if the fitting constraints have no basis set dependence or have a complete basis set limit, then the corresponding RESP NACs will have a complete basis set limit. Whether the RESP NACs are robust depends on how the constraints are constructed.
Not rotationally invariant.
Methods that project populations from a quantum chemistry calculation basis set (aka ‘source basis set’) onto a small basis set (aka ‘target basis set’) have a basis set limit with respect to improving the source basis set towards completeness, but their results depend on the small target basis set onto which the populations are projected.
QTAIM partitions are robust only when they have been sufficiently smoothed so that noise does not create spurious virial compartments.
Relative root mean squared errors (RRMSE) in electrostatic potential of the water molecule for 20 charge assignment methods having a complete basis set limit. Errors in the predicted molecular dipole moment magnitude are also listed. Methods listed from smallest to largest NAC magnitude on oxygen. For some of the non-negative AIM density partitioning methods, the errors including atomic dipoles are listed in parentheses
| Method | Oxygen NAC | RRMSE (%) | Δ |
|---|---|---|---|
| VDD | −0.286 | 61% | −59% |
| Hirshfeld | −0.306 | 58% (11%) | −56% (0%) |
| EQeq | −0.368 | 49% | −47% |
| APT | −0.513 | 30% | −26% |
| ACP | −0.522 | 29% | −25% |
| CM5 | −0.642 | 16% | −7% |
| Becke | −0.645 | 16% (22%) | −7% (0%) |
| MBSMulliken | −0.663 | 15% | −4% |
| ADCH | −0.693 | 14% | 0% |
| RESP | −0.704 | 14% | 2% |
| MK | −0.705 | 14% | 2% |
| CHELPG | −0.710 | 14% | 2% |
| HLY | −0.715 | 14% | 3% |
| i-ACP | −0.720 | 14% | 4% |
| IBO | −0.734 | 14% | 6% |
| DDEC6 | −0.802 | 19% (8%) | 16% (0%) |
| ISA | −0.841 | 23% (7%) | 21% (0%) |
| MBIS | −0.876 | 27% (6%) | 26% (0%) |
| Hirshfeld-I | −0.900 | 30% (4%) | 30% (0%) |
| QTAIM | −1.212 | 72% (10%) | 75% (0%) |
Fig. 2Correlation matrix between 20 methods having a complete basis set limit for assigning net atomic charges in molecules. Stoplight colors indicate the covariance values: green ≥ 0.9, 0.8 ≤ yellow < 0.9, red < 0.8. Blue shading marks blocks of values ≥ 0.9. There are three primary groups: (a) a main group that covers a large number of methods, (b) the i-ACP, APT, and QTAIM group, and (c) the VDD and Hirshfeld group. The DDEC6 method is strongly correlated to all members of group (a) plus the i-ACP method in group (b) and the Hirshfeld method in group (c). No other charge assignment method besides DDEC6 is strongly correlated to some members of all three groups. The ADCH and Becke methods are not strongly correlated to any charge assignment methods besides self. The ADCH-CHELPG entry is red rather than yellow, because its value is 0.7996 which is below the 0.8 cutoff. The ADCH-CM5 entry is yellow rather than green, because its value is 0.8999 which is below the 0.9 cutoff.
The first four eigenvalues and principal components coefficients for correlation PCA of 20 charge assignment methods having a complete basis set limit. The methods are listed in order from largest to smallest contribution to the MPC. The last column is listed for comparison to the MPC coefficient of column 2
| PC1 (MPC) | PC2 | PC3 | PC4 |
| |
|---|---|---|---|---|---|
| % correlation explained | 85.8% | 4.1% | 2.8% | 2.6% | — |
| Eigenvalue→ | 17.158 | 0.816 | 0.562 | 0.524 | — |
| DDEC6 | 0.238 | 0.020 | −0.035 | −0.075 | 0.238 |
| MBIS | 0.237 | 0.020 | −0.015 | −0.098 | 0.237 |
| ISA | 0.236 | −0.094 | 0.142 | −0.102 | 0.236 |
| Hirshfeld-I | 0.235 | −0.076 | −0.082 | −0.078 | 0.235 |
| ACP | 0.233 | 0.083 | −0.097 | 0.023 | 0.233 |
| CHELPG | 0.230 | −0.126 | 0.301 | −0.134 | 0.230 |
| i-ACP | 0.230 | −0.249 | −0.043 | 0.046 | 0.230 |
| RESP | 0.230 | −0.011 | 0.340 | −0.185 | 0.229 |
| MK | 0.229 | −0.007 | 0.354 | −0.200 | 0.229 |
| IBO | 0.228 | 0.156 | −0.220 | −0.043 | 0.227 |
| CM5 | 0.227 | 0.247 | −0.145 | 0.030 | 0.227 |
| EEQ | 0.225 | 0.207 | −0.146 | 0.050 | 0.225 |
| MBSMulliken | 0.225 | 0.226 | −0.203 | −0.072 | 0.225 |
| HLY | 0.224 | 0.094 | 0.366 | −0.246 | 0.224 |
| Hirshfeld | 0.223 | 0.045 | −0.259 | 0.123 | 0.223 |
| VDD | 0.222 | −0.026 | −0.263 | 0.158 | 0.222 |
| ADCH | 0.213 | 0.376 | −0.082 | 0.004 | 0.213 |
| APT | 0.204 | −0.537 | −0.087 | 0.096 | 0.205 |
| QTAIM | 0.203 | −0.514 | −0.189 | 0.106 | 0.203 |
| Becke | 0.169 | 0.126 | 0.420 | 0.861 | 0.171 |
Rank of each charge assignment method according to its amount of correlation to other charge assignment methods. The S and Ω(α, ϕ) ranking criteria always give the same order of methods. This table includes 20 charge assignment methods with a complete basis set limit
| Rank | Method |
|
| Method |
| Method | Number ( | Method | Number ( |
|---|---|---|---|---|---|---|---|---|---|
| 1 | DDEC6 | 18.204 | 0.985 | DDEC6 | 0.986 | DDEC6 | 19 | DDEC6 | 15 |
| 2 | MBIS | 18.109 | 0.980 | MBIS | 0.981 | MBIS | 19 | MBIS | 14 |
| 3 | ISA | 18.064 | 0.977 | ISA | 0.978 | ISA | 19 | Hirshfeld-I | 11 |
| 4 | Hirshfeld-I | 17.981 | 0.973 | Hirshfeld-I | 0.974 | Hirshfeld-I | 19 | ISA | 10 |
| 5 | ACP | 17.823 | 0.964 | ACP | 0.965 | CHELPG | 18 | ACP | 9 |
| 6 | i-ACP | 17.603 | 0.953 | CHELPG | 0.953 | i-ACP | 18 | CHELPG | 9 |
| 7 | CHELPG | 17.600 | 0.952 | i-ACP | 0.952 | ACP | 17 | i-ACP | 9 |
| 8 | RESP | 17.564 | 0.950 | RESP | 0.951 | RESP | 17 | RESP | 8 |
| 9 | MK | 17.520 | 0.948 | MK | 0.949 | MK | 17 | MK | 8 |
| 10 | IBO | 17.400 | 0.942 | IBO | 0.942 | IBO | 17 | MBSMulliken | 8 |
| 11 | CM5 | 17.396 | 0.941 | CM5 | 0.942 | CM5 | 17 | CM5 | 7 |
| 12 | EEQ | 17.237 | 0.933 | EEQ | 0.933 | EEQ | 17 | HLY | 7 |
| 13 | MBSMulliken | 17.188 | 0.930 | MBSMulliken | 0.931 | MBSMulliken | 17 | IBO | 6 |
| 14 | HLY | 17.143 | 0.928 | HLY | 0.929 | VDD | 17 | EEQ | 6 |
| 15 | Hirshfeld | 17.088 | 0.925 | Hirshfeld | 0.924 | Hirshfeld | 17 | Hirshfeld | 3 |
| 16 | VDD | 16.996 | 0.920 | VDD | 0.919 | HLY | 16 | APT | 3 |
| 17 | ADCH | 16.318 | 0.883 | ADCH | 0.883 | ADCH | 15 | QTAIM | 3 |
| 18 | APT | 15.683 | 0.849 | APT | 0.847 | APT | 9 | VDD | 2 |
| 19 | QTAIM | 15.562 | 0.842 | QTAIM | 0.840 | QTAIM | 8 | ADCH | 1 |
| 20 | Becke | 13.052 | 0.706 | Becke | 0.699 | Becke | 1 | Becke | 1 |
Falsifiable scientific tests of 20 methods to assign NACs in molecular systems. The NAC and ASM of the central atom are listed for each method
| Method | N@C60 | [Eu@C60]+ | ||
|---|---|---|---|---|
| NAC | ASM | NAC | ASM | |
| ACP | −0.017 |
|
|
|
| ADCH | 0.126 | 2.720 | 0.476 | 6.891 |
| APT | 0.015 |
| 0.415 |
|
| Becke | −0.056 | 2.900 | −4.427 | 7.001 |
| CHELPG | 0.371 |
| 1.031 |
|
| CM5 | 0.120 | 2.720 | 1.016 | 6.891 |
| DDEC6 | 0.143 | 2.836 | 1.360 | 6.933 |
| EQeq | −0.081 |
| 1.278 |
|
| Hirshfeld | 0.139 | 2.720 | 0.525 | 6.891 |
| Hirshfeld-I | 0.147 | 2.788 | 1.483 | 6.892 |
| HLY | 1050.40 |
| 199.86 |
|
| i-ACP | −0.009 |
|
|
|
| IBO | −0.013 | 2.987 |
|
|
| ISA | −3.082 | 2.800 | 1.452 | 6.910 |
| MBIS | 0.157 | 2.821 |
|
|
| MBSMulliken | −0.019 | 2.981 |
|
|
| MK | 11.986 |
| 0.926 |
|
| QTAIM | 0.014 | 2.888 | 2.691 | 6.932 |
| RESP | 9.116 [6.553] |
| 0.925 [0.925] |
|
| VDD | 0.198 | 2.906 | 0.339 | 6.931 |
The ACP and i-ACP parameters are not yet defined for the element Eu. Although the ACP and i-ACP methods could yield ASMs, this is not yet available in the software.
ASMs for the ADCH and CM5 methods are taken from the Hirshfeld partition.
This method does not give ASMs.
IBOView version 20150427 could not compute IBO populations for atoms using a RECP.
The software used was not set up to compute MBIS populations for atoms using a RECP.
MBSMulliken was not available for the Eu element in the Gaussian 16 program.
Two-stage fitting without brackets. One-stage fitting in brackets. See text for RESP penalty function parameter values.
Rank of each charge assignment method according to its amount of correlation to other charge assignment methods. The S and Ω(α, ϕ) ranking criteria always give the same order of methods. This table includes all 26 charge assignment methods
| Rank | Method |
|
| Method |
| Method | Number ( | Method | Number ( |
|---|---|---|---|---|---|---|---|---|---|
| 1 | DDEC6 | 23.575 | 0.986 | DDEC6 | 0.987 | DDEC6 | 24 | DDEC6 | 20 |
| 2 | MBIS | 23.481 | 0.982 | MBIS | 0.983 | MBIS | 24 | MBIS | 19 |
| 3 | MBSBickelhaupt | 23.468 | 0.981 | MBSBickelhaupt | 0.981 | MBSBickelhaupt | 24 | MBSBickelhaupt | 16 |
| 4 | Hirshfeld-I | 23.251 | 0.972 | Hirshfeld-I | 0.973 | Hirshfeld-I | 24 | Hirshfeld-I | 14 |
| 5 | ISA | 23.195 | 0.970 | ISA | 0.970 | ISA | 24 | ACP | 14 |
| 6 | ACP | 23.138 | 0.967 | ACP | 0.967 | Bickelhaupt | 24 | Bickelhaupt | 14 |
| 7 | Bickelhaupt | 23.093 | 0.965 | Bickelhaupt | 0.966 | i-ACP | 23 | ISA | 13 |
| 8 | NPA | 22.884 | 0.957 | NPA | 0.958 | ACP | 22 | MBSMulliken | 13 |
| 9 | IBO | 22.801 | 0.953 | IBO | 0.954 | NPA | 22 | Mulliken | 13 |
| 10 | CM5 | 22.693 | 0.949 | CM5 | 0.949 | IBO | 22 | NPA | 12 |
| 11 | MBSMulliken | 22.663 | 0.947 | MBSMulliken | 0.948 | CM5 | 22 | IBO | 11 |
| 12 | Mulliken | 22.653 | 0.947 | Mulliken | 0.947 | MBSMulliken | 22 | CM5 | 11 |
| 13 | EEQ | 22.540 | 0.942 | EEQ | 0.942 | Mulliken | 22 | i-ACP | 10 |
| 14 | i-ACP | 22.530 | 0.942 | i-ACP | 0.942 | EEQ | 22 | CHELPG | 10 |
| 15 | RESP | 22.467 | 0.939 | RESP | 0.940 | RESP | 22 | Stout–Politzer | 10 |
| 16 | CHELPG | 22.429 | 0.938 | CHELPG | 0.938 | CHELPG | 22 | EEQ | 8 |
| 17 | MK | 22.414 | 0.937 | MK | 0.938 | MK | 22 | RESP | 8 |
| 18 | Stout–Politzer | 22.162 | 0.927 | Stout–Politzer | 0.927 | Hirshfeld | 22 | MK | 8 |
| 19 | Hirshfeld | 22.074 | 0.923 | Hirshfeld | 0.923 | HLY | 21 | HLY | 7 |
| 20 | HLY | 22.021 | 0.921 | HLY | 0.922 | VDD | 21 | Hirshfeld | 4 |
| 21 | VDD | 21.897 | 0.915 | VDD | 0.915 | Stout–Politzer | 20 | APT | 3 |
| 22 | ADCH | 21.283 | 0.890 | ADCH | 0.890 | ADCH | 20 | QTAIM | 3 |
| 23 | APT | 19.970 | 0.835 | APT | 0.834 | APT | 11 | VDD | 2 |
| 24 | QTAIM | 19.855 | 0.830 | QTAIM | 0.830 | QTAIM | 10 | ADCH | 1 |
| 25 | Ros–Schuit | 16.867 | 0.705 | Ros–Schuit | 0.701 | Ros–Schuit | 1 | Ros–Schuit | 1 |
| 26 | Becke | 16.718 | 0.699 | Becke | 0.693 | Becke | 1 | Becke | 1 |
Rankings of nine charge assignment methods in a pared down dataset. The S and Ω(α, ϕ) ranking criteria always give the same order of methods
| Rank | Method |
|
| Method |
| Method | Number ( | Method | Number ( |
|---|---|---|---|---|---|---|---|---|---|
| 1 | DDEC6 | 8.111 | 0.977 | DDEC6 | 0.978 | DDEC6 | 9 | DDEC6 | 5 |
| 2 | IBO | 7.967 | 0.960 | IBO | 0.962 | VDD | 8 | CM5 | 5 |
| 3 | CM5 | 7.899 | 0.952 | CM5 | 0.954 | IBO | 7 | MBSMulliken | 5 |
| 4 | MBSMulliken | 7.868 | 0.948 | MBSMulliken | 0.951 | CM5 | 7 | IBO | 4 |
| 5 | EEQ | 7.856 | 0.946 | EEQ | 0.948 | MBSMulliken | 7 | EEQ | 4 |
| 6 | VDD | 7.733 | 0.932 | VDD | 0.931 | EEQ | 7 | QTAIM | 2 |
| 7 | ADCH | 7.417 | 0.894 | ADCH | 0.896 | ADCH | 7 | APT | 2 |
| 8 | QTAIM | 7.046 | 0.849 | QTAIM | 0.843 | APT | 4 | VDD | 1 |
| 9 | APT | 7.013 | 0.845 | APT | 0.839 | QTAIM | 3 | ADCH | 1 |
Fig. 3In a group of darts aimed at any target, the centrally located dart never lands farthest from the target. If the group of darts follows a spherically symmetric distribution, then a centrally located dart lands closer to the target than at least ∼50% of the darts. In other words, the centrally located dart performs average or better for diverse targets. This centrally located dart exhibits confluence properties including high correlation to the other individual darts and to the main principal component of the dart group.
Summed correlations and summed squared correlations between ϕ or MPC and the NAC methods
| Summed correlations | Summed squared correlations | |
|---|---|---|
|
| 18.48063 | 17.15555 |
| MPC | 18.47951 | 17.15756 |