| Literature DB >> 26633997 |
Stanislav Geidl1, Tomáš Bouchal1, Tomáš Raček2, Radka Svobodová Vařeková1, Václav Hejret1, Aleš Křenek3, Ruben Abagyan4, Jaroslav Koča1.
Abstract
BACKGROUND: Partial atomic charges describe the distribution of electron density in a molecule and therefore provide clues to the chemical behaviour of molecules. Recently, these charges have become popular in chemoinformatics, as they are informative descriptors that can be utilised in pharmacophore design, virtual screening, similarity searches etc. Especially conformationally-dependent charges perform very successfully. In particular, their fast and accurate calculation via the Electronegativity Equalization Method (EEM) seems very promising for chemoinformatics applications. Unfortunately, published EEM parameter sets include only parameters for basic atom types and they often miss parameters for halogens, phosphorus, sulphur, triple bonded carbon etc. Therefore their applicability for drug-like molecules is limited.Entities:
Keywords: Drug-like molecules; EEM; Electronegativity Equalization Method; Partial atomic charges; QM; Quantum mechanics
Year: 2015 PMID: 26633997 PMCID: PMC4667495 DOI: 10.1186/s13321-015-0107-1
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Summary information about published EEM parameters evaluated in this study
| QM theory Level + basis set | Charge calc. scheme | EEM parameter set name | Published by | Elements and bond orders included |
|---|---|---|---|---|
| HF/STO-3G | MPA | Baek1991 | Baekelandt et al. [ | C, O, N, H, P, Al, Si |
| Svob2007_cbeg2 | Svobodova et al. [ | C1, C2, O, N1, N2, H, S1 | ||
| Svob2007_cmet2 | Svobodova et al. [ | C1, C2, O, N1, N2, H, S1, Fe, Zn | ||
| Svob2007_chal2 | Svobodova et al. [ | C1, C2, O, N1, N2, H, S1, Br, Cl, F, I | ||
| Svob2007_hm2 | Svobodova et al. [ | C1, C2, O, N1, N2, H, S1, F, Cl, Br, I, Fe, Zn | ||
| HF/6-31G* | MK | Jir2008_hf | Jirouskova et al. [ | C1, C2, O, N1, N2, H, S1, F, Cl, Br, Zn |
| B3LYP/6-31G* | MPA | Bult2002_mpa | Bultinck et al. [ | C, O, N, H, F |
| NPA | Bult2002_npa | Bultinck et al. [ | C, O, N, H, F | |
| Ouy2009 | Ouyang et al. [ | C, O, N, H | ||
| Ouy2009_elem | Ouyang et al. [ | C, O, N, H | ||
| Hir. | Bult2002_hir | Bultinck et al. [ | C, O, N, H, F | |
| MK | Bult2002_mk | Bultinck et al. [ | C, O, N, H, F | |
| Jir2008_mk | Jirouskova et al. [ | C1, C2, O, N1, N2, H, S1, F, Cl, Br, Zn | ||
| CHELPG | Bult2002_che | Bultinck et al. [ | C, O, N, H, F | |
| AIM | Bult2004_aim | Bultinck et al. [ | C, O, N, H, F |
†An element symbol with no further information (e.g., C) means that the EEM parameters are available for this element bound by all possible bond orders. The element symbol followed by a number (e.g., C1) means that the EEM parameters are only available for this element bound by a bond with an order described using this number
‡For this parameter set, C1 represents sp hybridization, C2 sp hybridization, C3 sp hybridization, etc.
Information about freely available software tools enabling EEM charge calculation
| Software | EEM parameters used by a software |
|---|---|
| OpenBabel [ | It contains the embedded EEM parameter set Bult2002_mpa, which was parameterized for B3LYP/6-31G*/MPA charges. It does not allow any other EEM parameter set to be used |
| Balloon [ | It contains an embedded EEM parameter set published by Puranen et al. [ |
| EEM SOLVER [ | It allows the use of any input EEM parameter sets provided by the user. It does not contain any embedded EEM parameter sets |
Fig. 1a Composition of steps performed within this work and b tasks performed during EEM parametrization
Occurrence of atom types in the training set
| Denotation of atom type | Element symbol | Maximal bond order | Number of atoms with this atom type in the training set | Number of molecules containing this atom type in the training set |
|---|---|---|---|---|
| H1 | H | 1 | 57,119 | 4442 |
| C1 | C | 1 | 15,220 | 3447 |
| C2 | 2 | 38,097 | 4149 | |
| C3 | 3 | 345 | 266 | |
| N1 | N | 1 | 4151 | 2483 |
| N2 | 2 | 3383 | 1879 | |
| N3 | 3 | 345 | 266 | |
| O1 | O | 1 | 5016 | 2525 |
| O2 | 2 | 5793 | 3069 | |
| F1 | F | 1 | 938 | 395 |
| P1 | P | 1 | 153 | 143 |
| P2 | 2 | 251 | 213 | |
| S1 | S | 1 | 1034 | 770 |
| S2 | 2 | 1391 | 1211 | |
| Cl1 | Cl | 1 | 1084 | 676 |
| Br1 | Br | 1 | 336 | 261 |
| I1 | I | 1 | 1734 | 1365 |
| Total | – | – | 136,390 | 4475 |
Quality criteria of our EEM parameter sets
| EEM parameter set name | Relevant QM charges | R2 | RMSD |
|
|---|---|---|---|---|
| Cheminf_b3lyp_mpa | B3LYP/6-311G/MPA | 0.9007 | 0.1038 | 0.0727 |
| Cheminf_b3lyp_npa | B3LYP/6-311G/NPA | 0.9651 | 0.0746 | 0.0540 |
| Cheminf_b3lyp_aim | B3LYP/6-311G/AIM | 0.9499 | 0.0785 | 0.0558 |
| Cheminf_hf_mpa | HF/6-311G/MPA | 0.9178 | 0.1125 | 0.0776 |
| Cheminf_hf_npa | HF/6-311G/NPA | 0.9633 | 0.0805 | 0.0574 |
| Cheminf_hf_aim | HF/6-311G/AIM | 0.9441 | 0.0919 | 0.0651 |
Size of database, used for comparison of EEM parameter set coverages
| Database | Number of compounds |
|---|---|
| DrugBank | 6874 |
| ChEMBL | 1,456,020 |
| PubChem | 63,676,639 |
| ZINC | 21,957,378 |
Summary information about coverage and quality of all tested EEM parameters (see below for meaning of colours)