| Literature DB >> 27803746 |
Tomáš Raček1, Jana Pazúriková2, Radka Svobodová Vařeková3, Stanislav Geidl3, Aleš Křenek4, Francesco Luca Falginella5, Vladimír Horský2, Václav Hejret5, Jaroslav Koča3.
Abstract
BACKGROUND: The concept of partial atomic charges was first applied in physical and organic chemistry and was later also adopted in computational chemistry, bioinformatics and chemoinformatics. The electronegativity equalization method (EEM) is the most frequently used approach for calculating partial atomic charges. EEM is fast and its accuracy is comparable to the quantum mechanical charge calculation method for which it was parameterized. Several EEM parameter sets for various types of molecules and QM charge calculation approaches have been published and new ones are still needed and produced. Methodologies for EEM parameterization have been described in a few articles, but a software tool for EEM parameterization and EEM parameter sets validation has not been available until now.Entities:
Keywords: EEM; EEM parameterization; Electronegativity equalization method; Partial atomic charges; wwPDB CCD database
Year: 2016 PMID: 27803746 PMCID: PMC5067907 DOI: 10.1186/s13321-016-0171-1
Source DB: PubMed Journal: J Cheminform ISSN: 1758-2946 Impact factor: 5.514
Fig. 1Example of quality validation outputs–graphs of correlation between reference charges and EEM charges. Correlation graph for all atoms (a) and correlation graph for C1 atom type (b)
Description of datasets used in parameterization case study
| Dataset | |||||
|---|---|---|---|---|---|
| Denotation | DTP_small | DTP_large | CCD_gen | CCD_exp | |
| Source database | DTP NCI | wwPDB CCD | |||
| Number of molecules | 1956 | 4475 | 4443 | ||
| Atomic types (elements and bond orders) | C1, C2, O1,O2, N1, N2,H, S1 | H1, C1, C2,C3, N1, N2,N3, O1, O2,F1, P1, P2,S1, S2, Cl1,Br1, I1 | H1, C1, C2, C3, N1,N2, N3, O1, O2, F1,P2, S1, S2, Cl1, Br1 | ||
| Size of molecules | 6-176 atoms | 5-124 atoms | 3-305 atoms | ||
| Type of molecules | Small organic molecules | Small organic molecules | Small organic and inorganic molecules, organometals, peptides | ||
| Source of 3D structures | Generated by CORINA | Experimental structures | |||
| Characterization of a dataset | Variability of atomic types | Low | High | ||
| Variability of molecules | Low | High | |||
| Variability of structure sources | Low | High | |||
| Reference to publication | [ | [ | – | – | |
Quality criteria of EEM parameter sets calculated in parameterization comparison case study
Fig. 2Graph of QM and EEM charges correlation for dataset CCD_exp and LR+RMSD approach (a) and DE-MIN+RMSD approach (b)
NEEMP performance on a standard personal computer (Intel i7-4790K CPU @ 4.00GHz)
| Dataset | DTP_small | DTP_large | CCD_gen and CCD_exp | |||
|---|---|---|---|---|---|---|
| EEM parameterization method | LR | DE-MIN | LR | DE-MIN | LR | DE-MIN |
| Running time | 54 m | 14 m | 4 h 25 m | 16 m | 9 h 24m | 25 m |
Fig. 3Speedup achieved by the parallel version run at different number of CPU cores
Denotations and main quality criteria of EEM parameter sets calculated in parameterization calculation case study
Size of database, used for comparison of EEM parameter set coverages
| Database | Number of compounds |
|---|---|
| DrugBank | 7097 |
| wwPDB CCD | 21,741 |
| PubChem | 71,632,601 |
Description of datasets used in quality validation case study
| Datasets | ||
|---|---|---|
| Designation | CCD_gen_CHNO* | CCD_gen_all* |
| Source database | wwPDB CCD | wwPDB CCD |
| Number of molecules | 8144 | 17,769 |
| Atomic types (elements and bond orders) | H1, C1, C2, N1, N2, O1, O2 | H1, C1, C2, C3, N1, N2, N3, O1, O2, F1, P2, S1, S2, Cl1, Br1 |
*All other information about the dataset is the same as for the dataset CCD_gen, described in Table 1
Quality criteria of the EEM parameter sets on the dataset CCD_gen_all
Fig. 4Graph of QM and EEM charges correlation for Cheminf2015_mpa parameter set (a) and Cheminf2015_npa parameter set (b) on the dataset CCD_gen_all. The graph for Cheminf2015_mpa includes a marked correlation problem at C3 atoms (they are in green), the graph for Cheminf2015_npa shows a slight correlation issue at S2 atoms (they are in brown)