Literature DB >> 34840396

Rényi's divergence as a chemical similarity criterion.

N Flores-Gallegos1.   

Abstract

In this work, a new version of Rényi's divergence is presented. The expression obtained is used as a tool to identify molecules that could share some chemical or structural properties, and a data basis set of 1641 molecules is used in this study. Our results suggest that this new form of Rényi divergence could be a useful tool that will eventually permit complementary studies in which the main goal is to obtain molecules with similar properties.
© The Author(s), under exclusive licence to Springer Nature Switzerland AG 2021.

Entities:  

Keywords:  Chemical reactivity; Chemical similarity; Rényi’s divergence; Rényi’s entropy

Year:  2021        PMID: 34840396      PMCID: PMC8607974          DOI: 10.1007/s10910-021-01307-6

Source DB:  PubMed          Journal:  J Math Chem        ISSN: 0259-9791            Impact factor:   2.413


Introduction

Perhaps, currently in the modern sciences, there are no clear barriers between fields; for example, let us consider two different areas such as biology and computer sciences which in the last decade formed a field called bioinformatics. A similar situation happens with chemistry, in where there are no clear separations between the modern chemistry, physics, mathematics, and computer sciences; in this regard, a new discipline of chemistry called cheminformatics or molecular informatics has emerged in recent years. In this area, one of the objectives is to find molecules that may have some chemical or physical properties of interest. In the last year, in which, for practically the entire population of the world, it was suggested to stay isolated at home, all the scientific community was called to participate in the ‘JEDI Grand Challenge: Billion molecules against COVID-19’. In an announcement by The Joint European Disruptive Initiative that was published on the main page of the journal ‘Molecular Informatics’, one can read, ‘The Joint European Disruptive Initiative has announced a May 1st launch of the Billion Molecules against Covid19 Grand Challenge, with awards of up to € 2 million for the winners.’ for which the main objective is ‘to screen billions of molecules with blocking interactions relevant to SARS-CoV-2 and fast-track the route to a therapeutic treatment.’ It is interesting to note that to address a problem such as the mentioned before, there are two necessary ingredients; the first, of course, is the equations, which must be defined using solid fundamentals, instead of empirical, heuristic, or statistical assumptions, and the second is the algorithms, that is, how can we improve the search of biological, chemical, or physical properties that a group of molecules could have in common to obtain a biological or chemical agent that permits to define of a set of molecules that may act as a good treatment or even a vaccine. In this sense, cheminformatics is an area that has a strong link with QSAR analysis [1]; however, it is not common to find that such analyses are complemented by descriptors based on the formal fundamentals of theoretical chemistry. In this regard, perhaps the first proposal of a similarity coefficient based on electron density was done in the late 1970s by Carbó [2]. On the other hand, according to Willett [3], a cheminformatics analysis has three components, (i) a structure representation, (ii) a weighting scheme, and (iii) a similarity coefficient; in the first case, the analysis consists of performing a comparison of the structure between two molecules, using as a criterion the structural representation of the molecules in 2D or 3D; in the second case, one has to define a fingerprint of several molecules considering the number of the apparition of certain functionals groups, and in the last case, the analysis consists of choosing certain coefficients defined in terms of the number of features present in the compounds to analyze. In this sense, clearly one can define a vast number of such coefficients [4]; for example, in Todeschini’s work, the authors analyzed 51 different coefficients [5], which could be an attractive aspect to perform some studies in the field of medicinal chemistry [6]. However, it is worthwhile to mention that the majority of these coefficients were defined in a heuristic way. In this work, we propose the use of a new definition of Rényi’s divergence that can be used as a tool to find molecules that could have some chemical or structural properties in common. In the following section, we present a brief background of Rényi’s divergence and its mathematical properties.

Theoretical background

Since the publication of Shannon’s work in the 1940s, information theory [7] has been applied in several areas of science and technology. In chemistry, the use of the ideas of information theory started in the 1970s, and these studies were principally focused on the study of atoms and simple systems using Shannon’s entropy in position and momentum spaces; in the 1990s, with the advantages of computers, it was possible to extend these studies to simple molecules and some simple chemical processes. In this line, it was also possible to show that Shannon’s entropy can be related to some important concepts of quantum chemistry, such as the chemical reactivity, electron correlation, and structural stability of the molecules [8-17]. On the other hand, in the 1960s, the idea of generalizing Shannon’s entropy emerged, and today, one can find three generalizations of it: (i) Rényi’s entropy [18] (ii) the Havrda and Charvát entropy [19] and (iii) Tsallis’ entropy [20]. The first two were proposed in the 1960s, while the third one was proposed at the end of the 1980s. Considering a historical point of view, perhaps Rényi’s work can be considered the first attempt to generalize Shannon’s entropy. Since the 1990s, Rényi’s entropy has been profusely used in the field of atomic and molecular physics [21-36], and these works were focused on the study of the electron correlation phenomenon. Nevertheless, we would like to address the attention of the readers to another concept, also proposed by Rényi, which is the divergence, and it can be interpreted as a measure that compares and quantifies the difference between two probabilistic distributions. It is defined asandin which the sets and are probabilistic distributions that fulfill , , and , . In both cases, the term , can take values of , but with the exception that . Here, , and n and m are the sizes of the sets and , respectively. These constants prevent the numerical overestimation of the sums (see “Appendix”). From Eqs. (1) and (2), one can infer easily that ; because of this, we need to ensure that the expression used to calculate the divergence is symmetric; therefore, our proposal for the Rényi’s divergence isIf one applies the logarithm properties to this equation, then it can be reduced towhere ; Eq. (3) and consequently Eq. (4) both fulfill the following criteria [37]: In the field of information theory, it is well known that Rényi’s entropy is a generalization of Shannon’s entropy; in this regard, Rényi’s divergence is a generalization of several divergence measures. For example, if one considers , Eq. (4) becomes,An interesting aspect of this equation is that if one applies the exponential function to obtain , one obtains a discrete version of Carbó’s similarity coefficient, which is related to the overlap of two distributions [38-41]. is a continuous function of and . is permutationally symmetric; that is, the total value of this measure does not change if the elements and are labeled in a different way, which implies that the pairs can permute among themselves. . Based on the previous property, then one can infer that the minimum value of is zero. If we consider , Eq. (3) becomes,which is known as the Bhattacharyya coefficient [42] and has been used extensively in the field of computational sciences as a measure of overlap between two statistical distributions. When , Eq. (3) converges to the Kullback–Leibler entropy [43],Other interesting aspect of Eq. (3) appears when ; in this case, Eq. (3) has the following form,in this equation, one can recognize the terms and as terms associated with the chi-squared distribution, which is a useful tool in statistical analyses. In general, we can say that Eqs. (5), (6), (7) and (8) can be used as measures that permit the comparison of two different probabilistic distributions. In this regard, if one extrapolates these equations to the context of quantum chemistry, then one obtains a set of so-called quantum similarity coefficients. Thus, if one wants to express Eq. (4) in its quantum version, then it is necessary to rewrite it in terms of two diagonalized density matrices, namely, and . These density matrices are related to the quantum states of the systems A and B; therefore, we can write Eq. (4) in terms of the trace of and to obtainwhere the superscript indicates the quantum version of Eq. (4), which can be rewritten in the following compact and simplest form:To apply Eq. (9) to systems such as atoms, molecules, macromolecules, it is necessary to use two diagonalized density matrices, in which each element of their trace is an occupation number; in the theoretical chemistry field, the occupation numbers are subject to , , where N is the electron number. It is convenient to clarify that such numbers are restricted to the values of 0, 1, 2 when they are obtained through mono-determinantal methodologies, such as HF or DFT; moreover, when they are obtained through multideterminantal methodologies as CI or CC, they are subject to , where and . Therefore, to use such numbers with Eq. (10) they have to be normalized as . This new set of numbers is subject to , and , and now the set fulfills all criteria of a probabilistic distribution, which is one of the main ingredients of any information measure. Perhaps the most interesting aspect of Eq. (10) is that it permits comparison of the quantum information of two molecular systems, which are described by two density matrices, and such matrices, according to Dirac, completely describe the state of the quantum systems [44]. In this sense, it is not hard to see that Eq. (10) can represent a quantum superposition; that is, in the context of quantum mechanics, a quantum superposition is considered a sum of states to obtain a new state. For example, if one considers Eq. (9) with , it is evident that this equation is reduced to a sum of logarithms, and each logarithm has an argument for the states, and ; therefore, one can interpret the numerical results obtained with Eq. (10) also as a quantum superposition. On the other hand, in the context of density functional theory, there are three main concepts linked to chemical reactivity, hardness, softness, and chemical potential [45], which are defined aswhere the chemical potential, , is define asand is Mulliken’s electronegativity definition [46]. Equations (11), (12) and (13) can be rewritten in terms of the electron affinity, A, and the ionization potential, I, asandUsing Koopman’s theorem and under the LCAO approximation, the last three equations can be rewritten in terms of the HOMO and LUMO energies asandIn general, Eqs. (17), (18) and (19) are used as powerful tools to study how a molecule can react under different chemical conditions; for this reason, we consider that such equations are important to complement our study. In the following section, we present some numerical results when Eq. (10) is applied to a large set of molecules; our analysis is complemented by the numerical results of Eqs. (17), (18) and (19).

Results

To apply and to show the versatility of Eq. (10), a set of 1641 molecules defined by Blair and Thakkar [47] was chosen. Such molecules are closed-shell molecules, and according to the authors, the molecules contain up to 34 atoms and 246 electrons, which can be used to study the interrelationships among the properties of molecules of organic, biochemical, and pharmaceutical interest. To perform the calculations of this study, we used Eq. (10) with the Löwdin occupation numbers obtained with the CISD method and DGDZVP basis set, while to calculate the hardness, softness and chemical potential in terms of the HOMO and LUMO energies, we used the functional M062x with the DGDZVP basis set. All calculations were performed with Gaussian 09 [48]. For practical proposes, we chose in a random way one molecule of the set of molecules used. The molecule chosen was number 750, tetranitromethane, , which is a highly explosive chemical used as an oxidizer in rocket propellants and is used to increase the cetane number of diesel fuels in the manufacture of liquid explosives. It has been classified as possibly carcinogenic to humans [49]. This molecule has 98 electrons, a softness value of [a.u.], a hardness value of [a.u.] and a chemical potential value of [a.u.]. In Tables 1, 2, 3 and 4, we present the numerical results of Eq. (10) using the values of , and in Figs. 2, 3, 4 and 5 are depicted the molecules that according to the numerical results of Eq. (10) may share some properties. At this point, we consider that it is opportune to mention that we report only the numerical results of the molecules that are similar, instead of the 1641 numerical values of for each value considered in this study; however, we present the general trends of in Fig. 1a–d. For each case, a percentage of difference of among the molecules was chosen because, for example, if one observes Fig. 1a, in which the location of the molecule 750 is highlighted with red lines and follows the horizontal line, one can appreciate that there is a dense section in which several molecules appear. For that reason, we selected the molecules with a difference of 0.0001%, while for with , the difference used was 0.2%, 0.3% and 0.1%. To obtain small sets of molecules, clearly, one can change these percentage values to increase the size of the sets; on the other hand, if one follows the vertical line on Fig. 1a, one observes that there are several isoelectronic molecules, with , but in the majority of the cases, the value of differentiates all these molecules.
Table 1

Numerical results of Eq. (10) with

MoleculeN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{q}R_{\alpha =0}$$\end{document}qRα=0\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η [a.u.]S [a.u.]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ [a.u.]
mol7508010.40801337415330.20730002.4119633\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.098680
mol10418410.40801337415340.12869003.8853057\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.151720
mol11639010.40801337415330.15616503.2017417\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.150825
mol15538210.40801337415340.14727003.3951246\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.126900

The molecule of reference was tetranitromethane with , , , , and . mol750: tetraspiro[2.0.0.0.2.1.1.1]undecane, mol1041: 1,2-furylpyridin-2-one, mol1153: 1H-tetrazol-5-amine, mol1163: 1R-1,2,2-trifluoroethoxybenzene

Table 2

Numerical results of Eq. (10) with

MoleculeN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{q}R_{\alpha =0.5}$$\end{document}qRα=0.5\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η [a.u.]S [a.u.]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ [a.u.]
mol586225.37607557370340.09643505.1848395\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.148575
mol867025.3683766340630.21690002.3052097\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.102550
mol1497025.33167365303110.12286504.0695071\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.177535
mol1537025.3691181093990.22527002.2195587\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.092870
mol2397625.34309028796080.12817503.9009167\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.122995
mol5676825.37662564428750.12688003.9407314\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.143010
mol6067625.34058656657570.13475003.7105751\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.227120
mol7767025.36966986541210.23215002.1537798\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.107770
mol8158825.37049749795440.15184003.2929399\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.183810
mol8387025.33106632758220.15185503.2926146\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.109345
mol8607425.37574836158680.22501502.2220740\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.102315
mol9487425.37503161775020.22544002.2178850\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.116470
mol9497225.37912989854970.14136003.5370684\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.167100
mol9998225.3506795624350.22325502.2395915\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.112635
mol11327025.36965644441620.22472002.2249911\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.112970
mol13357025.33056401259370.13278503.7654855\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.100355
mol14877025.37737913349030.13420003.7257824\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.117240

The molecule of reference was tetranitromethane with , , , , and . Where, mol58: azulene, mol86: bicyclo-3,3,1-nonane, mol149: cinnamaldehyde, mol153: cis-bicyclo-6,1,0-nonane, mol239: difuro[3,2-b:2′,3′-d]furan, mol567: naphthalene, mol607: p-dichlorobenzene, mol776: trans-2,3-dimethylnorborane, mol815: vinylsulfonylbenzene, mol838: 1′H-1,2′-bipyrrole, mol860: 1-butoxybutane, mol948: 1-octanol, mol949: 1-p-tolylethanone, mol999: 1,1,1,2-tetramethoxyethane, mol1132: 1,4-dimethylnorborane, mol1335: 2,1H-pyrrol-2-yl-1H-pyrrole, mol1487: 3-methyl-1H-indole

Table 3

Numerical results of Eq. (10) with

MoleculeN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{q}R_{\alpha =1}$$\end{document}qRα=1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η [a.u.]S [a.u.]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ [a.u.]
mol2049010.5411934644820.16907002.9573549\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.106900
mol5637610.88100270961170.11766504.2493519\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.126665
mol5657610.88329308822760.11930504.1909391\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.127765
mol8158810.58944882465960.15184003.2929399\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.183810
mol8798410.67217056057490.21068002.3732675\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.142970

The molecule of reference was tetranitromethane with , and , , and . mol204: di-t-butyl sulfoxide, mol563: naphthalen-1-amine, mol565: naphthalen-2-amine, mol815: vinylsulfonylbenzene, mol879: 1-chloro-1,3-butadiene

Table 4

Numerical results of Eq. (10) with

MoleculeN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{q}R_{\alpha =2}$$\end{document}qRα=2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η (a.u.)S (a.u)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ (a.u)
mol242985.366992589608230.21057502.3744509\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.132485
mol546905.776729477915930.14159503.5311981\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.176035
mol879845.915191122339040.12381004.0384460\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.151740
mol1640865.810259742105460.13600503.6763354\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.146185

The molecule of reference was tetranitromethane with , and , , and . mol242: diisobutyl sulfone, mol546: N-pyrazin-2-ylmethanesulfonamide, mol879: 1-chloronaphthalene, mol1640: 9H-purine-2,6,8-trione

Fig. 2

Graphic representation of the molecules: mol750, mol563, mol565, mol815, mol879, used in Table 1

Fig. 3

Graphic representation of the molecules: mol58, mol86, mol149, mol153, mol239, mol567, mol607, mol776, mol815, mol838, mol860, mol948, mol949, mol999, mol1132, mol1335, mol1487, used in Table 2

Fig. 4

Graphic representation of the molecules: mol204, mol1041, mol1153, mol1163, used in Table 3

Fig. 5

Graphic representation of the molecules: mol242, mol546, mol879, mol1640, used in Table 4

Fig. 1

Trends of with

Numerical results of Eq. (10) with The molecule of reference was tetranitromethane with , , , , and . mol750: tetraspiro[2.0.0.0.2.1.1.1]undecane, mol1041: 1,2-furylpyridin-2-one, mol1153: 1H-tetrazol-5-amine, mol1163: 1R-1,2,2-trifluoroethoxybenzene Numerical results of Eq. (10) with The molecule of reference was tetranitromethane with , , , , and . Where, mol58: azulene, mol86: bicyclo-3,3,1-nonane, mol149: cinnamaldehyde, mol153: cis-bicyclo-6,1,0-nonane, mol239: difuro[3,2-b:2′,3′-d]furan, mol567: naphthalene, mol607: p-dichlorobenzene, mol776: trans-2,3-dimethylnorborane, mol815: vinylsulfonylbenzene, mol838: 1′H-1,2′-bipyrrole, mol860: 1-butoxybutane, mol948: 1-octanol, mol949: 1-p-tolylethanone, mol999: 1,1,1,2-tetramethoxyethane, mol1132: 1,4-dimethylnorborane, mol1335: 2,1H-pyrrol-2-yl-1H-pyrrole, mol1487: 3-methyl-1H-indole Numerical results of Eq. (10) with The molecule of reference was tetranitromethane with , and , , and . mol204: di-t-butyl sulfoxide, mol563: naphthalen-1-amine, mol565: naphthalen-2-amine, mol815: vinylsulfonylbenzene, mol879: 1-chloro-1,3-butadiene Numerical results of Eq. (10) with The molecule of reference was tetranitromethane with , and , , and . mol242: diisobutyl sulfone, mol546: N-pyrazin-2-ylmethanesulfonamide, mol879: 1-chloronaphthalene, mol1640: 9H-purine-2,6,8-trione In addition, in Fig. 1a–d, it is also possible to observe the effect of the coefficient on ; in such figures, we noticed that for values of , the general trends have a convex behavior, while for the interval , the trends are concave. Therefore we can say that depending on the value used, we can find more or fewer molecules because some could be or could not be in the same position. This can be inferred easily by comparing the position of the molecule of reference used, which is depicted in the figures mentioned previously. Trends of with Observing the four tables, one can note that depending on the value used, different molecules are found that according to Eq. (10) are informationally equivalent. For example, with (see Table 1 and Fig. 2), the molecules 750, 1041, 1153 and 1163 are equivalent under the perspective of . These molecules have structures constituted mainly by rings, and three of them present aromaticity, while the structure of tetranitromethane is tetrahedral (see Fig. 2). From a chemical reactivity point of view, the molecules 1041, 1153, and 1163 have similar values of softness, hardness, and chemical potential; however, if we compare such values with the values of tetranitromethane, we find that molecule 1163 has the closest value of hardness and softness but that its value of chemical potential differs significantly. Thus, based on the numerical results, we conclude that 1R-1,2,2-trifluoroethoxybenzene is similar to tetranitromethane. Graphic representation of the molecules: mol750, mol563, mol565, mol815, mol879, used in Table 1 In Table 2 are reported the numerical results of Eq. (10) with , and the graphic representation of the molecules of this case is shown in Fig. 3. The first general observation that we note is that with this value of , it is possible to find more molecules in comparison with the previous case, but this case, the numerical values of present a percentage difference of 0.2%. As in the previous case, this set of molecules constitutes mostly rings, but in this case, there are two linear molecules present in the set (see Fig. 3). Now, if we compare the values of softness, hardness, and chemical potential, it is possible to reduce this set of molecules. In this case, we find that only molecules 815 and 838 have similar hardness values [a.u.] and [a.u.] in comparison with the value for tetranitromethane of [a.u.]. A similar situation happens with the hardness; however, the chemical potential of molecules 815 and 838 differs by almost 0.1 a.u with respect to that of tetranitromethane. Graphic representation of the molecules: mol58, mol86, mol149, mol153, mol239, mol567, mol607, mol776, mol815, mol838, mol860, mol948, mol949, mol999, mol1132, mol1335, mol1487, used in Table 2 Graphic representation of the molecules: mol204, mol1041, mol1153, mol1163, used in Table 3 In Tables 3 and 4, we present the results of . Using the values of and 2, as in the first case, these values of produce values of with a small percentage difference. In both sets are present molecules with rings, and all they have some of the most electronegative atoms; nevertheless, in the case of the set with we also find two linear molecules (see Figs. 4 and 5). When a value of is considered, only molecules 204 and 815 have similar hardness and softness values, while the chemical potential is different by more than 50% with respect to tetranitromethane. On the other hand, with the value of , the four molecules differ considerably in all their chemical reactivity parameters with respect to tetranitromethane. Graphic representation of the molecules: mol242, mol546, mol879, mol1640, used in Table 4 Graphic representation of the molecules: mol204 mol815, mol838, mol1163, used in Table 5
Table 5

Selection molecules that could share some chemical properties

MoleculeN\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{q}R_{\alpha }$$\end{document}qRα\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\alpha$$\end{document}α\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\eta$$\end{document}η [a.u.]S [a.u.]\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu$$\end{document}μ [a.u.]
Tetranitromethane980.1658803.0142271\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.28717
mol2049010.54119346448210.1690702.9573549\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.106900
mol8158810.589448824659610.1518403.2929399\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.183810
mol8158825.37049749795440.50.1518403.2929399\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.183810
mol8387025.33106632758220.50.1518553.2926146\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.109345
mol11639010.408013374153300.1561653.2017417\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}-0.150825

mol204: di-t-butyl sulfoxide, mol815: vinylsulfonylbenzene, mol838: 1′H-1,2′-bipyrrole, mol1163: 1R-1,2,2-trifluoroethoxybenzene

In Table 5, we summarize our findings of . In this case, it was possible to reduce the set of molecules to four molecules. Based on this table, we note that molecule 204, di-t-butyl sulfoxide, with , has hardness and softness values close to tetranitromethane, which is the molecule of reference. This result may be explained in terms of the structure of both molecules (see Fig. 6). In this figure, one can observe that both molecules are semispherical, while the rest of them have rings and are semiplanar. Another interesting observation of the results presented in Table 5 is that the numerical results of with suggest that molecule 815, vinylsulfonylbenzene, has similar properties to tetranitromethane; however, we also note that while the hardness and softness in several cases match with the molecule of reference, the chemical potential does not.
Fig. 6

Graphic representation of the molecules: mol204 mol815, mol838, mol1163, used in Table 5

Selection molecules that could share some chemical properties mol204: di-t-butyl sulfoxide, mol815: vinylsulfonylbenzene, mol838: 1′H-1,2′-bipyrrole, mol1163: 1R-1,2,2-trifluoroethoxybenzene In conclusion, we can say that the version of Rényi’s divergence presented in this work could be an effective tool used in conjunction with the descriptors of the chemical reactivity to complement studies where the main propose is to find some chemical systems that may share some properties.

Conclusion

In this work, we presented a new version of Rényi’s divergence defined in terms of two diagonalized density matrices. The equation proposed in this work satisfies rigorously all the properties of any informational measure. Our results suggest that Eq. (10) with values of and produces the best sets of molecules that may share some properties. Finally, we consider that our expression of Renyi’s divergence could be a good candidate to complement studies in which the main goal is the search for chemical systems with similar properties or as a tool to classify sets of molecules that may share some physical or chemical properties.
  8 in total

1.  New measure of electron correlation.

Authors:  Alex D Gottlieb; Norbert J Mauser
Journal:  Phys Rev Lett       Date:  2005-09-16       Impact factor: 9.161

2.  Molecular similarity in medicinal chemistry.

Authors:  Gerald Maggiora; Martin Vogt; Dagmar Stumpfe; Jürgen Bajorath
Journal:  J Med Chem       Date:  2013-11-11       Impact factor: 7.446

3.  Similarity coefficients for binary chemoinformatics data: overview and extended comparison using simulated and real data sets.

Authors:  Roberto Todeschini; Viviana Consonni; Hua Xiang; John Holliday; Massimo Buscema; Peter Willett
Journal:  J Chem Inf Model       Date:  2012-11-07       Impact factor: 4.956

4.  The Calculation of Molecular Structural Similarity: Principles and Practice.

Authors:  Peter Willett
Journal:  Mol Inform       Date:  2014-04-29       Impact factor: 3.353

5.  Fisher information and Rényi dimensions: A thermodynamical formalism.

Authors:  B Godó; Á Nagy
Journal:  Chaos       Date:  2016-08       Impact factor: 3.642

6.  Fisher information and Rényi entropies in dynamical systems.

Authors:  B Godó; Á Nagy
Journal:  Chaos       Date:  2017-07       Impact factor: 3.642

7.  SARS-CoV-2 vaccination modelling for safe surgery to save lives: data from an international prospective cohort study.

Authors: 
Journal:  Br J Surg       Date:  2021-09-27       Impact factor: 11.122

8.  Rényi's divergence as a chemical similarity criterion.

Authors:  N Flores-Gallegos
Journal:  J Math Chem       Date:  2021-11-22       Impact factor: 2.413

  8 in total
  1 in total

1.  Rényi's divergence as a chemical similarity criterion.

Authors:  N Flores-Gallegos
Journal:  J Math Chem       Date:  2021-11-22       Impact factor: 2.413

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.