Literature DB >> 28720783

A Simple, Robust and Efficient Computational Method for n-Octanol/Water Partition Coefficients of Substituted Aromatic Drugs.

Asrin Bahmani1, Saadi Saaidpour2, Amin Rostami1.   

Abstract

In this paper, multiple linear regression (MLR) was used to build quantitative structure property relationship (QSPR) of n-octanol-water partition coefficient (logPo/w) of 195 substituted aromatic drugs. The molecular descriptors were calculated for each compound by the VLifeMDS. By applying genetic algorithm/multiple linear regressions (GA/MLR) the most relevant descriptors were selected to build a QSPR model. The robustness of the model was characterized by the statistical validation and applicability domain (AD). The prediction results from MLR are in good agreement with the experimental values. The R2 and Q2LOO for MLR are 0.9433, 0.9341. The AD of the model was analyzed based on the Williams plot. The effects of different selected descriptors are described.

Entities:  

Year:  2017        PMID: 28720783      PMCID: PMC5515958          DOI: 10.1038/s41598-017-05964-z

Source DB:  PubMed          Journal:  Sci Rep        ISSN: 2045-2322            Impact factor:   4.379


Introduction

Lipophilicity is the tendency of a compound to partition into a non-polar organic phase versus an aqueous phase. The typical quantitative descriptor of lipophilicity is the partition coefficient P of a given compound between two immiscible solvents[1]. Traditionally, n-octanol has been widely used as the non-polar phase and water as the polar phase. The partitioning value that is measured is termed logPo/w [2]. The n-octanol is considered a good mimic of phospholipids membrane characteristics because its nature is amphiphilic[3]. Among other physicochemical properties, lipophilicity plays a key role for molecular discovery activities in a variety of domains including, agrochemicals, cosmetics, material sciences, environmental chemistry, food chemistry, and particularly medicinal chemistry[4]. A correct estimation of logPo/w is essential for the discovery and development of efficient therapeutic molecules[5]. Whereas lipophilicity cannot characterize the whole physicochemical nature of a compound, properties governing lipophilicity have a basic effect on the actions of organic molecules, such as drugs or drug candidates. Many drugs will go through a series of partitioning steps: (a) leaving the aqueous extracellular fluids, (b) passing through lipid membranes, and (c) entering other aqueous environments before reaching the receptor. In this sense, a drug is passing the same partitioning phenomenon that happens to any chemical in a separatory funnel containing water and a non-polar solvent. So a compound must have an optimal lipophilicity, because if the solute is very lipophilic it will remain trapped in the membrane[6]. Lipophilicity is one of the main factors influencing the pharmacokinetic behavior of β-blockers by several ways: 1-Oral absorption, 2-Penetration in the central nervous system (CNS), 3-Renal clearance, 4-Degree of biotransformation and plasma half-life, 5-Cardioselectivity, 6-Cornealpenetration[7, 8]. For example, the most lipophilic β-blockers (such as propranolol) penetrate readily into the CNS and raise central effects (somnolence), whereas the more hydrophilic drugs have a low CNS penetration and negligible central effects[8]. The in situ rat gut technique is an informative tool yielding realistic absorption rates. In 1981 a study of 18 sulfonamides, the absorption rate constant ka was correlated with the lipophilicity parameter[9]. Good gastrointestinal absorption was for many years a problem in the development of Penicillins. Yoshimura[10] developed an organized study in mice and rats and showed that the two major molecular properties influencing the GI absorption of penicillins are their stability in acidic solutions and their lipophilicity. Corneal penetration is an overcritical condition for the therapeutic success of ocularly administered drugs such as β-blockers used as antiglaucoma agents. In 1983, an important study showed that lipophilicity clearly plays a key role in penetration through intact cornea. In a series of 12 β-blockers, the logPC (permeability coefficient) exhibited a parabolic relation with lipophilicity[11]. For a homogeneous set of phenols, a parabolic relation was found between human skin permeability (Kp) and the logPo/w [12]. In 1991, for 11 aromatic acids (model compounds and anti-inflammatory drugs) their binding constant to bovine serum albumin (in logarithmic form) was correlated with hydrophobic index obtained by RP-HPLC[13]. In another study, the unbound fraction in plasma (fu) that was taken as the biological response, showed a sigmoidal relation with logPo/w [14]. Interestingly, parabolic relations between protein binding and lipophilicity are also known, validating the limited dimensions of some binding sites. When large molecules such as Cephalosporins were tested for their association constant (Ka) to human serum albumin, a fair parabolic relation was found with lipophilicity[15]. In the important study, the concentration of 10 basic drugs in plasma and 8 non-metabolizing tissues was examined administration to rabbits. These drugs were weakly basic benzodiazepines and strongly basic neurological drugs. Good linear relations (R2 = 0.92 to 0.97) were found between the tissue-to-plasma concentration ratios of unbound, non-ionized drugs and their logPo/w. The slope of the linear regressions raised in the series: muscle < skin < bone < brain < gut < heart < lung < adipose[16]. In many studies on drug permeation through biological membranes (gut wall, skin, blood-brain barrier, and Caco-2 cell monolayer), relationships between permeation and lipophilicity have been developed with homologous series of compounds of a diverse nature (acidic, alkaline and neutral) to investigate the influence of lipophilicity on passive diffusion. For example Sigmoidal relationships were established between permeability coefficients in rat jejunum and logPo/w for seven steroids[17], and 11 β-blockers[18]. Even so, despite the good solubility of most organic compounds in n-octanol and ease in lab handling, the experimental determination of logPo/w remains a resource- and time-consuming process. Methods to estimate logPo/w are basically dedicated to medicinal chemistry and molecular design activities. Estimation approaches involve group and atom contribution methods[19, 20], quantitative structure property relationships (QSPR) derived from statistical regressions[21-23]. Group and atom contribution models have usually been based on fragments, derived either from atoms or groups of atoms, which are assigned incremental logPo/w contributions[24]. QSPR have been developed as alternate strategies of estimating lipophilicity. The assumption of QSPR for logPo/w is that physicochemical properties can be correlated with molecular structural characteristics (geometric and electronic) expressed in terms of appropriate molecular descriptors[25]. In recent years, enhancements in logPo/w QSPR have been suggested through the use of molecular descriptors derived from semi-empirical Molecular Orbital theory (quantum mechanics) calculations[26]. For example, Bodor[27], using AM1 semi-empirical MO theory, reported a standard deviation of 0.306 logPo/w for a 18 parameter linear correlation which was developed for estimating lipophilicity for a heterogeneous data set 302 organic compounds. In 1999, Eisfeld and Maurer[28] proposed a logPo/w correlation with dipole moment, polarizability, electrostatic potential and molar volume as chemical descriptors, based on a heterogeneous set of 202 compounds with a reported standard deviation and maximum absolute error of 0.287, respectively. Yaffe[29], using Fuzzy ARTMAP and Back-Propagation Neural Networks Based QSPR, Estimated logPo/w for heterogeneous set of 442 organic compounds. In this work we develop QSPR modeling of logPo/w of 195 substituted aromatic drugs. These drugs are very important in medicinal chemistry, such as: Alprazolam, that is mostly used to treat anxiety disorders, panic disorders, and nausea due to chemotherapy, Dapsone, that is commonly used in combination with Rifampicin and Clofazimine for the treatment of leprosy, Procaine, that is a local anesthetic drug of the amino ester group. It is used primarily to reduce the pain of intramuscular injection of penicillin, and it is also used in dentistry, Warfarin treatment can help prevent formation of future blood clots and help reduce the risk of embolism[30]. In this paper all of 195 drugs are homogeneous set of aromatic drugs.

Computational approach

All calculations were run on a Dell Inspiron N5010 laptop computer with Intel® Core™ i7 processor with Windows 7 operating system. The molecular structures of all compounds were drawn into the HyperChem 8.0 (Hypercube, Inc., Gainesville, 2011) and pre-optimized using MM+ molecular mechanics method (Polak–Ribiere algorithm). The final geometries of the minimum energy conformation were obtained by more precise optimization with the semi-empirical PM3 method, applying a root mean square gradient limit of 0.05 (Kcal.mol-1.Å−1), as a stopping criterion for optimized structures. The molecular descriptors were calculated by VLifeMDS (version: 4.4) Software. A GA/MLR algorithm procedure was used for selection of descriptors using QSARINS (QSAINSubria version 2.2.1 2015) software package. MLR was performed by QSARINS.

Data set selection

For the present study logPo/w of 195 drug compounds was collected from the literature[31]. All molecules exhibited a wide range of lipophilicity (−2.17; 6.03). In order to obtain a validated and, therefore, predictive QSPR model, an available dataset should be divided into the training and test sets. Commonly, this splitting is performed using random and rational splitting methods[32]. The data set was split randomly into 147 training set and 48 prediction set (see Table 1).
Table 1

Experimental logPo/w, Predicted logPo/w and Residuals values for train and test set of Aromatic Drugs for MLR model.

Training set
NoNameExperimental logPo/w Predicted logPo/w Residual
12-Aminobenzoic acid1.261.13090.1291
23,5-Dichlorophenol3.633.6918−0.0618
33-Aminobenzoic acid0.340.399−0.059
43-Bromoquinoline2.912.86310.0469
54-Aminobenzoic acid0.860.53730.3227
64-Butoxyphenol2.872.74910.1209
74-Chlorophenol2.452.488−0.038
84-Ethoxyphenol1.811.942−0.132
94-Iodophenol2.92.77650.1235
104-Methoxyphenol1.411.36530.0447
114-Pentoxyphenol3.263.10210.1579
124-Phenylbutylamine2.392.33270.0573
134-Propoxyphenol2.312.8643−0.5543
145-Phenylvaleric acid2.922.64470.2753
15Acebutolol2.021.83280.1872
16Acetaminophen0.340.6683−0.3283
17Acetophenone1.581.44770.1323
18Acetylsalicylic acid0.90.9666−0.0666
19Alprazolam2.613.0152−0.4052
20Alprenolol2.992.55990.4301
21Aminopyrine0.851.0384−0.1884
22Amitriptyline4.624.9183−0.2983
23Amlodipine3.743.39350.3465
24Ampicillin−2.17−2.0385−0.1315
25Atenolol0.220.15320.0668
26Atropine1.891.42010.4699
27Benzoic acid1.962.1432−0.1832
28Bifonazole4.774.9596−0.1896
29Bisoprolol2.152.04140.1086
30Bromazepam1.652.2939−0.6439
31Bumetanide4.064.5235−0.4635
32Bupropion3.213.436−0.226
33Carazolol3.733.66930.0607
34Carbamazepine2.453.0449−0.5949
35Cefadroxil−0.09−0.33430.2443
36Cefalexin0.650.51270.1373
37Celiprolol1.922.0377−0.1177
38Chlorambucil3.73.21560.4844
39Chloramphenicol1.140.88340.2566
40Chlorothiazide−0.24−0.0353−0.2047
41Chlorpheniramine3.393.9023−0.5123
42Chlorpromazine5.45.4701−0.0701
43Chlorprothixene6.035.34080.6892
44Chlorsulfuron1.791.45520.3348
45Chlortalidone−0.74−0.1934−0.5466
46Ciprofloxacin−1.08−1.55560.4756
47Clofibrate3.653.52810.1219
48Clonazepam3.022.85870.1613
49Clonidine1.572.2257−0.6557
50Clotrimazole5.25.01060.1894
51Clozapine4.14.08540.0146
52Cocaine3.012.27120.7388
53Codeine1.191.2284−0.0384
54Coumarin1.391.38260.0074
55Debrisoquine0.851.1733−0.3233
56Desipramine3.794.173−0.383
57Diacetylmorphine1.591.6449−0.0549
58Diclofenac4.514.7773−0.2673
59Diethylstilbestrol5.075.5014−0.4314
60Diltiazem2.892.69890.1911
61Diphenhydramine3.183.1280.052
62Doxorubicin0.650.8555−0.2055
63Enalaprilat−0.131.1457−1.2757
64Fenpropimorph4.934.9856−0.0556
65Fluconazole0.5−0.13960.6396
66Flufenamic acid5.565.10550.4545
67Flumazenil1.641.00180.6382
68Flumequine1.721.7723−0.0523
69Furosemide2.562.28610.2739
70Griseofulvin2.182.2831−0.1031
71Heptastigmine4.824.63490.1851
72Hydrochlorothiazide−0.03−0.3090.279
73Hydroflumethiazide0.540.48840.0516
74Hydroxyzine3.553.4220.128
75Ibuprofen4.133.750.38
76Imazaquin1.861.49230.3677
77Imipramine4.394.32870.0613
78Indomethacin3.514.3134−0.8034
79Ketoconazole4.344.25470.0853
80Labetalol1.332.3242−0.9942
81Lidocaine2.442.6036−0.1636
82Lormetazepam2.723.1982−0.4782
83Mefluidide2.022.0636−0.0436
84Meloxicam3.433.4110.019
85Melphalan−0.52−0.1399−0.3801
86Methotrexate0.540.51840.0216
87Methysergide1.952.0114−0.0614
88Metipranolol2.812.42650.3835
89Metoclopramide2.341.91240.4276
90Metoprolol1.951.74980.2002
91Nadolol0.851.0663−0.2163
92Naproxen3.243.6225−0.3825
93Nifedipine3.172.88940.2806
94Niflumic acid3.883.26720.6128
95Nitrendipine3.593.20330.3867
96N-Methylaniline1.651.62840.0216
97Norcodeine0.690.8584−0.1684
98Nordiazepam3.152.94190.2081
99Normorphine−0.170.2632−0.4332
100Nortriptyline4.394.23620.1538
101Ofloxacin−0.41−0.1945−0.2155
102Omeprazole1.81.74950.0505
103Oxprenolol2.512.2130.297
104Papaverine2.953.6619−0.7119
105Penbutolol4.624.31910.3009
106Penicillin V2.091.4650.625
107Pentachlorophenol5.124.97010.1499
108Pentamidine2.082.4219−0.3419
109Pericyazine3.654.1045−0.4545
110Phenazopyridine3.312.92950.3805
111Phenobarbital1.531.50030.0297
112Phenol1.481.26690.2131
113Phe-Phe-Phe0.020.6718−0.6518
114Prazosin2.161.81790.3421
115Primaquine33.4409−0.4409
116Probenecid3.73.06080.6392
117Procainamide1.231.2642−0.0342
118Procaine2.142.10520.0348
119Promethazine4.054.5525−0.5025
120Proquazone3.133.8239−0.6939
121Quinidine3.443.06990.3701
122Quinine3.52.78690.7131
123Quinmerac0.780.9345−0.1545
124Quinoline2.152.0620.088
125Rufinamide0.90.49760.4024
126Salicylic acid2.192.04170.1483
127Serotonin0.531.0892−0.5592
128Sotalol−0.47−0.0212−0.4488
129Sulfadiazine−0.12−0.13820.0182
130Sulfinpyrazone2.322.537−0.217
131Sulindac3.63.0380.562
132Tacrine3.322.80790.5121
133Terazosin2.292.332−0.042
134Terbutaline−0.080.2173−0.2973
135Terfenadine5.525.32350.1965
136Tetracaine3.513.7148−0.2048
137Thiabendazole1.941.32450.6155
138Thiamphenicol−0.27−0.58730.3173
139Tralkoxydim4.464.45580.0042
140Trazodone1.662.3977−0.7377
141Trimethoprim0.831.4642−0.6342
142Trovafloxacin0.15−0.33980.4898
143Trp-Phe−0.280.2391−0.5191
144Trp-Trp−0.1−0.018−0.082
145Tryptophan−0.77−0.2481−0.5219
146Verapamil4.333.88530.4447
147Warfarin3.542.47091.0691
Test set
1481-Benzylimidazole1.61.2480.352
1492,4-Dichlorophenoxy acetic acid2.782.9783−0.1983
1503,4-Dichlorophenol3.393.8638−0.4738
1513-Chlorophenol2.572.52770.0423
152Amoxicillin−1.71−1.72290.0129
153Antipyrine (phenazone)0.560.43710.1229
154Bentazone2.831.72991.1001
155Benzocaine1.891.9062−0.0162
156Carvedilol4.143.40160.7384
157Cromolyn1.951.79310.1569
158Dapsone0.940.9417−0.0017
159Diflunisal4.323.90030.4197
160Disopyramide2.372.7188−0.3488
161Ephedrine1.130.67150.4585
162Ergonovine1.671.8769−0.2069
163Flamprop3.093.117−0.027
164Flurbiprofen3.993.90660.0834
165Fluvastatin4.174.302−0.132
Experimental logPo/w, Predicted logPo/w and Residuals values for train and test set of Aromatic Drugs for MLR model.

Computational methods

Descriptor generation

Molecular descriptors are generated from molecular structures. Although different descriptors utilize different processing steps, still there are numerous steps common to these procedures. Molecular descriptors are powerful tools for the approximation of selected properties of chemical structures in an easy-to-handle form that allows efficient comparison and selection of compounds possessing required chemical, structural, pharmacological or biological features. In this study molecular descriptors were calculated for each compound by the VLifeMDS on the minimal energy conformations. VLifeMDS calculates about 500 different molecular descriptors from the categories: topological, electronic, electrostatic, E-state, information theory based, physicochemical and semi-empirical.

Descriptor selection

After descriptor generation a pool of the molecules with the corresponding descriptors become available for model calculation. But a limited number of modeling descriptors, related to the studied response, must be selected from the available pool. Descriptor selection is the process of selecting a subset of relevant variables for use in model construction. In QSARINS this is done using a GA/MLR procedure. This technique is able to explore a broad range of solutions, searching for the best ones, by maximizing or minimizing a selected fitness function. This is done mimicking the natural selection, where the best solutions replace the less performing. In biological terms, one would say that the best genes in the population displace the less fitting. In our case, every descriptor represents a gene, and a set of descriptors represents a chromosome. The fitness of a chromosome is related to the matching model performances. Starting with a pool of chromosomes, small subsets of chromosomes are picked randomly, and the best become parents. Couples of parent chromosomes are then crossed at a random position (crossing-over), thus obtaining the offspring, whose chromosomes are a combination of the parent ones. If among the new chromosomes one or more of them outperform the less fitting in the parent population, these chromosomes will replace the less performing. Repeating the aforesaid procedure many times, and introducing also random mutations (descriptor substitution) in the chromosomes, the result at the end of the procedure is a population of models with better performances than the models introduced at the beginning. In order to prevent a completely random beginning of the GA, in QSARINS, the best set of descriptors extracted from the all subset process is used as the core of the chromosomes of the initial population. In QSARINS, the tuning of the GA can be done changing the population size, the mutation rate, and the number of generations. A fundamental option is the selection of the fitness function to be used by GA. In the work, leave-one-out cross-validation (Q2 LOO) was used as fitness function throughout the GA process. When increasing the model size does not improve the Q2 value significantly, the GA selection will be stopped. Q2 LOO used as fitness function, is useable to select models with high fitting with the minimum number of descriptors. However, it is essential to note that they are fitting criteria, so they provide no information on the predictive ability of the models. For this reason, it is here proposed to use Q2 LOO as fitness function for the selection of predictive models[33]. The important parameters used in the GA process were set as below: population size 100, maximum allowed descriptors in a model 10 and reproduction/mutation trade-off 0.5. Finally, we obtained a 10-descriptor subset, which keeps most interpretive information for logPo/w. Four descriptors were calculated for each compound in the data set. The selected descriptors are: SKMostHydrophobic Area, SAHydrophobic Area, SKAverage, XKAverage Hydrophobicity, PSA, Average Potential, Polar Surface Area Excluding P & S, 4Path Count, ChiV6chain and AlphaR.

Modeling method in QSARINS

The datasets used in QSPR analysis are, as previously mentioned, composed of descriptors that should be correlated with the corresponding experimental responses. At this step it is necessary to apply a quantitative method able to find the existing relationship between a limited number of structural descriptors and the modeled response. In QSARINS, the used method is the MLR approach that can be demonstrated by the following formula:where a linear relationship is computed between the studied responses (yi) and the selected values of the descriptors (xij); ei is the random error (called also model residual). The intercept (b0) and the coefficients (bj) are thus to be evaluated. The equation (2) can be rewritten in a more compact form using the matrix notation:where y is the responses vector, b the vector of the coefficients and e is the vector of the errors. X is the matrix of the model, where the columns are the descriptors. In this software, to estimate the vector of the coefficients, the OLS technique is used:where is the vector that estimates the b vector of the coefficients, XT the transposed X matrix and −1 is the inverse matrix operation. The OLS minimizes the sum of squares of the difference between the experimental responses and the ones calculated by the model. To work correctly, the OLS assumes that: (1) a linear relationship exists between the descriptors and the response, (2) the response errors are independent and similarly distributed, (3) the descriptors are not too correlated among them, (4) there are more compound than modeling descriptors (a ratio that should be always higher than 5:1). Once the coefficients of the model are calculated, it is possible to obtain the vector of the , as in the following formula:where H is the leverage (or hat) matrix that relates the calculated and the experimental responses. The diagonal elements of the hat matrix h are useable to determine the distance of the i object from the centre of the chemical space of the model[34, 35], thus, for checking the structural applicability domain (AD) of the model.

Model evaluation

Evalution of QSPR model is a very important aspect. It is acknowledged that the goodness-of-fit is very important for QSPR models. The quality of goodness-of-fit of the models is quantified by the R2 squared correlation coefficient, R2 adj is adjusted squared correlation coefficient, s is the standard error of the regression and F is the Fisher ratio for regression. R2 is a statistic that will give some information about the goodness of fit of a model. R2 is defined as:where RSS is the residual sum of squares and TSS is the total sum of squares. Adjusted R2 detects the possible overfitting of a model so, used as fitness functions, are useful to select models with high fitting with the minimum number of descriptors. Adjusted R2 is defined as:where n is the number of members of the training set and m is the number of descriptors included in the model. The Adjusted R2 is a better measure of the proportion of variance in the data explained by the correlation than R2. The standard error indicates dispersion degree of random error. F-ratio test in regression is defined as the ratio between the variance explained by the model to the residual variance. The larger R2, R2 adj and F, the smaller s, and the model will have more fitting ability.

Model validation

Model calculation and evaluation are the basic steps in QSPR analysis, but are not sufficient to guarantee the model validity. Validation is fundamental to ensure the reliability of data predicted by the models. Validation of QSPR model is very important aspect, thus internal and external validation is considered to be necessary for model validation[35]. Internal validation is obtained from analyzing of each one of individual objects that configure the final equation. This procedure is leave-one-out (LOO) cross-validation. This process was done in training set and Q2 LOO is calculated.where TSS is the total sum of squares that is the sum of squared deviations from the data set mean and PRESS is the sum of squares of the prediction errors. The larger Q2 LOO and the model will have more predictive ability. However, a perturbation of only one compound at a time is very weak to demonstrate real model robustness. In QSARINS, the stronger Leave-More (or many)-Out (LMO) technique is also included. This technique studies the behavior of the model when a larger number of compounds are eliminated. LMO is used to counteract the slight overoptimism of LOO-cross-validation. The model under analysis can be considered stable if the R2 and Q2 values calculated in every LMO iteration and their averages (R2 LMO and Q2 LMO), are close to R2 LOO and Q2 LOO values of the model[36]. To show that the model is not the result of chance correlation, the Y-scrambling procedure can be applied. In this process, the responses are shuffled at random, so no correlation between them and the descriptors should exist. As a consequence, the performances of the corresponding scrambled models should decrease drastically. In this case if the original model under validation is good, the values of R2 and Q2 of the every iteration, and their averages (R2 yscr and Q2 LOO-yscr), must be far and much smaller from the values of the original model. If Q2 LOO-yscr < 0.2, and R2 yscr < 0.2, there is no risk of chance correlation in the developed model. In the process of model validation, external validation is necessary. External validation of the model is checked for its ability to predict new compounds. This is done by applying the model equation, obtained on the training set, to one or more prediction data set(s), that is the excluded compounds that have never been used in model calculation, and measuring the performances by means of different criteria, such as: RMSE[37], Q2 F1 [38], Q2 F2 [39], Q2 F3 [40], CCC[41] and Q2 EXT [42]. The external Q2 F1 for the test set is determined with the following equation:where indicates the response means of the training set, respectively. PRESS is the predictive sum of squares, is the total sum of squares of the external set calculated by means of the training set mean, respectively. Consequently, this formula gives valid values when the test set spans the whole response domain of the model because in this case the test set mean approaches the training set mean. Q2 F2 is defined as:where indicates the response means of the external test set and is the total sum of squares of the external set calculated by means of the external set mean, respectively. Function Q2 F2 does not account for information about the reference model because encodesinformation derived from the external set and this informationalters continuously on the basis of the objects belonging to the external set. Q2 F3 is defined as:where TSS is the total sum of squares nEXT is number of test set and nTR is number of train set. Expression Q2 F3 reduces to expression for Q2 LOO when training and test sets coincide (nEXT = nTR), or, in other words, when all available data are used both for fitting and assessing model predictive ability. CCC: Concordance correlation coefficient. It is well suited to measure the consensus between experimental and predicted data, which should be the real aim of any predictive QSPR models. Where xi and yi correspond to the abscissa and ordinate values of the graph plotting the prediction experimental data values vs. the ones calculated using the model. Where n is the number of chemicals, and and correspond to the averages ofabscissa and ordinate values, respectively. This coefficient measures both precision (how far the observations are from the fitting line) and accuracy (how far the regression line deviates from the slope 1 line passing through the origin, the concordance line), consequently any divergence of the regression line from the concordance line gives as a consequence a value of CCC smaller than 1. An elemental property of a function for the assessment of model fit from external evaluation data is that external observations are independent of each other. This means that the Q2 value derived from the whole external data set Q2 EXT and the average of the Q2 values obtained taking separately each external data one at one time should coincide. The optimized model was applied for the prediction of logPo/w values of 49 drugs in the prediction set which were not used in the optimization procedure. The predictive ability of a model on external validation set can be expressed by Q2 EXT.where Q2 i is the external Q2 calculated taking into account only the ith object of the test set and nEXT is the total number of external objects. An additional measure of the accuracy of the proposed QSPR is the RMSE (root mean squared errors) that summarizes the overall error of the model.where is the predicted value for the ith test object and yi its observed value, nEXT is the total number of test objects. This parameter depends only on the mean deviations between predictions and observed values and it can always be calculated even when there is only one test object. It is calculated as the square root of the sum of squared errors in prediction divided by their total number. This parameter was calculated to compare the accuracy and the stability of our models in the training (RMSETR) and in the prediction (RMSEEXT) sets. It is important to note that RMSE values must not only below but also as similar as possible for the training, cross-validation and external prediction sets. This suggests that the proposed model has both predictive ability (low values) as well as sufficient generalizability (similar values). The AD is a theoretical area in chemical space, defined by the model descriptors and modeled response, and thus by the nature of the chemicals in the training set, as represented in each model by specific molecular descriptors As even a robust, significant and validated QSPR cannot be expected to reliably predict the modeled property for the all universe of chemicals, its domain of application must be defined, and the predictions for only those chemicals that fall in this domain can be considered reliable. The Williams plot of the regression permits a graphical detection of both the outliers for the response and the structurally influential chemicals in a model. The Williams plot detects the outliers for the response (Y-outliers) and those for the structure (X-outliers). It consists of plotting the standardized residuals on the y-axis and the leverage values from the hat matrix diagonal on the x-axis. The leverage (h) of a compound measures its influence on the model. The leverage of a compound in the original variable space is defined as:where the X is the model matrix derived from the training set descriptor values and the leverage values of training set are diagonal elements of the Hat or Influence matrix H (hi = diag(H)). The leverage values are always between 0 and 1. The warning leverage h is defined as follows:where n is the number of training set compounds and p′ is the number of model parameters plus one. Observations with standardized residuals greater than (−3; +3) range, which lie outside the horizontal reference lines on the plot, are outlier’s responses in the QSARINS (standardized residuals > is the standard deviation of residuals). Standardized residual (SRi) for each sample is calculated as in equation (17):where yi and are respectively the measured and predicted values of the property; n is the number of compounds in each set of data. To visualize the AD of a QSPR model, the plot of standardized residuals versus leverage values (h) (Williams plot) can be used for an immediate and simple graphical detection of both the response outliers and structurally influential chemicals in a model (h > h ). Concerning the residuals, all the chemicals falling above or below the user defined threshold are not well predicted and thus considered as outliers. Too many outliers, especially those underestimated, are symptomatic of a poor model and this is the reason of implementing the counting of the outliers. Leverage values represent the degree of influence that the structure of every single chemical has on the model. A compound with high leverage in a QSPR model is the driving force for the variable selection if this compound is in the training set (good leverage). A high leverage compound in the prediction set is detected as far from the chemical domain of the training compounds, thus it could lead to unreliable predicted data, being the result of substantial extrapolation of the model. Therefore, the structural information of the chemicals included in the training set could be not sufficient for a reliable prediction of chemicals lying outside of the training-AD[43].

Results and Discussions

Multiple regression analysis

The MLR analysis was used to derive a QSPR model. The data set was randomly divided into training and test set. 147 drugs were selected as the training set in the modeling. 48 drugs were chosen as a prediction set and were used for external validation of the MLR. Making use of the MLR method, the linear model was obtained, in which the molecular descriptors were used as independent variables. In the Table 2, the list of descriptors, their coefficients and model parameters have been shown.
Table 2

The list of descriptors, their coefficients and model parameters.

No.DescriptorCoefficientModel parameter
1Intercept−2.1502n = 147
2PSA−0.0176R2 = 0.9433
3SKMostHphobic7.1814R2 adj = 09391
44PathCount−0.0108s = 0.4031
5chiV6chain6.4751F = 226.3247
6Average Potential−15.9893
7AlphaR−0.0897
8XKAverageHydrophobicity2.1153
9SAHydrophobic Area0.0055
10SKAverage−4.0213
11Polar Surface Area Excluding P&S0.0176
The list of descriptors, their coefficients and model parameters. Where, n is the number of compounds used for regression, R2 is the squared correlation coefficient, R2 adj is adjusted squared correlation coefficient, s is the standard error of the regression and F is the Fisher ratio for regression. R2 is a measure of how well the regression line approximates the real data points. The high R2 (R2 = 0.9433) indicates that the regression line perfectly fits the data. The squared correlation coefficient values closer to 1 represents the better fit of the model. Equation 18 has R2 adj value of 0.9391, which indicates very good agreement between the correlation and the variation in the data. s represents the average distance that the observed values fall from the regression line. Conveniently, it tells you how wrong the regression model is on average using the units of the response variable. Smaller values (s = 0.4031) are better because it indicates that the observations are closer to the fitted line. High values of the F (F = 226.3247) indicate that the model is statistically significant. The F-test reflects the ratio of the variance explained by the model and the variance due to the error in the model, and high values of the F-test indicate the model is statistically significant. The predicted and experimental values of logPo/w, residuals (experimental logPo/w − predicted logPo/w), are presented in Table 1. The plots of predicted logPo/w versus experimental logPo/w, the residuals versus experimental logPo/w value obtained by the MLR modeling and the random distribution of residuals about zero mean are shown in Fig. 1A and B. These results show that the predicted values are in good agreement with the experimental values. The leave-one-out and leave-many-out cross validations were performed in training set. The Q2 LOO and Q2 LMO describe the stability of a regression model obtained by focusing on sensitivity of the model to the elimination of any or more data point. (Q2 LOO = 0.9341, Q2 LMO = 0.9318 illustrate the stability of the model). In the present study, R2 yscr = 0.0685 and Q2 LOO-yscr = −0.0901 show that the model is not the result of chance correlation (see Fig. 2). The external validation is an indispensable validation method used to determine the true predictive ability of the QSPR model. The large value of Q2 EXT = 0.8982, Q2 F1 = 0.8941, Q2 F2 = 0.8921, Q2 F3 = 0.9118 and CCC = 0.9463 illustrate the predictive capability of a model on external prediction set. In the Williams plot for AD (see Fig. 3), Sulfasalazine in the test set is to the right of the vertical line, which indicates it has high leverage value (h > h  = 0.224) and low standardized residual, it is belong to the model AD. The chemical compound of Doxorubicin in the training set is to the right of the vertical line, which indicate they have high leverage value (h > h  = 0.224) and low standard residual. These chemicals with high leverages have a stronger influence on the model than other chemicals, and they are influential. In the standardized residuals plot, Enalapilat in training set and Phe-Phe in test set have standard residual > (−3; +3) range, which confirms that there are two outliers. Furthermore, there is no clear pattern in the residuals, so nothing seems to be wrong with the model. The fitting criteria, internal validation criteria and external validation criteria are shown in Table 3.
Figure 1

(A) Plot of predicted versus experimental of logPo/w values. (B) Plot of residual versus experimental of logPo/w values.

Figure 2

Plot of R2 and Q2 Y-scrambling models versus correlations among the block of the descriptors and the experimental data (Kxy).

Figure3

William plot of standardized residual (SR) versus leverage (h) values for training and test sets.

Table 3

Fitting, internal validation and external validation criteria for GA/MLR model.

CriteriaStatistical parameters
Fitting criteriaR2: 0.9433RMSETR: 0.3877 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rm{S}}$$\end{document}S: 0.4031
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{R}}}_{{\rm{adj}}}^{2}$$\end{document}Radj2: 0.9391 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\rm{F}}$$\end{document}F: 226.3247
Internal validation criteria \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{LOO}}}^{2}$$\end{document}QLOO2: 0.9341RMSECV: 0.4181 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{R}}}_{{\rm{yscr}}}^{2}$$\end{document}Ryscr2: 0.0685
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{LMO}}}^{2}$$\end{document}QLMO2: 0.9318 \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{yscr}}}^{2}$$\end{document}Qyscr2: −0.0901
External validation criteria \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{EXT}}}^{2}$$\end{document}QEXT2: 0.8982RMSEEXT: 0.4836
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{F}}1}^{2}$$\end{document}QF12: 0.8941
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{F}}2}^{2}$$\end{document}QF22: 0.8921
\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\rm{Q}}}_{{\rm{F}}3}^{2}$$\end{document}QF32: 0.9118
CCCEXT: 0.9463
(A) Plot of predicted versus experimental of logPo/w values. (B) Plot of residual versus experimental of logPo/w values. Plot of R2 and Q2 Y-scrambling models versus correlations among the block of the descriptors and the experimental data (Kxy). William plot of standardized residual (SR) versus leverage (h) values for training and test sets. Fitting, internal validation and external validation criteria for GA/MLR model.

Interpretation of descriptors

SKMostHydrophobic Area, SAHydrophobic Area and SKAverage

SKMostHydrophobic Area is the most hydrophobic value on the van der Waals (vdw) surface. The van der Waals surface of a molecule is a surface might reside for the molecule based on the hard cutoffs of van der Waals radii for individual atoms, and it represents a surface through which the molecule might be conceived as interacting with other molecules. Hydrophobicity (also termed hydrophobic) materials possessing this characteristic have the opposite response to water interaction. Compared to hydrophilic materials, hydrophobic materials (water hating) have little or no tendency to absorb water and water tends to bead on their surfaces. Hydrophobic materials possess low surface tension values and lack active groups in their surface chemistry for formation of hydrogen-bonds with water. Hydrophobicity is very important in solubility of drugs. Accordingly drugs that are extremely hydrophobic are also poorly absorbed, because they are totally insoluble in aqueous body fluids and, therefore, cannot gain access to the surface of cells. For a drug to be readily absorbed, it must be largely hydrophobic, yet have some solubility in aqueous solutions. This is one reason why many drugs are weak acids or weak bases. There are some drugs that are highly lipid-soluble, and they are transported in the aqueous solutions of the body on carrier proteins such as albumin. The results indicate that the SKMostHydrophobic Area increases as logPo/w increases. SAHydrophobic Area is van der Waals surface descriptor showing hydrophobic surface area. Lipid solubility of a compound is of special importance to drug discovery and development, because it is directly related to the transport abilities of a drug candidate to cross biological membranes. The requirement is that drug molecules must be soluble enough in lipid to get into membranes but cannot be so soluble that they become trapped in the membranes. These membranes are not exclusively anhydrous fatty or oily structures. As a first approximation, membranes can be considered bi-layers composed of lipids consisting of a polar cap and large hydrophobic tail. Phosphoglycerides are major components of lipid bi-layers. Other groups of bi-functional lipids include the sphingomyelins, galactocerebrosides, and plasmalogens. The hydrophobic portion is composed largely of unsaturated fatty acids, mostly with cis double bonds. In addition, there are considerable amounts of cholesterol esters, protein, and charged mucopolysaccharides in the lipid membranes. The final result is that these membranes are highly organized structures composed of channels for transport of important molecules such as metabolites, chemical regulators (hormones), amino acids, glucose, and fatty acids into the cell and removal of waste products and biochemically produced products out of the cell. Apparently, increasing the SAHydrophobic Area increases logPo/w. SKAverage is the Average hydophobicity function value. According to Supplementary information, some molecules have a positive Hydrophobicity function, others are negative. If the desired compound is more soluble in non-polar than polar phase, the Average hydophobicity function value is higher. Finally, increasing the SKAverage increases logPo/w. SKMostHydrophobic Area, SAHydrophobic Area and SKAverage are calculated by SlogP method[44]. This method represents a new atom type classification system for use in atom-based calculation logPo/w.

XKAverageHydrophobicity

XKAverageHydrophobicity is the Average hydrophobic value on the van der Waals (vdw) surface. This descriptor is calculated by XlogP method[45]. In this method the atoms are classified by their hybridization states and their neighboring atoms. XlogP is based on the summation of atomic contributions and includes correction factors for some intra-molecular interactions. The XKAverageHydrophobicity increases as logPo/w increases.

PSA, Polar Surface Area Excluding P & S and Average Potential

Polar surface area of a molecule is defined as the sum of the contributions to the molecular surface area of polar atoms such as oxygen, nitrogen and their attached hydrogen’s. This parameter is easy to understand and, most importantly, provides good correlation with experimental transport data. PSA is a descriptor showing the correlation with passive molecular transport through membranes, which allows prediction of human intestinal absorption, caco-2 mono-layer permeability, and blood-brain barrier penetration. Molecules with a polar surface area of greater than 140 angstrom squared tend to be poor at permeating cell membranes. For molecules to penetrate the blood-brain barrier a PSA less than 90 angstroms squared is usually needed. In new approach, PSA is calculated based on the summation of tabulated surface contributions of polar fragments by Ertl[46]. PSA increases as logPo/w decreases. Polar Surface Area Excluding P & S signifies total polar surface area excluding phosphorous and sulphur. According to Table 2, this descriptor has a positive coefficient. This shows that the molecules have S and P, tend to dissolve in polar phase. In contrast, the molecules that have other atoms tend to dissolve in non-polar phase. Thus, the presence of S and P atoms in the molecules are not in favor of the lipophilicity. Polar Surface Area Excluding P & S increases as logPo/w increases. Average Potential signifies average of the total electrostatic potential on van der Waals surface area of the molecule. According to Table 2, Average Potential increases as logPo/w decreases.

4PathCount, ChiV6chain and AlphaR

4Path count signifies total number of fragments of fourth order (four bond path) in a compound. This descriptor signifies total number of fragments of fourth order (four bond path) in a compound. 4Path Count describes the connectivity of the atoms within the molecule and also explains its branching and flexibility or rigidity. In fact, lipophilicity decreases with branching. This is due to the fact that the branching of the chain makes the molecular most compact and thereby decreases the surface area. Thus, more branching will reduce the size of the molecule, making it harder to solvate in non-polar phase. As a result, the lipophilicity of the normal compound isomers is higher in all instances than the branched compounds. According to Table 2, 4Path Count shows a negative coefficient towards the lipophilicity, which indicates this descriptor increases as logPo/w decreases. ChiV6chain signifies atomic valence connectivity index for six membered rings. This descriptor indicates the importance of molecular bulk for lipophilicity. Lipophilicity increases with molecular bulk because large molecules are better solved in non-polar phase such as n-octanol. This descriptor is calculated by molecular graph. Apparently, increasing the chiV6chain increases logPo/w. AlphaR indicates sum of α value of all non-hydrogen atoms in a reference alkane. The reference alkane is when all heteroatoms in the molecular graph are replaced by carbon and multiple bonds are replaced by single bonds, corresponding molecular graph may be considered as the reference alkane. The parameter α is related to the size of an atom. The term ∑α is a measure of molecular bulk. When ∑α is compared to that of the corresponding reference alkane, a measure of the heteroatom count and size of a molecule can be obtained. Where, Z and Zv represent atomic number and valence electron number respectively. The PN stands for period number. Hydrogen atom is considered as reference, α for hydrogen is taken to be zero. Table 4 shows that α value of different atoms. According to Table 2, the coefficient of AlphaR is negative. These results indicate the electronegativy of atoms must be considered. If the molecules that have the atoms such as Cl, Br, S and P, have the higher α and increases size and electronegativy. As a result, more electronegative molecules are solved in the aqueous phase[47]. Finally AlphaR increases as logPo/w decreases.
Table 4

The list of α of atoms commonly occurring in organic compound.

Noatom α
1H0.000
2C0.500
3N0.400
4O0.333
5F0.286
6P1.000
7S0.833
8Cl0.714
9Br1.333
10I1.643
The list of α of atoms commonly occurring in organic compound.

Conclusion

In this work, the MLR was used to construct linear QSPR model to predict logPo/w of a wide and homogeneous set of aromatic drugs. MLR method could model the relationship between logPo/w and descriptors. The GA/MLR method is applied for descriptor selection. The results show that the GA/MLR method is a very effective descriptor selection approach for QSPR analysis. The results indicate that the goodness of fit, robustness and predictive ability of MLR model was perfect from internal and external validation. By performing model validation, it can be concluded that the presented model is valid model and can be effectively used to predict the logPo/w. Moreover, the mechanism of the model was interpreted and the applicability domain of the model was defined. Data set1
  24 in total

1.  Evaluation of hydrophobic interaction between acidic drugs and bovine serum albumin by reversed-phase high-performance liquid chromatography.

Authors:  A Kaibara; M Hirose; T Nakagawa
Journal:  Chem Pharm Bull (Tokyo)       Date:  1991-03       Impact factor: 1.645

Review 2.  Lipophilicity and its relationship with passive drug permeation.

Authors:  Xiangli Liu; Bernard Testa; Alfred Fahr
Journal:  Pharm Res       Date:  2010-10-30       Impact factor: 4.200

3.  Toward a principled methodology for neural network design and performance evaluation in QSAR. Application to the prediction of logP.

Authors:  A F Duprat; T Huynh; G Dreyfus
Journal:  J Chem Inf Comput Sci       Date:  1998 Jul-Aug

4.  iLOGP: a simple, robust, and efficient description of n-octanol/water partition coefficient for drug design using the GB/SA approach.

Authors:  Antoine Daina; Olivier Michielin; Vincent Zoete
Journal:  J Chem Inf Model       Date:  2014-11-25       Impact factor: 4.956

5.  QSAR models using a large diverse set of estrogens.

Authors:  L M Shi; H Fang; W Tong; J Wu; R Perkins; R M Blair; W S Branham; S L Dial; C L Moland; D M Sheehan
Journal:  J Chem Inf Comput Sci       Date:  2001 Jan-Feb

6.  Corneal penetration behavior of beta-blocking agents I: Physiochemical factors.

Authors:  R D Schoenwald; H S Huang
Journal:  J Pharm Sci       Date:  1983-11       Impact factor: 3.534

7.  An investigation of the comparative liposolubilities of beta-adrenoceptor blocking agents.

Authors:  P B Woods; M L Robinson
Journal:  J Pharm Pharmacol       Date:  1981-03       Impact factor: 3.765

8.  The absorption of beta-adrenoceptor antagonists in rat in-situ small intestine; the effect of lipophilicity.

Authors:  D C Taylor; R Pownall; W Burke
Journal:  J Pharm Pharmacol       Date:  1985-04       Impact factor: 3.765

9.  Intestinal absorption-partition relationships: a tentative functional nonlinear model.

Authors:  J M Plá-Delfina; J Moreno
Journal:  J Pharmacokinet Biopharm       Date:  1981-04

10.  Percutaneous penetration of drugs: a quantitative structure-permeability relationship study.

Authors:  N el Tayar; R S Tsai; B Testa; P A Carrupt; C Hansch; A Leo
Journal:  J Pharm Sci       Date:  1991-08       Impact factor: 3.534

View more
  5 in total

1.  Design and Characterization of Phosphatidylcholine-Based Solid Dispersions of Aprepitant for Enhanced Solubility and Dissolution.

Authors:  Sooho Yeo; Jieun An; Changhee Park; Dohyun Kim; Jaehwi Lee
Journal:  Pharmaceutics       Date:  2020-04-29       Impact factor: 6.321

2.  Chemical Patterns of Proteasome Inhibitors: Lessons Learned from Two Decades of Drug Design.

Authors:  Romina A Guedes; Natália Aniceto; Marina A P Andrade; Jorge A R Salvador; Rita C Guedes
Journal:  Int J Mol Sci       Date:  2019-10-25       Impact factor: 5.923

3.  Application of Ethyl Cellulose and Ethyl Cellulose + Polyethylene Glycol for the Development of Polymer-Based Formulations using Spray-Drying Technology for Retinoic Acid Encapsulation.

Authors:  Antónia Gonçalves; Fernando Rocha; Berta N Estevinho
Journal:  Foods       Date:  2022-08-22

4.  Design of Antibacterial Agents: Alkyl Dihydroxybenzoates against Xanthomonas citri subsp. citri.

Authors:  Ana Carolina Nazaré; Carlos Roberto Polaquini; Lúcia Bonci Cavalca; Daiane Bertholin Anselmo; Marilia de Freitas Calmon Saiki; Diego Alves Monteiro; Aleksandra Zielinska; Paula Rahal; Eleni Gomes; Dirk-Jan Scheffers; Henrique Ferreira; Luis Octavio Regasini
Journal:  Int J Mol Sci       Date:  2018-10-06       Impact factor: 5.923

5.  Multiple linear regression models for predicting the n‑octanol/water partition coefficients in the SAMPL7 blind challenge.

Authors:  Kenneth Lopez; Silvana Pinheiro; William J Zamora
Journal:  J Comput Aided Mol Des       Date:  2021-07-12       Impact factor: 3.686

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.