Literature DB >> 34760008

QSAR analysis of pyrimidine derivatives as VEGFR-2 receptor inhibitors to inhibit cancer using multiple linear regression and artificial neural network.

Fariba Masoomi Sefiddashti1, Saeid Asadpour1, Hedayat Haddadi1, Shima Ghanavati Nasab1.   

Abstract

BACKGROUND AND
PURPOSE: In this study, the pharmacological activity of 33 compounds of furopyrimidine and thienopyrimidine as vascular endothelial growth factor receptor 2 (VEGFR-2) inhibitors to inhibit cancer was investigated. The most important angiogenesis inducer is VEGF endothelial growth factor, which exerts its activity by binding to two tyrosine kinase receptors called VEGFR-1 and VEGFR-2. Due to the critical role of VEGF in the pathological angiogenesis of this molecule, it is a valuable therapeutic target for anti-angiogenesis therapies. EXPERIMENTAL APPROACH: After calculating descriptors using SPSS software and stepwise selection method, 5 descriptors were used for modeling in multiple linear regression (MLR) and artificial neural network (ANN). The calibration series and the test series in this study included 26 and 7 combinations, respectively. FINDINGS/
RESULTS: The performance evaluation of models was determined by the R2, RMSE, and Q2 statistic parameters. The R2 values of MLR and ANN models were 0.889 and 0.998, respectively. Also, the value of RMSE in the ANN model was lower and its Q2 value was higher than the MLR model. CONCLUSION AND IMPLICATIONS: The results were evaluated by different statistical methods and it was concluded that the nonlinear neural network method is powerful to predict the pharmacological activity of similar compounds, and because of the complex and nonlinear relationships, the MLR was not capable of establishing a good model with high predictive power. Copyright:
© 2021 Research in Pharmaceutical Sciences.

Entities:  

Keywords:  Artificial neural network; Cancer; Multiple linear regression; Pyrimidine derivatives; QSAR

Year:  2021        PMID: 34760008      PMCID: PMC8562410          DOI: 10.4103/1735-5362.327506

Source DB:  PubMed          Journal:  Res Pharm Sci        ISSN: 1735-5362


INTRODUCTION

Cancer is one of the leading causes of worldwide mortality characterized by the loss of control of cell proliferation and almost most patients die without treatment (12). Angiogenesis is a physiological process in which new veins grow from existing veins and plays a key role in many pathological conditions such as tumor growth, metastasis, and so on. In adults, endothelial cells are silent in adolescence but are able to be activated in response to appropriate factors (34). Angiogenesis plays a vital role in life, growth, and recovery, for example, in wound healing. However, the basis for the transformation of tumors from dormant to malignant is due to this process. The most important angiogenesis inducer is the vascular endothelial growth factor (VEGF), which exerts its activity by binding to two tyrosine kinase receptors called VEGFR-1 and VEGFR-2 (4). Due to the critical role of VEGF in the pathological angiogenesis of this molecule, it is a valuable therapeutic target for anti-angiogenesis therapies (5). Pyrimidine is an aromatic heterocyclic organic compound similar to pyridine having nitrogen atoms at positions 1 and 3 in the ring (6). Pyrimidine derivatives have a wide range of pharmaceutical applications. There have been reports of pyrimidine derivatives as an antibacterial, analgesic, antiviral, anti-inflammatory, anti-HIV, antituberculosis, anticancer, anti-Parkinson, and antifungal as well as sleep medication in chemical sources. Among the reported medicinal properties of pyrimidines, anticancer activity is the most frequently reported (78). The quantitative structure-activity relationship (QSAR) is a strategy of critical importance for chemistry and pharmacy, based on the idea that when the structural properties of a molecule change, the activity or property of the material changes, accordingly (91011). QSAR models, mathematical equations related to the chemical structure of their biological activity, provide useful information for drug design and drug chemistry (12131415). These computational screening methods are a good alternative to the costly and laborious screening tests performed in laboratories. Therefore, there is a continuing effort among QSAR specialists to develop more efficient QSAR techniques to develop and discover more reliable approaches for pharmaceutical chemists in practice (161718). Following other papers in QSAR from our group members (192021), the current study attempted to associate the pharmacological activity of some furopyrimidine and thienopyrimidne derivatives as VEGFR-2 inhibitors by using both MLR as an extension of linear regression and ANN as nonlinear methods which use several explanatory variables to predict the outcome of a response variable (22). A comparison of various linear and nonlinear modeling techniques in recent research has shown how different regression methods can affect the predictive power of QSAR models (2324).

MATERIAL AND METHODS

Data sources

Two series of pyrimidine-based derivatives namely furo [2,3-d] pyrimidine and thieno [2,3-d] pyrimidine series, linked to either biarylamide or biarylurea via an NH or ether linker were seen in vitro VEGFR-2 inhibitory activity. All inhibitors of VEGFR-2 and their biological activities (percent inhibition values) were taken from the Aziz’s report (22). First, principal component analysis (PCA) was used to classify the molecules into calibration and test sets, then the data set is subdivided into a calibration set of 26 compounds and a test set of 7 compounds after PCA analysis for model evaluation. The chemical structures and the bioactivity values of all compounds are presented in Table 1.
Table 1

Structural formulae of compounds and their percent inhibition values

OrderStructure% of inhibition
1 8
2 15
3 5
4 8
5 14
6 14
7 32
8 19
9 97
10 62
11 61
12 72
13 98
14 14
15 45
16 23
17 27
18 77
19 100
20 40
21 73
22 39
23 47
24 67
25 19
26 10
27 83
28 87
29 86
30 94
31 46
32 96
33 100
Structural formulae of compounds and their percent inhibition values

Molecular model

All the 2D and 3D structures were drawn and built by ChemDraw and Chem3D software, respectively. Structures were optimized by MM2 algorithm in Chem3D. The theoretical molecular descriptors are derived from the chemical structure of the compounds. In order to calculate the theoretical descriptors, the molecular structures were constructed using ChemDraw Ultra version 15.0 and Chem3D Ultra version 15.0, then optimized using MM2 algorithm (2526).

Molecular descriptors

A descriptor is the mathematics of a molecule that contains different sources of chemical information that is converted and encoded to counter chemicals, biological, and pharmaceutical problems. To develop 2D-QSAR models, different physicochemical descriptors are calculated for each of the compounds in the dataset using DRAGON software version 5.5- 2007 (27). Dragon is a program for calculating and producing a variety of molecular descriptors for different compounds and converts the information of molecules including bond energy, bond angle, bond type, molecular mass, electronic properties, and so on, into numeric form and stores them in descriptive format. These descriptors can be used to study and evaluate molecular structure-activity or structure-property relationships as well as to analyze the high-throughput similarity and screening molecule databases. In fact, the dragon is widely used in scientific studies as well as part of several QSAR collections.

Feature selection

Feature selection should be the first and most important step of model designing. Feature selection methods have been employed for selecting the best descriptors among the many descriptors containing low information for model construction or correlated with other descriptors without incurring much loss of information. In this study, three methods were used to reduce descriptors (28). It should be noted that the number of descriptors calculated could be reduced by some techniques. Initially, among the pair of descriptors with a correlation coefficient above 0.95, one was eliminated by the Dragon software. Dragon reduced the number of 3224 calculated descriptors to 447. Then, descriptors that had constant or zero values that could not correlate the difference in structure to the difference in activity were removed. Then, the remaining descriptors were given to the software SPSS. The important descriptors are selected under a stepwise approach. In the stepwise strategy, a multiple-linear equation was built step by step. First, an initial model was determined, and then it was repeatedly changed by removing or adding a predictor variable based on stepping criteria for inclusion and exclusion. In each step, all variables were specified and evaluated to assign important descriptors. The SPSS presented a number of 8 proposed models by stepwise regression method.

RESULTS

Descriptor selection

First, the data set that consisted 33 compounds were divided into a calibration set of 26 compounds and a test set of 7 compounds with ratio 80% and 20%, respectively. Compounds number 2, 7, 9, 12, 19, 21, and 31 were selected as test sets and the remaining 33 compounds as a train set. In this study, the split of the data set was done with PCA. The equation must use the minimum number of descriptors to obtain the best fit and to achieve this, the stepwise regression method is used to find the best number of descriptors. Among the models given by the SPSS, after the sixth model, no considerable improvement in regression coefficient (R2) values were observed. For the appointed models, the values of the root-mean-square error (RMSE), (Q2), (R2), and R2adj parameters are calculated as shown in Table 2.
Table 2

R2, RMSE, Q2, adjusted R2 values for models with the different number of descriptors.

OrderAdjusted R2Q2RMSER2
10.5270030.02760622.332290.541784
20.7048680.57235317.351350.723314
30.7789270.73832914.747480.799653
40.8277260.82118312.831630.84926
50.8694540.87104510.973050.889852
60.8849360.8893410.189580.906511
70.9099110.89061610.149260.929618
80.9263770.89034410.24350.944782

R2, Regression coefficient; RMSE, root-mean-square error.

R2, RMSE, Q2, adjusted R2 values for models with the different number of descriptors. R2, Regression coefficient; RMSE, root-mean-square error. The appropriate regression model is a model with the lowest number of descriptors to obtain the best fit and the number of compounds in the samples is best suited to be at least 5 times the number of descriptors and the descriptors should be orthogonal values (2930). After analyzing the statistical parameters, according to the results shown in Table 2 and due to the changes in the slope of these parameters, model 5 with 5 descriptors was selected as the top model, and modeling was performed with 5 descriptors. The characteristics of the descriptors used in this study are presented in Table 3, and their values are given in Table 4. The selected descriptors should be independent of each other because in their high dependence only the descriptor with a higher correlation with the dependent variable is included in the model. A two-way correlation coefficient of descriptors was calculated by SPSS software and is presented in Table 5. The results showed that the behavior of the selected descriptors was independent and as can be seen, there is a little connection between the descriptors.
Table 3

Descriptors used in the 2D-QSAR study.

Descriptor typesDescriptor blocks typeDescriptor description
RDF035uRDF descriptorsRadial distribiution Function -5.3 / unweighted
Mor24v3DMoRSE3D-MORSE-signal 24 / weighted by atomic van der volumes
EEig11r3DMoRSEEigenvalue 11 from edge adj. matrix weighted by resonance integral
G2sWHIM descriptors2nd component symmetry directional WHIM index / weighted by atomic electropological states
ATS3v2Dauto correlationBroto-Moreau autocorrelation of a topological structure-lag 3/weighted by atomic van der Waals vol
Table 4

values of the obtained parameters of the studied derivatives of furopyrimidine and thienopyrimidine

NumberRDF035uMor24vEEig11rATS3vG2s
126.484-0.412.0443.7390.167
227.26-0.4432.0093.7390.167
322.56-0.2762.143.7390.167
423.356-0.1682.0053.7630.174
524.407-0.3042.0053.7390.167
624.234-0.2882.2713.8610.162
723.311-0.2372.0113.7460.165
826.9-0.4092.013.6870.167
928.934-0.3952.1793.7360.182
1027.84-0.3182.1933.760.164
1126.824-0.3692.1643.7360.165
1226.841-0.3512.013.7360.165
1328.809-0.3772.3293.8050.18
1429.849-0.3742.3223.8060.163
1526.845-0.4162.1623.7540.165
1628.53-0.3452.013.7830.164
1725.623-0.2792.0073.6640.167
1828.411-0.272.1793.7140.165
1927.922-0.2462.1933.7390.164
2025.499-0.3032.0073.7140.165
2128.285-0.212.3283.7850.164
2226.108-0.412.073.7180.169
2327.997-0.4022.1793.7660.168
2427.069-0.3532.1933.7890.184
2526.056-0.3642.1643.7660.168
2626.003-0.3872.073.7660.168
2727.842-0.3962.3333.8330.167
2828.994-0.3982.3243.8330.174
2926.499-0.1432.0683.6960.169
3028.306-0.2632.1793.7440.168
3127.494-0.0422.1933.7680.167
3226.049-0.1912.0683.7440.168
3328.785-0.192.3333.8130.176
Table 5

Correlation matrix between different obtained descriptors.

Descriptor typesRDF035uMor24vEEig11rATS3vG2s
RDF035u1
Mor24v-0.23541
EEig11r0.562653-0.024471
ATS3v0.255481-0.084190.7235261
G2s0.190981-0.086940.2094590.1580421
Descriptors used in the 2D-QSAR study. values of the obtained parameters of the studied derivatives of furopyrimidine and thienopyrimidine Correlation matrix between different obtained descriptors.

Multiple Linear Regression (MLR)

MLR, also known simply as multiple regression, is a statistical technique that uses several explanatory variables to predict the outcome of a response variable. MLR is modeling the linear relationship between the explanatory (independent) variables and response (dependent) variables. In essence, multiple regression is the extension of ordinary least-squares (OLS) regression that involves more than one explanatory variable. The equation for MLR is: yi = β0 + β1xi1 + β2xi2+...+βpxip + ϵ (1) where, i = n observations; yi, dependent variable; xi, explanatory variables; β0, y-intercept (constant term); βp, slope coefficients for each explanatory variable; ϵ, the model error term (also known as the residuals). The multiple regression model is based on the following assumptions: a) there is a linear relationship between the dependent variables and the independent variables; b) the independent variables are not too highly correlated with each other; c) yi observations are selected independently and randomly from the population; d) residuals should be normally distributed with a mean of 0 and variance σ. At the center of MLR analysis is the task of fitting a single line through a scatter plot. More specifically the multiple linear regression fits a line through a multi-dimensional space of data points. The simplest form has one dependent and two independent variables. The dependent variable may also be referred to as the outcome variable or regressing. The independent variables may also be referred to as the predictor variables or regressors. After selecting the number of final descriptors to build the model, PCA analysis was used to classify the molecules into calibration and test sets. So, the data set is subdivided into a calibration set of 26 compounds to build the MLR model and a test set of 7 compounds for model evaluation. We used Excel software and load the Analysis ToolPak add-in program. We used the regression in data analysis and by entering the data related to the calibration set (26 compounds), the MLR model was created. The results of which can also be seen in Table 6 and Fig. 1.
Table 6

Observed and calculated values of inhibition percent according to multiple linear regression method for the calibration and test sets.

Calibration setInhibition% (observed)Inhibition% (predicted)ResidualRelative error%
1821.30-13.30-166.31
3514.02-9.02-180.54
4814.97-6.97-87.07
5146.717.2952.06
61415.37-1.37-9.78
81932.61-13.61-71.65
106261.900.100.17
116147.8913.1721.50
139895.332.672.73
141424.6110.61-75.75
154537.767.2416.09
162329.55-6.55-28.48
172740.48-13.48-49.92
187783.43-6.43-8.35
204021.7918.2145.54
223929.389.6224.66
234754.44-7.44-15.83
246766.880.120.18
251937.08-18.08-95.14
26100.65-9.3593.51
278395.5912.59-15.17
288779.837.178.24
298670.2115.7818.37
309479.4914.5115.43
329693.04-2.963.09
33100111.34-11.34-11.34

Test set
21519.314.31-28.70
73218.00-14.0093.51
99788.728.288.54
127284.3312.33-17.13
1910076.4423.5623.56
217361.67-11.3315.52
314647.07-0.02-2.32
Fig. 1

Predicted inhibition percent activities by multiple linear regression in comparison with experimental for (A) model and (B) test set.

Predicted inhibition percent activities by multiple linear regression in comparison with experimental for (A) model and (B) test set. Observed and calculated values of inhibition percent according to multiple linear regression method for the calibration and test sets. Then, equation (2)was the best MLR model that was selected by the regression method for furopyrimidine and thienopyrimidine derivatives: Inhabitation percent = 177.73 + 9.99 RDF035u + 117.40 Mor24v + 166.22 EEig11r – 243.89 ATS3v + 1196.81 G2s (2) where, N = 26; R2 = 0.874; RMSE = 10.97; R2 CV = 0.87. The predicted values of the inhibition percentage of the calibration and test set datasets using this model were plotted against the experimental values and are shown in Fig. 1. The mentioned linear model was used to predict seven external test data that have never been used in the descriptor selection or model construction. The predicted values of the inhibition percent of the calibration set and the test set using the MLR equation are presented in Table 6. Y-randomization test certifies the robustness of a QSAR model. The dependent parameter is shuffled randomly and a new QSAR model is developed applying the original independent parameter matrix. The new QSAR models (after several iterations) are expected to have low R2 and Q2 values. The results are shown in Table 7. The low R2 and Q2 values show that the good results in our original model are not due to a chance correlation or structural dependency of the training set.
Table 7

R2 and Q2 values after several Y- randomization tests.

IterationR2Q2
10.160.01
20.150.00
30.200.03
40.330.11
50.180.02
60.140.00
70.240.05
R2 and Q2 values after several Y- randomization tests.

Artificial neural networks (ANN)

ANN is one of the main tools used in machine learning. As the “neural” part of their name suggests, they are brain-inspired systems that are intended to replicate the way that humans learn. Neural networks consist of input and output layers, as well as (in most cases) a hidden layer consisting of units that transform the input into something that the output layer can use (313233). They are excellent tools for finding patterns that are far too complex or numerous for a human programmer to extract and teach the machine to recognize. In the network, it connects to each node of the connection layers and is influenced by the amount of weights affected by the units connected to it. During the random weight training and initial random crash, adjustments are made to find the minimum difference between the output value and the target value. After a sufficient number of training iterations, the ANN learns to recognize patterns in the data, so it can be used for predicting new input values (3435). The network used in this study consisted of three layers (an input layer, a hidden layer, and an output layer). The input nodes contain five parameters in the regression equation and one constant. The output neuron refers to the retention index. Before entering the neural network, input data were stored at a ratio of 0 to 1. Inhibition percent values were also used with this rule. Sigmoid transfer functions were applied in all layers. The weights were adjusted through a backpropagation algorithm to correct the model behavior. This computer program is designed to generate the desired number of neurons in the hidden layer. In order to select the optimal model, different topological networks with different hidden units were performed. On the other hand, the values of learning factor, coefficient of movement, and core values of weight and bias were tested to find the best performance and fastest convergence. The predicted values of inhibition percent for training and test sets using the ANN model are presented in Table 8. The predicted values of the percentage of inhibition of the training data set are plotted using the ANN model against the experimental values and are shown in Fig. 2. Also, the residual plot of furopyrimidine and thienopyrimidine derivatives by ANN model is demonstrated in Fig. 3.
Table 8

Observed values and calculated values of inhibition percent according to the artificial neural network method.

Training setInhibition% (observed)Inhibition% (observed)ResidualRelative error%
188.02-0.020.25
21515.08-0.080.53
488.38-0.384.75
51414.03-0.030.21
73235.24-3.2410.13
81919.10-0.100.53
99797.000.000.00
106262.25-0.250.40
116161.2-0.20.33
141414.07-0.070.50
162323.21-0.210.91
172727.06-0.060.22
187777.13-0.130.17
19100100.000.000.00
204040.10-0.100.25
223939.01-0.010.03
234740.596.41-13.64
246764.582.42-3.61
251921.47-2.4713.00
261010.01-0.010.10
278383.10-0.100.12
309495.62-1.621.72
33100100.000.000.00

Validation set
61413.730.27-1.93
127266.865.14-7.14
217373.13-0.130.18
298686.01-0.010.01
329699.47-3.473.61

Test set
355.000.000.00
139898.000.000.00
154545.21-0.210.47
288787.02-0.020.02
314646.08-0.080.17

N = 33; Rtrain = 0.998; Rtest = 0.999; Rvalidation = 0.999; Rall = 0.998; R2CV= 0.99998; root-mean-square error = 1.78

Fig. 2

Predicted inhibition percent activities by artificial neural network in comparison with experimental.

Fig. 3

Residual plot of furopyrimidine and thienopyrimidine derivatives by artificial neural network model.

Predicted inhibition percent activities by artificial neural network in comparison with experimental. Residual plot of furopyrimidine and thienopyrimidine derivatives by artificial neural network model. Observed values and calculated values of inhibition percent according to the artificial neural network method. N = 33; Rtrain = 0.998; Rtest = 0.999; Rvalidation = 0.999; Rall = 0.998; R2CV= 0.99998; root-mean-square error = 1.78

DISCUSSION

Modeling was performed with 5 descriptors named RDF035u, Mor24v, EEig11r, ATS3v, and G2s. The RDF035u descriptor belongs to the RDF family, Mor24v and EEig11r to the 3DMoRSE family, the G2s to the WHIM family, and the ATS3v to the 2D-autocorrelation family. The 2D autocorrelation descriptor is a subset of autocorrelation descriptors. This group of descriptors is molecular descriptors that are calculated based on the autocorrelation function (AC1). In general, 2D autocorrelation descriptors express how an atomic property is distributed throughout the topology structure and can be calculated by summing the product of the terms containing the desired atomic property for the final atoms in all paths of a given length. Among these descriptors, four descriptors including RDF035u, Mor24v, EEig11r, and G2s with positive coefficients and ATS3v with negative coefficient entered in the model. The positive coefficients of each descriptor indicate its direct effect on the activity and the negative coefficient indicates the inverse effect of the descriptor on the activity. The ATS3v descriptor is a subset of the autocorrelation descriptors called Broto-Moreau, which is weighted by atomic and van der Waals volumes. This descriptor has a negative coefficient in the equation, meaning that increasing this descriptor reduces the inhibition activity of the VEGFR-2 receptor. 3DMoRSE descriptors (3D representation of molecule structure based on electron scattering) can be calculated from the equation used in electron diffraction studies, which allows the 3D representation of the molecule as fixed values. These descriptors are able to provide a link between the 3D structure of organic compounds and their physical, chemical, and biological properties. Because these descriptors express the 3D arrangement of atoms without being related to the size of the molecule, they apply to a large number of molecules with large structural differences. The Mor24v descriptor in the equation, which is weighted by the atomic and van der Waals volumes, has a positive coefficient and has a direct effect on the inhibition index. WHIM descriptors of Cartesian coordinates of the 3D structure of a molecule are calculated using conformers with the least energy and include information about the size, shape, equation, and atomic distribution of the 3D structure of the molecule. The descriptor of G2s has a positive coefficient in the weighted equation with the electropathological state of Kier and Hall and has a direct effect on the inhibition percentage. RDF descriptors are based on measuring the atomic distance in the 3D representation of molecules, and in addition to the atomic distance, they provide other information about ring types, planar and non-planar systems, and types of atoms. The RDF035u descriptor entered in the equation, which is not weighted with a specific property of the molecule, has a positive coefficient, and as it increases, the inhibition index increases. The main performance parameters of the two models are shown in Table 9. As expected, according to the results shown in the table, all statistical parameters for the ANN model are better than the MLR model. Also, the results of the two models are compared in Table 9. The results of the analysis with two models indicate that the percentage of relative error obtained from the ANN model is much lower than the MLR model.
Table 9

Performance comparison between models obtained by MLR and ANN.

ModelsCalibrationPrediction

RMSER2Q2RMSER2Q2
MLR10.970.8890.8714.540.6840.75
ANN1.780.9980.990.170.9990.99

MLR, multiple linear regression; ANN, artificial neural network; RMSE, root-mean-square error; R2, regression coefficient.

Performance comparison between models obtained by MLR and ANN. MLR, multiple linear regression; ANN, artificial neural network; RMSE, root-mean-square error; R2, regression coefficient. The ANN model containing a hidden layer with three nodes and a sigmoid transfer function could predict the activity of the VEGFR-2 inhibitory derivative with an absolute relative error of calibration and validation lower than 1% and that of prediction lower than 1%. Table 10 compares the predictions performances between models MLR and ANN.
Table 10

Comparing values of inhibition percent experimental and predicted results using MLR and ANN methods.

NumberInhibition (observed)MLR (predicted)ResidualRelative error (%)ANN (predicted)ResidualRelative error (%)
1821.305-13.305-166.3138.017-0.017-0.219
21519.305-4.305-28.715.082-0.082-0.545
3514.027-9.027-180.545.001-0.001-0.027
4814.966-6.966-87.0758.382-0.382-4.773
5146.7117.28952.06414.031-0.031-0.218
61415.369-1.369-9.77913.7340.2661.899
73218.17513.82543.20335.244-3.244-10.137
81932.614-13.614-71.65319.1-0.1-0.525
99788.7168.2848.5496.9980.0020.002
106261.9030.0970.15662.25-0.25-0.403
116147.88413.11621.50261.195-0.195-0.32
127284.331-12.331-17.12666.8645.1367.133
139895.3272.6732.72897.9980.0020.002
141424.605-10.605-75.7514.068-0.068-0.487
154537.767.2416.08945.207-0.207-0.46
162329.55-6.55-28.47823.208-0.208-0.903
172740.479-13.479-49.92227.056-0.056-0.207
187783.432-6.432-8.35377.126-0.126-0.163
1910076.44123.55923.55999.9990.0010.001
204021.78518.21545.53840.099-0.099-0.248
217361.66711.33315.52573.131-0.131-0.179
223929.3829.61824.66239.011-0.011-0.029
234754.44-7.44-15.8340.5866.41413.646
246766.8780.1220.18264.5792.4213.613
251937.076-18.076-95.13721.466-2.466-12.979
26100.6499.35193.5110.007-0.007-0.073
278395.591-12.591-15.1783.1-0.1-0.12
288779.8297.1718.24387.023-0.023-0.026
298670.20515.79518.36686.008-0.008-0.009
309479.49214.50815.43495.62-1.62-1.724
314647.071-1.071-2.32846.084-0.084-0.184
329693.0382.9623.08599.47-3.47-3.615
33100111.342-11.342-11.34299.9960.0040.004

MLR, multiple linear regression; ANN, artificial neural network.

Comparing values of inhibition percent experimental and predicted results using MLR and ANN methods. MLR, multiple linear regression; ANN, artificial neural network.

CONCLUSION

QSAR analysis can greatly help us to comprehend the basic structural properties of the inhibitors required by its target, and thus to discover more promising chemical derivatives (36). The MM2 theory was used to optimize the 3D geometry of the molecules and DRAGON was used to calculate a diverse set of quantum chemical descriptors. As can be seen, the predicted values of the MLR method and the ANN technique are close to the experimental values, which demonstrates the ability to describe molecular topology in prediction. In the MLR method, a six-parameter equation containing a constant value and the coefficients of the 5 selected descriptors was obtained. The ANN model containing a hidden layer with three nodes and a sigmoid transfer function could predict the activity of the VEGFR-2 inhibitory derivative with an absolute relative error of calibration and validation lower than 1% and that of prediction lower than 1%. Comparing the results of MLR and ANN methods showed the superiority of the ANN method over MLR for predicting the activities.

Conflict of interest statement

All authors declared no conflict of interest in this study.

Authors’ contribution

F. Masoomi Sefiddashti conducted the research, analyzed the data, and wrote the manuscript. Sh. Ghanavati Nasab conducted the research and participated in revising the manuscript. H. Haddadi and S. Asadpour supervised the project and revised the manuscript. The manuscript was reviewed by all authors.
  17 in total

Review 1.  Chem-bioinformatics and QSAR: a review of QSAR lacking positive hydrophobic terms.

Authors:  C Hansch; A Kurup; R Garg; H Gao
Journal:  Chem Rev       Date:  2001-03       Impact factor: 60.622

Review 2.  Pericytes at the intersection between tissue regeneration and pathology.

Authors:  Alexander Birbrair; Tan Zhang; Zhong-Min Wang; Maria Laura Messi; Akiva Mintz; Osvaldo Delbono
Journal:  Clin Sci (Lond)       Date:  2015-01       Impact factor: 6.124

3.  Improved 3D-QSAR prediction by multiple-conformational alignment: A case study on PTP1B inhibitors.

Authors:  Xiangyu Zhang; Jianping Mao; Wei Li; Kazuo Koike; Jian Wang
Journal:  Comput Biol Chem       Date:  2019-10-01       Impact factor: 2.877

4.  Novel piperidinylpyrimidine derivatives as inhibitors of HIV-1 LTR activation.

Authors:  Norio Fujiwara; Takashi Nakajima; Yutaka Ueda; Hitoshi Fujita; Hajime Kawakami
Journal:  Bioorg Med Chem       Date:  2008-09-30       Impact factor: 3.641

5.  The predicting study for chromatographic retention index of saturated alcohols by MLR and ANN.

Authors:  W Guo; Y Lu; X M Zheng
Journal:  Talanta       Date:  2000-03-06       Impact factor: 6.057

Review 6.  Vascular endothelial growth factor as an anti-angiogenic target for cancer therapy.

Authors:  Gang Niu; Xiaoyuan Chen
Journal:  Curr Drug Targets       Date:  2010-08       Impact factor: 3.465

7.  A modeling study of aldehyde inhibitors of human cathepsin K using partial least squares method.

Authors:  M Shahlaei; A Fassihi; L Saghaie; E Arkan; A Pourhossein
Journal:  Res Pharm Sci       Date:  2011-07

8.  Discovery of Potent VEGFR-2 Inhibitors based on Furopyrimidine and Thienopyrimidne Scaffolds as Cancer Targeting Agents.

Authors:  Marwa A Aziz; Rabah A T Serya; Deena S Lasheen; Amal Kamal Abdel-Aziz; Ahmed Esmat; Ahmed M Mansour; Abdel Nasser B Singab; Khaled A M Abouzid
Journal:  Sci Rep       Date:  2016-04-15       Impact factor: 4.379

9.  A quantitative structure-activity relationship (QSAR) study of some diaryl urea derivatives of B-RAF inhibitors.

Authors:  Sedighe Sadeghian-Rizi; Amirhossein Sakhteman; Farshid Hassanzadeh
Journal:  Res Pharm Sci       Date:  2016-12

10.  Prediction of p38 map kinase inhibitory activity of 3, 4-dihydropyrido [3, 2-d] pyrimidone derivatives using an expert system based on principal component analysis and least square support vector machine.

Authors:  M Shahlaei; L Saghaie
Journal:  Res Pharm Sci       Date:  2014 Nov-Dec
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.