Literature DB >> 28831410

Dataset of 2-(2-(4-aryloxybenzylidene) hydrazinyl) benzothiazole derivatives for GQSAR of antitubercular agents.

Amit S Tapkir1, Sohan S Chitlange2, Ritesh P Bhole2.   

Abstract

Fragment based Quantitative structure activity relationship (QSAR) analysis on reported 25 2-(2-(4-aryloxybenzylidene) hydrazinyl) benzothiazole dataset as antitubercular agents were carried out. Molecules in the current dataset were fragmented into six fragments (R1, R2, R3, R4, R5, R6).Group based QSAR Models were derived using Multiple linear regression (MLR) analysis and selected on the basis of various statistical parameters. Dataset of benzothiazole reveled importance of presence of halogen atoms on is essential requirement. The generated models will provide structural requirements of benzothiazole derivatives which can be used to design and develop potent antitubercular derivatives.

Entities:  

Keywords:  Antitubercular; Benzothiazole; GQSAR; Quantitative structure-activity relationship

Year:  2017        PMID: 28831410      PMCID: PMC5554989          DOI: 10.1016/j.dib.2017.08.006

Source DB:  PubMed          Journal:  Data Brief        ISSN: 2352-3409


Specifications Table Value of the data Tuberculosis is one of most lethal disease in the current decade; development of potent antitubercular compounds is need of time. GQSAR modelling data was developed for predicting structural properties of benzothiazole dataset which are infusing antitubercular activity. The GQSAR models generated will be utilized to screen various heterocyclic datasets for antitubercular potency, which will lead to development of novel antitubercular compounds.

Data

The data shown here regarding a GQSAR equation development that is used to predict contribution of substituents towards antitubercular potential of benzothiazole dataset.

Experimental design, materials and methods

Data set preparation

Molecular data set for current study were taken from literature reported by Telvekar et al. [1]. All the 24 structures of benzothiazole derivatives were drawn using 2D builder module of Vlife MDS 4.3. These 2D structures were converted into 3D via using V life engine platform. Geometry and structures of 3D molecules were optimized via energy minimization process using Merck molecular force field (MMFF) and Gasteiger charges. A common template which is a representative of the entire molecules under study was prepared with the presence of a dummy atom (X) at the substitution site.

Calculation of descriptors

The common chemical structure as shown in Fig. 1 was utilized for development of GQSAR model. The molecules in the data set were fragmented in six different fragments (R-R6). The fragmented molecules were incorporated into the QSAR module of V life MDS for calculation of molecular descriptors. Molecular descriptors are nothing but the numerical values which represents physical and chemical information of the molecules. In GQSAR studies descriptors are representation of the physical and chemical behavior of substituents present.
Fig. 1

Molecular Template Utilized for Fragmentation pattern.

Molecular Template Utilized for Fragmentation pattern.

Data selection and building G-QSAR model [2], [3], [4], [5]

Generated dataset of 25 benzothiazole derivatives were randomly divided into training set and test set 17 and 8 molecules respectively. Random distribution of training and test set will results into uniform distribution of biological activity across the molecules under study. Multiple linear regression analysis was utilized for development GQSAR models, with number of dependent variable limited to not more than 3 per model (Table 1).
Table 1

Table Showing Molecules under Study.

Mole. NoRR1R2R3R4R5
1.HHHHHH
2.HClHHHH
3.HHHClHH
4.HClHClHH
5.HHCH3ClHH
6.ClHHHHH
7.ClClHHHH
8.ClHHClHH
9.ClClHClHH
10.ClHCH3ClHH
11.CH3HHHHH
12.CH3ClHHHH
13.CH3HHClHH
14.CH3ClHClHH
15.CH3HCH3ClHH
16.OCH3HHHHH
17.OCH3ClHHHH
18.OCH3HHClHH
19.OCH3ClHClHH
20.OCH3HCH3ClHH
21.NO2HHHHH
22.NO2ClHHHH
23.NO2HHClHH
24.NO2ClHClHH
25.NO2HCH3ClHH
Table Showing Molecules under Study.

Validation of the developed G-QSAR model [6], [7], [8], [9], [10]

Validation is a critical step in the QSAR model development. Validation methods are required for establishing predictability of QSAR model on unseen data and for determination of complexity of QSAR model which is justified by the data under study. Number of methods like the methods of least squares fit (R2), cross validation (Q2), adjusted R2 (R2adj), chi-squared test (χ2), root mean squared error (RMSE), bootstrapping and scrambling (Y-Randomization) are reported for internal validation of QSAR models. Observed activity of molecules in dataset was expressed in MIC(μg/ml) and converted into pMIC for QSAR analysis. All the molecules in the dataset are having activity (MIC) in the range 1.5–29.00 μg/ml.

QSAR analysis

Congeneric nature of the dataset is basis prerequisite for any QSAR analysis. Fragment based QSAR is recent methodology were complex structures can be analyzed. 30 different G-QSAR models were generated and best one of them are selected on basis of the statistical values like r2, q2, pred_r2, F-test and standard error. The predicted activity data via QSAR models was in accordance with the observed biological activity with small variations which were clearly identified in the correlation plot of different model (Table 2 and Fig. 2). Selected model is given by.
Table 2

Table showing observed and predicted activity of selected GQSAR model.

Molecule. NoObserved Activity pMIC(μg/ml)Predicted Activity pMIC(μg/ml)Molecule NoObserved Activity pMIC(μg/ml)Predicted Activity pMIC(μg/ml)
12.12.7140.91.2
20.90.7150.80.8
31.21.2160.80.7
4 #2.31.217#0.71.3
5#1.20.918#1.10.7
61.31.6190.61.3
73.34.1200.60.8
82.83.621#0.60.7
95.64.022#0.61.1
101.53.7230.70.7
11#1.00.7240.71.1
121.41.225#0.70.8
130.80.7

# Test Set Molecules

Fig. 2

Figure Showing Correlation Plot for Selected GQSAR model having r2 0.88.

Figure Showing Correlation Plot for Selected GQSAR model having r2 0.88. Table showing observed and predicted activity of selected GQSAR model. # Test Set Molecules pMIC: 0.0038+ 2.9110(±0.4296)R1-ChlorinesCount+ 5.5097(±2.0358) R2-MomInertiaX. r2: 0.8845, q2: 0.6059, F test: 35.45.
Subject areaComputational and Insilico Chemistry
More specific subject areaGroup Quantitative Structure-Activity Relationship(QSAR)
Type of dataEquation, Tables,Graphs
How data was acquiredGroup based QSAR modelling
Data formatAnalysis
Experimental factorsMultiple linear regression GQSAR models for predicting the inhibitory potential of benzothiazole dataset were created. 17 molecules were utilized as training dataset and 8 molecules utilized as test dataset.
Experimental featuresFragment descriptors and pMIC values were utilized in GQSAR analysis via stepwise variable selection method using dataset of 25 molecules.
Data source locationPharmaceutical chemistry of Laboratory of Progressive Education Society's, Modern College of Pharmacy, Sector 21, Yamunanagar, Nigdi, Pune 411044, Maharashtra, India
Data accessibilityThe data is with this article
  3 in total

1.  Predictive QSAR modeling based on diversity sampling of experimental datasets for the training and test set selection.

Authors:  Alexander Golbraikh; Alexander Tropsha
Journal:  J Comput Aided Mol Des       Date:  2002 May-Jun       Impact factor: 3.686

2.  Novel 2-(2-(4-aryloxybenzylidene) hydrazinyl)benzothiazole derivatives as anti-tubercular agents.

Authors:  Vikas N Telvekar; Vinod Kumar Bairwa; Kalpana Satardekar; Anirudh Bellubi
Journal:  Bioorg Med Chem Lett       Date:  2011-10-30       Impact factor: 2.823

Review 3.  Variable selection methods in QSAR: an overview.

Authors:  Maykel Pérez González; Carmen Terán; Liane Saíz-Urra; Marta Teijeira
Journal:  Curr Top Med Chem       Date:  2008       Impact factor: 3.295

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.