Literature DB >> 21179317

Models for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines.

Rakesh K Goyal1, Harish Dureja, Gajendra Singh, Anil Kumar Madan.   

Abstract

The relationship between topological indices and antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines has been investigated. A data set consisting of 31 analogues of 5â-O-[(N-Acyl)sulfamoyl]adenosines was selected for the present study. The values of numerous topostructural and topochemical indices for each of 31 differently substituted analogues of the data set were computed using an in-house computer program. Resulting data was analyzed and suitable models were developed through decision tree, random forest and moving average analysis (MAA). The goodness of the models was assessed by calculating overall accuracy of prediction, sensitivity, specificity and Mathews correlation coefficient. Pendentic eccentricity index â a novel highly discriminating, non-correlating pendenticity based topochemical descriptor â was also conceptualized and successfully utilized for the development of a model for antitubercular activity of 5â-O-[(N-Acyl)sulfamoyl]adenosines. The proposed index exhibited not only high sensitivity towards both the presence as well as relative position(s) of pendent/heteroatom(s) but also led to significant reduction in degeneracy. Random forest correctly classified the analogues into active and inactive with an accuracy of 67.74%. A decision tree was also employed for determining the importance of molecular descriptors. The decision tree learned the information from the input data with an accuracy of 100% and correctly predicted the cross-validated (10 fold) data with accuracy up to 77.4%. Statistical significance of proposed models was also investigated using intercorrelation analysis. Accuracy of prediction of proposed MAA models ranged from 90.4 to 91.6%.

Entities:  

Keywords:  5′-O-[(N-Acyl)sulfamoyl]adenosines; Antitubercular activity; Molecular connectivity topochemical index; Pendentic eccentricity index; Superpendentic topochemical index; Wiener’s topochemical index

Year:  2010        PMID: 21179317      PMCID: PMC3007618          DOI: 10.3797/scipharm.1006-03

Source DB:  PubMed          Journal:  Sci Pharm        ISSN: 0036-8709


Introduction

In the pharmaceutical industry, much effort is being devoted to develop new drugs [1]. The seven steps involved in the drug discovery process are: disease selection, target hypothesis, lead identification, lead optimization, pre-clinical trial, clinical trial and pharmacogenomic optimization. Traditionally, these steps are carried out sequentially, and if one of these steps is slow, it naturally slows down the entire process [2]. Considering both, the potential benefits to human health and the enormous cost in time and money of drug discovery, any tool or technique that enhances the efficiency of any stage of drug discovery enterprise will be highly prized [3]. A viable solution to this quagmire lies in the estimation of necessary properties of molecules directly from their structure without the input of any other experimental data through quantitative structure-activity relationship (QSAR) models [4]. The main hypothesis in the QSAR/QSPR (quantitative structure-activity/property relationship) approach is that all properties (physico-chemical and biological) of a chemical substance are statistically related to its molecular structure [5]. Quantitative relations generated from such studies help in hypothesizing important contributions of specific structural aspects or chemical interactions in modifying physico-chemical properties and biological activities and also in predicting properties and activities of untested and not yet synthesized compounds [6]. Mathematical descriptors of molecular structure, such as various topological indices (TIs), have been widely used in structure-property-activity relationship studies [7]. Topological descriptors are mathematical entities encoding molecular graphs composed of vertices (corresponding to the atoms) and edges (representing the bonds among atoms). These are two-dimensional descriptors which take into account the internal atomic arrangement of compounds, and encode in numerical form information about molecular size, shape, branching, presence of heteroatoms and multiple bonds [8]. One of the most interesting advantages of molecular topology is the straightforward calculation of topological descriptors [9] without requirement of any experimentally derived measurement. The usefulness of TIs in QSPR and QSAR studies has been widely demonstrated, and they have also been used as a measure of structural similarity or diversity by their application to databases virtually generated by computer [10]. Though a large number of topostructural and topochemical indices of diverse nature have been reported in literature but only a small proportion of them has been successfully employed in structure- activity- relationships (SARs). Some of the topostructural and topochemical indices, which have been successfully employed in SAR studies include Wiener’s index [11], Hosoya’s index [12], Randic’s molecular connectivity index [13], Zagreb group parameters [14, 15], Balaban’s index [16], Schultz’index [17], molecular connectivity topochemical index [18, 19], eccentric connectivity index [20], revised Wiener index [21], E-state index [22], eccentric connectivity topochemical index [23], Zagreb topochemical indices [24], and superaugmented eccentric connectivity indices [25]. Tuberculosis (TB), one of the oldest recorded human afflictions, is still one of the biggest killers among the infectious diseases, despite the worldwide use of a live attenuated vaccine and combination of several antibiotics [26]. The disease spreads more easily in over crowded places and in the conditions of malnutrition and poverty; characteristics typical of developing countries. Tuberculosis is the commonest opportunistic disease in persons infected with human immunodeficiency virus [27]. Mycobacterium tuberculosis, the causative agent of TB, is the leading bacterial cause of infectious disease mortality. Mycobacterium tuberculosis and Yersinia pestis, the causative agent of plague, have been reported to be pathogens with serious ongoing impact on global public health and potential use as agents of bioterrorism [28]. The development of M. tuberculosis strains which are resistant to all of the current front-line antitubercular drugs has prompted worldwide efforts to develop new antibiotics to treat this notorious pathogen [29]. It is well known fact that iron is a required element for growth and survival of M. tuberculosis in its host, and iron overload can be an exacerbating cofactor to tuberculosis [30]. Although, iron’s abundance in the earth’s crust, spin state, and redox tuneability makes it the most versatile among transition elements, the insolubility of ferric hydroxide at pH 7.4 limits the concentration of [Fe3+] (the free aqueous ion) to ∼10−18 M. However, even below this concentration, free ferric ion is toxic. To avoid toxicity and regulate iron transport, the human serum iron transport protein, transferrin, maintains the free ferric iron concentration at about 10−24 M [31]. In a mammalian host, the concentration of free iron in serum and body fluids is too low to support growth of bacteria [32]. The ability of pathogens to obtain iron from transferrins, ferritin, hemoglobin, and other iron-containing proteins of their host is central to whether they can live or die [33]. Both pathogenic and saprophytic microorganisms have evolved sophisticated iron-acquisition systems to overcome iron deficiency imposed by host defensive mechanism and their environment. At the core of such systems is the production of small molecules known as siderophores, which are secreted into the extracellular space, tightly bind available iron, and then are reinternalized with their bound iron through specific cell surface receptors [34]. M. tuberculosis is reported to produce two series of structurally related siderophores, collectively known as the mycobactins, which are critical for virulence and growth. Mycobactin biosynthesis is initiated by MbtA, an adenylate-forming enzyme that catalyzes a two-step reaction and is responsible for incorporating salicylic acid into the mycobactins [35]. The reaction mechanism catalyzed by MbtA provides several opportunities to develop inhibitors against MbtA [32]. MbtA is an ideal target since it has no mammalian homologues [36]. Inhibition of siderophore biosynthesis has emerged as an attractive strategy to develop new antibiotics against pathogens which require siderophores for virulence [32]. In the present study, a pendenticity based topochemical descriptor termed as pendentic eccentricity index (in both topostructural and topochemical forms) has been conceptualized and successfully utilized along with existing TIs for development of models for prediction of antitubercular activity of 5′-O-[(N-Acyl)sulfamoyl]adenosines.

Methodology

Dataset

A dataset comprising of 31 analogues of 5′-O-[(N-Acyl)sulfamoyl]adenosines was selected for the present investigation [35]. The basic structures of 5′-O-[(N-Acyl)sulfamoyl]-adenosines are shown in Fig. 1 and the various substituents have been enlisted in Tab. 1. Somu et al. reported that, in order to enhance stability, all compounds were converted to triethylammonium salts after purification while conversion to alkali salts was readily achieved through ion-exchange [36]. In present study, only basic structures were taken into consideration while determining index values.
Fig. 1.

Basic structures of 5′-O-[(N-Acyl)sulfamoyl]adenosines [35].

Tab. 1.

Relationship between topological indices and antitubercular activity.

Cpd. No.Basic RingRWcχA cP ξcPAntitubercular activity
Predicted Using MAA models
Reported
WcχA cP ξcP
1A3896.41713.4617405.109189780.156±+±+
2A3583.74813.1172651.54123721.924
3A3893.84513.4937325.882185842.313±+±
4A3900.29213.4167522.852195710.641±+±+
5A3738.75213.4712810.77326456.617
6A4596.53814.19129026.42752262.75
7A4263.4213.5867768.0452337094.5+±+
8A4289.4213.5458214.6162233735+++++
9A4265.4213.5457977.9062335093+++++
10A4267.4213.73628115.132313057.25+±+++
11A4248.86513.458214.6161821982.125+++++
12A4258.09213.85427540.722353780.75+±+++
13A5478.72214.769554283.4780373248
14A4260.76413.81727706.482341639+±+++
15A4217.4213.75926691.562453185.25±±++
16A4237.4213.80827131.012452283.75±±+
17A4287.4213.78828608.062462222.5+±
18A3899.3413.3917423.716190558.016±±
19A3903.21513.357541.169196488.5±±
20A3924.52813.1792810.773212818.438±±
21A3612.95812.8752743.98124494.674
22A3254.24112.6172319.14620258.762
23A3583.74813.1172651.54123721.924
24AH3C—2119.20910.5275860.135171240.594
25AH3C—O—2400.37110.936196.673168013.875
26A3986.91413.5883339.37433262.125±±
27A6181.41615.608212137710269211648
28A4731.02714.47710746.7305689.281±
29B3855.00213.5037238.409184644.953±±±+
30B3210.49413.9741767.59613858.688
31B3331.26314.574676.292056.221

+…Active analogue; –…Inactive Analogue; ±…Transistional analogue where activity could not be specifically assigned; Note: The cation of anionic structures is Et3NH+ or Na+

Enzyme Assay and Biological Activity against Whole-cell M. tuberculosis

Enzyme assays were performed by Qiao et al. [35] at 37 °C with recombinant MbtA expressed in E. coli in a buffer of 75 mM Tris-HCl, PH 7.5, 10mM MgCl2, 2 mM DTT, 250 μM salicylic acid, 10 mM ATP, and 1 mM PPi. The apparent inhibition constants (Kiapp) were determined by fitting the concentration-responce plots either to the Hill equation or to the Morrison equation. All of the Kiapp values reported therein are uncorrected for substrate concentrations and represent an upper limit of the true dissociation constant. Although, the Kiapp reorted are not a measure of the true inhibitor potency, the differences are reflective of free energy differences associated with inhibitor binding to Mbta, presuming equivalent modalities of inhibition [35]. All inhibitors were also evaluated against whole-cell M. tuberculosis H37Rv under iron-limiting and iron-rich conditions by Qiao et al. [35]. For the purpose of present study, the analogues possessing Kiapp values of ≤0.05 μM were considered to be active and analogues possessing Kiapp values of >0.05 μM were considered to be inactive. Further, the analogues possessing MIC99 (Minimum inhibitory concentration that inhibited >99% of cell growth) values of ≤12.5 μM in iron-deficient conditions and ≤50 μM in iron-rich conditions were considered to be active, and analogues possessing MIC99 values of >12.5 μM in iron-deficient conditions and >50 μM in iron-rich conditions were considered to be inactive.

Topological indices

Values of twenty-six topological indices [13–15, 18–20, 23–25, 37–50] of diverse nature used in the present study (Tab. 2) were calculated for all the analogues involved in the data set using an in-house computer program.
Tab. 2.

Topological indices.

CodeIndexReference
A1Molecular connectivity topochemical index18, 19
A2Eccentric adjacency topochemical index37
A3Augmented eccentric connectivity topochemical index38
A4Superadjacency topochemical index39
A5Eccentric connectivity topochemical index23
A6Connective eccentricity topochemical index40
A7Zagreb topochemical index, M1c24
A8Zagreb topochemical index, M2c24
A9Wiener’s topochemical index41
A10Superaugmented eccentric topochemical connectivity index142
A11Superpendentic topochemical index
A12Superaugmented eccentric topochemical connectivity index 342
A13Pendentic eccentricity topochemical index
A14Molecular connectivity index13,43
A15Eccentric adjacency index44
A16Augmented eccentric connectivity index45
A17Superadjacency index39
A18Eccentric connectivity index20
A19Connective eccentricity index46
A20Zagreb group parameter, M114, 15
A21Zagreb group parameter, M214, 15
A22Wiener’s index47, 48
A23Superaugmented eccentric connectivity index125
A24Superpendentic index49
A25Eccentric distance sum index50
A26Pendentic eccentricity index

Decision tree

The decision tree (DT) methodology determines activity of a chemical through a series of rules based on selection of descriptors [51]. The simplified mechanism of a decision tree is to find some rules for each class based on the descriptors of the training set. These rules are subsequently utilized for building a decision tree having several branches leading to a leaf with a given class assignment [52]. The name decision tree is due to the reason that the classification is done using a set of tests (or decisions) that are arranged in the form of a tree [53]. The prediction for a molecule reaching a given terminal node is obtained by majority vote of the molecules reaching the same terminal node in the training set. The tree with lowest value of error in cross-validation is selected as optimal tree [54]. In this study, R program (version 2.1.0) along with RPART library was used to grow decision tree.

Random Forest

A random forest (RF) is an ensemble of unpruned classification trees created by using bootstrap samples of the training data to construct multiple trees (forests) and random subsets of variables to define the best split at each node, hence the name “random” forests [55, 56]. Random forest operates by generating a user-defined number of decision trees, 100 in this application. Mathematically a RF may be expressed as [57] Where T1(X) is a single decision tree and X represents a single molecular descriptor vector. In present study, the RFs were grown with the R program (version 2.1.0) using the random forest library.

Moving average analysis

In order to develop single topological index based models for classifying data set into active and inactive analogues, moving average analysis (MAA) was applied. Index values of all the 26 chosen descriptors were analyzed and suitable models were developed after identification of the active ranges by maximization of moving average with respect to active compounds (<35% = inactive, 35–65% = transitional, >65% = active) [44, 54]. Subsequently, each analogue of data set was assigned a biological activity using these models, which was then compared with the reported activity [35]. The apparent inhibition constant was reported quantitatively as Kiapp (μM) at different concentrations. The analogues possessing Kiapp values of ≤0.05 μM were considered to be active [labelled as “A” (N=10)] and analogues possessing Kiapp values of >0.05 μM were considered to be inactive [labelled as “B” (N=21)] for the purpose of present study. The analogues possessing MIC99 (Minimum inhibitory concentration that inhibited >99% of cell growth) values of ≤12.5 μM in iron-deficient conditions and ≤50 μM in iron-rich conditions were considered to be active, and analogues possessing MIC99 values of >12.5 μM in iron-deficient conditions and >50 μM in iron-rich conditions were considered to be inactive for the purpose of present study.

Calculation of topological indices

Though a total of 26 indices were employed for the present study (Tab. 2) but 11 indices were ultimately shortlisted by either DT or MAA. Classification ability and non-correlation nature of TIs were the main criteria adopted for short listing of TIs for MAA.

Wiener’s topochemical index(Wc)

Wiener’s topochemical index [41] is defined as the sum of the chemical distances between all pairs of vertices in hydrogen-suppressed molecular graph. It is a refined form of oldest and widely used distance-based topological index – Wiener’s index [11] and this modified index takes into consideration the presence as well as relative position of heteroatom(s) in a molecular structure. It can be expressed as: where P is the chemical length of the path that contains the least number of edges between vertex i and j in the graph G, n is the number of vertices in the hydrogen depleted graph [41].

Molecular connectivity topochemical index (χA)

The molecular connectivity topochemical index [18, 19] is defined as the summation of the modified bond values of adjacent vertices for all edges in the hydrogen-suppressed molecular graph. It is a modified form of the widely used adjacency-based topological index – molecular connectivity index [13, 43] and it takes into consideration the presence as well as relative position of heteroatom(s) in a molecular structure, as per the following equation: where n is the number of vertices, and are the chemical degrees of adjacent vertices i and j forming the edge {i, j} in a graph G. The modified degree of a vertex can be obtained from the adjacency matrix by substituting row element corresponding to heteroatom, with relative atomic weight with respect to carbon atom [18, 19].

Superpendentic index (∫P)

A pendenticity based graph invariant termed as superpendentic index and denoted by ∫ is calculated as the square root of the sum of products of the non-zero row elements in the pendent matrix [49]. It is expressed as: Similarly, its topochemical version termed as superpendentic topochemical index ( ) can be calculated from chemical pendent matrix as: where m and n are maximum possible numbers of i and j respectively.

Pendentic eccentricity index (ξP)

Pendentic eccentricity index (ξ), proposed in the present study, can be defined as the summation of the quotients of the product of non-zero row elements in the pendent matrix and squared eccentricity of the concerned vertex, for all vertices in the hydrogen suppressed molecular graph. Pendent matrix, Dp, of a graph G is a submatrix of distance matrix obtained by retaining the columns corresponding to pendent vertices i.e. terminal vertices or an end vertex with a degree of one [58]. The eccentricity E of a vertex i in a graph G is the path length from vertex i to the vertex j that is farthest from i (Ei = max d(ij); j G) It is expressed as: where P( is the length of the path that contains the least number of edges between vertex i and vertex j in graph G; n is the number of vertices in the hydrogen depleted graph. Similarly topochemical version of ξ - pendentic eccentricity topochemical index ( ) can be expressed as: where P( is the chemical length of the path that contains the least number of edges between vertex ic and vertex j in graph G; n is the number of vertices in the hydrogen depleted graph. Pendentic eccentricity topochemical index can be easily calculated from chemical pendent matrix, a submatrix of chemical distance matrix. Calculation of proposed index for three isomers of five membered molecule containing one heteroatom and at least one pendant vertex is exemplified in Fig. 2. The sensitivity of the proposed topochemical descriptor towards presence and relative position of heteroatom(s) for all three, four and five membered isomers containing only one heteroatom and at least one pendent vertex has been illustrated in Tab. 3. Discriminating power and degeneracy of the pendentic eccentricity topochemical index were investigated using all possible structures with three, four and five vertices containing one heteroatom and at least one pendent vertex and were compared with that of the other three indices (Tab. 4).
Fig. 2.

Calculation of pendentic eccentricity topochemical index values for three isomers of a five membered molecule containing one heteroatom and at least one pendent vertex.

Tab. 3.

Index values for all possible structures with three, four and five vertices containing one heteroatom and at least one pendent vertex.

S.No.StructureWcχA cP ξcP
1C—N—C4.3341.3092.311.923
2N—C—C4.1671.3592.311.818
3N—C—C—C10.251.8673.2661.694
4C—N—C—C10.5851.8143.2411.593
59.251.6863.723.703
69.7521.6033.8844
78.5851.7802.311.923
88.251.8212.2361.734
98.251.8572.3451.780
1018.3342.2285.3543.843
1118.3342.2265.3393.89
1218.8352.1765.3733.725
1319.1692.1405.5183.869
14N—C—C—C—C20.3342.3674.3782.116
15C—N—C—C—C20.8352.3224.342.052
16C—C—N—C—C21.0022.3194.3212.111
1716.3341.9605.9318.395
1814.3342.2532.6462.160
1914.3342.2232.6462.234
2014.3342.2822.7692.241
2116.3342.3172.8281.509
2216.3342.3222.8281.546
2316.3342.3572.9441.530
2416.8352.2802.9161.489
2513.3342.3382.7692.241
2613.3342.2882.6462.234
2715.8352.2942.9161.489
2816.3342.2693.8732.617
2916.3342.2343.7422.667
3016.8352.1953.8522.516
3117.8352.3473.0821.343
3217.3342.3883.1091.385
3318.0022.3173.0551.344
3417.3342.36431.380
3515.3342.3372.8281.546
3615.3342.3612.9441.530
3715.3342.3002.8281.583
3815.3342.1733.7423.664
3915.3342.1393.6063.586
Tab. 4.

Comparison of discriminating power and degeneracy of W, χA, , and using all possible structures having three, four and five vertices containing one heteroatom and one pendent vertex.

WcχA cP ξcP
For three vertices
Minimum value4.1671.3092.311.818
Maximum value4.3341.3592.311.923
Ratio1:1.041.1.041:11:1.05
Degeneracy0/20/22/20/2
For four vertices
Minimum value8.251.6032.2361.593
Maximum value10.5851.8673.8844
Ratio1:1.281.161:1.731:2.51
Degeneracy1/70/70/70/7
For five vertices
Minimum value13.3341.962.6461.343
Maximum value21.0022.3885.9318.395
Ratio1:1.571:1.221:2.241:6.25
Degeneracy13/302/309/305/30

Degeneracy = Number of compounds having same values / total number of compounds with same number of vertices.

Zagreb indices (M1and M2)

This pair of indices [14, 15] was introduced in 1972 and have been given different names in the literature, such as the Zagreb Group indices, the Zagreb group parameters and most often, the Zagreb indices. These indices are denoted by M and M and are defined as per the Eqs. 7 and 8: where d(i) is the degrees of vertex i, which can be defined as number of edges incident on a vertex i [58] and d(i)d(j) is the weight of edge {i,j}. Similarly Zagreb topochemical indices [24] and are defined as per the Eqs. 9 and 10: where d is chemical degree vertex i and n is the number of vertices. where d is the chemical weight of the edge {i,j} in the hydrogen suppressed molecular graph and n is the number of edges [24].

Augmented eccentric connectivity index (Aξ)

This is an adjacency-cum-distance based index [44] and is defined as the summation of the quotients of the product of adjacent vertex degrees and eccentricity of the concerned vertex, for all vertices in the hydrogen suppressed molecular graph. It is expressed as: where, M is the product of degrees of all vertices (v), adjacent to vertex i, E is the eccentricity, and n is the number of vertices in graph G [44].

Performance evaluation

The goodness of the models was assessed by calculating sensitivity, specificity [59, 60], overall accuracy of prediction [44], and Matthews correlation coefficient (MCC) [61]. The sensitivity and specificity are defined as per the following: Where the true positive (TP) is the number of compounds correctly predicted as active, false negative (FN) is the number of compounds incorrectly predicted as inactive, true negative (TN) is the number of compounds correctly predicted as inactive, false positive (FP) is the number of compounds incorrectly predicted as active. Thus, the overall accuracy is defined as: MCC quantifies the strength of the linear relation between the molecular descriptors and the classifications, and it may often provide a much more balanced evaluation of the prediction than, for instance, the percentages (accuracy). Matthews correlation coefficient of 1 corresponds to a perfect prediction, whereas 0 corresponds to a completely random prediction and takes both sensitivity and specificity into account. It is calculated as [59]: The percent degree of prediction for each range as well as overall degree of prediction were calculated. The percent classification was obtained from the ratio of number of compounds present in active and inactive ranges to the total number of compounds in the data set. The percent degree of prediction for each range as well as overall accuracy of prediction of the proposed model for antitubercular activity in iron-deficient and iron-rich state were also measured. The validation of the DT based model and self- consistency test were performed by 10-fold cross validation (CV) method, in which the compound dataset was randomly split into 10 folds. The model was developed using 9 randomly selected folds, and prediction was done on the remaining fold. The goodness of DT based model was also assessed by calculating sensitivity, specificity, overall accuracy of prediction and MCC. The 10-fold CV results are given in Tab. 5. From a practical application point of view, topological descriptors used should be least correlated [62]. Absence of direct correlation indicates that the two indices are distinctive and consider different structural components. Statistical significance of TIs used in building predictive models was also assessed by intercorrelation analysis by using index values of analogues of 5′-O-[(N-Acyl)sulfamoyl]adenosines.
Tab. 5.

Confusion Matrix for antitubercular activity and recognition rate of models based on decision tree and Random forest.

ModelDescriptionRangesNumber of compounds Predicted
Sensitivity (%)Specificity (%)Overall Accuracy of PredictionMCC
ActiveInactive
Decision TreeTraining setActive1001001001001
Inactive021
Cross validated setActive07037080.977.40.497
Inactive0417

Random ForestActive555076.1967.740.26
Inactive165

Results and Discussion

Computational approaches applied in drug discovery and toxicity prediction often require molecular descriptors that reflect structural information and physicochemical properties of molecules [63]. The description of the molecular structure through the so-called molecular descriptors is a more difficult but necessary task. Difficulties arise in the generation of such indices, due to non-mathematical nature of the molecular structure [64]. Topological indices are one of the widely used molecular descriptors, which are easily available and can be quickly computed for existing and virtual structures [65, 66]. The successful implementation of QSPR and QSAR certainly decreases the number of compounds synthesized, by making it possible to select the most promising compounds. However, it does not completely eliminate the trial and error factor involved in the development of new drugs [67]. Researchers are striving hard to develop new TIs with not only high discriminating power but also devoid of both degeneracy and correlation with existing TIs. As observed from Fig. 2, value of pendentic eccentricity index changes by >4 times (from 2.052 to 8.395) with a small change in the branching of a five membered molecule containing one heteroatom and at least one pendant vertex. Thus, novel descriptor has high discriminating power, defined as the ratio of highest to lowest value for all possible structures of same number of vertices. This is evident from the fact that the ratio of the highest to lowest value for all possible structures containing five vertices is 6.25 for , in contrast to 1.5, 1.22 and 2.24 for W, χA and respectively. Thus, pendentic eccentricity topochemical index revealed ∼4 times higher discriminating power with respect to Wiener’s topochemical index, >5 times higher discriminating power with respect to molecular connectivity topochemical index and ∼2.8 times higher discriminating power with respect to superpendentic topochemical index for all the possible structures of five vertices containing a heteroatom and at least one pendent vertex (Tab. 4). High discriminating power and extremely low degeneracy are desirable properties of an ideal topological index. High discriminating power of the proposed new descriptor makes it more sensitive towards any change in molecular structure. Degeneracy is the measure of ability of an index to differentiate between the relative positions of atom in a molecule. It is well known fact that topological indices show degeneracy, that is, two or more non-isomorphic graphs may have identical numerical values for an index [68]. The novel pendentic eccentricity topochemical index had significantly reduced degeneracy as compared to Wiener’s topochemical index and superpendentic topochemical index. This is evident from the fact that pendentic eccentricity topochemical index had only 5 identical values out of 30 structures with only five vertices containing one heteroatom and at least one pendent vertex whereas Wiener’s topochemical index and superpendentic topochemical index had 13 and 9 identical values, respectively, for the same compounds (Tab. 4). It is pertinent to mention here that pendentic eccentricity topochemical index had also reduced degeneracy as compared to molecular connectivity topochemical index, as is evident from the fact that novel index had a single identical index value out of 31 values of dataset under study, whereas molecular connectivity topochemical index had two identical values for the same (see tab. 1). Lower the degeneracy, better is the index [39]. Significant reduction in degeneracy indicates the enhanced capability of novel topochemical index to differentiate and demonstrate slight variations in the molecular structure. This means that the likeliness of different structures to have same value is very less. As observed from Tab. 6, pendentic eccentricity topochemical index is not correlated with most of the commonly used TIs. Pairs of indices with r≥0.97 are considerably highly intercorrelated, those with 0.90≥r<0.97 are appreciably correlated, those with 0.50≤r≤0.89 are weakly correlated and finally the pairs of indices with low r values (<0.50) are not intercorrelated [69]. Intercorrelation analysis (Tab. 6) revealed that the pair of indices are highly intercorrelated, pair of indices , are appreciably intercorrelated, pair of indices W - χ, , , , , , , , , are weakly correlated and pair of indices W - c, , , χA - , , , , and are not intercorrelated.
Tab. 6.

Intercorrelation matrix.

WcχA cP ξcP M1c M2cAξc
Wc10.8510.630.570.9140.8470.336
χA10.4760.4350.6170.544−0.074
cP10.9840.5940.4170.282
ξcP10.5220.3630.271
M1c10.9230.632
M2c10.653
Aξc1
In the present study, DT, RF and MAA based models were developed for the prediction of antitubercular activity of 5′-O-[(N-Acyl)sulfamoyl]adenosines. The decision tree was built by utilizing 26 TIs of diverse nature. This recursive partitioning scheme generates rules based on the numerical data of the available descriptors for each molecule. In this case, a classification of data set [35] into active and inactive compounds was desired. Decision tree assigns a probability value (0–1) that a compound is active or inactive; compounds with the probability equal to or greater than 0.5 are designated as active, while others are designated as inactive [70]. Decision tree identified five important topological indices: superpendentic topochemical index (A11), Zagreb group parameter, M (A21), Molecular connectivity topochemical index (A1), Zagreb topochemical index, (A8) and augmented eccentric connectivity topochemical index (A3). The obtained topology of the decision tree is shown in Fig. 3, where the respective descriptor is denoted with an alphanumerical abbreviation that refers to Tab. 2. The index at the root node is most important and significance of index decreases as the tree increases. The DT classified analogues of 5′-O-[(N-Acyl)sulfamoyl]adenosines in the training set with an accuracy of 100% and the cross validated set with an accuracy of 77.4% with regard to antitubercular activity. The sensitivity and specificity of DT based model in the training set was found to be 100%. The sensitivity and specificity of decision tree based model in the cross-validated set was of the order of 70% and 80.9% respectively. The values of MCC for DT based model in the training set and cross validated set are 1 and 0.497 respectively suggesting satisfactory performance as well as robustness of the model. The values of sensitivity, specificity and MCC are shown in Tab. 5.
Fig. 3.

Topology of a decision tree distinguishing active compounds {A} from inactive compounds {B}.

The random forests were also grown utilizing 26 TIs enlisted in Tab. 2. The RF classified 5′-O-[(N-Acyl)sulfamoyl]adenosines with regard to antitubercular activity with an accuracy of 67.74% and out-of-bag (OOB) estimate of error was 32.26%. The sensitivity, specificity and MCC value of RF based model was found to be 50%, 76.19% and 0.26 respectively. The values of sensitivity, specificity and MCC are shown in Tab. 5. Using a single index at a time, MAA provided four independent models based on W, χA, and with an accuracy of prediction ranging from 90.4% to 91.6%. The index values of various analogues along with their substituents are presented in Table 1. The reason behind choosing these four indices for development of models was that these indices provide structural information on different concepts. Wiener’s topochemical index is based upon inter-atomic distances and any increase in linearity and molecular size results in increase in the value of Wiener’s topochemical index. Molecular connectivity topochemical index, on the other hand, is based upon adjacency or connectivity of atoms within a molecule. Superpendentic topochemical index and novel pendentic eccentricity topochemical index are pendenticity based topological indices and thus take into consideration pendent vertices in the molecule The methodology used in the present study aims at the development of suitable models for providing lead molecules through exploitation of the active ranges in the proposed models based on topological indices. Proposed models are unique and differ widely from conventional QSAR models. Both systems of modeling have their own advantages and limitations. In the instant case, the modeling system adopted has distinct advantage of identification of narrow active range(s), which may be erroneously skipped during routine regression analysis in conventional QSAR modeling. Since the ultimate goal of modeling is to provide lead structures, therefore, these active ranges can play vital role in lead identification [71]. Retrofit analysis of the data with regard to Wiener’s topochemical index (Tab. 7–9) revealed that 90.4% analogues were predicted correctly with respect to antitubercular activity. Extremely low average Ki value of 0.019 μM of correctly predicted compounds indicates high potency of the active range in the proposed model. Activity of all the analogues in both inactive ranges was predicted correctly. The average Kiapp values for lower inactive and upper inactive ranges were found to be 43.22 μM and 37.05 μM respectively. Existence of a transitional range indicates gradual change in biological activity. The ratio of average Kiapp values of active range with lower inactive range and upper inactive range was found to be 1:2274.73 and 1:1950 respectively for correctly predicted analogues. Overall accuracy of this model, for prediction of antitubercular activity in iron-deficient and iron-rich state was found to be 80.9%. Sensitivity, specificity, and MCC for this model was found to be 100%, 86.66%, and 0.8 respectively.
Tab. 7.

MAA derived topological models for antitubercular activity.

Model IndexNature of range in proposed modelIndex valueNumber of analogues falling in the range
Percent accuracyAverage MIC99 (μM) (Correctly predicted analogues)Overall accuracy of prediction (%)
TotalCorrect
WcLower Inactive<3855.00299>99.943.2290.4
Transitional3855.002–<4248.86510N.A.N.A.15.61
Active4248.865–4289.4286750.019
Upper Inactive>4289.4244>99.937.05

χALower Inactive<13.41699>99.930.7191.3
Active13.416–13.54586750.016
Transitional>13.545–≤13.8548N.A.N.A.20.1
Upper Inactive>13.85466>99.941.5

cPLower Inactive< 7238.4091111>99.944.4791.6
Transitional7238.409–<7977.9067N.A.N.A.1.07
Active7977.906–28115.139777.70.018
Upper Inactive> 28115.1344>99.952.93

ξcPLower Inactive<184644.9531010>99.948.991.3
Transitional184644.953–<1821982.1258N.A.N.A.5.51
Active1821982.125–2453185.2597>77.70.018
Upper Inactive>2453185.2544>99.952.92
Tab. 9.

MAA derived topological models for antitubercular activity in Iron-rich state.

Model IndexNature of range in proposed modelIndex valueNumber of analogues falling in the range
Percent accuracyAverage MIC99 (μM) (Correctly predicted analogues)Overall accuracy of prediction (%)
TotalCorrect
WcLower Inactive<3855.00297>77.720080.9
Transitional3855.002–<4248.86510N.A.N.A.112.74
Active4248.865–4289.42867533.39
Upper Inactive>4289.4244>99.9200

χALower Inactive<13.4169777.720078
Active13.416–13.545867533.65
Transitional>13.545–≤13.8548N.A.N.A.110.35
Upper Inactive>13.8546583200

cPLower Inactive< 7238.40911872.7220079.16
Transitional7238.409–<7977.9067N.A.N.A.121.7
Active7977.906–28115.139777.729.07
Upper Inactive> 28115.1344>99.99200

ξcPLower Inactive<184644.9531088020082.6
Transitional184644.953–<1821982.1258N.A.N.A.112.74
Active1821982.125–2453185.259777.729.07
Upper Inactive>2453185.2544>99.9200
Retrofit analysis of the data with regard to molecular connectivity topochemical index (Tab. 7–9) revealed that 91.3% analogues were predicted correctly with respect to antitubercular activity. Extremely low average Kiapp value of 0.016 μM of correctly predicted compounds indicates high potency of the active range in the proposed model. Biological activity of all the analogues in both inactive ranges was predicted correctly. The average Kiapp values of lower inactive range and upper inactive range were found to be 30.71 μM and 41.5 μM respectively. Existence of a transitional range indicates gradual change in biological activity. The ratio of average Kiapp values of active range with lower inactive range and upper inactive range was found to be 1:1919.37 and 1: 2593.75 respectively for correctly predicted analogues. Overall accuracy of this model, for prediction of antitubercular activity in iron-deficient and iron-rich state was found to be 78%. Sensitivity, specificity, and MCC for this model was found to be 100%, 88.23%, and 0.8 respectively. Retrofit analysis of the data with regard to superpendentic topochemical index (Tab. 7–9) revealed that 91.6% analogues were predicted correctly with respect to antitubercular activity. Extremely low average Kiapp value of 0.018 μM of correctly predicted compounds indicates high potency of the active range in the proposed model. Activity of all the analogues in both inactive ranges were predicted correctly. The average Kiapp values of lower inactive and upper inactive ranges were found to be 44.47 μM and 52.93 μM respectively. Existence of a transitional range indicates gradual change in biological activity. The ratio of average Kiapp values of active range with lower inactive range and upper inactive range was found to be 1:2470.55 and 1:2940.55 respectively for correctly predicted analogues. Overall accuracy of this model, for prediction of antitubercular activity in iron-deficient and iron-rich state was found to be 91.6%. Sensitivity, specificity, and MCC for this model was found to be 100%, 88.23%, and 0.82 respectively. Retrofit analysis of the data with regard to pendentic eccentricity topochemical index (Tab. 7–9) revealed that 91.3% analogues were predicted correctly with respect to antitubercular activity. Extremely low average Kiapp value of 0.018 μM for the correctly predicted compounds indicates high potency of the active range in the proposed model. Activity of all the analogues in both inactive ranges was predicted correctly. The average Kiapp value of lower inactive range and of upper inactive range was found to be 48.9 μM and 52.92 μM respectively. Existence of a transitional range indicates gradual change in biological activity. The ratio of average Kiapp values of active range with lower inactive range and upper inactive range was found to be 1:2716.66 and 1:2940 respectively. Overall accuracy of this model, for prediction of antitubercular activity in iron-deficient and iron-rich state was found to be 82.6%. Sensitivity, Specificity, and MCC for this model has been found to be 100%, 87.5%, and 0.82 respectively. Pendentic eccentricity topochemical index ( ) depends upon number of pendent atoms and eccentricity. It also takes care of both the nature as well as relative position(s) of pendent atom(s)/heteroatom(s). For a compound to be biologically active, two pendent vertices on the cyclic substituent R (at appropriate places) are essential as observed from relative Kiapp (μM) values [35]. Any deviation from such substitution leads to either loss or reduction in biological activity. All of the compounds which have been characterized as active by the proposed model contained two pendent atoms in the cyclic substituent R. Accordingly, all the compounds [excepting 7 and 16] predicted as active by the proposed model were also experimentally reported to be active. Compounds 7 and 16 were categorised as active according to our proposed model with a cut off value of ≤0.05 μM. Though these two compounds were experimentally reported to be inactive as per the proposed model with a cut off value of ≤0.05 μM but both these compounds exhibited significant biological activity with Kiapp values of 0.061 and 0.137 respectively when compared to average Kiapp values of ∼50 μM for the inactive range. Consequently, all the compounds which were categorised as active as per the proposed model were either experimentally reported to be active or exhibited significant biological activity. All the compounds which have been characterized as inactive as per model possessed either less than two pendent atoms or more than two pendent atoms in the cyclic substituent R with an exception of compound 17. Inactivity of compound 17 may be due to lack of pendent vertex at ortho-position. This fact has already been reported earlier [35]. Since study signifies the influence of both the number as well as relative position(s) of pendent atom(s) in the cyclic substituent R on the biological activity, therefore, pendenticity based topological descriptors will naturally be of utmost importance in drug design. The results of average Kiapp (μM) values of correctly predicted analogues in various ranges of the proposed MAA based topological models are shown in Figures 4–6.
Fig. 4.

Average Kiapp (μM) values of correctly predicted analogues in various ranges of the proposed MAA topological models.

Fig. 6.

Average Kiapp (μM) values of correctly predicted analogues in various ranges of the proposed MAA topological models in iron-rich state

Conclusion

Pendentic eccentricity topochemical index - a novel molecular descriptor exhibited high discriminating power, sensitivity towards both the presence as well as relative position(s) of pendent/heteroatom(s) apart from reduced degeneracy. Moreover, Pendentic eccentricity topochemical index was found not to be correlated with important topological descriptors rendering it highly beneficial tool for isomer discrimination, similarity/dissimilarity, drug design, quantitative structure-activity/structure-property relationships, lead optimization and combinatorial library design. Significant correlation of topological descriptors with antitubercular activity of 5′-O-[(N-Acyl)sulfamoyl]adenosines led to development of numerous models through decision tree, random forest and MAA. All the proposed models exhibited high degree of prediction with regard to anti-tubercular activity. These models offer vast potential for providing lead structures for the development of potent therapeutic agents for treatment of tuberculosis.
Tab. 8.

MAA derived topological models for antitubercular activity in Iron-deficient state.

Model IndexNature of range in proposed modelIndex valueNumber of analogues falling in the range
Percent accuracyAverage MIC99 (μM) (Correctly predicted analogues)Overall accuracy of prediction (%)
TotalCorrect
WcLower Inactive<3855.00297>77.720080.9
Transitional3855.002–<4248.86510N.A.N.A.103.19
Active4248.865–4289.4286755.74
Upper Inactive>4289.4244>99.9200

χALower Inactive<13.41697>77.720078.0
Active13.416–13.54586757.38
Transitional>13.545–≤13.8548N.A.N.A.86.02
Upper Inactive>13.8546583200

cPLower Inactive< 7238.40911872.7220079.16
Transitional7238.409 – <7977.9067N.A.N.A.94.73
Active7977.906 – 28115.139777.75.03
Upper Inactive> 28115.1344>99.9200

ξcPLower Inactive<184644.9531088020082.6
Transitional184644.953–<1821982.1258N.A.N.A.103.19
Active1821982.125–2453185.259777.75.03
Upper Inactive>2453185.2544>99.9200
  38 in total

1.  Comparative QSPR studies with molecular connectivity, molecular negentropy and TAU indices. Part I: molecular thermochemical properties of diverse functional acyclic compounds.

Authors:  Kunal Roy; Achintya Saha
Journal:  J Mol Model       Date:  2003-06-20       Impact factor: 1.810

2.  QSAR and classification models of a novel series of COX-2 selective inhibitors: 1,5-diarylimidazoles based on support vector machines.

Authors:  H X Liu; R S Zhang; X J Yao; M C Liu; Z D Hu; B T Fan
Journal:  J Comput Aided Mol Des       Date:  2004-06       Impact factor: 3.686

3.  Rationally designed nucleoside antibiotics that inhibit siderophore biosynthesis of Mycobacterium tuberculosis.

Authors:  Ravindranadh V Somu; Helena Boshoff; Chunhua Qiao; Eric M Bennett; Clifton E Barry; Courtney C Aldrich
Journal:  J Med Chem       Date:  2006-01-12       Impact factor: 7.446

4.  Predictive activity profiling of drugs by topological-fragment-spectra-based support vector machines.

Authors:  Kentaro Kawai; Satoshi Fujishima; Yoshimasa Takahashi
Journal:  J Chem Inf Model       Date:  2008-06-06       Impact factor: 4.956

5.  Drug resistance pattern of Mycobacterium tuberculosis in seropositive and seronegative HIV-TB patients in Pune, India.

Authors:  Mycal Pereira; Srikanth Tripathy; Vikas Inamdar; K Ramesh; Manoj Bhavsar; Amruta Date; Rajshekar Iyyer; Anand Acchammachary; Sanjay Mehendale; Arun Risbud
Journal:  Indian J Med Res       Date:  2005-04       Impact factor: 2.375

6.  Predicting anti-HIV activity: computational approach using a novel topological descriptor.

Authors:  S Gupta; M Singh; A K Madan
Journal:  J Comput Aided Mol Des       Date:  2001-07       Impact factor: 3.686

7.  Design, synthesis, and biological evaluation of beta-ketosulfonamide adenylation inhibitors as potential antitubercular agents.

Authors:  Jagadeshwar Vannada; Eric M Bennett; Daniel J Wilson; Helena I Boshoff; Clifton E Barry; Courtney C Aldrich
Journal:  Org Lett       Date:  2006-10-12       Impact factor: 6.005

8.  Topochemical models for prediction of cyclin-dependent kinase 2 inhibitory activity of indole-2-ones.

Authors:  Harish Dureja; Anil Kumar Madan
Journal:  J Mol Model       Date:  2005-06-02       Impact factor: 1.810

Review 9.  Mycobacterium tuberculosis pathogenesis and molecular determinants of virulence.

Authors:  Issar Smith
Journal:  Clin Microbiol Rev       Date:  2003-07       Impact factor: 26.132

Review 10.  Enterobactin: an archetype for microbial iron transport.

Authors:  Kenneth N Raymond; Emily A Dertz; Sanggoo S Kim
Journal:  Proc Natl Acad Sci U S A       Date:  2003-03-24       Impact factor: 11.205

View more
  1 in total

1.  Models for anti-tumor activity of bisphosphonates using refined topochemical descriptors.

Authors:  Rakesh K Goyal; G Singh; A K Madan
Journal:  Naturwissenschaften       Date:  2011-09-04
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.