Literature DB >> 21918627

Modeling of thermodynamic and physico-chemical properties of coumarins bioactivity against Candida albicans using a Levenberg-Marquardt neural network.

Seyyedeh Soghra Mousavi1, Hanieh Bokharaie, Shadi Rahimi, Sima Azadi Soror, Mehrdad Hamidi.   

Abstract

In recent years, due to vital need for novel fungicidal agents, investigation on natural antifungal resources has been increased. The special features exhibited by neural network classifiers make them suitable for handling complex problems like analyzing different properties of candidate compounds in computer-aided drug design. In this study, by using a Levenberg-Marquardt (LM) neural network (the fastest of the training algorithms), the relation between some important thermodynamic and physico-chemical properties of coumarin compounds and their biological activities (tested against Candida albicans) has been evaluated. A set of already reported antifungal bioactive coumarin and some well-known physical descriptors have been selected and using LM training algorithm the best architecture of neural model has been designed for forecasting the new bioactive compounds.

Entities:  

Keywords:  Levenberg/Marquardt algorithm; coumarin; neural network

Year:  2010        PMID: 21918627      PMCID: PMC3170013          DOI: 10.2147/aabc.s11812

Source DB:  PubMed          Journal:  Adv Appl Bioinform Chem        ISSN: 1178-6949


Introduction

Human fungal infections have increased at an alarming rate for the last 20 years, mainly among immunocompromised individuals.1 Although it appears to be a great array of antifungal drugs, there is at present a quest for new generations of antifungal compounds due to the low efficacy, side effects, or resistance associated with the existing drugs.2,3 However, there are only a limited number of known clinically available antifungal reagents, including amphoterlcin B, ketoconazole, fluor, oconazole, and itraconazole. These antifungal drugs have disadvantages including high toxicity, ineffective towards some fungi, and low bioavailability, thus they are not able to meet fully the needs of the patients.4 Coumarin compounds are naturally occurring constituent of many plants and essential oil which comprise a chromenone ring, often a chromen-2-one or chromen-4-one ring.5 Selected coumarins are known to have antifungal activity. For example, Sardari et al6 described how a limited number of coumarins are active against C albicans, Cryptococcus neoformans, Saccharomyces cerevisiae, and Aspergillus niger. Several different parameters should be evaluated to design a new coumarin antifungal.7,8 Recently, artificial neural networks (ANNs) have most widely been used in drug design. They usually consist of 3 or 4 input layers, 1 output layer, and 1 or 2 hidden ones.9 In pattern classification, understanding the class boundaries by the classifier needs a training phase with a training algorithm.10 Gradient-based training algorithms, like back-propagation, are not efficient due to the fact that the gradient vanishes in the solution.11 Hessian-based algorithms allow the network to learn more subtle features of a complicated mapping.12 The training process converges quickly as the solution is approached, because the Hessian does not vanish in the solution. The Levenberg–Marquardt (LM) algorithm is basically a Hessian-based algorithm for nonlinear least squares optimization.13,14 For neural network training, the objective function is the error function of the type A unit in 1 layer is connected to all units in the next,13 where the a is the actual output at the output neuron l for the input k, the d is the desired output at the output neuron l for the input k, p is the total number of training patterns, n0 represents the total number of neurons in the output layer of the network, and x represents the weights and biases of the network.14 Because they can find the complex relationship between predictor variables (inputs) and predicted variables (output), LM algorithm trained ANN has received growing attention in drug discovery. In this study, first several LM neural networks are built for a set of thermodynamic and physico-chemical properties of antifungal coumarins. After that, the best architecture in terms of the least error and cycle of calculation is selected. Eventually, this neural model is used to calculate the correlation coefficient between thermodynamic and physico-chemical properties and bioactivity of antifungal coumarins (tested against Candida albicans) and the role of these properties in bioactivity of coumarins antifungal is discussed.

Material and methods

In the first step, some thermodynamic and physico-chemical descriptors for all congeners were computed or taken from the literature (Table 1).15–18 Geometry optimization is carried out by using the semiempirical PM3 method,19 implemented in (HyperCube, Inc. Gainesville, FL, USA)™ program package.20 All of these descriptor are generated by different application such as ACDLAB (11.02 release 21. May 2008), HyperChem (8.0.2), and MOPAC 93, together with the help of references (Table 1). For example, The basic thermodynamic properties such as standard enthalpy of formation, standard free enthalpy of formation, molar entropies, heat capacities, energies of highest occupied molecular orbital (HOMO), and lowest unoccupied molecular orbital (LUMO) are extracted from MOPAC 93 data files.
Table 1

Some thermodynamic and physico-chemical descriptors for all congeners

DescriptorParameterReferences
1Vicinal carbon atoms substitution pattern8
2Sasvol (solvent-accessible volume)8
3Vdwvol (van der Waals volume)8
4Symmetry of molecule8
5Maxq+ (the largest positive charge over the atoms in molecules)8
6Vapor pressure8
7Energy of HUMO (highest occupied molecular orbital)8
8Energy of LUMO (lowest unoccupied molecular orbital)8
9Molecular mass (Da)8
10Dipole moment of the molecule15
11Density (g/cm)15
12Retention time15
13Heat capacity16
14Standard enthalpy of formation16
15Specific polarizability of molecule17
16Molar refractivity (cm3)17
17Molar volume (cm3)18
18Log P18
19Surface tension (dyne/cm)18
The dataset is composed of some coumarins and coumarin derivatives that have previously shown antifungal activity. Bioactivities of these compounds are screened by the well dilution method and have been taken by literature search (Table 2).6,21–26 One major problem is the reporting of antifungal activity in 2 different forms of 50% inhibitory concentration (IC50) and minimal inhibitory concentration (MIC). By multiplying the IC50 values by 2,17 we obtain a close equivalent of MIC level; hence, our dataset becomes uniform, because this calculated number is approximately equal to MIC. We used antifungal screening results of isolates of C albicans for the simulation of their bioactivity. The Error Back Propagation (EBP) algorithm has been a significant improvement in neural network research, but it has a weak convergence rate.27,28 Many efforts have been made to speed up EBP algorithm.29,30 All of these methods lead to little acceptable results. The LM algorithm ensues from development of EBP algorithm dependent methods. It gives a good exchange between the speed of the Newton algorithm and the stability of the steepest descent method 31 that are 2 basic theorems of LM algorithm. In this paper, a feed-forward neural network with LM algorithms applied for modeling the bioactivity of coumarins antifungal. A standard feed-forward network, with LM algorithms and with 1–3 hidden layer architecture, was chosen. For solving the problem of over-fitting, the number of neurons was kept at a minimum. However, the optimum architecture with target error less than 0.02% was created with variation in the total number of nodes and hidden layers. This neural model is built with NeuroSolutions (NeuroDimension, Inc. Gainesville, FL, USA) (version: v5.07, 2008). For validation of our model and to analyze the influence of inherent randomness on the prediction stability, 10 repetitions of the complete validation process with different random seeds were made in all cases (Y-scrambling test). Accuracy has been selected for evaluation of predictive performance of a single validation process, whereas a coefficient of correlation of accuracies obtained across 10 repetitions is established as a measure of learning stability. Also cross-validation was applied by leave-n-out method.
Table 2

Structure and bioactivity of studied coumarins. The observed MICs and structures of coumarin compounds are derived from mentioned references

NumberCompoundMIC observed, μg/mLNumberCompoundMIC observed, μg/mL
11,00022,000
31,00041,000
52506250
71,000862.5
925010250
1125012250
131,00014250
1550016500
17250181,000
191,000201,000
2150022500
2380241,000
25702664
27252893.75
295123064
3142.653278.75
3316.653431.4
3553625
375003815.6
3915.640125
4131.3427.8
4315.6447.8
4525046250
47250482,150
491,979503,321
513,478522,035
533,752542,705

Abbreviation: MIC, minimal inhibitory concentration

Result

The computed basic physico-chemical and thermodynamic descriptors for coumarins presented in Table 1. Various architectures of neural network are shown in Table 3. In this study, LM trained ANN has been used to build a neural model for prediction of leading antifungal coumarins. The best architecture, according to the term of calculation cycles and considering the correlating behavior and output cycles of calculation was 19-8-1. ANNs are used to modeling systems that receive inputs and produce outputs. The relationships between the inputs, outputs, and the representation parameters are critical issues in the design of a good model for bioactive compounds and sensitivity analysis concerns methods to analyze these relationships. Perturbations of neural networks are caused by machine imprecision, and they can be simulated by embedding disturbances in the original inputs or connection weights, allowing us to study the characteristics of a function under small perturbations of its parameters. Sensitivity analysis is a measure of how the outputs change when the inputs are changed. The result of this analysis could help to predict the bioactivity of new antifungal coumarins. Result shows that the most sensitive input are Log P and molar refractivity. The input importance shows the relative importance of each input column. The importance is the sum of the absolute weights of the connections from the input node to all the nodes in the first hidden layer. Descriptors energy of LUMO, energy of HOMO, and surface tension were the most important inputs. The least important descriptor was determined as the density. The correlation coefficient between the observed and the predicted MIC value was 0.9266. Predicted activity varies from 22.55878 to 2010.87537 (Figure 1). Y-scrambling result showed that the classification accuracy for randomized datasets was significantly lower than for the original datasets (data not shown). The highest error is observed for compound 11, 34, and 43. Cross-validation is done by leave-some-out (some = 4) validating method. Validation showed that average of absolute errors was 0.029.
Table 3

Different plan of some applied networks by focus on errors

HLDesignY-scrambling R2Validation set errorCalculation cyclesTraining set error
119-3-10.6720.0077876680.009987
119-6-10.7610.0099885640.009941
119-8-10.7630.0092774560.009781
119-11-10.5530.0099007870.009887
119-15-10.2210.0098566230.009924
219-6-10-10.4470.0096779800.00996
219-10-11-10.6640.0996713450.09999
219-13-12-10.7740.977923210.09999
319-8-3-5-10.4410.0995912550.08812
319-7-8-6-10.3640.089998760.06812

Abbreviation: HL, hidden layers.

Figure 1

Plot of predicted activity vs observed one.

Discussion

The development of a new drug is still a challenging, time-consuming, and cost-intensive process. Computational methods can be used to assist and speed up the drug discovery process. In contrast to classical statistical methods such as regression analysis or partial least squares analysis, the ANNs enable the investigation of complex nonlinear relationships. Therefore, neural networks are ideally suited to be used in drug design and Quantitative structure-activity relationship (QSAR). They consist of many basic units, called artificial neurons (or simply neurons), which perform identical tasks. A neuron collects a series of input signals and transforms them into the output signal via a transfer function. In the course of training, such a network of neurons “learns” by changing the weights of its neurons. Two different learning methods can be distinguished:32 supervised and unsupervised learning. When learning is unsupervised, the neural network is provided with the input patterns. After some iteration, it should be settled to a stable state. The goal of supervised learning methods is to find a model that associates correctly the inputs (representation of the objects) with the targets (representation of the responses). The targets serve not only as a criterion for how well the system has been trained, but also they influence the correction of each weight. Also the best-known example of a neural network training algorithm is back-propagation, it is the easiest algorithm to understand, it is also a good choice if the dataset is very large, contains a great deal of redundant data and finally it still has advantages in some circumstances, but modern second-order algorithms such as conjugate gradient descent and LM are substantially faster (eg, an order of magnitude faster). LM is typically the fastest of the training algorithms and performs calculations using the entire dataset that might improve the performance of network.33,34 In this study, LM training algorithm is applied to discover the relationship between antifungal activity score data for a dataset of coumarins antifungal with the thermodynamic and physico-chemical descriptors. Descriptors are derived from molecular structure. Among the architectures constructed, the best ANN architecture is 19-8-1. Table 3 shows the statistical criteria of different architecture. The quiet low error for the training and validation set indicate that training and validation are absolutely successful. Thermodynamic and physico-chemical descriptors play a crucial role in the interaction of candidate compounds with their specific receptors (eg, biological membrane). The results have shown that descriptors LUMO and HOMO energy are the most important among all descriptors. LUMO is the lowest energy level in the molecule that contains no electrons. When a molecule acts as a Lewis acid (an electron- pair acceptor) in bond formation, incoming electron pairs are received in its LUMO. Molecules with low-lying LUMOs are more able to accept electrons than those with high LUMOs; thus, the LUMO descriptor should measure the electrophilicity of a molecule. HOMO is the highest energy level in the molecule that contains electrons. It is crucially important in governing molecular reactivity and properties. When a molecule acts as a Lewis base (an electron-pair donor) in bond formation, the electrons are supplied by the molecule’s HOMO. Molecules with high HOMOs are more able to donate their electrons and are hence relatively reactive compared with molecules with low-lying HOMOs; hence, the HOMO descriptor should measure the nucleophilicity of a molecule. Both descriptors strongly define how a compound could interact with a receptor. The third important descriptor is surface tension. This is in accordance with previous studies.17 The most sensitive is Log P and molar refractivity. Log P (the octanol/water partition coefficient) and molar refractivity are molecular descriptors that can be used to relate chemical structure to observed chemical behavior. Log P is related to the hydrophobic character of the molecule. The molecular refractivity index of a substituent is a combined measure of its size and polarizability. Because of their flexibility, supervised neural network have found a great application in drug design, for example a network with LM learning algorithm can be employed for the following applications in drug design: analysis of multidimensional data, classification and prediction of biological activity and ADME-Tox (absorption, distribution, metabolism, excretion, and toxicity) properties, lead discovery, comparison of compound libraries and analysis of the similarity. Unfortunately LM has some important limitations, specifically it can be only used on single output networks, and be used with the sum squared error function, and has memory requirements proportional to W2 (where W is the number of weights in the network; this makes it impractical for reasonably big networks). LM training algorithms seem to be very prone to stick in local minima in the early phases.35 Modification in LM algorithm to decrease these limits may increase its application in drug design.
  12 in total

1.  Antifungal activity of Paraguayan plants used in traditional medicine.

Authors:  A Portillo; R Vila; B Freixa; T Adzet; S Cañigueral
Journal:  J Ethnopharmacol       Date:  2001-06       Impact factor: 4.360

2.  The value of improving the productivity of the drug development process: faster times and better decisions.

Authors:  Joseph A DiMasi
Journal:  Pharmacoeconomics       Date:  2002       Impact factor: 4.981

3.  Enhanced training algorithms, and integrated training/architecture selection for multilayer perceptron networks.

Authors:  M G Bello
Journal:  IEEE Trans Neural Netw       Date:  1992

4.  Training feedforward networks with the Marquardt algorithm.

Authors:  M T Hagan; M B Menhaj
Journal:  IEEE Trans Neural Netw       Date:  1994

Review 5.  Epidemiology of fungal infections: the promise of molecular typing.

Authors:  M A Pfaller
Journal:  Clin Infect Dis       Date:  1995-06       Impact factor: 9.079

6.  Antifungal activity of some coumarins obtained from species of Pterocaulon (Asteraceae).

Authors:  Ana Cristina Stein; Sandra Alvarez; César Avancini; Susana Zacchino; Gilsane von Poser
Journal:  J Ethnopharmacol       Date:  2006-02-28       Impact factor: 4.360

7.  Guidelines for the investigation of invasive fungal infections in haematological malignancy and solid organ transplantation. British Society for Medical Mycology.

Authors:  D W Denning; E G Evans; C C Kibbler; M D Richardson; M M Roberts; T R Rogers; D W Warnock; R E Warren
Journal:  Eur J Clin Microbiol Infect Dis       Date:  1997-06       Impact factor: 3.267

Review 8.  Importance of Candida species other than Candida albicans as opportunistic pathogens.

Authors:  D C Coleman; M G Rinaldi; K A Haynes; J H Rex; R C Summerbell; E J Anaissie; A Li; D J Sullivan
Journal:  Med Mycol       Date:  1998       Impact factor: 4.076

9.  Interpretable correlation descriptors for quantitative structure-activity relationships.

Authors:  Benson M Spowage; Craig L Bruce; Jonathan D Hirst
Journal:  J Cheminform       Date:  2009-12-24       Impact factor: 5.514

10.  Forward Modeling of the Coumarin Antifungals; SPR/SAR Based Perspective.

Authors:  Saeed Soltani; Shima Dianat; Soroush Sardari
Journal:  Avicenna J Med Biotechnol       Date:  2009-07
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.