| Literature DB >> 26535037 |
Watshara Shoombuatong1, Veda Prachayasittikul2, Virapong Prachayasittikul3, Chanin Nantasenamat2.
Abstract
Aromatase inhibition is an effective treatment strategy for breast cancer. Currently, several in silico methods have been developed for the prediction of aromatase inhibitors (AIs) using artificial neural network (ANN) or support vector machine (SVM). In spite of this, there are ample opportunities for further improvements by developing a simple and interpretable quantitative structure-activity relationship (QSAR) method. Herein, an efficient linear method (ELM) is proposed for constructing a highly predictive QSAR model containing a spontaneous feature importance estimator. Briefly, ELM is a linear-based model with optimal parameters derived from genetic algorithm. Results showed that the simple ELM method displayed robust performance with 10-fold cross-validation MCC values of 0.64 and 0.56 for steroidal and non-steroidal AIs, respectively. Comparative analyses with other machine learning methods (i.e. ANN, SVM and decision tree) were also performed. A thorough analysis of informative molecular descriptors for both steroidal and non-steroidal AIs provided insights into the mechanism of action of compounds. Our findings suggest that the shape and polarizability of compounds may govern the inhibitory activity of both steroidal and non-steroidal types whereas the terminal primary C(sp3) functional group and electronegativity may be required for non-steroidal AIs. The R code of the ELM method is available at http://dx.doi.org/10.6084/m9.figshare.1274030.Entities:
Keywords: QSAR; aromatase; aromatase inhibitors; data mining; efficient linear method; genetic algorithm
Year: 2015 PMID: 26535037 PMCID: PMC4614109 DOI: 10.17179/excli2015-140
Source DB: PubMed Journal: EXCLI J ISSN: 1611-2156 Impact factor: 4.068
Table 1Dataset of steroidal and non-steroidal AIs
Figure 1Workflow diagram of the efficient linear method (ELM)
Table 2Pseudocode of ELM
Figure 2Box and histogram plots of the weighted summation f(C) of steroidal AIs obtained using the initial parameter (left) and the optimal parameter (right).
Table 3The 10 independent experiments of our proposed ELM method for predicting steroidal AIs
Figure 3Box and histogram plots of the weighted summation f(c) of non-steroidal AIs obtained using the initial parameter (left) and the optimal parameter (right).
Table 4The 10 independent experiments of our proposed ELM method for predicting non-steroidal AIs
Table 5Performance comparison of the proposed ELM method with existing and other related methods
Figure 4Important molecular descriptors for steroidal (left) and non-steroidal AIs (right), which are ranked according to their feature usages
Table 6Definition of informative molecular descriptorsa of steroidal and non-steroidal AIs