| Literature DB >> 31744364 |
Vishan Kumar Gupta1, Prashant Singh Rana1.
Abstract
In this study, efforts are created to develop a quantitative structure-activity relationship (QSAR)-based model, which are used for the prediction of toxicities to reduce testing in animals, time, and money in the early stages of drug development. An efficient machine learning model is developed to predict the toxicity of those drug molecules which binds to the androgen receptor (AR). Toxicity prediction is performed in terms of their activity, activity score, potency, and efficacy by using various physicochemical properties. A multilevel ensemble model is proposed, where its first level is performed ensemble-based classification of activity, and the second level is performed ensemble-based regression of activity score, potency, and efficacy of only those drug molecules which have been found active during the classification level. The AR dataset has 10,273 drug molecules where 461 are active, and 9812 are inactive, and each drug molecule has 1444 features. Therefore, our dataset is highly imbalanced having a very large number of features. Initially, we performed feature selection then the class imbalance problem is resolved. The k-fold cross-validation is accomplished to measure the consistency of the model. Finally, our proposed multilevel ensemble model has been validated and compared with some existing models.Entities:
Keywords: Androgen receptor; activity; class imbalance; feature selection; molecular descriptor, classification, regression; multilevel ensemble model; random forest; validation
Year: 2019 PMID: 31744364 DOI: 10.1142/S0219720019500331
Source DB: PubMed Journal: J Bioinform Comput Biol ISSN: 0219-7200 Impact factor: 1.122