Literature DB >> 31869226

Systematic Modeling of log D7.4 Based on Ensemble Machine Learning, Group Contribution, and Matched Molecular Pair Analysis.

Li Fu1, Lu Liu1, Zhi-Jiang Yang1, Pan Li2, Jun-Jie Ding2, Yong-Huan Yun3, Ai-Ping Lu4, Ting-Jun Hou5, Dong-Sheng Cao1,4.   

Abstract

Lipophilicity, as evaluated by the n-octanol/buffer solution distribution coefficient at pH = 7.4 (log D7.4), is a major determinant of various absorption, distribution, metabolism, elimination, and toxicology (ADMET) parameters of drug candidates. In this study, we developed several quantitative structure-property relationship (QSPR) models to predict log D7.4 based on a large and structurally diverse data set. Eight popular machine learning algorithms were employed to build the prediction models with 43 molecular descriptors selected by a wrapper feature selection method. The results demonstrated that XGBoost yielded better prediction performance than any other single model (RT2 = 0.906 and RMSET = 0.395). Moreover, the consensus model from the top three models could continue to improve the prediction performance (RT2 = 0.922 and RMSET = 0.359). The robustness, reliability, and generalization ability of the models were strictly evaluated by the Y-randomization test and applicability domain analysis. Moreover, the group contribution model based on 110 atom types and the local models for different ionization states were also established and compared to the global models. The results demonstrated that the descriptor-based consensus model is superior to the group contribution method, and the local models have no advantage over the global models. Finally, matched molecular pair (MMP) analysis and descriptor importance analysis were performed to extract transformation rules and give some explanations related to log D7.4. In conclusion, we believe that the consensus model developed in this study can be used as a reliable and promising tool to evaluate log D7.4 in drug discovery.

Entities:  

Year:  2020        PMID: 31869226     DOI: 10.1021/acs.jcim.9b00718

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  5 in total

1.  Comparison of logP and logD correction models trained with public and proprietary data sets.

Authors:  Ignacio Aliagas; Alberto Gobbi; Man-Ling Lee; Benjamin D Sellers
Journal:  J Comput Aided Mol Des       Date:  2022-04-01       Impact factor: 3.686

2.  Improvement of Prediction Performance With Conjoint Molecular Fingerprint in Deep Learning.

Authors:  Liangxu Xie; Lei Xu; Ren Kong; Shan Chang; Xiaojun Xu
Journal:  Front Pharmacol       Date:  2020-12-18       Impact factor: 5.810

3.  QSPR model for Caco-2 cell permeability prediction using a combination of HQPSO and dual-RBF neural network.

Authors:  Yukun Wang; Xuebo Chen
Journal:  RSC Adv       Date:  2020-11-26       Impact factor: 4.036

4.  A joint optimization QSAR model of fathead minnow acute toxicity based on a radial basis function neural network and its consensus modeling.

Authors:  Yukun Wang; Xuebo Chen
Journal:  RSC Adv       Date:  2020-06-04       Impact factor: 4.036

5.  Ensemble machine learning to evaluate the in vivo acute oral toxicity and in vitro human acetylcholinesterase inhibitory activity of organophosphates.

Authors:  Liangliang Wang; Junjie Ding; Peichang Shi; Li Fu; Li Pan; Jiahao Tian; Dongsheng Cao; Hui Jiang; Xiaoqin Ding
Journal:  Arch Toxicol       Date:  2021-05-01       Impact factor: 5.153

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.