Literature DB >> 31560206

LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity-Application to the Tox21 and Mutagenicity Data Sets.

Jin Zhang1, Daniel Mucs2, Ulf Norinder2,3, Fredrik Svensson4,5.   

Abstract

Machine learning algorithms have attained widespread use in assessing the potential toxicities of pharmaceuticals and industrial chemicals because of their faster speed and lower cost compared to experimental bioassays. Gradient boosting is an effective algorithm that often achieves high predictivity, but historically the relative long computational time limited its applications in predicting large compound libraries or developing in silico predictive models that require frequent retraining. LightGBM, a recent improvement of the gradient boosting algorithm, inherited its high predictivity but resolved its scalability and long computational time by adopting a leaf-wise tree growth strategy and introducing novel techniques. In this study, we compared the predictive performance and the computational time of LightGBM to deep neural networks, random forests, support vector machines, and XGBoost. All algorithms were rigorously evaluated on publicly available Tox21 and mutagenicity data sets using a Bayesian optimization integrated nested 10-fold cross-validation scheme that performs hyperparameter optimization while examining model generalizability and transferability to new data. The evaluation results demonstrated that LightGBM is an effective and highly scalable algorithm offering the best predictive performance while consuming significantly shorter computational time than the other investigated algorithms across all Tox21 and mutagenicity data sets. We recommend LightGBM for applications of in silico safety assessment and also other areas of cheminformatics to fulfill the ever-growing demand for accurate and rapid prediction of various toxicity or activity related end points of large compound libraries present in the pharmaceutical and chemical industry.

Entities:  

Mesh:

Substances:

Year:  2019        PMID: 31560206     DOI: 10.1021/acs.jcim.9b00633

Source DB:  PubMed          Journal:  J Chem Inf Model        ISSN: 1549-9596            Impact factor:   4.956


  19 in total

1.  MutagenPred-GCNNs: A Graph Convolutional Neural Network-Based Classification Model for Mutagenicity Prediction with Data-Driven Molecular Fingerprints.

Authors:  Shimeng Li; Li Zhang; Huawei Feng; Jinhui Meng; Di Xie; Liwei Yi; Isaiah T Arkin; Hongsheng Liu
Journal:  Interdiscip Sci       Date:  2021-01-27       Impact factor: 2.233

2.  E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database.

Authors:  Nima Safaei; Babak Safaei; Seyedhouman Seyedekrami; Mojtaba Talafidaryani; Arezoo Masoud; Shaodong Wang; Qing Li; Mahdi Moqri
Journal:  PLoS One       Date:  2022-05-05       Impact factor: 3.752

3.  DeepSnap-Deep Learning Approach Predicts Progesterone Receptor Antagonist Activity With High Performance.

Authors:  Yasunari Matsuzaka; Yoshihiro Uesawa
Journal:  Front Bioeng Biotechnol       Date:  2020-01-22

4.  Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models.

Authors:  Dejun Jiang; Zhenxing Wu; Chang-Yu Hsieh; Guangyong Chen; Ben Liao; Zhe Wang; Chao Shen; Dongsheng Cao; Jian Wu; Tingjun Hou
Journal:  J Cheminform       Date:  2021-02-17       Impact factor: 5.514

5.  Prediction Models for AKI in ICU: A Comparative Study.

Authors:  Qing Qian; Jinming Wu; Jiayang Wang; Haixia Sun; Lei Yang
Journal:  Int J Gen Med       Date:  2021-02-25

6.  Predicting the membrane permeability of organic fluorescent probes by the deep neural network based lipophilicity descriptor DeepFl-LogP.

Authors:  Kareem Soliman; Florian Grimm; Christian A Wurm; Alexander Egner
Journal:  Sci Rep       Date:  2021-03-26       Impact factor: 4.379

7.  Predicting student satisfaction of emergency remote learning in higher education during COVID-19 using machine learning techniques.

Authors:  Indy Man Kit Ho; Kai Yuen Cheong; Anthony Weldon
Journal:  PLoS One       Date:  2021-04-02       Impact factor: 3.240

8.  Development of machine learning model for diagnostic disease prediction based on laboratory tests.

Authors:  Dong Jin Park; Min Woo Park; Homin Lee; Young-Jin Kim; Yeongsic Kim; Young Hoon Park
Journal:  Sci Rep       Date:  2021-04-07       Impact factor: 4.379

9.  Identifying sarcopenia in advanced non-small cell lung cancer patients using skeletal muscle CT radiomics and machine learning.

Authors:  Xing Dong; Xu Dan; Ao Yawen; Xu Haibo; Li Huan; Tu Mengqi; Chen Linglong; Ruan Zhao
Journal:  Thorac Cancer       Date:  2020-08-06       Impact factor: 3.500

10.  A Toxicity Prediction Tool for Potential Agonist/Antagonist Activities in Molecular Initiating Events Based on Chemical Structures.

Authors:  Kota Kurosaki; Raymond Wu; Yoshihiro Uesawa
Journal:  Int J Mol Sci       Date:  2020-10-23       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.