Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner.

Literature DB >> 22481075

In silico prediction of toxic action mechanisms of phenols for imbalanced data with Random Forest learner.

Jing Chen¹, Yuan Yan Tang, Bin Fang, Chang Guo.

Abstract

With an increasing need for the rapid and effective safety assessment of compounds in industrial and civil-use products, in silico toxicity exploration techniques provide an economic way for environmental hazard assessment. The previous in silico researches have developed many quantitative structure-activity relationships models to predict toxicity mechanisms for last decade. Most of these methods benefit from data analysis and machine learning techniques, which rely heavily on the characteristics of data sets. For Tetrahymena pyriformis toxicity data sets, there is a great technical challenge-data imbalance. The skewness of data class distribution would greatly deteriorate the prediction performance on rare classes. Most of the previous researches for phenol mechanisms of toxic action prediction did not consider this practical problem. In this work, we dealt with the problem by considering the difference between the two types of misclassifications. Random Forest learner was employed in cost-sensitive learning framework to construct prediction models based on selected molecular descriptors. In computational experiments, both the global and local models obtained appreciable overall prediction accuracies. Particularly, the performance on rare classes was indeed promoted. Moreover, for practical usage of these models, the balance of the two misclassifications can be adjusted by using different cost matrices according to the application goals.

Entities: Chemical Disease Species

Mesh：

Substances：

Year: 2012 PMID： 22481075 DOI： 10.1016/j.jmgm.2012.01.002

Source DB: PubMed Journal: J Mol Graph Model ISSN： 1093-3263 Impact factor: 2.518

Keyword Cloud
Cited

5 in total

1. Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods.

Authors: Sankalp Jain; Vishal B Siramshetty; Vinicius M Alves; Eugene N Muratov; Nicole Kleinstreuer; Alexander Tropsha; Marc C Nicklaus; Anton Simeonov; Alexey V Zakharov
Journal: J Chem Inf Model Date: 2021-02-03 Impact factor: 4.956

2. CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

Authors: Li Ma; Suohai Fan
Journal: BMC Bioinformatics Date: 2017-03-14 Impact factor: 3.169

3. Comparing the performance of meta-classifiers-a case study on selected imbalanced data sets relevant for prediction of liver toxicity.

Authors: Sankalp Jain; Eleni Kotsampasakou; Gerhard F Ecker
Journal: J Comput Aided Mol Des Date: 2018-04-06 Impact factor: 4.179

4. Structure-activity relationship-based chemical classification of highly imbalanced Tox21 datasets.

Authors: Gabriel Idakwo; Sundar Thangapandian; Joseph Luttrell; Yan Li; Nan Wang; Zhaoxian Zhou; Huixiao Hong; Bei Yang; Chaoyang Zhang; Ping Gong
Journal: J Cheminform Date: 2020-10-27 Impact factor: 5.514

5. QSAR modeling of imbalanced high-throughput screening data in PubChem.

Authors: Alexey V Zakharov; Megan L Peach; Markus Sitzmann; Marc C Nicklaus
Journal: J Chem Inf Model Date: 2014-02-28 Impact factor: 4.956

5 in total