Literature DB >> 16428129

Classification ensembles for unbalanced class sizes in predictive toxicology.

J J Chen1, C A Tsai, J F Young, R L Kodell.   

Abstract

This paper investigates the effects of the ratio of positive-to-negative samples on the sensitivity, specificity, and concordance. When the class sizes in the training samples are not equal, the classification rule derived will favor the majority class and result in a low sensitivity on the minority class prediction. We propose an ensemble classification approach to adjust for differential class sizes in a binary classifier system. An ensemble classifier consists of a set of base classifiers; its prediction rule is based on a summary measure of individual classifications by the base classifiers. Two re-sampling methods, augmentation and abatement, are proposed to generate different bootstrap samples of equal class size to build the base classifiers. The augmentation method balances the two class sizes by bootstrapping additional samples from the minority class, whereas the abatement method balances the two class sizes by sampling only a subset of samples from the majority class. The proposed procedure is applied to a data set to predict estrogen receptor binding activity and to a data set to predict animal liver carcinogenicity using SAR (structure-activity relationship) models as base classifiers. The abatement method appears to perform well in balancing sensitivity and specificity.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 16428129     DOI: 10.1080/10659360500468468

Source DB:  PubMed          Journal:  SAR QSAR Environ Res        ISSN: 1026-776X            Impact factor:   3.000


  8 in total

1.  Text classification for assisting moderators in online health communities.

Authors:  Jina Huh; Meliha Yetisgen-Yildiz; Wanda Pratt
Journal:  J Biomed Inform       Date:  2013-09-08       Impact factor: 6.317

Review 2.  The Promise of AI in Detection, Diagnosis, and Epidemiology for Combating COVID-19: Beyond the Hype.

Authors:  Musa Abdulkareem; Steffen E Petersen
Journal:  Front Artif Intell       Date:  2021-05-14

3.  CURE-SMOTE algorithm and hybrid algorithm for feature selection and parameter optimization based on random forests.

Authors:  Li Ma; Suohai Fan
Journal:  BMC Bioinformatics       Date:  2017-03-14       Impact factor: 3.169

4.  Towards a generalized toxicity prediction model for oxide nanomaterials using integrated data from different sources.

Authors:  Jang-Sik Choi; My Kieu Ha; Tung Xuan Trinh; Tae Hyun Yoon; Hyung-Gi Byun
Journal:  Sci Rep       Date:  2018-04-17       Impact factor: 4.379

5.  Deep Convolutional Neural Networks for the Prediction of Molecular Properties: Challenges and Opportunities Connected to the Data.

Authors:  Niclas Ståhl; Göran Falkman; Alexander Karlsson; Gunnar Mathiason; Jonas Boström
Journal:  J Integr Bioinform       Date:  2018-12-05

6.  Effect of Dataset Size and Train/Test Split Ratios in QSAR/QSPR Multiclass Classification.

Authors:  Anita Rácz; Dávid Bajusz; Károly Héberger
Journal:  Molecules       Date:  2021-02-19       Impact factor: 4.411

7.  Cardiovascular RNA markers and artificial intelligence may improve COVID-19 outcome: a position paper from the EU-CardioRNA COST Action CA17129.

Authors:  Lina Badimon; Emma L Robinson; Amela Jusic; Irina Carpusca; Leon J deWindt; Costanza Emanueli; Péter Ferdinandy; Wei Gu; Mariann Gyöngyösi; Matthias Hackl; Kanita Karaduzovic-Hadziabdic; Mitja Lustrek; Fabio Martelli; Eric Nham; Ines Potočnjak; Venkata Satagopam; Reinhard Schneider; Thomas Thum; Yvan Devaux
Journal:  Cardiovasc Res       Date:  2021-07-07       Impact factor: 10.787

8.  NanoTox: Development of a Parsimonious In Silico Model for Toxicity Assessment of Metal-Oxide Nanoparticles Using Physicochemical Features.

Authors:  Nilesh Anantha Subramanian; Ashok Palaniappan
Journal:  ACS Omega       Date:  2021-04-23
  8 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.