Literature DB >> 30475733

Biased Random Forest For Dealing With the Class Imbalance Problem.

Mohammed Bader-El-Den, Eleman Teitei, Todd Perry.   

Abstract

The class imbalance issue has been a persistent problem in machine learning that hinders the accurate predictive analysis of data in many real-world applications. The class imbalance problem exists when the number of instances present in a class (or classes) is significantly fewer than the number of instances belonging to another class (or classes). Sufficiently recognizing the minority class during classification is a problem as most algorithms employed to learn from data input are biased toward the majority class. The underlying issue is made more complex with the presence of data difficult factors embedded in such data input. This paper presents a novel and effective ensemble-based method for dealing with the class imbalance problem. This paper is motivated by the idea of moving the oversampling from the data level to the algorithm level, instead of increasing the minority instances in the data sets, the algorithms in this paper aims to "oversample the classification ensemble" by increasing the number of classifiers that represent the minority class in the ensemble, i.e., random forest. The proposed biased random forest algorithm employs the nearest neighbor algorithm to identify the critical areas in a given data set. The standard random forest is then fed with more random trees generated based on the critical areas. The results show that the proposed algorithm is very effective in dealing with the class imbalance problem.

Year:  2018        PMID: 30475733     DOI: 10.1109/TNNLS.2018.2878400

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  3 in total

1.  Rule extraction from biased random forest and fuzzy support vector machine for early diagnosis of diabetes.

Authors:  Jingwei Hao; Senlin Luo; Limin Pan
Journal:  Sci Rep       Date:  2022-06-14       Impact factor: 4.996

2.  A novel early diagnostic framework for chronic diseases with class imbalance.

Authors:  Xiaohan Yuan; Shuyu Chen; Chuan Sun; Lu Yuwen
Journal:  Sci Rep       Date:  2022-05-21       Impact factor: 4.996

3.  MSPJ: Discovering potential biomarkers in small gene expression datasets via ensemble learning.

Authors:  HuaChun Yin; JingXin Tao; Yuyang Peng; Ying Xiong; Bo Li; Song Li; Hui Yang
Journal:  Comput Struct Biotechnol J       Date:  2022-07-14       Impact factor: 6.155

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.