Literature DB >> 26210983

Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs.

Shounak Datta1, Swagatam Das2.   

Abstract

Support Vector Machines (SVMs) form a family of popular classifier algorithms originally developed to solve two-class classification problems. However, SVMs are likely to perform poorly in situations with data imbalance between the classes, particularly when the target class is under-represented. This paper proposes a Near-Bayesian Support Vector Machine (NBSVM) for such imbalanced classification problems, by combining the philosophies of decision boundary shift and unequal regularization costs. Based on certain assumptions which hold true for most real-world datasets, we use the fractions of representation from each of the classes, to achieve the boundary shift as well as the asymmetric regularization costs. The proposed approach is extended to the multi-class scenario and also adapted for cases with unequal misclassification costs for the different classes. Extensive comparison with standard SVM and some state-of-the-art methods is furnished as a proof of the ability of the proposed approach to perform competitively on imbalanced datasets. A modified Sequential Minimal Optimization (SMO) algorithm is also presented to solve the NBSVM optimization problem in a computationally efficient manner.
Copyright © 2015 Elsevier Ltd. All rights reserved.

Keywords:  Bayes error; Decision boundary shift; Imbalanced data; Multi-class classification; Support Vector Machines; Unequal costs

Mesh:

Year:  2015        PMID: 26210983     DOI: 10.1016/j.neunet.2015.06.005

Source DB:  PubMed          Journal:  Neural Netw        ISSN: 0893-6080


  5 in total

1.  A Random Forests Quantile Classifier for Class Imbalanced Data.

Authors:  Robert O'Brien; Hemant Ishwaran
Journal:  Pattern Recognit       Date:  2019-01-29       Impact factor: 7.740

2.  Prediction of selective estrogen receptor beta agonist using open data and machine learning approach.

Authors:  Ai-Qin Niu; Liang-Jun Xie; Hui Wang; Bing Zhu; Sheng-Qi Wang
Journal:  Drug Des Devel Ther       Date:  2016-07-18       Impact factor: 4.162

3.  Automatic Multi-Label ECG Classification with Category Imbalance and Cost-Sensitive Thresholding.

Authors:  Yang Liu; Qince Li; Kuanquan Wang; Jun Liu; Runnan He; Yongfeng Yuan; Henggui Zhang
Journal:  Biosensors (Basel)       Date:  2021-11-14

4.  Research on expansion and classification of imbalanced data based on SMOTE algorithm.

Authors:  Shujuan Wang; Yuntao Dai; Jihong Shen; Jingxue Xuan
Journal:  Sci Rep       Date:  2021-12-15       Impact factor: 4.379

5.  Integration of gene co-expression analysis and multi-class SVM specifies the functional players involved in determining the fate of HTLV-1 infection toward the development of cancer (ATLL) or neurological disorder (HAM/TSP).

Authors:  Mohadeseh Zarei Ghobadi; Rahman Emamzadeh
Journal:  PLoS One       Date:  2022-01-18       Impact factor: 3.240

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.