Literature DB >> 28141529

Online Nonlinear AUC Maximization for Imbalanced Data Sets.

Junjie Hu, Haiqin Yang, Michael R Lyu, Irwin King, Anthony Man-Cho So.   

Abstract

Classifying binary imbalanced streaming data is a significant task in both machine learning and data mining. Previously, online area under the receiver operating characteristic (ROC) curve (AUC) maximization has been proposed to seek a linear classifier. However, it is not well suited for handling nonlinearity and heterogeneity of the data. In this paper, we propose the kernelized online imbalanced learning (KOIL) algorithm, which produces a nonlinear classifier for the data by maximizing the AUC score while minimizing a functional regularizer. We address four major challenges that arise from our approach. First, to control the number of support vectors without sacrificing the model performance, we introduce two buffers with fixed budgets to capture the global information on the decision boundary by storing the corresponding learned support vectors. Second, to restrict the fluctuation of the learned decision function and achieve smooth updating, we confine the influence on a new support vector to its -nearest opposite support vectors. Third, to avoid information loss, we propose an effective compensation scheme after the replacement is conducted when either buffer is full. With such a compensation scheme, the performance of the learned model is comparable to the one learned with infinite budgets. Fourth, to determine good kernels for data similarity representation, we exploit the multiple kernel learning framework to automatically learn a set of kernels. Extensive experiments on both synthetic and real-world benchmark data sets demonstrate the efficacy of our proposed approach.

Year:  2017        PMID: 28141529     DOI: 10.1109/TNNLS.2016.2610465

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  3 in total

1.  Augmentation Method for High Intra-Class Variation Data in Apple Detection.

Authors:  Huibin Li; Wei Guo; Guowen Lu; Yun Shi
Journal:  Sensors (Basel)       Date:  2022-08-23       Impact factor: 3.847

2.  A classification for complex imbalanced data in disease screening and early diagnosis.

Authors:  Yiming Li; Wei-Wen Hsu
Journal:  Stat Med       Date:  2022-05-23       Impact factor: 2.497

3.  Design of a Clinical Decision Support System for Fracture Prediction Using Imbalanced Dataset.

Authors:  Yung-Fu Chen; Chih-Sheng Lin; Kuo-An Wang; La Ode Abdul Rahman; Dah-Jye Lee; Wei-Sheng Chung; Hsuan-Hung Lin
Journal:  J Healthc Eng       Date:  2018-03-22       Impact factor: 2.682

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.