Literature DB >> 26557932

Integrating new data balancing technique with committee networks for imbalanced data: GRSOM approach.

Danaipong Chetchotsak1, Sirorat Pattanapairoj1, Banchar Arnonkijpanich2.   

Abstract

To deal with imbalanced data in a classification problem, this paper proposes a data balancing technique to be used in conjunction with a committee network. The proposed data balancing technique is based on the concept of the growing ring self-organizing map (GRSOM) which is an unsupervised learning algorithm. GRSOM balances the data through growing new data on a well-defined ring structure, which is iteratively developed based on the winning node nearby the samples. Accordingly, the new balanced data still preserve the topology of the original data. The performance of our proposed method is evaluated using four real data sets from the UCI Machine Learning Repository and the classification performance is measured using the fivefold cross validation method. Classifiers with most common data balancing techniques, namely the Minority Over-Sampling Technique (SMOTE) and the Random under-sampling Technique (RT), are used as the baseline methods in this study. The results reveal that a committee of classifiers constructed using GRSOM performs at least as well as the baseline methods. The results also suggest that classifiers constructed using neural networks with the backpropagation algorithm are more robust than those using the support vector machine.

Entities:  

Keywords:  Classification; Committee networks; Growing ring self-organizing map; Imbalanced data

Year:  2015        PMID: 26557932      PMCID: PMC4635392          DOI: 10.1007/s11571-015-9350-4

Source DB:  PubMed          Journal:  Cogn Neurodyn        ISSN: 1871-4080            Impact factor:   5.082


  7 in total

1.  Using ensemble methods to deal with imbalanced data in predicting protein-protein interactions.

Authors:  Yongqing Zhang; Danling Zhang; Gang Mi; Daichuan Ma; Gongbing Li; Yanzhi Guo; Menglong Li; Min Zhu
Journal:  Comput Biol Chem       Date:  2012-01-03       Impact factor: 2.877

2.  A learning method for the class imbalance problem with medical data sets.

Authors:  Der-Chiang Li; Chiao-Wen Liu; Susan C Hu
Journal:  Comput Biol Med       Date:  2010-03-26       Impact factor: 4.589

3.  Learning from imbalanced data in surveillance of nosocomial infection.

Authors:  Gilles Cohen; Mélanie Hilario; Hugo Sax; Stéphane Hugonnet; Antoine Geissbuhler
Journal:  Artif Intell Med       Date:  2005-10-17       Impact factor: 5.326

4.  Training neural network classifiers for medical decision making: the effects of imbalanced datasets on classification performance.

Authors:  Maciej A Mazurowski; Piotr A Habas; Jacek M Zurada; Joseph Y Lo; Jay A Baker; Georgia D Tourassi
Journal:  Neural Netw       Date:  2007-12-27

5.  SVMs modeling for highly imbalanced classification.

Authors:  Yuchun Tang; Yan-Qing Zhang; Nitesh V Chawla; Sven Krasser
Journal:  IEEE Trans Syst Man Cybern B Cybern       Date:  2008-12-09

6.  Local matrix learning in clustering and applications for manifold visualization.

Authors:  Banchar Arnonkijpanich; Alexander Hasenfuss; Barbara Hammer
Journal:  Neural Netw       Date:  2009-12-22

7.  LVQ-SMOTE - Learning Vector Quantization based Synthetic Minority Over-sampling Technique for biomedical data.

Authors:  Munehiro Nakamura; Yusuke Kajiwara; Atsushi Otsuka; Haruhiko Kimura
Journal:  BioData Min       Date:  2013-10-02       Impact factor: 2.522

  7 in total
  1 in total

1.  Novel Insights on Establishing Machine Learning-Based Stroke Prediction Models Among Hypertensive Adults.

Authors:  Xiao Huang; Tianyu Cao; Liangziqian Chen; Junpei Li; Ziheng Tan; Benjamin Xu; Richard Xu; Yun Song; Ziyi Zhou; Zhuo Wang; Yaping Wei; Yan Zhang; Jianping Li; Yong Huo; Xianhui Qin; Yanqing Wu; Xiaobin Wang; Hong Wang; Xiaoshu Cheng; Xiping Xu; Lishun Liu
Journal:  Front Cardiovasc Med       Date:  2022-05-06
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.