Literature DB >> 30631957

Distribution-Sensitive Unbalanced Data Oversampling Method for Medical Diagnosis.

Weihong Han1,2, Zizhong Huang3, Shudong Li4, Yan Jia3.   

Abstract

Aiming at the problem of low accuracy of classification learning algorithm caused by serious imbalance of sample set in medical diagnostic application, this paper proposes a distribution-sensitive oversampling algorithm for imbalanced data. The algorithm accurately divides the minority samples into noise samples, unstable samples, boundary samples and stable samples according to the location of the minority samples. Different samples are processed differently to select the most suitable sample for the synthesis of new samples. In the case of sample synthesis, a distribution-sensitive sample synthesis method is adopted. Different sample synthesis methods are selected according to their different distance from the surrounding minority samples, so as to ensure that the newly synthesized samples have the same characteristics with the original minority samples. The real medical diagnostic data test shows that this algorithm improves the accuracy rate of classification learning algorithm compared with the existing sampling algorithms, especially for the accuracy rate and recall rate of minority classes.

Keywords:  Classification learning; Data resampling; Imbalanced data; Medical diagnosis; Oversampling; Undersampling

Mesh:

Year:  2019        PMID: 30631957     DOI: 10.1007/s10916-018-1154-8

Source DB:  PubMed          Journal:  J Med Syst        ISSN: 0148-5598            Impact factor:   4.460


  5 in total

1.  Pan-Cancer Transcriptional Models Predicting Chemosensitivity in Human Tumors.

Authors:  Jason D Wells; Jacqueline R Griffin; Todd W Miller
Journal:  Cancer Inform       Date:  2021-03-19

2.  Application of Machine Learning for the Prediction of Etiological Types of Classic Fever of Unknown Origin.

Authors:  Yongjie Yan; Chongyuan Chen; Yunyu Liu; Zuyue Zhang; Lin Xu; Kexue Pu
Journal:  Front Public Health       Date:  2021-12-24

3.  Emotion Recognition Based on EEG Using Generative Adversarial Nets and Convolutional Neural Network.

Authors:  Bo Pan; Wei Zheng
Journal:  Comput Math Methods Med       Date:  2021-10-11       Impact factor: 2.238

4.  Machine learning in the loop for tuberculosis diagnosis support.

Authors:  Alvaro D Orjuela-Cañón; Andrés L Jutinico; Carlos Awad; Erika Vergara; Angélica Palencia
Journal:  Front Public Health       Date:  2022-07-26

5.  A Deep Neural Network-Based Method for Prediction of Dementia Using Big Data.

Authors:  Jungyoon Kim; Jihye Lim
Journal:  Int J Environ Res Public Health       Date:  2021-05-18       Impact factor: 3.390

  5 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.