Literature DB >> 32987205

Clinical data classification using an enhanced SMOTE and chaotic evolutionary feature selection.

S Sreejith1, H Khanna Nehemiah2, A Kannan3.   

Abstract

Class imbalance and the presence of irrelevant or redundant features in training data can pose serious challenges to the development of a classification framework. This paper proposes a framework for developing a Clinical Decision Support System (CDSS) that addresses class imbalance and the feature selection problem. Under this framework, the dataset is balanced at the data level and a wrapper approach is used to perform feature selection. The following three clinical datasets from the University of California Irvine (UCI) machine learning repository were used for experimentation: the Indian Liver Patient Dataset (ILPD), the Thoracic Surgery Dataset (TSD) and the Pima Indian Diabetes (PID) dataset. The Synthetic Minority Over-sampling Technique (SMOTE), which was enhanced using Orchard's algorithm, was used to balance the datasets. A wrapper approach that uses Chaotic Multi-Verse Optimisation (CMVO) was proposed for feature subset selection. The arithmetic mean of the Matthews correlation coefficient (MCC) and F-score (F1), which was measured using a Random Forest (RF) classifier, was used as the fitness function. After selecting the relevant features, a RF, which comprises 100 estimators and uses the Information Gain Ratio as the split criteria, was used for classification. The classifier achieved a 0.65 MCC, a 0.84 F1 and 82.46% accuracy for the ILPD; a 0.74 MCC, a 0.87 F1 and 86.88% accuracy for the TSD; and a 0.78 MCC, a 0.89 F1and 89.04% accuracy for the PID dataset. The effects of balancing and feature selection on the classifier were investigated and the performance of the framework was compared with the existing works in the literature. The results showed that the proposed framework is competitive in terms of the three performance measures used. The results of a Wilcoxon test confirmed the statistical superiority of the proposed method.
Copyright © 2020 Elsevier Ltd. All rights reserved.

Entities:  

Keywords:  Chaotic maps; Class imbalance; Classification; Clinical decision support system; Feature selection; Multi Verse Optimisation; SMOTE

Year:  2020        PMID: 32987205     DOI: 10.1016/j.compbiomed.2020.103991

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  6 in total

1.  Feature Selection and Classification of Clinical Datasets Using Bioinspired Algorithms and Super Learner.

Authors:  S Murugesan; R S Bhuvaneswaran; H Khanna Nehemiah; S Keerthana Sankari; Y Nancy Jane
Journal:  Comput Math Methods Med       Date:  2021-05-17       Impact factor: 2.238

2.  Prediction Model for Infectious Disease Health Literacy Based on Synthetic Minority Oversampling Technique Algorithm.

Authors:  Rongsheng Zhou; Weihao Yin; Wenjin Li; Yingchun Wang; Jing Lu; Zhong Li; Xinxin Hu
Journal:  Comput Math Methods Med       Date:  2022-03-25       Impact factor: 2.238

3.  Zoo: Selecting Transcriptomic and Methylomic Biomarkers by Ensembling Animal-Inspired Swarm Intelligence Feature Selection Algorithms.

Authors:  Yuanyuan Han; Lan Huang; Fengfeng Zhou
Journal:  Genes (Basel)       Date:  2021-11-18       Impact factor: 4.096

4.  Accurate Tumor Subtype Detection with Raman Spectroscopy via Variational Autoencoder and Machine Learning.

Authors:  Chang He; Shuo Zhu; Xiaorong Wu; Jiale Zhou; Yonghui Chen; Xiaohua Qian; Jian Ye
Journal:  ACS Omega       Date:  2022-03-21

5.  Using random forest algorithm for glomerular and tubular injury diagnosis.

Authors:  Wenzhu Song; Xiaoshuang Zhou; Qi Duan; Qian Wang; Yaheng Li; Aizhong Li; Wenjing Zhou; Lin Sun; Lixia Qiu; Rongshan Li; Yafeng Li
Journal:  Front Med (Lausanne)       Date:  2022-07-28

6.  Stacking Ensemble Method for Gestational Diabetes Mellitus Prediction in Chinese Pregnant Women: A Prospective Cohort Study.

Authors:  Ruiyi Liu; Yongle Zhan; Xuan Liu; Yifang Zhang; Luting Gui; Yimin Qu; Hairong Nan; Yu Jiang
Journal:  J Healthc Eng       Date:  2022-09-13       Impact factor: 3.822

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.