Literature DB >> 28113443

Semisupervised Feature Selection Based on Relevance and Redundancy Criteria.

Jin Xu, Bo Tang, Haibo He, Hong Man.   

Abstract

Feature selection aims to gain relevant features for improved classification performance and remove redundant features for reduced computational cost. How to balance these two factors is a problem especially when the categorical labels are costly to obtain. In this paper, we address this problem using semisupervised learning method and propose a max-relevance and min-redundancy criterion based on Pearson's correlation (RRPC) coefficient. This new method uses the incremental search technique to select optimal feature subsets. The new selected features have strong relevance to the labels in supervised manner, and avoid redundancy to the selected feature subsets under unsupervised constraints. Comparative studies are performed on binary data and multicategory data from benchmark data sets. The results show that the RRPC can achieve a good balance between relevance and redundancy in semisupervised feature selection. We also compare the RRPC with classic supervised feature selection criteria (such as mRMR and Fisher score), unsupervised feature selection criteria (such as Laplacian score), and semisupervised feature selection criteria (such as sSelect and locality sensitive). Experimental results demonstrate the effectiveness of our method.

Year:  2016        PMID: 28113443     DOI: 10.1109/TNNLS.2016.2562670

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw Learn Syst        ISSN: 2162-237X            Impact factor:   10.451


  4 in total

1.  Simple strategies for semi-supervised feature selection.

Authors:  Konstantinos Sechidis; Gavin Brown
Journal:  Mach Learn       Date:  2017-07-17       Impact factor: 2.940

2.  How to balance the bioinformatics data: pseudo-negative sampling.

Authors:  Yongqing Zhang; Shaojie Qiao; Rongzhao Lu; Nan Han; Dingxiang Liu; Jiliu Zhou
Journal:  BMC Bioinformatics       Date:  2019-12-24       Impact factor: 3.169

3.  The impact of public health emergencies on hotel demand - Estimation from a new foresight perspective on the COVID-19.

Authors:  Ling-Yang He; Hui Li; Jian-Wu Bi; Jing-Jing Yang; Qing Zhou
Journal:  Ann Tour Res       Date:  2022-04-12

4.  A hybrid feature selection model based on improved squirrel search algorithm and rank aggregation using fuzzy techniques for biomedical data classification.

Authors:  Gayathri Nagarajan; L D Dhinesh Babu
Journal:  Netw Model Anal Health Inform Bioinform       Date:  2021-06-02
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.