Literature DB >> 21768651

Prototype selection for nearest neighbor classification: taxonomy and empirical study.

Salvador García1, Joaquín Derrac, José Ramón Cano, Francisco Herrera.   

Abstract

The nearest neighbor classifier is one of the most used and well-known techniques for performing recognition tasks. It has also demonstrated itself to be one of the most useful algorithms in data mining in spite of its simplicity. However, the nearest neighbor classifier suffers from several drawbacks such as high storage requirements, low efficiency in classification response, and low noise tolerance. These weaknesses have been the subject of study for many researchers and many solutions have been proposed. Among them, one of the most promising solutions consists of reducing the data used for establishing a classification rule (training data) by means of selecting relevant prototypes. Many prototype selection methods exist in the literature and the research in this area is still advancing. Different properties could be observed in the definition of them, but no formal categorization has been established yet. This paper provides a survey of the prototype selection methods proposed in the literature from a theoretical and empirical point of view. Considering a theoretical point of view, we propose a taxonomy based on the main characteristics presented in prototype selection and we analyze their advantages and drawbacks. Empirically, we conduct an experimental study involving different sizes of data sets for measuring their performance in terms of accuracy, reduction capabilities, and runtime. The results obtained by all the methods studied have been verified by nonparametric statistical tests. Several remarks, guidelines, and recommendations are made for the use of prototype selection for nearest neighbor classification.

Year:  2012        PMID: 21768651     DOI: 10.1109/TPAMI.2011.142

Source DB:  PubMed          Journal:  IEEE Trans Pattern Anal Mach Intell        ISSN: 0098-5589            Impact factor:   6.226


  7 in total

1.  Computer-aided classification of sickle cell retinopathy using quantitative features in optical coherence tomography angiography.

Authors:  Minhaj Alam; Damber Thapa; Jennifer I Lim; Dingcai Cao; Xincheng Yao
Journal:  Biomed Opt Express       Date:  2017-08-25       Impact factor: 3.732

2.  Multi-class texture analysis in colorectal cancer histology.

Authors:  Jakob Nikolas Kather; Cleo-Aron Weis; Francesco Bianconi; Susanne M Melchers; Lothar R Schad; Timo Gaiser; Alexander Marx; Frank Gerrit Zöllner
Journal:  Sci Rep       Date:  2016-06-16       Impact factor: 4.379

3.  Outlier Removal in Model-Based Missing Value Imputation for Medical Datasets.

Authors:  Min-Wei Huang; Wei-Chao Lin; Chih-Fong Tsai
Journal:  J Healthc Eng       Date:  2018-02-04       Impact factor: 2.682

4.  IMMIGRATE: A Margin-Based Feature Selection Method with Interaction Terms.

Authors:  Ruzhang Zhao; Pengyu Hong; Jun S Liu
Journal:  Entropy (Basel)       Date:  2020-03-02       Impact factor: 2.524

5.  Multi-Objective Evolutionary Instance Selection for Regression Tasks.

Authors:  Mirosław Kordos; Krystian Łapa
Journal:  Entropy (Basel)       Date:  2018-09-29       Impact factor: 2.524

6.  Optimal 1-NN prototypes for pathological geometries.

Authors:  Ilia Sucholutsky; Matthias Schonlau
Journal:  PeerJ Comput Sci       Date:  2021-04-09

7.  Practical selection of representative sets of RNA-seq samples using a hierarchical approach.

Authors:  Laura H Tung; Carl Kingsford
Journal:  Bioinformatics       Date:  2021-07-12       Impact factor: 6.937

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.