Literature DB >> 31411491

Effects of Distance Measure Choice on K-Nearest Neighbor Classifier Performance: A Review.

Haneen Arafat Abu Alfeilat1, Ahmad B A Hassanat1, Omar Lasassmeh1, Ahmad S Tarawneh2, Mahmoud Bashir Alhasanat3,4, Hamzeh S Eyal Salman1, V B Surya Prasath5,6,7,8.   

Abstract

The K-nearest neighbor (KNN) classifier is one of the simplest and most common classifiers, yet its performance competes with the most complex classifiers in the literature. The core of this classifier depends mainly on measuring the distance or similarity between the tested examples and the training examples. This raises a major question about which distance measures to be used for the KNN classifier among a large number of distance and similarity measures available? This review attempts to answer this question through evaluating the performance (measured by accuracy, precision, and recall) of the KNN using a large number of distance measures, tested on a number of real-world data sets, with and without adding different levels of noise. The experimental results show that the performance of KNN classifier depends significantly on the distance used, and the results showed large gaps between the performances of different distances. We found that a recently proposed nonconvex distance performed the best when applied on most data sets comparing with the other tested distances. In addition, the performance of the KNN with this top performing distance degraded only ∼20% while the noise level reaches 90%, this is true for most of the distances used as well. This means that the KNN classifier using any of the top 10 distances tolerates noise to a certain degree. Moreover, the results show that some distances are less affected by the added noise comparing with other distances.

Keywords:  K-nearest neighbor; big data; machine learning; noise; supervised learning

Mesh:

Year:  2019        PMID: 31411491     DOI: 10.1089/big.2018.0175

Source DB:  PubMed          Journal:  Big Data        ISSN: 2167-6461            Impact factor:   2.128


  27 in total

1.  Visible Particle Identification Using Raman Spectroscopy and Machine Learning.

Authors:  Han Sheng; Yinping Zhao; Xiangan Long; Liwen Chen; Bei Li; Yiyan Fei; Lan Mi; Jiong Ma
Journal:  AAPS PharmSciTech       Date:  2022-07-06       Impact factor: 3.246

2.  Diagnostic classification of cancers using DNA methylation of paracancerous tissues.

Authors:  Baoshan Ma; Bingjie Chai; Heng Dong; Jishuang Qi; Pengcheng Wang; Tong Xiong; Yi Gong; Di Li; Shuxin Liu; Fengju Song
Journal:  Sci Rep       Date:  2022-06-23       Impact factor: 4.996

3.  Application of Machine Learning Methods in Modeling the Loss of Circulation Rate while Drilling Operation.

Authors:  Ahmed Alsaihati; Mahmoud Abughaban; Salaheldin Elkatatny; Dhafer Al Shehri
Journal:  ACS Omega       Date:  2022-06-08

4.  Evaluation of Traditional Culture Teaching Efficiency by Course Ideological and Political Integration Lightweight Deep Learning.

Authors:  Qingqing Zhong
Journal:  Comput Intell Neurosci       Date:  2022-06-25

5.  POCASUM : Policy Categorizer and Summarizer Based on Text Mining and Machine Learning.

Authors:  Rushikesh Deotale; Shreyash Rawat; V Vijayarajan; V B Surya Prasath
Journal:  Soft comput       Date:  2021-06-11       Impact factor: 3.732

6.  Non-Invasive Glucose Monitoring Using Optical Sensor and Machine Learning Techniques for Diabetes Applications.

Authors:  Maryamsadat Shokrekhodaei; David P Cistola; Robert C Roberts; Stella Quinones
Journal:  IEEE Access       Date:  2021-05-11       Impact factor: 3.367

7.  Fast prototyping of a local fuzzy search system for decision support and retraining of hospital staff during pandemic.

Authors:  Evgeny A Bakin; Oksana V Stanevich; Daria M Danilenko; Dmitry A Lioznov; Alexander N Kulikov
Journal:  Health Inf Sci Syst       Date:  2021-05-11

8.  Masked face recognition with convolutional neural networks and local binary patterns.

Authors:  Hoai Nam Vu; Mai Huong Nguyen; Cuong Pham
Journal:  Appl Intell (Dordr)       Date:  2021-08-14       Impact factor: 5.019

9.  Model-Based Reasoning of Clinical Diagnosis in Integrative Medicine: Real-World Methodological Study of Electronic Medical Records and Natural Language Processing Methods.

Authors:  Wenye Geng; Xuanfeng Qin; Tao Yang; Zhilei Cong; Zhuo Wang; Qing Kong; Zihui Tang; Lin Jiang
Journal:  JMIR Med Inform       Date:  2020-12-21

10.  Development and validation of consensus machine learning-based models for the prediction of novel small molecules as potential anti-tubercular agents.

Authors:  Mushtaq Ahmad Wani; Kuldeep K Roy
Journal:  Mol Divers       Date:  2021-06-10       Impact factor: 2.943

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.