Literature DB >> 33286874

Fractional Norms and Quasinorms Do Not Help to Overcome the Curse of Dimensionality.

Evgeny M Mirkes1,2, Jeza Allohibi1,3, Alexander Gorban1,2.   

Abstract

The curse of dimensionality causes the well-known and widely discussed problems for machine learning methods. There is a hypothesis that using the Manhattan distance and even fractional lp quasinorms (for p less than 1) can he<span class="Chemical">lp to overcome the curse of dimensionality in classification problems. In this study, we systematically test this hypothesis. It is illustrated that fractional quasinorms have a greater relative contrast and coefficient of variation than the Euclidean norm l2, but it is shown that this difference decays with increasing space dimension. It has been demonstrated that the concentration of distances shows qualitatively the same behaviour for all tested norms and quasinorms. It is shown that a greater relative contrast does not mean a better classification quality. It was revealed that for different databases the best (worst) performance was achieved under different norms (quasinorms). A systematic comparison shows that the difference in the performance of kNN classifiers for lp at p = 0.5, 1, and 2 is statistically insignificant. Analysis of curse and blessing of dimensionality requires careful definition of data dimensionality that rarely coincides with the number of attributes. We systematically examined several intrinsic dimensions of the data.

Entities:  

Keywords:  blessing of dimensionality; curse of dimensionality; fractional norm; high dimension; kNN; metrics

Year:  2020        PMID: 33286874      PMCID: PMC7597215          DOI: 10.3390/e22101105

Source DB:  PubMed          Journal:  Entropy (Basel)        ISSN: 1099-4300            Impact factor:   2.524


  16 in total

1.  Principal manifolds and graphs in practice: from molecular biology to dynamical systems.

Authors:  Alexander N Gorban; Andrei Zinovyev
Journal:  Int J Neural Syst       Date:  2010-06       Impact factor: 5.866

2.  Generalized multiscale radial basis function networks.

Authors:  Stephen A Billings; Hua-Liang Wei; Michael A Balikhin
Journal:  Neural Netw       Date:  2007-10-16

3.  Statistics in brief: the importance of sample size in the planning and interpretation of medical research.

Authors:  David Jean Biau; Solen Kernéis; Raphaël Porcher
Journal:  Clin Orthop Relat Res       Date:  2008-06-20       Impact factor: 4.176

4.  Stochastic separation theorems.

Authors:  A N Gorban; I Y Tyukin
Journal:  Neural Netw       Date:  2017-07-31

Review 5.  The unreasonable effectiveness of small neural ensembles in high-dimensional brain.

Authors:  Alexander N Gorban; Valeri A Makarov; Ivan Y Tyukin
Journal:  Phys Life Rev       Date:  2018-10-02       Impact factor: 11.025

6.  Quantitative structure-activity relationship models for ready biodegradability of chemicals.

Authors:  Kamel Mansouri; Tine Ringsted; Davide Ballabio; Roberto Todeschini; Viviana Consonni
Journal:  J Chem Inf Model       Date:  2013-03-27       Impact factor: 4.956

7.  Blessing of Dimensionality: Recovering Mixture Data via Dictionary Pursuit.

Authors:  Guangcan Liu; Qingshan Liu; Ping Li
Journal:  IEEE Trans Pattern Anal Mach Intell       Date:  2016-03-09       Impact factor: 6.226

Review 8.  Blessing of dimensionality: mathematical foundations of the statistical physics of data.

Authors:  A N Gorban; I Y Tyukin
Journal:  Philos Trans A Math Phys Eng Sci       Date:  2018-04-28       Impact factor: 4.226

9.  Sample size calculation.

Authors:  Prashant Kadam; Supriya Bhalerao
Journal:  Int J Ayurveda Res       Date:  2010-01

10.  The distance function effect on k-nearest neighbor classification for medical datasets.

Authors:  Li-Yu Hu; Min-Wei Huang; Shih-Wen Ke; Chih-Fong Tsai
Journal:  Springerplus       Date:  2016-08-09
View more
  2 in total

1.  A Novel Framework for the Identification of Reference DNA Methylation Libraries for Reference-Based Deconvolution of Cellular Mixtures.

Authors:  Shelby Bell-Glenn; Jeffrey A Thompson; Lucas A Salas; Devin C Koestler
Journal:  Front Bioinform       Date:  2022-03-21

2.  A Fast kNN Algorithm Using Multiple Space-Filling Curves.

Authors:  Konstantin Barkalov; Anton Shtanyuk; Alexander Sysoyev
Journal:  Entropy (Basel)       Date:  2022-05-30       Impact factor: 2.738

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.