| Literature DB >> 35741488 |
Konstantin Barkalov1, Anton Shtanyuk1, Alexander Sysoyev1.
Abstract
The paper considers a time-efficient implementation of the k nearest neighbours (kNN) algorithm. A well-known approach for accelerating the kNN algorithm is to utilise dimensionality reduction methods based on the use of space-filling curves. In this paper, we take this approach further and propose an algorithm that employs multiple space-filling curves and is faster (with comparable quality) compared with the kNN algorithm, which uses kd-trees to determine the nearest neighbours. A specific method for constructing multiple Peano curves is outlined, and statements are given about the preservation of object proximity information in the course of dimensionality reduction. An experimental comparison with known kNN implementations using kd-trees was performed using test and real-life data.Entities:
Keywords: dimensionality reduction; kNN; machine learning; multiple space-filling curves
Year: 2022 PMID: 35741488 PMCID: PMC9223091 DOI: 10.3390/e24060767
Source DB: PubMed Journal: Entropy (Basel) ISSN: 1099-4300 Impact factor: 2.738
Figure 1Peano curve.
Figure 2Family of Peano curves.
Test infrastructure.
| Computer Specifications | |
|---|---|
| CPU | Intel Core i3-7100 (3.9 HHz) |
| RAM | 8 GB |
| Operating system | Windows 10 |
| Compiler | Intel(R) oneAPI DPC++/C++ Compiler, Version 2022.0.0 |
Figure 3Running times of the algorithms with .
Figure 4Running times of the algorithms with .
Figure 5Running times of the algorithms with .
Figure 6Running times of the algorithms with , .
Figure 7Comparison of neighbour search times when determining the colour of an image point.
Classes in the CarEvaluation set.
| Object Class | Number of Objects in Class |
|---|---|
| 1 | 1400 |
| 2 | 400 |
| 3 | 70 |
| 4 | 65 |
Figure 8Comparison of neighbour search times for the CarEvaluation dataset.
Figure 9Running times of the algorithms with , 7 million objects.
Figure 10“Surrounded by aliens”.
Percentage of errors in DS-Random-2 recognition.
|
|
|
|
| |||
|---|---|---|---|---|---|---|
|
|
|
|
|
|
| |
| 5 | 5 | 5 | 11 | 12 | 6 | 8 |
| 15 | 4 | 6 | 10 | 11 | 10 | 9 |
| 25 | 5 | 7 | 10 | 11 | 7 | 9 |
| 35 | 6 | 6 | 11 | 13 | 7 | 10 |
| 45 | 7 | 6 | 10 | 13 | 7 | 10 |
Percentage of DS-Skin recognition errors, .
|
| kNN-KD | kNN-ME |
|---|---|---|
| 5 | 0.5 | 0.5 |
| 15 | 0.5 | 0.5 |
| 25 | 0.5 | 0.5 |
| 35 | 1 | 1 |
| 45 | 1 | 1 |
Average error rate (in percent) for DS-CarEvaluation recognition, .
|
| kNN-KD | kNN-ME |
|---|---|---|
| 5 | 8 | 11 |
| 7 | 9 | 10 |
| 9 | 8 | 10 |
| 11 | 10 | 11 |
| 13 | 9 | 11 |