Next Article in Journal
A New Ship-Radiated Noise Feature Extraction Technique Based on Variational Mode Decomposition and Fluctuation-Based Dispersion Entropy
Previous Article in Journal
Two-Dimensional Electronic Transport in Rubrene: The Impact of Inter-Chain Coupling
Article Menu
Issue 3 (March) cover image

Export Article

Open AccessArticle
Entropy 2019, 21(3), 234;

Compressed kNN: K-Nearest Neighbors with Data Compression

Facultad de Ingeniería, Ciencias Físicas y Matemática, Universidad Central del Ecuador, Quito 170129, Ecuador
Computer Technology Department, University of Alicante, 03080 Alicante, Spain
Author to whom correspondence should be addressed.
Received: 27 December 2018 / Revised: 14 February 2019 / Accepted: 22 February 2019 / Published: 28 February 2019
Full-Text   |   PDF [869 KB, uploaded 28 February 2019]   |  


The kNN (k-nearest neighbors) classification algorithm is one of the most widely used non-parametric classification methods, however it is limited due to memory consumption related to the size of the dataset, which makes them impractical to apply to large volumes of data. Variations of this method have been proposed, such as condensed KNN which divides the training dataset into clusters to be classified, other variations reduce the input dataset in order to apply the algorithm. This paper presents a variation of the kNN algorithm, of the type structure less NN, to work with categorical data. Categorical data, due to their nature, can be compressed in order to decrease the memory requirements at the time of executing the classification. The method proposes a previous phase of compression of the data to then apply the algorithm on the compressed data. This allows us to maintain the whole dataset in memory which leads to a considerable reduction of the amount of memory required. Experiments and tests carried out on known datasets show the reduction in the volume of information stored in memory and maintain the accuracy of the classification. They also show a slight decrease in processing time because the information is decompressed in real time (on-the-fly) while the algorithm is running. View Full-Text
Keywords: classification; KNN; compression; categorical data; feature pre-processing classification; KNN; compression; categorical data; feature pre-processing

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).

Share & Cite This Article

MDPI and ACS Style

Salvador–Meneses, J.; Ruiz–Chavez, Z.; Garcia–Rodriguez, J. Compressed kNN: K-Nearest Neighbors with Data Compression. Entropy 2019, 21, 234.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics



[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top