Next Article in Journal
Deploying CPU-Intensive Applications on MEC in NFV Systems: The Immersive Video Use Case
Previous Article in Journal
Automatic Configurable Hardware Code Generation for Software-Defined Radios
Article Menu

Export Article

Open AccessArticle
Computers 2018, 7(4), 54; https://doi.org/10.3390/computers7040054

Norm-Based Binary Search Trees for Speeding Up KNN Big Data Classification

Information Technology College, Mutah University; Karak 61710, Jordan
Received: 26 September 2018 / Revised: 11 October 2018 / Accepted: 19 October 2018 / Published: 21 October 2018
Full-Text   |   PDF [1636 KB, uploaded 21 October 2018]   |  

Abstract

Due to their large sizes and/or dimensions, the classification of Big Data is a challenging task using traditional machine learning, particularly if it is carried out using the well-known K-nearest neighbors classifier (KNN) classifier, which is a slow and lazy classifier by its nature. In this paper, we propose a new approach to Big Data classification using the KNN classifier, which is based on inserting the training examples into a binary search tree to be used later for speeding up the searching process for test examples. For this purpose, we used two methods to sort the training examples. The first calculates the minimum/maximum scaled norm and rounds it to 0 or 1 for each example. Examples with 0-norms are sorted in the left-child of a node, and those with 1-norms are sorted in the right child of the same node; this process continues recursively until we obtain one example or a small number of examples with the same norm in a leaf node. The second proposed method inserts each example into the binary search tree based on its similarity to the examples of the minimum and maximum Euclidean norms. The experimental results of classifying several machine learning big datasets show that both methods are much faster than most of the state-of-the-art methods compared, with competing accuracy rates obtained by the second method, which shows great potential for further enhancements of both methods to be used in practice. View Full-Text
Keywords: Big Data classification; machine learning datasets; binary search tree; norms Big Data classification; machine learning datasets; binary search tree; norms
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).
SciFeed

Share & Cite This Article

MDPI and ACS Style

Hassanat, A.B.A. Norm-Based Binary Search Trees for Speeding Up KNN Big Data Classification. Computers 2018, 7, 54.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Computers EISSN 2073-431X Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top