Next Article in Journal
Collaborative CAD Synchronization Based on a Symmetric and Consistent Modeling Procedure
Next Article in Special Issue
Analysis of Clustering Evaluation Considering Features of Item Response Data Using Data Mining Technique for Setting Cut-Off Scores
Previous Article in Journal
A Matter of Degree: Strength of Brain Asymmetry and Behaviour
Previous Article in Special Issue
3D Reconstruction Framework for Multiple Remote Robots on Cloud System
Article Menu
Issue 4 (April) cover image

Export Article

Open AccessArticle
Symmetry 2017, 9(4), 58; doi:10.3390/sym9040058

A Fast K-prototypes Algorithm Using Partial Distance Computation

Creative Informatics & Computing Institute, Korea University, Seoul 02841, Korea
Academic Editor: Doo-Soon Park
Received: 6 April 2017 / Revised: 17 April 2017 / Accepted: 18 April 2017 / Published: 21 April 2017
(This article belongs to the Special Issue Scientific Programming in Practical Symmetric Big Data)
View Full-Text   |   Download PDF [712 KB, uploaded 24 April 2017]   |  

Abstract

The k-means is one of the most popular and widely used clustering algorithm; however, it is limited to numerical data only. The k-prototypes algorithm is an algorithm famous for dealing with both numerical and categorical data. However, there have been no studies to accelerate it. In this paper, we propose a new, fast k-prototypes algorithm that provides the same answers as those of the original k-prototypes algorithm. The proposed algorithm avoids distance computations using partial distance computation. Our k-prototypes algorithm finds minimum distance without distance computations of all attributes between an object and a cluster center, which allows it to reduce time complexity. A partial distance computation uses a fact that a value of the maximum difference between two categorical attributes is 1 during distance computations. If data objects have m categorical attributes, the maximum difference of categorical attributes between an object and a cluster center is m. Our algorithm first computes distance with numerical attributes only. If a difference of the minimum distance and the second smallest with numerical attributes is higher than m, we can find the minimum distance between an object and a cluster center without distance computations of categorical attributes. The experimental results show that the computational performance of the proposed k-prototypes algorithm is superior to the original k-prototypes algorithm in our dataset. View Full-Text
Keywords: clustering algorithm; k-prototypes algorithm; partial distance computation clustering algorithm; k-prototypes algorithm; partial distance computation
Figures

Figure 1

This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. (CC BY 4.0).

Scifeed alert for new publications

Never miss any articles matching your research from any publisher
  • Get alerts for new papers matching your research
  • Find out the new papers from selected authors
  • Updated daily for 49'000+ journals and 6000+ publishers
  • Define your Scifeed now

SciFeed Share & Cite This Article

MDPI and ACS Style

Kim, B. A Fast K-prototypes Algorithm Using Partial Distance Computation. Symmetry 2017, 9, 58.

Show more citation formats Show less citations formats

Note that from the first issue of 2016, MDPI journals use article numbers instead of page numbers. See further details here.

Related Articles

Article Metrics

Article Access Statistics

1

Comments

[Return to top]
Symmetry EISSN 2073-8994 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert
Back to Top