Entropy 2014, 16(6), 3273-3301; doi:10.3390/e16063273

On Clustering Histograms with k-Means by Using Mixed α-Divergences

Received: 15 May 2014; in revised form: 10 June 2014 / Accepted: 13 June 2014 / Published: 17 June 2014
(This article belongs to the Special Issue Information Geometry)
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract: Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α -divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.
Keywords: bag-of-X; α-divergence; Jeffreys divergence; centroid; k-means clustering; k-means seeding
PDF Full-text Download PDF Full-Text [354 KB, uploaded 17 June 2014 15:22 CEST]

Export to BibTeX |

MDPI and ACS Style

Nielsen, F.; Nock, R.; Amari, S.-I. On Clustering Histograms with k-Means by Using Mixed α-Divergences. Entropy 2014, 16, 3273-3301.

AMA Style

Nielsen F, Nock R, Amari S-I. On Clustering Histograms with k-Means by Using Mixed α-Divergences. Entropy. 2014; 16(6):3273-3301.

Chicago/Turabian Style

Nielsen, Frank; Nock, Richard; Amari, Shun-ichi. 2014. "On Clustering Histograms with k-Means by Using Mixed α-Divergences." Entropy 16, no. 6: 3273-3301.

Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert