Next Article in Journal
A Bayesian Probabilistic Framework for Rain Detection
Next Article in Special Issue
Extending the Extreme Physical Information to Universal Cognitive Models via a Confident Information First Principle
Previous Article in Journal
Density Reconstructions with Errors in the Data
Previous Article in Special Issue
On the Fisher Metric of Conditional Probability Polytopes
Entropy 2014, 16(6), 3273-3301; doi:10.3390/e16063273

On Clustering Histograms with k-Means by Using Mixed α-Divergences

1,2,* , 3 and 4
1 Sony Computer Science Laboratories, Inc, Tokyo 141-0022, Japan 2 Polytechnique, 91128 Palaiseau Cedex, France 3 NICTA and The Australian National University, Locked Bag 9013, Alexandria NSW 1435, Australia 4 RIKEN Brain Science Institute, 2-1 Hirosawa Wako City, Saitama 351-0198, Japan
* Author to whom correspondence should be addressed.
Received: 15 May 2014 / Revised: 10 June 2014 / Accepted: 13 June 2014 / Published: 17 June 2014
(This article belongs to the Special Issue Information Geometry)


Clustering sets of histograms has become popular thanks to the success of the generic method of bag-of-X used in text categorization and in visual categorization applications. In this paper, we investigate the use of a parametric family of distortion measures, called the α-divergences, for clustering histograms. Since it usually makes sense to deal with symmetric divergences in information retrieval systems, we symmetrize the α -divergences using the concept of mixed divergences. First, we present a novel extension of k-means clustering to mixed divergences. Second, we extend the k-means++ seeding to mixed α-divergences and report a guaranteed probabilistic bound. Finally, we describe a soft clustering technique for mixed α-divergences.
Keywords: bag-of-X; α-divergence; Jeffreys divergence; centroid; k-means clustering; k-means seeding bag-of-X; α-divergence; Jeffreys divergence; centroid; k-means clustering; k-means seeding
This is an open access article distributed under the Creative Commons Attribution License (CC BY) which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Share & Cite This Article

Further Mendeley | CiteULike
Export to BibTeX |
EndNote |
MDPI and ACS Style

Nielsen, F.; Nock, R.; Amari, S.-I. On Clustering Histograms with k-Means by Using Mixed α-Divergences. Entropy 2014, 16, 3273-3301.

View more citation formats

Related Articles

Article Metrics

For more information on the journal, click here


[Return to top]
Entropy EISSN 1099-4300 Published by MDPI AG, Basel, Switzerland RSS E-Mail Table of Contents Alert