You are currently on the new version of our website. Access the old version .
AlgorithmsAlgorithms
  • Article
  • Open Access

30 January 2024

Efficient Time-Series Clustering through Sparse Gaussian Modeling

,
,
and
1
School of Electrical and Computer Engineering, National Technical University of Athens, 15780 Zografou, Greece
2
LIP6, Sorbonne Université, CNRS, F-75005 Paris, France
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue 2022 and 2023 Selected Papers from Algorithms Editorial Board Members

Abstract

In this work, we consider the problem of shape-based time-series clustering with the widely used Dynamic Time Warping (DTW) distance. We present a novel two-stage framework based on Sparse Gaussian Modeling. In the first stage, we apply Sparse Gaussian Process Regression and obtain a sparse representation of each time series in the dataset with a logarithmic (in the original length T) number of inducing data points. In the second stage, we apply k-means with DTW Barycentric Averaging (DBA) to the sparsified dataset using a generalization of DTW, which accounts for the fact that each inducing point serves as a representative of many original data points. The asymptotic running time of our Sparse Time-Series Clustering framework is Ω(T2/log2T) times faster than the running time of applying k-means to the original dataset because sparsification reduces the running time of DTW from Θ(T2) to Θ(log2T). Moreover, sparsification tends to smoothen outliers and particularly noisy parts of the original time series. We conduct an extensive experimental evaluation using datasets from the UCR Time-Series Classification Archive, showing that the quality of clustering computed by our Sparse Time-Series Clustering framework is comparable to the clustering computed by the standard k-means algorithm.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.