KDiscShapeNet: A Structure-Aware Time Series Clustering Model with Supervised Contrastive Learning
Abstract
1. Introduction
- (1)
- Insufficient modeling of temporal shape: Time series often exhibit varying forms due to differences in sampling frequency, amplitude perturbations, or nonlinear phase shifts. As a result, feature extraction methods based on Euclidean distance or fixed filters fail to accurately capture the underlying semantic consistency across sequences, making them ineffective in modeling local shape variations and elastic alignments.
- (2)
- Ambiguous inter-cluster boundaries and cluster collapse: In unsupervised settings, imbalanced cluster distributions, scattered intra-class representations, and degraded embeddings frequently arise, leading to unstable clustering performance.
- (3)
- Non-differentiable distance metrics hinder end-to-end learning: Metrics such as the normalized cross-correlation (NCC) used in k-Shape are non-Euclidean and inherently non-differentiable, making them incompatible with neural network optimization frameworks. This restricts joint training between the feature encoder and the clustering objective. Without a differentiable formulation, the learned representations cannot be dynamically adjusted to accommodate shape alignment needs, resulting in suboptimal and unstable clustering outcomes.
- (1)
- We propose KDiscShapeNet, a unified time series clustering framework that integrates discriminative learning with structure-aware modeling. By jointly incorporating Center Loss and Supervised Contrastive Loss, the model enforces intra-cluster compactness and inter-cluster separability. Meanwhile, a Differentiable k-Shape mechanism is introduced to preserve temporal shape alignment, enabling end-to-end optimization within a cohesive training paradigm.
- (2)
- We leverage Kolmogorov–Arnold Networks (KAN) as the clustering encoder. With its high-order nonlinear functional representation and inherent interpretability, KAN significantly enhances the model’s ability to capture complex temporal deformations and dynamic patterns, while extending its applicability to unsupervised learning settings.
- (3)
- We conduct extensive experiments on benchmark datasets, including UCR and ETT, encompassing both comparative evaluation and ablation studies. KDiscShapeNet consistently outperforms state-of-the-art time series clustering baselines in handling nonlinear and multi-scale sequences. Visualization and statistical significance analyses further confirm the model’s robustness and interpretability during the learning process.
2. Related Work
2.1. Time Series Clustering
- (1)
- Distance-based clustering: These methods directly partition raw time series based on explicitly defined similarity measures, and crisp partitional algorithms represent the most widely used category in this group. Representative approaches include k-Means, k-Medoids, and k-Shape, where each sample is deterministically assigned to a single cluster, emphasizing partition consistency and intra-cluster compactness. To improve robustness under uncertainty, nonlinearity, or multi-scale temporal variations, various extensions have been proposed, such as fuzzy k-Shape [16] and kernelized k-Shape [17]. These methods introduce soft assignment strategies or kernel-based similarity measures, enhancing the model’s adaptability to ambiguous cluster boundaries and nonlinear structures.
- (2)
- Distribution-based clustering: These methods model time series as samples generated from underlying probabilistic processes. Common approaches include Gaussian Mixture Models (GMMs) and Hidden Markov Models (HMMs), which capture the statistical generative characteristics of the data. Due to their strong modeling assumptions, such methods are typically best suited for domain-specific applications, such as financial time series analysis or biomedical signal interpretation.
- (3)
- Subsequence-based clustering: These methods aim to identify recurring, reusable pattern fragments from long time series and are widely used in applications such as activity recognition and event detection. Representative approaches include Symbolic Aggregate approXimation (SAX) [18], Matrix Profile [19], and Shapelet Transformation [20]. However, most of these techniques rely on heuristic search and sliding window mechanisms, which hinder their scalability to large-scale datasets.
- (4)
- Representation-learning-based clustering: Leveraging the expressive power of deep neural networks, these methods automatically extract latent representations from time series and perform clustering in the learned embedding space. They often employ self-supervised pretraining, contrastive learning, or autoencoding objectives to construct well-structured feature spaces. Common architectures include autoencoders, convolutional neural networks (CNNs) [21], recurrent neural networks (RNNs), and the recently popular Transformer models [22]. Motivated by this line of research, we introduce the Kolmogorov–Arnold Network (KAN) into the time series clustering framework, aiming to enhance both the quality of embeddings and downstream clustering performance by combining strong representational capacity with functional interpretability.
2.2. Differentiable Clustering and End-to-End Optimization
3. KDiscShapeNet
3.1. Overall Architecture
3.2. KANEncoder: High-Order Nonlinear Encoding
3.3. Differentiable k-Shape Clustering with Soft Assignment
3.4. Discriminative Enhancement Module
3.5. Joint Optimization Objective
4. Experimental Evaluation
4.1. Settings
- (1)
- ARI
- (2)
- NMI
- (3)
- SC
4.2. Comparative Experiment
- (1)
- Traditional time series clustering methods
- (2)
- Representation learning combined with clustering
- (3)
- Deep clustering methods
4.3. Ablation Study
4.4. Complexity and Efficiency Analysis
5. Conclusions and Future Work
- (1)
- Unified end-to-end framework: KDiscShapeNet integrates a Kolmogorov–Arnold Network (KAN) encoder with a Differentiable k-Shape clustering module, achieving joint optimization between feature representation and cluster assignment.
- (2)
- Dual discriminative enhancement: The combination of Center Loss and Supervised Contrastive Loss enforces intra-cluster compactness and inter-cluster separability, strengthening discriminative capability.
- (3)
- Structure-aware modeling: The design explicitly captures dynamic shape variations and local structural heterogeneity, improving adaptability to diverse temporal patterns.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Palpanas, T.; Beckmann, V. Report on the first and second interdisciplinary time series analysis workshop (ITISA). ACM SIGMOD Rec. 2019, 48, 36–40. [Google Scholar] [CrossRef]
- Bagnall, A.J.; Cole, R.L.; Palpanas, T.; Zoumpatianos, K. Data series management (Dagstuhl Seminar 19282). Dagstuhl Rep. 2019, 9, 24–39. [Google Scholar] [CrossRef]
- Huang, Z.; Hao, H.; Du, L. Exploring the explainability of time series clustering: A review of methods and practices. In Proceedings of the 18th ACM International Conference on Web Search and Data Mining (WSDM 2025), Singapore, 10–14 March 2025; pp. 1005–1007. [Google Scholar]
- Likas, A.; Vlassis, N.; Verbeek, J. The global k-means clustering algorithm. Pattern Recognit. 2003, 36, 451–461. [Google Scholar] [CrossRef]
- Paparrizos, J.; Gravano, L. k-shape: Efficient and accurate clustering of time series. ACM SIGMOD Rec. 2016, 45, 69–76. [Google Scholar] [CrossRef]
- Li, G.; Choi, B.; Xu, J.; Bhowmick, S.; Mah, D.; Wong, G. AUTOSHAPE: An autoencoder-shapelet approach for time series clustering. arXiv 2022, arXiv:2208.04313. [Google Scholar] [CrossRef]
- Yue, Z.; Wang, Y.; Duan, J.; Yang, T.; Huang, C.; Tong, Y.; Xu, B. TS2Vec: Towards universal representation of time series. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI 2022), Virtual Event, 22 February–1 March 2022; pp. 8980–8987. [Google Scholar]
- Yang, C.H.H.; Tsai, Y.-Y.; Chen, P.-Y. Voice2Series: Reprogramming acoustic models for time series classification. arXiv 2021, arXiv:2104.09577. [Google Scholar]
- Zhong, Y.; Huang, D.; Wang, C.-D. Deep temporal contrastive clustering. Neural Process. Lett. 2023, 55, 7869–7885. [Google Scholar] [CrossRef]
- Ma, Q.; Zheng, J.; Li, S.; Cottrell, G.W. Learning representations for time series clustering. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; Curran Associates Inc.: Red Hook, NY, USA, 2019; p. 339. [Google Scholar]
- Javed, A.; Rizzo, D.M.; Lee, B.S.; Gramling, R. Somtimes: Self organizing maps for time series clustering and its application to serious illness conversations. Data Min. Knowl. Discov. 2024, 38, 813–839. [Google Scholar] [CrossRef] [PubMed]
- Boniol, P.; Tiano, D.; Bonifati, A.; Palpanas, T. k-graph: A graph embedding for interpretable time series clustering. IEEE Trans. Knowl. Data Eng. 2025, 37, 2680–2694. [Google Scholar] [CrossRef]
- Huang, F.; Deng, Y. TCGAN: Convolutional generative adversarial network for time series classification and clustering. Neural Netw. 2023, 165, 868–883. [Google Scholar] [CrossRef] [PubMed]
- Gao, P.; Yang, X.; Zhang, R.; Huang, K.; Goulermas, J.Y. Explainable tensorized neural ordinary differential equations for arbitrary-step time series prediction. IEEE Trans. Knowl. Data Eng. 2023, 35, 5837–5850. [Google Scholar] [CrossRef]
- Esling, P.; Agon, C. Time-series data mining. ACM Comput. Surv. 2012, 45, 12. [Google Scholar] [CrossRef]
- Yang, J.; Ning, C.; Deb, C.; Zhang, F.; Cheong, D.; Lee, S.E.; Sekhar, C.; Tham, K.W. k-Shape clustering algorithm for building energy usage patterns analysis and forecasting model accuracy improvement. Energy Build. 2017, 146, 27–37. [Google Scholar] [CrossRef]
- Zhang, X.; Zhu, Y.; Cheng, W.; Wang, Y. Kernelized k-shape clustering for time series. In Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI 2019), Honolulu, HI, USA, 27 January–1 February 2019; pp. 5683–5690. [Google Scholar]
- Lin, J.; Keogh, E.; Lonardi, S.; Chiu, B. A symbolic representation of time series, with implications for streaming algorithms. In Proceedings of the 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD 2003), San Diego, CA, USA, 13 June 2003; pp. 2–11. [Google Scholar]
- Yeh, C.C.M.; Zhu, Y.; Ulanova, L.; Begum, N.; Ding, Y.; Dau, H.A.; Silva, D.F.; Mueen, A.; Keogh, E. Matrix profile I: All pairs similarity joins for time series: A unifying view that includes motifs, discords and shapelets. In Proceedings of the 2016 IEEE International Conference on Data Mining (ICDM 2016), Barcelona, Spain, 12–15 December 2016; pp. 1317–1322. [Google Scholar]
- Hills, J.; Lines, J.; Baranauskas, E.; Mapp, J.; Bagnall, A. Classification of time series by shapelet transformation. Data Min. Knowl. Discov. 2014, 28, 851–881. [Google Scholar] [CrossRef]
- LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998, 86, 2278–2324. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NeurIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 6000–6010. [Google Scholar]
- Lafabregue, B.; Weber, J.; Gançarski, P.; Forestier, G. End-to-end deep representation learning for time series clustering: A comparative study. Data Min. Knowl. Discov. 2022, 36, 29–81. [Google Scholar] [CrossRef]
- Xie, J.; Girshick, R.; Farhadi, A. Unsupervised deep embedding for clustering analysis. In Proceedings of the 33rd International Conference on Machine Learning (ICML 2016), New York, NY, USA, 19–24 June 2016; pp. 478–487. [Google Scholar]
- Guo, X.; Gao, L.; Liu, X.; Yin, J. Improved deep embedded clustering with local structure preservation. In Proceedings of the 26th International Joint Conference on Artificial Intelligence (IJCAI 2017), Melbourne, Australia, 19–25 August 2017; pp. 1753–1759. [Google Scholar]
- Yang, B.; Fu, X.; Sidiropoulos, N.D.; Hong, M. Towards k-means-friendly spaces: Simultaneous deep learning and clustering. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia, 6–11 August 2017; pp. 3861–3870. [Google Scholar]
- Cuturi, M.; Blondel, M. Soft-DTW: A differentiable loss function for time-series. In Proceedings of the 34th International Conference on Machine Learning (ICML 2017), Sydney, Australia, 6–11 August 2017; pp. 894–903. [Google Scholar]
- Cai, X.; Xu, T.; Yi, J.; Huang, J.; Rajasekaran, S. DTWNet: A dynamic time warping network. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, BC, Canada, 8–14 December 2019; pp. 10512–10522. [Google Scholar]
- Liu, Y.; Wijewickrema, S.; Bester, C.; Song, Y.; Kasmarik, K.; Zhou, H. Time Series Representation Learning with Supervised Contrastive Temporal Transformer. In Proceedings of the 2024 International Joint Conference on Neural Networks (IJCNN), Yokohama, Japan, 30 June–5 July 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–8. [Google Scholar]
- Li, X.; Xi, W.; Lin, J. RandomNet: Clustering Time Series Using Untrained Deep Neural Networks. Data Min. Knowl. Disc. 2024, 38, 3473–3502. [Google Scholar] [CrossRef]
- Peng, F.; Luo, J.; Lu, X.; Wu, F.; Xu, Y.; Liu, H.; Wang, L. Cross-Domain Contrastive Learning for Time Series Clustering. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; AAAI Press: Palo Alto, CA, USA, 2024; Volume 38, pp. 8921–8929. [Google Scholar]
- Liu, Z.; Wang, Y.; Vaidya, S. KAN: Kolmogorov–Arnold networks. arXiv 2024, arXiv:2404.19756. [Google Scholar] [CrossRef]
- Wen, Y.; Zhang, K.; Li, Z.; Qiao, Y. A discriminative feature learning approach for deep face recognition. In Proceedings of the 14th European Conference on Computer Vision (ECCV 2016), Amsterdam, The Netherlands, 11–14 October 2016; Leibe, B., Matas, J., Sebe, N., Welling, M., Eds.; Springer: Cham, Switzerland, 2016; pp. 499–515. [Google Scholar]
- Khosla, P.; Teterwak, P.; Wang, C.; Sarna, A.; Tian, Y.; Isola, P.; Maschinot, A.; Liu, C.; Krishnan, D. Supervised contrastive learning. In Proceedings of the 34th International Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, BC, Canada, 6–12 December 2020; Article 1567. [Google Scholar]
- Dau, H.A.; Bagnall, A.; Kamgar, K.; Yeh, C.C.M.; Zhu, Y.; Gharghabi, S.; Ratanamahatana, C.A.; Keogh, E. The UCR time series archive. IEEE/CAA J. Autom. Sin. 2019, 6, 1293–1305. [Google Scholar] [CrossRef]
- Zhang, Q.; Wu, J.; Zhang, P.; Long, G.; Zhang, C. Salient subsequence learning for time series clustering. IEEE Trans. Pattern Anal. Mach. Intell. 2019, 41, 2193–2207. [Google Scholar] [CrossRef] [PubMed]
Dataset | Classes | Train | Test | Description |
---|---|---|---|---|
Beef | 5 | 30 | 30 | Food industry application with limited sample size. |
ECG200 | 2 | 100 | 100 | Medical diagnostics involving short time series. |
GunPoint | 2 | 50 | 150 | Video-based sequences with medium to long durations. |
ItalyPowerDemand | 2 | 67 | 1029 | Energy demand forecasting with extremely short sequences. |
SonyAIBORobotSurface1 | 2 | 20 | 601 | Recognition under noisy environments with high task difficulty. |
Trace | 4 | 100 | 100 | Simulated control signals consisting of short sequences. |
StarLightCurves | 2 | 10,000 | 8000 | Astronomical observations with long sequences and large sample size. |
ETTh1 | - | 12,288 | 4096 | Power load data with hourly-resolution long sequences. |
ETTm1 | - | 69,120 | 17,280 | Power load data with ultra-long sequences at minute-level granularity. |
Data | Standard Metrics | k-Shape (2015) | DTCR (2019) | TNC (2021) | TS2Vec (2022) | AutoShape (2022) | TCGAN (2023) | CDCC (2024) | k-Graph (2025) | KDiscShapeNet (Ours) |
---|---|---|---|---|---|---|---|---|---|---|
Beef | ARI | 0.786 | 0.116 | 0.425 | 0.077 | 0.778 | 0.426 | 0.781 | 0.859 | 0.776 |
NMI | 0.798 | 0.267 | 0.194 | 0.323 | 0.386 | 0.352 | 0.792 | 0.842 | 0.962 | |
Silhouette | 0.598 | 0.251 | 0.665 | 0.613 | 0.358 | 0.603 | 0.511 | 0.612 | 0.986 | |
ECG200 | ARI | 0.903 | 0.192 | 0.311 | 0.039 | 0.759 | 0.325 | 0.764 | 0.533 | 0.946 |
NMI | 0.881 | 0.292 | 0.219 | 0.267 | 0.393 | 0.245 | 0.775 | 0.821 | 0.802 | |
Silhouette | 0.621 | 0.546 | 0.612 | 0.509 | 0.426 | 0.574 | 0.498 | 0.563 | 0.947 | |
GunPoint | ARI | 0.844 | 0.462 | 0.206 | 0.416 | 0.705 | 0.516 | 0.845 | 0.962 | 0.921 |
NMI | 0.861 | 0.110 | 0.115 | 0.016 | 0.403 | 0.420 | 0.853 | 0.931 | 0.862 | |
Silhouette | 0.673 | 0.556 | 0.664 | 0.757 | 0.598 | 0.601 | 0.572 | 0.739 | 0.786 | |
ItalyPowerDemand | ARI | 0.799 | 0.221 | 0.248 | 0.046 | 0.835 | 0.281 | 0.621 | 0.756 | 0.953 |
NMI | 0.812 | 0.102 | 0.292 | 0.108 | 0.635 | 0.331 | 0.633 | 0.701 | 0.980 | |
Silhouette | 0.641 | 0.674 | 0.483 | 0.755 | 0.517 | 0.513 | 0.401 | 0.497 | 0.774 | |
SonyAIBORobotSurface1 | ARI | 0.864 | 0.313 | 0.076 | 0.207 | 0.895 | 0.161 | 0.693 | 0.745 | 0.875 |
NMI | 0.870 | 0.347 | 0.416 | 0.247 | 0.610 | 0.274 | 0.702 | 0.828 | 0.951 | |
Silhouette | 0.622 | 0.237 | 0.378 | 0.610 | 0.676 | 0.494 | 0.455 | 0.587 | 0.879 | |
Trace | ARI | 0.896 | 0.328 | 0.182 | 0.424 | 0.922 | 0.225 | 0.872 | 0.801 | 0.916 |
NMI | 0.911 | 0.508 | 0.281 | 0.455 | 0.948 | 0.332 | 0.881 | 0.777 | 0.927 | |
Silhouette | 0.903 | 0.192 | 0.311 | 0.039 | 0.759 | 0.325 | 0.596 | 0.533 | 0.946 | |
StarLightCurves | ARI | 0.703 | 0.245 | 0.331 | 0.643 | 0.802 | 0.458 | 0.804 | 0.532 | 0.931 |
NMI | 0.732 | 0.506 | 0.040 | 0.485 | 0.933 | 0.382 | 0.816 | 0.902 | 0.840 | |
Silhouette | 0.754 | 0.603 | 0.308 | 0.534 | 0.915 | 0.490 | 0.538 | 0.903 | 0.902 |
Data | Standard Metrics | T-Loss (2019) | DTS-Cluster (2021) | TNC + KMeans (2021) | TS2Vec + KMeans (2022) | DTCC (2022) | AutoShape (2022) | KDiscShapeNet (Ours) |
---|---|---|---|---|---|---|---|---|
ETTh1 | ACC | 0.762 | 0.802 | 0.756 | 0.771 | 0.788 | 0.795 | 0.926 |
ARI | 0.690 | 0.733 | 0.672 | 0.685 | 0.712 | 0.721 | 0.751 | |
NMI | 0.710 | 0.754 | 0.695 | 0.709 | 0.738 | 0.743 | 0.917 | |
Silhouette | 0.414 | 0.451 | 0.390 | 0.403 | 0.428 | 0.436 | 0.619 | |
ETTm1 | ACC | 0.754 | 0.798 | 0.747 | 0.763 | 0.779 | 0.785 | 0.904 |
ARI | 0.678 | 0.729 | 0.658 | 0.671 | 0.701 | 0.715 | 0.985 | |
NMI | 0.702 | 0.744 | 0.682 | 0.695 | 0.723 | 0.731 | 0.981 | |
Silhouette | 0.398 | 0.438 | 0.375 | 0.391 | 0.412 | 0.427 | 0.741 |
Experiment | KAN Encoder | SupCon Loss | Center Loss | ARI | NMI | Silhouette |
---|---|---|---|---|---|---|
A | ✗ | ✗ | ✗ | 0.594 | 0.689 | 0.307 |
B | ✔ | ✗ | ✗ | 0.541 | 0.509 | 0.321 |
C | ✔ | ✔ | ✗ | 0.651 | 0.735 | 0.464 |
D | ✔ | ✗ | ✔ | 0.934 | 0.940 | 0.879 |
E | ✗ | ✔ | ✔ | 0.950 | 0.958 | 0.781 |
F | ✗ | ✔ | ✗ | 0.912 | 0.925 | 0.613 |
G | ✗ | ✗ | ✔ | 0.650 | 0.735 | 0.846 |
H | ✔ | ✔ | ✔ | 0.916 | 0.927 | 0.931 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, X.; Jiang, Y.; Zhang, Y.; Song, C. KDiscShapeNet: A Structure-Aware Time Series Clustering Model with Supervised Contrastive Learning. Mathematics 2025, 13, 2814. https://doi.org/10.3390/math13172814
Chen X, Jiang Y, Zhang Y, Song C. KDiscShapeNet: A Structure-Aware Time Series Clustering Model with Supervised Contrastive Learning. Mathematics. 2025; 13(17):2814. https://doi.org/10.3390/math13172814
Chicago/Turabian StyleChen, Xi, Yufan Jiang, Yingming Zhang, and Chunhe Song. 2025. "KDiscShapeNet: A Structure-Aware Time Series Clustering Model with Supervised Contrastive Learning" Mathematics 13, no. 17: 2814. https://doi.org/10.3390/math13172814
APA StyleChen, X., Jiang, Y., Zhang, Y., & Song, C. (2025). KDiscShapeNet: A Structure-Aware Time Series Clustering Model with Supervised Contrastive Learning. Mathematics, 13(17), 2814. https://doi.org/10.3390/math13172814