Hybrid Clustering for Retail Demand Forecasting: Combining Rule-Based and Machine Learning Methods
Highlights
- This study proposes an adaptive hybrid clustering framework that integrates rule-based and machine learning approaches to address the intermittent and heterogeneous demand patterns characteristic of FMCG retail environments.
- The results demonstrate that hybrid forecasting models incorporating demand-pattern embeddings consistently achieve superior accuracy compared with single-algorithm approaches across all identified demand segments.
- Since no single clustering method demonstrates universal superiority, practitioners are advised to adopt a context-sensitive strategy, selecting rule-based or machine learning approaches based on the characteristics of demand patterns.
- A diagnostic heuristic derived from preliminary clustering statistics can reduce experimental overhead by up to 50%, facilitating more resource-efficient model selection in large-scale retail settings.
Abstract
1. Introduction
- Practical Applicability: The hybrid clustering for retail demand forecasting proposed in this study overcomes the limitations of the existing clustering-based hybrid method. Comparing the unvalidated performance of rule-based and ML methods enables selecting effective methodologies based on data characteristics. Additionally, the practical approach, which utilizes actual sales data, enables the research results to be directly applicable to real-world practice.
- Utilization of Embedding-based Representation Learning: Embedding-based representation learning is a core strength of this study. Specifically, time series embeddings transform product-specific sales patterns into fixed-length vectors, thereby improving clustering when combined with unsupervised learning methods during the ML clustering phase. In both rule-based and ML methods, these embeddings are then used in the forecasting phase with baseline algorithms and exogenous variables to enhance product-specific forecasting.
- Explainable Model: This study used XGBoost-based Feature Importance analysis to identify key factors influencing prediction results. Using retail data without promotions, it clearly revealed variables affecting actual demand.
- Feature Diversity: Furthermore, the diversity of features is a key contributor to improved forecasting in this study. By integrating various data types including time series, sales (domestic/import classification, category, first shipment date, sales start date, price), economic (Consumer Price Index (CPI), Unemployment Rate, West Texas Intermediate (WTI), retail sales index), and weather data (average temperature, average relative humidity, average wind speed) the study enhances demand forecasting performance.
2. Related Work
2.1. Traditional Forecasting Methods
2.2. Machine Learning-Based Forecasting Methods
2.3. Hybrid Models
2.4. Research Trends in Clustering-Based Hybrid Demand Forecasting for Retail
3. Methodology
3.1. Dataset
3.2. Data Preprocessing and Feature Engineering
3.3. Clustering Methods for Demand Forecasting
3.3.1. Rule-Based Clustering
3.3.2. Machine Learning-Based Clustering
3.4. Hybrid Forecasting Framework
3.5. Forecasting Performance Evaluation
4. Results and Discussion
4.1. Evaluation of ML-Based Clustering
4.2. Clustering Results: Rule-Based vs. ML Methods
4.3. Performance Evaluation of Cluster-Level Forecasting Models
4.4. Feature Importance Analysis
4.5. Comparison of Rule-Based and ML Methods
4.6. Robustness Validation Under Alternative Data Splitting
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Al Orbani, M. SKU Time Series Forecasting Methods for FMCGs. Master’s Thesis, Rochester Institute of Technology, Dubai, United Arab Emirates, 2022. Available online: https://repository.rit.edu/theses/11172 (accessed on 10 January 2026).
- Abolghasemi, M.; Gerlach, R.; Tarr, G.; Beh, E. Demand Forecasting in Supply Chain: The Impact of Demand Volatility in the Presence of Promotion. arXiv 2019, arXiv:1909.13084. [Google Scholar] [CrossRef]
- Mejia, S.; Aguilar, J. A Demand Forecasting System of Product Categories Defined by Their Time Series Using a Hybrid Approach of Ensemble Learning with Feature Engineering. Computing 2024, 106, 1765–1784. [Google Scholar] [CrossRef]
- Chen, I.-F.; Lu, C.-J. Demand Forecasting for Multichannel Fashion Retailers by Integrating Clustering and Machine Learning Algorithms. Processes 2021, 9, 1578. [Google Scholar] [CrossRef]
- Petropoulos, F.; Kourentzes, N. Forecast Combinations for Intermittent Demand. J. Oper. Res. Soc. 2015, 66, 914–924. [Google Scholar] [CrossRef]
- Li, L.; Kang, Y.; Petropoulos, F.; Li, F. Feature-Based Intermittent Demand Forecast Combinations: Bias, Accuracy and Inventory Implications. arXiv 2022, arXiv:2204.08283. [Google Scholar] [CrossRef]
- Afifi, A.A. Demand Forecasting of Short Life Cycle Products Using Data Mining Techniques. In Artificial Intelligence Applications and Innovations; Maglogiannis, I., Iliadis, L., Pimenidis, E., Eds.; IFIP Advances in Information and Communication Technology; Springer: Cham, Switzerland, 2020; Volume 583, pp. 151–162. [Google Scholar] [CrossRef]
- Puspita, R.; Wulandhari, L. Hardware Sales Forecasting Using Clustering and Machine Learning Approach. IAES Int. J. Artif. Intell. 2022, 11, 1074–1084. [Google Scholar] [CrossRef]
- Paruthipattu, S.P. Demand Forecasting Based on External Factors Using Clustering and Machine Learning. Master’s Thesis, National College of Ireland, Dublin, Ireland, 2021. Available online: https://norma.ncirl.ie/id/eprint/6251 (accessed on 1 February 2026).
- David, E.; Bellot, J.; Le Corff, S. HERMES: Hybrid Error-Corrector Model with Inclusion of External Signals for Nonstationary Fashion Time Series. arXiv 2022, arXiv:2202.03224. [Google Scholar] [CrossRef]
- Dincer, K.F.; Turgay, S. Balancing Demand and Supply: Inventory Allocation in FMCG. Ind. Eng. Innov. Manag. 2023, 6, 41–49. [Google Scholar] [CrossRef]
- Olatunji, A.O. Leveraging Data Science for Demand Forecasting and Inventory Management: A Comprehensive Review. J. Basic Appl. Res. Int. 2025, 31, 29–38. [Google Scholar] [CrossRef]
- Khakpour, A. Data Science for Decision Support: Using Machine Learning and Big Data in Sales Forecasting for Production and Retail. Master’s Thesis, Østfold University College, Halden, Norway, 2020. Available online: https://hdl.handle.net/11250/2660428 (accessed on 25 March 2026).
- Suddala, S. Dynamic Demand Forecasting in Supply Chains Using Hybrid ARIMA–LSTM Architectures. Int. J. Adv. Res. 2024, 12, 1167–1171. [Google Scholar] [CrossRef]
- Punia, S.; Singh, S.P.; Madaan, J.K. Predictive Analytics for Demand Forecasting: A Deep Learning-Based Decision Support System. Knowl.-Based Syst. 2022, 258, 109956. [Google Scholar] [CrossRef]
- Fattah, J.; Ezzine, L.; Aman, Z.; El Moussami, H.; Lachhab, A. Forecasting of Demand Using ARIMA Model. Int. J. Eng. Bus. Manag. 2018, 10, 1847979018808673. [Google Scholar] [CrossRef]
- Tirkes, G.; Guray, C.; Celebi, N. Demand Forecasting: Comparison Between Holt-Winters Model, Trend Analysis and Decomposition Models. Teh. Vjesn. 2017, 24, 503–509. [Google Scholar] [CrossRef]
- Wang, G. Sales Forecasting for Firms Based on Multiple Regression Model. In Proceedings of the International Conference on Big Data Economy and Digital Management (BDEDM); SciTePress: Setúbal, Portugal, 2022; pp. 628–633. [Google Scholar] [CrossRef]
- Lukman, A.F.; Farghali, R.A.; Kibria, B.G.; Oluyemi, O.A. Robust-Stein Estimator for Overcoming Outliers and Multicollinearity. Sci. Rep. 2023, 13, 9066. [Google Scholar] [CrossRef]
- Ahmed, A.M. Accelerate Demand Forecasting by Hybridizing CatBoost with the Dingo Optimization Algorithm to Support Supply Chain Conceptual Framework Precisely. Front. Sustain. 2024, 5, 1388771. [Google Scholar] [CrossRef]
- Roy, K.; Ishmam, A.; Abu Taher, K. Demand Forecasting in Smart Grid Using Long Short-Term Memory. arXiv 2021, arXiv:2107.13653. [Google Scholar] [CrossRef]
- Oliveira, J.M.; Ramos, P. Evaluating the Effectiveness of Time Series Transformers for Demand Forecasting in Retail. Mathematics 2024, 12, 2728. [Google Scholar] [CrossRef]
- Zhang, G.P. Time Series Forecasting Using a Hybrid ARIMA and Neural Network Model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
- Falatouri, T.; Darbanian, F.; Brandtner, P.; Udokwu, C. Predictive Analytics for Demand Forecasting—A Comparison of SARIMA and LSTM in Retail SCM. Procedia Comput. Sci. 2022, 200, 993–1003. [Google Scholar] [CrossRef]
- Liu, Z.; Zhang, Z.; Zhang, W. A Hybrid Framework Integrating Traditional Models and Deep Learning for Multi-Scale Time Series Forecasting. Entropy 2025, 27, 695. [Google Scholar] [CrossRef]
- Seyedan, M.; Mafakheri, F.; Wang, C. Cluster-Based Demand Forecasting Using Bayesian Model Averaging: An Ensemble Learning Approach. Decis. Anal. J. 2022, 3, 100033. [Google Scholar] [CrossRef]
- Ozturk, Z.K.; Cetin, Y.; Isik, Y.; Cicek, Z.I.E. Demand Forecasting with Clustering and Artificial Neural Networks Methods: An Application for Stock Keeping Units. In Modeling, Dynamics, Optimization and Bioeconomics IV; Pinto, A., Zilberman, D., Eds.; Springer Proceedings in Mathematics & Statistics; Springer: Cham, Switzerland, 2021; Volume 365, pp. 275–292. [Google Scholar] [CrossRef]
- Duan, G.; Dong, J. Construction of Ensemble Learning Model for Home Appliance Demand Forecasting. Appl. Sci. 2024, 14, 7658. [Google Scholar] [CrossRef]
- Zhang, Y.; Ren, G.; Liu, X.; Gao, G.; Zhu, M. Ensemble Learning-Based Modeling and Short-Term Forecasting Algorithm for Time Series with Small Sample. Eng. Rep. 2021, 4, e12486. [Google Scholar] [CrossRef]
- Smyl, S. A Hybrid Method of Exponential Smoothing and Recurrent Neural Networks for Time Series Forecasting. Int. J. Forecast. 2020, 36, 75–85. [Google Scholar] [CrossRef]
- Büyükşahin, Ü.Ç.; Ertekin, Ş. Improving Forecasting Accuracy of Time Series Data Using a New ARIMA–ANN Hybrid Method and Empirical Mode Decomposition. arXiv 2018, arXiv:1812.11526. [Google Scholar] [CrossRef]
- Rhif, M.; Ben Abbes, A.; Martínez, B.; Farah, I.R. Veg-W2TCN: A Parallel Hybrid Forecasting Framework for Non-Stationary Time Series Using Wavelet and Temporal Convolution Network Model. Appl. Soft Comput. 2023, 137, 110172. [Google Scholar] [CrossRef]
- Hassanpouri Baesmat, K.; Farrokhi, Z.; Chmaj, G.; Regentova, E.E. Parallel Multi-Model Energy Demand Forecasting with Cloud Redundancy: Leveraging Trend Correction, Feature Selection, and Machine Learning. Forecasting 2025, 7, 25. [Google Scholar] [CrossRef]
- Cawood, P.; van Zyl, T.L. Feature-Weighted Stacking for Nonseasonal Time Series Forecasts: A Case Study of the COVID-19 Epidemic Curves. arXiv 2021, arXiv:2108.08723. [Google Scholar] [CrossRef]
- Godahewa, R.; Bergmeir, C.; Webb, G.I.; Montero-Manso, P. An Accurate and Fully-Automated Ensemble Model for Weekly Time Series Forecasting. arXiv 2020, arXiv:2010.08158. [Google Scholar] [CrossRef]
- Molina-Tenorio, Y.; Prieto-Guerrero, A.; Rodriguez-Colina, E.; Vásquez-Toledo, L.A.; Olvera-Guerrero, O.A. Gramian Angular Field and Convolutional Neural Networks for Real-Time Multiband Spectrum Sensing in Cognitive Radio Networks. Sensors 2025, 25, 3580. [Google Scholar] [CrossRef]
- Nie, Y.; Nguyen, N.H.; Sinthong, P.; Kalagnanam, J. A Time Series Is Worth 64 Words: Long-Term Forecasting with Transformers. arXiv 2022, arXiv:2211.14730. [Google Scholar] [CrossRef]
- Fisher, M.; Rajaram, K. Accurate Retail Testing of Fashion Merchandise: Methodology and Application. Mark. Sci. 2000, 19, 266–278. [Google Scholar] [CrossRef]
- Bala, P.K. Improving Inventory Performance with Clustering-Based Demand Forecasts. J. Model. Manag. 2012, 7, 23–37. [Google Scholar] [CrossRef]
- İşlek, İ.; Öğüdücü, Ş.G. A Retail Demand Forecasting Model Based on Data Mining Techniques. In Proceedings of the 2015 IEEE 24th International Symposium on Industrial Electronics (ISIE), Buzios, Brazil, 3–5 June 2015; pp. 55–60. [Google Scholar] [CrossRef]
- Pereira, M.M.; Frazzon, E.M. Towards a Predictive Approach for Omni-Channel Retailing Supply Chains. IFAC-PapersOnLine 2019, 52, 844–850. [Google Scholar] [CrossRef]
- Benhamida, F.Z.; Kaddouri, O.; Ouhrouche, T.; Benaichouche, M.; Casado-Mansilla, D.; López-de-Ipiña, D. Stock&Buy: A New Demand Forecasting Tool for Inventory Control. In Proceedings of the 2020 5th International Conference on Smart and Sustainable Technologies (SpliTech), Split, Croatia, 23–26 September 2020; pp. 1–6. [Google Scholar] [CrossRef]
- Giri, C.; Chen, Y. Deep Learning for Demand Forecasting in the Fashion and Apparel Retail Industry. Forecasting 2022, 4, 565–581. [Google Scholar] [CrossRef]
- Cohen, M.C.; Zhang, R.; Jiao, K. Data Aggregation and Demand Prediction. Oper. Res. 2022, 70, 2597–2618. [Google Scholar] [CrossRef]
- van Ruitenbeek, R.E.; Koole, G.M.; Bhulai, S. A Hierarchical Agglomerative Clustering for Product Sales Forecasting. Decis. Anal. J. 2023, 8, 100318. [Google Scholar] [CrossRef]
- Soltani, M.; Khatami Firouzabadi, S.M.A.; Amiri, M.; Hajian Heidary, M. Proposing an Integrated Approach for Omnichannel Demand Forecasting Using Machine Learning–Time Series Clustering with Dynamic Time Warping Algorithm and Artificial Neural Networks. Res. Prod. Oper. Manag. 2023, 14, 121–140. [Google Scholar] [CrossRef]
- Malik, A.; Dargar, G.; Sharma, A.; Pandey, P. Predictive Analysis for Retail Shops Using Machine Learning for Maximizing Revenue. In Proceedings of the 2023 7th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India, 17–19 May 2023; pp. 126–133. [Google Scholar] [CrossRef]
- Mitra, R.; Saha, P.; Tiwari, M.K. Sales Forecasting of a Food and Beverage Company Using Deep Clustering Frameworks. Int. J. Prod. Res. 2023, 62, 3320–3332. [Google Scholar] [CrossRef]
- Mikkilineni, B.S.; Madala, U.; Bonthagorla, R.S.; Parikala, Y.P.; Kumar, V.P.; Kishore, V.K. An Experimental Study on Prediction of Revenue and Customer Segmentation. In Proceedings of the 2024 8th International Conference on Inventive Systems and Control (ICISC), Coimbatore, India, 4–6 January 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 500–507. [Google Scholar] [CrossRef]
- Stylianou, T.; Pantelidou, A. A Machine Learning Approach to Consumer Behavior in Supermarket Analytics. Decis. Anal. J. 2025, 16, 100600. [Google Scholar] [CrossRef]
- Poslavskaya, E.; Korolev, A. Encoding Categorical Data: Is There Yet Anything ‘Hotter’ Than One-Hot Encoding? arXiv 2023, arXiv:2312.16930. [Google Scholar] [CrossRef]
- Pinheiro, J.M.H.; Oliveira, S.V.B.; Silva, T.H.S.; Saraiva, P.A.R.; de Souza, E.F.; Godoy, R.V.; Ambrosio, L.A.; Becker, M. The Impact of Feature Scaling in Machine Learning: Effects on Regression and Classification Tasks. arXiv 2025, arXiv:2506.08274. [Google Scholar] [CrossRef]
- Yin, H.; Aryani, A.; Petrie, S.; Nambissan, A.; Astudillo, A.; Cao, S. A Rapid Review of Clustering Algorithms. arXiv 2024, arXiv:2401.07389. [Google Scholar] [CrossRef]
- Gao, J.; Hu, W.; Chen, Y. Revisiting PCA for Time Series Reduction in Temporal Dimension. arXiv 2024, arXiv:2412.19423. [Google Scholar] [CrossRef]
- Liang, Z.; Zhang, J.; Liang, C.; Wang, H.; Liang, Z.; Pan, L. A Shapelet-Based Framework for Unsupervised Multivariate Time Series Representation Learning. Proc. VLDB Endow. 2023, 17, 386–399. [Google Scholar] [CrossRef]
- Irani, H.; Ghahremani, Y.; Kermani, A.; Metsis, V. Time Series Embedding Methods for Classification Tasks: A Review. arXiv 2025, arXiv:2501.13392. [Google Scholar] [CrossRef]
- Yue, Z.; Wang, Y.; Duan, J.; Yang, T.; Huang, C.; Tong, Y.; Xu, B. TS2Vec: Towards Universal Representation of Time Series. arXiv 2021, arXiv:2106.10466. [Google Scholar] [CrossRef]
- Kanungo, T.; Mount, D.M.; Netanyahu, N.S.; Piatko, C.D.; Silverman, R.; Wu, A.Y. An Efficient K-Means Clustering Algorithm: Analysis and Implementation. IEEE Trans. Pattern Anal. Mach. Intell. 2002, 24, 881–892. [Google Scholar] [CrossRef]
- Murtagh, F.; Legendre, P. Ward’s Hierarchical Agglomerative Clustering Method: Which Algorithms Implement Ward’s Criterion? J. Classif. 2014, 31, 274–295. [Google Scholar] [CrossRef]
- Syntetos, A.A.; Boylan, J.E.; Croston, J.D. On the Categorization of Demand Patterns. J. Oper. Res. Soc. 2005, 56, 495–503. [Google Scholar] [CrossRef]
- Garine, R. Enhanced E-Commerce Demand Prediction through Ensemble Models and Optuna-Based Hyperparameter Optimization. In Proceedings of the 2024 2nd DMIHER International Conference on Artificial Intelligence in Healthcare, Education and Industry (IDICAIEI), Wardha, India, 29–30 November 2024; pp. 1–7. [Google Scholar] [CrossRef]
- Chukwuemeka, U.M.; Nnalue, A.D.; Obiekwe, S.J.; Maruf, F.A.; Anakor, A.C.; Moses, M.O.; Amaechi, C.; Okonkwo, U.P.; Amaechi, I.A. Comparative Validity Assessment of Three Android Step Counter Applications: A Semi-Structured Laboratory-Based Study. BMC Digit. Health 2025, 3, 20. [Google Scholar] [CrossRef] [PubMed]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]



| Ref. | Year | Data Source | Clustering Algorithms | Forecasting Algorithms | Performance Indicators | Feature |
|---|---|---|---|---|---|---|
| [38] | 2000 | Fashion Retail | K-Median | Linear Regression | revenue | Location, Sales, AVG Temp |
| [39] | 2012 | Supermarket Retail | K-Means | ARIMA, SARIMA, ANN | decrease in inventory | Customer/Transaction information |
| [5] | 2014 | Royal Air Force | SBC | KH, KH-SES | MAPE | Sales |
| [40] | 2015 | Food Retail | Bipartite Graph | Bayesian Network, MLP | MAPE | Warehouse/Product Properties, Sales |
| [41] | 2019 | Omni-channel Retail | K-Means | ANN | MSE | Online/Offline Sales |
| [7] | 2020 | IT e-commerce Retail | K-Means | OneR, Naive Bayes, KNN, RIPPER, C4.5, Rules-6 | MAPE | e-commerce Sales |
| [42] | 2020 | Online Retail (Stock&Buy) | ClustAvg | Theta, ARIMA, MLP | Accuracy | Sales |
| [4] | 2021 | Fashion Retail | K-Means | ELM, SVR | MAPE, RMSE | Online/Offline Sales, Weather |
| [9] | 2021 | Kaggle Supermarket Retail | HAC | RF, XGBoost, LSTM+RF | RMSE, MAE | Transactions, Items, Stores, Holiday events, Oil prices |
| [6] | 2022 | M5 Walmart, Spare Parts Retail | SBC | SES, ARIMA, CROSTON | Inventory Decision Insights | Sales |
| [8] | 2022 | IT Hardware Retail | K-Means, AHC, GMM | ARIMA, RNN-LSTM | Cost | Sales/Stock/Customer Information |
| [25] | 2022 | Sports Retail | K-Means | LSTM, Prophet, Bayesian | Accuracy | Sales, Stores, Customers, Products, Delivery |
| [43] | 2022 | Fashion Retail | K-Means | SVM, RF, NN | MAE, RMSE | Sales |
| [44] | 2022 | Online Retail | K-Means | GLM | Accuracy | Sales, Product |
| [45] | 2023 | Bicycle accessories Retail | HAC | Regression | Accuracy | Sales, Product, Promotion |
| [46] | 2023 | Omni-channel Retail | DTW | ANN | RMSE | Sales |
| [47] | 2023 | Kaggle Retail | K-Means | Linear Regression, RF, XGBoost, LSTM | Accuracy | Sales, Customer, Product |
| [48] | 2024 | Food & Beverage Retail | GMM, HAC | RF | Accuracy | Sales, Customer, Region, Distribution, Product, Promotion |
| [49] | 2024 | Kaggle Retail | LSTM | RF | Accuracy | Sales, Product, Location |
| [50] | 2025 | Walmart Retail | K-Means | ARIMA | Improve inventory management | Sales, Product, Promotion |
| Sortation | NO | Column | Description | Data Type |
|---|---|---|---|---|
| Time Series Feature (8) | 1 | DATE | DATE | DATETIME64 |
| 2 | YEAR | YEAR | INT64 | |
| 3 | MONTH | MONTH | INT64 | |
| 4 | WEEK | WEEK | INT64 | |
| 5 | TOTAL_HOLIDAY_CNT | Number of Holidays | INT64 | |
| 6 | LAG1 | Sales Quantity 1 week ago | FLOAT64 | |
| 7 | LAG1_4W_ROLLING_AVG | Sales Quantity over the Past 4 weeks | FLOAT64 | |
| 8 | W_QTY | Sales Quantity | INT64 | |
| Product Feature (7) | 9 | CATEGORY | Product Category | OBJECT |
| 10 | PRODUCT_CODE | Product Code | INT64 | |
| 11 | PRODUCT_NAME | Product Name | OBJECT | |
| 12 | PRICE | Unit Price | INT64 | |
| 13 | FIRST_SHIPMENT_DATE | Initial Release Date | INT64 | |
| 14 | START_SALES_DATE | Sales Start Date | INT64 | |
| 15 | ORIGIN_TYPE | Domestic/Import Classification | OBJECT | |
| Weather Feature (3) | 16 | AVG_TEMPERATURE | Average Temperature | FLOAT64 |
| 17 | AVG_HUMIDITY | Average Humidity | FLOAT64 | |
| 18 | AVG_WIND_SPEED | Average Wind Speed | FLOAT64 | |
| Economy Feature (4) | 19 | CPI | Consumer Price Index | FLOAT64 |
| 20 | UNEMPLOYMENT_RATE | Unemployment Rate | FLOAT64 | |
| 21 | OIL_PRICE | West Texas Intermediate | FLOAT64 | |
| 22 | RETAIL_SALES_INDEX | Retail Sales Index | FLOAT64 |
| NO | Embedding | Model | Feature Variables | K | SC | DBI |
|---|---|---|---|---|---|---|
| A Center | ||||||
| 1 | PCA | K-Means | Time Series | 3 | 0.4846 | 0.9419 |
| 2 | PCA | K-Means | Time series + Sales | 3 | 0.4785 | 0.9378 |
| 3 | TS2Vec | K-Means | Time Series | 3 | 0.5933 | 0.6866 |
| 4 | PatchTST | K-Means | Time Series | 5 | 0.6038 | 0.6218 |
| 5 | GAF-CNN | K-Means | Time Series | 2 | 0.5055 | 0.8910 |
| 6 | PCA | HAC | Time Series | 3 | 0.4192 | 0.9428 |
| 7 | TS2Vec | HAC | Time Series | 3 | 0.5902 | 0.6586 |
| 8 | PatchTST | HAC | Time Series | 5 | 0.5287 | 0.6359 |
| 9 | PCA | HAC | Time series + Sales | 3 | 0.3920 | 0.9592 |
| 10 | GAF-CNN | HAC | Time Series | 2 | 0.6434 | 0.7152 |
| 11 | PCA | GMM | Time Series | 3 | 0.1324 | 1.7658 |
| 12 | TS2Vec | GMM | Time Series | 3 | 0.1869 | 1.5709 |
| 13 | PatchTST | GMM | Time Series | 3 | 0.6488 | 0.8826 |
| 14 | PCA | GMM | Time series + Sales | 3 | 0.1446 | 1.6398 |
| 15 | GAF-CNN | GMM | Time Series | 2 | 0.5336 | 0.7954 |
| B Center | ||||||
| 1 | PCA | K-Means | Time Series | 3 | 0.4699 | 0.9352 |
| 2 | PCA | K-Means | Time series + Sales | 3 | 0.4655 | 0.9288 |
| 3 | TS2Vec | K-Means | Time Series | 3 | 0.5782 | 0.7021 |
| 4 | PatchTST | K-Means | Time Series | 3 | 0.5950 | 0.6570 |
| 5 | GAF-CNN | K-Means | Time Series | 2 | 0.5358 | 0.8310 |
| 6 | PCA | HAC | Time Series | 3 | 0.5801 | 0.9680 |
| 7 | TS2Vec | HAC | Time Series | 3 | 0.5453 | 0.6849 |
| 8 | PatchTST | HAC | Time Series | 4 | 0.5565 | 0.7410 |
| 9 | PCA | HAC | Time series + Sales | 3 | 0.4533 | 0.9360 |
| 10 | GAF-CNN | HAC | Time Series | 2 | 0.6023 | 0.7152 |
| 11 | PCA | GMM | Time Series | 3 | 0.2489 | 1.7731 |
| 12 | TS2Vec | GMM | Time Series | 3 | 0.1227 | 1.8953 |
| 13 | PatchTST | GMM | Time Series | 4 | 0.6339 | 0.6165 |
| 14 | PCA | GMM | Time series + Sales | 3 | 0.2179 | 1.6418 |
| 15 | GAF-CNN | GMM | Time Series | 2 | 0.4791 | 0.9863 |
| Center | K | Product Quantity | |||
|---|---|---|---|---|---|
| Cluster 1 (Smooth) | Cluster 2 (Intermittent) | Cluster 3 (Erratic) | Cluster 4 (Lumpy) | ||
| A | 4 | 9451 | 117 | 2139 | 954 |
| B | 4 | 9608 | 121 | 2017 | 915 |
| Center | K | Product Quantity | |||
|---|---|---|---|---|---|
| Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 | ||
| A | 3 | 2513 | 10,008 | 140 | - |
| B | 4 | 10,211 | 1801 | 214 | 435 |
| Cluster (Product Quantity) | Evaluation | Phase 1 (Baseline Models) | Phase 2 (Proposed Models) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GAF-CNN | PatchTST | TS2Vec | ||||||||||
| ARIMA | RF | XGBoost | LSTM | Autoformer | XGBoost | RF | XGBoost | RF | XGBoost | RF | ||
| Cluster_1 (9451) | MAE | 249.72 | 164.83 | 167.39 | 194.14 | 187.71 | 144.30 | 176.33 | 150.67 | 143.11 | 195.26 | 188.49 |
| RMSE | 267.94 | 182.65 | 186.81 | 211.27 | 207.07 | 162.78 | 195.00 | 170.57 | 163.87 | 217.11 | 207.69 | |
| MAPE | 33.51 | 23.78 | 23.88 | 27.81 | 27.88 | 22.29 | 26.11 | 22.87 | 22.08 | 27.58 | 27.08 | |
| MASE | 2.85 | 1.93 | 1.94 | 2.25 | 2.12 | 1.62 | 1.98 | 1.67 | 1.59 | 2.13 | 2.08 | |
| Cluster_2 (117) | MAE | 219.87 | 84.81 | 85.36 | 87.77 | 86.31 | 79.62 | 91.42 | 80.14 | 74.40 | 107.30 | 101.20 |
| RMSE | 228.59 | 93.52 | 95.68 | 97.49 | 96.64 | 90.83 | 101.02 | 91.72 | 84.45 | 118.09 | 111.04 | |
| MAPE | 37.01 | 14.09 | 14.59 | 16.95 | 14.97 | 13.78 | 15.08 | 13.54 | 12.52 | 17.24 | 16.35 | |
| MASE | 4.54 | 2.01 | 2.08 | 2.04 | 1.84 | 1.75 | 1.94 | 1.74 | 1.57 | 2.16 | 2.03 | |
| Cluster_3 (2139) | MAE | 538.26 | 268.12 | 260.91 | 429.59 | 351.77 | 245.42 | 342.29 | 240.97 | 256.65 | 348.58 | 321.90 |
| RMSE | 601.13 | 329.70 | 323.01 | 489.66 | 411.27 | 311.45 | 409.58 | 299.27 | 318.44 | 416.21 | 384.66 | |
| MAPE | 64.60 | 51.52 | 49.93 | 62.49 | 55.10 | 46.63 | 57.46 | 46.75 | 49.69 | 54.50 | 54.17 | |
| MASE | 4.14 | 2.51 | 2.43 | 3.69 | 2.90 | 2.11 | 3.08 | 2.12 | 2.26 | 3.12 | 2.94 | |
| Cluster_4 (954) | MAE | 494.67 | 344.53 | 284.63 | 507.02 | 435.01 | 378.21 | 434.45 | 316.94 | 343.04 | 477.27 | 388.87 |
| RMSE | 572.37 | 404.03 | 345.16 | 590.65 | 504.97 | 456.39 | 509.96 | 379.87 | 412.59 | 567.51 | 461.58 | |
| MAPE | 64.62 | 55.59 | 51.97 | 56.12 | 57.14 | 51.71 | 59.23 | 48.57 | 54.59 | 57.01 | 55.15 | |
| MASE | 4.37 | 3.41 | 2.96 | 4.24 | 3.97 | 3.66 | 4.36 | 3.34 | 3.38 | 4.67 | 3.99 | |
| Cluster (Product Quantity) | Evaluation | Phase 1 (Baseline Models) | Phase 2 (Proposed Models) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GAF-CNN | PatchTST | TS2Vec | ||||||||||
| ARIMA | RF | XGBoost | LSTM | Autoformer | XGBoost | RF | XGBoost | RF | XGBoost | RF | ||
| Cluster_1 (9608) | MAE | 247.27 | 160.37 | 160.52 | 204.52 | 173.86 | 140.76 | 175.64 | 143.06 | 139.50 | 193.99 | 187.78 |
| RMSE | 265.21 | 178.00 | 180.17 | 223.55 | 192.92 | 158.80 | 193.84 | 163.92 | 160.64 | 215.89 | 206.51 | |
| MAPE | 32.89 | 23.11 | 23.20 | 29.14 | 25.89 | 21.50 | 25.54 | 21.45 | 21.19 | 27.09 | 26.82 | |
| MASE | 2.83 | 1.89 | 1.88 | 2.33 | 1.98 | 1.59 | 1.99 | 1.61 | 1.57 | 2.14 | 2.09 | |
| Cluster_2 (121) | MAE | 113.98 | 81.46 | 81.09 | 84.83 | 85.38 | 75.78 | 89.94 | 77.98 | 73.04 | 101.00 | 100.05 |
| RMSE | 123.74 | 91.59 | 91.61 | 93.13 | 95.83 | 87.57 | 100.86 | 90.00 | 83.61 | 113.61 | 110.14 | |
| MAPE | 21.21 | 14.96 | 15.01 | 14.44 | 16.50 | 14.33 | 16.44 | 14.68 | 13.81 | 18.17 | 18.22 | |
| MASE | 2.68 | 1.96 | 1.97 | 2.00 | 1.87 | 1.67 | 1.97 | 1.72 | 1.61 | 2.09 | 2.11 | |
| Cluster_3 (2017) | MAE | 540.17 | 270.38 | 255.92 | 414.88 | 362.22 | 239.26 | 314.49 | 225.18 | 248.65 | 349.55 | 314.20 |
| RMSE | 600.93 | 330.30 | 312.57 | 480.92 | 418.20 | 303.88 | 377.92 | 284.68 | 306.31 | 416.28 | 375.86 | |
| MAPE | 63.97 | 50.81 | 49.21 | 62.55 | 54.68 | 45.45 | 54.11 | 42.69 | 48.66 | 52.96 | 51.76 | |
| MASE | 4.11 | 2.49 | 2.36 | 3.60 | 3.05 | 2.09 | 2.85 | 2.00 | 2.23 | 3.08 | 2.85 | |
| Cluster_4 (915) | MAE | 538.76 | 373.24 | 333.30 | 493.95 | 466.53 | 401.90 | 486.88 | 335.64 | 360.39 | 319.78 | 393.09 |
| RMSE | 618.66 | 432.83 | 395.05 | 576.42 | 537.86 | 480.18 | 567.47 | 394.63 | 420.36 | 378.65 | 462.77 | |
| MAPE | 64.79 | 56.52 | 53.64 | 60.82 | 57.01 | 50.66 | 61.01 | 47.55 | 54.82 | 47.30 | 53.45 | |
| MASE | 4.39 | 3.46 | 3.21 | 4.44 | 3.99 | 3.64 | 4.58 | 3.36 | 3.36 | 3.27 | 3.80 | |
| Cluster (Product Quantity) | Evaluation | Phase 1 (Baseline Models) | Phase 2 (Proposed Models) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GAF-CNN | PatchTST | TS2Vec | ||||||||||
| ARIMA | RF | XGBoost | LSTM | Autoformer | XGBoost | RF | XGBoost | RF | XGBoost | RF | ||
| Cluster_1 (2513) | MAE | 594.25 | 369.31 | 358.56 | 475.31 | 389.54 | 318.94 | 405.52 | 323.48 | 323.89 | 442.94 | 420.97 |
| RMSE | 643.86 | 416.08 | 408.03 | 528.69 | 438.54 | 375.62 | 456.93 | 376.98 | 372.98 | 500.89 | 471.79 | |
| MAPE | 35.18 | 25.42 | 24.38 | 30.84 | 27.10 | 22.59 | 27.56 | 22.97 | 23.25 | 28.94 | 28.18 | |
| MASE | 3.46 | 2.36 | 2.29 | 2.82 | 2.32 | 1.91 | 2.43 | 1.96 | 1.96 | 2.63 | 2.51 | |
| Cluster_2 (10,008) | MAE | 218.89 | 133.37 | 127.45 | 184.76 | 160.09 | 122.60 | 152.02 | 115.77 | 121.24 | 167.50 | 155.05 |
| RMSE | 240.09 | 152.85 | 147.35 | 206.68 | 180.89 | 145.54 | 173.58 | 135.88 | 140.60 | 191.39 | 175.87 | |
| MAPE | 42.37 | 32.21 | 31.48 | 38.78 | 35.23 | 29.46 | 34.90 | 28.96 | 30.43 | 35.71 | 34.89 | |
| MASE | 3.09 | 2.07 | 2.00 | 2.67 | 2.28 | 1.79 | 2.27 | 1.74 | 1.80 | 2.46 | 2.31 | |
| Cluster_3 (140) | MAE | 2212.97 | 1497.40 | 1434.19 | 1917.62 | 1706.07 | 1385.69 | 1704.68 | 1314.49 | 1361.60 | 1739.86 | 1643.32 |
| RMSE | 2542.16 | 1800.25 | 1748.01 | 2262.51 | 2029.11 | 1731.76 | 2036.26 | 1624.23 | 1677.37 | 2105.41 | 1973.25 | |
| MAPE | 36.04 | 28.05 | 26.82 | 33.99 | 31.03 | 26.74 | 31.17 | 24.86 | 25.93 | 30.19 | 29.57 | |
| MASE | 3.75 | 2.73 | 2.62 | 3.44 | 2.93 | 2.40 | 2.92 | 2.32 | 2.39 | 3.03 | 2.90 | |
| Cluster (Product Quantity) | Evaluation | Phase 1 (Baseline Models) | Phase 2 (Proposed Models) | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| GAF-CNN | PatchTST | TS2Vec | ||||||||||
| ARIMA | RF | XGBoost | LSTM | Autoformer | XGBoost | RF | XGBoost | RF | XGBoost | RF | ||
| Cluster_1 (10,211) | MAE | 214.04 | 129.92 | 129.58 | 172.54 | 151.78 | 161.17 | 158.25 | 115.63 | 113.75 | 161.40 | 154.28 |
| RMSE | 232.09 | 147.10 | 148.21 | 189.95 | 169.71 | 176.40 | 176.64 | 134.69 | 133.70 | 182.47 | 172.67 | |
| MAPE | 39.90 | 29.88 | 29.99 | 34.75 | 32.71 | 34.80 | 34.28 | 27.41 | 26.92 | 32.74 | 32.66 | |
| MASE | 2.97 | 1.95 | 1.95 | 2.44 | 2.16 | 2.59 | 2.33 | 1.69 | 1.65 | 2.29 | 2.22 | |
| Cluster_2 (1801) | MAE | 545.60 | 336.74 | 332.48 | 440.27 | 375.90 | 293.81 | 366.07 | 299.57 | 306.37 | 417.85 | 385.64 |
| RMSE | 584.83 | 376.68 | 375.32 | 479.81 | 416.80 | 343.00 | 407.69 | 344.84 | 346.71 | 468.42 | 427.69 | |
| MAPE | 35.19 | 24.94 | 24.34 | 31.20 | 27.98 | 22.10 | 26.54 | 22.44 | 23.74 | 29.34 | 27.87 | |
| MASE | 3.66 | 2.48 | 2.45 | 3.02 | 2.60 | 2.01 | 2.49 | 2.10 | 2.12 | 2.81 | 2.64 | |
| Cluster_3 (214) | MAE | 1964.62 | 1143.86 | 1129.15 | 1410.41 | 1399.62 | 996.41 | 1334.76 | 1044.68 | 1050.29 | 1347.56 | 1320.13 |
| RMSE | 2200.80 | 1378.82 | 1360.12 | 1652.65 | 1635.47 | 1252.10 | 1568.43 | 1302.25 | 1288.23 | 1615.57 | 1555.77 | |
| MAPE | 37.17 | 25.08 | 24.17 | 30.39 | 29.79 | 21.26 | 28.28 | 22.57 | 22.85 | 28.81 | 27.26 | |
| MASE | 4.50 | 3.07 | 2.99 | 3.55 | 3.35 | 2.45 | 3.33 | 2.63 | 2.54 | 3.39 | 3.18 | |
| Cluster_4 (435) | MAE | 967.49 | 581.70 | 502.84 | 884.72 | 771.52 | 639.71 | 767.74 | 459.61 | 534.54 | 725.10 | 643.54 |
| RMSE | 1112.19 | 691.19 | 605.04 | 1033.78 | 903.06 | 773.33 | 911.58 | 559.07 | 639.33 | 877.01 | 767.45 | |
| MAPE | 60.90 | 52.03 | 47.74 | 64.52 | 58.25 | 49.78 | 55.80 | 42.93 | 49.09 | 52.70 | 51.41 | |
| MASE | 4.65 | 3.48 | 3.01 | 4.60 | 4.12 | 3.99 | 4.56 | 2.95 | 3.35 | 4.29 | 3.91 | |
| Method | Center | Cluster | N | Baseline | Proposed | Baseline MAE | Proposed MAE | Diff MAE | CI | t-Test p | Wilcoxon p |
|---|---|---|---|---|---|---|---|---|---|---|---|
| SBC | A | 1 | 122,863 | RF | PatchTST-RF | 164.43 | 142.64 | 21.79 | 20.84, 22.74 | 0.000 | 0.000 |
| A | 2 | 1521 | RF | PatchTST-RF | 84.34 | 73.59 | 10.75 | 8.22, 13.28 | 0.000 | 0.000 | |
| A | 3 | 27,807 | XGBoost | PatchTST-XGBoost | 257.16 | 236.43 | 20.73 | 17.45, 24.02 | 0.000 | 0.000 | |
| A | 4 | 12,402 | XGBoost | PatchTST-XGBoost | 280.38 | 310.19 | −29.81 | −37.66, −21.95 | 0.000 | 0.000 | |
| B | 1 | 124,904 | RF | PatchTST-RF | 160.00 | 138.66 | 21.34 | 20.48, 22.21 | 0.000 | 0.000 | |
| B | 2 | 1573 | RF | PatchTST-RF | 80.82 | 72.20 | 8.62 | 6.50, 10.75 | 0.000 | 0.000 | |
| B | 3 | 26,221 | XGBoost | PatchTST-XGBoost | 252.35 | 220.83 | 31.52 | 28.47, 34.58 | 0.000 | 0.000 | |
| B | 4 | 11,895 | XGBoost | TS2Vec-XGBoost | 329.76 | 313.30 | 16.45 | 9.78, 23.13 | 0.000 | 0.000 | |
| ML | A | 1 | 32,669 | XGBoost | GAF-CNN-XGBoost | 356.49 | 315.86 | 40.63 | 37.03, 44.23 | 0.000 | 0.000 |
| A | 2 | 130,104 | XGBoost | PatchTST-XGBoost | 126.43 | 114.06 | 12.37 | 11.66, 13.07 | 0.000 | 0.000 | |
| A | 3 | 1820 | XGBoost | PatchTST-XGBoost | 1426.71 | 1300.43 | 126.27 | 69.00, 183.55 | 0.000 | 0.000 | |
| B | 1 | 132,743 | RF | PatchTST-RF | 129.52 | 112.96 | 16.56 | 15.85, 17.28 | 0.000 | 0.000 | |
| B | 2 | 23,413 | XGBoost | GAF-CNN-XGBoost | 332.08 | 292.33 | 39.75 | 36.07, 43.42 | 0.000 | 0.000 | |
| B | 3 | 2782 | XGBoost | GAF-CNN-XGBoost | 1134.65 | 998.32 | 136.32 | 108.18, 164.47 | 0.000 | 0.000 | |
| B | 4 | 5655 | XGBoost | PatchTST-XGBoost | 481.88 | 439.23 | 42.64 | 33.59, 51.70 | 0.000 | 0.000 |
| Feature | A Center Importance (%) | B Center Importance (%) |
|---|---|---|
| CATEGORY | 30.99 | 45.76 |
| LAG1 | 22.66 | 14.95 |
| START_SALES_DATE | 15.15 | 16.97 |
| LAG1_4W_ROLLING_AVG | 13 | 8.72 |
| PRICE | 6.34 | 4.71 |
| FIRST_SHIPMENT_DATE | 4.48 | 4.3 |
| Subtotal (Top 6 Features) | 92.62 | 95.41 |
| Other Features (n = 9) | 7.38 | 4.59 |
| Center | Method | Weighted SUM (WMAPE) | WMAPE (%) | Weighted SUM (WAMPE) | WAMPE (%) |
|---|---|---|---|---|---|
| A | Rule-Based (SBC) | 359,720.55 | 28.41 | 304,274.08 | 24.03 |
| A | ML (PatchTST-GMM) | 350,080.75 | 27.65 | 299,910.97 | 23.69 |
| B | Rule-based (SBC) | 334,649.76 | 26.43 | 289,848.53 | 22.89 |
| B | ML (PatchTST-GMM) | 368,130.97 | 29.08 | 311,734.52 | 24.62 |
| Center | A | B | |||||
|---|---|---|---|---|---|---|---|
| Cluster | 1 | 2 | 3 | 1 | 2 | 3 | 4 |
| number_of_products | 2513 | 10,008 | 140 | 10,211 | 1801 | 214 | 435 |
| mean_W_QTY | 1411.2 | 405.3 | 5197.6 | 484.4 | 1317.0 | 4027.3 | 669.4 |
| mean_CV_W_QTY | 0.4 | 0.6 | 0.7 | 0.6 | 0.5 | 0.6 | 1.6 |
| mean_ZERO_RATIO | 0.0 | 0.1 | 0.1 | 0.1 | 0.0 | 0.0 | 0.3 |
| mean_ADI | 1.0 | 1.1 | 1.1 | 1.0 | 1.0 | 1.1 | 1.9 |
| mean_MAD | 264.9 | 96.1 | 824.5 | 107.6 | 238.1 | 697.0 | 175.2 |
| mean_KURTOSIS | 3.1 | 3.1 | 7.4 | 2.7 | 3.6 | 6.1 | 12.2 |
| mean_TREND_VOLATILITY | 297.4 | 118.4 | 1288.2 | 117.7 | 276.6 | 834.8 | 473.6 |
| Case | Volatility (A, B) | Required Experiments | Eliminated Experiments | Experimental Savings (Count, %) |
|---|---|---|---|---|
| Case 1 | Low, Low | ML Clustering (A, B) ML Forecasting (A, B) | Rule-based Clustering & Forecasting (A, B) | 4/50% |
| Case 2 | Low, High | ML Clustering (A, B) ML Forecasting (A) Rule-based Clustering & Forecasting (B) | ML Forecasting (B) Rule-based Clustering & Forecasting (A) | 3/37.5% |
| Case 3 | High, Low | ML Clustering (A, B) Rule-based Clustering & Forecasting (A) ML Forecasting (B) | ML Forecasting (A) Rule-based Clustering & Forecasting (B) | 3/37.5% |
| Case 4 | High, High | ML Clustering (A, B) Rule-based Clustering & Forecasting (A, B) | ML Forecasting (A, B) | 2/25% |
| Center | Method | Weighted SUM (WMAPE) | WMAPE (%) | Weighted SUM (WAMPE) | WAMPE (%) |
|---|---|---|---|---|---|
| A | Rule-based (SBC) | 356,626.44 | 28.17 | 313,516.71 | 24.76 |
| A | ML (PatchTST-GMM) | 343,664.48 | 27.14 | 270,475.29 | 21.36 |
| B | Rule-based (SBC) | 334,585.68 | 26.43 | 294,933.17 | 23.29 |
| B | ML (PatchTST-GMM) | 340,044.67 | 26.86 | 316,253.30 | 24.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Kim, J.-H.; Cho, N.-W. Hybrid Clustering for Retail Demand Forecasting: Combining Rule-Based and Machine Learning Methods. Forecasting 2026, 8, 37. https://doi.org/10.3390/forecast8030037
Kim J-H, Cho N-W. Hybrid Clustering for Retail Demand Forecasting: Combining Rule-Based and Machine Learning Methods. Forecasting. 2026; 8(3):37. https://doi.org/10.3390/forecast8030037
Chicago/Turabian StyleKim, Jung-Hyuk, and Nam-Wook Cho. 2026. "Hybrid Clustering for Retail Demand Forecasting: Combining Rule-Based and Machine Learning Methods" Forecasting 8, no. 3: 37. https://doi.org/10.3390/forecast8030037
APA StyleKim, J.-H., & Cho, N.-W. (2026). Hybrid Clustering for Retail Demand Forecasting: Combining Rule-Based and Machine Learning Methods. Forecasting, 8(3), 37. https://doi.org/10.3390/forecast8030037
