Consumer Transactions Simulation Through Generative Adversarial Networks Under Stock Constraints in Large-Scale Retail
Abstract
1. Introduction
1.1. Problem Statement
1.2. Objectives and Contributions
- Enhanced Product and Consumer Representation: We introduce an innovative approach of combining SKUs with transactional data, performing a comparative analysis of product embeddings using both the classic word2vec method [24] and the advanced hyper-graph model [25]. Similarly, our research contrasts consumer embeddings generated through a recurrent neural network (RNN) [26] with those produced by Cleora, offering a comprehensive perspective on representation efficacy.
- Innovative Use of GAN with Stocks Data: Combined with stock data availability, our adoption of the classic GAN architecture for generative modeling pushes the research in the field forward, ensuring that the generated data closely mirrors real-world transactional patterns. To the best of our knowledge, variations in stock data representation (weighted and unweighted embeddings) have been tested for the first time in our research to uncover the impact on generated transaction accuracy in the frame of the GAN architecture.
2. Related Work
3. Proposed Approach
3.1. Overview of Generative Adversarial Networks
- G—the Generator network, responsible for generating synthetic data that aims to mimic the real data distribution;
- D—the Discriminator network, responsible for distinguishing between real and synthetic (generated) data;
- x—a sample from the real data distribution; p—product embedding, s—stock embedding, λ—the penalty coefficient, and —sampled uniformly along a straight line between a pair of real and generated data points.
3.2. Features and Representations
- Product: Similarly to the related research we referred to in Section 2, we generate a simple representation using word2vec but also leverage [25] as a potentially more advanced alternative that has proven to be effective in one of our previous studies [removed for blind review] in a comparable setting. Unlike word2vec, we use Cleora to generate embeddings not only using product names but also including the information about product interactions within purchase baskets.
- Customer: We use RNN architecture that has been used by other researchers for similar applications [38] to extract consumer representation from the last layer of the network’s hidden layer. As a comparison, we use the abovementioned Cleora algorithm with the relevant input parameters to enable both product and consumer embedding outputs.
- Dates: Retail data often exhibit strong seasonal patterns. By using cyclic features for dates, our model can capture these patterns effectively, allowing for a more accurate and realistic transaction generation. Hence, we extract cyclic features from the transaction dates, such as the day of the week, the day of the month, and the month itself.
- Price: We transform unit price using a natural logarithm to reduce the dynamic range of a variable since some values are significantly larger than others. The variability of prices is expected when handling a wide range of assortments in large-scale retail.
- Stocks: The generation of stock embeddings is a two-step process involving the unweighted and weighted aggregation of product embeddings (we use Cleora-generated embeddings that hold more information than the ones generated with word2vec). Firstly, for each unique combination of site and date, a subset of product indices with the corresponding product embeddings and quantities is extracted. For unweighted representation, the mean of the product embeddings is computed, whereas for weighted representation, a weighted average of the product embeddings is computed using the quantities as weights.
4. Experiments and Results
4.1. Dataset
4.2. Model Design and Training
- Layer I: 2048 input units (product and stock embeddings) to 1024 output units, followed by a LeakyReLU (https://pytorch.org/docs/stable/generated/torch.nn.LeakyReLU.html, accessed on 1 April 2024) activation.
- Layer II: 1024 to 512 units, followed by a LeakyReLU activation.
- Layer III: 512 to 256 units, followed by a LeakyReLU activation.
- Output Layer: 256 to 1287 units, followed by a Tanh activation.
- Layer I: 1287 input units to 512 output units, followed by a LeakyReLU activation and a dropout layer.
- Layer II: 512 to 128 units, followed by a LeakyReLU activation and another dropout layer.
- Output Layer: 128 to 1 unit, followed by a sigmoid activation.
| Algorithm 1. Training Process of the Proposed GAN Model | 
| 1: Input: Training dataset, batch size, number of epochs 2: Initialize: Generator G, Discriminator D 3: Initialize: Optimizers optimizer_G, optimizer_D 4: for epoch in 1, 2, …, epochs do 5: for mini-batch in DataLoader do 6: Extract real orders and stock embeddings 7: Extract product embeddings from real orders 8: Move all data to computation device (GPU) 9: for iteration in 1, 2, …, ncritic do 10: Generate fake orders using G 11: Compute Discriminator loss LD 12: Update D using optimizer_D 13: Generate fake orders using G 14: Compute Generator loss LG 15: Update G using optimizer_G | 
4.3. Performance Metrics
4.4. Results and Discussion
5. Conclusions and Future Work
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Bahng, Y.; Rogers, F.T.; Kincade, D.H. Assortment planning for retail buying, retail store operations, and firm performance. J. Distrib. Sci. 2018, 16, 15–27. [Google Scholar] [CrossRef]
- Kök, A.G.; Fisher, M.L.; Vaidyanathan, R. Assortment planning: Review of literature and industry practice. In Retail Supply Chain Management: Quantitative Models and Empirical Studies; Springer: Boston, MA, USA, 2015; pp. 175–236. [Google Scholar]
- Rooderkerk, R.P.; Kök, A.G. Omnichannel assortment planning. In Operations in an Omnichannel World; Springer: Cham/Geneva, Switzerland, 2019; pp. 51–86. [Google Scholar]
- Goyal, V.; Levi, R.; Segev, D. Near-optimal algorithms for the assortment planning problem under dynamic substitution and stochastic demand. Oper. Res. 2016, 64, 219–235. [Google Scholar] [CrossRef]
- Méndez-Díaz, I.; Miranda-Bront, J.J.; Vulcano, G.; Zabala, P. A branch-and-cut algorithm for the latent class logit assortment problem. Electron. Notes Discret. Math. 2010, 36, 383–390. [Google Scholar] [CrossRef]
- Nazir, S.; Khadim, S.; Asadullah, M.A.; Syed, N. Exploring the influence of artificial intelligence technology on consumer repurchase intention: The mediation and moderation approach. Technol. Soc. 2023, 72, 102190. [Google Scholar] [CrossRef]
- Gallego, G.; Topaloglu, H. Revenue Management and Pricing Analytics; Springer Nature: Dordrecht, The Netherlands, 2019; Volume 209. [Google Scholar]
- Müller, S.; Huber, J.; Fleischmann, M.; Stuckenschmidt, H. Data-Driven Inventory Management Under Customer Substitution. 2020. SSRN 3624026. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3624026 (accessed on 5 March 2024).
- Anshari, M.; Almunawar, M.N.; Lim, S.A.; Al-Mudimigh, A. Customer relationship management and big data enabled: Personalization & customization of services. Appl. Comput. Inform. 2019, 15, 94–101. [Google Scholar]
- Adjie Eryadi, R.; Nizar Hidayanto, A. Critical success factors for business intelligence implementation in an enterprise resource planning system environment using dematel: A case study at a cement manufacture company in Indonesia. J. Inf. Technol. Manag. 2020, 12, 67–85. [Google Scholar]
- Alshurideh, M.; Gasaymeh, A.; Ahmed, G.; Alzoubi, H.; Kurd, B. Loyalty program effectiveness: Theoretical reviews and practical proofs. Uncertain Supply Chain Manag. 2020, 8, 599–612. [Google Scholar] [CrossRef]
- OxfordDictionaries. Gen z. 2023. Available online: https://www.oed.com/dictionary/gen-z_n (accessed on 3 January 2025).
- Thangavel, P.; Pathak, P.; Chandra, B. Consumer decision-making style of gen z: A generational cohort analysis. Glob. Bus. Rev. 2022, 23, 710–728. [Google Scholar] [CrossRef]
- Stankov, I.; Tsochev, G. Vulnerability and protection of business management systems: Threats and challenges. Probl. Eng. Cybern. Robot. 2020, 72, 29–40. [Google Scholar] [CrossRef]
- Robertson, V.H. Excessive data collection: Privacy considerations and abuse of dominance in the era of big data. Common Mark. Law Rev. 2020, 57, 161–190. [Google Scholar] [CrossRef]
- European Parliament and Council of the European Union. Regulation (EU) 2016/679 of the European Parliament and of the Council. Available online: https://data.europa.eu/eli/reg/2016/679/oj (accessed on 3 January 2025).
- Fonseca, J.; Bacao, F. Tabular and latent space synthetic data generation: A literature review. J. Big Data 2023, 10, 115. [Google Scholar] [CrossRef]
- Salinas, D.; Flunkert, V.; Gasthaus, J.; Januschowski, T. Deepar: Probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 2020, 36, 1181–1191. [Google Scholar] [CrossRef]
- Fisher, M.L.; Krishnan, J.K.; Netessine, S. Retail Store Execution: An Empirical Study. 2006. SSRN 2319839. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2319839 (accessed on 3 January 2025).
- Sinha, D.; Tulabandhula, T. Optimizing revenue over data-driven assortments. arXiv 2017, arXiv:1708.05510. [Google Scholar]
- McColl, R.; Macgilchrist, R.; Rafiq, S. Estimating cannibalizing effects of sales promotions: The impact of price cuts and store type. J. Retail. Consum. Serv. 2020, 53, 101982. [Google Scholar] [CrossRef]
- Hollebeek, L.D.; Sprott, D.E.; Andreassen, T.W.; Costley, C.; Klaus, P.; Kuppelwieser, V.; Karahasanovic, A.; Taguchi, T.; Islam, J.U.; Rather, R.A. Customer engagement in evolving technological environments: Synopsis and guiding propositions. Eur. J. Mark. 2019, 53, 2018–2023. [Google Scholar] [CrossRef]
- Apeh, E.T.; Gabrys, B.; Schierz, A. Customer profile classification using transactional data. In Proceedings of the 2011 Third World Congress on Nature and Biologically Inspired Computing, Salamanca, Spain, 19–21 October 2011; IEEE: Piscataway, NJ, USA; pp. 37–43. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient estimation of word representations in vector space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Rychalska, B.; Bąbel, P.; Gołuchowski, K.; Michałowski, A.; Dąbrowski, J.; Biecek, P. Cleora: A simple, strong and scalable graph embedding scheme. In Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, 8–12 December 2021, Proceedings, Part IV 28; Springer: Cham, Switzerland, 2021; pp. 338–352. [Google Scholar]
- Sherstinsky, A. Fundamentals of recurrent neural network (rnn) and long short-term memory (lstm) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A. Photo-realistic single image super-resolution using a generative adversarial network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 4681–4690. [Google Scholar]
- Pathak, D.; Krahenbuhl, P.; Donahue, J.; Darrell, T.; Efros, A.A. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2536–2544. [Google Scholar]
- Tzeng, E.; Hoffman, J.; Saenko, K.; Darrell, T. Adversarial discriminative domain adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 7167–7176. [Google Scholar]
- Zhou, T.; Li, Q.; Lu, H.; Cheng, Q.; Zhang, X. Gan review: Models and medical image fusion applications. Inf. Fusion 2023, 91, 134–148. [Google Scholar] [CrossRef]
- Assefa, S.A.; Dervovic, D.; Mahfouz, M.; Tillman, R.E.; Reddy, P.; Veloso, M. Generating synthetic data in finance: Opportunities, challenges and pitfalls. In Proceedings of the First ACM International Conference on AI in Finance, New York, NY, USA, 15–16 October 2020; pp. 1–8. [Google Scholar]
- Takahashi, S.; Chen, Y.; Tanaka-Ishii, K. Modeling financial time-series with generative adversarial networks. Phys. A Stat. Mech. Its Appl. 2019, 527, 121261. [Google Scholar] [CrossRef]
- Kumar, A.; Biswas, A.; Sanyal, S. ecommercegan: A generative adversarial network for e-commerce. arXiv 2018, arXiv:1801.03244. [Google Scholar]
- Doan, T.; Veira, N.; Keng, B. Generating realistic sequences of customer-level transactions for retail datasets. In Proceedings of the 2018 IEEE International Conference on Data Mining Workshops (ICDMW), Singapore, 17–20 November 2018; IEEE: Piscataway, NJ, USA; pp. 820–827. [Google Scholar]
- Zhao, Z.; Kunar, A.; Birke, R.; Chen, L.Y. Ctab-gan: Effective table data synthesizing. In Proceedings of the Asian Conference on Machine Learning, Virtually, 17–19 November 2021; PMLR, 2021; pp. 97–112. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of wasserstein gans. Adv. Neural Inf. Process. Syst. 2017, 30. [Google Scholar]
- Salampasis, M.; Siomos, T.; Katsalis, A.; Diamantaras, K.; Christantonis, K.; Delianidi, M.; Karaveli, I. Comparison of rnn and embeddings methods for next-item and last-basket session-based recommendations. In Proceedings of the 2021 13th International Conference on Machine Learning and Computing, Shenzhen China, 26 February–1 March 2021; pp. 477–484. [Google Scholar]
- Tkachuk, S.; Wróblewska, A.; Dabrowski, J.; Łukasik, S. Identifying Substitute and Complementary Products for Assortment Optimization with Cleora Embeddings. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18–23 July 2022; IEEE: Piscataway, NJ, USA. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Tieleman, T.; Hinton, G. Lecture 6.5-rmsprop, Coursera: Neural Networks for Machine Learning; Technical Report, 6; University of Toronto: Toronto, ON, Canada, 2012. [Google Scholar]
- Liu, K.; Qiu, G. Lipschitz constrained gans via boundedness and continuity. Neural Comput. Appl. 2020, 32, 18271–18283. [Google Scholar] [CrossRef]
- Saxena, D.; Cao, J. Generative adversarial networks (gans) challenges, solutions, and future directions. ACM Comput. Surv. (CSUR) 2021, 54, 1–42. [Google Scholar] [CrossRef]
- Karampatsa, M.; Grigoroudis, E.; Matsatsinis, N.F. Retail category management: A review on assortment and shelf-space planning models. In Operational Research in Business and Economics: 4th International Symposium and 26th National Conference on Operational Research, Chania, Greece, June 2015; Springer: Cham, Switzerland, 2015; pp. 35–67. [Google Scholar]
- Mahalanobish, O.; Mishra, S.; Das, A.; Misra, S. Capturing Demand Transference in Retail—A Statistical Approach. 2017. SSRN 3227753. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3227753 (accessed on 3 January 2025).


| Consumer | Product | Stocks (Unweighted) | Stocks (Weighted) | No Stocks | ||||||
|---|---|---|---|---|---|---|---|---|---|---|
| EMD | JSD | Acc | EMD | JSD | Acc | EMD | JSD | Acc | ||
| Cleora | Cleora | 0.23 | 0.12 | 0.60 | 0.18 | 0.09 | 0.58 | 0.35 | 0.20 | 0.68 | 
| Cleora | w2v | 0.28 | 0.16 | 0.63 | 0.24 | 0.12 | 0.61 | 0.40 | 0.24 | 0.71 | 
| RNN | Cleora | 0.21 | 0.10 | 0.58 | 0.16 | 0.07 | 0.54 | 0.33 | 0.18 | 0.66 | 
| RNN | w2v | 0.26 | 0.14 | 0.61 | 0.22 | 0.11 | 0.59 | 0.38 | 0.22 | 0.69 | 
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. | 
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tkachuk, S.; Łukasik, S.; Wróblewska, A. Consumer Transactions Simulation Through Generative Adversarial Networks Under Stock Constraints in Large-Scale Retail. Electronics 2025, 14, 284. https://doi.org/10.3390/electronics14020284
Tkachuk S, Łukasik S, Wróblewska A. Consumer Transactions Simulation Through Generative Adversarial Networks Under Stock Constraints in Large-Scale Retail. Electronics. 2025; 14(2):284. https://doi.org/10.3390/electronics14020284
Chicago/Turabian StyleTkachuk, Sergiy, Szymon Łukasik, and Anna Wróblewska. 2025. "Consumer Transactions Simulation Through Generative Adversarial Networks Under Stock Constraints in Large-Scale Retail" Electronics 14, no. 2: 284. https://doi.org/10.3390/electronics14020284
APA StyleTkachuk, S., Łukasik, S., & Wróblewska, A. (2025). Consumer Transactions Simulation Through Generative Adversarial Networks Under Stock Constraints in Large-Scale Retail. Electronics, 14(2), 284. https://doi.org/10.3390/electronics14020284
 
        

 
                                                


 
       