A Hybrid Recommendation System Based on Similar-Price Content in a Large-Scale E-Commerce Environment
Abstract
1. Introduction
2. Literature Review on Recommendation Systems and Price-Based Approaches
2.1. Recent Advances in Deep Learning-Based Recommendation
2.2. Collaborative Filtering
- Advantages:
- Relies solely on user behavior data, without requiring domain-specific knowledge.
- Effective captures and reflects emerging trends.
- Disadvantages:
- Performance declines when interaction data are sparse, a limitation known as the “cold-start” problem.
- Susceptible to recommendation bias, in which a few popular items dominate exposure [24].
2.3. Content-Based Filtering with BM25
- Example: For query “wireless earphone”, products containing the term more frequently, combined with a high IDF value, receive higher BM25 scores and appear more prominently in search rankings.
2.4. The Concept and Necessity of Price Similarity Recommendation
- If a product’s price exceeds the user’s baseline, its score is reduced to prevent overexposure of high-cost items.
- If a product’s price falls below baseline, the adjustment depends on user tendencies; for instance, some may avoid excessively cheap items, ensuring balanced placement in the ranking.
2.5. Hybrid Recommendation Architecture in High-Traffic Environments
3. Design and Implementation of a Very Large-Scale Hybrid Recommendation System
3.1. Research Process Design
3.2. Algorithm Proposal and Performance Verification Strategy for Analyzing Practical Effect
- A/B Test design
- ○
- Subject assignment: Users are randomly selected from the overall user base.
- ▪
- Experimental Group 1 (Hybrid): Receives hybrid recommendations integrating CF, BM25, and price similarity.
- ▪
- Experimental Group 2 (CF-only): Receives recommendations based solely on CF.
- ▪
- Control Group (popular products): Receives a list of popular products refreshed every four hours, reflecting the existing operational approach.
- Experiment period and data collection
- ○
- The experiment runs for approximately 8–16 days, during which recommendation outputs and user interactions are recorded.
- ○
- An adequate sample size (in terms of sessions and users) is ensured to support statistically valid analysis.
- Evaluation procedure
- ○
- Recommendation logging: Each session’s recommendations and their ranking positions (e.g., Top-N) are stored in logs.
- ○
- Click and purchase tracking: User clicks and completed purchases are tracked at the session and item levels.
- ○
- Key Metrics:
- ▪
- CTR: Ratio of clicks to impressions.
- ▪
- Purchases: Number of transactions completed during a test period.
- ▪
- Revenue: Calculated from total payment amounts.
- ○
- Statistical testing: CTR, purchase counts, and revenue are compared across groups to assess whether differences are statistically significant. Performance is evaluated to determine whether the hybrid method outperforms the CF-only and popular-products baselines.
3.3. Designing the Architecture for a Large-Scale Hybrid Recommendation System
- Distributed caching. To accelerate retrieval, similarity scores, particularly cosine similarity, are pre-computed and stored in a multi-node cache system (e.g., Redis). During a request, the cached similarity values are directly integrated with BM25 search results, minimizing additional computation. This considerably reduces response time by offloading the most resource-intensive operations. Because the cache operates in a distributed structure, workload surges on individual nodes can be balanced across the cluster, ensuring both stability and scalability.
- Horizontal scaling. The architecture is designed for scale-out deployment, where the search engine, cache servers, and analysis servers (responsible for statistical processing and model computation) are deployed as independent nodes or containers. Load balancing ensures that traffic spikes are distributed across the cluster, preventing bottlenecks. Since nodes can be dynamically added or removed, the system can flexibly adapt to unpredictable traffic patterns while balancing operational costs and performance.
- Memory optimization through hashing. To improve memory efficiency, hashing techniques such as CRC32 and simplified similarity tables are applied. For example, in CF computations, hash-based compression reduces the memory required for storing similarity scores and indexing tables while maintaining recommendation accuracy. This approach is particularly essential for large-scale user-item datasets, where memory access costs directly impact processing speed. By minimizing collisions and enabling rapid lookups, the hashing strategy ensures stable system performance, even in the face of exponential data growth. By appropriately integrating these optimization techniques, data operations in a distributed environment can be managed with flexibility, while system performance remains stable even as the data volume grows exponentially.
- Database
- ○
- Distributed RDB: A relational database distributed across multiple nodes is employed to ensure reliable large-scale transaction processing.
- ○
- NoSQL (MongoDB): MongoDB is used as a complementary solution for storing semi-structured and unstructured data and for supporting high-speed read/write operations.
- Search engine
- ○
- Elasticsearch: Responsible for searching text fields such as product names, categories, and descriptions, and for computing BM25 scores. It is particularly suitable for large-scale document processing and real-time tasks.
- Distributed cache
- ○
- Redis: Enhances query speed by caching frequently accessed metadata (e.g., session information, recommendation results). Scalability and fault tolerance are achieved through data partitioning and replication.
- Analysis tools and pipeline
- ○
- Java-based statistical analysis: In a Java environment for large-scale processing, logs are aggregated and analyzed to support model refinement. Large-scale traffic monitoring is conducted using Scouter.
- ○
- Log processing pipeline: Collects and stores event-based logs in both real-time and batch modes, supporting the computation and visualization of statistical metrics.
3.4. Building a Hybrid Recommendation System Based on CF and a Search Engine
- Preprocessing and threshold setting. To maximize the efficiency of processing large-scale user-product interaction logs, a preprocessing step was conducted in the offline stage to retain only data that satisfied specific criteria. The distributions of products and users were first analyzed, and thresholds were established to exclude cases with extremely low or disproportionately high view frequencies. In addition, thresholds were dynamically adjusted using metrics that varied according to the system’s application environment (e.g., user-to-product ratios). This approach prevented an excessive number of candidate pairs and improved computational efficiency.
- Co-occurrence extraction. From the filtered interaction history, only products that met the criteria were selected, and the co-occurrence count between product pairs was calculated based on the set of users who had viewed each product. Specifically, the method tracked the frequency with which users who viewed one product also viewed another. This co-occurrence information was accumulated in a parallel and distributed manner, ensuring efficient computation for subsequent similarity estimation.
- Similarity calculation (cosine similarity) Using the extracted co-occurrence data, cosine similarity values were computed by combining the co-occurrence counts with exposure frequencies (e.g., the number of users who viewed the clicked each product). Noise was reduced by applying the standard cosine normalization, placing the co-occurrence count in the numerator and the square root of the product of the individual view count counts in the denominator, and excluding cases below a defined threshold. The resulting similarity values provided a quantitative measure of correlations among products.
- Category-based correction. Product classification information was then incorporated to refine the similarity estimates. Additional weights were applied to product pairs within the same or similar categories, allowing the system to prioritize candidates with stronger domain-relevant similarity rather than relying solely on view-based measures. This adjustment integrated domain knowledge into recommendation process and captured category-specific characteristics more accurately.
- Recommendation list composition and sorting. For the extracted product pairs, a recommendation list was generated by ranking them in descending order of similarity. The products were then grouped by category or closely related categories, facilitating use in specific scenarios (e.g., shopping cart recommendations or set configurations). Items that did not meet the minimum occurrence threshold were filtered out to improve the reliability of the final list.
- Result storage and merging. The final recommendation candidates were stored in fast-access storage system (e.g., caching servers or external file storage) to enable real-time retrieval. These results were later combined with other offline analyses, such as user purchase history, and aggregated into a format immediately usable at critical interaction points (e.g., when a user opens a product detail page). This strategy reduces response time in high traffic online environments and supports the implementation of a multi-factor recommendation system that integrates multiple metrics.
- Expansion and optimization strategy. The algorithm can adapt to varying scales of user traffic and product diversity by dynamically adjusting parameters such as occurrence thresholds and category weights (Appendix A). In large-scale environments, caching and streaming techniques distribute the computational load and accelerate processing. Moreover, incorporating additional metadata, such as brand or price range, enables the system to deliver more refined recommendation functions beyond basic category-based weighting.
3.5. Applying Price-Similarity Weighting in the Hybrid Recommendation Engine
Algorithm 1. Price-adjusted scoring function |
Input: document price, product price Output: Adjusted score reflecting the document’s price relative to the product price 1. for each document price in the document set: 2. If the document price is greater than the product price: 3. Compute the price difference: document price—product price. 4. Increment the difference by 1. 5. Compute the relative price: difference/document price. 6. Take the natural logarithm: Log (relative price) 7. Compute the adjustment factor: Log relative price * −2. 8. Update the adjusted score: (_score) + price adjustment 9. Return the adjusted score. 10. Else if the document price is less than the product price: 11. Calculate the price difference: product price—document price 12. Increment the difference by 13. Compute the relative price: price difference/product price 14. Take the natural logarithm: Log (relative price). 15. Compute the adjustment factor: Log relative price * −1.2. 16. Updated the adjusted score: (_score) + price adjustment. 17. Return adjusted score. 18. end if 19. end for Note: _score denotes the document’s initial score before price adjustment. The logarithmic transformation ensures that the influence diminishes as the price difference increases. |
4. Experiments and Analysis
4.1. Experimental Method
4.2. Test of CF-Only Recommendation Vs. Popular Items
4.3. Test of the Hybrid Recommendation Method Versus Popular Items
5. Discussion
- Effectiveness of Price-similarity recommendations. Prioritizing items within a user’s preferred price range led to notable gains in CTR and purchase conversion. These results highlight that within the same product category, price sensitivity significantly influences purchasing behavior. Thus, incorporating personalized price-range filtering effectively enhances user engagement and increases the likelihood of purchase.
- Improvement in recommendation diversity. The proposed method reduced exposure bias toward popular items, increasing visibility for new and less popular products. This outcome demonstrates that the hybrid approach successfully fosters diversity in product recommendations, thereby broadening user options and reducing algorithmic homogenization.
6. Conclusions
- Practical applications.
- ○
- In high-traffic domains such as online marketplaces or streaming platforms, the combination of CF, BM25, and price similarity can be flexibly turned by adjusting their respective weights.
- ○
- A practical implementation strategy is to integrate CF and BM25 scores as a baseline, followed by incorporating price similarity as a weighting factor to generate the final recommendation list.
- Limitations
- ○
- Implementation complexity: Deploying the system requires advanced search engine configuration, caching strategies, and large-scale data pipelines, which demand substantial development and operational expertise.
- ○
- Variability in price sensitivity: For branded, luxury, or niche products, price exerts limited influence on user decisions. In such cases, alternative or supplementary recommendation strategies should be considered to address varying levels of price elasticity across product categories.
- ○
- Generalizability of Findings: A primary limitation of this study is its reliance on data from a single, large-scale domestic e-commerce platform. Consequently, the findings regarding the impact of price sensitivity may not be directly generalizable to other platforms that differ in their product categories, primary customer demographics, or overall market environment.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Appendix A.1. Implementation Details and Parameters
Component | Parameter | Value/Description |
---|---|---|
BM25 Algorithm | k1 (Term frequency saturation) | 1.2 (Elasticsearch default) |
b (Document length normalization) | 0.75 (Elasticsearch default) | |
Hybrid Scoring | Score-Based Candidate List Generation | As illustrated in Figure 2, the final score was calculated as a weighted sum of the BM25 score, price similarity score, and category similarity score. Weights were empirically tuned. |
Final List Integration | When integrating the CF-based and score-based candidate lists, CF-based recommendations were prioritized for items appearing in both lists. | |
Redis Caching Strategy | Time-To-Live (TTL) for CF recommendations | 178,800 s (48 h) |
Time-To-Live (TTL) for price-similarity search results | 14,400 s (4 h) | |
Cache Update Policy | The CF cache was refreshed every 24 h to reflect the latest results from the offline model. | |
Elasticsearch indexing Strategy | Bulk Index Policy | The Elasticsearch product information is refreshed every 4 h to reflect the latest results from the offline model. |
References
- Fayyaz, Z.; Afzal, A.; Mirza, H.T. Recommendation Systems: Algorithms, Challenges, Metrics, and Business Opportunities. Appl. Sci. 2020, 10, 7748. [Google Scholar] [CrossRef]
- Zhang, S.; Yao, L.; Sun, A.; Tay, Y. Deep Learning Based Recommender System: A Survey and New Perspectives. ACM Comput. Surv. 2019, 52, 5. [Google Scholar] [CrossRef]
- Krishna, E.S.P.; Ramu, T.B.; Chaitanya, R.K.; Ram, M.S.; Balayesu, N.; Gandikota, H.P.; Jagadesh, B.N. Enhancing E-commerce recommendations with sentiment analysis using MLA-EDTCNet and collaborative filtering. Sci. Rep. 2025, 15, 6739. [Google Scholar] [CrossRef] [PubMed]
- Ou, T.Y.; Chen, C.H.; Tsai, W.L. Establishing a Dynamic Recommendation System for E-commerce by Integrating Online Reviews, Product Feature Expansion, and Deep Learning. Appl. Artif. Intell. 2025, 39, 2463723. [Google Scholar] [CrossRef]
- Pei, C.; Yang, X.; Cui, Q.; Lin, X.; Sun, F.; Jiang, P.; Ou, W.; Zhang, Y. Value-aware Recommendation based on Reinforcement Profit Maximization. In Proceedings of the World Wide Web Conf 2019, San Francisco, CA, USA, 13–17 May 2019; pp. 3123–3129. [Google Scholar] [CrossRef]
- Gomez-Uribe, C.A.; Hunt, N. The Netflix Recommender System: Algorithms, Business Value, and Innovation. ACM Trans. Manag. Inf. Syst. 2016, 6, 13. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Guo, Q.; Zhuang, F.; Qin, C.; Zhu, H.; Xie, X.; Xiong, H.; He, Q. A Survey on Knowledge Graph-Based Recommender Systems. IEEE Trans. Knowl. Data Eng. 2022, 34, 3549–3568. [Google Scholar] [CrossRef]
- Wu, S.; Sun, F.; Zhang, W.; Xie, X.; Cui, B. Graph neural networks in recommender systems: A survey. ACM Comput. Surv. 2022, 55, 97. [Google Scholar] [CrossRef]
- Gao, C.; Zheng, Y.; Li, N.; Li, Y.; Qin, Y.; Piao, J.; Quan, Y.; Chang, J.; Jin, D.; He, X.; et al. A Survey of Graph Neural Networks for Recommender Systems: Challenges, Methods, and Directions. arXiv 2023, arXiv:2109.12843. [Google Scholar] [CrossRef]
- Wu, J.; Wang, X.; Feng, F.; He, X.; Chen, L.; Lian, J.; Xie, X. Self-supervised Graph Learning for Recommendation. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event Canada, 11–15 July 2021. [Google Scholar] [CrossRef]
- Zhao, Z.; Fan, W.; Li, J.; Liu, Y.; Mei, X.; Wang, Y. Recommender Systems in the Era of Large Language Models (LLMs). IEEE Trans. Knowl. Data Eng. 2024, 36, 6889–6907. [Google Scholar] [CrossRef]
- Min, X.; Sun, Z.; Liu, F.; Li, Q.; Liu, Y.; Zhang, D. Matrix Factorization Recommendation Algorithm Based on Attention Interaction. Symmetry 2024, 16, 267. [Google Scholar] [CrossRef]
- Covington, P.; Adams, J.; Sargin, E. Deep Neural Networks for YouTube Recommendations. In Proceedings of the 10th ACM Conference on Recommender Systems, Boston, MA, USA, 15–19 September 2016; pp. 191–198. [Google Scholar] [CrossRef]
- Bell, R.M.; Koren, Y. Scalable Collaborative Filtering with Jointly Derived Neighborhood Interpolation Weights. In Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM 2007), Omaha, NE, USA, 28–31 October 2007; pp. 43–52. [Google Scholar] [CrossRef]
- Jiang, W. Enhancing Operational Efficiency in E-Commerce Through Artificial Intelligence and Information Management Integration. Rev. d’Intell. Artif. 2023, 37, 1441–1449. [Google Scholar] [CrossRef]
- Asfar, M.; Crump, T.; Far, B. Reinforcement Learning based Recommender Systems: A Survey. ACM Comput. Surv. 2022, 55, 38. [Google Scholar] [CrossRef]
- Lu, J.; Wu, D.; Mao, M.; Wang, W.; Zhang, G. Recommender System Application Developments: A Survey. Decis. Support Syst. 2015, 74, 12–32. [Google Scholar] [CrossRef]
- Lu, S.; Liu, Z.; Yang, X.; Ding, Y.; Gao, Y.; Yuan, Y. Meta-Learning for Debiasing Recommendation using Simulated Uniform Data. In Proceedings of the 2024 IEEE International Conference on Big Data (BigData), Washington, DC, USA, 15–18 December 2024. [Google Scholar] [CrossRef]
- Zheng, Y.; Gao, C.; He, X.; Li, Y.; Jin, D. Price-aware Recommendation with Graph Convolutional Networks. In Proceedings of the 36th International Conference on Data Engineering (ICDE), Dallas, TX, USA, 20–24 April 2020; pp. 1–12. [Google Scholar] [CrossRef]
- Adomavicius, G.; Tuzhilin, A. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. IEEE Trans. Knowl. Data Eng. 2005, 17, 734–749. [Google Scholar] [CrossRef]
- Dong, Y.; Chawla, N.V.; Swami, A. metapath2vec: Scalable Representation Learning for Heterogeneous Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 135–144. [Google Scholar] [CrossRef]
- Su, X.; Khoshgoftaar, T.M. A survey of collaborative filtering techniques. Adv. Artif. Intell. 2009, 2009, 421425. [Google Scholar] [CrossRef]
- Fleder, D.; Hosanagar, K. Blockbuster Culture’s Next Rise or Fall: The Impact of Recommender Systems on Sales Diversity. Manag. Sci. 2009, 55, 697–712. [Google Scholar] [CrossRef]
- He, X.; Liao, L.; Zhang, H.; Nie, L.; Hu, X.; Chua, T.S. Neural Collaborative Filtering. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 173–182. [Google Scholar] [CrossRef]
- Agarwal, S.; Mehta, R.; Singh, V. A Healthy Food Recommendation System Using KNN Model and Elasticsearch with Quantum Computing. In Handbook of Research on Predictive Analysis Using AI Tools and Applications; IGI Global: Hershey, PA, USA, 2024; pp. 1–18. [Google Scholar] [CrossRef]
- Liu, H.; Wang, Y.; Zhang, T. Optimizing Real Estate Recommendations with Elasticsearch and Collaborative Filtering. In Advanced Computational Intelligence, Proceedings of the ICACI 2024, Singapore, 5–7 January 2024; Springer: Singapore, 2024; pp. 187–201. [Google Scholar] [CrossRef]
- Burke, R. Hybrid recommender systems: Survey and experiments. User Model. User-Adapt. Interact. 2002, 12, 331–370. [Google Scholar] [CrossRef]
- Lian, J.; Zhang, F.; Chen, X.; Xie, X.; Sun, G. XDeepFM: Combining explicit and implicit feature interactions for recommender systems. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 1754–1763. [Google Scholar] [CrossRef]
- Robertson, S.; Zaragoza, H. The Probabilistic Relevance Framework: BM25 and Beyond. Found. Trends Inf. Retr. 2009, 3, 333–389. [Google Scholar] [CrossRef]
- Salton, G.; Buckley, C. Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 1988, 24, 513–523. [Google Scholar] [CrossRef]
- Sedhain, S.; Mehta, A.; Suthossat, S.; Piech, C. AutoRec: Autoencoders meet collaborative filtering. In Proceedings of the 24th International Conference on World Wide Web, Florence, Italy, 18–22 May 2015; pp. 92–93. [Google Scholar] [CrossRef]
- Moreno, A.; Mehmood, R.; Hästbacka, E.; Ericsson, D. Exploring the landscape of hybrid recommendation systems in e-commerce: A systematic literature review. IEEE Access 2024, 12, 34413–34433. [Google Scholar] [CrossRef]
- Sharma, S.; Baishya, K.; Pandey, M.; Rautaray, S. Hybrid Product Recommendation System using Popularity Based and Content-Based Filtering. In Proceedings of the 2023 International Conference on Data Science, Agents & Artificial Intelligence, New York, NY, USA, 24–26 February 2023; pp. 642–649. [Google Scholar] [CrossRef]
- Ratchford, B.T. Online Pricing: Review and Directions for Research. J. Interact. Mark. 2009, 23, 82–90. [Google Scholar] [CrossRef]
- Luo, H. Research on the impact of online promotions on consumers’ impulsive online shopping intentions. J. Theor. Appl. Electron. Commer. Res. 2021, 16, 2386–2404. [Google Scholar] [CrossRef]
- Lin, C.C.; Chien, T.K.; Ma, H.Y. The effects of popularity: An online store perspective. Int. J. Inf. Sci. Manag. 2014, 12, 1–11. [Google Scholar]
- Davidson, J.; Liebald, B.; Liu, J.; Nandy, P.; Van Vleet, T.; Gargi, U.; Waterson, P. The YouTube Video Recommendation System. In Proceedings of the Fourth ACM Conference on Recommender Systems, Barcelona, Spain, 26–30 September 2010; pp. 293–296. [Google Scholar] [CrossRef]
- Park, Y.J.; Tuzhilin, A. The long tail of recommender systems and how to leverage it. In Proceedings of the 2008 ACM Conference on Recommender systems, Lausanne, Switzerland, 23–25 October 2008; pp. 11–18. [Google Scholar] [CrossRef]
- Abdollahpouri, H.; Burke, R.; Mobasher, B. Controlling Popularity Bias in Learning-to-Rank Recommendation. In Proceedings of the Eleventh ACM Conference on Recommender Systems, Como, Italy, 27–31 August 2017; pp. 42–46. [Google Scholar] [CrossRef]
- Han, G.; Feng, Z.; Xu, Y. Bundle Recommendation for Budget-Conscious Consumers. Tsinghua Sci. Technol. 2024. Available online: https://www.sciopen.com/article/10.26599/TST.2024.9010144?issn=1007-0214 (accessed on 30 September 2025).
- Amatriain, X.; Basilico, J. Recommender Systems in Industrial Settings. In Recommender Systems Handbook, 2nd ed.; Ricci, F., Rokach, L., Shapira, B., Eds.; Springer: New York, NY, USA, 2016; pp. 385–419. [Google Scholar] [CrossRef]
- Kohavi, R.; Deng, A.; Frasca, B.; Walker, T.; Xu, Y.; Pohlmann, N. Online Controlled Experiments at Large Scale. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Chicago, IL, USA, 11–14 August 2013; pp. 1168–1176. [Google Scholar] [CrossRef]
Component | Key Takeaways |
---|---|
Distributed search engine utilization | Uses distributed search engines such as Elasticsearch or Solr for product searches that integrate BM25 and price weighting, allowing horizontal scaling of query performance under heavy traffic. Well-suited for scale-out architectures. |
Memory efficiency and caching | Reduces the computational cost of collaborative filtering by caching predicted results for frequently queried or computed user-item pairs. Ensure rapid response times by minimizing redundant computations. |
Log ingestion and real-time analysis | Collects click, cart, and purchase logs through an event streaming platform (e.g., Kafka) and performs batch, parallel, and real-time analysis. Rapidly integrates analysis outcomes into the recommendation model, improving freshness and accuracy. |
Hybrid ranking adjustment | Determines final recommendation order using a ranking function that combines BM25 scores, CF scores, and price weights Built with a modular structure that supports scale-out in large-scale environments |
Metric | Definition & Characteristics | Application & Significance |
---|---|---|
CTR (Click-Through Rate) | Ratio of clicks to impressions for a product or recommendation slot. Quantifies user engagement with recommendation results. Serves as an indirect indicator of system effectiveness. | Used to refine recommendation strategies and gauge user interest. Optimization results are validated through CTR changes. |
Purchase Growth Rate | Measures the extent to which purchasing activity is stimulated by comparing the number of purchases before and after the implementation of the recommender system. Indicates whether higher CTR translates into actual purchase growth. | Assesses whether CTR improvements lead to conversions. Demonstrates the business impact of the recommendation strategy over time |
Revenue | Compares total sales before and after the implementation of the recommender system. Captures the algorithm’s contribution to revenue growth. | Confirms financial outcomes (ROI) in relation to CTR and purchase trends. Provide persuasive evidence for management or resource allocation. |
TPS (Transaction Per Second) | Indicates the number of transactions processed per second. Serves as an indicator of system stability and scalability under high-traffic conditions. Confirms whether operational requirements, such as zero-downtime index updates, are satisfied. | Evaluates the system’s capacity to manage a sudden surge in concurrent users or traffic when the recommender system is deployed. Detects performance bottlenecks and validates scalability. |
Event Type | Description | Weight |
---|---|---|
View | Product viewed by clicking | 1 |
Cart | Product added to cart | 2 |
Purchase | Purchased product | 3 |
Dictionary Type | Word List |
---|---|
Stopword dictionary | control, reheating, stop button, at time of purchase, when adding, bestseller, free gift random giveaway, series, multiple, choice product, special price event, multi-purpose, rush of orders, owner’s recommendation, limited quantity, discount, coupon, official importer, option, single item sale, free shipping, same-day shipping, time-sensitive, points, or more, choose 1, way-drawer, launch, super high quality, of material, BEST, recommendation, launching, collection, coming soon, Namyang, more than X sheets, same-day shipping, fashion beauty week, free shipping, time-sensitive, self-produced, special price in progress, store price, domestic invention patent, product, direct shipping, use, additional, limited, limited quantity, discount, new launching, points, coupon discount, wonder coupon, mat excluded, official importer, option, single item sale, additional choice, included, free round-trip shipping, round-trip shipping included, free round-trip shipping, available, product sale, partner card |
Weighted word dictionary | men, women, male, female, girl, boy, child, infant, for infants, kids, stage 1–4, jacket, pants, belt, hat, gloves, ball cap, cap, sun cap, heavy down, golf shoes, scrub holder, extended padding, loafer, sneakers, bloafer, knit cap, cardigan, bikini, shirt, jumpsuit, v-neck top, lace, culottes, skirt, cooler bag, colored pencil, swimsuit, card, stiletto heels, socks, headset, clutch, dinosaur, glue stick, sandals, note, toothpaste, wet wipes, beauty tissues, tteokbokki, shorts, string pants, liner socks, slippers, long one-piece dress, turtleneck, culottes, spikes, fleece-lined, neck cushion, newborn baby clothes, trash can, long skirt, gauze, airway maintainer, Velpeau band, cloth plaster, non-sterile gauze, sterile gauze, roll gauze, absorbent cotton, cotton bandage, cotton ball, compressed cotton, InBody, height scale, cast, mugwort moxibustion, heating mat, electric mat, hot water mat, rhinitis, nasal wash, moxibustion tool, mini moxibustion, fire cupping set, cupping, cupping device, moxibustion device, bathroom slippers, Black Friday, hot pack, neck traction, turtle neck traction device, disc care, vision test chart, toothbrush, bath thermometer, digital thermometer, colored jeans, sweater, padded jumper, suspenders, apron |
Noun Dictionary | tight, female, male, spring, summer, autumn, winter, timing, leg, value, British, stripe, girl, boy, pants, button one-piece dress, halter one-piece dress, bikini, girl, short sleeve, melody, halter, culottes pants, Kanuda, I am Mother, fresh, arugula, for in-flight, sweetim, dream, bear, beanie, Pororo, heavy equipment, basketball, basketball (ball), desk, chair, set, complete set, random color shipping, pang-pang, C’est Vsi Bon, houndstooth, Novalac, pillow stuffing, Superga, bed, captain, u-line, sweet pumpkin, commercial, living room cabinet, cross-body bag, Unika, smart, hanger, Saekom-Dalkom, lace, point, one-piece dress, simple, ring, desk, clay bed, Harim, rabbit, Bioderma, clothing, slingback, solid wood, baby high chair, dried fruit, coffee, waterproof, name sticker, party, card, flower, materials, French Cafe, Cafe Mix, Coffee Mix, Chamgreen, one-touch, kimchi refrigerator, drawer container, stainless steel, freezer compartment, refrigerator compartment, flatty, flat, tall small, basic set, tongs, funnel, storage container, no-collar, wine, red wine, white wine, color scheme, one-button, jacket, zipper, fine dust, yellow dust, mask, navigation, handmade |
Day | View A | View B | CTR Diff. | Purchase Diff. | Revenue |
---|---|---|---|---|---|
1 | 10,242,358 | 10,244,362 | 3.15% | −5.11% | 15.88% |
2 | 10,794,068 | 10,793,921 | 2.02% | −1.90% | −2.02% |
3 | 10,865,519 | 10,858,848 | 3.64% | −3.44% | 4.41% |
4 | 11,105,431 | 11,106,800 | 3.24% | 6.19% | 20.32% |
5 | 8,463,556 | 8,461,972 | 2.44% | 0.85% | 1.18% |
6 | 8,680,147 | 8,685,480 | 2.81% | 5.68% | 34.88% |
7 | 10,965,151 | 10,966,954 | 3.70% | 2.14% | −8.39% |
8 | 11,068,590 | 11,068,059 | 3.29% | −3.06% | −28.63% |
9 | 11,226,542 | 11,225,323 | 2.39% | −4.41% | −23.36% |
10 | 10,796,796 | 10,799,107 | 3.55% | 3.49% | −15.06% |
11 | 9,927,224 | 9,928,889 | 3.08% | 8.47% | −5.55% |
12 | 8,022,208 | 8,016,256 | 2.90% | 5.04% | −13.52% |
13 | 8,106,215 | 8,108,953 | 3.80% | 9.02% | −5.56% |
14 | 10,497,534 | 10,495,112 | 2.86% | 13.52% | −12.09% |
Total | 24,378,054 | 24,381,210 | 3.06% | 1.98% | −5.10% |
Day | View A | View B | CTR Diff. | Purchase Diff. | Revenue |
---|---|---|---|---|---|
1 | 10,706,774 | 10,709,954 | 0.17% | 5.70% | 78.92% |
2 | 8,611,818 | 8,620,872 | 6.58% | 3.46% | 2.98% |
3 | 10,788,443 | 10,781,719 | 5.56% | −1.24% | −21.40% |
4 | 10,760,489 | 10,763,836 | −1.11% | 3.38% | −5.20% |
5 | 11,036,644 | 11,032,724 | 2.79% | −2.22% | −2.98% |
6 | 8,868,809 | 8,876,145 | 1.01% | −9.32% | −5.37% |
7 | 9,483,432 | 9,479,642 | 2.97% | −11.01% | 23.77% |
8 | 10,383,113 | 10,392,445 | 2.00% | −4.63% | −1.11% |
9 | 10,996,398 | 10,993,813 | 2.90% | −6.48% | 11.13% |
10 | 10,768,384 | 10,768,389 | 4.10% | 4.46% | 5.31% |
11 | 10,498,714 | 10,499,502 | 3.44% | −7.77% | −1.07% |
12 | 9,387,142 | 9,394,486 | 4.18% | 8.46% | −2.49% |
13 | 7,051,044 | 7,051,973 | 1.91% | 15.20% | 65.74% |
14 | 7,714,353 | 7,707,807 | 1.52% | −3.99% | 2.73% |
15 | 9,694,423 | 9,690,520 | 3.33% | −1.88% | −8.62% |
16 | 9,074,197 | 9,075,797 | 1.55% | 4.59% | 5.38% |
17 | 9,987,808 | 9,992,299 | 3.16% | −5.65% | −13.06% |
18 | 8,551,118 | 8,549,539 | 0.86% | −3.31% | 21.13% |
19 | 10,204,142 | 10,205,193 | 1.62% | 5.05% | 7.60% |
20 | 8,212,553 | 8,204,835 | 1.88% | −6.26% | −3.48% |
21 | 8,451,097 | 8,454,349 | 2.02% | −3.05% | −13.06% |
22 | 10,261,191 | 10,257,224 | 2.35% | 0.52% | −11.75% |
23 | 10,242,358 | 10,244,362 | 3.15% | −5.11% | 15.88% |
24 | 10,794,068 | 10,793,921 | 2.02% | −1.90% | −2.02% |
25 | 10,865,519 | 10,858,848 | 3.64% | −3.44% | 4.41% |
26 | 11,105,431 | 11,106,800 | 3.24% | 6.19% | 20.32% |
27 | 8,463,556 | 8,461,972 | 2.44% | 0.85% | 1.18% |
28 | 8,680,147 | 8,685,480 | 2.81% | 5.68% | 34.88% |
Total | 271,643,165 | 271,654,446 | 2.55% | −0.53% | 5.55% |
Day | View A | View B | CTR Diff. | Purchase Diff. | Revenue |
---|---|---|---|---|---|
1 | 6,120,323 | 6,127,115 | 0.64% | 9.05% | 136.00% |
2 | 4,703,805 | 4,712,818 | 11.01% | −2.57% | −7.67% |
3 | 6,214,008 | 6,206,601 | 8.33% | 1.16% | −28.20% |
4 | 6,055,943 | 6,057,083 | −2.15% | 10.06% | −4.67% |
5 | 6,290,783 | 6,290,295 | 5.06% | −1.27% | −6.17% |
6 | 4,853,853 | 4,860,400 | 2.54% | −5.30% | 13.58% |
7 | 5,251,918 | 5,251,540 | 4.12% | −4.79% | 22.13% |
8 | 5,829,956 | 5,832,374 | 3.64% | −7.10% | −5.24% |
9 | 6,187,362 | 6,187,827 | 5.01% | −11.55% | 14.21% |
10 | 6,161,072 | 6,162,085 | 7.00% | 7.96% | 14.86% |
11 | 5,891,823 | 5,892,350 | 5.53% | −8.66% | 7.05% |
12 | 5,412,457 | 5,416,814 | 6.86% | 17.55% | −3.65% |
13 | 3,966,932 | 3,966,488 | 2.43% | 7.43% | 92.17% |
14 | 4,327,500 | 4,324,016 | 2.46% | 1.84% | 24.00% |
15 | 5,480,579 | 5,478,136 | 4.42% | 0.35% | −8.72% |
16 | 5,074,602 | 5,069,520 | 3.18% | −0.55% | 15.54% |
17 | 5,595,537 | 5,598,759 | 5.66% | −2.36% | −5.04% |
18 | 4,689,294 | 4,688,116 | 2.06% | −8.31% | 53.31% |
19 | 5,737,617 | 5,735,085 | 3.22% | 3.70% | 3.46% |
20 | 4,515,367 | 4,510,117 | 2.25% | −5.78% | 0.27% |
21 | 4,717,987 | 4,722,724 | 3.59% | −1.19% | −21.30% |
22 | 5,947,914 | 5,945,721 | 1.86% | 12.80% | −14.02% |
23 | 5,882,857 | 5,886,064 | 4.81% | 5.37% | 42.67% |
24 | 6,184,598 | 6,182,748 | 3.69% | 1.82% | 14.03% |
25 | 6,228,829 | 6,225,887 | 5.59% | 0.85% | 18.87% |
26 | 6,407,521 | 6,408,274 | 4.99% | 17.25% | 33.71% |
27 | 4,696,162 | 4,697,371 | 4.35% | 0.29% | −1.97% |
28 | 4,792,173 | 4,797,007 | 5.02% | 25.00% | 83.71% |
Total | 153,218,772 | 153,233,335 | 4.13% | 2.20% | 13.30% |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Kwon, Y.; Bak, G.; Bae, Y. A Hybrid Recommendation System Based on Similar-Price Content in a Large-Scale E-Commerce Environment. Appl. Sci. 2025, 15, 10758. https://doi.org/10.3390/app151910758
Kwon Y, Bak G, Bae Y. A Hybrid Recommendation System Based on Similar-Price Content in a Large-Scale E-Commerce Environment. Applied Sciences. 2025; 15(19):10758. https://doi.org/10.3390/app151910758
Chicago/Turabian StyleKwon, Youngoh, Gwiman Bak, and Youngchul Bae. 2025. "A Hybrid Recommendation System Based on Similar-Price Content in a Large-Scale E-Commerce Environment" Applied Sciences 15, no. 19: 10758. https://doi.org/10.3390/app151910758
APA StyleKwon, Y., Bak, G., & Bae, Y. (2025). A Hybrid Recommendation System Based on Similar-Price Content in a Large-Scale E-Commerce Environment. Applied Sciences, 15(19), 10758. https://doi.org/10.3390/app151910758