Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning
Abstract
1. Introduction
- How do the characteristics of different populations influence the performance of personalized meal recommender systems?
- A fuzzy logic-based approach for generating user profiles that account for diverse populations and varied preferences.
- A methodology for comparing and evaluating reinforcement learning-based recommender systems across different populations.
2. Materials and Methods
2.1. Materials
2.2. Methods
Fuzzy Logic (FL)
2.3. Recommendation Systems (RSs)
2.3.1. Problem Formulation
- State Space : A state is defined as the historical selection of the user for each course over the past w days, up to day m. Each day consists of a vector of size three (corresponding to the first course, second course, and dessert) containing the selected plate type (tag) for each course. For instance, a menu consisting of pasta, meat, and fruit would correspond to tag IDs , respectively.
- Action Space : The action consists of a pair , where corresponds to the course index (first, second, or dessert), and t is the candidate tag (e.g., pasta) to be recommended. The action space at time m comprises all available plate types for selection. Note that not all the tags are available for each course. For example, for the dessert, only fruit and dairy are available. The size of the action space was 15 and the proposed algorithms considered all the tags.
- Transition Probability : The transition probability defines the likelihood of moving from state to state after taking action . The definition depends on the learning framework:
- –
- In multi-armed bandit (MAB) settings, the transitions are independent across time steps, as each recommendation does not influence future states.
- –
- In Q-learning and State–Action–Reward–State–Action (SARSA) settings, the transitions account for sequential dependencies, where the user’s historical choices influence future selections.
- Reward : Once the user selects a plate for each course, the chosen items are compared with the recommended tags. If the recommendation matches the user’s selection, the system receives a reward . This mechanism allows learning without explicit user feedback.
- Discount Factor : The discount factor controls the importance of future rewards when selecting the current action. A value of leads to a greedy selection based only on immediate rewards, whereas incorporates future outcomes into the decision-making process.
2.3.2. Algorithms
2.3.3. Evaluation
- Accumulated reward refers to the mean of the sum of rewards obtained by each algorithm in the evaluated population. If the recommendation matches the user selection, a reward is provided to the algorithm. This metric indicates how often the algorithm succeeded in recommending the correct item compared with the actual selections.
- Improvement shows the relative increase in performance with respect to a random recommender. To correctly compute this metric, a random baseline is required. For this purpose, the algorithm randomly selected one dish from the three options for the first and second courses, and one from the two options offered for dessert, without considering any probabilities.
- Efficiency measures how often per day the algorithm correctly predicts each selection of the user. In the context of our study, this metric indicated how many dishes, on average, the recommender correctly guessed out of the three choices (first dish, second dish, and dessert).
- Supervised classification metrics: this includes F1-score, recall, and precision [25], which are commonly used in supervised classification. These metrics assess the effectiveness of recommendations based on user-selected items. A value of 1 indicates perfect recommendation performance (i.e., the system recommends exactly the same items chosen by the user), while 0 represents the lowest performance.
| Algorithm 1: Menu recommendation process for all algorithms |
![]() |
2.3.4. Optimal Recommendation System Selection
3. Results
3.1. Characterization of Different Target Populations
3.2. Recommendation Systems (RSs)
4. Discussion
5. Conclusions
Supplementary Materials
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| MAB | Multi-Armed Bandit |
| SARSA | State–Action–Reward–State–Action |
| DQN | Deep-Q Network |
| RL | Reinforcement Learning |
| FL | Fuzzy Logic |
| RS | Recommendation Systems |
| MPD | Markov Decision Process |
| DNN | Deep Neural Network |
| LSTM | Long Short-Term Memory |
References
- Research and Markets. Europe Contract Catering Market—Focused Insights 2023–2028. 2023. Available online: https://www.researchandmarkets.com/report/europe-contract-catering-market?utm_source=BW&utm_medium=PressRelease&utm_code=gnqg5x&utm_campaign=1901594+-+Europe+Contract+Catering+Market+Insights+Report+2023-2028%3a+Collaboration+Sparks+Growth%2c+Fresh+Food+Subscriptions+%26+Digitalization+Gaining+Momentum&utm_exec=chdo54prd (accessed on 13 December 2023).
- Singh, A.; Prasad, S.; Singh, R.; Younis, K.; Yousuf, O. Revolutionizing the supply chain: Cutting-edge strategies and technologies for food waste reduction. Bioresour. Technol. Rep. 2025, 29, 102047. [Google Scholar] [CrossRef]
- CCL Hospitality Group. Key Dining Challenges in Senior Living & Implications for the Future. 2022. Available online: https://www.ccl-hg.com/perspectives/key-dining-challenges-in-senior-living-and-implications-for-the-future/ (accessed on 13 December 2023).
- Benton, G.; Fazelnia, G.; Wang, A.; Carterette, B. Trajectory based podcast recommendation. arXiv 2020, arXiv:2009.03859. [Google Scholar] [CrossRef]
- Fuad, A.; Bayoumi, S.; Al-Yahya, H. A Recommender System for Mobile Applications of Google Play Store. Int. J. Adv. Comput. Sci. Appl. 2020, 11, 42–50. [Google Scholar] [CrossRef]
- Zhu, Q.; Zhou, X.; Song, Z.; Tan, J.; Guo, L. DAN: Deep Attention Neural Network for News Recommendation. Proc. AAAI Conf. Artif. Intell. 2019, 33, 5973–5980. [Google Scholar] [CrossRef]
- Keshav, A.; Viswanathan, S.; Dinesh, R. Smart Dine-in: A Personalized Food Recommendation System. In Proceedings of the 2023 Intelligent Computing and Control for Engineering and Business Systems (ICCEBS), Chennai, India, 14–15 December 2023; pp. 1–6. [Google Scholar]
- Basco, A.L.; Licup, P.M.; Longno, D.A.; Martinez, M.A.; Yabut, A.L.; Zamin, N. Tinira Ni Benny: A Recipe Recommender System to Minimize Food Waste. In Proceedings of the 2024 5th International Conference on Artificial Intelligence and Data Sciences (AiDAS), Bangkok, Thailand, 3–4 September 2024; pp. 262–267. [Google Scholar]
- Scherhaufer, S.; Moates, G.; Hartikainen, H.; Waldron, K.; Obersteiner, G. Environmental impacts of food waste in Europe. Waste Manag. 2018, 77, 98–113. [Google Scholar] [CrossRef]
- Reisch, L.A.; Sunstein, C.R.; Andor, M.A.; Doebbe, F.C.; Meier, J.; Haddaway, N.R. Mitigating climate change via food consumption and food waste: A systematic map of behavioral interventions. J. Clean. Prod. 2021, 279, 123717. [Google Scholar] [CrossRef]
- Gupta, M.; Thakkar, A.; Aashish; Gupta, V.; Rathore, D.P.S. Movie Recommender System Using Collaborative Filtering. In Proceedings of the 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC), Coimbatore, India, 2–4 July 2020; pp. 415–420. [Google Scholar] [CrossRef]
- Ting, J.; Ramaswamy, S.I. Yelp Recommendation System. 2013. Available online: https://cs229.stanford.edu/proj2014/Jason%20Ting,%20Swaroop%20Indra%20Ramaswamy,%20Yelp%20Recommendation%20System.pdf (accessed on 30 October 2025).
- Silva, N.; Werneck, H.; Silva, T.; Pereira, A.C.; Rocha, L. Multi-armed bandits in recommendation systems: A survey of the state-of-the-art and future directions. Expert Syst. Appl. 2022, 197, 116669. [Google Scholar] [CrossRef]
- Balakrishnan, A.; Bouneffouf, D.; Mattei, N.; Rossi, F. Using multi-armed bandits to learn ethical priorities for online AI systems. IBM J. Res. Dev. 2019, 63, 1:1–1:13. [Google Scholar] [CrossRef]
- Steck, H.; Baltrunas, L.; Elahi, E.; Liang, D.; Raimond, Y.; Basilico, J. Deep Learning for Recommender Systems: A Netflix Case Study. AI Mag. 2021, 42, 7–18. [Google Scholar] [CrossRef]
- R, K.; Kumar, P.; Bhasker, B. DNNRec: A novel deep learning based hybrid recommender system. Expert Syst. Appl. 2020, 144, 113054. [Google Scholar] [CrossRef]
- Zhao, X.; Gu, C.; Zhang, H.; Yang, X.; Liu, X.; Tang, J.; Liu, H. DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems. Proc. AAAI Conf. Artif. Intell. 2021, 35, 750–758. [Google Scholar] [CrossRef]
- Salloum, S.; Rajamanthri, D. Implementation and Evaluation of Movie Recommender Systems Using Collaborative Filtering. J. Adv. Inf. Technol. 2021, 12, 189–196. [Google Scholar] [CrossRef]
- Srifi, M.; Oussous, A.; Ait Lahcen, A.; Mouline, S. Recommender Systems Based on Collaborative Filtering Using Review Texts—A Survey. Information 2020, 11, 317. [Google Scholar] [CrossRef]
- Aljunid, M.F.; Manjaiah, D.H.; Hooshmand, M.K.; Ali, W.A.; Shetty, A.M.; Alzoubah, S.Q. A collaborative filtering recommender systems: Survey. Neurocomputing 2025, 617, 128718. [Google Scholar] [CrossRef]
- Wang, F.; Zhu, H.; Srivastava, G.; Li, S.; Khosravi, M.R.; Qi, L. Robust Collaborative Filtering Recommendation With User-Item-Trust Records. IEEE Trans. Comput. Soc. Syst. 2022, 9, 986–996. [Google Scholar] [CrossRef]
- Gazdar, A.; Hidri, L. A new similarity measure for collaborative filtering based recommender systems. Knowl.-Based Syst. 2020, 188, 105058. [Google Scholar] [CrossRef]
- Bobadilla, J.; Ortega, F.; Hernando, A.; Bernal, J. A collaborative filtering approach to mitigate the new user cold start problem. Knowl.-Based Syst. 2012, 26, 225–238. [Google Scholar] [CrossRef]
- Li, Y. A Book Recommendation Algorithm Based on Improved Similarity Calculation. In Proceedings of the 2018 3rd International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Huhhot, China, 14–16 September 2018; pp. 615–618. [Google Scholar] [CrossRef]
- Hossin, M.; Sulaiman, M.N. A review on evaluation metrics for data classification evaluations. Int. J. Data Min. Knowl. Manag. Process 2015, 5, 1. [Google Scholar]
- Dutt, V. Explaining human behavior in dynamic tasks through reinforcement learning. J. Adv. Inf. Technol. 2011, 2, 177–188. [Google Scholar] [CrossRef]
- O’Doherty, J.P.; Lee, S.W.; McNamee, D. The structure of reinforcement-learning mechanisms in the human brain. Curr. Opin. Behav. Sci. 2015, 1, 94–100. [Google Scholar] [CrossRef]
- Hu, X.; Kang, S.; Ren, L.; Zhu, S. Interactive preference analysis: A reinforcement learning framework. Eur. J. Oper. Res. 2024, 319, 983–998. [Google Scholar] [CrossRef]
- Risso, D.S.; Giuliani, C.; Antinucci, M.; Morini, G.; Garagnani, P.; Tofanelli, S.; Luiselli, D. A bio-cultural approach to the study of food choice: The contribution of taste genetics, population and culture. Appetite 2017, 114, 240–247. [Google Scholar] [CrossRef] [PubMed]
- Enriquez, J.P.; Archila-Godinez, J.C. Social and cultural influences on food choices: A review. Crit. Rev. Food Sci. Nutr. 2022, 62, 3698–3704. [Google Scholar] [CrossRef] [PubMed]
- Monteleone, E.; Spinelli, S.; Dinnella, C.; Endrizzi, I.; Laureati, M.; Pagliarini, E.; Sinesio, F.; Gasperi, F.; Torri, L.; Aprea, E.; et al. Exploring influences on food choice in a large population sample: The Italian Taste project. Food Qual. Prefer. 2017, 59, 123–140. [Google Scholar] [CrossRef]
- Cao, J.; Wang, K.; Shi, Y.; Pan, Y.; Lyu, M.; Ji, Y.; Zhang, Y. Effects of personal and interpersonal factors on changes of food choices and physical activity among college students. PLoS ONE 2023, 18, e0288489. [Google Scholar] [CrossRef]
- Rolls, E.T.; Kellerhals, M.B.; Nichols, T.E. Age differences in the brain mechanisms of good taste. NeuroImage 2015, 113, 298–309. [Google Scholar] [CrossRef]
- Barragán, R.; Coltell, O.; Portolés, O.; Asensio, E.M.; Sorlí, J.V.; Ortega-Azorín, C.; González, J.I.; Sáiz, C.; Fernández-Carrión, R.; Ordovas, J.M.; et al. Bitter, Sweet, Salty, Sour and Umami Taste Perception Decreases with Age: Sex-Specific Analysis, Modulation by Genetic Variants and Taste-Preference Associations in 18 to 80 Year-Old Subjects. Nutrients 2018, 10, 1539. [Google Scholar] [CrossRef]
- Shim, J.S.; Shim, S.Y.; Cha, H.J.; Kim, J.; Kim, H.C. Socioeconomic characteristics and trends in the consumption of ultra-processed foods in Korea from 2010 to 2018. Nutrients 2021, 13, 1120. [Google Scholar] [CrossRef]
- Azmi, A.K.; Abdullah, N.; Emran, N.A. A recommender system model for improving elderly well-being: A systematic literature review. Int. J. Adv. Soft Comput. Appl. 2019, 11, 87. [Google Scholar]
- Cena, F.; Console, L.; Likavec, S.; Micheli, M.; Vernero, F. How Personality Traits can be Used to Shape Itinerary Factors in Recommender Systems for Young Travellers. IEEE Access 2023, 11, 61968–61985. [Google Scholar] [CrossRef]
- Matos, P.; Rocha, J.; Gonçalves, R.; Almeida, A.; Santos, F.; Abreu, D.; Martins, C. Smart Coach—A Recommendation System for Young Football Athletes. In Ambient Intelligence—Software and Applications—, 10th International Symposium on Ambient Intelligence; Novais, P., Lloret, J., Chamoso, P., Carneiro, D., Navarro, E., Omatu, S., Eds.; Springer: Cham, Swizerland, 2019; pp. 171–178. [Google Scholar]
- Ekstrand, M.D.; Tian, M.; Azpiazu, I.M.; Ekstrand, J.D.; Anuyah, O.; McNeill, D.; Pera, M.S. All the cool kids, how do they fit in?: Popularity and demographic biases in recommender evaluation and effectiveness. In Proceedings of the Conference on Fairness, Accountability and Transparency, PMLR. New York, NY, USA, 23–24 February 2018; pp. 172–186. [Google Scholar]
- Beel, J.; Langer, S.; Nürnberger, A.; Genzmehr, M. The Impact of Demographics (Age and Gender) and Other User-Characteristics on Evaluating Recommender Systems. In Research and Advanced Technology for Digital Libraries; Aalberg, T., Papatheodorou, C., Dobreva, M., Tsakonas, G., Farrugia, C.J., Eds.; Springer: Berlin/Heidelberg, Germany, 2013; pp. 396–400. [Google Scholar]
- Raza, S.; Ding, C. News recommender system: A review of recent progress, challenges, and opportunities. Artif. Intell. Rev. 2021, 55, 749–800. [Google Scholar] [CrossRef]
- Bundasak, S.; Yoksuriyan, P.; Kuntawee, P.; Kotama, R. Food recommendation system for the elderly. Int. J. Sci. 2021, 18, 152–167. [Google Scholar]
- Naik, P. Intelligent Food Recommendation System Using Machine Learning. Int. J. Innov. Sci. Res. Technol. 2020, 5, 616–619. [Google Scholar] [CrossRef]
- Aramayo, N.; Schiappacasse, M.; Goic, M. A Multiarmed Bandit Approach for House Ads Recommendations. Mark. Sci. 2023, 42, 271–292. [Google Scholar] [CrossRef]
- Wu, F.; Qiao, Y.; Chen, J.H.; Wu, C.; Qi, T.; Lian, J.; Liu, D.; Xie, X.; Gao, J.; Wu, W.; et al. MIND: A Large-scale Dataset for News Recommendation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 3597–3606. [Google Scholar] [CrossRef]
- Harper, F.M.; Konstan, J.A. The MovieLens Datasets: History and Context. ACM Trans. Interact. Intell. Syst. 2015, 5, 19. [Google Scholar] [CrossRef]
- Trattner, C.; Parra, D.; Elsweiler, D. Monitoring obesity prevalence in the United States through bookmarking activities in online food portals. PLoS ONE 2017, 12, e0179144. [Google Scholar] [CrossRef]
- Li, S. Food.com Recipes and Interactions. 2019. Available online: https://www.kaggle.com/datasets/shuyangli94/food-com-recipes-and-user-interactions (accessed on 30 October 2025).
- Gulla, J.A.; Zhang, L.; Liu, P.; Özgöbek, O.; Su, X. The Adressa dataset for news recommendation. In Proceedings of the International Conference on Web Intelligence, WI ’17, Leipzig, Germany, 23–26 August 2017; pp. 1042–1048. [Google Scholar] [CrossRef]
- Lahtinen, S. Identification of fuzzy controller for use with a falling-film evaporator. Food Control 2001, 12, 175–180. [Google Scholar] [CrossRef]
- Stoian, V.; Ivanescu, M. Robot Control by Fuzzy Logic. In Frontiers in Robotics, Automation and Control; IntechOpen: London, UK, 2008; pp. 111–132. [Google Scholar] [CrossRef]
- Mehra, A.; Gupta, O.; Avikal, S. Finding the combined effect of academic and non-academic performance on management students’ placement: A fuzzy logic approach. Int. J. Manag. Educ. 2023, 21, 100837. [Google Scholar] [CrossRef]
- Leal Ramírez, C.; Castillo, O. A Hybrid Model Based on a Cellular Automata and Fuzzy Logic to Simulate the Population Dynamics. In Soft Computing for Hybrid Intelligent Systems; Springer: Berlin/Heidelberg, Germany, 2008; pp. 189–203. [Google Scholar] [CrossRef]
- Zhao, K.; Liu, S.; Cai, Q.; Zhao, X.; Liu, Z.; Zheng, D.; Jiang, P.; Gai, K. KuaiSim: A comprehensive simulator for recommender systems. Adv. Neural Inf. Process. Syst. 2023, 36, 44880–44897. [Google Scholar] [CrossRef]
- Zhang, S.; Balog, K. Evaluating conversational recommender systems via user simulation. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; pp. 1512–1520. [Google Scholar]
- Dernoncourt, F. Introduction to Fuzzy Logic; John Wiley & Sons: Hoboken, NJ, USA, 2013; p. 21. [Google Scholar] [CrossRef]
- Vegconomist. El 44% de la Población Vegana en España Tiene Entre 25 y 34 años—Vegconomist-la Revista de los Negocios Veganos-en Español. Available online: https://vegconomist.es/estudios-y-numeros/poblacion-vegana-espana/ (accessed on 7 May 2024).
- Sandri, E.; Modesto i Alapont, V.; Cantín Larumbe, E.; Cerdá Olmedo, G. Analysis of the Influence of Socio-Demographic Variables and Some Nutrition and Lifestyle Habits on Beverage Consumption in the Spanish Population. Foods 2023, 12, 4310. [Google Scholar] [CrossRef]
- Han, S.; Wu, L.; Wang, W.; Li, N.; Wu, X. Trends in dietary nutrients by demographic characteristics and BMI among US adults, 2003–2016. Nutrients 2019, 11, 2617. [Google Scholar] [CrossRef] [PubMed]
- Bilgin Fıçıcılar, B. Comparative Analysis of Fish Consumption Habits in Coastal and Inland Districts of Samsun Province. Mar. Sci. Technol. Bull. 2024, 13, 251–261. [Google Scholar] [CrossRef]
- Turrini, A.; Saba, A.; Perrone, D.; Cialfa, E.; D’Amicis, A. Food consumption patterns in Italy: The INN-CA Study 1994–1996. Eur. J. Clin. Nutr. 2001, 55, 571–588. [Google Scholar] [CrossRef] [PubMed]
- Afsar, M.M.; Crump, T.; Far, B. Reinforcement Learning Based Recommender Systems: A Survey. ACM Comput. Surv. 2022, 55, 145. [Google Scholar] [CrossRef]
- Slivkins, A. Introduction to multi-armed bandits. Found. Trends® Mach. Learn. 2019, 12, 1–286. [Google Scholar] [CrossRef]
- Yan, C.; Xian, J.; Wan, Y.; Wang, P. Modeling implicit feedback based on bandit learning for recommendation. Neurocomputing 2021, 447, 244–256. [Google Scholar] [CrossRef]
- Ravish, R.; Rangaswamy, S.; V, A.; U, V. User preference-based intelligent road route recommendation using SARSA and dynamic programming. J. Control Decis. 2023, 10, 443–453. [Google Scholar] [CrossRef]
- Farebrother, J.; Machado, M.C.; Bowling, M. Generalization and regularization in dqn. arXiv 2020, arXiv:1810.00123. [Google Scholar] [CrossRef]
- Lin, Y.; Liu, Y.; Lin, F.; Zou, L.; Wu, P.; Zeng, W.; Chen, H.; Miao, C. A Survey on Reinforcement Learning for Recommender Systems. IEEE Trans. Neural Networks Learn. Syst. 2023, 35, 13164–13184. [Google Scholar] [CrossRef]
- Intayoad, W.; Kamyod, C.; Temdee, P. Reinforcement Learning for Online Learning Recommendation System. In Proceedings of the 2018 Global Wireless Summit (GWS), Chiang Rai, Thailand, 25–28 November 2018; pp. 167–170. [Google Scholar] [CrossRef]
- Liu, L.; Guan, Y.; Wang, Z.; Shen, R.; Zheng, G.; Fu, X.; Yu, X.; Jiang, J. An interactive food recommendation system using reinforcement learning. Expert Syst. Appl. 2024, 254, 124313. [Google Scholar] [CrossRef]
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Mezard, M.; Nadal, J.P. Learning in feedforward layered networks: The tiling algorithm. J. Phys. A Math. Gen. 1999, 22, 2191. [Google Scholar] [CrossRef]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2017, arXiv:1412.6980. [Google Scholar] [CrossRef]
- Unión Vegetariana. El Veganismo en España en Cifras—Actualizado en 2021. 2021. Available online: https://unionvegetariana.org/el-veganismo-en-espana-en-cifras-actualizado-en-2021/ (accessed on 18 December 2023).
- Roy, D.; Dutta, M. A systematic review and research perspective on recommender systems. J. Big Data 2022, 9, 59. [Google Scholar] [CrossRef]
- Bondevik, J.N.; Bennin, K.E.; Babur, O.; Ersch, C. A systematic review on food recommender systems. Expert Syst. Appl. 2024, 238, 122166. [Google Scholar] [CrossRef]
- Rostami, M.; Farrahi, V.; Ahmadian, S.; Mohammad Jafar Jalali, S.; Oussalah, M. A novel healthy and time-aware food recommender system using attributed community detection. Expert Syst. Appl. 2023, 221, 119719. [Google Scholar] [CrossRef]







| Plate Type | Count | #Tags | #Innovative | Example |
|---|---|---|---|---|
| Firsts | 60 | 5 | 10 | Rice with tomato, pumpkin risotto |
| Seconds | 60 | 8 | 23 | Zucchini omelet, sweet and sour pork |
| Desserts | 25 | 2 | 4 | Cheesecake, mango |
| Parameter | MAB | SARSA | DQN |
|---|---|---|---|
| Random recommendation probability | 15% | 15% | 15%, decreasing by 1% per training (min 1%) |
| Exploration–exploitation balance | -greedy | -greedy | -greedy |
| Previous selections used (d) | – | – | 3 days |
| LSTM neurons | – | – | 32 |
| DNN structure | – | – | 1 hidden layer (16 neurons) 1 output layer |
| Learning rate | – | – | 0.0005 |
| Target network update | – | – | Every 3 training runs |
| Training frequency | Real time | Real time | Every 20 recommendations |
| Training duration | 20 days | 20 days | 20 days |
| Group | % Female | Age Range | % Omnivorous | % Flexitarians | % Vegetarians | % Vegans | % Innovative Users |
|---|---|---|---|---|---|---|---|
| Spanish | 52% | 1* | 80% | 16% | 2% | 2% | 45% |
| Foodies | 44% | Any age | 72% | 20% | 6% | 2% | 88% |
| Veggies | 70% | Any age | 0% | 50% | 30% | 20% | 88% |
| Senior | 54% | >70 | 98% | 2% | 0% | 0% | 10% |
| Recommendation System | Group | Accumulated Reward | Std. Dev. | Improvement | Efficiency | F1 | Recall | Precision |
|---|---|---|---|---|---|---|---|---|
| DQN | Spanish | 655.74 | 118.05 | 63.46% | 1.9 | 0.6448 | 0.6371 | 0.6664 |
| Foodie | 698.08 | 120.22 | 71.60% | 2.02 | 0.6754 | 0.6722 | 0.6835 | |
| Veggie | 680.72 | 83.60 | 65.02% | 1.97 | 0.6673 | 0.6576 | 0.6894 | |
| Senior | 436.91 | 110.65 | 8.89% | 1.27 | 0.4198 | 0.4216 | 0.4315 | |
| MAB | Spanish | 617.44 | 101.37 | 53.91% | 1.79 | 0.6058 | 0.5998 | 0.6220 |
| Foodie | 654.85 | 102.47 | 66.97% | 1.90 | 0.6339 | 0.6335 | 0.6370 | |
| Veggie | 640.50 | 72.26 | 55.26% | 1.86 | 0.6273 | 0.6216 | 0.6413 | |
| Senior | 431.23 | 95.01 | 7.48% | 1.25 | 0.4160 | 0.4180 | 0.4242 | |
| SARSA | Spanish | 418.86 | 36.99 | 4.41% | 1.21 | 0.3837 | 0.4088 | 0.4042 |
| Foodie | 408.78 | 30.32 | 0.48% | 1.18 | 0.3846 | 0.3966 | 0.4033 | |
| Veggie | 415.25 | 33.04 | 0.65% | 1.20 | 0.3914 | 0.4043 | 0.4018 | |
| Senior | 431.69 | 30.24 | 7.59% | 1.25 | 0.4032 | 0.4175 | 0.4094 |
| Spanish | Foodie | Senior | Veggie | |
|---|---|---|---|---|
| Spanish | p = 1 | p < 0.005 | p < 0.005 | p = 0.75 |
| Foodie | p < 0.005 | p = 1 | p < 0.005 | p < 0.005 |
| Senior | p < 0.005 | p < 0.005 | p = 1 | p < 0.005 |
| Veggie | p = 0.75 | p < 0.005 | p < 0.005 | p = 1 |
| Spanish | Foodie | Senior | Veggie | |
|---|---|---|---|---|
| Spanish | p = 1 | p < 0.005 | p < 0.005 | p = 0.87 |
| Foodie | p = p < 0.005 | p = 1 | p = 0.86 | p < 0.005 |
| Senior | p < 0.005 | p = 0.86 | p = 1 | p < 0.005 |
| Veggie | p = 0.87 | p < 0.005 | p < 0.005 | p = 1 |
| Type of Study/References | Dataset Characteristics and Advantages | Main Limitations |
|---|---|---|
| Studies using public datasets | Contain large-scale real user interactions (clicks, ratings, purchases, etc.). | Users are anonymized and lack demographic, cultural, or contextual information. |
| (e.g., Wu et al. [45], Harper and Konstan [46], | Facilitate benchmarking and reproducibility across different algorithms. | Prevent cross-population or cultural behavior comparisons. |
| Trattner et al. [47], Li [48], Gulla et al. [49]) | Easily accessible and standardized for research. | Limit the study of how user diversity influences recommendation performance. |
| Studies using custom or simulated datasets | Allow control over user and contextual variables. | Limited sample size and often rely on simulated or survey-based data. |
| (e.g., Bundasak et al. [42], Naik [43], | Enable domain-specific studies (e.g., food, health, or elderly care). | Reduced representativeness of real-world populations. |
| Aramayo et al. [44]) | Adaptable to specific experimental designs. | Results are difficult to generalize to broader contexts. |
| Studies addressing user diversity and cross-population comparison | Highlight the importance of user diversity and fairness in recommendations. | Lack datasets with explicit demographic or cultural information. |
| (e.g., Ekstrand et al. [39], Beel et al. [40], | Identify biases and performance differences across user groups. | No standardized approach for population-level evaluation. |
| Raza and Ding [41]) | Limited understanding of algorithm generalization across populations. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tellechea, Y.; Arrojo, M.; Cejudo, A.; Martin, C. Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning. Foods 2025, 14, 3770. https://doi.org/10.3390/foods14213770
Tellechea Y, Arrojo M, Cejudo A, Martin C. Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning. Foods. 2025; 14(21):3770. https://doi.org/10.3390/foods14213770
Chicago/Turabian StyleTellechea, Yone, Markel Arrojo, Ander Cejudo, and Cristina Martin. 2025. "Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning" Foods 14, no. 21: 3770. https://doi.org/10.3390/foods14213770
APA StyleTellechea, Y., Arrojo, M., Cejudo, A., & Martin, C. (2025). Population-Level Analysis of Personalized Food Recommendation Using Reinforcement Learning. Foods, 14(21), 3770. https://doi.org/10.3390/foods14213770


