1. Introduction
The rapid evolution of educational technology has opened new avenues for delivering complex concepts to younger audiences. One such avenue is the use of serious games that merge entertainment with pedagogical goals. This paper focuses on a serious game designed to address two critical and intertwined topics: nutritional health and environmental sustainability. These issues are particularly relevant for Generation Alpha, who will grow up in a world facing mounting challenges related to food security and climate change.
Generation Alpha (born 2010–2025) was selected as the target audience for ARFood due to their strong affinity for digital technology and interactive learning environments, but also because adolescence is a critical period for shaping long-term dietary and sustainability behaviors. Research indicates that early exposure to nutrition education significantly influences future food choices, making this age group particularly receptive to interventions that promote healthy and sustainable consumption habits [
1,
2].
Traditional instructional methods often fail to engage younger learners, whereas AR-based serious games enhance motivation, attention, and cognitive engagement by providing interactive real-world simulations [
3].
The app discussed in this study leverages Augmented Reality (AR) and Artificial Intelligence (AI) to create an engaging, interactive learning experience. Through AR, users interact with a virtual supermarket and select food items as they would in a real environment. Two AI-driven Non-Player Characters (NPCs), powered by ChatGPT, evaluate the choices made by users. Each NPC provides feedback tailored to one of two core educational themes: nutritional education or environmental sustainability.
AR offers a more immersive and interactive educational experience compared to traditional non-AR learning methods, particularly in the context of nutrition and sustainability education. Traditional approaches, such as text-based materials or classroom teaching, often fail to engage students in experiential learning, which is critical for developing critical thinking and decision-making skills. Augmented reality (AR) improves learning by integrating digital elements into real-world scenarios, allowing users to dynamically interact with virtual environments that simulate realistic shopping experiences. This interactivity is in line with research suggesting that active learning environments improve knowledge retention and motivation compared to passive methods of instruction [
1].
A key advantage of AR lies in its ability to provide situated cognition, allowing students to apply their knowledge in realistic decision-making contexts. Instead of relying on theoretical explanations, serious games based on Augmented Reality present practical problem-solving situations, reinforcing learning through immediate feedback and adaptive guidance. Studies indicate that Augmented Reality favors greater involvement and cognitive processing, particularly among younger people who are accustomed to digital interfaces and interactive content [
4,
5]. Compared to static educational formats, Augmented Reality applications encourage users to explore, experiment, and receive real-time responses based on artificial intelligence, making the learning process more personalized and responsive to individual choices [
6].
Augmented reality (AR) and serious games are proving to be transformative in the field of nutrition education. Research highlights their ability to enhance engagement, knowledge retention, and positively influence dietary behavior. For instance, Paramita et al. [
7] emphasize that AR can reduce boredom and heighten interest in learning about nutrition. Interactive games and simulations are also effective tools. McMahon and Henderson [
8] explore mobile-based pervasive games that utilize QR codes to promote dietary learning engagement for children. Similarly, Barwood et al. [
9] report on computer games’ efficacy in promoting healthier food choices among young users. Educational innovations extend to professional training as well. Camacho and Guevara [
10] note the benefits of AR in dietetics education, providing realistic and interactive training environments that surpass traditional methods.
Participatory game design is another promising approach. Leong et al. [
11] detail the development of video games aimed at improving children’s nutrition knowledge, highlighting the importance of balancing engagement with concerns like screen time. Moreover, AR and Virtual Reality (VR) applications are gaining traction. For example, Pilut et al. [
12] show how a VR grocery store tour can boost self-efficacy in purchasing healthy foods, suggesting broader implications for serious games in enhancing nutrition literacy. Overall, these studies collectively underscore the significant potential of AR and serious games in making nutrition education more effective and enjoyable.
In relation to the second educational objective, food sustainability, the intersection of food sustainability, serious games, and Augmented Reality (AR) is an innovative area of research that offers promising strategies for promoting sustainable eating habits and education. AR has been shown to effectively engage users by providing interactive experiences that enhance dietary behavior and support sustainable nutrition practices. Serious games leverage interactive and immersive gameplay to instill sustainable nutrition values. The game “You Better Eat to Survive!” exemplifies how virtual reality (VR) can incorporate real food consumption to enhance social interaction and sustainable eating behaviors [
13]. Beyond traditional gameplay, AR and VR technologies foster more profound behavioral changes by simulating real-world experiences. A study by Plechatá et al. [
14] demonstrated that VR interventions can reduce dietary footprints by enhancing awareness of the environmental impact of food choices. Meanwhile, Fritz et al. [
15] show that AR enhances food desirability by enabling users to mentally simulate consumption, promoting healthier and more sustainable purchasing decisions.
ChatGPT is emerging as a promising tool in nutrition education, offering personalized learning experiences and supporting dietary education. For instance, Garcia [
16] explores ChatGPT’s potential as a virtual dietitian, highlighting its ability to improve nutrition knowledge through personalized meal planning and educational materials. Similarly, Ray [
17] notes the increasing use of AI technologies, including ChatGPT, in academic settings for nutrition and dietetics. The accuracy and effectiveness of ChatGPT in responding to nutrition-related questions have also been tested. Kirk et al. [
18] found that ChatGPT provided more scientifically correct and actionable answers compared to human dieticians. However, Mishra et al. [
19] caution against potential harm in complex medical nutrition scenarios, emphasizing the need for responsible use by healthcare professionals. Beyond individual learning, ChatGPT also aids in healthcare education. Sallam [
20] highlights its role in personalized learning and critical thinking. Despite its potential, challenges like generating incorrect information remain, as noted by Lo [
21]. ChatGPT’s application extends to specific patient groups, such as those with chronic kidney disease, where Acharya et al. [
22] reports its potential for enhancing nutrition education through accurate and timely responses. While ChatGPT shows significant promise, further research is essential to refine its use and address its limitations in nutrition education [
23]. This dual approach of leveraging technology while maintaining professional oversight could redefine the educational landscape of dietetics and nutrition.
While generative AI applications in education are rapidly emerging, current research has primarily focused on their ability to deliver personalized learning experiences and facilitate interactive dialogue. Studies have demonstrated the potential of AI to improve engagement and dynamically adapt content to learner needs. However, critical gaps remain unaddressed in the literature, particularly in the following areas.
Lack of Objective Evaluation Frameworks: existing AI-driven educational tools often lack a robust framework for evaluating their effectiveness against specific, measurable learning objectives. While they may improve user engagement, their alignment with predefined educational goals has seldom been systematically assessed.
Iterative Improvement Based on Learning Outcomes: most AI applications in education do not incorporate iterative processes to refine their content delivery based on feedback or performance relative to educational objectives. This limits their capacity to adapt and improve their ability to achieve targeted learning outcomes.
Integrating AI and AR into serious games. Although AR and AI have been used independently in education, their combined potential for immersive and adaptive learning experiences has not been explored. Furthermore, the integration of generative AI into AR-based serious games to achieve specific educational goals has not been explored in depth.
This study aims to fill these gaps by presenting an innovative framework for evaluating and optimizing AI-driven educational feedback in a serious game.
Integrating AI and AR into serious games presents a significant opportunity to enhance educational experiences. While Augmented Reality (AR) and Artificial Intelligence (AI) have been independently utilized in education, their combined potential for creating immersive, adaptive learning environments remains largely unexplored. Existing research has primarily focused on AR-driven engagement or AI-assisted learning without fully leveraging their synergistic capabilities. Furthermore, the use of generative AI in AR-based serious games to achieve structured educational outcomes has received little attention, with most applications relying on rule-based systems or static AI models that lack adaptability and iterative refinement. This study aims to fill these critical gaps by introducing an innovative framework for evaluating and optimizing AI-driven educational feedback in the context of serious games. ARFood, the focus of this study, is designed to educate Generation Alpha about nutritional health and environmental sustainability through an AI-enhanced AR experience.
This paper intends to explore how AR technology can be used as an educational tool to raise awareness about sustainable and healthy eating habits among teenagers. Rather than aiming for immediate behavioral changes, the primary goal of ARFood is to engage users in a simulated shopping experience that encourages reflection and informed decision-making. Through interactive scenarios and AI-driven feedback, the application helps users understand the nutritional and environmental impacts of their choices, in accordance with experiential learning principles.
Unlike existing AI-based serious games, which often rely on predefined feedback mechanisms or supervised learning models, ARFood incorporates an iterative AI-driven feedback refinement process. By leveraging zero-shot classification with RoBERTa, the system systematically evaluates and enhances NPCs’ responses to ensure alignment with educational objectives. Additionally, the integration of distinct linguistic styles represents a novel approach to engaging Generation Alpha in nutritional and sustainability education. This iterative refinement methodology not only personalizes feedback dynamically but also ensures comprehensive pedagogical coverage through a data-driven optimization process.
Based on the previous gaps, the research questions of the paper are as follows
- RQ1.
How does the use of AR technology facilitate the creation of realistic virtual shopping carts, promote user interaction, and support educational outcomes related to purchasing behavior?
- RQ2.
Can the responses of an AI-based generative non-player character (NPC) designed to communicate in an engaging and accessible language for Generation Alpha be systematically evaluated for alignment with specific educational goals?
- RQ3.
Can the evaluation results be used to improve the appropriateness of NPC responses to specific educational objectives?
3. Results
3.1. The Dataset of Virtual Shopping Carts
3.1.1. Spending Behavior
To analyze the shopping behavior of adolescents using the AR-based serious game app, a standardization procedure was applied to data from 83 virtual shopping carts. This procedure normalized the quantity of each product to the percentage of the total products selected in each shopping cart. This step ensured comparability between the different shopping carts, as participants simulated purchases for different time periods, e.g., for daily or weekly needs, and for households with different numbers of members. By standardizing the data, preference and allocation patterns can be identified independently of absolute quantities.
Fruits and vegetables emerged as important components of shopping carts (
Figure 3). Fruit accounted for a median of 11.1% and an average of 12.2% of the total cart, with some participants allocating up to 36.1% of their choices to this category. Vegetables were slightly less emphasized, with a median of 7.4% and an average of 7.3%, although the maximum contribution of 33.3% suggests that some adolescents gave high priority to this food group.
Locally grown and garden produce brought a further level of diversity to the trolleys. Items such as strawberries, courgettes, lettuce, tomatoes, and carrots were frequently included, with averages of 6.5%, 4.0%, 5.4%, 5.8%, and 4.8%, respectively. The highest values for strawberries (35.7%) and tomatoes (38.1%) highlight the occasional strong preference for fresh garden-grown produce. Local specialties, such as caciotta cheese and olive oil, played a minor but distinct role, with median contributions of 1.4% and 1.9%, respectively.
Staple foods, including bread, pasta, and rice, occupied a significant share of the carts, reflecting their centrality in meal planning. Bread had a median contribution of 4.8% and a mean contribution of 5.4%, with some participants dedicating up to 30% of their trolleys to this category. Pasta followed closely, with a median of 6.2% and an average of 6.7%, while rice was slightly less important, with a median of 4.5% and an average of 4.6%. These staples underline the importance of familiar and versatile ingredients in the purchasing decisions of adolescents.
Protein sources showed considerable variability between animal- and vegetable-based options. Animal proteins, such as chicken, beef, and fish, were generally less emphasized, with median contributions of 2.3%, 3.0%, and 2.9%, respectively. However, the inclusion of fish, in particular carts, reached 20%, indicating an occasional priority for this category. In contrast, proteins of vegetable origin, such as legumes and nuts, showed median contributions of 3.8% and 3.5%, respectively. Legumes were present in quantities up to 23.5%, while nuts reached a high of 33.3%, highlighting the potential appeal of these elements in promoting sustainable and healthy eating habits.
Dairy products and sweets played minor roles in the shopping carts. Milk represented a median of 3.6%, with maximum contributions reaching 20%, while cheese showed a median of 2.0% and an average of 2.2%. Sweets, although present, were generally not a priority, with a median contribution of 1.7% and a maximum contribution of 19.4%. This suggests that adolescents may have responded to implicit or explicit encouragement in their applications to focus on healthier food options.
3.1.2. Cluster Analysis
To further understand the spending patterns observed in the 83 virtual shopping carts created by adolescents with the AR application, a hierarchical clustering analysis was conducted. A standardized dataset, normalized to express the proportion of each product to the total items in the cart, was used to calculate the distance matrix. The Ward.D2 method was applied to the hierarchical clustering process, generating the attached dendrogram. The dendrogram (
Figure 4) revealed the presence of three distinct clusters, comprising 21 observations in Cluster 1, three observations in Cluster 2, and 59 observations in Cluster 3.
Cluster 1, comprising 21 observations, reflects a balanced spending behavior characterized by a moderate emphasis on fruit and vegetables, with median contributions of 11.8% and 6.3% respectively (
Figure 5). Commodities, such as bread and pasta, also occupy a prominent place, with median proportions of 8.3% for both categories. However, protein sources, particularly those of animal origin, such as chicken, beef, and fish, are minimally represented. The median contributions of these elements were negligible or nil, suggesting a limited focus on these categories. Proteins of vegetable origin, including legumes and nuts, are present but not highly emphasized, with median contributions of 1.5% and 3.8%, respectively. Dairy products, such as milk, had a higher presence, with an average of 6.7%. Locally grown products, such as strawberries, tomatoes, and courgettes, were relatively more important in this cluster, with average contributions of 11.9%, 7.7%, and 5.8%, respectively, underlining a preference for fresh produce. Local specialties, such as olive oil and caciotta cheese, had minimal contributions, with averages of 2.3% and 1.8%.
Cluster 2, comprising only 3 observations, presents a distinctly different pattern, characterized by a high emphasis on fruit and vegetables (
Figure 5). These categories have an average contribution of 16.1% and 22.4%, respectively. Dried fruits are another key component, with an average contribution of 20.3%. Commodities such as bread, pasta, and rice are largely neglected, with average contributions of 2.8%, 6.1%, and 0%, respectively. Animal protein, sweets, and dairy products were minimally represented, with negligible or no median values. This cluster is also characterized by the limited inclusion of garden articles and local specialties, which appear in very low proportions or are absent altogether. The spending pattern of Cluster 2 shows a focus on fresh and nutrient-dense categories, potentially reflecting preferences aligned with plant-based or minimalist dietary patterns.
Cluster 3, the largest group with 59 observations, represents a more diverse and balanced spending behavior (
Figure 5). Fruits and vegetables remained important components, with average contributions of 12.4% and 6.9%, respectively. Commodities, including bread, pasta, and rice, were moderately represented, with average proportions of 4.8%, 6.3%, and 5.4%, respectively. Protein sources in this cluster showed a more even distribution, with average contributions of 3.1% for chicken, 4.1% for beef, and 3.3% for fish. Proteins of vegetable origin, such as legumes and nuts, also play a prominent role, with averages of 4.7% and 3.3%. Sweets, although present, were not a dominant category, with an average of 3.1%. Dairy products, including milk and cheese, made moderate contributions, with averages of 3.4% and 2.7%. Fruits, vegetables, and local specialties had a stronger presence in this cluster than in Cluster 2. Items such as strawberries, lettuce, and tomatoes contributed to the averages of 5.1%, 5.4%, and 6.2%, respectively. Local caciotta and olive oil were present in small but significant proportions, with averages of 2.3% and 2.1%.
3.2. Initial Prompt Results
The boxplots in
Figure 6 summarize NutriBot’s performance in evaluating virtual shopping carts based on eight educational objectives. The results reveal a clear imbalance in how the NPC addresses these objectives in the initial prompt.
Variety of Foods, Nutritional Education, and Motivation to Healthy Eating showed the highest levels of engagement. These objectives had relatively high median values (0.21, 0.15, and 0.23, respectively) and broader interquartile ranges, indicating consistent and thorough feedback. The high maximum values for these categories also highlight NutriBot’s strong focus on promoting dietary diversity and motivation for healthier choices.
In contrast, objectives like Portion Control, Snack Quality, and Unhealthy Eating received significantly less attention, with median values close to zero. Portion Control exhibited the lowest coverage, with most evaluations clustering near the minimum. This demonstrates that the initial prompt did not sufficiently guide NutriBot in addressing the critical aspects of nutrition education.
Additionally, Healthy Choices and Balanced Diet received moderate attention, with medians of 0.15 and 0.11, respectively. Although these objectives are more consistently covered than the lowest-performing ones, the distribution of scores suggests room for improvement in providing more comprehensive evaluations.
The boxplot analysis provides a detailed overview of CyberFlora’s evaluations across ten ecological objectives using the initial generic prompt (
Figure 7). The boxplot visualizes the distribution of CyberFlora’s evaluations across eight educational objectives, highlighting significant disparities in how the NPC addresses these goals under the initial generic prompt. The red dashed line at 0.125 represents the balanced probability threshold, indicating the level at which all the objectives received equal attention.
The results show that CyberFlora prioritizes certain objectives disproportionately. The use of Sustainable Products stands out, with a mean probability of 0.5155 and a median of 0.5475, significantly exceeding the balanced threshold. Similarly, Support for Local Products also shows relatively high coverage, with a mean of 0.2100.
In contrast, several objectives fall well below the balanced threshold. Carbon Footprint, with a mean of 0.0096 and a median of 0.0079, and Ecological Impact, with a mean of 0.0709, are notably underrepresented. Waste Reduction, Biodiversity Support, and Organic Food Preference also displayed low probabilities, indicating a limited focus on these critical aspects of sustainability.
These findings suggest that the initial prompt guides CyberFlora to emphasize specific aspects of sustainability, particularly the use of sustainable and local products, while neglecting other equally important goals, such as reducing carbon footprint and promoting biodiversity. This imbalance underscores the necessity of refining the prompt to ensure more comprehensive coverage across all educational objectives.
3.3. Prompt Refinement Results
The graph in
Figure 8 presents the distribution of NutriBot’s evaluations across the eight educational objectives after refining the prompt to explicitly incorporate these goals. The refined prompt led to a more balanced coverage of all objectives, as evidenced by the distribution of probabilities aligning closely with the equipartition probability of 0.125 (indicated by the red dashed line).
Compared to the evaluations from the original generic prompt, where some objectives were overemphasized and others were largely neglected, the refined prompt ensured a more even representation. The boxplots show that the median probabilities for all eight objectives, including Healthy Choices, Balanced Diet, Variety of Foods, Nutritional Education, Snack Quality, Unhealthy Eating, Portion Control, and Motivation to Healthy Eating, hover near 0.125. Furthermore, the interquartile ranges (IQRs) are narrow and consistent, indicating low variability and stable performance across different evaluations.
This improvement underscores the success of the refined prompt in guiding NutriBot to provide comprehensive feedback. The probability that NutriBot addresses each educational objective is now more evenly distributed, thereby ensuring that all key aspects of nutritional education are adequately covered. This balanced feedback enhances the app’s effectiveness in promoting a holistic understanding of healthy eating among its users.
The boxplot in
Figure 9 illustrates the distribution of CyberFlora’s evaluations across the eight educational objectives after refining the prompt to explicitly incorporate these targets. The refined prompt yielded significantly more balanced feedback compared to the initial generic prompt. The median values for all objectives, including Ecological Impact, Carbon Footprint, Use of Sustainable Products, Waste Reduction, Support for Local Products, Biodiversity Support, Minimizing Packaging Waste, and Organic Food Preference, were consistently close to the theoretical equipartition probability of 0.125 (1/8). This is highlighted by a red dashed line in the graph. The variability, as reflected by the interquartile ranges, was relatively low across objectives, suggesting that CyberFlora’s evaluations were evenly distributed. No single objective dominated, and none were significantly underrepresented. For instance, Carbon Footprint, Support for Local Products, and Biodiversity Support maintain similar probabilities to Waste Reduction and Organic Food Preference, showing that the NPC addresses each aspect of sustainability education with comparable frequency.
These findings mirror those observed with NutriBot’s refined prompt. The explicit integration of educational objectives leads to a more holistic and equitable evaluation process. This refined design ensures that CyberFlora provides feedback that thoroughly covers all key sustainability goals, thereby enhancing the educational impact of the app.
To assess the improvement in the topic probability distribution for NutriBot and CyberFlora, we computed four key statistical metrics: standard deviation, Shannon entropy, Euclidean distance from a uniform distribution, and coefficient of variation (CV). The standard deviation measures the spread of probabilities across topics, with lower values indicating a more balanced distribution. Shannon entropy quantifies the randomness in the probability distribution and reaches its maximum when all probabilities are equal. The Euclidean distance from a perfectly uniform probability distribution evaluates how closely the NPC’s outputs align with the ideal scenario in which all topics are equally represented. Finally, the coefficient of variation (CV) assesses the relative variability in topic representation, with lower values indicating a more equitable distribution.
For NutriBot, the results (
Table 1) demonstrate a marked improvement in probability uniformity from Round 1 to Round 2. The standard deviation decreased from 0.2698 to 0.0136, indicating a substantial reduction in the disparity between the topic probabilities. The Shannon entropy increased from 1.5450 to 2.0731, suggesting a more evenly distributed representation of nutritional topics. The Euclidean distance from the uniform distribution significantly decreased from 0.8256 to 0.0359, confirming that the NutriBot’s outputs in Round 2 were much closer to an ideally balanced distribution. Additionally, the coefficient of variation saw a dramatic reduction from 1.0825 to 0.1087, reinforcing the finding that topic probabilities are now more homogeneous across different records than before.
Similarly, for CyberFlora (
Table 1), a comparable pattern of improvement was observed in the results. The standard deviation was reduced from 0.1776 to 0.0223, indicating a more consistent probability distribution across the ecological topics. The Shannon entropy increased from 1.3955 to 2.0644, reflecting an increase in the topic balance. The Euclidean distance from uniformity decreased sharply from 0.4700 to 0.0589, demonstrating that CyberFlora’s probability distribution is now significantly closer to the optimal state. Moreover, the coefficient of variation declined from 1.4211 to 0.1780, supporting the conclusion that the modifications in Round 2 led to a more equal representation of the ecological aspects.
In summary, both NutriBot and CyberFlora exhibited substantial improvements in the uniformity of their topic probability distributions from Round 1 to Round 2. The observed reductions in the standard deviation, Euclidean distance, and coefficient of variation, alongside the increases in entropy, confirm that the refinements made to these NPCs resulted in a more balanced topic representation, achieving the objective of equalizing the discussion across all relevant themes.
4. Discussion
The integration of Augmented Reality (AR) and Artificial Intelligence (AI) in education has shown great promise in addressing complex topics such as nutritional health and environmental sustainability, as evidenced by the ARFood app. The results of this study support its potential as an effective tool for engaging Generation Alpha in these critical areas.
In response to RQ1, the use of AR technology facilitated a realistic and engaging simulation, promoting user interaction and retention of the educational content.
The results of this study indicate that ARFood successfully facilitates a realistic simulation of purchasing behavior, allowing users to experiment with different food choices in a controlled environment. While it does not directly assess real-world behavioral change, research on game-based learning suggests that interactive and immersive environments significantly enhance knowledge retention and decision-making processes. Future studies could explore how knowledge gained through AR-based simulations translates to real-world consumption patterns by employing longitudinal tracking methods.
Overall, the virtual shopping cart data reveal a balanced allocation of resources among the major food groups, with fruits and vegetables consistently favored. The inclusion of staples such as bread, pasta, and rice reflects a practical approach to meal composition, while the relatively low emphasis on animal protein suggests an intentional preference for plant-based options or the influence of the app’s educational design. The presence of sweets and processed foods was significantly limited, which probably reflects both the app’s objectives and the engagement of adolescents with its framework for healthy eating habits.
This shopping pattern also has significant implications for environmental sustainability. The priority given to plant-based foods such as fruits, vegetables, legumes, and nuts over animal-based proteins is in line with dietary patterns that reduce greenhouse gas emissions and resource use. In addition, the emphasis on locally grown produce and gardens supports sustainable agricultural practices, minimizing transport-related emissions and fostering links with local food systems. By placing little emphasis on high-impact foods, such as beef and processed products, the adolescents in this study demonstrate a potential alignment with environmentally conscious consumption patterns. These findings highlight the role that gamified educational interventions can play in promoting not only healthier but also more sustainable eating behaviors, contributing to broader efforts to promote sustainable food systems. Further research should explore the long-term impact of such interventions on environmental attitudes and purchasing habits in the real world.
Cluster analysis reveals distinct spending patterns among the three groups of adolescents. Cluster 1 shows a balanced but low-protein spending behavior, with a preference for fresh produce. Cluster 2 emphasizes fruit, vegetables, and nuts, reflecting a minimalist, vegetable-oriented approach. Cluster 3, on the other hand, shows a varied and balanced pattern, integrating a wide range of food categories, including plant- and animal-based options. These results underline the potential of the AR serious game application to accommodate and reflect diverse food preferences, providing an educational platform that aligns with both individual choices and broader sustainability goals.
In response to RQ2, the identification of specific educational goals in nutrition and sustainability verified that simplified hip-hop style prompts for NutriBot and New Age style prompts for CyberFlora allowed for partial and limited alignment with the identified educational dimensions. Results from evaluating NutriBot ratings with the RoBERTa classifier confirm that the initial prompt strongly emphasizes fruit and vegetable balance, dietary diversity, and local and home-grown products, while consistently neglecting goals such as Portion Control, variety of protein sources, and whole foods and grains. These shortcomings underscore the need for immediate refinement to ensure that NutriBot provides a more comprehensive assessment of all educational objectives, thereby improving its effectiveness as an educational tool.
The uneven distribution of the relevance of CyberFlora’s ratings to the dimensions of food sustainability indicates that while the initial prompt was successful in eliciting responses related to some high-level sustainability goals, it failed to comprehensively address the others.
The identified gaps highlight the need for targeted and detailed prompt designs to ensure a balanced coverage of all educational goals. This finding underscores the importance of refining prompts to achieve a more holistic and effective educational experience.
Addressing RQ3, the iterative process proved instrumental in refining NPC responses to better align with educational objectives. For instance, NutriBot’s focus on underrepresented goals, such as Portion Control and Snack Quality, improved markedly with refined prompts, leading to a more holistic coverage of critical nutritional aspects. This improvement is evidenced by a significant increase in the alignment of feedback with predefined educational targets, highlighting NutriBot’s enhanced ability to guide users toward healthier dietary habits. Similarly, CyberFlora achieved a more balanced coverage across all sustainability objectives, addressing prior gaps in areas like waste reduction and Biodiversity Support. These refinements underscore the effectiveness of iterative evaluations, demonstrating how targeted adjustments can optimize the educational value of AI-driven tools. Moreover, this process highlights the adaptability of such technologies in evolving educational landscapes, ensuring their relevance and efficacy in addressing diverse learning needs. This study highlights that educational objectives were successfully achieved while preserving NutriBot’s hip-hop style and CyberFlora’s new-age tone. This demonstrates the feasibility of combining the accuracy and comprehensiveness of educational content with an engaging and dynamic communication style. These findings underscore the effectiveness of well-designed and rigorously tested prompts in delivering educational messages in a manner that is both informative and appealing to the target audience.
The study’s findings align with key sustainability frameworks, particularly the United Nations Sustainable Development Goals (SDGs) and the EAT-Lancet Commission’s planetary health diet. The preference for plant-based and locally sourced foods observed in participant choices supports SDG 12 (Responsible Consumption and Production) by promoting food selections that reduce environmental impact [
24]. Additionally, integrating AI-driven sustainability feedback into educational tools like ARFood aligns with SDG 4 (Quality Education) by fostering experiential learning and awareness of dietary sustainability.
The findings of this study align with the existing literature on the use of Augmented Reality (AR) and Artificial Intelligence (AI) in education, particularly in the domains of nutrition and personalized learning.
The ARFood app’s use of AR to create engaging and realistic simulations for teaching nutritional health is consistent with the conclusions of Yigitbas and Mazur [
44], who found that AR and Virtual Reality (VR) technologies effectively support healthy eating by providing additional product information and new learning applications. Similarly, McGuirt et al. [
45] highlighted the potential of Extended Reality (XR) technologies to increase the accessibility and attractiveness of nutrition education programs, which aligns with ARFood’s approach to engaging Generation Alpha.
The iterative refinement of AI-based Non-Player Characters (NPCs) in ARFood to provide personalized feedback aligns with the findings of Maghsudi et al. [
46,
47], who noted that AI in higher education is used for assessment, evaluation, and intelligent tutoring systems, thereby enhancing personalized learning experiences.
The study’s iterative process to refine NPC responses for better alignment with educational objectives reflects the adaptability of AI-driven educational tools. The results confirm that the correct wording of prompts is crucial in the use of artificial intelligence (AI) in education, as it directly affects the quality and relevance of the responses generated. Recent studies have highlighted the importance of these practices. For example, Denny et al. [
48] introduced the concept of ‘Prompt Problems to help students develop skills in creating effective prompts for code generators based on large language models (LLM). Furthermore, Ng and Fung [
49] have shown that the careful design of prompts, including specific information about learners, effectively guides AI in generating coherent and pedagogically sound learning paths, improving the effectiveness of personalized instruction.
The ARFood app not only aligns with existing research on the use of Augmented Reality (AR) and Artificial Intelligence (AI) in education but also extends these findings in several significant ways. While previous studies have explored the use of AR and AI in nutrition education, ARFood uniquely combines these technologies to address both nutritional health and environmental sustainability. By guiding users toward balanced diets and eco-friendly food choices, ARFood provides a holistic educational experience that encompasses multiple dimensions of well-being. ARFood employs AI-based Non-Player Characters (NPCs) that offer tailored feedback on users’ food selections, enhancing the personalization of the learning experience. This approach builds on existing research by demonstrating the effectiveness of AI in delivering customized educational content, thereby increasing user engagement and knowledge retention. The development of ARFood involved an iterative process to refine NPC feedback to ensure alignment with educational objectives. This method not only improved the quality of the information provided, but also showcased the adaptability of AI-driven educational tools in meeting diverse learning needs. By incorporating gamified elements and immersive AR experiences, ARFood effectively engages Generation Alpha, catering to their digital proficiency and learning preferences. This strategy enhances motivation and participation, addressing the challenges identified in previous studies regarding student engagement in educational programs.
Unlike traditional supervised machine learning models, which require large labeled datasets, ARFood employs a RoBERTa-based zero-shot classification framework to iteratively refine the AI-generated feedback. This approach allows dynamic adaptation without the need for continuous retraining. Supervised learning models often achieve high accuracy when trained on well-structured datasets; however, they face challenges in scalability and adaptability when new educational objectives emerge [
50]. ARFood circumvents these limitations by leveraging zero-shot classification, enabling the system to assess the relevance of AI feedback to predefined objectives without additional training data [
51].
Similarly, ARFood differs from rule-based feedback systems that rely on predefined conditions and logical rules to generate responses. While rule-based approaches provide transparent and predictable feedback [
52], they lack the ability to adapt dynamically to diverse learning scenarios. ARFood overcomes this rigidity by integrating a flexible AI-driven evaluation process that refines feedback iteratively based on a probabilistic alignment with educational goals.
Furthermore, while many AI-driven serious games employ generative AI models, such as GPT-4, to create personalized learning experiences, these models often lack an intrinsic evaluation mechanism to ensure alignment with pedagogical objectives. Without structured oversight, generative AI responses can become inconsistent, off-topic, or fail to address key learning goals [
53]. In contrast, ARFood systematically integrates RoBERTa-based classification to analyze AI-generated feedback and refine it iteratively, ensuring that each interaction progressively enhances educational effectiveness.
By combining generative AI (ChatGPT 4o) with zero-shot classification (RoBERTa), ARFood represents an innovative hybrid approach that balances the flexibility of natural language generation with the structure of an objective-driven evaluation system. This dual-layer methodology not only ensures high-quality feedback tailored to user interactions but also guarantees that all educational objectives—ranging from nutritional awareness to sustainability practices—are covered comprehensively and equitably. Compared to traditional AI-driven feedback mechanisms, ARFood demonstrates greater adaptability, scalability, and effectiveness in optimizing educational outcomes within serious games [
47,
49].
However, this study has limitations. The sample size, while sufficient for the initial analysis, may not be generalizable to broader populations. Additionally, reliance on a single zero-shot classification model might overlook nuanced feedback gaps. The absence of longitudinal data limits our insights into the long-term behavioral impacts. Addressing these limitations in future research could involve expanding the participant base, incorporating diverse AI models, and conducting longitudinal studies to assess the sustainability of observed educational benefits.
Future developments could include integrating adaptive learning mechanisms to tailor feedback dynamically based on individual user progress. Expanding the application to address other sustainability and health topics, such as water conservation and physical activity, could further enhance its educational value. Finally, collaborative features that enable peer interactions may enrich learning experiences and foster collective awareness.
The potential of ARFood to influence real-world behaviors is an important consideration. While the application is not designed as a direct intervention tool, it serves as a learning environment where users can experiment with purchasing decisions, receive AI-driven feedback, and engage with sustainability concepts in a risk-free, gamified setting. Research on game-based learning and serious games suggests that immersive and interactive educational environments enhance knowledge retention and decision-making skills, which can lead to incremental behavioral changes over time [
54]. Through a realistic virtual shopping simulation, ARFood fosters cognitive engagement with sustainable food choices, potentially influencing attitudes that extend beyond the game environment.
To further demonstrate the long-term impact of improved feedback on artificial intelligence, future research will focus on conducting pre/post knowledge assessments to measure learning gains over time. Longitudinal monitoring of behavior will make it possible to determine whether feedback-driven learning leads to lasting food and sustainability choices. Another important line of investigation will be the comparison of AI-enhanced and rule-based feedback to quantify the added value of iterative AI refinement in serious games. By integrating empirical assessments of knowledge retention and behavioral change, future studies could further validate AI-driven gamification strategies as transformative tools in education, health, and sustainability learning.
While this study focused on how AR facilitates a realistic and educationally meaningful simulation of purchasing behavior, future research should explore the role of engagement in enhancing learning effectiveness and behavior change. Engagement is a crucial factor in gamified educational interventions, influencing motivation, sustained interaction, and knowledge retention. To gain deeper insights into how AR-based serious games impact user engagement, future studies should incorporate quantitative engagement metrics, such as self-reported user experience surveys, time-on-task measurements, behavioral tracking, and physiological indicators (e.g., eye-tracking or emotional response analysis).
The study involved middle school students (ages 11–14) recruited from local school networks, ensuring that the participants reflected the target demographic for the AR-based learning intervention. However, since the sample was selected through convenience sampling, the findings may not be fully generalizable across different socioeconomic and cultural backgrounds. Future studies could enhance the robustness of these results by incorporating a more diverse participant pool that includes students from varied educational and geographic settings.