Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach

Chernbumroong, Suepphong; Intawong, Kannikar; Asawimalkit, Udomchoke; Puritat, Kitti; Julrode, Phichete

doi:10.3390/informatics13010016

Open AccessArticle

Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach

by

Suepphong Chernbumroong

¹,

Kannikar Intawong

²,

Udomchoke Asawimalkit

³

,

Kitti Puritat

⁴

and

Phichete Julrode

^4,*

¹

College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand

²

Faculty of Public Health, Chiang Mai University, Chiang Mai 50200, Thailand

³

Department of Public Administration, Faculty of Political Science and Public Administration, Chiang Mai University, Chiang Mai 50200, Thailand

⁴

Department of Library and Information Science, Faculty of Humanities, Chiang Mai University, Chiang Mai 50200, Thailand

^*

Author to whom correspondence should be addressed.

Informatics 2026, 13(1), 16; https://doi.org/10.3390/informatics13010016

Submission received: 27 November 2025 / Revised: 15 January 2026 / Accepted: 20 January 2026 / Published: 21 January 2026

(This article belongs to the Special Issue Generative AI in Higher Education: Applications, Implications, and Future Directions)

Download

Browse Figures

Versions Notes

Abstract

The rapid proliferation of misinformation on social media underscores the urgent need for scalable digital-literacy instruction. This study presents the design and evaluation of a Generative AI-enhanced serious game system that integrates Large Language Models (LLMs) to drive adaptive non-player characters (NPCs). Unlike traditional scripted interactions, the system employs role-based prompt engineering to align real-time AI dialogue with the Currency, Relevance, Authority, Accuracy, and Purpose (CRAAP) framework, enabling dynamic scaffolding and authentic misinformation scenarios. A mixed-method experiment with 60 undergraduate students compared this AI-driven approach to traditional instruction using a 40-item digital-literacy pre/post test, the Intrinsic Motivation Inventory (IMI), and open-ended reflections. Results indicated that while both groups improved significantly, the game-based group achieved larger gains in credibility-evaluation performance and reported higher perceived competence, interest, and effort. Qualitative analysis highlighted the HCI trade-off between the high pedagogical value of adaptive AI guidance and technical constraints such as system latency. The findings demonstrate that Generative AI can be effectively operationalized as a dynamic interface layer in serious games to strengthen critical reasoning. This study provides practical guidelines for architecting AI-NPC interactions and advances the theoretical understanding of AI-supported educational informatics.

Keywords:

generative AI; serious games; large language models (LLMs); prompt engineering; AI-driven NPCs; digital literacy; misinformation; game-based learning

1. Introduction

The proliferation of misinformation and “fake news” on digital platforms poses a critical challenge to information ecosystems, threatening evidence-based decision-making across domains ranging from public health to international relations [1,2]. In the current Post-Truth Era, characterized by algorithmic amplification and rapid content velocity [3,4], traditional static educational methods often struggle to replicate the dynamic, adversarial nature of online information consumption. This is particularly evident in short-video ecosystems like TikTok, where high engagement and low barriers to entry facilitate the spread of manipulative content [5,6]. Consequently, there is an urgent need for advanced technological tools that can effectively train users in digital literacy skills, specifically the application of structured evaluation criteria such as the CRAAP framework [7,8].

Serious games have emerged as a promising solution to this challenge, offering interactive environments that enhance motivation and knowledge retention [9]. However, a significant technical bottleneck in current educational games—particularly those relying on role-playing or simulation—is the design of Non-Player Characters (NPCs). Traditionally, NPCs are powered by scripted decision trees or finite-state machines (FSMs), resulting in repetitive, pre-determined interactions that fail to simulate the nuance and unpredictability of real-world human dialogue. This lack of responsiveness limits the “authenticity” of the simulation and, by extension, the system’s pedagogical effectiveness. Recent advancements in Large Language Models (LLMs) and Generative AI offer a transformative opportunity to overcome these limitations by enabling NPCs to converse naturally, adaptively, and responsively [10].

By integrating Generative AI into game engines, developers can create NPCs that simulate human-like dialogue, present complex perspectives, and dynamically adjust responses based on user input. These capabilities shift the paradigm from pre-scripted learning paths to adaptive, free-form interactions—a crucial requirement for training critical thinking and information evaluation. While the application of serious games in digital literacy is well-documented, the technical implementation and user experience (UX) implications of integrating generative AI-driven NPCs into these systems remain underexplored. Specifically, questions remain regarding how such systems can be architected to ensure pedagogical coherence while maintaining the fluidity of AI generation.

To address this gap, this study presents the design and evaluation of a Generative AI–enhanced serious game system. The system integrates OpenAI’s GPT models with a Unity-based game environment to support undergraduate students in developing digital literacy skills. Unlike traditional approaches, this study focuses on the intersection of system design and learning efficacy. It evaluates how AI-generated NPC dialogues influence learning outcomes, user engagement, and perceived system authenticity compared to traditional instruction. The study addresses the following research questions:

RQ1: How can generative AI be effectively integrated into a serious game to support dynamic, realistic, and pedagogically meaningful learning interactions?
RQ2: To what extent does the generative AI-enhanced serious game improve students’ learning performance compared with traditional classroom instruction?
RQ3: How do learners perceive the benefits and limitations of generative AI–driven NPC interactions in terms of engagement and support for information-evaluation skills?

This study contributes to the field of Informatics and Educational Technology in three key ways. First, it demonstrates a technical framework for integrating Generative AI into serious games using prompt engineering to enforce role-specific behaviors, addressing the limitations of finite-state systems. Second, it provides empirical evidence of the system’s effectiveness in a controlled misinformation-evaluation context. Finally, it offers design guidelines for AI-supported educational environments, highlighting the balance between open-ended AI generation and structured pedagogical scaffolding.

2. Literature Review

2.1. Serious Games for Learning

Serious games are designed to achieve specific instructional objectives rather than simple entertainment [11], offering immersive and interactive learning experiences that enhance understanding across multiple fields [12], particularly in subjects where learners often struggle, such as science and mathematics [13]. Research has shown that incorporating game mechanisms can increase motivation [14], confidence [15], and engagement, with studies such as Karimov et al. demonstrating improved participation and performance among below-average students after game-based revision sessions [16]. Consistent findings also indicate that serious games support sustained interest and enjoyment in the learning process [17], making them valuable pedagogical tools that improve engagement and learning outcomes across various domains, including digital and information literacy, as summarized in Table 1.

2.2. Generative AI for Learning/AI-Driven Educational Interactions

Recent advances in generative artificial intelligence—particularly large language models (LLMs)—have accelerated the development of AI-based tools to support teaching, learning, and assessment across educational levels. Systematic reviews consistently report that generative AI can enhance personalized feedback, adaptive scaffolding, and learner engagement when integrated with clear instructional objectives [23,24]. At the same time, researchers emphasize the importance of human oversight due to challenges related to hallucinated content, bias, and overreliance on automated support [25]. Studies on AI-powered educational agents suggest that hybrid human–AI designs—where instructors guide prompts, curate AI output, and embed AI interactions within structured learning tasks—yield better learning outcomes than fully autonomous systems [24,26].

Conversational agents represent one of the most widely adopted applications of AI in education. Meta-analyses indicate that chatbots can promote learner support, motivation, and reflective thinking, although their impact on achievement varies depending on interaction design and instructional alignment [27]. Effective agents tend to use questioning, scaffolding, and dialogic prompts rather than providing direct answers or solutions. This aligns with socio-constructivist and dialogic learning principles, where learners build understanding through guided exploration and feedback. Within this pedagogical landscape, generative AI offers new opportunities to create dynamic, context-sensitive, and human-like educational interactions. In the present study, generative AI is used to drive non-player character (NPC) dialogues in a serious game, enabling adaptive questioning, scenario-based feedback, and authentic information-evaluation tasks. These AI-mediated interactions form the basis for comparing the learning performance, engagement, and perceived support of students using the game with those receiving traditional instruction.

2.3. Generative AI and the Evolving Role of NPCs in Game-Based Learning

Generative AI is reshaping both how games are produced and how players experience them. Although early-career developers have raised ethical and usability concerns regarding generative AI tools [28], current research highlights their strong potential to streamline development by automating prototyping, narrative generation, and asset creation [29]. These capabilities reduce the resource demands traditionally associated with scenario design and character development in serious games, allowing designers to direct more attention toward pedagogy and user experience. As a core component of game interaction, non-player characters (NPCs) have historically relied on scripted or finite-state systems [30], which limit responsiveness and realism. Prior work documents persistent constraints in NPC adaptivity [31], reinforcing calls for more flexible models that enhance immersion and perceived credibility [32]. Recent research in human–AI interaction emphasizes that adaptive agents are more effective when their behavior aligns with users’ implicit mental models, enabling systems to anticipate user needs and adjust interaction strategies dynamically [33].

Recent AI advances have significantly strengthened NPC authenticity in both entertainment and educational games. Titles such as Grand Theft Auto V demonstrate how decision trees, behavior trees, and reinforcement learning can enable nuanced character reactions shaped by needs and environmental cues [34,35,36]. Similarly, memory-based NPC systems in games like Red Dead Redemption 2 support emergent storytelling with context-dependent consequences [37,38,39]. Although the present study does not implement explicit learner or mental-state models, NPC dialogue adaptation is achieved through rule-based prompt conditioning using observable contextual variables, such as NPC roles, scenario types, and performance-based severity levels. These developments hold important implications for game-based learning: NPCs that generate adaptive dialogue, emotional expression, and contextual memory can simulate complex scenarios, present diverse viewpoints, and deliver personalized scaffolding—affordances particularly valuable for educational tasks that require critical thinking, reasoning, and real-time information evaluation.

2.4. Digital Literacy, Misinformation, and the CRAAP Framework in Educational Contexts

Misinformation and “fake news” have increasingly drawn scholarly attention due to their varied forms—including rumors, manipulation, propaganda, and deliberately deceptive disinformation—each posing risks to public understanding and democratic discourse [40,41,42,43]. Their rapid spread is amplified by social media algorithms, reward systems, and echo chambers that reinforce pre-existing beliefs [44,45,46]. Platforms such as TikTok, which combine fast-paced, visually rich content with large youth audiences, have become particularly influential in accelerating this circulation [47,48,49,50]. Motivations for creating or sharing misleading content range from entertainment and social recognition to financial gain or political manipulation [51,52,53], underscoring the need for educational interventions that strengthen learners’ critical evaluation skills.

Digital literacy therefore plays a crucial role in equipping individuals to identify, analyze, and responsibly use information in digital environments [54,55]. It encompasses technical proficiency as well as the cognitive ability to discern credible sources, a skill that improves through repeated exposure to online materials [56]. Educational programs aimed at cultivating digital literacy stress the importance of structured evaluation methods to help learners navigate vast and uneven information landscapes [57]. The CRAAP framework—Currency, Relevance, Authority, Accuracy, and Purpose—offers such a structure, serving as a widely adopted checklist for assessing the quality and credibility of online content [8,58]. In the present study, CRAAP-based evaluation tasks form the applied context through which students practice information-assessment skills and interact with generative AI-driven NPCs. Although misinformation scenarios provide the domain of practice, the study’s primary focus lies in examining how AI-supported interactions can enhance learning, engagement, and authenticity.

3. Game Design of the Generative AI-Enhanced Serious Game

The serious game developed in this study was designed to provide an interactive, authentic, and pedagogically meaningful environment in which undergraduate students practice evaluating online information. The design approach integrates three core components: (1) scenario-based tasks inspired by real-world misinformation contexts, (2) generative AI-driven non-player characters (NPCs) that enable dynamic and adaptive interactions, and (3) embedded scaffolding based on the CRAAP framework to support learners’ information-evaluation processes. Together, these components create a learning environment that mirrors authentic digital-media encounters and encourages students to apply critical-thinking skills in a guided yet open-ended manner.

3.1. Learning Objectives and Pedagogical Foundations

To define the learning objectives of the generative AI-enhanced serious game, a knowledge–expert co-creation approach was adopted, involving structured ideation sessions with specialists in digital literacy, librarianship, and game design. This collaborative process ensured that the intended learning outcomes were pedagogically sound while maintaining an appropriate balance between cognitive challenge, narrative coherence, and gameplay enjoyment. Digital literacy experts contributed domain knowledge in information-evaluation frameworks and curricular alignment, while game designers translated these goals into Mechanics–Dynamics–Aesthetics (MDA) structures that support meaningful player engagement. Student participants were also involved during early prototyping stages to provide formative feedback on usability, difficulty balance, and clarity of in-game tasks.

To articulate the instructional focus, the CRAAP framework was initially applied to validate the misinformation scenarios used in the game, ensuring that tasks reflected authentic challenges encountered in university-level digital literacy instruction. The target audience comprises undergraduate learners with foundational digital skills who are expected to engage with complex, conversation-driven gameplay facilitated by generative AI–powered NPCs. Through iterative co-design, the learning objectives were refined to emphasize three areas: (1) applying structured information-evaluation criteria, (2) engaging in critical reasoning through adaptive NPC dialogue, and (3) reflecting on credibility judgments through embedded scaffolding. An overview of the subject-matter experts’ contributions is summarized in Table 2.

3.2. Narrative Structure and Misinformation Scenarios

The narrative of the serious game was designed to immerse learners in authentic, real-world information-evaluation challenges while maintaining a coherent storyline that supports critical thinking and sustained engagement. Players assume the role of a novice fact-checker recruited by a university media-literacy unit, where they must investigate questionable news items, social-media posts, video clips, and comments circulating within a simulated digital ecosystem. The game world includes dynamic outdoor and indoor locations that reflect familiar digital-media and public-information environments (see Figure 1). This framing mirrors how misinformation typically appears in students’ everyday contexts, ensuring both relevance and authenticity. Each mission introduces a distinct scenario modeled after common patterns of misleading or low-credibility content—such as manipulated images, decontextualized statistics, emotionally charged claims, or authority-appeal fallacies—allowing learners to repeatedly apply evaluation strategies across varied contexts.

To support pedagogical authenticity, misinformation samples and scenario templates were co-developed with digital-literacy experts and librarians, who validated the plausibility, difficulty level, and instructional value of each case. The scenarios progressively increase in complexity: early missions focus on identifying basic credibility cues, while later missions require interpreting conflicting evidence, cross-checking sources, and articulating justification for evaluation decisions. In more advanced tasks, players are presented with multiple documents or posts displayed simultaneously for comparison and evaluation, as illustrated in Figure 2. Throughout the narrative, generative AI-driven NPCs act as editors, peers, or community members who challenge, guide, or question the player’s reasoning. Their dynamic, context-responsive dialogue enables branching narrative paths—such as presenting alternative viewpoints, requesting clarification, or offering supportive hints—resulting in personalized learning trajectories while maintaining narrative coherence. This scenario design ensures that gameplay remains tightly aligned with the learning objectives established during the expert co-design process.

3.3. AI-Driven NPC Interaction Design

The serious game incorporates generative AI-powered non-player characters (NPCs) to facilitate adaptive, naturalistic, and cognitively engaging interactions. Unlike traditional scripted characters, the AI-driven NPCs generate real-time responses based on the learner’s inputs, reasoning patterns, and in-game decisions. Each NPC fulfills a specific instructional role—such as mentor, peer, or skeptic—providing diverse viewpoints that prompt learners to justify their evaluations, question assumptions, and consider alternative interpretations of online information. Their conversational moves—such as probing questions, clarification requests, or reflective prompts—support deeper reasoning and authentic engagement with credibility-assessment tasks embedded in the narrative. The overall interaction cycle between the player and the AI-driven NPCs is illustrated in Figure 3.

To preserve instructional coherence, the AI dialogue system was guided by carefully designed prompts and constraints that align NPC behavior with the CRAAP framework and the pedagogical purpose of each mission. These guardrails ensure that NPC interactions remain relevant and supportive while preserving the fluidity and personalization afforded by generative AI. The resulting interaction model enables branching dialogue paths, varied learning trajectories, and personalized conversational scaffolding, enhancing immersion while maintaining a structured environment that promotes systematic information evaluation.

3.4. Integrated LLM Prompting and NPC Response Mechanisms

To integrate the LLM into the serious game, a structured prompting mechanism was designed to explicitly control NPC roles, belief states, and response severity levels while supporting two complementary learning pathways. Rather than allowing free-form text generation, the LLM functioned as a constrained dialogue generator to simulate realistic community responses to misinformation. One pathway involved a knowledge-oriented AI NPC, represented as a “man in black” character (Figure 4), who randomly provided concise explanations related to information literacy and the CRAAP framework during gameplay. This NPC delivered brief, context-aware guidance on concepts such as credibility, authority, accuracy, currency, relevance, and purpose, reinforcing evaluative principles without interrupting gameplay flow. In parallel, the LLM prompts specified the news scenario, the NPC’s role as a non-expert city resident within a thematic zone (health, economic, education, or belief-based), and a response-severity level that was explicitly determined by the player’s misinformation-evaluation score in each scenario.

The second learning pathway emphasized experiential understanding through outcome-driven NPC responses that illustrated the real-world consequences of misinformation. The system employed three graded response levels for both negative and positive outcomes, along with a neutral response level representing everyday dialogue unrelated to misinformation evaluation. The intensity of both negative and positive response levels was directly mapped to the player’s scoring outcome, with lower scores triggering stronger misinformation-consistent behaviors and higher scores producing increasingly sophisticated digital-literacy responses. Negative response levels (Levels 1–3) reflected escalating misinformation impact, ranging from limited personal belief and private sharing (Level 1), to confident belief with active recommendation to others (Level 2), and to strong belief associated with broader social consequences (Level 3). Conversely, positive response levels (Levels 1–3) represented progressively stronger evaluative reasoning, skepticism, and responsible information-sharing behavior. The neutral response level maintained social realism without affecting player scores. As summarized in Table 3, this design establishes a clear mapping between player scoring, NPC belief states, LLM prompt configurations, and in-game dialogue outputs, enabling players to learn from both explicit conceptual guidance and the simulated social consequences of their own news-evaluation decisions.

3.5. Scaffolding and Feedback Mechanics

The game integrates multiple layers of scaffolding to support learners as they navigate the credibility-evaluation tasks and interact with generative AI–driven NPCs. Scaffolding is primarily delivered through adaptive conversational guidance, where NPCs pose probing questions, request clarification, or highlight overlooked cues to prompt deeper reasoning. These interactions operate as just-in-time support, encouraging players to articulate their thinking and apply structured evaluation strategies. Additional scaffolding is incorporated through the game interface, including subtle visual indicators, CRAAP-guided checkpoints, and context-relevant hints that help learners focus on essential elements of credibility assessment without overwhelming cognitive resources. The sequential structure of the missions—beginning from exploration and progressing toward source evaluation—provides a natural progression of scaffolding opportunities (see Figure 5).

Feedback is delivered through a combination of immediate and reflective mechanisms. Immediate feedback is presented after key evaluation choices, offering concise explanations that validate correct reasoning or address misunderstandings. Reflective feedback appears at the end of each mission, summarizing the learner’s decision-making process, identifying strengths and areas requiring improvement, and offering strategies for more rigorous analysis in subsequent tasks. This layered feedback structure ensures that players not only receive corrective information but also develop metacognitive awareness of how they evaluate online content. Collectively, the scaffolding and feedback mechanics reinforce learning while maintaining narrative immersion and an appropriate level of cognitive challenge.

3.6. Technical System Implementation

The technical architecture of the system integrates a Unity WebGL client with a cloud-based backend responsible for managing generative AI interactions, instructional rules, and data logging. The Unity WebGL client handles front-end rendering, player input, in-game decision logic, and NPC dialogue presentation, while delegating all natural-language generation to the GPT-4 model through secure API requests. When players initiate NPC conversations, the Unity client packages the player message together with contextual metadata—including mission state, CRAAP-related cues, and NPC role parameters—and sends this structured request to the backend. The backend then forwards the query to the GPT-4 model using a constrained prompt template specifying behavioral limits, dialogue tone, and pedagogical intent. The response is returned to the Unity client and integrated seamlessly into the gameplay loop. All conversational exchanges, decision events, timestamps, and gameplay metrics are logged in a MySQL 8.4 database for later analysis. The overall system structure is shown in Figure 6.

To maintain instructional coherence and minimize irrelevant or hallucinated AI output, the dialogue generation pipeline incorporates prompt-engineering strategies and system-level constraints. Each NPC is governed by a dedicated role-prompt that defines its instructional purpose—for example, mentor, skeptic, or peer evaluator—and restricts dialogue content to credibility assessment and scenario-specific learning goals. Additional server-side safeguards, including keyword filters and context-validation procedures, ensure that generated responses remain appropriate and aligned with the game’s pedagogical objectives. The MySQL backend stores player choices, CRAAP-aligned evaluation results, and interaction transcripts, enabling detailed behavioral analytics and supporting the system’s adaptive feedback mechanisms. This hybrid architecture—combining Unity WebGL for client delivery, GPT-4 for adaptive dialogue, and MySQL for persistent storage—provides a stable yet flexible foundation suitable for controlled educational research while supporting real-time personalized learning. Before AI-generated dialogue is presented to the player, all NPC responses pass through a multi-stage validation process. First, the LLM generates NPC dialogue based on the selected misinformation scenario, player choices, and the predefined NPC role, allowing the response to reflect beliefs and social consequences associated with the fake news context. Second, the generated output is screened using a keyword-based filtering mechanism. If sensitive, inappropriate, or high-risk terms (e.g., offensive language, sexual references, or other flagged keywords stored in the system database) are detected, the response is discarded and regenerated. Third, a secondary AI-based validation step evaluates whether the regenerated response is consistent with the intended scenario context, NPC role, and assigned severity level. If contextual inconsistency is identified, the response is regenerated iteratively until it satisfies all constraints. Only validated responses are displayed to the player, a process designed to reduce the likelihood of harmful or misleading content and to mitigate the risk of inappropriate AI-generated dialogue, including offensive language, procedural or “how-to” misinformation, and contextually inconsistent responses, while maintaining pedagogical alignment.

4. Materials and Methods

This study employed a quasi-experimental research design to evaluate the effectiveness of the generative AI-enhanced serious game in supporting students’ digital-literacy skills and credibility-evaluation performance. The methodology consisted of two components: (1) the implementation of the AI-driven game prototype, and (2) an empirical evaluation with undergraduate participants. A total of 60 students voluntarily participated in the study and completed the gameplay session followed by an assessment of learning performance and perceptions. Quantitative data including accuracy scores, task completion metrics, and evaluation logs were collected to address the three research questions concerning AI integration, learning outcomes, and learner perceptions.

4.1. Participant

Participants were recruited through announcements posted on the university’s official website and the official website of the Safe and Creative Media Development Fund. To broaden outreach, targeted social-media advertisements were used to direct individuals with internet access to the game’s landing page, where study information was provided. Upon launching the game, participants were invited to provide informed consent and complete an in-game pre- and post-survey administered for scientific research over a two-month period. During the active data-collection phase, we obtained n = 60 fully matched pre–post responses, with data paired at the individual level and linked to research-relevant gameplay records. Only participants who (a) completed the entire gameplay session and (b) explicitly consented to the use of their data were included in the analysis. Table 4 presents the demographic characteristics of the research participants.

4.2. Instruments

To evaluate learning performance, gameplay behaviors, and learner perceptions, three categories of research instruments were employed. Each instrument includes clearly defined measurement items to ensure systematic and replicable assessment.

4.2.1. Pre- and Post-Test of Digital Literacy Questionnaire

A 40-item pre- and post-test was developed to assess participants’ theoretical and applied knowledge of digital literacy, fake news evaluation, and the CRAAP framework. The assessment items were adapted from a question set routinely used in an undergraduate Information Literacy course to evaluate students’ ability to access, analyze, and critically interpret online information. This part of the instrument focused on clearly defined, knowledge-based questions assessing foundational concepts of information literacy and news evaluation. All items and the corresponding scoring criteria were reviewed by three domain experts and demonstrated acceptable content validity, with an item–objective congruence (IOC) value of 0.75. Responses were scored using a predefined rubric, with scores reflecting the accuracy and appropriateness of participants’ reasoning rather than response length. For items requiring qualitative judgment, responses were independently evaluated by two raters using the predefined scoring rubric, and inter-rater reliability analysis indicated an acceptable level of agreement between raters (Cohen’s κ = 0.62).

4.2.2. Qualitative Post-Experience Responses

Qualitative data were gathered through open-ended post-experience questions asking participants to describe their learning experience, their interaction with the AI-driven NPCs, and their overall impressions of the game. Responses were analyzed thematically across key dimensions, including perceived learning gains, engagement, clarity and usefulness of NPC dialogue, technical stability, and feedback on gameplay structure or desired improvements. These qualitative insights provided contextual understanding of learner experience beyond quantitative measures.

4.2.3. Intrinsic Motivation Questionnaire

Intrinsic motivation toward the learning activity was measured using a short-form Intrinsic Motivation Inventory (IMI), a widely applied instrument for assessing subjective motivational states in educational settings [59,60]. The questionnaire comprised 14 Likert-scale items (1 = strongly disagree to 5 = strongly agree) across three core subscales—Interest/Enjoyment, Perceived Competence, and Effort/Importance—with minimal wording adjustments to align with the study’s instructional contexts. The IMI was administered to both the experimental group (AI-driven serious game) and the control group (traditional instruction) to enable comparison of motivational outcomes across learning modalities. Mean subscale scores were calculated to generate composite indicators of intrinsic motivation for subsequent analysis.

4.3. Research Procedure

The overall research workflow is shown in Figure 7, which presents the quasi-experimental pre–post design used in this study. After assessing eligibility and obtaining informed consent, all participants completed a pre-test of digital literacy along with the IMI to establish baseline knowledge and motivational levels prior to the intervention. Participants were then assigned to one of two groups of equal size (n = 30 each) using a quasi-experimental assignment based on course sections. The experimental group accessed the AI-driven serious game through the web-based platform, receiving mission instructions and interacting with generative AI-powered NPCs while performing misinformation evaluation tasks. In contrast, the control group participated in a traditional instructor-led information-literacy lesson covering the same core topics, including digital literacy concepts, fake news characteristics, and the CRAAP evaluation framework.

Upon completion of the instructional activities, both groups undertook the post-test of digital literacy and the IMI to assess learning gains and motivational changes. In addition, the experimental group completed an open-ended qualitative questionnaire to provide deeper insights into user experience and perceptions of the AI-driven interactions. Gameplay logs, user decisions, and NPC dialogue transcripts were automatically stored in the system’s database, while all collected data were anonymized prior to statistical analysis. Paired and between-group comparisons were performed to evaluate the relative effectiveness of AI-driven game-based learning versus traditional classroom instruction.

5. Data Analysis and Results

5.1. Results for Digital Literacy Learning Outcomes

Table 5 presents the pre- and post-test score summaries for the digital-literacy and news-evaluation assessment across the traditional instruction group and the game-based learning group. Normality diagnostics indicated that the paired difference scores were acceptable for parametric testing; therefore, paired-samples t-tests were used to compare pre- and post-intervention performance within each group. For the game-based learning group, mean scores increased from 4.40 (SD = 2.91) to 16.50 (SD = 4.93), a statistically significant improvement, t = 11.587, p < 0.001, with a large effect size (Cohen’s d = 2.116). The traditional instruction group also showed significant gains, with mean scores rising from 4.76 (SD = 2.68) to 14.20 (SD = 4.90), t = 16.081, p < 0.001, and similarly demonstrating a large effect size (Cohen’s d = 2.936).

An independent-samples t-test comparing gain scores between groups was conducted to evaluate relative improvement from pre- to post-test. Results showed that the game-based learning group achieved significantly greater improvement (M = 12.10, SD = 5.71) than the traditional instruction group (M = 9.43, SD = 3.21), t(58) = 2.226, p = 0.03, with a medium effect size (Cohen’s d = 0.575). These results indicate that, although both groups improved after instruction, the game-based learning condition produced stronger learning gains overall. This clarification aligns the reported values with Table 5, where raw post-test scores (Traditional: 14.20; Game: 16.50) reflect absolute performance, while Table 6 specifically reports gain scores to compare learning improvement.

5.2. Results of Intrinsic Motivation

Table 7 presents the pre- and post-test comparisons of intrinsic motivation across the three IMI subscales—competency, interest, and effort—for both the experimental (game-based) and control (traditional instruction) groups. In the experimental group, significant increases were observed in both competency and interest. Competency scores rose from 3.12 (SD = 0.61) to 4.08 (SD = 0.55), t = 7.214, p < 0.001, with a large effect size (Cohen’s d = 1.317). Interest showed a similarly strong improvement from 3.25 (SD = 0.72) to 4.35 (SD = 0.49), t = 8.103, p < 0.001, with a large effect size (Cohen’s d = 1.480). Effort also increased significantly, though to a lesser degree, from 3.41 (SD = 0.67) to 4.22 (SD = 0.52), t = 6.548, p < 0.001, with a large effect size (Cohen’s d = 1.195). In the control group, competency scores increased modestly from 3.18 (SD = 0.58) to 3.46 (SD = 0.60), t = 2.201, p = 0.035, indicating a small to medium effect size (Cohen’s d = 0.401). Changes in the interest and effort subscales were not statistically significant. Interest increased from 3.29 (SD = 0.69) to 3.52 (SD = 0.66), t = 1.884, p = 0.070, while effort rose from 3.38 (SD = 0.64) to 3.59 (SD = 0.62), t = 1.767, p = 0.087.

5.3. Results of Open-Ended Questionnaire

The open-ended responses from the experimental group provided deeper insights into learners’ experiences with the AI-driven serious game. The thematic analysis revealed three major themes, each comprising both positive perceptions and critical feedback.

5.3.1. Theme 1: Enhanced Engagement Through Interactive Gameplay

Learners consistently highlighted the game’s interactive missions, visual design, and scenario-based tasks as key elements that boosted engagement and made the learning experience more enjoyable than traditional lectures. Many participants emphasized that the combination of hands-on gameplay and AI-driven interactions created a sense of liveliness and unpredictability that kept them focused. The integration of AI-generated dialogue alongside interactive exploration was frequently described as making the overall experience feel more dynamic, immersive, and closely aligned with real-world misinformation challenges. In addition to appreciating the gameplay mechanics themselves, several students noted that the presence of AI enhanced engagement by giving each mission a sense of novelty, as NPC responses varied with each playthrough. This unpredictability not only increased enjoyment but also motivated some learners to replay missions to observe different conversational outcomes. However, participants also reported moments where engagement declined—particularly when certain mission structures became repetitive or when visual scenes did not evolve to match the dynamism of the AI interactions. Some users further mentioned that time-limited tasks created pressure that occasionally reduced enjoyment.

5.3.2. Theme 2: Clarity and Usefulness of AI-Driven NPC Guidance

Participants expressed particularly strong appreciation for the AI-driven NPC interactions, emphasizing that the dynamic and non-repetitive dialogue made the gameplay feel lively, unpredictable, and highly enjoyable. Many students commented that the NPCs’ varied conversational responses added “color,” “depth,” and “freshness” to each playthrough, increasing their motivation to replay missions in order to see how the NPCs would respond differently. Learners highlighted that this adaptiveness made the experience feel more authentic than scripted characters and contributed to sustained engagement.

Despite these strengths, participants suggested that the system could be further improved by extending AI capabilities beyond dialogue generation. While the NPCs’ responses were diverse, learners expressed a desire for AI to also influence in-game events, such as triggering different scenarios, altering mission conditions, or dynamically modifying story branches based on player choices. Several respondents noted that current gameplay relies mainly on AI-generated conversation, whereas integrating AI-driven event generation would create a more immersive and responsive narrative world.

5.3.3. Theme 3: Technical and Usability Limitations

Participants identified several strengths related to usability, including intuitive interfaces, accessible gameplay mechanics, and clear visual feedback after decisions. Players found the in-game tools easy to navigate, even on first use. At the same time, users also reported technical issues, including lag, occasional crashes, slow loading times, and audio–text synchronization problems. These interruptions disrupted immersion and reduced motivation in longer sessions. Some users also noted that mobile performance was less stable than desktop versions.

6. Discussion and Findings

6.1. Summary of Key Findings

Regarding the system integration (RQ1), this study demonstrates that generative AI can be effectively operationalized within a serious game architecture through adaptive, role-based NPCs. By utilizing structured prompts, scenario metadata, and CRAAP-aligned instructional constraints, the system enabled NPCs to engage learners in clarifying questions and personalized scaffolding. From a system design perspective, our architecture successfully decoupled dialogue logic from the game engine, shifting the development paradigm from exhaustive scripting to prompt engineering. This modularity not only facilitated dynamic interaction but also significantly enhanced system scalability, allowing for the creation of complex educational scenarios without the overhead of hard-coding finite-state machines. This aligns with recent research showing that AI-supported serious games can significantly enhance cognitive engagement and adaptive learning experiences when compared with traditional scripted approaches [61]. Furthermore, the implementation confirms that conversational AI can strengthen immersion and realism by replacing rigid rule-based NPCs with flexible agent behaviors [62].

These architectural innovations translated directly into measurable learning gains (RQ2). The generative-AI-enhanced system produced significantly greater improvements in digital-literacy performance compared with traditional instruction. While both groups showed pre–post gains, the game-based group achieved markedly higher post-test scores and larger effect sizes. These results align with recent work showing that AI-driven learning environments can substantially enhance performance and skill development in higher education settings [63]. The observed improvements confirm that the system’s ability to streamline instructional processes and adapt dynamically to learner needs—key features of generative AI—intensifies cognitive engagement and learning effectiveness [64,65]. Consequently, the findings validate that generative AI meaningfully improves learning performance by creating adaptive, feedback-rich instructional contexts that traditional classroom methods cannot easily replicate.

From a Human–Computer Interaction (HCI) perspective (RQ3), learners perceived the system as highly supportive of motivation and cognitive effort. IMI results indicated statistically significant gains across competency, interest, and effort dimensions, suggesting that AI-driven dialogue enhanced perceived understanding more strongly than traditional instruction. Open-ended responses described the adaptive guidance as “encouraging” and “natural,” confirming that context-aware prompts successfully clarified misconceptions. However, the study also identified a critical technical trade-off between response quality and system latency. Mirroring limitations highlighted in the current literature [64], participants noted occasional response delays. These interruptions, caused by API inference times for the high-complexity model, occasionally disrupted the gameplay “flow”. This finding highlights a key consideration for future educational informatics: balancing the depth of AI reasoning with the real-time responsiveness required for seamless user immersion.

6.2. Opportunities and Challenges of AI-Driven NPCs

Theoretically, this study contributes to the intersection of educational informatics and game design by demonstrating how generative AI can operationalize adaptive scaffolding and dynamic NPC behavior within a constrained learning environment. The findings extend existing theories on game-based learning by showing that AI-NPC interactions function most effectively when architected with role-specific prompting and conversational adaptivity aligned with instructional frameworks such as CRAAP. These results support emerging conceptualizations that generative AI enables richer forms of cognitive engagement and sustained immersion compared with traditional pre-scripted pedagogies [63]. Furthermore, the motivational gains evidenced through IMI scores provide empirical support for HCI theories suggesting that AI-powered virtual characters enhance learner competence and cognitive effort by simulating authentic social interactions [65]. This suggests that in the context of higher education, Generative AI serves not just as a content generator, but as a dynamic interface layer that personalizes the user experience in real-time.

Practically, this study offers evidence-based design recommendations for integrating generative AI into educational software. A key implication is the role of prompt engineering as a functional ‘control layer’ within the system architecture. The AI-driven dialogue system used in this project illustrates how constrained prompts, mission-aligned role design, and scenario metadata act as logic gates to ensure pedagogical coherence while preserving the fluidity of conversation. This approach effectively mitigates technical risks such as hallucination or content drift, functioning similarly to strict logic constraints in traditional software development. These insights guide developers and instructional designers in implementing AI-NPCs that improve learning outcomes without compromising system reliability. Moreover, the study highlights a scalability advantage: generative AI can streamline the design of educational scenarios and automate content creation [64], allowing developers to expand game content by modifying prompts rather than rewriting complex codebases. Collectively, the implications highlight the transformative potential of generative AI to reshape digital-literacy instruction through adaptive, interactive, and biologically grounded game environments.

6.3. Limitations and Recommendations for Future Research

This study has several limitations that merit consideration. Although the sample of 60 undergraduate volunteers was suitable for the analyses conducted, it inevitably limits the broader applicability of the findings. Because participation was voluntary, the sample may reflect students who already felt comfortable with digital tools or were more open to game-based activities. The study also took place within a single university, and the learning environment may not resemble those found in other institutions or learner groups. On the technical side, some learners experienced delays or repetitive AI-NPC responses, and a few instances of simplified reasoning were noted. These issues partly reflect the constraints of current large language models, including the well-known tendency toward occasional factual inaccuracies or hallucinations despite careful prompt design. In addition, the factual accuracy of AI-generated responses was not systematically evaluated in this study, as the LLM was used to support guided reasoning rather than to provide authoritative answers. Furthermore, the study did not include a scripted serious game condition without generative AI; therefore, the specific contribution of the LLM cannot be fully isolated from the effects of game-based interactivity. While acceptable in a research-prototype context, such system latency may disrupt user flow and pose challenges for real-world deployment, particularly on mobile devices or in environments with limited network connectivity. Human scoring of the open-ended assessment may also introduce some unavoidable subjectivity, even with the use of rubrics. Lastly, the WebGL deployment caused performance fluctuations on older devices, which may have influenced immersion during gameplay.

Future work should examine how AI-driven serious games operate in more varied and larger populations, including secondary-school learners, adult learners, or professional fact-checkers. Cross-cultural studies would also be valuable, as misinformation challenges differ across regions and media ecosystems. On the technical side, exploring more advanced LLMs or retrieval-augmented generation could help reduce hallucination risks and improve the consistency of AI-NPC dialogue. In addition, future system iterations may explore cross-model validation strategies, such as employing heterogeneous large language models for dialogue generation and response validation, to reduce shared model bias and further strengthen safety assurance in high-risk misinformation scenarios. Future research may also consider integrating explicit learner models to enable AI-driven NPCs to adapt interactions based on learners’ evolving knowledge states, potentially enhancing personalization, improving replayability, and reducing learner frustration through dynamically adjusted challenges, feedback, and scaffolding. Further research might also look at longer-term outcomes, such as skill retention or transfer to real-world social-media settings. More detailed qualitative methods—such as think-aloud protocols, gameplay analytics, or eye-tracking—could deepen our understanding of how learners interact with AI-NPCs and how these interactions shape reasoning and engagement over time.

7. Conclusions

This study explored how generative AI can enhance serious-game-based digital-literacy instruction by enabling NPCs to respond in more adaptive and context-aware ways. The AI-driven game supported improvements in misinformation-evaluation skills and increased learners’ motivation and engagement when compared with traditional instruction. Students generally perceived the AI-NPCs as helpful in guiding their reasoning, although some technical limitations—such as response delays or occasionally simplified explanations—were also observed. These mixed but encouraging results suggest that generative AI can strengthen the authenticity of educational simulations, provided that its limitations are acknowledged and carefully managed. While the findings are promising, they represent a first step rather than a final answer. The study highlights design considerations for future AI-enhanced serious games, including clearer scaffolding structures, more stable dialogue generation, and broader evaluation across different learner groups. As generative-AI technologies continue to evolve, there will be opportunities to refine how NPCs support critical thinking and digital-literacy skills in environments where misinformation is increasingly complex. The overall evidence from this study points to the potential of AI-driven serious games as a useful direction for future educational development and research.

Author Contributions

Conceptualization, S.C.; methodology, S.C. and P.J.; software, K.I.; validation, U.A. and K.I.; formal analysis, K.P. and K.I.; investigation, U.A.; resources, K.I.; data curation, K.I.; writing—original draft preparation, S.C. and P.J.; writing—review and editing, S.C.; visualization, K.P.; supervision, K.I.; project administration, K.P.; funding acquisition, K.I. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by the Teaching and Learning Innovation Center (TLIC) and partially supported by Chiang Mai University.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki and approved by the Institutional Review Board of Chiang Mai University Research Ethics Committee (CMUREC No.67/268) on 14 October 2024.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available upon request from the corresponding author due to restrictions. The data are not publicly available.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Iyengar, S.; Massey, D.S. Scientific communication in a post-truth society. Proc. Natl. Acad. Sci. USA 2019, 116, 7656–7661. [Google Scholar] [CrossRef]
Baptista, J.P.; Gradim, A. Online disinformation on Facebook: The spread of fake news during the Portuguese 2019 election. J. Contemp. Eur. Stud. 2022, 30, 297–312. [Google Scholar] [CrossRef]
Alonso-López, N.; Sidorenko-Bautista, P.; Giacomelli, F. Beyond challenges and viral dance moves: TikTok as a vehicle for disinformation and fact-checking. Anàlisi 2021, 64, 65–80. [Google Scholar] [CrossRef]
Scheibenzuber, C.; Neagu, L.M.; Ruseti, S.; Artmann, B.; Bartsch, C.; Kubik, M.; Dascalu, M.; Trausan-Matu, S.; Nistor, N. Dialog in the echo chamber: Fake news framing predicts emotion, argumentation and dialogic social knowledge building in subsequent online discussions. Comput. Hum. Behav. 2023, 140, 107587. [Google Scholar] [CrossRef]
Su, C. Douyin, TikTok and China’s Online Screen Industry; Taylor & Francis: London, UK, 2023. [Google Scholar]
Wardle, C.; Derakhshan, H. Information Disorder; Council of Europe: Strasbourg, France, 2017. [Google Scholar]
Gilster, P.; Watson, T. Digital Literacy; Wiley: New York, NY, USA, 1997. [Google Scholar]
Zak, E. Three years of misinformation: A case study of information literacy methods. Dialogue 2024, 11, 2–9. [Google Scholar]
Lamb, R. Serious games and learning motivation. Educ. Technol. Res. Dev. 2024, 72, 1–18. [Google Scholar]
Hare, R.; Tang, Y. Player modeling and adaptation methods within adaptive serious games. IEEE Trans. Comput. Soc. Syst. 2022, 10, 1939–1950. [Google Scholar] [CrossRef]
Jacobs, R.S. Serious games: Play for change. In The Video Game Debate 2; Routledge: London, UK, 2020; pp. 19–40. [Google Scholar]
Bellotti, F.; Kapralos, B.; Lee, K.; Moreno-Ger, P.; Berta, R. Assessment in and of serious games. Adv. Hum.-Comput. Interact. 2013, 2013, 136864. [Google Scholar]
Cheng, M.T.; Su, T.F.; Huang, W.Y.; Chen, J.H. An educational game for learning human immunology. Br. J. Educ. Technol. 2014, 45, 820–833. [Google Scholar] [CrossRef]
Ongoro, C.A.; FanJiang, Y.Y.; Hung, C.H.; Lin, B.J.; Guo, J. Tares: A game-based tangible augmented reality English spelling mastery system with minimal cognitive load. IEEE Access 2024, 12, 61163–61184. [Google Scholar] [CrossRef]
Byusa, E.; Kampire, E.; Mwesigye, A.R. Game-based learning approach on students’ motivation and understanding of chemistry concepts: A systematic review of literature. Heliyon 2022, 8, e09441. [Google Scholar] [CrossRef]
Karimov, A.; Saarela, M.; Kärkkäinen, T. Clustering student feedback. In Proceedings of the 16th International Conference on Educational Data Mining (EDM 2023), Bengaluru, India, 11–14 July 2023; pp. 234–243. [Google Scholar]
Arztmann, M.; Hornstra, L.; Jeuring, J.; Kester, L. Effects of games in STEM education. Stud. Sci. Educ. 2023, 59, 109–145. [Google Scholar]
Neylan, J.; Biddlestone, M.; Roozenbeek, J.; van der Linden, S. Inoculating against misinformation. Sci. Rep. 2023, 13, 18273. [Google Scholar] [CrossRef] [PubMed]
Roozenbeek, J.; van der Linden, S. Breaking Harmony Square: A game that “inoculates” against political misinformation. Harv. Misinf. Rev. 2020, 1, 1–12. [Google Scholar]
Alnuaim, A. The impact and acceptance of gamification by learners in a digital literacy course at the undergraduate level: Randomized controlled trial. JMIR Serious Games 2024, 12, e52017. [Google Scholar] [CrossRef] [PubMed]
Elkin, J.A.; McDowell, M.; Yau, B.; Machiri, S.V.; Pal, S.; Briand, S.; Purnat, T.D. The good talk! A serious game to boost People’s competence to have open conversations about COVID-19: Protocol for a randomized controlled trial. JMIR Res. Protoc. 2023, 12, e40753. [Google Scholar] [CrossRef]
Muis, K.R.; Denton, C.; Dubé, A. Identifying CRAAP on the internet: A source evaluation intervention. Adv. Soc. Sci. Res. J. 2022, 9, 239–265. [Google Scholar] [CrossRef]
Garzón, J.; Patiño, E.; Marulanda, C. Systematic review of artificial intelligence in education: Trends, benefits, and challenges. Multimodal Technol. Interact. 2025, 9, 84. [Google Scholar] [CrossRef]
Córdova-Esparza, D.-M. AI-Powered Educational Agents: Opportunities, Innovations, and Ethical Challenges. Information 2025, 16, 469. [Google Scholar] [CrossRef]
Lo, C.K. What Is the Impact of ChatGPT on Education? A Rapid Review of the Literature. Educ. Sci. 2023, 13, 410. [Google Scholar] [CrossRef]
Labadze, L.; Grigolia, M.; Machaidze, L. Role of AI chatbots in education. Int. J. Educ. Technol. High. Educ. 2023, 20, 56. [Google Scholar] [CrossRef]
Deng, X.; Yu, Z. A meta-analysis and systematic review of the effect of chatbot technology use in sustainable education. Sustainability 2023, 15, 2940. [Google Scholar] [CrossRef]
Boucher, J.D.; Smith, G.; Telliel, Y.D. Is Resistance Futile?: Early Career Game Developers, Generative AI, and Ethical Skepticism. In Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI ’24), Honolulu, HI, USA, 11–16 May 2024; ACM: New York, NY, USA, 2024. [Google Scholar] [CrossRef]
Colado, A.; Morata, M.; Piriz, A.; Manjón, B.F. Using New AI-Driven Techniques to Ease Serious Games Authoring. In Proceedings of the IEEE Frontiers in Education Conference (FIE 2023), College Station, TX, USA, 18–21 October 2023; pp. 1–9. [Google Scholar] [CrossRef]
Orkin, J. Three states and a plan: The AI of FEAR. In Proceedings of the Game Developers Conference (GDC) 2006, San Jose, CA, USA, 20–24 March 2006. [Google Scholar]
Uludağlı, M.Ç.; Oğuz, K. Non-player character decision-making in computer games. Artif. Intell. Rev. 2023, 56, 14159–14191. [Google Scholar]
Chen, Y. Design of Basketball Game AI System. Master’s Thesis, Nanjing University, Nanjing, China, 2017. [Google Scholar]
Andrews, R.W.; Lilly, J.M.; Srivastava, D.; Feigh, K.M. The role of shared mental models in human-AI teams: A theoretical review. Theor. Issues Ergon. Sci. 2023, 24, 129–175. [Google Scholar]
Saranya Rubini, S.; Ram, R.V.; Narasiman, C.V.; Umar, J.M.; Naveen, S. Behaviors of modern NPCs. In Proceedings of the Fourth International Conference on Communication, Computing, and Electronics Systems (ICCCES 2022); Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
Schrier, K. Designing Games for Moral Learning and Knowledge Building. Games Cult. 2019, 14, 306–343. [Google Scholar] [CrossRef]
Jaderberg, M.; Czarnecki, W.M.; Dunning, I.; Marris, L.; Lever, G.; Castaneda, A.G.; Silver, D. Human-level performance in 3D multiplayer games with population-based reinforcement learning. Science 2019, 364, 859–865. [Google Scholar]
Fronek, R.; Göbl, B.; Hlavacs, H. Procedural behavior trees for NPCs. In Proceedings of the 22nd International Conference on Electronic Commerce (ICEC 2020), Chengdu, China, 7–9 July 2020; Springer: Berlin/Heidelberg, Germany, 2020. [Google Scholar]
Wang, S. Effect of VR on storytelling. Highlights Sci. Eng. Technol. 2023, 39, 131–137. [Google Scholar]
Zeng, G. AI-based game NPCs review. Appl. Comput. Eng. 2023, 15, 155–159. [Google Scholar] [CrossRef]
Vath, D.; Vanderlyn, L.; Vu, N.T. Towards a Zero-Data, Controllable, Adaptive Dialog System. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), Torino, Italy, 20–25 May 2024; Calzolari, N., Kan, M.-Y., Hoste, V., Lenci, A., Sakti, S., Xue, N., Eds.; ELRA and ICCL: Torino, Italy, 2024; pp. 16433–16449. [Google Scholar]
Boyd-Barrett, O. Fake news and ‘RussiaGate’discourses: Propaganda in the post-truth era. Journalism 2019, 20, 87–91. [Google Scholar]
Tandoc, E.C.; Lim, Z.W.; Ling, R. Defining “fake news” A typology of scholarly definitions. Digit. J. 2018, 6, 137–153. [Google Scholar]
Armitage, R.; Vaccari, C. Misinformation and disinformation. In Routledge Companion to Media Disinformation; Routledge: Oxfordshire, UK, 2021. [Google Scholar]
Salas-Paramo, J.; Escandon-Barbosa, D. The moderating effect of fake news on the relationship between behavioral patterns and vaccines. Cogent Soc. Sci. 2022, 8, 2103900. [Google Scholar] [CrossRef]
Islam, A.N.; Laato, S.; Talukder, S.; Sutinen, E. Misinformation sharing and social media fatigue during COVID-19: An affordance and cognitive load perspective. Technol. Forecast. Soc. Change 2020, 159, 120201. [Google Scholar] [CrossRef] [PubMed]
Kalsnes, B. Fake news. In Oxford Research Encyclopedia of Communication; Oxford University Press: Oxford, UK, 2018. [Google Scholar] [CrossRef][Green Version]
González-Padilla, D.A.; Tortolero-Blanco, L. Social media influence in the COVID-19 Pandemic. Int. Braz. J. Urol. 2020, 46, 120–124. [Google Scholar] [PubMed]
Newman, N.; Fletcher, R.; Schulz, A.; Andi, S.; Robertson, C.T.; Nielsen, R.K. Digital News Report 2021; Reuters Institute: Oxford, UK, 2021; Available online: https://reutersinstitute.politics.ox.ac.uk/digital-news-report/2021 (accessed on 19 January 2026).
O’Sullivan, N.J.; Nason, G.; Manecksha, R.P.; O’Kelly, F. Misinformation on TikTok. J. Pediatr. Urol. 2022, 18, 371–375. [Google Scholar] [CrossRef]
Truong, P.H.; Kim, A.D. TikTok influence on youth. Eur. Conf. Soc. Media 2023, 10, 310–317. [Google Scholar]
Melchior, C.; Oliveira, M. Motivations to share fake news. New Media Soc. 2023, 25, 1461–1480. [Google Scholar]
Altay, S.; Hacquin, A.-S.; Mercier, H. Why people avoid sharing fake news. New Media Soc. 2022, 24, 1303–1324. [Google Scholar] [CrossRef]
Aïmeur, E.; Amri, S.; Brassard, G. Fake news, disinformation and misinformation in social media: A review. Soc. Netw. Anal. Min. 2023, 13, 30. [Google Scholar] [CrossRef]
Reddy, P.; Sharma, B.; Chaudhary, K. Digital literacy: A review of literature. Int. J. Technoethics 2020, 11, 65–94. [Google Scholar] [CrossRef]
Safitri, I.; Marsidin, S.; Subandi, A.; Padang, U.N. Analisis kebijakan terkait kebijakan literasi digital di sekolah dasar. Edukatif 2020, 2, 176–180. [Google Scholar]
Siregar, K.E. Increasing digital literacy in education: Analysis of challenges and opportunities through literature study. Int. J. Multiling. Educ. Appl. Linguist. 2024, 1, 10–25. [Google Scholar]
Siregar, K.E. Human resource development in education. Al-Ma’lumat 2023, 1, 30–39. [Google Scholar]
Dinkelman, A.L. Course syllabi and information literacy. Issues Sci. Technol. Librariansh. 2010, 60. Available online: https://projectcora.org/library-collection/collection-information-literacy-course-syllabi (accessed on 19 January 2026).
McAuley, E.; Duncan, T.; Tammen, V.V. Psychometric properties of the Intrinsic Motivation Inventory in a competitive sport setting: A confirmatory factor analysis. Res. Q. Exerc. Sport 1989, 60, 48–58. [Google Scholar] [CrossRef] [PubMed]
Arayaphan, W.; Sirasakmol, O.; Nadee, W.; Puritat, K. Enhancing intrinsic motivation of librarian students using virtual reality for education in the context of culture heritage museums. TEM J. 2022, 11, 620–630. [Google Scholar] [CrossRef]
Mitsea, E.; Drigas, A.; Skianis, C. A Systematic Review of Serious Games in the Era of Artificial Intelligence, Immersive Technologies, the Metaverse, and Neurotechnologies: Transformation Through Meta-Skills Training. Electronics 2025, 14, 649. [Google Scholar] [CrossRef]
Adetunji, R.O.; Ade-Ibijola, A. TechStartUpGame: A serious game for training tech start-ups using AI-generated scenarios. Discover Educ. 2025, 4, 286. [Google Scholar] [CrossRef]
Marengo, A.; Pagano, A.; Lund, B.D.; Santamato, V. Research AI: Integrating AI and gamification in higher education for e-learning optimization and soft skills assessment through a cross-study synthesis. Front. Comput. Sci. 2025, 7, 1587040. [Google Scholar] [CrossRef]
Swacha, J. Supporting Serious Game Development with Generative Artificial Intelligence: Mapping Solutions to Lifecycle Stages. Appl. Sci. 2025, 15, 11606. [Google Scholar] [CrossRef]
Zhao, J. Enhancing Design History Education through AI Virtual Characters and Role-Playing Narratives in Serious Games. Int. J. Gaming Comput.-Mediat. Simul. 2025, 17, 1–20. [Google Scholar] [CrossRef]

Figure 1. Overview of the outdoor scenario illustrating the game’s narrative environment, where players investigate misinformation embedded in a simulated public setting.

Figure 2. Example of an in-game evaluation task showing multiple news items and artefacts that players must assess using credibility and CRAAP-based criteria.

Figure 3. Conceptual interaction flow between player input, AI-NPC responses, scaffolding, and evaluation.

Figure 4. A knowledge-oriented AI NPC (“man in black”) that randomly provides contextual explanations of information literacy concepts and the CRAAP framework during gameplay.

Figure 5. Gameplay flow illustrating the progression from initial exploration to community inspection and final source evaluation within each mission.

Figure 6. System architecture illustrating the integration of the Unity WebGL client, GPT-4 dialogue engine, and MySQL data layer.

Figure 7. Research procedure illustrating participant allocation, assessment stages, instructional conditions, and data-collection flow in the quasi-experimental pre–post design.

Table 1. A summary of serious games as educational tools for digital literacy and information literacy education.

Game Name	Domain/Topic	AI-Driven	Game/Intervention	Research Subjects/Participants	Key Findings
Cat Park [18]	Misinformation inoculation	No (serious game)	Browser game	Adult online users/general public	Playing the game decreased the perceived dependability of misinformation and the urge to distribute it.
Harmony Square [19]	Election misinformation	No (serious game)	Browser game	Adult participants recruited online	Perceived credibility of misinformation decreased by around 16%, as did the urge for distribution by approximately 11%.
N/A [20]	Digital literacy	No (gamified course)	Gamification	Undergraduate students	The experimental group performed better academically in digital literacy than students who weren’t introduced to the educational game setting.
The Good Talk! [21]	Conversational skills for misinformation resilience	No	Serious game	Secondary/undergraduate students	Improved determination and techniques for having productive discussions regarding misinformation.
[22]	Information literacy/CRAAP outcomes	N/A	Instructional intervention	Undergraduate students	CRAAP enhanced assessment judgments but did not increase source integration much.

Table 2. Subject-matter experts’ contributions and representative examples.

Subject Matter Expert	Knowledge Distribution	Representative Case
Librarian, Digital literacy lecturer	Domain knowledge of information evaluation frameworks (CRAAP), curricular integration, and evaluation design.	Co-defined learning goals, collected and evaluated fake-news examples, and created score rubrics for in-game assessment
Student Participants (player, game testers)	Playtesting provides insights into customer demands, usability, and engagement, as well as formative assessment.	Joined in focus groups and discussion playtests to offer comments on difficulty balance, prompt clarity, and replayability.
Game Designer	Translation of learning objectives into Mechanics–Dynamics–Aesthetics (MDA), quick prototyping, and a balance of pedagogy and enjoyment.	Developed MDA design maps, constructed playable prototypes, and improved based on telemetry and user input.

Table 3. Examples of Integrated LLM Prompting and In-Game NPC Dialogue Across News.

News Scenario	Player Input	NPC Role	LLM Prompt	In-Game Output (Dialogue)
Health news: Claim that herbal medicine can cure and prevent COVID-19	Player accepts the fake news as true (normal score)	Community resident in the health zone	You are a city resident who believes this [news]. Simulate casual dialogue reflecting personal experience and social impact. [Negative Level 1].	“I saw this news about herbal medicine curing COVID. I bought some to keep at home, and it seemed really good, so I told my family about it.”
Economic news: Claim that a specific investment guarantees 200 percent high returns with no risk	Player accepts the fake news as true (low score)	Community resident in the economic zone	You are a city resident who believes this [news]. Simulate casual dialogue reflecting personal experience and social impact. [Negative Level 2].	“This investment sounded really safe and easy. I put all my savings into it and recommended it to several friends.”
Belief-based news: Claim that wearing a charm can protect against illness or bad luck	Player accepts the fake news as true (very low score)	Community resident in the belief zone	You are a city resident who believes this [news]. Simulate casual dialogue reflecting personal experience and social impact. [Negative Level 3].	“I strongly believe this charm can protect me. I even borrowed money to buy it, and the belief spread so widely that there were news reports of people fighting over these charms, including a serious violent incident.”
Education news: Claim that enrolling in this course guarantees an A grade	Player accepts the news as not true (normal score)	Community resident in the education zone	You are a city resident who believes this [news]. Simulate casual dialogue reflecting personal experience and social impact. [Positive Level 1].	“I saw this news about the course, but I don’t really believe it’s true. I think it’s better to focus on my studies and take care of myself properly.”

Table 4. The research participants’ demographics.

	n	Percent
Gender
Male	24	40.0%
Female	33	55.0%
Non-binary	3	5.0%
Total	60
Age
18–20	34	56.7%
21–23	26	43.3%
Total	60

Table 5. Results of paired-samples t-tests for traditional-based and game-based learning groups.

Group	Intervention	Mean (SD)	n	St. Error Mean	t	Cohen’s d	95% CI (Cohen’s d)	Sig. (2-Tailed)
Traditional-based Learning	Pre-survey	4.76 (2.68)	30	0.49	16.081	2.936	[2.09, 3.76]	<0.001 **
Traditional-based Learning	Post-survey	14.20 (4.90)	30
Game-based Learning	Pre-survey	4.40 (2.91)	30	0.53	11.587	2.116	[1.46, 2.75]	<0.001 **
Game-based Learning	Post-survey	16.50 (4.93)	30

Note: ** p-value less than 0.001.

Table 6. Independent-samples t-test comparing gain scores between the Traditional-based Learning and Game-based Learning groups.

Group	Mean Difference (SD)	n	St. Error Mean	t	Cohen’s d	95% CI (Cohen’s d)	Sig. (2-Tailed)
Traditional-based Learning	9.43 (3.21)	30	0.58	2.226	0.575	[0.056, 1.089]	0.03 *
Game-based Learning	12.10(5.71)	30

Note: * p-value less than 0.05.

Table 7. Paired-samples t-test results for IMI subscales comparing pre- and post-test intrinsic motivation between the experimental (game-based) and control (traditional instruction) groups.

Group	Dimension	Pre-Test (SD)	Post-Test (SD)	t	Cohen’s d	95% CI (Cohen’s d)	Sig. (2-Tailed)
Experimental Group (n = 30)	Competency	3.12 (0.61)	4.08 (0.55)	7.214	1.317	[0.78, 1.84]	<0.001 **
Experimental Group (n = 30)	Interest	3.25 (0.72)	4.35 (0.49)	8.103	1.480	[0.94, 2.02]	<0.001 **
	Effort	3.41 (0.67)	4.22 (0.52)	6.548	1.195	[0.66, 1.72]	<0.001 **
Control Group (n = 30)	Competency	3.18 (0.58)	3.46 (0.60)	2.201	0.401	[0.05, 0.73]	0.035 *
Control Group (n = 30)	Interest	3.29 (0.69)	3.52 (0.66)	1.884	0.344	[−0.03, 0.67]	0.070
	Effort	3.38 (0.64)	3.59 (0.62)	1.767	0.322	[−0.06, 0.64]	0.087

Note: * p-value less than 0.05, ** p-value less than 0.001.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chernbumroong, S.; Intawong, K.; Asawimalkit, U.; Puritat, K.; Julrode, P. Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach. Informatics 2026, 13, 16. https://doi.org/10.3390/informatics13010016

AMA Style

Chernbumroong S, Intawong K, Asawimalkit U, Puritat K, Julrode P. Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach. Informatics. 2026; 13(1):16. https://doi.org/10.3390/informatics13010016

Chicago/Turabian Style

Chernbumroong, Suepphong, Kannikar Intawong, Udomchoke Asawimalkit, Kitti Puritat, and Phichete Julrode. 2026. "Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach" Informatics 13, no. 1: 16. https://doi.org/10.3390/informatics13010016

APA Style

Chernbumroong, S., Intawong, K., Asawimalkit, U., Puritat, K., & Julrode, P. (2026). Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach. Informatics, 13(1), 16. https://doi.org/10.3390/informatics13010016

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Design and Evaluation of a Generative AI-Enhanced Serious Game for Digital Literacy: An AI-Driven NPC Approach

Abstract

1. Introduction

2. Literature Review

2.1. Serious Games for Learning

2.2. Generative AI for Learning/AI-Driven Educational Interactions

2.3. Generative AI and the Evolving Role of NPCs in Game-Based Learning

2.4. Digital Literacy, Misinformation, and the CRAAP Framework in Educational Contexts

3. Game Design of the Generative AI-Enhanced Serious Game

3.1. Learning Objectives and Pedagogical Foundations

3.2. Narrative Structure and Misinformation Scenarios

3.3. AI-Driven NPC Interaction Design

3.4. Integrated LLM Prompting and NPC Response Mechanisms

3.5. Scaffolding and Feedback Mechanics

3.6. Technical System Implementation

4. Materials and Methods

4.1. Participant

4.2. Instruments

4.2.1. Pre- and Post-Test of Digital Literacy Questionnaire

4.2.2. Qualitative Post-Experience Responses

4.2.3. Intrinsic Motivation Questionnaire

4.3. Research Procedure

5. Data Analysis and Results

5.1. Results for Digital Literacy Learning Outcomes

5.2. Results of Intrinsic Motivation

5.3. Results of Open-Ended Questionnaire

5.3.1. Theme 1: Enhanced Engagement Through Interactive Gameplay

5.3.2. Theme 2: Clarity and Usefulness of AI-Driven NPC Guidance

5.3.3. Theme 3: Technical and Usability Limitations

6. Discussion and Findings

6.1. Summary of Key Findings

6.2. Opportunities and Challenges of AI-Driven NPCs

6.3. Limitations and Recommendations for Future Research

7. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI