Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review

Quartulli, Aurora Annamaria; Mignogna, Giovanni; Zizzo, Vera; Mongiello, Marina

doi:10.3390/computers15040235

Open AccessReview

Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review

Department of Electrical and Information Engineering (DEI), Polytechnic University of Bari, 70125 Bari, Italy

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Computers 2026, 15(4), 235; https://doi.org/10.3390/computers15040235

Submission received: 7 March 2026 / Revised: 1 April 2026 / Accepted: 2 April 2026 / Published: 9 April 2026

(This article belongs to the Special Issue Advances in Game-Based Learning, Gamification in Education and Serious Games)

Download

Browse Figures

Versions Notes

Abstract

Effective software engineering education today requires tools that adapt to individual learner proficiency and progress, while ensuring positive student engagement. Gamified platforms represent an effective approach to learning and maintaining motivation, but their efficacy depends on a robust underlying architecture. This systematic literature review analyzes state-of-the-art artificial intelligence (AI)-based adaptive architectures designed to support gamified learning tools, highlighting their architectural models (such as intelligent tutoring systems, multi-agent systems, and immersive virtual reality/augmented reality environments), adaptation mechanisms (including Generative AI and chatbots), and personalization strategies. A significant focus is placed on Process Mining and Learning Analytics as methodological approaches to organize learning paths and guide dynamic adaptation based on student behavior. The results of the selected studies demonstrate advantages such as increased engagement, longer-term participation, and personalized learning pace. However, challenges remain, such as common assessment criteria, integrating different technologies, and system scalability. The findings offer concrete insights for designing the next generation of effective gamified learning tools, based on data and software engineering processes.

Keywords:

gamification; software engineering education; adaptive learning; intelligent tutoring systems; software architecture; process mining; learning analytics; personalization; artificial intelligence in education

1. Introduction

Teaching software engineering extends beyond simple syntax knowledge; it requires a solid theoretical foundation and extensive practice. For many students, particularly at the beginning, abstract concepts can be difficult to apply. Traditional methods are often considered too rigid and insufficiently engaging. Consequently, due to frustration and lack of confidence, many students end up losing interest or dropping out, without developing a truly deep understanding of the subject matter [1]. For this reason, research is increasingly focusing on gamification and adaptive learning methods. This approach involves introducing game-like elements into education, such as points, leaderboards, or levels, to enhance engagement and transform extrinsic motivation into genuine interest. Beyond engagement, the main advantage of these tools lies in their ability to adapt to each individual student. Modern systems no longer offer the same path to everyone but adapt to each student’s pace, preferences, and style, dynamically adjusting activities and content. It is this personalization that makes the experience truly effective and addresses the uniform approach of the past [2]. In recent years, educational methodologies have changed profoundly due to the convergence of Generative Artificial Intelligence (GenAI) and immersive technologies. Large Language Models (LLMs), such as ChatGPT, are no longer just tools: they are becoming digital tutors, capable of providing personalized explanations, suggesting next steps, and even creating tailor-made exercises. At the same time, Mixed Reality (MR) and Virtual Reality (VR) are transforming learning into a tangible and engaging experience. Rather than merely observing abstract concepts on a two-dimensional interface, students can immerse themselves in the content, manipulating code structures or software architecture elements as if they were physical objects. This bridges the gap between theory and practice and makes what once seemed distant and hard to visualize more intuitive [3]. One of the most important aspects of these new architectures is their ability to derive significant value from learner interactions. Every command typed, every dialogue with a chatbot, every movement in a VR environment becomes a valuable indicator of their cognitive processes, where they encounter difficulties, what confuses them, and what facilitates their progress. Thanks to advanced techniques in Educational Data Mining and Process Mining, this raw data is transformed into a comprehensive representation of the student’s learning journey. This allows teachers and the systems themselves to recognize patterns, understand when someone is about to lose motivation or is at risk of dropping out, and intervene at the right moment. Essentially, the system analyzes learner behavior and adapts the environment in real time, closing the loop between action, analysis, and support [4]. Despite extensive research on gamification, artificial intelligence, and data analytics, these topics are often studied in isolation rather than as components of a unified ecosystem. A comprehensive framework and a clear understanding of how all these elements can function synergistically remain absent. In other words, there is limited empirical consensus on how game engines, adaptive algorithms, and process analysis tools should interact to create a single coherent experience. To address this gap, this systematic review analyzes 59 studies to outline a reference architecture for building a truly integrated gamified and adaptive ecosystem for teaching Software Engineering [5]. Specifically, this review answers the following six Research Questions (RQs):

RQ1 (Architectures): What software architectures and platforms are used to implement adaptive gamification in Software Engineering Education (SEE)?
RQ2 (Process Mining): How can process mining be integrated into gamified platforms to monitor learning processes?
RQ3 (Techniques): What adaptive mechanisms (e.g., AI, Rule-based) are employed to personalize the experience?
RQ4 (Impact): What is the documented impact of these systems on academic performance and engagement?
RQ5 (Optimization): How can platforms combine gamification and data analytics to optimize learning?
RQ6 (Challenges): What are the main technical and pedagogical challenges in integrating these technologies?

2. Research Methodology

This Systematic Literature Review was conducted following the guidelines proposed by Kitchenham and Charters for software engineering research and reports in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement. The primary goal was to identify, classify, and analyze state-of-the-art adaptive architectures and gamification strategies in SEE.

2.1. Search Strategy

To ensure a comprehensive coverage of the topic, an automated search was performed across three major digital libraries: Scopus, ACM Digital Library, and SpringerLink. Scopus was selected as the primary metadata aggregator because of its capacity to index high-impact technical literature from publishers such as IEEE and Web of Science. The effectiveness of this approach is demonstrated by the retrieval of relevant studies published by IEEE (e.g., Shum et al. [6]) through the Scopus engine, confirming that key engineering-centric records were successfully captured. Furthermore, the inclusion of ACM Digital Library and SpringerLink was essential to cover pedagogical frameworks and computer science education research, which constitute the primary publication venues for the SEE domain.

The search string was constructed by combining keywords related to the three main pillars of the study: (1) Domain (Software Engineering Education), (2) Pedagogical Strategy (Gamification), and (3) Technical Architecture (Adaptive Systems/Process Mining). The final search string used was: (“Software Engineering Education” OR “teaching software engineering” OR “learning software engineering” OR “programming education”) AND (“Gamification” OR “Game-based learning” OR “Serious games”) AND (“Adaptive” OR “Personalization” OR “Intelligent Tutoring Systems” OR “Process Mining” OR “Learning Analytics” OR “Architecture”). The breakdown of raw search results per database is provided in Table 1.

2.2. Inclusion and Exclusion Criteria

Studies were selected based on specific eligibility criteria to ensure relevance to the RQs. Inclusion Criteria (IC):

IC1: Papers describing software architectures, tools, or platforms for SEE.
IC2: Studies integrating gamification elements (e.g., points, leaderboards, serious games).
IC3: Studies proposing or evaluating adaptive mechanisms, AI integration, or data-driven personalization (including Process Mining).
IC4: Peer-reviewed articles published in journals or conference proceedings in English.

Exclusion Criteria (EC):

EC1: Studies focusing solely on K-12 (primary/secondary) education without relevance to higher SEE concepts.
EC2: Papers discussing gamification without any software implementation or architectural details.
EC3: Short papers, posters, editorials, and non-peer-reviewed material.

2.3. Study Selection Process

The selection process followed a structured funnel approach (see Figure 1). The initial automated search, conducted on 12 December 2025, yielded a total of 474 records. The distribution of these raw hits across the selected digital libraries is detailed in Table 1.

Prior to the screening phase, 304 records were removed by automation tools based on initial filters, including publication year (2020–2025), language (English), and document type (e.g., excluding posters and short papers). The choice of this time frame was primarily motivated by the significant surge and rapid evolution of Generative Artificial Intelligence and immersive technologies within this period, which have fundamentally redefined the landscape of adaptive architectures in education (see Figure 2). The remaining 170 records underwent title and abstract screening. During this stage, 3 records were manually identified as duplicates that had bypassed the initial automated filters and were subsequently removed. Consequently, 167 reports were sought for retrieval. Of these, 1 report could not be accessed, leaving 166 reports to be assessed for eligibility through a full-text review. In this final phase, 107 reports were excluded as they were found to be not directly relevant to the Research Questions. This systematic process resulted in a final selection of 59 primary studies for analysis.

2.4. Data Extraction and Analysis

A structured data extraction form was developed to answer the six defined RQs. For each selected study, the following data were extracted:

General Information: Title, Year, Venue, Authors.
Architecture (RQ1): Platform type (e.g., Web, VR, Plugin), architectural patterns, and integration with Learning Management Systems (LMSs).
Process Mining & Optimization (RQ2, RQ5): Usage of educational data mining, process mining techniques, and strategies for learning optimization.
Adaptive Techniques (RQ3): Adaptation logic (Rule-based vs. AI/ML), personalization algorithms, and feedback mechanisms.
Evaluation (RQ4): Methodology (Case study, Experiment), sample size, and reported impact on student engagement and performance.
Challenges (RQ6): Reported limitations, technical barriers, and open issues in implementation.

The extracted data were synthesized qualitatively to categorize architectural patterns and challenges, and quantitatively to report the distribution of technologies and outcomes.

2.5. Quality Assessment

While a formal Quality Assessment scoring matrix typically employed to evaluate empirical rigor through metrics such as the presence of randomized control groups, statistical power, longitudinal sample sizes, and risk of bias was not applied in this study, a foundational baseline of scientific quality was established by strictly adhering to our inclusion criteria. Specifically, the review exclusively incorporates peer-reviewed articles published in recognized scientific journals and international conference proceedings.

The rationale for omitting a structured Quality Assessment step lies in the primary epistemological objective of this systematic review. Rather than aiming to perform a statistical meta-analysis to quantify the absolute educational efficacy of gamified systems, this research is inherently exploratory and architectural in nature. Our goal is to map the landscape of state-of-the-art software architectures, identify emerging adaptation mechanisms such as Generative AI integration, and delineate implementation strategies within Software Engineering Education.

Consequently, the methodological approach adopted for analyzing educational impact is strictly descriptive. It is crucial to note that the findings related to student performance, behavioral engagement, and overall educational outcomes inherently reflect the self-reported results and interpretations documented by the original authors. Because the selected primary studies were not weighted according to a standardized hierarchy of evidence, these impact outcomes should be interpreted as a mapping of reported technological affordances rather than a definitively validated measure of pedagogical effectiveness across the domain.

3. Answering Researching Questions

3.1. RQ1: Architectural Patterns and Platforms

The analysis of the 59 selected primary studies reveals that adaptive gamification in SEE is not implemented through a single standard architecture. Instead, three distinct architectural patterns have emerged: (1) Immersive Extended Reality (XR) Environments, (2) GenAI-Driven Multi-Agent Systems (MAS), and (3) Web-based Intelligent Tutoring Systems (ITSs). The full mapping of all 59 primary studies into these categories is provided in Appendix A (see Table A1).

3.1.1. Immersive Environments (VR/MR)

A significant cluster of studies focuses on VR and MR to teach abstract concepts like Unified Modeling Language or algorithms. These architectures typically rely on game engines (e.g., Unity3D) to visualize code structures in a spatial environment, requiring specific hardware like Head-Mounted Displays.

Constructivist Frameworks (TeachVR): Wee et al. [7] implement the TeachVR framework through a constructivist approach, prioritizing learner autonomy. The system transforms the virtual environment into an adaptive space where students manipulate 3D logic blocks. This architectural choice facilitates knowledge construction via direct interaction, aligning the learning process with student preferences for contextualized activities. A notable drawback, however, is that the framework’s evaluation primarily relies on subjective student preferences; furthermore, the authors acknowledge that the high cost of VR hardware and the potential “novelty effect” remain significant barriers to verifying long-term pedagogical effectiveness in standard classrooms.
Spatial Programming (MR-LEAP): The MR-LEAP system [8] aims to reduce the abstraction of programming by projecting code structures into the physical environment. By representing instructions as tangible 3D objects, the architecture bridges the gap between conceptual logic and spatial perception, potentially lowering the cognitive barrier for novices. While MR-LEAP offers significant portability, its reliance on high end devices like Microsoft HoloLens limits its accessibility. Additionally, while the visual editor simplifies initial learning, it is yet to be determined if the spatial representation scales effectively for complex, large scale software architectures without inducing extraneous cognitive load.
Narrative-Driven VR (AdLer, MoonBase): Holder et al. [9] demonstrate how integrating coherent storytelling within 3D environments supports autonomous learning. In these systems, the narrative structure acts as a functional context that guides the student through programming challenges, facilitating the transition from passive reception to active exploration of abstract concepts. Nonetheless, the study’s findings are limited by a relatively small sample size ( $n = 40$ ), and the researchers note that the immersive nature of the game may lead to “gamification distraction”, where the entertainment value of the narrative potentially overshadows the focus on underlying technical competencies.

3.1.2. GenAI-Powered Agents and Chatbots

The most recent trend (2023–2025) involves architectures integrating LLMs and GenAI. These systems move beyond static rules, employing multi-agent architectures to simulate real-world software development teams.

Virtual Team Simulation (DevCoach): Wang et al. [10] utilize GenAI to simulate a multi-agent software development team (e.g., Product Manager, Tester, Senior Developer). This architecture enables students to engage with the social and collaborative dynamics of the software lifecycle, framing learning as a dialogic process consistent with the Community of Inquiry framework. Conversely, the empirical validation of DevCoach was limited to a small sample size ( $n = 20$ ) and a short-term study. Furthermore, the authors note that the non deterministic nature of LLMs can lead to “hallucinations” or inconsistent feedback, which might confuse novice students if not carefully monitored by an instructor.
Motivational Scenario Design (GhostCoder): Shum et al. [6] integrate Keller’s ARCS model (Attention, Relevance, Confidence, Satisfaction) directly into the game engine. The system dynamically adapts both narrative elements and programming difficulty based on the learner’s detected motivational state, providing a calibrated learning curve. However, the researchers concede that the system’s effectiveness was tested on a small cohort within a specific vocational IT diploma program focusing on foundational programming concepts, limiting the generalizability of the findings. Additionally, there is a risk that students may become overly reliant on AI-generated hints, potentially undermining the development of independent problem solving skills.
Evolution of Conversational Tutors: The scoping review by Barzanji and Loitsch [11] identifies a transition from command-based chatbots to LLM-driven digital tutors. These contemporary agents exhibit contextual understanding and provide targeted code explanations, shifting the pedagogical focus toward personalized, guided instruction. Nevertheless, the review highlights critical gaps in current research, such as the lack of longitudinal studies to assess long-term knowledge retention. The study also warns that fragmented pedagogical foundations and the risk of generating incorrect or biased code explanations remain significant technical and ethical challenges for the widespread adoption of AI tutors.

3.1.3. Web-Based Intelligent Tutoring Systems

The majority of the analyzed platforms utilize scalable web-based architectures (Client-Server) that integrate gamification elements (points, badges, leaderboards) with backend logic for tracking student progress.

Many of the systems examined rely on a classic web architecture, the client-server model, which allows students to be easily reached and content to be updated without complications. In these platforms, gamification elements are not just decorative: points, badges, and leaderboards are managed directly by the backend, which tracks students’ progress and adjusts activities based on their performance. The result is a more engaging learning environment, capable not only of motivating but also of effectively monitoring each individual’s learning journey [12]. Despite these benefits, the study highlights critical hurdles in current LMS implementations, such as poor interoperability with specialized Software Engineering tools and the significant effort required by lecturers to individualize content for diverse student groups. Furthermore, the authors acknowledge a geographical bias in their findings, as the participants were primarily from Western, industrialized countries, which may limit the generalizability of the results to other global contexts.
In the field of vertical skills, QueryCompetition represents a very concrete example of how gamification can specialize in a technical domain. Morales-Trujillo and García-Mireles [13] show, through a almost-experimental study focused on SQL, that a simple competitive web architecture consisting of points, challenges, and leaderboards does more than just make the activity more enjoyable. On the contrary, it leads to a real and measurable improvement in academic performance. Students who used the gamified version achieved significantly better results compared to those who worked without game elements, demonstrating that competition can become a powerful driver of learning. Yet, the authors caution that the competitive nature of the system can trigger increased anxiety and stress among students who are less comfortable with public recognition or time pressure. Additionally, the study identifies a lack of independent, invigilated assessment items as a threat to the objective validation of learning gains.
In the field of software quality assurance, Sojourner under Sabotage offers a completely new way of learning [14]. Instead of presenting abstract or fragmented exercises, the game immerses students in a sci-fi story: you are a member of a spaceship crew, and your task is to discover and repair sabotaged components using testing and debugging. The fact that everything happens directly in the browser makes the experience immediate and accessible, while the narrative transforms activities often perceived as monotonous into a high-stakes mission, with clear objectives and a sense of urgency that keeps motivation high. However, the researchers note that the highly tailored game context may not fully reflect real world professional testing scenarios. Moreover, while over 80% of participants enjoyed the experience, less experienced students occasionally felt overwhelmed by the debugging tasks, and the study’s reliance on proxies like test coverage may not fully capture the depth of students’ cognitive effort or long-term knowledge retention.

The classification of the 59 identified records into the three architectural categories is summarized in Figure 3.

3.2. RQ2: Process Mining and Learning Analytics Integration

The second research question investigates how gamified platforms utilize data-driven techniques to monitor and optimize learning processes. The analysis of the primary studies indicates an evolving perspective in assessment strategies within SEE. While traditional platforms have predominantly relied on “result-oriented” evaluation (n = 25, see Figure 4) where the system verifies the correctness of the final code output (e.g., via unit tests) this approach may not fully capture the cognitive processes and problem-solving strategies employed by students. To address this, an emerging subset of the analyzed architectures explores the integration of Educational Process Mining (EPM) and advanced Learning Analytics to perform “process-oriented” analysis.

A specific cluster of studies (n = 8) focuses on reconstructing the student’s “learning trajectory” by analyzing granular interaction logs. In contrast to standard LMSs that track high-level metrics like login frequency, these adaptive platforms capture micro-events such as compilation errors, time spent debugging specific functions, and navigation patterns within the IDE. In the field of cybersecurity training, research such as [4] demonstrates that behavioral analysis of interaction logs (e.g., distinguishing between trial-and-error and systematic strategies) enables the identification of divergent problem-solving patterns. Visualizing these “learning trajectories” as graphs facilitates the identification of learners at risk of frustration or suboptimal strategy adoption. However, as noted in the survey by [4], a significant challenge remains the lack of interoperability between different training platforms, which often results in fragmented data that complicates the reconstruction of a holistic learning journey. Similarly, platforms inspired by the KINAITICS architecture [15] utilize digital traces to compare student actions against ideal solution models, triggering automated interventions when significant deviations are detected. Still, it is acknowledged that creating these “ideal solution models” for complex cybersecurity tasks is a labor-intensive process that may not scale easily. Furthermore, since the evaluation was based on short-term hackathon events, the effectiveness of such interventions in sustaining long-term knowledge retention remains to be fully explored.

3.2.1. Predictive Modeling and Early Warning Systems

Beyond visualization, datasets derived from gamified activities are frequently utilized for predictive modeling (n = 12). The literature indicates that clustering and classification algorithms constitute prevalent methodologies for forecasting student outcomes and identifying attrition risks prior to course completion. Lokkila et al. [16] proposed a framework for analyzing pedagogical approaches to programming by recording each compilation attempt as a discrete “snapshot” of the source code. By analyzing the longitudinal evolution of these snapshots specifically the frequency of errors and the temporal intervals between attempts a behavioral profile is constructed for each learner.

Subsequently, K-Means clustering (n = 10) is applied to categorize students based on these distinct operational patterns. This methodology allows for the differentiation between students exhibiting high self-efficacy and those experiencing cognitive difficulties. For instance, the model identifies students trapped in repetitive error loops, which often serves as a proxy for conceptual confusion. The detection of such patterns enables the implementation of targeted pedagogical interventions before student disengagement or formal withdrawal occurs. Limitations arise, however, regarding the data source; as noted by Lokkila et al. [16], relying solely on code snapshots provides a behavioral proxy that may miss the student’s underlying cognitive intent, and the model’s accuracy remains sensitive to the specific pedagogical context and language used.

3.2.2. Dynamic Difficulty Adjustment via Psychometrics

The integration of analytical tools facilitates precise difficulty adjustment compared to traditional level-based systems. A specific implementation within this category (n = 4) involves the adaptation of the Elo-rating model, originally developed for chess ranking, to the context of programming education. In the system proposed by Vesin et al. [17], both learners and exercises are assigned dynamic scores that evolve over time. Upon successful completion of a task, the learner’s rating increases, while the exercise’s difficulty parameters are simultaneously updated based on the aggregate performance of the student population. This continuous calibration ensures that learning activities remain within the most suitable difficulty range for each learner. Consequently, students demonstrating rapid progress are presented with increased complexity, whereas those experiencing difficulties receive reinforcement tasks to maintain engagement. Through this approach, adaptability shifts from linear level-unlocking to a personalized learning path that evolves according to student proficiency. Nevertheless, the researchers acknowledge that the modified Elo-rating system requires a significant volume of student task interactions to overcome the “cold start” problem and stabilize difficulty estimates, a factor that may limit its immediate applicability in smaller or shorter term courses.

3.3. RQ3: Adaptive Mechanisms and AI Techniques

The third research question investigates the specific algorithms and mechanisms employed to implement adaptability. The systematic review highlights a clear dichotomy in the architectural landscape: the coexistence of traditional Rule-Based Systems and the rapid emergence of GenAI driven by LLMs. This transition is quantitatively illustrated in Figure 5, which shows the temporal evolution and the significant surge of GenAI-based studies starting from 2023.

3.3.1. Rule-Based vs. AI-Driven Adaptation

Historically, the majority of educational platforms utilized deterministic, rule-based architectures. These systems operated through predefined logic, typically employing conditional “if-then” statements to manage student progress (e.g., triggering a module repetition if performance metrics fell below a 50% threshold). Recent advancements in this domain have refined these deterministic logic models; for instance, Sanal Kumar and Thandeeswaran [18] demonstrated that rule-based systems can effectively personalize video-based e-learning by integrating parameters such as complexity levels and assessment variance to align with the learner’s receptive pace. While effective for standardized instructional paths, such linear methods lacked the capacity to diagnose the underlying cognitive or conceptual causes of student errors. As evidenced by recent literature, the integration of next-generation language models has fundamentally altered this paradigm. Current systems have evolved from basic syntax validation to the interpretation of student intent. According to Hong [19], GenAI-based tools demonstrate the ability to process natural language queries, generate contextually appropriate code, and provide step-by-step explanations. Furthermore, the evolution of adaptive pedagogy now leverages sophisticated models for tracking student knowledge to dynamically calibrate task difficulty. For instance, Willert and Eriksson [20] propose a multidimensional approach that combines cognitive load theory with feature-oriented software engineering to generate personalized programming exercises. By identifying core and optional task features, these systems can modularize the learning experience, ensuring that the complexity of the challenge remains aligned with the learner’s evolving computational thinking skills. This shift from mechanical correction to semantic understanding facilitates a highly personalized and flexible programming pedagogy that more closely aligns with human-centric instruction.

However, a qualitative synthesis of these core methodologies reveals significant trade-offs in terms of architectural complexity and operational sustainability. Table 2 provides a technical comparison based on the computational requirements and robustness observed in the analyzed frameworks.

The analysis of these architectures highlights that while GenAI offers superior semantic scaffolding, its robustness remains a concern due to the non-deterministic nature of large models, which may require additional validation layers compared to the inherent stability of rule-based systems [18]. Furthermore, as noted by Hong [19], the computational cost of LLM inference introduces a scalability bottleneck that is absent in lightweight, feature-based generators [20].

It is critical to note that the existing corpus lacks primary studies providing a direct empirical comparison (e.g., A/B testing) of these approaches on the same population; therefore, current evidence remains limited to isolated implementation reports rather than comparative effectiveness. This limitation underscores the need for future research to evaluate the relative cost benefit trade-offs between AI-driven and deterministic adaptation in large scale SEE deployments.

3.3.2. Multi-Agent Systems and Intelligent Tutors

A prominent trend identified in the reviewed studies is the transition from monolithic intelligent tutors to Multi-Agent Systems (MAS). These systems aim to simulate professional software development environments, where the learner interacts with multiple specialized agents rather than a single interface. The DevCoach platform [10] exemplifies this architectural shift. By coordinating several LLM-based agents each assigned a distinct role within the software development lifecycle, such as requirement elicitation, bug reporting, or procedural guidance the system situates the learner within an authentic Agile team context. Consequently, the learner’s role shifts from passive consumption to active participation, requiring continuous discussion, clarification, and negotiation of technical solutions. This pedagogical approach replaces traditional “lecture-style” instruction with collaborative, project-based learning. However, as noted by Wang et al. [21], the actual effectiveness of these Intelligent Tutoring Systems (ITSs) in natural educational contexts remains a complex landscape, often showing mixed results depending on the experimental rigor and the specific non-cognitive factors analyzed. Furthermore, educational chatbots have undergone significant qualitative improvements [11]. Modern conversational agents no longer function as purely reactive tools; instead, they act as proactive tutors capable of identifying learning plateaus and providing scaffolding without offering direct solutions. By employing Socratic questioning (e.g., querying the rationale behind specific loop structures or predicting execution outcomes), these agents stimulate metacognitive reflection and foster a deeper educational dialogue.

3.3.3. Narrative and Affective Adaptation

Recent advancements in adaptive systems extend beyond cognitive difficulty adjustments to encompass the learner’s affective state. The “Motivational Scenario-Based Design” engine within GhostCoder, for instance, monitors student progress and dynamically modifies the narrative experience to mitigate frustration. When a task exceeds the learner’s current capabilities, the system provides integrated support through narrative interventions, such as character-led hints or recontextualized objectives, thereby maintaining engagement during challenging phases. In this framework, adaptation accounts for both the learner’s knowledge base and their emotional response. The environment is designed to convert potential frustration into a constructive learning opportunity [6]. Similarly, Marougkas et al. [22] propose the use of Fuzzy Cognitive Maps to model learner affect within Virtual Reality (VR) environments. By analyzing interaction patterns including error frequency, exploration duration, and task latency the system can infer states of boredom or cognitive overload. Upon detection of these states, the platform dynamically adjusts the pace, challenge complexity, and visual feedback to maintain the learner within the “Flow” zone, ensuring that the technology addresses both pedagogical content and the learner’s emotional well-being.

3.4. RQ4: Educational Impact and Student Outcomes

The primary objective of implementing adaptive architectures in Software Engineering Education (SEE) is to evaluate whether technical sophistication yields measurable educational advancements. The synthesis of quantitative and qualitative data from the 59 primary studies reveals significant impacts across two fundamental dimensions: behavioral engagement (encompassing motivation and participation) and cognitive performance (including skill acquisition and academic achievement). The distribution and hierarchy of these reported outcomes are illustrated in Figure 6, showing a predominant focus on behavioral engagement followed by cognitive improvements.

3.4.1. Impact on Engagement and Team Dynamics

A prevalent finding within the surveyed literature is the positive correlation between adaptive gamification and student motivation. Multiple studies utilized the ARCS model (Attention, Relevance, Confidence, Satisfaction) as a psychometric framework to quantify this impact. Findings indicate that personalized feedback mechanisms contribute to a statistically significant reduction in attrition rates, particularly in technically demanding courses.

Voluntary Practice and Engagement: Balla et al. [23] investigated the efficacy of game-based learning environments in SQL programming education. By replacing traditional, abstract exercises with narrative-driven mini-games (e.g., Star Wars or Harry Potter themes), the authors observed a reduction in students’ perceived complexity of the subject matter. Quantitative assessments integrated within these gamified modules demonstrated superior performance compared to conventional testing methods. The data suggests that gamification facilitates a cognitive context conducive to sustained focus and reduced performance anxiety, thereby aligning pedagogical delivery with intrinsic motivational drivers rather than merely providing entertainment. Furthermore, a five year longitudinal study by Hamann et al. [24] reinforces this by demonstrating that voluntary participation in incentive-based e-learning programs correlates with significantly higher exam performance in software engineering, suggesting that continuous feedback loops are a scalable strategy for improving academic success.
Social Transparency and Agile Dynamics: To address the common challenge of disparate contribution levels in collaborative software projects, Meißner et al. [25] introduced DinoDev. This system implements “social transparency” by visualizing individual contributions through real-time activity feeds and merit-based badges. The empirical results indicate that such transparency fosters a self-regulating social effect, where lower-contributing students increased their engagement, thereby reinforcing Agile best practices through shared accountability. Building on the importance of such interactive frameworks, Masson et al. [26] demonstrate that gamified dynamics such as the Scrum Game Challenge are crucial for bridging the gap between theoretical knowledge and the practical application of methodologies. Their findings suggest that these simulations provide a “realistic experience” of industry standards, an essential component for Computer Science curricula. Together, these studies highlight that integrating visibility of labor with hands-on, gamified simulations not only enhances cohort coordination but also prepares students for the professional rigors of the software industry.
Emotional Valence and Affective States: Beyond performance metrics, recent research has pivoted toward the granular mapping of student emotions during gamified interventions. Paredes-Velasco et al. [27] conducted a comprehensive taxonomical analysis of affective responses to the synergistic use of Augmented Reality (AR) and data visualization. Their findings, synthesized into a table of emotinos, reveal a complex psychological profile: while students reported a predominance of positive over negative emotions characterized by high levels of stimulation and agitation rather than passivity the intervention also triggered a persistent and escalating state of anxiety associated with the use of AR. Furthermore, the study highlights that the pedagogical medium (face-to-face vs. online) acts as a significant moderator for both emotional valence and learning outcomes. This underscores the necessity of balancing “flow” and engagement with the potential cognitive and emotional load induced by immersive technologies. Complementing this perspective, El Hassan et al. [28] demonstrate the transformative potential of AR-based gamification in bridging the gap between theoretical instruction and practical observation. Through a case study in a botanical context, the authors argue that the integration of AR serves as a pivotal catalyst for fostering immersive learning experiences, effectively reshaping traditional pedagogical methods into dynamic environments that significantly deepen student understanding of complex, multi-dimensional subjects.

3.4.2. Impact on Skill Acquisition and Grades

While engagement metrics remain high, the influence of adaptive systems on academic grades and mastery is moderated by the complexity of the domain and the granularity of the adaptation.

Conceptualization of Abstract Processes: Understanding the Software Development Life Cycle (SDLC) presents significant cognitive hurdles due to the interdependence of roles and decision-making processes. Wang et al. [10] addressed this via DevCoach, an experiment comparing a control group using static materials with an experimental group interacting with AI agents simulating professional roles (e.g., tester, product manager, architect). Post-test analysis revealed a significant increase in scores for the DevCoach group. The authors attribute this improvement to the transition from passive memorization to active, role-based interaction, which enables students to internalize the collaborative nature of software engineering through simulated professional discourse.
Efficiency in Testing and Debugging: To mitigate the perceived monotony of debugging, Straubinger et al. [14] developed Sojourner, a tool that embeds unit testing within a security-themed narrative. The study results indicate that this narrative framework functions as a cognitive anchor, enhancing both the speed and effectiveness of bug identification. By contextualizing abstract vulnerabilities within a coherent story, students demonstrated improved long-term retention of debugging patterns compared to traditional instructional methods.
Immersive Learning and Spatial Reasoning: Wee et al. [7] explored the role of Virtual Reality (VR) through the TeachVR platform. While VR implementation did not result in increased coding speed, it significantly aided the visualization of three-dimensional data structures and object-oriented relationships. Qualitative feedback suggests that VR facilitates the reification of abstract concepts. However, the authors maintain a cautious stance, noting that a portion of the observed student enthusiasm may be attributed to the “novelty effect” of the technology, necessitating further longitudinal research to confirm sustained learning outcomes.
Internalization of Software Quality Standards: Beyond functional correctness, the acquisition of professional competencies involves a shift toward structural and qualitative rigor. De Luca et al. [29] addressed the common educational gap where software quality is overshadowed by project functionality. By implementing an automated pipeline utilizing ArchUnit and SonarQube within Object-Oriented Programming (OOP) courses, the study demonstrates that integrating quality metrics directly into assessment criteria effectively highlights recurrent structural flaws in student projects. This approach suggests that pedagogical tools facilitating immediate feedback on code quality are essential for transitioning students from basic coding to professional-grade engineering.

3.5. RQ5: Optimization of Learning Paths via Data-Driven Gamification

Research Question 5 examines the intersection of gamification mechanics and data analytics. The synthesis of the 59 primary studies indicates that advanced architectural frameworks increasingly utilize a “Closed-Loop” design. Figure 7 formalizes this closed-loop reference architecture using UML Component Diagram notation, detailing the functional decoupling between the Learner Interface layer, the Monitoring and Data Layer, the Adaptive Intelligence layer, and the Adaptation Controller. Within this model, gamification transcends static reward structures (e.g., conventional points and badges) to become a dynamic layer optimized through continuous data streams derived from Process Mining (RQ2) and Artificial Intelligence (AI).

3.5.1. The “Smart Gamification” Feedback Loop

Traditional gamification models often encounter limitations due to their non-adaptive, “one-size-fits-all” nature. To enhance pedagogical effectiveness, modern architectures implement Smart Gamification, wherein reward systems autonomously adjust to learner behavior.

Adaptive Feedback Mechanisms: Despite methodological variations across the analyzed projects, a consistent trend involves the deployment of data-driven feedback to modulate student engagement. For instance, the KINAITICS platform [15] utilizes Capture The Flag (CTF) challenges and scoring algorithms that prioritize both accuracy and longitudinal progression. In the context of Test-Driven Development (TDD) gamification, Ren [30] proposes a multi-dimensional scoring system that evaluates procedural integrity including commit frequency, TDD cycle adherence, and test-first sequencing rather than solely final outputs. Furthermore, the Experience–Simulation–Debrief–Reflection model in Requirements Engineering leverages hierarchical badges and real-time feedback to enhance meta-cognitive awareness. These empirical findings suggest that by incentivizing the learning process (the “how”) alongside the learning outcome (the “what”), systems can effectively discourage minimal-effort strategies in favor of procedural mastery [31].
Equilibrium between Challenge and Proficiency: Marougkas et al. [22] investigate the maintenance of the Flow state a cognitive equilibrium where task difficulty aligns with user skill to prevent both boredom and anxiety. This is operationalized through Fuzzy Cognitive Maps, enabling Extended Reality (XR) systems to calibrate task complexity in real-time based on performance metrics. While other studies may not explicitly address affective states, they converge on the principle of personalized content adaptation: Vesin et al. [17] employ an Elo-based adaptive assessment, while Logacheva et al. [32] demonstrate that contextual customization significantly increases learner engagement. Collectively, these studies indicate that responsive, individualized learning environments correlate with sustained motivation and cognitive focus.

3.5.2. Curriculum Optimization (Test-Driven Learning)

Data-driven insights extend beyond real-time adaptation, informing the structural optimization of curricula for subsequent cohorts. Evolutionary Curriculum Design: The APLAS system [33] exemplifies the Test-Driven Learning (TDL) approach, utilizing automated verification to guide students. Evidence from the broader literature suggests a growing reliance on historical interaction data to refine educational pathways. Analysis of large-scale datasets including command-line telemetry in cyber ranges [4], recurrent errors in mutation testing [34], and failure patterns in white-box testing games [35] reveals that pedagogical “bottlenecks” are non random and highly localized. By aggregating thousands of student submissions, these systems can identify specific conceptual hurdles with high granularity. Consequently, the platforms can autonomously restructure module sequences, integrate targeted remedial micro-activities, or suggest reinforcement exercises. This iterative process utilizes empirical evidence to ensure that the curriculum evolves according to the demonstrated needs of the student population.

3.5.3. Optimization in Immersive Spaces

In XR-based learning environments, optimization efforts are primarily directed toward the mitigation of cognitive load. Spatial and Visuo-Cognitive Optimization: The MR-LEAP framework [8] utilizes spatial tracking to ensure consistent virtual object placement, thereby reducing the extraneous cognitive load associated with environmental interaction. Complementary VR studies including TeachVR [7], MoonBase VR [9], and the platforms discussed by Yigitbas et al. [36], Agbo et al. [37], and Dörringer et al. [38] underline the necessity of managing visual complexity to prevent sensory overload and cybersickness. Although these frameworks may not all feature real-time velocity-based adaptation, there is a consensus that 3D educational environments must balance immersion with clarity. This is achieved through streamlined interfaces and immediate feedback loops designed to align with human perceptual constraints.

4. Findings

The examination of the 59 core studies provides a comprehensive overview of the transition from static gamified systems to adaptive, AI-driven architectures in SEE. While the results (RQ1–RQ5) demonstrate clear benefits in terms of engagement and personalized feedback, the implementation of these systems is not without significant hurdles. In this section, we address the final research question (RQ6) regarding challenges and opportunities, and we discuss our findings in the context of the broader literature.

4.1. RQ6: Challenges and Limitations

Integrating adaptive architectures with gamification presents a complex set of technical and pedagogical challenges. The review identifies four main categories of barriers that currently hinder widespread adoption.

4.1.1. Technical Integration and Interoperability

One of the most common problems is the complexity of integrating gamified tools into existing educational ecosystems. As Meißner et al. [12] note, current LMSs aren’t flexible enough to support finer forms of data tracking, such as those required for process mining. Added to this is an extremely fragmented landscape: teachers often find themselves using isolated tools, each with its own logic and interfaces, rather than integrated platforms. The same thing happens when trying to connect specialized tools. For example, creating a seamless experience between a development environment and a mutation testing framework like FRAFOL [34] still requires a lot of technical work. Likewise, even gamified tools for requirements engineering, such as the pedagogical architecture described by Santana et al. [39], struggle to find a place in traditional university courses, which are often too rigid to accommodate more interactive methodologies. The result is a siloed learning experience, where each tool exists on its own and data cannot flow freely between activities, play, and assessment.

4.1.2. Pedagogical Alignment and Assessment

Ensuring that serious games truly teach substantive concepts, and not just how to “play”, is an increasingly evident challenge. Wünsche et al. [40] warn against the risk of excessive gamification: when the game takes over, students may end up optimizing strategies to win not to truly learn. A similar concern emerges in the work of Barbosa Monteiro et al. [41], who highlight how the field still lacks shared and reliable metrics for evaluating the effectiveness of gamification. Without common criteria, comparing different approaches becomes difficult, and the risk is that of accumulating isolated, non-cumulative experiences. Regarding automatic feedback, the situation is equally nuanced. The large study by Frankford et al. [42] shows that students continue to place great trust in unit test based feedback, perceived as stable and predictable. AI-based solutions, however, while offering enormous potential, can still produce inaccurate or out-of-context responses. This limitation has also been observed in recent explorations into the use of chatbots in education, as reported by Fernandez-y Fernandez et al. [43], where cases of misinterpretations or suggestions not aligned with best practices emerge.

4.1.3. Cognitive Load in Immersive Spaces

XR architectures (RQ1) offer very high levels of engagement, but they also bring significant physical and cognitive challenges. The systematic review by Liu et al. [3] shows, for example, that prolonged use of wearable headsets can generate fatigue, discomfort, and even episodes of cybersickness, especially when tasks require sustained concentration. This highlights a critical point: finding a balance between immersion and usability. Even in the context of collaborative modeling, Yigitbas et al. [36] observe that, although VR makes the experience more natural and immersive, 3D interaction can be slower and less efficient than traditional desktop tools, precisely because of the greater complexity of the gestures and manipulations required.

4.1.4. Inclusivity and Resource Constraints

A final critical issue concerns the accessibility of AI technologies themselves. As Omeh et al. [44] observe, when universities operate in resource-constrained settings, adopting heavyweight AI models requiring expensive hardware or API credits becomes simply unrealistic. This creates a clear divide between those who can afford innovation and those who are excluded. At the same time, Zhu and Zhang’s systematic review draws attention to another often-overlooked risk: that of Diversity, Equity & Inclusion. If models are trained on biased or unrepresentative data, students from minority groups may find themselves disadvantaged by tools that, paradoxically, should support everyone. This is a structural problem that many AI-based educational architectures still fail to seriously address [45].

4.2. Comparison with Related Work

Although the results of our analysis align with emerging trends in recent research, this study provides a distinct contribution by addressing specific literature gaps and broadening the scope of prior surveys.

A critical differentiation of our work lies in its progression beyond foundational mappings of gamification evaluation and adaptation. For instance, although Barbosa Monteiro et al. [41] provided a comprehensive systematic mapping of gamification evaluation strategies in software engineering, their findings primarily highlighted a persistent lack of standardized evaluation models, observing that most studies rely on basic, self reported metrics of engagement, motivation, and performance. Our review extends their research by shifting the analytical lens from merely how gamification is evaluated to what architectural complexities and emerging technologies such as Generative AI and Process Mining actually drive these outcomes.

Similarly, Aydin et al. [5] conducted a systematic review of adaptation components in serious games, identifying various decision algorithms (e.g., Bayesian networks, fuzzy logic, and deep learning) utilized to adjust game difficulty and non player character behaviors. While their research offers a valuable taxonomy of adaptability across general educational settings, our study specifically contextualizes and extends this paradigm within the rigorous domain of higher education software engineering. We emphasize that, in this specific context, adaptability requires significantly more than simple algorithmic adjustments of task difficulty; it necessitates deep integration with real-world development processes, code repositories, and continuous feedback loops driven by GenAI.

This deep integration aligns with the rapid proliferation of GenAI technologies observed in our RQ3, mirroring the findings of Li et al. [46], whose meta analysis demonstrates that the impact of AI on STEM education has reached a consistently medium to high level. Furthermore, whereas the work of Triantafyllou et al. [47] focuses primarily on K-12 contexts, our study underscores that the complexity of software engineering concepts in higher education renders superficial gamified mechanics (e.g., basic points or badges) fundamentally insufficient. Instead, students require sophisticated systems capable of modelling real-world processes, complex data flows, and professional interactions.

Consistent with the observations of Ishaq et al. [48] and Wilson et al. [1], we ultimately conclude that personalization remains the quintessential factor in sustaining student engagement over time.

4.3. The Fragmented Landscape of Software Engineering Games

Although some recurring patterns are beginning to emerge, the landscape remains fragmented: a constellation of highly specialized educational tools and games exists, each designed to cover a very specific phase of software engineering. Beyond solutions dedicated to programming in the narrow sense, we find games designed for specific areas such as risk management in software projects (SERGE [49]), software architecture (D-LEARN [50]), model-based modeling (PapyGame [51]), or white box testing (GAMFLEW [35]). Other tools focus on similarly vertical aspects: cloud security through Infrastructure-as-Code (Sifu [52]), peer assessment during programming exercises (PuzzleMe [53]), or the transition from block based to text based coding (Pyrates [54]). These initiatives are often brilliant and innovative, but they tend to exist as separate islands: each offers its own model, its own interaction logic, and its own ecosystem. Harmonization efforts exist such as the Framework for Gamified Programming Education (FGPE [55]), mobile enabled adaptive gamification models [56], or codesign approaches applied to the creation of mini-games [37,57] and VR environments but a unified, modular, and truly plug-and-play architecture that allows these experiences to be integrated into a coherent framework is still lacking. Building a universal framework for the gamification of software engineering therefore remains an open and far from trivial challenge.

5. Conclusions

This systematic literature review has highlighted a profound metamorphosis in Software Engineering Education. We are witnessing a decisive transition from static, “one-size-fits-all” instructional tools toward complex, adaptive educational ecosystems. In these environments, technology does not merely act as a medium for knowledge transmission but actively shapes the learning experience through data-driven personalization and architectural flexibility.

Our analysis confirms that the effectiveness of modern pedagogical architectures relies on their ability to contextualize abstract engineering principles. The integration of narrative design and simulation-based approaches allows students to internalize complex methodologies, such as Agile and Scrum, by experiencing them within a social and functional framework rather than through passive theoretical study. This immersive dimension is significantly enhanced by Extended Reality technologies. Beyond simple code visualization, XR impacts the affective domain of learning by reducing cognitive anxiety and fostering an emotional connection with the subject matter.

However, technical immersion must be balanced with intelligent adaptation. The shift toward Generative AI and Multi-Agent Systems represents a milestone in providing individualized feedback that accounts for the learner’s specific cognitive load and progress. To achieve professional-grade training, these systems must move beyond functional verification to include the automated assessment of architectural quality, such as the detection of code smells and anti-patterns, thereby instilling maintainability as a core competency.

In summary, the future of SEE lies at the convergence of immersive storytelling, algorithmic adaptability, and rigorous process-oriented assessment. While the current landscape remains fragmented, the path forward requires standardizing the interoperability between these specialized “closed-loop” systems and traditional Learning Management Systems. Only through such integration and further longitudinal empirical validation can we democratize access to advanced, personalized, and truly effective software engineering training.

Author Contributions

Conceptualization, M.M.; methodology, A.A.Q.; software, A.A.Q.; validation, G.M. and V.Z.; formal analysis, A.A.Q.; investigation, A.A.Q.; resources, A.A.Q. and G.M.; data curation, A.A.Q.; writing—original draft preparation, A.A.Q.; writing—review and editing, G.M. and V.Z.; visualization, G.M. and V.Z.; supervision, M.M.; project administration, G.M., V.Z. and M.M.; funding acquisition, M.M. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article. Further inquiries can be directed to the corresponding author.

Acknowledgments

The authors used Gemini 3 Pro (Google AI) for language editing and syntactic correction during the preparation of this manuscript. Following the use of this tool, the authors reviewed and edited the content as needed and take full responsibility for the final version of the publication.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

AI	Artificial Intelligence
EC	Exclusion Criteria
EPM	Educational Process Mining
GenAI	Generative Artificial Intelligence
IC	Inclusion Criteria
ITSs	Intelligent Tutoring Systems
LLMs	Large Language Models
LMSs	Learning Management Systems
MAS	Multi-Agent Systems
MR	Mixed Reality
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
RQs	Research Questions
SEE	Software Engineering Education
VR	Virtual Reality
XR	Extended Reality

Appendix A. Mapping of the 59 Primary Studies

In this appendix, the complete mapping of the 59 primary studies identified in the systematic review is provided. The studies are categorized by their architectural pattern (Immersive XR, GenAI/Agents, or Web-based ITS) and their primary Software Engineering focus area.

Table A1. Classification of identified architectural patterns in SEE.

Study Reference	Architecture	Software Engineering Focus Area	Adaptation Level
[15]	Web-based ITSs	Software Security	Feedback-driven
[45]	Web-based ITSs	Programming	Rule-based/Adaptive
[36]	Immersive (VR/MR)	Software Design (UML)	Collaborative-based
[40]	Web-based ITSs	General SEE	Adaptive/Scaffolded
[1]	Web-based ITSs	Software Lifecycle/Management	Scenario-based
[20]	Web-based ITSs	General SEE	Feedback-driven
[7]	Immersive (VR/MR)	Computational Thinking	Constructivist
[10]	GenAI & Agents	Agile	Multi-agent
[21]	Web-based ITSs	General SEE	Data-driven/Adaptive
[53]	Web-based ITSs	Programming	Peer-based
[58]	Web-based ITSs	General SEE	Narrative-based
[17]	Web-based ITSs	Programming	Elo-rating/Adaptive
[31]	Web-based ITSs	Requirements Engineering	Rule-based
[47]	Web-based ITSs	Computational Thinking	Meta-adaptive
[34]	Web-based ITSs	Software Testing	Multi-tool Integration
[33]	Web-based ITSs	Mobile Programming	Test-driven
[4]	Web-based ITSs	Software Security	Data-driven
[14]	Web-based ITSs	Software Testing	Narrative-based
[35]	Web-based ITSs	Software Testing	Rule-based
[6]	GenAI & Agents	Programming	ARCS
[8]	Immersive (VR/MR)	Programming	Spatial-based
[39]	Web-based ITSs	Requirements Engineering	Scenario-based
[18]	Web-based ITSs	Programming	Rule-based
[30]	Web-based ITSs	Software Testing	Rule-based
[27]	Immersive (VR/MR)	Programming	Immersive/Affective
[44]	Web-based ITSs	Programming	AI-assisted
[13]	Web-based ITSs	SQL/Database	Rule-based
[12]	Web-based ITSs	General SEE	Descriptive/Technical
[25]	Web-based ITSs	Agile	Rule-based
[26]	Web-based ITSs	Agile/Scrum	Simulation-based
[55]	Web-based ITSs	Programming	Pareto-optimized/Adaptive
[22]	Immersive (VR/MR)	Programming	Flow-based
[16]	Web-based ITSs	Programming	Data-driven
[32]	GenAI & Agents	Programming	LLM-based
[3]	Immersive (VR/MR)	General SEE	Biosignal-adaptive
[50]	Web-based ITSs	Software Architecture	Simulation-based
[46]	Web-based ITSs	General SEE	AI-Personalized
[48]	Web-based ITSs	Programming	Personalized-learning
[19]	GenAI & Agents	Programming	Deep Learning
[9]	Immersive (VR/MR)	Programming	Narrative-based
[24]	Web-based ITSs	Modeling/Programming	Feedback-driven
[42]	Web-based ITSs	Programming	Feedback-driven
[43]	GenAI & Agents	General SEE	Chatbot-driven
[52]	Web-based ITSs	Cloud Security	Challenge-based
[28]	Immersive (VR/MR)	General SEE	Resource-based
[38]	Web-based ITSs	General SEE	Narrative-based
[29]	Web-based ITSs	Software Quality/Architecture	Static Analysis/Feedback
[51]	Web-based ITSs	Modeling	Rule-based
[54]	Web-based ITSs	Programming	ML-based
[11]	GenAI & Agents	Programming	Pedagogical Scaffolding
[2]	Web-based ITSs	Computer Science/General SEE	Multi-technique
[41]	Web-based ITSs	General SEE	Multi-model
[23]	Web-based ITSs	Databases/SQL	Game-based
[5]	Web-based ITSs	General SEE	Player-modeling
[49]	Web-based ITSs	Project Management (Risk)	Simulation-based
[56]	Web-based ITSs	Programming	Adaptive Pathways
[59]	Web-based ITSs	Requirements Engineering	Model-driven
[37]	Immersive (VR/MR)	Computational Thinking	Interactive
[57]	Web-based ITSs	Computational Thinking	Co-design/Collaborative

References

Wilson, K.N.; Ghansah, B.; Ananga, P.; Oppong, S.O.; Essibu, W.K.; Essibu, E.K. Exploring the efficacy of computer games as a pedagogical tool for teaching and learning programming: A systematic review. Educ. Inf. Technol. 2025, 30, 4157–4184. [Google Scholar] [CrossRef]
Barbosa, P.L.S.; Carmo, R.A.F.D.; Gomes, J.P.P.; Viana, W. Adaptive learning in computer science education: A scoping review. Educ. Inf. Technol. 2024, 29, 9139–9188. [Google Scholar] [CrossRef]
Liu, S.; Toreini, P.; Maedche, A. Mixed Reality Learning Systems with Head-Mounted Displays in Higher Education: A Systematic Review. Technol. Knowl. Learn. 2025. [Google Scholar] [CrossRef]
Švábenský, V.; Vykopal, J.; Čeleda, P.; Kraus, L. Applications of educational data mining and learning analytics on data from cybersecurity training. Educ. Inf. Technol. 2022, 27, 12179–12212. [Google Scholar] [CrossRef]
Aydin, M.; Karal, H.; Nabiyev, V. Examination of adaptation components in serious games: A systematic review study. Educ. Inf. Technol. 2023, 28, 6541–6562. [Google Scholar] [CrossRef]
Shum, L.C.; Rosunally, Y.; Munir, K. Transforming Programming Education: The Effectiveness of Motivational Scenario-Based Design in Serious Games. IEEE Trans. Educ. 2025, 68, 394–406. [Google Scholar] [CrossRef]
Wee, C.; Wang, L.Y.K.; Ong, H.F. TeachVR: An Immersive Virtual Reality Framework for Computational Thinking Based on Student Preferences. ACM Trans. Comput. Educ. 2025, 25, 1–36. [Google Scholar] [CrossRef]
Schez-Sobrino, S.; García, F.M.; Albusac, J.A.; Glez-Morcillo, C.; Castro-Schez, J.J.; Vallejo, D. MR-LEAP: Mixed-Reality Learning Environment for Aspirational Programmers. Softw. Impacts 2024, 20, 100648. [Google Scholar] [CrossRef]
Holder, R.; Carey, M.; Walder, P.; Keir, P. MoonBase VR: Learning to program in a virtual reality game. In Proceedings of the 2023 the 8th International Conference on Information and Education Innovations, Manchester, UK, 13–15 April 2023; pp. 74–80. [Google Scholar] [CrossRef]
Wang, T.; Trimble, M.; Brown, C. DevCoach: Supporting Students Learning the Software Development Life Cycle with a Generative AI powered Multi-Agent System. In Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, Clarion Hotel Trondheim, Trondheim, Norway, 23–28 June 2025; pp. 987–998. [Google Scholar] [CrossRef]
Barzanji, C.; Loitsch, C. Exploring conversational agents for novice programmers: A scoping review. Discov. Artif. Intell. 2025, 5, 271. [Google Scholar] [CrossRef]
Meißner, N.; Koch, N.; Speth, S.; Breitenbücher, U.; Becker, S. Unveiling Hurdles in Software Engineering Education: The Role of Learning Management Systems. In Proceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training, Lisbon, Portugal, 14–20 April 2024; pp. 242–252. [Google Scholar] [CrossRef]
Morales-Trujillo, M.E.; García-Mireles, G.A. Gamification and SQL: An Empirical Study on Student Performance in a Database Course. ACM Trans. Comput. Educ. 2021, 21, 1–29. [Google Scholar] [CrossRef]
Straubinger, P.; Greller, T.; Fraser, G. Sojourner under Sabotage: A Serious Testing and Debugging Game. In Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, Clarion Hotel Trondheim, Trondheim, Norway, 23–28 June 2025; pp. 738–748. [Google Scholar] [CrossRef]
Zola, F.; Echeberria, X.; Petisco, J.; Vakakis, N.; Voulgaridis, A.; Votis, K. KINAITICS: Enhancing Cybersecurity Education Using AI-Based Tools and Gamification. In Proceedings of the 2024 16th International Conference on Education Technology and Computers, Porto, Portugal, 18–21 September 2024; pp. 138–147. [Google Scholar] [CrossRef]
Lokkila, E.; Christopoulos, A.; Laakso, M.J. A Clustering Method to Detect Disengaged Students from Their Code Submission History. In Proceedings of the 27th ACM Conference on on Innovation and Technology in Computer Science Education, Dublin, Ireland, 8–13 July 2022; Volume 1, pp. 228–234. [Google Scholar] [CrossRef]
Vesin, B.; Mangaroska, K.; Akhuseyinoglu, K.; Giannakos, M. Adaptive Assessment and Content Recommendation in Online Programming Courses: On the Use of Elo-rating. ACM Trans. Comput. Educ. 2022, 22, 1–27. [Google Scholar] [CrossRef]
Sanal Kumar, T.S.; Thandeeswaran, R. An improved adaptive personalization model for instructional video-based e-learning environments. J. Comput. Educ. 2024, 12, 267–313. [Google Scholar] [CrossRef]
Hong, Z. Research on the Innovative Application of AIGC in Programming Education: A Case Study of Code Generation and Intelligent Feedback System Design. In Proceedings of the 2nd Guangdong–Hong Kong–Macao Greater Bay Area International Conference on Digital Economy and Artificial Intelligence, Dongguan, China, 28–30 March 2025; pp. 1368–1373. [Google Scholar] [CrossRef]
Willert, N.; Eriksson, J. Towards a feature-based didactic framework for generating individualized programming tasks for an e-learning environment. In Proceedings of the 5th European Conference on Software Engineering Education, Bavaria Germany, 19–21 June 2023; pp. 246–255. [Google Scholar] [CrossRef]
Wang, H.; Tlili, A.; Huang, R.; Cai, Z.; Li, M.; Cheng, Z.; Yang, D.; Li, M.; Zhu, X.; Fei, C. Examining the applications of intelligent tutoring systems in real educational contexts: A systematic literature review from the social experiment perspective. Educ. Inf. Technol. 2023, 28, 9113–9148. [Google Scholar] [CrossRef]
Marougkas, A.; Troussas, C.; Krouska, A.; Sgouropoulou, C. An adaptive virtual reality game for programming education using fuzzy cognitive maps and pedagogical models. Smart Learn. Environ. 2025, 12, 62. [Google Scholar] [CrossRef]
Balla, T.; Kiraly, S.; Kiraly, R. Enhancing SQL programming assessments through educational games: A gamified approach. Discov. Educ. 2025, 4, 365. [Google Scholar] [CrossRef]
Hamann, M.; Götz, S.; Domanowski, A.; Marx, R.; Aßmann, U. The Influence of Incentives and Continuous Feedback in Combined Modeling and Programming Education. In Proceedings of the 6th European Conference on Software Engineering Education, Seeon, Germany, 2–4 June 2025; pp. 135–144. [Google Scholar] [CrossRef]
Meißner, N.; Bredl, P.; Speth, S.; Becker, S. Enhancing Motivation in Software Engineering Education through Gamified Agile Project-based Learning. In Proceedings of the 33rd ACM International Conference on the Foundations of Software Engineering, Clarion Hotel Trondheim, Trondheim, Norway, 25–28 June 2025; pp. 835–846. [Google Scholar] [CrossRef]
Masson, E.T.S.; Calazans, A.T.S.; Bandeira, I.N.; Silva, G.R.S.; Canedo, E.D. Scrum in Practice: City Reconstruction as a Pedagogical Game Challenge. In Proceedings of the XXII Brazilian Symposium on Software Quality, Brasília, Brazil, 7–10 November 2023; pp. 321–331. [Google Scholar] [CrossRef]
Paredes-Velasco, M.; Velázquez-Iturbide, J.á.; Gómez-Ríos, M. Augmented reality with algorithm animation and their effect on students’ emotions. Multimed. Tools Appl. 2023, 82, 11819–11845. [Google Scholar] [CrossRef]
El Hassan, A.; Nabou, A.; Jaouar, M.; Ez-Zarzouri, H.; Qazdar, I.; Belfarji, K. Toward a Gamified Approach Based on Augmented Resources in Education. In Proceedings of the 7th International Conference on Networking, Intelligent Systems and Security, Meknes, Morocco, 18–19 April 2024; pp. 1–7. [Google Scholar] [CrossRef]
De Luca, M.; Di Meglio, S.; Fasolino, A.R.; Starace, L.L.L.; Tramontana, P. Automatic Assessment of Architectural Anti-patterns and Code Smells in Student Software Projects. In Proceedings of the 28th International Conference on Evaluation and Assessment in Software Engineering, Salerno, Italy, 18–21 June 2024; pp. 565–569. [Google Scholar] [CrossRef]
Ren, W. Gamification in Test-Driven Development Practice. In Proceedings of the 2nd International Workshop on Gamification in Software Development, Verification, and Validation, San Francisco, CA, USA, 4 December 2023; pp. 38–46. [Google Scholar] [CrossRef]
Unkelos-Shpigel, N. Elicit, specify, require, revise: Enhancing requirements engineering process using gamification. Softw. Qual. J. 2025, 33, 27. [Google Scholar] [CrossRef]
Logacheva, E.; Hellas, A.; Prather, J.; Sarsa, S.; Leinonen, J. Evaluating Contextually Personalized Programming Exercises Created with Generative AI. In Proceedings of the 2024 ACM Conference on International Computing Education Research, Melbourne, Australia, 13–15 August 2024; Volume 1, pp. 95–113. [Google Scholar] [CrossRef]
Syaifudin, Y.W.; Funabiki, N.; Kaswar, A.B.; Sunandar, A.; Fatmawati, T.; Saputra, P.Y.; Wijaya, D.C.; Irawanto, M.F. Implementing Data Integration Self-Study: A Curriculum Development for Android Programming Learning Assistance System. SN Comput. Sci. 2025, 6, 559. [Google Scholar] [CrossRef]
Tavares, P.; Paiva, A.; Amalfitano, D.; Just, R. FRAFOL: FRAmework FOr Learning mutation testing. In Proceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis, Vienna, Austria, 16–20 September 2024; pp. 1846–1850. [Google Scholar] [CrossRef]
Silva, M.; Paiva, A.C.R.; Mendes, A. GAMFLEW: Serious game to teach white-box testing. Softw. Qual. J. 2025, 33, 5. [Google Scholar] [CrossRef]
Yigitbas, E.; Gorissen, S.; Weidmann, N.; Engels, G. Design and evaluation of a collaborative UML modeling environment in virtual reality. Softw. Syst. Model. 2023, 22, 1397–1425. [Google Scholar] [CrossRef]
Agbo, F.J.; Oyelere, S.S.; Suhonen, J.; Tukiainen, M. Design, development, and evaluation of a virtual reality game-based application to support computational thinking. Educ. Technol. Res. Dev. 2023, 71, 505–537. [Google Scholar] [CrossRef] [PubMed]
Dörringer, A.; Klopp, M.; Rossmann, R. Digital Storytelling in Serious Games: Empirical Research on the Impact of Narrative in 3D Learning Worlds. In Proceedings of the 6th European Conference on Software Engineering Education, Seeon, Germany, 2–4 June 2025; pp. 96–105. [Google Scholar] [CrossRef]
Santana, T.S.; Kudo, T.N.; Bulcao-Neto, R.F. Undergraduates’ perspective on a pedagogical architecture to requirements engineering education. In Proceedings of the XXXVII Brazilian Symposium on Software Engineering, Campo Grande, Brazil, 25–29 September 2023; pp. 422–431. [Google Scholar] [CrossRef]
Wünsche, B.C.; Hooper, S.; Whalley, J.; Straand, I.; Denny, P.; Crow, T.; Lange-Nawka, D.; Luxton-Reilly, A.; Thompson, S.E.R. Leveling up Learning: Serious Games for Computing Education - Long-Term Opportunities and Risks. In Proceedings of the 27th Australasian Computing Education Conference, Brisbane, Australia, 12–13 February 2025; pp. 134–143. [Google Scholar] [CrossRef]
Barbosa Monteiro, R.H.; De Almeida Souza, M.R.; Bezerra Oliveira, S.R.; Dos Santos Portela, C.; De Cristo Lobato, C.E. The Diversity of Gamification Evaluation in the Software Engineering Education and Industry: Trends, Comparisons and Gaps. In Proceedings of the 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering Education and Training (ICSE-SEET), Virtual, 25–28 May 2021; pp. 154–164. [Google Scholar] [CrossRef]
Frankford, E.; Antensteiner, T.; Vierhauser, M.; Sauerwein, C.; Wallner, V.; Groher, I.; Plösch, R.; Breu, R. A Survey on Feedback Types in Automated Programming Assessment Systems. ACM Trans. Comput. Educ. 2025, 26, 1–35. [Google Scholar] [CrossRef]
Fernandez-y Fernandez, C.A.; Sánchez-Soto, E.; Cisnero, J.R.A.; Juárez-Ramírez, R. Exploring the Frontier of Software Engineering Education with Chatbots. Program. Comput. Softw. 2024, 50, 796–815. [Google Scholar] [CrossRef]
Omeh, C.B.; Ayanwale, M.A.; Mnguni, L.E.; Olelewe, C.J. Fostering programming skill and critical thinking through AI-assisted PBL integration. J. New Approaches Educ. Res. 2025, 14, 22. [Google Scholar] [CrossRef]
Zhu, M.; Zhang, K. Artificial Intelligence for Computer Science Education in Higher Education: A Systematic Review of Empirical Research Published in 2003–2023. Technol. Knowl. Learn. 2025, 30, 2417–2441. [Google Scholar] [CrossRef]
Li, S.; Zeng, C.; Liu, H.; Jia, J.; Liang, M.; Cha, Y.; Lim, C.P.; Wu, X. A meta-analysis of AI-enabled personalized STEM education in schools. Int. J. STEM Educ. 2025, 12, 58. [Google Scholar] [CrossRef]
Triantafyllou, S.A.; Sapounidis, T.; Stamovlasis, D. Gamification and Computational Thinking in Education: A Review and a Meta-Analysis. Technol. Knowl. Learn. 2025. [Google Scholar] [CrossRef]
Ishaq, K.; Alvi, A.; Haq, M.I.U.; Rosdi, F.; Choudhry, A.N.; Anjum, A.; Khan, F.A. Level up your coding: A systematic review of personalized, cognitive, and gamified learning in programming education. PeerJ Comput. Sci. 2024, 10, e2310. [Google Scholar] [CrossRef]
Annunziata, G.; Lambiase, S.; Palomba, F.; Ferrucci, F. SERGE—Serious Game for the Education of Risk Management in Software Project Management. In Proceedings of the 46th International Conference on Software Engineering: Software Engineering Education and Training, Lisbon, Portugal, 14–20 April 2024; pp. 264–273. [Google Scholar] [CrossRef]
Lima Caminha, L.O.; Marques, A.B. D-LEARN: A digital game for Software Architecture education. In Proceedings of the 20th Brazilian Symposium on Information Systems, Juiz de Fora, Brazil, 20–23 May 2024; pp. 1–10. [Google Scholar] [CrossRef]
Bucchiarone, A.; Savary-Leblanc, M.; Le Pallec, X.; Cicchetti, A.; Gérard, S.; Bassanelli, S.; Gini, F.; Marconi, A. Gamifying model-based engineering: The PapyGame experience. Softw. Syst. Model. 2023, 22, 1369–1389. [Google Scholar] [CrossRef]
Espinha Gasiba, T.; Andrei-Cristian, I.; Lechner, U.; Pinto-Albuquerque, M. Raising Security Awareness of Cloud Deployments using Infrastructure as Code through CyberSecurity Challenges. In Proceedings of the 16th International Conference on Availability, Reliability and Security, Vienna, Austria, 17–20 August 2021; pp. 1–8. [Google Scholar] [CrossRef]
Wang, A.Y.; Chen, Y.; Chung, J.J.Y.; Brooks, C.; Oney, S. PuzzleMe: Leveraging Peer Assessment for In-Class Programming Exercises. Proc. ACM Hum. Comput. Interact. 2021, 5, 1–24. [Google Scholar] [CrossRef]
Branthôme, M.; Lallé, S. Impact of Adaptive Feedback on Learning Programming with a Serious Game in High Schools’ Classes. In Proceedings of the 33rd ACM Conference on User Modeling, Adaptation and Personalization, New York, NY, USA, 16–19 June 2025; pp. 104–113. [Google Scholar] [CrossRef]
Maskeliūnas, R.; Damaševičius, R.; Blažauskas, T.; Swacha, J.; Queirós, R.; Paiva, J.C. FGPE+: The Mobile FGPE Environment and the Pareto-Optimized Gamified Programming Exercise Selection Model—An Empirical Evaluation. Computers 2023, 12, 144. [Google Scholar] [CrossRef]
Alshalabi, I.A.; Alrawashdeh, T.; AbuKaraki, A.; Alksasbeh, M.Z. A Mobile-Enabled Adaptive Gamification Framework for Programming Education. Int. J. Interact. Mob. Technol. (IJIM) 2025, 19, 42–69. [Google Scholar] [CrossRef]
Agbo, F.J.; Oyelere, S.S.; Suhonen, J.; Laine, T.H. Co-design of mini games for learning computational thinking in an online environment. Educ. Inf. Technol. 2021, 26, 5815–5849. [Google Scholar] [CrossRef]
Vieira, B.R.; Santos, C.Q. Learning through play: Design and creation of a narrative Game-Based Learning experience. In Proceedings of the XXII Brazilian Symposium on Human Factors in Computing Systems, Maceió, Brazil, 16–20 October 2023; pp. 1–11. [Google Scholar] [CrossRef]
Albaghajati, A.; Hassine, J. A use case driven approach to game modeling. Requir. Eng. 2022, 27, 83–116. [Google Scholar] [CrossRef]

Figure 1. PRISMA Flow Diagram summarizing the study selection process.

Figure 2. Distribution of primary studies by year of publication (2020–2025).

Figure 3. Distribution of the three architectural patterns identified in the literature (RQ1).

Figure 4. Data analysis techniques: The shift from result-oriented evaluation to process-oriented mining (RQ2).

Figure 5. Temporal evolution of adaptation mechanisms: The rapid emergence of GenAI (RQ3).

Figure 6. Hierarchy of educational impacts: Behavioral Engagement vs. Cognitive Performance (RQ4).

Figure 7. UML Reference Architecture for the ’Closed-Loop’ Adaptive System (RQ5).

Table 1. Breakdown of automated search results per database.

Digital Library	Raw Search Hits	Records Removed (Automation Filters)	Records Remaining
ACM Digital Library	292	202	90
SpringerLink	144	73	71
Scopus	38	29	9
Total	474	304	170

Table 2. Technical trade-offs between Rule-Based and GenAI adaptation mechanisms.

Feature	Rule-Based [18]	Template-Based [20]	GenAI-Driven [19]
Mechanism	Deterministic Decision Trees	Feature-Oriented DSL	NLP/Deep Learning
Complexity	Low ( $O (n)$ logic)	Moderate (Static parsing)	Very High (Probabilistic)
Robustness	High (Zero hallucinations)	High (Formal constraints)	Moderate (Potential bias)
Personalization	Level-based coarse tuning	Feature-based modularity	Semantic/Intent-aware
Resource Cost	Very Low (4 GB Disk/Standard CPU)	Moderate (High initial modeling)	High (API Tokens/GPU)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Quartulli, A.A.; Mignogna, G.; Zizzo, V.; Mongiello, M. Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review. Computers 2026, 15, 235. https://doi.org/10.3390/computers15040235

AMA Style

Quartulli AA, Mignogna G, Zizzo V, Mongiello M. Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review. Computers. 2026; 15(4):235. https://doi.org/10.3390/computers15040235

Chicago/Turabian Style

Quartulli, Aurora Annamaria, Giovanni Mignogna, Vera Zizzo, and Marina Mongiello. 2026. "Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review" Computers 15, no. 4: 235. https://doi.org/10.3390/computers15040235

APA Style

Quartulli, A. A., Mignogna, G., Zizzo, V., & Mongiello, M. (2026). Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review. Computers, 15(4), 235. https://doi.org/10.3390/computers15040235

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Adaptive Architectures for Gamified Learning in Software Engineering: A Systematic Review

Abstract

1. Introduction

2. Research Methodology

2.1. Search Strategy

2.2. Inclusion and Exclusion Criteria

2.3. Study Selection Process

2.4. Data Extraction and Analysis

2.5. Quality Assessment

3. Answering Researching Questions

3.1. RQ1: Architectural Patterns and Platforms

3.1.1. Immersive Environments (VR/MR)

3.1.2. GenAI-Powered Agents and Chatbots

3.1.3. Web-Based Intelligent Tutoring Systems

3.2. RQ2: Process Mining and Learning Analytics Integration

3.2.1. Predictive Modeling and Early Warning Systems

3.2.2. Dynamic Difficulty Adjustment via Psychometrics

3.3. RQ3: Adaptive Mechanisms and AI Techniques

3.3.1. Rule-Based vs. AI-Driven Adaptation

3.3.2. Multi-Agent Systems and Intelligent Tutors

3.3.3. Narrative and Affective Adaptation

3.4. RQ4: Educational Impact and Student Outcomes

3.4.1. Impact on Engagement and Team Dynamics

3.4.2. Impact on Skill Acquisition and Grades

3.5. RQ5: Optimization of Learning Paths via Data-Driven Gamification

3.5.1. The “Smart Gamification” Feedback Loop

3.5.2. Curriculum Optimization (Test-Driven Learning)

3.5.3. Optimization in Immersive Spaces

4. Findings

4.1. RQ6: Challenges and Limitations

4.1.1. Technical Integration and Interoperability

4.1.2. Pedagogical Alignment and Assessment

4.1.3. Cognitive Load in Immersive Spaces

4.1.4. Inclusivity and Resource Constraints

4.2. Comparison with Related Work

4.3. The Fragmented Landscape of Software Engineering Games

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. Mapping of the 59 Primary Studies

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI