Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study

Guimerà-Ballesta, Gerard; Jiménez-Valverde, Gregorio; Fabre-Mitjans, Noëlle

doi:10.3390/educsci16060855

Open AccessArticle

Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study

by

Gerard Guimerà-Ballesta

¹

,

Gregorio Jiménez-Valverde

^1,2,*

and

Noëlle Fabre-Mitjans

¹

Department of Language, Science and Mathematics Education (Faculty of Education), EduCiTS Innovation and EMA Research Groups, Universitat de Barcelona, 08035 Barcelona, Spain

²

Institut de Recerca en Educació (IREUB), Universitat de Barcelona, 08035 Barcelona, Spain

^*

Author to whom correspondence should be addressed.

Educ. Sci. 2026, 16(6), 855; https://doi.org/10.3390/educsci16060855 (registering DOI)

Submission received: 30 April 2026 / Revised: 22 May 2026 / Accepted: 26 May 2026 / Published: 29 May 2026

(This article belongs to the Special Issue Learning Futures: Designing for Motivation, Self-Regulation and Success in Evolving Learning Ecosystems)

Download

Browse Figures

Versions Notes

Abstract

Digital gamification may function as motivational scaffolding within situated course designs when it helps learners perceive progress, participate actively, and connect course activities with meaningful professional goals. This article examines how motivational scaffolds were designed and refined through digital gamification in a fourth-year science course in early childhood teacher education. Using a two-cycle design-based research approach, the study analyzed an initial FantasyClass-supported implementation and a subsequent redesigned version. In Cycle 1, broad affective outcomes, feature ratings, and open responses were associated with more favorable recent learning experiences, somewhat more favorable current views of science and its relevance, and higher perceived science-teaching capability and preparedness. Feature-level evidence identified progression, collaborative work, and narrative coherence as central motivational supports. These findings informed Cycle 2, which recalibrated selected mechanics and strengthened the narrative structure. Post-course Intrinsic Motivation Inventory results were descriptively consistent with above-midpoint enjoyment, perceived competence, and perceived choice, with enjoyment positively associated with competence and choice. Qualitative evidence highlighted active participation, progress awareness, and perceived relevance for future teaching. The findings suggest that, under the design conditions examined here, digital gamification may support motivationally meaningful course design when treated as motivational scaffolding rather than as an isolated reward system.

Keywords:

design-based research; gamification; motivational design; educational technology; teacher education; early childhood education; science education; FantasyClass; intrinsic motivation; narrative

1. Introduction

Preparing preservice early childhood teachers to teach science well involves more than providing disciplinary knowledge and pedagogical techniques. It also requires helping them develop a sufficiently positive relationship with science, confidence in their own capacity to engage with it, and a willingness to see science as a meaningful part of future professional practice (Appleton, 2003; Osborne et al., 2003). In teacher education, these affective and motivational conditions are not peripheral. They influence what preservice teachers value, what they avoid, and how much effort they are willing to invest in science-related planning and instruction (Bravo et al., 2022; Brígido et al., 2013; Membiela et al., 2022; Tytler & Ferguson, 2023). In early childhood education, this is especially important because teachers mediate children’s first encounters with science, and the quality of those early encounters is shaped not only by content knowledge but also by teachers’ confidence, enthusiasm, and readiness to create exploratory learning environments (Sasway & Kelly, 2021; Tytler & Ferguson, 2023).

This challenge is well documented in preservice teacher education. Many preservice teachers arrive at university with fragile or negative relationships with science that are linked to prior schooling experiences marked by transmissive teaching, low personal relevance, and weak opportunities for meaningful participation (Appleton, 2003; Osborne et al., 2003). These trajectories are often accompanied by low self-efficacy, negative emotions, and avoidance, particularly in relation to the sciences perceived as more difficult (Bravo et al., 2022; Brígido et al., 2013). Because preservice teachers’ self-efficacy is not fixed but develops through formative teaching experiences and contextual supports (Klassen & Durksen, 2014), initial teacher education offers an important opportunity to reshape how preservice teachers perceive their capacity to engage with science. In early childhood teacher education, such patterns are particularly concerning because they may persist into professional practice and later translate into reduced instructional time for science, lower pedagogical ambition, or a tendency to treat science as secondary to other domains (Bravo et al., 2019; Sasway & Kelly, 2021). If these beliefs and experiences are not addressed during initial teacher education, they risk being reproduced in future classrooms, thereby reinforcing a cycle of disengagement from science at the earliest educational stages (Appleton, 2003; Tytler & Ferguson, 2023).

For this reason, science-related outcomes in teacher education are better understood as broader than “attitude” in a narrow sense. What matters educationally is a configuration of affective and motivational variables that includes remembered learning experiences, present views of science and its relevance, confidence for future teaching, and willingness to invest effort in science-related work. That broader perspective is especially pertinent in early childhood teacher education, where many students need conceptual support as well as emotionally reparative and professionally empowering experiences if they are to move beyond inherited science avoidance (Bravo et al., 2022; Brígido et al., 2013; Tytler & Ferguson, 2023).

1.1. Gamification as Motivational Design in Teacher Education

Against this background, gamification has emerged as a promising but design-sensitive approach for supporting motivation and engagement in educational contexts. In broad terms, gamification refers to the use of game design elements in non-game settings in order to influence participation, engagement, and goal-directed activity (Deterding et al., 2011; Hamari et al., 2014; Kapp, 2012). Across the field, recent reviews and meta-analyses generally point to positive tendencies, but they also emphasize heterogeneity, design dependence, variable motivational effects, and the need for stronger theoretical and methodological grounding (Khaldi et al., 2023; L. Li et al., 2024; Ratinho & Martins, 2023). This means that the relevant educational question is no longer whether gamification “works” in the abstract, but under what design conditions it can support sustained and educationally valuable forms of engagement.

That point is particularly important in science education. A recent systematic review focused specifically on science education concluded that gamification is often associated with improved motivation, engagement, and learning, but also highlighted strong variability across interventions and a continuing need for more robust research designs and more explicit theoretical grounding (Kalogiannakis et al., 2021). Broader reviews and meta-analyses in educational gamification reach a similar conclusion: positive outcomes are plausible, but they are far from automatic, and their magnitude depends on design configuration, duration, educational context, learner experience, and outcome measures (Khaldi et al., 2023; M. Li et al., 2023; Ratinho & Martins, 2023; Zainuddin et al., 2020). In higher education, longitudinal evidence suggests that gamification may support educational outcomes when it is carefully calibrated and sustained over time (Putz et al., 2020). This design sensitivity is especially relevant in teacher education, where the aim is not simply to increase short-term participation, but to create motivationally meaningful experiences that preservice teachers may later recognize as pedagogically transferable.

A useful framework for understanding this issue is Self-Determination Theory (SDT), one of the most recurrent theoretical frameworks for explaining motivation in gamification, serious games, and game-based learning research (Deci & Ryan, 1985, 2000; Krath et al., 2021; Ryan & Deci, 2000, 2020). SDT distinguishes between more controlled and more autonomous forms of motivation and argues that high-quality engagement flourishes when three basic psychological needs are supported: competence, autonomy, and relatedness (Deci & Ryan, 1985, 2000). Competence refers to feeling effective when facing meaningful challenges; autonomy refers to a sense of volition and ownership over one’s actions; and relatedness refers to feeling connected to others within a shared activity (Ryan & Deci, 2000, 2020). Importantly, SDT also offers a way to think beyond a rigid intrinsic–extrinsic dichotomy: educational activities that are not initially interesting for their own sake may nevertheless become increasingly self-endorsed when learners perceive them as personally or professionally valuable, a process of internalization that is especially relevant in teacher education (Deci et al., 1994; Ryan & Deci, 2017).

This makes SDT particularly useful for studying gamification. It shifts attention away from the simplistic assumption that points, rewards, or badges are motivating in themselves and toward the question of whether a learning environment supports or frustrates students’ psychological needs. In game contexts, need satisfaction has been consistently linked to enjoyment and sustained engagement (Ryan et al., 2006). In educational gamification, more fine-grained studies show that different gameful features do not operate in the same way; more recent experimental work has likewise treated elements such as progress bars, narrative, feedback, and badges as distinguishable design components rather than as an undifferentiated gamified package (Mazarakis & Bräuer, 2023; Sailer et al., 2017). Sailer et al. (2017) demonstrated experimentally that different configurations of game design elements have different psychological effects: achievement-oriented elements such as badges, leaderboards, and performance graphs supported competence and task meaningfulness, whereas avatars, meaningful stories, and teammates more strongly supported relatedness. Xi and Hamari (2019) likewise showed that achievement-related, social, and immersion-related gamification features have different relationships with autonomy, competence, and relatedness need satisfaction. In the same vein, Van Roy and Zaman (2019) argued that gamification has an ambivalent motivational power, because the very same elements can either support or thwart basic psychological needs depending on the situations in which learners experience them.

This evidence also helps explain why findings on gamification remain mixed. A recent meta-analysis and systematic review by L. Li et al. (2024) found that gamification has a small but significant positive effect on students’ intrinsic motivation, together with clearer positive effects on perceived autonomy and relatedness, but only minimal impact on competence. Their systematic review further identified two recurring challenges in gamified classes: students’ lack of perceived competence and lack of perceived autonomy. This pattern is highly relevant for teacher education, where fragile science-related confidence, perceived lack of agency, and externally regulated engagement can undermine the intended benefits of innovation. They also help explain why the design of a gamified course should be treated as an educational design problem rather than as the mere addition of game elements to conventional teaching.

Related evidence warns against reducing gamification to reward accumulation. Mekler et al. (2017), for example, found that points, levels, and leaderboards increased the quantity of task performance but did not significantly improve either competence or intrinsic motivation. This suggests that common reward-oriented mechanics may function as extrinsic incentives without necessarily improving the quality of motivation. Gamification is best understood as a form of motivational design: its educational value depends on how mechanics, feedback loops, social organization, and thematic framing are configured to support agency, competence, and social connection over time (Grabner-Hagen & Kingsley, 2023; Van Roy & Zaman, 2019).

This perspective is especially important in teacher education because prospective teachers are simultaneously learners and emerging pedagogical decision-makers. Recent work with preservice primary teachers also suggests that gamification can be positively perceived as an active teaching methodology (Colomo-Magaña et al., 2024). A well-designed gamified course may therefore influence both how they experience science learning in the present and how they imagine science teaching as part of their future professional repertoire. For this study, the relevant issue is not whether gamification works in the abstract, but how a science teacher education course can be designed so that its mechanics, feedback structures, social organization, and course-level coherence support competence, autonomy, and meaningful participation.

1.2. Narrative, Coherence, and Meaning in Gamified Science Learning

Among the many resources available within gamification, narrative is especially relevant for science education. Narrative approaches have long been recognized as powerful pedagogical tools because they help organize information temporally and causally, provide emotional resonance, and connect abstract ideas to more meaningful contexts (Bruner, 1991; Negrete & Lartigue, 2004). In science education, where students often encounter specialized language, fragmented concepts, and content that feels detached from their lived experience, narrative can function as a bridge between formal knowledge and meaningful sense-making (Avraamidou & Osborne, 2009; Millar & Osborne, 1998).

This pedagogical value has been discussed from several complementary perspectives. Bruner (1991) argued that narrative is a fundamental mode of thought through which people organize experience and construct meaning. In science education, this insight has supported the claim that stories can help learners connect scientific ideas to purpose, causality, human action, and memorable situations rather than encountering them as isolated informational fragments (Avraamidou & Osborne, 2009; Kokkotas et al., 2010). Other scholars have highlighted that narrative supports memory and engagement because stories are easier to retain and emotionally process than decontextualized expository sequences (Negrete & Lartigue, 2004; Rowcliffe, 2004). In science classrooms, narrative has also been used to make highly abstract or multilevel ideas more meaningful, not only by contextualizing them, but by helping students position scientific content within coherent, experience-near frames of interpretation (Boström, 2008; Millar & Osborne, 1998). In this sense, the pedagogical potential of narrative aligns with broader efforts to humanize science, make it more accessible, and situate it within social and personal contexts that matter to learners (Avraamidou & Osborne, 2009; Boström, 2008). More recent reviews have reinforced this view, arguing that narrative can act simultaneously as a cognitive scaffold and a motivational resource in science learning (Jiménez-Valverde, 2025; Soares et al., 2023).

These affordances become even more significant in gamified environments. One recurring problem in gamification is fragmentation: mechanics such as points, rewards, tasks, and optional features can remain disconnected from one another and from the educational purpose of the course. Narrative offers a way of counteracting that risk by giving students a sense of continuity and by linking activities to a broader shared purpose. In gamified learning, narrative can therefore function as an organizing spine that gives coherence to course activities. Koivisto and Hamari (2019) noted that gamification research still lacks consistent models of how different features and experiences are connected, and more recent reviews similarly emphasize the importance of theoretically grounded combinations of game elements (Khaldi et al., 2023). Within this design problem, narrative coherence is especially relevant because it can help students experience otherwise discrete mechanics as part of a coherent course-level trajectory (Sailer et al., 2017).

There is also empirical evidence that narrative is not merely aesthetic but can be psychologically consequential when it shapes learners’ experience of the gameful environment. Bormann and Greitemeyer (2015) showed that in-game storytelling facilitated immersion, and that this immersion, in turn, was associated with enhanced autonomy and relatedness need satisfaction. This finding is especially relevant to educational gamification because it suggests that narrative may strengthen the motivational potential of an environment by helping learners feel more agentic, more connected, and more embedded in a meaningful world of action. More recent educational evidence extends this argument to fantasy-based gamification: Bai et al. (2022), in a design-based study in higher education, found that incorporating fantasy into a gamified course promoted student learning and the quality of online interaction. These findings suggest that narrative and fantasy are pedagogically valuable when they shape activity, feedback, and participation, not when they remain at the level of surface decoration. In that sense, narrative and gamification are not separate resources but potentially synergistic ones: narrative gives meaning to mechanics, while mechanics give form and consequence to narrative.

Related work in science teacher education supports this synergy. Emerging studies in preservice teacher education suggest that structurally gamified and narrative-rich courses can foster more favorable science-related dispositions, higher enjoyment, stronger perceptions of competence, and greater investment in learning when the design is coherent and aligned with motivational principles (Jiménez-Valverde et al., 2025b, 2026; Sánchez-Martín et al., 2017). However, this work also indicates that such effects are not automatic: neither a fantasy frame nor a reward system is sufficient if the environment does not also make progress intelligible, support collaboration, and connect academic tasks to a meaningful course-level trajectory.

For this reason, narrative should not be treated as an optional add-on to a gamified course. In science teacher education, where many students arrive with prior negative experiences and fragile confidence, the question is not simply whether narrative is “fun”, but whether it helps transform science learning into something more coherent, more approachable, and more professionally relevant. For the present study, narrative is therefore conceptualized as a coherence device that can link tasks, sustain continuity across the semester, and make the motivational structure of the course more intelligible to students. Whereas previous studies have documented affective or motivational benefits of gamified science courses in teacher education, much less is known about how evidence from an initial implementation can be used to redesign a course and refine the evaluative lens applied to it.

1.3. Study Aim and Research Questions

Building on this gap, the study examines a digitally mediated, structurally gamified science course in early childhood teacher education supported by FantasyClass. Specifically, it documents a two-cycle process in which a broad affective diagnosis and feature-level analysis in Cycle 1 informed targeted redesign decisions, followed by a more focused motivational assessment in Cycle 2. The primary contribution of this article is design-oriented: it documents how evidence from an initial gamified implementation informed redesign decisions concerning visible progression, collaboration, narrative coherence, and ancillary mechanics. Empirically, the study reports broad affective patterns and experiential contrasts in Cycle 1 and a focused motivational profile in Cycle 2. Methodologically, it illustrates how the evaluative focus of a two-cycle course refinement process can shift from broad affective diagnosis to a more targeted assessment of motivational quality.

The article addresses four research questions:

RQ1. What broad affective patterns and experiential contrasts were observed during the initial implementation?
RQ2. Which implemented gamification features and platform-level aspects were perceived as most and least motivating by participants in the initial implementation?
RQ3. How did those findings inform the redesign of the course in the subsequent iteration?
RQ4. What motivational profile emerged after redesign, as assessed through intrinsic motivation measures and qualitative accounts?

RQ1 and RQ2 serve a broad diagnostic purpose in Cycle 1, RQ3 addresses the redesign logic between cycles, and RQ4 examines the motivational profile of the redesigned course in Cycle 2. The two cycles are therefore linked as successive stages of the same design-and-redesign process, but they are not metrically equivalent assessments of the same construct. Accordingly, the study should not be read as a direct pre–post comparison across academic years.

2. Methodology

The study adopted design-based research (DBR) as its overarching methodological approach. DBR is widely characterized as an interventionist and iterative form of educational inquiry in which the design of a learning environment, its enactment in an authentic setting, the analysis of evidence, and subsequent redesign are tightly intertwined (Anderson & Shattuck, 2012; Brown, 1992; Design-Based Research Collective, 2003; Hoadley & Campos, 2022). Rather than asking only whether an intervention ‘works’, DBR is concerned with how and why a design functions in practice, how it can be refined across cycles, and what transferable design knowledge can be generated from that process (Cobb et al., 2003; Guisasola, 2024; Lehrmann et al., 2022; McKenney & Reeves, 2021; Tinoca et al., 2022). Here, DBR was operationalized through two successive course implementations: an initial diagnostic cycle and a refined cycle focused on the motivational experience of the redesigned course. Consistent with the SDT framework developed above, the main design features were interpreted as provisional motivational supports: progression and feedback were expected to support perceived competence, opportunities for choice and agency to support autonomy, group-based work to support relatedness, and narrative coherence to make the course trajectory more meaningful and intelligible.

In the present case, the intervention was a digitally mediated, structurally gamified science course implemented in two successive academic years (2022–2023 and 2023–2024) in the same fourth-year elective subject of the Early Childhood Education degree, Biological Development of the Child and Didactic Intervention (BDCDI), at the University of Barcelona. This 6-ECTS course addresses the biological development of young children and the didactic translation of that knowledge into early childhood teaching practice. The course combines theory–practice sessions, small-group tutoring, and laboratory work, and explicitly links scientific understanding to the planning of classroom activities and didactic projects in early childhood education.

Within this stable curricular context, the DBR process unfolded across two successive cycles that shared the same broad pedagogical aim—strengthening favorable dispositions toward science and supporting future teaching readiness—while differing in design intensity and evaluative focus. The first cycle fulfilled an exploratory and diagnostic function using a broad affective instrument, ratings of implemented gamification features, and a written open response. The second cycle implemented evidence-informed refinements to the gamified course and narrowed the evaluation focus to intrinsic motivation, assessed through a dedicated post-course instrument and complemented by an open-ended question. The two cohorts were independent natural class groups from successive academic years. The study is therefore reported as a single DBR process composed of two analytically differentiated but methodologically linked cycles.

Because gamification was integrated into the ordinary course design, a clear distinction was established between participation in course activities and research participation. Completing course activities, earning experience points (XP), progressing through levels, and accessing rewards formed part of the regular learning and assessment structure of BDCDI. By contrast, completing the research questionnaires and open-ended responses was voluntary and had no effect on grades, XP, level progression, access to rewards, or students’ academic standing or treatment by the teaching team. Students were explicitly informed of this distinction before data collection, received written information about the study, and signed an informed consent form. They were also informed that they could decline participation, withdraw from the research component, or leave items unanswered without academic consequences. The questionnaires were administered digitally by the teaching–research team through forms separate from the course assessment records. With the exception of the Cycle 1 pretest, which was administered at the beginning of the course, post-course questionnaires and open-ended responses were administered after students had experienced the relevant course activities but before final grades were communicated. Nevertheless, questionnaire responses were not consulted or used for grading purposes, and all research data were analyzed only after course grades had been communicated to students.

Research responses were reported only in aggregate form or through anonymized quotations. Because pretest–post-test matching in Cycle 1 required a participant identifier, complete anonymity at the point of data collection was not possible in that cycle; therefore, identifiers were used only for matching and were replaced with coded data before analysis. By contrast, Cycle 2 data were collected anonymously because no pretest–post-test matching was required. The research dataset was thus de-identified before analysis in Cycle 1 and anonymous from collection in Cycle 2. These procedures were intended to reduce perceived coercion, social desirability, and power imbalance, although such risks cannot be fully eliminated in classroom-based research conducted within a course designed and implemented by the teaching team.

This article is organized accordingly. After this general methodological description, Cycle 1 is presented as an exploratory and diagnostic iteration, followed by a section that documents how its findings informed redesign decisions. Cycle 2 is then presented as the refined iteration. The final discussion integrates both cycles and formulates provisional design principles for gamification and motivational design in educational technology within early childhood science teacher education.

3. Cycle 1: Exploratory and Diagnostic Iteration

The purpose of Cycle 1 was twofold: first, to determine whether participating in a structurally gamified course was associated with more favorable affective outcomes related to science and science teaching; and second, to identify which elements of the gamified design appeared to function as core motivational supports and which elements generated weaker responses or friction. The cycle therefore combined pretest–post-test quantitative analysis with post-course feature ratings and qualitative written reflections.

3.1. Cycle 1 Participants

The Cycle 1 cohort comprised 28 preservice early childhood teachers enrolled in BDCDI during the first year of the project. The group was predominantly female (27 women and 1 man), with a mean age of 22.14 years (SD = 1.48; median = 22). Eight participants (28.6%) reported a prior science-oriented upper-secondary track. The cohort should be understood as a convenience sample drawn from the natural class group, which is consistent with the situated and interventionist logic of DBR in teacher education (Anderson & Shattuck, 2012; Lehrmann et al., 2022).

3.2. Initial Implementation Context and Intervention Design

In Cycle 1, BDCDI was implemented through structural gamification supported by FantasyClass, a configurable web-based platform that enables teachers to overlay a gameful architecture onto existing course content through points, levels, rewards, inventories, teams, and narrative framing (Jiménez-Valverde et al., 2026). The core feedback ecology of the first implementation revolved around three point systems: health points (HP), experience points (XP), and gold coins. Students created avatars at the beginning of the course—represented as astronaut-like explorers adapted to the science-fictional theme of Cycle 1 (Figure 1). HPs reflected classroom conduct and participation and could increase or decrease depending on students’ behavior and some game events; XP indexed academic progression and were awarded according to the quality of completed tasks, including optional tasks; and gold coins functioned as an in-course currency that students could spend in the virtual shop. Accumulated XP allowed students to level up, and the final level contributed up to 40% to the course grade, making progression both motivationally salient and assessment-relevant. The gamified tasks and level progression formed part of the ordinary course assessment. As explained in Section 2, this assessment function was separated from the voluntary research data collection: completing questionnaires or open-ended prompts did not generate XP, level progression, rewards, penalties, or grade consequences.

Around this points-and-progression backbone, a second layer of supporting mechanics expanded the gameful ecology. Students could obtain or purchase cards (Figure 1), activate skills, adopt virtual pets, open mysterious chests, spin the fortune wheel, and participate in wordlets. The virtual shop allowed them to exchange gold coins for items with concrete in-course consequences, while micro-recognition features such as badges or symbolic rewards enriched the experience to varying degrees. A more detailed description of these FantasyClass features is provided elsewhere (Jiménez-Valverde et al., 2026). These ancillary mechanics did not all play an equally central role in the intervention, but they were sufficiently present to be included in the Cycle 1 post-course feature questionnaire as individually rated elements.

The narrative frame positioned students as astronaut-explorers in a fictional world called the ‘Lost Biological Kingdom’, divided into thematic domains that paralleled blocks of the course. An introductory video presented the mission and connected the semester’s tasks to a broader adventure. Students worked in stable groups of four or five throughout the course, which enabled collaboration-based mechanics and, in some tasks, light inter-group competition. Throughout the course, students faced unpredictable situations where they could gain or lose points, such as random events that introduced unexpected bonuses or penalties, and whole-class monster battles. These battles, often implemented through quiz-like challenges, connected course revision and content application to collective play. Success in these challenges rewarded students with XP needed for leveling up, while failure resulted in the loss of HPs. This risk-and-reward dynamic gave functional meaning to the game’s inventory. Consequently, some cards and skills were specifically designed to reinforce group interdependence—for example, a healing skill that restored HPs to the whole team, or a card that allowed an action to be reversed after a point loss. Overall, the initial design combined a clear progression system, multiple optional reward channels, group-based participation, and a light narrative frame that provided occasional thematic context for selected activities.

3.3. Cycle 1 Instruments

Cycle 1 used three data sources. The first was a 22-item Likert questionnaire previously developed and validated (Jiménez-Valverde et al., 2024) as a broad diagnostic instrument for examining preservice teachers’ affective positioning toward science and science teaching. The validation study reported that the instrument was constructed by selecting and adapting items from two previously validated Spanish-language instruments on science attitudes and motivation. Its content and three-block structure were reviewed by a panel of three experts in science education, who confirmed the adequacy and coherence of the items within the proposed structure. The validation study also reported satisfactory internal consistency for the three components later used in the present study as Blocks 1–3 (α = 0.851, 0.827, and 0.792, respectively), and all corrected item–total correlations exceeded the conventional 0.30 threshold.

In the present study, Block 1 (items 1–8) addresses science learning experiences; Block 2 (items 9–16) addresses current attitudes toward science and perceptions of its relevance; and Block 3 (items 17–22) addresses future self-efficacy for teaching science and the professional value attributed to teaching science in early childhood education. Because the original instrument was developed for preservice teachers’ attitudes and motivation toward physics and chemistry and their teaching, item wording was adapted only where necessary to refer to the science-learning and science-teaching context of BDCDI. This adaptation supports contextual relevance, but it also means that the Cycle 1 results should be interpreted as exploratory and course-specific. Accordingly, the questionnaire was analyzed by blocks, and no global questionnaire score was computed.

The questionnaire was administered at the beginning and at the end of the course. However, the meaning of the pre–post comparison is not identical across the three blocks. Whereas Blocks 2 and 3 retained the same conceptual referent across both administrations, Block 1 referred to different experiential frames. Within Block 1, the pretest captured prior learning experiences with science, whereas the post-test referred to the science-learning experience in BDCDI; accordingly, this block is interpreted as a contrast between previous and recent science learning experience rather than as a strictly invariant longitudinal measure. To preserve semantic parallelism while shifting the experiential frame, Block 1 items were minimally reworded between administrations, and an illustrative example of this adaptation is provided in Appendix A. Internal consistency estimates in this study were acceptable for Blocks 1 and 2 and modest but interpretable for Block 3, which contains fewer items and more heterogeneous content. Cronbach’s alpha was 0.792 (pre) and 0.685 (post) for Block 1, 0.887 and 0.790 for Block 2, and 0.623 and 0.653 for Block 3. Given the exploratory and DBR-oriented nature of the first cycle, these values were considered sufficient for cautious, exploratory block-level interpretation.

The second data source was a post-course questionnaire in which students rated both the overall FantasyClass experience and the perceived motivational value of specific gamification features. The overall FantasyClass rating was treated as a platform-level item, whereas the remaining items referred to feature-level components of the gamified design. Ratings were expressed on a five-point ordinal rating scale of perceived motivational value (1 = not motivating, 2 = slightly motivating, 3 = moderately motivating, 4 = motivating, 5 = highly motivating). Feature-level items included progression-related mechanics (e.g., XP, leveling up), social components (e.g., group work), narrative, economy-related features, random events, and ancillary mechanics such as badges, skills, pets, cards, chests, and curses. The third data source was an open-ended prompt asking students to explain how they perceived their attitude and motivation toward science in relation to the gamified course experience.

3.4. Cycle 1 Data Analysis

All 28 students completed the pre–post affective questionnaire. Valid responses for the post-course feature ratings varied by item because of occasional missing responses; analyses were therefore conducted with the available valid responses for each item, without imputation. All 28 students also provided substantive open-ended responses; thematic frequencies in Appendix C are therefore reported as percentages of the full Cycle 1 cohort. No imputation was applied to missing or non-substantive qualitative responses.

Cycle 1 quantitative analyses were conducted at two levels. First, participant-level block scores were computed as the mean of the items belonging to each affective block, and pretest–post-test differences were examined using the Wilcoxon signed-rank test. Holm adjustment was applied across the three block-level comparisons. Second, item-level pretest–post-test comparisons were conducted within each block using Wilcoxon tests with block-wise Holm adjustment. The normality of block-level change scores was examined using the Shapiro–Wilk test. Whereas Block 1 and Block 3 did not show significant departures from normality, Block 2 did (W = 0.775, p < 0.001). Given this result, together with the ordinal nature of the questionnaire responses, all descriptive summaries and inferential analyses in Cycle 1 were conducted and reported using non-parametric statistics. Following current reporting recommendations in educational research, effect sizes were expressed as matched-pairs rank-biserial correlations (r_rb), interpreted as very small (<0.10), small (0.10–0.29), moderate (0.30–0.49), and large (≥0.50) (López-Martín & Ardura-Martínez, 2023). Feature ratings were summarized descriptively using medians, interquartile ranges, and percentages of high endorsement (ratings ≥ 4).

The qualitative data were analyzed using reflexive thematic analysis (Braun & Clarke, 2006). In Cycle 1, the qualitative corpus consisted of 28 brief individual written responses to a single open-ended prompt. The analysis was primarily semantic, because the aim was to identify how students explicitly described their motivational experience, perceived changes in their relationship with science, and reactions to specific gamification features. At the same time, theme development involved interpretive refinement in order to connect students’ accounts with the broader design questions of the study. Responses were first read repeatedly by one author for familiarization, then coded inductively, and subsequently grouped into candidate themes. During this initial phase, codes remained close to participants’ wording, for example references to visible progress, rewards, enjoyment, group work, professional usefulness, or frustration with particular mechanics. Candidate themes were then discussed by the research team as a reflexive audit of coherence and interpretive adequacy, rather than as an inter-rater reliability procedure. In this audit, candidate themes were checked against the full set of responses, overlapping codes were merged, overly broad themes were split or renamed, and less positive or more ambivalent responses were deliberately considered as potential negative or deviant cases. These included comments indicating that gamification was not the main source of motivation for every student, or that some mechanics generated frustration or unclear usefulness. The themes were reviewed against the dataset and refined to ensure internal coherence and conceptual distinction. Representative quotations were translated into English for reporting purposes, preserving the substantive meaning of the original responses. Translations were checked against the original-language responses to ensure that the reported quotations preserved participants’ intended meaning. Given the situated nature of the project, the qualitative strand was interpreted reflexively rather than as a search for coder-independent theme reliability. In this mixed-methods DBR context, frequencies are reported only as orienting indicators of thematic prominence, not as quasi-quantitative evidence of prevalence or importance (Bakker, 2018; Braun & Clarke, 2006). Accordingly, the qualitative findings are used to contextualize and interpret the quantitative patterns rather than to establish independent estimates of effect or prevalence.

3.5. Cycle 1 Results

Cycle 1 results are organized around the three data sources used in the exploratory and diagnostic iteration: the affective questionnaire, the post-course ratings of the overall FantasyClass experience and specific gamification features, and students’ open-ended accounts of the course. This sequence reflects the diagnostic role of Cycle 1, moving from broad affective positioning to platform- and feature-level appraisal and qualitative interpretation.

3.5.1. Cycle 1 Affective Profile

At the block level, all three affective blocks showed statistically significant pretest–post-test differences after Holm correction (Table 1). In Block 1 (learning experiences), the evaluation of the BDCDI course was more favorable than the evaluation of prior science-learning experiences (pre-Mdn = 3.38, post-Mdn = 3.75). Block-level scores also increased in Block 2 (pre-Mdn = 4.00, post-Mdn = 4.19), which concerned current attitudes toward science and perceived relevance. Finally, Block 3 (future teaching self-efficacy and professional value) showed the strongest change (pre-Mdn = 3.92, post-Mdn = 4.17). All three block-level effect sizes were large (r_rb ≥ 0.50).

Item-level analyses clarified where these differences were concentrated. In Block 1 (Appendix A, Table A1), three items remained significant after Holm adjustment: obtaining answers to intriguing questions (item 1), being able to express one’s own ideas in science class (item 2) and finding science classes fascinating (item 4). Together, these shifts point to a more dialogic and affectively engaging learning experience in the course compared with students’ prior science experiences. In Block 2 (Appendix A, Table A2), several items showed favorable raw pretest–post-test trends, but none survived correction for multiple comparisons. This suggests that the block-level gain reflected a distributed pattern of small improvements rather than strong change in a specific item. In Block 3 (Appendix A, Table A3), two items displayed very robust changes after Holm adjustment: perceived capability to explain natural science content in early childhood education (item 21) and the perception of having sufficient knowledge to teach that content (item 22). In substantive terms, this pattern indicates that the strongest movement in Cycle 1 occurred primarily in science-related teaching self-efficacy and perceived preparedness to teach science, with more limited evidence of change in the professional-value component.

3.5.2. Perceived Motivational Value of Gamification Features

Students’ ratings of implemented gamification features and the overall platform experience (Table 2) provided a second diagnostic layer. At the platform level, FantasyClass overall was highly valued (Mdn = 5; 89.3% of valid responses rated it 4 or 5). At the feature level, XP was the most strongly endorsed item (92.9% ≥ 4), followed by leveling up and group work (both 89.3% ≥ 4). Gold coins also showed a high level of endorsement (78.6% ≥ 4), followed by narrative (74.1% ≥ 4) and HPs (71.4% ≥ 4). Random events and monster battles were also positively valued, although less strongly endorsed than the core progression and collaboration features. By contrast, skills, badges, curses, and the FantasyClass shop showed more modest profiles (Mdn = 3; 29.6–44.4% ≥ 4), indicating that they either played a secondary role in the motivational ecology of the course or were experienced more heterogeneously across students. This descriptive profile is important in DBR terms because it distinguishes a valued “core” of the design—visible progression, collaboration, and narrative coherence—from weaker or more friction-prone features.

3.5.3. Cycle 1 Qualitative Themes

The qualitative data converged with this quantitative picture. Five themes emerged (Appendix C, Table A8). The most frequently identified theme concerned activating motivation through goals, rewards, and visible progress. Students explicitly linked engagement to XP, participation incentives, and the possibility of seeing advancement, as in the comment “More motivation to gain experience within the platform” (C1–S1). A second theme depicted the course as a more playful, dynamic, and enjoyable way of learning science (“It was much more dynamic and entertaining”, C1–S9). A third theme captured the reframing of science as more approachable and professionally relevant, including statements suggesting that science had become more positive, less distant, or more worth learning for future teaching. Two additional themes provided nuance: one foregrounded social and professional transfer value, and the other identified frictions related to the design, notably economy/shop calibration, frustration about penalties, and the fact that gamification did not function as the main motivator for every student. This latter theme was the least frequent in the dataset, being endorsed by 10.7% of students, but it was analytically relevant because it helped identify specific sources of design friction for the subsequent iteration. Overall, the Cycle 1 qualitative data suggest that students perceived the course as engaging and developed a more articulated view of which design features supported or constrained their experience.

3.6. Cycle 1 Reflections and Design Implications

Three broad conclusions emerged from Cycle 1. First, the quantitative results suggested that the course was associated with more favorable affective outcomes across all three blocks: students valued their recent learning experience more positively than their prior science experiences, reported somewhat more favorable current views of science and its relevance, and—most strongly—showed increased science-teaching self-efficacy and perceived preparedness to teach science, with a more modest pattern in the professional-value items. This pattern indicated that the intervention was promising, particularly in relation to the professional future of the participants, but it did not yet clarify which components of the design were most central to students’ motivational experience.

Second, the element-level ratings clarified that not all mechanics contributed equally. Progression-related mechanics (especially XP and leveling up), collaborative work, and the narrative layer formed a highly valued core. The ratings also showed that not all reward- or economy-related mechanics functioned in the same way: gold coins were positively valued, whereas the shop, curses, badges, and skills showed more modest or heterogeneous profiles. The qualitative data helped interpret these patterns. Students repeatedly referred to visible progress, continuity across activities, and participation incentives as motivating, while lower-valued elements were often mentioned either as peripheral or as insufficiently functional during the semester. In DBR terms, these findings offered evidence about which features were experienced as more central or more friction-prone, and signals about where the initial design lacked coherence or balance (Bakker, 2018; Tinoca et al., 2022).

Third, Cycle 1 showed that the initial measurement strategy, although informative, provided a broad affective diagnosis rather than a focused portrait of motivational quality. The three-block questionnaire was useful for detecting experiential contrasts in science learning, shifts in current attitudes toward science and perceived relevance, and shifts in future teaching self-efficacy, but it did not isolate intrinsic motivational experience in the more specific way offered by SDT-aligned measures. This became especially relevant because the qualitative data suggested mechanisms—enjoyment, sense of purpose, competence, and agency—that were conceptually closer to intrinsic motivation than to a broad affective profile.

4. Cycle 2: Refined Iteration

4.1. Cycle 2 Participants

The analytical sample for Cycle 2 consisted of 32 preservice early childhood teachers enrolled in BDCDI during the following academic year. All participants were women, reflecting the strongly feminized gender composition that characterizes early childhood teacher education in many contexts. The mean age was 22.52 years (SD = 1.52; median = 23), and three participants (9.4%) reported having completed a science-oriented high-school track prior to university.

4.2. Revised Implementation Context and Intervention Design

The redesign therefore focused on four main priorities: preserving the highly valued core of progression and collaboration, recalibrating the shop/economy implementation and selected penalty mechanics, intensifying narrative continuity, and refining the evaluative focus toward intrinsic motivation. Rather than presenting these changes as isolated adjustments, Table 3 summarizes the DBR audit trail linking Cycle 1 evidence, the design tension identified, the Cycle 2 response, and the expected motivational rationale.

Cycle 2 therefore retained the structural backbone of the first implementation—FantasyClass as the digital platform, progression through XP and levels, group-based participation, and a points-based reward ecology—while introducing targeted revisions in response to Cycle 1 evidence. The redesign did not replace the ordinary pedagogical structure of BDCDI: regular theory–practice sessions, small-group tutoring, laboratory work, disciplinary objectives, and assessment criteria remained in place. Instead, the gameful layer was organized more selectively and transparently. XP and level progression remained central and were linked more explicitly to course missions and milestones; the shop/economy system was recalibrated so that rewards became more attainable and functionally relevant earlier in the semester; selected penalty-linked mechanics were made more transparent and less punitive; and lower-valued ancillary features, such as some shop items, badges, skills, or curses, were used more selectively when they served a clear functional, collaborative, or narrative purpose.

The most visible redesign concerned the narrative. Instead of the lighter science-fiction frame used in Cycle 1, Cycle 2 adopted an enveloping Harry Potter-inspired storyline that functioned as the main organizing frame of the course. On the first day, students assumed the fictional role of young residents of an orphanage who gradually discovered their magical identity after receiving a letter signed by Dumbledore. The letter also provided their access credentials to FantasyClass, so entry into the platform was narratively embedded from the outset. Once admitted to Hogwarts, students were sorted into houses and prepared for the impending return of Voldemort, while course activities were framed as missions set by Hogwarts staff and other characters (Figure 2). This storyline was designed to provide continuity across the semester and to frame academic tasks as purposeful actions inside a shared fictional world.

The narrative was not restricted to atmosphere; it structured the sequencing of activities and the communication of challenges. Character-led prompts, videos, and classroom episodes were used to introduce tasks across blocks of the course. One example of this integration was a “stolen wand” mission, a playful nature-of-science activity in which learners construct and revise hypotheses as new clues are progressively revealed (Jiménez-Valverde et al., 2025a). Within the Cycle 2 storyline, student groups had to identify a possible suspect by interpreting incomplete and sometimes ambiguous evidence released in successive phases. The activity was linked to BDCDI aims concerning scientific inquiry and its didactic translation to early childhood education: groups generated hypotheses, revised them in light of new evidence, compared interpretations with other groups, and reflected on uncertainty, collaboration, evidence sharing, and the diversity of scientific methods. The link with ordinary assessed course work was established through the group reflection notebook: students answered guided questions about how the mission resembled scientific activity and how similar inquiry processes could be translated into early childhood science teaching. In this way, the narrative did not replace the academic task; rather, it framed scientific reasoning, reflective writing, and didactic transfer as meaningful actions within the course storyline. Other activities were framed around different Hogwarts classes and characters, including logic and pattern games, inquiry-oriented challenges, and tasks connected to course content. The semester closed with a final confrontation with Voldemort linked to students’ culminating work.

4.3. Cycle 2 Instruments

Cycle 2 employed a more focused measurement strategy centered on intrinsic motivation and its SDT-related experiential conditions. To this end, students completed the Intrinsic Motivation Inventory (IMI; Self-Determination Theory, n.d.), a multidimensional self-report instrument designed to assess participants’ subjective experience in relation to a specific activity (Ryan, 1982; Ryan et al., 1983). The IMI has been widely used in studies on intrinsic motivation, internalization, and self-regulation, and includes several subscales assessing distinct experiential dimensions rather than a single global construct (Ryan et al., 1990; Deci et al., 1994). In the present study, four subscales were selected because of their close alignment with the motivational aims of the redesigned course and with SDT: Enjoyment/Interest, Perceived Competence, Perceived Choice, and Relatedness. Enjoyment/Interest was treated as the primary self-report indicator of intrinsic motivation, whereas Perceived Competence, Perceived Choice, and Relatedness were used as indicators aligned with the three basic psychological needs emphasized in SDT. Other IMI subscales, such as Value/Usefulness, Effort/Importance, and Pressure/Tension, were not included in order to keep the instrument focused on intrinsic motivation and the three SDT-related experiential dimensions most directly aligned with the redesign, while limiting response burden. Value/Usefulness was not selected because professional relevance and perceived value had already been addressed through the Cycle 1 affective questionnaire and through qualitative responses across both cycles. However, its omission means that Cycle 2 does not provide a direct IMI-based estimate of internalized value or perceived usefulness. The omission of Pressure/Tension is more consequential for interpretation because XP, levels, penalties, and grade-linked progression could be experienced by some students as controlling or pressure-inducing. Thus, the selected IMI subscales provide a focused profile of enjoyment, perceived competence, perceived choice, and relatedness, but they do not capture all possible motivational costs of the gamified assessment ecology.

The questionnaire used a 7-point Likert-type response scale ranging from 1 (“not at all true”) to 7 (“very true”). Following established IMI practice (Deci et al., 1994), item wording was minimally adapted to anchor responses to the gamified course context while preserving the original meaning of each item; the full adapted wording used in the present study is reported in Appendix B. Because the IMI is intended to capture subjective experience in relation to a concrete activity, it was administered once, at the end of the semester, after participants had experienced the redesigned course as a whole. This timing is consistent with SDT-informed post-activity assessment and prior gamification studies using subjective motivational measures (Ryan & Deci, 2000; Mekler et al., 2017; Sailer et al., 2017). Internal consistency was estimated at the subscale level, yielding good to excellent sample-specific coefficients: Enjoyment/Interest, α = 0.891; Perceived Competence, α = 0.831; Perceived Choice, α = 0.857; and Relatedness, α = 0.894.

As in Cycle 1, the IMI was complemented with an open-ended question, which invited students to explain how FantasyClass and the narrative of the activities and characters had motivated them in the course. This qualitative source was important because it allowed the motivational profile emerging from the IMI to be interpreted through the students’ own accounts of what they found engaging, useful, or problematic.

4.4. Cycle 2 Data Analysis

All 32 students provided valid IMI responses. No imputation was therefore required for IMI scores. Subscale scores were calculated at the participant level as the mean of the items belonging to each subscale. Although Shapiro–Wilk tests did not indicate statistically significant departures from normality for any subscale (all p > 0.05), Cycle 2 retained a non-parametric reporting strategy because the scores derived from Likert-type items, the sample was small, and Cycle 1 had already adopted the same analytic strategy. Subscale descriptives were therefore reported using medians and interquartile ranges. In addition, Spearman correlations were estimated among the four IMI subscales, with Holm correction applied to control familywise error across the six pairwise comparisons.

The open-ended responses were analyzed thematically using the same Braun and Clarke–informed procedure applied in Cycle 1. In Cycle 2, the qualitative corpus consisted of 32 brief individual written responses to one open-ended prompt focused on how FantasyClass, the narrative, course activities, and characters had motivated students during the redesigned course. Because all 32 students provided substantive written responses, thematic frequencies in Appendix C are reported as percentages of the full Cycle 2 cohort. As in Cycle 1, coding was primarily semantic and inductive, and candidate themes were refined through team discussion, attention to ambivalent or less positive accounts, and interpretation in relation to the quantitative motivational profile. The qualitative findings were therefore used to contextualize students’ motivational experience rather than to provide independent estimates of prevalence or effect.

4.5. Cycle 2 Results

4.5.1. Cycle 2 Motivational Profile

The IMI results revealed a broadly positive motivational profile: Enjoyment/Interest showed the highest central tendency (N = 32; Mdn = 5.64, IQR = 1.43), followed by Perceived Choice (Mdn = 5.50, IQR = 1.11) and Perceived Competence (Mdn = 5.00, IQR = 0.71), while Relatedness was somewhat lower and more variable (Mdn = 4.63, IQR = 1.63). All four subscales were descriptively above the midpoint of the 1–7 scale, suggesting that the redesigned course was generally experienced as enjoyable, autonomy-supportive, competence-supportive, and socially positive, albeit with more heterogeneity in the relational dimension.

At the item level (Appendix B), the four IMI subscales showed a differentiated but coherent pattern. Within Enjoyment/Interest, responses were consistently positive, with five of the seven items reaching a median of 6, indicating that the course was broadly experienced as enjoyable, engaging, and attention-sustaining. By contrast, the more explicitly reflective item about being aware during class of how much one was enjoying the course was somewhat lower (Item 7: Mdn = 5, IQR = 2), suggesting that enjoyment was salient even if students did not always explicitly attend to it in the moment. Within Perceived Competence, the pattern was comparatively compact: feeling competent after working on the course activities was especially consistent across students (Item 10: Mdn = 5, IQR = 0), while the lowest endorsement appeared in the socially comparative item about doing well relative to classmates (Item 9: Mdn = 4, IQR = 1), which suggests that competence was experienced more in mastery than in normative terms. In Perceived Choice, the strongest scores were concentrated in reverse-worded items rejecting lack of choice, particularly Item 16 (Mdn = 7, IQR = 1.25), while direct volitional items such as having some choice in participation and doing the activities because one wanted to were somewhat more moderate (Items 14 and 19: both Mdns = 5), pointing to a positive but not unqualified autonomy profile. Finally, Relatedness was the most heterogeneous subscale: students clearly rejected social withdrawal from classmates in future interactions (Item 25, reversed: Mdn = 6, IQR = 3) and also rejected lack of trust (Item 26, reversed: Mdn = 6, IQR = 3), yet items tapping present closeness and interpersonal trust were lower (Items 23 and 28: Mdns = 4.5 and 4), and the median of 5 for Item 24 suggests that many students would still have welcomed more opportunities for interaction. At subscale and item level, these results reinforce the interpretation of a generally positive motivational profile, while also clarifying that the relational dimension was less consolidated than enjoyment, competence, and perceived choice.

The intercorrelation analysis showed a selective rather than uniform structure (Table 4). Enjoyment was positively associated with Perceived Competence (ρ = 0.60, p_Holm = 0.002) and with Perceived Choice (ρ = 0.55, p_Holm = 0.005), whereas Relatedness did not show significant associations with the other subscales after Holm correction. This pattern suggests that, in this cohort, Enjoyment/Interest was more strongly associated with perceived competence and meaningful room for agency than with social connection per se.

4.5.2. Cycle 2 Qualitative Themes

The qualitative analysis identified five themes (Appendix C, Table A9). The most frequent theme captured behavioral activation: students described the course as helping them stay on track, attend class, participate, and persist through points, progress, and recurring challenges. A second theme highlighted the methodology as innovative, dynamic, and interactive, often contrasted with lecture-and-slides teaching. A third theme positioned the Harry Potter narrative as a motivating spine that made classes more coherent and enjoyable, even for some students who had little prior familiarity with the saga.

Two less frequent but important themes added nuance. One emphasized the value of visible progress and feedback, consistent with the centrality of progress-based mechanics in the quantitative results. The other identified design frictions, particularly the need for more challenges more closely tied to theory and occasional initial difficulty for students who did not know the Harry Potter universe. Overall, the qualitative data converged with the IMI profile in suggesting that students experienced the revised course as enjoyable and agency-supportive, while also pointing to areas where the alignment between narrative, challenge density, and theoretical assessment could be further refined.

5. Discussion

The discussion is organized around the central design question of the study: how evidence from an initial gamified implementation informed the refinement of motivational scaffolding in a subsequent course iteration. The two cycles are not treated as parallel evaluations or as metrically equivalent assessments of the same construct, because they involved independent natural class groups and different evaluative lenses. Instead, they are interpreted as successive stages of a design-and-redesign process. On this basis, the discussion focuses on four design insights: the affective and professional meaning of the Cycle 1 pattern, the motivational role of visible progression, the differentiated contribution of collaboration and relatedness, and the function of narrative coherence in the redesigned course.

5.1. From Affective Repair to Perceived Teaching Readiness

The Cycle 1 results suggest that the most educationally meaningful pattern was not simply that students reported a more enjoyable science-learning experience, but that science teaching appeared more imaginable and attainable for them. At block level, all three affective blocks showed favorable pre–post differences, but the item-level analyses indicate that these differences should be interpreted with nuance. Block 1 points to a more favorable recent science-learning experience, Block 2 suggests a distributed and less item-specific favorable pattern in current views of science and its relevance, and Block 3 shows the clearest pattern in perceived teaching capability and preparedness.

Because Block 1 contrasts prior science-learning experiences with the recent BDCDI experience, it is interpreted here as an experiential contrast rather than as a strictly invariant longitudinal change. This contrast is nevertheless educationally relevant. Students evaluated the BDCDI experience more favorably than their previous science-learning experiences, especially in relation to obtaining answers to intriguing questions, expressing their own ideas, and finding science classes fascinating. In a population where prior science experiences may be associated with transmissive teaching, low relevance, and fragile confidence, this contrast may be interpreted as a possible counterexperience to earlier disengagement.

The more modest pattern in Block 2 suggests that current attitudes toward science and perceptions of relevance may be less sharply reflected at item level. Although the block-level difference was statistically significant, no individual item survived correction for multiple comparisons, which points to a distributed pattern of smaller favorable shifts rather than a strong difference in one or two focal beliefs. By contrast, Block 3 showed the most robust item-level pattern, concentrated in students’ perceived capability to explain natural science content in early childhood education and their perception of having sufficient knowledge to teach it. The professional-value items were more mixed, suggesting that perceived capability and perceived preparedness were more clearly reflected in the Cycle 1 data than broader beliefs about the professional importance of science teaching.

This distinction matters in early childhood teacher education. The pedagogical significance of motivational design lies not only in whether students experience a course as enjoyable, but also in whether they report feeling more able to teach science in their future classrooms. Preservice teachers’ confidence, affective relationship with science, and perceived capability to teach it have been repeatedly identified as important conditions for whether science is later approached with enthusiasm, avoided, or reduced to low-risk activities (Appleton, 2003; Brígido et al., 2013). From this perspective, the most consequential Cycle 1 finding may be that the gamified course was associated with stronger perceived science-teaching capability and preparedness, while the recent BDCDI experience was also evaluated more favorably than students’ prior memories of science education.

5.2. Progression, Visible Feedback, and the Motivational Role of Competence and Agency

Across both cycles, the clearest design signal concerns visible progression. In Cycle 1, experience points and leveling up were among the most highly valued features, and qualitative responses linked motivation to goals, advancement, and reward visibility. In Cycle 2, enjoyment was positively associated with perceived competence and perceived choice. This convergence suggests that progression was motivationally meaningful not simply as a reward structure, but as an informational scaffold that made effort, achievement, and next steps easier to interpret.

This point deserves emphasis because it helps avoid a common misinterpretation of gamification. The motivating force of the design does not appear to have rested on points or rewards as isolated stimuli. Rather, progression seems to have functioned as an informational scaffold that helped students track what they had achieved, how effort translated into movement, and what remained ahead. This reading is consistent with SDT-informed research showing that competence-supportive feedback is most effective when it is experienced as informative rather than controlling (Deci & Ryan, 2000; Ryan & Deci, 2020). It is also consistent with experimental evidence suggesting that specific achievement-related features can positively affect competence need satisfaction when embedded in a coherent design (Sailer et al., 2017; Xi & Hamari, 2019).

This pattern is important because FantasyClass, as embedded in the course design, helped make progress visible, cumulative, and interpretable to students. The platform allowed students to monitor their own XP accumulation and level progression throughout the course, while also situating that progress within a shared gameful environment in which classmates’ progress was visible. In this modest sense, FantasyClass may have supported conditions for self-monitoring by helping students keep track of where they were in the course, what they had already achieved, and what still remained to be done. Because self-regulated learning was not directly measured, this claim should be understood as a cautious design inference; nevertheless, it is consistent with DBR perspectives on technology-enhanced learning environments as designs that make learning processes visible, analyzable, and open to refinement (Hoadley & Campos, 2022; Wang & Hannafin, 2005).

The Cycle 2 IMI profile strengthens this reading in an important way. The item-level pattern suggests that competence was experienced more in mastery terms than in normative terms: students felt competent after working on the activities, but the lowest endorsement within that subscale concerned doing well relative to classmates. In teacher education, this pattern is relevant because it suggests that the environment may have supported a sense of growing efficacy without relying too heavily on social comparison. Likewise, Perceived Choice was positive but not unqualified: students strongly rejected lack of choice, yet direct volitional items were somewhat more moderate. This suggests that agency was present, but structured agency—consistent with the fact that the course still retained fixed academic tasks and assessment demands. From an SDT perspective, such a profile is plausible in a teacher-education course: learners do not need unlimited freedom to feel autonomous, but they do need to experience their actions as meaningful and not purely imposed (Deci & Ryan, 2000; Ryan & Deci, 2020).

5.3. Collaboration, Relatedness, and the More Uneven Social Dimension

A second relevant pattern is the difference between the strong value of collaboration in Cycle 1 and the more variable pattern of relatedness in Cycle 2. On the one hand, collaborative work was one of the most highly valued features in Cycle 1, and the qualitative data from both cycles repeatedly described the course as dynamic, interactive, and participatory. Stable group structures, team-based goals, and other shared activities were reflected in both the feature ratings and students’ accounts as relevant components of the course’s motivational ecology. On the other hand, the Cycle 2 IMI results showed that Relatedness, although descriptively above the midpoint overall, was lower and more heterogeneous than enjoyment, competence, or perceived choice, and did not correlate significantly with the other subscales after Holm correction. Because Relatedness is one of the three basic psychological needs in SDT, this lower and more heterogeneous profile should be interpreted as a substantive design tension rather than as a marginal result. The pattern suggests that the social dimension of the course was experienced more consistently at the level of functional collaboration than at the level of consolidated interpersonal closeness.

This less consolidated and more differentiated role of relatedness is especially noteworthy because it suggests that, in this cohort, students’ enjoyment and engagement appeared more closely associated with perceived competence and agency than with deep interpersonal connection or affinity. More specifically, enjoyment covaried with competence and choice, but not with relatedness, indicating that students could experience the course as highly engaging without necessarily reporting equally strong feelings of social closeness. This distinction suggests that working together and feeling socially close are not the same thing. The course appears to have supported collaboration as a functionally meaningful component of the learning experience, but not all students experienced that collaboration as strong interpersonal closeness.

The item-level pattern supports this distinction. Students clearly rejected social withdrawal and lack of trust in future interactions, but present closeness and interpersonal trust were more moderate, and many expressed a desire for more opportunities to interact. In other words, the design may have created conditions for productive social participation without fully consolidating felt relatedness at the level of friendship or strong group belonging. This nuance complicates any overly simple assumption that collaborative mechanics automatically satisfy relatedness. The literature instead suggests a more contingent pattern. Van Roy and Zaman (2019) showed that the motivational power of gamification is ambivalent: the same elements can satisfy or thwart needs depending on situational factors. Xi and Hamari (2019) likewise found that social-related features can support need satisfaction, but that their effects are not uniform across contexts. L. Li et al. (2024) reported positive effects of gamification on relatedness overall, but also emphasized that motivational outcomes remain sensitive to how designs are enacted. The present findings fit this broader pattern. Collaboration appears to have mattered, but its contribution may have been more structural and task-focused than deeply relational.

There are several plausible reasons for this. First, relatedness may simply be harder to consolidate in a semester-long course than competence or enjoyment. Second, the structure of the course may have prioritized goal-directed interaction over slower forms of interpersonal bonding. Third, the Harry Potter narrative and the points-based system may have fostered a shared framework of action without necessarily generating a strong sense of emotional intimacy. Finally, in higher education, students often navigate multiple courses, schedules, and pre-existing social networks, which can limit the extent to which any one course becomes a primary site of belonging. None of this negates the value of collaboration in the design. Rather, it suggests that if strengthening relatedness is a more explicit goal in future iterations, additional relational scaffolds may be needed—for example, structured cross-group peer exchange, longer-running interdependent missions, and short reflection activities in which students explicitly recognize one another’s contributions. This interpretation is consistent with recent work on gamification of cooperation, which emphasizes that cooperative effects require explicit design of interdependence, shared goals, and interaction structures rather than the mere presence of group-based activity (Riar et al., 2022).

5.4. Narrative Coherence, Cultural Accessibility, and Transferability

If progress visibility formed one key design pillar across both cycles, narrative coherence formed another. In Cycle 1, narrative was already among the more highly valued features, although its role was still relatively light and atmospheric. In Cycle 2, the Harry Potter-inspired storyline was used as the main organizing frame of the course, linking access to FantasyClass, group identity, classroom missions, and culminating tasks within a shared fictional trajectory. Students’ qualitative accounts suggest that this more continuous narrative frame was perceived as making the course more coherent and enjoyable, including by some students who reported limited prior familiarity with the Harry Potter universe.

This interpretation should be understood cautiously. The findings do not show that the Harry Potter universe itself caused the favorable motivational profile observed in Cycle 2. Rather, they suggest that, within this course context, students perceived the narrative as meaningful when it was integrated into the structure of activity rather than treated as atmosphere alone. Activities involving hypothesis generation, indirect evidence, and mystery-solving were not presented or experienced as a separate curricular strand detached from the gamified design, but as part of a broader course trajectory. In this sense, the narrative may be better understood as a coherence device: it offered a shared frame through which otherwise separate activities, mechanics, and challenges could be experienced as connected.

This reading qualifies a recurrent concern in educational gamification research: that game elements may promote surface-level engagement when they are not meaningfully integrated with academic tasks, learning activities, and instructional goals (Dah et al., 2025; Dichev & Dicheva, 2017). In the present case, students’ accounts indicate that narrative and academic substance were not necessarily experienced as competing dimensions. When the narrative was linked to inquiry-oriented tasks, challenge sequences, and course assessment, it appeared compatible with cognitively demanding scientific work. This interpretation is consistent with scholarship arguing that narrative in science education can act as both a cognitive scaffold and a motivational resource, helping learners connect fragmented tasks or concepts to a broader sense of purpose (Avraamidou & Osborne, 2009; Jiménez-Valverde, 2025). It also resonates with Bruner’s (1991) argument that narrative is a fundamental mode of meaning-making rather than a merely ornamental form of discourse.

At the same time, the use of a familiar commercial universe requires a more critical interpretation. A widely recognizable fantasy world may make the narrative immediately legible, emotionally engaging, and socially shareable for some students, but this familiarity cannot be assumed for all participants. Students who are less familiar with, less interested in, or less culturally connected to the Harry Potter universe may initially find the narrative less accessible or less motivating. The qualitative data support this caution, because some students reported initial difficulty due to limited familiarity with the Harry Potter universe. For this reason, the transferable principle is not the use of Harry Potter specifically, but the design of an accessible narrative frame that gives continuity, purpose, and coherence to course activities. Future implementations should therefore consider whether the chosen narrative is culturally inclusive, sufficiently explained, and adaptable to different cohorts; in some contexts, an original or locally meaningful narrative may be more appropriate than a commercial universe.

The comparison with closely related work in gamified science teacher education further clarifies this contribution. Fabre-Mitjans et al. (2025), in a related FantasyClass-supported science course with preservice early childhood teachers, reported significant pre–post gains in attitudes toward science and perceived teaching competence, together with qualitative evidence of engagement, enjoyment, and perceived relevance in a narrative-rich gamified environment. The current study extends that line of work by documenting a design-based refinement process in which evidence from Cycle 1 informed a more continuous and structurally integrated narrative design in Cycle 2. This does not establish the causal effect of narrative intensity, nor does it isolate the specific effect of the Harry Potter theme. It instead identifies narrative coherence, cultural accessibility, and integration with course activity as design variables that merit explicit attention in future research and in the design of gamified science teacher education.

5.5. Implications for Motivational Design in Early Childhood Science Teacher Education

Taken together, the two cycles point to a central implication: gamification in teacher education should be treated as motivational design rather than as feature accumulation. The most valued and most stable elements of the course were not all of the available mechanics, but a smaller set of design supports that students experienced as making the environment more progressive, collaborative, and meaningful. This interpretation is consistent with recent literature suggesting that the educational value of gamification depends less on the number of game elements included and more on how those elements are aligned with psychological needs, instructional goals, and learner experience (Dai et al., 2025; Khaldi et al., 2023; L. Li et al., 2024; Ratinho & Martins, 2023).

More specifically, the findings suggest three provisional principles for motivational design in gamified science teacher education. First, progression should be made visible and interpretable, because students valued feedback systems that helped them understand effort, achievement, and remaining goals. Second, collaboration should be structurally embedded rather than added occasionally, because group work was associated with participation and shared momentum even when relatedness was uneven. Third, narrative should be used as a coherence device that links tasks, activities, and assessment into a meaningful course trajectory. These principles are provisional and context-sensitive, but they offer practical guidance for designing gamified learning environments as motivational scaffolds rather than as collections of isolated game elements.

A further implication concerns the professional meaning of motivational design in early childhood science teacher education. The Cycle 1 pattern was strongest in perceived science-teaching capability and preparedness, whereas the professional-value component was more mixed. This suggests that the educational relevance of gamified science courses should not be judged only in terms of enjoyment or participation, but also in terms of whether preservice teachers report feeling more able to teach science. Preservice early childhood teachers who feel more capable, more prepared, and more positive about science could plausibly be better positioned to later offer richer science experiences to children (Membiela et al., 2022). Because the present study did not examine subsequent practicum performance, lesson design, or classroom teaching, this implication should be treated as a hypothesis for future longitudinal or practice-based research rather than as evidence of transfer to professional practice.

Finally, the study suggests that the value of digital platforms lies not primarily in their technical novelty, but in their capacity to organize feedback, continuity, visibility, and participation within a coherent course structure. In this study, FantasyClass appeared useful as a platform through which progression, rewards, group work, and narrative could be made more interpretable and cumulative. At the same time, the uneven reception of ancillary mechanics such as the shop, curses, badges, and skills reinforces the need for careful calibration: not all game elements were perceived as contributing equally to students’ motivational experience, a finding consistent with research on feature-specific and design-sensitive effects in gamification (L. Li et al., 2024; Mekler et al., 2017; Sailer et al., 2017; Xi & Hamari, 2019).

5.6. Limitations and Future Directions

This study has several limitations that should frame the interpretation of its findings. First, both cycles were conducted in a single course and institution using convenience samples, which limits transferability. Second, because the study was conducted within a course designed and implemented by the teaching team, some degree of socially desirable responding cannot be ruled out, particularly in the self-report and open-ended data. This risk was mitigated by voluntary questionnaire administration, aggregated reporting, and explicit communication that students’ responses would not affect course assessment. Third, the first cycle used a one-group pretest–post-test design and the second a post-test-only design, which means that causal claims would be unwarranted. Relatedly, because the Cycle 2 IMI did not include the Pressure/Tension or Value/Usefulness subscales, the study cannot directly assess whether the evaluative role of XP, levels, or other game mechanics was experienced by some students as controlling or pressure-inducing, nor can it quantify the extent to which students internalized the usefulness or professional value of the redesigned course through a dedicated IMI subscale. Fourth, the qualitative analysis relied on brief written responses rather than interviews or focus groups; accordingly, the themes should be interpreted as concise experiential accounts rather than exhaustive reconstructions of students’ meaning-making. A further practical limitation is that the narrative drew on a cultural universe that may not have been equally familiar to all students, which may have moderated engagement for some participants during the initial stages of the course. Finally, the strongly feminized composition of the cohorts—especially the all-women Cycle 2 sample—reflects the degree context but limits transferability to more gender-balanced teacher education settings.

Future research could extend this line of work in several directions. A further DBR cycle would help determine whether the motivational profile observed in Cycle 2 is reproduced once novelty effects stabilize and whether the remaining design tensions around challenge density, alignment with theoretical course content, and uneven relatedness can be addressed more effectively. Comparative or quasi-experimental designs could help isolate the relative contribution of progression systems, collaborative structures, and narrative intensity. Future work should also incorporate more direct measures of progress awareness, self-monitoring, or self-regulated learning so that possible links between visible progression and regulatory processes can be examined more precisely. Finally, it would be especially valuable to follow students beyond the course itself and investigate whether the changes in perceived science-teaching capability and preparedness, together with the favorable post-course motivational profile suggested by the present evidence, are later reflected in lesson design, practicum decisions, or science teaching practices during school placement.

6. Conclusions

This article reported a two-cycle DBR study in a fourth-year early childhood science education course supported by the digital platform FantasyClass. Cycle 1 provided a broad affective diagnosis and identified both promising and problematic features of the initial design. Cycle 2 implemented evidence-informed refinements—most notably, recalibrated mechanics and a more immersive narrative—and was characterized by a favorable motivational profile across selected IMI subscales, marked especially by enjoyment, perceived competence, and perceived choice. Across the two cycles, the findings in this DBR case suggest that, under the design conditions examined here, digitally mediated gamification may serve as a form of motivational design in educational technology within this situated course context, when it makes progress visible, gives tasks narrative meaning, and preserves collaborative participation as a central principle. These claims are offered as situated and provisional, rather than as evidence of causal effectiveness across contexts.

Author Contributions

Conceptualization, G.G.-B., G.J.-V. and N.F.-M.; Methodology, G.G.-B. and G.J.-V.; Software, G.G.-B.; Validation, G.J.-V. and N.F.-M.; Formal analysis, G.G.-B. and G.J.-V.; Investigation, G.G.-B. and G.J.-V.; Data curation, G.G.-B. and G.J.-V.; Writing—original draft, G.G.-B., G.J.-V. and N.F.-M.; Writing—review & editing, G.G.-B., G.J.-V. and N.F.-M.; Supervision, G.G.-B., G.J.-V. and N.F.-M.; Funding acquisition, G.J.-V. All authors have read and agreed to the published version of the manuscript.

Funding

The research was funded by the Universitat de Barcelona, grant number REDICE22-3080, and the APC was partially funded by Universitat de Barcelona.

Institutional Review Board Statement

The institution did not require specific ethics committee approval for this research. The study adhered to the ethical standards set forth by the University of Barcelona’s Code of Conduct of Research Integrity and complied with the EU General Data Protection Regulation (GDPR) 2016/679 regarding data protection and participant privacy.

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy restrictions.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

BDCDI	Biological Development of the Child and Didactic Intervention
DBR	Design-Based Research
ECTS	European Credit Transfer and Accumulation System
HP	Health Points
IMI	Intrinsic Motivation Inventory
IQR	Interquartile Range
Mdn	Median
R	Reverse-worded/Reverse-scored item
RQ	Research Question
SD	Standard Deviation
SDT	Self-Determination Theory
XP	Experience Points

Appendix A

The following tables report all questionnaire items from Cycle 1, grouped by analytically differentiated affective blocks. For Block 1, item wording was minimally adapted between administrations so that the pretest referred to prior science-learning experiences and the post-test referred specifically to the science-learning experience in BDCDI. For example, the pretest item “In science class, I got answers to questions that intrigued me” was reformulated in the post-test as “In this course, I got answers to questions that intrigued me”.

Table A1. Cycle 1 item-level statistics for Block 1: Learning experiences.

Item Number and Statement	Pretest		Post-Test		W	p (Holm)	r_rb
Item Number and Statement	Mdn	IQR	Mdn	IQR	W	p (Holm)	r_rb
1—In science class, I got answers to questions that intrigued me	4	1	4	1	36.0	0.033 *	0.66
2—In science class, I could express my own ideas	3.5	1	5	1	15.0	<0.001 *	0.89
3—I could get good grades in science without the help of the teacher	3	2	3	1	78.5	0.613	0.25
4—Science classes fascinated me	3	1	4	1	14.0	0.020 *	0.79
5—Science lessons were easy to study	3	1	3	1	30.5	0.329	0.49
6—I had fun learning science	4	1	4	1	42.5	0.284	0.50
7—Science allowed me to understand everyday phenomena	4	2	4	1	44.5	0.348	0.42
8—For me, it was difficult to learn science (R)	3	2	3	2.25	129.0	0.780	0.07

Note: Mdn = median; IQR = interquartile range; W = Wilcoxon signed-rank test statistic (two-tailed, continuity-corrected); p (Holm) = Holm step-down family-wise error–adjusted p-value within this block, asterisks (*) indicate statistical significance at the 5% level; r_rb = matched-pairs rank-biserial correlation. Item marked (R) was reverse-worded and was reverse-scored so that higher values indicate more favorable science learning experiences.

Table A2. Cycle 1 item-level statistics for Block 2: Current attitudes and perceived relevance.

Item Number and Statement	Pretest		Post-Test		W	p (Holm)	r_rb
Item Number and Statement	Mdn	IQR	Mdn	IQR	W	p (Holm)	r_rb
9—Science has no connection to my life (R)	4	1	5	1	28.0	1.000	0.15
10—Understanding science is important for everyone	5	1	5	1	10.5	0.816	0.42
11—I like to read and learn about science through social media, YouTube, or other media	3	1	3.5	1	27.5	1.000	0.17
12—I think science is interesting	4	0.25	5	1	7.0	0.536	0.61
13—I am interested in explanations of scientific phenomena	4	1	4	0.5	40.0	0.359	0.48
14—Science makes our lives healthier, easier, and more comfortable	4	0	4	1	32.0	0.152	0.58
15—The benefits of science outweigh the potential adverse effects	4	1	4	1	5.0	0.100	0.82
16—Science can solve environmental problems	4	1	4	1	26.0	0.625	0.43

Note: Mdn = median; IQR = interquartile range; W = Wilcoxon signed-rank test statistic (two-tailed, continuity-corrected); p (Holm) = Holm step-down family-wise error–adjusted p-value within this block; r_rb = matched-pairs rank-biserial correlation. Item marked (R) was reverse-worded and was reverse-scored so that higher values indicate more positive current attitudes toward science and stronger perceived relevance.

Table A3. Cycle 1 item-level statistics for Block 3: Future teaching self-efficacy and professional value.

Item Number and Statement	Pretest		Post-Test		W	p (Holm)	r_rb
Item Number and Statement	Mdn	IQR	Mdn	IQR	W	p (Holm)	r_rb
17—Science should not be taught in early childhood education (R)	5	0	5	0	3.5	0.307	−0.67
18—More time should be devoted to science in early childhood education	4	0	4	1	55.0	0.766	0.08
19—I think teaching science to children in early childhood education must be boring (R)	5	1	5	1	15.0	0.635	−0.33
20—The science I can learn is important for my future professional development as an early childhood education teacher	5	1	4	1	18.0	0.283	−0.54
21—I feel capable of teaching science content to children in early childhood education	3	1	4	0	0.0	0.001 *	1.00
22—I consider that I have sufficient knowledge to teach the science content in the early childhood education curriculum	2	1	4	0	7.0	0.001 *	0.96

Note: Mdn = median; IQR = interquartile range; W = Wilcoxon signed-rank test statistic (two-tailed, continuity-corrected); p (Holm) = Holm step-down family-wise error–adjusted p-value within this block, asterisks (*) indicate statistical significance at the 5% level; r_rb = matched-pairs rank-biserial correlation. Item marked (R) was reverse-worded and was reverse-scored so that higher values indicate greater future self-efficacy for teaching science and stronger professional value attributed to science teaching in early childhood education.

Appendix B

The following tables report the course-specific IMI item wording used in Cycle 2, grouped by subscale. Because the IMI is designed to assess subjective experience in relation to a specific target activity, item wording was minimally adapted to anchor responses to the gamified course context (Deci et al., 1994). For example, the generic item “I enjoyed doing this activity very much” was adapted to “I enjoyed taking this course very much”. Such context-sensitive modifications are consistent with established IMI practice when the substantive meaning of the items is preserved. Because the IMI was administered only after the intervention, only post-intervention descriptive statistics are presented.

Table A4. Cycle 2 item-level descriptive statistics for the IMI enjoyment/interest subscale.

Item Number and Statement	Mdn	IQR
1—I enjoyed taking this course very much	6	2
2—This course was fun	6	1.25
3—I thought this was a boring course (R)	6	2
4—The classes in this course did not hold my attention at all (R)	6	2
5—I would describe this course as very interesting	5	2
6—I thought this course was quite enjoyable	6	2
7—During the classes, I thought about how much I was enjoying the course	5	2

Note: Mdn = median; IQR = interquartile range. Items marked (R) were reverse-worded and reverse-scored so that higher values indicate greater enjoyment/interest.

Table A5. Cycle 2 item-level descriptive statistics for the IMI perceived competence subscale.

Item Number and Statement	Mdn	IQR
8—I think I am pretty good at this course	5	1
9—I think I did pretty well at this course, compared to other students	4	1
10—After working on the activities in this course for a while, I felt pretty competent	5	0
11—I am satisfied with my performance at this course	5	1
12—I was pretty skilled at this course	5	1
13—This was a course that I couldn’t do very well (R)	5	1.25

Note: Mdn = median; IQR = interquartile range. Item marked (R) was reverse-worded and reverse-scored so that higher values indicate greater perceived competence.

Table A6. Cycle 2 item-level descriptive statistics for the IMI perceived choice subscale.

Item Number and Statement	Mdn	IQR
14—I believe I had some choice in how I participated in this course	5	1.25
15—I felt like I had no choice in how to complete the tasks in this course (R)	6	2.25
16—I didn’t really have a choice about how to participate in this course (R)	7	1.25
17—I felt like I had to complete the activities in a very specific way (R)	6	2
18—I did the activities in this course because I had no choice (R)	5.5	2
19—I did the activities in this course because I wanted to	5	2
20—I did the activities in this course because I had to (R)	5.5	2

Note: Mdn = median; IQR = interquartile range. Items marked (R) were reverse-worded and reverse-scored so that higher values indicate greater perceived choice (autonomy).

Table A7. Cycle 2 item-level descriptive statistics for the IMI relatedness subscale.

Item Number and Statement	Mdn	IQR
21—I felt really distant from my classmates (R)	5	3
22—I really doubt that my classmates and I could ever become friends (R)	4	2
23—I felt like I could really trust my classmates	4.5	3
24—I’d like more opportunities to interact with my classmates	5	1.25
25—I’d really prefer not to interact with my classmates in the future (R)	6	3
26—I don’t feel like I could really trust my classmates (R)	6	3
27—It is likely that my classmates and I could become friends if we interacted more	5	1
28—I feel close to my classmates	4	1.25

Note: Mdn = median; IQR = interquartile range. Items marked (R) were reverse-worded and reverse-scored so that higher values indicate greater relatedness.

Appendix C

Thematic Analysis Summary Tables.

Table A8. Cycle 1 thematic analysis summary.

Theme (Frequency)	Codes	Representative Quotes
Activating motivation through goals, rewards, and visible progress (46.4%)	XP as incentive; points and rewards; increased participation; task persistence; attendance and engagement; progress tracking	“More motivation to gain experience within the platform.” (C1–S1). “It encouraged me to attend classes and participate more.” (C1–S12).
A more playful, dynamic, and enjoyable way of learning science (28.6%)	dynamic classes; enjoyable learning; fun; enriched experience; entertaining format	“It was much more dynamic and entertaining.” (C1–S9). “Very enriching and fun.” (C1–S28).
Reframing science as more approachable and professionally relevant (17.9%)	more positive view of science; content becomes closer; motivation to learn more; future professional relevance; increased confidence	“My perspective changed a lot, seeing science in a more positive and motivating way.” (C1–S10). “I went from having no scientific knowledge to having a basis for my future.” (C1–S8).
Social and professional transfer value (17.9%)	working together; transfer to early childhood teaching; learning strategies for future practice	“I think gamification is very important to motivate children and help them enjoy science.” (C1–S11). “Seeing different learning strategies.” (C1–S27).
Tensions and design limitations (10.7%)	shop/economy calibration; frustration with penalties; gamification not main motivator	“The items were so expensive that I could barely buy anything before the course ended.” (C1–S2). “It changed my attitude, but it also annoyed me that points were deducted for almost anything.” (C1–S7).

Note: Frequencies represent the percentage of unique students endorsing each theme at least once (participant-level; N = 28), and do not sum to 100% because students often endorsed multiple themes. They are used as descriptive indicators of thematic prominence within this corpus of brief open-ended responses and should not be interpreted as inferential estimates of prevalence or as a hierarchy of thematic importance. Representative quotes were translated into English by the authors.

Table A9. Cycle 2 thematic analysis summary.

Theme (Frequency)	Codes	Representative Quotes
Behavioral activation: staying on track, attending, and participating (34.4%)	progress and rewards; staying connected; attendance; active participation; tasks and challenges; persistence	“You can see results and rewards, and that keeps you motivated and willing to participate.” (C2–S2). “It motivated me to stay active and show interest in improving myself because of the points attached to fulfilling the course demands.” (C2–S25).
An innovative, dynamic, and interactive methodology (31.2%)	different way of working; interactive activities; novelty; more than lecture/PowerPoint; learning by doing	“It has been a very different way of working in class.” (C2–S4). “I liked going to class knowing we would do different, cooler things, not just sit while the teacher read slides.” (C2–S27).
Harry Potter as a motivating narrative spine (28.1%)	narrative coherence; thematic engagement; initial affinity with Harry Potter; accessible storyline even without prior familiarity	“It was a good narrative thread and it fit the whole syllabus well.” (C2–S17). “The Harry Potter storyline was explained so well that even without having seen the movies, it was easy to follow.” (C2–S29).
Design frictions: theory alignment, challenge density, and narrative familiarity (15.6%)	need for more theory-linked challenges; difficulty studying for exam; unfamiliarity with Harry Potter; initial disconnection from the plot	“I missed more challenges to earn experience points and gold coins.” (C2–S10). “I had not seen the movies, so sometimes I did not know what the plot was about.” (C2–S22).
Feedback and progress monitoring (6.2%)	progress tracking; monitoring learning; visible advancement	“A very fun tool to work with and to see the progress of learning.” (C2–S12). “It helped me maintain continuity in my classroom work and keep my attention on the different projects.” (C2–S9).

Note: Frequencies represent the percentage of unique students endorsing each theme at least once (participant-level; N = 32), and do not sum to 100% because students often endorsed multiple themes. They are used as descriptive indicators of thematic prominence within this corpus of brief open-ended responses and should not be interpreted as inferential estimates of prevalence or as a hierarchy of thematic importance. Representative quotes were translated into English by the authors.

References

Anderson, T., & Shattuck, J. (2012). Design-based research: A decade of progress in education research? Educational Researcher, 41(1), 16–25. [Google Scholar] [CrossRef]
Appleton, K. (2003). How do beginning primary school teachers cope with science? Toward an understanding of science teaching practice. Research in Science Education, 33(1), 1–25. [Google Scholar] [CrossRef]
Avraamidou, L., & Osborne, J. (2009). The role of narrative in communicating science. International Journal of Science Education, 31(12), 1683–1707. [Google Scholar] [CrossRef]
Bai, S., Hew, K. F., Gonda, D. E., Huang, B., & Liang, X. (2022). Incorporating fantasy into gamification promotes student learning and quality of online interaction. International Journal of Educational Technology in Higher Education, 19, 29. [Google Scholar] [CrossRef]
Bakker, A. (2018). Design research in education: A practical guide for early career researchers. Routledge. [Google Scholar] [CrossRef]
Bormann, D., & Greitemeyer, T. (2015). Immersed in virtual worlds and minds: Effects of in-game storytelling on immersion, need satisfaction, and affective theory of mind. Social Psychological and Personality Science, 6(6), 646–652. [Google Scholar] [CrossRef]
Boström, A. (2008). Narratives as tools in designing the school chemistry curriculum. Interchange, 39(4), 391–413. [Google Scholar] [CrossRef]
Braun, V., & Clarke, V. (2006). Using thematic analysis in psychology. Qualitative Research in Psychology, 3(2), 77–101. [Google Scholar] [CrossRef]
Bravo, E., Brígido, M., Hernández, M. A., & Mellado, V. (2022). Las emociones en ciencias en la formación inicial del profesorado de infantil y primaria. Revista Interuniversitaria de Formación del Profesorado, 97(36.1), 57–74. [Google Scholar] [CrossRef]
Bravo, E., Costillo, E., Bravo, J. L., & Borrachero, A. B. (2019). Emociones de los futuros maestros de Educación Infantil en las distintas áreas del currículo. Profesorado. Revista de Currículum y Formación del Profesorado, 23(4), 196–214. [Google Scholar] [CrossRef]
Brígido, M., Borrachero, A. B., Bermejo, M. L., & Mellado, V. (2013). Prospective primary teachers’ self-efficacy and emotions in science teaching. European Journal of Teacher Education, 36(2), 200–217. [Google Scholar] [CrossRef]
Brown, A. L. (1992). Design experiments: Theoretical and methodological challenges in creating complex interventions in classroom settings. Journal of the Learning Sciences, 2(2), 141–178. [Google Scholar] [CrossRef] [PubMed]
Bruner, J. (1991). The narrative construction of reality. Critical Inquiry, 18(1), 1–21. [Google Scholar] [CrossRef]
Cobb, P., Confrey, J., DiSessa, A. A., Lehrer, R., & Schauble, L. (2003). Design experiments in educational research. Educational Researcher, 32(1), 9–13. [Google Scholar] [CrossRef]
Colomo-Magaña, E., Colomo-Magaña, A., Cívico-Ariza, A., & Basgall, L. (2024). Pre-service primary teachers’ perceptions of gamification as a methodology. Journal of Technology and Science Education, 14(1), 109–122. [Google Scholar] [CrossRef]
Dah, J., Hussin, N., Zaini, M. K., Helda, L. I., Ametefe, D. S., & Aliu, A. A. (2025). Gamification is not working: Why? Games and Culture, 20(7), 934–957. [Google Scholar] [CrossRef]
Dai, W.-A., Xu, W., & Xing, Q.-W. (2025). Gamified learning impact: A meta-analysis of game element combinations on students’ learning outcomes. Educational Technology Research and Development, 73(4), 2617–2643. [Google Scholar] [CrossRef]
Deci, E. L., Eghrari, H., Patrick, B. C., & Leone, D. R. (1994). Facilitating internalization: The self-determination theory perspective. Journal of Personality, 62(1), 119–142. [Google Scholar] [CrossRef]
Deci, E. L., & Ryan, R. M. (1985). Intrinsic motivation and self-determination in human behavior. Plenum Press. [Google Scholar] [CrossRef]
Deci, E. L., & Ryan, R. M. (2000). The “what” and “why” of goal pursuits: Human needs and the self-determination of behavior. Psychological Inquiry, 11(4), 227–268. [Google Scholar] [CrossRef]
Design-Based Research Collective. (2003). Design-based research: An emerging paradigm for educational inquiry. Educational Researcher, 32(1), 5–8. [Google Scholar] [CrossRef]
Deterding, S., Dixon, D., Khaled, R., & Nacke, L. E. (2011). From game design elements to gamefulness: Defining gamification. In Proceedings of the 15th International Academic MindTrek Conference (pp. 9–15). ACM. [Google Scholar] [CrossRef]
Dichev, C., & Dicheva, D. (2017). Gamifying education: What is known, what is believed and what remains uncertain: A critical review. International Journal of Educational Technology in Higher Education, 14, 9. [Google Scholar] [CrossRef]
Fabre-Mitjans, N., Jiménez-Valverde, G., Guimerà-Ballesta, G., & Calafell-Subirà, G. (2025). Digital gamification to foster attitudes toward science in early childhood teacher education. Applied Sciences, 15(11), 5961. [Google Scholar] [CrossRef]
Grabner-Hagen, M. M., & Kingsley, T. (2023). From badges to boss challenges: Gamification through need-supporting scaffolded design to instruct and motivate elementary learners. Computers and Education Open, 4, 100131. [Google Scholar] [CrossRef]
Guisasola, J. (2024). Design-based research: Some challenges and insights. Revista Eureka sobre Enseñanza y Divulgación de las Ciencias, 21(2), 2801. [Google Scholar] [CrossRef]
Hamari, J., Koivisto, J., & Sarsa, H. (2014). Does gamification work?—A literature review of empirical studies on gamification. In Proceedings of the 47th Hawaii International Conference on System Sciences (pp. 3025–3034). IEEE. [Google Scholar] [CrossRef]
Hoadley, C., & Campos, F. C. (2022). Design-based research: What it is and why it matters to studying online learning. Educational Psychologist, 57(3), 207–220. [Google Scholar] [CrossRef]
Jiménez-Valverde, G. (2025). Narrative approaches in science education: From conceptual understanding to applications in chemistry and gamification. Encyclopedia, 5(3), 116. [Google Scholar] [CrossRef]
Jiménez-Valverde, G., Calafell-Subirà, G., Fabre-Mitjans, N., & Heras-Paniagua, C. (2024). Desarrollo y validación de un cuestionario sobre actitudes y motivación hacia la física y química en estudiantes de Magisterio: Un análisis comparado. In M. Díez-Ojeda, S. Martínez-Juste, R. Bogdan-Toma, M. E. Dies-Álvarez, A. Ramírez-Segado, R. Jiménez-Fontana, & E. García-González (Eds.), Sobre la educación científica y el cuidado de la casa común: Necesidades y perspectivas (pp. 140–161). Dykinson. Available online: https://hdl.handle.net/2445/215006 (accessed on 29 April 2026).
Jiménez-Valverde, G., Fabre-Mitjans, N., & Guimerà-Ballesta, G. (2025a). Games and Playful Activities to Learn About the Nature of Science. Encyclopedia, 5(4), 193. [Google Scholar] [CrossRef]
Jiménez-Valverde, G., Fabre-Mitjans, N., & Guimerà-Ballesta, G. (2025b). Narrative-Driven Digital Gamification for Motivation and Presence: Preservice Teachers’ Experiences in a Science Education Course. Computers, 14(9), 384. [Google Scholar] [CrossRef]
Jiménez-Valverde, G., Fabre-Mitjans, N., & Guimerà-Ballesta, G. (2026). Potions & Dragons: Player-informed web-based gamification for science attitudinal change in initial teacher education. Computers, 15(2), 78. [Google Scholar] [CrossRef]
Kalogiannakis, M., Papadakis, S., & Zourmpakis, A.-I. (2021). Gamification in science education: A systematic review of the literature. Education Sciences, 11(1), 22. [Google Scholar] [CrossRef]
Kapp, K. M. (2012). The gamification of learning and instruction: Game-based methods and strategies for training and education. Pfeiffer. [Google Scholar]
Khaldi, A., Bouzidi, R., & Nader, F. (2023). Gamification of e-learning in higher education: A systematic literature review. Smart Learning Environments, 10, 10. [Google Scholar] [CrossRef]
Klassen, R. M., & Durksen, T. L. (2014). Weekly self-efficacy and work stress during the teaching practicum: A mixed methods study. Learning and Instruction, 33, 158–169. [Google Scholar] [CrossRef]
Koivisto, J., & Hamari, J. (2019). The rise of motivational information systems: A review of gamification research. International Journal of Information Management, 45, 191–210. [Google Scholar] [CrossRef]
Kokkotas, P., Rizaki, A., & Malamitsa, K. (2010). Storytelling as a strategy for understanding concepts of electricity and electromagnetism. Interchange, 41, 379–405. [Google Scholar] [CrossRef]
Krath, J., Schürmann, L., & von Korflesch, H. F. O. (2021). Revealing the theoretical basis of gamification: A systematic review and analysis of theory in research on gamification, serious games and game-based learning. Computers in Human Behavior, 125, 106963. [Google Scholar] [CrossRef]
Lehrmann, A. L., Skovbjerg, H. M., & Arnfred, S. J. (2022). Design-based research as a research methodology in teacher and social education—A scoping review. Educational Design Research, 6(3), 1–32. [Google Scholar] [CrossRef]
Li, L., Hew, K. F., & Du, J. (2024). Gamification enhances student intrinsic motivation, perceptions of autonomy and relatedness, but minimal impact on competency: A meta-analysis and systematic review. Educational Technology Research and Development, 72, 765–796. [Google Scholar] [CrossRef]
Li, M., Ma, S., & Shi, Y. (2023). Examining the effectiveness of gamification as a tool promoting teaching and learning in educational settings: A meta-analysis. Frontiers in Psychology, 14, 1253549. [Google Scholar] [CrossRef]
López-Martín, E., & Ardura-Martínez, D. (2023). The effect size in scientific publication. Educación XX1, 26(1), 9–17. [Google Scholar] [CrossRef]
Mazarakis, A., & Bräuer, P. (2023). Gamification is working, but which one exactly? Results from an experiment with four game design elements. International Journal of Human–Computer Interaction, 39(3), 612–627. [Google Scholar] [CrossRef]
McKenney, S., & Reeves, T. C. (2021). Educational design research: Portraying, conducting, and enhancing productive scholarship. Medical Education, 55(1), 82–92. [Google Scholar] [CrossRef]
Mekler, E. D., Brühlmann, F., Tuch, A. N., & Opwis, K. (2017). Towards understanding the effects of individual gamification elements on intrinsic motivation and performance. Computers in Human Behavior, 71, 525–534. [Google Scholar] [CrossRef]
Membiela, P., Vidal, M., Fragueiro, S., Lorenzo, M., García-Rodeja, I., Aznar, V., Bugallo, A., & González, A. (2022). Motivation for science learning as an antecedent of emotions and engagement in preservice elementary teachers. Science Education, 106(1), 119–141. [Google Scholar] [CrossRef]
Millar, R., & Osborne, J. F. (Eds.). (1998). Beyond 2000: Science education for the future: The report of a seminar series funded by the Nuffield Foundation. King’s College London, School of Education. [Google Scholar]
Negrete, A., & Lartigue, C. (2004). Learning from education to communicate science as a good story. Endeavour, 28(3), 120–124. [Google Scholar] [CrossRef]
Osborne, J., Simon, S., & Collins, S. (2003). Attitudes towards science: A review of the literature and its implications. International Journal of Science Education, 25(9), 1049–1079. [Google Scholar] [CrossRef]
Putz, L. M., Hofbauer, F., & Treiblmaier, H. (2020). Can gamification help to improve education? Findings from a longitudinal study. Computers in Human Behavior, 110, 106392. [Google Scholar] [CrossRef]
Ratinho, E., & Martins, C. (2023). The role of gamified learning strategies in student’s motivation in high school and higher education: A systematic review. Heliyon, 9(8), e19033. [Google Scholar] [CrossRef]
Riar, M., Morschheuser, B., Zarnekow, R., & Hamari, J. (2022). Gamification of cooperation: A framework, literature review and future research agenda. International Journal of Information Management, 67, 102549. [Google Scholar] [CrossRef]
Rowcliffe, S. (2004). Storytelling in science. School Science Review, 86(314), 121–126. [Google Scholar]
Ryan, R. M. (1982). Control and information in the intrapersonal sphere: An extension of cognitive evaluation theory. Journal of Personality and Social Psychology, 43(3), 450–461. [Google Scholar] [CrossRef]
Ryan, R. M., Connell, J. P., & Plant, R. W. (1990). Emotions in nondirected text learning. Learning and Individual Differences, 2(1), 1–17. [Google Scholar] [CrossRef]
Ryan, R. M., & Deci, E. L. (2000). Self-determination theory and the facilitation of intrinsic motivation, social development, and well-being. American Psychologist, 55(1), 68–78. [Google Scholar] [CrossRef]
Ryan, R. M., & Deci, E. L. (2017). Self-determination theory: Basic psychological needs in motivation, development, and wellness. Guilford Press. [Google Scholar] [CrossRef]
Ryan, R. M., & Deci, E. L. (2020). Intrinsic and extrinsic motivation from a self-determination theory perspective: Definitions, theory, practices, and future directions. Contemporary Educational Psychology, 61, 101860. [Google Scholar] [CrossRef]
Ryan, R. M., Mims, V., & Koestner, R. (1983). Relation of reward contingency and interpersonal context to intrinsic motivation: A review and test using cognitive evaluation theory. Journal of Personality and Social Psychology, 45(4), 736–750. [Google Scholar] [CrossRef]
Ryan, R. M., Rigby, C. S., & Przybylski, A. (2006). The motivational pull of video games: A self-determination theory approach. Motivation and Emotion, 30(4), 344–360. [Google Scholar] [CrossRef]
Sailer, M., Hense, J. U., Mayr, S. K., & Mandl, H. (2017). How gamification motivates: An experimental study of the effects of specific game design elements on psychological need satisfaction. Computers in Human Behavior, 69, 371–380. [Google Scholar] [CrossRef]
Sasway, H. M., & Kelly, A. M. (2021). Instructional behaviors affecting student attitudes towards science. Community College Journal of Research and Practice, 45(6), 385–402. [Google Scholar] [CrossRef]
Sánchez-Martín, J., Cañada-Cañada, F., & Dávila-Acedo, M. A. (2017). Just a game? Gamifying general science class at university: Collaborative and competitive work implications. Thinking Skills and Creativity, 26, 51–59. [Google Scholar] [CrossRef]
Self-Determination Theory. (n.d.). Intrinsic motivation inventory (IMI). Available online: https://selfdeterminationtheory.org/wp-content/uploads/2022/02/IMI_Complete.pdf (accessed on 29 April 2026).
Soares, S., Gonçalves, M., Jerónimo, R., & Kolinsky, R. (2023). Narrating science: Can it benefit science learning, and how? A theoretical review. Journal of Research in Science Teaching, 60(9), 2042–2075. [Google Scholar] [CrossRef]
Tinoca, L., Piedade, J., Santos, S., Pedro, A., & Gomes, S. (2022). Design-based research in the educational field: A systematic literature review. Education Sciences, 12(6), 410. [Google Scholar] [CrossRef]
Tytler, R., & Ferguson, J. P. (2023). Student attitudes, identity, and aspirations toward science. In N. G. Lederman, D. L. Zeidler, & J. S. Lederman (Eds.), Handbook of research on science education: Volume III (pp. 158–192). Routledge. [Google Scholar] [CrossRef]
Van Roy, R., & Zaman, B. (2019). Unravelling the ambivalent motivational power of gamification: A basic psychological needs perspective. International Journal of Human-Computer Studies, 127, 38–50. [Google Scholar] [CrossRef]
Wang, F., & Hannafin, M. J. (2005). Design-based research and technology-enhanced learning environments. Educational Technology Research and Development, 53(4), 5–23. [Google Scholar] [CrossRef]
Xi, N., & Hamari, J. (2019). Does gamification satisfy needs? A study on the relationship between gamification features and intrinsic need satisfaction. International Journal of Information Management, 46, 210–221. [Google Scholar] [CrossRef]
Zainuddin, Z., Chu, S. K. W., Shujahat, M., & Perera, C. J. (2020). The impact of gamification on learning and instruction: A systematic review of empirical evidence. Educational Research Review, 30, 100326. [Google Scholar] [CrossRef]

Figure 1. Examples of three FantasyClass features. From left to right: a student’s avatar (showing HP, XP, gold coins, group icon, level, and pet), an equipment-upgrade screen in the virtual shop, and a card. The use of these screenshots has been authorized by the creator of FantasyClass.

Figure 2. Author-created schematic narrative map, not reproducing official franchise artwork, inspired by Diagon Alley and used in Cycle 2 to situate course missions within the Harry Potter-inspired storyline. Harry Potter, Hogwarts, Dumbledore, Voldemort, Diagon Alley, and related Wizarding World names and indicia are trademarks and/or copyrighted material of their respective rights holders. Their use in this study is limited to the academic description of a non-commercial instructional intervention and does not imply endorsement, sponsorship, or affiliation.

Table 1. Cycle 1 block-level results for the three differentiated affective blocks.

Affective Block	Pre Mdn [IQR]	Post Mdn [IQR]	W	p (Holm)	r_rb
Block 1. Learning experiences	3.38 [0.63]	3.75 [0.53]	57.0	0.003 *	0.70
Block 2. Current attitudes and perceived relevance	4.00 [0.53]	4.19 [0.88]	67.5	0.010 *	0.58
Block 3. Future teaching self-efficacy and professional value	3.92 [0.50]	4.17 [0.50]	26.0	0.002 *	0.81

Note: Mdn = median; IQR = interquartile range; W = Wilcoxon signed-rank (two-tailed, continuity-corrected) for all blocks. N = 28 paired observations. p (Holm) = Holm step-down family-wise error–adjusted p-value across the three block-level comparisons, asterisks (*) indicate statistical significance at the 5% level; r_rb = matched-pairs rank-biserial correlation.

Table 2. Cycle 1 descriptive profile of perceived motivational value for gamification features and platform-level rating.

Feature/Platform Item	N	Mdn	IQR	% ≥ 4
FantasyClass overall rating	28	5	1	89.3
Experience points (XP)	28	4.5	1	92.9
Leveling up	28	5	1	89.3
Group work	28	5	1	89.3
Gold coins	28	4	1	78.6
Narrative	27	4	1.5	74.1
Health points (HP)	28	4	1.25	71.4
Random events	27	4	1.5	63.0
Monster battles	27	4	1	59.3
Avatar choice	28	4	2	57.1
Pets	28	4	2.25	57.1
Wordlets	25	4	1	52.0
Fortune wheel	27	4	1	51.9
Cards	27	4	2	51.9
Chests	26	3.5	2	50.0
Skills	27	3	1.5	44.4
Badges	27	3	1	40.7
Curses	25	3	2	32.0
FantasyClass shop	27	3	1	29.6

Note: Mdn = median; IQR = interquartile range; % ≥ 4 = proportion of valid responses rated as 4 or 5 on a five-point ordinal rating scale of perceived motivational value (1 = not motivating; 4 = motivating; 5 = highly motivating). Valid N varied by item because of occasional missing responses. The platform-level overall rating is shown first; feature-level items are then ordered by % ≥ 4 in descending order.

Table 3. Design audit trail linking Cycle 1 evidence to Cycle 2 redesign decisions.

Cycle 1 Evidence	Design Tension Identified	Cycle 2 Redesign Response	Motivational Rationale	Evaluation Focus in Cycle 2
XP, levels and group work were among the most valued features	Preserve the motivational core without overloading the system	Progression and collaboration were retained as central mechanics	Intended to support perceived competence and relatedness	Enjoyment/Interest, Perceived Competence, Relatedness
Shop, badges, skills, and penalty-linked mechanics received more modest or uneven responses	Some mechanics generated friction or unclear usefulness	Shop/economy implementation and selected penalty-linked mechanics were recalibrated	Intended to reduce perceived control or pressure and improve functional clarity	Perceived Choice, Enjoyment/Interest
Open responses suggested that narrative coherence could be strengthened	Narrative was present but remained relatively light	A more continuous Harry Potter-inspired storyline was introduced	Intended to support meaning, continuity and situational interest	Enjoyment/Interest, open responses
Cycle 1 used broad affective measures	Need for a more targeted motivational lens	IMI subscales were used in Cycle 2	Aligns the evaluative focus with intrinsic motivation and SDT constructs	Enjoyment/Interest, Perceived Choice, Perceived Competence, Relatedness

Table 4. Cycle 2 Spearman intercorrelations among IMI subscales.

Subscale A	Subscale B	ρ	p (Raw)	p (Holm)
Enjoyment/Interest	Perceived Competence	0.601	<0.001	0.002 *
Enjoyment/Interest	Perceived Choice	0.551	0.001	0.005 *
Enjoyment/Interest	Relatedness	0.054	0.770	1.000
Perceived Competence	Perceived Choice	0.219	0.229	0.917
Perceived Competence	Relatedness	0.119	0.516	1.000
Perceived Choice	Relatedness	0.119	0.515	1.000

Note: ρ = Spearman rank-order correlation; p (Raw) = uncorrected two-tailed p-value; p (Holm) = Holm step-down family-wise error-adjusted p-value. * p (Holm) < 0.05. N = 32 participants.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Guimerà-Ballesta, G.; Jiménez-Valverde, G.; Fabre-Mitjans, N. Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study. Educ. Sci. 2026, 16, 855. https://doi.org/10.3390/educsci16060855

AMA Style

Guimerà-Ballesta G, Jiménez-Valverde G, Fabre-Mitjans N. Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study. Education Sciences. 2026; 16(6):855. https://doi.org/10.3390/educsci16060855

Chicago/Turabian Style

Guimerà-Ballesta, Gerard, Gregorio Jiménez-Valverde, and Noëlle Fabre-Mitjans. 2026. "Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study" Education Sciences 16, no. 6: 855. https://doi.org/10.3390/educsci16060855

APA Style

Guimerà-Ballesta, G., Jiménez-Valverde, G., & Fabre-Mitjans, N. (2026). Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study. Education Sciences, 16(6), 855. https://doi.org/10.3390/educsci16060855

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Motivational Scaffolding Through Digital Gamification in Early Childhood Science Teacher Education: A Design-Based Research Study

Abstract

1. Introduction

1.1. Gamification as Motivational Design in Teacher Education

1.2. Narrative, Coherence, and Meaning in Gamified Science Learning

1.3. Study Aim and Research Questions

2. Methodology

3. Cycle 1: Exploratory and Diagnostic Iteration

3.1. Cycle 1 Participants

3.2. Initial Implementation Context and Intervention Design

3.3. Cycle 1 Instruments

3.4. Cycle 1 Data Analysis

3.5. Cycle 1 Results

3.5.1. Cycle 1 Affective Profile

3.5.2. Perceived Motivational Value of Gamification Features

3.5.3. Cycle 1 Qualitative Themes

3.6. Cycle 1 Reflections and Design Implications

4. Cycle 2: Refined Iteration

4.1. Cycle 2 Participants

4.2. Revised Implementation Context and Intervention Design

4.3. Cycle 2 Instruments

4.4. Cycle 2 Data Analysis

4.5. Cycle 2 Results

4.5.1. Cycle 2 Motivational Profile

4.5.2. Cycle 2 Qualitative Themes

5. Discussion

5.1. From Affective Repair to Perceived Teaching Readiness

5.2. Progression, Visible Feedback, and the Motivational Role of Competence and Agency

5.3. Collaboration, Relatedness, and the More Uneven Social Dimension

5.4. Narrative Coherence, Cultural Accessibility, and Transferability

5.5. Implications for Motivational Design in Early Childhood Science Teacher Education

5.6. Limitations and Future Directions

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

Appendix B

Appendix C

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI