1. Introduction
In today’s complex socio-economic and cultural landscape, fostering financial literacy has become a critical educational priority [
1]. Financial education is increasingly recognised as essential for cultivating competent, responsible, and active citizens [
2]. It also enables key United Nations Sustainable Development Goals (SDGs)—particularly SDG 4 (Quality Education), SDG 10 (Reduced Inequalities), and SDG 12 (Responsible Consumption and Production) [
3]. International institutions, including the OECD [
4], highlight the importance of introducing financial education from an early age, in line with children’s cognitive and socio-emotional development. Early financial education contributes not only to individual financial well-being, resilience, and informed decision-making, but also to broader societal outcomes, including financial inclusion, the reduction of inequalities, and the promotion of sustainable economic behaviours [
5,
6]. When integrated with sustainability principles, financial literacy supports the development of critical, ethical, and socially responsible citizenship, helping learners to understand the interconnected nature of economic, environmental, and social challenges. In this sense, financial literacy becomes a pillar of sustainability-oriented education, helping learners address interconnected global challenges, such as overconsumption, environmental degradation, and inequality.
Despite growing awareness and policy efforts, international assessments consistently report low levels of financial literacy among both young people and adults [
6,
7], revealing structural limitations in current curricula and teaching practices. While financial education initiatives are expanding [
8,
9], significant gaps remain in research and provision [
10,
11]. Existing studies tend to emphasize secondary and adult learners [
5,
12,
13], with limited empirical work on younger children and a lack of comprehensive and geographically diverse data on primary financial literacy programs [
14,
15]. At the same time, extensive research across educational domains shows that active learning strategies tend to outperform traditional didactic approaches in fostering meaningful learning and behavioural change [
16,
17,
18]. However, their application in financial education remains scarce [
19,
20]. Among active pedagogies, cooperative learning and game-based learning are widely recognised, yet they are often studied as isolated methods. In practice, these approaches are enacted through broader instructional configurations that vary in terms of student autonomy, group structure, and teacher mediation, all of which may substantially influence learning outcomes. To capture this ecological complexity, this study adopts the concept of ‘instructional implementation package’. It refers to a coherent configuration of classroom features, rather than a pedagogical approach in isolation. Each package includes elements such as group composition and stability, levels of student autonomy, teacher orchestration strategies, and instructional materials. This perspective allows for a more nuanced interpretation of pedagogical effectiveness grounded in real classroom practice.
Direct comparisons of these pedagogical implementations are particularly limited, and very few studies adopt mixed-methods designs that combine quantitative outcomes with qualitative insights on motivation, social interaction, and sustainability competencies. This gap constrains the development of evidence-informed practices capable of addressing cognitive, social, and sustainability-related educational goals simultaneously.
Building on these premises, the present study investigates the effectiveness of two instructional implementation packages—one based on cooperative learning and one based on game-based learning—in fostering financial literacy among primary school pupils. Rather than assuming the inherent superiority of one pedagogy, the study examines how differences in learning outcomes may emerge as a function of how each approach is enacted. The research is guided by the following question:
Which instructional implementation package—one based on cooperative learning and one based on game-based learning—is more effective in fostering financial literacy in primary education, and why?
Addressing this question allows us to move beyond method labels and examine the mechanisms through which active approaches support financial literacy and sustainability-related learning.
This study builds upon a previously published qualitative investigation of the same educational intervention and sample [
15]. While the earlier work focused on examining the applicability, strengths, and limitations of the EduFin Framework as an evaluative tool, the present study addresses a different research question. Here, the focus shifts from framework evaluation to instructional effectiveness: specifically, to whether and why two contrasting instructional implementation packages produce different learning outcomes in terms of financial literacy. By adopting a mixed-methods design, this study provides complementary and novel evidence on measurable learning gains, thereby extending the prior qualitative findings from a descriptive and interpretive perspective to an outcome-oriented and explanatory one.
The paper first reviews the theoretical foundations of financial education and evidence-based pedagogy, with particular attention to active and experiential learning. It then outlines the two instructional approaches examined: cooperative learning, drawing on Johnson and Johnson’s Learning Together model, and game-based learning, grounded in socio-constructivist principles and implemented through the
Jun€co (Amiotti Foundation, Milan, Italy) financial literacy kit.
Section 3 describes the multiple case study design and the two intervention programs, highlighting the alignment between intended learning outcomes, instructional activities, and assessment methods. Finally, quantitative and qualitative findings are presented and discussed in relation to financial literacy development, student motivation, collaboration, and reflective learning processes, with implications for sustainability education and evidence-based practice.
3. Materials and Methods
This study adopts a multiple case study design, an evidence-based approach intended to generate a comprehensive, multi-dimensional understanding of a complex issue within its real-life context [
62]. This study is exploratory in nature and does not aim at population-level statistical generalization beyond the specific educational contexts examined. The primary goal is to avoid oversimplification, enabling a deep and holistic comprehension of the phenomenon under investigation [
63]. This method allows for the exploration of both variations and commonalities across different settings or participants, thus increasing the robustness of the findings [
64,
65]. By examining several cases in parallel, the design facilitates comparative analysis and offers richer insights into context-specific dynamics, while also identifying patterns that transcend individual cases. The research involved two fifth-grade classes from a primary school in Northern Italy. The two classes were assigned to the instructional conditions through convenience assignment, based on institutional constraints and school scheduling considerations. Each class included 16 pupils, all aged 10–11. Class A had 9 male and 7 female pupils, while Class B had an equal ratio of 8 male and 8 female pupils. The relatively small sample size of 32 pupils is consistent with case study research norms, where the focus is on depth over breadth. This sample allows for detailed exploration of participants’ perceptions and behaviors, while supporting systematic cross-case comparison of the two pedagogical implementations [
62,
63,
64,
65]. According to Italian and European regulations for educational research, this study was exempt from formal review by an institutional ethics committee. All participants’ legal guardians provided written informed consent following the school’s standard procedures. The research adhered strictly to ethical guidelines, ensuring voluntary participation, anonymity, and the protection of participants’ rights and well-being.
The data used in this study derive from the same intervention implemented in 2024 as reported in Andreatti and Morselli [
15]. Both studies share the same sample and pedagogical context, but differ in focus, research questions, and methodological approach.
Data collection employed a mixed-method approach, combining quantitative methods (pre- and post-tests) and qualitative ones (focus groups). This allowed for exploration of the research question from multiple perspectives, providing a deeper understanding of the problem [
66,
67]. Integrating quantitative and qualitative data responded to the need to combine an objective measurement of intervention efficacy (through pre- and post-test statistical analyses) with an exploration of motivational, relational, and learning dynamics experienced by students (through qualitative analysis of focus group data). Such integration aligns with established educational research paradigms and facilitates a more comprehensive and in-depth evaluation of the interventions.
The pre- and post-tests measured pupils’ financial literacy. These were based on the low-stakes tests developed by the INVALSI (The Italian National Institute for the Evaluation of the School System) for Grade 5 pupils, and covered the financial topics addressed in the program. These items adopted a competence-based approach, asking the learner to apply financial concepts to everyday life situations [
68]. The tests were conceived as a unitary instrument to evaluate pupils’ overall financial literacy, rather than as separate modules for each lesson, in line with the INVALSI framework for primary education. This choice reflected the principle of constructive alignment [
69], whereby intended learning outcomes, teaching–learning activities, and assessment are coherently aligned at the program level. Consequently, we calculated a single effect size per class, referring to the entire educational pathway, rather than separate values for each of the six sessions.
To avoid practice effects, the tests were not identical: a pool of 48 items was developed and divided into two parallel forms of 24 items each, equivalent in content, difficulty, and competence coverage. This split-half design ensured comparability while reducing test–retest bias, allowing performance differences to be attributed more reliably to the educational interventions. Using identical tests could have inflated results through memory effects, while entirely different forms could have introduced difficulty imbalances. For each form, internal consistency was assessed by splitting items into two balanced halves, correlating half-scores, and applying the Spearman–Brown correction. The pooled analysis across Classes A and B yielded coefficients of r_SB = 0.71 for the pre-test and r_SB = 0.76 for the post-test, indicating satisfactory reliability. Class-level estimates ranged from 0.70 to 0.85, confirming robust consistency even with small samples. Qualitative checks of item content and difficulty further supported the equivalence of the two forms, reinforcing that observed pre–post differences reflect genuine learning rather than measurement artifacts. Taken together, these analyses support the reliability and parallelism of the test forms [
70]. Each pupil was assigned a numerical code to allow comparison of pre- and post-test results while preserving anonymity. Teachers confirmed that the financial topics addressed in the program were not part of their regular curriculum, ensuring that differences in test performance could be attributed to the intervention rather than to prior instruction.
Table 1 presents a sample question.
Given the exploratory and preliminary nature of this study, primary emphasis was placed on effect size estimation to quantify intervention impact magnitude. While inferential statistics are sensitive to sample size and primarily address whether an observed effect is unlikely to be due to chance, effect sizes provide an estimate of the magnitude and educational relevance that is independent of sample size. In educational research, and particularly in small-sample intervention studies, effect sizes are widely recommended as they allow for a more meaningful interpretation of learning gains and facilitate comparison with prior studies and meta-analytic benchmarks. For this reason, Hedges’g was selected as the primary quantitative indicator of intervention effectiveness, as it provides a bias-corrected estimate suitable for samples smaller than 20 units [
71]. This approach avoided generalizations or inferences about a broader population while allowing quantitative data to enrich and explain qualitative findings, and vice versa [
72]. To complement these analyses, a non-parametric Wilcoxon signed-rank test was also employed to evaluate whether the observed pre–post differences could be reasonably attributed to chance variation, taking into account the small sample size and non-normal distributions typical of educational datasets [
73,
74]. While the Wilcoxon test provides useful inferential support in small samples, consistent with EBE principles [
31,
32], the central emphasis remains on effect sizes and confidence intervals as primary indicators of intervention effectiveness and practical relevance. The Wilcoxon test results are thus presented as complementary descriptive evidence rather than the main criterion for interpreting the study’s findings.
For the focus groups, the stimulus questions encouraged pupils to reflect on their learning experiences and the processes involved. As the discussions were semi-structured, pupils could steer the conversation toward topics, opinions, and concerns they found most relevant, while maintaining a focus on learning. Due to school regulations and parental consent constraints, focus groups could not be audio- or video-recorded. Instead, the researcher produced detailed real-time transcriptions of pupils’ spoken contributions, aiming to capture their reflections and interactions as faithfully as possible. Immediately after each session, the notes were reviewed and refined to enhance completeness and accuracy. To ensure anonymity, no identifying information was collected, and each pupil was referred to only by a numerical code. Although the absence of recordings limited the possibility of verifying transcripts against audio data, the combination of careful notetaking, immediate revision, and systematic coding provided a reliable and ethically compliant dataset for analysis. Qualitative content analysis was employed, enabling a systematic and adaptable exploration of patterns emerging from the focus groups. This method involved developing a coding frame and iteratively refining it in response to the data, with coding units progressively aligned to the evolving analytical framework [
75].
The qualitative analysis phase was conducted collaboratively, with researchers independently defining coding frames before comparing and merging them to identify categories that best represented the data. Triangulating objective data with analyses from multiple researchers enhanced awareness of potential biases and minimized subjective interpretations [
76]. Furthermore, to ensure the reliability of the qualitative coding process, inter-rater agreement between the two researchers was assessed by calculating Cohen’s Kappa coefficient [
75].
After defining the research questions and data collection tools, the authors designed two educational programs based on the theory of constructive alignment [
69]. This entailed formulating intended learning outcomes (ILOs), planning coherent teaching and learning activities, and selecting appropriate assessment methods—both summative and formative. To promote deep learning, the ILOs in both programs were aligned with the relational level of the SOLO taxonomy, aiming to develop students’ ability to integrate and synthesize knowledge into a coherent whole [
77]. Although the cognitive objectives were comparable, the ILOs were specifically adapted to the pedagogical implementation used in each class: CL-based implementation with high autonomy and stable groups in Class A and GBL-based implementation with stronger teacher orchestration and dynamic grouping in Class B, as shown in
Table 2. Both programs had the same duration—12 h, delivered in six two-hour sessions—and shared a common assessment structure: a pre-test as diagnostic, a post-test as summative, and focus groups as formative evaluations. The same classroom teacher delivered both instructional implementation packages to the two fifth-grade classes, in order to ensure comparable instructional conditions across groups. The teacher had several years of teaching experience in primary education and prior familiarity with both CL- and GBL-based implementations. This design choice was intended to reduce variability related to teacher-specific characteristics, while acknowledging that each package required distinct instructional roles. Fidelity to the intended implementation packages was supported using predefined instructional plans, shared learning objectives, and standardized materials for each condition. Ongoing coordination between the researchers and the teacher before and during the intervention ensured adherence to the core features of each package, while allowing for natural classroom interaction.
3.1. Description of CL-Based Implementation in Class A
Before starting the educational program, pupils were divided into four base groups, each consisting of four members. These groups remained stable throughout the financial education program. With the support of the class teacher, pupils were assigned to groups heterogeneously, taking into account their performance and interpersonal skills. This approach is important because it fosters peer learning, allowing all students to benefit from mutual support.
In each of the six sessions, the teaching and learning activities were designed to help pupils achieve the ILOs related to six financial education topics, in line with the Financial Competence Framework for children and youth [
4]. The topics covered during the program included: the value of goods and services, specialization and exchange, price estimation, the transition from barter to currency, loan and interest, sustainable economy.
Each session was divided into three phases, with groups receiving an envelope containing all the materials needed for the planned activities.
The first phase (15 min) focused on internal organization within the group, role assignment, and a review of feedback from the previous session. The roles assigned were as follows:
The secretary, who managed the group, read instructions, and encouraged participation;
The timekeeper, who controlled the tone of voice and managed time;
The teacher, who checked for understanding, summarized, and corrected mistakes;
The scribe, who wrote and completed tasks or exercises.
Roles were rotated in each session so that all pupils experienced each role, thereby sharing responsibility for group success.
The second phase (60 min) was dedicated to the financial education activities. During this phase, groups engaged with exercises, readings, group discussions and reflections, case analyses, and debates. The third phase (15 min) focused on individual and group evaluations, identifying areas for improvement. The final 30 min of each session were reserved for a focus group discussion on what had been learned.
In the CL-based implementation, the adult’s role was limited to distributing materials at the beginning of each phase. Each envelope contained materials for the three phases of work, along with an instruction sheet including a checklist of required actions based on the principle of internal interdependence, thus ensuring a high degree of pupil autonomy.
For feedback, the sandwich technique was used: this method places corrective feedback between two positive statements. It is one of the most crucial and powerful factors influencing the learning process, and its effectiveness has been studied by Prochazka et al. [
78].
3.2. Description of GBL-Based Implementation in Class B
In this class, we tested the
Jun€co kit—Ethical and Sustainable Economics for Active Citizenship, developed by the Amiotti Foundation (
https://fondazioneamiotti.org/juneco/, accessed on 10 December 2025). This kit has been specifically designed for financial literacy and includes six sessions that use GBL to enhance pupils’ financial literacy, as well as their competencies in active citizenship and ethical economics. The playful activities are designed to develop the ILOs defined during the initial phase of the research, focusing on the same financial education themes outlined earlier. Specifically, the
Jun€co kit consists of the following games:
The Hidden Prices Puzzle—To explore how goods and services result from human labour and the complex processes of resource combination.
Pizza, Love, and Economics—To understand how specialisation and exchange have enabled society to create new goods and improve prosperity.
The Price of Parsley—To recognize and assess the price of goods and services, highlighting how products with the same use value can vary significantly in price.
The Economic Goose Game—To learn about the historical evolution of exchange, from bartering to various forms of money, including virtual currency.
The Tug of War with Money—A business game designed to help students understand the impact of decisions on production, investments, and interest-bearing loans.
Recycle—To understand the principles of ethical, sustainable, and circular economics, promoting conscious consumption, reuse, and waste reduction.
In this class, pupils did not work alone, as the activities required team formation. Unlike the cooperative groups in the other class, each game involved the creation of random teams of varying sizes and numbers, depending on the activity. Pupils were randomly assigned to different teams for each game, ensuring that they worked with different teammates every time. This design encouraged interaction with different peers and promoted healthy competition, but without the collective responsibility and structured social skill development typical of cooperative groups.
Unlike the other class, where each session followed a similar structure despite covering different financial education topics, each of these six sessions had a different organization based on the types of games included in the
Jun€co kit. Flashcards, game boards, card decks, dice, game pieces, self-correcting puzzles, sets of banknotes and coins, income and expense ledgers, and crosswords were used. The only constant across all sessions was the final 30 min collective reflection in the form of a focus group [
79], aimed at supporting pupils in articulating and consolidating what they had learned. Another difference compared to the other class was the role of the adult, who managed the games by explaining the rules, guiding the execution, and coordinating all phases of the activity. As a result, pupils operated with lower levels of autonomy during the learning process.
4. Results
Table 3 summarizes the quantitative results for Class A, including median pre- and post-test scores (raw scores, i.e., number of correct answers out of 24) with interquartile ranges, mean, standard deviations (SD), and the corresponding effect size (Hedges’g), which indicates the magnitude of the difference between pre- and post-test performance.
In class A, all pupils showed improvement between the pre-test and post-test (see
Supplementary Materials). The intervention produced a large effect size, with a Hedges’g value of 1.84. The 95% confidence interval for this estimate ranged from 1.01 to 2.67, indicating a substantial and educationally meaningful improvement in performance. These results suggest robust improvements in pupils’ financial literacy following the intervention. The Wilcoxon Signed-Rank Test indicated a statistically significant improvement in financial literacy scores from pre-test to post-test. The test statistic was W = 0, with a corresponding standardized value Z = −3.52 and a two-tailed significance level of
p = 0.00044. This result indicates that nearly all score differences reflected improved post-test performance. The mean difference in scores was 6.31, providing strong evidence of a meaningful positive effect of the educational intervention on pupils’ financial literacy.
Table 4 presents the quantitative results for Class B, including median pre- and post-test scores (raw scores) with interquartile ranges, mean, standard deviations (SD), and the corresponding effect size (Hedges’g).
In class B, pupils demonstrated clear improvement in their performance from the pre-test to the post-test. Only one student showed a regression, with their score decreasing from 10 to 6 correct responses (see
Supplementary Materials). The intervention yielded a Hedges’g value of 1.2, indicating a large effect size. The 95% confidence interval ranged from 0.45 to 1.95, suggesting a meaningful improvement in pupils’ financial literacy following the intervention.
The Wilcoxon Signed-Rank Test was performed on 15 paired pre-test and post-test scores to assess the effectiveness of the intervention. One participant was excluded from the analysis because their pre-test and post-test scores were identical, resulting in a difference score of zero; such cases are excluded from the Wilcoxon test, thereby reducing the effective sample size. The test statistic was W = 9, with a corresponding standardized value Z = −2.90 and a two-tailed significance level of p = 0.00374. These results indicate a statistically significant improvement in financial literacy attributable to the intervention.
Table 5 presents the categories that emerged from the analysis of the pupils’ contributions regarding how the learning process unfolded during the activities (the complete transcripts are provided in the
Supplementary Materials). Each speaking turn was assigned to a single category based on its primary qualitative content, ensuring mutually exclusive and clearly defined classification. The Cohen’s Kappa coefficient obtained was
κ = 0.747, indicating substantial agreement between coders according to Landis and Koch’s criteria [
80]. This level of agreement supports the reliability of the qualitative coding process.
For each category, absolute counts and corresponding percentages of the frequency of each topic are provided separately for two classes: Class A (n= 42 speaking turns) and Class B (n = 12 speaking turns). Percentages are calculated based on the total number of speaking turns for each class. Representative quotes are provided to enhance transparency and to add qualitative depth to the identified themes. Given the substantial difference in the total number of speaking turns between the two classes, frequencies and percentages should be interpreted primarily as descriptive indicators of emerging themes, rather than as precise measures of engagement or emphasis.
As shown in the table, in Class A, where the financial education program based on CL was implemented, pupils reflected on the learning process to a much greater extent than in Class B, where GBL was used. This difference is significant, as the semi-structured focus groups allowed for open-ended responses, giving pupils the freedom to discuss what they found meaningful.
In Class A, pupils clearly valued the opportunity to reflect on their experiences within the cooperative learning groups. The categories that emerged were closely related to key components of CL. For example, pupils discussed the functioning and assignment of roles within their cooperative groups. One student noted, “It took us a while to understand that not everyone does everything—each person has their own role”. As this class had no prior experience with CL, pupils initially struggled to grasp its structure, particularly the principle of positive interdependence required to achieve shared learning goals.
Pupils also reflected on their progress—“We learned how to work as a team. It used to be harder to collaborate, but now it’s not anymore!”—as well as on the challenges faced and strategies for improvement in future sessions, saying things like “We need to be a bit faster”, “We need to respect other people’s ideas more”, and “We have to get better at discussing things”. The novelty of the approach also emerged as a theme, since CL—particularly the Learning Together model—had never been implemented by their teachers before. This was highlighted by multiple pupils: “We’ve never worked like this before…” and “Usually the teacher tells us what to do… it’s nice to do everything by ourselves”.
In Class B, the category of cooperation did not emerge, despite pupils being divided into teams. Instead, the dominant theme was competition, with students frequently referring to rivalry and the desire to outperform others. For instance, one student remarked, “We’ll see soon… when you run out of money and we don’t!” while another added, “We played it smart and won… can’t say the same for you!”.
Both classes expressed positive emotions regarding the learning process and commented on the complexity of certain activities or specific games. However, the theme of ease—interpreted as growing sense of confidence and understanding—emerged only in Class A.
5. Discussion
Previous research has shown both pedagogical approaches to be effective, with similar average effect sizes (0.53) [
31]. Our findings offer a more context-sensitive analysis, highlighting how pedagogical effectiveness is mediated by implementation variables. Recognizing the importance of contextual factors—such as student engagement, classroom dynamics, and alignment with intended learning outcomes—we formulated the research question: Which instructional implementation package—one based on cooperative learning and one based on game-based learning—is more effective in fostering financial literacy in primary education, and why?
The discussion interprets the quantitative and qualitative findings in relation to this question and situates them within the existing literature. Throughout this section, references to cooperative learning and game-based learning are understood as referring specifically to their classroom enactment through the two instructional implementation packages examined, rather than to the pedagogical approaches in isolation.
Consistent with the theoretical framework, the two implementation packages were expected to produce differential learning outcomes, particularly in relation to the depth of financial literacy. As previously anticipated, the quantitative results indicate that both CL-based implementation (g = 1.84) and GBL-based implementation (g = 1.2) produced a large observed effect in enhancing financial literacy among primary school pupils. To interpret the magnitude of these effect sizes, we adopted conventional benchmarks commonly used in educational research: according to Cohen [
81], effect sizes of 0.2, 0.5, and 0.8 are considered small, medium, and large, respectively. Hattie’s “zone of desired effects” [
31] further suggests that effects above 0.4 are likely to produce meaningful educational outcomes. By these criteria, both CL- and GBL-based implementations generate substantial learning gains. The associated confidence intervals further support the robustness and transparency of these findings within the specific study context.
These results are in line with evidence reported by Kaiser and Menkhoff, showing that experiential and small-group-based financial education programs are particularly effective with younger learners [
19,
20], and they extend this evidence to the under-researched domain of primary education [
8,
9,
10,
11]. The effectiveness of both implementation packages is also consistent with Frisancho’s conclusions [
82], that experiential financial education programs tend to benefit a broad range of students, regardless of their initial knowledge level. At the same time, the magnitude of the observed effect sizes warrants careful interpretation, as they exceed average estimates reported in meta-analyses of financial education interventions [
19,
20]. Several contextual and design-related factors may help explain these comparatively large effects. First, the intervention was designed according to the principle of constructive alignment, ensuring close coherence between intended learning outcomes, instructional activities, and assessment. Such alignment is known to amplify measured learning gains when assessment closely reflects instructional content. Second, the competence-based pre- and post-test, specifically aligned with the instructional content, may have been particularly sensitive to the specific knowledge, practices, and cognitive processes emphasized during instruction. As a result, it may have captured learning gains that would be less visible using more generic or standardized measures. Third, the small sample size and the highly contextualized nature of the intervention limit external validity, and effects of this magnitude may not be replicated in more heterogeneous settings or with less tightly aligned assessment instruments. Accordingly, the reported effect sizes should be interpreted as context-dependent and exploratory, rather than as broadly generalizable benchmarks.
The Wilcoxon Signed Rank Tests further confirmed statistically significant pre-post improvements in both classes, reinforcing the conclusion that the observed gains reflect genuine learning outcomes [
74]. Together, these quantitative findings provide context-sensitive empirical support for the theoretical assumptions underpinning the two instructional implementation packages [
31]. Cooperative learning is explicitly designed to foster positive interdependence, individual accountability, and sustained peer interaction, conditions associated with deeper conceptual understanding [
37,
38,
39]. In contrast, game-based learning is grounded in socio-constructivist and motivational theories that emphasize engagement but are highly sensitive to implementation features such as competition, orchestration, and novelty [
52,
53,
54,
58].
Qualitative evidence from student focus groups helps illuminate the mechanisms underlying these outcomes. As illustrated in
Table 5, in CL-based implementation, pupils displayed increasing metacognitive awareness of their learning processes. They frequently reflected in collaboration (“Now we’ve really gotten good at working together”), their strategies for navigating group challenges (“We have to learn how to bring our ideas together”), and the evolution of group dynamics. Statements such as “If someone doesn’t do their job properly, the assembly line breaks” and “Usually the teacher tells us what to do… it’s fun doing everything by ourselves” illustrate how autonomy and stable group membership supported reflective engagement and collective responsibility. These features—with the teacher acting mainly as a facilitator [
37]—are consistent with a classroom climate aligned with Theory Y [
33,
69], characterized by intrinsic motivation, deep engagement and self-regulated learning.
In contrast, students in the GBL-based implementation more often framed their experience through a competitive lens. Although the games included team-based elements, classroom dialogue emphasized outperforming peers, and learning strategies were oriented toward rapid information acquisition in order to succeed in the game. As one student noted, “We’ll see soon… when you run out of money and we don’t!”—highlighting a shift from cooperative problem-solving to peer rivalry. While this approach generated strong engagement during gameplay, it appeared to privilege performance over conceptual understanding. Once the game phase ended, this surface-oriented engagement appeared to lose relevance. This competitive climate aligns with Theory X [
33,
69], where extrinsic rewards take precedence over intrinsic learning goals. This pattern aligns with previous studies indicating that motivational benefits associated with game-based learning may diminish over time without sustained pedagogical scaffolding [
50,
55,
56]. Accordingly, the effectiveness of GBL-based implementation appears to depend strongly on the educator’s ability to balance pedagogical goals, scaffolding strategies, student needs, and game dynamics to support meaningful learning [
57,
58].
The qualitative evidence should be interpreted with caution due to an imbalance in the volume of contributions recorded across the two focus groups. Pupils in the CL-based implementation produced more verbal reflections related to learning processes than those in the GBL-based implementation. This difference may reflect contextual factors such as discussion time, communication styles, or familiarity with verbal reflection rather than differences in engagement alone. For this reason, qualitative frequencies are treated as descriptive indicators rather than as directly comparable measures of effectiveness.
Another contextual factor that may have influenced the effectiveness of the two pedagogical implementations was the teacher’s role during key moments of the learning process. In the GBL-based implementation, the high level of teacher orchestration—focused on managing gameplay, explaining rules, and coordinating turns—appeared to limit opportunities for student autonomy and reflective engagement. Consequently, pupils in Class B had fewer opportunities to self-organize, negotiate roles, or support peer understanding. In contrast, pupils in the CL-based implementation managed learning through distributed roles (e.g., timekeeper, secretary, teacher, scribe), with the teacher acting primarily as a facilitator. Qualitative analyses corroborated this dynamic, revealing that students perceived lower levels of autonomy and cooperation compared to Class A. These differing levels of teacher involvement are consistent with the typical characteristics of the respective instructional implementations and were implemented according to established guidelines in the literature [
37,
49]. CL inherently promotes student autonomy and peer collaboration with minimal teacher intervention; GBL conventionally involves active teacher facilitation to ensure smooth progression [
57,
58]. This fundamental distinction, while reflecting authentic pedagogical differences, also constitutes a confounding factor that calls for cautious interpretation of comparative outcomes. While some authors highlight the importance of the teacher’s role in GBL [
57,
58], this finding supports Bado’s [
49] warning that excessive teacher intervention may undermine the student-led discovery that makes experiential approaches most effective. Taken together, these classroom dynamics may help explain the deeper and more reflective learning observed in the CL-based implementation.
The relatively high effect size observed in the GBL-based implementation, associated with more context-dependent and performance-oriented learning, may be partly attributable to an initial novelty effect. Several students expressed excitement and curiosity during the early sessions—an effect identified by Faiella and Ricciardi [
53] as a potential catalyst for engagement. Novelty can temporarily enhance engagement by disrupting routine instructional practices. However, in this study, as competitive dynamics became more salient, this motivational boost appeared to diminish, and engagement shifted toward performance-oriented goals. This shift reflects a move from a more autonomous and intrinsically motivated environment (Theory Y) to a more controlled and competitive one (Theory X) and is consistent with existing literature. Faiella and Ricciardi [
53] note that the effectiveness of GBL largely depends on students’ intrinsic motivation, the novelty of the activity, and the quality of game design. While GBL can promote knowledge acquisition, its long-term effects on behavioural change and deep conceptual understanding remain debated [
59], with some studies reporting outcomes comparable to traditional instruction [
60,
61]. When competition overshadows collaboration—as observed in this study—motivational benefits tend to be short-lived and oriented toward performance rather than meaningful learning and deep engagement [
31]. Individual variability further illustrates this dynamic: one pupil in the GBL-based implementation exhibited a notable decline in performance, highlighting how individual variations can interact in small-sample studies. Possible explanations include a mismatch between the student’s learning style and the game mechanics, fluctuations in motivation, cognitive overload associated with fast-paced game elements, or external factors such as stress or fatigue. Differences in prior knowledge or familiarity with game-based learning environments may also have contributed. Acknowledging such variability is essential for interpreting the findings and suggests that future research should examine which learner profiles may benefit most—or least—from specific features of GBL-based implementation [
53].
Conversely, a substantial body of research supports the potential benefits of cooperative learning approaches, highlighting improvements in motivation, academic achievement, critical thinking, and social skills [
38,
39,
41]. Johnson and Johnson [
36] demonstrated that CL fosters positive interdependence, individual accountability, and social competence, which together enhance both cognitive and affective learning outcomes. In this study, these effects appeared particularly salient in the CL-based implementation characterized by stable groups and high autonomy.
Taken together, these findings highlight not only the cognitive and motivational implications of adopting instructional implementation packages based on active pedagogies but also their broader educational value. Beyond cognitive outcomes, the results also point to broader educational implications related to sustainability competencies. Pupils were encouraged to identify and discuss the potential risks and sustainability implications of investment activities, with attention to environmental and social impacts. Evidence of collective action was observed in the CL-based implementation, where pupils negotiated roles and responsibilities and reached shared decisions. Elements of systems thinking emerged as pupils connected financial scenarios to broader real-life contexts, recognising interdependencies between individual and collective choices. Similarly, signs of future orientation were present when pupils reflected on the longer-term consequences of financial decisions. Although these competences were not the primary focus of assessment, their presence in classroom interactions suggests that active pedagogies may support sustainability-related skills alongside financial literacy.
Overall, this study provides contextually rich, exploratory evidence of how cooperative learning and game-based learning function when enacted through contrasting instructional implementation packages in primary financial education. By foregrounding implementation features, teacher mediation, and classroom dynamics, the findings extend existing literature and offer practical insights for educators and policymakers seeking to design evidence-informed and developmentally appropriate financial education from an early age.
6. Conclusions
This study explored the relative effectiveness of two distinct instructional implementation packages—one grounded in cooperative learning (CL) and one grounded in game-based learning (GBL)—in fostering financial literacy in two fifth-grade classes at an Italian primary school. The study is situated within the field of sustainability education, as financial education is a key enabler of the United Nations Sustainable Development Goals, particularly SDG 4 (Quality Education), SDG 10 (Reduced Inequalities), and SDG 12 (Responsible Consumption and Production) [
3].
As an exploratory multiple case study conducted within a single school context and with a small sample size, the findings should be interpreted cautiously, as preliminary and closely tied to the specific characteristics of the two participating classes [
19,
20,
31].
The analysis of effect sizes revealed that both implementations were highly effective, each achieving an effect size well above the 0.4 threshold of Hattie’s zone of desired effects and thus producing educationally meaningful learning gains. The CL-based implementation showed a larger observed effect size (g = 1.84) than the GBL-based implementation (g = 1.20). However, this difference should be interpreted as the result of two distinct instructional implementation packages, characterized by different levels of student autonomy, grouping structures, and teacher orchestration, rather than as evidence of the intrinsic superiority of one pedagogical method over the other.
The qualitative analysis revealed a fundamental difference in the learning processes activated by the two approaches. The CL-based implementation appeared to foster mutual support, autonomy, and shared responsibility, encouraging students to engage deeply with content while developing social and relational competences [
39]. These skills—including cooperation, reflection, conflict resolution, and decision-making—contribute to the development of key sustainability competences [
3] and support the SDGs previously mentioned.
Through these features, the CL-based implementation may support the formation of active, responsible, and socially aware learners. In doing so, it promotes a Theory Y classroom climate, characterized by active participation, intrinsic motivation, and a focus on the learning process itself [
69]. In contrast, the GBL-based implementation, while engaging, tended to activate a competitive dynamic that shifted students’ attention from collaboration and understanding toward winning and outperforming peers. Such dynamics can foster a Theory X climate, marked by extrinsic motivation and surface-level learning, in which pressure and performance concerns could limit attention to deeper comprehension. Taken together, these observations suggest that within the specific instructional and contextual conditions examined in this study, the learning outcomes observed are better understood as the product of two distinct implementation packages—CL with high autonomy and stable groups, and GBL with strong teacher facilitation and dynamic grouping—rather than as evidence of the intrinsic superiority of one pedagogical method over the other.
These findings have important implications for motivation and classroom culture [
31]. Cooperative learning’s emphasis on positive interdependence and shared goals supports intrinsic motivation, which research links to more meaningful learning outcomes and to a better quality of learning. Conversely, excessive competition can undermine intrinsic motivation and promote context-dependent knowledge, potentially limiting the long-term benefits of financial literacy education [
53,
56]. Nonetheless, the effectiveness of GBL should not be underestimated; it still contributes positively to student learning and engagement [
54], especially when thoughtfully implemented to balance competitive and cooperative elements.
Beyond the specific comparison, this study offers broader insights relevant to diverse educational contexts. It highlights the importance of adopting active instructional implementation packages that effectively foster financial literacy and sustainability competencies in line with the SDGs in primary education. While specific strategies may vary according to context, approaches that actively engage pupils in meaningful tasks, encourage collaboration, and stimulate critical reflection have demonstrated strong potential and adaptability. On this basis, teachers and educators can integrate active approaches into daily practice by designing activities that promote responsibility, dialogue, and shared decision-making. For example, assigning structured roles in group work or combining playful activities with collective reflection may strengthen both autonomy and deeper understanding, while also nurturing sustainability-related competences such as systems thinking and collective action. Likewise, parents play a crucial role by involving children in small everyday financial choices, encouraging open conversations about money management, and transforming playful moments into opportunities for awareness and cooperation. They can also potentially support the transfer of school learning into daily life by discussing budgeting decisions or sustainability-related trade-offs at home. Strengthening alignment between school and family contexts may further contribute to the development of financially literate, socially responsible, and sustainability-conscious citizens.
With regard to limitations, our research involved a small number of primary school pupils, which limits population-level generalisability and reduces statistical power for detecting small effects. Furthermore, the absence of an external control group further limits the scope for statistical generalisation. However, the detection of statistically significant pre–post differences and large effect sizes suggests that the observed learning gains were robust within this specific context. Accordingly, the findings should be interpreted as context-dependent and exploratory, rather than as population-level estimates. In addition, the pre- and post-tests were developed as two parallel forms of a competence-based instrument, conceived as a unitary test of financial literacy rather than six separate modules. While this choice was consistent with the principle of constructive alignment, it limited the possibility of analysing variations across topics. Future research with larger samples and multiple trials could address these issues by computing separate effect sizes for each learning unit, thereby providing a more fine-grained comparison of cooperative and game-based approaches.
A further limitation concerns the qualitative data collection process: due to school regulations, focus group sessions could not be audio- or video-recorded, and transcripts were instead produced manually in real time. Although detailed note-taking and immediate revision enhanced accuracy, this procedure may have reduced the richness of interactional details compared to full recordings.
An additional limitation of the present study concerns the potential impact of unmeasured or unexamined teacher-related variables in the two experimental settings. Although a prominent mediating and controlling role of the teacher was noted in the GBL-based implementation, which appeared to limit students’ autonomy and reflective engagement, it cannot be excluded that teacher-related factors also positively or negatively influenced the effectiveness of the intervention in the CL-based implementation. For instance, aspects such as the teacher’s ability to effectively organize cooperative groups, the quality of feedback provided, and motivational strategies may have played a significant role in the success of the CL-based implementation. Similarly, variables such as the teacher’s competence in facilitating game-based learning environments, managing gameplay time, and balancing intervention with student autonomy might have affected results in the GBL-based implementation. These considerations suggest that some observed differences may be partially attributable to unmeasured teacher-related factors. This underscores the need for future research to include more systematic analyses of teacher variables and their moderating role in pedagogical effectiveness.
Despite these limitations, this study demonstrates that financial literacy can be effectively developed in primary school through active instructional implementation packages. Furthermore, such approaches support the development of key sustainability competences, including critical thinking, cooperation, and responsible decision-making, thereby contributing directly to the educational goals of the 2030 Agenda. The findings also highlight the importance of fostering student autonomy and maintaining a balanced classroom climate, avoiding overly competitive environments that may privilege extrinsic motivation and hinder deeper learning.
Future research could expand and generalize these findings by encompassing a wider range of settings, replicating the study in other primary schools to compare results, or including secondary school students. Additionally, other instructional implementation packages or combinations and hybrid models of CL and GBL could be evaluated to assess their effectiveness. Moreover, the long-term effects of the developed financial competences remain to be empirically investigated, alongside other contextual variables such as students’ socioeconomic backgrounds and gender differences, to better understand the impact of different instructional strategies.