2. Materials and Methods
2.1. Participants
The study involved 388 students from two schools: a public high school in the province of Valladolid (Spain) and a Spanish state-run school in Bogota (Colombia), which follows the same Spanish educational system and official curriculum as schools in Spain. Student distribution by stage of education and group is shown in
Table 1.
2.2. Instruments
In order to assess general cognitive aptitudes and, in particular, those related to logical-mathematical reasoning, the BADyG/M-r (Differential and General Skills Battery, middle level, updated version) psychometric battery was applied for years one to four of the ESO, and BADyG/S-r (Differential and General Skills Battery, high level, updated version) for years one and two of Bachillerato/Upper Secondary.
BADyG (
Yuste & Martínez, 2003a,
2003b) is one of the most widely applied tools in education and educational psychology thanks to its high degree of validity and reliability (
Monsalvo Díez & Carbonero Martín, 2009). It is an intelligence test—seen as the capacity to undertake activities that require abstract thinking and the ability to predict possible consequences of hypothetical situations.
This construct is assessed through six factors: (a) verbal analogies, (b) numerical series, (c) logical matrices, (d) sentence completion, (e) numerical problems, and (f) matching shapes. These factors are grouped into others that are second order: (a) logical reasoning, which includes verbal analogies, numerical series, and logical matrices; (b) verbal factor, which encompasses verbal analogies and sentence completion; (c) numerical factor, comprising numerical series and numerical problems; and (d) visual-spatial factor, which covers the variables of logical matrices and matching shapes. The sum of these four complementary factors provides a global value—general intelligence.
2.2.1. Validity of the Construct
Official assessments of the BADyG have provided ample evidence of its validity based on the internal structure. In the case of the BADyG/M-r, exploratory factorial analyses show a general intelligence factor and three correlated factors (verbal, numerical, and visual-spatial), which is consistent with the battery’s theoretical framework.
These results have also been shown to support the hierarchical conception of intelligence and factorial organization proposed by the authors. Likewise, in the BADyG/S-r, principal component analyses have confirmed the presence of a general factor and of the specific factors that make up the test, thus offering sufficient evidence of the construct’s validity (
Spanish General Council of Psychology, 2019a,
2019b).
2.2.2. Concurrent Validity
Both versions of the psychometric battery provide empirical evidence of validity based on the link to relevant external criteria. In BADyG/M-r, significant correlations have been reported to academic performance in language, mathematics, English, and social sciences, with values deemed appropriate for assessing cognitive aptitudes amongst schoolchildren.
These relations are predictive, since the BADyG is applied at the start of the school year, with scores then being reported when the school year ends. As regards the BADyG/S-r, positive correlations have also been found with academic performance, with mean values of between 0.20 and 0.35, thereby offering further evidence of validity in secondary students aged 16 to 17 (
Spanish General Council of Psychology, 2019a,
2019b).
2.2.3. Reliability
Indices of internal consistency to assess the reliability of the scores have been analyzed through the Cronbach alpha coefficient and the split-half corrected method. In the BADyG/M-r, values showed coefficients above 0.75 in the basic tests, over 0.90 in the global factors, and over 0.94 in general intelligence (
Spanish General Council of Psychology, 2019a).
In the case of the BADyG/S-r, internal consistency varies between 0.70 and 0.86 in the basic tests, and between 0.83 and 0.88 in the global factors, reaching values of between 0.93 and 0.94 in general intelligence (
Spanish General Council of Psychology, 2019b). These results confirm the psychometric robustness of the two batteries for use in research in education.
2.3. Methodological Design and Procedure
Since the participating groups were already formed in each school, and because no random allocation could be made, the study was organized with a quasi-experimental design that included a non-equivalent comparison group, which is common in research in education in natural environments. Work was carried out with an experimental group, whose activities focused on computational thinking, and a comparison group, which proceeded with their regular curriculum.
In both educational contexts, existing class groups were assigned either to the experimental group or to the comparison group within each school; class assignment to condition was conducted at the intact class-group level, based on organizational and timetable constraints, in order to avoid disruption of regular school functioning and to minimize contamination between groups. Thus, both schools included experimental and comparison groups following the same institutional framework. All intervention activities were implemented during regular class time within the subject “Technology and Digitalization”. No additional instructional time or extracurricular sessions were introduced for the experimental group. This design reduces the likelihood that observed effects are attributable to increased instructional time or general enrichment, rather than to the specific characteristics of the intervention.
In order to ensure the study’s internal validity, a pretest (O
1 and O
3) and a post-test (O
2 and O
4) were applied to the two groups, which enabled a comparison to be made not only with regard to the initial differences between them but also in terms of improvements achieved in the study variables, as suggested in classical quasi-experimental designs (
Creswell & Creswell, 2021;
Mertens, 2020). Below is the outline of the quasi-experimental design used in the study (
Table 2).
The study was carried out in two schools, one in the province of Valladolid and another in a Spanish state-run school in Bogota (Colombia). Both schools operate under the same Spanish educational system and official curriculum, ensuring curricular equivalence across contexts. The methodological sequence, materials and intervention conditions were identical in the two schools, which allowed the results to be merged into a single comparative analysis.
The intervention was implemented by the same research team in both contexts, following a shared instructional plan and common progression criteria, in order to ensure consistency across groups. In the two schools, intervention lasted for about six months and was organized in the same way: 40 sessions of 50 min each, undertaken during school time. In the high school in Valladolid (Spain), intervention took place over one school year, applying the BADyG battery as a pre-test halfway through the first term (October) and as a post-test prior to the end of the school year (May). In the school in Bogota (Colombia), the same procedure was employed in the following school year, with the BADyG also being applied at the start and at the end of the intervention in the same months. The study was therefore conducted over two consecutive academic years, and all experimental and comparison groups from both cohorts were included in the statistical analyses once baseline equivalence had been verified through pretest scores.
The design of the intervention program involved carrying out activities that gradually increased in complexity. Solving the activities initially required basic algorithms, after which more advanced programming structures had to be tackled. All the computer programs included mathematical concepts (mental calculation, algebra, geometry, and mathematical analysis) that needed to be solved. These activities were designed using a common set of tasks and materials that were implemented uniformly in both schools, without contextual adaptations, in order to ensure internal consistency of the intervention.
The intervention was explicitly designed to promote core computational thinking practices—such as decomposition, algorithmic reasoning, abstraction, and debugging—rather than general digital skills or generic enrichment activities.
Students in the comparison group attended the same subject during the same time slots but followed the official curriculum of the subject “Technology and Digitalization”, without engaging in programming or computational thinking activities.
In order to help students delve deeper into the concepts, practical tasks and creative activities were systematically combined.
Stage 1: All the students in the experimental group undertook an initial stage of 20 sessions using Scratch to reduce the cognitive load that is often associated with text programming languages and to avoid initial rejection that might arise due to syntaxis and level of abstraction (
Tsai et al., 2025). The Scratch environment allowed basic computational concepts such as sequences, loops, conditionals, parallelism, and debugging to be more accessible to students. Graphical representations and flow diagrams were also used to encourage algorithmic thinking.
Stage 2: Students from the experimental group of first, second, and third year ESO students (12–15-year-olds) completed the whole of the intervention using only Scratch up to the 40 planned sessions. Nevertheless, once the bases had been established, fourth-year ESO and upper-secondary students worked for a further 20 sessions with Python, thus completing the intervention.
Python is a programming language that is more appropriate for developing complex structures and enables an easier transition to more advanced mathematical contexts. Once students had become more familiar with the environment, they gradually engaged in increasingly more complex activities with different kinds of data, mathematical operations, lists, tuples, and libraries. Exercises were also carried out with variables, conditional structures (if/else), loop control structures (while, for, until) and programs in which functions needed to be defined. The use of standard modules such as math, time, and random was also introduced. Some of the main programs undertaken by students involved creating passwords, counters, timers, guessing games, and “hangman”. These activities were designed to help overcome the difficulties that those who are just beginning to program tend to face when moving up from block to text programming environments.
According to
Mladenović et al. (
2024), the transition from block languages to text languages can lead to errors in conception and can pose cognitive barriers that must be dealt with if a deeper understanding of programming is to be achieved. All sessions were delivered following the same instructional guidelines and progression criteria in both educational contexts, allowing any observed differences in outcomes to be attributed to the intervention rather than to differences in task design or implementation.
Although no formal external fidelity checks (e.g., observational protocols or fidelity checklists) were applied, the use of a common instructional framework and shared implementation criteria contributed to maintaining instructional consistency across contexts.
2.4. Data Analysis
Once the assumptions for the parametric tests had been verified, a descriptive statistical analysis was carried out, followed by inferential analysis, to gauge what impact the intervention had. In order to compare the results between the experimental group and the comparison group, gains (differences between post- and pre-test) were considered in all the variables of the BADyG, after verifying initial equivalence between the two groups in each level of education through the pre-test scores.
Although analysis of covariance (ANCOVA) is frequently used in pretest–posttest designs, it was not applied in the present study. In quasi-experimental designs with non-equivalent groups, the use of pretest scores as covariates may violate the assumption of independence between the covariate and the experimental condition, particularly when initial differences between groups exist, which can lead to biased estimates of the treatment effect (
Jamieson, 2004;
Miller & Chapman, 2001). For this reason, gain scores were considered the most appropriate analytical strategy in this context, as recommended for quasi-experimental designs with non-equivalent groups.
A two-factor multivariate factorial design was then applied (MANOVA), using education level and group (comparison/experimental) as independent variables, and the difference scores of the complementary factors of the BADyG as dependent variables, following the recommendations of
Tabachnick and Fidell (
2019) for multivariate analysis in research in education. Effect size was calculated with the partial eta squared (
η2p), and its values were interpreted following
Cohen (
1988): 0.01 <
η2p < 0.05, indicating a small effect; 0.06 <
η2p < 0.13, a moderate effect; and
η2p > 0.14, a large effect.
Post hoc tests were subsequently carried out to identify significant differences between the various levels of education. In our case, we opted for the Scheffé procedure, given its conservative nature as well as its robustness in contexts with unequal sample sizes between education levels (
Field, 2018). When comparing the groups, we also applied the
t-test for independent samples, and incorporated Hedges
g statistic (
Hedges, 1981;
Hedges & Olkin, 1985) to calculate effect size, which proves particularly useful in small or unbalanced samples. Interpretation was based on the criteria of
Cohen (
1988,
1992): (a)
g = 0.20, indicating a small effect size; (b)
g = 0.50, a moderate effect size; and (c)
g = 0.80, representing a large effect size. All the analyses were conducted with a 95% confidence level using the IBM SPSS Statistics v.29 (IBM Corp., Armonk, NY, USA) statistical package.
Since the research was conducted at two different schools and covered two consecutive school years, group equivalence was verified prior to intervention through pre-test comparisons. This procedure reduces the likelihood that post-test differences are explained by baseline variability between groups and supports a more cautious interpretation of observed gains as being associated with the computational thinking intervention under the conditions of this study.
3. Results
3.1. Descriptive Analysis
In line with the results in
Table 3, the descriptive analysis shows that gains in the complementary factors of the BADyG are greater in compulsory secondary (ESO) than in upper secondary (
Bachillerato). In particular, first and second year ESO students stand out for the greatest improvements in most factors, whereas these gains tend to diminish as students progress through later years, with the lowest values being achieved in
Bachillerato. This trend is particularly clear in the numerical reasoning factor, the visual-spatial reasoning factor, and in general intelligence, where the differences between the levels are more marked. In contrast, the verbal factor displayed minimal variations between school years, suggesting general stability in this aspect.
When comparing the groups, an even clearer pattern emerges; the experimental group obtains systematically higher gains than the comparison group in all factors, except in the verbal factor, where the two groups show very small increases. Particularly noteworthy are the major differences in the visual-spatial factor and in general intelligence, in which the improvements in the experimental group were markedly greater.
Taken as a whole, the descriptive results suggest that intervention based on computational thinking leads to greater gains in the cognitive skills assessed, with a more marked impact in the early years of the ESO, and a more moderate impact in Bachillerato. Descriptively, the clearest and most consistent patterns of improvement are observed in the numerical, visual-spatial, and general intelligence factors. Furthermore, the advantage gained by the experimental group in virtually all the factors anticipates a possible positive effect of the intervention, an aspect that needs to be tested through inferential analysis.
3.2. Multivariate Analysis
In order to determine how the effects of education level and group affect the dependent variables related to the post-pre difference scores in the complementary factors of the BADyG, a multivariate analysis of variance (MANOVA) was carried out using a 6 × 2 design. Prior to conducting the MANOVA, the assumptions of multivariate analysis were examined (e.g., linearity and absence of multicollinearity). Box’s M test was used to assess the homogeneity of covariance matrices across groups. The results indicated no substantial deviations from the assumptions; therefore, Wilks’ lambda was employed as the principal statistic, given its widespread use and robustness to minor assumption violations. The stability of the findings was further confirmed using alternative multivariate statistics (Pillai’s trace, Hotelling–Lawley trace, and Roy’s largest root), following methodological recommendations for multivariate analysis (
Field, 2018).
The MANOVA results shown in
Table 4 reveal significant principal effects for education level and group, albeit with different-sized effects. The education level factor showed a significant effect, Wilks’
Λ = 0.845,
F(20, 1238) = 3.152,
p < 0.001,
η2p = 0.041, suggesting there are multivariate differences between school years with regard to the improvements obtained. Nevertheless, this effect is small, which is consistent with the diversity of the education levels involved in the study.
The principal effect of the group also proved significant—Wilks’ Λ = 0.664, F(4, 373) = 47.13, p < 0.001, η2p = 0.336—indicating a large effect size. Consequently, these results generally indicate that the students in the experimental group displayed significantly higher gains and improvements in the variables studied when compared to the comparison group. As regards the interaction between level and group, the results show no significant differences: Wilks’ Λ = 0.920, F(20, 1238) = 1.57, p = 0.052, η2p = 0.021.
In other words, although the tendency shows values close to the significance thresholds (α = 0.05), differences between groups do not vary significantly in terms of the school year.
3.3. Principal Effects of the Level of Education Variable
The results of the univariate analyses showed statistically significant differences between education levels in three of the complementary factors of the BADyG (
Table 5), all with small size effects, which is very common in multi-level education studies. Notably, the visual-spatial factor and general intelligence show the clearest developmental pattern across school years. Significant differences were found in: (a) numerical factor,
F(5, 382) = 3.397,
p = 0.005; (b) visual-spatial factor,
F(5, 382) = 6.973,
p < 0.001; and (c) general intelligence,
F(5, 382) = 5.733,
p < 0.001.
In these three variables, the highest scores were reported in the first and second year ESO, whereas scores in later school years tended to drop gradually, particularly in
Bachillerato (
Figure 1). As a result—and according to this pattern—improvements tend to be more significant in the early years of secondary education.
Variance analysis showed significant differences between education levels in the numerical, visual-spatial, and general intelligence factors. Depending on the η2p obtained, effect sizes varied between small and moderate. Post hoc comparisons carried out using the Scheffé procedure allowed us to identify specific differences between education levels. Specifically, with regard to the visual-spatial factor, first- and second-year ESO students obtained significantly higher gains than those in the two years of Bachillerato. For general intelligence, significant differences were found between first and second year ESO and second year Bachillerato. Finally, no significant post hoc differences were observed for the numerical factor despite the significant main effect of education level.
In contrast, no significant differences were found between levels in the logical reasoning factor, F(5, 382) = 2.229, p = 0.051 or verbal factors, F(5, 382) = 1.140, p = 0.339, indicating that improvements in these areas are more evenly distributed over the different school years.
In sum, these results indicate that education level has a significant impact on improvements in the factors of numerical reasoning, visual-spatial reasoning, and general intelligence, whereas gains in logical reasoning and the verbal factor do not depend on the school year.
3.4. Principal Effects of the Group Variable
The results shown in
Table 6 show statistically significant differences between groups in four of the five complementary factors of the BADyG, all with effect sizes ranging from moderate to large. The largest effects were observed in the numerical, visual-spatial, and general intelligence factors.
As shown in
Figure 2, the experimental group achieved substantially higher gain scores than the comparison group in logical reasoning, numerical reasoning, visual-spatial reasoning, and general intelligence.
Overall, the experimental group achieved post-pre gains that were significantly higher than those of the comparison group in logical reasoning, numerical reasoning, visual-spatial reasoning, and general intelligence, with effect sizes ranging from moderate to large (
Table 6).
The verbal factor showed no significant differences between groups, t(386) = −0.59, p = 0.557, with a very small effect size (g = 0.06), which suggests that improvements in this area were similar in the two groups.
In sum, these results confirm that intervention based on computational thinking yielded improvements in most of the cognitive factors assessed, with particularly marked effects in numerical, visual-spatial, and general intelligence factors.
3.5. Effects of Interaction Between Education Level and Group
The analyses showed no significant effect of interaction between education level and group (comparison and experimental) on the difference scores of the BADyG—Wilks’ Λ = 0.920, F(20, 1238) = 1.572, p = 0.052. Even though the level of significance is close to the threshold, the statistical evidence is insufficient to state that the effect of the intervention clearly varies between the different levels of education.
This trend indicates that there might be slight variations in the scale of the improvements between school years, but that said, differences are not sufficiently consistent to consider them as offering a solid pattern. As a result, no additional analyses of simple effects by level were carried out, since the significance of the interaction is a prior requirement to interpret specific comparisons within each school year.
Overall, the results indicate that no statistically significant differential effects of the intervention were observed across educational levels, with gain patterns remaining broadly comparable throughout secondary education.
4. Discussion
The results of this study indicate that high school students who take part in an educational program that involves programming with Scratch and Python—designed to promote computational thinking—obtain significant improvements in several dimensions of logical-mathematical reasoning, assessed using the BADyG psychometric battery.
In particular, gains in these variables in the experimental group were higher than those in the comparison group in logical reasoning, numerical factor, visual-spatial factor, and general intelligence, with effect sizes that range from moderate to large. However, no significant differences were found in the verbal factor between groups. Given the nature of the intervention, which primarily emphasizes logical-mathematical reasoning and problem-solving processes, substantial changes in verbal intelligence were not necessarily expected. The absence of significant differences in the verbal factor is therefore consistent with the cognitive focus of computational thinking-based activities.
In this context, gains observed in the general intelligence factor should be interpreted as improvements in test-related cognitive performance sensitive to educational intervention, rather than as changes in stable or trait-like intelligence constructs.
From an educational standpoint, the strongest and most meaningful effects were observed in the numerical, visual-spatial, and general intelligence factors, suggesting that computational thinking interventions particularly reinforce core cognitive processes underlying logical-mathematical reasoning rather than producing generalized gains across all cognitive domains.
Furthermore, education level had a significant principal effect, with greater gains in the early years of ESO followed by a tendency to gradually diminish in
Bachillerato, although it was not possible to confirm a statistically significant interaction between level and group. As affirmed by
Li and Oon (
2024), this pattern is consistent with the role played by education level as a moderating variable in the impact of computational thinking.
These results concur with other studies that link computational thinking to the development of advanced cognitive skills, particularly those related to logical-mathematical reasoning (
Grover & Pea, 2013;
Scherer et al., 2019;
Shute et al., 2017;
Zambrano-Choez et al., 2025). In line with
Wing (
2008) as well as the
ISTE and CSTA (
2011), computational thinking encompasses processes such as decomposition, abstraction, the use of algorithms, pattern identification, and generalization, all of which tie in directly with the ability to solve mathematical problems in a structured and efficient manner.
In this context, improvements observed in numerical factor, visual-spatial factor, and general intelligence may be seen as boosting the logical organization of thinking, symbolic representation and problem modeling, as highlighted by
Román-González et al. (
2017).
The results show a pattern with a significant impact on the numerical and visual-spatial factors. Using environments such as Scratch and Python enabled students to work with concepts like coordinates, variables, control structures, graphical representations, and algorithmic models to solve problems. This, in turn, helped with the understanding of abstract mathematical concepts and the exploration of the logical underlying structures (
Pinargote-Zambrano et al., 2024;
Sáez-López et al., 2016). From a constructionist perspective (
Papert, 1980,
1993), creating one’s own projects in these environments becomes a driver for learning, since students “program to learn” rather than merely “learning to program”, and thus build mathematical knowledge by elaborating and experimenting with artifacts that are meaningful both to the learners and to their learning context.
As regards the effect of education level, results indicate that improvements in numerical reasoning, visual-spatial reasoning, and general intelligence occur primarily in the early years of ESO. As students progress through the education system, these gains become more modest, especially in
Bachillerato. This may be explained from the standpoint of developmental psychology; the transition from concrete operations toward the more formal thinking described by
Piaget (
1981), as well as the role played by the zone of proximal development in the acquisition of new cognitive skills (
Vygotsky et al., 1978). It is reasonable to assume that there is a greater margin for improvement in the early years and that tasks based on computational thinking adapt best to students’ zone of proximal development. In contrast, in
Bachillerato, many of these cognitive structures have already been well established, which might account for there being less of an improvement. Such reasoning concurs with the meta-analysis conducted by
Li and Oon (
2024), who point out that education level moderates the impact of computational thinking, with stronger effects at earlier educational stages.
In this regard, the absence of a statistically significant interaction between education level and group suggests that, while the magnitude of gains varies across school years, the intervention does not produce qualitatively different effects at specific educational stages. The observed impact should therefore be interpreted as broadly consistent across secondary education under the instructional conditions of this study.
Use of the BADyG battery as an assessment tool fully aligns with these results, as it allows for the measurement of different aptitudes such as numerical series, logical matrices, numerical problems or matching shapes whilst also grouping them into second-order factors. This offers a clearer and more structured representation of intelligence, together with a suitable level of validity and reliability in educational contexts (
Yuste & Martínez, 2003a,
2003b). In this study, the fact that improvements were found mainly in the numerical and visual-spatial factors, as well as in general intelligence, aligns with previous works; computational thinking and IT programming boost the processes of logical reasoning that underpin these dimensions (
Román-González et al., 2017;
Scherer et al., 2019;
Shute et al., 2017).
Limitations and Proposals for Improvement
Although the results of this study do provide consistent evidence concerning the impact that computational thinking has on logical-mathematical reasoning, certain limitations should be taken into account. First, although intervention was carried out at two different schools (one in Spain and another in Colombia) and over two consecutive academic years, student distribution by education levels was not totally homogeneous. This might have affected the sensitivity of some analyses, particularly in the level × group interaction, where the results showed p-values that were very close to the significance threshold. It would prove advisable to repeat this study using samples that are more balanced between school years and between schools in order to enhance the validity of the results.
Additionally, as the study relied on a quasi-experimental design with non-randomized groups, residual initial differences between groups cannot be entirely ruled out. Although gain scores were considered the most appropriate analytical strategy under these conditions, future research employing randomized designs or alternative longitudinal modeling approaches could further strengthen causal inference. In addition, variables not captured by the pretest (e.g., prior academic achievement, teacher effects, classroom climate, or other contextual factors) may have differed between conditions and could have influenced the observed outcomes.
Secondly, the intervention was carried out by the researchers themselves. This might have led to bias, either due to prior expectations or due to the manner in which activities were conducted. In order to offset this possible effect, future research should draw on external teachers when applying the program or should include a double-blind assessment design.
Furthermore, even though conducting the intervention in two schools and in two countries is a key strength, the study was carried out within a very specific institutional framework, which could lessen the applicability of the results to other educational contexts—both national and international—in which there are different organizational or curricular features. Even under a shared official curriculum, cross-national implementation may involve cultural, organizational, or contextual differences (e.g., school routines, available resources, or student background) that were not modeled explicitly and may have influenced the outcomes. As a result, broadening the research into other educational environments would help to boost the generalization of the results.
Finally, it would be interesting to ascertain how permanent the improvements observed in logical reasoning, numerical, and visual-spatial reasoning prove to be. Likewise, the possible integration of mixed methodologies in studies similar to this (such as qualitative registers through interviews, learning diaries or observations) might help to understand the cognitive processes that underlie the impact of computational thinking.
In this regard, although the present study is based on a quantitative approach that allows for the objective measurement of gains in logical-mathematical reasoning, future research could benefit from the incorporation of qualitative methodologies. Approaches such as semi-structured interviews, classroom observations, or learning diaries would enable a deeper exploration of students’ problem-solving strategies, perceptions, and cognitive processes when engaging in programming activities with Scratch and Python. This complementary perspective would contribute to a more comprehensive understanding of how computational thinking interventions influence learning beyond measurable performance gains.
5. Conclusions
The present study provides empirical evidence that a program based on activities that promote computational thinking using Scratch and Python as programming environments has a positive educational impact on certain variables related to logical-mathematical reasoning. In particular, the educational impact has a size effect ranging from moderate to large in the factors of numerical and visual-spatial reasoning as well as general intelligence of the BADyG psychometric battery.
The results indicate that education level influences the magnitude of observed gains, with more pronounced improvements in the early years of ESO and more moderate gains in Bachillerato. At the same time, the lack of a statistically significant interaction between education level and group suggests that the intervention operates in a comparable manner across secondary education, without evidence of stage-specific effects.
From an educational standpoint, these findings support the potential value of integrating computational thinking within secondary education curricula, provided that such integration is carefully aligned with curricular objectives, instructional design, and contextual constraints, and that appropriate teacher preparation and instructional fidelity are ensured.
Finally, future enquiry should establish longitudinal designs in order to gain insights into how lasting the impact might be and to deepen current understanding of the cognitive processes involved in computational thinking.