1. Introduction
In contemporary education systems, the primary goal is not only to ensure that students acquire knowledge but also to enable them to use this knowledge in an analytical, holistic, and productive manner. In this regard, it is considered critical that students can integrate knowledge from different domains of expertise, adapt to changing conditions, develop interdisciplinary competencies, and continuously structure their own thinking processes [
1]. These expectations further emphasize the importance of problem-solving and systems thinking skills, which are prominent in 21st century skills frameworks.
Systems thinking is recognized as one of the core competencies in contemporary 21st century skills frameworks and is gaining increasing importance across a range of disciplines [
2,
3,
4]. Due to its potential to deepen students’ conceptual understanding, to support the solution of complex problems, and to foster informed decision-making regarding global challenges such as sustainability, systems thinking is increasingly emphasized, particularly in the context of science education [
5]. However, the systematic treatment of systems thinking education at the K–12 level is a relatively recent development, and studies investigating children’s developmental trajectories in relation to these skills remain limited [
6,
7]. Given that problem-solving plays a central role in science and engineering education [
8], examining systems thinking and problem solving together becomes even more meaningful.
In recent years, interest in the learning of complex systems has grown markedly in the science education literature. Gilissen et al. [
9] highlight that systems thinking and complex systems learning have become an emerging research focus in science education. The fact that systems thinking is increasingly considered an expected competency in science classrooms [
10] further underscores its practical importance. Many phenomena in science education—such as ecosystems, the phases of the Moon, or energy transfer—are inherently complex systems that require a systemic perspective to be understood [
11]. Accordingly, new teaching programs have been developed that aim to explicitly integrate systems thinking into instruction [
12]. Systems thinking is also defined as a key component of the cognitive flexibility needed for individuals to collaboratively solve social and environmental problems [
13]. Systems thinking, which involves understanding the relationships among the elements of a system, feedback loops, and causal connections [
14], offers an interdisciplinary skill domain that aligns with the core aims of science education.
Although interest in systems thinking has increased in the science education literature, the instructional approaches that most effectively develop this skill are still not clearly identified [
15]. This uncertainty makes the need for age- and developmentally appropriate instructional materials and valid assessment tools more visible [
16]. Although various methods have been developed to assess systems thinking, research regarding the direct teachability of this competency, the extent to which it can be developed, and how it can be guided through instruction remains limited [
14]. Nevertheless, recent studies have shown that systems thinking approaches are increasingly being integrated into classroom practices and offer important pedagogical opportunities for supporting students’ learning processes [
17]. Taken together, these findings suggest that the design and implementation of pedagogical models and instructional tools aimed at fostering systems thinking in science education—grounded in strong empirical foundations—will make significant contributions to the field.
Problem solving is a cognitive process that involves analyzing a situation, generating solution paths, making decisions, and evaluating the outcomes. The quality of this process is directly related to the holistic understanding of the relationships and interactions among the elements that constitute the problem. Therefore, there is a strong and complementary connection between systems thinking and problem solving. Systems thinking skills are described as essential for addressing the complex problems of today’s world [
18]. Systems thinking is argued to provide a fundamental cognitive framework for rational decision-making and effective problem solving [
19]. It is also emphasized that problem solving should address the relationships among components within a system from a holistic perspective [
20].
Complexity theory further illuminates the nature of problem solving by offering a more realistic approach that explains how systems function and interact with one another. In this context, systems thinking facilitates understanding the problem and increases the speed of generating solutions [
1]. Comprehensive systems thinking, by considering interactions among components, contextual conditions, and stakeholder needs, points to the necessity of a holistic problem-solving approach [
21]. Thus, the complexity of contemporary problems necessitates that problem-solving methods be shaped by a systems thinking perspective [
4].
Within this framework, technological tools—particularly simulations—are seen as offering substantial potential for the development of both systems thinking skills and students’ engagement in problem-solving processes. Rapid developments in information and communication technologies have made it possible to integrate innovative instructional tools—such as computer simulations, video games, virtual environments, and internet-based applications—into educational settings [
22]. Simulations are regarded as an important pedagogical tool because they allow students to engage in interactive learning experiences [
23]. It is also noted that simulations have didactic value due to their capacity to introduce new perspectives into scientific processes and provide innovative learning experiences [
24]. However, it has been emphasized that the epistemological effects of simulations on science education are still not adequately explored [
22]. Research on the role of simulations in developing systems thinking and problem-solving perceptions in science education therefore gains importance.
There are some studies in the literature examining the effects of simulation-based learning on problem-solving processes and performance [
8,
25,
26,
27,
28,
29]. These studies systematically address how simulations support students’ thinking processes, how they influence conceptual understanding, and their contributions to problem-solving performance. For example, Yuliati et al. [
26] found that supporting the direct current electricity topic with a PhET simulation contributed to students’ development of a scientific approach and enhanced their problem-solving skills. Similarly, Avramiotis and Tsaparlis [
30] examined the effect of simulations on the problem-solving processes of chemistry problems. Chen et al. [
31] investigated the relationship between online simulation-based collaborative problem solving and communication and collaboration skills. In addition, early studies in the field have provided important examples regarding the use of simulation and modeling software in science education [
32].
Beyond simulation-based environments, students’ problem-solving perceptions have also been examined in relation to various instructional approaches such as educational games, STEM-based learning, and argumentation-oriented instruction [
33,
34,
35,
36]. These studies suggest that learners’ perceptions regarding their problem-solving competence, effort, and perseverance are associated with their engagement and learning behaviors. From a psychological perspective, perception is considered a key determinant influencing individuals’ behaviors and the way they enact their skills in learning contexts [
37]. In contrast to the dominant body of simulation research focusing on problem-solving performance, the present study specifically examines students’ problem-solving perceptions. Accordingly, the findings should be interpreted in terms of perceived competence, willingness, and perseverance in problem solving, rather than observable problem-solving skills or performance outcomes.
The potential effects of simulations on systems thinking have been examined in a smaller number of studies [
11,
38,
39,
40,
41]. Evagorou et al. [
11] showed in their work with elementary school students that simulation-based environments can contribute to the understanding of complex processes. Waddington and Fennewald [
41] demonstrated that a complex simulation focusing on climate change can support students in identifying systemic relationships. These studies indicate that simulations hold considerable potential for the development of both systems thinking skills and problem-solving perceptions.
However, studies that simultaneously examine how simulation-based teaching influences both systems thinking skills and students’ perceptions of their problem-solving abilities remain limited. In this regard, the Electricity in Our Lives unit—often described by students as abstract and challenging to understand due to its relational structure—is expected to become more comprehensible and make systemic relationships more visible through simulation-based teaching. Therefore, the Electricity in Our Lives unit was chosen as the focus of this study due to both its systemic structure and its potential for problem solving. The aim of this research is to examine the effect of simulation-based science teaching on fifth-grade students’ systems thinking skills and problem-solving perceptions. The sub-problems of the study are as follows:
Does simulation-based science teaching have a significant effect on students’ systems thinking skills compared to the Ministry of National Education (MoNE) curriculum?
Does simulation-based science teaching have a significant effect on students’ perceptions of their problem-solving compared to the MoNE curriculum?
Is there a significant relationship between students’ systems thinking skills and problem-solving perceptions?
2. Theoretical Framework
2.1. Systems Thinking
Systems thinking (ST) is regarded as one of the fundamental competencies needed to make sustainable decisions in today’s rapidly changing and increasingly complex world. Indeed, ST provides a critical cognitive capacity for steering the world toward a more sustainable future and is gaining increasing importance, particularly in the context of science education [
42]. When students develop a systems perspective, they are better prepared to cope with the complexity of real-world problems and to handle uncertain and multifaceted situations [
43]. This approach requires a holistic way of thinking that involves understanding the structure and functioning of a system, anticipating its future behavior, and enabling desired changes within the system [
44].
An examination of the content of science curricula reveals that many topics are grounded in contemporary problems and seek solutions to those problems. This structure indicates a close relationship between science education and systems thinking skills by its very nature [
45]. ST offers a holistic approach that focuses on the interactions among system components, the patterns that emerge from these interactions, and the dynamic properties of the system as a whole [
46]. In this respect, ST is defined as an interdisciplinary field that seeks to understand the structure of complex systems and provides an important cognitive framework for solving a wide variety of problems [
47].
Systems thinking is also conceptualized as a problem-solving approach that aims to understand relational structures that cannot be reduced to the parts of the whole [
48]. In this sense, ST is considered a higher-order skill necessary for understanding complex problems and making informed decisions [
49], and it enables the analysis of the dynamic structure that emerges when system components come together [
50]. Furthermore, ST offers a robust cognitive framework for examining the interactions among elements of a system over time and evaluating the consequences of these interactions [
51]. Therefore, ST is a critical skill that helps students approach real-world problems from an interdisciplinary perspective, view scientific inquiry through a holistic lens, and form deeper conceptual connections in STEM learning processes [
1].
Understanding a system requires not only identifying its components but also grasping the relationships among these components, the impacts of these relationships on the overall behavior of the system, and how the system interacts with its broader context. This holistic understanding is defined as one of the core aims of ST [
52]. In this framework, systems thinking ability enables individuals to evaluate events or problems both as a whole and at the level of relationships among parts [
53]. Such a mode of thinking can support students in analyzing situations that involve uncertainty more clearly and in making more effective decisions [
54].
The role of ST in education is becoming increasingly evident, and it is reported to make multifaceted contributions such as developing higher-order thinking skills, promoting interdisciplinary learning, enhancing student agency, and integrating with computational thinking [
55]. The ability to see systems as complex wholes composed of interconnected components helps students construct many concepts more accurately and meaningfully [
56].
2.2. Systems Thinking Hierarchy (STH) Model and Its Instruction
Eight hierarchical characteristics of systems thinking in the context of Earth systems have been proposed [
57]:
- (1)
The ability to identify the components of a system and processes within the system.
- (2)
The ability to identify relationships among the system’s components.
- (3)
The ability to organize the systems’ components and processes within a framework of relationships.
- (4)
The ability to make generalizations.
- (5)
The ability to identify dynamic relationships within the system.
- (6)
Understanding the hidden dimensions of the system.
- (7)
The ability to understand the cyclic nature of systems.
- (8)
Thinking temporally: retrospection and prediction.
The model’s eight characteristics are organized into three sequential levels: (A) analyzing system components (1); (B) synthesizing system components and relationships (2, 3, 4, and 5); and (C) implementation (6, 7, and 8) [
12,
57].
The essence of the model is that the main emphasis in systems thinking should be on synthesis rather than analysis. The meaning of the whole does not emerge merely from examining the parts individually, but from understanding the relationships among the parts in an integrated way. Therefore, cognitive processes in systems thinking are driven more by synthesis than by analysis [
58].
Various instructional methods have been reported in the literature as effective in fostering systems thinking among students. The use of systems modeling tools with appropriate scaffolds is proposed as an effective strategy to help students develop a systems perspective [
59]. However, due to the difficulties encountered in acquiring ST, different instructional strategies have been developed in recent years. Among these, model-based teaching has emerged as an important approach that supports students in structuring complex systems [
60]. Active learning methods are also considered effective pedagogical tools for developing ST skills among STEM students [
61]. Moreover, simulations of complex systems have long been used as tools for the teaching and investigation of ST, although further research is needed to identify games and simulations that specifically support the development of systems thinking [
41]. In the present study, simulations were used with the goal of enhancing systems thinking skills.
2.3. Problem-Solving
Problem solving is a deliberate, student-centered process grounded in multiple interactions among problem solvers, tools, and relevant resources, through which learners seek different solutions to authentic and personally meaningful problems [
62]. Regarded as one of the most critical cognitive activities in both everyday life and professional contexts [
63], problem solving constitutes a fundamental component of learning in educational settings. However, in traditional classroom practices, problem solving is often treated as a linear process and conceptualized as a limited activity focused on reaching predetermined solutions [
1]. This perspective can overlook the multidimensional nature of problem solving and its demands on higher-order cognition.
In reality, problem solving is a complex cognitive activity that requires individuals to perform multiple cognitive functions simultaneously, such as retrieving information from long-term memory, maintaining it in working memory, and associating and transforming it [
30]. Therefore, effective problem solving necessitates the integrated regulation of cognitive, metacognitive, and non-cognitive processes [
64]. One of the most classical and holistic explanations of the problem-solving process is the four-step model consisting of understanding the problem, devising a plan, executing the plan, and reflecting on the solution [
65]. This framework reveals that problem solving is not merely an outcome-oriented activity but also a cognitive journey that involves regulating one’s thinking processes.
In educational contexts, students must be able to apply problem-solving skills in order to effectively address the problems they encounter [
37]. However, in many classroom-based studies, students’ problem solving is examined not only through performance-based indicators but also through their perceptions of competence, effort, and perseverance during the problem-solving process. In this regard, interactive learning environments supported by technology offer important opportunities for supporting students’ problem-solving perceptions, particularly by influencing motivational and self-regulatory aspects of learning. In particular, how simulation-based environments enhance students’ perceived problem-solving competence and willingness to engage in challenging tasks has been of considerable interest to researchers. Such environments provide a dynamic structure that enables students to explore alternative solution paths, reflect on their thinking, and strengthen their confidence and persistence in problem-solving processes [
25].
2.4. Simulation-Based Science Teaching
Computer simulations are defined as digital programs that represent a real system or phenomenon [
66] and enable students to examine complex scientific processes dynamically. These simulations, which provide interactive modeling of specific physical phenomena through dynamic structures and user-friendly interfaces, are increasingly used in science education [
8]. Simulation-based software is notable for providing rich learning opportunities such as visualizations of mechanisms, graphical and tabular displays of variables, and virtual laboratory applications [
22].
A substantial portion of the simulations used in science education involves the representation of experimental processes [
67] and has therefore long been regarded as an effective tool in science teaching. In this context, the PhET simulations used in the present study aim to provide an interactive and inquiry-based digital learning environment for physics and science learning.
Simulation-based instruction is defined as a pedagogical approach that enables students to practice skills in a safe environment for later use in real-world contexts, and its impact can be further enhanced through supportive elements such as peer coaching [
68]. The fact that students can access and interact with simulations anytime and anywhere makes simulations a powerful learning tool [
23].
Research has shown that simulations facilitate students’ understanding of complex concepts and microscopic processes in particular [
31]. By making otherwise unobservable events visible, simulations help students understand a wide range of scientific phenomena—from the molecular to the astronomical scale [
69]. In this respect, simulations support active learning and strengthen students’ ability to transfer scientific knowledge to real-world contexts [
23].
The integration of simulations into teaching also offers practical and ethical advantages. In situations where experiments cannot be conducted in the classroom due to danger or ethical concerns, simulations provide an alternative learning environment; they make costly laboratory experiments more accessible, accelerate time-consuming processes, and allow teachers to spend less time on experimental setups and more time interacting with students [
66]. In addition, interactive tools such as digital simulation games can support emotional and cognitive skills that are vital for effective learning [
70]. Simulations also increase students’ motivation by offering opportunities to explore phenomena they are curious about [
71].
Another important contribution of simulations is their ability to present different forms of scientific representation simultaneously. Students can engage with multiple representations—such as quantitative graphs, qualitative explanations, and visual models on the same screen—and thereby have opportunities to integrate these representations [
67]. This integration supports both conceptual learning and scientific reasoning. Moreover, computer-based simulations provide valid and reliable environments for assessing problem-solving behaviors [
64].
In this framework, simulation-based science teaching is considered a promising pedagogical approach that can support students’ cognitive, affective, and behavioral learning processes. Simulations facilitate the understanding of complex systems and contribute to the development of higher-order skills such as problem solving, modeling, and visualizing system relationships. Therefore, investigating the effects of simulation-based instruction—particularly in science units dominated by abstract concepts—is regarded as an important research area in contemporary science education.
3. Methods
3.1. Research Design
This study was conducted using a quasi-experimental design with a pre-test–post-test control group. The research included two groups: an experimental group and a control group. The experimental group participated in simulation-based instruction in addition to the curriculum determined by the MoNE. The control group carried out activities aligned with the same learning outcomes solely based on the MoNE curriculum, using physical materials (battery, bulb, switch, cable, etc.).
The groups were selected from two classes in the same school with similar socio-economic and academic characteristics. Random assignment was implemented at the class level by designating one intact class as the experimental group and the other as the control group. Because the unit of assignment was the class rather than the individual student, students’ observations may not be fully independent due to potential classroom-level (cluster) influences (e.g., shared peer dynamics and classroom climate). The implementation process lasted a total of 4 weeks. Eight sessions were held, with two class sessions each week. Each session consisted of two class hours (approximately 80 min), and thus the total duration of the implementation was about 16 class hours. This design aimed to comparatively examine the effect of simulation-based instruction on students’ systems thinking skills and their problem-solving perceptions. A schematic representation of the research design is given in
Figure 1.
3.2. Study Group
The study group consisted of 35 fifth-grade students enrolled in a public middle school in Turkiye. One of the classes was designated as the experimental group (n = 18) and the other as the control group (n = 17). The classes were selected from two parallel sections in the same school with similar socio-economic characteristics. One section was randomly assigned as the experimental group and the other as the control group. Both classes were taught by the same science teacher, and group assignment was implemented at the class level rather than at the individual student level.
Inclusion criteria for the classes were being at the fifth-grade level and students’ voluntary participation in the study. In the experimental group, 50% of the students were female (n = 9) and 50% were male (n = 9), with a mean age of 11.00. In the control group, 52.9% of the students were female (n = 9) and 47.1% were male (n = 8), with a mean age of 11.07. In both groups, the majority of students (approximately 67%) reported having prior experience in constructing an electric circuit.
Regarding students’ self-efficacy perceptions about electricity, most students in the experimental group rated themselves at a “good” level (M = 2.67), while this value was 2.19 in the control group. In response to the question “When you encounter a problem, can you solve it?” most students in both groups selected “yes, I can solve it” (experimental group M = 2.39, control group M = 2.31).
3.3. Data Collection Instruments
Two different instruments were used in the study: the Systems Thinking Skills Test and Problem-Solving Skills Perception Scale.
3.3.1. Systems Thinking Skills Test
The Systems Thinking Skills Test was developed by the researcher to assess fifth-grade students’ ability to understand, interpret, and reason about systemic relationships in science learning. The test was designed within the context of the Electricity in Our Lives unit and consists of two complementary sections: a multiple-choice section and a scenario-based open-ended section.
In the test development process, a comprehensive literature review was first conducted to clarify the definition, components, and assessment approaches related to systems thinking. Based on this review, the systems thinking framework proposed by Assaraf and Orion [
57] was adopted, and key subcomponents of systems thinking—such as identifying system components, recognizing relationships among components, and interpreting feedback processes—guided the item-writing process. This framework was selected because it conceptualizes systems thinking as a progression from identifying system elements to reasoning about dynamic interactions and feedback mechanisms, which is particularly suitable for elementary-level science contexts.
All items were aligned with the learning outcomes of the fifth-grade science curriculum related to electric circuits. Contextual, real-life scenarios were intentionally used to support students’ meaningful engagement with systemic reasoning (e.g., explaining changes in bulb brightness in everyday situations). AI-supported content generation tools were used selectively during this phase to assist in generating age-appropriate contexts and scenarios, while conceptual accuracy and pedagogical appropriateness were ensured by the researcher. The use of AI tools was limited to scenario construction and wording support, and all final items were reviewed and refined by the researcher to ensure alignment with the theoretical framework and curriculum objectives.
To establish content validity, the draft test was reviewed by three experts with expertise in science education, measurement and evaluation, and educational technology. Experts evaluated the items in terms of alignment with systems thinking sub-competencies, curriculum relevance, cognitive demand, clarity of language, and grade-level appropriateness. Expert feedback was collected using structured evaluation forms, and revisions were made based on consensus recommendations. Based on expert feedback, several items were revised for clarity, distractors were balanced, and the overall test length was adjusted to ensure that completion time did not exceed 40 min.
The revised draft was initially piloted with four fifth-grade students to examine item clarity and administration time. Following minor revisions, the test was administered to a larger sample of 88 middle school students for item analysis. Item difficulty indices ranged from 0.33 to 0.83, and most item discrimination indices were 0.30 or higher, indicating adequate differentiation between students with different levels of systems thinking skills. Items with low discrimination values (V5, V8, and V12) were removed. In its final form, the multiple-choice section consisted of 13 items, with a KR-21 reliability coefficient of 0.74, indicating acceptable internal consistency for research purposes.
The open-ended section consisted of eight scenario-based items designed to capture students’ ability to explain, justify, and predict system behavior using relational and causal reasoning. Responses to open-ended items were scored using analytic rubrics ranging from 0 to 2 points, reflecting increasing levels of systemic reasoning. To balance research transparency with the protection of the instrument for planned future use and ongoing studies, the full item set and scoring rubrics are not publicly released at this stage. However, detailed sample items and scoring anchors are provided in
Appendix A.1, and the blueprint table is presented in the manuscript to support transparency and replicability. The complete instrument is available from the corresponding author upon reasonable request for research purposes.
To ensure transparency regarding construct coverage, a blueprint mapping all items in the final version of the Systems Thinking Skills Test to the targeted systems thinking sub-competencies was developed based on the framework of Assaraf and Orion [
57]. This mapping illustrates how each sub-competency is represented across both the multiple-choice and scenario-based open-ended sections of the test and is presented in
Table 1.
Although the Systems Thinking Skills Test is conceptually grounded in distinct sub-competencies proposed by Assaraf and Orion [
57], it was primarily designed to yield a composite systems thinking score representing students’ overall systemic reasoning. This decision was based on the developmental level of the participants and the study’s primary focus on examining overall growth in systems thinking rather than conducting a psychometric validation of subscale structures. Sub-dimension scores were computed to provide descriptive and exploratory insights into specific aspects of systems thinking rather than for confirmatory or scale-level inference.
3.3.2. Problem-Solving Skills Perception Scale
To determine students’ perceptions regarding their problem-solving, the Problem-Solving Skills Perception Scale for Middle School Students [
37] was used. This scale was developed to measure middle school students’ self-reported perceptions regarding the problem-solving process, rather than direct problem-solving performance. It consists of 22 items rated on a five-point Likert scale ranging from 1 (strongly disagree) to 5 (strongly agree). The scale has two factors: “Perceived of problem-solving competence” and “Perceived of willingness and perseverance regarding problem-solving.” In the original validity and reliability studies conducted by the developers, the KMO value was reported as 0.87 and Cronbach’s alpha as 0.89.
3.4. Experimental Procedure
The experimental procedure of the study was carried out over four weeks during the spring semester of the 2024–2025 academic year within the scope of the fifth-grade science course. A total of eight sessions (approximately 16 class hours) were planned, with two class sessions per week. The same learning outcomes were addressed in both the experimental and control groups; the primary difference lay in the instructional tools and representational affordances used during teaching. In both groups, instruction was grounded in the same inquiry-oriented teaching principles, including prediction, observation, explanation, and teacher-guided questioning; the primary difference between conditions was the instructional medium rather than the pedagogical orientation.
Prior to the implementation, the researcher prepared lesson plans, instructional materials, and assessment tools. The PhET Circuit Construction Kit: DC and PhET Circuit Construction Kit: DC—Virtual Lab simulations (University of Colorado Boulder, USA) [
72] to be used in the experimental group were aligned with the learning outcomes of the Electricity in Our Lives unit in the MoNE Science curriculum. The teacher implementing the intervention was a science educator pursuing doctoral-level training in science education. Before the intervention, the teacher received targeted guidance from the researcher regarding the instructional sequence, use of simulations, and the focus on systems thinking processes, and adhered to the planned procedure throughout the implementation. To control for instructional variability, the same teacher conducted the lessons in both the experimental and control groups, and lesson durations, activity sequences, and assessment times were standardized. Equal time was allocated to experimentation in both groups, and the same physical circuit experiments were conducted concurrently. Students in both conditions worked in small groups during the hands-on experimentation process. The experimental group additionally engaged with PhET simulations, which were implemented via an interactive whiteboard in a whole-class format to complement the physical experiments. Apart from the integration of simulation-supported activities, efforts were made to maintain equivalence across instructional conditions. In both groups, instruction followed the same inquiry-based framework, including prediction, observation, explanation, and teacher-guided questioning.
First, the pre-tests were administered, and then the four-week instructional process began. In the control group, instruction followed the fifth-grade Science curriculum of the MoNE. In these lessons, students constructed circuits using materials such as batteries, bulbs, cables, and switches; explored key concepts such as “open and closed circuits” and “series and parallel connections” and observed variables affecting circuit brightness through hands-on experimentation. These activities were inquiry-oriented and involved teacher-guided questioning; however, they relied on static physical configurations and limited opportunities for rapid manipulation and comparison of multiple system states, which constrained students’ ability to explore dynamic system variations within a single lesson. During these activities, the teacher prompted students to make predictions, explain observed outcomes, and justify their reasoning, consistent with guided inquiry practices.
In the experimental group, the same learning outcomes were addressed, but instruction was additionally supported by PhET interactive simulations displayed on an interactive whiteboard. Due to classroom infrastructure conditions, simulations were implemented in a whole-class format rather than through individual student devices. Students participated sequentially by manipulating the simulations on the interactive whiteboard, while other students actively contributed through observation, discussion, and commentary. During these activities, students manipulated variables such as the number of batteries, number of bulbs, and connection types to explore causal relationships within the system.
Instruction in the experimental group followed a guided inquiry cycle that included prediction, manipulation, observation, explanation, and reflection. The teacher adopted a facilitative role by prompting students to articulate causal explanations (e.g., why bulb brightness changed), identify system components and relationships, and consider hypothetical scenarios (e.g., “What would happen if another battery were added?”). These prompts were explicitly designed to elicit systems thinking processes such as relational reasoning, feedback awareness, and generalization across configurations. Thus, the simulation-supported environment actively engaged students in hypothesis generation, testing, and systemic reasoning.
At the end of the implementation, the same instruments used as pre-tests were administered again as post-tests in both groups. Instructional consistency across conditions was supported through the use of standardized lesson plans and implementation by the same teacher.
3.5. Data Analysis
Scores obtained from the instruments used in the study were computed according to the nature of each measure. The Problem-Solving Skills Perception Scale is scored on a five-point Likert scale, with items rated from “1 = Strongly disagree” to “5 = Strongly agree.” Negatively worded items were reverse-coded, and a total score was then calculated. The possible total score ranges from 22 to 110, with higher scores indicating higher perceived problem-solving skills.
The Systems Thinking Skills Test comprises two sections: multiple-choice and open-ended items. For the multiple-choice section, each correct response was scored as 1 and incorrect or blank responses as 0. The open-ended questions were scored on three levels (0–1–2 points) using analytic rubrics developed by the researcher and reviewed by subject-matter experts. For the total systems thinking score, multiple-choice and open-ended item scores were summed to obtain a composite score representing overall systems thinking performance. An AI-assisted provisional coding tool was used solely to assist with the initial rubric-based categorization of open-ended responses. The AI system processed anonymized response texts only and had no access to students’ group membership (experimental/control), test time (pre/post), or study hypotheses. AI-generated scores were not retained as final scores. All AI-generated codes were subsequently reviewed by the researcher, and final scoring decisions were made exclusively by human raters. To examine scoring reliability, open-ended responses from a randomly selected 20% of the sample (n = 70) were independently coded by the researcher and a second rater with expertise in science education. Inter-rater reliability was calculated for open-ended item scores using Cohen’s kappa coefficient, which indicated a substantial level of agreement (κ = 0.70, p < 0.001). Reliability estimates based on total open-ended scores yielded comparable agreement levels. In cases of discrepancy between human raters, scores were discussed until consensus was reached, and only consensus-based human scores were used in the final dataset and all subsequent analyses. Because the AI-assisted tool functioned solely as an organizational aid and AI-generated provisional codes were not incorporated into the final analytic dataset, an AI–human agreement statistic was not computed.
Prior to statistical analyses of the quantitative data, the suitability of the dataset for parametric tests was examined. Data were analyzed using SPSS 26.0. Descriptive statistics (means, standard deviations, and minimum–maximum values) were computed for all variables. Normality assumptions were assessed based on total scores using skewness and kurtosis coefficients. For the Systems Thinking Test, skewness and kurtosis coefficients for all items ranged from −0.99 to 0.49 and from −1.13 to 0.63, respectively, which are within the ±1 boundaries. These results indicated that the distribution of systems thinking pre-test scores met the assumptions of normality. For problem solving perceptions, skewness and kurtosis values were also within ±1 (skewness = −0.43, kurtosis = −0.30), indicating that the distribution fell within acceptable limits for normality. Therefore, it was concluded that the data met the assumptions of normality, and parametric tests were used.
Prior to inferential analyses, the dataset was screened for missing values and potential outliers. Missing data were examined across all variables, and no missing values were detected at the total score level; therefore, all analyses were conducted using complete cases. Potential outliers were identified using standardized z-scores, with values exceeding |z| ≥ 3.0 considered indicative of extreme cases. This screening did not identify any outliers for the total scores of systems thinking or problem-solving perceptions. Accordingly, all statistical analyses were performed using the full dataset, and the primary results were not influenced by extreme observations.
In line with the purpose of the study, a combination of inferential statistical analyses was employed to examine between-group differences and changes over time associated with simulation-based science teaching on students’ systems thinking and problem-solving perceptions. To examine within-group changes over time, paired-samples t-tests were conducted separately for the experimental and control groups using pre-test and post-test total scores. In addition, paired-samples t-tests were applied to the sub-dimensions of systems thinking within the experimental group for descriptive and exploratory purposes.
To assess between-group differences at baseline, independent-samples t-tests were conducted on pre-test scores. For the primary between-group comparisons at post-test, analyses of covariance (ANCOVA) were performed with post-test scores as the dependent variables, instructional condition (experimental vs. control) as the fixed factor, and corresponding pre-test scores as covariates, in order to adjust for potential baseline differences. Prior to conducting ANCOVA, key assumptions were evaluated. Levene’s test indicated that the assumption of homogeneity of variances was met (p > 0.05). The homogeneity of regression slopes assumption was assessed by testing the interaction between the group variable and pre-test scores. The interaction terms were not statistically significant (p > 0.05), indicating that the assumption was satisfied and that ANCOVA was appropriate for the data. Finally, Pearson product–moment correlation analyses (r) were conducted using post-test scores to examine the relationship between students’ systems thinking and problem-solving perceptions. For all analyses, the significance level was set at α = 0.05. Because group assignment was conducted at the class level using two intact classes, statistical analyses were interpreted with caution, and effect sizes were emphasized alongside p-values.
Given the multiple comparisons conducted across the systems thinking skills and problem-solving perception sub-dimensions in both within-group (pre–post) and between-group analyses, these sub-dimension analyses were treated as exploratory rather than confirmatory. Therefore, no formal multiplicity adjustment was applied, and unadjusted p-values are reported. The primary conclusions of the study are based on total scores and ANCOVA results. Given the small number of clusters and the exploratory nature of sub-dimension analyses, results were interpreted cautiously with greater emphasis on effect sizes than on p-values.
3.6. Ethical and Privacy Considerations
All student data were anonymized prior to analysis, and no personally identifiable information was included at any stage of data processing. The AI-assisted scoring tool was used only with anonymized response texts and operated in a secure environment for provisional coding purposes. No student identifiers were stored, transmitted, or retained during AI-assisted processing.
All study procedures were reviewed and approved by the Bursa Uludağ University Research and Publication Ethics Committees (Social and Human Sciences Research and Publication Ethics Committee) (Approval Date: 28 November 2025; Protocol No: 2025-10).
Given that the participants were minors, the study was conducted in accordance with institutional and national ethical regulations governing educational research involving children. Participation was based on voluntary involvement, and consent procedures were implemented in line with school policies and prevailing ethical practices. Students were informed about the purpose of the study and their right to withdraw at any stage without penalty.
Artificial intelligence tools (specifically ChatGPT by OpenAI, GPT-5) were used in a limited and controlled manner for language editing, text organization, and improving clarity during manuscript preparation. In addition, an AI-assisted tool was used solely as a provisional coding aid for the initial categorization of anonymized open-ended responses. All AI-generated outputs were reviewed by human raters, and final scoring decisions were made exclusively through human judgment. AI was not used for statistical analyses, data interpretation, or the generation of study findings.
4. Findings
4.1. Findings Related to the First Sub-Problem
The first sub-problem of the study was: “Does simulation-based science teaching have a significant effect on students’ systems thinking skills compared to the MoNE curriculum?” Students’ systems thinking skills were assessed using a test consisting of multiple-choice and open-ended items. Prior to the intervention, an independent-samples
t-test was conducted to examine whether there was a significant difference between the experimental and control groups’ pre-test systems thinking scores. The results indicated no statistically significant difference between the groups at baseline (
p > 0.05), suggesting that the two groups had comparable initial levels of systems thinking before the implementation. To examine the effect of simulation-based instruction while controlling for baseline differences, an analysis of covariance (ANCOVA) was conducted with post-test systems thinking scores as the dependent variable, instructional condition (experimental vs. control) as the fixed factor, and pre-test scores as the covariate. The ANCOVA results are presented in
Table 2.
The ANCOVA revealed a statistically significant difference between instructional conditions on post-test systems thinking scores after controlling for pre-test performance, F(1, 32) = 34.63,
p < 0.001, partial η
2 = 0.52, indicating a large effect size. The covariate (pre-test systems thinking) was also significant, F(1, 32) = 82.96,
p < 0.001, partial η
2 = 0.72. These findings indicate that simulation-based science teaching was associated with significantly higher systems thinking performance compared to the MoNE curriculum when baseline differences were taken into account. Exploratory analyses were conducted to descriptively examine post-test differences between the experimental and control groups across the sub-dimensions of systems thinking. The results of these independent-samples
t-test analyses are provided in
Table A2 (
Appendix A.2). These sub-dimension analyses are reported for descriptive and exploratory purposes only and should not be interpreted as confirmatory evidence.
The paired samples
t-test results for the pre-test and post-test systems thinking scores of the experimental and control groups are presented in
Table 3.
As shown in
Table 3, systems thinking scores in the experimental group increased significantly from pre-test to post-test (
p < 0.001, d = 2.55). In the control group, the change was not significant (
p > 0.05). This finding suggests that the observed increase in systems thinking scores was primarily evident in the experimental group. Given the small sample size and class-level assignment, the magnitude of the observed effect size should be considered in light of these design characteristics. The results for the sub-dimensions of systems thinking in the experimental group are shown in
Table 4.
As seen in
Table 4, exploratory within-group analyses indicated statistically significant increases from pre-test to post-test were observed in seven of the eight sub-dimensions in the experimental group (
p < 0.05). Only the increase in the “Thinking temporally: retrospection and prediction” sub-dimension was not statistically significant (
p > 0.05). These sub-dimension findings are interpreted descriptively and should be considered exploratory in nature. Overall, the pattern of results suggests that simulation-based teaching may strongly support the development of processes such as identifying system components, constructing relational structures, and making conceptual generalizations, whereas mental simulation skills based on temporal flow and causality may require longer-term interventions.
4.2. Findings Related to the Second Sub-Problem
The second sub-problem of the study was: “Does simulation-based science teaching have a significant effect on students’ perceptions of their problem-solving compared to the MoNE curriculum?” To address this question, the Problem-Solving Skills Perception Scale was administered to both the experimental and control groups as a pre-test and post-test. Prior to the intervention, an independent-samples
t-test was conducted to examine baseline equivalence between the groups. The results indicated no statistically significant difference between the experimental and control groups’ pre-test problem-solving perception scores (
p > 0.05), suggesting comparable initial levels before the implementation. To examine the effect of simulation-based instruction while controlling for baseline differences, an analysis of covariance (ANCOVA) was conducted with post-test problem-solving perception scores as the dependent variable, instructional condition (experimental vs. control) as the fixed factor, and pre-test scores as the covariate. The ANCOVA results are presented in
Table 5.
The ANCOVA revealed a statistically significant effect of instructional conditions on post-test problem-solving perception scores after controlling for pre-test performance, F(1, 32) = 8.40,
p = 0.007, partial η
2 = 0.21, indicating a moderate effect size. The pre-test covariate was also statistically significant, F(1, 32) = 101.58,
p < 0.001, partial η
2 = 0.76. These findings indicate that simulation-based science teaching was associated with more positive problem-solving perceptions compared to the MoNE curriculum when baseline differences were taken into account. Exploratory analyses of post-test differences across the sub-dimensions of problem-solving perceptions are provided in
Table A3 (
Appendix A.2). These sub-dimension analyses are reported for descriptive and exploratory purposes only and should not be interpreted as confirmatory evidence. Paired samples
t-test results for pre-test and post-test scores on the problem-solving perception scale in the experimental and control groups are presented in
Table 6.
In the experimental group, problem-solving perception scores increased from pre-test to post-test, although this increase was not statistically significant (p > 0.05). However, the effect size is moderate (d = 0.43). In the control group, there was no significant change between pre-test and post-test scores (p > 0.05). This finding suggests that gains in problem-solving perceptions may have emerged in the experimental group, but that the development is more gradual over time. In other words, although simulation-based learning may support students’ perceptions of problem-solving, its effect appears more slowly and process-oriented compared to systems thinking.
4.3. Findings Related to the Third Sub-Problem
The third sub-problem of the study was: “Is there a significant relationship between students’ systems thinking skills and problem-solving perceptions?” The correlation results addressing this question, based on posttest scores, are presented in
Table 7.
When
Table 7 is examined, a positive and moderate correlation is observed between students’ systems thinking skills and their problem-solving perceptions (r = 0.44,
p < 0.01). This result shows that students with higher systems thinking skills also report more positive perceptions regarding their problem-solving processes. Therefore, it can be cautiously argued that students who can understand the components, relationships, and functioning of a system may be better positioned to perceive themselves as more competent, willing, and persistent problem solvers, even though this relationship does not directly indicate problem-solving performance.
5. Discussion
The findings of this study reveal that, under the present instructional and contextual conditions, simulation-based science teaching is meaningfully associated with the development of systems thinking skills. Results related to the first sub-problem show that PhET-based simulations were associated with significant gains in higher-order cognitive processes such as identifying system components, organizing relationships among these components, making generalizations, understanding hidden dimensions, and recognizing the cyclic nature of systems. The large effect sizes observed after controlling for baseline differences suggest that simulations may support a holistic examination of systemic structures and contribute to students’ construction of deeper mental models. These findings should, however, be interpreted with caution, as classroom-level dynamics and instructional interactions may also have contributed to the observed differences. Accordingly, the results are best understood as reflecting the combined influence of simulation affordances and naturally occurring classroom processes rather than the independent effect of a single instructional variable.
These results are consistent with previous research. Evagorou et al. [
11] demonstrated that a simulation-based learning environment resulted in significant gains in multiple components of systems thinking, even within a relatively short instructional period, among elementary students. This finding provides a useful framework for explaining the substantial improvements observed in the present study. Similarly, do Amaral and Fregni [
73] reported that computer simulations enhance understanding of system structures and behavioral patterns, support critical thinking, and develop cognitive skills related to quantitative modeling. In this context, it can be argued that simulations help students to concretize abstract concepts and make systemic relationships more visible. Likewise, Waddington and Fennewald [
41] indicated that complex simulations such as “Fate of the World” serve as powerful conceptual tools for fostering systems thinking and enhancing students’ multidimensional reasoning about global issues. This suggests that simulations support not only content knowledge but also the ability to interpret system-level interactions, feedback loops, and long-term consequences.
In line with this body of literature, the present findings show that simulation-based science teaching was associated with higher levels of systems thinking performance when baseline differences were taken into account, and thus may represent a promising pedagogical approach particularly for learning science topics that involve multi-component and dynamic relationships. The visualization, interactivity, and immediate feedback afforded by simulations enable students to notice systemic relationships, build conceptual connections, and structure their holistic thinking skills more deeply.
From a theoretical perspective, the observed patterns of improvement can be interpreted in relation to the design features of the PhET simulations and their alignment with specific characteristics of the Systems Thinking Hierarchy (STH) [
57]. This interpretation is consistent with research on cognitive transfer suggesting that instructional designs that promote mindful engagement and active interaction with content are more likely to support higher-order cognitive processes [
74]. Interactive manipulation of system components and immediate visual feedback support lower-level STH characteristics such as identifying components and relationships, while coordinated visual representations may facilitate generalization and the recognition of hidden system dimensions. In contrast, higher-level temporal reasoning, which requires understanding delayed and cumulative system effects, may be less strongly supported by short-term simulation tasks, helping to explain the weaker gains observed in this sub-dimension.
The very large effect sizes observed for systems thinking should be interpreted in relation to the instructional and methodological characteristics of the study. Although such magnitudes are less common in short-term educational interventions, several factors may have contributed to the observed effects. These include close alignment between the assessment instrument and the instructional focus of the simulation-supported lessons, the sensitivity of rubric-scored open-ended items to instructional change, possible novelty effects associated with interactive simulations, and the influence of small sample size and class-level assignment on standardized effect size estimates. Accordingly, the reported effect sizes are best interpreted as indicators of strong associations under the present conditions rather than as precise population-level estimates.
However, the absence of significant gains in temporal thinking, together with the comparatively weaker improvement observed in understanding dynamic relationships within the system, suggests that these skills may require longer-term interventions. The literature emphasizes that understanding causal chains and feedback loops that unfold over time is among the highest levels of systems thinking and demands more complex cognitive processing [
57]. From this perspective, despite the strong visualization opportunities provided by simulations, additional instructional scaffolds and extended learning experiences may be necessary to support students’ understanding of the temporal dimension of systems. It should be noted that findings at the sub-dimension level are interpreted as exploratory, as the primary inferences of the study are based on total scores and covariate-adjusted group comparisons.
Findings related to the second sub-problem of this study indicate that simulation-based instruction was associated with more positive problem-solving perceptions, rather than directly measured problem-solving performance. The fact that the experimental group achieved significantly higher scores than the control group in the post-test suggests that interactive simulation environments may strengthen students’ self-perceived motivation, perseverance, and task persistence in the problem-solving process. Rather than reflecting direct gains in problem-solving performance, these results should be interpreted as changes in students’ self-perceptions regarding their engagement with and persistence in problem-solving processes. This aligns with studies emphasizing that simulations provide learning environments that activate students’ cognitive and affective processes [
23,
70].
There is strong evidence in the literature that simulation-based instruction can support problem-solving skills and processes as assessed through performance-based measures [
25,
30,
31]. These studies demonstrate that simulations can facilitate the regulation of cognitive processes and promote the use of expert-like scientific procedures, such as hypothesis generation, data interpretation, and drawing conclusions. For example, Ceberio et al. [
8] report that simulation-based learning environments contribute to the development of more sophisticated strategies in students’ scientific problem-solving processes. Similarly, Simanjuntak et al. [
29] showed that the combined use of problem-based learning and simulations leads to significant improvements in problem-solving and creative thinking performance compared to traditional instruction.
In contrast to this performance-oriented body of research, the present study focuses on students’ problem-solving perceptions rather than observable problem-solving skills. Accordingly, the findings should be interpreted as indicating that simulation-based instruction may influence how students perceive their competence, effort, and persistence during problem-solving, rather than directly enhancing their problem-solving performance. In this sense, the gradual and process-oriented effects observed in the present study align with prior research suggesting that motivational and perceptual changes may precede measurable gains in problem-solving skills.
Findings related to the third sub-problem of the study show that there is a positive and moderate correlation between systems thinking skills and problem-solving perceptions. This result is consistent with literature emphasizing the critical role of understanding relationships among system components and holistic thinking in solving complex problems [
19,
20,
21]. Students’ ability to grasp the structure of a system, its processes, and relational patterns may help them perceive problems as more manageable and approach them with greater confidence and strategic awareness. In this sense, systems thinking can be regarded as a potential cognitive foundation that supports students’ perceptions of problem-solving competence, and instructional approaches that foreground systemic reasoning may indirectly contribute to more holistic learning experiences in science education.
6. Conclusions and Implications
The results of this study show that simulation-based science teaching was associated with enhanced fifth-grade students’ systems thinking skills and problem-solving perceptions. The findings indicate that students who participated in PhET-supported instruction demonstrated higher levels of understanding of system components, relational structures, cyclic processes, and hidden aspects of scientific phenomena after controlling for baseline differences, compared to their peers who received curriculum-based instruction. Although temporal reasoning and aspects of dynamic causal tracking appeared to develop more slowly, the overall pattern suggests that simulations provide a cognitively rich environment that may support the construction of deeper and more coherent mental models. This underscores the potential of simulation-based learning as a promising tool for deepening students’ mental models in science topics characterized by abstract and multi-component structures. These findings should be interpreted within the constraints of a quasi-experimental design based on two intact classes, and therefore reflect associations consistent with simulation-supported instruction rather than definitive causal effects. Nevertheless, given the quasi-experimental nature of the design, the observed outcomes likely reflect the combined influence of simulation affordances and naturally occurring classroom processes rather than the independent effect of a single instructional variable.
Results pertaining to problem-solving perceptions show that simulations are particularly associated with improvements in students’ process-oriented and motivational components—especially willingness, effort, perseverance, and task persistence. Although there was no significant increase in students’ general problem-solving perceptions, the improvement observed in motivational dimensions within a short implementation period highlights the potential of simulations to activate non-cognitive processes that may precede broader and more stable perceptual change over time.
The positive relationship found between systems thinking skills and problem-solving perceptions highlights the importance of instructional approaches that foreground systemic reasoning, as such approaches may also shape how students perceive and engage with problem-solving processes in science education. It may be argued that students who are more competent in understanding system components, relationships, and patterns may be better positioned to structure the problem-solving process in a strategic and holistic manner. Therefore, it is important in science instruction not only to focus on content transmission but also to design learning environments that emphasize the analysis of systemic structures, relationships, and processes.
The educational implications of these findings point to the value of integrating well-designed simulations into science education, especially for topics such as electricity that involve abstract or unobservable processes. Simulations can help students visualize complex interactions, safely test their ideas, and engage in inquiry-based explorations that are often difficult to implement in traditional classroom settings. From a design perspective, the findings suggest that early instructional scaffolds should prioritize helping students identify system components, relationships, and cyclic structures, as these aspects appear to respond relatively quickly to simulation-based support. In contrast, competencies involving temporal and dynamic causal reasoning may benefit from staged tasks and longer instructional sequences that allow students to revisit and refine their understanding over time. In addition, effective use of simulations appears to depend on iterative instructional feedback loops in which simulation use is integrated with inquiry-based questioning and teacher facilitation. Rather than functioning as stand-alone visualization tools, simulations are likely to be most effective when embedded within cycles of prediction, testing, reflection, and guided discussion.
Overall, the study provides preliminary evidence that simulation-based instruction can contribute to more effective and engaging science learning environments by simultaneously supporting cognitive understanding, relational reasoning, and students’ perceived motivational readiness to cope with complex problems. At the same time, the findings highlight an important pedagogical trade-off: while simulations may yield relatively rapid gains in structural and relational aspects of systems thinking, more complex competencies—such as temporal reasoning and causal tracking—appear to mature more slowly and require sustained instructional attention.
In conclusion, it is recommended that simulations be used more extensively, progressively, and systematically in science education, and that they be enriched with activities designed to support students’ causal and temporal reasoning, particularly in topics where space–time relations are central. Future research may also benefit from examining these effects across larger samples and multiple instructional contexts, as well as from further exploring sub-dimension–level changes using confirmatory designs. In addition, implementing and comparing holistic instructional designs aimed at the instructional designs that examine how systems thinking–oriented learning environments relate to students’ problem-solving perceptions across different grade levels may provide important directions for future research.
7. Limitations
Several limitations of this study should be acknowledged. First, the study was conducted with a relatively small sample size drawn from a single public middle school, which may limit the generalizability of the findings. In addition, randomization was implemented at the class level using two intact classes taught by the same teacher. Although this approach helped control for teacher-related variability, the inclusion of only two clusters prevented the use of multilevel modeling or cluster-robust statistical techniques. Accordingly, ANCOVA was used as a robustness check to adjust for baseline differences rather than as a full correction for clustering effects, and the results should still be interpreted with appropriate caution. Because Intraclass Correlation Coefficients (ICCs) could not be reliably estimated with such a small number of clusters, the potential influence of clustering cannot be ruled out. As a result, standard errors may be underestimated, and the findings should be considered preliminary and interpreted cautiously.
Additionally, although efforts were made to standardize instructional conditions across the experimental and control groups—including the use of the same teacher, equivalent lesson durations, shared learning objectives, and structured lesson plans—it is possible that unmeasured classroom dynamics influenced the outcomes. Factors such as subtle variations in teacher facilitation, student engagement, interaction patterns, or the intensity of inquiry-based practices may have contributed to the observed differences. Therefore, the findings should be interpreted as reflecting the affordances of simulation-supported learning environments rather than the isolated effect of simulation tools alone.
Second, the duration of the intervention was relatively short. While significant gains were observed in systems thinking skills and in certain motivational dimensions of problem-solving perceptions, more complex skills—particularly those related to temporal reasoning and causal tracking—may require longer-term or repeated instructional exposure. Therefore, the effects observed in this study may reflect early-stage developments rather than fully consolidated cognitive changes.
Third, problem-solving was assessed through students’ self-reported perceptions rather than performance-based measures. Although perceptions of problem solving are an important motivational and affective component of learning, future studies could benefit from incorporating objective problem-solving tasks or process-based assessments to provide a more comprehensive evaluation of students’ problem-solving competencies.
Finally, although efforts were made to ensure consistency and rigor in scoring open-ended systems thinking items through the use of expert-reviewed rubrics and AI-assisted scoring procedures, the AI system functioned solely as a provisional coding tool, with all final scoring decisions made by human raters, and some degree of subjectivity in human judgment cannot be entirely eliminated. Future research may further strengthen reliability by employing multiple independent human raters or automated scoring systems validated across larger datasets.
Despite these limitations, the study provides valuable insights into the potential of simulation-based science instruction to support the development of systems thinking and students’ problem-solving perceptions in elementary science education.