Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance

Pedraja-Rejas, Liliana; Acosta-García, Katherine; Rodríguez-Ponce, Emilio; Muñoz-Fritis, Camila

doi:10.3390/systems14040403

Open AccessArticle

Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance

by

Liliana Pedraja-Rejas

^1,*

,

Katherine Acosta-García

²

,

Emilio Rodríguez-Ponce

³

and

Camila Muñoz-Fritis

³

¹

Departamento de Ingeniería Industrial y de Sistemas, Facultad de Ingeniería, Universidad de Tarapacá, Arica 1000000, Chile

²

Departamento de Educación, Facultad de Educación y Humanidades, Universidad de Tarapacá, Arica 1000000, Chile

³

Instituto de Alta Investigación, Universidad de Tarapacá, Arica 1000000, Chile

^*

Author to whom correspondence should be addressed.

Systems 2026, 14(4), 403; https://doi.org/10.3390/systems14040403

Submission received: 27 February 2026 / Revised: 29 March 2026 / Accepted: 3 April 2026 / Published: 7 April 2026

(This article belongs to the Special Issue Systems Thinking in Education: Learning, Design and Technology)

Download

Browse Figure

Review Reports Versions Notes

Abstract

Professional education increasingly requires graduates to make decisions in complex systems marked by multiple stakeholders, feedback, delays, uncertainty, and unintended consequences, yet systems thinking is still often taught as a set of disconnected tools rather than as an integrated professional practice. This conceptual paper adopts an integrative theory-building approach to develop a unified architecture for systems thinking in professional education, drawing purposively on systems traditions, practice-based learning, assessment scholarship, and emerging work on generative artificial intelligence (GenAI). The paper proposes four iterative practices (sensemaking and boundary setting, co-modelling and causal representation, intervention reasoning, and meta-learning) as the core architecture for learning systems thinking in professional contexts. It further translates this architecture into indicative implications for curriculum sequencing, authentic tasks, and assessment, while positioning GenAI as a cross-cutting support/risk layer that can assist iteration and critique but also introduce predictable risks such as fabricated causal links, overreliance, and false mastery. To address these risks, the paper outlines governance conditions based on traceability, uncertainty checks, stakeholder validation, and process-based assessment. Overall, the framework provides a design-oriented basis for teaching, assessing, and governing systems thinking in contemporary professional education and a foundation for future empirical testing.

Keywords:

systems thinking; professional education; curriculum design; assessment; generative artificial intelligence; governance

1. Introduction

Professionals increasingly make decisions inside complex systems where multiple actors, rules, resources, and information flows interact, producing nonlinearity, delays, path dependence, and unintended consequences. In such settings, competence is not only technical: it includes the ability to frame problems (and reframe them), engage stakeholders, model causal structure, anticipate feedback, and justify interventions under ethical and regulatory constraints. Systems thinking offers a transdisciplinary language and set of practices for this work, with deep roots in system dynamics [1], organisational learning [2], and leverage-point reasoning [3], as well as in interpretive and critical traditions that foreground multiple perspectives, problem structuring, and boundary judgements [4,5].

Yet the educational translation of this mature theoretical base remains uneven. While recent scholarship shows sustained growth and diversification in systems thinking education, learners still often encounter the field as a set of disconnected techniques (e.g., causal mapping, stakeholder lists, or simulations) rather than as an integrated practice that links representation to intervention and reflective learning over time [6,7]. The result is a recurring mismatch: the professional demand is iterative, situated, and justificatory, but instructional pathways frequently remain episodic and tool-centred.

This mismatch becomes especially visible in assessment. Systems thinking is not simply familiarity with terminology; it is performance in situated tasks where learners must make assumptions explicit, build and critique causal explanations, and defend decisions under constraints. Empirical work further indicates that self-reports and performance can diverge, reinforcing the need for assessment designs that capture both mindsets and observable reasoning practices, rather than treating either one as sufficient on its own [8,9].

A rapidly evolving force now amplifies these long-standing challenges: the diffusion of generative artificial intelligence (GenAI) into higher education and professional learning. Large language models can support iteration, explanation, and feedback at scale [10,11]. At the same time, they introduce failure modes that directly threaten core goals in systems thinking education, including persuasive fabrication (“hallucination”), inappropriate reliance, and false mastery, where outputs improve while underlying competence stagnates [12,13,14]. In systems thinking education, where causal reasoning, epistemic humility, and revision in light of evidence are central, these risks are not peripheral; they target the integrity of learning itself.

Against this background, three interrelated gaps motivate the argument. First, existing work provides valuable pedagogical strategies and domain-specific innovations, yet offers limited conceptual integration across the full learning process. In particular, the literature lacks a clear end-to-end curricular throughline from boundary setting to modelling, intervention reasoning, and meta-learning that can support longitudinal curriculum design beyond stand-alone activities [6,15]. Second, assessment remains only partially aligned with the situated nature of systems-thinking practice, and there is a need to specify performance criteria, expected evidence, and instrument options that fit authentic professional tasks while explicitly integrating self-report and performative metrics [8,9]. Third, the productive and responsible integration of GenAI in systems thinking education remains under-specified: although opportunities and risks are increasingly acknowledged, fewer frameworks explain how GenAI should be incorporated into learning environments for systems thinking without undermining causal reasoning, reflective judgement, and accountability [12,14,16,17].

Existing integrative contributions in systems thinking education have tended to advance one of these dimensions at a time rather than connect them within a single design logic. Some provide valuable curricular and pedagogical guidance for particular domains or learning settings, but stop short of specifying an end-to-end architecture linking framing, modelling, intervention, and reflective revision across longitudinal learning journeys [6,7,15]. Others clarify assessment challenges or measurement options, but do not embed these within a broader curricular and governance framework for professional formation [8,9]. In parallel, emerging work on GenAI has identified important opportunities and risks, yet usually treats these at the level of tool use or general educational governance rather than as conditions internal to the learning of systems thinking itself [12,14,16,17]. The specific gap addressed here, therefore, is not the absence of relevant scholarship, but the absence of a design-oriented architecture that integrates these strands into a single account of how systems thinking can be learned, assessed, and governed in professional education.

In response, this paper develops a unified conceptual architecture for systems thinking in professional education and translates it into actionable implications for curriculum design, assessment, and governance in GenAI-mediated learning environments. Its contribution lies not only in synthesis, but in architectural integration, expressed in four specific contributions. First, it reconceptualises systems thinking for professional education as an iterative architecture of four linked practices (sensemaking and boundary setting, co-modelling and causal representation, intervention reasoning, and meta-learning), thereby moving beyond tool-centred approaches and offering an end-to-end learning logic. Second, it translates systems traditions into a cross-level design model that connects concepts, professional capabilities, learning artefacts, curriculum sequencing, and assessment criteria.

Third, it brings GenAI into the framework not as a generic educational add-on, but as a governance condition internal to the teaching and learning of systems thinking. Fourth, it identifies a pathway for empirical development by linking the framework to propositions, measurable constructs, and candidate instruments. Taken together, these contributions position the paper as a design-oriented framework for how systems thinking can be taught, assessed, and governed in contemporary professional education. The discussions of curriculum, assessment, governance, and empirical development should therefore be read as translational extensions of the proposed architecture rather than as equally elaborated stand-alone frameworks.

2. Methodological Approach

This paper adopts a conceptual, integrative theory-building approach to develop a framework for systems thinking in professional education. Rather than providing a systematic review, the paper synthesises concepts from four relevant bodies of scholarship: systems thinking traditions, professional education and practice-based learning, assessment of complex competencies and authentic performance, and emerging work on GenAI in education.

The literature was identified primarily through Google Scholar and selected purposively in line with the conceptual aims of the paper. The search was broad in thematic scope rather than exhaustive in corpus coverage. Indicative search terms included combinations such as systems thinking education, systems thinking in professional education, system dynamics education, soft systems methodology, critical systems thinking, practice-based learning, authentic assessment, systems thinking assessment, causal loop diagrams, systems thinking competencies, generative AI in education, and AI governance in education. These terms were used heuristically to identify seminal, theoretically relevant, and analytically productive contributions across the four bodies of scholarship informing the framework, rather than to generate an exhaustive review corpus.

Selection was guided by conceptual relevance to the educational design problem, theoretical centrality within the field, and usefulness for linking systems concepts to curriculum, assessment, and governance. This is consistent with methodological guidance on conceptual papers, which emphasises the need to explicate and justify the choice of theories and their role in the analysis [18]. The purpose was not exhaustive coverage or formal evidence synthesis, but the construction of a coherent conceptual framework. Accordingly, the methodological boundaries of the synthesis are explicit: it is selective rather than exhaustive, oriented toward framework construction rather than corpus mapping, and intended to integrate seminal and analytically productive contributions rather than reproduce the procedures of a systematic review.

Framework development proceeded through an iterative process of comparative conceptual mapping. The analysis first clarified the focal design problems in systems thinking education that recur across the literature, especially fragmentation between tools and learning progression, weak alignment between situated performance and assessment, and the under-specified integration of GenAI into the teaching and learning of systems thinking. Second, core concepts from the selected systems traditions were compared across traditions in terms of their educational and professional relevance, with particular attention to recurring constructs such as feedback, causality, delays, boundary judgements, multiple perspectives, leverage, and reflective revision. Third, these overlapping and complementary constructs were grouped according to the kind of learning work they imply, which led to the formulation of four iterative practices: sensemaking and boundary setting, co-modelling and causal representation, intervention reasoning, and meta-learning. Fourth, the emerging architecture was checked for cross-level coherence by examining whether these practices could be translated into professional capabilities, curricular sequences, artefacts, assessment criteria, and governance implications for GenAI without losing conceptual alignment. Finally, the framework was refined by evaluating its internal coherence, practical interpretability for educators, and empirical tractability through measurable constructs and observable artefacts.

Within this synthesis, the systems traditions were prioritised for conceptual centrality in defining the architecture itself, while the literature on professional education, assessment, and GenAI were used to translate that architecture into curricular, evaluative, and governance implications. The resulting framework is assessed in terms of internal coherence, cross-level integration, practical interpretability for educators, and empirical tractability through measurable constructs and observable artefacts.

3. Fundamentals of Systems Thinking for Professional Practice

Systems thinking is best understood as a family of traditions united by a shift in attention from isolated parts and linear cause-effect chains to relationships, feedback, temporality, boundaries, and multiple perspectives [19]. Professional practice benefits from drawing on three complementary traditions.

3.1. System Dynamics: Feedback, Accumulation, and Delays

System dynamics formalised the idea that system behaviour is generated endogenously by structure, particularly by feedback loops, accumulations (stocks), and delays [1]. The professional implication is practical: interventions must be judged not only by intended direct effects but also by feedback-mediated side effects and time-lagged responses. Meadows [3] made this reasoning widely accessible and introduced leverage points as a way to connect modelling to action.

3.2. Soft Systems and Problem Structuring: Pluralism and Learning

Soft Systems Methodology and related problem-structuring approaches emphasise that many professional problems are “messy” and socially constructed: stakeholders disagree on goals, values, and what counts as a solution [4]. In professional education, this tradition motivates curricula that treat stakeholder engagement and framing as core learning outcomes rather than as optional preambles to technical analysis.

3.3. Critical Systems Thinking: Boundary Critique and Power

Critical systems thinking foregrounds the ethical and political dimensions of system definition. Ulrich’s [5] boundary critique argues that system boundaries are value-laden judgements about whose interests count and what evidence is legitimate. In professional practice, this is essential for avoiding models that are technically elegant but socially harmful or infeasible.

3.4. Core Concepts and Their Professional Meaning

The following concepts recur across systems traditions and are foundational for professional systems practice. To make the discussion operational for professional education, they are synthesised here in terms of their professional meaning. Rather than treating these concepts as abstract theoretical categories, the aim is to clarify what each one enables professionals to do in practice, especially when they must diagnose situations, anticipate consequences, and justify interventions in complex settings.

Accordingly, Table 1 summarises the core concepts and links each of them to its practical relevance for professional reasoning and action. The table is intended as a bridge between the theoretical foundations presented above and the learning architecture developed in the following sections, where these concepts are translated into capabilities, curricular design principles, and assessment implications.

Taken together, these traditions do more than provide background context; they directly inform the architecture proposed in the following sections. System dynamics contributes to the framework’s emphasis on causal representation, feedback, delays, and leverage-oriented intervention reasoning. Soft systems and related problem-structuring traditions inform the emphasis on sensemaking and boundary setting, stakeholder engagement, and the recognition that professional problems are often contested rather than technically given. Critical systems thinking contributes to the framework’s concern with boundary critique, inclusion, power, and the ethical defensibility of system definitions and proposed interventions.

4. Conceptual Model for Learning Systems Thinking in the Professions

Building on the three systems traditions outlined above, we propose a learning architecture organised as an iterative cycle of four practices. The architecture translates the theoretical contributions of system dynamics, soft systems, and critical systems thinking into an educational model for professional formation. In this sense, the cycle mirrors how professionals work with complex systems: understanding evolves through repeated interaction with evidence, stakeholders, values, and outcomes.

4.1. Components

The proposed learning architecture is organised around four iterative practices through which systems thinking is developed in professional education. The first is sensemaking and boundary setting, understood as the work of defining the system of interest, clarifying purpose, identifying relevant stakeholders, and making boundary judgements explicit [4,5]. This practice is foundational because it frames what is considered relevant in the analysis and what remains outside the immediate scope, thereby making assumptions visible from the outset.

The second practice is co-modelling and causal representation, through which learners develop shared representations (e.g., CLDs or qualitative system maps) that render causal assumptions, feedback structures, delays, and emergent patterns over time explicit [1]. Here, modelling is not merely descriptive. It functions as a medium for collective reasoning, comparison, and dialogue about how the system behaves over time. Empirical work suggests that diagrammatic representations can shape professionals’ information use and systems reasoning when they are presented and used appropriately [20].

The third practice is intervention reasoning, in which learners use these representations to move from explanation to action by identifying leverage points, exploring scenarios, and weighing feasibility constraints alongside ethical trade-offs [3]. This practice links systems understanding to professional judgement, since learners are required not only to propose possible interventions but also to justify them in relation to system dynamics, practical constraints, and responsibility for consequences.

The fourth practice is meta-learning, which involves reflecting on model limits, uncertainty, and professional responsibility while revising boundaries and causal assumptions in light of feedback. In this sense, meta-learning reinforces the iterative character of the architecture by connecting reflection to model revision and renewed inquiry. It also supports transfer and adaptive expertise by helping learners become more aware of how they frame problems, how they reason through complexity, and how their judgements change over time.

4.2. Relations and Iteration

The four practices described above should not be understood as a fixed linear sequence. Although they can be introduced pedagogically in a staged manner, their value in professional education lies in their iterative and relational character. In practice, learners move back and forth between framing the situation, representing causal structures, reasoning about interventions, and revising assumptions in light of uncertainty and feedback. Progression is therefore not simply cumulative but recursive, as each practice can prompt reconsideration of the others.

Within this architecture, sensemaking and boundary setting shape what is included in the analysis and what counts as a relevant problem, stakeholder, or constraint. These boundary judgements then condition co-modelling and causal representation, because the structure of the model depends on how the system of interest has been defined. In turn, intervention reasoning builds on these representations by connecting causal understanding to possible actions, leverage points, feasibility constraints, and ethical trade-offs. At the same time, intervention reasoning feeds back into earlier stages, since attempts to design action often reveal missing actors, overlooked variables, or overly narrow boundaries.

Meta-learning operates across the entire cycle by supporting reflection on model limits, uncertainty, and professional responsibility while prompting revision of both boundaries and causal assumptions. In the broader framework developed in this paper, moreover, this iterative cycle does not unfold in a neutral instructional environment. It increasingly operates in educational settings shaped by GenAI, which can support, accelerate, or distort each practice depending on how its use is structured and governed. Figure 1 therefore summarises the framework as an iterative cycle of four learning practices while also showing GenAI as a cross-cutting layer of support, risk, and governance rather than as a separate fifth stage. In this sense, iteration is not an optional extension of the process but a constitutive feature of systems thinking as professional practice, and GenAI forms part of the conditions under which that practice is now learned, enacted, and assessed.

4.3. Translating Systems Concepts into Professional Capabilities

The proposed architecture becomes educationally meaningful only when its concepts and practices are translated into capabilities relevant to professional formation. In this context, “capabilities” refer to integrated forms of knowing and doing: not only understanding systems concepts, but also applying them in situated tasks, justifying decisions, and revising reasoning in light of feedback. This translation is particularly important in professional education, where systems thinking must inform diagnosis, intervention, and responsible action under real-world constraints.

To make this translation explicit, Table 2 links the core systems concepts and iterative practices to corresponding professional capabilities. The table clarifies what learners should be able to do when systems thinking is approached as a professional competence and serves as a bridge to the curriculum design and assessment implications developed in the following sections.

4.4. Illustrative Example of Framework Enactment

To make the framework more concrete, consider a health-professions module focused on emergency department overcrowding and delayed discharge. In the sensemaking and boundary setting phase, students define the system of interest, identify key stakeholders (patients, nurses, physicians, hospital managers, and community care providers), and justify what is included or excluded from the problem framing. In co-modelling and causal representation, they develop a CLD showing how bed availability, discharge delays, staff workload, waiting times, and readmission risk interact over time. In intervention reasoning, they compare possible responses, such as changes in discharge coordination, triage redesign, or community follow-up, and justify these options in terms of leverage, feasibility, unintended consequences, and ethical trade-offs. In meta-learning, they review how their framing shaped the model, identify uncertainties and missing perspectives, and revise their analysis in light of peer, instructor, or stakeholder feedback.

GenAI may be used to generate alternative stakeholder perspectives, challenge the plausibility of causal links, or prompt counterfactual scenarios. However, students must document what the tool contributed, what they accepted or rejected, where evidence remains weak, and how stakeholder feedback altered the final model and intervention brief. In this way, the framework becomes visible not only as a conceptual architecture but as a sequence of learnable practices, artefacts, and evaluable decisions within an authentic professional task.

4.5. Empirical Anchors

Although this paper is conceptual, recent empirical and design-based studies illustrate that these capabilities can be cultivated through intentional pedagogical design. Examples include systems-thinking-informed curriculum design in health professions education [6], simulation-based nursing education supporting systems thinking development through reflection [21], design-based approaches to integrating systems thinking in sustainable chemistry teacher education with explicit rubrics [15], and quasi-experimental work combining systems thinking with digital tools in teacher education [22]. Taken together, these studies reinforce the need for coherent learning journeys, iterative artefacts, and assessment aligned with authentic tasks.

The four-practice architecture therefore offers more than a conceptual description of how systems thinking is learned. Because the practices are expressed through identifiable capabilities, artefacts, and forms of judgement, they also provide a basis for curriculum design, assessment, and GenAI governance. The next sections translate these implications into curricular, evaluative, and governance terms.

5. Curriculum Design Implications: Outcomes, Sequences, Authentic Tasks, and Learning Journeys

The four-practice architecture implies a curricular logic in which learners revisit these practices across time, contexts, and increasing levels of complexity. The following sections outline indicative outcomes, sequences, authentic tasks, and learning journeys for systems thinking education.

5.1. Learning Outcomes Aligned to the Model

A curriculum for professional systems thinking should articulate learning outcomes that map onto observable performance in authentic or profession-relevant tasks. In this framework, expected outcomes extend beyond conceptual familiarity with systems language and are expressed as demonstrable competencies. First, learners should develop boundary and stakeholder competence, understood as the ability to define a system boundary, justify inclusions and exclusions, and represent stakeholder perspectives and power dynamics [4,5]. Second, they should demonstrate causal modelling competence by producing and explaining causal representations that incorporate feedback and delays and that support professional communication [1,20]. Third, the curriculum should foster intervention competence, enabling learners to identify leverage points, anticipate unintended consequences, and justify interventions under professional constraints [3]. Finally, learners should develop reflective and ethical competence, expressed in the ability to critique model limits, represent uncertainty, and articulate the ethical implications of interventions.

5.2. Sequencing a Learning Journey

Longitudinal progression requires repeated cycles with increasing authenticity, stakeholder complexity, and evaluative rigour. Rather than treating systems thinking as a single-course outcome, the proposed model frames development as a learning journey in which learners revisit the core practices across stages, each time with more demanding tasks, richer constraints, and stronger expectations for justification and reflection.

Several recent professional-education studies align with this logic. For example, Khanna et al. [6] argue for curricula designed as complex adaptive systems that support learners’ journeys; Vuorio et al. [15] describe structured sequences (pre-task, modelling assignment, assessment rubric) in teacher education; and Liou [21] highlights repeated simulation scenarios paired with reflection as a mechanism for developing systems thinking. To make this progression explicit in curricular terms, Table 3 outlines an indicative learning sequence from novice to adaptive practitioner, showing how task design, stakeholder complexity, and evaluative expectations can be progressively intensified.

5.3. Authentic Tasks and Conditions for Learning

Authentic tasks for systems thinking in professional education should resemble the artefacts and decision practices that learners are expected to use in real professional contexts. Rather than focusing only on conceptual recall, task design should require learners to produce and justify outputs that make their reasoning visible. This may include, for example, a stakeholder ecosystem report that identifies who matters, why, and how power shapes outcomes; a CLD portfolio with annotated assumptions, evidence notes, and revision history; a scenario and intervention brief linking model structure to action, unintended consequences, and trade-offs; and a reflective critique that articulates uncertainty, ethical implications, and learning from feedback. Taken together, these tasks position systems thinking as a form of professional reasoning-in-action and create opportunities to assess not only what learners know, but how they frame, model, intervene, and reflect.

Progression in these tasks depends on learning conditions that support iteration, comparison, and transfer. In particular, learners need repeated opportunities to revise their work through feedback loops, including multiple drafts and critique from peers, instructors, and, where feasible, stakeholders. Comparative modelling is also important, as requiring at least two plausible boundary framings can reduce premature closure and strengthen justificatory reasoning. When possible, stakeholder contact through participatory sessions, interviews, or structured role-play further increases authenticity and exposes learners to competing perspectives. Finally, cross-context transfer should be intentionally designed through repeated application across cases and domains, so that systems thinking develops as an adaptive professional capability rather than as a case-specific technique.

The architecture proposed here is intended to be transferable across professional-education contexts at the level of design logic, not as a uniform implementation template. Its practical enactment will depend on disciplinary epistemologies, programme structures, institutional resources, stakeholder access, and local cultures of assessment and participation. For this reason, the framework should be adapted and validated in context rather than applied as if the same sequence, artefacts, or governance routines would function identically across all professional fields.

6. Criteria, Expected Evidence, and Measurement Options

The proposed architecture becomes educationally meaningful through observable artefacts, judgements, and revisions. Assessment must therefore focus not only on what learners know about systems concepts, but also on how they frame problems, represent causality, justify interventions, and revise their reasoning under uncertainty. The assessment blueprint below aligns performance criteria with the kinds of evidence through which the four practices become visible in situated professional tasks, explicitly combining self-report and performative metrics.

6.1. Principles for Assessment in Systems Thinking

Assessment in systems thinking should be guided by principles that align evaluation with the situated and performative nature of professional reasoning. First, assessment should focus on situated performance, examining how learners reason through authentic tasks that resemble professional practice rather than relying only on decontextualized tests of conceptual recall. In addition, robust assessment requires triangulation, combining self-report measures, performance artefacts, and process data (such as revision trails or interaction logs) in order to reduce false positives and obtain a more valid picture of competence development.

At the same time, evaluation should prioritise causal quality over diagram aesthetics, rewarding coherent causal explanations, awareness of feedback structures and delays, and the ability to reason through trade-offs, rather than the visual polish of representations alone. Finally, in an artificial-intelligence-rich environment, assessment must be process-sensitive. Because digital tools can improve final outputs without necessarily improving underlying competence, it becomes essential to evaluate drafts, justifications, and decision processes in addition to final products [14].

6.2. Performance Criteria and Evidence

To make the assessment principles operational, they must be translated into explicit performance criteria and corresponding forms of evidence. In systems thinking, this is especially important because competence is expressed through situated reasoning and artefact production, not only through declarative knowledge. As a result, assessment design should specify what counts as quality performance and what kinds of evidence can validly support evaluative judgements across tasks and stages of progression.

Accordingly, Table 4 summarises indicative performance criteria and associated evidence for assessing systems thinking in professional education. The table is intended to support alignment between learning outcomes, authentic tasks, and assessment decisions by clarifying what evaluators should look for in learners’ products, justifications, and iterative work processes. The following two subsections provide supporting elaborations through an illustrative rubric and links to existing instruments.

6.3. Illustrative Rubric Template for Causal Loop Diagrams

Rubrics operationalise quality and support more consistent and reliable scoring in systems thinking assessment. In the specific case of CLDs, a rubric is useful not only for evaluating the final artefact but also for making explicit the dimensions of reasoning that matter pedagogically, such as causal coherence, feedback awareness, the representation of delays, and the quality of explanatory justification.

Table 5 presents a rubric template that translates these core reasoning demands into observable criteria and performance descriptors. The dimensions consolidate (i) recent rubric-development work for evaluating systems thinking in causal modelling contexts [23] and (ii) measurement research comparing self-report and performative metrics in learners’ responses to building causal loop models [9]. Rather than functioning as a fixed scoring scheme, Table 5 is designed as an adaptable scaffold: instructors can tailor weights, exemplars, and proficiency thresholds to course level and task complexity, while using the same dimensions to align expectations, structure formative feedback, and support transparent evaluation across learning stages.

6.4. Supplementary Links to Validated Instruments and Measurable Constructs

To strengthen empirical grounding, even in a conceptual paper, future studies should connect proposed constructs to established instruments and measurable indicators. In particular, research on systems thinking disposition can draw on self-report measures such as the Systems Thinking Scale [8], while systems thinking performance can be assessed through performative metrics based on causal model building and evaluation [9], rubric-based scoring of feedback reasoning [23], and domain-specific modelling tasks. This distinction is important because it preserves the conceptual difference between reported orientation toward systems thinking and demonstrated competence in situated performance.

In addition, future empirical work can examine critical thinking in learning interactions using the Practical Inquiry/cognitive presence framework [24], which has been used in recent research on GenAI and critical thinking [25]. For the study of appropriate reliance on artificial intelligence (AI), researchers may incorporate behavioural indicators such as the Appropriateness of Reliance proposed by Schemmer et al. [26], as well as experimental designs using cognitive forcing functions to reduce overreliance [12]. Finally, learning analytics for process evidence can be incorporated through revision histories and interaction traces, consistent with learning analytics perspectives for large language models [27]. Taken together, these options provide a concrete basis for translating the paper’s conceptual propositions into testable designs with validated or at least clearly operationalised measures.

7. Generative Artificial Intelligence as Support and Risk in Systems Thinking Education

In the proposed architecture, GenAI can support or distort each of the four practices and must therefore be treated as a pedagogical condition rather than as an external add-on. The section that follows examines how GenAI operates across sensemaking and boundary setting, co-modelling and causal representation, intervention reasoning, and meta-learning, first by identifying productive roles, then by specifying predictable failure modes, and finally by outlining the governance mechanisms required for responsible integration.

7.1. Productive Roles

In the proposed framework, GenAI can play productive roles when it is used to support, rather than substitute for, the four core practices involved in learning systems thinking. In sensemaking and boundary setting, generative tools can help learners surface alternative framings, identify overlooked stakeholders, and generate contrasting narratives that broaden problem definition and reduce premature closure [10]. In co-modelling and causal representation, they can prompt learners to check polarity, identify missing feedback loops or delays, compare alternative explanations, and improve the clarity of causal accounts. More broadly, GenAI can support iterative critique and feedback when it is used with appropriate pedagogical scaffolding [10].

In intervention reasoning, generative tools can support scenario-based comparison and the exploration of alternative courses of action, thereby broadening the analysis of trade-offs, leverage points, and possible consequences [10]. In meta-learning, they can contribute to reflective prompting and guided metacognitive support, especially when learners engage with them collaboratively rather than passively [25]. Under these conditions, GenAI functions less as a source of ready-made answers than as a structured prompt for inquiry, comparison, and revision. Its educational value therefore lies not in replacing systems reasoning, but in supporting iterative learning while preserving human oversight and reducing uncritical acceptance [10,12].

7.2. Predictable Failure Modes

In GenAI-mediated systems thinking education, several failure modes are sufficiently recurrent to be treated as predictable rather than exceptional. Within the proposed architecture, these risks matter because they can distort each of the four iterative practices rather than affect only one isolated stage of learning. A first risk is plausible fabrication and spurious causality: generative models can produce fluent but incorrect statements and invent links that appear coherent at the surface level, a phenomenon widely documented as hallucination in natural language generation and large language models [13]. In systems thinking education, this is particularly problematic because learners may adopt causal explanations that are rhetorically persuasive but analytically unfounded, thereby weakening co-modelling and causal representation.

A second recurring failure mode is overreliance and automation bias. Research in human factors has consistently shown that people may accept automated advice even when it is wrong [28,29]. More recent experiments indicate that cognitive forcing functions can reduce overreliance in AI-assisted decision making [12], and newer work has proposed measurable metrics for assessing appropriate reliance [26]. In educational contexts, this risk can appear when learners accept GenAI-generated causal narratives, stakeholder framings, or intervention suggestions without verification or critical comparison. In this sense, overreliance may narrow sensemaking and boundary setting, weaken intervention reasoning, and reduce the reflective work expected in meta-learning.

Related to this, a third failure mode concerns cognitive offloading and false mastery. Recent research suggests that some patterns of GenAI use may reduce cognitive effort while creating an illusion of competence. For example, studies operationalising cognitive offloading report negative relationships between heavy GenAI tool use and critical thinking [30], while mixed-methods research indicates that critical thinking benefits depend on whether students use tools passively or collaboratively with guidance [25]. A preprint combining neural and behavioural measures further suggests that sustained use of large language models for essay writing may reduce independent engagement over repeated sessions (“cognitive debt”; [31]). In a similar vein, the OECD [14] highlights false mastery as a systemic assessment risk in GenAI-rich environments. For systems thinking education, the concern is not only that learners produce stronger-looking outputs, but that these outputs may mask weak causal understanding, limited reflective judgement, and shallow revision practices.

A fourth risk is what may be termed complexity collapse, in which generative tools compress complex situations into tidy narratives that omit delays, feedback interactions, and stakeholder conflict. This tendency can be especially damaging in systems thinking education because it undermines the very features—nonlinearity, interdependence, and contested boundaries—that learners are expected to recognise and reason through. Finally, there is the risk of bias and marginalisation of perspectives. Critiques of foundation models have shown that training data and design choices can reproduce societal biases and obscure accountability [32,33]. In pedagogical settings, this may narrow the range of perspectives considered during problem framing and weaken the quality of stakeholder analysis, particularly when GenAI outputs are treated as neutral or comprehensive representations of the system.

Taken together, these failure modes show why GenAI cannot be treated simply as a productivity aid within systems thinking education. Because it can distort framing, modelling, intervention reasoning, and reflective revision, its educational use requires explicit governance conditions. The following subsection therefore turns to the mechanisms needed to keep GenAI aligned with the pedagogical aims of the proposed architecture.

7.3. Governance Mechanisms for Responsible Integration

A governance approach treats GenAI as a bounded partner within a learning system, useful for iteration and critique, but constrained by documentation, uncertainty work, and stakeholder validation. Within the proposed architecture, governance is best understood as a design condition for keeping GenAI use aligned with the pedagogical aims of the model. The mechanisms outlined below should therefore be read as analytically grounded design implications: some are directly supported by existing scholarship on AI oversight, overreliance, and process-based assessment, while others are proposed here as framework-consistent strategies that require further empirical testing.

Within this framework, governance mechanisms are intended to make GenAI-supported work visible, reviewable, and contestable across the learning process. In sensemaking and boundary setting, this includes traceability routines that document where GenAI contributed to framing, stakeholder identification, or problem definition, together with stakeholder validation and bias-reflection prompts that can reduce premature closure and exclusion. In co-modelling and causal representation, governance may include evidence requirements, uncertainty checks, and model-justification routines so that generated causal claims are not accepted without scrutiny. In intervention reasoning, governance may also involve explicit assumptions, counterfactual comparison, and justification before acceptance, thereby reducing overconfidence in plausible but weak intervention narratives. In meta-learning, governance can be supported through process-based assessment, revision logs, confidence notes, and explicit reflection on what was accepted, rejected, or revised in interaction with GenAI.

These mechanisms align with broader principles of trustworthy and human-centred AI governance while remaining internal to the pedagogical aims of the framework. Traceability supports accountability and reviewability [16,17]. Cognitive forcing functions can reduce uncritical acceptance by requiring learners to reason before adopting recommendations [12]. Process-based assessment helps counter false mastery by evaluating drafts, revision rationales, evidence notes, and the integration of stakeholder feedback rather than relying on polished outputs alone [14]. Finally, AI literacy may be treated as a curricular requirement rather than an optional add-on. Institutions increasingly treat AI literacy as essential, and policy developments such as the European Union’s AI Act include obligations to promote AI literacy in relevant contexts [34].

To synthesise these mechanisms at the instructional level, Table 6 maps productive GenAI roles, predictable risks, and governance responses directly onto the four-practice architecture.

The interaction between the four practices and GenAI can be illustrated through a brief hypothetical example. In sensemaking and boundary setting, learners analysing emergency department overcrowding might use GenAI to generate alternative framings of the problem—for example, as a triage issue, a discharge-coordination issue, or a community-care bottleneck—and then compare these framings against stakeholder evidence. In co-modelling and causal representation, students building a CLD could prompt GenAI to suggest missing feedback loops or delays, but would need to justify whether those additions are evidentially defensible. In intervention reasoning, learners could use GenAI to compare possible interventions, such as redesigning discharge protocols or expanding community follow-up, and then examine trade-offs, feasibility constraints, and unintended consequences. In meta-learning, students might use GenAI to prompt reflective comparison between an initial and revised model, identify where their confidence exceeded the available evidence, and document how peer or stakeholder feedback changed their reasoning. In each case, the educational value of GenAI depends not on automated output alone, but on the quality of the learner’s evaluation, revision, and justification.

8. Testable Propositions and Future Research Agenda

To support future empirical development, this section outlines an initial set of propositions and related methodological directions derived from the proposed architecture. The aim is to indicate how the framework may be translated into testable lines of inquiry rather than to define a complete empirical programme.

8.1. Testable Propositions

The propositions that follow should be read as conceptually derived and variably grounded in existing scholarship. Some extend findings already suggested in the adjacent literature, whereas others represent interpretive extrapolations from the proposed architecture and are offered as hypotheses for future testing rather than as established claims.

A first set of propositions concerns the effects of curricular design and iterative practice on capability development. The framework leads to the proposition that curricula requiring repeated cycles of boundary setting, modelling, intervention reasoning, and reflection are likely to support higher systems-thinking performance than curricula focused on one-off modelling activities (P1, capability gains through iteration). It also suggests that incorporating stakeholder validation checkpoints may improve the quality of boundary judgements and the feasibility of interventions in learner artefacts (P2, stakeholder validation improves robustness). In addition, when CLDs are presented and used as interpretive tools rather than as decorative outputs, their use may improve systems reasoning and information utilisation in complex professional cases (P3, diagram use improves information utilisation) [20].

A second group of propositions addresses the effects of GenAI under different governance conditions. The framework suggests that, when traceability and uncertainty routines are in place, GenAI support may improve causal model quality and explanatory depth compared with non-GenAI conditions (P4, GenAI improves modelling under governance). By contrast, without governance mechanisms, GenAI use may increase automation bias and reduce transfer to novel cases (P5, overreliance without governance reduces transfer). Relatedly, designs that require independent first drafts and justification before GenAI exposure may reduce overreliance while preserving productivity gains (P6, cognitive forcing reduces inappropriate reliance) [12].

A third set of propositions concerns measurement and learning interaction patterns. The framework suggests that systems-thinking self-report measures may not fully predict modelling performance, and that this divergence may be moderated by task authenticity and feedback intensity (P7, self-report diverges from performance) [8,9]. It also proposes that passive GenAI-directed use may correlate with lower critical-thinking indicators than collaborative, guided use (P8, mode of GenAI interaction predicts critical thinking) [25]. Taken together, these propositions are intended to provide an initial basis for testing how the proposed architecture operates under different instructional, assessment, and governance conditions, rather than to exhaust the full range of possible empirical questions.

8.2. Methodological Alignments and Future Research Directions

To move from conceptual propositions to empirical development, future research requires explicit alignment between hypotheses, measurable constructs, and feasible study designs. This is important because methodological choices shape what kinds of effects can be detected, how competence is interpreted, and whether observed outcomes reflect genuine learning gains, process differences, or artefact quality alone. In this context, the propositions advanced in Section 8.1 should be read as an initial empirical extension of the framework, while the present section outlines the methodological alignments and research directions needed to examine them in context.

Accordingly, Table 7 presents an indicative alignment between the propositions advanced in Section 8.1, the constructs to be measured, and suitable study designs for testing them. The table is intended to support analytical clarity by showing how the proposed architecture may be translated into empirically tractable research strategies across curricular, assessment, and governance conditions for GenAI.

Beyond this initial alignment, future research should prioritise studies that test the framework across contexts, over time, and under different governance conditions. A first priority is cross-professional comparative research examining the model in fields such as health, engineering, sustainability, justice, and telecommunications in order to identify domain-sensitive mechanisms and boundary conditions. A second priority is longitudinal evaluation of learning journeys, following learners across multiple courses to assess transfer, retention, and the development of adaptive expertise rather than short-term task performance alone.

A complementary line of work concerns governance mechanism experiments designed to isolate the contribution of traceability routines, uncertainty checks, and stakeholder validation checkpoints, both individually and in combination. In parallel, learning analytics and revision-trail research can use process data to distinguish productive from unproductive forms of GenAI use, thereby improving empirical identification of how support, overreliance, and revision behaviours shape learning outcomes [27]. A further priority is equity and inclusion research examining whether GenAI widens or narrows learning gaps and how participatory validation practices may help protect marginalised perspectives in systems framing and intervention design.

A longer-term extension of this agenda concerns the institutional conditions under which systems-thinking and governance practices related to GenAI become durable beyond individual courses. In this regard, future research could examine how organisational learning processes, leadership, and programme-level coordination shape the uptake and sustainability of these practices across courses and programmes. This line of inquiry extends the framework from course design toward institutional embedding without displacing its central focus on the proposed learning architecture.

Taken together, these methodological alignments and future directions are intended to guide cumulative empirical refinement of the framework rather than to define a complete research programme. Their purpose is to indicate where the proposed architecture is most in need of contextual testing, measurement development, and comparative validation.

9. Conclusions

Systems thinking is increasingly central to professional practice, yet professional education often teaches it as a fragmented set of tools. This paper contributes more than a synthesis of existing scholarship. It offers a design-oriented conceptual framework that reorganises systems thinking for professional education around four iterative practices, links those practices to capabilities, artefacts, curriculum pathways, and assessment criteria, and specifies how GenAI should be governed when the learning goal is causal reasoning, boundary critique, and reflective judgement. By also linking the framework to propositions, measurable constructs, and candidate instruments, the paper provides not only conceptual integration but a clearer basis for cumulative empirical testing.

At the same time, the argument developed here has important limitations that define the scope of its contribution and the next steps for research. Because the paper is conceptual, it prioritises integrative clarity over empirical specificity. The proposed architecture is intended to be transferable across professions, but its implementation will require domain-sensitive adaptation and validation in local curricular contexts. In addition, the rapid evolution of GenAI means that governance recommendations cannot be treated as fixed; they will need periodic revision as tools, institutional policies, and educational practices change. Finally, important measurement challenges remain, particularly in relation to causal reasoning, boundary critique, and reflective judgement, where robust instruments, scoring procedures, and evaluator calibration still require further development. These limitations do not weaken the framework’s value; rather, they underscore the need for cumulative empirical work capable of testing, refining, and contextualising the model across professional education settings.

Author Contributions

Conceptualization, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; methodology, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; validation, L.P.-R., K.A.-G. and E.R.-P.; formal analysis, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; investigation, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; resources, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; writing—original draft preparation, L.P.-R. and C.M.-F.; writing—review and editing, L.P.-R., K.A.-G. and E.R.-P.; visualization, L.P.-R., K.A.-G., E.R.-P. and C.M.-F.; supervision, L.P.-R.; project administration, L.P.-R.; funding acquisition, L.P.-R. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

No new data were created or analyzed in this study.

Acknowledgments

“The authors would like to thank ANID for its support through Fondecyt projects 1220568 and 1260766.” We also confirm that we have ANID’s permission to thank them for their support in this work.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

GenAI	Generative Artificial Intelligence
AI	Artificial Intelligence
CLD	Causal Loop Diagram

References

Sterman, J.D. Business Dynamics: Systems Thinking and Modeling for a Complex World; Irwin/McGraw-Hill: Boston, MA, USA, 2000. [Google Scholar]
Senge, P. The Fifth Discipline: The Art and Practice of the Learning Organization; Doubleday/Currency: New York, NY, USA, 1990. [Google Scholar]
Meadows, D.H. Thinking in Systems: A Primer; Chelsea Green Publishing: White River Junction, VT, USA, 2008. [Google Scholar]
Checkland, P.B. Systems Thinking, Systems Practice; John Wiley & Sons: New York, NY, USA, 1981. [Google Scholar]
Ulrich, W. Critical Heuristics of Social Planning: A New Approach to Practical Philosophy; John Wiley & Sons: Chichester, UK, 1983. [Google Scholar]
Khanna, P.; Roberts, C.; Lane, A.S. Designing health professional education curricula using systems thinking perspectives. BMC Med. Educ. 2021, 21, 20. [Google Scholar] [CrossRef]
Peretz, R. Integrating systems thinking into sustainability education: An overview with educator-focused guidance. Educ. Sci. 2025, 15, 1685. [Google Scholar] [CrossRef]
Dolansky, M.A.; Moore, S.M.; Palmieri, P.A.; Singh, M.K. Development and validation of the Systems Thinking Scale. J. Gen. Intern. Med. 2020, 43, 436–447. [Google Scholar] [CrossRef] [PubMed]
Frantz, C.M.; Blotner, J.; Petersen, J.E. How do we measure and increase systems thinking? Comparing self-reported and performative metrics in response to building causal loop models. Syst. Res. Behav. Sci. 2025, 1–21. [Google Scholar] [CrossRef]
Kasneci, E.; Sessler, K.; Küchemann, S.; Bannert, M.; Dementieva, D.; Fischer, F.; Gasser, U.; Groh, G.; Günnemann, S.; Hüllermeier, E.; et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learn. Individ. Differ. 2023, 103, 102274. [Google Scholar] [CrossRef]
Kinder, A.; Briese, F.J.; Jacobs, M.; Dern, N.; Glodny, N.; Jacobs, S.; Leßmann, S. Effects of adaptive feedback generated by a large language model: A case study in teacher education. Comput. Educ. Artif. Intell. 2025, 8, 100349. [Google Scholar] [CrossRef]
Buçinca, Z.; Malaya, M.B.; Gajos, K.Z. To trust or to think: Cognitive forcing functions can reduce overreliance on AI in AI-assisted decision-making. In Proceedings of the ACM on Human-Computer Interaction; Association for Computing Machinery: New York, NY, USA, 2021; pp. 1–21. [Google Scholar]
Ji, Z.; Lee, N.; Frieske, R.; Yu, T.; Su, D.; Xu, Y.; Ishii, E.; Bang, Y.J.; Madotto, A.; Fung, P. Survey of hallucination in natural language generation. ACM Comput. Surv. 2023, 55, 248. [Google Scholar] [CrossRef]
OECD. OECD Digital Education Outlook 2026: Exploring Effective Uses of Generative AI in Education; OECD Publishing: Paris, France, 2026. [Google Scholar]
Vuorio, E.; Pernaa, J.; Aksela, M. A pedagogical model for teaching systems thinking in a sustainable chemistry course: A design-based research approach. J. Chem. Educ. 2025, 102, 3878–3892. [Google Scholar] [CrossRef]
National Institute of Standards and Technology. Artificial Intelligence Risk Management Framework (AI RMF 1.0); NIST AI 100-1; National Institute of Standards and Technology: Gaithersburg, MD, USA, 2023. [Google Scholar]
UNESCO. Guidance for Generative AI in Education and Research; UNESCO: Paris, France, 2023. [Google Scholar]
Jaakkola, E. Designing conceptual articles: Four approaches. AMS Rev. 2020, 10, 18–26. [Google Scholar] [CrossRef]
Jackson, M.C. Critical Systems Thinking and the Management of Complexity: Responsible Leadership for a Complex World; Wiley: Oxford, UK, 2019. [Google Scholar]
Veldhuis, G.A.; Smits-Clijsen, E.M.; van Waas, R.P.M.; Hof, T.; Maccatrozzo, V.; Rouwette, E.A.J.A.; Kerstholt, J.H. The influence of causal loop diagrams on systems thinking and information utilization in complex problem-solving. Comput. Hum. Behav. Rep. 2025, 17, 100613. [Google Scholar] [CrossRef]
Liou, C.-F. Determining the developmental pathway of system thinking in undergraduate nursing students using a simulation education program. Nurse Educ. Pract. 2025, 89, 104626. [Google Scholar] [CrossRef] [PubMed]
Kurent, B.; Avsec, S. Systems thinking in the role of fostering technological and engineering literacy. Systems 2026, 14, 5. [Google Scholar] [CrossRef]
Kastens, K.A.; Wakeland, W.; Shipley, T.F. Developing and Field-Testing a Rubric for Evaluating Students’ Causal Loop Diagrams. 2024. Available online: https://proceedings.systemdynamics.org/2024/papers/P1354.pdf (accessed on 24 January 2026).
Garrison, D.R.; Anderson, T.; Archer, W. Critical inquiry in a text-based environment: Computer conferencing in higher education. Internet High. Educ. 1999, 2, 87–105. [Google Scholar] [CrossRef]
Nasr, N.R.; Tu, C.-H.; Werner, J.; Bauer, T.; Yen, C.-J.; Sujo-Montes, L. Exploring the impact of generative AI ChatGPT on critical thinking in higher education: Passive AI-directed use or human–AI supported collaboration? Educ. Sci. 2025, 15, 1198. [Google Scholar] [CrossRef]
Schemmer, M.; Bartos, A.; Spitzer, P.; Hemmer, P.; Kühl, N.; Liebschner, J.; Satzger, G. Towards effective human-AI decision-making: The role of human learning in appropriate reliance on AI advice. arXiv 2023, arXiv:2310.02108. [Google Scholar] [CrossRef]
Mazzullo, E.; Bulut, O.; Wongvorachan, T.; Tan, B. Learning analytics in the era of large language models. Analytics 2023, 2, 877–898. [Google Scholar] [CrossRef]
Lee, J.D.; See, K.A. Trust in automation: Designing for appropriate reliance. Hum. Factors 2004, 46, 50–80. [Google Scholar] [CrossRef]
Parasuraman, R.; Riley, V. Humans and Automation: Use, Misuse, Disuse, Abuse. Hum. Factors 1997, 39, 230–253. [Google Scholar] [CrossRef]
Gerlich, M. AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies 2025, 15, 6. [Google Scholar] [CrossRef]
Kosmyna, N.; Hauptmann, E.; Yuan, Y.T.; Situ, J.; Liao, X.-H.; Beresnitzky, A.V.; Braunstein, I.; Maes, P. Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv 2025, arXiv:2506.08872. [Google Scholar] [CrossRef]
Bender, E.M.; Gebru, T.; McMillan-Major, A.; Shmitchell, S. On the dangers of stochastic parrots: Can language models be too big? In FAccT ’21: Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency; Association for Computing Machinery: New York, NY, USA, 2021; pp. 610–623. [Google Scholar]
Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E. On the opportunities and risks of foundation models. arXiv 2021, arXiv:2108.07258. [Google Scholar] [CrossRef]
European Parliament and Council of the European Union. Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 Laying Down Harmonised Rules on Artificial Intelligence and Amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and Directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (Artificial Intelligence Act); Official Journal of the European Union: Luxembourg, 2024. [Google Scholar]

Figure 1. Iterative learning cycle for professional systems thinking, with GenAI as a cross-cutting support/risk layer and governance conditions for responsible use.

Table 1. Core systems concepts and implications for professional practice.

Concept	Working Definition	Typical Representations	Professional Implication
Feedback	Causal influence that returns to affect its own drivers (reinforcing/balancing).	Causal loop diagrams (CLD); stock-flow models	Anticipate growth, tipping, oscillations; avoid one-shot “fixes.”
Causality	Explanatory structure linking actions/conditions to outcomes.	Causal maps; narrative causal accounts	Move from correlation to mechanisms; justify interventions.
Delays	Time lags between cause and observable effect.	Delay marks in models; time series reasoning	Prevent premature conclusions; anticipate short-term versus long-term trade-offs.
Emergence	System-level patterns not reducible to single components.	Behaviour-over-time graphs; scenario narratives	Expect unintended consequences; design for adaptation.
Boundaries	Judgements about what is inside/outside the system of concern.	Boundary statements; rich pictures; scope notes	Reveal value assumptions; avoid hidden exclusions.
Stakeholders	Actors and groups shaping and affected by system behaviour.	Stakeholder maps; influence-interest matrices	Improve legitimacy, feasibility, and equity of interventions.

Table 2. From the four practices to professional capabilities and artefacts.

Practice	Professional Capability	Typical Learning Artefacts	Example Performance Indicators
Sensemaking and boundary setting	Problem framing; boundary critique; stakeholder engagement	Boundary statement; stakeholder map; problem framing	Defensible scope; explicit assumptions; inclusion of affected stakeholders
Co-modelling and causal representation	Causal reasoning; feedback literacy; modelling communication	CLD; narrative causal explanation	Coherent loops; explicit delays; plausible mechanisms; alternative explanations
Intervention reasoning	Trade-off analysis; scenario planning; leverage-point reasoning	Policy memo; scenario brief; intervention portfolio	Clear model-to-action link; anticipation of unintended effects; feasibility/ethics rationale
Meta-learning	Reflective critique; uncertainty awareness; adaptive expertise	Reflection log; revision trail; model critique	Revision quality; calibrated confidence; evidence seeking; ethical awareness

Table 3. Example learning journey across a module or programme.

Stage	Design Focus	Typical Activities	Expected Artefacts
Novice	Language of systems; boundaries; basic causality	Guided stakeholder mapping; boundary critique prompts; simple causal maps	Boundary note; stakeholder map; short causal narrative
Intermediate	Feedback and delays; model critique	Draft CLD; peer review; compare alternative framings	CLD v1 and v2 with revision rationale; critique memo
Advanced	Intervention and trade-offs; validation	Scenario planning; leverage-point analysis; stakeholder review workshop	Intervention portfolio; scenario brief; stakeholder feedback log
Capstone	Integration and transfer	Apply cycle to new domain; demonstrate adaptive reasoning	Final model + memo; reflection on transfer and uncertainty

Table 4. Assessment blueprint: criteria, evidence, and instruments.

Criterion	Evidence (Artefacts/Process)	Indicators	Candidate Measures/Instruments
Boundary and stakeholder quality	Boundary statement; stakeholder map; justification notes	Defensible scope; explicit exclusions; plural perspectives; power/values	Boundary-critique checklist; stakeholder engagement ratings (locally defined)
Causal reasoning quality	CLD + narrative; alternative models	Feedback loops; delays; coherent polarity; plausible mechanisms; emergent patterns over time	CLD rubric; performative tasks [9]; systems dynamics concept tasks
Intervention logic	Policy memo; scenario brief; trade-off matrix	Clear model-to-action link; unintended consequences; feasibility and ethics; emergent system-level consequences	Scenario reasoning rubric; decision-justification scoring
Meta-learning and uncertainty	Reflection log; revision history; confidence notes	Calibrated uncertainty; evidence seeking; revision quality; limits of model claims made explicit	Reflection rubric; uncertainty annotation quality
Systems thinking mindset (supplementary)	Survey	Interdependency orientation; holistic framing	Systems Thinking Scale [8]

Table 5. Causal loop diagram rubric.

Dimension	Emerging (1)	Developing (2)	Proficient (3)	Advanced (4)
Variables	Vague, redundant, or poorly defined	Mostly clear; some ambiguity	Clear, measurable, relevant variables	Variables well-scoped; includes key latent constructs where justified
Causal links	Many unsupported or inconsistent links	Mostly plausible; occasional leaps	Plausible mechanisms; coherent polarity	Mechanisms justified with evidence; alternative links considered
Feedback structure	Few/no loops; linear chains dominate	At least one loop; limited coherence	Multiple coherent reinforcing/balancing loops	Loop dominance and interactions discussed; potential nonlinearities noted
Delays and dynamics	Delays absent	Some delays implied	Delays explicitly included where relevant	Delays used to explain time patterns and unintended consequences
Boundary and stakeholders	Boundary implicit; stakeholders absent	Some stakeholder variables	Stakeholders integrated into causal structure	Boundary critique explicit; stakeholder power/values represented
Explanation and use	Minimal narrative explanation	Partial explanation	Clear narrative linking structure to behaviour	Behavioural implications and intervention leverage points articulated

Table 6. GenAI across the four-practice architecture: productive roles, risks, and governance.

Learning Practice	Productive GenAI Role	Typical Risk	Governance Mechanism
Sensemaking and boundary setting	Alternative framings; stakeholder perspective generation; contrastive problem definitions	Bias; missing marginalised voices; premature closure	Stakeholder validation; bias-reflection prompts; traceability notes
Co-modelling and causal representation	CLD critique; identification of missing loops, delays, and assumptions; explanatory clarification	Spurious causality; hallucination; weak evidentiary grounding	Evidence requirement; uncertainty checks; model justification
Intervention reasoning	Scenario generation; counterfactual comparison; trade-off exploration	Overconfidence in narratives; weak feasibility reasoning	Explicit assumptions; justification before acceptance; counterfactual comparison
Meta-learning	Reflective prompting; revision support; uncertainty articulation	Overreliance; false mastery; reduced effort	Cognitive forcing functions; process-based grading; revision logs; confidence notes

Table 7. Indicative alignment between propositions, measurable constructs, and candidate study designs.

Proposition	Key Constructs	Measures/Instruments (Examples)	Recommended Designs
P1	Systems performance; transfer	CLD rubric (Table 5); performative metrics [9]; transfer tasks	Longitudinal quasi-experiment across modules
P2	Boundary robustness; feasibility	Boundary-critique scoring; stakeholder feedback ratings	Mixed methods; stakeholder panel comparisons
P3	Information utilisation; systems reasoning	Think-aloud protocol + coding [20]	Lab-in-field experiments with professionals/students
P4	Model quality with AI	CLD rubric + traceability score; revision quality	Randomised controlled trial with governance versus none
P5	Automation bias; transfer	Appropriateness of Reliance metric [26]; novel-case performance	Controlled experiment; delayed post-test
P6	Overreliance reduction	Error-override rates; time-on-task; justification quality	Cognitive forcing interventions [12]
P7	Disposition versus performance	Systems Thinking Scale [8] versus modelling performance	Correlational + mediation models
P8	Critical thinking mode	Cognitive presence coding [24]; interaction logs	Mixed methods; script analysis [25]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pedraja-Rejas, L.; Acosta-García, K.; Rodríguez-Ponce, E.; Muñoz-Fritis, C. Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance. Systems 2026, 14, 403. https://doi.org/10.3390/systems14040403

AMA Style

Pedraja-Rejas L, Acosta-García K, Rodríguez-Ponce E, Muñoz-Fritis C. Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance. Systems. 2026; 14(4):403. https://doi.org/10.3390/systems14040403

Chicago/Turabian Style

Pedraja-Rejas, Liliana, Katherine Acosta-García, Emilio Rodríguez-Ponce, and Camila Muñoz-Fritis. 2026. "Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance" Systems 14, no. 4: 403. https://doi.org/10.3390/systems14040403

APA Style

Pedraja-Rejas, L., Acosta-García, K., Rodríguez-Ponce, E., & Muñoz-Fritis, C. (2026). Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance. Systems, 14(4), 403. https://doi.org/10.3390/systems14040403

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Educating for Complexity: A Learning Architecture for Systems Thinking in Professional Education and Generative AI Governance

Abstract

1. Introduction

2. Methodological Approach

3. Fundamentals of Systems Thinking for Professional Practice

3.1. System Dynamics: Feedback, Accumulation, and Delays

3.2. Soft Systems and Problem Structuring: Pluralism and Learning

3.3. Critical Systems Thinking: Boundary Critique and Power

3.4. Core Concepts and Their Professional Meaning

4. Conceptual Model for Learning Systems Thinking in the Professions

4.1. Components

4.2. Relations and Iteration

4.3. Translating Systems Concepts into Professional Capabilities

4.4. Illustrative Example of Framework Enactment

4.5. Empirical Anchors

5. Curriculum Design Implications: Outcomes, Sequences, Authentic Tasks, and Learning Journeys

5.1. Learning Outcomes Aligned to the Model

5.2. Sequencing a Learning Journey

5.3. Authentic Tasks and Conditions for Learning

6. Criteria, Expected Evidence, and Measurement Options

6.1. Principles for Assessment in Systems Thinking

6.2. Performance Criteria and Evidence

6.3. Illustrative Rubric Template for Causal Loop Diagrams

6.4. Supplementary Links to Validated Instruments and Measurable Constructs

7. Generative Artificial Intelligence as Support and Risk in Systems Thinking Education

7.1. Productive Roles

7.2. Predictable Failure Modes

7.3. Governance Mechanisms for Responsible Integration

8. Testable Propositions and Future Research Agenda

8.1. Testable Propositions

8.2. Methodological Alignments and Future Research Directions

9. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI