How Perspectives of a System Change Based on Exposure to Positive or Negative Evidence

: The system that shapes a problem can be represented using a map, in which relevant constructs are listed as nodes, and salient interrelationships are provided as directed edges which track the direction of causation. Such representations are particularly useful to address complex problems which are multi-factorial and may involve structures such as loops, in contrast with simple problems which may have a clear root cause and a short chain of causes-and-effects. Although students are often evaluated based on either simple problems or simpliﬁed situations (e.g., true/false, multiple choice), they need systems thinking skills to eventually deal with complex, open-ended problems in their professional lives. A starting point is thus to construct a representation of the problem space, such as a causal map, and then to identify and contrast solutions by navigating this map. The initial step of abstracting a system into a map is challenging for students: unlike seasoned experts, they lack a detailed understanding of the application domain, and hence struggle in capturing its key concepts and interrelationships. Case libraries can remedy this disadvantage, as they can transfer the knowledge of experts to novices. However, the content of the cases can impact the perspectives of students. For example, their understanding of a system (as reﬂected in a map) may differ when they are exposed to case studies depicting successful or failed interventions in a system. Previous studies have abundantly documented that cases can support students, using a variety of metrics such as test scores. In the present study, we examine the ways in which the representation of a system (captured as a causal map) changes as a function of exposure to certain types of evidence. Our experiments across three cohorts at two institutions show that providing students with cases tends to broaden their coverage of the problem space, but the knowledge afforded by the cases is integrated in the students’ maps differently depending on the type of case, as well as the cohort of students.


Introduction
Decisions that are in appearance simple, such as whether we eat an expensive walnut salad or an affordable apple pie, have antecedents (is the person trying to lose or gain weight? are there allergies? is the cost a concern?) and consequents (is there then a budget left for other items on the menu? should we exercise to make up for the calories?). As individuals, we are accustomed to forming such decisions by navigating a multiplicity of interacting systems in our minds. However, by making such decisions only in our mind, we may not be aware of our biases or inconsistencies. An implicit approach to decision-making also lacks transparency, and challenges the comparison or evaluation of decisions among individuals. Consequently, an elicitation process serves to externalize the mental models or 'perspectives' held internally by individuals. There are many possible processes to externalize models, depending on the aim of the study or the familiarity of facilitators with certain methods [1]. Such processes may thus result in: (i) textual artefacts, for example as a narrative that necessarily finds a linear arrangement of the elements of the system [2,3]; (ii) rich pictures, which provide a freeform tool building on a rich iconography [4]; or (iii) maps [5], in which the elements of the system are abstracted in the form of labelled or illustrated nodes, and their interrelations are captured through edges. Mapping approaches can also be subdivided into numerous alternatives (Figure 1), including causal loop diagrams [6], Novakian concept maps [7], or causal maps (also known as cognitive maps) [8,9]. In this paper, we focus on the representation of a system in the form of causal maps, which consist of labelled nodes and directed edges representing either a positive causation (i.e., an increase in the source causes an increase in the target) or a negative causation (i.e., an increase in the source causes a decrease in the target). Causal maps have been an extensive subject of research for several decades among many fields of applications, with examples including socio-ecological systems [10][11][12] or health [13][14][15]. The creation of a causal map is either a precursor to the development of a simulation model in which scenarios can be tested and quantitatively evaluated (e.g., a Fuzzy Cognitive Map), or the final representation for the mental models of various stakeholders. In this paper, we examine how mental models of a system are shaped by the evidence that individuals are exposed to; hence, we collect causal maps as proxies to tracking changes in the representation of a system. Our study is motivated by a practical problem in the application domain of educational technology, which strongly shapes our data collection protocol and the applicability of our findings. The application domain is briefly described here; readers interested in this domain are referred to seminal works by [16][17][18] or our previous works [19,20] for complementary overviews of the field.
Once they become professionals, students will have to choose one out of many decisions in complex open-ended problems. From a systems perspective, they will consider multiple viable reasoning paths through a system. In order to prepare them for this future challenge, Problem-Based Learning (PBL) exposes students to scenarios that admit multiple solutions (known as 'ill-structured problems'). Unlike scenarios with a few valid answers (e.g., true/false, multiple choice questions), students can produce very different answers that can be equally admissible. Hence, the expectation is that students can construct an accurate problem space [21][22][23], that is, a representation of the system in which the problem is situated. For example, the decision on an intervention to reduce obesity would start by representing the complex system of obesity, including eating and physical activity as well as psychological or social constructs; then, this system can be explored to propose and evaluate solutions by accounting for alternatives, expected results, and potential unintended consequences [14,24]. As such, a problem space not only depicts the major concepts (variables) that have a role in the cause(s) or the solution of the problem, but also provides an underlying explanation for the problem through the causal relationships among the variables. For example, in medical education, a problem space should "include[s] all the causal mechanisms that account for the patient's signs and symptoms" ( [22], p. 26). For students, the construction of a problem space is also useful, as it provides an opportunity to practice scientific problem-solving processes and integrate their knowledge learned into a conceptual knowledge framework. Working on an understanding and representation of a system is also not only of benefit to solve one immediate problem; it also provides a schema that may be used to solve similar problems in the future [17,25].
An important element in problem-based learning is how learners are able to construct the problem space. Learners must be able to not only identify the concepts that are germane to the problem but also the causal mechanisms for the ways in which the variables impact each other [16,26,27]. The literature suggests that experts' mental models are more systems level [28,29], whereas novices tend to be more segmented and loosely connected [30]. From an educational perspective, Jonassen notes that "there is good reason to believe that there is a dynamic and reciprocal relationship between internal mental models and the external models that students construct" ( [23], p. 311). There are multiple ways to represent this relationship. An approach often used in the literature includes concept mapping [31], which allows learners to depict how they connect ideas. That is, how learners list variables and cluster them based on their relationships or other shared characteristics. Causal mapping is similar, but specifically allows the learners to visualize multiple cause-effect patterns and their impact in a linear fashion. In doing so, it uniquely allows learners to engage in decision-making and pattern recognition as they visualize various pathways during problem-solving [32]. In contrast to more text-based assessments (e.g., argumentation), the visualization afforded by mapping also allows learners to focus on specific segments during their causal reasoning, which is beneficial from a cognitive load perspective [33]. Concept maps can also be used alongside text-based approaches, for example, to give automated feedback to students using the CohViz system [34,35].
According to the theory of Case-Based Reasoning (CBR), students solve problems based on their prior experience. However, this presents a challenge for problem-solving instructional strategies, as novices have limited experience to solve ill-structured problems. This experience can be provided by a digital case library, which consists of narratives describing the ways in which practitioners have solved problems [36,37]. In order to support problem-based learning, instructors can thus give students access to a case library, or equip students with software (e.g., recommender systems) to identify a relevant case [38]. As this library will shape one's view of a system, it is essential that students properly interpret the case to transfer its lessons to their own problem. In particular, cases found within digital libraries can depict successes or failures, which are handled differently in episodic memories [39]. Failure represents a situation where the expectations do not meet the goal requirement. While failure may seem negative, it can incite students to search for explanations; it can thus help to set the groundwork for future reasoning.
While many studies have explored the degree to which learning outcomes differ in various scaffolding approaches, these studies are often performed within a single case or post-hoc learning outcomes (e.g., post-test scores, final argumentation scores). However, theories and studies describe how problem-solving is an iterative process, as students refine their understanding of the system. As such, there is a need for studies which explore how views of a system change as a function of exposure to certain forms of evidence. In this paper, our objective is to quantify the ways in which exposure to success or failure cases will change a student's representation of a system. Our contributions are twofold:

1.
We examine two representations of a system (before/after exposure to a case) using causal maps, which allows us to precisely track the structural evolution of a system instead of relying on test scores.

2.
We repeat our experiments three times, at two different institutions, thus gathering diverse student profiles to support the generalizability of the findings.
We started this paper by explaining the elicitation of knowledge as a causal map and how these causal maps can be analysed, as well as their relation to case-based reasoning. The remainder of this paper is organized as follows. In the Materials and Methods section, we detail how our data was collected, prepared, and analyzed through network algorithms. Next, we present our results, consisting of the key structural differences before and after exposure to failure/success cases in each of our three cohorts. Lastly, we contextualize our results and conclude with suggestions for further research on the representation of systems in the face of changing or conflicting evidence.

Materials and Methods
Our overall approach is summarized in Figure 2 and detailed in the next subsection. Given the complexity of the data pre-processing, we provide it as a separate schema ( Figure 3).

Data Collection
Our data consists of causal maps, which are represented as graphs consisting of labelled nodes and directed, weighted edges. Each causal map was produced by a student. We collected causal maps over three semesters at two different American institutions. Northern Illinois University is a nationally-ranked public Midwestern university, in which the data collection took place through a fourth-year elective on Network Science. Furman University is a nationally-ranked private liberal arts college in the Southeast, where the data collection similarly took place during a third-year elective on Artificial Intelligence. At both institutions, the students first learned how to structure knowledge into causal maps [5], and then they were given the same baseline description of a hypothetical town ('Pleasantville') in which they had to advise the mayor on an open-ended dilemma (see Supplementary Material S1). In our study, the ill-structured problem required students to either accept a retailer, which would transform the city's park into warehouses but bring jobs and tax revenues, or to decline the retailer and preserve the park but struggle with expenses such as teachers' salaries. The students also had to consider the impact of pollution, the available natural resources, and other variables. The students then produced a causal map with the assistance of software (Python in the fourth year course or Actionable Systems in the third year course).
After the students completed their base map without scaffolding, they were introduced to success or failure cases which were relevant to the problem to solve, and were asked to revise their causal maps. The instructions for the failure case are provided in Supplementary Material S2, while the success case is given as Supplementary Material S3. For example, the successful case helped the students manage the effects of rapid population growth, while the failure cases depicted the negative effects of urbanization on the environment.

Data Cleaning/Pre-Processing
Our overall data cleaning process is shown in Figure 3. We converted the students' submissions into one file format (CSV) for analysis. As each file encodes one causal map, we ensured that the files were consistently formatted to describe a network as a list of edges. That is, each line lists the starting node, end node, and type of the causal edge (−1 for causal decrease and 1 for causal increase). This process of ensuring that the maps were properly entered in the file was semi-automatized by using ActionableSystems [40] to detect the presence of formatting errors (e.g., missing header line, extra spaces) and then solving them manually. A common problem was the presence of typos (e.g., "Parks -> Environmental well-being", "Parks ->Enviornmental well-being"), which would mislead the analysis into counting two distinct factors, "Environmental well-being" and "Enviornmental well-being". Such typos were identified via ActionableSystems, as they result in a map that is visibly disconnected; they were then fixed manually. As participants constructed their revised maps, rather than having them provide a discrete numeric value, they were asked to simply provide a nominal label (e.g., 'very strong') to describe the relations between the concepts. After the general data cleaning, we shifted to transforming the nominal values intuitively chosen by the participating students to numeric values (Table 1). These values are only used in the subsequent analysis to identify whether an edge is negative or positive.

Data Analysis
When analysing the changes between the two maps, there are several metrics which denote a more complex understanding of the problem. For example, the presence of chains typically denotes a less complex map, while the presence of cycles shows a stronger understanding of the relationships between various concepts [41]. A participant's coverage of the problem space can be determined by the number of nodes and edges, as well as the length of their map's diameter. Maps containing more nodes and edges have a broader scope, which may show that the participant is examining the problem in a larger context, but could also be a result of the participant going on a tangential train of thought [42,43]. The complexity of a concept map can be analysed by finding the number of cycles and chains, as well as by looking at their lengths [41].
The metrics chosen in this study to contrast the base map with the success or failure maps were selected based on their relation to the complexity of the map and the coverage of the problem space. As discussed by Frerichs et al. [44], the metrics cover the breadth, depth, and structural complexity of conceptualizations of the problem space. Note that our objective is to characterize the structure of the maps rather than to aggregate or compare the content of the maps, which would require us to either solve linguistic variability (i.e., the possibility that students use different terms to cover the same concept) or constraint students into using a bank of concepts [45] instead of letting students express concepts as they see fit (i.e., using an open-ended approach [46]). We used each metric for the following reasons:

•
Number of nodes and number of edges, which represent the number of concepts and causal relationships, respectively. These simple and common measures [42] serves to evaluate whether a map covers more of the problem space.

•
Average and maximum number of edges per node, also known as the 'degree', which measure the connectivity. Higher numbers are more important, as they indicate that students often see them as part of the problem space, exhaustively considering the causal impact of their factors. • Diameter, which measures the furthest away that two concepts can be (i.e., the length of the longest shortest path). A very large diameter is a risk of going on a tangent (as students may go beyond the boundaries of the problem space), whereas a very small diameter may be indicative of an early map, in which most concepts are tightly packed around a focal point. A medium diameter is thus most desirable. • Total number and average lengths of cycles. Cycles or 'feedback loops' are essential structures in complex problems [47]. In contrast with the simple notion of a 'root cause' that would need to be solved for the solution to straightforwardly permeate to a whole system, a cycle recognizes that an intervention will eventually affect us back [48], possibly in unintended ways. As cycles are often present in systems, but are harder to capture in maps due to cognitive limitations [49], seeing cycles in maps is associated with demonstrating a better understanding of the problem. We characterize cycles both in numbers (the more, the better) and in their average length (the longer, the better). • Total number, longest length, and average length of chains. Chains or 'paths' indicate chains of reasoning. They reveal how students combine concepts into a logical sequence, thus giving an insight into how students associate terms and causal reasoning [27,50]. We characterized chains in their amount, and maximum and average length. Lower numbers are associated with more refined maps. • Percentage of positive causations. Because each causal edge was categorized as either positive (i.e., an increase in A causes an increase in B) or negative (i.e., an increase in A causes a decrease in B), we measured the percentage of positive edges. The positive and negative weights are known as 'signs' in the framework of signed graphs, which has been the subject of recent studies in control and systems theory [51,52]. Some of the notions above (e.g., cycles) can be extended within the context of signed graphs, such as characterizing a 'balanced network' as having no cycle of which the product of signs is negative [53,54]. However, these refinements are more applicable to the study of polarizing phenomena [55] in social networks, in which nodes refer to individuals, and edges characterize their interactions (e.g., cooperative or antagonistic), than in the mapping of the problem spaces studied here.
These metrics were primarily computed using the NetworkX Python library within a Jupyter notebook. We note that this library offers sufficiently efficient implementations given the network sizes encountered in this paper, but readers interested in much larger networks may consider alternative approaches, such as the use of spectral graph theory to quickly characterize loops [56,57]. Graphs were generated from .csv files into NetworkXDi-Graphs objects, which were then analysed using a mixture of built-in NetworkX functions and our own implementations of classic graph algorithms. The implementation of the algorithms is described in Table 2. Table 2. Description of each metric and what they measure, along with how they are implemented.

Measure
Metric Implementation

Complexity
Average length and number of cycles Found by using NetworkX's simple_cycles function and sorting the nodes in the cycle before passing it into a set to remove duplicates.

Average length, longest, and number of chains
Found by getting all shortest paths between nodes using NetworkX's shortest_simple_paths function and counting those whose paths contain only nodes of degree two or higher.

Coverage of Problem Space
Number of nodes and edges Counted the number of nodes and edges in the graph.

Diameter
Found the longest of the shortest paths found using NetworkX's shortest_simple_paths function.

Type of causations
Percentage of positive causations Found by reading the last column in the .csv files and checking if the value is above zero, if so find add to count and divide the final count by the total number of rows in the file.

Results
The results are summarized in Tables 3 and 4, while the sample distributions for the number of nodes and edges are shown in Figures 4 and 5; all of the other distributions are provided as supplementary online material. In the first and third cohort, there is a noticeable increase in the number of edges added. This displays an increased coverage of the problem space, hinting that the participants were considering the relationships between their concepts more heavily than in their initial maps. This increase was especially pronounced in the success groups for these two cohorts. In the third cohort, this may have been a result of several outliers, who displayed a significantly larger increase in their number of edges than the others in their group. This same spike in the number of edges is likely the cause of the massive increase in cycles for the third cohort, in which the total number of cycles grew from 68 to 747 cycles after their revision. This increase stands out, as the first cohort group did not experience an increase in cycles anywhere near as significant (only 22 to 36 cycles) despite also having significant growth in the number of edges. Table 3. Relative differences of the students' maps for each group (success, failure), and for all students combined. The first two cohorts were formed at Northern Illinois University in a fourth-year elective on network science, while the third cohort originated from Furman University through a third-year course on artificial intelligence.

Difference
+ is a greater than 25% increase from the previous map, ++ is a greater than 50% increase from the previous map. -is a greater than 25% decrease from the previous map, -is a greater than 50% decrease from the previous map. = is a change of less than a 25% (either increase or decrease) from the previous map. * None of the four students had chains to start with. In the success group, two of them added chains: one added one chain (of length 2), and the other added five chains (of length 2 to 3, with 2.2 on average).   The number of chains and their length saw similar changes across both the success and failure group in the first two cohorts. For the first cohort, we noted a decrease in the number of chains, and their average length shortened significantly after their revisions, while the second cohort saw a small increase across both groups.
The third cohort, however, saw the introduction of chains in the success group where there were previously no chains, and the failure group saw a spike in the number of chains (4 to 38) and their length (4 to 11).
The connectivity of the graphs did not change significantly across each group, with few exceptions: the failure group in the first cohort and the success group in the third cohort, which saw a minor increase in diameter; and the failure group of the third cohort, which had a decrease in the average degree across the graph. This decrease in the average degree may be linked to the spike in chains, because an increase in chains should lower the average degree if the only new connections were those chains.

Rationale for the Study and Application Context
When we do not have expertise in an application domain and still need to make decisions, our understanding of the domain can be improved by being exposed to an evidence base in the form of case studies (i.e., a 'case library'). However, the content of the cases can have an impact on our perspectives, such that our mental model may be shaped differently depending on whether the case studies depict successful interventions or failures. Previous studies have abundantly documented that cases can support students, using a variety of metrics such as test scores. In the present study, we took one step further in examining the ways in which the perceptions of a system change as a function of exposure to certain types of evidence. Specifically, our study measured the representation of knowledge through causal maps before and after exposure to a case. We focused on causal maps as they have been used repeatedly in the literature on participatory modelling to elicit mental models with crowds that did not necessarily have prior experience in systems thinking. We acknowledge that other representations may yield different results, as mapping techniques capture specific aspects of a student's knowledge, and hence are not interchangeable [58].

Main Takeaways from the Analysis
We found that exposure to cases leads to a higher number of nodes, edges, cycles, and a higher average degree. The diameter is not changed significantly, which provides evidence of knowledge integration. However, these numbers are increased differently depending on the case and student cohort (either Northern Illinois University fourth year students for cohorts 1-2, or Furman University third year students for cohort 3). Similarly, the findings are nuanced by case and semester for the two key structures consisting of chains and cycles. As discussed earlier, the presence of cycles is tied to a more complex understanding of the problem space [41,47,48], and chains typically denote a naïve understanding. In the first two cohorts, students used the success case to reinforce their own map, thus leading to a sharp increase in the number of cycles and their average length, while these two metrics did not noticeably change in the third cohort. The first two cohorts doubled the number of concept nodes when exposed to a failure case, as they thought of concepts which were previously unaccounted for, while the number of nodes in the third cohort increased slightly regardless of the case. Finally, the first two cohorts had noticeably more positive causal connections when they were exposed to the success case than the failure case, but this relation reversed in the third cohort.

Limitations
A potential limitation of this study was the size of the data sets. First, the activity of representing a system through several maps is time consuming for students; hence, it needs to be part of the relevant learning outcomes for the course. We thus tied it to courses on network science (focusing on the structure of knowledge as a map) or artificial intelligence (echoing the core notion of knowledge representation), which means that we performed data collection in upper-level electives which generally have small enrolment numbers at the two institutions considered, in comparison with introductory courses. We addressed this concern by collecting data for three successive semesters, thus achieving a sample size of n = 28 for the analysis. Although there is no typical number of participants for studies that elicit systems in the form of maps, we note that our sample size is comparable to several other works, with n = 27 [59] or n = 30 [60] participants. The number of participants can differ widely depending on the objectives of the study: our analysis of maps provided solely by experts had as few as n = 7 participants [61], while our examination of mental models in the community involved as many as n = 264 individuals [10]. The specific data collection protocol also plays an important role in determining whether a study can scale to many participants. Studies with very few participants can involve long one-onone semi-structured interviews which are then transcribed and turned into maps [42], whereas studies with many participants consist of self-administered surveys in which even the choice of names for each factor is limited. The intermediate order of magnitude of our sample is in line with the intermediate option taken in this study: students can independently create maps representing their view of the system, without being forced to choose from limited options.

Conclusions
Effective problem solving starts with a systematic process to define the problem space and articulate the causal relationships among the variables identified. By mapping the major variables and their causal relationships, the plausible solutions will be more easily and efficiently identified [16]. In problem-based instruction, these mapping processes also help students to construct the knowledge acquired into a conceptual framework for that problem. Therefore, constructing a representation of a system by mapping the problem spaces using causal reasoning is a critical cognitive process not only to successfully solve a given problem, but also for the refinement of the students' conceptual knowledge and problem-solving skills.
Our study shows that providing students with cases tends to broaden their coverage of the problem space, as evidenced by an increase in the number of nodes and causal edges. However, the knowledge afforded by the cases is integrated into the students' maps differently depending on the type of case (success vs. failure) as well as the cohort of students.
Future studies could expand the data sets by continuing to recruit cohorts. Furthermore, our findings that some of the structural changes depend on the cohort suggest that additional factors may underpin the ways in which students process cases. Consequently, it would be of particular interest for future studies to also collect data about the profile of the students, and to test for the presence of mediating factors (e.g., using structural equation models) in the way that maps change when they are exposed to certain cases.

Supplementary Materials:
The following are available online at https://www.mdpi.com/article/10 .3390/systems9020023/s1: activity statement distributed to all students (S1) followed by exposure to the failure case (S2) and success case (S3); distributions across cohorts of average degree, highest degree, and percentage of positive edges (S4); distributions across cohorts of the number of cycles and average cycle length (S5); distributions across cohorts of the number of chains, average and highest chain lengths (S6); distributions across cohorts of the diameter (S7).
Author Contributions: P.J.G. and A.A.T. jointly designed the study. P.J.G. collected the data and directed the analysis. P.J.G. and A.A.T. jointly wrote and approved the manuscript. All authors have read and agreed to the published version of the manuscript.