Complexity in Education for Sustainable Consumption—An Educational Data Mining Approach using Mysteries

Systems thinking is one of the skills necessary for sustainable behavior, especially regarding sustainable consumption. Students are faced with complexity and uncertainty while taking part in it and other daily life aspects. There is a need to foster their competence in this field. From a classroom point of view, the mystery method is an example for implementing education for sustainable consumption and working with complex and uncertain content. With the mystery method students construct an influence diagram, which consists of concepts and requires several skills, especially in decision-making. Using these diagrams as a form of assessment is desirable but also very difficult, because of the mentioned complexity and uncertainty that is part of the task itself. The study presented here tackles this problem by creating an expert based reference diagram that has been constructed with the help of educational data mining. The result shows that it is possible to derive such a reference even if parts remain ambiguous due to the inherent complexity. The reference may now be used to assess students’ systems thinking abilities, which will be undertaken in future research. Beside this, the reference can be used as a reflective tool in lessons, so students can compare their own content knowledge and discuss differences to the experts’ reference.


Introduction
The world in which we live in is facing increasing challenges, usually carrying a high degree of uncertainty. Fensham [1] compiles an overview of what he terms Grand Challenges and Opportunities. He observes that on the one hand, these challenges must be described in scientific terms but on the other hand, they always contain social-moral aspects to be addressed as well. For example, the consequences of consumption choices can be described based on ecological consequences (such as emissions, consumption of resources, etc.), but also in terms of social ones (wage dumping, exploitation, etc.). At this interface, science and society meet and this is why reference is also made to socio-scientific issues (SSIs) [2]. SSIs can be utilized for Education for Sustainable Development (ESD) [3]. Also, these SSIs constitute as complex issues [1]. Based on the model regarding sustainability competences by Rost, et al. [4], there are three competencies taking part in sustainable behavior: At first, one has to know, then one has to value, Sustainability 2019, 11, 722 2 of 16 and finally one can take action. Especially consumption is based on this order, as many will have experienced themselves. In school lessons for example, students can analyze the consumption patterns of a conventional good (knowing), then value this pattern based on the three layers of sustainable development (valuing), and finally develop sustainable patterns for this good and hopefully carry them out themselves (taking action). This article dedicates itself to the first part, the knowing, which is the basis for the others. In this context, Rieckmann [5] cites cognitive learning goals, which also emphasize understanding and knowledge, based on Sustainable Development Goal 12: "responsible consumption and production" [6]. Rost, Lauströer and Raack [4] specify this part by referring to competencies regarding systems. Two examples for such competencies are system competence [7] and systems thinking [8]. It is crucial for teaching sustainable education, and therefore Education for Sustainable Consumption (ESC), too, to have tools for assessing students' abilities in those competencies. This paper proposes such a tool, which is based on a teaching method that was optimized by the authors for the purpose of being used as an assessment tool. For now, the methodological process of the tool's empirical development is the focus as it has been a complex endeavor in itself. The overall goal of this research project is to empower teachers and educators, so they are able to foster their students based on the tool, so the students may be qualified to value a decision and finally execute it.
We are using the mystery method [9] (or mystery for short) as a basis, because we consider it suitable for learning processes regarding sustainable consumption. A mystery consists of facts or statements usually written on cards. Students are faced with an initial question or problem that seems odd (or mysterious) at a first glance. They then need to arrange the cards in a way that explains this mysterious question or problem. In this study, we are using a self-developed mystery focusing on the (non-sustainable) consumption of tomatoes and providing information about ecological, economical, and social aspects. This example represents a previously mentioned SSI, since it combines scientific and social-moral aspects. The result or product of the mystery method is a type of concept map (see below), also called an influence diagram by Schuler, Fanta, Rosenkraenzer and Rieß [8]. We also use this term, since the structure generated by students contains justified interrelations between ecological, economical, and social aspects of the selected content. As will be explained in the following section, these interrelations are externalized concepts, which can also be regarded as aspects of systems thinking [8].
This article describes the development of a tool for analyzing these influence diagrams in order to assess systems thinking. We are focusing on deriving an expert-based reference that teachers may use for an easy assessment of a mystery. The chosen approach is based on expertise research und underlies several empirical steps, which will be explained in the methodology chapter. It is based on the methods of relatedness judgement [10] and educational data mining [11]. The result of the research will be a reference, which will allow comparison between students and experts and/or students exclusively. However, this step will not be conducted in this paper, due to the need for an elaborate description of the conducted research, as it has not yet been undertaken in this specific setup. The reference will illustrate the experts' rating of the perceived interrelations of single aspects within the consumption of tomatoes. Therefore, we intend to clarify with the study presented here whether it is possible to generate an expert reference map for a complex consumption pattern. The assessment-with the reference's application-will take place in a subsequent research project.
In the following, we first describe the theoretical basis and then present how complexity can be handled within the framework of ESD and how systems thinking contributes to ESD. Based on this, a more precise description of the mystery method is presented, together with its general potential for assessment as derived from related mapping techniques.

Addressing Socio-Scientific Issues in Education for Sustainable Consumption
Fensham [1] provides a current overview of the urgent global problems, or Grand Challenges, and compares these with the prevailing limitations in the subject areas, which are also encountered in classroom teaching. These challenges are strongly interdisciplinary in character, however, making them impossible to resolve within one discipline alone [2]. The structure of Grand Challenges is defined by the poles of science and society. This can be demonstrated using the example of consumption. Inexpensive consumer goods can, on the one hand, be produced under conditions which pose hazards to the environment (e.g., overexploitation), a factor which can be described scientifically. On the other hand, however, these products also have societal aspects (such as wage dumping). Such areas are of a more uncertain nature and cannot be measured by scientific means [1]. Within the context of the societal and scientific domain of Grand Challenges the term socio-scientific issues (SSIs) is used to describe such areas of friction [2]. There are numerous examples of SSIs, such as a consumption-oriented lifestyle, which has a deleterious effect on environmental and social systems [12,13]. The consequences of these consumption patterns are multilayered and thus can be considered uncertain and complex [14,15]. The degree of complexity in this case is defined by the presence of elements within a sector of reality, which are interrelated. The more elements there are and the higher their degree of interconnection, the more complex the situation appears [16]. This is accompanied by a decrease in predictability and an increase in uncertainty. As an example, many elements of the production cycle of a consumer product have non-linear reciprocal effects. Feedback effects, an interaction of various elements, and multiple causes may also exist. Several plausible solution pathways always present themselves, which is typical of SSIs [2,17]. Understanding these reciprocal dependencies is a challenge, but nevertheless important, if we are to be capable of shaping future developments in such a way as to provide equal developmental opportunities to all generations-those living now (intragenerationality) and in the future (intergenerationality) [18]. This points out the obvious link to the model of sustainable development (SD), in which these ideas are united [19,20]. The interconnection of the layers economy, ecology, society with one another are a central characteristic. As the layers also contain societal and/or scientific elements, the model itself is an SSI as well [21] and therefore complex [22]. This situation has a corresponding effect on ESD [3], where SSI and complexity play important roles, so one of the central elements in the objective of ESD is dealing with complexity: learners should be made capable of understanding complex systems [15,23]. Learners are also expected to recognize their own influence on the system such as in making choices regarding consumption [13,24]. Here the connection to the construct of sustainable consumption comes clear, which is described "as individual acts of satisfying needs in different areas of life by acquiring, using and disposing goods and services that do not compromise the ecological and socioeconomic conditions of all people (currently living or in the future) to satisfy their own needs" [25]. Systems thinking must be viewed as a necessary skill [8]. Within the construct of systems thinking, among other things, the recognition of the elements of a system and their effects on and among one another is emphasized. The embeddedness of systems thinking in ESD is crucial: "Knowledge and understanding of the complex global relations in major natural, social, and economic systems are important for implementing sustainable development" [8]. Rieckmann [5] explains that the understanding of production and consumption patterns as a learning approach within ESC must be regarded as an important building block and that Sustainable Development Goal 12 is thus fulfilled. In the context of SSIs, literature suggests that learning methods contain particular elements. These include processes that initiate information processing [22], working in small groups [26], a detective-like approach [17], and working with real life problems [18]. Methods are needed which activate students and are capable of portraying complexity [27]. The mystery method [9] can fulfill all of these requirements. Regarding the possibilities of the mystery method in the area of ESC, Rieckmann [5] lists numerous learning goals that are central to responsible consumption and production and which are aligned with the method: Learners should perform Life Cycle Analysis, change their roles to put themselves in different circumstances, and fundamentally understand the production and consumption patterns.
When working on the mystery method, students work in small groups to create an influence diagram ( Figure 1) about a SSI using prepared information cards (mystery cards; see Appendix A), with the aim of answering a mysterious key question (described below). The students work in small groups and discuss with each other, argue about causes and consequences within the SSI, and handle the information from the mystery cards. The challenge of the task is to evaluate the partly uncertain information from the cards and to connect the presented aspects of the SSI consensually with labelled arrows. In this way, the mystery method facilitates informal reasoning, which is crucial for successfully working with SSIs [2,[28][29][30]. After the mystery has been worked through, the students are able to write an answer to the mysterious question from the beginning by using the information from the cards and the interrelations from the influence diagram prepared, as it depicts the reciprocal effects of the various aspects of the system being studied [18]. When working on the mystery method, students work in small groups to create an influence diagram ( Figure 1) about a SSI using prepared information cards (mystery cards; see appendix), with the aim of answering a mysterious key question (described below). The students work in small groups and discuss with each other, argue about causes and consequences within the SSI, and handle the information from the mystery cards. The challenge of the task is to evaluate the partly uncertain information from the cards and to connect the presented aspects of the SSI consensually with labelled arrows. In this way, the mystery method facilitates informal reasoning, which is crucial for successfully working with SSIs [2,[28][29][30]. After the mystery has been worked through, the students are able to write an answer to the mysterious question from the beginning by using the information from the cards and the interrelations from the influence diagram prepared, as it depicts the reciprocal effects of the various aspects of the system being studied [18]. Until now, not much research has been performed on the learning results and the learning processes when using mysteries, although the method is known internationally [e.g. 9,31,32]. The structure created in laying out mystery cards is similar to that generated with other structure-creating techniques, such as concept maps [33]. A mystery consists of elements, which can be connected to one another. In the case of a mystery, a connection indicates a causal relationship perceived by the user as linking the facts presented on the cards. Similarly, a concept map is comprised of connected elements. In this case, a connection between two concepts indicates a "fact" (proposition). Concept maps can be used to externalize and evaluate the cognitive structures of learners. In analogy to a concept map, the structure laid with mystery cards will be referred to in the following as a "mystery map", both in conformity with and in distinction to concept maps or other types of networks.

Empirical Results on the Mystery Method and Mystery Map
Leat [9] describes his mystery method as an activating tool for teaching situations, particularly for the visualization of learning processes. While working with a mystery, the cards are connected, webbed, sorted, and shifted. However, Leat [9] describes his insights only on the basis of experience and observations made during teaching. Empirical evidence is rare [34][35][36]. Empirical investigations performed to date indicate that mysteries are capable of fostering geographic and/or networked thinking [34] and are considered suitable for making complex topics understandable [27]. A further Until now, not much research has been performed on the learning results and the learning processes when using mysteries, although the method is known internationally, e.g., [9,31,32]. The structure created in laying out mystery cards is similar to that generated with other structure-creating techniques, such as concept maps [33]. A mystery consists of elements, which can be connected to one another. In the case of a mystery, a connection indicates a causal relationship perceived by the user as linking the facts presented on the cards. Similarly, a concept map is comprised of connected elements. In this case, a connection between two concepts indicates a "fact" (proposition). Concept maps can be used to externalize and evaluate the cognitive structures of learners. In analogy to a concept map, the structure laid with mystery cards will be referred to in the following as a "mystery map", both in conformity with and in distinction to concept maps or other types of networks.

Empirical Results on the Mystery Method and Mystery Map
Leat [9] describes his mystery method as an activating tool for teaching situations, particularly for the visualization of learning processes. While working with a mystery, the cards are connected, webbed, sorted, and shifted. However, Leat [9] describes his insights only on the basis of experience and observations made during teaching. Empirical evidence is rare [34][35][36]. Empirical investigations performed to date indicate that mysteries are capable of fostering geographic and/or networked thinking [34] and are considered suitable for making complex topics understandable [27]. A further study addresses the influence of mysteries, in combination with other methods, on geographic thinking [37]. This investigation was able to demonstrate that students always pass through a series of individual stages. The cognitive skills of the students determine which stage they achieve. Leat and Nichols [38] create an association between these and Piaget's developmental stages. These studies address the guiding role of the teacher. Recognition of difficulties forms one aspect of the diagnostic activity in the classroom, accompanied by the detection of successful work. According to Leat and Nichols [38] both of these become visible when mystery maps are constructed, since the arrangement of the cards renders the students' thought processes transparent. Thus, the diagram gains a particular relevance from the viewpoint of diagnostic questions. Investigations as to how diagnosis can be achieved by mysteries have recently been reported by Karkdijk, van der Schee and Admiraal [35], who work with a SOLO-based design using the numbers of connections in the mystery besides other measurements like group discussions subsequent to the mystery task. The connections were controlled based on an "'ideal' concept map" [35], generated by two experienced geography teachers in discussion with the mystery-designer. The researchers completed the test groups' mysteries, based on the information from the group discussions. Afterwards "'total' concept map[s]" [35] were constructed, based on all mysteries from the test. Based on the established way of surveying concept maps with standard solutions (or references), it needs to be asked how those references are constructed and who constructs them. Generating a single reference map seems unreasonably strict given the open and complex nature of SSIs and generating them from inside the test group is less effective than working with expertise. In addition, in case of mysteries, the actual use of mystery solutions for this purpose may cause problems rooted in lack of space during working on mysteries or other problems. This may be the reason for the completion of the concept maps by Karkdijk, van der Schee and Admiraal [35] and is the starting point for the research approach applied in the present study. The use of influence diagrams as a diagnostic instrument is a timesaving option as no additional teaching time is required. Mysteries can be part of regular classroom instruction and be used subsequently for analysis. The aim of this article is to present such an analysis, based on comparison with an expert reference. In doing so, associated techniques are applied in generating diagrams, which are then evaluated using data mining methods to establish a reference, which can be used in diagnosing systems thinking.

ESD Mystery Construction and Analysis
In principle, mysteries can be constructed around a number of lesson topics, but they are particularly suitable for use with complex and multi-layered ones. Our thematic point of reference is the field of ESD, which contains a system component in the sense of a human-environment relationship [4,8,39]. The mystery was constructed around the SSI of global water shortage [40]. The content concerns the topic of export-oriented, water-intense agriculture in Almería, Spain. This real-life issue helps to bring significance from the lesson beyond the classroom [2,41].
Few instructions exist on the construction of mysteries, e.g., [9] and those that do, have too little relation to theory to be suitable for use in conducting an empirical investigation regarding the complexity of ESD. Therefore, we followed an objective and standardized method in creating the mystery for this study. In accordance with the SD model [20] and Wu and Tsai [29], who based parts of a mixed-method reasoning analysis on the layers of SD, our mystery consist of three groups of cards, one for each layer of SD (economy, ecology, society) [42,43]. In addition, cards explicitly designed to relate to intra-or intergenerationality are included for each dimension. Fundamental research into the literature on the chosen SSI for this study provided a basis of knowledge, which was then condensed to a total of 18 cards. Six cards, including one card each bearing a close relationship to intra-and intergenerationality per dimension (Figure 2), thus represent each layer of SD. Following an initial review in the form of a practical test of the mystery performed by students of teaching based on the key question‚ "Why can cheap tomatoes cause thirst?" the cards were modified and linguistically simplified. A further review corroborated an improvement in the cards from the standpoint of practical usage. An overview of all the cards, and how they were numbered, appears in the Appendix A at the end of this article. From a diagnostic perspective, analysis of students' mystery maps is very useful since they reflect-as do concept maps-the cognitive structures of the students [38]. Such influence diagrams can be constructed in a wide variety of ways-just as concept maps can be-because the learners apply prior knowledge and the results of discussions conducted during the work phase [9]. This diversity is also typical of solution pathways for complex problems [17]. It is interesting to note that the ability to construct influence diagrams is considered to be a sub-capability of systems thinking [8]. Therefore, we assume that this ability plays a role in the creation of a mystery map. Utilizing mystery maps for assessment and thus making statements on the ability of the students constructing it, is the step intended to allow teachers to make diagnoses of their own students in the field of systems thinking as part of ESD. At the same time, the map can be used to perform research based on an assessment. To achieve this objective, this article describes a way of collecting and analyzing expert input to be subsequently used for assessing solutions to mystery maps in order to enable description of basic differences with regard to systems thinking.

Method
As described above, mystery maps are -to some degree -similar to concept maps. For those, many different methods of assessment have been suggested over time. A typical approach is the quantification of elements present within the maps [44]. Typically, this leads to a map with many links and concepts (nodes) being seen to be of higher quality than a map having only a few such elements. Kinchin, et al. [45] argue, however, that the number of elements is not the most important factor, but rather the structure of the pattern created, in other words, the quality of their interconnection. Furthermore, the quality of each element per se -that is, particularly the correctness of the propositions -can be included in the evaluation. This is often performed with the aid of a reference generated by experts [46]. This is based on the assumption that experts structure their knowledge similarly to each other, which indicates that there is a "correct" or at least "typical" way of interrelating concepts of a particular subject-domain [10]. For mysteries dealing with SSIs, there are, however, different ways of connecting the cards in a reasonable manner as the system itself is complex and open in nature. Also, in contrast to concept maps, a more densely connected influence diagram is not necessarily better than a sparse one, as the students' task was simply to identify a causal explanation of the mystery question and not to include as many causal links as they could identify. Therefore, many different sparsely connected solutions may be considered equally valid and From a diagnostic perspective, analysis of students' mystery maps is very useful since they reflect-as do concept maps-the cognitive structures of the students [38]. Such influence diagrams can be constructed in a wide variety of ways-just as concept maps can be-because the learners apply prior knowledge and the results of discussions conducted during the work phase [9]. This diversity is also typical of solution pathways for complex problems [17]. It is interesting to note that the ability to construct influence diagrams is considered to be a sub-capability of systems thinking [8]. Therefore, we assume that this ability plays a role in the creation of a mystery map. Utilizing mystery maps for assessment and thus making statements on the ability of the students constructing it, is the step intended to allow teachers to make diagnoses of their own students in the field of systems thinking as part of ESD. At the same time, the map can be used to perform research based on an assessment. To achieve this objective, this article describes a way of collecting and analyzing expert input to be subsequently used for assessing solutions to mystery maps in order to enable description of basic differences with regard to systems thinking.

Method
As described above, mystery maps are-to some degree-similar to concept maps. For those, many different methods of assessment have been suggested over time. A typical approach is the quantification of elements present within the maps [44]. Typically, this leads to a map with many links and concepts (nodes) being seen to be of higher quality than a map having only a few such elements. Kinchin, et al. [45] argue, however, that the number of elements is not the most important factor, but rather the structure of the pattern created, in other words, the quality of their interconnection. Furthermore, the quality of each element per se-that is, particularly the correctness of the propositions-can be included in the evaluation. This is often performed with the aid of a reference generated by experts [46]. This is based on the assumption that experts structure their knowledge similarly to each other, which indicates that there is a "correct" or at least "typical" way of interrelating concepts of a particular subject-domain [10]. For mysteries dealing with SSIs, there are, however, different ways of connecting the cards in a reasonable manner as the system itself is complex and open in nature. Also, in contrast to concept maps, a more densely connected influence diagram is not necessarily better than a sparse one, as the students' task was simply to identify a causal explanation of the mystery question and not to include as many causal links as they could identify. Therefore, many different sparsely connected solutions may be considered equally valid and useful. Therefore, a single reference for judging correctness is not adequate-the complexity of the issue at hand needs to be addressed somehow.
Our method does not ask experts to create a complete reference for the mystery, but instead focusses on the causal relations between the statements of the cards. Experts are asked to judge whether any form of causal relation exists between each pair of statement. The resulting data encompasses therefore a quantification of the degree of certainty that a group of experts possesses regarding any element that may appear as part of the students' solution to the mystery. The data collection is similar to relatedness judgements in psychology (cf. [10] where persons are asked to rate how related two words are for multiple pairs of words. In our case, the resulting data is modeled as a weighted graph that can be used for automated assessments and also analyzed using graph-theoretic methods, e.g., [47] in order to create a reference as a visual representation that can subsequently be used; e.g., by teachers for a manual approach to assessing mysteries. The goals align with the aspects of "knowledge engineering" and "distillation of data for human judgement" that Baker (2010) describes as typical aspects of educational data mining.

Sample
Eight experts participated in the survey. The participants were educational researchers from the field of ESD, teachers with more than 10 years of professional experience, and students of teaching (high school) with their focus on ESD and consumption. All of them were selected as experts because they (1) have expertise in the content handled in the mystery (agriculture in Almería). They possess sound expert knowledge through participation in courses and/or intensive study of specialized literature on the topic prior to the survey. The experts have also (2) dealt with the topic from an educational perspective as researchers and teachers, which requires in-depth analysis of the content [8]. Additionally, the participants (3) possess knowledge concerning both the mystery method and the requirements and content of ESD.
Overall, it would have been desirable if test persons could have been drawn from a large total number of experts by randomization. Since participation in the research project is voluntary, we have selected those persons in our existing network who meet the above criteria. On the one hand, one could say that this number can be considered small. On the other hand, the data obtained show a high degree of stability, so that it can be assumed that the data situation would not change significantly if further experts were involved.

Design
This study wants to determine with the aid of the experts all correct (and incorrect) interconnections, which can be generated within a mystery map as well as the degree of certainty that experts possess for each of these connections. This information is taken as the basis for assessing mystery maps from students (automatically) in further studies or by teachers in the classroom.
When a mystery map is created, typically only a portion of all the possible meaningful, logical correlations between the given facts are utilized since in practice only one answer under several potential solutions to the key question is being sought. Analogously, a single mystery map generally provides no information on those connections, which were not selected but are nevertheless possible in principle. These connections may not have been selected, for instance, because their content has no connection to the key question and they were not required for the solution being sought, or because they do not reflect the knowledge level or specialty of the individual test person. A possible solution is to collect a large number of expert solutions and to expect that a "saturation" will be reached at which information about all possible connections has been collected. In order to extract that maximal information from each expert, we chose a different approach. Instead of having experts solve a mystery, we presented them with all possible pairs of statements from our mystery and asked them to establish (if possible) the causal relationship between the information on these two cards. In line with a focus on the principle of cause and effect, the test persons can choose between four possible types of relations: (1) A causes B, (2) B causes A, (3) A causes B and B causes A, (4) no relation. A computer-based survey lends itself to the analysis of this type of design as it also has the advantage of saving time.
For the present survey, two of the 18 cards were not used directly. These cards did not contain any cause or effect but are essential for teaching purposes and for understanding the content. Their fundamental purpose is to provide basic introductory information. For the survey, these cards were utilized as an example in demonstrating how the survey works. The experts were required to systematically decide whether there was a logical cause and effect correlation between the remaining 16 × 15/2 = 120 pairs of cards presented to them in pairs of two in the online assessment. In this way, insights into the degree of consensus between the experts for all areas of the mystery are obtained as well as insights into where uncertainties lie. The survey took about 45 minutes for the experts to complete.
Based on the responses, a network or graph can be generated: the 16 information cards become 16 nodes between which 240 links can exist. Since a causal relationship between two facts often is not valid in both directions, the links must contain a specific direction: they are directed. As additional information, they also carry a weight: the number of experts who considered a particular connection to be valid/existing, sometimes also called the support for that particular connection. Based on this general model of a directed, weighted graph, a broad spectrum of assessments and visualizations can be performed. In particular, one can derive the "typical" expert solution to the mystery by using the pathfinder algorithm [48] with the values q = 15 and r = ∞. The pathfinder algorithm with these values removes all links for which a "better" indirect path exists. For a discussion on the selection of the parameters q and r within the context of knowledge structures can be found in [49]. Evaluating the data is a time-consuming process as different hypotheses are developed and tested according to the data mining approach.
The method of data collection was tested in a pilot survey conducted with students of teaching and then slightly modified.

Results
A first overview of the data can be provided in the form of a network (Figure 3) in which each of the nodes represents a specific mystery card and each link reflects a perceived causal association by at least one expert as represented by an arrow indicating the direction of causation: A → B means that A causes B. To improve the legibility of the figures all mystery cards are assigned a number (circles in Figures 3 and 5, nodes). The corresponding content of the cards is presented in the Appendix A. This list also presents the relation of each card to the dimensions of sustainability. The data clearly indicate that the experts found associations between nodes from all dimensions of the sustainability model. The objective of the data analysis is to obtain findings step by step, each of the stages of data processing following one another until in the end a reference is obtained.

Network of all Associations Perceived
All experts agreed that 151 of the 240 possible causal links are not reasonable and thus do not appear in the network. This leaves 89 links that a varying number of experts indicate as possible causal connections. Figure 3a shows the network depicting all of these links. One notable aspect in the figure is that node 16 is directly connected to only one other node and thus can be considered only weakly linked. In contrast, node 7 has 17 incoming and outgoing links. Only node 15 is solely a receiving point for incoming links, whereas node 16 has an only outgoing link.
All experts agreed that 151 of the 240 possible causal links are not reasonable and thus do not appear in the network. This leaves 89 links that a varying number of experts indicate as possible causal connections. Figure 3a shows the network depicting all of these links. One notable aspect in the figure is that node 16 is directly connected to only one other node and thus can be considered only weakly linked. In contrast, node 7 has 17 incoming and outgoing links. Only node 15 is solely a receiving point for incoming links, whereas node 16 has an only outgoing link.

Network of Weighted Associations
One interesting question regards the degree to which the network changes its structure when containing only links that most experts have agreed on. Figure 3b depicts a network containing only those links supported by at least half of the experts. The figure also indicates the weight of the links both directly as a number on the relevant link and indirectly through the thickness of the lines. This network contains a significantly smaller number of links. Node 16 is isolated. Also, some nodes are solely initial or final points. It also becomes evident here that there can be more than one path connecting an initial and a final point, for example between nodes 14 and 4.
It is apparent that only two of the links are supported by all eight experts: from node 8 to 4 and node 12 to 3 (for content see Appendix A).
Lowering the threshold for links under consideration to seven experts leads to an additional four links. This is of interest particularly in the case of node 2 as this is the point of contact between the two eightfold links at nodes 8 and 12. Furthermore, it is possible to differentiate between nodes having the same number of links by taking the weight of these links into consideration. For example, although both node 13 and 7 have four links each, the links at node 13 have a higher average weight. Figure 4 shows cards 5 and 13 with middle to high weight connections and their respective contents.

Network of Weighted Associations
One interesting question regards the degree to which the network changes its structure when containing only links that most experts have agreed on. Figure 3b depicts a network containing only those links supported by at least half of the experts. The figure also indicates the weight of the links both directly as a number on the relevant link and indirectly through the thickness of the lines. This network contains a significantly smaller number of links. Node 16 is isolated. Also, some nodes are solely initial or final points. It also becomes evident here that there can be more than one path connecting an initial and a final point, for example between nodes 14 and 4.
It is apparent that only two of the links are supported by all eight experts: from node 8 to 4 and node 12 to 3 (for content see Appendix).
Lowering the threshold for links under consideration to seven experts leads to an additional four links. This is of interest particularly in the case of node 2 as this is the point of contact between the two eightfold links at nodes 8 and 12. Furthermore, it is possible to differentiate between nodes having the same number of links by taking the weight of these links into consideration. For example, although both node 13 and 7 have four links each, the links at node 13 have a higher average weight. Figure 4 shows cards 5 and 13 with middle to high weight connections and their respective contents.

Approach for Creating a Reference
While these depictions make some structural characteristics more easily visible, farther-reaching statements can be derived by applying additional data processing methods. One manner of enhancing the visibility of the core interrelationships is to generate the pathfinder network (PfNet) [48]. From the 89 links originally present, this network is missing all those for which there is a "better", indirect link. In our case, better means having a higher weight. For example, a direct link exists from node 2 to 3 as well as an indirect link from 2, via 12, to 3 (Figure 3b). Comparison with the PfNet shown in Figure 5 shows that the direct link from 2 to 3 has been eliminated, because the indirect link via 12 has a higher weight. In other words, the pathfinder algorithm thins out the links appearing in a graph.

Approach for Creating a Reference
While these depictions make some structural characteristics more easily visible, farther-reaching statements can be derived by applying additional data processing methods. One manner of enhancing the visibility of the core interrelationships is to generate the pathfinder network (PfNet) [48]. From the 89 links originally present, this network is missing all those for which there is a "better", indirect link. In our case, better means having a higher weight. For example, a direct link exists from node 2 to 3 as well as an indirect link from 2, via 12, to 3 (Figure 3b). Comparison with the PfNet shown in Figure 5 shows that the direct link from 2 to 3 has been eliminated, because the indirect link via 12 has a higher weight. In other words, the pathfinder algorithm thins out the links appearing in a graph.
"better", indirect link. In our case, better means having a higher weight. For example, a direct link exists from node 2 to 3 as well as an indirect link from 2, via 12, to 3 (Figure 3b). Comparison with the PfNet shown in Figure 5 shows that the direct link from 2 to 3 has been eliminated, because the indirect link via 12 has a higher weight. In other words, the pathfinder algorithm thins out the links appearing in a graph. It is also apparent that the PfNet takes two different aspects of the graph into account. On the one hand, the original components of the graph are maintained, which is why node 16 remains connected to the rest of the network despite its link having a comparatively low weight. On the other hand, where alternatives exist, the heavily weighted links are used preferentially. From the It is also apparent that the PfNet takes two different aspects of the graph into account. On the one hand, the original components of the graph are maintained, which is why node 16 remains connected to the rest of the network despite its link having a comparatively low weight. On the other hand, where alternatives exist, the heavily weighted links are used preferentially. From the perspective of content, the network thus generated can be viewed as a type of prototypical solution for the mystery as created by the group of experts.

Discussion
The results reflect the ratings of the experts surveyed based on the given mystery cards.
The following sections discuss whether the results are reliable, which conclusions can be drawn from the findings for research and practice using the mystery method in Education for Sustainable Consumption, and what the findings tell us from the point of view of systems thinking. Beyond this, critical aspects of the method are illustrated.

Structural Aspects
The mystery used in this research is based on the consumption of unsustainable tomatoes, which stand as an example for various different consumption goods. By solving the mystery method, students discover causes and effects of such an unsustainable behavior. More so, they show exactly what they discover by constructing the mystery map; its structure reveals insights into the students' decision-making [49]. For teachers it is important to be supported with a reference map, so they are able to determine if their students have found proper connections or not. In this case, they are empowered to foster them based on the information from the reference. It is clear that the reference's structure has a crucial importance for this process, so in the following the structural analysis will take place to show what the experts told us about the causes and effects.
With regard to the 151 links not drawn, it should be noted that the experts unanimously did not determine any contextual correlation here. This is unsurprising, as it is obvious that not every fact can have a causal link to all others.
There is a notable trend from a few heavily weighted links to a larger number of less heavily weighted ones (with the exception of the weight 4; Table 1). So, there is indeed a variability or even disagreement among the experts as also noted by [50], for example because no consensus has yet been reached for this "frontier science"-aspects [51]. It may also indicate different (sub-)areas of expertise among our group of experts. This is not an indication of "noise" in the data, but a measure of the uncertainty inherent in the SSI [17], which arises despite expertise. In any case, this is true only for parts of the mystery. In the case of the high-quality links there is a (more or less clear) consensus. Ultimately, the degree of correlation between some pairs of cards remains unclear, but our results allow us to quantify the degree of uncertainty and also assure that it only affects certain sub-aspects of the mystery as, interestingly, the mystery cards associated with the layers of SD (Figure 2, Appendix A) vary in regard to their degree of webbing. It is apparent, for instance, that although the economic cards have the largest number of links on average, their average weight is not as high as that of the ecologic cards. This reflects the assumption that the ecologic cards yield more "scientifically accepted and well known" facts, while the economic and social cards carry a higher degree of uncertainty. The differences between the various disciplines in formulating system terms also makes this apparent [52].
A further important focus for evaluation are the pathways. As described earlier there can be several alternative pathways from one node to another; the example given was the path from node 14 to 4 (see description of Figure 4). This fact indicates that the multitude of solution pathways seen in mysteries and the different mystery maps are created not only under the influence of group processes as reported by Leat [9]. It is more likely that this is a further effect of the complexity and uncertainty inherent in the SSI [17]. Again, our method makes this inherent complexity accessible for automated or manual assessment of mysteries as we can use it to judge the plausibility of a solution based on the experts' degree of certainty regarding the various possible pathways.
One particularly interesting feature is the complex at node 2. As already noted, this node connects the two eightfold links by means of two sevenfold links (Figure 4). The significance of the card behind the node becomes apparent when we consider the fact that this card forms a central point of a substructure containing the most heavily weighted cards and thus is still present in the PfNet. The relevance of the information on card 2 ('Tomatoes require sufficient water, nutrients, light, and heat. Simple plastic greenhouses are used to provide these.') for the entire mystery is high and represents a core aspect of the SSI. It appears that adequate webbing of this card is a prerequisite to successfully solving the mystery. By implication then, if this card has no links to other cards, the test persons would be in need of aid.

Methodological Aspects
We were successful in fulfilling the objective of obtaining a "superset" of all expected mystery maps in the form of relatedness judgements. Based on the small number of experts, adding additional answers may shift some of the structural aspects described above, but we assume that a larger group would not produce significant deviations as the results are in line with our expectations.
Regarding the preparation of the mystery cards, the result from card 16, whose node appears in all figures as poorly webbed, is revealing. Care was taken during construction of the cards to ensure that the information was capable of being webbed so that links could be generated in the first place. The fact that only few of the experts were able to associate the information on card 16 indicates that there is still potential for further improvement. The information on the card may simply be incapable of being connected to other cards and thus the principle of cause and effect does not apply here, as was the case with the two cards eliminated beforehand. Conversely, this means that the survey method presented here can also be applied in evaluating the quality of the mystery itself. This approach would bring significant improvement and professionalization to the current method of constructing mysteries [9].
The implementation of data mining as fundamental approach and data processing method is very helpful from the point of view of establishing a reference for possible solutions of the mystery. It is for example possible, to provide teachers with a spreadsheet that contains the reference data and allows them to enter a student map and obtain a judgement of its correctness. The PfNet can be used as a "guide" but may not yet prove to be a practical approach for classrooms. In actual classroom use, the mysteries will have significantly fewer links. In line with our objective, the reference was intended from the start to be a superset of all mysteries, which might be used in learning situations in order to evaluate them in terms of their factual correctness.
Disengaging the data survey from practical application may be viewed in a critical light. After all, work in groups is one of the central elements of this method. Our survey has nevertheless generated an objective environment within which mysteries can be analyzed without the inherent influence of group dynamics. This lays the foundation called for by any scientific survey. Further research can now be performed on this basis.
Finally, a critical reflection of the expertise itself should take place. Dorussen, et al. [53] rightly ask to what extent one can rely on the statements of one individual, especially in relation to interviews with experts. It rightly seems risky to rely on one or a few experts. However, our method is innovative in this respect, because by examining a number of experts, the weight of the individual is reduced. Outliers remain visible in the data, but are no longer effective. The method is therefore well suited for a basic picture of the expertise. What is more, theoretically the type of data resulting from data mining allows not only each card to be viewed with its connections-but also the individual experts and their knowledge network. This is an interesting possibility for future approaches.

Conclusions
Use of the mystery method in ESD classroom instruction is already well established. At the same time, only a limited number of empirical studies show findings on its effectiveness. This article is intended to contribute within this context by describing a method of how to obtain a reference of any mystery from experts that allows for the inherent complexity and uncertainty of topics derived from SSIs. The approach involved applying various methods, which to date have rarely been used in educational research. The result of integrating the various lines of research in the present study is a reference, ready to be used as an assessment tool. In contrast to existing scoring methods, it is not bound to simple numerical values or counting of structural elements devoid of content, but to the mystery cards utilized and to the direction of the arrows linking them. In light of the uncertainties involved in SSIs [2,17], this is of advantage in ESD, since the uncertainties are linked to facts and thus naturally do not appear in the data by chance. The method presented here is an example of how to handle not only mysteries, but other comparable teaching methods as well. Their essential common characteristic comprises inherent complexity and uncertainty, which the present method is capable of diminishing.

Prospects for Dealing with Complexity in Education for Sustainable Consumption
Nowadays, consumption patterns contain complexity and uncertainty and are often unsustainable. By fostering students' abilities to analyze those patterns, they can see what is unsustainable and may change a pattern into a sustainable one. The complexity is a mayor issue and must be dealt with effectively, especially on the classroom level. The reference presented here firstly allows a diagnosis of systems thinking sub-traits, as this in part involves dealing with the interplay between the elements of a complex system [8] and is externalized through the work with a mystery. The precise method by which a correlation can be drawn between a student's mystery map and the reference as well as the results of a test of systems thinking is still to be conceived, e.g., [54][55][56]. In the end, this undertaking makes it possible for teachers to diagnose their students as to their need for further assistance regarding systems thinking as an essential part of sustainable consumption [39].

The Importance of Data Mining in Educational Contexts
Automated collection and analysis of data has become prevalent in many areas of science. For educational research, educational data mining deals with the specific data found in those contexts. In the study presented here, automation has been used to create the survey, collect the data, and analyze the results. Doing so manually would have been possible, but tedious. Incorporating additional answers by experts is very easily possible in an automated setting allowing us to further refine our findings in the future. Also, collecting data for different mysteries is easily possible in an automated setting and only requires recruiting experts. One of the core aspects of (educational) data mining methods is dealing with noise and uncertainty [57,58]. This makes the methods particularly suitable for our task as incorporating the uncertainty of experts into the reference was one of the core aspects of our work.

Practical Teaching Purposes
The reference can have an additional function in practical classroom teaching, when it comes to experience uncertainties. One single student's mystery map does not display any uncertainties, because no weight occurs. Based on this, individual mystery map uncertainties from an expert map can be reviewed in a meta-reflexive discussion, by comparing the propositions and the weighting of links from experts and students. Comparison of the reference to their own mystery map thus allows the students to gain knowledge on an individual level and therefore contributes to their judgement of SSI topics. Complexity can therefore be integrated into classroom teaching in a constructive manner, because the students are able to widen their focus by shifting from consideration of their own point of view to that of the wider view of the expert perspective.
Finally, we would like to point out that the mystery method has many special advantages from a didactic and practical teaching perspective (e.g., openness, etc.). We like to point out that the mystery is still used in its original form in teaching and not altered by our suggestions. The level of a future assessment is secondary and the teaching process remains unaffected.