Procedural Knowledge of Primary School Teachers in Madagascar for Teaching and Learning towards Land-Use- and Health-Related Sustainable Development Goals

: Achieving the Sustainable Development Goals (SDGs) requires the empowerment of learners through Education for Sustainable Development (ESD), already at primary level. Teacher education for the SDGs is a focus of ESD. However, many teachers in Madagascar are underqualiﬁed and show knowledge gaps regarding ESD. This paper aims at identifying starting points for an ESD-oriented further development of teacher training, considering regionally relevant issues. Teaching Sustainable Development issues requires procedural knowledge. This paper reports on (i) Malagasy primary school teachers’ ( n = 286) teaching and learning prerequisites regarding land-use and health issues compared to expert knowledge, (ii) modeling teachers’ respective procedural knowledge with the Rasch Partial Credit Model and validation studies, and on (iii) comparison of groups of teachers differentiated by diversity dimensions, e.g., teaching at rural or urban schools. The teachers underestimated land-use and health courses of action regarding effectiveness and possibility of implementation, compared to experts. IRT modeling resulted in two distinct knowledge dimensions, i.e., land use and health (latent correlation: 0.31). Rural teachers showed higher procedural land-use knowledge than urban teachers. No differences occurred regarding health knowledge. The paper argues for ESD-focused reorientation of teacher training, considering regional speciﬁcities of land-use topics, e.g., regarding vanilla and rice cultivation in North-East Madagascar, and health topics.


Introduction
Within the Agenda 2030, the UN member states-among them, Madagascar-adopted the 17 Sustainable Development Goals (SDGs) addressing the most pressing issues of Sustainable Development (SD) [1]. SDG 4, Quality Education, explicitly addressing Education for Sustainable Development (ESD), plays a prominent role in making the SDGs viable. For Madagascar, this applies particularly to primary education, as most of the population only completes primary school [2]. To promote ESD, teacher education has been highlighted as a priority area, as teachers are seen as multipliers of knowledge [3]. However, teachers and pre-service teachers worldwide show knowledge gaps regarding ESD and the SDGs [4][5][6][7][8][9]; among them are teachers in Madagascar [10]. These knowledge gaps ask for a reorientation of ESD-oriented teacher education in Madagascar that allows teachers to act as change agents to achieve the SDGs. The identification of learning objectives for ESD-relevant teacher qualification remains a "complex task" [11] (p. 631). Such learning objectives can be explicitly linked to the 17 SDGs [12], under consideration of context-or lifeworldorientation [13]. The latter is of particular relevance for countries such as Madagascar, as

ESD-Related Knowledge of Malagasy Teachers
In Madagascar, ESD-related research predominantly focuses on environmental education [10,[19][20][21][22]. Studies revealed that primary school teachers have difficulties to relate environmental issues to regional examples [10,23]. For example, teachers have limited knowledge about lemurs, their diversity and role for conservation [23,24]. Furthermore, primary school teachers in the northeastern Alaotra region had a Eurocentric perspective in teaching, as they rather referred to bush fires and charcoal instead of local invasive fish species as major environmental threat [10]. In addition, the researchers obtained the impression that the interviewed teachers "were citing facts without deeper understanding of interrelationships or reasons" [10] (p. 79). This is similar to South Africa, where teachers "[fail] to develop deeper conceptual depth and understanding of environment and sustainability, as issues-based knowledge dominates: For example, knowledge of climate change as an issue will be shared, but teachers [ . . . ] fail to consider what can be done about it" [7] (p. 32). Such knowledge gaps constitute major hindering factors to effectively implement environmental education in Madagascar [24]. To date, little is known about teaching prerequisites of Malagasy primary school teachers beyond the field of environmental education. To adequately prepare Malagasy primary school teachers for ESD, they need to be equipped with ESD competencies, including ESD-relevant knowledge, that is closely linked to the local context [12]. For a reorientation of teacher education towards ESD in NE Madagascar, data on teacher knowledge for primary education about regionally relevant SD issues is required.

Teacher Qualifications in Madagascar
In the past, the educational system in Madagascar underwent constant changes [25], as did teacher education, leading to a diversity of teacher qualifications in Malagasy primary schools [25]. Current certificates qualifying for teaching in primary schools include, e.g., the Certificat d'Aptitude d'Enseignement (CAE), the Certificat d'Aptitude Pédagogique (CAP) or the Certificat Fin de Formation Pédagogique (CFFP), that can be obtained in the Regional Centers of the National Institute of Pedagogical Training (CRINFP) [25].
However, not every primary school teacher has necessarily had an initial pedagogical training. Since the year 2000, parent associations (FRAM) are recruiting teachers for public primary schools in order to decrease the high student-teacher ratio [26]. The so-called "enseignants non-fonctionnaires (ENF)" do not necessarily have an initial qualification but receive in-service training [25].
In 2019, 59% of the Malagasy primary school teachers had not had an initial pedagogical training, the highest number among 14 Sub-Saharan African states [27]. In Madagascar, NGOs contribute to teacher education, e.g., by environmental training [20]. Yet, the increase of underqualified teachers in Madagascar impacts a decreasing educational quality [10,27,28]. These impacts are even stronger for rural than for urban areas, where teachers appear to have lower education and qualifications [29]. A recent study in the SAVA region in NE Madagascar showed that teachers in rural areas have significantly more often a (non-subsidized) ENF status, attended initial pedagogical training less often, have a higher absence rate during lean seasons (annual hunger gap in Madagascar, [30]) and a lower personal satisfaction than teachers in urban schools [31].
Another difference between teachers in urban and rural areas concerns the gender distribution, being more feminine in urban and more masculine in rural areas. Ratompomalala et al. [29] argue that the "hazardous security in rural regions and difficult access (discourage) women from other regions to work there" [29] (p. 164).
Due to the diversity of teacher qualifications, the teaching and learning prerequisites of Malagasy primary school teachers to teach ESD are probably heterogeneous. To identify starting points for reorienting teacher education towards ESD, the analysis of teacher prerequisites requires the consideration of different diversity dimensions.

Malagasy Primary Education and Education in the SAVA Region in NE Madagascar
The formal educational system in Madagascar is mainly based on the French system, but only five years of primary education are compulsory. As in many developing countries, Madagascar has a public-private education system [27]. Community schools represent the third school type, but are only visited by less than 3% of all school children [28]. In 2018, the primary school completion rate in Madagascar was only at 56% (27% for lower secondary education) [2], indicating that most Malagasy children only complete primary school.
Hurdles in the Malagasy education system are manifold, including insufficient material [29,32], poor condition of buildings and school yards [29,32,33], and long distances to school [32,34]. Differences between those hurdles predominantly exist between different school locations (urban vs. rural) [27,28,32,33] as well as between different school types (public vs. private) [27,28], being higher for rural and/or public schools. As a result, children show higher performance in urban compared to rural and in private compared to public schools [27,28]. These differences between schools of different types and locations are likewise present in the SAVA region [31]. However, the primary school completion rate in the province Antsiranana (87.3%), where the SAVA region belongs to, is higher compared to other Malagasy provinces (41.0-72.3%), except the capital Antananarivo (87.5%) [28].
The different schooling conditions in rural compared to urban areas as well as public compared to private institutions suggest that the school type and school location should be considered in educational research in Madagascar.

Land-Use and Health Issues Relevant for ESD in the SAVA Region
ESD teaching needs to be adapted to the regional realities. This applies in particular to countries with challenging living conditions such as Madagascar [35]. In a previous study, we identified courses of action regarding regionally relevant land-use practices in the SAVA region (land-use context) and health-protective behavior (health context) for ESD teaching in Malagasy primary education [15].
The two contexts cover highly relevant SD issues that are present in the SAVA region. The protection of the unique biodiversity in Madagascar while sustaining local livelihoods is a key challenge for SD [36]. Coping with this challenge requires sustainable land use such as the sustainable management of cultivations and soils [37,38]. Compared to intensively managed monocultures such as rice paddies, agroforestry as practiced for vanilla (with tutor trees and shade trees) has an increased potential for biodiversity conservation [39,40]. In the SAVA region, rice production and vanilla agroforestry are common land-use practices [41].
Coping with health issues therefore requires improved health-conscious behavior [49].
Teaching the land-use and health issues relevant in the SAVA region in primary education increases the relevance of ESD teaching in NE Madagascar [15] and can contribute to achieving the SDGs 2, 3, 6, 12, and 15. Thus, learning more about teacher knowledge as prerequisites for teaching these land-use and health issues in schools is required.

Measuring Procedural Knowledge to Identify Teaching and Learning Prerequisites for Facilitating ESD
Facilitating teaching and learning with respect to the land-use-and health-related SDGs is a complex task for ESD. Previous studies investigating SD-related knowledge as teaching and learning prerequisites build on the model of knowledge types of de Jong and Ferguson-Hessler [16,18,[50][51][52][53]. The knowledge model explicitly focuses on problem-solving and knowledge-in-use [50]. It differentiates between different knowledge types-among them, procedural knowledge. Procedural knowledge is action-oriented and refers to knowledge about specific strategies or solutions and their implementation for problem-solving [50]. In contrast to other knowledge types, procedural knowledge requires the consideration of different perspectives, e.g., for evaluating a solution strategy [54]. For ESD, where teachers and learners are often confronted with complex challenges that lack clear-cut solutions, the consideration of different perspectives such as the ecological, economic and social perspectives, is highly relevant [12,55]. Therefore, procedural knowledge is an essential knowledge type to be fostered in ESD teaching [18,56]. For measuring procedural knowledge of Indonesian and German university students and pre-service teachers as (teaching and) learning prerequisites, Koch et al. [16] and Richter-Beuschel and Bögeholz [18] investigated the effectiveness of solution strategies ("courses of action" cf. [57]) to solve specific SD challenges. Due to the lack of clear-cut solutions, procedural knowledge that contributes to solving complex SD challenges is an expert question. Therefore, both studies [16,18] used expert benchmarks previously developed in Delphi studies. To assess teaching and learning prerequisites of pre-service teachers regarding ESD, Richter-Beuschel and Bögeholz [18] compared the mean expert ratings with the mean pre-service teacher ratings. This procedure was later refined for Item-Response-Theory (IRT) modeling. Therefore, the rounded expert ratings served as a benchmark to allow modeling of teacher knowledge [17].
Following the approach of Koch et al. [16] and Richter-Beuschel et al. [56], we also conducted a Delphi study to define procedural knowledge for primary education for contributing to land-use-and health-related SDGs. The Delphi study provides the required benchmark for assessing corresponding teaching and learning prerequisites of Malagasy primary school teachers [15]. As a further development of the instruments of Koch et al. [16] and Richter-Beuschel et al. [56], the courses of action were estimated regarding their possibility of implementation in addition to the effectiveness estimations [15]. For the implementation ratings, the study participants considered extant routines, beliefs and resources that exist among the regional population [15]. Thereby, we could generate regionally adapted courses of action.
In addition, a think-aloud study with 10 primary school teachers in the SAVA region was conducted in parallel to the present study [58]. The study included five male and five female teachers working in rural (n = 6) or urban (n = 4) areas. The teachers of the think-aloud study were between 31 and 53 years old (mean: 37.8; standard deviation: 9). The think-aloud study revealed insights into the cognitive processes of the teachers. Especially, the study provides explanations of primary school teachers for their estimations of possibility of implementation regarding land-use and health courses of action [58].
As procedural knowledge is essential for problem-solving, it is highly relevant for coping with SD challenges that lack clear-cut solutions [16,18]. The expert benchmark and its underlying questionnaire-developed in the Delphi study by Niens et al. [15]-allows us to assess teaching and learning prerequisites of primary school teachers for ESD by measuring procedural knowledge regarding land-use-and health-related SDGs.

Research Questions
Primary school teachers-especially in Madagascar-play a key role for ESD, and thus for education towards the SDGs. Among the SDGs, the land-use-and health-related SDGs 2, 3, 6, 12, and 15 are of curricular importance for primary education [59]. Therefore, fostering SD-related procedural knowledge in teachers constitutes a promising avenue for ESD. In this respect, the promotion of SD-related knowledge needs to be connected to regionally relevant SD issues [14,55]. A regionally focused reorientation of ESD teacher trainings requires to learn more about teachers' prerequisites regarding SD-related procedural knowledge.
A previous study investigating procedural knowledge of pre-service teachers resulted from a comparison of the effectiveness estimations of study participants with an expert benchmark [18]. Applying this approach to investigate Malagasy primary teacher knowledge can give insights into teaching and learning prerequisites for procedural knowledge regarding the investigated contexts of land use and health. This leads to our first research question: • RQ 1: In which ways do primary school teachers' estimations of effectiveness and implementation of courses of action differ from the expert benchmark?
The procedure for measuring procedural knowledge has been continuously refined since the first approach of Koch et al. [16][17][18]. The latest approach applied IRT modeling using dichotomous items. For dichotomizing the items, the expert ratings had to be rounded to the nearest integer, affecting the precision of the expert benchmark [17]. This leads to the second research question, regarding a further improvement of modeling procedural knowledge.
• RQ 2: In which ways can the procedural knowledge of Malagasy primary teachers regarding land use and health be adequately modeled?
For the questionnaire on procedural knowledge regarding land use and health, qualitative expert comments [15] contributed to content validity. However, a validation of the measurement instrument on procedural knowledge with other constructs is still missing. This leads to research question three regarding validation purposes: In what way do(es) the resulting dimension(s) of procedural knowledge modeling (e.g., land-use and health knowledge) correlate(s) with (i) self-efficacy beliefs regarding environmental and health teaching, (ii) teaching experience, (iii) self-rated knowledge on agricultural cultivation, and iv) age?
Once we achieve a reliable and valid measurement of procedural knowledge, the questionnaire data provides differentiated information on teachers' procedural knowledge regarding land use and health. This leads to research question four: • RQ 4: What are the strengths and weaknesses of Malagasy primary teachers regarding land-use and health procedural knowledge?
To identify starting points for further development of teacher training, it is valuable to identify groups of teachers with particular needs, as well as groups with particular potential for facilitating ESD. This leads to research question five: • RQ 5: In which way does the procedural knowledge of primary school teachers differ regarding diversity dimensions such as school education, teacher training, school type (public vs. private) and school location (rural vs. urban) as well as gender?

Measurement Instruments
The present study focused on teachers' procedural knowledge regarding land-use (SDGs 12 and 15) and health (SDGs 2, 3, and 6) issues that are regionally relevant. Following the procedure of Koch et al. [16] and Richter-Beuschel and Bögeholz [18], the questionnaire of the Delphi study of Niens et al. [15] can be used for measuring procedural knowledge of Malagasy primary school teachers ( Figure 1). Therefore, we take the expert answers as a benchmark [15]. In a previous study, Richter-Beuschel and Bögeholz [17] defined the procedural knowledge of pre-service teachers by measuring the differences of their estimations from the expert benchmark using dichotomous coding. In the present study, we apply polytomous coding to define the teacher knowledge by the deviations of the teacher estimations from the expert estimations. Thus, we use the Partial Credit Model [60] for modeling teacher procedural knowledge. Questionnaire on sociodemographic information (n=286 teachers) Figure 1. Approach for measuring and modeling land-use and health procedural knowledge of Malagasy primary teachers. Grey: Delphi study [15]; blue: teacher studies on procedural knowledge, including a questionnaire and a think-aloud study [58].
The questionnaire on procedural knowledge includes land-use and health issues [15]. The land-use part of the questionnaire comprises 20 courses of action in three topics: Management of vanilla cultivations, Management of cultivations other than vanilla and Soil management (Appendix A). The topic Management of vanilla cultivations contains courses of action regarding the cultivation of vanilla, a liana that is cultivated in agroforestry systems in the SAVA region, with tutor trees and shade trees. The courses of action in the topic Management of cultivations other than vanilla refer to sustainable rice cultivation as well as sustainable management of cultivations in general. The topic Soil management includes courses of action relating to sustainable soil management in vanilla cultivations, rice cultivations and arable land in general. The health part of the questionnaire contains 21 courses of action in four topics: Consideration of clean water, sanitation, and hygiene, Consideration of food hygiene and healthy diet, and Prevention of (serious) illness and Risk avoidance (Appendix B). The topic Consideration of clean water, sanitation, and hygiene includes courses of action relating to daily hygiene practices such as hand washing and teeth brushing, latrine use, as well as drinking water treatment. Courses of action in the topic Consideration of food hygiene and healthy diet relate to the preparation of safe and healthy meals that support a balanced diet. In the topic Prevention of (serious) illness, the courses of action refer to, e.g., malaria prevention, as well as behavior in case of illness or injury, such as consultation of doctors or a health center. The topic Risk avoidance includes courses of action relating to, e.g., traffic risks, avoidance of polluted air, and body protection during pesticide use.
Each course of action was estimated regarding its effectiveness in one or two field/s of action and one or two implementation setting/s (see Tables 1 and 2). This results in nine subscales for the land-use context and twelve subscales in the health part (for further description of item composition see Section 2.4). As in previous studies on SD-related procedural knowledge [16,18,56], the answering format was a four-point Likert scale (1: ineffective, 2: little effective, 3: effective, 4: very effective/1: impossible to implement, 2: difficult to implement, 3: possible to implement, 4: easy to implement).
Prior to the assessment of procedural knowledge, the teachers answered questions on socio-demographic data regarding age, sex, type of school, employment status, years of teaching experience, and educational background ( Figure 1). Furthermore, they rated their own knowledge regarding vanilla cultivation as well as their knowledge regarding rice cultivation on a Likert scale from 1: no knowledge at all to 5: very good knowledge (Figure 1).
In addition to this, we used further instruments for validation, i.e., a questionnaire on self-efficacy beliefs developed by Moseley et al. [61] (Figure 1). The original instrument, the Environmental Education Teaching Efficacy Beliefs Instrument (EETEBI), consists of 13 items regarding Personal Environmental Education Teaching Efficacy (PEETE) and 7 items regarding Environmental Education Outcome Expectancy (EEOE). The PEETE focuses on teacher perceptions regarding their own capabilities to teach environmental education. The EEOE, in contrast, focuses on the general outcome that teachers expect from environmental education, independent from their own capabilities. For construct validation of land-use procedural knowledge, an instrument of self-efficacy beliefs needs to have certain similarities [62]. Therefore, and in order to keep the study questionnaire short, the present study only considers the items on PEETE. The items were translated into Malagasy and adapted to the local context. Furthermore, the items were adapted with a focus on health education (Personal Health Education Teaching Efficacy, PHETE) and added to the questionnaire, resulting in 26 items in total. In most items of the PHETE, the term "environment/environmental" was simply replaced by "health". Item 16 from the original instrument, referring to outdoor teaching skills for environmental education (cf. [61]), was replaced by "I have teaching skills to teach about illnesses that often affect students and their relatives to effectively teach about health". Thereby, we kept the PHETE as close as possible to the PEETE regarding structure and content. According to Moseley et al. [61], the study participants could answer on a six-point Likert scale (1: strongly disagree, 2: disagree, 3: somewhat disagree, 4: somewhat agree, 5: agree, 6: strongly agree). All study participants, except one, answered the items on PEETE and PHETE (n = 285).

Sample Composition
In total, 286 teachers from rural and urban areas participated in the study. The village choice for the teachers from rural areas was based on 60 randomly selected villages in the SAVA region [41]; all villages were maximum 10 km from tertiary roads [41]. Based on official school lists provided by the Direction Régionale de l'Éducation Nationale SAVA, we randomly selected 300 teachers from primary schools from 30 different villages and an additional 40 teachers from primary schools in the four cities of the SAVA region ( Figure 2). The selected participants were invited at least one week in advance. However, 110 of the selected teachers were absent on the day of study conduction; they were mainly replaced by colleagues of the same school. validation of procedural land-use knowledge, an instrument of self-efficacy beliefs needs to have certain similarities [62]. Therefore, and in order to keep the study questionnaire short, the present study only considers the items on PEETE. The items were translated into Malagasy and adapted to the local context. Furthermore, the items were adapted with a focus on health education (Personal Health Education Teaching Efficacy, PHETE) and added to the questionnaire, resulting in 26 items in total. In most items of the PHETE, the term "environment/environmental" was simply replaced by "health". Item 16 from the original instrument, referring to outdoor teaching skills for environmental education [cf. 61], was replaced by "I have teaching skills to teach about illnesses that often affect students and their relatives to effectively teach about health". Thereby, we kept the PHETE as close as possible to the PEETE regarding structure and content. According to Moseley et al. [61], the study participants could answer on a six-point Likert scale (1: strongly disagree, 2: disagree, 3: somewhat disagree, 4: somewhat agree, 5: agree, 6: strongly agree).
All study participants, except one, answered the items on PEETE and PHETE (n = 285).

Sample Composition
In total, 286 teachers from rural and urban areas participated in the study. The village choice for the teachers from rural areas was based on 60 randomly selected villages in the SAVA region [41]; all villages were maximum 10 km from tertiary roads [41]. Based on official school lists provided by the Direction Régionale de l'Éducation Nationale SAVA, we randomly selected 300 teachers from primary schools from 30 different villages and an additional 40 teachers from primary schools in the four cities of the SAVA region ( Figure  2). The selected participants were invited at least one week in advance. However, 110 of the selected teachers were absent on the day of study conduction; they were mainly replaced by colleagues of the same school.   This led to a sample composition of 250 teachers in rural and 36 teachers in urban schools. In total, 161 participants are male (age span: 17-65; mean: 37.65; standard deviation: 10.90) and 123 participants are female (20-67; 38.23; 11.56), two not indicated. The teachers work in public (n = 173), private (n = 103) or community schools (n = 8), two not indicated. The highest school degree is either the Certificat d'Étude Primaire Élémentaire (CEPE, primary) (n = 5), the Brevet d'Étude Primaire Complémentaire (BEPC, lower secondary) (n = 182) or the Baccalauréat (BACC, higher secondary) (n = 99). Only n = 74 teachers have a diploma or certificate following an initial pedagogical training, such as the CAE, CAP, or CFFP (Section 1.2). Further n = 206 teachers have not had such training or only had inservice training, e.g., the training provided for ENF teachers. Six teachers did not provide any information about initial teacher training. Through the random selection of study participants, we aimed at drawing a sample that corresponds to the socio-demographic distribution of primary school teachers in the SAVA region. Due to the high number of 110 replacements of the original random sample (see above), we only reached an approximate representation. For example, based on information of official documents provided by the Direction Régionale de l'Éducation Nationale SAVA, 65% of the primary school teachers in the region are males and 35% are females [66] (our sampling: 56% male; 43% female). Furthermore, 82% of the primary teachers teach in public and 18% in private schools [67] (our sampling: 60% public; 38% private).

Study Conduction
The questionnaire study was conducted with tablets and the open-source KoBo Collect app [68], using XLS programming. A team of seven local assistants and one local team leader were involved in data collection. This allowed one-to-one settings with the teachers in which the assistants read out loud the quantitative questionnaire and entered the study participants' answer on the tablet. The team was accompanied by a doctoral student. Following a one-week training, the team conducted a pilot study with seven primary school teachers. During a subsequent reflection workshop the team got final instructions to ensure a standardized procedure in data collection. The study was conducted entirely in Malagasy.
Each participant got a written data protection declaration that was explained by the assistant and contained contact information. The interviews only started after informed consent.
As a first step, the assistants explained the questionnaire structure and handed over printed Likert scales of the questionnaire to the teachers. To increase standardization, the assistants were trained to read out the courses of action with an intonation that increases the comprehensibility. Courses of action with unfamiliar terms were supported by drawings (i.e., a tippy-tap that is an installation for hands-free hand washing and a water hyacinth that is an invasive plant). In case of insecurity, the teachers had the option to skip a question. Thus, n = 47 teachers did not estimate the eight vanilla-related courses of action of the questionnaire. The total processing time of the questionnaire, including the socio-demographic questions, self-rated knowledge, the questionnaire on procedural knowledge, and the questionnaire on self-efficacy beliefs, was approximately 30 to 50 min. All participants received a small gift for participation.

Data Analysis
In the questionnaire on procedural knowledge, the study participants estimated the 41 courses of action (land use: 20; health: 21) for three purposes (Tables 1 and 2). This results in three subscales per topic, leading to nine subscales in the land-use and twelve subscales in the health context [15]. For answering RQ 1 on how the teachers' answers differed from the expert estimations [15], we compared the mean estimations per subscale Appendices A and B), using two-tailed t-tests. Additionally, we calculated Cohen's d for reporting the effect size [69].
For IRT modeling of the teachers' procedural knowledge (RQ 2), we used the expert ratings from Niens et al. [15] as a benchmark. For each of the 21 subscales of the question-naire, we created a ranking of the courses of action based on the mean estimations by the experts per course of action (Table 3). Accordingly, we created rankings per subscale for each teacher that participated in the study (Table 3). Table 3. Example of a mean expert and a teacher ranking for a subscale with four courses of action.

Rank
Mean Expert Ranking (Benchmark) Example Teacher Ranking 1 Course of action A Course of action A 2 Course of action B Course of action C 3 Course of action C Course of action B 4 Course of action D Course of action D In the next step, we compared the teacher rankings with the expert benchmark and checked if the teachers produced an appropriate ranking order. Therefore, we looked at each course of action separately and checked if the ranking relation to the other courses of action of the subscale displays the relations of the expert ranking. Thus, each subscale equals one item for IRT modeling.
In the following, the applied scoring will be explained with the example illustrated in Table 3:

•
Based on the expert benchmark, course of action A has a higher rank than course of action B (A > B).
If the teacher ranked A > B (no deviation) or A = B (small deviation), she/he got 1 point. If the teacher ranked A < B (large deviation), she/he got 0 points.
• For the few special cases where the expert rankings included courses of action with the same rank (A = B), we referred to the concrete estimations of the courses of action on the Likert scale to differentiate between no/small deviations (1 point) and large deviations (0 points).
If the difference between both estimations of the same course of action on the Likert scale was 0 (no deviation) or 1 (small deviation), the teacher got 1 point. If the difference between both estimations of the same course of action was 2 or more (large deviation), the teacher got 0 points.
The resulting total score per item was standardized by the number of ranking relations per item, expressed as a percentage score from 0 to 100 for each item. For example, an item (a subscale) with four courses of action has six ranking relations (Table 3): A > B; A > C; A > D; B > C; B > D; C > D. Thus, the maximum total score are six points. For the standardized score, the total score is thus divided by six. In the teacher example, five of the six relations are correct, only C > B is wrong. Thus, the total score is five points, leading to a standardized score of 83. In case that teachers did not estimate a course of action, she/he got zero points for the respective ranking relations. This was the case for 47 teachers that did not estimate the vanilla-related courses of action (see Section 2.3). Therefore, missing answers regarding the estimation of the courses of action led to a lower ranking score, associated with lower procedural knowledge.
For analyzing the data on procedural knowledge regarding land use and health, we applied the Partial Credit Model [60]. We conducted the Item Response Theory (IRT) analysis using Acer ConQuest 4 [70]. To create polytomous items for IRT modeling, we divided the standardized scores from the rankings into four competence levels (categories): 0: "0-70", 1: ">70-80", 2: ">80-90" and 3: ">90-100". The high boundary of 70 of the lowest score is empirically justified, as the ranking scores largely scatter above 50. Therefore, we combined the lowest scores into the lowest category below 70.
First, we conducted one-dimensional (1D) modeling for the land-use and the health context separately, using an item-centered analysis. We checked for item misfits based on Weighted Mean Squares (wMNSQ) and item discrimination. We considered a wMNSQ value between 0.8 and 1.2 [71] and item discrimination >0.20 as good item fit [72,73]. No item had to be excluded due to misfit. However, some categories had to be collapsed to ensure a minimum number of responses (>5%) per category and the increase of the average person ability from low to high categories (Appendix C).
Second, all items on procedural knowledge in both contexts were modelled in 1D-and 2D-models. To compare the fit of the two models, we computed the deviance, Akaike's information criterion (AIC) [74], as well as Bayesian information criterion (BIC) [75]. In addition, we tested for significant difference between 1D-and 2D-models using χ 2 -test. We furthermore calculated the Expected A-Posteriori reliability/Plausible Values (EAP/PV reliability; comparable to Cronbach's alpha used in classical test theory [76]) and person separation reliabilities based on Weighted Likelihood Estimates (WLE) [77]. Additionally, in the 2D-model, we examined the latent correlation between both dimensions.
To check for potentially biased items regarding our selected criteria for group comparisons (RQ 4), we applied Differential Item Functioning analyses (DIF) regarding gender, school education, school location (urban vs. rural), school type (public vs. private), and initial teacher training [78]. Based on Pohl and Carstensen [79], we did not consider differences below 0.4.
For validation (RQ 3), we analyzed latent correlations between the resulting dimension(s) of procedural knowledge and self-efficacy beliefs for environmental education (PEETE) and self-efficacy beliefs for health education (PHETE), using Partial Credit Modeling. First, we conducted two 1D-models for PEETE and PHETE. We considered the same thresholds for item misfit as for the modeling of procedural knowledge of land use and health, except for less restrictive boundaries of wMNSQ (0.7-1.3, [71]). Accepting less restrictive boundaries, we aimed to ensure a better representation of the self-efficacy constructs. For each model, one item had to be excluded due to misfit. The item addresses the idea to invite the principal into their own lessons, to evaluate environmental/health teaching (cf. Item 17 in the EETEBI, [61]). From 286 teachers, 36 refused to answer this question. This indicates that the question was sensitive in the SAVA region. To ensure a minimum of 5% of the participants per answer category, some categories had to be collapsed.
For analyzing the intended latent correlations between procedural knowledge and self-efficacy beliefs, we aimed at conducting one 3D-model with procedural knowledge and PEETE and PHETE (in case of one-dimensional procedural knowledge). In case of two procedural knowledge dimensions (e.g., land use and health), we aimed at conducting a 4D-model with both knowledge dimensions and PEETE and PHETE.
For further validation, we correlated procedural knowledge with age, teaching experience as well as self-rated knowledge regarding vanilla and rice cultivation. Therefore, we z-standardized the WLE person abilities of procedural knowledge with IBM SPSS Statistics 26. Three persons showed significant outliers (standardized WLE beyond ± 3.29 [80]) in the health context. Due to not carefully filled out questionnaires (conspicuous, too uniform response patterns), the data of these three persons were removed for further analyses. As the Shapiro-Wilk test indicated that the data was not normally distributed (p < 0.05), we used Spearman's ρ for correlation analysis.
To answer RQ 4, we compared the standardized WLE person abilities of different groups of teachers, using two-tailed t-tests and Cohen's d [69]. Teachers that did not indicate their gender (n = 2) or their school type (n = 2) or did not provide information about initial teacher training (n = 6) (Section 2.2) were not considered in the respective group comparisons. As three of the standardized WLE-person abilities of procedural knowledge were removed in the health context, the total sample size in the group comparison differs between the land-use and the health contexts.

Results
In the following section, the results are presented according to the five research questions. The numbers of the subsections correspond to the number of the research questions, e.g., Section 3.1 addresses the first research question and Section 3.2 the second research question.

Comparison of Teacher's Estimations with Expert Benchmark
The comparison of the teacher estimations regarding the effectiveness and the possibility of implementation of the courses of action showed significant differences from the expert benchmark in both contexts, land use and health (RQ 1, Table 4). Overall, the teachers systematically underestimated the courses of action (Appendices A and B). On subscale level, the differences between experts and teachers were significant for almost all fields of action and implementation settings, with medium to large effect sizes (Cohen's d [69]) ( Table 4). Table 4. T-test between estimations of experts and teachers (n of land-use experts = 15; n of health experts = 14, n of teachers = 286). In the following section, the differences between teacher and expert estimations are described in more detail: first, the effectiveness estimations on scale and subscale levels, followed by the estimations of possibility of implementation, on scale and subscale levels. We first take a closer look on the effectiveness ratings. In the land-use context, on scale level, the estimations regarding biodiversity conservation and regarding agronomic productivity showed high differences of a large effect size (Cohen's d ≥ 0.8; [69]). This pattern likewise occurred in the subscales of the land-use context, except for Soil management with a medium effect size (Cohen's d ≥ 0.5 < 0.8) for agronomic productivity (Table 4). In the health context, on scale level, the differences of the effectiveness ratings were significant and show a medium effect size (Cohen's d ≥ 0.5 < 0.8). This likewise accounts for the differences of effectiveness ratings on subscale level, except for Consideration of clean water, sanitation, and hygiene with a large effect size (Cohen's d ≥ 0.8) and Risk avoidance, where no difference occurred (Table 4).

Subscales
Regarding the differences of the estimations of implementation, a significant difference with large effect size (Cohen's d ≥ 0.8) occurred on land-use scale level. The significant differences likewise appeared on subscale level, however, only with medium effect size (Cohen's d ≥ 0.5 < 0.8). In the health context, on scale level, the differences between the implementation ratings were significant with large effect size for both settings. On health subscale level, however, the differences showed a heterogeneous pattern and no difference Sustainability 2021, 13, 9036 13 of 36 occurred regarding the estimations of implementation in urban areas for Consideration of clean water, sanitation, and hygiene (in six out of eight cases large effect size and in one case medium effect size, Table 4).

IRT Modeling-Dimensionality of Procedural Knowledge Regarding Land Use and Health
To examine in which way the procedural knowledge can be adequately modeled (RQ 2), we first modeled land-use and health procedural knowledge in two separate 1Dmodels. The modeling was successful as all items of the 1D-models met the requested requirements of fit (Section 2.4).
Regarding the comparison of 1D-and 2D-modeling of procedural knowledge, the deviance, the AIC and the BIC indicated a better fit of the 2D-model ( Table 5). The χ 2 test showed significant results. The latent correlation between the land-use and the health context in 2D-modeling was 0.31 (n = 286). The different characteristics of test quality showed satisfactory values for both dimensions of the model ( Table 6). The EAP/PV values (land use: 0.78; health: 0.56) and WLE reliabilities (land use: 0.70; health: 0.55) showed acceptable values for both dimensions. The item fit was good for both models (wMNSQ between 0.8 and 1.2, [81]) and the discrimination reached good values above 0.3 for both dimensions [82]. However, the variance of 0.15 for the health dimension indicates that there was a low differentiation within the primary teachers in contrast to the land-use context (variance: 0.53). All in all, the reported results provide evidence for two dimensions of SD-related procedural knowledge. Table 5. Comparison of fit statistics between one-and two-dimensional (1D, 2D) modeling of procedural knowledge regarding land use and health with Rasch Partial Credit Model (n = 286).   Figures 3 and 4 give insights into the item difficulties and person abilities of the two procedural knowledge dimensions (Figure 3: land-use context, Figure 4: health context). The modelings are based on polytomous items in the land-use context as well as dichotomous and polytomous item in the health context. The Wright Maps are presented with item steps.

Differential Item Functioning
As land use and health turned out to form two dimensions of procedural knowledge, we applied the DIF for each dimension separately in two 1D-models. The 1D-model of land-use procedural knowledge showed no considerable DIF (<0.4 [79]) regarding school education (maximum logit difference: 0.07), initial teacher training (0.10), gender (0.18), school type (public vs. private) (0.15) or school location (rural vs. urban) (0.20).
Additionally, no items in the 1D-model of health procedural knowledge showed considerable DIF regarding the tested groups of school education (maximum logit

Differential Item Functioning
As land use and health turned out to form two dimensions of procedural knowledge, we applied the DIF for each dimension separately in two 1D-models. The 1D-model of land-use procedural knowledge showed no considerable DIF (<0.4 [79]) regarding school education (maximum logit difference: 0.07), initial teacher training (0.10), gender (0.18), school type (public vs. private) (0.15) or school location (rural vs. urban) (0.20).
Additionally, no items in the 1D-model of health procedural knowledge showed considerable DIF regarding the tested groups of school education (maximum logit difference: 0.17), initial teacher training (0.33), gender (0.20), school type (0.15) and school location (0.11).
Thus, the test instrument is suitable for all investigated subgroups and allows analyzing the addressed diversity dimensions.

Validation
For the validation (RQ 3), we aimed at conducting a 4D-model with both dimensions of procedural knowledge-land use and health-and the two constructs of self-efficacy beliefs on environmental education (PEETE) and health education (PHETE). However, the multidimensional modeling with both constructs on self-efficacy beliefs resulted in a non-positive covariance matrix, hindering 4D-modeling. Therefore, we conducted two separate 3D-models, i.e., one 3D-model with procedural knowledge on land use and health with PEETE and one 3D-model with procedural knowledge on land use and health with PHETE. Furthermore, the WLE person abilities in procedural knowledge regarding land use and health were correlated with age, teaching experience, and self-rated knowledge.

Validation of Procedural Knowledge Regarding Land Use and Health with Self-Efficacy Beliefs
The item-centered 1D-modelings of PEETE and PHETE showed acceptable test characteristics ( Table 7). The EAP/PV and WLE reliabilities are good, the item discrimination displayed good values above 0.3 [72], and the item fit is acceptable for both models (wMNSQ 0.7-1.3, [71]). The two 3D-models of the two dimensions of procedural knowledge (land use or health) with self-efficacy beliefs regarding environmental education and health education (PEETE and PHETE) showed low latent correlations (Table 8). Table 8. Latent correlations of two 3D-models of (a) Procedural knowledge regarding land use and health with self-efficacy beliefs regarding environmental education (PEETE) and (b) Procedural knowledge regarding land use and health with self-efficacy beliefs regarding health education (PHETE) (n = 285).  Land-use procedural knowledge did not correlate with age or teaching experience. In contrast, health procedural knowledge did, indicating that the higher the age and the teaching experience, the higher the health procedural knowledge was (p < 0.01; Table 9). Table 9. Manifest correlations between person abilities of the land-use (n = 286) and of the health (n = 283) procedural knowledge dimension and age, teaching experience, self-rated knowledge regarding vanilla cultivation, and self-rated knowledge regarding rice cultivation. Land-use procedural knowledge showed low to medium positive correlations with self-rated knowledge on vanilla cultivation and self-rated knowledge on rice cultivation (p < 0.001; low correlation: r > 0.1 ≤ 0.3; medium correlation: r > 0.3 ≤ 0.5; high correlation: r > 0.5; [69]). Regarding health procedural knowledge, no correlations with self-rated knowledge on vanilla and rice cultivation existed (Table 9).

Strengths and Weaknesses Regarding Land-Use and Health Procedural Knowledge
The land-use dimension covers three different topics of sustainable land use in the SAVA region: Management of vanilla cultivation refers to vanilla agroforestry and Management of cultivations other than vanilla to rice cultivations as well as cultivations in general. Soil management includes sustainable soil management practices for vanilla cultivation, rice cultivation, as well as arable land in general. In the land-use procedural knowledge dimension, the average item difficulties ranged from −0.26 to +0.51 logits ( Figure 5, Table 10).
Regarding all three topics, it was the most challenging for the teachers to "correctly" estimate the possibility of implementation of land-use courses of action in rural life (most divergent rankings compared to the expert benchmark). The corresponding items have the highest item difficulties (OTHERc, VANc, SOILc; underlined in Figure 5 and Table 10). In contrast, in all three land-use topics, the items regarding the effectiveness on biodiversity conservation had the lowest item difficulties (VANa, OTHERa, SOILa; italic in Figure 5 and Table 10). Therefore, it was mostly easier for teachers to estimate the effectiveness of courses of action for biodiversity conservation ("correct" rankings compared to the expert benchmark) than to estimate the effectiveness of courses of action for agronomic productivity (VANb, OTHERb, SOILb; Table 10).
Comparing the three land-use topics, the three items regarding Soil management (SOIL) had lower item difficulties compared to Management of vanilla cultivations (VAN) and Management of cultivations other than vanilla (OTHER) ( Table 10).
The health dimension covers four different topics regarding health prevention in the SAVA region: Consideration of clean water, sanitation, and hygiene covers daily hygiene routines such as hand washing and teeth brushing, Consideration of food hygiene and healthy diet comprises the preparation of safe and balanced meals, Prevention of (serious) illness refers to general health care such as consulting a doctor and malaria prevention, and Risk avoidance covers the avoidance of traffic-related risks, protection against harmful substances and polluted air. The average item difficulties in the health procedural knowledge dimension ranged from −0.46 to +0.70 logits ( Figure 5 and Table 10).
The land-use dimension covers three different topics of sustainable land use in the SAVA region: Management of vanilla cultivation refers to vanilla agroforestry and Management of cultivations other than vanilla to rice cultivations as well as cultivations in general. Soil management includes sustainable soil management practices for vanilla cultivation, rice cultivation, as well as arable land in general. In the land-use procedural knowledge dimension, the average item difficulties ranged from −0.26 to +0.51 logits ( Figure 5, Table 10). As in the land-use dimension, the most difficult item in each of the four topics was the one referring to the implementation in rural life (WASHb, FOODb, ILLb, RISKb; underlined in Figure 5 and Table 10). Unlike the land-use dimension, the corresponding health items end with b instead of c due to the different questionnaire structure. The easiest item in all four topics was the item referring to the effectiveness on good health and well-being (WASHa, FOODa, ILLa, RISKa; italic in Figure 5). This indicates that it was easier for the teachers to give effectiveness-estimations that result in "correct" rankings (compared to the expert benchmark) than giving estimations of possibility of implementation in rural life that result in much more divergent rankings. On a descriptive level, the three items of Consideration of clean water, sanitation, and hygiene (WASH), as well as the three items of Prevention of (serious) illness (ILL) had the lowest item difficulties compared to the other items of the health dimension (Table 10). The low item difficulties indicate that it was easier for teachers to make estimations for WASH and ILL items that result in rankings corresponding to the expert benchmark. In contrast, the three items of Risk avoidance (RISK) had the highest item difficulties (Table 10). The three items of Consideration of food hygiene and healthy diet (FOOD) had medium item difficulties (Table 10).

Comparison of Different Groups of Teachers
We compared the land-use and health procedural knowledge of different groups of teachers according to selected diversity dimensions (RQ 5), i.e., school education, teacher training, school type (public vs. private) and school location (rural vs. urban) as well as gender.
Due to diverging sample sizes in the analyses of the teacher groups regarding diversity dimensions, we give detailed information on the sample size in the following paragraphs.
The land-use and health procedural knowledge (WLE person abilities) did not differ between teachers with different educational background; teachers with BEPC (lower secondary degree) as highest school certificate did not show lower procedural knowledge than teachers holding a BACC (higher secondary degree) certificate (land use: 182 teachers with BEPC vs. 99 teachers with BACC; health: 180 with BEPC vs. 98 BACC; Table 11). Likewise, no difference between teachers with initial teacher training and without such training appeared (land use: 74 teachers with training vs. 206 teachers without training; health: 73 vs. 204; Table 11). Furthermore, there were no differences between land-use and health procedural knowledge of teachers working in public and private institutions (land use: 173 teachers in public institutions vs. 103 teachers in private institutions; health: 170 vs. 103) ( Table 11). Table 11. T-tests between person abilities of land-use (n = 286) and health (n = 283) procedural knowledge of different groups of teachers with effect size (-: Cohen's d not calculated due to missing significance).  [69]) and 0.54 logits higher in land-use procedural knowledge than female teachers (medium effect, Cohen's d ≥ 0.5 < 0.8).

Groups
Further differences with a large effect size (Cohen's d ≥ 0.8) appear between teachers in rural and urban schools: teachers from rural schools scored 0.72 logits higher in land-use procedural knowledge, resulting in a significant difference between both groups (land use: 250 teachers in rural schools vs. 36 teachers in urban schools) (Table 11). No difference is given with respect to health procedural knowledge (health: 247 teachers in rural schools vs. 36 teachers in urban schools).
According to the statistical distribution of male and female teachers in SAVA region (see Section 1.2), our study sample included more female teachers in urban schools and more male teachers in rural schools. Thus, we analyzed the differences between male and female teachers as well as teachers at urban and rural schools in more detail.
In a first step, we conducted a separate analysis for male and female teachers. The differences between urban and rural teachers (Table 11) likewise occurred among male teachers; male teachers working in urban schools showed tendentially lower land-use knowledge than male teachers working in rural schools (p = 0.10). Regarding female teachers, a significant difference with medium effect size [69] occurred (p = 0.002; Cohen's d: −0.75); female teachers working in rural schools outperformed female teachers working in urban schools.
In a second step, we divided the sample by school location (urban vs. rural). The large differences between male and female teachers displayed in Table 11 likewise occurred for teachers working in rural schools: Males had a higher land-use knowledge compared to females (p < 0.001; Cohen's d: 0.51). Among teachers working in urban schools, no gender differences occurred (p = 0.264). However, the sample size of urban teachers was only n = 36.

Discussion
This study presents the first approach to analyzing Malagasy primary school teachers' procedural knowledge with IRT modeling. The study covers two contexts that are relevant for NE Malagasy primary education-land use and health. Thereby this study provides starting points for including SDGs 2, 3, 6, 12, and 15 transversally in education, in teacher education as well as in regionally adapted ESD on primary level. We successfully applied the results of the preceding Delphi study as a benchmark for modeling teacher land-use and health procedural knowledge [15].
In the following section, the results of the study are discussed in the order of the research questions, focusing on comparison of expert and teacher estimations, dimensionality and test quality of modeling land-use and health procedural knowledge, validation, strengths and weaknesses of primary teacher procedural knowledge, and comparison of procedural knowledge between different groups of teachers considering relevant diversity dimensions.

Teacher Prerequisites for Land-Use and Health Procedural Knowledge for ESD
The direct comparison of teacher and expert estimations of effectiveness and possibility of implementation allowed insights in teaching and learning prerequisites for ESD-relevant procedural knowledge of primary school teachers (RQ 1).
The results revealed significant differences between teachers and experts in both the land-use and the health contexts. The differences are in line with deviations in effectiveness estimations regarding courses of action in selected SD issues comparing expert and pre-service teacher ratings [16,18]. The teachers in this present study predominantly underestimated the effectiveness and possibility of implementation of SD-related courses of action compared to the experts. It has to be taken into account that experts participating in Delphi studies tend to overestimations in their field of expertise [83].
In the land-use dimension, large deviations between teachers and experts appeared regarding the effectiveness estimations for biodiversity conservation and the effectiveness estimations regarding agronomic productivity. In the health dimension, large deviations occurred regarding the estimations of implementation in rural and urban life. Regarding the effectiveness estimations in the land-use context, the teachers might not be aware of how to determine long-term effects on biodiversity and agronomic productivity. This could lead to the significant underestimations of effectiveness. Regarding the implementation ratings in the health dimension, the teachers might have seen more barriers for the implementation than the experts were aware of. The think-aloud study with ten primary school teachers that complemented the present study revealed that teachers often referred to contextual factors (e.g., access to resources such as water, fruits, and vegetables, access to markets, or the local infrastructure) when estimating the possibility of the implementation of health courses of action [58]. This indicates that the teachers' estimations of implementation were closely connected to the teachers' local surroundings. Most teachers that participated in the study worked in rural areas (Section 2.2). In contrast, the experts mostly came from cities and made general estimations regarding the whole SAVA region. This might have reinforced the known overestimation effect that experts tend to underestimate barriers and obstacles (i.e., overestimate the possibility of implementation), such as cost barriers or infrastructural restrictions, that can appear in the inquired domain [81].

Dimensionality, Test Quality and Differential Item Functioning
In the present study, we successfully modeled land-use and health procedural knowledge with IRT, using Partial Credit Modeling (RQ 2). The test statistics support a 2D-model solution with land use and health as two dimensions of ESD-relevant procedural knowledge. The latent correlation of 0.31 between the land-use and health procedural knowledge dimensions supports the assumption that the two contexts represent distinct knowledge dimensions. Furthermore, the 2D-model showed satisfactory statistics of item fit, item reliability and item discrimination without any item misfit. The two 1D-models showed no considerable DIF for different groups of teachers (highest school education, initial teacher training, gender, school type, i.e. public vs. private, and school location, i.e. rural vs. urban). Thus, the used instruments are suitable to investigate teacher knowledge along these relevant diversity dimensions.
The Wright Map of the land-use procedural knowledge dimension in Figure 3 displays the item difficulties considering item steps. The distribution shows areas of very precise measurement, but also areas of imprecise measurement of person abilities, particularly in the area above +0.83 logits and below −0.72 logits. The Wright Map of the health procedural knowledge dimension in Figure 4 displays a better distribution of item difficulties including item steps compared to the land-use procedural knowledge. It mostly shows areas of precise measurement, except for person abilities above +1.06 logits.
The successful 2D-modeling provides a reliable measurement instrument for Malagasy primary teacher education for ESD. The reliability of the measurement of the land-use procedural knowledge is good with WLE: 0.7 and EAP/PV: 0.78. In contrast, the reliability of the health procedural knowledge is restricted but acceptable with WLE: 0.55 and EAP/PV: 0.56. It is above the "critical value of 0.50" that other studies with Partial Credit Modeling apply, e.g., for competencies of ESD (0.53 for the competence dimension "quantitative modeling" of decision-making competence, [82] (p. 16)).

Validation
The results give evidence for the validity of the interpretation of the test values gained by measuring land-use and health procedural knowledge (RQ 3). Two-dimensional modeling indicated two separate procedural knowledge dimensions (land use and health): thus, latent correlation revealed context-specificity of procedural knowledge.
The low latent correlations between procedural knowledge-of each of the two dimensions land use and health-and self-efficacy beliefs for environmental education and health education in both 3D-models indicate validity. While the two instruments on self-efficacy beliefs (PEETE and PHETE) covered environmental education and health education as broad fields, the procedural knowledge in the land-use context and in the health context was measured with the courses of action belonging to selected, relevant, and concrete topics. The low correlation is in line with low correlations between procedural knowledge on biodiversity and climate change issues and self-efficacy beliefs for ESD teaching of German pre-service teachers [17].
The correlations of land-use and health procedural knowledge with age and teaching experience only showed significant results for the health dimension. Given the fact that the health procedural knowledge comprises courses of action that refer to daily life activities, the correlation of this knowledge with age is plausible. Furthermore, previous research indicates that teachers' Subject Matter Knowledge-that is comparable to the investigated procedural knowledge-increases with teaching experience [84]. Similarly, the scores in mathematics and reading comprehension of Malagasy primary school teachers significantly increase with teaching experience [27]. Considering the strong presence of health-related learning objectives regarding SDGs 2, 3, and 6 compared to the lower prevalence of landuse-related learning objectives regarding SDGs 12 and 15 in Malagasy primary school curricula [59], the results are evident.
In addition to the latent correlations (modeling of procedural knowledge dimensions and self-efficacy beliefs), manifest correlations were computed with self-rated knowledge. The significant positive correlations of land-use procedural knowledge with self-rated knowledge regarding vanilla cultivation and self-rated knowledge regarding rice cultivation are plausible, since many courses of action in the land-use context refer to vanilla or rice cultivation (Appendix A). Thus, for procedural knowledge and self-rated knowledge, knowledge regarding equal topics was measured. Therefore, the significant positive correlations indicate validity.

Strengths and Weaknesses of Teachers Regarding Land-Use and Health Procedural Knowledge
The closer look at the item difficulties ( Figure 5) revealed land-use-and health-related items belonging to procedural knowledge that were easier and those that were more difficult for Malagasy primary school teachers (RQ 4). The discussion points are illustrated by citations of the think-aloud teacher study [58] that was conducted with ten primary teachers in parallel with the questionnaire study on procedural knowledge. All selected citations represent an opinion of minimum three teachers and are followed by the individual code of the study participant.
Regarding land-use procedural knowledge, the teachers performed well for items that included rankings of effectiveness on biodiversity conservation (Items VANa, OTHERa, SOILa; Figure 5). Considering the long tradition of environmental education in Madagascar and the various trainings on environmental education provided for Malagasy teachers by NGOs, this appears plausible [20]. In contrast, programs promoting agricultural school education (addressing, among others, agronomic productivity) seem to be rare.
Comparing the three different land-use topics, the three items of Soil management showed a lower item difficulty than the items in Management of vanilla cultivations and Management of cultivations other than vanilla, e.g., rice. This is surprising, as courses of action for Soil management depending on sufficient land access (L.1-Natural vegetation development, L.12-Crop rotation, L.19-Fertilization of hill rice, and L.20-Recommended soil recovery; Appendix A) turned out not to completely fit the regional realities due to land scarcity [15]. Furthermore, the experts of the preceding Delphi study often mentioned the need for technical supervision for the correct implementation of L.21-Monitoring the soil quality [15]. Nevertheless, it was relatively easy for teachers to make good estimations for all three Soil management items that resulted in rankings corresponding to the expert benchmark. The teachers are well-aware of land scarcity as hindering factor for implementing Soil management courses of action; in the think-aloud study, available arable land was often mentioned as relevant factor for the implementation [58]. In addition, the teachers often mentioned existing habits as relevant factors for the implementation of sustainable soil management [58]. For example, one teacher explains: "Sometimes, it is the crops they are planting since always, it is still this what they will plant. [ . . . ] Following the soil quality, this is difficult for them!" [AE-01].
The easiest item in the land-use context was Item VANa ( Figure 5). For this item, teachers had to create ranking orders for courses of action regarding Management of vanilla cultivations according to their effectiveness on biodiversity conservation. The courses of action refer to cultivation practices in vanilla agroforestry, the most important cash crop in the SAVA region [41] (e.g., L.4-Having a diversity of tutor trees, L.9-Cultivation of other crops on vanilla plantations, Appendix A). A qualitative study from 2016 with ten Malagasy primary school teachers indicates, that some teachers cultivate vanilla besides their job at school [85]. Vanilla agroforestry has the potential to be a biodiversity-friendly land-use option [39,40]. It might be that teachers with personal experience in vanilla cultivation are aware of effects that the vanilla-related courses of action have on biodiversity, even if they underestimate these effects (see Section 4.1). In the think-aloud study, several study participants demonstrated detailed associations reflecting their profound land-userelated knowledge, particularly regarding vanilla cultivation [58]. For example, one male teacher at a rural school explains his estimation of L.6 regarding shade regulation on vanilla plantations: "Especially when it comes to vanilla, which it is about here, one needs always to prune the trees, but we need precision and time to do so. Like now, for example, [ . . . ] it is the rainy season. Before the next rainy season, one needs to prune them. Even if the vanilla plantation is more enlightened and the shades disappear, the vanilla lianas do not wither.
[ . . . ] If the vanilla pods are mature, it does not pose any problem but if it is still small like this, one needs to well regulate the shade on the vanilla plantations." [AC -14] However, the knowledge about topics in the land-use context, and in particular on vanilla-related courses of action, were challenging for other teachers. In total, 47 out of 286 teachers (16.4%) did not feel confident enough to estimate the eight vanilla-related courses of action (see Section 2.3). Among the teachers with missing answers regarding vanilla, 16 work in urban schools (out of in total 36 urban teachers of the study sample). The missing answers might have led to the general lower person abilities in the land-use dimension compared to the health dimension (0.34 logits difference of average person abilities in Table 6, Figure 5). In the health dimension, in contrast, only five teachers had a maximum of two missing values.
In the health dimension, items that include rankings regarding effective courses of action for good health and well-being (Items WASHa, FOODa, ILLa, RISKa; Figure 5) were easier for teachers compared to rankings of courses of action on possibility of implementation. All courses of action in the health dimension refer to daily life practices (Appendix B).
The results indicate that teachers have procedural knowledge regarding the effectiveness of such practices, but there are still knowledge gaps regarding the implementation possibilities.
Comparing the knowledge in the four different health topics, the teachers showed high performance for the three items regarding Consideration of clean water, sanitation, and hygiene (Items WASHa, WASHb, WASHc). The courses of action in this topic refer to daily hygiene routines such as hand washing and teeth brushing, but also latrine use instead of open defecation and preparation of safe drinking water. The low item difficulty of the effectiveness estimation item WASHa and the (rather) intermediate item difficulties of the implementation possibility in urban and rural life are plausible since they are widely promoted in the SAVA region. This is conducted, for example, through water, sanitation and hygiene (WASH) supplies by UNICEF [86] or education on health and hygiene practices by vanilla exporters [87]. In the think-aloud study, two teachers explicitly mentioned such trainings [58]. One teacher explained his estimation and elaborated on the conditions that are beneficial for making use of the effective measure H.12-Having constructions for hands-free hand washing: "But still, as [the constructions] bring good health, it is effective if one does more sensitization for it. Things like this [tippy-tap] are not seen very often in the villages, but as it brings health, one makes an effort to sensitize the people." [AC -14] Furthermore, the teachers showed high performance in the three items regarding Prevention of (serious) illness (Items ILLa, ILLb, ILLc). Primary schools play a crucial role in Madagascar for health prevention as the state provides medical treatment and health training through public primary schools [44]. Several courses of action included in the topic Prevention of (serious) illness are explicitly mentioned in the school curriculum (H.9-Avoiding mosquito nesting sites, H.11-The use of mosquito nets, H.16-Consultation of a doctor; Appendix B) [88]. Regarding malaria prevention (H.11-The use of mosquito nets), the Malagasy state distributed impregnated mosquito nets to a great share of the population [89]. Experts of the Delphi study described the use of mosquito nets as common practice in the SAVA region [15]. Accordingly, many teachers that participated in the think-aloud study referred to the free provision of mosquito nets [58], e.g.,: "The possibility of implementation [of the use of mosquito nets], this is also easy to implement, because the state has already distributed mosquito nets to everybody" [AH-02].
The most difficult knowledge items belong to the Risk avoidance topic (RISKb and RISKc). The topic comprises, inter alia, courses of action regarding traffic education (H.22-Paying attention to fast vehicles, H.23-Respecting the security rules for driving) and pesticide use (H.21-The save use of pesticides; Appendix B). The corresponding courses of action are considered to be highly relevant for regional primary education despite not yet being integrated into current school curricula [15]. The high difficulties of the first item steps of RISKb and RISKc already indicate that for many teachers, basic procedural knowledge regarding the implementation of Risk avoidance courses of action deviate substantially from expert knowledge.

Teacher Procedural Knowledge Differences Regarding Educational Background, Gender, and School Location
The comparison of different teacher groups (RQ 5) reflecting diversity dimensions indicates that there are no differences in procedural knowledge between teachers with different educational background, be it school education (BEPC vs. BACC) or initial teacher training (no training vs. training). This is in line with a previous study on German preservice teachers' SD-related procedural knowledge, where no difference in procedural knowledge between Bachelor and Master level student teachers occurred [18]. These results indicate that existing school education and teacher training in Madagascar do not yet explicitly promote land-use and health procedural knowledge. However, we did not differentiate between the different initial teacher trainings that currently exist or existed before the latest educational reforms (Section 1.2).
Among all group comparisons, the difference in land-use procedural knowledge between male and female teachers was the most striking. This difference can be explained with the fact that females generally have less access to agricultural training, e.g., provided by NGOs [41]. Furthermore, traditional gender roles are still prevalent in Madagascar [90], so that presumably men have more responsibility for productive agricultural work [91]. When it comes to vanilla in the SAVA region, men are rather involved in cultivation, while women are rather responsible for flowering and processing of the vanilla bean (personal observation). The questionnaire only covered the cultivation of vanilla, but not flowering and processing. This might have led to a better performance of male study participants in the land-use dimension. The differences between male and female teachers also occurred when the sample was separated by school location (urban vs. rural; see Section 3.5).
The high land-use procedural knowledge of teachers in rural schools can be explained by the high proportion of the rural population that is working in the agricultural sector. As a result, land use is a highly relevant topic in rural areas. Teachers in rural schools might have a greater connection to land use, compared to urban teachers. However, even if the primary school curricula include some learning objectives related to land use [59], the higher knowledge of rural teachers in the land-use context seems to have little effect on learning outcomes; in the SAVA region, rural schools have lower completion rates and face more hurdles compared to urban schools [31]. The phenomenon is plausible regarding the conception of the CEPE (primary school final exam). The CEPE refers to the nationally standardized curricula. In contrast, our investigated land-use knowledge, at least for vanilla and other cultivations, is regionally adapted to the SAVA region. Despite the higher land-use procedural knowledge of rural primary teachers, they are less qualified than teachers at urban schools, and they display lower job satisfaction (Section 1.2).
A regionally adapted school curriculum would not only enhance the relevance of the teaching content, but also allow primary school teachers to bring their land-use knowledge into the classroom and thereby increase quality education. Therefore, regionally adapted education could result in a substantial benefit for the society in the SAVA region.

Limitations
It is noteworthy that in both dimensions of procedural knowledge-land use and health-the items referring to the possibility of implementation in rural life displayed the highest item difficulties. This coincides with a special sample composition of the Delphi study, where most experts came from urban regions [15]. According to Niens et al. [15], this could have led to a potential bias of the expert benchmark. This might reduce the explanatory power of items regarding the possibility of implementation in rural life. However, as teachers in rural schools outperformed teachers from urban schools regarding land-use procedural knowledge (Section 3.5) the effect of the Delphi study sampling with mostly urban experts was probably small.
Regarding the collapsing of categories for IRT modeling, items with a low number of courses of action showed a reduced number of item steps (Section 3.2.1). For example, all three items regarding Soil management, containing nine courses of action, showed three item steps. In contrast, two topics in the health context only comprise four courses of action (Consideration of food hygiene and healthy diet and Prevention of (serious) illness) and two of the corresponding items were only dichotomous (Section 3.2.1).
More courses of action per item could have led to more differentiation of the ranking scores and thus more item steps, increasing the precision of the measurement instrument. Further studies should therefore consider a higher number of courses of action per item. Still, the IRT modeling in the present study was successful with minimum four courses of action per item, without any item misfit.
The IRT modeling of procedural knowledge resulted in two dichotomous (scoring 0/1), twelve trichotomous (0/1/2), and seven quadrotomous (0/1/2/3) items (Appendix C). Therefore, it has to be taken into account that items with higher maximum score values have more influence on the IRT model. For example, Soil management items with a scoring of a maximum of three had more influence on the model than trichotomous items. In contrast, the two dichotomous Items FOODc or ILLc with a scoring of maximum one had less influence on the model. As dichotomous items only occurred in the health dimension, the land-use dimension is less affected by the partly unbalanced analysis procedure. This is also reflected in the higher reliability values of the land-use dimension (0.78) compared to the health dimension (0.56) ( Table 6). Still, the elaborated ranking procedure led to differentiated information for each single item and, thus, good (land use) and acceptable (health) reliability values (Section 4.2). Therefore, the analysis without weighting did not affect the general study results regarding the validation (Section 3.3), strengths and weaknesses of primary teachers (Section 3.4) and group comparisons (Section 3.5). However, to further increase the precision of the measurement procedure, future analysis could consider a weighting of the items, as done for example by Joachim et al. [92].
Furthermore, the approach for data analysis shows a limitation. After our applied procedure for calculating the ranking scores (Section 2.4), teachers always received a full point when they gave the same rating for two courses of action in an item (because two courses of action on the same rank is always a small deviation, no matter which course of action was rated higher by the experts). Thus, teachers received the full ranking score if they gave the same rating on the Likert scale to all courses of action in one item. On average, this phenomenon occurred in 20% of the cases per item (predominantly for items with a low number of courses of action). However, it needs to be considered that the courses of action in the study questionnaire were not ordered according to the different topics (items). Furthermore, the study was conducted in an interview setting where the assistant read out loud the course of action and filled out the questionnaire according to the teachers' answer. This reduces the probability that undifferentiated responses in an item (leading to high ranking scores) were caused by un-reflected response behaviors. However, we further improved the measurement instrument for future studies, so that the study participants explicitly rank the presented courses of action.

Conclusions
Land use (relating to SDGs 12 and 15) and health (relating to SDGs 2, 3, and 6) are two facets of SD that provide promising starting points for including SDGs in ESD in the SAVA region [15]. The same accounts for Malagasy primary education on national level. Regarding both contexts, the land-use courses of action predominantly meet the needs for the SAVA region, whereas the courses of action in the health context address nationally relevant issues.
The present research gives valuable insights into primary school teachers corresponding procedural knowledge as prerequisites for facilitating ESD. We successfully applied the Partial Credit Model, identifying the two contexts: land use and health, as two dimensions of procedural knowledge. Therefore, we successfully applied the results of the preceding Delphi study as a benchmark [15]. The approach represents a further development of the instruments for measuring university students' procedural knowledge by Koch et al. [16] and pre-service teachers' procedural knowledge by Richter-Beuschel and Bögeholz [17,18]. In particular, the application of rankings in data analyses for defining the deviation from teacher to expert knowledge is a new contribution, in addition to the results regarding primary teacher knowledge for ESD in the SAVA region.
To date, school education and initial teacher trainings seem to have little influence on highly relevant teachers' procedural knowledge on SD issues. This indicates that ESD is only marginally integrated in school and teacher training curricula. Beside the official state institutions that are responsible for teacher education, external bodies such as NGOs and bilateral development aid are a major driver for ESD [3,93]. In the SAVA region, the promotion of clean water and hygiene [86,87] as well as environmental education [94] seems to have influence on teachers corresponding procedural knowledge, as related WASH items as well as items regarding biodiversity conservation show lower item difficulties.
However, there are still knowledge gaps that need to be filled. To increase teachers' awareness on how effective sustainable land use can be for biodiversity conservation and agronomic productivity, stakeholders and teacher training institutions should consider integrating courses of action that are effective for both in teacher training. Given the higher item difficulties of items regarding agronomic productivity, stakeholders and teacher training institutions should furthermore consider broadening their training foci and promoting teachers' procedural knowledge in this field. Such trainings should explicitly target women, as they show lower land-use procedural knowledge.
Vanilla cultivation, as a special characteristic of the agricultural landscape in the SAVA region [41] and with potential as biodiversity-friendly land-use option [39,40], can provide examples for regionally relevant ESD teaching [15]. The results of the present study support the introduction of vanilla topics into primary education, as teachers showed high performance on vanilla-related items. However, it should be considered that not all teachers are familiar with vanilla cultivation, as 47 teachers did not answer vanilla-related courses of action (31 from rural schools and 16 from urban schools).
Teaching related to Soil management requires the consideration of regional factors, such as land scarcity [15]. The think-aloud study indicates that teachers are aware of land scarcity as a limiting factor for sustainable soil management (cf. [58]). Accordingly, the teachers showed high performance regarding Soil management procedural knowledge. These results speak for the integration of Soil management already in primary teaching [95]. Under consideration of regional factors, e.g., land scarcity, school children should learn why to conserve the soil whenever possible.
The group comparisons regarding the diversity dimension of school location showed that teachers from rural schools have particularly higher procedural knowledge regarding land use, despite being disadvantaged in other areas, compared to urban schools [31]. Many Malagasy teachers are not only teachers but also farmers (e.g., in the SAVA region, cultivating vanilla [85]). The agriculture-related knowledge of these teachers should be valued as an asset that they can bring into the classroom for regionally relevant ESD teaching-a potential that is not yet fully exploited. The importance of connecting teaching content to locally relevant examples has been echoed in several contexts of educational research, among them research on ESD in Africa [14,96,97]. Therefore, teachers should be encouraged to bring their existing knowledge of regional relevant SD issues into the classroom, e.g., through agricultural excursions that are currently rare in the SAVA region [31].
Regarding health procedural knowledge, particularly the Risk avoidance topic needs more attention for further development of teacher training. Despite its high relevance for primary teaching [15] it is to date absent in primary school curricula (cf. [88]), and corresponding procedural knowledge is low. Two of five courses of action in the Risk avoidance topic refer to traffic education. Despite the large distance to cities, fast vehicles such as motorcycles and corresponding risks do likewise exist in rural areas. This accounts particularly for the SAVA region where vanilla cultivation and marketing has recently increased the economic situation of many households and thus enables people to own motorbikes [41]. As a result, teacher knowledge for teaching traffic education should be promoted, so that teachers are able to prepare children for related health risks. Furthermore, as the health items regarding implementation had higher item difficulties and the teachers' estimations had large deviations from the expert benchmark, health education and training should focus on implementation possibilities, linked to the direct environment of the schools. Linking the implementation possibilities to the local context could increase awareness on how to perform health-protective behavior.
As a result, ESD-oriented development of initial as well as continuous teacher education should not only promote knowledge in the promising fields of land use and health (related to SDGs 2, 3, 6, 12, and 15) but also provide tools for teachers on how to connect their local environment to the curricular content, and thereby enhance the relevance of their teaching. A (further) regional adaption of parts of the national school curricula could significantly enrich the teachers' possibilities to connect their teaching to local realities as current curricula "[lack] site-specific content" [98] (p. 36).
The identified regionally relevant land-use and health topics are not only suitable for teaching ESD in primary education; these highly relevant ESD topics also provide real-world learning opportunities for teacher education, e.g., through project-oriented learning [99,100]. Together with local stakeholders and supervising experts, teachers could address land-use and health challenges and "produce a workable contribution to solutions" [100] (p. 312) by applying sustainability concepts [100] and ESD approaches [12]. Such project-oriented learning opportunities could also be linked to agricultural excursions, e.g., to vanilla cultivations, as well as to projects related to road safety education. Respective participatory teaching and learning methods are proposed by the Global Action Programme [101] and are known to foster sustainability competencies such as knowledge regarding problem-solving, i.e., procedural knowledge [100,102].
In sum, the present study allows an evidence-based further development of primary school curricula as well as teacher training curricula and, thus, facilitates a transversal inclusion of SDGs into Malagasy education.
So far, educational research in Madagascar predominantly focuses on environmental education [10,20,21]. To the best of the author's knowledge, the present study is the first contribution that explicitly focuses on ESD, covering two different facets-land use and health-that relate to highly relevant SDGs. Thereby, it uses recognized IRT modeling, rarely used for ESD competence research up to now. The present study opens the field on educational research in Madagascar and encourages future work to broaden the view on ESD and its multiple facets that are worth being addressed in primary education.

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data cannot be shared at this time, as further analyses are awaited. The data can be made available in due time upon request.
Acknowledgments: Special thanks goes to Béatrice Rasoanirina for her crucial support during data collection by leading the local assistant team and managing the field visits. Furthermore, we thank the local team members that were involved in quantitative data collection. We are grateful for the teachers who dedicated their time by contributing to the study, as well as the school directors of the selected schools who made the data collection possible. We also thank Rebecca Schneider for analyzing the data of the think-aloud study within the scope of her Master thesis. This publication was supported financially by the Open Access Grant Program of the German Research Foundation (DFG) and the Open Access Publication Fund of the University of Göttingen.

Conflicts of Interest:
The authors declare no conflict of interest.
Appendix A Table A1. Rated courses of action for procedural knowledge in the land-use context. Mean and standard deviation of teacher estimations (n = 286) and mean difference between estimations of teachers (n = 286) and experts (n = 15). M: mean, SD: standard deviation.

Topics
Courses of Action   Table A2. Rated courses of action for procedural knowledge in the health context. Mean and standard deviation of teacher estimations (n = 286) and mean difference between estimations of teachers (n = 286) and experts (n = 14). M: mean, SD: standard deviation.