Assessing Environmental Attitudes and Cognitive Achievement within 9 Years of Informal Earth Education

Given the multitude of attitude scales, we examined the relationship between the 2-Major Environmental Values model (2-MEV) and the New Environmental Paradigm scale (NEP) based on a 6585 child sample over a 9-year period. The students participated in a three-day outdoor earth education program at field centers in three different US states (Arizona, Pennsylvania, and Louisiana). We further investigated the scales’ sensitivity to program effects, relating cognitive achievement and attitude with respect to a pro-environmental indicator of behavior (Y key). The NEP and Preservation correlated highly, while the subscales Utilization and Preservation showed a strong inverse relationship. Based on further reliability and validity scores, and in line with the literature, this pointed to a unidimensional Preservation of Nature scale as a concise attitude measurement. In structural equation modelling, Preservation related to knowledge gains and the Y key, and effects from Preservation on knowledge held true for all three states. This suggests Preservation as one factor influencing cognitive achievement and environmentally conscious performance. Regarding program effects, the Earthkeepers program seemed to induce pro-environmental shifts based on knowledge gains and attitude changes (Preservation increasing and Utilization decreasing). Pro-environmental shifts were most prominent for those who received the Y key.


Introduction
Environmental attitude is among the key concepts of environmental psychology, often referring to environmental concern. It forms a set of beliefs, affects, and behavior intentions towards the environment [1]. Eagly and Chaiken [2] refer to attitude as a person's evaluation of or affection towards an object or topic, which is associated with cognitive, psychosocial, and demographic variables [3]. There is consensus about the multidimensionality of environmental attitudes on the primary level, while the second order structure is less evident [4,5]. Some argue for one higher order factor [6,7]. Xiao and Dunlap [8], for instance, proposed eight primary level factors pointing toward one higher order factor, which is environmental concern. Others advocate two dimensions. The 2-Major Environmental Values model (2-MEV) reflects the dilemma of whether to protect or exploit nature on two orthogonal scales [9]. Milfont and Duckitt [10] confirmed the two-dimensional structure but added primary level factors. Thompson and Barton's [11] ecocentrism-anthropocentrism scales showed a similar pattern; they designed an instrument with three underlying perspectives but later merged the altruistic and egoistic dimension. Building on Stern and Dietz' [12] value-basis theory, which derives attitude from values, others differentiated between the three higher order dimensions of the self (egoistic), other people (altruistic), and the biosphere [13,14]. Next to the question of dimensionality, there is a multiplicity of program-specific scales, which often lack comparability because of their application in isolated settings or with small sample sizes [15]. In research, questionnaires on attitude are often time-consuming. Particularly for children, they need to be compact so the quality of responses is not at the expense of item numbers. We contrast two widely used instruments based on a large, age-homogenous but state-diverse sample size, with the New Environmental Paradigm scale (NEP) pointing towards one higher order factor and the 2-MEV pointing towards two, aiming at a concise environmental attitude scale. We then investigate program effects of an established earth education program to see to what extend the attitude scales are sensitive to its effects, and finally test the predictive power of the attitude scales on knowledge and behavior.
In the 1990s, the 2-MEV was developed, as there had been no appropriate tool to measure environmental attitude of adolescents; previous studies were mainly used with adults [9]. While at that time many scales were competing, the 2-MEV received the privilege of independent cross-cultural confirmation from different disciplines (e.g., [5,10,[16][17][18]). Subsequently, the model has been used around the globe and translated into 33 languages, the very basis for its consistency being those bi-national studies; recently, e.g., Regmi et al. [19] applied the 2-MEV with 200 rural Nepalese students and confirmed its validity and reliability. The instrument's worldwide acceptance is rooted in its compact format and two-dimensionality, which covers preservation of nature (a biocentric perspective to protect our natural environment) and utilization of nature (an anthropocentric perspective to exploit nature) [20]. It allows the placement of individuals on a biocentric and an anthropocentric scale independently. This means individuals could score high in Preservation although they show tendencies to exploit nature (Utilization). Johnson and Manoli [21] and Schneller et al. [22] adapted the 2-MEV to American fifth to sixth graders who participated in earth education programs and cut the original number of items. This reduced scale showed stability and validity in an 8-year comparison [23]. Despite modifications regarding locality or length, the instrument has been shown to be internally consistent and reliable (e.g., [14]).
The NEP is among the most frequently used environmental attitude instruments, as it is less topic-specific than comparable ones introduced at that time, e.g., the Environmental Concern scale [24] or the Ecology scale [25]. Dunlap and Van Liere [26] had developed the 12-item NEP but eventually revised it to assess a key set of ecological worldviews, to balance pro-and anti-environmental items, to update terminology [27], and because the scale had faced disputes about its dimensionality. The modified 15-item NEP aimed at the new relationship between humans and nature in Western countries and placed individuals on a scale between feeling separated from or as an internal part of nature through its five interrelated domains. Since it relies on the social-psychology theory and on Rokeach [28], it reflects primitive or fundamental environmental beliefs and allows for predicting attitude or behavior [29]. Several studies yielded two to four latent factors, and with an increasing number of items, more but not necessarily reasonable factors are likely to emerge; if the eigenvalue of the main factor far exceeds the remaining ones, it should be treated as unidimensional. According to Dunlap [29], the NEP's dimensionality depends on the study purpose, and the potential to be used either way turned into a strength as it reflected the cohesion of a person's belief system. Based on this, we treat the NEP unidimensionally and use its modified version for children [30].
In a preceding paper, Manoli et al. [31] related the 2-MEV to the NEP and concluded that the two-dimensional 2-MEV allowed assessment and prediction of attitudes more accurately than the unidimensional NEP. They suggested that a person with high Preservation and low Utilization scores (2-MEV) is placed at the upper end of the NEP, which corresponds with a biocentric worldview. Low Preservation and high Utilization scores are at the opposite end of the NEP, which reflects an anthropocentric view on nature. Consequently, the NEP is restricted to place an individual on a continuum between biocentrism (pro-environmental) and anthropocentrism (anti-environmental), whereas the 2-MEV allocates a person to one of four quadrants (high Preservation and high Utilization scores, high Preservation and low Utilization scores, low Preservation and low Utilization scores, or low Preservation and high Utilization scores). The 2-MEV acknowledges that a person could score low or high in both Preservation or Utilization, while those aspects would merge on the NEP and result in a medium or zero score. Although the NEP is a valid and reliable instrument to assess environmental perceptions and useful to determine extremes (biocentrism vs. anthropocentrism), the 2-MEV can identify attitude sets, which allows for designing learning activities accordingly. We build on those findings by further relating both instruments.
Environmental knowledge faces the issue of measurement and conceptualization. One approach is to regard environmental knowledge from three perspectives [32]: System knowledge covers facts and fundamental concepts, which reflect a basic, factual understanding. Action-related knowledge refers to understanding and implementing individual (pro-) environmental behaviors (e.g., turning off the lights when you leave the room, or taking a shower instead of a bath). Effectiveness knowledge is more abstract and enables students to reflect on the efficiency of pro-environmental actions on a broader scale (e.g., reducing carbon emissions by promoting public transportation). In general, factual knowledge about environmental issues appears to be low; yet it is challenging to draw conclusions as many studies use program-specific ad-hoc questionnaires, which often lack comparability, validity, or reliability (e.g., [33]). Providing such quality criteria requires time-consuming piloting and stepwise revision, which promotes the development of a universal scale. We therefore use a broadly applicable environmental knowledge scale that covers basic conceptual understanding (system knowledge), with action-related and effectiveness knowledge to be added in future studies.
From a traditional approach, knowledge is essential to environmental attitudes (e.g., [3,34,35]). This linear function of knowledge raising awareness evokes attitude and, finally, ecological behavior (e.g., [36,37]). This is reasonable given that, e.g., people are more willing to reduce carbon emissions if they realize the impact of rising carbon dioxide levels. There is reason to look at knowledge from the opposite direction, which is knowledge resulting from attitude, as you are more willing to learn if you have a certain mindset [38]. Pro-environmental behavior is the main goal of environmental education, but there is uncertainty about which parameters contribute most [39]. Many researchers claim only a low to moderate relationship between attitude and behavior, but this perceived relationship might stem from the unclear dimensionality of environmental attitude or from methodology (e.g., [3]). Based on this, we investigated attitude effects on knowledge.
We conducted our study at similar earth education centers in three US states, where four to six grade students experienced the same three-day outdoor program, Earthkeepers. This is important because the more intense an informal learning program, the more likely it is to induce attitude changes [33,40,41]; once formed, attitudes are enduring and resistant to change [35,42,43]. Meaningful outdoor learning is presumed to enhance affectively based attitudes, which are associated with intrinsic motivated behavior, while the traditional school setting instead establishes cognitive attitudes associated with extrinsic motivated behavior [44]. Moreover, the younger the students are, the more fruitful it is to introduce pro-environmental attitudes [36,45,46], as positive attitude towards science declines in secondary schooling and is difficult to generate in older students [47,48]. This renders the earth education program not only the foundation of our study, but a key feature to further investigate environmental learning.

Objectives of the Study
Based on the large sample size, the present study focused on a multi-perspective approach to compare two established attitude scales (2-MEV to the NEP). This refers to (1) assessing the quality of the scales; (2) relating both scales; (3) determining their sensitivity to program effects; and (4) relating them to knowledge scores and the Y key, while we tested if our postulated structural equation model held true for three subsamples (students in Arizona, Pennsylvania, and Louisiana). This further allowed us to investigate the program effects of Earthkeepers. Our research questions were:

1.
How do the 2-MEV subscales and the NEP relate based on a large, age-homogeneous but state-diverse sample? 2.
What changes does the three-day residential earth education program induce on attitude and knowledge scores with regard to the behavioral indicator Y key? 3.
How does attitude affect knowledge scores and behavior (Y key)? To what extend can we confirm our postulated structural equation model in a state comparison (AZ, PA, and LA)?

Participants and Procedures
We compiled the data of 6585 4th to 6th grade students (age: MS ± SD: 9.96 ± 0.812) in three different states (27.2% in Arizona (AZ), 25.8% in Pennsylvania (PA), and 47% in Louisiana (LA)), all of whom participated in the same three-day outdoor earth education program, Earthkeepers. We further included control subjects from the same schools, who did not participate in the Earthkeepers. Gender showed an even distribution with 40.9% females and 39.2% males (19.9% provided no information). It was not possible to collect demographics on race/ethnicity or economic status from individual students. The school districts, however, provided us with general information: most students came from a low to middle socioeconomic environment. In Arizona, the students were from 33 urban schools with a majority of Hispanic and non-Hispanic white students. Louisiana is represented with 34 schools, which are mostly from an urban area with a majority of African-American students, with others from suburban areas with a majority of white students. In Pennsylvania, predominantly white students from 19 elementary and middle, (sub-) urban and rural schools participated in the study. The questionnaire was administered 1-2 weeks prior and 4-6 weeks after the three-day outdoor experience, which was a residential program in AZ and PA while students in LA commuted daily to the earth education center.

Earthkeepers Program
Framed as a "magical learning adventure", the earth education program Earthkeepers aims at gaining a deeper understanding of our natural environment and at developing pro-environmental behavior through outdoor activities. Organized around the work KEYS (for Knowledge, Experience, Yourself, and Sharing), the program consists of (1) a three-day outdoor experience at an environmental learning center and (2) follow-through back in the classroom (for detailed information, see [49]). The students can earn four keys (K, E, Y and S) to become an Earthkeeper. Students are invited to participate through a letter they receive from a mysterious character known only by the initials E.M. Each time a student earns one of the keys, they are able to open a locked box to reveal one of the secret meanings of E.M.'s name. For example, opening the K box show that E.M. means Energy and Materials, the overall focus of the Knowledge portion of the program. Earning keys and opening boxes helps to create a sense of magic and adventure while also reinforcing the main points of the program.
During the three days at the center, participants receive two keys, Knowledge (K) and Experience (E), for learning about ecological concepts and for engaging in experiences like observations or discoveries. There are four Knowledge activities, each a participatory experience outdoors lasting 75-90 min, with a focus on big picture understanding. One focuses on the flow of energy in ecosystems, one on the cycling of materials, one on the interrelationships between living and nonliving things, and the final one on change over time. All aim to make abstract ecological concepts more concrete for the learners, and each includes an application component. Experience also consists of four outdoor, participatory activities, focusing on observation, discovery, immersion and solitude. These activities aim to help the students develop environmentally positive feelings, values, and attitudes. Other activities during the three days help the students to process what they are learning and experiencing and learn about ways that they can lessen their impact on the environment when they return home to work on the next two keys.
Back at school, the students earn the Y (Yourself) key if they lessen their impact on and deepen their feelings for the earth. They select individual tasks to save energy and to save materials and spend time in nature, all on a regular basis for at least one month to form new habits. Some suggestions are introduced during the program, e.g., saving energy by riding a bicycle instead of getting a ride in a car. Examples to reduce resources refer to water (taking a shower instead of a bath), electricity (turning off lights or heating), packaging (choosing environmentally friendly products at grocery stores), or recycling. The S (Sharing) key is for sharing experiences with others. It is important to note that not all students earn the Y and S keys; only those who provide evidence that they have done the required tasks receive the last two keys and become an Earthkeeper. Teachers encourage the students to do so, but completion is not mandatory.

Instruments
We used The Environment Questionnaire (TEQ) (e.g., [50]), which comprised the modified 16-item 2-MEV [21] and the 10-item NEP [30] slightly modified to use with children in the US. Language and content were simplified without changing the meaning of the items, and the 2-MEV was shortened to nine Preservation and seven Utilization items. In line with the literature, the students rated their answers on a five-point scale ranging from strongly agree to strongly disagree with a neutral middle of "not sure" [23,51].
Since using both scales resulted in a longer questionnaire, we restricted the knowledge items to system knowledge, as this is the focus of most environmental knowledge studies (e.g., [32]). The 13 items are applicable to the Earthkeepers program and address basic ecological concepts [52]: Energy (e.g., "Which of the following shows a food chain in proper order?"), materials cycling (e.g., "Take a look at your pencil. Were the materials that make your pencil ever part of something else?"), interrelationships (e.g., "I can do just one thing without affecting anything else."), and change (e.g., "If a place is now a desert it may someday be an ocean") (Table A1, Appendix A).
Environmental behavior is difficult to measure, particularly in children, and studies show little association between children's environmental attitude and self-reported behavior (e.g., [53]). In the present study, we used whether or not students earned the Y key as a proxy for environmental behavior, because they need to provide evidence that they have made behavior changes to earn the key.

Statistical Analyses
We used the Statistical Program for Social Sciences (SPSS, 24th version) and IBM SPSS AMOS 24th version [54]. Confirmatory factor analyses (CFA) and Structural Equation Models (SEM) visualize standardized values of correlations (double-headed arrows) and regression weights (single headed arrows). Since we did not have complete information on the Y key, we used a Maximum-likelihood solution, which is the recommended method for treating missing data. The models include random measurement errors, synonym to error terms or unobserved variables; they help to differentiate the true score from unobserved parameters by compensating for unobserved distractors to allow a more accurate measurement, and it is possible to draw correlations among those to define further relationships [55]. Good models should provide a range of fit indices, e.g., a low Chi-Square (χ 2 ) relative to the degrees of freedom followed by an insignificant p value (testing the null hypothesis). The Root Mean Square Error of Approximation (RMSEA) should be below 0.07 and values smaller than 0.03 present ideal fit. We chose the Comparative Fit Index (CFI) as a baseline comparison because it is not overly sensitive to sample size. The CFI should exceed 0.9 [56]. Though the Chi-Square is a valuable indicator, it is sensitive to sample size [57]. We therefore report the relative or normed Chi-Square, which relativizes the impact of sample size by dividing the Chi-square by the degrees of freedom. Ideally, the score is below 5.0 [55]. There is no standard referring to the ratio of SEM complexity and sample size, but Kline [58] recommend a 10:1 (number of subjects: number of parameters) relationship, allowing us to draw complex models. For those, we first modelled CFAs for the scales separately to then relate them in a SEM, and to draw parameters among the latent variables to reflect our hypothesized relationships between attitude, knowledge, and behavior.

Results
We examined the relationship between the two attitude measurements (2-Major Environmental Values model , New Environmental Paradigm scale [NEP]) based on the earth education program Earthkeepers. We first related the two instruments using CFA, correlation coefficients, and further validity and reliability scores. We then investigated program effects on attitude and knowledge to elaborate on the attitude scales when applied to an environmental education program which demonstrated success, to then position them in the attitude-knowledge-behavior relationship. Since we had a large sample size, we ran several analyses with subsamples to provide measurement invariance tests.

How Do the 2-MEV Subscales and the NEP Relate Based on a Large, Age-Homogeneous but State-Diverse Sample?
The CFA confirmed the theoretically proposed structures of the 2-MEV with its proposed two orthogonal subscales and the NEP (Figure 1). The primary factors (items) formed the higher order factors of Preservation, Utilization, and the NEP (one-dimensional construct of environmental concern), which we correlated to reveal their relationships. The factor analyses enabled us to estimate the relationships among the latent constructs (ellipses), which are assessed via multiple indicators (items or manifest variables and the error terms associated with them) [55]. Construct validity consists of convergent and discriminant validity [59]. Convergent validity (AVE) should exceed 0.5, which was true for Preservation (0.716) and Utilization (0.541) in contrast to the NEP (0.369), indicating that the 2-MEV subscales are similar enough to fall under one dimension. Discriminant validity is the square root of the AVE and should outnumber the correlation (Preservation: 0.846, Utilization: 0.735, and NEP: 0.609). The correlations of the 2-MEV subscales with the NEP exceeded the score, while the correlation between Preservation and Utilization was below the threshold, indicating insufficient disparity among both. Pearson coefficients (r) confirmed the AMOS correlations, which was r = −0.874 (p < 0.001) for Preservation and Utilization, r = −0.831 (p < 0.001) for Utilization and the NEP, and r = −0.817 (p < 0.001) for Preservation and the NEP. With both latent variables inversely correlating, we ran an exploratory factor analysis in SPSS and a CFA in AMOS for the 2-MEV. The first yielded two factors explaining 51.94% of the total variance (Bartlett's test of Sphericity: p < 0.001, KMO = 0.915) with good loadings between 0.520 and 0.844 for Preservation but inconsistent loadings for Utilization (0.139 to −0.625), supporting the results of Figure 1. The eigenvalue of Preservation (6.845) exceeded the eigenvalue of Utilization (1.466), indicating onedimensionality. The CFA with all items pointing at one factor to test one-dimensionality presented acceptable fit indices (CMIN/DF = 15.134, CFI = 0.964, RMSEA = 0.046), but three loadings from Utilization were below the internal consistency threshold [60]. The CFA thus pointed at one-dimensionality and inconsistencies of the latent construct Utilization. To elaborate on the constructs' reliability, we calculated McDonald's Omega (ω), as it does not require tau-equivalence, which we did not expect within our scales [61]. Omega coefficients for Utilization were low for the pre (ω = 0.642) and acceptable for the post (ω = 0.716) data. After excluding one item from the Utilization scale, which is "Weeds should be killed because they take up space from plants we need", reliabilities increased (pre: ω = 0.677, post: ω = 0.734), so we excluded it from further analyses, which is in line with previous findings having eliminated the same item (e.g., [17]). Preservation (pre: ω = 0.878, post: ω = 0.921) and the 2-MEV (pre: ω = 0.922, post: ω = 0.972) showed strong reliabilities, while the NEP (pre: ω = 0.785, post: ω = 0.771) yielded weaker Omega coefficients. Cronbach's alpha depends on the assumptions that each item contributed equally to its latent factor and error variances were uncorrelated; if those are violated, reliabilities are underestimated [61]. This is supported by the lower alpha scores (Utilization: pre α = 0.632, post α = 0.710; Preservation: pre α = 0.877, post α = 0.911; 2-MEV: pre α = 0.914, post α = 0.922; NEP: pre α = 0.709, post α = 0.711), indicating that particularly the NEP fell less under one-dimensionality, whereas the proposed orthogonal scales of the 2-MEV pointed at one-dimensionality. Based on Manoli et al. [31], who had already recommended use of the 2-MEV instead of the NEP, the high correlation between the 2-MEV subscales and the NEP, which indicate that both measure an identical construct, and further reliability and validity scores, we proceeded with the 2-MEV as a more robust tool.

What Changes Does the Three-Day Residential Earth Education Program Induce on Attitude and Knowledge Scores with Regard to the Behavioral Indicator Y Key?
In a pre-post comparison with listwise exclusion, T-tests revealed statistically significant program effects on students who received the Y key and on those for whom we had no information on the Y key ( Table 1). The students who did not earn the Y key showed smaller increases (Preservation and knowledge) or decreases (Utilization). Knowledge gains were most prominent, and Table 2 illustrates the item difficulties. Lower numbers indicate difficult, and higher numbers easier questions, so the items covered a wide range of difficulties, aligning with a good distribution [62]. Since Table 1 merges all nine years, we provided examples of two subsamples in Table 3, which illustrate the same trend of pro-environmental shifts. We included but are not reporting exhaustive details on control groups as controls have repeatedly shown no considerable changes for either the Earthkeepers (e.g., [63]) or other environmental programs (e.g., [37,64]). Furthermore, proenvironmental shifts are mostly prevalent right after program completion but drop later, so we deem it more important to focus on long-term effects than on controls. We therefore used the same questionnaire 6-8 weeks after the outdoor program to assess longer-term effects. We will nevertheless

How Does Attitude Affect Knowledge Scores and Behavior (Y Key)? To What Extend Can We Confirm Our Postulated Structural Equation Model in a State Comparison (AZ, PA, and LA)?
In structural equation modelling, path coefficients (beta regression weights; b) reflect the predictive power of explanatory variables. Figure 2 confirmed the high negative correlation between Preservation and Utilization already revealed in Figure 1. Both latent factors strongly related to their post scores, indicating program effects. The model showed a statistically significant effect from Utilization to Knowledge. By pointing identical but inversely correlated constructs at the same variable, a suppression effect might appear where the stronger construct represses the regression weight of the second one [65,66]. The model showed a plausible regression weight (b = −0.14) from Utilization but repressed the effect from Preservation and caused it to be statistically insignificant. . The proxy of behavior (Y key) is the model's last instance. All ellipses are factor-analyzed latents. The preknowledge score is also factor-analyzed, but its item structure is not depicted due to place restrictions. The error terms (uv) are correlated and illustrate the first higher order structure of Preservation and Utilization, which is Support, Care with Resources, and Enjoyment of Nature (Preservation), or Altering Nature, and Human Dominance (Utilization). Regarding Knowledge, we excluded item no. 9 from the factor analysis due to its low loading, its non-significant increase in a pre-post comparison, and because it was by far the most difficult item ( Table 2). We then correlated the error terms of the knowledge items belonging together (energy, materials cycling, interrelationships, and change). In a further step, we excluded all non-significant error-term correlations. Since Preservation and Utilization highly correlated, which suggested Utilization is the reversed measurement of Preservation, and since Preservation showed stronger reliability and validity scores, and since Utilization yielded weaker and inconsistent factor loadings, while there is the notion of one-dimensionality, we proceeded with Preservation only. Figure 3 illustrates the tailored structural equation model without Utilization. Fit indices increased and the effect of Preservation on knowledge became visible. Our large sample size might produce marginal but still significant effects, so we split our cohort in three subsamples (participants in Arizona, Pennsylvania, and Louisiana) to further test for measurement invariance. Multigroup analyses in AMOS provide simultaneous fit indices, so all four models relied on the indices in Figure 3 and indicated that the data fit the model well and thus had good configural invariance. Table 4 shows the regression weights for each state. The regression coefficients of the pre-post comparison (Preservation and Knowledge) were statistically significant throughout all groups. The relationship from Preservation to knowledge was only non-significant for Pennsylvania (p = 0.061). Numbers from knowledge to the Y key were statistically significant for two states and slightly above the threshold (p = 0.083) in the merged model.

Preservation as the Strong Subscale of the 2-MEV-NEP Comparison
Both scales, the 2-MEV and the NEP, appear to measure an identical environmental attitude construct. The negative correlation of the NEP with Utilization, inversely correlated with Preservation, support the conformity of both tools. We found only one previous study that identified a correlation between the 2-MEV and the NEP [67]; their relationship is substantive (0.51) but not robust as it refers to just 100 participants. It might be sufficient to apply either scale for assessing environmental attitude. The NEP is an established tool that can discriminate between (non-)environmentalists cross-culturally [51] and can be used with children [30], adolescents, and adults [29] alike; however, we chose the 2-MEV (for further arguments, see below). The NEP consisted of more items, yet the 2-MEV subscales showed superior validity and reliability scores (confirmed, e.g., by [68]), with the NEP scoring worse than in previous studies [29]. Comparisons of the NEP, however, are difficult because many researchers do not use a standard number of items and sometimes merge the old and new NEP versions [51]; the old 12 NEP items and the new 15 NEP items differ substantially, though both scales fall under the same acronym. Typically, unidimensional scales provide strong reliability values because all items contribute to one higher order factor. It thus seems that the NEP yielded more latent factors, which might turn the strength of its potential multidimensionality into a weakness; the instrument's dimensionality appears to depend too much on the setting and participants, although there is consensus to use the NEP unidimensionally [29]. Then, the factor analyzed 2-MEV including its construct validity appears to be more stable than the NEP, the latter showing several weak factor loadings. Validity and reliability scores might improve when cutting low loading items; however, others have also raised concerns about the NEP. According to Kaiser et al. [69], the unidimensional NEP insufficiently captures ecological behavior. Schultz and Zelezny [13] deem the NEP deficient as it only reflects environmental concern and disregards further attitudinal concepts, while the 2-MEV might produce a clearer analysis since it attributes a person with a Preservation and Utilization score [31]. Regarding educational recommendations, these two perspectives provide a dual approach. From an anthropocentric perspective (utilization of nature), personal rewards (e.g., money for recycling) can lead to environmentally friendly actions [70]. From an eco-centric or social-altruistic perspective (preservation of nature), an incentive caring for others or the environment can be helpful [13].
When pointing the 2-MEV subscales Preservation and Utilization to knowledge, our structural equation model was supportive for the postulated model. Since both subscales strongly and inversely correlated, and Utilization repressed the regression weight from Preservation on knowledge, we ran the same model without Utilization. Then, the repressed score between Preservation and knowledge turned into a statistically significant positive effect and fit indices increased. Our decision to proceed without Utilization is based in its weaker reliability and validity scores, the improved model fit indices without Utilization, and is in line with literature. Milfont and Duckitt [10], for instance, confirmed the two-dimensional structure but argued that Preservation had more predictive power. They also found a strong inverse correlation, which pointed towards a bipolar scale with Preservation and Utilization at opposite ends. Yet, the authors did not confine to one-dimensionality as the semantic network technique supported the two-dimensionality despite slight overlaps of both constructs; however, even with this approach, participants were more familiar with Preservation as they showed a more coherent, exhaustive, and positive semantic network opposed to Utilization [71]. Others like Stokols [72] provided a similar but unidimensional scale. He differentiated between a spiritual and instrumental view on the environment. The first regards the environment as an end itself, associated with conservation, protection, and preservation. An instrumental perspective points at an exploitative use of nature where humans dominate the object-like environment. Then, there is a trend of decreasing anthropocentrism and increasing biocentrism with age [73], rendering Utilization alone insufficient to broadly capture environmental attitude. Gardner [74] claimed that good instruments need to be unidimensional and internally consistent, and factor analysis provide the best evidence for dimensionality [75]; loadings with Preservation outnumber those with Utilization [20], indicating the strength of the underlying construct. Higher reliability scores of Preservation have been shown several times (e.g., [10,67]), and some Utilization items were difficult to use with children, probably due to misunderstandings (e.g., [37]). The 2-MEV with its two higher order factors showed moderate fit indices as Preservation and Utilization appear associated rather than orthogonal. As a solution, the Utilization scale could be modified to perfect correlation or omitted. In this study and in the face of a concise attitude scale, we proceeded with Preservation only.

Effects of Attitude and a Behavior-Indicator (Y Key) on Knowledge Gains, and the Y Key as a Behavior-Measurement for Upper Elementary School Students
Our third model is restricted to Preservation, knowledge, and the Y key, to reflect the components of the knowledge-attitude-behavior relationship. The model was stable and the results were persistent for all states with slight variations. The overall positive effect of Preservation on cognitive achievement is not new [36] and indicates that the stronger a student's pro-environmental attitude, the more they are likely to learn in the areas of energy, materials cycling, interrelationships, and ecosystem changes. This renders Preservation a useful tool to analyze and strengthen pro-environmental attitude. We suggest the moderate effect in the present study could be caused by several factors: US students showed a weaker pro-environmental attitude in a cross-cultural comparison [13]. This corresponds with international polls stating that US citizens were less concerned about environmental problems [53]. We also face the issue of a restricted system knowledge scale, which covers thirteen items of conceptual knowledge but disregards action-related and effectiveness knowledge, which might relate to stronger attitude and behavior [38].
Although the effects on the Y key as a proxy of behavior were statistically significant for the Arizona and Pennsylvania subsamples, they were not for Louisiana. The score for the whole sample is therefore slightly above the significance level. There are various approaches to assess behavior, and we chose the Y key as it is not a momentary selfreported single response but is earned over the course of a month. This is the time required to manifest new habits. The longer and more immersed a person is, and the more opportunities to practice action skills, pro-environmental attitude and behavior will be encouraged [76]. Evans et al. [53] found no substantive relationship between attitude and behavior among 6-8-year students cross-culturally, and they reason it might be too difficult to assess behavior within that age group developmentally and methodologically. We suggest the Y key a useful tool with children as it allows choosing and implementing pro-environmental engagement in individual, more flexible terms, and as it involves actual performance instead of self-reported behavior. Self-reports imply the risk of social desirability [37,77], and correlations between environmental attitude and behavior are deemed higher with actual behavior [3]. The Y key also demands parent and teacher engagement and collaborative group work, which are contributing factors to environmental education [33]. Indeed, all students who earned the Y key considerably increased their Preservation and knowledge scores, while the Utilization level decreased, indicating a shift from anthropocentrism to ecocentrism. There are only marginal changes for those who did not gain the Y key. It thus might be a motivator to induce pro-environmental behavior. We nevertheless do recognize the additional work to introduce the Y key and collect data a month later.
We had anticipated a stronger relationship between knowledge and the Y key. The modest effect might relate to its measurement, or because we used a narrow knowledge scale, which insufficiently captured cognitive achievement. We also have to be aware that neither knowledge nor attitude directly translate to behavior but are mediated, for instance, through situational constraints (e.g., perceived social pressure, task difficulty) [78] and are only preconditions to behavior [3]. Kaiser et al. [69] refer to such constraints as performance costs and regard knowledge as only one factor that insufficiently determines attitude or behavior. Nevertheless, knowledge is a more proximal outcome of learning, while attitude and behavior develop, e.g., by putting knowledge into practice [39]. In future studies, we aim at extending knowledge at the expense of attitude items to incorporate the tripartite environmental knowledge scale introduced by Frick et al. [32]; these include action-related and effectiveness knowledge items.

Effects of the Earth Education Program Earthkeepers on Attitude and Knowledge
In a pre-post comparison based on a three-day outdoor earth education program and follow-up pro-environmental engagement, knowledge scores considerably improved. Attitude became more environmentally friendly with Preservation increasing and Utilization decreasing. This is in line with previous findings of the Earthkeepers (e.g., [52]), of extensive exposure to a 10-day environmental program [36], and other long-or short-term educational out-of-class experiences [39,64,79,80]. The earth education program Earthkeepers aims at meaningful outdoor learning and underlines human dependence on nature. It is more effective for students to engage in activities that induce personal relevance, particularly when dealing with complex topics such as (global) environmental issues [81], and feeling dependence increases personal commitment to its welfare [82]. Attitudes are then formed by personal experience [83], and outdoor activities are often cited as a catalyst for engagement [76] especially with long-term experiences [33] as attitude and behavior change slowly [35]. Direct learning opportunities prompt affective learning and contextualize cognitive knowledge [44], which supports the knowledge gains and attitude changes of our study. Encouraging others to behave in an environmentally conscious way is thought to be another useful strategy for long-lasting attitude changes [15], which is assessed with the Earthkeepers S key. This renders the Earthkeepers and similar programs fruitful educational programs to engage students in meaningful learning and to induce attitude shifts with the ultimate goal to foster pro-environmental performance.

Study Limitations
As indicated in the previous paragraphs, study limitations are mainly methodological. At the expense of using two attitude instruments (2-MEV and NEP), we used a narrow knowledge scale, which was confined to system knowledge and disregarded other types of knowledge (action-related and effectiveness knowledge). Shorter scales generally fall short on predictive power, and those other types of knowledge might relate stronger to proenvironmental behavior. The main purpose of this study was to suggest a concise attitude scale, so in future studies, we aim at elaborating on an environmental knowledge scale. Secondly, though the Y key as a proxy of behavior appears to be a promising predictor for use with children, it is time-efficient to introduce the concept and to collect data one month after program completion. The Y key further relies on teacher and parent support, which might impact the Y key performance. We therefore lack knowledge on whether the children' pro-environmental engagement was intrinsically motivated, or whether teacher and parents acted as extrinsic motivators. This means we cannot differentiate between students having understood or merely having applied knowledge. Third, we used a large sample-size within 9 years of assessment. Although this allows us to draw robust conclusions as our results hold true for individual years and the total cohort, it entails some drawbacks. Each year was exposed to various environmental issues, e.g., through the media, which might have affected the results. Some tests might produce marginal but still statistically significant effects; we therefore ran several analyses with subsamples (years, states, or schools) to attenuate for such an effect. Lastly, the Earthkeepers program was implemented at three different centers in three different areas of the country. While the activities and overall program did not differ, there are certainly some differences in the details of how the program was enacted, the natural environment and climate, and the backgrounds of the participating teachers and students. Those differences do not appear to be substantial enough to cause significant differences in results between the three areas. This way, having three locations turned into a strength as it implies the program itself contributed to pro-environmental changes.

Conclusions
Since there are various measurement tools to assess environmental attitude, and since there is uncertainty about its dimensionality, we aimed at relating two established tools, which is the two-dimensional 2-Major Environmental Values model (2-MEV) and the onedimensional New Environmental Paradigm scale (NEP), based on the earth education program, Earthkeepers, which has demonstrated success. We used a quantitative approach to include a large sample-size based on three US states (Arizona, Pennsylvania, and Louisiana) to determine the better attitude scale. Although there is sufficient evidence that the 2-MEV and the NEP are valid, reliable tools to assess environmental attitude, in the face of a concise scale, our study points at the advantages to confine to Preservation (2-MEV subscale) as a unidimensional measurement. In a stepwise reduction based on AMOS modelling, reliability and validity tests, and supported by the sensitivity to program effects, we cut the 2-MEV-NEP comparison to Preservation, while both latent factors (Preservation and Utilization) appear to be bipolar rather than orthogonal. As such, Preservation showed a positive effect on system knowledge, rendering it a contributing factor to cognitive achievement. System knowledge related to an indicator of pro-environmental behavior (Y key), and we confirmed our structural equation model in a state comparison with three subsamples (AZ, PA, and LA).
Investigating effects of the three-day earth education program, shifts towards proenvironmental attitude were most prevalent for those who earned the Y key and thus showed pro-environmental engagement within a month after program completion. This suggests assessing the attitude-behavior relationship for children through actual behavior; it also renders the outdoor experience and its incorporation in the classroom setting helpful to induce knowledge and attitude changes towards an environmentally friendly, sustainable lifestyle. Attitude and knowledge assessment follow the goal of program development to promote pro-environmental behavior. We suggest measuring attitude with the Preservation scale (or additional perfectly inversely correlated Utilization items) in order to create more room for further variables within future assessment designs, in which the lengths of questionnaires is a crucial issue.  Institutional Review Board Statement: Ethical review and approval were waived for this study, due to a determination that the data were from a pre-existing, de-identified data set and so not considered to be human subjects research.
Informed Consent Statement: Participant consent was waived for this study, due to a determination that the data were from a pre-existing, de-identified data set and so not considered to be human subjects research.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to EU data protection.

Acknowledgments:
We are grateful to the UA Earth Education Research and Evaluation team, to the outreach staff involved at all earth education centers, and to all teachers and students who participated in our study.

Conflicts of Interest:
The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results. Table A1. Thirteen system knowledge items and their multiple choice answers; there is always one correct answer, so the students could score up to thirteen points.

Questions
Multiple Choice Answers • It could never be an ocean. • It may someday be an ocean. • It will for sure someday be an ocean.